The problem of the applicability of mathematics is the problem of explaining why mathematics plays various important roles in the natural sciences (or in nature). This problem has a long history and has been addressed by many mathematicians, scientists, and philosophers. As Mark Steiner aptly put it, “to an unappreciated degree, the history of Western philosophy is the history of attempts to understand why mathematics is applicable to Nature….” The problem is fundamental in that every philosophy of physics and every philosophy of mathematics must explain the dependency of physics on mathematics. The discussion on this problem has been especially influenced by the leading theoretical physicist, Eugene Wigner, whose 1960 paper revived and reformulated the applicability problem as “The Unreasonable Effectiveness of Mathematics.”
What seemed puzzling and “mysterious” to Wigner was the strong relationship between mathematics and physics. As Wigner noted, without mathematics there would not be such a thing as modern physics. There are many ways that physics uses mathematics. The properties of physical objects are all quantified by numbers (natural numbers, real numbers, complex numbers, and so forth). The laws are expressed in mathematical syntax such as formulas and inequalities. The physical processes are described by mathematical structures such as manifolds, vector spaces, and groups. Looking at all these aspects, one cannot resist asking: Why is mathematics so central to physics? This is the applicability problem that Wigner raised, and on which this article focuses. There is no consensus on the solution to the problem.
The relationship between mathematics and modern physics has turned out to be enduring, important, and strong. It is not only the strength of this relationship but also its significance that has created a sense of wonder among scholars. What seems to be puzzling is the underlying distinctness between the relata of this relationship. Physics is the study of inanimate nature and the discovery of its laws, or symmetries, or models. Mathematics is the study of abstract concepts and structures, which seem to be removed from the empirical study of nature. How is it then that there is such a strong and long-lasting relationship?
The philosophical problem of the applicability of mathematics arises from a distinctness thesis. For some, the distinctness is ontological: mathematical objects belong to a different reality than physical objects. Physical objects are said to be causal and spatiotemporal. Mathematical objects, on the other hand, stand outside the flow of time and causality. How are these two realms related to each other? Why is it that knowledge about the abstract realm of mathematics proves to be effective in generating knowledge about the world of concrete physical phenomena?
Mathematics is also distinct from physics in its methodology. The research in mathematics is said to be driven by internal mathematical standards, such as rigor, mathematical beauty, and elegance. On this conception, mathematics is: (1) an autonomous discipline, (2) with its internal norms such as rigor, generality, consistency, mathematical beauty (notoriously, applicability is not among them), and (3) whose objects of study are abstract. Physics, on the other hand, is governed by standards such as empirical adequacy and producing empirically testable results. Why is it that a practice that is guided by rigor, elegance, simplicity, and formal beauty proves successful in an area that is governed by empirical standards (Islami & Wiltsche 2020)?
Given this distinctness, the applicability problem is the problem of explaining why mathematics is effective and useful for describing, explaining, and predicting the kinds of physical phenomena with which science is concerned. The dependency of physics on mathematics can easily be seen in the early 21st century academy, where every physics class requires many mathematics classes as prerequisites. To some extent, a physics statement (such as a physical law) is not complete until it is put into a mathematical language. One needs the correct differential equation, the exact function, or the right formula. Otherwise, it is not physics.
Wigner and many other prominent scientists highlighted the distinctness and a sense of wonder and puzzlement over the applicability:
Albert Einstein wrote:
At this point an enigma presents itself, which in all ages has agitated inquiring minds. How can it be that mathematics, being after all a product of human thought which is independent of experience, is so admirably appropriate to the objects of reality? Is human reason, then, without experience, merely by taking thought, able to fathom the properties of real things? (Einstein 1921)
Paul Dirac described the situation as follows:
[T]he mathematician plays a game in which he himself invents the rules while the physicist plays a game in which the rules are provided by Nature, but as time goes on it becomes increasingly evident that the rules which the mathematician finds interesting are the same as those which Nature has chosen. It is difficult to predict what the result of all this will be. (Dirac 1939)
David Hilbert wrote:
We are confronted with the peculiar fact that matter seems to comply well and truly to the formalism of mathematics. There arises an unforeseen unison of being and thinking, which for the present we have to accept like a miracle. (Hilbert 1919)
In Wigner’s important paper, he wrote:
The miracle of the appropriateness of the language of mathematics for the formulation of the laws of physics is a wonderful gift which we neither understand nor deserve. We should be grateful for it and hope that it will remain valid in future research and that it will extend, for better or for worse, to our pleasure, even though perhaps also to our bafflement, to wide branches of learning. (Wigner 1960, p. 237)
[T]he enormous usefulness of mathematics in the natural sciences is something bordering on the mysterious and there is no rational explanation for it. (Wigner 1960, p. 223)
It is difficult to avoid the impression that a miracle confronts us here, quite comparable in its striking nature to the miracle that the human mind can string a thousand arguments together without getting itself into contradictions, or to the two miracles of the existence of laws of nature and of the human mind’s capacity to divine them. (Wigner 1960, p. 229)
In non-Western traditions, the question of the applicability of mathematics and the status of its entities also has a long history. In the Persian and Islamic worlds, for instance, the question of the relationship between mathematics and astronomy was subject to heated debates. See (Zarepour 2022).
b. Wigner’s Formulation of the Problem
In Wigner’s view, the differences between mathematics and physics are so fundamental that there is no way to bridge the gap between them and thus to explain the applicability of mathematics. To see why, look at how he defined mathematics and physics.
Wigner defined mathematics as the “science of skillful operations with concepts and rules invented for just this purpose” (Wigner 1960). The emphasis here is on the invention of mathematical concepts on the basis of their manipulability and formal beauty without paying attention to their applicability.
As G.H. Hardy wrote:
The mathematician’s patterns, like the painter’s or the poet’s, must be beautiful; the ideas, like the colors or the words, must fit together in a harmonious way. Beauty is the first test: there is no permanent place in the world for ugly mathematics. (Hardy 1940, p. 85)
Bertrand Russell also emphasized the role of beauty in mathematics when he wrote:
Mathematics, rightly viewed, possesses not only truth, but supreme beauty, cold and austere, like that of sculpture, without appeal to any part of our weaker nature, without the gorgeous trappings of painting or music, yet sublimely pure, and capable of a stern perfection such as only the greatest art can show. The true spirit of delight, the exaltation, the sense of being more than Man, which is the touchstone of the highest excellence, is to be found in mathematics as surely as in poetry. (Also quoted in Wigner 1960)
Wigner was an advocate of the Hilbert School of formalism, according to which mathematics is understood to be a game of meaningless symbols (Ferreiros 2017). While mathematicians, in Wigner’s view, are concerned with beautiful and manipulable patterns, physicists have a different objective: “The physicist, unlike the mathematician, is interested in inanimate nature and in discovering ‘regularities’ that exist among the natural phenomena” (Wigner 1960).
Given this distinctness, what seems to be puzzling is that virtually every law of physics is formulated mathematically: the second law of motion in classical mechanics is formulated using the second derivative of position with respect to time; the axioms of quantum mechanics are formulated using a complex Hilbert space, where the states are defined as vectors in Hilbert space and the observables are self-adjoint operators on these Hilbert spaces; relativity theory is formulated using smooth manifolds. These are just a few instances of the general phenomena of the successful use of mathematical concepts in physical theories.
The situation, Wigner claimed, is like “a man who was provided with a bunch of keys and who, having to open several doors in succession, always hit on the right key on the first or second trial” (Wigner 1960). In this analogy, “keys” refers to mathematical concepts, and locks are regularities in the natural phenomena.
Mark Steiner (Steiner 1998, p. 46) summarized Wigner’s argument as follows:
(1) Mathematical concepts arise from the aesthetic impulse in humans.
(2) It is unreasonable to expect that what arises from the aesthetic impulse in humans should be significantly effective in physics.
(3) Nevertheless, a significant number of these concepts are significantly effective in physics.
(4) Hence, mathematical concepts are unreasonably effective in physics.
c. Examples of Applicability
There are numerous examples of the applicability of mathematics discussed by scholars, beginning with the cases raised by Wigner in his 1960 paper. In some cases, mathematics is used “merely as a tool” to help us calculate the consequences of a law in a particular situation. More interesting is where mathematics plays a more “sovereign role” in the very formulation of the laws of physics, as mentioned above.
The first example is the case of planetary motion. One must see this from a historical perspective. The Greek geometer Apollonius lived prior to 200 BCE and studied the mathematical shapes that occur when a cone is intersected with a flat plane. Such intersections are called conic sections. He was able to describe circles, ellipses, parabolas, and hyperbolas using such methods. In his book, he stated about 400 theorems concerning the different properties of conic sections. Eighteen hundred years after Apollonius lived, Johannes Kepler was trying to figure out how to make sense of Copernicus’s radical new idea of having the sun in the center of the galaxy with the planets going around it. He realized that the planets traveled around in ellipses and not circles. Kepler was able to determine properties of planetary motion by reading a book that was almost two millennia old. An historian of science, William Whewell, wrote that “If the Greeks had not cultivated conic sections, Kepler could not have superseded Ptolemy” (Whewell 1837, p. vi). How is it possible that the abstract writings of an ancient Greek mathematician could help explain the motion of the planets? Along the same lines, the parabolas of Apollonius also helped Newton explain the motion of projectiles flying through space.
The law of freely falling bodies was established as a result of experiments which cannot be considered very accurate by human standards. Newton brought the case of freely falling bodies together with the motion of the moon. He realized that “the parabola of the thrown rock’s path on the earth and the circle of the moon’s path in the sky are particular cases of the same mathematical object of an ellipse” (Wigner 1960). On the basis of these approximate observations and numerical coincidences, he postulated the universal law of gravitation. This law is written in the language of mathematics, using the second derivative, and the law is known to be accurate to less than one ten thousandth of a percent.
Another example that Wigner discussed is the use of matrices in quantum mechanics. On the basis of experimental data from the hydrogen atom, Heisenberg came up with a form of basic quantum mechanics. When Max Born read Heisenberg’s paper, he realized that what Heisenberg had been working with was formally identical with the rules of computation for matrices, which had been established before by mathematicians. Born, Jordan, and Heisenberg then proposed to replace the position and momentum variables of the equations of classical mechanics with matrices. They applied the rules of matrix mechanics to a few highly idealized problems. At that time, there was no reason to think that one could apply these results to more realistic and complex cases such as the helium atom. The agreement of the calculations with experiments came as a pleasant surprise. The success of matrix mechanics is even more surprising given the fact that matrices were already known to mathematicians decades before the formulation of quantum mechanics (Wigner 1960).
There are—of course—many other examples of the applicability of mathematics that Wigner did not mention. Arithmetical operations such as addition and multiplication are an example of the basic use of mathematics in everyday life, rather than just physics. When there is a collection of distinct objects and another collection of other distinct objects, then when they are combined, there is a collection x + y of distinct objects. It is to be noted that the addition operation satisfies commutativity and associativity, that is:
x + y = y + x and (x + y) + z = x + (y + z)
which is also satisfied by grouping physical objects. In case of multiplication, if there are boxes, and each box contains objects, then there is a total of x × y objects. Again, the rules of arithmetic, such as distributivity, that is:
x × (y + z) = (x × y) + (x × z)
are satisfied by collections of physical objects.
A more sophisticated use of mathematics is the use of differential and integral calculus to describe accelerated motion. Isaac Newton and Gottfried Leibniz invented independent versions of calculus so that they can describe the way physical objects interact. To solve problems in physics, one needs a fairly advanced knowledge of calculus. Calculus is used to calculate the times and speed of balls rolling down ramps and cannon balls making parabolas in the air. This goes on to multidimensional vector calculus which is applied to all types of physical phenomena. Differential equations permit the physicist to calculate many more physical phenomena.
Basic arithmetic and its applications were both noticed simultaneously. Differential calculus was invented by Newton and Leibniz to be applied to understand real world physics. The applicability problem becomes much more interesting when the theoretical mathematics is formulated first–before the applied mathematics, such as in the case of matrices and complex numbers. While algebraists in the 16th century came up with imaginary quantities in order to solve cubic equations, these strange and “nonsensical” quantities, as they were called, found their ways to the most fundamental equations of physics, in electromagnetism and quantum mechanics (Islami 2018).
The two major revolutionary theories of the twentieth century, relativity theory and quantum theory, are replete with examples of the “unreasonable effectiveness of mathematics” as this phenomenon has come to be known. When Albert Einstein was interested in describing the curved spacetime which is central to the general theory of relativity, a friend, Marcel Grossmann, introduced him to non-Euclidean geometries (Pais, 1982, p. 22). Non-Euclidean geometries were introduced decades earlier as ways to replace the fifth postulate of Euclidean geometry—the parallel postulate—with its negation. It seems strange that an intellectual curiosity about the fifth postulate of geometry proves to be so helpful in understanding the universe at-large.
Major aspects of quantum theory are described by previously existing mathematical structures and theorems: (1) As discussed, imaginary quantities were first introduced in the 16th century but nevertheless played a central role in quantum theory. (2) Werner Heisenberg used matrices to describe quantum phenomena, even though matrices were known to mathematicians well before quantum mechanics was known to physicists (Sudbery 1986).
Algebraic topology was originally formed to classify and categorize topological spaces. In the early 21st century, the tools of algebraic topology were used in the study of string theory. This relationship is so strong that some physicists who disparage string theory have taken it to be “just math” (Smolin 2006).
One of the most useful mathematical structures in all of physics is Évariste Galois’s notion of a group. The structure was invented in the beginning of the 19th century to describe the symmetries of polynomial equations. It was soon realized that this notion can describe any type of symmetry, and it has become central in all of physics, especially throughout quantum mechanics and specifically in particle physics. What was once taught only in advanced classes regarding obscure polynomial equations is a prerequisite for much of theoretical physics. The physicist Hermann Weyl wrote the following about the discovery of the notion of a group:
Galois’ ideas, […] are contained in a farewell letter written to a friend on the eve of his death, which he met in a silly duel at the age of twenty-one. This letter, if judged by the novelty and profundity of ideas it contains, is perhaps the most substantial piece of writing in the whole literature of mankind. (Weyl, 1952, p. 138)
This list can be extended indefinitely. The main point is that in these cases, mathematics, invented for many different reasons, is extremely applicable to physical phenomena. Why should this be?
2. The Applicability Problem as a Pseudo-Problem
a. The Applicability Problem and Other Sciences
In response to the applicability problem, some scholars have pointed out the limitations of the use of mathematics in other sciences, which makes them convinced that Wigner’s sense of mystery is at best unjustified. Velupillai argues for the unreasonable ineffectiveness of mathematics in economics (Velupillai 2005). Longo and Montévil argue for the reasonable ineffectiveness of mathematics in biology (Longo and Montévil 2016). The mathematician, Israel M. Gelfand is often quoted as saying:
There is only one thing which is more unreasonable than the unreasonable effectiveness of mathematics in physics, and this is the unreasonable ineffectiveness of mathematics in biology.
Hamming argues for the reasonable and limited effectiveness of mathematics in everyday life and engineering (Hamming 1980, Islami 2022). The ineffectiveness is even more pronounced in sociology, psychology, and anthropology.
With these limitations in mind, the problem of the unreasonable effectiveness of mathematics is partially answered. If Velupillai, Longo, Gelfand and other commentators are right, then mathematics is not so effective in the natural sciences, broadly construed. Rather, it is mostly effective in modern theoretical physics. Moreover, when discussing physics, there are also some limitations to the applicability of mathematics.
b. Ineffectiveness of Mathematics in Physics
Perhaps the effectiveness of mathematics in other sciences and in physical phenomena is too much to ask for. Wigner claimed to address the unreasonable effectiveness of mathematics in the “natural sciences.” However, the title is a bit misleading (Islami 2017). The body of the paper is entirely focused on the applicability of mathematics in modern physics. And even there, mathematics fails to be perfectly suited.
There are numerous examples of such limitations (for example, Kline 1980), but this article discusses only two: the three-body problem in classical physics and the measurement problem in quantum mechanics.
In a sense, this section contrasts with Section 1c. There, the article listed many examples of physical phenomena that are described by mathematics. The following are a few failure cases of the use of mathematics in physics.
Starting with classical mechanics, take the simple idea from Isaac Newton that two objects with masses m and m’ which are separated by a distance of r are attracted to each other by a force given by
F = Gmm’/r2.
This mathematical law will give the correct answer only under perfect conditions. This means objects are measured from their exact centers; the objects are perfectly spherical, exactly homogeneous; neither object has any electrical charge; they are far away from any third object that would influence them; they are large enough so that no quantum mechanical phenomena is active, small enough so that no general relativity phenomena are active, and nothing else interferes. These requirements severely restrict the number of phenomena described by the rule. In order to use this rule, physicists make many idealizations which are not really true. In short, this mathematical formula that is at the center of classical mechanics is not very applicable.
This is evident from examining the requirement that there is no third object near the two objects. While Isaac Newton gave us a powerful mathematical formula to describe the attractive forces of two bodies, mathematics has not provided a simple formula for three bodies. How do three objects interact with each other? This is called the three-body problem, and it cannot be solved. The three-body problem is an early, though long unrecognized, example of chaos. Chaos theory is replete with physical problems for which mathematics is of no help (Diacu and Holmes 1996). A double pendulum is one of the simplest examples of a chaotic system. This is a string which connects to a body which connects to another string and another body. Those two bodies and the Earth combine to form three bodies. While its actions are totally deterministic, it is not predictable. In contrast, a single pendulum is totally predictable. The Moon, Sun and Earth form a three-body problem. The unpredictability of this three-body problem manifests itself by the fact that there is no exact length of a lunar month. On average it is about 29.5 days but there is no way to exactly predict how long each month will be (Peale 2023).
One can go on and ask about more than three bodies. The more complicated n-body problem is of course unsolvable. There are no formulas to tell us how three particles interact and there are definitely no simple formulas to tell us about the interactions within the entire universe.
Along the same lines, a larger part of the work of physicists is constituted by differential equations. All types of problems in physics are described by differential equations. However, solutions to the vast majority of differential equations are unavailable. One might be able to describe the problem with mathematics, but not easily.
The measurement problem is another part of physics for which mathematics is of no help. Quantum mechanics is considered the most successful theory ever formulated. It is proclaimed to be correct up to the proverbial fourteenth decimal place. But there is a nagging lacuna concerning the applicability of mathematics to quantum mechanics. When a measurement is made in quantum mechanics, the Born rule tells us that the results are given probabilistically and not deterministically. In other words, quantum mechanics only offers a probability distribution rather than an exact answer. The theory that describes every interaction besides gravity is probabilistic and not deterministic. There are no mathematical formulas to tell us how likely it is that a particle will actually collapse (fission). One can defend mathematics by saying that the inherent nature of the physical universe is probabilistic, and not deterministic. Therefore, it simply is not a problem that mathematics fails to predict the future.
Along the same lines, the fields of thermodynamics and statistical mechanics also only give probabilistic results. The universe is far more complicated than the mathematics of the early 21st century possess so there are no known ways to solve such problems.
Sabine Hossenfelder has an interesting argument about the role of mathematics in physics. She says that sometimes mathematics leads physics astray (Hossenfelder, 2018). There are other philosophers such as Nancy Cartwright (Cartwright 1983) and Mark Wilson (Wilson 2000) who have critiqued the role of mathematics in applied physics.
Given these examples of where mathematics does not easily apply to physics, Wigner’s problem doesn’t seem so mysterious. In vast parts of physics, mathematics is, in fact, not so helpful. The mystery comes about only if one focuses on those parts of physics which mathematics helps.
3. Ontological Solutions to the Problem
The article has outlined the limitations of mathematics in other sciences and even in physics. With those limitations in mind, the applicability problem becomes more nuanced. Why is it that mathematics is so well-suited for the parts of physics where it is effective? In other words, why does mathematics work where it does work? Moreover, mathematics seems to be effective in the most fundamental parts of physics. The standard model of particle physics and basic general relativity are described by mathematical structures. The nuanced question of the applicability of mathematics is: why is mathematics so effective in the most fundamental theories of physics?
The next few sections describe some of the solutions given to this more subtle and focused question. The list is not by any means exhaustive.
a. Classical Pythagoreanism and Platonism
One of the first schools of thought to deal with the relationship between mathematics and the physical world, in the Western tradition, was the Pythagoreans who were active from the 6th century BC. They believed that numbers were somehow imbued with the divine. These numbers create physical objects and express the harmony of existence. Pythagoreans even incorporated some aspects of aesthetics in their philosophy by stating relationships between lengths of strings and sounds. Western musical scales come from these ideas.
For the Pythagoreans, the applicability of mathematics in natural sciences is simple. Since the universe is made out of numbers, it makes sense the physical world is controlled by mathematics. They believed that with pure thought about abstract numbers, one can come to understand the physical world. This unseen, pure realm cannot be accessed by the physical senses. The idea of such a pure realm came to have a deep and profound influence on Western philosophy and religion.
Platonism took the Pythagorean idea of the pure realm as its central dogma. To Platonists, the physical world was no longer imbued with the divine. Rather, the physical universe was simply a bad imitation of the perfect realm. This realm contains all numbers and all the relationships between numbers. In fact, it contains all of mathematics that ever existed and ever will exist. This Platonic heaven also contains all true physical laws that ever existed and perfect notions of truth, justice, and beauty. Their main point is to accept that the physical world is unimportant, and the only true existence is the Platonic realm accessed through the human mind.
Galileo perhaps provided the most famous “solution” to the applicability problem, following a version of Platonism. According to Galileo, the universe is written in a mathematical language so it should not be a surprise that empirical investigation of the universe (in physics for instance) is filled with mathematical statements:
Philosophy is written in that great book which continually lies open before us (I mean the Universe). But one cannot understand this book until one has learned to understand the language and to know the letters in which it is written. It is written in the language of mathematics, and the letters are triangles, circles and other geometric figures. Without these means it is impossible for mankind to understand a single word; without these means there is only vain stumbling in a dark labyrinth (Galileo 1623, p. 171).
According to Galileo, while the universe is fundamentally mathematical, senses mask the mathematical essence of the universe. So, he divided the qualities of the objects into two categories: primary and secondary. The primary qualities, such as size and shape, are mathematical while the secondary qualities such as color, taste and scent are not. But one should not be deceived by the secondary qualities into forgetting the mathematical essence of the universe. In Galileo’s own immutable words:
Whenever I conceive any … corporeal substance, I immediately … think of it as … having this or that shape; as being large or small … and in some specific place at any given time; as being in motion or at rest; as touching or not touching some other body; and as being one in number, or few, or many. From these conditions I cannot separate such a substance by any stretch of my imagination. But that it must be white or red, bitter or sweet, noisy or silent, and of sweet or foul odor, my mind does not feel compelled … Without the senses … reason … would probably never arrive at qualities like these. Hence I think that tastes, odors, colors, and so on are no more than mere names so far as the object in which we place them is concerned, and that they reside only in the consciousness. Hence if the living creature were removed, all these qualities would be … annihilated (Galilei 1957, p.274).
One can think of Galileo as an advocate of Platonism, according to which the true reality is perfect and mathematical, but sensory experience gives us an imperfect image of this otherwise perfect reality. The same idea is almost ubiquitous in anti-Aristotelian natural philosophy, and it is formulated by Bacon (1620) in Novum Organon and by Descartes in The World (ca.1630) and Principles (1644). Descartes surely took it from Galileo.
b. Modern Platonism
Mathematical Platonists posit that mathematical entities exist in an independent Platonic realm. According to this view, mathematical objects are abstract and independent from thoughts, language and practices. One motivation behind adopting Platonism is the “Quine-Putnam indispensability argument” according to which, since mathematics is indispensable in physics, a commitment to both the existence of mathematical and physical entities is required. In other words, a realist commitment to the existence of objects described by theories of physics leads to a realist commitment to the existence of mathematical objects. See, for example, (Colyvan 2001), (Maddy, 1990), and (Balaguer, 1998). For a critical examination of the indispensability argument, see Chapter 8 of (Bueno and French 2018). See also (Soto, 2019), and (Pincock, 2007).
One of the greatest mathematical physicists of the 20th Century, Roger Penrose, adopts a modern version of Platonism. According to him, there exists three worlds: a Platonic world, a physical world, and a mental world. While a typical Platonist has to explain how a small part of the Platonic world controls the physical world, Penrose furthermore has to deal with how a small part of the physical world creates a sentient person with a mental world. He also has to explain how a small part of the mental world can create or perceive the Platonic world. Penrose admits that each of these connections are somewhat mysterious, but he defends the entire system as being the correct way to understand the universe (Penrose 1994, s.8.7).
The applicability problem for the Platonist is solved by positing that the perfect laws of physics are stated by perfect mathematics which all exist in the Platonic realm. While it is important to perform experiments in the physical universe, one must be well versed in the pure mathematics of the Platonic realm to be able to understand the physical universe.
While Platonists deal with the applicability problem by saying that the timeless physical laws are perfectly expressed by the timeless mathematical ideas contained in the same realm, it is still not clear how they relate to each other. Moreover, there are many questions left unanswered. How does one access this Platonic realm? Why is it that the regularities in the world have a mathematical formulation? What about a mathematical theory that was shown to be false or not always true? Is such a theory contained in the Platonic realm? See (Field, 1980) and (Field, 1989).
c. Modern Pythagoreanism
Some offer a version of Pythagoreanism as a solution to the applicability problem. The idea is that the universe is a mathematical structure and that people live in a mathematical universe. Viewed in this way, it had to be the case that all laws of physics are written in the language of mathematics. Hence, there is no mystery to the applicability of mathematics in physics.
Several contemporary scientists advocate a version of Pythagoreanism. The cosmologist Max Tegmark notes that as science progresses, we learn that more and more aspects of the physical universe are described by mathematical structures. Even subatomic particles, which compose all physical matter, are described by irreducible representations of some symmetry groups. However, Tegmark takes this one step further. He posits that physical reality is not only described by mathematical structures, but physical reality is an all-encompassing mathematical structure, which he calls the mathematical universe hypothesis. In other words, subatomic particles are not described by irreducible representations of symmetry groups, but subatomic particles are irreducible representations of groups. The only entities which exist are mathematical. Even observers of the physical universe are simply complex mathematical structures. All of the trees, frogs and tables that can be seen are an illusion. They are simply one way that a mathematical structure views another structure (Tegmark 2008, 2014).
From Tegmark’s point of view, the applicability of mathematics to physics is reasonable. Mathematics is applicable to the physical world because the physical world is simply mathematical structures that unfold in mathematical ways. Tegmark writes of his form of Pythagoreanism that “it explains the utility of mathematics for describing the physical world as a natural consequence of the fact that the latter is a mathematical structure, and we are simply uncovering this bit by bit” (Tegmark 2008).
More radical than saying that the entire universe is made of mathematical structures is to say that the entire multiverse is described by a single mathematical structure. Sean Carroll describes the modern notion of Hugh Everett’s multiverse theory in this way (Carroll, 2019). Rather than saying that a new universe pops into existence every time a quantum measurement is made, he describes there being one giant quantum wave function of the multiverse. This single wave function describes the universe from the beginning of time to the end of time. When a measurement is made, different parts of the single wave function collapse. This and all theories of the multiverse are almost beyond any imagination. It is important to note that no scientific experiment can show that this theory is false. Nevertheless, the existence of the single mathematical structure of the wave function would explain why math is so useful in describing physical phenomena.
4. Epistemic Solutions to The Problem
Although ontological solutions present us with claims about the kind of worlds there are, epistemic solutions focus on the kind of cognition and knowledge humans have in relation to physics and mathematics.
a. Transcendental Solutions
According to Kant’s transcendental idealism, cognition is the result of the work of human faculties of sensibility and understanding. Sensibility is a passive or receptive capacity to be affected by objects of the world, while understanding organizes this received data into coherent concepts and judgments. The only way that one could grasp an intelligible world is through sensibility.
In Kant’s view, mathematics, particularly geometry and arithmetic, is grounded in the a priori intuitions of space and time. These intuitions provide the necessary framework for organizing sensory experience, allowing us to cognize the world. Thus, mathematical concepts like geometric shapes and numerical relationships are not merely reflections of empirical reality but are fundamental to a mode of cognition.
Given this conception of mathematics and physics, the applicability of mathematics to modern physics is possible because both disciplines share this common foundation in the structure of human cognition. Through Kant’s transcendental solution, mathematics is not seen as an arbitrary tool imposed onto the world but as a study of aspects of human cognition that enable us to grasp the order and regularity of the universe. See (Ryckman, 2023) and (Toretti, 1990).
Scientists, by employing mathematical principles, extend the reach of understanding beyond immediate sensory experience, enabling them to uncover regularities of nature.
b. Laws of Physics and Invariance Principles
A good starting point for an epistemic solution to the applicability problem is by looking at how mathematics is used in modern theoretical physics. In other words, one can ask oneself, what is modern physics and how can it be distinguished from Aristotelian physics which was not mathematical?
Modern physics focuses on regularities that exist among the phenomena and their mathematical formulation. Since the world is a very complex place, some physicists have long given up the ideal of “explaining” nature. The great success of modern physics, as Wigner reminded us, is in restricting its objective. Nature, as Roman puts it, just is. “It cannot be ‘explained’, at least not as far as science is concerned. Existence is a primary category, including, by the way, ourselves, too” (Roman 2004, p. 2).
Modern physicists focus on the discovery of regularities in the behavior of inanimate objects. What has made the discovery of these regularities and their mathematical formulation possible is a division that physicists have used between initial conditions and regularities. Regularities are the time evolution of a system whereas initial conditions are the state of the system at the outset. While the regularities can be formulated using mathematics, initial conditions are the domain of accidental and unordered. Wigner wrote:
Man, has, therefore devised an artifice that permits the complicated nature of the world to be blamed on something which is called accidental and thus permits him to abstract a domain in which simple laws can be found. The complications are called initial conditions, the domain of regularities, laws of nature. Unnatural as this division of the world structure may appear from a detached point of view, and probable though that the possibility of such a division has its own limits, the underlying abstraction is probably one of the most fruitful ones that human mind has made (Wigner 1949, p. 3).
This division is a condition for the possibility of formulating mathematical laws of nature. It is a condition for the possibility of modern mathematical physics. The contemporary physicist, Steven Weinberg, echoed Wigner’s remarks:
In modern quantum mechanics as well as in Newtonian mechanics there is a clear separation between the conditions that tell us the initial state of a system (whether the system is the whole universe, or just a part of it), and the laws that govern its subsequent evolution. (Weinberg 1992, p. 34)
Similarly, Frank Wilczek wrote:
Indeed, classical physics teaches us that the size of planetary orbits is not the sort of thing we should aspire to predict. It makes a sharp distinction between the basic laws, which govern the evolution of systems in time, and are expected to be simple, and the initial conditions, which must be given from outside. The equations of classical physics can be applied to any number of different types of solar system, having different sizes and shapes. There is nothing in Newton’s laws of gravity and mechanics, nor for that matter in the other pillar of classical physics, Maxwell’s electrodynamics, that could serve to fix a definite size. (Wilczek 1999, p. 303)
What, in turn, makes possible the division between regularities and initial conditions are invariance or symmetry principles. The invariance with respect to transformations (changes) in space and time are examples of invariance principles. Without these invariances, one could not assume that the result of these experiments done in a particular lab, at a particular time, holds universally. Had it been the case that regularities change from one place (time) to another, and from one time to the next, it would be impossible to test the accuracy of laws.
Invariance principles can be understood as laws of laws of nature: regularities that exist among the laws. They are conditions for the possibility of modern physics. In a nutshell, the invariance principles give laws their universality which is a feature that makes their mathematical formulation possible. See Chapter 9 of (Bueno and French, 2018), Chapter 1 of (Pincock 2012), and (Soto and Bueno 2019).
Noticing the particular structure of modern physics helps with solving the applicability problem by answering the question of what it is about modern physics that makes a mathematical formulation of its laws possible. Hermann Weyl argued that mathematics is a science that is concerned with the study of most general relations that exist among objects, and that invariance principles of physics furnish the use of such general relations (Islami 2017). Once this has been made clear, one can look at the co-constitution of mathematics and physics and their evolving and dynamic relationship (Wiltsche-Islami 2020).
Moreover, it is not only mathematics that has led to a particular constitution of physical concepts and relations but also physics has changed mathematics drastically (See Levy-Leblond 1992, Lützen 2011, Gingras 2001). This connects closely to the point about the physical origins of mathematics.
c. Physical Origins of Mathematical Concepts
An epistemic solution to the applicability problem is that mathematics was formulated by observing the physical universe. Human beings constructed mathematical notions by seeing them at work in the physical universe. People learn addition by looking at the way ensembles are combined. People learn calculus by looking at accelerating bodies. People learn about symmetries by seeing physical transformations of objects (Such ideas can be found in many papers of the Mickens 1990 collection of papers on the unreasonable effectiveness of mathematics).
The mathematician Richard Hamming, in his article published twenty years after Wigner’s 1960 contribution, presents what he considered to be partial explanations for the unreasonable effectiveness of mathematics.:
(1) People see what they look for. After all, no one should be surprised to see the whole world blueish if one has blue glasses on her eyes.
(2) People select the kind of mathematics to use. That is, when the mathematical tools that are chosen are not adequate in a particular case, people choose a different kind, such as vectors instead of scalars.
(3) Science in fact answers comparatively few problems. It of course does not say much about long-asked questions about God, Justice, Beauty, Truth and so on.
(4) The evolution of man has provided the model. One can think about objects especially the macroscopic ones, an ability that surely has had survival value (Hamming 1980).
In general, Hamming’s claim is that the number system that has been developed through abstraction, generalization, increased simplicity and aesthetic reasons (all internal to mathematics) provides us with a system that “is unreasonably effective even in mathematics itself.” Hamming asks, how is it that one can apply numbers to the concepts of physical objects? More precisely, what is it about the world that makes the abstraction of numbers possible?
His response is that the universe is constituted in such a way that abstractions of this sort are possible. Were the bodies that one attempts to count not reasonably stable, one would not be able to abstract numbers and arithmetic would lose its application. That is, objects that are important in everyday life and in science, to which arithmetic applies, remain the same under a group of translations in space and time.
The number of coins in my pocket, as Steiner pointed out, remains the same no matter where I am or what time of the day, month, or year it is (Steiner 1998). The number of coins remains the same, regardless of whether I put some nuts in my pocket or take out my keys. All countable objects “stay constant” for long enough to be counted.
What makes addition useful is that gathering or assembling objects preserves their stable existence. The operation of addition is also invariant under translations in time and space. The case of multiplication is similar. Given those facts, Hamming claims there is nothing unexpected and mysterious about the usefulness of counting, addition and multiplication. One can easily come up with requirements that a quantity needs to satisfy in order for it to have an “additive structure,” such that the operation of addition can be used with respect to it.
d. Mathematics as the Science of Symmetry
Yanofsky and Zelcer (2016) and Yanofsky (2016) argue for a novel solution to the applicability problem that uses symmetry in a fundamental way. In order to understand this, one must go back and understand how symmetry works in modern physics. With that in hand, Yanofsky and Zelcer’s definition of mathematics can be understood. Finally, symmetry can be understood as the unifying factor in the relationship between physics and mathematics.
One must understand a revolution that Albert Einstein started with his formulation of special relativity, which brought to light the fundamental role of symmetries in physics. Prior to Einstein’s work, symmetries were understood to be the mere consequences of the laws of physics, for example Galilean relativity was seen as the consequence of the laws of motion in Newton’s Principia. With Einstein’s work, symmetries came to be the cornerstone of laws of physics and conditions that each law had to meet.
The real importance of symmetry came when Einstein formulated the laws of special relativity. Prior to him, one first found a law of nature and then found its symmetries. In contrast, Einstein used the symmetries to discover the laws. In order to find the laws of special relativity, he posited that the laws must be the same for a stationary observer and an observer moving close to the speed of light. Given these presuppositions, Einstein went on to formulate the equations that describe special relativity. This was revolutionary. Einstein had realized that symmetries are the defining characteristics of laws of physics. In summary, before Einstein, physicists would say “a physical law satisfies symmetries.” After Einstein, physicists said “Whatever satisfies symmetries is a physical law.” See for example (Icke 1995) or (Schumm 2004).
In 1918, Emmy Noether showed that symmetry is even more central to physics. She proved a celebrated theorem that connected symmetry to conservation laws that permeate physics. The theorem states that for every symmetry of a certain type there exists a conservation law and vice versa. For example, the fact that the laws of physics are invariant with respect to translations in space corresponds to conservation of linear momentum. Time invariance corresponds to conservation of energy. Orientation invariance corresponds to conservation of angular momentum. Equipped with the understanding given by Einstein and Noether of the centrality of symmetry, physicists have been searching for novel and different types of symmetries in order to find new laws of physics.
After the work of Einstein and Noether, much of theoretical physics became about finding symmetries. As the Nobel prize winning physicist David Gross wrote, “in the latter half of the 20th century symmetry has been the dominant concept in the exploration and formulation of the fundamental laws of physics” (Gross 1996). This trend is particularly strong in particle physics and the standard model. The symmetries are represented by certain mathematical groups, and one studies the particles by studying the representations of those groups in vector spaces.
The laws of physics have another symmetry that is so obvious it has hardly been articulated. Every law of physics concerns types of objects. For example, some laws are about fermions. Some laws are about strings. Some laws are about all subatomic particles. Classical mechanics is about macroscopic objects that are not moving close to the speed of light. Special relativity theory concerns objects that can even be moving close to the speed of light. General relativity deals with objects that might be accelerating. There are many laws that deal with ideal gasses. For every law, one calls the type of physical objects that it is concerned with, its “domain of applicability.” With such a class of objects, one then formulates a “symmetry of applicability.” This says that one can swap any element in the domain of applicability for another element in the domain of applicability and after making the corresponding changes, the law gives the correct result. This is just another way of saying that the law applies to all the elements of the domain of applicability without exception.
Yanofsky and Zelcer take this revolutionary way of looking at physics to the world of mathematical structures. Their conception of mathematics is firmly in the nominalistic camp. The following question is posed and answered: of all the thoughts, how does one demarcate mathematical thoughts? Just as physics is defined as those phenomena that satisfy certain symmetries, they define mathematics as those thoughts which satisfy certain types of symmetries.
When communicating mathematics, one makes statements, and these statements are about certain structures or ideas. When a mathematician says “let n be a natural number…” or “let every closed curve be,…” or “the following is true for all quasi triangular coassociative quantum groups…., in all these cases, the statement is describing a “domain of discourse.” The variable can be any element in the domain of discourse.
A mathematical statement has a form of symmetry associated with it and its domain of discourse. Yanofsky and Zelcer call this “symmetry of semantics.” It states that the statement is true for one element of the domain of discourse if and only if the statement is true after swapping the element for any other element. In other words, the statement is invariant with respect to changing the element of the domain of discourse. The logician will recognize this requirement as saying that mathematical statements are valid for all elements of the domain of discourse. However, by stating validity as a form of symmetry, Yanofsky and Zelcer are making an analogy with other forms of symmetry. In particular, symmetry of semantics is very similar to symmetry of applicability. Their main point is that with symmetry of applicability and symmetry of semantics, both physics and mathematics reasoning is exact and does not permit counter examples.
In context of the previous discussion, mathematics and physics are not universal truths that exist in Platonic realms. Nor do people learn mathematics from the physical world. Rather, both mathematics and physics are selected by sentient beings if they satisfy certain symmetry requirements. Only those statements that satisfy certain symmetry requirements are deemed mathematical statements. Similarly, only those statements that satisfy certain symmetry requirements are deemed physical laws. It is for these reasons that both mathematics and physics have an objective, timeless feel to them. They are chosen that way.
If one accepts these conceptions of physics and mathematics, then it seems one has a “solution” to the applicability problem. The reason that mathematics is applicable in physics is that both the structures of mathematics and the laws of physics are defined the same way. There are complex relationships between domains of discourses for mathematical statements and domains of applicability for physical laws. The physical laws and their concomitant mathematics statements are defined only if they satisfy related symmetries. This solution answers the question of why physics is mathematical, as part of Wigner’s problem.
e. Anthropocentric Solutions
Mark Steiner in his influential book, The Applicability of Mathematics as a Philosophical Problem, (Steiner 1998) presents a version of rational theism as a “solution” to the applicability problem. According to Steiner, humans have a central place in the universe, and the universe is not indifferent to human goals. In other words, the universe is created for humans and people live in a user-friendly universe. Since God has created this universe for humanity, it follows that using human-made mathematical concepts leads to understanding, prediction, and new discoveries.
Steiner describes the early years of quantum mechanics as a showcase for his reasoning in favor of the special place of humans in the world. According to Steiner, by the early 20th century, physicists already knew that they cannot deduce laws of atomic physics from the old classical laws. Nor could the atomic phenomena be directly observed. Moreover, difficulties that accompanied experiments of any sort at the atomic level, turned physics in the early 20th century into a state of despair with a prevalent view that “[a]tomic physics seemed reduced to blind guessing with an uncertain future” (Steiner 1998, p.48). Under these circumstances, physicists had to come up with a strategy to guess the laws, as it was the only option available, Steiner claims. However, since there could be infinitely many possible guesses (thus the probability of finding the right one was close to zero), they needed to adopt a strategy for guessing. This strategy, Steiner claims, was based on mathematical analogies. The use of mathematics in discovering the laws, in Steiner’s view, is a sufficient reason to reject naturalism and accept anthropocentrism.
5. References and Further Reading
Balaguer, Mark. Platonism and Anti-Platonism in Mathematics. Oxford University Press, 1998.
Bueno, Otavio, and French, Steven. Applying Mathematics: Immersion, Inference, Interpretation. Oxford University Press, 2018.
Buyse, Filip. “The Distinction between Primary Properties and Secondary Qualities in Galileo Galilei’s Natural Philosophy.” Cahiers du Séminaire Québécois en Philosophie Moderne / Working Papers of the Quebec Seminar in Early Modern Philosophy, vol. 1, 2015, pp. 20-43.
Carroll, Sean. Something Deeply Hidden: Quantum Worlds and the Emergence of Spacetime. Dutton, 2019.
Cartwright, Nancy. How the Laws of Physics Lie. Cambridge University Press, 1983.
Colyvan, Mark. The Indispensability of Mathematics. Oxford University Press, 2001.
Dirac, Paul A. M. “The Relation Between Mathematics and Physics.” Proceedings of the Royal Society of Edinburgh, vol. 59, part 2, 1939, pp. 124-126.
Einstein, Albert. “Geometry and Experience.” 1921. http://www.relativitycalculator.com/pdfs/einstein_geometry_and_experience_1921.pdf.
Field, Hartry. Science without Numbers: A Defense of Nominalism. Blackwell, 1980.
Field, Hartry. Realism, Mathematics, and Modality. Blackwell, 1989.
Florin Diacu and Philip Holmes. Celestial Encounters: The Origins of Chaos and Stability. Princeton University Press, 1996.
Galilei, G. (1957) [1623]. “The Assayer.” In Discoveries and Opinions of Galileo, edited by S. Drake. Doubleday, 1957.
Gingras, Yves. “What did Mathematics do to Physics?” History of Science, vol. 39, 2001.
Gross, David J. “The Role of Symmetry in Fundamental Physics.” Proceedings of the National Academy of Sciences of the United States of America, vol. 93, 1996, pp. 14256-14259.
Hamming, Richard W. “The Unreasonable Effectiveness of Mathematics.” The American Mathematical Monthly, vol. 87, no. 2, 1980, pp. 81–90.
Hardy, G.H. A Mathematician’s Apology. Cambridge University Press, 1940.
Hilbert, David. Natur und mathematisches Erkennen: Vorlesungen, gehalten 1919–1920 in Göttingen, edited by Paul Bernays and David E. Rowe. Birkhäuser, 1992.
Hossenfelder, Sabine. Lost in Math: How Beauty Leads Physics Astray. Basic Books, 2018.
Icke, Vincen. The Force of Symmetry. Cambridge University Press, 1995.
Islami, Arezoo. “A Match not Made in Heaven: On the Applicability of Mathematics in Physics.” Synthese, vol. 194, 2017.
Islami, Arezoo. “A Complex Problem for Formalists.” IfCoLog Journal of Logics and their Applications (FLAP), vol. 5, no. 6, 2018, pp. 1337–1364.
Islami, Arezoo, and Wiltsche, Harald. “A Match Made on Earth: On the Applicability of Mathematics in Physics.” In Phenomenological Approaches to Physics, edited by H.A. Wiltsche and P. Berghofer, Synthèse Library, 2020.
Islami, Arezoo. “The Unreasonable Effectiveness of Mathematics: From Hamming to Wigner and Back Again.” Foundations of Physics, vol. 52, no. 4, 2022, pp. 72.
Kline, Morris. Mathematics: The Loss of Certainty. Oxford University Press, 1980.
Levy-Leblond, Jean-Marc. “Why Does Physics Need Mathematics?” Studies in History, Philosophy, and Sociology of Science, vol. 146, 1992.
Longo, Giuseppe, and Maël Montévil. “Extended Criticality, Phase Spaces and Enablement in Biology.” Chaos, Solitons & Fractals, vol. 91, 2016, pp. 1–10.
Lützen, Jasper. “The Physical Origin of Physically Useful Mathematics.” Interdisciplinary Science Reviews, vol. 36, 2011, pp. 229-243.
Maddy, Penelope. Realism in Mathematics. Oxford University Press, 1990.
Mickens, Ronald E. Mathematics and Science. Teaneck, NJ: World Scientific, 1990.
Pais, Abraham. Subtle Is the Lord: The Science and the Life of Albert Einstein. Oxford University Press, 1982.
Peale, Stanton J. “Celestial Mechanics.” Encyclopedia Britannica, 30 Aug. 2023. https://www.britannica.com/science/celestial-mechanics-physics.
Penrose, Roger. Shadows of the Mind: A Search for the Missing Science of Consciousness. Oxford University Press, 1994.
Pincock, Christopher. “The Applicability of Mathematics.” Internet Encyclopedia of Philosophy. https://iep.utm.edu/math-app/.
Pincock, Christopher. “A Role for Mathematics in the Physical Sciences.” Noûs, vol. 41, no. 2, 2007.
Pincock, Christopher. Mathematics and Scientific Representation. Oxford University Press, 2012.
Roman, Paul. “Why Symmetry? Some Personal Reflections.” Symmetries in Science, vol. 11, 2004, pp. 1–12.
Ryckman, Thomas. “Effective Field Theories: A Case Study for Torretti’s Perspective on Kantian Objectivity.” In Current Debates in Philosophy of Science: In Honor of Roberto Torretti, edited by Cristián Soto, Springer Verlag, 2023, pp. 61-79.
Schumm, Bruce A. Deep Down Things: The Breathtaking Beauty of Particle Physics. John Hopkins University Press, 2004.
Shapiro, Stewart. “Mathematics and Reality.” Philosophy of Science, vol. 50, no. 4, 1983, pp. 523–548.
Smolin, Lee. The Trouble with Physics: The Rise of String Theory, the Fall of Science, and What Comes Next. Houghton Mifflin Company, 2006.
Soto, Cristian, and Bueno, Otávio. “A Framework for an Inferential Conception of Physical Laws.” Principia: An International Journal of Epistemology, vol. 23, no. 3, 2019, pp. 423–444.
Soto, Cristian. “The Epistemic Indispensability Argument.” Journal for General Philosophy of Science / Zeitschrift Für Allgemeine Wissenschaftstheorie, vol. 50, no. 1, 2019, pp. 145–161.
Steiner, Mark. The Applicability of Mathematics as a Philosophical Problem. Harvard University Press, 1998.
Steiner, Mark. “Mathematics—Application and Applicability.” In The Oxford Handbook of Philosophy of Mathematics and Logic, edited by Stewart Shapiro, Oxford University Press, 2005.
Sudbery, Anthony. Quantum Mechanics and the Particles of Nature: An Outline for Mathematicians. Cambridge University Press, 1986.
Tegmark, Max. Our Mathematical Universe: My Quest for the Ultimate Nature of Reality. Alfred A. Knopf, 2014.
Tegmark, Max. “The Mathematical Universe.” Foundations of Physics, vol. 38, 2008, pp. 101–150.
Torretti, Roberto. Creative Understanding. University of Chicago Press, 1990.
Velupillai, K. Vela. Computability, Complexity and Constructivity in Economic Analysis. Edited by K. Vela Velupillai, Blackwell Publishing, 2005.
Weinberg, Steven. Dreams of a Final Theory: The Scientist’s Search for the Ultimate Laws of Nature. Pantheon Books, 1992.
Weyl, Hermann. Symmetry. Princeton University Press, 1952.
Whewell, William. The History of Inductive Science, vol. 2. Parker Strand, 1837.
Wigner, Eugene P. “Localized States for Elementary Systems.” Reviews of Modern Physics, vol. 21, no. 3, 1949, pp. 400–406.
Wigner, Eugene P. “The Unreasonable Effectiveness of Mathematics in the Natural Sciences.” Communications in Pure and Applied Mathematics, vol. 13, no. 1, 1960, pp. 1–14.
Wilczek, Frank. “Quantum Field Theory.” Reviews of Modern Physics, vol. 71, 1999, pp. S85–S95.
Wilson, Mark. “The Unreasonable Uncooperativeness of Mathematics in the Natural Sciences.” The Monist, vol. 83, no. 2, 2000, pp. 296–314.
Yanofsky, Noson S. “Why Mathematics Works So Well.” In Trick or Truth?: The Mysterious Connection Between Physics and Mathematics, edited by Anthony Aguirre, Brendan Foster, and Zeeya Merali, Springer, 2016. http://fqxi.org/community/essay/winners/2015.1#Yanofsky.
Yanofsky, Noson S., and Zelcer, Mark. “The Role of Symmetry in Mathematics.” Foundations of Science, 2016. http://arxiv.org/abs/1502.07803.
Zarepour, Mohammad Saleh. “Arabic and Islamic Philosophy of Mathematics,” Stanford Encyclopedia of Philosophy. 2022. https://plato.stanford.edu/entries/arabic-islamic-phil-math/.
Author Information
Arezoo Islami
Email: arezooi@sfsu.edu
San Francisco State University
U. S. A.
It has long been a philosophical problem to explain the commonly accepted distinction between two kinds of qualities: primary and secondary. While the roots of this debate can be traced back to Plato’s Euthyphro, it was John Locke who, more clearly than others, articulated the distinction and introduced the terms “primary quality” and “secondary quality.” Similar distinctions, under labels such as “primary affections,” “primary attributes,” or “original qualities,” also appear in the writings of earlier thinkers, including Galileo Galilei, Robert Boyle, and René Descartes. This article’s focus is not on the historical development of the distinction, but rather on Crispin Wright’s contemporary and well-known “Judgment-Dependent” account of this distinction, which aims to provide a criterion for determining whether a quality should be regarded as secondary.
Metaphysically, a primary quality is often understood as a quality whose instantiation or existence is constituted, or metaphysically determined, by something independent of human responses formed under certain conditions, such as seeing, smelling, or touching. Typical examples include shape, extension, motion, solidity, and number. These qualities are said to be objective because, for instance, if an object’s surface is square-shaped, it has this shape regardless of whether one perceives it as such. In other words, such qualities are mind-independent. Epistemologically, knowledge that an object’s surface is square-shaped is not infallible precisely because the fact that the object possesses this quality is independent of how one perceives it or what one believes about it. Therefore, even if one’s beliefs about these qualities are formed under optimal conditions, they can still be false.
By contrast, a secondary quality is one whose instantiation or existence depends on how one perceives, judges, or responds to it. Typical examples include color, taste, smell, and sound. Consequently, these qualities are often considered subjective. For example, whether an object is red appears to depend on how it looks or how one sees it under certain conditions. Thus, the quality of being red is said to be mind-dependent. Epistemologically, knowledge that an object is red can be infallible because the object’s having that quality depends on one’s perception: if one’s beliefs about these qualities are formed under optimal conditions, they will be guaranteed to be correct.
Wright’s judgment-dependent account offers a criterion for determining whether a given quality is treatable as secondary or “judgment-dependent,” as Wright prefers to call it. By employing this account, he aims to answer two fundamental questions: one metaphysical and one epistemological. If the account successfully shows that a quality is judgment-dependent, Wright can claim that facts about whether an object has that quality are metaphysically determined by the judgments of a suitable subject. The epistemological question, that is, how the subject knows such facts, is answered by appealing to the subject’s direct knowledge of her own judgments. In what follows, this account is introduced step by step and then applied to several crucial cases.
This section gives a brief review of the roots of the distinction between primary and secondary qualities. As presented in Plato’s Euthyphro (Cooper, 1997, 1-16), both Socrates and Euthyphro agree on an important fact while disagreeing over a crucial explanatory issue concerning that very fact. They both accept the claim that whatever is loved by the gods is pious, and whatever is pious is loved by the gods.
In other words, they agree that there is a covariance between two facts, presented in this biconditional: the fact that an action is pious co-varies with the fact that it is loved by the gods. Put simply, an action is pious if and only if it is loved by the gods.
Neither Socrates nor Euthyphro questions the existence or validity of this covariance; both agree that the biconditional is true. What they disagree about is how to answer the following questions: Which side of the biconditional has priority? Which fact constitutes the other? Or, as Wright puts it, what is the “order of determination” here (Wright, 1992, 108-109)? The biconditional allows for two possible explanatory directions:
First: Because an action is pious, it is loved by the gods.
Second: Because an action is loved by the gods, it is pious.
Socrates and Euthyphro each endorse a different order of determination. Socrates holds that if an action is pious, it is not because it is loved by the gods; rather, its piousness depends on something else, such as the nature or characteristics of the action itself. Nonetheless, since the gods love pious actions, they love that action too. As previously emphasized, both agree on the truth of the biconditional, but for Socrates, the fact that an action is pious is not constituted by the gods’ responses, judgments, or attitudes. Thus, Socrates treats piousness as a primary quality, constituted independently of the gods’ responses.
Euthyphro, by contrast, believes that the piousness of an action essentially depends on the gods’ loving it. This aligns with the notion of a secondary quality, where the fact of an action having the quality is constituted by the gods’ responses. Euthyphro thus takes piousness to be a secondary quality. Wright refers to this type of disagreement about the order of determination as the “Euthyphro Contrast” (1992, 108, 37). Neither Socrates nor Euthyphro has offered an account of why their respective orders of determination are correct, and the purpose of Wright’s account is to provide such an explanation.
To generalize this contrast, replace “the gods” with a subject, S, “the action” with an object or event, x, “being pious” with an arbitrary quality, F, and the gods’ loving responses with the subject’s “judgment” to arrive at the following formulation of the biconditional:
Basic Equation: (x)(S): x is F ↔ S judges that x is F, under certain conditions.
This equation is what Wright calls the “Basic Equation” (1992, 108). It represents the covariance between the fact that x is F and the fact that S judges that x is F. Different examples can be generated by substituting “F” with various qualities, such as being red, having a square-shaped surface, or being salty. For instance, people often accurately see an object as red if it is red, and if an object is red, it typically looks red to them. This covariance between x’s being red and S’s judging that x is red is a fact that people generally agree on. The challenge lies in explaining this covariance by determining which side of the equation accounts for the other.
Before introducing Wright’s approach to this issue, it will be helpful to clarify some basic but important points that may otherwise cause confusion in the discussion that follows.
a. Response-Dependence: Concepts, Predicates, and Properties
Judgment-dependent accounts (hereafter “J-D accounts”) are a subclass of the more general category of “response-dependent” accounts (hereafter “R-D accounts”). The term “response-dependence” was first coined by Mark Johnston in his influential paper “Dispositional Theories of Value” (1989). Wright takes the relevant responses in these accounts to be certain mental responses, namely, one’s judgments or opinions. This is because, in areas of particular philosophical interest, non-judgmental responses, such as “nausea, amusement, certain forms of aesthetic response, having things look red, or square to one, and so on” (Wright, 1992, 111), are not sufficiently rich to capture the complex interdependence between a quality and the subject’s responses. Addressing the Euthyphronic question (or fixing the order of determination) in these areas of discourse, including intention, meaning, morality, politics, and economics, requires a more fine-grained notion.
“Judgment” often simply means a belief or opinion formed in response to the world. However, Wright adopts a more sophisticated definition: “the response of judgement… [is the response] of endorsing what is affirmed by a tokening of an assertoric sentence of the discourse” (1992, 111). Here, judgments are framed within the philosophy of language. Consider the discourse of color. An assertoric sentence within this discourse might be “That apple is red.” When a speaker makes this assertion, what is affirmed is the content of the sentence, namely, that the apple is red. Thus, to judge that the apple is red is to endorse what the assertoric sentence expresses. Given this, what are these judgments about?
Philosophical discussions reveal a diversity of notions employed in building R-D accounts, most notably properties (or qualities), concepts, and predicates. For instance, John McDowell, one of the first philosophers who employed the notion of secondary qualities to address philosophical problems concerning values, uses the notion of property (more specifically, disposition) and offers a disposition-oriented reading of secondary qualities: “A secondary quality is a property the ascription of which to an object is not adequately understood except as true, if it is true, in virtue of the object’s disposition to present a certain sort of perceptual appearance” (McDowell, 1985, 111-112). Thus, if red is a secondary quality, “that rose is red” is a true ascription of color to the rose if it is disposed to look red to people under optimal conditions.
Johnston, who first formally developed a disposition-based R-D account known as the “response-dispositional” account (1998, 14), shifts his focus from concepts to properties in a notable way. Initially, he was concerned with response-dependent concepts and their connection to the concept of objects having dispositions to appear a certain way to normal observers. In 1989, he defined a response-dependent concept as one interdependent with concepts of subjects’ responses under certain conditions (see Johnston, 1989, 145). By 1992, he introduced the idea of “response-dispositional concepts,” stating that one who claims that the following equation, or biconditional, is a priori knowable to be true is asserting that the concept of red is a response-dispositional (that is, secondary) concept:
The property red = the standardly realized disposition to look red to standard perceivers under standard conditions. (Johnston, 1992, 230)
While Johnston’s emphasis was initially on concepts, he argues that his account could also demonstrate that “redness would also be a genuine property” (1989, 10), partly because the term “red” has a reference. By 1993 and fully by 1998, Johnston’s account had shifted to focus explicitly on “response-dependent properties.” He asserts that a property, being F, is response-dependent if the following biconditional can be made a priori knowable and necessary:
x is F if and only if x is disposed to produce x-directed response R in all actual and possible subjects, S, under conditions C. (1998, 9)
In contrast, Richard Holton argues that “response-dependence is a feature of concepts, not of properties” (Holton, 1991, 4) and later defines response-dependent concepts as “concepts which are connected, in an a priori way, with certain human responses” (1992, 183).
Frank Jackson (1996) offers a definition of redness similar to Johnston’s but with an emphasis on determining the reference of the response-dependent term “red”: “‘red’ denotes the property of an object putatively presented in visual experience when that object looks red” (Jackson, 1996, 200).
Philip Pettit, in his extensive work on the topic, employs all the notions of properties, concepts, and predicates while aiming to remain neutral about the metaphysical status of properties. He states:
The property that we fix upon, the property that provides the referent of our concept of redness, is that property whose instances evoke red sensations in normal observers under normal circumstances. (Pettit, 1991, 609)
For Pettit, the concern of an R-D account is not to determine what this property precisely is: “In maintaining that the concept is response-dependent… all that we may mean is that… the concept hooks on to that property, whichever it is, that evokes red sensations under normal conditions” (1991, 609).
In 2002, Jackson and Pettit jointly sought to build on Johnston’s R-D account, committing to a communitarian R-D view, according to which “an English term ‘T’ will be response-dependent just so far as it is a priori that ‘T’ serves to pick out a property (or whatever) in something if and only if that thing would seem T under certain independent, favourable specifications” (Jackson and Pettit, 2002, 101). This view, while aligning with Johnston’s earlier approach seeing concepts as the main bearer of R-D accounts, allows for the possibility that while a response-dependent term or concept can successfully refer to an objective property, this success in referring, and even the existence of the property, cannot be guaranteed because “observers may not prove to converge on any property of the kind envisaged, under conditions that count independently as favourable” (2002, 102). In this way, their view integrates elements from Johnston, Jackson, and Pettit. Why is there disagreement among philosophers on these notions?
Simon Blackburn clarifies where these disputes arise: “For some of us there is not much of a difference here: I myself see properties as the semantic shadows of predicates, not as self-standing objects of investigation. But others see the matter differently” (Blackburn, 1993, 262). For the philosophers discussed above, significant distinctions between properties, concepts, and predicates affect whether these terms can be used interchangeably in an R-D account. Properties or qualities are features of objects or events in the world, such as an apple’s being red or a table’s having a square-shaped surface. R-D accounts seek to determine whether the fact that an object has a particular property depends on facts about normal observers’ responses. If red objects are thought of as forming a specific set, the R-D account asks whether an object’s membership in that set is determined by one’s responses or independently of them.
Concepts are typically regarded as mental items employed in thought, while predicates are linguistic items or structures that appear in sentences expressing those thoughts. Since concepts have a specific extension, which only specific objects fall under, they can be seen as representing the set of objects that instantiate a property. In this sense, the concept of red and the term “red” can be said to refer to, or represent, the property of being red.
However, two properties may be identical, while the concepts ascribing them are not. Blackburn notes that some philosophers “think of there being a substantive science of identifying the property [j] which is not the same thing as identifying the concept [j]” (1993, 262). The classic example is water and H2O: having the property of being water is the same as having the property of being H2O, whether or not a person actually knows this fact. However, having the concept of water is different from having the concept of H2O; one may possess the former without the latter. This is one reason why Holton argues that response-dependency is a feature of concepts, not properties. What about predicates?
A predicate such as “… is red” or “x is red” is an incomplete linguistic structure that forms a complete sentence when “…” or “x” refers to something, and it forms a true sentence when what “…” or “x” refers to is indeed red. Predicates are thus true of specific objects only, for instance, the set of red objects, and are said to have satisfaction, correctness, or application conditions. The predicate “x is red” is satisfied or correctly applied if and only if the object denoted by “x” has the property of being red. The relationship between properties, concepts, and predicates can be summarized as follows: “a predicate like ‘… is red’… can be used to ascribe the property of redness to things.… At the same time, this predicate ‘… is red’ expresses the concept red” (Wedgwood, 2013, 976). However, a similar issue arises concerning the relationship between predicates and concepts. In the philosophy of language, it is well known that two terms or predicates can refer to the same property while presenting it in different ways: they have the same referent but different Fregean senses, resulting in distinct concepts.
This matter will not be pursued further. The aim of the above discussion has been to clarify some of the significant disputes concerning the boundaries of response-dependency and to highlight the general features of R-D accounts. This article will now introduce Wright’s account by focusing first on two cases: color, as the typical example of a secondary quality, and shape, as the typical example of a primary quality. In each case, Wright discusses the relevant concepts and their extensions, though, as will be seen, he prefers to characterize response-dependence in terms of the distribution of truth-values across a class of judgments within each discourse. After presenting Wright’s J-D account in these two cases, I will explain how this account addresses the more complex case of intention and, finally, the most controversial case in this discussion: moral values.
2. Wright’s Account: The Case of Color
To begin with, consider the basic equation for color concepts, such as the concept of red:
RED: (x)(S): x is red ↔ S judges that x is red, under certain conditions.
The key question is whether it is S’s judgments that determine the extension of the concept of red. Here, the concern is with judgments formed “under certain conditions.” More precisely, S’s best judgments are needed, that is, those formed “under certain optimal conditions” (Wright, 1992, 108, 111). This is necessary in light of the familiar Wittgensteinian remark that not every judgment a subject forms can be trusted or treated as reliable: it is not the case that whatever seems right to the subject is right. If the subject is color-blind, suffers from brain damage affecting vision, observes the object in poor lighting, or is inattentive, then the subject’s judgment about x’s color cannot be trusted. What is needed are the subject’s best judgments about x’s color, and best judgments are those formed under optimal conditions. The first step, therefore, is to specify what these optimal conditions would be.
Some of these conditions have already been mentioned. Importantly, Wright is not after finding ideal conditions – for reasons to come – but rather what are ordinarily regarded as statistically normal conditions for color observation. These conditions have two components: one about the environment and one concerning the subject or observer. On the environmental side, the conditions are like the following: “the surface must be in full view and in good light, relatively stationary, and not too far away” (2001, 192), with lighting conditions typically specified as “those which actually typically obtain out-of-doors and out-of-shadow at noon on a cloudy summer’s day” (2001, 195). These are generally treated as the conditions under which an object’s color can best be observed. On the part of the observer, “the thinker must know which object is in question, must observe it attentively, must be possessed of normal visual equipment and be otherwise cognitively lucid, and must be competent with the concept blue. In addition, the thinker must be free of doubt about the satisfaction of any of these conditions” (2001, 192-193).
It is crucial to note that having the concept of red is part of the optimal conditions. Without the concept, the subject cannot form any judgment about it; one cannot judge about something for which one has no concept or understanding. However, possessing the concept of red does not entail knowing the color of any particular object.
Once these conditions are met, the judgment formed can be taken as the subject’s best and most reliable judgment about the color of the object The next step is to show what role this judgment plays in determining the extension of the concept, that is, in determining x’s color. Before turning to this, it is important to address a key complication with the “basic equation” as formulated above, in which the optimal conditions appear on the right-hand side of the biconditional. For Wright, this formulation introduces a potential problem that must be dealt with first.
a. The Chameleon Problem
Suppose it has been established that color is a secondary quality, and thus the fact that an object is red is constituted by the subject’s best judgment formed under optimal conditions. It is clear that the obtaining of optimal conditions should not change the fact being judged, namely, that the object in question is red. However, consider the following case.
Suppose our subject is ready to judge the color of a chameleon, and the chameleon’s skin is currently red. The fact in question is that the chameleon is red, and the judgment will be “the chameleon is red.” The goal is to examine the explanatory role of S’s judgment in determining the truth of this fact. The basic equation here would be:
CHRED: The chameleon is red ↔ S judges that it is red, under optimal conditions.
Consider the right-hand side of the above equation. Since it says that S forms their judgment regarding the chameleon’s color under optimal conditions, it can be written in the following conditional form:
Optimal conditions obtain → S judges that the chameleon is red.
Thus, the above basic equation becomes:
CHRED: The chameleon is red ↔ (Optimal conditions obtain → S judges that the chameleon is red).
Now, suppose the chameleon is in a dark room on a green carpet. To form their best judgment, the subject requires the optimal conditions to obtain, which include the conditions regarding appropriate lighting. Under these lighting conditions, however, the chameleon changes its skin color to green, matching the carpet. The subject’s best judgment, therefore, would not be that the chameleon is red, but that it is green. This would make the right-hand side of the biconditional false, since under optimal conditions S would not judge that the chameleon is red. Does this mean that red is not a secondary quality?
It may seem tempting to resolve the issue by rewriting the equation as:
CHGREEN: The chameleon is green ↔ (Optimal conditions obtain → S judges that the chameleon is green).
However, this move introduces a deeper problem. How could the subject know that the chameleon is green before the optimal conditions obtain and thus before the subject even forms their best judgment under those conditions? Before the obtaining of the optimal conditions, all that the subject knew was that the chameleon is red. If S knew and could already judge that the chameleon is green prior to the obtaining of optimal conditions, it would imply that the color had been fixed independently of their best judgment, undermining the treatment of color as a secondary quality.
This problem, identified by Johnston (1993, 121) as the problem of “altering,” highlights that the obtaining of optimal conditions may unpredictably alter the facts or truth-conditions of the sentences by which one describes those facts about the colors of objects, and thus directly affect the subject’s judgments. Wright offers a solution to this challenge.
According to Wright, the basic equation should be reformulated as follows:
PECH: Optimal conditions obtain → (the chameleon is green ↔ S judges that the chameleon is green).
Here, the condition that optimal conditions obtain is moved to the antecedent of a conditional, with the biconditional placed in the consequent. This means that once optimal conditions obtain, the fact to be examined will be that the chameleon’s skin is green, and the subject will judge it to be so. (Similarly, if the chameleon is red under optimal conditions, the subject will judge it to be red.) Thus, in the case of red, the equation that Wright’s account works with becomes:
PERED: (x)(S): Optimal conditions obtain → (x is red ↔ S judges that x is red).
This formulation, which Wright calls a “Provisional Equation” (PE) (1992, 119; 2001, 194), ensures that the obtaining of optimal conditions does not change the fact under scrutiny, while preserving the covariance between judgment and the property in question. The equation is called “provisional” precisely because it is provisioned by, or is conditional on, the obtaining of optimal conditions.
As noted earlier, Wright still needs to demonstrate whether color is indeed a secondary quality, since PEs merely capture the covariance between judgments and colors. This covariance can be explained in two ways:
(1)Euthyphronic (Extension-Determining): Facts about the object’s color and S’s best judgments covary because it is S’s judgments that constitute facts about the color of the object. In this case, judgments are said to play an “extension-determining” role (2001, 192), making the concept of color “judgment-dependent” or secondary: the extension of color concepts is determined by S’s best judgments.
(2)Socratic (Extension-Tracking): Facts about the object’s color and S’s best judgments covary because the subject’s perceptual apparatus is well-suited to accurately capture independently constituted color facts. Here, judgments are said to play an “extension-reflecting” or “extension-tracking” role (2001, 192): they merely detect, track, or reflect the extension of color concepts, making them “judgment-independent” or primary.
Wright must establish which explanation applies to color concepts, such as the concept of red. To address this, he introduces four specific conditions that, if satisfied, would demonstrate that the concept under consideration is indeed judgment-dependent or secondary.
3. The Four Conditions
According to Wright, in order to show that red is a judgment-dependent or secondary concept, the provisional equation for red (PERED) must satisfy four specific conditions.
a. The A Priority Condition
This condition requires that the truth of PERED be knowable in an a priori way. First of all, if red is a secondary concept, PERED is true because S’s best judgments would then determine the facts about objects being red, ensuring that facts about color co-vary with facts about S’s best judgments. This is precisely what PERED states: under optimal conditions, x is red if and only if S judges that x is red.
Second, if red is a secondary quality, a subject who possesses the concept of red would know the truth of PEREDa priori, that is, simply by reflecting on the concept itself, without needing to appeal to empirical experience (see the a priori and a posteriori article). Possessing the concept of red, understood as secondary, is to know that under appropriate conditions, if an object is red, it is judged to be red, and vice versa. If the A Priority Condition were not met, S would require knowledge beyond reflecting on the concept of red to know that PERED is true, implying that S’s best judgments could no longer serve as the conceptual ground for truths about color. This would undermine the secondary status of color. Thus, the A Priority Condition is a necessary condition that must be met if red is secondary.
PERED meets the A Priority Condition because any normal person possessing the concept of red would already know that under standard conditions, if an object is red, it will appear red, and vice versa. This is simply what is required to have the concept of red conceived as secondary.
Note, however, that while empirical observation is needed to determine whether a particular object is red, such an inevitable appeal to experience does not undermine the a priori knowability of the truth of PERED itself, that is, that what it expresses is true.
Nonetheless, meeting this condition alone does not suffice to establish that red is a secondary quality because PERED could be a priori knowable even if color is determined independently of (or by something other than) S’s best judgments.
b. The Substantiality Condition
The optimal conditions must be specified in a substantial, non-trivial manner, avoiding “whatever-it-takes” formulations that guarantee the correctness of S’s judgments by fiat (see Wright, 2001, 196). If the optimal conditions were defined as “whatever is necessary for S to form correct judgments about color” or contained conditions such as “… and there is no obstacle whatsoever for S to determine the color of x…,” the truth of PERED would be trivial. One would no longer be able to claim that the truth of PERED owes specifically and exclusively to the fact that S’s best judgments determine facts about redness, rather than simply to the obtaining of the conditions themself. If the Substantiality Condition is violated, it remains entirely indeterminate whether S’s judgments, if formed under substantially specified optimal conditions, would still determine facts about x’s color.
The optimal conditions Wright specifies, such as S possessing the concept, being attentive, having normal visual equipment, and observing under appropriate lighting, do not guarantee that S will always form correct judgments about color, but they are sufficient to enable only reliable judgments. This shows that the Substantiality Condition is met in the case of red.
Note that this is why Wright relies on statistically normal conditions rather than ideal conditions, which would involve “whatever conditions are conducive to the reliable appraisal of colour” (1992, 113). “Ideal conditions” would collapse into triviality because they are supposed to be the conditions defined in terms of the deliverance of correct appraisals of color.
Yet, satisfying the Substantiality Condition alone does not rule out the possibility that independent facts may still constitutively be involved in determining color, leaving the judgment merely to track these facts.
c. The Independence Condition
The optimal conditions must not presuppose any fact about the color of the object. If red is a secondary quality, facts about an object’s being red should be determined by S’s best judgment, not assumed beforehand. Presupposing any fact about x’s color in the optimal conditions would raise this question: what determined those facts about x’s color before S forming their best judgment about x’s color? Since such facts are fixed before S makes their best judgment, the judgment cannot be said to play an extension-determining role. Consider an example.
Suppose the optimal conditions included a condition like “x’s color remains stable during the observations S makes.” This requirement would presuppose that x has a specific color, F, that remains stable over a period, implying that the fact about x’s color has been determined somehow independently of S’s best judgment. This undermines the judgment-dependent status of color.
Wright’s formulation of optimal conditions avoids such presuppositions. It requires no assumptions about what color x has or about the stability of x’s color. A point is to be noted at this time.
By looking at PERED, one can see that the concept of red appears on both sides of the biconditional. Does this violate the Independence Condition? It has been assumed that S possesses the concept of red, which means that S knows this concept has a certain extension, E, which admits only specific objects. However, Wright argues that this does not violate the Independence Condition because his account is non-reductionist. It does not seek to reduce the concept of red to other concepts or facts about redness to any other facts but instead investigates the relation between S’s judgments and the object’s being red. Thus, using the concept within the biconditional does not assume an extension independently determined prior to judgment (see Wright, 1992, 120–121).
It remains to be seen, however, whether meeting these three conditions suffices to show that red is judgment-dependent, or whether a non-judgment-dependent account might also satisfy them.
d. The Extremal Condition
This condition requires that the satisfaction of the above three conditions owes only to the fact that S’s best judgments determine the concept’s extension. There must be no alternative explanation for why the PE in question satisfies these conditions other than the extension-determining role of S’s best judgment. Wright illustrates this condition using an example.
Consider the case of pain. The PE for this case can be written as follows:
PEPAIN: If S has the concepts requisite in order to entertain the judgment that she is in pain, then S is in pain at t if and only if S judges that she is in pain at t.
Here, while PEPAIN appears to meet the first three conditions, this is due to the infallibility of first-person judgments about pain, not because the judgments determine the extension of the concept of pain. Pain is the typical example of the mental states of which a person has a unique qualitative experience: when one feels pain, such a feeling is distinct from that of a tickle, an itch, or similar feelings. Moreover, one has direct, non-inferential knowledge of one’s own pains, without needing to form judgments, beliefs, or inferences to know one is in pain. This means that PEPAIN is a priori knowable to be true for someone who possesses the concept of pain; the optimal conditions were also substantially specified, and no fact about S’s pain was presupposed in them. Thus, in the case of pain, although the provisional equation meets the three conditions, it does not show that the concept of pain is judgment-dependent; rather, the conditions are satisfied because of the infallibility of first-person awareness (see Wright, 1992, 124).
Therefore, one must carefully distinguish between (a) the cases in which the three conditions are satisfied because S’s judgments have determined the extension of the concept, and (b) the cases in which they are satisfied because of our independent infallibility of the extension of the concept, or other factors, in which case our judgments about the concept’s extension at most track or reflect facts about that extension. The notion of judgment-dependence applies only where no alternative explanation exists for why the conditions are met other than the extension-determining role of S’s judgment. The case of color, as indicated, meets this condition as well.
Therefore, since PERED satisfies all four conditions, Wright concludes that color can indeed be viewed as a judgment-dependent or secondary concept: S’s judgments about x’s color, formed under optimal conditions, do not merely track facts about x’s being red but determine them.
Before turning to the case of shape, it is worth recalling that Wright often frames these results in terms of the “distribution of truth values” across judgments within a discourse. For, if color is a judgment-dependent concept, then equations like PERED “purport to exhibit a way in which truth values in the discourse in question are … bounded by the deliverances of best opinion” (1992, 120; see also 2012, 404). This connects with Wright’s broader Dummettian anti-realist view of truth and the related notion of superassertibility (see Wright, 1992, Chapter 2; 2003, Chapter 1; see especially the related papers collected in Wright, 1993). A detailed exploration of Wright’s anti-realism cannot occur here, but it is important to note that for judgment-dependent concepts like color, the applicability of the truth predicate to assertions involving these concepts (their truth or falsity) depends on human judgments, which serve as the truth-makers for such assertions.
This concludes Wright’s J-D account of color, showing how red can be treated as a secondary, judgment-dependent concept. The next question is whether the same approach can explain why shape is typically considered a primary concept.
4. Wright’s Account: The Case of Shape
Shape concepts are typically viewed as primary concepts, and Wright argues that his account can explain why this is so. Consider Wright’s example (2001, 197): Suppose there is an object in view with a pear-shaped surface. As with color, one must first set out a provisional equation for x being pear-shaped and specify optimal conditions under which S’s judgments about x’s shape would count as their best judgments:
PEPEAR: (x)(S): Optimal conditions obtain → (x is pear-shaped ↔ S judges that x is pear-shaped).
Before examining whether PEPEAR meets the four conditions, the relevant optimal conditions must be identified. Some of them would naturally be similar to those for color because shape also involves visual appraisal: the object must be clearly visible, the environment must not be too dark, dusty, and so forth. The subject must also be perceptually normal, attentive to the shape of the object (rather than its color or other properties), and free from cognitive dysfunction or brain damage. Wright summarizes the optimal conditions for shape as follows:
S knows which object x is, and knowingly observes it in plain view from a sufficient variety of positions in normal perceptual conditions, and is fully attentive to these observations, and is perceptually normal and is prey to no other cognitive disfunction, and is free of doubt about the satisfaction of any of these conditions. (1988, 17)
Although the conditions appear innocuous, a deeper problem lurks beneath the surface.
a. Violation of the Independence Condition: The Stability Problem
Consider the following two questions about the specified optimal conditions:
(1) Would a single observation of x suffice for S to make their best judgment about its shape?
It does not seem so. Unlike color, where a single observation under optimal conditions often suffices, shape judgments typically require viewing the object from multiple angles. It is a common experience to see a square-shaped object as appearing rectangular, rhombus-like, or even trapezoidal. Only after viewing the object from multiple perspectives can one reliably conclude that its surface is square rather than another shape. For this reason, Wright adds the requirement that S observes the object “from a sufficient variety of positions.”
This requirement, however, gives rise to an additional worry: if S must make multiple observations to form their best judgment about x’s shape, then “we must ensure that no change in x’s shape takes place through the period of these several observations” (2001, 197). This leads to the second question:
(2) Can this shape stabilityrequirement be safely included within the optimal conditions?
It does not seem so. For S to make multiple observations, x’s shape must remain stable throughout. This would require adding a condition such as: “the shape of the object remains stable during the period of observation.” However, adding this condition presupposes that x has a determinate shape, F, that remains stable during the period of observing the object from different directions; this is to presuppose some facts about x’s shape in the optimal conditions, which are inevitably constituted before and independently of S’s best judgments. Thus, the Independence Condition, which required that optimal conditions not presuppose facts about the target property, is violated. If shape were secondary, facts about x’s shape would be determined by S’s best judgments. Here, by contrast, x’s shape must already be fixed independently of S’s best judgment. Can the stability problem be avoided?
Wright suggests a teamwork strategy: instead of one subject making multiple observations, multiple subjects could simultaneously observe the object from different angles and report their judgments. If most agree that x is square, this could be taken as the best judgment about x’s shape (see 2001, 197). This might resolve the stability issue and thus preserve the Independence Condition.
However, even if the Independence Condition can be met, Wright argues that shape concepts face a deeper problem: the violation of the A Priority Condition.
b. Violation of the A Priority Condition: The Answerability Problem
Wright observes that “[t]he application of shape predicates, even ones as rough and ready as ‘pear-shaped’, is answerable to a variety of considerations besides visual appearance” (2001, 197). In particular, operational definitions for shape are used. For instance, a square is defined as having four equal sides and four right angles. By having such definitions at hand, would it really matter how accurately or otherwise S observes the object under this or that set of optimal condition? Under such definitions, visual appraisal alone appears insufficient or at least subordinate; measurement becomes decisive. As Wright puts it, “operational criteria dominate the visual” (2001, 198). What is needed is simply to measure the sides and angles to approximately determine the shape. Consequently, S’s visual judgments about shape lose their extension-determining role, instead serving to track facts established by measurement.
Of course, there is no surface in the actual world that is perfectly and ideally square-shaped; even if it is assumed that there are such surfaces, humans do not possess the visual apparatus that accurately detects them. Thus, it seems what is needed is only to determine “approximate square.” To see the problem more clearly, consider the PE for “approximately square”:
PESQUARE1: (x)(S): Optimal conditions obtain → (S judges that x is approximately square ↔ x is approximately square).
Now, define approximate squareness operationally:
x is approximately square ↔ (if the four sides and angles of x were correctly measured, the sides would be approximately equal, and the angles approximately right).
The right-hand side of PESQUARE1, that is, that “x is approximately square,” can be replaced with this operational definition and arrive at PESQUARE2:
PESQUARE2: (x)(S): Optimal conditions obtain → (S judges that x is approximately square ↔ (if the four sides and angles of x were correctly measured, the sides would be approximately equal, and the angles approximately right)).
If square were a secondary concept, the A Priority Condition would require PESQUARE2 to be knowable to be true a priori by anyone competent with the concept, that is, merely by reflecting on the concept of square or approximately square. The problem is that it is nota priori knowable that judgments under optimal conditions align with operational definitions and precise measurements: this is a contingent, empirical fact. It is not the case that anyone possessing the concept of squareness knows a priori that if correct measurement is made, it would reveal the sides and angles of x to be approximately equal and right. As Wright states, this is a “deep fact of experience” (1988, 20) and only a posteriori knowable. The case becomes worse if more complex shapes and geometric figures are considered.
Hence, the A Priority Condition is violated: in the case of shape, one’s best judgments do not determine the extension of shape concepts but instead track or detect independently established facts.
The conclusion is that, even if one manages to address the stability problem and thus the violation of the Independence Condition, the violation of the A Priority Condition remains. In the case of shape, our best visual judgments serve only an extension-tracking role, rather than an extension-determining one. Shape, therefore, can only be viewed in a Socratic manner, as a primary, judgment-independent concept.
A note on the theoretical framework of this discussion is worth considering. It may be tempting to think that ordinary language users actually learn, come to know, and operate with explicit awareness of the biconditionals or PEs like PERED or PESQUARE. However, as Pettit clarifies, these biconditionals are theoretical constructs developed by philosophers to capture the structure of everyday practices:
The biconditional belongs to us theorists, not to the participants in the relevant practice. We theorists register how the participants fix on the property that they refer to as redness…. Although participants may have no notion of normal conditions in their repertoire… their practice ensures that it is indeed a priori that something is red just in case it is such as to look red in normal conditions. (Pettit, 1991, 601)
Pettit calls this sort of view “ethocentric” because such accounts, including Pettit’s, rely on the habits and correction practices of ordinary language users (see especially Pettit, 1998, 56-57). One learns to use terms like “red” or “square” through ostensive teaching and communal practices, not by grasping theoretical biconditionals or PEs explicitly: such knowledge is of a know-how sort, not propositional knowledge of some definition or of anything like what PEs express. Philosophers abstract these biconditionals from people’s ordinary practices to test whether concepts are response-dependent or response-independent.
Having examined Wright’s account applied to color and shape, this framework can now be applied to other domains, including mental concepts (such as “intention”) and evaluative concepts (such as “moral goodness”). Before doing so, it will be useful to summarize some of the broader implications of Wright’s discussion for understanding objectivity and subjectivity in the treatment of primary and secondary qualities.
5. A Mix of Subjectivity and Objectivity
A central issue in the discussion of primary and secondary qualities concerns the degree of objectivity and subjectivity that philosophers attribute to them. These notions typically hinge on the notion of dependence on human responses, including perceptions, dispositions, judgments, beliefs, observations, and related attitudes. A quality is considered subjective if its existence or instantiation depends on, or is constituted by, certain human responses; conversely, a quality is objective to the extent that its existence or instantiation is independent of such responses. Consequently, our knowledge of subjective qualities is often more direct, transparent, authoritative, and non-inferential than our knowledge of objective qualities, precisely due to this dependence.
Traditionally, secondary qualities have been regarded as almost entirely subjective, while primary qualities are viewed as objective. A secondary quality exists only because the perceivers are capable of perceiving or responding to it in a specific way, whereas a primary quality exists independently of perceivers. For example, an object’s being red is seen as subjective because it depends on how it appears to the observer, whereas being square-shaped is viewed as objective because it does not depend on an observer’s existence or perception.
Wright rejects this extreme dichotomy. According to Wright, his account shows that both primary and secondary qualities display degrees of subjectivity and objectivity. On one hand, the J-D account reveals that red is judgment-dependent: facts about redness are not constituted independently of human responses. On the other hand, this does not imply that “there is no standard to meet, that whatever we say about colour goes or — what comes to the same thing — that there is no such thing as an object’s real colour” (2001, 199). The fact that red is judgment-dependent does not entail that whatever seems right to the subject about color is right, that any judgment about color is correct. Rather, judgments about color are authoritative and extension-determining only when formed under the optimal conditions that include objective conditions, such as lighting, perceptual functioning, and cognitive conditions, all of which can be objectively assessed. Thus, certain states of affairs independent of the subject must obtain, and S’s judgments must be formed under those conditions for those judgments to reliably determine extensions.
In the case of square, by contrast, certain subjective criteria are also necessary: S must possess the concept of squareness, understand what it is for an object to have a shape, attend to the task, and be confident that relevant conditions obtain. These subjective requirements are crucial for S’s judgments to be reliable and to reflect the extension of the concept.
Traditionally, secondary qualities were dismissed as fully subjective and thus explanatorily irrelevant, as in Descartes’ view that secondary qualities should be excluded from a complete physical description of the world. Wright’s account challenges this view by demonstrating that, for a quality to count as secondary, certain objective conditions must obtain. Judgment-dependency itself relies on the existence and obtaining of objective states of affairs. Conversely, the judgment-independency of primary qualities depends, to some degree, on the subjective capacities of perceivers to recognize and apply concepts correctly. In this way, both primary and secondary qualities exhibit a mix of objectivity and subjectivity.
Nonetheless, the dependence of secondary or judgment-dependent qualities on human responses may appear to invite relativism: if facts about secondary qualities are constituted by subjects’ responses, these qualities might be seen as mere projections of human perception, akin to the expression of feelings or sensations. This raises concerns that if perception changes, the qualities themselves must also change. Must they?
6. Relativism, Objectivity, and Secondary Qualities
The worry here is that if color, for instance, is something that is dependent on one’s perception, then if the way one actually perceives the world changes, color must change as well: red would not be red anymore as it would not look red to the observer. Blackburn draws on this issue when considering analogies between moral values and secondary qualities (as proposed by John McDowell). For Blackburn, there is a crucial difference between secondary qualities, such as color, taste, and so forth, and the case of moral values: relativism, he argues, does not arise in the latter case. As he puts it, “the analogue with moral qualities fails dramatically: if everyone comes to think of it as permissible to maltreat animals, this does nothing at all to make it permissible: it just means that everybody has deteriorated” (1985, 14; see also Blackburn, 2013, 49-50).
If this is correct, moral judgments possess a form of objectivity that color judgments do not. However, not all philosophers agree that dependence on perception entails relativism about color.
Many instead argue that the extension of color concepts is determined by our actual responses under normal conditions. In this view, if an “inverted spectrum” scenario were to occur, where an alien intervention or a neurological change caused red to appear green and vice versa, this would not mean that the colors of objects themselves had changed, for instance, that grass would not be green anymore. At most, one can conclude that grass, or whatever was actually and originally green, merely appears red to them, not that its color has truly changed to red. As Sydney Shoemaker, the philosopher who introduced this thought experiment, explains, “green things look the way red things used to, yellow things look the way blue things used to, and so on” (Shoemaker, 1994, 302). This “rigidification” strategy preserves a useful sense of objectivity for secondary qualities: from the fact that red is a secondary quality and thus dependent on humans’ perception, it does not follow that objects would not really have a color. Rather, as Peter Railton formulates it, color would be related to our perception in the following way:
x is red = x is such as to elicit in normal humans as they actually are (and in actual normal circumstances) the visual impression of redness. (Railton, 1998, 132)
In this way, Railton disagrees that color concepts evoke relativism. He rather contends that color concepts enjoy a form of non-relativism or “objectivity” that the above equation conferred understood as “independence from certain changes in our attitudes or sensibilities” (1998, 133).
Railton’s position aligns with Wright’s view that, according to the Euthyphronic proposal, “it is the judgements of those who are actually statistically standard observers, in what are actually statistically standard conditions of observation, which count” (Wright, 1992, 114). If all people suddenly become color blind, they would obviously be disposed to make different judgments. Nonetheless, in such a counterfactual situation, our actual judgements, formed under standard conditions by standard observers, would still determine color facts: “Colours need not have changed; they need not have changed because the judgements which we would have made, had we still been visually constituted as we actually are, need not have changed” (1992, 114). Put differently, if in such a possible world people remained as they currently are, they would arrive at a judgment akin to the one they now hold. Thus, even for secondary qualities, judgment-dependency does not entail relativism but instead supports a qualified objectivity tied to the practices and capacities of actual observers under standard conditions.
This concludes our application of Wright’s J-D account to the cases of color and shape. This framework can now be employed to explore whether other concepts, particularly mental concepts and moral qualities, are best viewed as primary or secondary, or whether the analogy with primary and secondary qualities fails to illuminate their nature: a concept would be neither primary nor secondary if the account cannot be properly built for it. The following sections briefly outline how a J-D account might be developed for these more complex concepts.
7. Wright’s Account: The Case of Intention
What kind of concept is the concept of intention? To answer this, one can construct Wright’s J-D account for intention by (1) formulating a suitable PE for intention, (2) specifying the relevant optimal conditions, and (3) checking whether the A Priority, Substantiality, Independence, and Extremal Conditions are met.
Suppose that S intends to do j, such as go shopping. The PE for this case can be written as follows:
PEINT: (x)(S): Optimal conditions obtain → (S intends to j ↔ S judges that S intends to j).
Here, the fact under consideration is that S has a particular intention, expressed in a self-ascription such as “I intend to go shopping.” This is an instance of avowal, where the subject ascribes a mental state to themself. The question is whether S’s own judgment constitutes the fact that they have this intention or determines the truth of their self-ascription.
Note that one can describe the relevant concepts, predicates, and qualities: taking S as an object, the quality in question is having the intention to go shopping, with the associated concept and the predicate “intends to go shopping.” This concept has a specific extension: only certain entities (humans) can instantiate it.
Since PEINT concerns the subject’s judgment about their own intention, it represents a First-Personal J-D account: the same person or subject, whose intention is under scrutiny, is the judge too. An alternative would be a Third-Personal J-D account, where facts about S’s intention would be constituted by the judgments of another observer, or an interpreter, under optimal conditions (see Byrne, 1998; Child, 1994; Holton, 1992; Hossein Khani, 2023). The accounts considered in this article, including Wright’s, are first-personal.
In applying the J-D account to intention, Wright aims to answer two fundamental questions: (1) the metaphysical question of how the fact that S has the intention to do j is constituted, and (2) the epistemological question of how one can explain S’s direct and non-inferential knowledge of the content of their intention. If Wright’s J-D account successfully shows intention to be judgment-dependent, then facts about what S intends are constituted by S’s own judgments formed under optimal conditions, and S knows what they intend simply by forming a judgment under those conditions.
The next step is to specify the optimal conditions under which S can form their best judgment about their intention. Wright refers to these as “cognitively ideal conditions of both judge and circumstance” (Wright, 2001, 192), or “C-conditions.” In the case of intention, these naturally focus on the internal, cognitive functioning of S’s mind or brain. More particularly, Wright identifies three key conditions: “[a] grasp of the appropriate concepts, [a] lack of any material self-deception or anything relevantly similar, and [the] appropriate attentiveness” (2001, 201). These conditions are mostly clear: S must have the concept of intention, just as possessing the concept of redness was necessary for forming judgments about redness. Moreover, S must genuinely have the intention and not merely deceive themself into thinking so, avoiding distortions from intervening emotions, desires, or external pressures or environmental factors. Finally, S must be attentive to the question of their intention, not confusing it with hopes, fears, or other states. These conditions ensure that S’s judgment about their intention is reliable and can potentially play an extension-determining role.
Once PEINT is formulated and the C-conditions specified, the next step is to evaluate whether PEINT meets the A Priority, Substantiality, Independence, and Extremal conditions. First, the A Priority Condition.
If intention is a secondary concept, then the truth of PEINT must be knowable a priori. PEINT appears to satisfy this condition because possessing and reflecting on the concept of intention seems sufficient for subject S to recognize the truth of PEINT: anyone who grasps the concept of intention would, under the specified C-conditions, judge that they have an intention if and only if they actually do, and knowledge of this would need no appeal to empirical experience.
As for the Substantiality Condition, a subtle problem emerges in relation to the second criterion.
a. Violation of the Substantiality Condition
The problem is that including conditions such as “S must not be self-deceived” would make PEINT trivially true: this condition effectively requires that S not be in any situation that would prevent them from making correct judgments about their intention. However, the list of circumstances that might lead to self-deception is open-ended, and excluding all of them amounts to excluding all conditions under which S might be wrong about their intention.
As a result, the “no-self-deception” requirement functions as a “whatever-it-takes” condition, violating the Substantiality Condition: it ensures that S’s judgments are guaranteed to be correct under the C-conditions, regardless of whether the facts about intention are genuinely constituted by those judgments. This renders PEINT trivially true and prevents intention from being classified as judgment-dependent under the J-D account.
Does this mean intention is judgment-independent? Wright argues that such a conclusion is drawn too hastily, suggesting that there may be a way to remove the problematic condition while preserving substantiality.
b. Self-Deception as a Positive Presumption
Wright proposes that this issue can be resolved by treating the “no-self-deception” condition as a positive presumption (see Wright, 2001, 202). This means treating the absence of self-deception as a default assumption in everyday communication and interpretation, grounded in the presumption that others are rational, language-using, and capable of thinking and possessing a rich set of propositional attitudes. The list of such things will go on and on, and these are linked to one another in a complex, holistic way. These assumptions about others are held by default, unless there is strong evidence to the contrary.
Dense “wholesale” self-deception would undermine a person’s rationality, but self-deception about a specific intention can often be detected through conflicts between self-ascriptions and actions. Interpreting others and being interpreted by others is necessary to detect such inconsistencies: one needs to be “an at least potential object of interpretation, with [their] claims about [their] own [intentions] essentially defeasible in the light of the shape assumed by [their] actual practice” (2001, 87).
If no inconsistencies are detectable, it is reasonable to positively presume that the no-self-deception condition holds (see 2001, 204). Doubt about self-deception arises only when there is strong evidence for it, and this allows the “no-self-deception” condition to be removed from the C-conditions. In this case, the C-conditions retrieve their substantiality: there is no condition providing S with whatever is necessary for a correct appraisal of intention. The problem of violating the Substantiality Condition is thus solved.
The resulting C-conditions, and consequently PEINT, are restricted because the positive presumptive no-self-deception condition has been removed from the C-conditions. The resulting C-conditions are thus substantially specified but conditional on, or restricted to, the obtaining of the fact that there is no evidence in favor of any considerable self-deception. As Wright puts it, “if—lacking evidence to the contrary—we are a priori justified in holding the no-self-deception condition to be met, we are also a priori justified in believing the result of deleting that condition from the provisional biconditional in question” (2001, 202).
In this restricted form, PEINT is no longer a priori true, but it is now “a priori reasonable to believe” that the restricted PEINT is true: PEINT is “a priori credible” (2002, 203). The truth of PEINT is conditional upon the absence of evidence of self-deception, making it verifiably credible rather than unconditionally a priori.
Despite this restriction, Wright emphasizes, intention can still be regarded as judgment-dependent in a constructive sense. Of the three C-conditions (concept possession, attentiveness, absence of self-deception), the first two remain intact, while the third is handled as a defeasible presumption. Given that the A Priority, Substantiality, and Independence Conditions have been met, is there (for the sake of satisfying the Extremal Condition) any better explanation of why it is a priori reasonable to believe the truth of something like PEINT?
For the Extremal Condition to be met, there must be no better explanation of the covariance between intention and judgment than the claim that S’s judgments formed under the restricted C-conditions play an extension-determining role. Wright argues that, “the key thought of the variant approach will be that the matter will be nicely explained if the concept of intention works in such a way that… S’s opinions, formed under the restricted set of C-conditions, play a defeasible extension-determining role. (2001, 203)
Thus, the best available explanation is that S’s judgments defeasibly determine the extension of the concept of intention, where “defeasibly” means that this extension-determining role can be overridden by evidence of self-deception. Therefore, intention is a defeasible judgment-dependent (secondary) concept.
Wright suggests that a similar treatment applies to the concept of meaning, which, like intention, sustains authoritative first-person avowals and exhibits “disposition-like theoreticity” (2001, 178, 206), that is, sensitivity to future disposed responses. In both cases, self-ascriptions (for instance, “I mean addition by ‘plus’”) are taken by default as reliable but remain defeasible in light of the subject’s future responses, dispositions, and practices.
The discussion of intention now naturally leads to the next case: moral values.
8. Wright’s Account: The Case of the Moral
Are moral values secondary qualities? One often thinks of good and bad, right and wrong, permissible and impermissible as moral qualities that certain human actions may instantiate: kicking a dog for fun is wrong, while kicking a rabid dog attacking a child is not. Can Wright’s J-D account explain what metaphysically determines the moral fact that an action has a particular moral quality, and how one come to know such facts? Is it S’s best judgment that determines the extension of concepts such as good, wrong, and permissible?
These questions form the core of meta-ethics, which typically divides into “cognitivism” (or sometimes “moral realism”) and “non-cognitivism.” Cognitivists hold that moral evaluations, such as “stealing that money was wrong,” express beliefs about actions. Beliefs have propositional content that are truth-apt, that is, capable of being true or false. If “stealing that money was wrong” is true, it is so because some moral fact obtains. Non-cognitivists, by contrast, argue that moral judgments merely express subjective sentiments rather than beliefs. Judgments such as “helping others is good” do not describe the world and thus have no truth-conditions; they are not truth-apt as there is no moral reality that makes them true or false. Such statements merely express one’s approval or disapproval of an action.
If it succeeds, the J-D account would offer a cognitivist, though not strongly realist, perspective: moral judgments would still be truth-apt, but their truth or falsity depends on human responses rather than on independently constituted moral facts (see Miller, 2003, 5–6, Chapter 7). As before, discussion begins by formulating a PE and specifying the C-conditions.
Wright’s example is the judgment that “That remark of Jones’s was culpably insensitive.” Contrary to the case of intention, Wright adopts a Third-Personal J-D account here because he is dealing with a performed action and judgments about that particular action; any suitable subject would, under the C-conditions, arrive at the same judgment:
PEMORAL: (x)(S): The C-conditions obtain → (Jones’s remark was culpably insensitive ↔ S judges that Jones’s remark has been so).
In order to specify the C-conditions, one must see under what conditions S’s judgment about Jones’s remark would be their best judgment. The C-conditions, as before, have different parts: some would be about Jones, his personality, actions, thoughts, and life; some cover the features of the action itself; some concern the judge or S’s own moral capacities to form judgments about certain moral qualities. Wright offers the following C-conditions for the case at hand:
S scrutinises the motives, consequences and, for Jones, foreseeable consequences in the context of the remark; and does this in a fashion which involves no error concerning non-moral fact or logic, and embraces all morally-relevant considerations; and… S gives all this the fullest attention, and so is victim to no error or oversight concerning any relevant aspect of his/her deliberation; and… S is a morally-suitable subject – accepts the right moral principles, or has the right moral intuitions or sentiments, or whatever; and… S has no doubt about the satisfaction in any of these condition. (1988, 22-23)
Most conditions are straightforward, and there is enough justification to include them in the C-conditions. For, in order for S to judge whether Jones’s remark was culpably insensitive, S must evaluate what motivated Jones to do what he did, what consequences the action had, has, or would have for him and the targeted people, and so on. Moreover, S must take into account the context of the action, the environment, those who were present at the time, and other relevant factors carefully and accurately. In addition, in doing all this, S must not make any logical or non-moral factual errors.
The critical condition, however, is “S is a morally suitable subject,” requiring that S accept appropriate moral principles and possess fitting moral intuitions, ensuring S is qualified to judge others’ actions. A person who consistently behaves immorally would not be a reliable moral judge. For instance, to be eligible for making a reliable judgment, S must not themself make culpably insensitive remarks, and so forth. However, including the condition that S is a morally suitable subject in the C-conditions creates a serious difficulty. It results in a violation of both the Independence Condition and the Substantiality Condition.
First, any account or assessment of S’s moral suitability inevitably presupposes certain moral facts and knowledge of the extensions of certain moral concepts. If S lacks such knowledge, they cannot reliably judge whether Jones’s remark was culpably insensitive or whether certain actions are wrong. This presupposes the very moral facts that the J-D account aims to explain precisely via S’s judgments. How could S judge whether Jones’s remark was culpably insensitive if S does not know what actions fall under the concept of culpably insensitive actions? Thus, the Independence Condition is violated because of presupposing, in the C-conditions, facts about the extension of moral concepts.
This mirrors the earlier problem in the case of shape. There, requiring “shape stability” in the C-conditions presupposed the object’s shape, F, before S forms their best judgment to determine it. Similarly, requiring “moral suitability” presupposes that the action has a moral quality, M, already known by a morally suitable S before their judgment is formed, and this undermines the extension-determining role of S’s judgments.
Second, the suitability requirement implies that S is guaranteed to form correct moral judgments due to their possession of the necessary moral principles and intuitions. It becomes unclear under what conditions such a subject might fail to form correct judgments. Thus, the Substantiality Condition is also violated (see 1988, 23.)
Thus, including the “moral suitability requirement” in the C-conditions prevents the J-D account from showing that facts about the moral qualities of actions are constituted by S’s judgments under the specified conditions. Does this mean moral values are primary qualities, similar to shape?
a. Moral Values: Neither Primary nor Secondary
According to Wright, the analogy between moral values and secondary qualities such as color breaks down at a crucial point: meeting the C-conditions in the case of color does not require presupposing the extensions of color concepts, while “proper pedigree for moral judgements, by contrast, is a matter of meeting conditions the satisfaction of some of which is, irreducibly, a moral question” (1988, 23-24).
This disanalogy has significant consequences. In the case of moral qualities, one often wishes to preserve a degree of objectivity: for example, one regards the judgment that genocide is wrong as expressing more than a subjective sentiment; it is considered truly wrong, regardless of individual opinions.
For secondary qualities, Wright showed that S’s relevant judgments retain a sustainable objectivity despite their being secondary or judgment-dependent. However, since moral qualities are not analogous to secondary qualities, this helpful mix of objectivity and subjectivity is lost: “The mix of subjectivity and objectivity is simply not as in the case of secondary qualities. The comparison is misconceived, and can only encourage a misconceived confidence in the objectivity of morals” (1988, 24).
However, does this imply that moral qualities are primary qualities, like shape? Wright denies this analogy too. Primary qualities interact with other qualities of the object, as well as the perceiver, in various ways, but moral qualities “lack this diversity of interactive role” (1988, 25, fn. 36). Primary qualities, such as shape, extension, position, solidity, motion, and number, are those qualities by which one describes the fundamental nature of physical objects. Yet describing these qualities necessitates a network of interdependencies among them. For example, accounting for an object’s motion requires reference to its position; likewise, describing its size and shape involves its extension and solidity, among other attributes. According to Wright, such intricate interrelations and dependencies are absent in the realm of moral qualities.
Additionally, explanations of knowledge of primary qualities rely on such relationships and causal interactions. For instance, one may trace the covariance between the shape of an object and one’s (often very accurate) visual judgments of it to evolutionary processes and other natural factors. In the case of moral qualities, Wright argues, there is “no inkling of how such a story concerning our appraisal of moral value might run” (1988, 25, fn. 36). One does not know, or at least lack any plausible account of, which components of the body or brain might plausibly be cited to explain one’s purported epistemic access to moral facts. Do humans possess a distinct faculty, capacity, or mechanism uniquely designed to connect them to the alleged moral reality and deliver moral knowledge? The assumption of such special, perhaps even queer or mysterious, faculties is filled with well-known philosophical challenges.
In summary, unlike color, meaning, and intention, moral qualities cannot be viewed as secondary qualities, which means that the mix of objectivity and subjectivity they afford is lost. Yet moral qualities do not function as primary qualities either. From these considerations, Wright concludes that appealing to the distinction between primary and secondary qualities offers no constructive path for the metaphysics and epistemology of the moral. The moral cannot be located within this framework, requiring philosophers to look elsewhere for an account of moral values.
In this way, Wright’s J-D account demonstrates that the comparison between secondary and primary qualities and moral qualities is flawed. This still is an important philosophical insight.
9. References and Further Reading
a. References
Blackburn, Simon. 1985. “Errors and the Phenomenology of Value.” In Morality and Objectivity: A Tribute to J.L. Mackie, edited by Ted Honderich, 1–22. London: Routledge.
Blackburn, Simon. 2013. “Relativism.” In The Blackwell Guide to Ethical Theory, edited by Hugh LaFollette, 43–58. Malden, MA: Blackwell.
Byrne, Alex. 1998. “Interpretivism.” European Review of Philosophy, 3: 199–223.
Child, William. 1994. Causality, Interpretation, and the Mind. Oxford: Oxford University Press.
Cooper, John M. (ed.). 1997. Plato: Complete Works. Indianapolis: Hackett. Euthyphro: 1–16.
Holton, Richard. 1991. “Intentions, Response-Dependence, and Immunity from Error.” ANU Working Papers in Philosophy, 1: 1–26.
Holton, Richard. 1992. “Response-Dependence and Infallibility.” Analysis, 52(3): 180–184.
Hossein Khani, Ali. 2023. “Intention and Judgment-Dependence: First-Personal vs. Third-Personal Accounts.” Philosophical Explorations, 27(1): 41–56.
Jackson, Frank. 1996. “The Primary Quality View of Color.” Noûs, 30: 199–219.
Jackson, Frank and Pettit, Philip. 2002. “Response-Dependence without Tears.” Philosophical Issues, 12: 97–117.
Johnston, Mark. 1989. “Dispositional Theories of Value.” Proceedings of the Aristotelian Society, 63: 139–74.
Johnston, Mark. 1992. “How to Speak of the Colors.” Philosophical Studies, 68: 221–263.
Johnston, Mark. 1993. “Objectivity Refigured: Pragmatism Without Verificationism.” In Reality, Representation and Projection, edited by John Haldane and Crispin Wright, 85–130. Oxford: Oxford University Press.
McDowell, John. 1985. “Values and Secondary Qualities.” In Morality and Objectivity: A Tribute to J. L. Mackie, edited by Ted Honderich, 110–129. London: Routledge.
Miller, Alexander. 2003. An Introduction to Contemporary Metaethics. Cambridge, MA: Blackwell
Pettit, Philip. 1991. “Realism and Response-Dependence.” Mind, 100(400): 586–626.
Pettit, Philip. 1998. “Terms, Things and Response-Dependence.” European Review of Philosophy, 3: 55–66.
Railton, Peter. 1998. “Red, Bitter, Good.” European Review of Philosophy, 3: 67–84.
Shoemaker, Sidney. 1994. “Self-Knowledge and ‘Inner Sense’.” Philosophy and Phenomenological Research, 54: 249–314.
Wedgwood, Ralph. 2013. “Concepts vs. Properties, Moral.” In The International Encyclopedia of Ethics, edited by Hugh LaFollette, 976–984. London: John Wiley & Sons.
Wright, Crispin. 1988. “Moral Values, Projection and Secondary qualities.” Proceedings of the Aristotelian Society.62: 1–26.
Wright, Crispin. 1992. Truth and Objectivity. Cambridge, MA: Harvard University Press.
Wright, Crispin. 1993. Realism, Meaning and Truth. Second Edition. Cambridge, MA: Harvard University Press.
Wright, Crispin. 2001. Rails to Infinity: Essays on Themes from Wittgenstein’s Philosophical Investigations. Cambridge, MA: Harvard University Press.
Wright, Crispin. 2012. “Replies.” In Mind, Meaning, and Knowledge: Themes from the Philosophy of Crispin Wright, edited by Annalisa Coliva, 377–486. Oxford: Oxford University Press.
b. Further Reading
Boghossian, Paul. 2012. “Blind Rule-Following.” In Mind, Meaning, and Knowledge: Themes from the Philosophy of Crispin Wright, edited by Annalisa Coliva, 27–48. Oxford: Oxford University Press.
Boghossian, Paul and J. David Velleman. 1989. “Colour as a Secondary Quality.” Mind, 98(389): 81–103.
Brynjarsdóttir, Eyja M. 2008. “Response-Dependence of Concepts Is Not for Properties.” American Philosophical Quarterly, 45(4): 377–386.
Devitt, Michael. 2009. “Global Response Dependency and Worldmaking.” In Putting Metaphysics First, 121–136. Oxford: Oxford University Press.
Edwards, Jim. 1992a. “Secondary Qualities and the A Priori.” Mind, 101(402): 263–272.
Haukioja, Jussi 2001. “The Modal Status of Basic Equations.” Philosophical Studies, 104: 115–122.
Haukioja, Jussi. 2007. “How (Not) to Specify Normal Conditions for Response-Dependent Concepts.” Australasian Journal of Philosophy, 85(2): 325–331.
Hindriks, Frank A. 2006. “Acceptance-Dependence: A Social Kind of Response-Dependence.” Pacific Philosophical Quarterly, 87(4): 481–498.
Hossein Khani, Ali. 2020. “Interpretationism and Judgement-Dependence.” Synthese, 198(10): 9639-9659.
Hossein Khani, Ali. 2023. “Intention, Judgment-Dependence, and Self-Deception.” Res Philosophica, 100(2): 203–226.
Jackson, Frank. 1996. “The Primary Quality View of Color.” Noûs, 30: 199–219
Menzies, Peter and Price, Huw. 1993. “Causation as a Secondary Quality.” The British Journal for the Philosophy of Science, 44(2): 187–203.
Miller, Alexander. 2007. “Another Objection to Wright’s Treatment of Intention.” Analysis, 67 (3): 257–263.
Miller, Alexander. 2009. “Primary Qualities, Secondary Qualities and the Truth about Intention.” Synthese 171 (3): 433–442.
Miller, Alexander and Divers, John. 1994. “Best Opinion, Intention-Detecting and Analytic Functionalism.” Philosophical Quarterly, 44(175): 239–245.
Norris, Christopher. 2002. “Realism, Projectivism and Response-Dependence: On the Limits of ‘Best Judgement’.” Philosophy and Social Criticism, 28(2):123–152.
Pettit, Philip. 1999. “A Theory of Normal and Ideal Conditions.” Philosophical Studies, 96: 21–44.
Powell, Mark. 1998. “Realism or Response Dependence?”. European Review of Philosophy, 3: 1–13.
Railton, Peter. 1998. “Aesthetic Value, Moral Value, and the Ambitions of Naturalism.” In Aesthetics and Ethics: Essays at the Intersection, edited by Jerrold Levinson, 59–105. New York: Cambridge University Press.
Stroud, Barry. 2018. “Are the Colours of Things Secondary Qualities?” In Seeing, Knowing, Understanding, 179–192. Oxford: Oxford University Press.
Shoemaker, Sydney. 1990. “Qualities and Qualia: What’s in the Mind?” Philosophy and Phenomenological Research, 50: 109–131.
Thompson, Brad. 2006. “Moral Value, Response-Dependence, and Rigid Designation.” Canadian Journal of Philosophy, 36(1): 71–94.
In the philosophy of mathematics, intuitionism stems from the view originally developed by L. E. J. Brouwer that mathematics derives from intuition and is a creation of the mind. This view is prefigured most notably by Kant, Kronecker, Poincaré, Borel, and Lebesgue. Intuitionism maintains that a mathematical object exists only if it has been constructed and that a proposition is true only if a certain construction that realizes its truth has been carried out. Thus, intuitionism generally entails a form of anti-realism in ontology and truth-value, for mathematical objects exist and mathematical propositions have truth-values but never independently of our limited human cognitive faculties.
Intuitionism can also be understood as a reaction to Cantor’s set theory due to its attempt to tame the infinity in mathematics by accepting only potentially infinite objects, namely, actually finite objects which can always be extended into larger finite objects. In this respect, intuitionism preserves the spirit of ancient Greek mathematics, where actual infinity used to be avoided by techniques such as the method of exhaustion. In his systematic development of a new foundation of mathematics without actually infinite collections, Brouwer was famously led to abandon the law of excluded middle and to introduce a new finitary class of objects called “choice sequences” in order to reconstruct the theory of the continuum.
Intuitionism is one of the three major views that dominated debates in the foundations of mathematics during the first half of the 20th century, along with logicism and formalism. It maintains against logicism that logic is merely a part of mathematics and against formalism that mathematics has meaning and is unformalizable. Intuitionism is also the only school among the three that was in effect largely reinforced by Gödel’s incompleteness results. Although it is classical mathematics and not intuitionistic mathematics that remains widely practiced, intuitionism has born many fruits in philosophy, mathematics, and computer science. Moreover, in addition to the historical significance of intuitionism, topics in its mathematics, logic, and philosophy continue to be actively explored.
This article surveys intuitionism as a philosophy of mathematics, with emphasis on the philosophical views endorsed by Brouwer, Heyting, and Dummett. Some preliminary remarks are in order. The term “intuitionism” is not synonymous with “constructivism”, an umbrella term that roughly refers to any particular form of mathematics that adopts “we can construct” as the appropriate interpretation of the phrase “there exists”. However, intuitionism remains one of the most prominent varieties of constructive mathematics. The curious reader can see the related article constructive mathematics for background. For more on the intuitionistic rejection of actual infinities, see also the article on the infinite. Finally, because intuitionism advocates a revision of classical mathematics, a certain amount of mathematical knowledge is required to fully appreciate some parts of this article, most importantly in the section on intuitionistic analysis. Readers can check any introductory textbook on classical real analysis or topology if they are not familiar with the terminology. Some familiarity with basic set theory is also presupposed in the first section on Brouwer, in particular regarding transfinite ordinals and uncountable cardinalities. Since the focus is intuitionistic mathematics, intuitionistic logic is not presented in this article. But for reference the appendix includes a list of notable theorems and non-theorems of intuitionistic logic.
The development of non-Euclidean geometry in the eighteen century put some serious pressure on the status of space as a pure form of intuition, for it is hard to determine which geometry best describes the space of our experience. Intuitionism represents Brouwer’s attempt to revise Kant’s philosophy of mathematics by renouncing the intuition of space altogether and developing a stronger commitment to the pure intuition of time. Simply put, Brouwer views all his intuitionistic mathematics as a product of temporal intuition. This means in particular that Brouwer is open to the use of the tools of analytic geometry to found geometry on real-number coordinates without appeal to spacial intuition.
According to Brouwer, mathematics is a human mental activity independent of language; mathematical objects exist only as mental constructions given in intuition; and there are no mathematical truths which are not graspable by intuition. This means that Brouwer is committed to a radical version of anti-realism better characterized as a form of mathematical idealism in ontology and truth-value: both the existence of mathematical objects and the truth-values of mathematical propositions depend on the mind.
Brouwer proposed various articulations of intuitionism throughout his career. The germ of his basic views on mathematical intuition, logic, and language can be traced back as far as his dissertation on the foundation of mathematics (Brouwer 1907). This section concentrates on the presentations found in Brouwer’s late writings. Their main distinctive feature is that the development of intuitionistic mathematics out of intuition is more methodically described by means of two acts (Brouwer 1952, 1954, 1981). These two acts are performed by an idealized mind known as the creating subject. The most comprehensive formulations of the background philosophy that he adopted to justify his two acts of intuitionism can be found in (Brouwer 1905, 1929, 1933, 1948b). The following sections examine each one of these acts in turn and discuss the role they play in Brouwer’s philosophical justification of his intuitionistic mathematics.
a. The First Act of Intuitionism
Intuitionistic mathematics is a creation of the mind and independent of language. Brouwer emphasizes this mental dimension of intuitionism with the first act, which at the same time separates mathematics from language and introduces the “intuition of twoity” as the foundation on which the whole edifice of intuitionistic mathematics rests:
[…] Intuitionistic mathematics is an essentially languageless activity of the mind having its origin in the perception of a move of time. This perception of a move of time may be described as the falling apart of a life moment into two distinct things, one of which gives way to the other, but is retained by memory. If the twoity thus born is divested of all quality, it passes into the empty form of the common substratum of all twoities. And it is this common substratum, this empty form, which is the basic intuition of mathematics (Brouwer 1981, pp. 4-5).
As this is a rather brief and difficult passage, the rest of this subsection takes a closer look at the mental construction of mathematical objects according to the intuition of twoity and explores the autonomy of intuitionistic mathematics from logic and language. This article draws from other works by Brouwer to clarify these aspects of the first act.
i. Mental Construction in Intuition
The empty twoity is the form “one thing and then another thing” shared by all twoities, where a twoity is a pair of a first sensation followed by a second one. We intuit a twoity by perceiving the succession of two experiences in time, which is understood not in the external sense of scientific time but in the internal sense of our time consciousness. For example, suppose that we listen to the sounds of two successive ticks of a clock. One tick is heard at the initial present stage of our awareness, then as the second tick is heard at a new stage, the first tick does not completely disappear from consciousness but is retained in our memory as just past. Finally, a twoity that pairs these sounds is experienced by thinking of both ticks together:
[…] The basic intuition of mathematics (and of every intellectual activity) as the substratum, divested of all quality, of any perception of change, a unity of continuity and discreteness, a possibility of thinking together several entities, connected by a ‘between’, which is never exhausted by the insertion of new entities. Since continuity and discreteness occur as inseparable complements, both having equal rights and being equally clear, it is impossible to avoid one of them as a primitive entity, trying to construe it from the other one, the latter being put forward as self-sufficient; in fact it is impossible to consider it as self-sufficient (Brouwer 1907, p. 8).
Some comments on the passage quoted above are needed. The same formulation of the intuition of twoity found in Brouwer’s statement of the first act is paraphrased in almost verbatim style multiple times in his later writings, but, as we can see above, he maintained the ideas already in his dissertation (Brouwer 1907, p. 9). The greatest difference is that in these early formulations he puts emphasis on a “between” that connects the two elements of our perception of two things in time and he insists that the discrete and continuum are both inseparable primitive complements of each other. From Brouwer’s introduction of the second act on, he almost never speaks of the irreducible continuum, but still stresses the creation of the between with a twoity (Brouwer 1981, p. 40). The postulation of choice sequences and species with the second act merely allows Brouwer to recognize the intuitive continuum as a “matrix of ‘point cores’” and study it mathematically.
Intuitionistically, the positive integers 1, 2, 3, … are represented in terms of finite ordinals. The intuition of twoity generates in the first instance the ordinal numbers two and one, as well as all other finite ordinal numbers by the indefinite repetition of the process. As the empty twoity is the form of two paired things, it can be viewed as a pair of units. One of its elements is defined as the number 1 and the empty twoity itself is the number2. The discrete fragment of intuitionistic mathematics arises from the basic operation of mentally constructing new twoities in intuition by pairing the empty twoity and its units:
This empty two-ity and the two unities of which it is composed, constitute the basic mathematical systems. And the basic operation of mathematical construction is the mental creation of the two-ity of two mathematical systems previously acquired, and the consideration of this two-ity as a new mathematical system (Brouwer 1954, p. 2).
We should stress that, as for Brouwer the empty twoity is intuited before its elements are, the number 2 is constructed before the number 1 in his mathematical universe. This is because 1 can only be constructed by projecting a unit out of the empty twoity. The intuitive nature of projection is often emphasized by van Atten (2004, §1.1; 2024, p.63). Brouwer thinks we could not have an empty unity as the starting point because the operation of adding another unit would already presuppose the intuition of twoity to form a pair of units:
The first act of construction has two discrete things thought together […] F. Meyer […] says that one thing is sufficient, because the circumstance that I think of it can be added as a second thing; this is false, for exactly this adding (i.e. setting it while the former is retained) presupposes the intuition of two-ity; only afterwards this simplest mathematical system is projected on the first thing and the ego which thinks the thing (Brouwer 1907, p. 179, fn. 1).
Indeed, the construction of all the positive integers proceeds as follows:
The inner experience (roughly sketched):
twoity;
twoity stored and preserved aseptically by memory;
twoity giving rise to the conception of invariable unity;
twoity and unity giving rise to the conception of unity plus unity;
threeity as twoity plus unity, and the sequence of natural numbers;
mathematical systems conceived in such a way that a unity is a mathematical system and that two mathematical systems, stored and aseptically preserved by memory, apart from each other, can be added; etc. (Brouwer 1981, p. 90).
It is important to note that Brouwer omits the construction of the number zero here. It is unclear whether this has to do with its lack of a direct representation in intuition, since it was common to exclude zero as a natural number during his time. In any event, Brouwer introduces zero into his mathematical ontology later as an integer. Kuiper (2004, ch. 2) proposes alternative ways to accommodate zero and describes constructions of the natural numbers and rationals from the intuition of twoity relying on the early notion of betweenness advocated in Brouwer’s dissertation. A recent interpretation of the construction of the positive integers based on Brouwer’s mature writings is developed by Bentzen (2023b, section 3.1) based on intuitive operations of pairing, projecting, and recollecting.
The intuition of twoity does not just yield the positive integers. Brouwer maintains that the first infinite ordinal number \(\omega\) and all subsequent countable ordinals can be obtained by the indefinite repetition of the creation of twoities in time (Brouwer 1907, pp. 144-145). The first act of intuitionism is therefore also able to produce the sequence of mathematical objects \(\omega,\omega+1,\omega+2,…,\omega + \omega, …,\omega \times \omega,\omega \times \omega + 1,…\), but it excludes uncountable ordinals. In fact, in his early writings, Brouwer can be found stressing that intuitionistic mathematics only recognizes the existence of countable sets (Brouwer 1913, p. 58). Recall that a set is countable if its elements can be put in a one-to-one correspondence with the elements of a subset of the set of natural numbers. Just as in classical mathematics, the continuum also remains uncountable intuitionistically. Of course, this does not mean that Brouwer is prepared to deny the existence of the continuum, but only that it cannot be reduced to a totality that is supposed to exist as a whole (Brouwer 1907, pp. 144-149). Intuitionistically, we can only make sense of an uncountable set as a “denumerably unfinished set”, namely, a set such that as soon as we think we have constructed more elements than those in a countable subset of it, we can immediately find new elements which should also belong to it. Notice that, after the introduction of the second act, this vague notion of set assumed in Brouwer’s early writings is abandoned in favor of his mature conception of species. In his later works the word “set” (Menge) is reserved for a spread, a particular kind of species of choice sequences. Choice sequences and spreads are examined in a later section.
ii. Logic and Language
Brouwer stresses that language can only play a non-mathematical auxiliary role in the construction of objects out of the intuition of twoity. For example, language can be used as a memory aid and to communicate some mathematical constructions to others, but what is written down can only be justified as an expression of our mental acts. This is why Brouwer maintains against Hilbert that freedom from contradiction can never guarantee existence. From the intuitionistic standpoint, to exist is to be constructed in intuition:
It is true that mathematics is quite independent of the material world, but to exist in mathematics means: to be constructed by intuition; and the question whether a corresponding language is consistent, is not only unimportant in itself, it is also not a test for mathematical existence (Brouwer 1907, p. 177).
In its purest form, intuitionistic mathematics consists of acts of construction in intuition. Logic and language only come into play later in its communication. This independence of mathematics from logic immediately leads to a repudiation of logicism, for, intuitionistically, mathematics is not part of logic. Rather, it is logic that is a part of mathematics:
While thus mathematics is independent of logic, logic does depend upon mathematics: in the first place intuitive logical reasoning is that special kind of mathematical reasoning which remains if, considering mathematical structures, one restricts oneself to relations of whole and part; the mathematical structures themselves are in no respect especially elementary, so they do not justify any priority of logical reasoning over ordinary mathematical reasoning (Brouwer 1907, p. 127).
Despite Brouwer’s general downplay of logic, he often stresses that if our mathematical constructions are put into words, the logical principles of the law of identity, the law of noncontradiction, and the Barbara syllogism can always be used to arrive at linguistic descriptions of new constructions (Brouwer 1908, pp. 155-156).. Nevertheless, he notes that in general this cannot be said of the law of excluded middle as well, namely, the principle that states that either a proposition or its negation holds, for any proposition. The law of excluded middle must be rejected in intuitionistic mathematics because it is taken to mean that every mathematical proposition can either be proved or refuted. The constant presence of open problems in mathematics indicates that this is not the case:
The question of the validity of the principium tertii exclusi is thus equivalent to the question concerning the possibility of unsolvable mathematical problems. For the already proclaimed conviction that unsolvable mathematical problems do not exist, no indication of a demonstration is present (Brouwer 1908, p. 156).
The rejection of the law of excluded middle is very subtle. To avoid confusion, notice that intuitionism simply does not accept that this law is generally valid for every proposition. It is not denied that the law might hold for certain propositions. Indeed, Brouwer insists that there will always be admissible uses of the law of excluded middle when dealing with properties about finite sets (Brouwer 1981, pp. 5-6). This is because an exhaustive verification running through the elements of the set one by one will always terminate. This leads to a proof or refutation of the finitary proposition in question. But we have to be careful when working with infinite sets because an exhaustive verification is not available to us. The repudiation of the law of the excluded middle for infinite domains is a direct product of Brouwer’s view of intuitionistic mathematics as an activity of the finite human mind.
When discussing logic in the first act of intuitionism, Brouwer often connects the rejection of the law of excluded middle to the existence of what he calls “fleeing” properties. Roughly, a fleeing property is a property of the natural numbers such that for every individual number it can be decided whether it holds or not, but we do not know a particular number for which it holds, nor that there is no number for which it holds. Now, Brouwer often refused to adopt logical symbolism in his writings perhaps due to his aversion to formalization, but we will not hesitate to employ formal notation in this entry for more precision. We say that a property \(A\) on the natural numbers is fleeing if the following three conditions are met:
we know that \(\forall n (A(n) \lor \neg A(n))\);
it is not known so far whether \(\exists n A(n)\);
it is not known so far whether \(\neg
\exists n A(n)\).
Every fleeing property therefore provides an unsettled instance of the excluded middle. Some properties may cease to be fleeing the moment it is found an \(n\) such that \(A(n)\) or if there emerges a proof that no such number exists, in which case \(\exists n A(n) \lor \neg \exists n A(n)\) is decided. But there will always be other fleeing properties available to us. In fact, every open problem about the natural numbers naturally gives rise to a fleeing property, as shown in the following example. Suppose that \(A(n)\) holds iff \(n\) is a counterexample to Goldbach’s Conjecture, which states that every even natural number greater than \(2\) is a sum of two primes. Since as of this writing this proposition remains a conjecture, we cannot tell whether there exists a counterexample nor whether it is impossible that a counterexample might exist. Yet, given any particular number we can, at least in principle, ignoring practical limitations of time, verify whether it is a counterexample or not because, once again, we are dealing with a decidable property over the natural numbers. Put differently, there exists an effective method for checking whether the property holds or not for each number.
Intuitionistic arithmetic is not quite different from its classical counterpart. One reason for their similarity is that equality between natural numbers is decidable, so \(\forall n \forall m (n = m \lor \neg n = m)\) is an intuitionistically valid instance of the law of excluded middle. This is why Brouwer repeatedly insists that on the basis of the first act alone classical discrete mathematics “can be rebuilt in a slightly modified form” (Brouwer 1952, p. 142). Yet, there remains several theorems from classical number theory that cannot be proved intuitionistically. Perhaps the most notable case in point is the absence of the least number principle, which asserts that if a property on the natural numbers has a witness then it has a least one. To be exact, it can be expressed symbolically as \(\exists n A(n) \to \exists n (A(n) \land \forall i (i < n \to \neg A(i)))\). Its failure in intuitionistic arithmetic is due to the presence of fleeing properties. To see this, let \(B\) be a property such that \(B(n)\) holds for all \(n > 0\) but \(B(0)\) holds iff \(\exists n A(n)\), where \(A\) is the property considered above tracking counterexamples to Goldbach’s Conjecture. If this conjecture is settled one day we can just pick another open problem to replace \(A\). If the least number principle were to hold for \(B\), we would already know whether \(\exists n A(n)\) or not. ==Given that \(A\) is a fleeing property, this is currently undecided (Posy 2020, section 2.2.1). Clearly there are some properties for which the least number principle holds. It is just not valid for every property in the intuitionistic setting as it is in classical number theory.
b. The Second Act of Intuitionism
We have seen that the first act of intuitionism postulates the construction of the positive integers in intuition as finite ordinals and all countable infinite ordinals. We also saw that intuitionistic arithmetic already diverges to some extent from classical arithmetic. It is however in its distinctive treatment of the continuum that the greatest differences between classical and intuitionistic mathematics begin to show. To arrive at a satisfactory approach to real analysis from the intuitionistic standpoint, Brouwer thought more powerful tools than those introduced in the first act are required. The second act postulates the admission of species and choice sequences into the domain of intuitionistic mathematical objects:
Admitting two ways of creating new mathematical entities: firstly in the shape of more or less freely proceeding infinite sequences of mathematical entities previously acquired (so that, for example, infinite decimal fractions having neither exact values, nor any guarantee of ever getting exact values are admitted); secondly in the shape of mathematical species, i.e. properties supposable for mathematical entities previously acquired, satisfying the condition that if they hold for a certain mathematical entity, they also hold for all mathematical entities which have been defined to be ‘equal to it, definitions of equality having to satisfy the conditions of symmetry, reflexivity and transitivity (Brouwer 1981, p. 8).
Once again, this is not an entirely clear formulation. In particular, Brouwer tends to be rather briefly worded in his articulations of what precisely a choice sequence is. The first place where choice sequences are accepted is (Brouwer 1914). However, it is not until later that they begin to be put to use in the intuitionistic modeling of the continuum by means of spreads. Without spending much time discussing these novel ideas, Brouwer simply introduces choice sequences as the paths through a spread (Brouwer 1918, p. 3). The notion of spread is examined more carefully when looking at species in a later section, but for now it will suffice to think of a spread as an infinitely branching tree described according to a certain law, where of course infinity is understood only in the sense of potential infinity. So, a choice sequence is viewed as a path of nodes growing indefinitely with some degree of freedom. While Brouwer does not go far beyond this initial vague explanation, he later revisits this characterization of choice sequences adding that we can restrict their freedom of continuation by a law after each choice as we wish (Brouwer 1925, fn. 3). Multiple passing remarks intended to elaborate on law restrictions can be found in his writings. For a period, Brouwer even hesitated about accepting higher-order restrictions, which allow for a restriction imposed on restrictions of choice of elements, and so on. But the details are unimportant for our purposes in this entry, given that the idea, first introduced in Brouwer 1942), was ultimately abandoned a decade later (Brouwer 1952, p. 142 fn.*). We shall therefore focus on the ordinary conception of choice sequence with first-order restrictions here. Readers interested in higher-order restrictions can see (van Atten and van Dalen 2002, pp. 331-335).
The above articulation of the second act also leaves out the two special ingredients that Brouwer invokes in his intuitionistic approach to real analysis, namely, his principles of continuity and bar induction. It is unclear whether Brouwer omitted them from his formulation of the second act because he viewed them as mere consequences of it. In any case, he never attempted to offer a justification for these principles in his writings, though as we shall see in the remainder of this section, they are far from being obvious. Regardless, the principles of continuity and bar induction are so important to intuitionistic analysis that it cannot be done without them. They are the ingredients that lead to notable results such as the uniform continuity theorem that actually contradict classical analysis. In the literature results like this are referred to as “strong counterexamples” because they reveal that intuitionistic analysis is not a merely restriction of its classical counterpart, as in the case of intuitionistic and classical arithmetic, but rather an incomparable alternative to it.
i. Choice Sequences and Continuity
Choice sequences are potentially infinite sequences of mathematical objects. That is, at every stage in time only a finite initial segment of the sequence has been constructed, but the sequence can always be continued with the inclusion of a new element. For example, if so far the creating subject has only constructed some initial segment \(\langle 7, 5, 12, 8, 23 \rangle\), they can later pick another number of their choosing to extend the sequence with, say \(42\). It is not possible to make an actually infinite number of choice of elements. Due to the inherently finite nature of the human mind the construction process can never be finished.
What is special about choice sequences and distinguishes intuitionism from all other variants of constructive mathematics such as Bishop’s brand of constructivism is that it is explicitly admitted that their elements need not be given by some law. Simply put, a law is any procedure that our minds can express by means of an algorithm. The notion of choice sequence also admits in addition to elements determined algorithmically a unique method of free choice that arbitrarily selects a new element. We sometimes say that these elements are chosen by the free will of the creating subject. In all his career Brouwer saw free choice as a necessary ingredient to go beyond the computable reals, the real numbers which can be computed by an algorithm and are classically countable. Heyting (1975, p. 607) notes that this conviction was invalidated when Bishop showed that the set of computable reals remains uncountable constructively if all functions admitted are algorithmic. In any case, choice sequences remain significant for their philosophical and mathematical interest.
It is time to take a closer look at what choice sequences are. Although, as we saw earlier, Brouwer does not actually say much about them, it is common in the literature to think of a choice sequence as a generalization of an algorithmic sequence (Troelstra 1977, section 1.7), that is, a sequence whose elements are all determined by an algorithm. Unless explicitly stated otherwise, we shall follow the usual convention of denoting algorithmic sequences using the small Latin letters \(f, g, …\) and choice sequences using the small Greek letters \(\alpha, \beta, …\) Now, to see how this generalization works, note that, given any algorithmic sequence \(f\), an algorithm determines exactly one value \(f(n)\) for every positive integer \(n\). In contrast, given a choice sequence \(\alpha\), we still retain \(\alpha(n)\) as a value assigned for each \(n\), but we cannot require that this value be effectively calculable by an algorithm. In general, at every stage only an initial segment may be known. But recall that Brouwer (1925, 1942, 1952) does emphasize that law restrictions can be imposed at will to the freedom of continuation of the sequence after each step of its construction. To express this more precisely, we may demand that for any finite initial segment \(\langle \alpha(1), …, \alpha(n) \rangle\) for some positive integer \(n\), there is an algorithm determining a non-empty range of possible values for \(\alpha(n+1)\) onward. Simply put, instead of stipulating that an algorithm determines exactly one value, as in the case of algorithmic sequences, we just require that at each stage of the construction of a choice sequence its range of possible further values be determinable algorithmically. We can calculate the restrictions and even subject them to further restrictions at some later point in the ever-unfinished construction of the sequence. What need not be calculable ahead of time is the exact value of \(\alpha(n)\) for every \(n\). In sum, a choice sequence is given by an initial finite segment and a law that determines at every subsequence step of the sequence its range of possible future values. From this it immediately follows that, from the intuitionistic standpoint, an algorithmic sequence is nothing but a special case of a choice sequence in which its range of possible values always has exactly one element.
Now, let us look at some examples of choice sequences to illustrate the idea. The common practice in the literature is to distinguish between at least two different kinds of choice sequences depending on the information available to the creating subject:
lawlike sequences are choice sequences that are algorithmic. The simplest example is the sequence of positive integers, whose elements are completely determined by the number one and an effective rule that repeatedly applies the successor operation. But any more complex sequence calculable by an algorithm will do.
lawless sequences are choice sequences subject to no law restrictions. At each stage in their construction only a finite initial segment of them can be known. More precisely, after a fixed initial segment of the sequence is stipulated at the outset, there is no law limiting the range of possible future values. Perhaps the most well-known example is that of a sequence obtained by throwing a die with six faces \(\{1,2,3,4,5,6\}\). For example, if we let \(\alpha(1)=1\) be its initial segment, from this point on we cannot know in advance nothing more than that \(\alpha(i) \in \{1,2,3,4,5,6\}\) for all \(i > 1\). The terminology “lawless” was suggested by Gödel and introduced by (Kreisel 1968).
Some remarks are in order to avoid confusion. First, note that lawlike sequences are to be distinguished from general recursive sequences, for, as Heyting (1966) observes, the intuitionist does not accept Church’s thesis with respect to laws, which essentially states that every lawlike sequence can be computed by a Turing machine. Even if it might be compelling to identify “mechanically computable” with “calculable by a Turing machine”, there are still good reasons to abstain from Church’s thesis. Given that intuitionism is a product of the human mind, a law is understood as what is humanely computable, and that is to be distinguished from a mechanically computable process (Troelstra 1977, section 1.3). From the intuitionistic perspective, as the states of a mental calculation involve intentional aspects having to do with the basic mathematical intuition, for all we know, they might go beyond the artificial states of a Turing machine (Tieszen 1989, p. 81). In Brouwer’s work the notion of a law is accepted as primitive and left without a rigorous definition. His introduction of choice sequences predates the work of Church and Turing by nearly two decades, but even after that it appears that he never has commented on the status of Church’s thesis. For more on the acceptance of this thesis, see (McCarty 1987). Although most intuitionists tend to reject Church’s thesis, McCarty provides a systematic investigation of different ways in which it can be understood in intuitionistic mathematics. The alternative variety of constructivism that accepts this thesis is known as recursive constructivism.
Second, in the example given above of a lawless sequence, we have limited ahead of time the domain of the sequence to a proper subset of the positive integers. Put it differently, we have restricted all elements of the sequence to \(\{1, 2, 3, 4, 5, 6\}\) from the very beginning, ruling out any other positive integers from the lawless sequence. However, how can this be possible if lawless sequences are supposed to be subject to no law restrictions? There is no contradiction, for our freedom to create a lawless sequence guarantees that the range of the sequence can be any range we want. We are always allowed to impose a “general a priori restriction” that all values belong to a certain domain (Troelstra 1977, section 2.2). It is conceivable, for instance, to have a lawless sequence of coin tosses that only has values in \(\{1, 2 \}\) depending on whether we get heads or tails. What we cannot do is to impose a law during the construction of a lawless sequence to further restrict that domain.
Finally, the distinction between lawlike and lawless sequences is not exhaustive. It is a very common mistake to assume that all non-lawlike sequences are lawless. We simply call a sequence non-lawlike if it is a choice sequence that is not lawlike. Of course, every lawless sequence is non-lawlike by definition, but the converse is false. The following example illustrates why. Given two lawless sequences \(\alpha\) and \(\beta\), define a new choice sequence \(\gamma\) by \(\gamma(k)=\alpha(k)+\beta(k)\). Then \(\gamma\) is subject to a law restriction, but is not given algorithmically. Another example is the choice sequence \(\delta\) which oscillates between \(\alpha\) and \(\beta\) on even and odds arguments, that is, \(\delta(2k)=\alpha(k)\) and \(\delta(2k+1)=\beta(k)\). In general, non-lawlike sequences enjoy an intermediate degree of freedom, involving both law restrictions and free choice. The best well-known example of a class of choice sequences that exhibit this intermediate nature is that of hesitant sequences (Troelstra and van Dalen 1988, section 4.6.2), namely, choice sequences that start out lawless but a law may or may not be adopted to determine future values. If a law is accepted from the very beginning or never accepted then the particular sequence is in fact lawlike or lawless respectively, but it remains neither if the antecedent is not the case.
Continuity
Since choice sequences are permanently in a process of growth and can never be admitted as completed mathematical objects, how can we do mathematics with them? First of all, when dealing with choice sequences, we must carefully distinguish between intensional and extensional equality. As one might expect, two choice sequences \(\alpha\) and \(\beta\) are extensionally equal iff they have the same values for the same arguments: \[\alpha = \beta \leftrightarrow \forall n \alpha (n) = \beta (n).\]
Strictly speaking, however, we should also distinguish between the laws according to which the sequence might be given to us (Dummett 1977, section 3.1). Simply put, two sequences might be extensionally equal but still be intensionally different if they are given by different laws. We commonly write \(\alpha \equiv \beta\) to indicate that \(\alpha\) and \(\beta\) are intensionally equal. We are often especially interested in properties that respect extensionality in intuitionistic mathematics. We say that a property \(A\) of choice sequences is extensional iff the following holds: \[\forall \alpha \forall \beta (A(\alpha) \land \alpha = \beta \to A(\beta) ).\]
It is worth noting that a choice sequence might be extensionally equal to a lawlike sequence but not necessarily given as a lawlike one. It is not even generally possible to determine when a supposedly lawless sequence actually turns out to determine a lawlike sequence extensionally. To borrow Borel’s metaphor, if we give an infinite amount of time to a monkey hitting number keys at random in a typewriter, then in principle the monkey could end up reproducing the Fibonacci sequence, for instance. That is to say, the creating subject cannot be certain that they are not following an unknown law when constructing a choice sequence until the very end, and yet the construction will never end. Let me clarify that this does not mean that a lawless sequence can in fact be lawlike. We may say the sequence is intensionally lawless but extensionally lawlike in the sense above. In the remainder of this subsection, it is assumed that all properties are extensional. Now, to motivate Brouwer’s principle of continuity, it might be useful to start with some general remarks on the acceptance of the axiom of choice in intuitionism. The simplest version of the axiom states that if for every natural number \(n\) there is some \(m\) such that \(A(n,m)\), there exists a function \(f\) such that \(A(n,f(n))\) for every \(n\). In symbols: \[\forall n \exists m A(n,m) \to \exists f \forall n A(n,f(n)).\]
To avoid confusion, we stress that here \(f\) denotes a function and not a lawlike sequence. The above principle is in fact perfectly valid intuitionistically. It is a direct consequence of the meaning of the intuitionistic logical constants, for the antecedent has a form \(\forall x \exists y A(x,y)\) and its truth presupposes a procedure that ransforms every object \(x\) into a pair that contains a specific \(y\) and a proof that \(A(x,y)\). We will look more closely at this informal semantics when studying Heyting’s meaning explanations in 2.2. The point is that the constructive meaning of existence yields an “operation” that in this particular case serves as a choice function because intensional and extensional equality coincide. Therefore, controversies surrounding the axiom of choice arise in intuitionistic mathematics only when intensional and extensional equality diverge. The reader is referred to (McCarty et. al. 2023) for a comprehensive overview of the status of the axiom of choice in various intuitionistic and constructivist systems. In the general version of the axiom of choice, we drop the above restriction to the natural numbers and consider any set (or rather species). In this form, the axiom is false because intensional and extensional equality need not always agree. We have already seen that that is not the case for choice sequences in general.
What does this have to do with continuity? We can think of a continuity principle as a more assertive version of the axiom choice that tells us how to manipulate choice sequences. If instead of the natural number \(n\) we consider a choice sequence \(\alpha\) in the above formula, the resulting version of the axiom of choice \(\forall \alpha \exists m A(\alpha,m) \to \exists f
\forall \alpha A(\alpha,f(\alpha))\) is false. Even when \(A\) is an extensional property, the function \(f\) need not respect extensional equality. So, if \(\alpha = \beta\) then \(f(\alpha) = f(\beta)\) might not hold, unless \(\alpha \equiv \beta\) also holds (Dummett 1977, p. 57). It is possible that \(A(\alpha,f(\alpha))\) is true but not \(A(\beta,f(\beta))\) or vice-versa, contradicting the assumption that \(A\) is an extensional property. To overcome this obstacle, intuitionists need to impose an additional requirement to the admissible choice functions to guarantee the preservation of extensional propertiesand this is where continuity enters the picture. One solution is to restrict them to those belonging to a class of “neighborhood functions” \(\mathbf{K}\) that allow us to determine their value in a finitistic way. They are usually denoted by the letter “\(e\)” rather than “\(f\)” in the literature because of their special status. If \(e \in \mathbf{K}\), then \(e(\alpha)\) is calculable from a finite amount of information known about \(\alpha\), such as the finite initial segment of the sequence or the law restrictions imposed up until a certain stage. The value \(e(\alpha)\) never depends on all the elements of the sequence, but only from what we know about \(\alpha\) at some point. This means that such a function \(e\) is in effect only ever applied to initial segments.
So, as the domain of a neighborhood function actually consists of finite sequences, strictly speaking, the notation “\(e(\alpha)\)” makes a category mistake. However, we can treat \(e(\alpha)\) as an abbreviation for the value of \(e\) on the smallest sufficiently long initial segment of \(\alpha\). Put another way, if, intuitively, \(e(\overline{\alpha}(n))=0\) means that the initial segment \(\overline{\alpha}(n)\) is not sufficiently long enough to compute the value of \(e\) for \(\alpha\), then \[e(\alpha) \text{ is defined iff } \exists n (e(\overline{\alpha}(n)) > 0),\]\[e(\alpha)=k \text{ iff } \exists n (\overline{\alpha}(n) = k + 1) \land \forall m (m < n \to \overline{\alpha}(n) = 0).\]
Now, \(e(\alpha)=e(\beta)\) holds because \(e\) is continuous in a precise sense (Dummett 1977, p. 58). Although this is not to place to enter into detail, the basic idea is this. Suppose for the moment that we are dealing only with arbitrary, unrestricted choice sequences. Then the topological space in question can be studied as a Baire space, whose neighborhoods consist of all species of choice sequences sharing the same initial segment of some length. Readers need not worry if the concept of a Baire space is unfamilar to them, for it essentially describes the universal spread soon to be discussed in 1.2.2. Topologically, \(e\) serves as a continuous function from this Baire space to the natural numbers with the discrete topology, the space consisting of every subset of the natural numbers as a neighborhood. Continuity means that, given any choice sequence \(\alpha\), for every neighborhood \(N(e(\alpha))\), there exists a neighborhood \(M(\alpha)\) such that if \(\beta \in M(\alpha)\) then \(e(\beta) \in N(e(\alpha))\). The equality \(e(\alpha)=e(\beta)\) immediately follows because, under the discrete topology, the singleton \(\{ e(\alpha) \}\) is a neighborhood of its element. Notice that we only focused our attention on the universal spread in the example above for the sake of simplicity, but similar considerations also apply to neighborhood functions associated with choice sequences that admit restrictions.
Once again, it is not our interest to dive into topology in this survey. As far as intuitionism is concerned in the philosophy of mathematics, the important thing to keep in mind is that such a \(\forall\alpha\exists n\)-continuity principle results from the imposition of continuity to the otherwise problematic choice principle \(\forall \alpha \exists m A(\alpha,m) \to \exists f \forall \alpha A(\alpha,f(\alpha))\) mentioned above. The antecedent remains unchanged, but the consequent states that there exists a neighborhood function \(e\) such that for every \(\alpha\) for which \(e\) is defined, \(A(\alpha, e(\alpha)\) holds. If we write \(e(\alpha) > 0\) to mean that \(e\) is defined for \(\alpha\), then the resulting principle can be stated as follows:
This is one of the formulations of Brouwer’s continuity principle. It is commonly described as the full version of continuity to distinguish it from a weaker formulation for the natural numbers that can be derived from it. Let the equality \(\overline{\alpha}(m) = \overline{\beta}(m)\) abbreviate the fact that the initial segments of length \(m\) of \(\alpha\) and \(\beta\) agree, which we can symbolically represent as \(\forall i < m (\alpha(i) = {\beta}(i))\). We can state the weak continuity principle as:
\[\tag{WC-N} \label{wcn} \forall \alpha \exists n A(\alpha, n) \to \forall \alpha \exists m \exists n \forall \beta (\overline{\beta}(m) = \overline{\alpha}(m) \to A(\beta, n)).\]
The only difference between these two principles is the consequent. Indeed, in (WCN) the requirement that there be a neighborhood function is dropped. Informally, the consequent says that given an \(\alpha\), we can find a fixed length \(m\) such that for some \(n\) and any \(\beta\) whose the initial segments of length \(m\) agrees with \(\alpha\), \(A(\beta,n)\) holds. As van Atten and van Dalen (2002) note, Brouwer used this continuity principle for the first time to show that the set of numerical choice sequences is not enumerable. They also investigate justifications for versions of the weak continuity principle (WC-N). As they point out, this principle is evident for lawless sequences because we never know more than a finite segment of them in advance. But for lawlike sequence it is less obvious since law restrictions might be taken into account.
These are not the only continuity principles considered in intuitionism. An even stronger form of continuity is usually discussed under the name of \(\forall\alpha\exists\beta\)-continuity. It can be regarded as a generalization of the full continuity principle (C-N) where the existential statement about numbers is instead about choice sequences. It is articulated by considering a partial higher-order function \(|\) that given a neighborhood function \(e\) and choice sequence \(\alpha\) returns a choice sequence \(e|\alpha\) if it satisfies some technical conditions (Dummett 1977, p. 60):
\[\tag{SC-N} \forall \alpha \exists \beta A(\alpha, \beta) \to \exists e \in \mathbf{K} \forall \alpha A(\alpha, e | \alpha).\]
The status of the \(\forall\alpha\exists\beta\)-continuity principle is more controversial. Dummett (1977, p. 222) sees this it as a dubious principle, as “it is far from plain that it is intuitionistically correct”. While it might initially be seen that \(\forall\alpha\exists\beta\)-continuity leads to contradictions in creating subject contexts, Posy (2020, pp. 73-74) explains that it is valid intuitionistically and that the clash only arises if one forgets about the restriction to extensional properties.
Continuity principles allow us to refute some classical theorems. It provides us with theorems that contradict results in classical mathematics. Iemhoff (2024) stresses that even the weak continuity principle implies the negation of a quantified version of the law of excluded middle \(\neg\forall \alpha (\forall n \alpha(n)=0 \lor \neg \forall n \alpha(n)=0)\). More precisely, unlike intuitionistic logic, which is not incompatible with classically valid principles, intuitionistic mathematics actually refutes classical logic (not just classical mathematics)! The full continuity principle can be used to prove that every total real function is continuous in intuitionistic analysis. The intuitionistic construction of the real numbers is discussed in the next sub-section. To be clear, however, it should be emphasized that discontinuous functions can be defined, but they are not total (Posy 2020, p. 35). For an overview of these and other consequences of the continuity principles, such as a derivation of weak continuity (WC-N) from full continuity (C-N), see (Dummett 1977, pp. 60-67). A more general mathematical account is given by (Veldman 2021, sections 12-13).
ii. Species, Spreads, and Bar Induction
Species are the closest analog in intuitionism to sets as used in classical mathematics. But at the same time they exhibit significant differences that should be emphasized. In a nutshell, a species is a property that previously constructed mathematical objects may possess. When a species \(S\) is defined by a certain property \(A\), we write \(S=\{x \mid A(x)\}\). If \(A(a)\) holds, we say that a formerly obtained object \(a\) is an element of \(S\) and we write \(a \in S\). Brouwer always highlights the following facts when introducing the notion of species in the second act:
1. a species can be an element of another species, but never an element of itself. That is to say that predicativity is a built-in feature of species. This is how the intuitionist is able to circumvent circularities like Russell’s paradox. In a sense, there is a clear resemblance with the theory of types (Heyting 1956b, section 3.2.3). The universe of species can be stratified into a hierarchy depending on their order: first-order species may only have elements which are not themselves given as species; second-order species may only have as elements objects given as first-order species and so on.
2. a species always has an equality relation defined among its elements. The equality associated with the definition of a species \(S\) must always be an equivalence relation \(R\) such that if \(a \in S\) and \(R(a,b)\) then \(b \in S\). We write \(a =_S b\) to denote \(R(a,b)\). This ternary equality relation is to be distinguished from the binary relations of intensional and extensional identity considered previously for choice sequences, for instance.
3. we can redefine the same species time and again. To understand why this is useful, recall that only for entities constructed prior to the definition of a species may we ask if they possess the corresponding property. When a species \(S\) is introduced no other mathematical object \(b\) constructed after this point in time can be a member of it, even if they turn out to satisfy the same property \(A\) that defined \(S\)! In order to accommodate this missing element, we can redefine the species \(S\) again after constructing \(b\). Thus, roughly speaking, a species can change their membership relation over time. We never lose its former elements but we might gain new ones after the redefinition.
Just as with the notion of a choice sequence, which necessarily grow over time, with species temporality is witnessed in the second act of intuitionism. This should not come as a surprise, for, as already seen earlier, intuitionistic mathematics is based on the intuition of time. Besides, like choice sequences, species are intrinsically intensional since two species are in fact one and the same only when given by the same property (van Atten 2004, pp. 6-7).
How are species used to construct the intuitionistic continuum? First, let us remark that the classical constructions of the set of integers \(\mathbb{Z}\) and rationals \(\mathbb{Q}\) are essentially acceptable intuitionistically, except that we are dealing with species instead. Moreover, it should be noted that they are not regarded as a finished totality. They can be defined with the usual equivalence relations by considering the equivalence classes as species. But due to the intensionality that species carry with them, we need to maintain some fine grained distinctions not present in classical set theory. For example, the subset of natural numbers in the integers \(\mathbb{N}_\mathbb{Z}\) and in the rationals \(\mathbb{N}_\mathbb{Q}\) are not identical species. The construction of the species \(\mathbb{R}\) is sometimes said to be analogous to the construction of the reals in terms of Cauchy sequences in classical mathematics. This is true in a sense, but the description downplays the role of choice sequences. In the intuitionistic approach to mathematics, the notion of choice sequence first comes into play in the definition of \(\mathbb{R}\) as the species of equivalence classes of convergent choice sequences of rationals (Heyting 1956b, section 3.1.1. Let \(\alpha=\langle r_i \rangle_i\) be a choice sequence of rational numbers. We say that \(\alpha\) is convergent iff its elements eventually get closer and closer to one another. More precisely, \[\forall k \exists n \forall m (|r_n – r_{n+m}| <_{\mathbb{Q}} 2^{-k}).\]
Two choice sequences \(\alpha=\langle r_i \rangle_i\) and \(\beta=\langle s_i \rangle_i\) coincide, in symbols, \(\alpha \simeq \beta\), when their own elements eventually get closer and closer to one another: \[\forall k \exists n \forall m (|r_{n+m} – s_{n+m}| <_{\mathbb{Q}} 2^{-k}).\]
We call a convergent choice sequence of rational numbers a real number generator. Now, let a real number be the equivalence class \(\{ \beta \mid \alpha \simeq \beta \}\) for some real number generator \(\alpha\). Finally, the species \(\mathbb{R}\) is simply defined as the species of all real numbers. For accessible accounts of the intuitionistic continuum, see (Dummett 1977, section 2.2) or (Posy 2020, section 2.2.2). It is worth emphasizing that, unlike in Bishop’s constructive mathematics, for example, intuitionistically the real numbers are not convergent sequences of rational numbers themselves. Real numbers remain treated as equivalence classes in intuitionism just as they are in classical mathematics. In this sense, intuitionism stays relatively closer to classical mathematics.
Spread
In fact, a species of choice sequences of rationals is what we call a spread. We can view a spread as a tree in which each node is a finite sequence but every path is infinite. As we mentioned earlier, a choice sequence can be identified with the paths of the tree. So spreads are essentially species of choice sequences sharing a law restriction. Although in the contemporary literature one often reduces the notion of spread to that of species, spreads were originally introduced by Brouwer as primitive mathematical objects. To be exact, a spread is defined by a spread law and a complementary law (Heyting 1956b, section 3.1.2). As shown below, the former serves to determine the admissible finite sequences of natural numbers, and the latter is used to map them to other mathematical objects.
The spread law classifies finite sequences into admissible and inadmissible. This law dictates the overall structure of the spread with an exhaustive account of all different admissible ways the finite sequences of natural numbers keep growing in time. Because it is exhaustive, it is possible to decide when a finite sequence is admissible or not, meaning that the excluded middle actually holds for finite sequences as far as the spread law is concerned. Moreover, the spread law always begins by including the empty sequence \(\langle\rangle\) as admissible and from then on proceeds by determining what its admissible extensions are. We say that the finite sequence \(\langle n_1, …, n_k, n \rangle\) is an extension of the finite sequence \(\langle n_1, …, n_k \rangle\). So, to sum up, every spread law has to satisfy the following four conditions:
1. the empty sequence is admissible;
2. either a finite sequence is admissible or not admissible;
3. at least one extension of an admissible sequence is admissible;
4. no extension of an inadmissible sequence is admissible.
The complementary law assigns mathematical objects to the non-empty finite admissible sequences. In effect, it maps numeric choice sequences to choice sequences of elements of some species. For example, recall that in the construction of the continuum sketched above we work with choice sequences of rational numbers. Given that the species of real number generators is actually treated as a spread, as we already explained, in this case the complementary law assigns rational numbers to the non-empty finite sequences. Roughly, if the spread law provides an endoskeleton that supports the spread, the complementary law supplies the body tissue that reveals how it looks like. Spreads with only a spread law are called naked and those with a complementary law are called dressed. For example, the so-called “universal spread” is a naked spread of all finite sequences of natural numbers. It can be helpful to illustrate its tree structure diagrammatically. We start from the empty sequence as the root and then include its admissible extensions as its nodes:The species of real number generators mentioned above is an example of a dressed spread. But for a more concrete example, consider the spread of binary sequences (Posy 2020, p. 31). The spread law specifies that for every admissible sequence \(\langle b_1,…,b_k \rangle\), there are exactly two extensions \(\langle b_1,…,b_k,0\rangle\) and \(\langle b_1,…,b_k,1 \rangle\). The complementary law maps the sequence \(\langle b_1,…,b_k\rangle\) to its last binary digit \(b_k\). The spread can thus be visualized as follows:Finally, what does it mean for a choice sequence \(\alpha\) to be an element of a spread \(M\)? We write \(\alpha \in M\) iff for every \(n\), the complementary law assigns \(\alpha(n)\) to an admissible sequence. In the simpler case of naked spreads, \(\alpha \in M\) iff for every \(n\), \(\overline{\alpha}(n)\) is admissible. It should be noted this treatment of membership is just a particular case of species membership.
Fan
Fans are finitary spreads. Informally speaking, the paths along the tree structure of a fan remain infinite, but the tree admits only finite branching. The spread of binary sequences illustrated above is an example of a fan. Note also that the binary fan has uncountably infinitely many infinite paths in the form of all sequences of zeros and ones. So despite the finite branching, fans need not even have a countable number of paths. We may characterize a fan as a spread that imposes an additional requirement that, for every admissible sequence, the spread law specifies that only a finite number of extensions is admissible.
One of the most important results in intuitionism is the so-called “fan theorem”. It can be understood as the intuitionistic counterpart of König’s lemma in classical graph theory, which states that if there is no finite upper bound to the paths of a finitely branching tree then the tree has at least one infinite path. Classically, its contrapositive thus states that if every path in a finitely branching tree is finite then there exists a longest finite path. That is precisely what the fan theorem says in intuitionistic mathematics, except that “there exists” must be interpreted constructively and the context is transposed to fans. One difficulty is that, strictly speaking, the intuitionist cannot have a finite path in a fan, since every path in a spread is identified with a choice sequence and is thus infinite. To make sense of the idea that a path terminates in another way, Brouwer introduced the concept of a bar. Simply put, a bar \(R\) is a species of “nodes” such that each path in a spread \(M\) is intersected. To be exact, \(R\) is a bar if every choice sequence in \(M\) has some initial segment of some length in \(R\): \[\forall \alpha \in M \exists n \overline{\alpha}(n) \in R.\]
Since the bar associates every choice sequence with a number, it in essence simulates some kind of oracle device telling us the lengths where the paths terminate. This captures the intuition that a finitely branching tree is finite. That is what a barred fan is. Now, in this intuitionistic terminology, we can express the idea that the fan has a finite path, namely, the consequent of König’s lemma, if we can determine a length such that every choice sequence in \(M\) has initial segment no longer than this length in \(R\): \[\exists m \forall \alpha \in M \exists n (n \leq m \land \overline{\alpha}(n) \in R).\]
We are now in a position to state the fan theorem:
Theorem 1 (Fan theorem). Let \(M\) be a fan and \(R\) be a species of finite sequences. If every \(\alpha \in M\) has an initial segment of length \(n\) in \(R\), then there exists a longest length \(m\) such that every \(\alpha \in M\) has an initial segment no longer than \(m\) in \(R\):
\[
\tag{FT}
(\forall \alpha \in M \; \exists n \; \overline{\alpha}(n) \in R)
\;\to\;
(\exists m \; \forall \alpha \in M \; \exists n \; (n \leq m \;\land\; \overline{\alpha}(n) \in R)).
\]
But how is the fan theorem proved? The classical proof of König’s lemma makes use of an intuitionistically invalid inductive argument to show the existence of \(m\) (Dummett 1977, p. 51). The intuitionist must therefore appeal to a different argument.
The proof of the fan theorem relies on an original argument known as the principle of bar induction. It can be stated without loss of generality for the universal spread \(M\). Let \(A\) be a subspecies of admissible sequences of \(M\). Call \(A\) *upward hereditary* if whenever every extension by one element of a finite sequence is in \(A\), then so is the finite sequence itself.
In the unrestricted form conceived by Brouwer, the principle of bar induction states that, if \(R\) is a bar, \(R\) is a subspecies of \(A\), and \(A\) is upward hereditary, then the empty sequence belongs to \(A\):
\[
\tag{BI}
\bigl(\forall \alpha \in M \; \exists n \; \overline{\alpha}(n) \in R \; \land\;\\
\forall u \,(u \in R \lor u \notin R) \;\land\; \forall u \,(u \in R \to u \in A) \; \land\;\\
\forall u \,(\forall n \,(u \cdot \langle n \rangle \in A) \to u \in A)\bigr) \;\to\; \langle \rangle \in A.
\]
Here \(u \cdot \langle n \rangle\) denotes the extension of a finite sequence \(u\) by one element \(n\). It helps to think of \(A\) as the species of those finite sequences for which we can find the length required by the consequent of the fan theorem. Indeed, the proof proceeds by instantiating \(A\) with such a species \(\{ u \mid \exists m \forall \alpha \in M \exists n (n < m \land \overline{\alpha}(n) \in R) \land \exists k \; u = \alpha(k) \}\).
Another key result in intuitionistic mathematics goes by the name of the “bar theorem”. The fan theorem can be easily confused with this theorem because it is also a result about bars. In fact, the bar theorem is simply the statement that bar induction holds. It should be noted that in bar induction the restriction that the spread must be finitary is dropped. So bar induction is maintained not just for fans but spreads in general. Brouwer originally derived the fan theorem using the principle of bar induction as a corollary of the bar theorem. The proof of the bar theorem is presented in three different articles (Brouwer 1924, 1927, 1954). His proofs have been extensively studied because they inaugurate a proof-theoretic method of analysis of the structure of canonical proofs. The bar theorem has even found applications in repairing Gentzen’s original proof of the consistency of arithmetic. But a careful study would be outside the scope of this introductory article. The reader is referred to Dummett (1977, section 3.4), Martino and Giaretta (1981), and Sundholm and van Atten (2008) for more on Brouwer’s proof and to Tait (2015) for connections with Gentzen’s original consistency proof.
It should be emphasized, however, that the principle (BI) originally formulated by Brouwer turns out to be false intuitionistically (Kleene and Vesley 1965, section 7.14). In this unrestricted form, bar induction implies a quantified version of the law of excluded middle and thus, as it can be inferred from 1.2.1.1, it must be inconsistent with the continuity principle. To fix this, Kleene proposes that we add an additional requirement that \(R\) is decidable:
\[
\tag{BI$_D$}
\bigl(\forall \alpha \in M \; \exists n \; \overline{\alpha}(n) \in R \; \land\; \\ \forall u \,(u \in R \lor u \notin R) \;\land\; \\ \forall u \,(u \in R \to u \in A) \;\land\;\\ \forall u \,(\forall n \,(u \cdot \langle n \rangle \in A) \to u \in A)\bigr)
\;\to\; \langle \rangle \in A.
\]
(Van Atten 2004, section 4.3) argues that the decidability condition is implicit in Brouwer’s earlier proofs from 1924 and 1927, though not in the 1954 proof. Another intuitionistically valid formulation introduces a requirement of monotonicity. \(R\) is monotonic if the fact that \(u \in R\) and \(v\) is an extension of \(u\), \(u \preceq v\), implies that \(v \in R\):
\[
\tag{BI$_M$}
\bigl(\forall \alpha \in M \; \exists n \overline{\alpha}(n) \in R \land\ \\
\forall u (u \in R \land u \preceq v \to v \in R) \land\ \\
\forall u (u \in R \to u \in A) \land\ \\
\forall u (\forall n (u \cdot \langle n \rangle \in A) \to u \in A)\bigr) \to \langle \rangle \in A.
\]
Brouwer’s “proof” of the bar theorem is not actually a proof. Surely, it must be erroneous since the unrestricted form of bar induction used does not hold intuitionistically. But even if we correctly derive the bar theorem from decidable or monotonic bar induction, these principles are no less in need of justification than the statement to be proved. Still, despite its mathematical shortcomings, Brouwer’s argument remains philosophically significant in that it serves to shed light on the intuitive content of the bar theorem. Before concluding this discussion of Brouwer’s views, it should be remarked that the fan and bar theorems were primarily motivated by him as tools to prove the uniform continuity theorem, which states that every total real function is uniformly continuous. This is an even stronger counterexample than the continuity result mentioned in 1.2.1. Finally, we note that the fan theorem is incompatible with Church’s thesis (Dummett 1977, p. 53).
c. The Creating Subject
Intuitionism is the product of the activity of the creating subject, an idealized mind carrying out constructions and experiencing truths at discrete stages of time. (Brouwer 1948a) starts introducing in print arguments relying on the creating subject to construct weak and strong counterexamples to classical principles based on the solvability of open problems. Such arguments are ultimately grounded in Brouwer’s conception of truth, according to which a proposition is true if it has been experienced by the creating subject:
[…] truth is only in reality i.e. in the present and past experiences of consciousness. Amongst these are things, qualities of things, emotions, rules (state rules, cooperation rules, game rules) and deeds (material deeds, deeds of thought, mathematical deeds). But expected experiences, and experiences attributed to others are true only as anticipations and hypotheses; in their contents there is no truth (Brouwer 1948b, p. 1243).
At this stage [of introspection of a mental construction] the question whether a meaningful mathematical assertion is correct or incorrect, is freed of any recourse to forces independent of the thinking subject (such as the external world, mutual understanding, or axioms which are not inner experience) and becomes exclusively a matter of individual consciousness of the subject. Correctness of an assertion then has no other meaning than that its content has in fact appeared in the consciousness of the subject (van Stigt 1990, p. 450).
[…] in mathematics no truths could be recognized which had not been experienced […] (Brouwer 1955, p. 114).
There has been much debate about the exact nature of the creating subject. Brouwer is frequently accused of psychologism, subjectivism, or even solipsism. Placek (1999, section 2.4.1) defends Brouwer against the charges of psychologism and subjectivism by maintaining that the creating subject is a transcendental subject in Kant’s sense. The transcendence is hinted at in passages such as the above, where Brouwer stresses that all our physical and psychological limitations as human subjects are abstracted away, such as our finite lifespan, imperfect memory, mental state, mathematical proficiency. Although Brouwer does argue against the plurality of minds (Brouwer 1948b, pp. 1239-1240), Placek notes that there is still some room left for intersubjectivity, which allows for the possibility of communication with other minds equally equipped with all the cognitive structure assumed by Brouwer.
There is also some dispute as to whether the transcendental status of Brouwer’s creating subject is better understood from Kant’s or Husserl’s perspective. Perhaps because of the way Brouwer himself explicitly acknowledges his debt to Kant, the received view is the Kantian interpretation assumed by (Placek 1999). Similarities between Brouwer and Kant in the context of their transcendental philosophies are most systematically drawn by Posy (1974, 1991, 1998). At the same time, however, there are other similarities between Brouwer and Husserl that are just as difficult to ignore. Drawing from the phenomenology of inner-time awareness, van Atten (2004, section 6.2) argues that Brouwer’s creating subject is best analyzed using Husserl’s phenomenology. Further connections between Brouwer’s intuitionism and Husserl’s phenomenology have been explored especially by Tieszen (1989, 1995, 2008) and van Atten (2006, 2017, 2024). 2.2 shows that this phenomenological interpretation of intuitionism is originated by the meaning explanations advanced by Heyting in terms of fulfillments of intentions.
Creating Subject Argument:
It is time to return to the creating subject arguments. Brouwer realized that the way the creating subject experiences truths about open problems could be used to define choice sequences and generate counterexamples to emphasize the elements of indeterminacy in intuitionistic analysis. To see how this can be done, it might be instructive to examine a simple argument used by Brouwer as a weak counterexample to \(\forall r \in \mathbb{R}(r \neq 0 \to r > 0 \lor r < 0)\) (Brouwer 1948a, p. 322). We say that a proposition \(A\) is testable if the instance of the weak law of excluded middle \(\neg A \lor \neg\neg A\) holds. Given a proposition \(A\) not known to be testable, Brouwer maintains that the creating subject can construct a choice sequence \(\alpha\) of rationals by making the following choices:
if during the choice of \(\alpha(n)\) the creating subject has neither experienced the truth of \(A\) nor its absurdity then choose \(\alpha(n) = 0\);
if as soon as between the choice of \(\alpha(k-1)\) and \(\alpha(k)\), the creating subject has
experienced the truth of \(A\) then choose \(\alpha(k+n) = 2^{-k}\) for every \(n\);
if as soon as between the choice of \(\alpha(k-1)\) and \(\alpha(k)\), the creating subject has
experienced the absurdity of \(A\) then choose \(\alpha(k+n) = -2^{-k}\) for every \(n\).
Let us introduce some notation to make this definition a bit more mathematically precise. Brouwer’s tenet that “there are no non-experienced truths” can be understood as saying that, if a proposition \(A\) is true, then it is not possible that there is no stage at which the creating subject is going to experience the truth of \(A\). In the presence of a special notation \(\Box_n A\) to express that “the creating subject has experienced the truth of \(A\) at stage \(n\)”, then this fundamental tenet can be put into symbolic notation as:
\[A \to \neg
\neg \exists n \Box_n A.\]
The above definition can then be more rigorously stated as:
\[
\alpha(n) =
\begin{cases}
0 & \text{if } \neg \Box_n (A \lor \neg A) \\[6pt]
2^{-k} & \text{for all } n \geq k \text{ if } \Box_k A \text{ and } \neg \Box_{k-1}(A \lor \neg A) \\[6pt]
-2^{-k} & \text{for all } n \geq k \text{ if } \Box_k \neg A \text{ and } \neg \Box_{k-1}(A \lor \neg A).
\end{cases}
\]
\(\alpha\) is convergent and thus a real-number generator. Yet, the real number \(r\) generated by \(\alpha\) cannot be \(0\) because that would lead to a contradiction. If that were the case, \(\alpha(n)=0\) for all \(n\), meaning that \(\forall n \neg \Box_n (A \lor \neg A)\). Intuitionistically, the De Morgan law \((\forall x \phi(x)) \to (\neg \exists x \neg \phi(x))\) and contraposition \((\phi \to \psi) \to (\neg \psi \to \neg\phi)\) are valid schemes. See the appendix for more information. The contradiction \(\neg A \land \neg\neg A\) follows from the contrapositive of Brouwer’s tenet, \(\neg \exists n \Box_n (A \lor \neg A) \to \neg (A \lor \neg A)\). This shows that \(r \neq 0\). Yet, if \(r > 0\), there is a \(k\) such that \(\alpha(n) = 2^{-k}\) and thus the truth of \(A\) has been experienced, contradicting our assumption that \(A\) is not known to be testable. Since the same can be said mutatis mutandis for \(r < 0\), we have established that \(r < 0 \lor r > 0\) does not hold. For an overview of this and other examples given by Brouwer, see (van Atten 2018).
Creating subject arguments tend to be seen as a controversial aspect of intuitionism because it is not clear how they qualify as mathematical arguments (Heyting 1956b section 8.12.1). In spite of the doubts surrounding them, they have been studied in the literature and different formalizations have been offered to make precise the assumptions underlying them. We conclude this section by presenting the two most influential proposals:
Kripke’s Schema: \[\tag{KS} \label{wks}
\exists \alpha ((\neg A \leftrightarrow \forall n \alpha(n)=0)
\land (\exists n \alpha(n)\neq 0 \to A)).\]
This schema was proposed by Kripke in a letter to Kreisel around 1965 (Kripke 2019, fn. 8). This letter led Kreisel to formulate the theory of the creating subject to be seen soon. The schema is intended to describe the generation of a choice sequence \(\alpha\) out of the truth or absurdity of a proposition \(A\) experienced by the creative subject. Though Kripke originally formulated it in what is known as its weak form, there is a stronger and simpler form:
\[\tag{KS+} \label{sks} \exists \alpha (A \leftrightarrow \exists n \alpha(n)=1).\]
Van Atten (2018) stresses that both schemes are present in Brouwer’s published work.
The Theory of the Creating Subject:
\[
\begin{align*}
\tag{CS1} & \forall n (\Box_n A \lor \neg \Box_n A) \\
\tag{CS2} & \forall n (\Box_n A \to \Box_{n+m} A) \\
\tag{CS3} & \forall n (\Box_n A \to A) \\
\tag{CS4} & A \to \neg \neg \exists n \, \Box_n A.
\end{align*}
\]
The theory of the creating subject was originally proposed by Kreisel (1967, secton 3) and subsequently refined by Troelstra (1969, section 16). In their expositions they employ a turnstile notation to represent the creating subject and omit the universal quantifiers. The modal notation used here is introduced by Troelstra and van Dalen (1988, p. 236). The four axioms above emphasize that the creating subject continuously performs their mental activities one after another at stages of time. (CS1) states that \(\Box_n A\) is decidable at every stage \(n\), for the creating subject either experiences a truth or not. (CS2) expresses that truths experienced in past stages are never forgotten at subsequent stages. (CS3) encapsulates the plausible principle that experienced truths are indeed truths. At any stage \(n\), if the creating subject has experienced the truth of \(A\), then \(A\) is true. (CS4) expresses the already familiar tenet discussed before. Importantly, (KS) is derivable from these four axioms. It can be derived with the strengthening of (CS4) proposed by Troelstra (1969, pp. 95-96): \[\tag{CS4+} \label{cs4plus} A \leftrightarrow \exists n \Box_n A.\]
For additional readings the reader is referred to (Posy 2020, sections 2.2.3 and 3.2.4).
2. Heyting
Although Brouwer downplays the roles of logic and language in his intuitionism, the development of intuitionistic logic by his student Arend Heyting and others has ironically made intuitionism more accessible and generated a growing interest. Historically, intuitionistic logic came into being when the Dutch Mathematics Society held a contest involving the formalization of intuitionistic logic and mathematics, and developed the first full axiomatization of intuitionistic first-order logic by isolating the set of intuitionistically acceptable axioms of Principia Mathematica. Heyting begins the undertaking with the following remark that echoes Brouwer’s attitude about language:
Intuitionistic mathematics is a mental activity, and for it every language, including the formalistic one, is only a tool of communication. It is in principle impossible to set up a system of formulas that would be equivalent to intuitionistic mathematics, for the possibilities of thought cannot be reduced to a finite number of rules set up in advance (Heyting 1930a, p. 191).
Perhaps because of Brouwer’s tendency towards anti-formalism, there is a popular narrative that his initial reaction to Heyting’s formalization work was negative. However, in reality Brouwer was very pleasantly surprised (van Stigt 1990, section 5.12.2-3). This survey article does not delve into intuitionistic logic and its formal developments. Instead, we refer the interested reader to the appendix for a list of notable theorems in intuitionistic logic and classical theorems that turn out unprovable within intuitionistic logic. Our focus is on the philosophical views that underlie Heyting’s presentation of intuitionistic mathematics. While they continued to be true to the spirit of his teacher’s intuitionism as a mental creation independent of language, his thought has evolved in many interesting directions. His liberalism and anti-metaphysics, meaning explanations for the logical connectives, and hierarchy of grades of evidence are some major contributions worth stressing.
a. Intuitionism Without Philosophical Subtleties
Perhaps the most relevant contrast in attitude is that Brouwer is a strong mathematical revisionist that sees classical mathematics as unjustified. Without making compromises, Brouwer was determined to make intuitionism prevail over formalism. Heyting, on the other hand, expresses a more liberal stance in the foundational debate, and is even willing to see intuitionistic mathematics as standing alongside classical mathematics. He is interested in promoting intuitionism so that working mathematicians could engage with its practice and not in witnessing the end of classical mathematics. In order to popularize intuitionism and minimize conflict with formalists, Heyting decided to present it as a simple program that banishes metaphysics from mathematics.
Brouwer’s own account of intuitionism is actually inseparable from a mystical perspective towards life that supports the intuition of twoity. This rather arcane doctrine states how consciousness moves away from its deepest home to the exterior world of the subject. The reader can refer to for an easy-to-follow overview of it. Given that these mythicist elements understandably tend to unsettle mathematicians, it is hardly surprising that these ideas are largely marginalized in Heyting’s work. Heyting emphasizes that to properly understand intuitionistic mathematics we do not need to adhere to this peculiar “psychological theory concerning the origination of knowledge”:
Brouwer’s explanations are incorporated in an exposition of his views on science in general and of an all-embracing conception of life and the world. Fortunately, in order to understand intuitionistic mathematics, it is not necessary to adhere to philosophic and psychological theories (Heytng 1958b, p. 102).
Heyting also dismisses the Kantian roots of intuitionism. Like Brouwer, Heyting stresses that intuitionism is an activity rather than a doctrine. It thus cannot be adequately described by means of a set of premises. Heyting sees this as reason to avoid looking for a basis for intuitionistic mathematics within some form of Kantian philosophical framework. He thinks the foundations of intuitionism should be erected on simple ordinary concepts that do not require any definite metaphysical considerations. For Heyting, intuitionism does not stand as a philosophical system on par with “realism, idealism, or existentialism”. In his view “the only philosophical thesis of mathematical intuitionism is that no philosophy is needed to understand mathematics” (Heyting 1974, p. 79). One might wonder how he may retain the mind-dependence of mathematical objects and truths without committing intuitionistic mathematics to a form of mathematical idealism. As we shall see, his answer seems to be that to practice intuitionism we do not need to pay attention to such questions.
Despite the rejection of any kind of philosophical system, Heyting still maintains the view that the intuition of twoity is the starting point of intuitionistic mathematics. The “two fundamental facts” of the “mental conception of an abstract entity, and that of an indefinite repetition of the conception of entities” is described not as part of a theory of knowledge but as a natural phenomenon of our daily lives. Everyone “knows how to build up mentally the natural numbers” from these two concepts and “knows what it means that their sequence can be made to proceed indefinitely” (Heyting 1958b, p. 102). Heyting stresses that the considerations about intuition are not metaphysical per se, for in intuitionistic mathematics we do not need to inquire how these two fundamental concepts are obtained:
They become so if one tries to build up a theory about them, e.g., to answer the question whether we form the notion of an entity by abstraction from actual perceptions of objects, or if, on the contrary, the notion of an entity must be present in our mind in order to enable us to perceive an object apart from the rest of the world. But such questions have nothing to do with mathematics. We simply state the fact that the concepts of an abstract entity and of a sequence of such entities are clear to every normal human being, even to young children (Heyting 1956b, p. 13).
b. The Meaning Explanations of the Intuitionistic Logical Constants
Brouwer viewed logic as a mere part of mathematics, never paying much attention to the study of intuitionistic logic as an independent discipline. He entrusted this task to Heyting, limiting himself to only a few scattered remarks on the proper interpretation of the intuitionistic logical connectives and the intuitionistic nature of truth in his work. In essence, according to intuitionism, a subject may only claim that a proposition is true when a construction required by the proposition has been carried out (Heyting 1958c, p. 278). Since truth requires an “introspective realization” of a mathematical construction, no subject-independent notion of truth is recognized in intuitionistic mathematics.
There was a heated debate in the twenties about the nature of intuitionistic logic, including contributions by Glivenko, Barzin and Errera, and others. Heyting developed in a series of papers his “meaning explanations” to address questions about the meaning of expressions occurring in a formal system of intuitionistic logic (Heyting 1930b, 1931, 1934). The meaning explanations serve as an informal semantics that endows every compound mathematical proposition formed by the intuitionistic logical connectives with meaning by establishing the intended interpretation of these connectives. It has led to the consolidation of what is today known as the BHK explanation of the logical connectives (Troelstra and van Dalen 1988, section 3.1). The acronym is a reference to its originators, Brouwer, Heyting, and Kolmogorov. Note that sometimes this explanation is described as an interpretation instead. There are good reasons to avoid this terminology since interpretations are commonly associated with formal semantics. The constructions that realize the truth of a propositions can be informally regarded as proofs. For this reason, the BHK explanation is also known as the proof explanation as well:
a proof of \(A \land B\) is a pair containing a proof of \(A\) and a proof of \(B\);
a proof of \(A \lor B\) is a either a proof of \(A\) or a proof of \(B\), plus the information of whether the left or right disjunct is the one being proved;
a proof of \(A \to B\) is a method that transforms any proof of \(A\) into a proof of \(B\);
a proof of \(\bot\) is not possible; \(\neg A\) is defined as \(A \to \bot\);
a proof of \(\forall x \in A\, B(x)\) is a method that transforms any \(x \in A\) into a proof of \(B(x)\);
a proof of \(\exists x \in A\, B(x)\) is a pair containing an \(x \in A\) and a proof of \(B(x)\).
By themselves these conditions have no explanatory power as a formal semantics. Whether or not they can invalidate the law of excluded middle, for example, depends on a concrete interpretation of “proof” and “method” (Troelstra and van Dalen 1988, ex. 1.3.4). Note also that the domain of quantification is a species. It can be omitted when the species is simple enough that \(x \in A\) does not require a proof and we know the domain that \(B\) ranges over. This was done above in formulas about numbers or numeric sequences. But even when this is not made explicit, quantification is always restricted to a domain. Intuitionism sees the totality of mathematical constructions as open-ended (Brouwer 1907, pp. 148-149). It cannot make sense of an unrestricted quantification over all mathematical objects.
It is important to distinguish the BHK explanation as presented above from Heyting’s own original ideas advanced in his meaning explanations. It is especially worth noting that he says nothing about the intuition of twoity. One might think that this is related to his general aversion to intricate philosophical considerations. Yet, in his explanations Heyting does borrow key concepts from Husserl’s theory of intentionality. This connection was first emphasized by Martin-Löf (1985) and Tieszen (1984). By interpreting proofs of propositions as fulfillments of intentions, Heyting alludes to the view of proofs as objects of intuition in Husserl’s sense. This suggests that the meaning explanations are elucidated in terms of intuition in the phenomenological sense of intentional fulfillments:
A mathematical proposition expresses a certain expectation; e.g. the proposition “The Euler constant \(C\) is rational” means the expectation that one can find two integers \(a\) and \(b\) in such a way that \(C = a / b\). Perhaps the word “intention” coined by the phenomenologists expresses even better what is meant here than the word “expectation”. I also use the word “proposition” for the intention expressed by the proposition. […] The assertion of a proposition means the fulfillment of the intention, e.g. the assertion “\(C\) is rational” would mean that one has actually found the wanted integers (Heyting 1931, p. 113, our translation).
Unfortunately, Heyting does not say much more about the idea. But since then connections between intuitionism and phenomenology have been extensively explored in the literature. This propositions-as-intentions explanation in particular has been further extended by Tieszen (1989) to accommodate intuitions of natural numbers and finite sets and by van Atten (2004, 2006, 2017) to justify choice sequences as objects of intuition. It has also been severely criticized due to incompatibilities between Husserl’s strong platonist tendencies and other idealist aspects of Brouwer’s thought (Rosado Haddock 2010) (Hill 2010). In 2023, a stronger objection that, even if we overlook these tensions, propositions cannot be intentions and proofs cannot be fulfillments, has been raised: while intentions have to be directed to a unique object and can only be fulfilled with their presence, propositions, such as disjunctions and existential statements, can have different proofs (Bentzen 2023a, section 6.4).
Another significant contrast is that, as observed by Prawitz (2012), the term “proof” does not systematically appear in Heyting’s meaning explanations. Although in the BHK explanation the notion of provability is employed in every condition, Heyting himself only introduces proofs in his explanation of the clause for implication, writing that \(A \to B\) means the intention of a construction which leads from every proof of \(A\) to a proof of \(B\). Even so, a proof of a proposition is explained as the realization of the construction required by it, a mathematical construction which can itself be treated mathematically (Heyting, 1934 p.14; 1931, p.114). That is, provability is not accepted as a primitive notion. This suggests that, in contrast to the formulation of the BHK explanation given above, proofs does not play a central semantic role in Heyting’s view. For him, proofs can be seen to receive their Brouwerian reduction to mental constructions given in intuition through their identification with intentional fulfillments. One difficulty with this strategy is that proofs are treated as objects, while fulfillments are mental processes. Tieszen (1989, pp. 86-87) argues that the objectification of a mental process can be achieved through an act of reflection, a higher-order intentional act that one directs at one’s own intentional experiences.
Heyting rejects the rather widespread explanation of “\(A\) is true” as “\(A\) is provable”. He sees the latter as equivalent to the phrase “there exists a proof of \(A\)”, which implies the idea of mind-transcendent existence of proofs. Intuitionistically, truth is understood as the establishment of an “empirical fact” about realization of expectations or fulfillment of intentions. It does not have to do with a state-of-affairs, but only with one’s own experiences. To assert the truth of a proposition \(A\) is thus to maintain “we know how to prove \(A\)” (Heyting 1930b, p. 959).
c. The Hierarchy of Grades of Evidence
Brouwer developed intuitionism as a response to the paradoxes of set theory, ensuring that every theorem possesses intuitive content to guarantee the absence of contradictions. But for Heyting not even in intuitionistic mathematics can we achieve absolute certainty. He proposed a hierarchy that classifies what is intuitively clear in a descending order:
It has proved not to be intuitively clear what is intuitively clear in mathematics. It is even possible to construct a descending scale of grades of evidence. The highest grade is that of such assertions as \(2 + 2 = 4\). \(1002 + 2 = 1004\) belongs to a lower grade; we show this not by actual counting, but by a reasoning which shows that in general \((n + 2) + 2 = n + 4\). Such general statements about natural numbers belong to a next grade. They have already the character of an implication: ‘If a natural number \(n\) is constructed, then we can effect the construction, expressed by \((n + 2) + 2 = n + 4\)’. This level is formalized in the free-variable calculus. I shall not try to arrange the other levels in a linear order; it will suffice to mention some notions which by their introduction lower the grade of evidence.
The notion of the order type \(\omega\), as it occurs in the definition of constructible ordinals.
The notion of negation, which involves a hypothetical construction which is shown afterwards to be impossible.
The theory of quantification. The interpretation of the quantifiers themselves is not problematical, but the use of quantified expressions in logical formulas is.
The introduction of infinitely proceeding sequences (choice sequences, arbitrary, functions).
The notion of a species, which suffers from the indefiniteness of the notion of a property. The natural numbers form a species; all species do not. It is doubtful whether all species of natural numbers form a species; therefore I prefer not to use this notion (Heyting 1966, p. 195).
Let us go over his hierarchy from the highest to the lowest grades of evidence. First, we have particular propositions about small numbers. They have the highest grade because they are immediately evident (Heyting 1958a, p. 335-336). The second highest grade of evidence is that of particular propositions about larger numbers. Heyting maintains that there is loss in evidence because they tend to be demonstrated as instances of a general rule. For example, when asked to show that \(135664 + 37863 = 37863 + 135664\) most people will appeal to the fact that \(n + m = m + n\) for every \(n\) and \(m\) instead of counting. The third highest grade of evidence is that of such general numeric propositions. The issue at stake here is that such a general proposition has a hypothetical nature. Heyting does not elaborate, but the fact that general numeric propositions cannot be verified by calculation and need to be proved by induction presumably has to do with the loss of evidence. What about the lowest grade described in (1)–(5)? Heyting follows Griss (1946) in admitting that we descend into an even lower grade when we introduce negations, since the implication is less certain when the hypothetical constructions cannot be carried out. He also mentions the theory of quantification, but since general propositions have already been placed one level above, it is not entirely clear what kind of propositions he has in mind. Infinity is also found at the bottom of the hierarchy in the form of the introduction of \(\omega\) and choice sequences. Heyting places the notion of species in the lowest grade, for there is no precise definition of what a property is. Note that each one of these represent a different and incomparable lowest grade.
3. Dummett
Mental constructions in intuition play a key role in the writings of Brouwer and Heyting. Dummett’s seminal work marked a notable departure from their tradition. Restricting his attention to only logic and not all of intuitionistic mathematics, he wishes to justify the rejection of classical logic in favor of intuitionistic logic. This is done by means of an original meaning-theoretic argument where intuition is replaced by provability as the fundamental notion that grounds meaning and truth (Dummett 1975, pp. 6-7, 16-17). More generally, unlike Brouwer and Heyting, Dummett denies the mind-independence of both mathematical objects and truths without maintaining their mind-dependence. This position leads to an intuitionism without intuition that subscribes only to anti-realism and not to idealism.
a. The Meaning-Theoretic Turn
For Dummett, the meaning of a sentence is manifested in its use. Dummett concludes from this that meaning should be given in terms of proof, given that in ordinary mathematical practice using a sentence expressing a theorem is the same as using a proof of it or at least recognizing such a proof when we are presented with one. Dummett’s program can be seen as a rearticulation of the meaning explanations proposed by Heyting through the late Wittgenstein’s idea that the meaning of a word can be determined by its use.
Dummett forcefully rejects Brouwer’s views. Dummett sees the meaning of a sentence as a product of its function as an instrument for communication. An individual cannot assign a personal mental image to the meaning of a sentence that cannot be observed by its communication. For him explaining meaning by appealing to a subjectivist notion of mental construction in intuition invites the threat of solipsism. In fact, Dummett describes Brouwer’s philosophy as “psychologistic through and through” (Dummett 1980). We must understand meaning in terms of mind-independent verifiability conditions.
Realism violates the principle that use exhaustively determines meaning, for it determines meaning by truth-conditions which are in general not verifiable. The roots of Dummett’s rejection of realism and general appreciation of the intuitionistic standpoint can already be found in his earlier investigations about truth (Dummett 1959, pp. 160-162). The principle that meaning is use is perfectly consistent with the BHK explanation. Dummett observes that to conform with this principle we must also replace the notion of truth by that of proof. How should this meaning-theoretic argument for intuitionism analyze truth in such terms? There are two fundamental principles that in his view the notion of truth should satisfy:
(C) If a statement is true, there must be something in virtue of which it is true.
(K) If a statement is true, it must be in principle possible to know that it is true.
The first principle can be traced back to classical correspondence theories of truth. It is also valid intuitionistically as long as the truth-makers are identified with proofs. The second principle requires commitment to anti-realism. How to satisfy these principles? This raises several difficult questions about existence, temporality, and provability which Dummett acknowledges but does not offer a conclusive answer for. Consider how the truth of the proposition \(598017 + 246532 = 844549\), for example, may be verified. Dummett writes:
We may perform the computation, and discover that \(598017 + 246532\) does indeed equal \(844549\): but does that mean that the equation was already true before the computation was performed, or that it would have been true even if the computation had never been performed? The truth-definition leaves such questions quite unanswered, because it does not provide for inflections of tense or mood of the predicate ‘is true’: it has been introduced only as a predicate as devoid of tense as are all ordinary mathematical predicates (Dummett 1975, pp. 24-25).
Should one reduce “\(A\) is true” to “\(A\) is provable” or “\(A\) has been proved”? This introduces in Dummett’s rearticulation of intuitionism what Raatikainen (2004) calls the distinction between the actualist and possibilist conceptions of truth. To endorse the actualist view is to introduce tense in mathematics, as Dummett admits, while under the possibilist view truth remains timeless just as it is in a classical account. We should stress that the ambiguous modality involved in “provable” further complicates the possibilist view. In his attempts to elucidate the possibility of proof existence involved in the account, Prawitz even goes as far as endorsing ontological realism for proofs (Prawitz 1987, pp. 153-154). Dummett fears that allowing for the existence of unknown proofs in such a static mathematical reality leads to the validation of the excluded middle and thus cannot accept the proposal. Dummett does not consistently subscribe to either the actualist or possibilist conception in his writings. Unlike the account of truth envisioned by Brouwer and Heyting seen in the previous sections, in Dummett there is no reference to intentional content because truth has been desubjectivized. It is no longer, so to speak, a first-person notion that appeals to the creating subject, as it was since Brouwer, but a notion analyzed strictly from a third-person perspective.
One of Dummett’s major contributions to intuitionism is the introduction of a crucial distinction between canonical and non-canonical proofs. Roughly, a canonical proof has an explicit form by which it can be directly checked that it proves a proposition; a proof is non-canonical if it is not in canonical form, but an effective method can reduce it to a canonical proof in a finite period of time (Dummett 1975, pp. 31-35). The notion of proof presupposed in the BHK explanation has to be limited to that of canonical proof. Clearly, even in intuitionistic mathematics, a proof of a conjunction \(A \land B\), for example, need not consist in a pair of proofs of \(A\) and \(B\) explicitly. We may prove the conjunction indirectly, say, by modus ponens, if we have proofs \(C \to A \land B\) and \(C\) in hand. Dummett also contemplates connections between canonical proofs and normalized proofs in proof systems of natural deduction, on the one hand, and canonical proofs and Brouwer’s fully analyzed proofs in his proof of the bar theorem, on the other.
The notion of canonical proof has been instrumental in the further development of mathematical constructivism with the Swedish school led by Prawitz, Martin-Löf, and Sundholm, which pushed forward the meaning-theoretic approach inaugurated by Dummett in different directions. Prawitz founded proof-theoretic semantics as an attempt to convert the BHK explanation into a formal semantics complete with respect to intuitionistic logic (Prawitz 1971, 1973, 1974). Martin-Löf kept the explanation informal but extended its scope to offer a philosophical justification for the axioms and rules of inference of his constructive type theory (Martin-Löf 1984), a formal system developed to establish a rigorous foundation for Bishop’s brand of constructive mathematics. The formalism is based on a correspondence between propositions and types sometimes known as the Curry–Howard isomorphism. See (Sundholm 2004) for an overview of anti-realism in this type-theoretic tradition.
b. Against the Ontological Route
Dummett (1975, pp. 38-40) argues that the meaning-theoretic argument outlined above is the only correct route to the justification of intuitionistic logic. Of course, questions of meaning were already considered to some extent in the works of Brouwer and Heyting. But their main focus has always been epistemological: their intuitionism reflects the mental activity of the creating subject carrying out constructions in intuition. With Dummett, the matter of how mathematical objects are given to us is given a semantic answer: what in analytic philosophy is known as Frege’s linguistic turn is replicated in intuitionistic mathematics. In other words, intuitionism ceases to be primarily a theory of knowledge with semantic overtones and becomes primarily a theory of meaning with epistemic overtones.
Part of the strength of Dummett’s meaning-theoretic turn derives from his rejection of another argument for intuitionistic logic that is ontological in nature. It starts with the doctrine that natural numbers are the creations of human thought (Dummett 1975, pp. 18-19). Without presupposing any sort of theory of meaning, he implies that we may only elucidate truth for decidable arithmetical statements by means of actually performed computations:
[…] it has now become apparent that there is one way in which the thesis that natural numbers are creations of the human mind might be taken, namely as relating precisely to the appropriate notion of truth for decidable statements of arithmetic, which would provide a ground for rejecting a platonistic interpretation of number-theoretic statements generally, without appeal to any general thesis concerning the notion of meaning. This way of taking the thesis would amount to holding that there is no notion of truth applicable even to numerical equations save that in which a statement is true when we have actually performed a computation (or effected a proof) which justifies that statement. Such a claim must rest, as we have seen, on the most resolute scepticism concerning subjunctive conditionals: it must deny that there exists any proposition which is now true about what the result of a computation which has not yet been performed would be if it were to be performed (Dummett 1975, p. 40).
The problem with this “hard-headed” view, according to Dummett, is that it entails a strong form of skepticism about subjunctive conditionals. If we can only explain truth for decidable arithmetical statements by appealing to actual computations, Dummett thinks we necessarily rule out the existence of propositions which are now true about the result of a yet-to-be-performed computation. For example, it is easy to tell now that \(10^{100} + 10^{10} = 10^{10} + 10^{100}\) is true even before performing the calculations involved in the equation. Dummett sees ontological intuitionism as an unattainable position because, in his view, it must reject such kind of reasoning about subjunctive conditionals. One might resist Dummett’s argument by noting that actual computations need not play a decisive role in determining truth. If numbers are the creations of human thought, then some truth-makers of propositions may also come into being in time. Van Atten (2022) presents an excellent reconstruction of Dummett’s arguments and launches a defense of the ontological route in this direction.
c. Sense and Reference
Dummett (2021) also considers a sense-reference distinction for constructive semantics. In a nutshell, the basic insight behind his distinction is that the sense of an expression is related to its reference as a program is related to its execution. That is, under an intuitionistic point of view, the sense of an expression must be an effective method for computing its reference, whereas the reference must be what results after the method is fully carried out. The idea has its roots in Dummett’s early commentaries on Frege’s own semantic distinction:
Frege’s argument about identity statements would be met by supposing the sense of a singular term to be related to its reference as a programme to its execution, that is, if the sense provided an effective procedure of physical and mental operations whereby the reference would be determined (Dummett 1978, p. 133).
How to implement such a distinction? Dummett classifies expressions in two main groups, namely, that of predicates, relational expressions, and logical constants, on the one hand, and that of singular terms and functional expressions, on the other. To specify the meaning of predicates, relational expressions, and logical constants we must rely on the notion of proof, but a semantic analysis of singular terms and functional expressions can be done directly without explicit mention of proofs (Dummett 2021, pp. 489-494). Predicate-like functional expressions in the Fregean sense are excluded from the second group. The meaning of the logical constants is fixed by the BHK explanation. Dummett adds that the meaning of a predicate consists in a means of recognizing a proof that the predicate applies to a given object. Dummett relates this conception of meaning to sense and reference as follows:
We grasp the meaning of a predicate when we know how, for any element of the domain over which it is defined, to classify mathematical constructions into those that do and those that do not prove that it satisfies the predicate; and just that principle of classification is the semantic value of the predicate (Dummett 2021, p. 490).
What we grasp when we understand a predicate is its sense. Here “semantic value” is simply Dummett’s word for the intuitionistic counterpart of reference. Dummett concludes from this that there is no room for a sense-reference distinction for predicates, for “the semantic value of the predicate and what we grasp when we understand it are one and the same” (p.491). He sees the problem as the explanation of the meaning of a predicate in terms of proofs. Given that the same can be said of the meaning of relational expressions and logical constants, there can be no distinction between sense and reference for them either.
What about singular terms and functional expressions? Dummett argues that insofar as we are dealing with natural numbers or other finitely presented mathematical objects, we can distinguish the sense and reference of their corresponding expressions. To illustrate this, he points out that the same natural number may be given in different ways. The number thirteen can be presented to us as \(13\) or \(4 + 3^2\), for example, and the equality \(13 = 4 + 3^2\) may even extend our knowledge, if analytic judgments are allowed to be ampliative. We may thus regard numerals as “aiming at natural numbers by varying routes” (p.493). Yet, we cannot do the same for expressions for real numbers or other not finitely given mathematical objects. The particular way a real number is given to us might affect what counts as a proof of a proposition about it. As a result, their sense is part of their semantic value:
A real number, such as \(\pi\), can be given in different, even though provably equivalent, ways. For classical mathematicians and constructivists alike, the specific way in which the number \(\pi\) is given affects what is required for a proof of a statement about it, at least until the different possible definitions have been proved equivalent; that is why the sense of an expression, and not just its reference, must be something common to all. In classical semantics, however, the semantic value of a term denoting \(\pi\) will be its contribution to what determines, not what is required to prove a statement in which it occurs, but its truth-value, and that is just the denotation of the term, the number \(\pi\) itself; that is why its sense is not part of its semantic value. In a constructivist meaning-theory, by contrast, the semantic value of the term is, precisely, its contribution to determining what is to count as a proof of any statement in which it occurs; and therefore the way in which the denotation is given to us is an integral ingredient of its semantic value (Dummett 2021, p. 494).
These considerations also extend to functional expressions, since infinite sequences may be regarded as functions from the natural numbers. In sum, for Dummett the intuitionist cannot generally maintain a sense-reference distinction not even for singular terms or functional expressions because of the intensional treatment that infinitary objects receive. A more optimistic rearticulation Dummett’s semantic views is proposed by Martin-Löf (2021) against the background of constructive type theory. Bentzen (2020, 2022) advances yet another type-theoretic reformulation of the sense-reference distinction. See (Klev 2022) for a comprehensive account of identity in constructive type theory.
4. Appendix: Intuitionistic Logic
Intuitionistic logic can be roughly viewed as classical logic without the law of excluded middle. Below is a non-exhaustive list of some notable theorems of intuitionistic logic as well as some notable classical theorems that are not provable in intuitionistic logic. Some of them have been important enough in the literature to deserve their own names. Others have been grouped together under a certain theme. If you see an implication but not a biconditional in the intuitionistic theorems, it is typically because the converse fails.
\((\forall x (\phi \lor \psi(x))) \to (\phi \lor \forall x \psi(x))\)
\((\forall x \neg\neg \phi(x)) \to (\neg\neg \forall x \phi(x))\) (double negation shift)
\((\neg\neg \exists x \phi(x)) \to (\exists x \neg\neg \phi(x))\)
\((\neg \forall x \phi(x)) \to (\exists x \neg \phi(x))\) (generalized de Morgan law)
\((\neg \exists x \neg \phi(x)) \to (\forall x \phi(x))\)
\((\neg \phi \to \exists x \psi(x)) \to (\exists x (\phi \lor \psi(x)))\)
\((\forall x \phi(x) \to \psi) \to \exists x (\phi(x) \to \psi)\)
\((\phi \to \exists x \psi(x)) \to \exists x (\phi \to \psi(x))\) (independence of premise principle)
\(\exists x (\phi(x) \to \forall y \phi(y)))\) (drinker paradox)
\(\neg\neg \forall x (\phi(x) \lor \neg \phi(x))\)
\(\neg \neg \forall x \forall y (x=y \lor x \neq y)\)
\(\forall x \exists y \phi(x,y) \lor \exists x \forall y \neg \phi(x,y)\)
\(\forall x \phi(x) \lor \exists x \neg \phi(x)\) (principle of omniscience)
5. References and Further Reading
a. Further Reading
(Van Atten 2004) is one of the best entry points into Brouwer scholarship for philosophy students without extensive mathematical training. The canonical introduction to intuitionism and constructivism in general is (Troelstra and van Dalen 1988). For additional sources, philosophers and logicians may consult (Dummett 1977) and Posy 2020) . For historical accounts of Brouwer’s development of his intuitionism see (van Stigt 1990). More mathematically-oriented sources can be found in (Troelstra 1969), (Moschovakis and Valfeiadou 2020) and (Veldman 2021). (Heyting 1956b) remains a mostly readable introduction to the subject despite its unusual dialogue-driven style. Readers interested in more about Brouwer’s life can have a look at the biography in (van Dalen 2012) as well as an edited collection of Brouwer’s correspondence in (van Dalen 2011).
Bentzen, Bruno. 2023b. “Brouwer’s Intuition of Twoity and Constructions in Separable Mathematics.” History and Philosophy of Logic 45 (3): 341–61. https://doi.org/10.1080/01445340.2023.2210908.
Brouwer, L. E. J. 1905. “Leven, Kunst en Mystiek.” Delft: Waltman.
Brouwer, L. E. J. 1914. “A. Schoenflies und H. Hahn, Die Entwickelung der Mengenlehre und ihrer Anwendungen.” In Jahresbericht Der Deutschen Mathematiker-Vereinigung, 23:78–83.
Brouwer, L. E. J. 1918. “Begründung der Mengenlehre unabhangig vom logischen Statz vom ausgeschlossenen Dritten. Erster Teil: Algemeiner Mengenlehre.” Verhandelingen Der Koninklijke Nederlandsche Akademie van Wetenschappen Te Amsterdam 12. https://doi.org/10.1016/B978-0-7204-2076-0.50015-5.
Brouwer, L. E. J. 1925. “Zur Begründung der intuitionistischen Mathematik. I.” Mathematische Annalen 93 (1): 244–57. https://doi.org/10.1007/BF01449963.
Brouwer, L. E. J. 1948b. “Consciousness, Philosophy, and Mathematics.” Proceedings of the 10th International Congress of Philosophy, Amsterdam, 1235–49. https://doi.org/10.1016/B978-0-7204-2076-0.50054-4.
Dummett, Michael. 1975. “The Philosophical Basis of Intuitionistic Logic.” Studies in Logic and the Foundations of Mathematics 80: 5–40. https://doi.org/10.1016/S0049-237X(08)71941-4.
Dummett, Michael. 1977. Elements of Intuitionism. Oxford University Press.
Dummett, Michael. 1980. “Critical Notice. L. E. J. BROUWER: Collected Works, Amsterdam and Oxford: North-Holland Publishing Company, New York: American Elsevier Publishing Company. Volume I, Philosophy and Foundations of Mathematics. Edited by A. HEYTING, 1975. Pp. xv+628. Volume 2, Geometry, Analysis, Topology and Mechanics. Edited by HANS FREUDENTHAL, 1976. Pp. xxvii+706. Dfl. 250, £110, per volume.” Mind LXXXIX (356): 605–16. https://doi.org/10.1093/mind/LXXXIX.356.605.
Dummett, Michael. 1978. “Frege’s Distinction Between Sense and Reference.” In Truth and Other Enigmas, 116–44. Cambridge: Harvard Univ. Press.
Dummett, Michael. 2021. “Sense and Reference from a Constructivist Standpoint.” The Bulletin of Symbolic Logic 27 (4): 485–500. https://doi.org/10.1017/bsl.2021.60.
Franchella, Miriam. 1994. “Heyting’s Contribution to the Change in Research into the Foundations of Mathematics.” History and Philosophy of Logic 15 (2): 149–72. https://doi.org/10.1080/01445349408837229.
Heyting, Arend. 1958b. “Intuitionism in Mathematics.” In Philosophy in the Mid-Century. A Survey, i Logic and Philosophy of Science, edited by Raymond Klibansky, 101–15. Florence: La Nuova Italia Editrice.
Heyting, Arend. 1958c. “On Truth in Mathematics.” Verslag van de Plechtige Viering van Het Hondervijftigjarig Bestaan Der Koninklijke Nederlanse Akademie van Wetenschappen Met de Teksten Der Bij Die Gelegenheid Gehouden Redevoenmgen En Voordrachten 12: 227–79.
Heyting, Arend, ed. 1975. L. E. J. Brouwer Collected Works 1. Philosophy and Foundations of Mathematics. Amsterdam: North-Holland. https://doi.org/10.1016/C2013-0-11893-4.
Hill, Claire Ortiz. 2010. “Husserl on Axiomatization and Arithmetic.” In Phenomenology and Mathematics, 195:47–71. Dordrecht: Springer. https://doi.org/10.1007/978-90-481-3729-9_3.
Iemhoff, Rosalie. 2024. “Intuitionism in the Philosophy of Mathematics.” In The Stanford Encyclopedia of Philosophy, edited by Edward N. Zalta and Uri Nodelman, Summer 2024. Metaphysics Research Lab, Stanford University. https://plato.stanford.edu/archives/sum2024/entries/intuitionism/.
Kleene, Stephen, and R. E. Vesley. 1965. The Foundations of Intuitionistic Mathematics, Especially in Relation to Recursive Functions. Amsterdam: North-Holland.
Kreisel, Georg. 1970. “Church’s Thesis: A Kind of Reducibility Axiom for Constructive Mathematics.” In Studies in Logic and the Foundations of Mathematics, 60:121–50. Elsevier.
Kripke, Saul A. 2019. “Free Choice Sequences: A Temporal Interpretation Compatible with Acceptance of Classical Mathematics.” Indagationes Mathematicae 30 (3): 492–99. https://doi.org/10.1016/j.indag.2019.01.003.
Mancosu, Paolo. 1997. From Brouwer to Hilbert: The Debate on the Foundations of Mathematics in the 1920s. Oxford University Pres.
Martin-Löf, Per. 1984. Intuitionistic Type Theory. Vol. 1. Studies in Proof Theory. Lecture Notes. Bibliopolis, Naples.
Martin-Löf, Per. 1985. “On the Meanings of the Logical Constants and the Justifications of the Logical Laws.” In Atti Degli Incontri Di Logica Matematica. Scuola Di Specializzazione in Logica Matematica, 2:203–81. Universitá di Siena. https://ncatlab.org/nlab/files/MartinLofOnTheMeaning96.pdf.
Martin-Löf, Per. 2021. “The Sense/Reference Distinction in Constructive Semantics.” Bulletin of Symbolic Logic 27 (4): 501–13. https://doi.org/10.1017/bsl.2021.61.
Martino, Enrico, and Pierdaniele Giaretta. 1981. “Brouwer, Dummett and the Bar Theorem.” In Atti Del Congresso Nazionale Di Logica, Montecatini Terme, 1–5 Ottobre 1979, edited by Sergio Bernini. Napoli: Bibliopolis.
Martino, Enrico, and Pierdaniele Giaretta. 2018. “Brouwer, Dummett and the Bar Theorem.” Intuitionistic Proof Versus Classical Truth: The Role of Brouwer’s Creative Subject in Intuitionistic Mathematics, 1–14. https://doi.org/10.1007/978-3-319-74357-8_1.
McCarty, Charles. 1987. “Variations on a Thesis: Intuitionism and Computability.” Notre Dame Journal of Formal Logic 28 (4): 536–80.
McCarty, Charles, Stewart Shapiro, and Ansten Klev. 2023. “The Axiom of Choice Is False Intuitionistically (in Most Contexts).” Bulletin of Symbolic Logic 29 (1): 71–96. https://doi.org/10.1017/bsl.2022.22.
Placek, Tomasz. 1999. Mathematical Intuitionism and Intersubjectivity: A Critical Exposition of Arguments for Intuitionism. Vol. 279. Springer Science & Business Media. https://doi.org/10.1007/978-94-015-9315-1.
Prawitz, Dag. 1971. “Ideas and Results in Proof Theory.” In Proceedings of the Second Scandinavian Logic Symposium, edited by J. E. Fenstad, 63:235–307. Studies in Logic and the Foundations of Mathematics. Elsevier. https://doi.org/10.1016/S0049-237X(08)70849-8.
Prawitz, Dag. 1973. “Towards a Foundation of a General Proof Theory.” In Proceedings of the Fourth International Congress for Logic, Methodology and Philosophy of Science, Bucharest, 1971, edited by Patrick Suppes, Leon Henkin, Athanase Joja, and Gr C. Moisil, 74:225–50. Studies in Logic and the Foundations of Mathematics. Elsevier. https://doi.org/10.1016/S0049-237X(09)70361-1.
Prawitz, Dag. 1987. “Dummett on a Theory of Meaning and Its Impact on Logic.” In Michael Dummett: Contributions to Philosophy, edited by Barry M. Taylor, 117–65. Dordrecht: Springer Netherlands. https://doi.org/10.1007/978-94-009-3541-9_5.
Rosado Haddock, Guillermo. 2010. “Platonism, Phenomenology, and Interderivability.” In Phenomenology and Mathematics, 195:23–46. Dordrecht: Springer. https://doi.org/10.1007/978-90-481-3729-9_2.
Sundholm, Göran. 2004. “Antirealism and the Roles of Truth.” In Handbook of Epistemology, edited by I. Niiniluoto, M. Sintonen, and J. Woleński, 437–66. Netherlands: Kluwer Academic Publishers.
Sundholm, Göran, and Mark van Atten. 2008. “The Proper Explanation of Intuitionistic Logic: On Brouwer’s Demonstration of the Bar Theorem.” In One Hundred Years of Intuitionism (1907–2007) the Cerisy Conference, 60–77. Birkhäuser Basel. https://doi.org/10.1007/978-3-7643-8653-5_5.
Tait, William W. 2015. “Gentzen’s Original Consistency Proof and the Bar Theorem.” Gentzen’s Centenary: The Quest for Consistency, 213–28. https://doi.org/10.1007/978-3-319-10103-3_8.
Tieszen, Richard. 1995. “What Is the Philosophical Basis of Intuitionistic Mathematics?” In Studies in Logic and the Foundations of Mathematics, 134:579–94. Elsevier. https://doi.org/10.1016/S0049-237X(06)80064-9.
Tieszen, Richard. 2008. “The intersection of intuitionism (Brouwer) and phenomenology (Husserl).” In One Hundred Years of Intuitionism (1907–2007), 78–95. Springer. https://doi.org/10.1007/978-3-7643-8653-5_6.
Troelstra, Anne S. 1969. Principles of Intuitionism. Vol. 95. Springer Lecture Notes. Berlin–Heidelberg–New York: Springer.
Troelstra, Anne S. 1977. Choice Sequences: A Chapter of Intuitionistic Mathematics. Clarendon Press Oxford.
Troelstra, Anne S., and Dirk van Dalen. 1988. Constructivism in Mathematics. Vol. I. Vol. 121. Studies in Logic and the Foundations of Mathematics. Amsterdam: North-Holland.
van Atten, Mark. 2004. On Brouwer. Belmont, CA: Wadsworth/Thomson Learning.
van Atten, Mark. 2006. Brouwer Meets Husserl: On the Phenomenology of Choice Sequences. Vol. 335. Springer Science & Business Media. https://doi.org/10.1007/978-1-4020-5087-9.
van Atten, Mark. 2017. “Construction and Constitution in Mathematics.” In Essays on Husserl’s Logic and Philosophy of Mathematics, 384:265–315. Dordrecht: Springer. https://doi.org/10.1007/978-3-319-10031-9_12.
van Atten, Mark. 2018. “The Creating Subject, the Brouwer–Kripke Schema, and infinite proofs.” Indagationes Mathematicae 29 (6): 1565–1636. https://doi.org/10.1016/j.indag.2018.06.005.
van Atten, Mark, and Göran Sundholm. 2017. “L. E. J. Brouwer’s ‘Unreliability of the Logical Principles’: A New Translation, with an Introduction.” History and Philosophy of Logic 38 (1): 24–47. https://doi.org/10.1080/01445340.2016.1210986.
van Atten, Mark, and Dirk van Dalen. 2002. “Arguments for the Continuity Principle.” Bulletin of Symbolic Logic 8 (3): 329–47. https://doi.org/10.2178/bsl/1182353892.
van Dalen, Dirk. 2012. L. E. J Brouwer Topologist, Intuitionist, Philosopher : How Mathematics Is Rooted in Life. Springer. https://doi.org/10.1007/978-1-4471-4616-2.
van Dalen, Dirk. 2005. Mystic, Geometer, and Intuitionist: The Life of LEJ Brouwer. Volume 2: Hope and Disillusion. Oxford: Clarendon Press.
Walter P van Stigt. Brouwer’s intuitionism. North-Holland Amsterdam, 1990.
Veldman, Wim. 2021. “Intuitionism: An Inspiration?” Jahresbericht Der Deutschen Mathematiker-Vereinigung 123: 221–84.
Susan Stebbing was the first female professor of philosophy in the United Kingdom, a founder of the journal Analysis, and president of the Mind Association and the Aristotelian Society.
One of her most notable academic achievements during that time was developing precise directional analysis, which begins with our common-sense beliefs and breaks them down into their component parts to help better understand what the world would be like if those beliefs were true. Stebbing held that several philosophers in her day employed this type of analysis, but without an appropriately precise system to support its practice. She suggested it as an alternative to the approach of the logical positivists, who began with our basic concepts as a foundation and built on them. Stebbing saw their approach as backward, arguing we should instead start with what we know to be true and break these ideas down.
Stebbing had a wide variety of philosophical concerns beyond analysis and wrote books and articles on several topics. These interests include logic, where she wrote, for example, A Modern Introduction to Logic, the first textbook that took contemporary work in the field seriously. They also include Philosophy of Science, where she published Philosophy and the Physicists, a book intended to show readers that two popular scientists at the time were misusing language and misleading their readership. Her dissertation was on the subject of truth.
Stebbing was also concerned that advances in logic and logical tools were locked away in “the ivory tower”, spending much of her career trying to bring them to the general public. She wrote several books on how to use logic to improve reasoning. One in particular, Thinking to Some Purpose, based on a series of lectures for BBC radio, achieved significant popularity. Stebbing’s work was instrumental in helping non-academic audiences gain access to important academic work and become better reasoners. She was motivated by the need to help people see through biased media and to reason properly about the information they were being given.
What follows presents Stebbing’s work thematically. It starts with her early work and her textbooks, then considers her philosophical work for an academic audience, consisting mostly of contributions to the role of analysis in philosophy. Finally, it examines her work aimed at a general audience, using clear language, and critical thinking in physics and everyday life.
Lizzie Susan Stebbing was born in 1885, but she preferred using her middle name “Susan”. Her published names include L.S.S., L.S. Stebbing, L. Susan Stebbing, and Susan Stebbing.
Stebbing was the youngest of six children and was orphaned at an early age. She read for her exams at Girton College, Cambridge, initially in history and eventually in moral sciences. She completed her master’s degree at the University of London in 1912. Stebbing spent most of her career at Bedford College, University of London, where she started as a part-time lecturer in 1915 and ended as the first female professor of philosophy in Britain. She served as president for the Aristotelian Society and the Mind Association, and was one of the founding members of the journal Analysis. She died in 1943 of cancer.
Stebbing faced significant adversity throughout her life. She had Meniere’s disease, which would occasionally cause her to be on bed rest for extended periods. On top of that, being a woman in academia was unusual at this time, so much so that when she became a professor at Bedford, it was newsworthy enough that several papers and periodicals ran articles on her promotion. Siobhan Chapman (2013) comments: “[W]omen were by now an established presence, although certainly a minority one, in academia, but their place there was hard-won and still controversial”. Despite these adversities, Stebbing established herself as a well-respected philosopher.
During her lifetime, Stebbing was best known as the author of A Modern Introduction to Logic, a textbook written for students taking exams and those wanting an introduction to the newest advances in logic. The book includes several diverse chapters, including one on the Aristotelian syllogisms and one on Bertrand Russell’s theory of definite descriptions. Stebbing wrote several other books on logic as well: A Modern Elementary Logic—a condensed and heavily revised version of A Modern Introduction to Logic; Logic in Practice—a text about clear thinking aimed at a general audience; and Thinking to Some Purpose—the book for she is best known today. The aim of Thinking to Some Purpose is to inform the general public about critical and careful thinking. It reads like a how-to guide, starting with common-sense beliefs and teaching the reader how to think critically and carefully about examples in politics, advertising, and the media. Stebbing’s use of real everyday examples from newspapers, advertisements, and the radio was novel at the time.
Stebbing wrote two other books on critical thinking. In Philosophy and the Physicists, she argues that some scientists’ use of everyday language to describe complex phenomena has misled the public. In Ideals and Illusions, she purports to show that reason and careful critical thinking can ameliorate the human condition by “delivering humans from evil and the conditions resulting from it”. She wrote at least forty-nine papers, including “The Method of Analysis in Metaphysics”, where she provides a groundbreaking precisification of the type of analysis typically used by the Cambridge Analysts.
Stebbing was a straightforward, no-nonsense person concerning the work of others, as well as her own. Despite being harsh in her criticisms, she was also quick to praise the success of other philosophers. Highly self-critical, her work contains several references to previous unclarities and she often attempts to re-explain herself. Frederique Janssen-Lauret (2019) argues that Stebbing’s tendency to self-criticize has led some philosophers to dismiss her importance. This is unfortunate, as Stebbing’s work is original and insightful. We should also note that self-criticism embodies her philosophy, resolving ambiguities wherever she sees them.
Stebbing’s abilities were well respected in her time. When G.E. Moore retired from Cambridge, Stebbing was put in competition with Ludwig Wittgenstein as a potential replacement for his chair position. Ultimately, she did not get the job. Wittgenstein was too renowned at the time, and once he applied, no other candidate stood a chance. That she was asked to apply in competition with Wittgenstein speaks to her prominence in the field. This is especially true given the pervasiveness of sexism in philosophy. For example, she recalls a comment by Gilbert Ryle: “Of course everyone thinks you are the right person to succeed Moore, except that you are a woman” (Chapman 2013).
Stebbing published many book reviews. At least eighty have been located thus far, but as Chapman (2013) notes, Stebbing makes mention of several others in her correspondence that have yet to be located. Given the amount of work this requires, she would have had to review a new book every four months of her career on average. Interestingly, Philosophical Studies: Essays in Memory of L. Susan Stebbing, makes no mention of this in the bibliography of her work, suggesting that this particular contribution was overlooked even in her own time.
2. Bergson and the Pragmatists
Stebbing began her academic career as a historian, but when she came across Bradley’s Appearance and Reality, she decided to pursue Moral Philosophy instead (Chapman, 2013). After she completed her history exams, she turned her attention to the exams for Moral Science and completed her graduate work at King’s College London.
Stebbing’s dissertation, Pragmatism and French Voluntarism, focuses on the topic of truth as conceived by the Bergsonian intuitionists and the pragmatists. Roughly, the Bergsonian intuitionists hold that our instincts are more basic than our rational thoughts (this is the reverse of, for example, Descartes). Alternatively, the pragmatists argue that a claim is true only if it is useful and the meaning of a proposition is determined by the consequences of accepting it as true.
Stebbing disagreed with the notion of truth proposed by the Bergsonian intuitionists and pragmatists, claiming both schools failed due to similar, though diametrically opposed, mistakes. According to Stebbing, neither the Bergsonian intuitionalists nor the pragmatists distinguish the nature of truth from how we determine what is true. Stebbing argues one can tell if something is true without a conception of the nature of truth, and, additionally, one can have a conception of the nature of truth without being able to tell if something is true. I can know, for example, that it is true that pigs are mammals without any deep knowledge about the nature of truth. I can also know things about the nature of truth without being able to tell whether any particular proposition is true in the actual world.
In Stebbing’s view, developed in her dissertation, these resulting theories of truth are in opposition. Bergson and the French Voluntarists identify “truth with reality”, and think the only thing that matters is truth’s nature. The pragmatists “identify truth with one of its consequences” and think the only thing that matters is how you determine whether a proposition is true, and hence the consequences of a proposition.
The Bergsonians conclude that we do not need to pursue absolute truth, as it is an affair of the intellect, and the intellect is inadequate for philosophy. The Bergsonians hold that intellect is only capable of analysis and cannot introduce any new ideas onto the scene. New ideas are essential to philosophy, they claim, and so the intellect is not enough for philosophical pursuits.
Stebbing argues that the pragmatists’ conclusion is that truths must be useful. This claim started an argument with F.C.S. Schiller, who accused Stebbing of conflating the slogan “all that works is true” with the slogan “all truths work” (“work” likely means “useful” in this case). Schiller held the former, but not the latter. Stebbing insisted that the pragmatist must hold both.Schiller and Stebbing exchanged papers on the topic in Schiller may have been right in this case. Per Capps (2019), William James was the only pragmatist to relate truth and usefulness in this way, and even then, James defined truth in terms of utility, so it is still controversial to say that he held that a belief is true when it matches reality. It may be that although “all that works is true”, it is not the case that “all truths work”. Usefulness may be a sufficient but not necessary condition for the pragmatists. Although Stebbing may have erred in judging that the pragmatists held such a strong relationship between truth and usefulness, it is still interesting to see her taken seriously by a major contemporary like Schiller.
3. Textbooks and The New Logic
Stebbing wrote A Modern Introduction to Logic as well as a condensed and heavily revised version of the same textbook, A Modern Elementary Logic. Beaney (2017) considers the former to be considered the first textbook on analytic philosophy.
Unlike other textbooks of this time, Stebbing’s are notable in that they cover the traditional logical topics, such as the syllogisms, and also what was being called the “new logic”, in addition to a host of other problems in analytic philosophy. The “new logic”, as discussed by Stebbing, began with Gottlob Frege and was essential to the work of Bertrand Russell and Alfred Whitehead in their Principia Mathematica. Her discussions of the new logic included speculation on where traditional syllogistic logic failed, discussions on relations and classes and definite descriptions, the universal and existential quantifier, variables, and material implication. Both textbooks also dealt with Mill’s methods and the methodology of science in general, which would be categorized today as the philosophy of science. In this sense, they covered much more ground than their precursors.
Both Russell and Whitehead were contemporaries of Stebbing. She often discussed Whitehead’s philosophy of science (see section 5 and Janssen-Lauret (2019)) and was deeply engaged with Russell’s work (for a sample, see her works of 1917; 1931; 1933a; 1934b). Her engagement with both Russell and Whitehead was often quite critical. For example, she objected to logical atomism (1930a, Janssen-Lauret (2017)).
Stebbing also spent time defending the so-called “new logic”, especially from those who took reviews of her textbooks as opportunities to criticize the program. She engaged in a protracted exchange in Mind with H.W.B. Joseph, an Oxford philosopher and defender of Aristotelian logic, over whether the new logic was coherent (Joseph, 1932; Stebbing, 1933b; Joseph, 1933; Stebbing, 1934a; Joseph, 1934). Joseph claimed that Stebbing’s explanation of variables, and hence the use of variables in the new logic, was ill-founded and illegitimate. True to form, Stebbing complained in return that both she and Joseph lacked clarity in their language and were talking past each other.
4. Analysis
Stebbing’s work in metaphysics focused on using analysis in philosophy. Her most famous papers on the topic are “The Method of Analysis in Metaphysics” and “Logical Positivism and Analysis”. In these two papers, she argues that directional analysis serves metaphysics better than other forms of analysis and she gives it a precise set of rules.
Stebbing also had interests in and published work on logical constructions, the a priori, and philosophy of language more generally (see, respectively, 1934b, 1933c and 1939a for examples). The breadth of her contributions to metaphysics and analysis is wide. However, the focus in this section is on directional analysis, and the systems of analysis that Stebbing held were not as suitable for metaphysics.
a. Non-Directional Analysis
Stebbing articulates three different types of analysis in “Logical Positivism and Analysis” and one additional one in “The Method of Analysis in Metaphysics”, which she contrasts with directional analysis. We will consider deductive metaphysics and postulational analysis in some detail, and then look briefly at analytic definition of a symbolic expression and analytic clarification of concepts, before articulating the details of directional analysis itself in the next subsection.
Deductive metaphysics engages with the world by starting with the ultimate nature of reality and making conclusions about how the world seems to us from this ultimate nature. Postulational analysis begins by setting up a deductive system, logic, or axioms, and making deductions about apparent reality from there. Both systems approach philosophical activity in the same way. First, they start with a set of basic things or definitions, and then they turn to the world to see how it looks in light of the definitions and/or basic objects. Stebbing claims that both systems go the wrong way—the philosopher must start with the world of common sense, rather than what is ultimate or basic.
“The Method of Analysis in Metaphysics”, one of Stebbing’s most highly regarded papers of her own time, focuses on deductive metaphysics. A deductive view of metaphysics requires:
(1) That the metaphysician be concerned with Reality;
(2) that Reality have an ultimate, as distinct from an apparent, nature;
(3) that metaphysics give us knowledge of this ultimate Reality.
Stebbing claims that in being concerned with Reality (capital “R”), a deductive metaphysician wishes to provide new facts about Reality rather than looking into the structure of facts we can access through our common-sense investigations. In “The Method of Analysis in Metaphysics”, she claims proper “metaphysics does not consist in creation but investigation”. The claim that good metaphysics would generate a new set of facts about Reality amounts to the claim that there is something “hidden” about the nature of the world, or something we cannot see with common sense. This means metaphysics cannot be deductive. Stebbing puts J.M.E. McTaggart squarely in the deductive metaphysics camp. She asserts that he starts with what is ultimate and then tries to deduce what is apparent from this ultimate.
“Logical Positivism and Analysis”, a paper Stebbing read to the British Academy in 1933, focuses on postulational analysis. Postulational analysis, she argues, “is the kind of analysis used in the construction of a deductive system”. It is the type of analysis that occurs when someone constructs a deductive system, or logic, and tries to use it to analyze the world. One of the aims of postulational analysis is to do the most with the fewest assumptions possible, even when this means the resulting system does not accord at all with common sense. She uses Rudolf Carnap’s Der Logische Aufbau der Welt as an example of this type of pursuit. Carnap starts with specific types of sense data and attempts to derive the world from that data logically. Though this may be an interesting linguistic project, Stebbing claims it is not the appropriate aim of metaphysics. For her, pursuing this type of analysis makes the grave mistake of treating something vague as precise and attempting to fit it into a deductive system.
Stebbing holds that Wittgenstein is doing this type of analysis in the Tractatus. In “Logical Positivism and Analysis”, she claims that Wittgenstein, and the logical positivists generally, employ postulational analysis:
[T]hey suppose that the first problem of philosophy is to determine the principles of symbolism, and from these principles to draw limits with regard to what we can think. This assumption has two important consequences. First, it leads to the view that philosophy is “the activity of finding meaning” . . . The second consequence is that they are apt to place too much reliance upon the construction of postulational systems.
In the same lecture, Stebbing argues that the issue of treating philosophy as the activity of finding meaning arises because (at least as she interprets Schlick), “the pursuit of meaning precedes the determination of the truth or falsity of a proposition”. For Stebbing, though, this simply gets things backwards:
Understanding more or less unclearly what we say, we nevertheless may know that what we say is true. We then inquire what must be the case if what we have said is true. In this way, we come to see more clearly what we were knowing.
We start with some level of understanding of the meaning of what we have said and knowledge of its truth. We then proceed to learn more about it, and, presumably, understand it better.
The second of these issues, an over-reliance on postulational analysis, leads Wittgenstein and the positivists to misapply a truth about a word to the object itself. For example, Stebbing claims that the positivists “regard tables . . . as constructs of the given”. Stebbing insists that it is correct to say that “tables are logical constructions”, but that it is a mistake to think that this statement is about what “table” refers to, rather than the word “table” itself: “[W]e are saying something about the way in which the word ‘table’ may be used in ordinary sentences; we are not saying that what the word ‘table’ refers to is itself a construct”. The tendency of Wittgenstein and the logical positivists to place too much reliance on postulational systems makes them likely to make this mistake because they take the postulational system as basic, rather than the tables themselves. Thus, it is a bad tendency.
Deductive metaphysics and postulational analysis commit the same error, according to Stebbing. They first create a system or set of definitions or the ultimate set of things and then turn to the world to apply their system. They both start with a definition and then seek out things in the world that meet that definition, rather than starting with what we already know and then seeking out what the components of those facts have in common.
Stebbing provides examples, such as “Everyone on the committee is a professor” (see A Modern Introduction to Logic) when discussing directional analysis. There are two candidates for a postulational or deductive analysis of this sentence.
The first is to start by looking for definitions of complex phenomena. So, one way a deductive or postulational analyst might go is to start by looking for the basic building blocks, and their composition relations. Assume, for example, that the basic building blocks are atoms, and their composition relations are chemical bonds. We would then have to piece together the people on the committee out of these atoms and chemical bonds, eventually perhaps adding something basic for their minds. Once we have the people, we can only then go on to think about the committee and the nature of what it is to be a professor (which will assuredly require other complex concepts that we will have to break down, like university).
The second candidate for analysis is purely logical, something like (where C is “on the committee” and P is “professor”):
∀x(Cx → Px)
Though this type of analysis teaches us about the structure of the sentence, it does not help us see more clearly why or how it might be true. This is especially the case since there is no hidden structure, like the hidden structure in Russell’s definition of definite descriptions. For Stebbing, this type of analysis is not focused enough on showing us what the world would be like if our sentence were true. She does hold that there is a time and place for such analyses, but that these types of analyses are not the main goal of metaphysics.
Stebbing also considers what she calls “analytic definition of a symbolic expression” and “analytic clarification of concept” in “Logical Positivism and Analysis”. Stebbing defines the former for sentences:
“E’ is an analysis of E” is to be defined as follows: (i) E’ says what E says; (ii) if ‘a’ is a symbol occurring in E, then what ‘a’ refers to is not less distinctly referred to in E’, and there is some symbol “b” occurring in E’ but not occurring in E.
So, E’ is an analysis of E when it says the same thing as E, refers at least as distinctly to the same things as E and contains at least one symbol that E does not.
As an example of this type of analysis, Stebbing uses Russell’s analysis of definite descriptions. Russell (and Stebbing) analyzes the proposition “The author of Waverly is Scotch,” but we may as well use “The author of Thinking to Some Purpose is a woman”. Russell provides an analysis whereby “the author of Thinking to Some Purpose” is interpreted as a definite description, meaning that its referent is required to exist and be unique for propositions containing it to be candidates for truth. So, “the author of Thinking to Some Purpose is a woman”, upon analysis, becomes “there is exactly one author of Thinking to Some Purpose, and that author is a woman”. This new sentence, it is claimed, says the same thing as “the author of Thinking to Some Purpose is a woman”, makes reference to the same things (Thinking to Some Purpose and author), and contains at least one symbol that the original sentence does not (claims about existence, sometimes notated as ∃).
Analytic definition of a symbolic expression could “clarify our thoughts . . . because in using ‘E’ we may understand more clearly what we were saying when we used ‘E’ ”. In effect, producing an analytic definition of a symbolic expression amounts to producing a definition. It may also be a candidate for the grammatical analysis that Janssen-Lauret (2019) proposes as part of the solution to the paradox of analysis.
Analytic definition of a symbolic expression is a useful tool, but it is not enough to properly accomplish metaphysics, in Stebbing’s sense. Though it can sometimes help us see more clearly what the world would be like if a particular sentence were true, a definition is not always helpful in this way. For Stebbing, we can define terms when we know what they refer to or when we can use them “significantly in combination with other words” (see A Modern Introduction to Logic). But neither of these guarantee that a definition will help us see more clearly what our beliefs imply about the world, especially if we are producing a new and novel definition that does not match anything actual. So, analytic definition of a symbolic expression is not a type of analysis that will suit the purpose of metaphysics.
An analytic definition of “Everyone on the committee is a professor” might try to analyze the term “professor”. This could result in a sentence in the following form: “Everyone on the committee is a teacher of the highest academic rank at a university”. This analysis presumably says the same things and refers to the same things as the original and contains at least one symbol that the original does not (“rank”, for example). We learn something here about the logical or grammatical structure of the sentence.
Analytic clarification of concepts is far less systematic and may not result in a clarification of our thoughts. It “consists in the elimination of the elements supposed to be referred to whenever we use a symbol ‘S’, but which are not such that these elements must be referred to whenever we so use a sentence containing ‘S’ that the sentence says what is true”. Thus, a symbol is an analytic clarification of a concept “S” when it does not refer to the things that S mistakenly refers to but is in all other respects the same as S. She gives as examples of this type of analysis the clarifications of “mass”, “force”, and “simultaneity”. Her main target appears to be those who say something close to the truth but lack a clear understanding of the concept in question. Eric Schliesser argues that analytic clarification of concepts is “the effect of scientific development”. This would mean that analytic clarification of concepts occurs when we must adjust our ordinary beliefs in light of our scientific practices.
Stebbing relies on the example of simultaneity in “Logical Positivism and Analysis”. Prior to Einstein’s discovery that simultaneous events are not really simultaneous in the sense that we had thought, “we thought we knew quite well what was meant by saying ‘happening at the same time in London and New York’”. However, upon Einstein’s discovery, “we did not know quite well what we meant; we now understand that what we thought to be essential is not so”. In effect, we learn that we were systematically mistaken. Analytic clarification of concepts is supposed to rectify these situations.
Again, analytic clarification of concept is not a strong enough tool to completely accomplish the goal of metaphysics. Although this may sometimes help clarify our beliefs, it can also have a distorting effect. Stebbing herself admits it may not always clarify our thoughts. Scientific developments can be complex and do not always help us see things more clearly, at least not without significant work to explain them.
An analytic clarification of concepts for the sentence “Everyone on the committee is a professor” is more complicated, since we do not yet know what we are confused about. Suppose, though, that the committee is open to professors of any rank. It might be that someone could utter the sentence and assume that “professor” meant only “full professor”, and excluded assistant and associate professors. Then, we might analytically clarify the sentence as “everyone on the committee is an assistant, associate, or full professor.” The person uttering the original might still have said something true, despite not quite knowing why they said something true, but now we are in a better position to see the actual extent of what the original sentence means.
b. Directional Analysis
Directional analysis is, very roughly, a form of analysis that starts with our common-sense beliefs and breaks them down into what they are made of . It thus allows us to see more clearly what the world must be like for our beliefs to be true. Stebbing claims that directional analysis is the type of analysis that Moore uses.
In “The Method of Analysis in Metaphysics”, she argues that “the aim of metaphysics is to reveal the structure of that to which reference is made in true statements”. Metaphysics does not reveal new facts about the world but rather teaches us about the structure of such facts. In “Logical Positivism and Analysis”, she maintains that the purpose of analysis is “to enable us to understand something more clearly”. This is a rough restatement of a claims she makes in “The Method of Analysis in Metaphysics”, where she insists that metaphysics needs to investigate:
(i) [W]hat exactly we are believing when we believe that there is a table in this room, that it was here three hours ago, and so on; (ii) how our various beliefs are inter-related;
(iii) how our inconsistent beliefs may be adjusted, and which should be rejected.
Common sense is at the core of her approach to analysis, as expressed by her claim that we learn more about our common-sense beliefs through analysis:
We must begin with common-sense facts, such as I see this candle, or This blow on his head killed this man, or Her remarks made him angry. It is useless first to define “material thing,” or “cause,” and then to ask whether the terms so defined are exemplified in the world. Yet this is what the deductive metaphysician does, unless he takes the easier course of defining the terms, and then ruling out whatever does not conform to the definition as ‘mere appearance’.
To accommodate this intuition, and in response to the failures of the previous approaches, Stebbing attempts to make directional analysis precise. In “Logical Positivism and Analysis”, she claims:
A directional analysis of a sentence ‘S’ consists of a set of steps such that (i) each step results in a sentence (to be called ‘a resultant’) which is such that this sentence reveals more clearly the multiplicity of the fact (expressed both by ‘S’ and by the resultant) so that the resultant shows more clearly the structure of the fact expressed; and (ii) if the analysis were completed, the final resultant would have the same multiplicity as the fact expressed by ‘S’ and by the resultant at each step. Thus the final resultant would reveal the form, the elements, and the mode of their combination.
The multiplicity of a fact is a function of the number of components involved and the relationships between them. A directional analysis of a sentence more clearly demonstrates what the world is like if that sentence is true, and, if completed, results in a perfect picture of the facts involved in making the sentence true. It shows us how its precise elements are put together, and what its truthmakers must be.
Directional analysis can do things that the three other types of analysis Stebbing presents cannot. Stebbing expresses this clearly in “The Method of Analysis in Metaphysics”:
We understand more or less clearly. In the endeavour to understand more clearly we use words and sentences, and then reflect upon how we are using them, and whether we are so using them as to say what is true, or what might have been true although it happens to be false. Such reflection is required in the case of directional analysis.
Directional analysis, then, can show us what the world would be like if our beliefs were true, whereas the other types of analysis cannot (or at least do not have this as their primary purpose). It does not make the same mistakes that she accuses postulational analysis of making. Directional analysis starts with the goal of analyzing our common-sense beliefs. It has the right aim for metaphysics and does not do things backwards.
Turning to more details of the system, in “Substances, Events and Facts”, Stebbing claims that there are two presuppositions that the analysis of facts rests on:
(1) [T]hat some propositions can be known to be true; (2) that directional analysis is possible.
Here, we see the initial ideas that support Stebbing’s further discussions of directional analysis. We must be able to know that some propositions are true, and it must be the case that directional analysis is possible. These presuppositions play a major role in her most sustained articulation of directional analysis, “The Method of Analysis in Metaphysics”.
In “Substances, Events and Facts” she provides further details about what it takes for an analysis to be directional:
To say that the analysis of F is directional is to say that if F be analysed into a, b, c, then a, b, c, are on a lower level than F; and if a be analysed into a1, a2, then a1, a2 are on a lower level than a. The relation being on a lower level than is clearly asymmetrical and transitive. To say that a is on a lower level than F is to say that a is in some sense simpler than F.
This is one of Stebbing’s first definitions of directional analysis in print. We can see her emphasis on simplicity. We move from more complex facts to simpler ones, which allows us to learn “the structure of that to which reference is made in true statements”.
In “The Method of Analysis in Metaphysics”, Stebbing defines a level as “a Resultant in which every factor [configured element] has been carried to the same degree of analysis”. A Resultant (R) is any completed step in an analysis procedure, and is of higher order than another (R’) when it contains “configured elements, some of which are either replaced by unconfigured elements in the next step or are replaced by unconfigured elements in a fewer number of steps from R’ than is required to replace any configured element in R
A configuration of elements is a fact, for Stebbing, and some facts may contain other configurations as elements. In other words, some facts may have subfacts. Simple facts are configurations, which contain no configured elements. If a fact contains a configured element, it is more complex than the configuration it contains. Thus, for Stebbing, a fact is of lower level than another when it contains fewer subfacts as elements. Stebbing gives an example of the difference between same-level analysis and lower-level analysis in “Directional Analysis and Basic Facts”. She holds that one same-level analysis of “A and B are first cousins” is “One parent of A is a brother or a sister of one parent of B” because it is definition-like. Further, she holds that ordinary scientific definitions, and an analysis of “Every economist is fallible” as “Something is both an economist and fallible and it is not the case that something is an economist and not fallible”, are same-level analyses. On the other hand, an analysis of a sentence about a committee into a sentence about individuals is directional. If we further analyze a sentence about an individual into a sentence about “bodily and mental states”, then we again have a directional analysis. This marks the difference in level, for Stebbing. Sentences analyzed at the same level make reference to exactly the same things as in the original sentence, but a directional analysis does not. For Stebbing, “the direction is from higher-level non-basic facts to lower-level non-basic facts, and finally to basic facts”.
In “Some Puzzles about Analysis”, Stebbing claims that John Wisdom’s distinction between same-level and new-level analysis is the same as her distinction between same-level and directional analysis. She reiterates this claim in her contribution to Moore’s Schilpp volume:
Mr. John Wisdom . . . at one time laid great stress upon what I call directional analysis, which he re-named ‘new level analysis’. His proposal [is] to analyse ‘nation-sentences’ (e.g., ‘Germany hates Russia’) into ‘individual sentences’ (e.g., ‘Fritz hates Ivan’), and these again into sense-data sentences.
Stebbing disagrees that sense-data sentences mark the end point of directional analysis, since they are the result of a process of abstraction, and thus not basic facts. But she does seem to approve Wisdom’s new-level proposal to analyze “nation” into “individual” (Wisdom, 1934). At the most basic level, for Stebbing, we have basic facts, or “absolutely specific facts” (see “The Method of Analysis in Metaphysics”). Rather than sense-data, she suggests absolutely specific facts might be about absolutely specific shades of color, tastes, or sounds, though she is far from committed to exactly what they are. She later repudiates basic facts, so this is not surprising.
A step in an analysis of a sentence is of a lower level than the original when it is simpler, in the sense that it contains fewer configured elements. In more colloquial terms, this means it breaks things down into its component parts.
A directional analysis of “Everyone on the committee is a professor” is accordingly quite different from the other types of analysis, such as a candidate analysis (Person 1 is a professor, and Person 2 is a professor), where “Person n” is the name of the nth committee member. In this way we reduce the complexity of the components involved in the original statement by eliminating one concept and replacing it with less complex ones. We have eliminated “committee” in favor of the people involved, thereby reducing the proposition originally about a committee to the members of that committee. This, then, may serve as the first step in a directional analysis. We can go further here, too. Following this analysis, we could proceed with another step: “These bodily and mental states make up a professor, these other bodily and mental states make up a professor . . .” As shown, directional analysis separates complex ideas into their simpler constituent parts.
A directional analysis of “Everyone on the committee is a professor” is accordingly quite different from the other types of analysis. A candidate analysis is:
Person 1 is a professor, and Person 2 is a professor, and…
where “Person n” is the name of the nth committee member. In this way, we reduce the complexity of the components involved in the original statement, by eliminating one concept from it and replacing it with less complex ones. We have eliminated
“committee” in favor of the people involved, thereby reducing the proposition originally about a committee to the members of that committee. This, then, may serve as the first step in a directional analysis. We can go further here, too. The next step in this analysis might be something like:
These bodily and mental states make up a professor, these other bodily and mental states make up a professor…
Very roughly, directional analyses break things down into what they are made of. One might wonder whether Stebbing is just renaming Russell’s logical atomism. In fact, there is good reason to think that she repudiated atomism, as she discusses several objections to the position over the course of her career (see Janssen-Lauret, 2017, for more details).
One might wonder whether Stebbing is just renaming Russell’s logical atomism. In fact, there is good reason to think that she repudiated atomism, as she discussed several objections to the position over the course of her career.
In her pursuit of providing a more formal set of rules for directional analysis, Stebbing defines a “fact” and some surrounding concepts in “The Method of Analysis in Metaphysics”. For Stebbing, a fact is “a configuration of elements”. She takes this to be equivalent to Wittgenstein’s claim that a fact is “a set of elements arranged in a certain way.” Facts (or configurations) may be elements in other configurations. A configuration containing no other configurations is a basic fact (or, as above, a fact with no subfacts is a basic fact). A resultant (little “r”) is the final step in an analysis of a fact, and a Resultant (capital “R”) is any completed step in the analysis. One Resultant is of higher order than another when it contains more configured elements.
Directional analysis rests on presuppositions, which Stebbing formulates as follows:
(a) The logical presupposition:
1. If p is to be analyzed, then p must be understood. It follows that there is at least one expression, which unambiguously expresses p.
(b) The metaphysical presuppositions:
2. If p is to be analyzed, then it is not always the case that p is known to be false, and it is sometimes the case that p is known to be true.
3. Directional analysis is possible.
3.1 There is a way of analyzing p such that we can correctly speak of the analysis of p.
This assumption is equivalent to the assertion that we must be able to say, “There is one and only one analysis of p, and this is that analysis.”
3.2 If π1,π2…,πn, is the analysis of p, then p entails and is entailed by π1,π2…,πn.
It follows that there is no sub-set out of π1,π2…,πn which is the analysis of p.
3.3 Each Resultant of higher order contains more configured elements than the Resultant of the next step.
3.3.1 The lowest level Resultant is the resultant of the analysis.
3.4 Every Resultant refers to a set of basic facts.
3.5 Each factor in a given level must exclude every other factor in that level.
3.6 A basic fact is an absolutely specific fact.
The logical presupposition implies that we need to understand (on an immediate level) what we analyze. The metaphysical presuppositions are more complicated. Assumption 2, Stebbing claims, is equivalent to the assertion that some propositions are known to be true. Though directional analysis is still possible if p is false, it cannot be the case that all propositions are known to be false:
[W]e often do not know what we assert is true or false, and sometimes we are led to reject a possible view because the directional analysis of a proposition shows that something would have to be the case which we believe not to be the case.
All that is meant by assumption 2 is the anti-skeptical claim that some propositions are true.
The presuppositions under 3, Stebbing continues, follow from the assumption that directional analysis is possible. 3.1 guarantees that we have a single analysis of a proposition, whereas 3.2 guarantees that the original proposition and its analysis entail each other. Further, 3.5 guarantees that our resultant is indeed simpler than the original proposition.
An analysis terminates in basic facts, and hence the resultant always exists, as demonstrated by the presupposition, “Each Resultant of higher order contains more configured elements than the Resultant of the next step” (3.3), which guarantees that each step in a directional analysis is simpler than the one before. Given that Stebbing (at this point) holds that there are basic facts, presupposition 3.4 requires that each analysis terminates with basic facts. 3.6 describes the nature of basic facts so that they must be absolutely specific. Stebbing asserts that the assumption “absolutely specific elements (basic facts)” exist is “plausible”, and that she sees “no reasons against it”.
The existence of basic facts is contentious. Both Eugene Bronstein (1933) and Max Black (1933) wrote sustained criticisms of Stebbing’s use of basic facts in directional analysis. Both also suggested that directional analysis was not possible due to this use.
Bronstein identifies three problems for directional analysis based on “The Method of Analysis in Metaphysics” and “Logical Positivism and Analysis”. He claims that directional analysis is used to infer the existence of basic facts and that this inference is risky.
Bronstein’s arguments against basic facts amount to more of an assertion of their controversy. He states:
[W]hen [directional analysis] ends, and it does end, it ends in basic facts; directional analysis becomes the deus ex machina, introduced to save basic facts.
Basic facts are dubious, per Bronstein. Since Stebbing argues that directional analysis is possible and terminates at basic facts, he holds that directional analysis is being used to prove the existence of something dubious. Stebbing insists he has things backward. What she argued is not that the possibility of directional analysis proves that basic facts exist, but that directional analysis is possible only if they do. Furthermore, she agrees that the existence of basic facts is dubious.
Black makes a much more sustained argument in favor of what he calls “logical analysis” as opposed to directional analysis. We will focus only on his concerns about basic facts in directional analysis. He claims, essentially, that we can never know whether there are such things as basic facts, and so we can never know whether directional analysis is indeed possible. He claims:
It can never be known that there are simple elements [basic facts], and hence, if [Stebbing] is right in her view of analysis, it can never be known whether metaphysical analysis is possible.
Stebbing’s response to both Bronstein and Black seems to be the same—she concedes that basic facts may not exist, making directional analysis potentially impossible.
This may represent a change of opinion for Stebbing. In “Substances, Events and Facts”, which appears just before “The Method of Analysis in Metaphysics”, Stebbing indicates that there are “at least five assumptions commonly made by those who employ analysis in metaphysics” but that “they are all disputable”. She adds that she “should be ready to grant them all” herself. In saying this, she points to her forthcoming “The Method of Analysis in Metaphysics”. Importantly, Stebbing maintains that she would be ready and willing to grant that the assumptions of directional analysis are true.
However, in “Directional Analysis and Basic Facts”, she claims that there may not be basic facts, or at least, they may not be epistemically accessible to us. She further claims that “The Method of Analysis in Metaphysics” was not an argument for directional analysis, but rather an exercise in pointing out what the presuppositions of such a system might be. In “Directional Analysis and Basic Facts”, she claims that:
[T]he method of analysis as practiced by philosophers required the assumption that the analysis terminate in basic facts, and that this assumption was not ‘certainly justified’ and was not ‘even very plausible’.
Still, she insists in “The Method of Analysis in Metaphysics” that she sees no reasons against postulating basic facts (“the assumption is plausible”). This seems to contradict her claims in the same paper that the presuppositions of directional analysis are “far from being certainly justified, and not even very plausible”. It may be, then, that earlier she hoped basic facts might be shown to exist, but later lost hope.
5. Philosophy of Science and Physics
Toward the beginning of her career, Stebbing focused on the philosophy of Alfred North Whitehead. She had a great respect for Whitehead, but this did not stop her from being critical. In her review of his work Process and Reality, she complains about his use of misleading language. This included his two definitions of God as “creativity” and as “infinitely patient”.
Elsewhere, she focuses on Whitehead’s notion of “perceptual object”. Whitehead contends that perceptual objects are based directly on common-sense things. Stebbing, on the other hand, claims that a proper explanation of them merely starts with common-sense things. In “Professor Whitehead’s “Perceptual Object””, Stebbing notes that Whitehead identifies perceptual objects with “so little qualification and that it can scarcely be denied that the common-sense ‘table’ is much too crude to be a perceptual object”. Moreover, as Stebbing goes on to say, Whitehead’s assertion that perceptual objects are common-sense things is misleading, since common-sense things are “a product of sense-awareness and unrecognized metaphysical prejudices that have no secure basis in fact”, while perceptual objects are “what remains when the prejudices and inconsistencies have been cleared away”. This is in part because Whitehead settles on a rather non-common-sense notion of what a perceptual object is: it is an adjective. Stebbing argues that this idea is contrary to the common-sense understanding of perceptual objects: “The notion that a perceptual object, a chair, is an adjective is certainly shocking to common sense and to many philosophers”. Janssen-Lauret (2022) provides more details on Stebbing’s philosophical interactions with Whitehead. Stebbing’s work on Whitehead and analysis had a direct relationship to her work in the philosophy of physics. In her first symposium paper on the subject, she argues that once we clarify what “materialism” and “idealism” mean, we can use older work of Whitehead’s to see that the new physics (the term she used for post-Einsteinian work in physics) does not require idealism to be true.
In other work on similar topics, she argues that McTaggart’s discussions of time are ambiguous because he is not precise enough about what the question, “Is time real?” is asking (see “Some Ambiguities in Discussions Concerning Time”). She also criticizes Sir James Jeans for misleading his readership by using language inappropriately (see “Physics, Materialism and Free Will” and “The New Physics and Modern Materialism”). Generally, she argues against the lack of precise language use in physics and philosophy of physics.
Her main contribution to this field is Philosophy and the Physicists. This book is written for a general audience and takes issue with the philosophies that Jeans and Sir Arthur Eddington propose in their books on physics aimed at general audiences. She accuses Jeans and Eddington of making a grave error in their presentation of physics by trying to argue that “modern physics shows the world to be a very different sort of place than it indeed seems to be” (Willow, 1995, p 127). She thinks that they use inappropriate language to communicate their material to the general reader, and thus prevent them from arriving at the truth and adjusting their thinking to accommodate it. Stebbing holds that “careful use of reason and knowledge” (Willow, 1995, p 127) would otherwise enable this adjustment in thought. Stebbing alleges that their use of language, particularly that of Eddington, deliberately misled their audience into thinking that the world was other than what it was. For example, the claim “the plank is not really solid” is one of the claims Stebbing found misleading, as discussed in Philosophy and the Physicists. We know via common sense that the plank is really solid, for solid could not mean anything other than what the plank is. Stebbing held that new language was needed to describe to a common audience what was meant here, rather than using everyday language and confusing the reader and leading them to doubt their common-sense beliefs without good reason. Stebbing herself is explicit about the relationship between the language of common sense and science:
No concepts drawn from the level of common-sense thinking are appropriate to sub-atomic, i.e. microphysical, phenomena. Consequently, the language of common sense is not appropriate to the description of such phenomena.
Common-sense language and concepts are inappropriate for some complex scientific investigations. The colloquial meaning of “solid” as something “firm” needs to be redefined as a scientific concept, such as “non-porous at the microscopic level”.
Eddington offers an example in The Nature of the Physical World:
I am standing on the threshold about to enter a room. It is a complicated business. In the first place I must shove against an atmosphere pressing with a force of fourteen pounds on every square inch of my body. I must make sure of landing on a plank traveling at twenty miles a second around the sun—a fraction of a second too early or too late, the plank would be miles away . . . The plank has no solidity of substance. To step on it is like stepping on a swarm of flies. Shall I not slip through?
Eddington is essentially suggesting that there are two worlds—that of common sense and that of the scientist. Stebbing disagrees with his reasoning in Philosophy and the Physicists:
The danger arises when the scientist uses the picture for the purpose of making explicit denials, and expresses these denials in common-sense language used in such a way as to be devoid of sense. This, unfortunately, is exactly what Eddington has done in the passage we are considering, and indeed, in many other passages as well.
Stebbing is convinced that Eddington is wrong here, in part because he is using the “language of common sense” to describe a micro-physical phenomenon. By continuing to pretend “solid” still means “firm”, Stebbing held that this use of language did not help the audience learn anything, adding in “Nebulous Philosophy” that “unfortunately, the popularizing scientist too often delights rather to mystify the reader rather than enlighten him”.
More broadly, Stebbing seems to be insisting that since physicists are not trained philosophers, they should not be doing philosophy. Both Jeans and Eddington make grandiose claims about what the world must be like for the new physics to be true: that there is a god, but he is a mathematician, that free will cannot exist, that the world consists only of pointer readings (the recordings of measurements by devices), et cetera. They fail to see, Stebbing argues, how these claims conflict with common sense and do so without reason. Stebbing holds that this overreaching was a result of a lack of critical thinking about their own subject and the relation of that subject to things beyond it. Had they been more precise in their language and clearer in their explanations, they would have seen for themselves that physics is not metaphysics and that their philosophical conclusions went well beyond their evidence.
6. Critical Thinking
Stebbing’s emphasis on clear and precise language and thinking went beyond her academic work. Much of her career was devoted to what is known today as public philosophy. She was concerned with two issues: the first was to point out common mistakes in everyday thinking, and the second was to offer insights into how to avoid them in the future. To address these issues, Stebbing adopts a dual role as critic and teacher, and this dual role influences her popular writings about critical thinking. She seemed incredibly optimistic about the potential effects of teaching people to think critically. In Ideals and Illusions and Men and Moral Principles, she suggests that the current political situation (mostly with respect to WWII) might have been less extreme if politicians and the media had not misled people or if they had had the tools to think through what was being said to them.
Thinking to Some Purpose is Stebbing’s best-known and best-received work on this subject (see Chapman 2013). Although there were several books on critical thinking and language intended for the general public on the market, hers was notable in that almost all of her examples came from sources that people would have had regular access to, such as political speeches, advertising campaigns, and newspaper articles.
A particularly striking example of this occurs when she compares several news outlets’ coverage of a contemporary event, as when Lord Cecil requested that the government whip not be sent to him (essentially meaning that he would no longer vote along party lines). Over the course of almost two full pages (p 226-228), she lists the length and content of newspaper coverage of the letter Cecil sent announcing his decision. The coverage is wildly variable and depends on the political point of view of the newspaper. The coverage runs from fully re-printing the letter to a complete lack of reporting on the event. This example of the wide variety of coverage on the same event was one of many examples of the misleading use of language and space pointed to by Stebbing, in this particular case about the omission of what might be thought of as important information. For Stebbing, this use is egregious, as she held that newspapers are considered by many to be a source of information only and not to take stands on issues:
[Omissions] can be detected only if we form the habit of consulting newspapers representing different political views. That this should be necessary is deplorable. We are considering not views, but news.
This provides all the more reason to train ourselves to think clearly and critically about everyday affairs. Achieving different perspectives on the same topic can be the goal of an ideal reasoner, but this is not possible if the presentation of the topic aims to persuade us of something rather than to convince us of its truth through reasoned argument. Finding different perspectives on the same topic is only one method to help distinguish between what we know to be true and what we merely believe to be true, or what we take for granted as probably based on common sense.
Stebbing emphasizes early on that logical thinking is desirable, and that “to think logically is to think relevantly to the purpose that initiated the thinking; all effective thinking is directed to an end”. Further, logic helps us think clearly. Here, logic does not only encompass formal systems like syllogisms or work being done by her contemporaries on the new logic. Logic, in the sense Stebbing is using it, is more than an attempt to derive a priori knowledge from some set of axioms (which is in keeping with her views on metaphysics that our philosophical aims are not to find indubitable truths). Logic encompasses more than formal systems, notably what we would call critical thinking today.
For Stebbing, the best thinking is done when trying to solve a problem. Problems can be simple, like “What should I eat for lunch?” or “What did my dad say to me last night?” They can also be more complex, like “How do we find world peace?” However, there are additional complications to thinking clearly about a purpose. Our character traits, for example, sometimes prevent us from recognizing the truth by blinding us to certain evidence, or making us focus on unnecessary information. This may lead to concealed contradictions in our thinking. One way we can help someone think more clearly is by pointing out concealed contradictions in their thinking.
As always, determining what a concealed contradiction is, and what we know to be true versus what we merely believe to be true, is a challenge. As Stebbing says, we all have beliefs that are “cherished”. They are sometimes the things our parents believed, so we grew up believing them too, or sometimes the things associated with our chosen political party, or other things altogether. These types of prejudices can also come from an inability to see another person’s perspective. To have an accurate picture of the world, we need to try to see it from other people’s perspectives. For Stebbing, thinking clearly means questioning these beliefs. There is a danger, she holds, in not questioning these beliefs, not making their assumptions clear, and forgetting that we ourselves are prejudiced:
I do seek to convince the reader that it is of great importance that we ordinary men and women should think clearly, that there are many obstacles to thinking clearly, and that some of these obstacles can be overcome provided that we wish to overcome them and are willing to make an effort to do so.
We must put in the effort to overcome our prejudices if we want to be clear thinkers.
Language use plays an essential role in both clear thinking and problematic thinking. As Chapman (2013) points out, Stebbing acknowledges that during her career, she was largely concerned with how to make language clear and coherent. In the domain of critical thinking and logic, Stebbing was worried that poor use of language would impede clear thinking. Well-chosen words can communicate a fact or proposition best, but they can also trick listeners into believing something that may or may not be true if chosen poorly. Sometimes language is used to “induce . . . readers to agree [to a claim] . . . rather than to convince them that . . . it is sound”. We should use language to help readers see that a claim is sound and to help them think clearly about it. Poorly chosen language “makes for twisted thinking”, since it can mislead us into thinking we have good reasons for a belief when we do not.
“Potted thinking” is a term Stebbing uses for thinking that is too quick: “[P]otted thinking is easily accepted, is concentrated in form, and has lost the vitamins essential to mental nourishment”. Potted thinking is easy thinking. It occurs when we accept slogans as facts rather than quips, when we do not think through complicated things because it is too time-consuming. The term is a reference to potted meat, which is “a vacuum-packed product such as Spam that you might find in a wartime ration pack” (West, 2012), and although it can sustain you, it is not as healthy as regular meat and has lost many of its valuable nutrients. Potted thinking, though, is not always bad. Sometimes it is a necessary shortcut, and some of our unproblematic common-sense beliefs even seem to fit the bill (like “many hands make light work”). Stebbing claims we need to review our thinking about what led us to our potted belief from time to time. This occasional review will save us time, and keep us from the worst of potted thinking.
Other features of critical thinking that also broach this helpful/harmful divide, according to Stebbing, include metaphors and analogies. Sometimes metaphors and analogies can help make a difficult topic easier to understand, but if we accept them too easily and without thinking them through, they can be problematic and misleading. For this reason, Stebbing claims that analogies can be used to suggest conclusions but not to establish them.
There are a host of other influences that can twist our thinking, Stebbing says. She addresses propaganda, persuasion, misleading uses of statistics, and fallacious reasoning. Because we can be misled in all these ways, Stebbing argues that we must recognize that sometimes our beliefs need to be tested:
My thinking is twisted when I believe that I am thinking effectively and have discovered sound reasons for my conclusion but am mistaken in this belief.
The lack of clarity of my belief may cause me to think that it is well-justified when it is not. There can be many reasons our thinking is twisted. Sometimes, twisted thinking is benign: we may simply not have access to all the needed information. But sometimes it is not. Our thinking may be potted or mistaken. More problematically, people sometimes make crooked arguments. That is, they sometimes make arguments to persuade you that something is true while knowing full well they have not given you enough information to think clearly about the topic. To avoid this, we must regularly test our beliefs, just as scientists test their hypotheses.
Stebbing may have been overly optimistic in her hopes for what clear thinking could achieve. At times, it sounds like she convinced herself that if we were all clear thinking and careful language users, WWII would never have occurred. Though clear and careful thinking and language use might indeed improve our situations, her hope that they could fix them completely is perhaps overly ambitious. She has a particularly apt quote about the current political situation at the end of Thinking to Some Purpose:
Amidst the ruins [of the current state of the world] it is still possible to preach the ideal of freedom, truth, happiness and love. The choice offered us is evil, but it is not necessary to choose the worse. The way before us is hard, but it is not impossible to make it lead towards a world where men can be free and happy because they are not afraid of the truth, however uncomforting, and have learnt that love casts out fear and brings peace.
If we are careful thinkers, we can be freer people. We must learn to accept the truths that we discover through careful thinking, even though that may be challenging to do. In this sense, Stebbing thought that despite its difficulty, training ourselves to be careful and precise everyday thinkers would ultimately help us make the world a better place.
7. References and Further Reading
Parts of this article are based on an English translation of (Kouri Kissel 2019).
a. Further Reading
Suggested reading of primary sources:
A Modern Introduction to Logic : A textbook on logic, aimed at students, incorporating both old logical systems (Aristotelian), and more contemporary concerns (work of Frege, Russell and Whitehead, as well as some philosophy of science).
“The Method of Analysis in Metaphysics” : A paper in which Stebbing makes precise directional analysis, the system of analysis she holds is used by the Cambridge Analysts.
Logical Positivism and Analysis : A lecture which further articulates the power of directional analysis, and a comparison of the work on analysis of Moore to that of the Vienna Circle.
Philosophy and The Physicists : A book which shows that two contemporary popular scientists (Jeans and Eddington) are misleading the public in problematic ways about science.
Thinking to Some Purpose : A popular philosophy book about critical thinking.
b. References
Aristotelian Society for the Systematic Study of Philosophy (1948). Philosophical Studies: Essays in Memory of L Susan Stebbing. London: George Allen and Unwin Ltd.
Beaney, M. (2017). Analytic Philosophy: A Very Short Introduction. Oxford: Oxford University Press.
Black, M. (1933). Philosophical Analysis. Proceedings of the Aristotelian Society 33, 237–258.
Bronstein, E. D. (1933). Miss Stebbing’s Directional Analysis and Basic Facts. Analysis 2 (1-2), 10–14.
Capps, J. (2019). The Pragmatic Theory of Truth. In E. N. Zalta (Ed.), The Stanford Encyclopedia of Philosophy (Summer 2019 ed.). Metaphysics Research Lab, Stanford University.
Chapman, S. (2013). Susan Stebbing and the Language of Common Sense. New York: Palgrave Macmillan.
Eddington, A. S. (1928). The Nature of the Physical World. London: Dent.
Janssen-Lauret, F. (2017). Susan Stebbing, Incomplete Symbols and Foundherentist Meta-Ontology. Journal for the History of Analytical Philosophy 5 (2), 7–17.
Janssen-Lauret, F. (2019). Susan Stebbing’s Metaphysics and the Status of Common Sense Truths. In J. Peijnenburg and S. Verhaegh (Eds.), Women in the History of Analytic Philosophy, pp. 171–195. Cham: Springer Nature.
Janssen-Lauret, F. (2022). Susan Stebbing. Cambridge, UK: Cambridge University Press.
Joseph, H. (1932). A Defence of Freethinking in Logistics. Mind 41 (164), 424–440.
Joseph, H. (1933). A Defence of Free-Thinking in Logistics Resumed. Mind 42 (168), 417–443.
Joseph, H. (1934). A Last Plea for Free-Thinking in Logistics. Mind 43 (171), 315–320.
Kouri Kissel, T. (2019). Susan Stebbing. Portale Italiano di Filosofia Analytica 19, 34 pp. Accessed: 2024-07-22.
Schliesser, E. (2023). Stebbing on Clarity. Https://philarchive.org/rec/SCHSOC-13.
Stebbing, L. S. (1914). Pragmatism and French Voluntarism: With Especial Reference to the Notion of Truth in the Development of French Philosophy From Maine de Biran to Professor Bergson, Volume 6. Girton College Studies.
Stebbing, L. S. (1917). Relation and Coherence. Proceedings of the Aristotelian Society 17, 459–480.
Stebbing, L. S. (1924). Mind and Nature in Prof. Whitehead’s Philosophy. Mind 33 (131), 289–303.
Stebbing, L. S. (1925). Universals and Professor Whitehead’s Theory of Objects. Proceedings of the Aristotelian Society 25, 305–330.
Stebbing, L. S. (1926). Professor Whitehead’s “perceptual object”. Journal of Philosophy 23 (8), 197–213.
Stebbing, L. S. (1928). Symposium: Materialism in the Light of Modern Scientific Thought. Proceedings of the Aristotelian Society, Supplementary Volumes 8, 113–119.
Stebbing, L. S. (1930a). A Modern Introduction to Logic. London: Methuen.
Stebbing, L. S. (1930b). Review of “Process and Reality: An Essay in Cosmology” by A.N. Whitehead. Mind XXXIX(156), 466–475.
Stebbing, L. S. (1931). Logical Constructions and Knowledge Through Description. In Proceedings of the Seventh International Congress of Philosophy, pp. 117–121.
Stebbing, L. S. (1932a). Substances, Events, and Facts. Journal of Philosophy 29 (12), 309–322.
Stebbing, L. S. (1932b). The Method of Analysis in Metaphysics. Proceedings of the Aristotelian Society 33, 65–94.
Stebbing, L. S. (1933a). Logical Positivism and Analysis. Annual Philosophical Lecture Henriette Hertz Trust.
Stebbing, L. S. (1933b). Mr. Joseph’s Defence of Free Thinking in Logistics. Mind 42 (167), 338–351.
Stebbing, L. S. (1933c). Symposium: The “A Priori”. Proceedings of the Aristotelian Society, Supplementary Volumes 12, 178–196.
Stebbing, L. S. (1934a). A Second Reply to Mr. Joseph. Mind 43 (170), 156–169.
Stebbing, L. S. (1934b). Constructions: The Presidential Address. Proceedings of the Aristotelian Society 34, 1–30.
Stebbing, L. S. (1934c). Directional Analysis and Basic Facts. Analysis 2 (3), 33–36.
Stebbing, L. S. (1934d). Logic in Practice. London: Methuen.
Stebbing, L. S. (1936). Some Ambiguities in Discussions Concerning Time. In Philosophy and History: Essays Presented to Ernst Cassirer, pp. 107–123. Oxford: Clarendon Press.
Stebbing, L. S. (1937a). Nebulous Philosophy – Jeans and Eddington. The American Scholar 6, 71–84.
Stebbing, L. S. (1937b). Philosophy and the Physicists. London: Methuen & Co. Ltd.
Stebbing, L. S. (1939a). Language and Misleading Questions. The Journal of Unified Science 8 (1), 1–6.
Stebbing, L. S. (1939b). Some Puzzles about Analysis. Proceedings of the Aristotelian Society 39, 69–84.
Stebbing, L. S. (1939c). Thinking to Some Purpose. London: Penguin Books.
Stebbing, L. S. (1941). Ideals and Illusions. London: Watts.
Stebbing, L. S. (1942). Moore’s Influence. In P. A. Schilpp (Ed.), The Philosophy of G.E. Moore, pp. 515–532. London: Cambridge University Press.
Stebbing, L. S. (1943a). A Modern Elementary Logic. London: Methuen.
Stebbing, L. S. (1943b). Physics, Materialism and Free Will. Nature 151, 686–687.
Stebbing, L. S. (1943c). Symposium: The New Physics and Metaphysical Materialism. Proceedings of the Aristotelian Society 43, 167–184.
Stebbing, L. S. (1944). Men and Moral Principles. London: Oxford University Press.
West, P. (2021). Pause. Reflect. Think. https://aeon.co/essays/on-susan-stebbingand-the-role-of-public-philosophy. Accessed: 2024-05-16.
Willow, M. G. (1995). L. Susan Stebbing. In M. E. Waithe (Ed.), A History of Women Philosophers, Volume 4, pp. 125–155. Springer Science+Business Media Dordrecht.
Wisdom, J. T. (1934). Is Analysis a Useful Method in Philosophy? Proceedings of the Aristotelian Society 13(May), 65–89..
Author Information
Teresa Kouri Kissel
Email: tkouri@odu.edu
Old Dominion University
U. S. A.
Sarvāstivāda Buddhism
The Sarvāstivāda school of Indian Buddhism emerged from the attempts of early Buddhist communities to systematise the teachings of the Buddha into a complete and consistent description of reality. This often took the form of reducing our phenomenal experience of reality to the atomic, mind-independent fundamental constituents—known as dharmas—from which the mind constructs the objects of everyday experience. By understanding the mind-independent world, one is better equipped to achieve the goal of the Buddhist system—the cessation of suffering (i.e. Nirvana). What distinguished the Sarvāstivāda from these other schools was their claim that the fundamental elements (dharmas) were permanently existing, a point of major contention with other schools of Buddhism. Despite these issues, the Sarvāstivāda provided a framework for the Buddha’s teachings that was utilised by other Buddhists, even their opponents, and the school successfully merged the philosophical and analytic aspects of Buddhist teaching with the broader ethical and soteriological goals of the system. This article begins by discussing the origins of the early schools of Buddhism in the attempt to interpret the teachings of the Buddha, and it highlights the key texts and thinkers of the Sarvāstivāda school. The article then presents the metaphysical picture defended by the Sarvāstivāda school, explaining the method of reduction used to distinguish between conceptually constructed entities (prajñapti-sat) and the fundamental mind-independent dharmas (dravya-sat), the role of intrinsic nature (svabhāva) in this method, the arguments that led the Sarvāstivāda to its characteristic claim for the permanent existence of dharmas, and the school’s understanding of causality as applied to dharmas. The process of categorising dharmas is then discussed, specifically the five-fold categorisation of the 75 dharma types, as developed by the Sarvāstivāda school. Given that the Buddhist outlook is concerned with liberation from existential suffering, the article also considers the role of Sarvāstivāda thought with regard to this goal, from its understanding of liberation as the cessation of dharmas in a particular stream and the function of defilements in this account of suffering. Finally, the article touches upon criticisms of the Sarvāstivāda position by the Sautrāntika, Madhyamaka, and Yogacāra schools.
The Buddha (the “awakened” one) was born around the 5th century B.C.E. as Siddhārtha Gautama and, after spending time meditating as a renunciant, he reached enlightenment. He then proceeded to share his knowledge and teachings with disciples who formed the earliest Buddhist community.
Like other Indian philosophers of the time, the Buddha accepted a number of metaphysical views about the world. The first of these was that when a person dies, they are reborn as another being. This is seen as unpleasant, and the goal of practice is to gain liberation from this cycle of rebirth. A person is not always reborn as a human being, but may be reborn as a non-human animal, or even an insect. The kind of thing one is reborn as is determined by the law of karma-phala (“action-result”). Virtuous actions and intentions accrue good karma, which leads to rebirth in a higher state, but bad intentions and actions lead to rebirth in a lower state.
The teachings of the Buddha emerged with the above presuppositions. His teachings are varied, but the whole is often conveyed through the “Four Noble Truths,” derived from the first sermon he gave upon reaching enlightenment. These are:
(1) The truth of existential suffering
(2) The truth that suffering has a cause
(3) The truth that suffering has an end
(4) The truth that there is a path to end suffering
For the Buddha, the world and everything within it were seen to be impermanent, and this led to the feeling of suffering. This suffering is existential since it is inherent in our existence. The truth that we, our loved ones, the things we desire to possess, and the pleasures we feel are all impermanent is the basis for a feeling of dissatisfaction and anguish in life. This is considered to be self-evident to anybody who reflects, and the realisation of it is what leads one to consider a spiritual life.
The second truth provides an explanation for the suffering and, importantly, shows that suffering arises in dependence upon certain conditions. These conditions are our instinctive nature to crave or thirst after things, a tendency which is rooted in our desire for things we find pleasurable, our aversion to those we find painful, and our delusion at the way things really are—specifically the delusion that there is a permanently enduring self.
The truth that suffering has an end claims that if the above conditions are removed, then existential suffering is also removed. If craving is blocked, one is less inclined to intend or commit acts which lead to rebirth, and this will prevent rebirth. The state in which craving is absent is Nirvana, and upon death, one reaches “Complete Nirvana”—the cessation of rebirth.
The truth that there is a path to end suffering claims that there is a way to reach Nirvana that is understandable, communicable and attainable. It is referred to as the “Noble Eightfold Path” and is a combination of knowledge, action, and cultivation. The eight elements of the path consist of right view, right thought, right speech, right action, right livelihood, right effort, right mindfulness, and right concentration.
Returning to the second truth, and the delusion that leads to suffering, one of the defining features of the Buddha’s thought is the claim that there is no permanently enduring self. In place of this, the Buddha claimed that what was thought to be the self was merely an erroneous superimposition upon a series of impersonal elements called skandhas. These consist of matter, emotions, cognitions, volitions, and consciousness. These were sufficient to explain our experience in the world, and there was no need to posit a permanently enduring self over and above these. Even worse, the positing of a self-perpetuated suffering. If one believes oneself to be a permanently enduring self, then it would be possible for that enduring entity to acquire possessions. But the possessions one desires are impermanent, and so one will inevitably suffer when they perish. In addition, if one posits a self, then one identifies with something in the world called a “self.” But if everything that exists is impermanent, this includes the self. This means the experience of old age, suffering, and death, once again leading to existential suffering. The role of the illusion of the self in Buddhist thought is concisely captured in a passage from the Buddhist scriptures (Saṃyutta Nikāya 22):
And how does grasping lead to anxiety? It’s when an uneducated ordinary person regards form [i.e. physical matter] like this: ‘This is mine, I am this, this is my self.’ But that form of theirs decays and perishes, which gives rise to sorrow, lamentation, pain, sadness, and distress. (Saṃyutta Nikāya translated by Bhikku Sujato)
Thus, for the Buddha, the belief in a self is intimately tied to ignorance and suffering.
Another teaching closely related to the second and third Noble Truths is that of “dependent-origination” (pratītya-samutpāda). Though developed in different ways, the core of its philosophical aspect rests on the claim that everything that exists does so as a result of causes, and that anything that ceases to exist does so as a result of the removal of its causes. This formulation is notably stated in the Saṃyutta Nikāya (12.61), a discourse in which the Buddha states the following:
When this exists, that comes to be; with the arising of this, that arises. When this does not exist, that does not come to be; with the cessation of this, that ceases. (Translated by Bhikku Bodhi, n.d.)
It is this concept of “dependent-origination” that provides a metaphysical basis for the claim that everything is impermanent (since everything depends upon its causes), that suffering has a cause, and that there is an end to suffering (through the removal of its causes). This notion became increasingly important as Buddhist metaphysics continued to develop, as shall be seen below.
b. Early Buddhist Texts and the Abhidharma
After the Buddha’s death, his disciples gathered to solidify his teachings into an authoritative collection. These were initially transmitted orally but eventually written down. One of the earliest such written collections that survives is the Pali Canon of the Theravāda Buddhists of Sri Lanka. These are divided into three sections, or “baskets” (piṭaka), a practice of organisation followed by most other early Buddhist schools. The first of these is the basket of discourses (sūtra), and it contains a vast collection of narrative episodes in which the Buddha imparted his teachings to others, such as the examples given above relating to the Four Noble Truths and dependent-origination.
The second section includes detailed rules and regulations pertaining to monastic discipline. Whilst Buddhist schools were often tolerant of differing philosophical views, the monastic rules were closely guarded, and any attempt to alter these was seen as grounds for removal from the Buddhist community. In an important episode in early Buddhist history, a dispute involving such rules led to a split between two groups: the Sthavira Nikāya (“the group of elders”) and the Mahāsāṃghika (“the greater community”). Both of these groups continued to split into smaller schools over time, and it was from the former that the Sarvāstivāda school eventually emerged.
The final basket is the “Abhidharma,” and it contains texts which appear to be of a later date than the other two baskets. These texts attempt to systematise the teachings of the Buddha by cataloguing and organising the constituents of reality referenced in the discourses (the first basket), such as the elements that generate the illusion of a permanently enduring self. Any inconsistencies between the Buddha’s statements were explained away by reference to the doctrine of “two-truths” (dvaya-satya), the claim that the Buddha, from his compassion to spread the teachings of how to achieve liberation, would adjust his manner of speaking to meet the intellectual stage of his hearer. In this way, the Buddha’s remarks may not correspond to the way the world actually was, but were merely a pragmatic way of speaking to help someone reach the next stage on the path. When the Buddha spoke in this way, it was held that he was speaking the “conventional truth” (saṃvṛti-satya). When he was held to be describing objective reality, his statements were said to correspond to the “ultimate truth” (paramārtha-satya). It is the Abhidharma basket that is believed to contain this ultimate truth, and its texts are written in the form of lists or “matrices” (mātṛkā) of the basic elements of reality according to different arrangements and categories. Many are also written in the form of a catechism, proving useful to monks for memorisation. The Abhidharma texts are mainly descriptive, and it was left to later Buddhist philosophers to philosophically defend the account of reality found in these texts. Those monks who are specialists in the Abhidharma texts and who offer a defence of its metaphysics are often referred to as “Ābhidharmika.” There are, however, major differences between these Ābhidharmika, leading to a variety of philosophical schools, with one of these schools being the Sarvāstivāda.
c. The Sarvāstivāda and Their Texts
Much of the history of the Sarvāstivāda school is shrouded in mystery. It is believed that the school emerged from the earlier Sthavira Nikāya (“the group of elders”) of the Buddhist community. The school was especially prominent in the North West of ancient India, in Central Asia, and in Kashmir. Their arguments for the permanent existence of basic elements through time and their claim that these elements possessed an intrinsic nature (svabhāva) proved controversial, becoming a source of much disagreement within Buddhist philosophical circles.
The three baskets of the Sarvāstivāda survive in ancient translations into Chinese and Tibetan, although several Sanskrit texts from their sūtra basket have also been discovered. The Abhidharma basket of the Sarvāstivāda consists of seven texts: (i) the Dharmaskandha, (ii) the Saṃgītiparyāya, (iii) the Prajñapti, (iv) the Vijñānakāya, (v) the Prakaraṇa, (vi) the Dhātukāya, and (vii) the Jñānaprasthāna. A very significant commentary on these texts, known as the “Great Commentary” (Mahāvibhāṣā), came to be seen as the most coherent and comprehensive formulation of the Sarvāstivāda, collecting the arguments of different Sarvāstivāda teachers. This text was of such significance to the Sarvāstivāda school in Kashmir that the group was given the name Vaibhāṣikas, meaning “those of the commentary.” The Mahāvibhāṣā was extensively drawn upon by later Sarvāstivāda writers, and two particularly important philosophers who utilised and developed the arguments found in this commentary deserve mention: Vasubandhu and Saṃghabhadra.
The renowned Buddhist philosopher Vasubandhu, attracted by the prestige of the Sarvāstivāda school in Kashmir, studied with them and eventually composed the extremely influential “Treasury of Abhidharma” (Abhidharmakośa), which still survives in its original Sanskrit. It appears that during the composition of this text, he began to doubt the arguments of the Kashmiri Sarvāstivādins, and his commentary to the text (the Abhidharmakośa-bhāṣya) offers a range of criticisms of their doctrines. The brilliance of his text is reflected in the fact that it continued to be used by the Sarvāstivāda monks even after Vasubandhu became a critic of the school.
Saṃghabhadra was a philosopher of the Kashmiri Sarvāstivāda school, and he sought to defend their teachings against the criticisms of Vasubandhu. He wrote two influential texts—the Nyāyānusāra and the Abhidharmasamayapradīpikā—which only survive in early Chinese translations, and display a maturation and sophistication of the doctrines found in the earlier texts.
2. The Metaphysics of the Sarvāstivāda
The Sarvāstivāda school provided a detailed account of reality as consisting of permanently existing and causally interacting basic elements. To defend this model, they developed a method of metaphysical reduction, introduced a concept of a permanent intrinsic nature (svabhāva), offered arguments for their claim that these basic elements existed permanently, and provided a scheme of categorisation based on the causal roles of these elements. This section covers each of these features of Sarvāstivāda thought.
a. Dharmas (An Account of the Reductionist Method)
The Sarvāstivāda, along with the other Abhidharma schools, followed the Buddha in his reductionist project. But where the Buddha had simply reduced the self (and phenomena like cognition) into its constituent elements, claiming that the self was an illusion, the Sarvāstivāda went much further and applied this methodology to the whole of phenomenal experience and the external world. Everything, from concrete objects like pots to mental phenomena like thoughts, was to be reduced to their fundamental elements. This led to an ontological distinction between those entities that could not be reduced further, and therefore truly existed, and those that were reducible but perceived as real by a mind polluted with ignorance. The Sarvāstivāda essentially split the world into really existing ultimate elements, which they called “dharmas,” and those illusory entities built up from dharmas. Something existing on the level of dharmas was referred to as a “substantial entity” (dravya-sat), whereas that which was constructed from the dharmas, and erroneously perceived as real, was referred to as a “designated entity” (prajñapti-sat), the latter implying that the belief in these unreal entities results from our concepts and conventional use of language. These two ontological levels corresponded to the two levels of truth adopted by Buddhists in interpreting the Buddha’s statements. If there were any inconsistency in the Buddha’s remarks, one would look for whether the statement was a description of ultimate reality (paramārtha-satya) or a pragmatic use of the usual way people speak of the world when they are imbued with ignorance (saṃvṛti–satya). If the former, then the Buddha was speaking of dravya-sat entities (i.e. dharmas); if not, then he was speaking of prajñapti-sat entities.
Having adopted a two-tier ontology, the Sarvāstivāda needed to explain a way of determining which tier a given entity belonged to—an account of the reductive methodology had to be given. The Buddhist philosopher Vasubandhu, in the Abhidharmakośabhāsya (6.4), gives a concise statement of the approach adopted by the Sarvāstivāda:
Where a cognition of an object, by division or by abstraction, turns into something else, like water or a pot, then it is an [unreal] conventional entity (saṃvṛtisat); otherwise it is a [real] ultimate entity (paramārthasat).
Vasubandhu elaborates on this definition further. He states that sometimes a cognition of an object ceases when that object is broken into parts. Consider a pot. Once the pot is broken into parts, we no longer have a cognition of a pot, but a cognition of potsherds. The pot is, therefore, a conventional entity—ultimately unreal. We perceive it to exist due to our conventions, way of living, and our linguistic practices. But there are certain aspects of cognition that remain, even when the pot is broken into pieces. The cognition of colour and shape, for example, will remain, even if the potsherds are further broken. If these cannot be divided further without cognition ceasing, then they are to be taken as ultimately real entities (dharmas). This is the method of reduction by division.
Vasubandhu also suggests a form of reduction via abstraction. As previously mentioned, the Sarvāstivāda wished to apply their reductive method to all phenomenal experience, and the method of reduction by division only applies to physical objects. It is interesting to note that of the five elements (skandhas) to which the Buddha reduced the self, only one of them is physical matter (rūpa), with the other four being related to mental phenomena. As such, the Sarvāstivāda need an additional method of reduction to capture these entities. Vasubandhu describes reduction by abstraction as occurring when “the other things have been abstracted by thought and the cognition of the original entity does not arise.” He means that when we abstract different aspects of an entity and focus on these aspects, the experience no longer includes that original entity. We continue this until we can no longer abstract further without the cognitive experience ceasing. In terms of a pot, this would involve mentally abstracting different aspects from the pot to determine whether, in the absence of these aspects, the cognition of a pot still arises. So, for example, if we abstract and focus on colour, or shape, or resistance to touch, these are no longer a cognition of the pot, yet it seems we cannot go any further in our abstraction without the cognitive experience ceasing. It appears that all forms of reduction by division are included in reduction by abstraction, but not the reverse. For example, cognitive experiences cannot be broken like a pot, yet through abstraction one may determine the constituent elements of a given cognition, such as the eye, the pot, the thought that one is looking at a pot, and the emotional colouring of the cognition if one needs a pot (though this may sound obscure, section 3 of this article covers the Sarvāstivāda categorisation of dharmas, such as those that constitute a cognition). It appears that all forms of reduction by division may also be accomplished through reduction by abstraction, but not the reverse.
In the process of both forms of reduction, we eventually land on a certain quality that stands independently. These qualities are what explain and define the macro-objects that are built from them. In understanding what a pot is, we refer to its features of colour, shape, and resistance to touch. The pot derives its nature from these aspects, and yet these aspects stand independently. They contain a nature that is not borrowed from anything else—an intrinsic nature. It is from this logic that we land on another key component of the Sarvāstivāda metaphysics: svabhāva.
b. Intrinsic Nature (svabhāva) as the Mark of Ultimate Reality
The Sarvāstivāda did not simply wish to offer a reductive methodology, but they sought to exhaustively catalogue our phenomenal experience and the external world. If one understands one’s own experiences and the world, as they truly are, devoid of the distorting effects of ignorance, then one is better placed to attain spiritual liberation. In order to catalogue, the Sarvāstivāda needed a method of categorisation. The early Abhidharma schools experimented with a number of methods of categorisation, but the Sarvāstivāda eventually settled on grouping dharmas by the quality they could never be without. Combined with the reductive method detailed above, this meant that dharmas would possess only a single intrinsic nature, providing a strictly delineated categorisation. If it appeared that there was more than one intrinsic nature for a dharma, then the reduction had not been fully completed. Furthermore, rather than adopt a form of categorisation based on one dharma’s relation to another (such that one type dharma may have the nature of being colder than another type of dharma), the reliance on an intrinsic nature provided an atemporal grouping, one that was not liable to fluctuate in the way a categorisation based on relational qualities might (such that one dharma has the nature of being colder when compared to a second dharma, but hotter when compared to a third). More detail on the categories decided upon by the Sarvāstivāda is detailed in Section 3 of this article.
The svabhāva, therefore, is something a dharma possesses. Given classical Indian philosophers’ penchant for etymological derivations for the meaning of concepts, Vasubandhu states that the etymology of “dharma” derives from “its bearing (dhāraṇa), its unique characteristic (svalakṣaṇa—a term often synonymous with svabhāva).” This svabhāva provides the unreal entities we mistakenly take as real, those that are built from dharmas, their specific qualities and nature. These conventionally real but ultimately unreal entities (saṃvṛti-sat/prajñapti-sat) are said to possess an external nature (parabhāva) that is derived from the svabhāva of dharmas. In contrast, a svabhāva is intrinsic to a dharma, and does not depend for its nature on any other entity.
It is useful at this point to note that whilst each numerically distinct dharma had its own numerically distinct svabhāva, the categorisation groups these separate svabhāva by kind. There is, therefore, a token-svabhāva and a type-svabhāva. The categorisation of dharmas, by their svabhāva, offers an account of the latter. The token-svabhāva fulfils a metaphysical role in the Sarvāstivāda account of reality.
Though the Sarvāstivāda contributed to developments in the Buddhist method of reduction and categorisation, and set the notion of svabhāva as foundational in their metaphysics, their lasting and most controversial contribution is the position that gives the school its name—that the dharmas (and their svabhāva) are eternal and fixed, that is, they exist at all times (sarvam asti).
c. The Existence of Dharmas at all Times (sarva asti)
There appears to be no logical necessity for a given dharma to exist permanently, even if one were to grant it a svabhāva. It seems that one can consistently hold that there are irreducible elements that possess a single independent nature, and that these elements can also come into and out of existence (along with their essence). But the Sarvāstivāda made the additional leap of claiming that a dharma existed in the “three times”—in the past, in the present, and in the future. The motivation for this is the same as that which always motivates the Sarvāstivāda: a desire to systematise the Buddha’s teachings into a single coherent metaphysical picture. As such, the arguments for the tri-temporal existence of dharmas are a combination of appeals to the literature of the tradition and logical reasoning. Vasubandhu, in his Abhidharmakośa, provides four arguments. The first and second are presented as appeals to authoritative literature, and the third and fourth as a result of logical reasoning. The arguments are as follows:
(i) The tri-temporal existence of dharmas is supported by scripture.
In explaining this argument, Vasubandhu refers to a discourse of the Buddha in which the latter states that, should past dharmas not exist, monks could not take them (the past) into consideration, and, if future dharmas did not exist, monks could not delight in them (the future). If the tri-temporal existence of dharmas is false, then so is the Buddha’s claim. Whilst the argument is stated as an appeal to authoritative literature, an underlying reason for the claim can be found in the third argument.
(ii) The tri-temporal existence of dharmas is supported by the Buddha’s account of cognition.
The Buddha’s account of cognition held that, rather than there being a self that does the cognising, cognition simply consists of six sense-faculties, each of which has its own sphere of operation, the combination of which produces a particular type of cognition. So, a visual experience is the result of the sense-faculty (the eye) coming into contact with a visible object. The reason this entails the tri-temporal existence of dharmas is that the Sarvāstivāda thought that past and future dharmas could be sense-objects, as one can think of both the past and the future. Therefore, if the Buddha’s words are correct, the tri-temporal existence of dharmas must be accepted. Once again, the argument is presented as an appeal to authoritative literature, but the following argument seems to provide further reasoning for the claim in the scripture.
(iii) The tri-temporal existence of dharmas is supported by the fact that consciousness always has a real object.
The Sarvāstivāda hold that all mental activity is intentional—it has an object that it is about, and that object must exist for consciousness to occur. If past and future dharmas do not exist, then one cannot have any cognitions involving a past or future object. But we do have such cognitions—one can think of the cake one ate last week, or the trip one plans to take next month. From this, the Sarvāstivāda argue that if the tri-temporal existence of dharmas is denied, and past and future dharmas do not exist, we would have to conclude that consciousness can have an unreal object. But if we have consciousness of it, it must be real. Therefore, past and future dharmas exist from the intentional structure of cognition.
(iv) The tri-temporal existence of dharmas is supported by the fact that the past has a result in the present.
The Sarvāstivāda argue that if we believe actions to have consequences, then we must grant actions causal power. But most actions (which the Sarvāstivāda also reduce to dharmas) are in the past, especially when their consequences manifest. In order to make sense of this, we ought to accept that past dharmas exist, and it is through this continued existence that they are able to causally influence the present and the future. This reasoning also fits well with the Buddhist commitment to the framework of karma-phala—the pan-Indian belief that all intentional actions have consequences, the latter of which depend on the moral status of the intention.
If one accepts that dharmas exist in the past, present, and future, and also accepts that dharmas possess an unchanging intrinsic nature, then an account should be given for why we perceive these three temporal states. Why does there appear to be a difference between the past, present, and future, if the dharma itself does not change? But more importantly, if some notion of change is not included within the Sarvāstivāda metaphysics, then it seems to dismiss the Buddha’s claim that reality is characterised by impermanence. And if all is permanent and unchangeable, one cannot make sense of the possibility of religious liberation, for surely this involves a change from non-liberated to liberated.
The Mahāvibhāṣā and Vasubandhu’s Abhidharmakośa refer to four approaches in explaining the nature of temporal change by historic teachers of the Sarvāstivāda as follows:
(a) Dharmatrāta: the difference between the times is a difference of being (bhāva-anyathātva)
Dharmatrāta argues that whilst a dharma retains its substance permanently, it changes its mode of being, and it is the change of mode of being that accounts for temporal difference. He compares this to a golden vessel, where its substance as a material form of gold remains even whilst the shape may change if it is melted down and remoulded. The intrinsic nature (svabhāva) of a dharma is like the gold substance, and the different times are like the different shapes the gold may be moulded into. In this way, a dharma is both permanent and yet undergoes change. Vasubandhu finds this argument unsatisfactory—if a mode of being is treated like a quality that inheres in an underlying permanent dharma, then in altering its qualities, the dharma also changes (Gold: 29-31). The error of this approach is compared to the non-Buddhist Sāṃkhya school of Indian philosophy, and their doctrine of transformation (parināma), according to which cause and effect are merely the transformation of an underlying substance. The Sāṃkhya position implies that change is not, therefore, the coming into existence or falling out of existence of entities, contradicting the Buddha’s claim of universal impermanence (Although see Dhammajoti (137-138) for a defence of Dharmatrāta).
(b) Ghoṣaka: the difference between the times is a difference of characteristic (lakṣaṇa- anyathātva)
Ghoṣaka claims, like Dharmatrāta, that the substance of a dharma (its intrinsic nature/svabhāva) remains the same, yet it alters in its temporal “character.” Importantly, Ghoṣaka proposes that whilst a dharma is connected to one temporal characteristic (for example, the past), it is not severed from the other temporal characteristics (such as the present and the future). A dharma is always connected to the three different temporal periods, only to varying degrees. He likens this to the desire a man may have for a woman, in that it is compatible with the man simultaneously desiring other women, even though the desire is for the time focused upon the first woman. In this way, Ghoṣaka can avoid the kind of criticism that was raised against Dharmatrāta, since for Ghoṣaka, a dharma never loses the characteristics of past, present, and future, therefore maintaining its status as permanent. The problem with this approach, according to Vasubandhu, is that this does not do enough to distinguish the three times—if a dharma always, in fact, possesses the characteristic of past, present, and future, then what is there to distinguish the three times? Returning to Ghoṣaka’s example, even though the man’s desire is focused on a particular woman, Vasubandhu argues that, according to Ghoṣaka’s reasoning, the man would in fact always desire all of the women. Potential (the future) and enactment (the present) are the same (Gold: 32). Temporal difference is left unexplained.
(c) Buddhadeva: the difference between the times is a difference of relation to the other times (anyathā-anyathātva)
Buddhadeva tried to solve the issue by proposing that the terms “past,” “present,” and “future” are predicated of dharmas by reference to those that precede and those that follow them. So, a particular dharma may be “past” relative to a second dharma, and yet “future” relative to a third. The substance of the dharma remains the same—it does not change its intrinsic nature (svabhāva). The only change is its relation to other dharmas, and these relations account for the difference between the time periods. Buddhadeva gives the example of the same woman who is called “daughter” relative to her mother, and “mother” relative to her child. The woman does not change her substance, merely her relation to the other people differs. Vasubandhu criticises this position on the basis that it, too, fails to provide an adequate account of the differences between the three temporal periods. Just as the same woman can be both a mother and a daughter, the same dharma can then be past, present, and future. But the time periods are themselves defined in distinction to one another; they are supposed to be mutually exclusive, and so the account fails (Gold: 38-39).
(d) Vasumitra: the difference between the times is a difference of position (avasthā-anyathātva), according to activity (kāritra)
Vasumitra’s argument is considered to be the most promising by Vasubandhu. Vasumitra argues that the difference between the times is the difference in position or state of a dharma. We may imagine a chess piece on a board—the piece remains the same, but its position on the board may alter. This, for Vasumitra, is comparable to the dharma passing through the three temporal stages—it maintains its intrinsic nature permanently, and yet it changes its position from future, to present, to past. But then, how are these positions distinguished from one another? The response provides a great conceptual innovation: it is the exercise of a dharma’s activity (kāritra) that determines its position in the temporal order. Though a dharma always retains an intrinsic nature, the manifestation of that nature, its activity, occurs dependent on certain causes and conditions. So, whilst a dharma with the intrinsic nature of heat always exists with that nature, it is only when a match is struck that some of the heat dharmas which partially make up the match (since dharmas are atomic, there will be many different kinds in a single matchstick) will manifest that quality. A dharma that is yet to manifest its activity is a future dharma, one that is currently exercising its activity is a present dharma, and one that has already exercised its activity is a past dharma. In this way, Vasumitra maintains the permanent nature of dharmas whilst offering an account of temporal difference that does not appear to conflict with this permanent nature. This argument proved appealing, and Vasubandhu states that it is the most convincing of the four arguments put forward, although he went on to criticise it.
d. Causation Between Dharmas (The hetu-pratyaya Model)
Having developed an account of reality based on a reductive methodology and concluding that only the atomic tri-temporally existing dharmas are ultimately real, the Sarvāstivāda sought to detail how these dharmas came to arise and cease in our phenomenal experience. In other words, they sought to understand the conditions under which a dharma carried out its activity (kāritra). One reason for this is that the Sarvāstivāda believed causal efficacy, as well as irreducibility, to be a mark of an ultimate existent. Objects built from dharmas appeared to have causal efficacy, but in reality, it was the dharmas that did the work. The reason for this requirement for causal efficacy is clear, if the Sarvāstivāda project is an account of reality which is supposed to provide guidance on how to reach Nirvana, then it ought to explain how to cause that outcome (and how to stop that which obstructs this outcome). In providing this account, the Sarvāstivāda, in their characteristic style, developed a taxonomy of causes (hetu) and conditions (pratyaya) for dharmas. So, for a dharma to be recognised, and therefore acknowledged to be ultimately real, it must fit into at least one of these causal categories.
The distinction between a “cause” (hetu) and a “condition” (pratyaya) is partially chronological. An explicit reference to the fourfold division of conditions is clearly found in the discourses of the Buddha. The four consist of: (i) the causal condition (hetu-pratyaya), (ii) the equal-immediate condition (samanantara-pratyaya), (iii) the object-condition (ālambana-pratyaya), and (iv) the dominant condition (adhipati-pratyaya). The Sarvāstivāda went on to develop another model—the sixfold division into causes (hetu). The six causes are: (a) homogenous cause (sabhāga-hetu), (b) universal cause (sarvatraga-hetu), (c) resulting cause (vipāka-hetu), (d) co-existing cause (sahabhū-hetu), (e) conjoined cause (saṃprayuktaka-hetu), and (f) efficient cause (kāraṇa-hetu). This framework of causes overlaps with the fourfold division of conditions. Five of the causes (a-e) fall under the category of causal condition (hetu-pratyaya). The sixth cause (f) is correlated to the fourth condition. The conceptual distinction between the two taxonomies appears to be that the condition (pratyaya) is a more generic and less specific account of cause, whereas the cause (hetu) is a more fine-grained account, although this is by no means a strictly maintained difference. In an oft-cited example, the seed is the cause of the plant, but the soil, water, temperature, and so on, are the conditions. The Sarvāstivāda account, therefore, offers a picture of complex collections of permanently existing independent dharmas continuously interacting with one another, leading to our phenomenal experience of a dynamic and fluctuating reality.
In what follows, the main headings are the divisions of conditions (pratyaya), with subheadings detailing the division of causes (hetu) where appropriate.
(i) the causal condition (hetu-pratyaya)
In the Abhidharmakośa (2.61), Vasubandhu simply states that the causal condition is equivalent to five of the six causes. These causes are as follows:
(a) homogenous cause (sabhāga-hetu)
A homogenous cause is one in which the effect is of the same kind as the cause, with the common example used by Sarvāstivāda philosophers in explaining this cause often having a moral tone. For example, a morally good dharma will be the cause of another morally good dharma. In less moral terms, we might think that a given colour dharma will, in the absence of any other interfering dharma, act as a homogenous cause for the immediately following activity of another colour dharma of the same kind. As such, the homogenous cause applies to both mental and material dharmas. It helps to explain the phenomenal experience of continuity in a world of interacting dharmas.
(b) universal cause (sarvatraga-hetu)
A dharma is a universal cause if it falls under the category of “universal defilement” (see section 4). These are essentially dharmas that lead to ignorance and suffering, and can be obstructed or abandoned through knowledge of the Four Noble Truths and following the Buddhist path. They are similar to the previously described homogenous cause in that they also produce an effect in kind, except that, unlike the homogenous cause, the kind under consideration is broader. For example, the homogenous cause can only make a dharma of the same kind, so a material colour dharma, when conceived as a homogenous cause, can only produce the same kind of material colour dharma as an effect. But the defilement dharma of “delusion” (moha: the dharma that is a mental occurrence of delusion) has the universal cause of producing other defilement dharmas that may extend beyond its own kind, such as producing the defiled dharmas of “idleness” (kauśīdya) or “non-diligence” (pramāda), rather than simply another “delusion” dharma.
(c) resulting cause (vipāka-hetu)
This is the Sarvāstivāda method of incorporating the law of karmic consequences into its account of dharmas.Dharmas of all categories associated with the self (meaning the five skandhas) may be coloured with a moral quality, for example, a mental dharma of anger may contaminate the other dharmas within that matrix, and it will act as the resulting cause of future dharmas that are the karmic results of this anger. Note the importance of this cause in the Sarvāstivāda argument for dharmas existing at all times—it is because a dharma must act as a resulting cause to a karmic consequence that it is held to still exist even when its activity (the manifestation of its intrinsic nature) has ceased.
(d) co-existing cause (sahabhū-hetu)
It often appears that certain things are mutual causes. For example, when two playing cards lean against each other when a person begins to build a house of cards, it seems as though each card is the cause of the other remaining upright. Philosophers are cautious in accepting such scenarios as examples of causation, given that causation is characterised as an asymmetrical relationship between cause and effect. The Sarvāstivāda allowed for mutual causation in the form of their co-existing cause (sahabhū-hetu). Vasubandhu describes this causal category as “dharmas that are causes one of the other” (Vasubandhu 1988-1990: 257) and explains it as “in the same way that three staffs stand up supported one on the other” (Vasubandhu 1988-1990: 261). The reason for this is that the Sarvāstivāda took dharmas in the category of thought and the separate category of concomitant mental states to be mutual causes—when the thought manifests its activity, then so do the concomitant mental states. But it is also seen that when the associated mental states manifest their activity, then so does the mind (see section 3 for an explanation of these two categories). The later Sarvāstivāda philosopher Saṃghabhadra adds another explanation for this causal category—sometimes entities seem to have multiple causes that are all simultaneously required for the effect to take place (Dhammajoti: 175). When this occurs, they are said to be co-existent causes of the effect. This causal category is of particular importance to the Sarvāstivāda and their argument for the tri-temporal existence of dharmas, given their reliance on the Buddha’s claim that cognition arises with the sense-faculty and sense-object. The sense-faculty, sense-object, and cognition are therefore co-existing causes.
(e) conjoined cause (saṃprayuktaka-hetu)
The conjoined cause is a subset of the co-existing cause. It refers to dharmas of the thought (citta) category and mental concomitant (caitta) category, which share the same object. The Mahāvibhāṣā explains that these dharmas are weak in isolation and are only able to exercise their activity when together (Dhammajoti: 175-176). For example, a cognition of an object cannot take place with only one dharma of the thought or mental concomitant categories. Instead, it requires a number of different kinds of dharmas of both categories (as well as the dharmas that make up the object of cognition). The difference between the conjoined cause and the previous category of co-existing cause is that the conjoined causes always occur together, whereas co-existing causes bring about the same results. For the Sarvāstivāda, all conjoined causes are also co-existing causes, but not all co-existing causes are conjoined causes.
(ii) the equal-immediate condition (samanantara-pratyaya)
This is the characteristic of thought (citta) and mental concomitant (caitta) dharmas to “make way” for subsequent dharmas of the same kind (Vasubandhu 1988-1990: 297-302; Dhammajoti: 196-198). In a particular collection of dharmas that make up one’s mental experience, the Sarvāstivāda hold that no two dharmas of the same specific type can occur together (for example, you cannot have two separate instances of anger occurring together). So, the equal-immediate condition is the tendency of a dharma to cease and give space to another dharma of the same kind (thus being “equal”) to arise without another dharma intervening in between (thus being “immediate”).
(iii) the object-condition (ālambana-pratyaya)
As seen in the arguments for the tri-temporal existence of dharmas, the Sarvāstivāda held that all objects of cognition are reducible to ultimately real dharmas. The object-condition is simply this claim—that all dharmas may act as objects for cognition. (Vasubandhu 1988-1990: 302; Dhammajoti: 198-199). An interesting consequence of this view is that cognition cannot have an unreal object. This led the Sarvāstivāda to eventually distinguish between the “content” (ākāra) of a cognition and the real “object” (viṣaya) that causes the cognition, explaining a mistaken cognition as being one where the perceiver takes the illusory content to be the real object (Cox 1988).
(iv) the dominant condition (adhipati-pratyaya)
As Vasubandhu states, the “cause termed kāraṇa [efficient cause] is called adhipati, predominant” (Vasubandhu 1988-1990: 303). This condition is therefore equivalent to the causal category called “efficient cause.”
(f) the efficient cause (kāraṇa-hetu)
This category applies to all dharmas in relation to all other conditioned dharmas, and consists of the fact that, as Vasubandhu explains, “no dharma constitutes an obstacle to the arising of dharmas susceptible of arising” (Vasubandhu 1988-1990: 255). This category allows the Sarvāstivāda to grant a causal role to dharmas that would otherwise seem to lack one, and therefore lack reality, such as future dharmas. By not acting as an obstacle to the arising of dharmas, future dharmas are an efficient cause and therefore real.
The exhaustive and confusing treatment of causation is the result of the Sarvāstivāda desire to provide a comprehensive, systematic account of the Buddha’s teaching. As the teachings found in the discourses are so disparate, and the conservative nature of the religious tradition sought to hold on to as much as possible, many seemingly ad-hoc solutions were developed to maintain consistency and respond to critics. A similar approach can be found in the five-fold division of the 75 dharmas recognised by the Sarvāstivāda school.
3. The Ontology of the Sarvāstivāda (The Categorisation of Dharmas)
The Sarvāstivāda argue that reality consists of many individual dharmas interacting, manifesting their activity, then lying dormant again. This account of the general nature of dharmas and their causal relations is explained by the Sarvāstivāda’s metaphysics, as discussed in the previous section. But the Sarvāstivāda school also sought to list and categorise the various kinds of dharmas. It is this categorisation that forms the school’s ontology. Though there were many dharmas, the Sarvāstivāda believed they could be categorised into 75 types (even these types could have further divisions, but it is the grouping of 75 types that is taken to be the most useful reduction). These types could then be further grouped into five separate categories, and this became the highly influential five-fold division of dharmas, as can be seen by the importance of the categories “Thought” and “Mental Concomitants” in some of the Sarvāstivāda accounts of causation. The five-fold division was adopted by many Buddhists, including opponents of the Sarvāstivāda. Within these five-fold are further separate groupings of dharmas, which become important in outlining the Buddhist path to liberation. The following is a useful reference, adapted from a very helpful list of the 75 dharmas provided by Dhammajoti (41-42):
I. Matter (rūpa)
11 dharma types
II. Thought (citta)
1 dharma type
III. Mental concomitants
(caitta/caitasika)
46 dharma types
IV. Conditioned dharmas disconnected from Thought (cittaviprayukta saṃskāra dharma)
14 dharma types
V. Unconditioned dharmas (asaṃskṛtadharma)
3 dharma types
1. Visual-faculty
(cakṣur-indriya)
2. Auditory-faculty
(śrotra-indriya)
3. Olfactory-faculty (ghrāṇa-indriya)
4. Gustatory-faculty
(jihvā-indriya)
5. Tactile-faculty
(kāya-indriya)
6. Visual-object
(rūpa-artha)
7. Auditory object
(śabda-artha)
8. Olfactory object (gandha-artha)
9. Gustatory object
(rasa-artha)
10. Tactile-object (spraṣṭavya-artha)
11. Invisible matter (avijñapti-rūpa)
1. Thought (citta), also named “mind” (manas) and “consciousness” (vijñāna).
2. Cessation with knowledge (pratisaṃkhyā-nirodha)
3. Cessation without knowledge (apratisaṃkhyā-nirodha)
It is interesting to note that only one category consists of matter, and this contains 11 types of dharmas. Two categories relate to human mental experience, and these make up the majority of dharma types (47 in total). This highlights the fact that behind the exhaustive analytic reduction of the Sarvāstivāda lies the Buddhist emphasis on liberation from the phenomenal experience of the world and the condition of suffering.
Whilst it is outside the scope of this entry to discuss every dharma type, the following sections will discuss the general features of each of the five-fold categories, touching upon some unusual dharma types when appropriate.
a. Matter (rūpa)
The category of matter (rūpa) contains the sense-faculties and sense-objects of cognition. Each sense-faculty is distinct and has its own sphere of operation, so the visual sense-faculty cannot combine with the auditory sense-object to produce a cognition, just as the eyes cannot see music, though they may see the physical writing of musical notation. The division of ten types of sense-faculty and sense-object dharmas resembles the Buddha’s teaching of the sensory bases (āyatana) found in his discourses.
Vasubandhu provides two definitions for the classification of dharmas as matter: (i) that they are breakable, meaning that they are liable to transformation, and (ii) that they are resistant, so that a particular rūpadharma acts as an obstacle to another rūpa dharma taking its space (Vasubandhu 1988-1990: 70). The latter definition is the one emphasised by most Sarvāstivāda philosophers, and accords well with an intuitive understanding of matter.
The unusual dharma in this category is that of invisible matter (avijñapti). There was much dispute over the precise nature of this type (see Dhammajoti: 425-427), but its function was to provide scope for karmic consequences within the rūpa division of dharmas, given that the Sarvāstivāda account of resulting cause (vipāka-hetu) applies to all five divisions of dharmas. The avijñapti dharma is, therefore, the invisible dharma that is activated following an intention to act. Not all intentions produce actions—a person may intend to cause harm or do good, and yet the opportunity to do these may not be available. A person can also be physically restrained from acting on a particular intention (for example, if one is held back from striking someone). As karmic consequences are thought to apply to intentions, the avijñapti dharma is one that can be produced in the absence of physical action, and will eventually produce a dharma that is the karmic consequence of the intention.
Its subsumption under a non-mental category is thought to maintain its existence outside of the fluctuations of mental states (Abhidharmakośabhāṣyam 1.11). For example, it continues to be active even when one is in a deep and dreamless slumber, and therefore lacks mental activity. At root, the avijñapti dharma acts as a guarantee of karmic consequences, carrying the seeds for its later fruition. The attempt to justify the inclusion of avijñapti in the matter category (rūpa) is, however, unconvincing, since the Sarvāstivāda accept that it is not resistant and so it does not meet the definition of a rūpadharma. Vasubandhu explains the Sarvāstivāda position as holding that, since the avijñapti is dependent on matter (and thought) for exercising its activity, then it should also be considered under the matter category, in the way a shadow is part of a tree from being dependent on a tree (Vasubandhu 1988-1990: 71-72). Vasubandhu himself noted that this seems contrary to the Sarvāstivāda reductionist methodology.
b. Thought (citta)
Thought contains a single dharma type, though the terms “mind” (manas) and “cognition” (vijñāna) are said to name the same type, only highlighting its different aspects (Abhidharmakośabhāṣyam 2.34). Recalling the Buddha’s account where a sense-faculty and sense-object arise with the relevant sense-consciousness, Vasubandhu explains the character of this consciousness as “the impression relative to each object” and the “raw grasping of each object” (Vasubandhu 1988-1990: 74). Thought is the raw part of cognition that is uncoloured by other mental factors (which fall under the category of “mental concomitants”). It is only through being associated with the mental concomitants that a cognition has content in the form of what we call propositional knowledge. The thought dharma grasps a blue dharma, but it is the thought dharma occurring with certain mental concomitant dharmas that generates the experience “this is blue.” It is important to note that the Sarvāstivāda held that thought dharmas always occur with mental concomitant dharmas (and vice-versa), and not in isolation (Abhidharmakośabhāṣyam 2.23). Due to this, the thought dharmas and mental concomitant dharmas are the paradigm example of mutually caused dharmas, defined as co-existing causes (sahabhū-hetu) and conjoined causes (saṃprayuktaka-hetu) in the Sarvāstivāda causal hetu-pratyaya framework. It is for this reason that the categories of “thought” (citta) and “mental concomitants” (caitta) are often discussed together by Buddhist philosophers as citta-caitta or citta-caitasikadharmas.
c. Mental Concomitants (caitta/caitasika)
The category of mental concomitants contains 46 of the 75 dharma types and highlights the fact that the Sarvāstivāda were primarily concerned with understanding the phenomenology of human experience as it relates to ending existential suffering. They occur together with thought dharmas (citta) to produce contentful cognition, and the presence of particular mental concomitants will indicate the quality of the thought dharma occurring with them. This is because most of the mental concomitant dharmas are considered to have a moral value, the importance of which will become clear in section 4 of this article. Note that multiple types of mental concomitant dharmas occur in the same instance of cognition, alongside one another. The mental concomitants are further grouped into six kinds: (i) universal dharmas, (ii) skilful universal dharmas, (iii) unskilful universal dharmas, (iv) universal defilement dharmas, (v) non-universal defilement dharmas, and (vi) indeterminate dharmas.
Universal dharmas (mahābhūmika dharmas) are of 10 types, and are called so because they occur in all minds (Vasubandhu 1988-1990: 189), where “mind” means the collection of causally interacting dharmas that constitute our everyday understanding of the mind of an individual person. A number of these universal dharmas help explain contentful cognition, such as conceptual perception (saṃjñā), which grasps the mark of an object, providing the foundation on which to apply concepts and linguistic labels. For example, it discerns the “blueness” of an object, allowing one to apply the label “blue.” Other dharmas of this type account for the phenomenological colour of mental experiences, such as “feeling” dharmas (vedanā), which explain the attitude of desire or aversion to a perceived object. Finally, a number of universal dharma types, such as volition (cetanā) and inclination (chanda), capture the Sarvāstivāda emphasis on causal dependence for dharma activity and the importance of intention in the analysis of karmic consequences.
There are further dharmas classed as “skilful” (kuśala) or “unskilful” (akuśala), and are called “universal” because the skilful ones are found in all skilful minds, and the unskilful are found in all unskilful minds (Vasubandhu 1988-1990: 190, 195). Once again, “mind” here means the causally interacting dharmas that give rise to our everyday understanding of the mind of a person. A dharma is considered skilful when it helps produce further dharma activity that will end in liberation, and it is considered unskilful when it produces further dharmas that keep one in the realm of existential suffering. The terms are therefore imbued with moral significance, as is apparent from dharma types such as “faith” (śraddhā), “calm” (praśrabdhi), and “non-violence” (avihiṃsā) being classed as skilful, whilst “non-modesty” (āhrīkya) and “shamelessness” (anapatrāpya) are classed as unskilful. There are 10 universal skilful dharma types, and only two universal unskilful dharma types, as mentioned previously.
The concept of defilement (kleśa) is of significant importance to the Sarvāstivāda account of the Buddhist path to liberation. A defilement dharma is held to infect the collection of other dharmas that make up a person, and so to “defile” the collection, meaning it causally interacts with other morally bad dharmas, causing them to manifest their activities, keeping the collection (person) within the realm of existential suffering. The goal of Buddhist practice is to produce the cessation of these defilement dharmas. The universal defilement dharmas (kleśa-mahābhūmika dharma) are those that occur in every defiled mind, such as error (moha) and idleness (kauśīdya) (Vasubandhu 1988-1990: 193), and the non-universal defilement dharmas (parīttakleśa-bhūmika dharma) are those that only occur in some defiled minds, and include dharmas such as anger (krodha) and pride (mada) (Vasubandhu 1988-1990: 196). There are six types of universal defilement dharmas, and 10 types of non-universal defilement dharmas.
Finally, there are eight indeterminate dharmas of the mental concomitants category. As the name implies, the moral status of these dharmas is not determined without reference to context. For example, the “sleep” (middha) dharma is unskilful when indulged in unnecessarily, but skilful when one requires refreshment to continue engaging in morally good thought and actions (Dhammajoti: 252).
The detailed analysis contained in this category demonstrates the Sarvāstivāda focus on explaining phenomenal experience. But alongside this, the school maintained a commitment to the objective reality of dharmas, holding them to not depend upon our subjective experiences. The introduction of the categories of “conditioned dharmas separate from thought” (citta-viprayukta-saṃskāra) and unconditioned dharmas (asaṃskṛta) reflects this tendency in the Sarvāstivāda.
d. Conditioned Dharmas Separate from Thought (citta-viprayukta-saṃskāra)
As the Sarvāstivāda held that only dharmas are ultimately existent, then, alongside including the elements of phenomenal experience into their description, they also needed to account for certain metaphysical principles that were not reductions of phenomenal experience. This led to the two final categories of the five-fold division of dharmas, one of which is the category of “conditioned dharmas separate from thought” (citta-viprayukta-saṃskāra).
There are 14 dharma types included in this category. By “conditioned” (saṃskāra) is meant that these dharmas manifest their activity in dependence upon causes and conditions, like all dharmas of the previous categories discussed. The specification that they are “separate from thought” is used to indicate that these dharmas are non-material, and therefore do not fall under the “matter” (rūpa) category, and non-mental, therefore falling outside of the “thought” (citta) and “mental concomitant” (caitta/caitasika) categories (Vasubandhu 1988-1990: 206). In a sense, they are metaphysical principles conceived as ontologically real constituents of reality. In line with the Sarvāstivāda commitment to the objectivity of their description of reality, these principles are ultimately real entities, although neither material nor mental. Since they resemble metaphysical principles, they can affect both material and mental dharmas. Dhammajoti notes that in “some cases at least, they may be considered as laws of nature” (323). In other cases, they are simply the Sarvāstivāda attempt to fit explanations used by the Buddha in his discourses into the reductive dharma model of reality.
The motivation from a metaphysical principle can be seen in the inclusion of a “resemblance” (sabhāgatā) type of dharma, which causes the perception of resemblance between different entities. Vasubandhu points out the Sarvāstivāda reasoning for the inclusion of this. He notes that numerically distinct entities have a likeness, as indicated by our general concepts and expressions (such as using the term “tree” to refer to numerically distinct individual trees). Since there must be something that is the cause of their likeness to one another, and since only dharmas can have causal efficacy, there must be a dharma that fulfils this function. This is the “resemblance” dharma (Vasubandhu 1988-1990: 220).
This category also includes the “acquisition” (prāpti) and “non-acquisition” (aprāpti) dharma types. Almost all Buddhists hold that the self is an illusion. For the Sarvāstivāda, reality consists of complex interactions between different dharmas, and these interaction gives rise to our phenomenal experience of the world. But since there is no self, this seems to give up the value and moral impetus for individual agency. If there is no self, who is to pursue the Buddhist path? There is also the question of karmic consequences—if there is no self, what stops the bad actions of one “person” from leading to bad consequences for an unrelated second “person”? There is also a need to explain the phenomenal experience of continuity. The world is not an incoherent assemblage of phenomenal experience; it seems to follow a logical pattern. To address these concerns, the Sarvāstivāda claimed that what was mistaken to be the self was in fact a “personal series” or “stream” (santati/santāna) of dharmas. And it is the “acquisition” (prāpti) dharmas that hold together the series (Vasubandhu 1988-1990: 207). When a particular dharma arises based on causes and conditions, an “acquisition” dharma may also arise to connect it with the personal series. This can be something as simple as maintaining the material continuity of what is taken to be a person, to the acquisition of mental defilements which obstruct a person from attaining liberation. The “non-acquisition” (aprāpti) dharmas act in the opposite fashion. They may be produced by skilful dharmas in a given stream and disconnect certain defilement dharmas from that stream.
In order to support the Buddha’s claim that reality is characterised by impermanence, meaning that all entities arise and cease, the Sarvāstivāda also included four types of dharma in this category that are often known collectively as “characteristics of the conditioned” (saṃskṛta-lakṣaṇa). In this way, the Sarvāstivāda hope to show that the Buddha’s remarks were an accurate description of ultimate reality. The four characteristics of the conditioned are “origination” (jāti-lakṣaṇa), “duration” (sthiti-lakṣaṇa), “deterioration” (jarā-lakṣaṇa), and “destruction” (anityatā-lakṣaṇa) (Vasubandhu 1988-1990: 238). The “origination” dharma acts, along with other causes and conditions, on a dharma that has not yet manifested its activity. Once this future dharma arises, the “duration” dharma acts as a cause for it to manifest its activity and produce other dharmas for a period of time, until the “deterioration” dharma weakens this production, and finally, the “destruction” dharma causes the activity to cease completely.
The category of “conditioned dharmas separate from thought” (citta-viprayukta-saṃskāra) provides one of the more philosophically rich categories of dharmas, and highlights some of the key metaphysical concerns and innovations of the Sarvāstivāda.
e. Unconditioned Dharmas (asaṃskṛta)
The category of unconditioned dharmas is the most contentious and much criticised by the opponents of the Sarvāstivāda. As the name implies, the dharmas of this category are unconditioned, meaning they are neither causes nor effects. This seems to fly in the face of the Buddhist commitment to impermanence, and yet the Sarvāstivāda maintained the ultimate existence of three kinds of unconditioned dharmas: (i) space, (ii) cessation with knowledge, and (iii) cessation without knowledge.
Unconditioned space is defined by Vasubandhu as “not hindering matter,” since matter occurs in space, and also as “not being hindered by matter,” because matter dharmas do not displace the unconditioned space dharma (Vasubandhu 1988-1990: 59). The Sarvāstivāda teacher Vasumitra attempted to justify its ultimate existence by arguing that this could be inferred from our direct perceptions (Dhammajoti 555). For example, space has the nature of accommodating other dharmas. It is also due to this accommodation that we can specify the location of any entity. In the absence of an ultimately real space, this would not be possible. Similarly, since space is defined as that which does not hinder, then in the absence of its real existence, reality should be characterised as being hindered everywhere, preventing anything from arising. Space acts as the condition for the possibility of the arising and ceasing of dharmas due to its nature of non-hinderance. Such arguments show that the Sarvāstivāda were committed to an absolute conception of space in their dharma theory—the unconditioned space dharmas are like the background upon which other dharmas act, and it is for this reason that they are held to be beyond the requirement for causes and conditions.
The “cessation with knowledge” (pratisaṃkhyā-nirodha) is connected to the Sarvāstivāda formulation of Nirvana. Recalling that the Buddhist goal is the ending of existential suffering from the cycle of rebirths, the Sarvāstivāda metaphysical picture conceives of this as the cessation of future dharmas occurring in a given stream, such that the entire stream ceases at death. Certain dharmas, known as “defilements,” can negatively affect the stream, preventing this liberation from occurring by producing more dharmas. When one acquires knowledge of the truth of Buddhist teaching and learns to discern reality correctly, one’s stream disconnects from the defilement dharmas and acquires the “cessation with knowledge” dharma, which prevents these defilements from attaching to the stream again (Vasubandhu 1988-1990: 59). It should be noted that whilst this kind of dharma mirrors the Buddha’s account of Nirvana, there is a separate “cessation with knowledge” dharma for each impure dharma. These dharmas are considered unconditioned since they do not exercise any activity; it is their possession in a stream which explains the ability to maintain disconnection from a specific defilement dharma.
The “cessation without knowledge” (apratisaṃkhyā-nirodha) dharmas are those that also disconnect certain dharmas from one’s stream, but not due to any knowledge of the truths of the Buddha’s teachings. This makes sense of the limited sphere of our attention, and the fact that when one is focused on perceiving a particular object, this excludes the ability to perceive other objects (Vasubandhu 1988-1990: 60). For example, when focusing on reading, one may not hear one’s name being called. That sound dharma then moves from the present to the past. The stream is therefore disconnected from any mental dharmas that may have arisen from the audible perception of that sound dharma, such as desire or jealousy (say, if one is romantically attracted to the person calling one’s name). As the example highlights, the cessation involved in these dharmas is accidental to the intention of the person.
4. Dharma Theory as a Means to Liberation
A focus on the system building and taxonomy of the Sarvāstivāda, whilst necessary, can sometimes distract from the school’s overriding concern—providing an account of the Buddhist method for liberation. The whole Sarvāstivāda project is an attempt at developing this, and the dharma theory is the outcome. This section will explain how the metaphysical view and liberation aspirations of the Sarvāstivāda come together.
For Buddhists, existence in the cycle of rebirths is characterised by suffering (duḥkha). Given the law of karma, our existence when we are ignorant of the Buddhist path leads to actions that will generate karmic consequences. The need for the fruition of these consequences will ensure the prolongation of existence in the cycle of rebirth—one is reborn in order for the effect to take place. The cessation of this rebirth and suffering, which means liberation from existence, is Nirvana, and this acts as the goal of Buddhist practice.
For the Sarvāstivāda (and other Buddhist schools), what is taken to be a self is in fact a stream (santāna) of material and mental dharmas causally interacting and manifesting their nature. The activity of certain dharmas leads to the production of activity in other dharmas, which keeps the stream together across different lifetimes. Certain dharmas that attach to a stream act to continually produce further dharmas that maintain one within the cycle of rebirth. These are known as defilements (kleśa/anuśaya). They infect almost all other dharmas of a stream, causing their impurity. The defilement dharmas are often mental attitudes, and they fall under the “mental concomitant” category of the five-fold taxonomy (see section 3.c). They include ignorance (moha), idleness (kausīdya), and agitation (auddhatya).
It is, however, possible for a stream to abandon these defilement dharmas. By abandoning these, the stream removes the causal conditions that lead to certain actions, which keep one within the cycle of rebirths. But in order to be able to abandon these negative dharmas, the practitioner needs to understand the dharmas, know the natures of each, know which are defilements, and know the methods of abandoning them. The link between knowledge of dharmas and liberation is eloquently stated by Vasubandhu in Abhidharmakośabhāṣyam (1.2): “Apart from the discernment of dharmas, there is no means to extinguish the defilements, and it is by reason of the defilements that the world wanders in the ocean of existence” (Vasubandhu 1988-1990: 57).
The final abandonment of all defilements from a stream is gradual and takes many lifetimes. It involves strict adherence to monastic discipline, developing the values of the Noble Eightfold Path, meditation, and cultivating the practices of skilful dharmas.
5. Criticisms of the Sarvāstivāda
The Sarvāstivāda, in producing a comprehensive and systematic elaboration of Buddhist thought, enriched Indian philosophy with its development of a new conceptual vocabulary and metaphysical model. The school’s influence within Buddhist philosophy can be seen by the many criticisms these developments received from other Buddhist philosophers, and the exchanges between the schools led to a growing complexity and sophistication of philosophical argumentation. This section will note three lines of criticism against the Sarvāstivāda from three separate Buddhist schools.
a. Sautrāntika Criticism of Ontological Promiscuity and the Tri-Temporal Existence of Dharmas
The name of the Sautrāntika school derives from the word “sūtra,” and reflects their belief that the Sarvāstivāda had strayed too far from the words of the Buddha in its attempt to provide a metaphysical system for these teachings. Their criticism of the Sarvāstivāda is all the more interesting since the Sautrāntika agreed that most of our experience can be reduced to basic elements, and that only what is irreducible after such analysis is ultimately real. But they argued that the Sarvāstivāda were too promiscuous in their ontology—they included too many entities as ultimately real dharmas, when these entities could in fact be further reduced. For example, the Sarvāstivāda claim that there are two kinds of dharmas known as “absorption” (asaṃjñi-samāpatti and nirodha-samāpatti) that belong to the category of “conditioned dharmas separate from thought.” They are produced by meditative practice, and once a stream acquires either of these dharmas, the dharma manifests its activity of arresting the mind and mental concomitant dharmas of that stream. But Vasubandhu, writing as a Sautrāntika, sees no reason to include this phenomenon as an ultimately real dharma. He admits that a halting of the mind and mental states can occur, but he writes that what “is called “absorption” is simply the non-existence of the mind for a certain period of time” (Vasubandhu 1988-1990: 232). It is an absence rather than a positive entity. The admission of a dharma to account for the phenomena is unnecessary, and so if the Sarvāstivāda were more rigorous in their reductive method, they would not include it (as well as many other supposed dharmas) as an ultimate constituent of reality.
The same attitude is held by the Sautrāntika in their criticisms of the Sarvāstivādin’s key doctrine—that dharmas exist in the past, present, and future. The Sarvāstivādin believed that there was a difference between the intrinsic nature of a dharma (svabhāva), which it retains eternally, and the manifestation—or activity (kāritra)—of this essence when certain causes and conditions are present. The idea of an intrinsic nature allowed the Sarvāstivādin to claim that dharmas existed in all times, and the concept of dharma “activity” allowed the Sarvāstivādin to explain the experience of temporal change. The Sautrāntika offer many responses, one of which argues against this distinction. As Vasubandhu explains, the Sarvāstivādin must either accept that the activity of a dharma is a separate thing from the dharma, or that it is not separate. If separate, then the activity itself must also be temporal, and exist first in the future, then in the present, and finally in the past. But in order to explain which of these stages it belongs to, the activity will also need a second activity, and this will require a further activity, leading to an infinite regress (Vasubandhu 1988-1990: 811). Alternatively, if the activity is not considered separate from the dharma, and if a dharma always exists (as the Sarvāstivādin maintains), then it will always exercise its activity, and there would be no distinction between the time periods (Vasubandhu 1988-1990: 812).
The Sautrāntika approach to the Buddhist reductive project differs and argues that reality consists of momentarily existing and causally interacting basic elements, which they call svalakṣaṇa. These can arise and cease to exist, yet the Sautrāntika believe their model can adequately explain reality in a more parsimonious manner, and without, in their view, the heretical innovations of notions like intrinsic nature (svabhāva) and unconditioned dharmas (asaṃskṛta). The dispute between the Sautrāntika and Sarvāstivāda is captured in full detail by Vasubandhu in his Abhidharmakośabhāṣya, where the verses present the Sarvāstivāda view and the commentary often argues in favour of Sautrāntika criticisms, reflecting Vasubandhu’s own intellectual movement towards the latter school.
b. Madhyamaka Criticisms of Intrinsic Nature (svabhāva) as Incompatible with Dependent-Origination (pratītya-samutpāda)
The Madhyamaka school of Buddhist philosophy emerged within the broader Mahāyāna movement of Buddhism. This movement had a significant impact on the practice, spread, and intellectual development of Buddhism throughout the world. It was characterised by a commitment to the bodhisattva path, which required those on the threshold of liberation to willingly return to the cycle of rebirths and spread their knowledge until all beings were liberated, and by a growing collection of sūtras which emphasised the concept of “emptiness” (śūnyatā). The philosopher Nāgārjuna provided a philosophical analysis of this concept of emptiness, composing the “Root Verses of the Madhyamaka” (Mūlamadhyamakakārikā), which became the foundational text of the school.
Nāgārjuna presents emptiness as the absence of an intrinsic nature (svabhāva), and by characterising reality as empty, his work is seen as a direct criticism of the Sarvāstivāda. He argues that emptiness is equivalent to the Buddha’s teaching of dependent-origination (pratītya-samutpāda), which holds that everything depends upon causes and conditions for its existence. By introducing a notion of an eternal intrinsic nature, Nāgārjuna believes the Sarvāstivāda view leads to a conception of reality as static and unchanging. He writes that “if something existed by essential nature (prakṛti), then there would not be the nonexistence of such a thing” (Nāgārjuna 2013: 160). He also argues that if entities already exist with an intrinsic nature, then there would be no place for causation, for an “intrinsic nature that was produced by causes and conditions would be a product” (Nāgārjuna 2013: 154), meaning that it would be something dependent and not eternally existing, as the Sarvāstivādin claims. The major issue with such a view is that it would render the Buddhist path impossible, for without allowing for change, there is no way to alter one’s state of existential suffering into one of liberation.
Nāgārjuna’s style of argumentation is often puzzling and relies on paradoxes and reductio ad absurdum. This has led to many diverse interpretations of his philosophy, though there is little doubt that for him, the Sarvāstivāda notion of intrinsic nature (svabhāva) was incompatible with the soteriological goal of Buddhism and the Buddha’s teaching of dependent-origination. (For more details, see the article on Nāgārjuna.)
c. Yogacāra Criticism of Material Atomism
The Yogacāra school, also known as the Vijñaptimātratā, also emerged within the Mahāyāna movement. They argued for a form of idealism, claiming that the concept of “emptiness” is best understood as the absence of a duality between the perceived and the perceiver. Vasubandhu eventually came to adopt the Yogacāra view and offers an interesting criticism of the Sarvāstivāda project in his “Twenty Verses on the Proof of Consciousness Only” (Viṃśatikāvijñaptimātratāsiddhiḥ).
The Sarvāstivāda dharma ontology holds that material dharmas (rūpa) lead to our everyday phenomenal experience of macroscopic objects. These macroscopic objects do not truly exist; they are an illusion caused by the rūpa dharmas and our ignorance regarding the nature of reality. The Sarvāstivāda believe that the existence of rūpadharmas is proven by our cognition of objects—the rūpadharmas are the intentional object of cognition. But Vasubandhu argues that we do not perceive dharmas in our cognition, since they are imperceptible atoms, so this argument will not work (Vasubandhu 2018: 212). If the Sarvāstivāda argues that the object we perceive is a combination of rūpa dharmas, then Vasubandhu asks what is meant by combination. If the object of cognition is the physical touching of dharmas until they form a large perceptible object, Vasubandhu claims that “in the simultaneous conjunction with a group of six [other atoms], the atom [would have to] have six parts” (Vasubandhu 2018: 213). Since an atom is by definition something without parts, this would mean that the elements making the perceived object are themselves ultimately unreal by the Sarvāstivāda’s own standards.
Later, Sarvāstivādins, such as Saṃghabhadra, attempt to avoid such issues by drawing a distinction between the intentional object of a cognition and the object that is the real cause of this cognition (see Cox 1988).
6. Conclusion
The Sarvāstivāda school helped develop the Buddha’s teachings into a consciously philosophical project, attempting to provide a systematic account of the entire Buddhist outlook, from scriptural interpretation to meditative practice. Their metaphysical picture of permanently existing dharmas was extremely influential, and their associated conceptual innovations were hugely influential in Indian philosophy. A knowledge of their philosophy is therefore essential in understanding the details of philosophy in classical India, and especially important in understanding the later Buddhist schools that emerged as critics of the Sarvāstivāda.
7. References and Further Reading
a. Translations of Sanskrit Texts
Nāgārjuna (2013). Nāgārjuna’s Middle Way: Mūlamadhyamakakārikā. Translated by Siderits, M. and Katsura, S. Somerville, MA, USA: Wisdom Publications. [originally written Mūlamadhyamakakārikā, c. 2nd century CE.]
Vasubandhu (2018). Materials Toward the Study of Vasubandhu’s Viṁśikā (I): Sanskrit and Tibetan Critical Editions of the Verses and Autocommentary; An English Translation and Annotations. Translated by Silk, J.A. Cambridge, MA: Department of South Asian Studies, Harvard University. [originally written Viṃśatikāvijñaptimātratāsiddhiḥ and Viṃśatikāvṛtti, c. 4th-5th century CE.]
Vasubandhu (1988-1990). Abhidharmakośabhāṣyam of Vasubandhu (4 volumes). Translated by Pruden, L.M. Berkeley, California: Asian Humanities Press. [originally written Abhidharmakośabhāṣyam, c. 4th-5th century CE]. (A highly influential text. The verses contain a comprehensive account of the Sarvāstivāda position, though the commentary that accompanies these verses is often critical of the Sarvāstivāda and appears to adopt a Sautrāntika perspective.)
b. Secondary Literature
Bartley, C. (2015). Indian Philosophy: Hindu and Buddhist Ideas from Original Sources. London, United Kingdom: Bloomsbury Academic. (Ch. 3 offers a helpful introduction to the Abhidharma/ Sarvāstivāda philosophy.)
Cox, C. (2004). “From Category to Ontology: The Changing Role of Dharma in Sarvāstivāda Abhidharma.” Journal of Indian Philosophy32(5-6), 543-597. (A very detailed article on the philosophical motivations for certain conceptual developments in Sarvāstivāda thought.)
Cox, C. (1988). “On the Possibility of a Nonexistent Object of Consciousness: Sarvāstivādin and Dārṣṭāntika Theories.” The Journal of the International Association of Buddhist Studies11(1), 31-87. (Includes an explanation and discussion of Saṃghabhadra’s defences of the Sarvāstivāda metaphysics).
Dhammajoti, K.L. (2015). Sarvāstivāda Abhidharma (5th Revised Edition). Hong Kong: The Buddha-Dharma Centre of Hong Kong. (The most detailed and comprehensive account of Sarvāstivāda philosophy written in English, including many translations of passages from Sanskrit and Chinese texts.)
Gold, J.C. (2014). Paving the Great Way: Vasubandhu’s Unifying Buddhist Philosophy. New York, USA: Columbia University Press. (The early chapters offer a good discussion of philosophical disputes involving the Sarvāstivāda, focusing on Vasubandhu’s Sautrāntika criticisms.)
Potter, K. (ed.) (1996). Encyclopedia of Indian Philosophies Volume VII: Abhidharma Buddhism to 150 A.D. Delhi, India: Motilal Banarsidass Publishers Private Limited. (A very helpful introduction to Abhidharma thought, including the practical aspects of the spiritual path according to the Sarvāstivāda, such as meditation.)
Siderits, M. (2007). Buddhism as Philosophy: An Introduction. United Kingdom: Ashgate Publishing Limited. (Ch. 6 provides a good introduction to Abhidharma and Sarvāstivāda arguments for their reductionist metaphysics.)
Westerhoff, J. (2018). The Golden Age of Indian Buddhist Philosophy. Oxford, United Kingdom: Oxford University Press. (Ch. 1 gives a good introduction to Abhidharma thought, highlighting the aspects to which later Buddhist schools responded.)
This article explores philosophical ideas and concepts central to Afro-diasporic religions such as Vodou, Santeria, Candomblé, and Winti. Afro-diasporic religions have their origins in communities displaced by the Atlantic slave trade, blending elements from indigenous African traditions with Christian and original ideas.
Afro-diasporic religions have been largely overlooked in contemporary philosophy of religion, historically, dismissed as folklore, mythology or even satanic. This article, instead, delves into the intellectual depth of these traditions, which offer ideas that differ significantly from Abrahamic religious traditions and which challenge dominant views in contemporary Western philosophy of mind, epistemology and ethics.
The central ideas discussed include:
Afro-diasporic cosmology in which the universe contains material and spiritual realms existing along-side one each other, incorporating elements like Ashe (spiritual, divine energy).
Alternative conceptions of divinities, viewing gods as imperfect and capable of possessing adherents’ bodies during possession-episodes.
Tradition-specific sources of knowledge, such as divination and testimony during spirit-possession.
Afro-diasporic accounts of human personhood, viewing humans as composed of at least three elements: body, spirit, and soul.
Afro-diasporic ethics in which spirits and deities serve as moral exemplars and where evil is seen as inherent in reality, not as a distortion.
All ideas are discussed in detail and contrasted with alternative Western ideas.
Afro-diasporic religions are traditions that originated among descendants of Africans displaced by the Atlantic slave trade. These traditions are practiced primarily in the Caribbean, Northeast South America, Brazil and the Southern United States. Because of the prominent presence in the Caribbean, Afro-Diaspora religions are often called ‘Afro-Caribbean religions’. The term is, however, less inclusive of Brazilian and North American traditions. Traditions include Candomblé, Santería (also known as Lucumí), Vodou (Haitian, Dominican and New Orleans), Hoodoo (rootwork) and Winti. These traditions share similarities with African religious practices.
Despite their richness and uniqueness, Afro-diasporic traditions have received limited attention from contemporary philosophers. While there are some notable exceptions (Porcher and Carlucci 2023; Schmidt 2024; Porcher 2024)), these traditions have not been extensively explored philosophically. Most academic studies of these traditions are anthropological or psychological in nature, making it challenging to find systematic philosophical investigations of Afro-diasporic philosophy. The discussion below is based mainly on anthropological observations of practices and beliefs, as well as publications by practitioners.
2. What Are Afro-diasporic Traditions?
Distinguishing Afro-diasporic religions from other traditions poses some difficulties. Relying solely on geographical location is problematic because most traditions coexist with others like Christianity or Islam. Simply including traditions that originated as a result of the Atlantic slave trade raises issues as well. Slave communities also gave rise to new forms of Christianity or religious movements like Rastafari, which are not Afro-diasporic traditions. A better criterion is to include traditions shaped or grounded in a specific way of thinking or logic. Jonathan Chimakonam argues that African philosophy should be distinguished by its specific logico-methodological grounding. African logic includes basic axioma that differ from Western axioms (like the law of the excluded middle). These differences are especially noticeable in African divinatory practices (Chimakonam 2021; 2019: 138-148). Chimakonam’s criterion allows us to isolate philosophical traditions characterized by the African thought systems, differing from Asian or European thought systems. For our purposes, Chimakonam’s criterion can be applied with slight alterations. Afro-diasporic traditions have undergone syncretism to a greater extent than traditions on the African continent. Therefore, the thought system they are grounded in cannot simply be called ‘African’. Instead, Afro-diasporic traditions are characterized by a different way of thinking, which is somewhat closer to Western thought.
The Afro-diasporic thought system is arguably more aligned with the West. These traditions retained most central ideas and practices of their African parental traditions (e.g., divination practices, possession rituals, belief in multiple deities and spirits). Due to contact with Western ideas and traditions (mainly through colonialism and missionary efforts), they arguably became closer to Western ways of thinking. Other shared ideas among Afro-diasporic religions include beliefs about a spiritual realm populated by spirits and close personal contacts with these spirits.
Apart from a few exceptions, Afro-diasporic religions are not the majority religions in the regions where they are practiced. In the Americas, they coexist with various Christian churches and are actively opposed by some of them. Immigrant practitioners also form small communities in Europe, Asia, North America and Oceania.
Afro-diasporic religions typically rely on oral traditions and have very few written sources. They do not have sacred scriptures. The transmission of teachings and the initiation of clergy usually occur through apprenticeship. In some traditions, practices and beliefs are safeguarded by secret societies that maintain a high level of secrecy (Davis 2010). Most traditions are also open to new information regarding the nature of spirits and worship through open inspiration. New information can also be revealed through self-revelation during mediumship or possession.
All Afro-diasporic religions were heavily influenced by the Atlantic slave trade and the subsequent experiences of slavery. Enslaved individuals brought African traditions from various regions of Africa and continued to practice them, often in secret. Key African influences included Yoruba beliefs and practices, West African religious beliefs and practices and Kongo religious beliefs and practices. Due to the diverse origins of enslaved individuals from across Africa, different practices, pantheons and beliefs were merged or syncretized. Traces of this initial syncretism can be observed in the worship of various spirits.
African traditions were not only syncretized with each other but also influenced by Christian ideas. This influence stemmed from Christian missionaries in African homelands and forceful Christian indoctrination in the Americas. This syncretism is most evident in the use of Christian symbols such as the cross and the veneration or worship of Christian saints (Desmangles 2000). Some traditions even incorporated Catholic hymns or prayers into their rituals (Nwokocha 2023).
Reflection on Afro-diasporic traditions is largely absent in the philosophy of religion. The following discussion highlights some ideas central to these traditions that are of clear philosophical interest. As discussed below, some ideas are similar to Abrahamic ideas discussed in the philosophy of religion, while others are distinctly different.
3. Metaphysics
3a. Cosmology
Due to regional and cultural differences, Afro-diasporic traditions do not have a uniform worldview or cosmology. However, most traditions, however, do show stark similarities and shared ideas. The cosmology in Afro-diasporic traditions is heavily influence by Yoruba cosmology. According to Yoruba myths, all of reality has its origins in the god Olódùmarè. From him, other deities came forth, including the sky, the sun, moon and stars. The earth was formed by the creator god Ọbàtálá. He pleaded with the supreme god Olódùmarè to create dry land for living creatures to inhabit. Ọbàtálá created the earth with help from various other deities. He cultivated plants and formed hills and valleys. Later, Ọbàtálá decided to make humans to keep him company. After he created humans out of clay, Ọbàtálá asked Olódùmarè to breathe life into them (Wilson 1991). Variants of the creation myth exist as well, see (Gordon 1979; Porcher 2024). Some elements in the creation myths appear to be similar to the Genesis narrative. This may show the influence of Christian ideas on Yoruba cosmology. Nonetheless, the existence of a creator god who is different from the ultimate origin of reality is usually affirmed.
Like the Yoruba, adherents of Afro-diasporic traditions affirm that reality is not exhausted by the physical, perceivable reality. The Yoruba and their religious descendants represent their vision of the universe as similar to a calabash with two parts that form a whole. Like the calabash, reality is of one piece with both visible and invisible elements (Clark 2007) (Ann 2020). Invisible elements include various deities or spirits (called lwa, orishas or orixás) and forces of the universe (like ashé, see below).
While the perceivable reality is not the only reality, the invisible reality is not strictly separated from it. The boundaries between both realities are permeable. Invisible realities like spirits continuously interact with the visible. They can be accessed through ritual practices, both private and public (Ann 2020). Some practitioners of Candomblé describe the goal of ritual behavior as establishing harmony between the human and spiritual realms (Walker 1990). Most Afro-diasporic traditions also do not share the idea of an end time or final stage of history where a spiritual reality will supplant the physical reality we inhabit, as is the case in Abrahamic traditions.
3b. Ashé/ Axé/ Àṣẹ
An important concept in most Afro-diasporic traditions is that of ashé (axé in Candomblé and ashé in Santería). The concept traces back to Yoruba and is difficult to pin down. It is sometimes used to refer to the fundamental reality that makes up everything. Emmanuel Eze defines it as: “the principle of intelligibility in the universe and in humans (…) as rationality itself. It is creative power, the word, reason, the logos which ‘holds’ reality. More specifically, ashé is that principle which accounts for the uniqueness of humans; it is the rational and spiritual principle which confers upon humans their identity and destiny” (Eze 1998: 173). Schmidt defines it as “divine energy that pervades the universe and ensures creation” (Schmidt 2024). It may be a special kind of energy comparable to the Igbo concept of chi and the Chinese concept of qì. Some define it as “a sacred, mystical or invisible force that transcends human beings” (Da Silva and Brumana 2017). Ashé is believed to be a force that circulates among all beings. Ashé is also regarded as creative energy without which spirits and humans are not able to create or transform (Mandle, Rothschild, and others 1992) quoted by (Schmidt 2024). Human subjects ought to seek balance of ashé through proper ways of living and ritual behavior. The pure experience of ashé is said to only be achievable during trance (Schmidt 2024).
Although the concept remains somewhat vague, it seems as if ashé is primordial energy that makes up most of reality. Sometimes the breath of life by the god Olódùmarè is seen as an infusion of ashé. It can take different forms that may be exchanged among beings. Balancing ashé is key for maintaining well-being (Schmidt 2024). Offerings are a means for the transmission of ashé. When consuming offerings of food, spirits do not consume the physical food but rather the ashé present in the food (Schmidt 2024). This suggests that interaction between humans and spirits is partly an exchange of ashé.
Ashé is a key part of reality in most Afro-diasporic traditions. Its existence has a key role in rituals and other religious practices and serves as a bridge between the human and non-human realms. The existence of Ashé is without counterpart in western cosmology but bears similarities to concepts in other Non-Western traditions.
4. Ontology
Most Afro-diasporic traditions acknowledge the existence of a transcendent ultimate origin of reality, known by various names such as Olódùmarè, Elédùmarè, Bondye, Anana, and others. In these traditions, the ultimate god typically plays a minor role in religious practices and the lives of followers. Some believe this is because the ultimate god is too transcendent for humans to access. This transcendent god is the only one with the power to give life and all other spirits or deities rely on him for their existence and sustenance.
Non-ultimate deities or spirits are more prominent in religious practices. It is difficult to pinpoint how many spirits are worshipped in each tradition. Some suggest that the number may be lower than in African traditions due to lost knowledge from forced migration. The number is intentionally kept vague as a sign of humility or ignorance on the part of practitioners. For instance, followers of Haitian Vodou may claim to worship 400 + 1 spirits, with the ‘plus one’ referring to additional spirits that could be added over time.
The deities or spirits govern different natural domains (Richman 2019), often reflecting Yoruba influences in their names. Most traditions have spirits overseeing love or protection, as well as facilitating access to the spirit-world through rituals or offerings, allowing practitioners access (Legba in Haitian Vodou, Eleguá in Cuban Santería). Spirits are honored through prayer and offerings.
Spirits form different strands or nations in most Afro-diasporic traditions. For example, Haitian Vodou distinguishes 21 nations of spirits of which the Rada, Petro and Congo nations are the most worshipped (Hebblethwaite 2021). Surinam Winti also knows different groups of spirits (Luzaragga 2019). Candomblé distinguishes nations of spirits as well. Most adherents either worship spirits from the Jêje-Nago lineage or the Congo-Angola lineage. The former likely traces back to spirits worshipped in West-Africa and the latter to spirits worshipped in the Congo Basin. Adherents of both groups tend to have minor differences in ritual activity. For example, the Congo-Angola rite plays drums with hands rather than sticks and sings hymns in Portuguese (Da Silva and Brumana 2017). Some of these nations trace back to different geographical origins and are the result of syncretism. Spirits of the same nation tend to have similar properties as well. Spirits of the Petro nation in Haitian Vodou are regarded as ‘hot’; meaning they are easily ill-tempered and even considered dangerous to work with. Spirits of the Rada nation, by contrast are considered ‘cool’. They are considered easier and less dangerous to petition.
Most deities and spirits are not attributed with omni-properties (omniscience, omnibenevolence or omnipresence) commonly attributed to God in Abrahamic traditions. They are usually bound by cosmic laws such as cause and effect or moral laws. Some also have a morally ambiguous nature, being both inclined to heal or help and to curse or cause harm. For example, an adherent of Vodou lwa Baron Samedi noted: “[I]f he is paid to do bad things, he’ll do it (…) but if you are a good child, and you serve him well (…) he’ll do anything for you” (Marcelin 1950: 168) quoted by (Hebblethwaite 2021: 190).
Spirits and most deities are not omnipresent and usually come when invited. Some inhabit specific trees or places of worship (Landry 2016) or only become present when invoked. Another perfection spirits lack is aseity (i.e. being self-sufficient) (see (Craig 2016). Spirits require sustenance from humans in the form of offerings or devotion. As a result, the relationship between humans and the supernatural is much more reciprocal than in Abrahamic traditions.
Another difference between deities and spirits worshipped in Afro-diasporic traditions and God in Abrahamic traditions is that they show strong preferences. Spirits can have preferences for some people over others (usually those that show reverence). They also show preference for worldly goods. The lwa of Haitian Vodou shows strong preferences for specific offerings, garments or colors. For example, lwa Danbala shows a strong preference for white attire. Worshippers wearing white are more likely to be possessed by Dambala or be granted favors (Nwokocha 2023). Similar preferences can be noted in Santería and Candomblé. Spirits also show a preference for musical rhythms and dance (Bahia 2016).
Most, if not all, Afro-diasporic traditions actively practice spirit possession. During possession, a spirit takes over a human’s bodily and behavioral functions. The possessed human is usually not aware of the spirit and has no memories of the episode, although there are exceptions (See (Bourguignon 2005)). Possession usually involves typical bodily reactions like shaking, thermic reactions and rolling of the eyes. Spirit possession is only known in Abrahamic traditions as demonic possession and has an exclusively negative connotation. In Afro-diasporic traditions, spirit possession can be malevolent but is usually regarded as good. It allows intimate connection to spirits and provides venues for help or advice from spirits.
Spirit possession is similar to shamanistic practices where subjects seek contact with spirits (Eliade 2024). However, there are important differences. In most shamanic practices, the subject reaches out into the spirit world and travels towards the spirits. The shamanic journey can involve elements like soul flights or astral travel. Spirit possession involves reverse movement. The subject invites the spirits to come down into her body and temporary take over her functions (Hebblethwaite 2021).
Spirit possession rituals often involve inviting spirits by making offerings and wearing suitable attire. Spirits are also welcomed by music, singing and dancing (Nwokocha 2023). Spirit possession is facilitated by ecstatic rituals that make humans prone to trance states (Hebblethwaite 2021). Spirits choose the best dancer or person otherwise deemed most suitable to host the spirit. When a spirit takes over a human body, it is greeted with the utmost respect and asked for blessings or favors.
Spirit possession presents a solution to a widely discussed problem in Western philosophy of religion. In the West, some have noted a problem with divine intervention. Since God is regarded as a transcendent, purely non-physical being of a profoundly different nature, interaction with the physical world (including physical humans) would be problematic. Various solutions to this problem have been proposed (Moser 2022). Spirit possession offers a way towards a solution. During spirit possession, spirits take control over humans. Haitian Vodou holds that a human’s ti bon ange temporarily gets replaced with a spirit’s (see also below). Spirits thus take control over a human’s functions in a similar way as a human mind takes control over a body. Interaction between a spirit and human during possession is, therefore, similar to mind-body interaction. Mind-body interaction has been problematized as well. Interactions between minds and bodies are, however, less mysterious than interactions between a transcendent God and a physical world. Minds and bodies may be of different nature or substances, but one is not highly transcendent or radically different from the other.
5. Epistemology
Afro-diasporic epistemology lacks a systematic discussion on the definitions of knowledge or detailed criteria for justification, which are common in Western epistemology. It also does not extensively discuss the justification of belief in God or spirits (for such a discussion, refer to Van Eyghen 2023). However, these traditions have specific ways of understanding the nature and will of spirits, typically through divination and possession.
The significance of divination highlights the African roots of Afro-diasporic religions. While all traditions incorporate forms of divination, they are usually less complex than those found in their African origins. Divination practices aim to receive messages or information from spirits which can be communicated through dreams (Nwokocha 2023), intuition or material means. The central practices and goals of divination are similar in Afro-diasporic traditions. Adherents contemplate a question, perform some actions (for example, throwing cowry shells or sticks) and interpret the outcome of the action (e.g. how many cowry shells face upwards or what pattern the sticks form). The outcome may indicate a positive or negative answer or provide more details. Sometimes, the divinatory action is repeated several times (Da Silva and Brumana 2017).
Divination is often used to address practical questions, such as suitability of a romantic partner or how to cure an illness, as well as spiritual matters like discovering one’s Met tet in Haitian Vodou or finding out what curses haunt one.
It is evident that divination differs from testimony as it requires special skills and interpretation. Testimony (at least as commonly discussed in Western epistemology) is a direct transmission of information from subject to subject and usually does not require special skills or interpretation. Divination is also not mere interpretation; it is more of a technological application, involving careful consideration and use of background information. It requires considerable skill but is less idiosyncratic.
Divination is commonly not regarded as an infallible source of knowledge. Answers gained by divination may turn out to be false or predictions may not come true. Some attribute mistakes to errors in interpreting the signs since the replies are often abstract (Nwokocha 2023). Some are open to the possibility that spirits may trick or deceive during divination. Many other sources of knowledge, like testimony, perception and reasoning are not infallible as well. Adherents of Afro-diasporic traditions do add that divination requires considerably more expertise than other sources of knowledge. It is nonetheless usually regarded as reliable.
Another source of knowledge characteristic of Afro-diasporic traditions is through possession. Many traditions hold rituals where spirits are invited in to temporarily take control of a subject’s body and functions. Possession can also occur spontaneously outside of ritual settings. Possession serves many functions besides gaining knowledge. However, possessing spirits are often consulted for practical advice or other matters.
Information obtained during possession is believed to be directly communicated by the possessing spirit or deity, akin to Western divine revelation. However, possessing spirits are not seen as omniscient and the knowledge gained is often practical and contextualized. Unlike Western revelations that focus on spiritual or moral matters, knowledge from possessing spirits tends to be more practical in nature.
6. Philosophy of Mind
We have observed how followers of Haitian Vodou believe that a person’s ti bon ange is temporarily replaced during possession episodes. This presents a view of the human person that differs from Western perspectives. While most Western views identify one or two fundamental parts or substances within a human person (matter or physicality for physicalists and matter or physicality combined with a mental substance or soul for dualists), some Afro-diasporic traditions identify at least three.
According to Haitian Vodou, a human person is composed of a body, a ti bon ange and a gros bon ange. The Vodou concept of the body aligns with Western understanding and refers to human corporality along with basic motor functions. The gros bon ange or big guardian angel is the life force shared by all humans. It can be seen as a specific form of the general ashé. The gros bon ange is not individualized and returns to the ultimate god after death. The ti bon ange (small guardian angel) is the individualizing aspect of the human person. The ti bon ange is believed to leave the body during dreams and possession episodes. During possession, the ti bon ange is replaced by a deity or spirit (Métraux 1946). Additionally, humans have a met tet (master of the head), a tutelary spirit that is closely connected to the individual and offers guidance and protection. The met tet guides and protects the subject, in some ways similar to the western concept of a guardian angel although the met tet also poses demands on humans. The identity of the met tet can be revealed through divination (Strongman 2008).
Followers of Surinam Winti also believe in a personal tutelary spirit or winti that guides them through life (Luzaragga 2019). Adherents of Candomblé believe that humans consist of three elements that need to be balanced: a physical body that perishes with biological death, breathi or emi which gives life, intelligence or consciousness (ori), and a divine element or spark. The divine element connects humans to the orixás. During initiation rituals, the orixá is firmly placed within the body of the initiate to achieve better harmony with the divine element (Schmidt 2024).
Candomblé shares the Haitian belief in a tutelary deity or spirit with all humans having a connection to one or more orixá (Serra 2019). Initiation strengthens the bond with one’s personal orixá and adherents believe that each individual has a predetermined plan or odu before birth. Odus guide individuals through different life phases, with efforts made to align with the child’s odu from birth to avoid misfortunes. Odus also govern animals, spaces and events (Da Silva and Brumana 2017).
While Afro-diasporic traditions recognize various non-bodily elements in human persons, they emphasize the importance of the body more than Western traditions. Ritual practices involve bodily behavior, like dancing, singing, shaking and drumming to a greater extent (Porcher and Carlucci 2023). Interaction with spirits through possession is a physical experience rather than a purely mental one. Many traditions also place significance on bodily purity.
7. Ethics
Afro-diasporic traditions lack a formalized moral code or systematic reflection on ethics. Ethical norms are typically based on tradition and customs (Clark 2005). Rules in these traditions often have a stronger communal focus, emphasizing obligations to the wider family and community. A common theme is the respect for elders and their guidance, both in ritual and practical matters (Mason 2002). This respect is reinforced by honoring ancestral spirits like the Egún in Cuban Santería, which are also significant in Haitian Vodou and Candomblé.
Similar to Western religious traditions, Afro-diasporic traditions believe that morality is derived from a cosmological order. Some rules are rooted in mythological stories about spirits. For example, a Cuban myth tells of Eleguá and Obatalá punishing a king for his pride and mistreatment of the poor (Olmos and Paravisini-Gebert 2022). Adherents do not attribute the origin of morality to God’s nature, will or commandments as in Abrahamic traditions but they do believe that spirits play an active role in guiding morality, sometimes appearing in dreams to offer moral guidance (Mason 2002: chapter 8).
Unlike many other traditions, Afro-diasporic traditions do not view evil as an aberration but as an integral part of the cosmological order. Some believe that committing evil serves a balancing purpose and restores order. Spirits may facilitate evil by assisting in curses or seeking revenge on behalf of practitioners (Davis 2010). However, the emphasis on curses in these traditions is often exaggerated.
Certain practices in Afro-diasporic traditions may involve socially transgressive or unacceptable activities such as the Guédé cult in Haitian Vodou which includes sexually suggestive displays and use of human remains (Hebblethwaite 2021). Rituals centered around Exu and Pomba Gira in Quimbanda exhibit similar characteristics and are more popular among marginalized segments of society.
In some Afro-diasporic traditions, ritual behavior can positively impact morality. In Cuban Santería, having a ‘good heart’ can enhance moral character and disposition. Practitioners describe a good heart as ‘clean’ and ‘disinterested’, leading to increased generosity towards others and the spirits. Developing a god heart involves proper ritual observance and allowing the spirits to influence one’s life (Mason 2002).
In Haitian Vodou, spirits (lwa) serve as guides, embodying virtues like courage (Ogun) or parental care (Ezili Danto). Through exposure to these spirits and their rituals, individuals are encouraged to emulate their virtues. Some spirits represent qualities associated with freedom or independence, especially of divinized ancestors who played a role in the Haitian revolution (Hebblethwaite 2021).
Certain moral rules apply to specific worshippers such as those who are initiated or have a closer connection to a spirit. In Haitian Vodou, some practitioners are spiritually married to a spirit, requiring them to show increased reverence and adhere to stricter rules, such as abstaining from alcohol, meat or sex for periods of time (Nwokocha 2023). While spirit marriages are unique to these traditions, initiates often follow more stringent guidelines (Van Eijk 2010).
8. Concluding Remarks
Afro-diasporic traditions have a rich history and a complex system of ideas, most of which are intertwined with religious beliefs and are distinct from Western ideologies. Reflection on metaphysical, epistemological and ethical concepts dates back centuries and continues to shape the lives of those who practice these traditions.
Despite the depth and significance of ideas from Afro-diasporic traditions, they have had minimal impact on Western academic discourse, particularly within the realm of philosophical discussions. This lack of influence may be attributed to stereotypical portrayals of these traditions in popular culture and society, as well as a general lack of awareness. However, as demonstrated in the preceding discussion, Afro-diasporic traditions offer unique perspectives that could enrich existing philosophical dialogues. These traditions present differing viewpoints on the nature of deities and humans, with less variation in epistemology and ethics, though most Afro-diasporic traditions draw from additional sources for both.
9. References and Further Reading
Bahia, Joana. 2016. “Dancing with the Orixás: Music, Body and the Circulation of African Candomblé Symbols in Germany.” African Diaspora 9 (1–2): 15–38.
A discussion of bodily postures and other ways the body influences Candomblé rituals.
Bourguignon, Erika. 2005. “Spirit Possession.” In A Companion to Psychological Anthropology: Modernity and Psychocultural Change, edited by Casey Conerly and Robert Edgerton, 374–87. Conerly Casey, Robert B. Edgerton. Oxford: Blackwell Publishing ltd.
In depth discussion of spirit possession across cultures, with attention for psychological and social factors.
Chimakonam, Jonathan O. 2019. Ezumezu: A System of Logic for African Philosophy and Studies. Cham: Springer.
Discussion of African systems of logic and its differences to Western logic. Also discusses how African logic can tie various African traditions together.
Chimakonam, Jonathan O. 2021. “On the System of Conversational Thinking: An Overview.” Arumaruka: Journal of Conversational Thinking 1 (1): 1–46.
Discussion of how African logic is rooted in different modes of thinking.
Clark, Mary Ann. 2005. Where Men Are Wives And Mothers Rule: Santería Ritual Practices and Their Gender Implications. Gainesville: University Press of Florida.
Book about gender differences and their implications in Santería ritual practices.
Clark, Mary Ann. Santeria: Correcting the myths and uncovering the realities of a growing religion. Bloomsbury Publishing USA, 2007.
Introduction to the central ideas of Santeriá.
Craig, William Lane. 2016. God over All: Divine Aseity and the Challenge of Platonism. Oxford: Oxford University Press.
Book with arguments for divine aseity over and against alternative conceptions of the divine.
Da Silva, Vagner Gonçalves, and Fernando Giobellina Brumana. 2017. “Candomblé: Religion, World Vision and Experience.” In Handbook of Contemporary Religions in Brazil, 170–85. Leiden: Brill.
Introduction to the main features of Candomblé.
Davis, Wade. 2010. The Serpent and the Rainbow. New York: Simon and Schuster.
Personal account of the author’s quest for the ‘Zombie Powder’ in Haitian Vodou. The account contains an in-depth discussion of secret societies within Haitian Vodou.
Desmangles, Leslie G. 2000. The Faces of the Gods: Vodou and Roman Catholicism in Haiti. Chapel Hill: University of North Carolina Press.
Introduction to central ideas in Haitian Vodou with focus on syncretism with Roman Catholicism.
Eliade, Mircea. 2024. Shamanism: Archaic Techniques of Ecstasy. Princeton NJ: Princeton University Press.
Reprint of Eliade’s classical account of shamanism. The book is an in-depth discussion of various recurring features across traditions with a large number of examples.
Eze, E Chukwudi. 1998. “The Problem of Knowledge in ‘Divination’: The Example of Ifa.” In African Philosophy: An Anthology, edited by E. Chukwudi Eze, 173–75. Oxford: Blackwell.
Discussion of Ifa divination in Yoruba with a discussion of Ashe.
Gordon, Jacob U. 1979. “Yoruba Cosmology and Culture in Brazil: A Study of African Survivals in the New World.” Journal of Black Studies 9 (4): 231–44.
Account of how Yoruba ideas feature in Brazilian traditions like Candomblé and Umbanda.
Hebblethwaite, Benjamin. 2021. A Transatlantic History of Haitian Vodou: Rasin Figuier, Rasin Bwa Kayiman, and the Rada and Gede Rites. Jackson: University Press of Mississippi.
Detailed account of the African roots of Haitian Vodou. Also contains a discussion of Haitian songs that features in Vodou rituals.
Hurston, Zora Neale. 1990. Tell My Horse. Voodoo and Life in Haiti and Jamaica. Harper perennial.
Anthropological study conducted by the author in the 1930’s in Haiti and Jamaica. The work is well known for its vivid description of possession rituals.
Landry, Timothy R. 2016. “Incarnating Spirits, Composing Shrines, and Cooking Divine Power in Vodun.” Material Religion 12 (1): 50–73.
Discussion of how materials like food are used in West-African rituals.
Luzaragga, Ramon. 2019. “Suriname.” In Encyclopedia of Latin American Religions, edited by Henri Gooren, 1519–23. Religions of the World. Cham: Springer.
Introduction to the religious traditions present in Surinam. Contains a discussion of Winti.
Mandle, Julia Barnes, Deborah Menaker Rothschild, and others. 1992. Sites of Recollection: Four Altars & a Rap Opera. Philadelphia: University of Pennsylvania Press.
Marcelin, Milo. 1950. Mythologie Vodou (Rite Arada II). Port au Prince: Editions Haitiennes.
Collection of Haitian Vodou myths.
Mason, Michael Atwood. 2002. Living Santería: Rituals and Experiences in an Afro-Cuban Religion. Smithsonian Institution.
Anthropological study of Santería. Contains the author’s account of his own initiation.
Métraux, Alfred. 1946. “The Concept of Soul in Haitian Vodu.” Southwestern Journal of Anthropology 2 (1): 84–92.
Discussion of the tri-partite division of the human soul in Haitian Vodou.
Moser, Paul K. 2022. “Faith, Power, and Philosophy: Divine-Human Interaction Reclaimed.” International Journal of Philosophy and Theology 83 (4): 281–95.
Philosophical account of how human-divine interaction can occur in Abrahamic traditions.
Nwokocha, Eziaku Atuama. 2023. Vodou En Vogue: Fashioning Black Divinities in Haiti and the United States. Chapel Hill: UNC Press Books.
Anthropological study of Haitian Vodou rituals with focus on the use of clothing to embody Vodou deities.
Olmos, Margarite Fernández, and Lizabeth Paravisini-Gebert. 2022. Creole Religions of the Caribbean: An Introduction from Vodou and Santería to Obeah and Espiritismo. New York: NYU Press.
Introduction to some Afro-diasporic traditions.
Porcher, José Eduardo. 2024. Afro-Brazilian Religions. Cambridge Elements. Cambridge: Cambridge University Press.
Introduction to key philosophical ideas in Afro-Brazilian religions, centered around Candomblé, with a focus on the role of mythology, possession-experiences and embodied knowledge.
Porcher, José Eduardo, and Fernando Carlucci. 2023. “Afro-Brazilian Religions and the Prospects for a Philosophy of Religious Practice.” Religions 14 (2): 146.
Discussion of how Candomblé rituals are much more embodied or have a stronger bodily element than Western traditions.
Richman, Karen. 2019. “Vodou, Voodoo.” In Encyclopedia of Latin American Religions, edited by Henri Gooren, 1618–24. Religions of the World. Cham: Springer.
Introduction to Haitian Vodou.
Rigaud, Milo. 1985. Secrets of Voodoo. City Lights Books.
Classical summary of central ideas in Haitian Vodou.
Schmidt, Bettina E. 2024. “Axé as the Cornerstone of Candomblé Philosophy and Its Significance for an Understanding of Well-Being (Bem Estar).” Religious Studies, 1–13.
Discussion of axé in Brazilian Candomblé as the philosophical backbone of the tradition.
Serra, Ordep. 2019. “Candomblé.” In Encyclopedia of Latin American Religions, edited by Henri Gooren, 270–78. Religions of the World. Cham: Springer.
Introduction to Candomblé.
Strongman, Roberto. 2008. “Transcorporeality in Vodou.” Journal of Haitian Studies, 4–29.
Discussion of possession in Haitian Vodou with focus on its bodily features.
Philosophers generally count Spinoza (1632-1677), along with Descartes (1596-1650) and Leibniz (1646-1716), as one of the great rationalists of the 17th century, but he was also a keen student of religion whose analysis has shaped our modern outlook. For those at home in secular liberal democracies, much seems familiar and right in Spinoza: he was suspicious of the clergy and wished to limit their power; he opposed the enforcement of orthodoxy and championed freedom of thought and expression; he affirmed scripture as a source of moral guidance but rejected it as a source of philosophical or scientific truth; and he constructed a worldview in which nature is law-governed and miracles are disallowed. Nevertheless, although a critic of religion in its traditional forms, Spinoza did not wish to see it disappear. In his view, genuine piety opens a path to blessedness that would otherwise be closed off to most. Such piety consists not in strict adherence to creeds or scrupulous performance of rituals but in simple obedience to the divine law, which commands the love of God and of neighbor through the practice of justice and charity above all else. Regarding the state, Spinoza advocates a tolerant society in which an individual’s religious belief is a matter of public indifference and there is broad freedom to philosophize without fear of censure.
Spinoza was born into Amsterdam’s Portuguese Jewish community a few decades after it had been constituted. The Jews of this community were of converso stock—they came from families, long established in the Iberian Peninsula, who had converted from Judaism to Christianity. Because their conversions were by and large coerced, these families often maintained some form of Jewish identity in their private lives while keeping up a public-facing Christianity. In doing so, they risked social sanction and, after the establishment of the Portuguese Inquisition in 1536 (a successor to the Spanish Inquisition, established in 1478), the possibility of inquisitorial proceedings. To escape these risks, as well as to find new economic opportunities, waves of conversos emigrated from Portugal in the 16th and 17th centuries. Some found their way to Amsterdam. The city offered the economic opportunities they were looking for and the chance to re-establish themselves as Jews.
As a child of Amsterdam’s Portuguese Jewish community, Spinoza received a traditional Jewish education. As would be expected, his family sent him to the Talmud Torah school operated by the united congregation of the same name. In the first four levels of the school, students read the prayer book, learned Hebrew, studied Torah, and worked through biblical commentary. The fifth and sixth levels were devoted to more advanced studies, focused on Talmud, and were primarily undertaken by those boys on a path to the rabbinate. There is little question that Spinoza completed the first four levels – likely by the age of fourteen – but controversy remains about whether and how far beyond that he went. It may well be that, with the expectation that he would become a merchant like his father Michael, Spinoza ended his formal education after completing the fourth level to work full-time in the family’s trading business. After his father’s death in 1654, he took over the business and entered a partnership with his younger brother Gabriel.
None of this means that Spinoza did not continue his education as a Jew. In addition to whatever self-study he pursued, he had the opportunity to attend one or more of the informal yeshivas operating in the community. Although not certain, there is some evidence that he attended the yeshiva Keter Torah (Crown of the Law) operated by Rabbi Saul Levi Mortera (1596-1660), a learned Talmudist from Vienna. As Mortera was also conversant with the Jewish philosophical tradition, Spinoza may have gained exposure to Maimonides (1138-1204) and other Jewish philosophers through him. By the mid-1650s, when he was in his early twenties, Spinoza, while not trained for the rabbinate, had a thorough grounding in the Jewish religious and intellectual tradition.
It was at this time that a dramatic rupture occurred in Spinoza’s life. In July of 1656, the governing board of the Talmud Torah congregation subjected Spinoza, twenty-three years of age, to a herem: a religious condemnation and order of expulsion from the community. In Spinoza’s case, the condemnation was severe, and the expulsion was permanent. It is not known precisely what offense triggered the herem. There is no first-hand information from this period, and Spinoza does not address the incident in his surviving correspondence. The document itself cites his “abominable heresies” and “monstrous deeds” but gives no specifics. It is tempting to look at Spinoza’s writings for answers – they are full of heterodox ideas that would have drawn the ire of his community – but it is impossible to know how much of what they contain was in his mind at the time of the herem. Whatever it was that triggered the herem, it was serious enough in the minds of the community’s leaders to expel him permanently.
Among the questions the herem raises is whether and in what sense Spinoza should be considered a Jewish philosopher or even a Jew at all. He was expelled from his community, and, if the words attributed to him by his biographer Lucas are to be believed, he was happy to go:
All the better; they do not force me to do anything that I would not have done of my own accord if I did not dread scandal; but, since they want it that way, I enter gladly on the path that is opened to me, with the consolation that my departure will be more innocent than was the exodus of the early Hebrews from Egypt. (Lucas, Life, p. 51)
Whether Spinoza continued to think of himself as a Jew after his expulsion is difficult to determine. Although many of his friends, associates, and critics saw him as one, he did not present himself as such in his writings. Among other things, he wrote under the Latin Benedictus rather than the Hebrew Baruch (or the Portuguese Bento). More telling is the fact that he never sought reintegration into his community. From the time of the herem on, he was content to live and work in a Christian social milieu, and he formed a number of intimate friendships with free-thinking Protestants, who became his most ardent supporters. There is no evidence, however, that he converted to Christianity or came to identify as a Christian. While he admired the religiosity of his friends, he disdained sectarian Christianity. This included both Roman Catholicism, which he regarded as little more than organized superstition, and Reformed Calvinism, which was dominant in the Netherlands. Thus, one may see in Spinoza an individual – so familiar in the early 21st century but so rare in his – who lived without any sense of religious identity.
2. The Sources for Spinoza on Religion
Spinoza treats religion throughout his corpus. His magnum opus, the Ethics, published posthumously, sets forth his account of God, the outflow of all things from God, bondage to the passions, liberation from passions, and the attainment of happiness in the intellectual love of God. His most important treatment of religion, however, is the Theological-Political Treatise, published anonymously in 1670. The Treatise reflects much of the philosophy of the Ethics, but it does not articulate the grand philosophical vision of that work and on occasion adopts language at odds with it. Spinoza intended it to prepare his audience for the Ethics, which he feared would be met with religiously based prejudice.
The primary aim of the Treatise is to argue on behalf of the liberty of thought and expression in general and what Spinoza calls the freedom of philosophizing in particular. This is made clear in the full title of the work— Theological-Political Treatise: Several Discussions Showing that the Republic Can Grant Freedom of Philosophizing without Harming Its Peace or Piety and Cannot Deny It without Destroying Its Peace and Piety.
Since Spinoza sees the primary threat to the freedom of philosophizing to emanate from the enforcement of religious orthodoxy, the Treatise makes the case that genuine piety, as revealed to the prophets and communicated through scripture, consists in simple obedience to the divine law, which commands the love of God and neighbor through the practice of justice and charity above all else. As its belief requirements are minimal, genuine piety leaves speculative inquiry largely unconstrained. The freedom of philosophizing may thus be allowed without posing any threat to it.
In making his case, Spinoza is primarily concerned with what one might call biblical religion, encompassing Judaism, Christianity, and the precursors to those traditions among the ancient Hebrews and early followers of Jesus. (Islam comes up for occasional discussion but gets no sustained treatment.) Accordingly, a great deal of the Treatise is dedicated to developing an account of prophecy, revelation, miracles, the Mosaic law, the Hebrew state, Christ, and his apostles. Perhaps of most lasting significance is Spinoza’s understanding of the biblical texts and his method of interpreting them. It is in virtue of his contributions in this area that he is counted as one of the founders of modern biblical criticism.
3. God, Nature, and Miracles
Although Spinoza has often been accused of atheism, his account of God is foundational to his philosophical vision. Spinoza’s God, however, is not the God of the Bible. He is not the creator of the universe who rules over it with providential care. He does not execute justice or show mercy, and he does not act with plans, purposes, or ends in view. Most importantly, he is not personal in any meaningful sense of the word. He is, as Spinoza defines him, an infinite being or, more precisely, a substance consisting of infinite attributes: “By God I understand a being absolutely infinite, that is, a substance consisting of an infinity of attributes, of which each one expresses an eternal and infinite essence” (Ethics ID6). As God is a substance consisting of infinite attributes, his essence is maximally extensive. Of the attributes that constitute it, two are knowable to humans. These are the Cartesian attributes of thought and extension. While Descartes takes these attributes to constitute the essence of mind and body respectively (see Principles 1.63), Spinoza takes them to constitute that portion of the divine essence that one can cognize.
From this definition of God, Spinoza demonstrates that God exists of necessity from his essence and is therefore self-caused. An infinite substance, Spinoza argues, has infinite power to exist. More importantly, he demonstrates that God is the only substance. Spinoza does not mean by this that God is the only infinite substance, all others being finite, but that God is the only substance at all. Other things that one might take to be substances – rocks, trees, dogs, humans, and the like – are merely modes of God, existing in God as modifications of his substance that express his attributes in some determinate way: “Particular things are nothing but affections or modes by which God’s attributes are expressed in a certain and determinate way” (Ethics, IP25, corollary). In taking this position, Spinoza is rejecting traditional monotheism in favor of substance monism. According to his version of this doctrine, everything that exists is either God, the infinite substance, or a mode of God. Apart from God and his modes, there is nothing.
It is in the light of this doctrine that one can understand Spinoza’s controversial identification of God and Nature, expressed in his short formula Deus sive Natura. Nature, as Spinoza understands it, does not exist in addition to God as a distinct order of things that God creates, governs, or directs. It is one and the same as God. Here, however, Spinoza makes a distinction. Nature may be taken alternatively as the immanent, active principle from which all things arise or as the things themselves that arise from this immanent, active principle. The former, which Spinoza calls Natura naturans (literally: Nature naturing), is identifiable with God, the one substance, whereas the latter, which he calls Natura naturata (literally: Nature natured), is identifiable with the modes of this substance. This is important, for it means that Spinoza, although he identifies God and Nature, is not a pantheist, at least not if one takes pantheism according to its simplest formulation as the doctrine that the universe is God. The universe, for Spinoza, is the totality of modes – Natura naturata – of which God is the active, immanent principle – Natura naturans. Since these modes exist in God, the infinite substance, some commentators have proposed ‘panentheism’ as a more appropriate label for Spinoza’s view, although there is no consensus on this matter.
Nearly as controversial as Spinoza’s identification of God and Nature is his view that all things – all modes – follow the divine essence with strict necessity: “From the necessity of the divine nature there must follow infinitely many things in infinitely many modes (i.e., everything which can fall under an infinite intellect)” (Ethics IP16). Although all modes follow from the divine essence of strict necessity, some of these modes – the infinite modes – do so from that essence taken absolutely, while others of these modes – the finite modes – do so from that essence taken as modified by one or more finite modifications, which is just to say that they are causally determined by other finite modes, which are causally determined by still other finite modes, and so on. Such a view quite obviously rules out the creation of the universe by a free act of God’s will, but it also rules out contingency within the universe: “In nature there is nothing contingent, but all things have been determined from the necessity of the divine nature to exist and to produce an effect in a certain way” (Ethics IP29). In Spinoza’s view, since all things follow with strict necessity from the divine essence, things cannot be other than how they are. If they were, that essence would be other than it is, which is impossible. The universe as a whole and in its parts must thus be exactly as it is.
Given this view, it is not surprising that Spinoza would reject one of the central pillars on which biblical religion rests. This is the occurrence of miracles, understood as divinely caused events that contravene the laws of nature. Unlike Hume (1711-1776), who attacks the evidentiary basis of belief in miracles, arguing that the testimonial evidence in their favor is never such as to justify believing that one has occurred (see Enquiry, sec. 10), Spinoza attacks their very possibility.
The laws of nature, he argues, are decreed by God, but they are not arbitrary dictates of the divine will that he can suspend or revoke. Rather, they are necessary consequences of the divine nature. Hence, if God were to act in contravention of those laws – something he would do if he were to bring about a miracle – God would act in contravention not only of his decrees but of his nature as well. As such action is absurd, miracles must be judged impossible:
[T]he universal laws of nature are nothing but decrees of God, which follow from the necessity and perfection of the divine nature. Therefore, if anything were to happen in nature which was contrary to its universal laws, it would also necessarily be contrary to the divine decree, intellect, and nature. Or if someone were to maintain that God does something contrary to the laws of nature, he would be compelled to maintain at the same time also that God acts in a way contrary to his own nature. Nothing could be more absurd than that. (TTP 6, 154)
In Spinoza’s universe, there is no room for miracles. Everything that happens does so in accordance with the laws of nature, which are necessary consequences of the divine nature and admit of no exceptions.
What, then, should be made of the putative miracles to which so many individuals have testified throughout history? Spinoza answers that such “miracles” are nothing more than unusual events for which there are no available explanations:
It clearly follows that the term “miracle” cannot be understood except in relation to men’s opinions, and means nothing but a work whose natural cause we cannot explain by the example of another familiar thing, or at least cannot be so explained by the one who writes or relates the miracle. (TTP 6, 155)
According to Spinoza, there are no genuine miracles in the sense of divinely caused events that contravene the laws of nature, but there are events that fall outside ordinary experience which cannot be explained by causes that are familiar. Such events, because they excite wonder, are regarded as miracles, especially if one lacks a scientific worldview.
Notably, Spinoza argues that the authors of scripture lacked a scientific worldview, and it is this that accounts for the many reports of miracles in their narratives. Spinoza dismisses some of these reports as interpolated fictions, but his usual practice is to interpret them naturalistically. A case in point is his treatment of the report, given in Joshua 10:12-14, of God causing the sun to stand still in the sky during the battle of Gibeon. Picking up on a reference to hailstones in Joshua 10:11, Spinoza speculates that, on the day of the battle, there was an unusual amount of ice in the atmosphere, which caused an increase in the refraction of sunlight, which in turn caused a lengthening of the day. Joshua, having no understanding of astronomy, meteorology, or optics, reported this phenomenon as a miracle. In accordance with his understanding of the world, he reported it as God causing the sun to stand still in the sky.
Thus far, Spinoza has an argument against the possibility of miracles and an account of how putative miracles are to be understood. He adds to these the surprising claim that, even if a genuine miracle were to occur, it would be of dubious epistemic value, at least with respect to humanity’s knowledge of God. This is because miracles, having no natural causes, surpass human understanding, and what surpasses human understanding cannot serve as the basis of understanding anything, let alone God:
But from a work, and absolutely, from anything which surpasses our power of understanding, we can understand nothing. For whatever we understand clearly and distinctly must become known to us either through itself or through something else which through itself is understood clearly and distinctly. So from a miracle, or a work surpassing our power of understanding, we can understand neither God’s essence, nor his existence, nor absolutely anything concerning God and nature. (TTP 6, 157)
Instead of looking to miracles for an understanding of God, Spinoza suggests looking to the operations of the universe and the laws that govern it. As these are necessary consequences of the divine nature, they are the best way of gaining such an understanding:
On the contrary, since we know that all things are determined and ordained by God, that nature’s operations follow from God’s essence, indeed, that the laws of nature are God’s eternal decrees and volitions, we must conclude absolutely that the better we know natural things – the more clearly we understand how they depend on their first cause, and how they produce effects according to the eternal laws of nature – the better we know God and his will. (TTP 6, 157)
As Spinoza here makes clear, it is only by increasing one’s understanding of the operations of the universe and the laws that govern it that increases one’s understanding of God. It is also the only means of increasing one’s understanding of Nature, which is the same. The attempt to do so through miracles is nothing but mystification.
4. Prophecy
As already noted, the concrete forms of religion with which the Theological-Political Treatise deals are Judaism, Christianity, and the precursors of those traditions among the ancient Hebrews and early followers of Jesus. Because these forms of religion are based on claims of prophetic revelation from God, it is fitting that the first two chapters of the Treatise are devoted to prophecy in the biblical tradition.
Spinoza begins his discussion by giving an unexceptional definition of prophecy as knowledge received by humans from God: “Prophecy, or Revelation, is the certain knowledge of some matter which God has revealed to men” (TTP 1, 76). In giving this definition, Spinoza does not wish to suggest that prophetic knowledge is in any way supernatural. His identification of God and Nature does not allow for this. Rather, he wishes to distinguish prophetic knowledge from knowledge derived from the natural light of reason. He thus wishes to distinguish prophetic knowledge from philosophical knowledge, although he acknowledges that the latter depends on God as well, insofar as it requires “knowledge of God and of his eternal decrees” (TTP 1, 76).
Understanding how prophecy works, Spinoza argues, allows no recourse other than to consult scripture, which is a record of prophecy and conveys its contents to humanity. Doing so shows that it is either auditory, that is, given through words heard by the prophet, or visionary, that is, given through appearances seen by the prophet. Furthermore, one finds that these words and appearances are either real or imaginary:
If, then, we run through the Sacred books, we will see that everything God revealed to the Prophets was revealed to them either in words, or in visible forms, or in both words and visible forms. The words and visible forms were either true, and outside the imagination of the Prophet who heard or saw them, or else imaginary, occurring because the imagination of the Prophet was so disposed, even while he was awake, that he clearly seemed to himself to hear words or to see something. (TTP 1, 79)
As it turns out, in virtually every case, the words and appearances experienced by the prophets were imaginary, and Spinoza lines up an impressive array of biblical passages to establish this. One exception is the case of Moses, to whom God spoke directly with real rather than imagined words. How this or any other form of prophetic communication was possible, Spinoza declines to speculate. He concludes, however, that what is required of a prophet is not exceptional intellectual ability but a fertile imagination:
We have asserted, then, that… no one has received God’s revelations without the aid of the imagination, i.e., without the aid of words or images. So, no one needed to have a more perfect mind in order to prophesy, but only a more vivid imagination. (TTP 1, 85)
In saying this, Spinoza is breaking with an important tradition in Jewish philosophy, most notably represented by Maimonides, which takes prophets to be individuals not only of great imagination but of great intellect as well (see Guide 2.36-38). According to this tradition, the perfection of both faculties is requisite to prophecy. An individual of great imagination who lacks similarly great intellect may possess certain divinatory powers but may not rise to the level of a prophet. Prophets must rival or even surpass the greatest philosophers in intellectual power. When seen against this view, Spinoza’s view is deflationary.
The implications of Spinoza’s view are enormous. Among the most important is that, with respect to speculative matters, it is a mistake to look to the prophets for guidance. Although they are the recipients of revelation, they have no special insight into the nature of God, the origin and structure of the cosmos, the human mind, the human body, and so on. They do not even agree among themselves on these matters. What the prophets received, according to Spinoza, was penetrating moral insight, but this insight was adapted to their pre-existing beliefs, which were for the most part born of ignorance:
[We have established] that God accommodated his revelations to the power of understanding and to the opinions of the Prophets, and that the Prophets could be ignorant of things which concern only speculation, but not those things which concern loving-kindness and how to conduct our lives, and that they really were ignorant and had contrary opinions [regarding speculative matters]. So we really should not seek knowledge of natural and spiritual things from them. We conclude, therefore, that we are not bound to believe the prophets with regard to anything except what is the end and substance of revelation. In all other things, each person is free to believe as he pleases. (TTP 2, 109)
The lesson here is that, while the prophets offer valuable moral guidance, if one wishes to gain speculative understanding, one must cultivate the intellect and discipline one’s reason rather than consult the prophets’ texts and pour over their words. One must engage in philosophy rather than scriptural study.
5. Moses and the Law
In his treatment of prophecy, Spinoza devotes the greatest share of attention to Moses. It has already been seen that he considers Moses to be exceptional among prophets insofar as the words through which God communicated with him were real rather than imagined. This Spinoza takes from Numbers 12:6-8, in which God declares that, while he speaks to other prophets in dreams and visions, he speaks to Moses “mouth to mouth.” What is most important to Spinoza about Moses, however, is that he organized the Hebrew people into a political community after leading them out of bondage in Egypt. On Spinoza’s account, Moses did this by arrogating absolute power to himself and imposing a comprehensive set of laws governing all aspects of life and regulating all spheres of activity. Given that the Hebrew people had just emerged from long years of slavery and had no experience of self-rule, such measures were necessary. What is more, as Moses understood that the community would be stronger and more stable if its members were induced to obey this law willingly rather than coercively, he created a state religion in which obedience became the very expression of piety. It is in the formation of this religion that one sees the distant origins of what would become Judaism.
Central to the system of law that Moses laid down, according to Spinoza, is the ceremonial law, which consists of regulations concerning sacrifice, festivals, feasts, purity, dress, diet, sabbath observance, prayer, circumcision, and the like. Unlike the divine law, which is universal and conduces to moral development and the attainment of blessedness, this law has no moral valence. It is particular to the Hebrew people and conduces only to the prosperity of their state: “[I]t is established that ceremonies contribute nothing to blessedness, and that those of the Old Testament, indeed, the whole law of Moses, were concerned with nothing but the Hebrew state, and consequently, with nothing but corporeal advantages” (TTP 5, 146). As the ceremonial law conduces to the prosperity of the Hebrew state, Spinoza argues, it is part and parcel of the election of the Hebrew people. That election, as Spinoza understands it, has reference to the state, its perseverance, and its flourishing. It has no reference beyond that:
Their election, therefore, and their calling consisted only in the enduring prosperity of their state and in other [temporal advantages]. Nor do we see that God promised the Patriarchs or their successors anything more than this. Indeed, all the Law promises for obedience is the continual prosperity of their state and the other advantages of this life. Conversely, [it threatened] nothing for obstinacy and breaking the covenant except the ruin of their state and the greatest [temporal] disadvantages. (TTP 3, 115)
The conclusion that Spinoza draws from these observations is that the ceremonial law – or the Mosaic law more generally – is binding on those individuals living within the Hebrew state during the period in which it endured and its people were elected. Those living after that state ceased to exist and who can make no claim to election, including the Jews of Spinoza’s day, are in no way bound by it and gain no benefit from observing it:
So there is no doubt that after their state was dissolved the Jews were no more bound by the law of Moses than they were before their social order and the Republic began. For before the exodus from Egypt, when they lived among other nations, they had no laws peculiar to themselves, and were not bound by any law, except natural law and, no doubt, the legislation of the Republic in which they were living (insofar as it was not contrary to the divine natural law). (TTP 5, 142-143)
It is difficult to overstate the radicalness of what Spinoza is saying here. In denying the continued validity of the Mosaic law, Spinoza is challenging the fundamental basis of Jewish life and practice as he understands it. This does not mean that he is calling for a cessation of that life and practice, but it is unclear what rationale he sees for its continuation, and he makes no attempt to provide one.
In this context, it is interesting to note that Spinoza does credit the persistence of ceremonial observance by Jews – particularly, the practice of circumcision – for their perseverance as a diaspora people in Christian and Islamic lands. He even entertains the possibility that they will at some future time reestablish the Hebrew state:
Indeed, if the foundations of their religion did not make their hearts unmanly, I would absolutely believe that someday, given the opportunity, they would set up their state again, and God would choose them anew. That’s how changeable human affairs are. (TTP 3, 124)
Such an observation is not an expression of proto-Zionism. It is rather a judgment about the character of a people and the vicissitudes of history. Assuredly, the state that Spinoza has in mind here is not a secular state, such as the modern state of Israel, but a theocratic state based on the Mosaic law. In such a state, that law would once again be binding, and the people would once again be elected.
6. Christ
A second figure of great interest to Spinoza is Jesus of Nazareth, whom he consistently refers to by his Christian appellation, Christ. According to Spinoza, Christ, unlike Moses, was uninterested in statecraft. His concern was with blessedness and the precepts of the divine law that lead to it. Chief among these precepts is the love of God and neighbor, which manifests itself in the practice of justice and charity. In Spinoza’s view, while Moses taught these precepts as well, he understood them as special dictates of the divine will given to the Hebrew people alone. In contrast, Christ understood them as eternal truths applying to all people everywhere. His apostles, having learned these precepts from him, took them outside the Jewish community and proclaimed them to every nation. In this way, Christ became the founder of a truly catholic religion, although one that quickly lapsed into sectarianism as his teaching became mixed with theological dogma and philosophical speculation. This movement toward sectarianism, Spinoza argues, began with the apostles themselves as they adapted Christ’s teaching to their individual understanding and preached it to diverse audiences.
Perhaps most interesting in Spinoza’s treatment of Christ is his claim that the way in which he received revelation was even more direct than the way in which Moses received it. Finding no instance in the gospels in which God communicated with Christ through words, real or imagined, Spinoza asserts that God communicated with him “mind to mind.” This is to say that Christ enjoyed unmediated intellectual apprehension of the precepts of the divine law, which is precisely what allowed him to understand them as eternal truths applying to all people everywhere. Such apprehension, Spinoza claims, is something no one else, including Moses, has so far enjoyed:
[F]or a man to perceive by the mind alone things that aren’t contained in the first foundations of our knowledge, and can’t be deduced from them, his mind would necessarily have to be more excellent than, and far superior to the human mind. So I do not believe that anyone else has reached such a perfection, surpassing all others, except Christ, to whom the decisions of God, which lead men to salvation, were revealed immediately – without words or visions. So God revealed himself to the Apostles through Christ’s mind, as previously he had revealed himself to Moses by means of a heavenly voice. And therefore Christ’s voice, like the one Moses heard, can be called the voice of God. (TTP 1, 84)
As Christ enjoyed unmediated intellectual apprehension of the precepts of the divine law, he is set apart from all the prophets, whose apprehension of those precepts was by way of the imagination. His voice, Spinoza goes so far as to say, can thus be likened to the voice of God that spoke to Moses. For this reason, he concludes, “Christ was not so much a Prophet as the mouth of God” (TTP 4, 133).
That Christ comes in for such favorable treatment as compared with Moses and other Hebrew prophets has aroused suspicion and even consternation among many of Spinoza’s readers, especially his Jewish readers. Given that he was writing for a largely Christian audience, some see Spinoza’s comments as more strategic than sincere. Others see them as a betrayal of his own Jewishness. Whatever the case may be, it is important to note that Spinoza stopped well short of endorsing some of the most fundamental Christian dogmas about Christ. In the Theological-Political Treatise, for example, he writes vaguely about “the things some of the churches maintain about Christ,” and adds, “I freely confess that I do not grasp them” (TTP 1, 84). In a letter to Oldenburg, he is more forthcoming. After asserting that God’s eternal wisdom is manifest most fully in Christ’s mind, he says this about the incarnation:
As to what certain Churches add to this – that God assumed a certain nature – I warned expressly that I don’t know what they mean. Indeed, to confess the truth, they seem to me to speak no less absurdly than if someone were to say to me that a circle had assumed the nature of a square. (Ep. 73: to Oldenburg)
Finally, in a subsequent letter to Oldenburg, after acknowledging that the apostles believed that Christ physically rose from the dead, he insists that this event must nevertheless be understood in a purely spiritual and moral sense:
I conclude, then, that the resurrection of Christ from the dead was really spiritual, and was revealed only to the faithful, according to their power of understanding . . . and at the same time that he gave, by his life and death, an example of singular holiness, and to that extent he raises his disciples from the dead, insofar as they follow this example of his life and death. (Ep. 75: to Oldenburg)
From all this, it is clear that Spinoza, whether sincere or not in his remarks about Christ, could not sign on to anything close to Christian orthodoxy about him.
7. The Interpretation and Aim of Scripture
Spinoza’s discussion of prophecy naturally leads to a discussion of scripture, and this discussion has scandalized readers of the Theological-Political Treatise as much as anything else. In one sense, Spinoza considers the Bible, consisting of both the Hebrew scriptures and the New Testament, to be a thoroughly human document. It contains diverse writings composed by diverse authors writing in different places and times for different purposes and audiences. In many cases there is little or no information about how its books came to be, and in no cases are their originals extant.
Perhaps because he considers the Bible to be such a thoroughly human book, Spinoza advocates a thoroughly naturalistic method of scriptural interpretation: “To sum up briefly, I say that the method of interpreting Scripture does not differ at all from the method of interpreting nature, but agrees with it completely” (TTP 7, 171). What Spinoza means by this is deceptively simple. Just as an understanding of Nature comes by a careful examination of Nature itself, so must arrive an understanding of scripture by a careful examination of scripture itself. In other words, one cannot attribute anything to scripture that is not found from a close study of its words: “Therefore, the universal rule in interpreting Scripture is to attribute nothing to Scripture as its teaching which we have not understood as clearly as possible from its history” (TTP 7, 172). In Spinoza’s view, this study involves more than a simple reading. It involves first, a thorough study of the languages – Hebrew and Greek – in which scripture is written; second, a collection of the main pronouncements of each book organized under appropriate headings, together with a notion of obscurities and contradictions to be found among them; and third, a historical study of the authors of scripture, the circumstances under which they wrote, and the transmission and canonization of their writings.
What should be sought in all this is what Spinoza calls the true meaning of scripture, which he identifies with authorial intention or what the author wishes to convey. This meaning, he insists, is not to be confused with the truth of fact, which may or may not align with it and which cannot be used to determine it. For example, when one is confronted with Moses’ claim that God is fire or that God is jealous (Deut. 4:24), the fact that God is neither of these things gives no guidance as to what Moses may have meant. Now since Moses elsewhere states that God is incorporeal (Deut. 4:15), there is reason to take the first of these claims metaphorically, provided, of course, that doing so accords with accepted linguistic usage among the Hebrews. However, since Moses never states that God is free of affection, there is no reason to do the same with respect to the second of these claims. So, despite the fact that God is not jealous, one cannot take Moses to have meant anything other than what he appears to have meant. One must take the meaning of his claim at face value and adhere to the literal sense.
In saying this, Spinoza is again breaking with important figures in the Jewish (as well as the Christian) tradition. Maimonides, whom Spinoza criticizes at length on this point, famously argues that as scripture is true in everything it teaches, all passages that appear to conflict with demonstrated truth must be interpreted in such a way as to align with that truth. He even goes so far as to say that if Aristotle and his successors had succeeded in demonstrating the eternity of the world – something, he argues at great length, they did not succeed in doing – all passages asserting the world’s creation in time would have to undergo radical reinterpretation (see Guide 2.25). As Spinoza points out, however, this assumes that the biblical authors had philosophical and scientific knowledge far beyond what is reasonable to suppose. In addition, it has the consequence that, until the truth of those matters on which scripture speaks is known, one can never be sure of what it is saying about them:
[Maimonides] cannot be certain of the true meaning of Scripture, however clear it may be, so long as he can doubt whether the proposition it seems to assert is true, or so long as he thinks the truth of that proposition has not been established. For so long as the truth of the matter is not established, we don’t know whether the thing agrees with reason or is contrary to it. So [on this theory] we wouldn’t know whether the literal meaning is the correct interpretation or not. (TTP 7, 188)
This, Spinoza thinks, is madness. One’s understanding of scripture must be drawn from scripture itself and not from independent considerations of what is true or false. The true meaning of scripture cannot be derived from the truth of fact.
What, then, is found when one turns to scripture? Spinoza’s answer is that, despite the diverse and sometimes conflicting outlooks of the biblical texts, one finds agreement over certain basic theological affirmations: that God exists; that God exercises providence over creation; that God rewards the righteous and punishes the wicked; and so on. One also finds agreement over certain basic moral affirmations: that one should uphold justice; that one should help the needy; that one should not murder; and so on. At the core of scripture, however, is a simple command to love God and neighbor by practicing justice and charity. It is this command, rather than any speculative doctrine, that scripture communicates first and foremost: “From Scripture itself we have perceived its most important themes without any difficulty or ambiguity: to love God above all else, and to love your neighbor as yourself” (TTP 12, 255). According to Spinoza, scripture places so much importance on this command that it identifies obedience to it as the means of attaining blessedness. In doing so, it opens a more democratic path to blessedness than that which is offered by philosophy. Everyone is capable of obedience in response to a command, but few are capable of following the guidance of reason and achieving enlightened understanding. This message of obedience, Spinoza insists, is what makes scripture so vital to human welfare. In its absence, humanity could never know that blessedness can be achieved by this means, and its hope for it would be greatly diminished:
We can’t perceive by the natural light that simple obedience is a path to salvation. Only revelation teaches that this happens, by a special grace of God, which we cannot grasp by reason. It follows that Scripture has brought great comfort to morals. Everyone, without exception, can obey. But only a very few… acquire a habit of virtue from the guidance of reason alone. So, if we didn’t have this testimony of Scripture, we would doubt nearly everyone’s salvation. (TTP 15, 281-282)
In short, it is by teaching that blessedness can be attained through obedience – obedience to the command to love God and neighbor by practicing justice and charity – that scripture offers hope and guidance to the mass of humans in a way that philosophy, which speaks to the few, cannot. It is in virtue of this guidance and hope that scripture can be considered the word of God.
8. A Universal Faith
It is because he takes obedience as a means of attaining blessedness that Spinoza sees the value of faith to lie not in the truth of what it affirms but in the obedience to which it leads. In fact, a saving faith need not be true at all. If it leads to obedience, that is all that matters for the attainment of blessedness: “Faith requires, not so much true doctrines, as pious doctrines, i.e., doctrines that move the heart to obedience, even if many of them do not have even a shadow of the truth” (TTP 14, 267). As a consequence of this view, Spinoza does not think that individuals should be judged faithful because their beliefs are orthodox or faithless because their beliefs are heterodox. Such judgments should be based solely on the love of God and neighbor they do or do not exhibit. More concretely, they should be based on the justice and charity they do or do not practice.
This view of faith quite obviously leaves broad latitude in belief, particularly concerning speculative matters that do not bear directly on action. It thus leaves broad latitude for philosophy to go its way, unconstrained by whatever orthodoxy may prevail in society. Even so, Spinoza thinks that there are certain doctrines one must affirm if one is to be obedient – they are, as it were, conditions of the possibility of obedience – and he defines faith in terms of them: “[Faith is] thinking such things about God that if you had no knowledge of them, obedience to God would be destroyed, whereas if you are obedient to God, you necessarily have these thoughts” (TTP 14, 266). Spinoza enumerates seven of these doctrines, each of which enjoys consensus among the writers of scripture. These he calls doctrines of the universal faith:
God, a supreme being, exists.
God is one.
God is omnipresent and omniscient.
God has dominion over all things.
Obedience to God consists of justice and charity.
Those who obey God are saved.
God forgives those who repent.
It is easy to see why Spinoza would think that affirming these doctrines is necessary for obedience: If one did not believe that God exists, one would have no reason to obey him; if one did not believe that God is one, one would not regard him as pre-eminent; if one did not believe that God is omnipresent and omniscient, one would think it possible to conceal one’s actions from him; and so on. Each of these doctrines thus belongs to the universal faith precisely insofar as affirming it is necessary for obedience.
There are two additional things to note about these doctrines. The first is that they are subject to a variety of interpretations according to the diversity of individual understanding. In affirming that God exists, for example, one might believe that God is fire or light or spirit or something else altogether. In affirming that God has dominion over all things, for another example, one might believe that God directs all things by his will or by the necessity of his nature or by some other means altogether. Each of these doctrines is thus to a large degree philosophically indeterminate. The only restriction is to not interpret them in a way that leads to disobedience. The second thing to note is that these doctrines are decidedly non-sectarian. It was Spinoza’s hope that they could be endorsed by Christians of all kinds and presumably by Jews as well. This would not eliminate doctrinal disagreements among those groups, but it might blunt charges of impiety by one group against others on account of those disagreements. In this way, the widespread embrace of these doctrines, together with the understanding of faith that stands behind them, might lead to greater toleration and peace within society.
9. Religion and the State
Spinoza’s views on religion are interwoven with his views on the state. Of particular concern is the relationship between civil authority and religious authority, which is explored throughout the last five chapters of the Theological-Political Treatise. The unfinished Political Treatise, although indispensable as a source for Spinoza’s views on the state, is less helpful on this topic. It is focused on the most basic principles of political philosophy and is little concerned with religion.
Spinoza sees the state as emerging from an agreement among individuals to form a community and be governed by a common will. In making this agreement, they transfer their right to live as they please to the community and ultimately to the sovereign who has charge of it. This sovereign may be the entire people, as in a democracy; a restricted class of people, as in an aristocracy; or a single person, as in a monarchy. Among the consequences of this transference is that the sovereign holds authority in religious as well as civil matters. It thus falls to the sovereign to regulate such things as the performance of ceremonies, the activities of clergy, and the functioning of churches, synagogues, and other religious institutions. More broadly, it falls to the sovereign to determine how piety is to be manifest in society, which means that the sovereign must determine what justice and charity demand of individuals. Religious functionaries may advise the sovereign in this work, but they can have no independent authority. To grant them such authority, Spinoza warns, is to divide sovereignty and weaken it. It is also to invite opposition from clergy, who will use such authority to advance their interests against the sovereign.
This placement of religion under the sovereign’s authority may appear illiberal – it is directly contrary to the principle of separation of church and state – but Spinoza offers qualifications that mitigate this appearance. Foremost among these is that the transference of rights from individuals to the sovereign is of necessity incomplete:
No one will ever be able to transfer to another his power, or consequently, his right, in such a way that he ceases to be a man. And there can never be a supreme power who can get everything to happen just as he wishes…. So it must be granted that each person reserves to himself many things of which he remains the master, things which therefore depend on no one’s decision but his own. (TTP 17, 296)
Among the rights that individuals inevitably retain, according to Spinoza, is the right to think and express themselves as they like. Individuals may not engage in sedition, which the sovereign can and must suppress, but they may otherwise hold and voice whatever opinions suit them. If the sovereign seeks to deprive them of this right, it becomes tyrannical and sets itself in opposition to the very humanity of its subjects. In so doing, it risks destabilizing itself by fostering resentment and encouraging resistance.
Importantly, Spinoza thinks that the right of individuals to think and express themselves as they like extends to religion. Individuals can thus determine for themselves what to believe about God or any other matter of religious concern:
Therefore, since each person has the supreme right to think freely, even about Religion, and it’s inconceivable that anyone can abandon his claim to this right, each person will also have the supreme right and the supreme authority to judge freely concerning Religion, and hence to explain it and interpret it for himself” (TTP 7, 191).
In view of this, Spinoza recommends that the sovereign focus its attention on action. Specifically, it should focus its attention on whether individuals act toward one another according to standards of justice and charity that the sovereign itself determines. If they do, it should regard them as pious irrespective of what they believe:
[W]e conclude that nothing is safer for the republic than that piety and Religion should improve the practice of Loving-kindness and Equity, and that the right of the supreme powers concerning both sacred and secular matters should relate only to actions. For the rest, everyone should be granted the right to think what he wants and to say what he thinks. (TTP 20, 353)
The outcome of this policy is a tolerant society in which an individual’s religious belief is a matter of public indifference. It is also a society that allows individuals the freedom to philosophize without fear of being charged with impiety: “[T]he best republic concedes to everyone the same freedom to philosophize as we’ve shown that faith does” (TTP 20, 348). As far as the state is concerned, what matters is how one comports oneself in one’s dealings with others. In the final pages of the Treatise, Spinoza lauds Amsterdam as a city that to a remarkable degree embodies this toleration:
In this most flourishing republic, this most outstanding city, all men no matter what their nation or sect, live in the greatest harmony. When they entrust their goods to someone, the only thing they care to know is whether that person is rich or poor, and whether he usually acts in good faith or not. They don’t care at all what his religion or sect is, for that would do nothing to justify or discredit their case before a judge. Provided they harm no one, give each person his due, and live honestly, there is absolutely no sect so hated that its followers are not protected by the public authority of the magistrates and their forces. (TTP 20, 351-352)
Spinoza may be putting these words forward more as exhortation than description. The very text in which they were written came in for widespread condemnation, and in 1674 it was banned (together with Meijer’s Philosophy as the Interpreter of Holy Scripture and Hobbes’ Leviathan) not only in Amsterdam but throughout the Dutch Republic as well. Nevertheless, the relative openness of Dutch society, however imperfect and fragile it may have been, was something he recognized as unique and wished to see preserved. That such openness is now taken for granted in so many liberal democracies is in no small measure due to his ideas.
10. References and Further Reading
a. References
All passages from Spinoza are quoted from the translations provided in Curley, Edwin, ed., The Collected Works of Spinoza, vols. 1 & 2 (Princeton: Princeton University Press, 1985 & 2016).
Passages from the Ethics are cited by book and definition or proposition (Ethics IP11 = Ethics, Book I, Proposition 11).
Passages from the Theological-Political Treatise are cited by chapter and page number as found in the second volume of Curley (TTP 7, 170 = Theological-Political Treatise, chapter 7, page 170).
Passages from Spinoza’s correspondence are cited by letter number and correspondent (Ep. 75: to Oldenburg = Letter 75: to Oldenburg).
b. Further Reading
Carlisle, Clare. Spinoza’s Religion. (Princeton: Princeton University Press, 2021).
Curley, Edwin. “Notes on a Neglected Masterpiece (I): Spinoza and the Science of Hermeneutics,” in Graeme Hunger, ed., Spinoza: The Enduring Questions. (Toronto: University of Toronto Press, 1994) 64-99.
Donagan, Allen. “Spinoza’s Theology,” in Don Garrett, ed., The Cambridge Companion to Spinoza. (Cambridge: Cambridge University Press, 1996) 343-382.
Fraenkel, Carlos. Philosophical Religions from Plato to Spinoza: Reason, Religion, and Autonomy. (Cambridge: Cambridge University Press, 2012).
Fraenkel, Carlos. “Spinoza on Miracles and the Truth of the Bible.” Journal of the History of Ideas 74 (2013) 643-658.
Garber, Daniel. “Should Spinoza have published his philosophy,” in Charlie Huenemann, ed., Interpreting Spinoza: Critical Essays. (Cambridge: Cambridge University Press, 2008) 166-187.
Goldstein, Rebecca. Betraying Spinoza: The Renegade Jew Who Gave Us Modernity. (New York: Schocken Books, 2006).
Harvey, Warren Zev. “Spinoza on Biblical Miracles.” Journal of the History of Ideas 74 (2013), 659-675.
Israel, Jonathan. Spinoza, Life and Legacy. (Oxford: Oxford University Press, 2023).
James, Susan. Spinoza on Philosophy, Religion, and Politics: The Theological-Political Treatise. (Oxford: Oxford University Press, 2012).
James, Susan. “Narrative as the Means to Freedom: Spinoza on the Uses of Imagination,” in Yitzhak Melamed & Michael Rosenthal, eds., Spinoza’s Theological-Political Treatise: A Critical Guide. (Cambridge: Cambridge University Press, 2010) 250-267.
Lærke, Morgans. Spinoza and the Freedom of Philosophizing. (Oxford: Oxford University Press, 2021).
Levene, Nancy. Spinoza’s Revelation: Religion, Democracy, and Reason. (Cambridge: Cambridge University Press, 2004).
Lucas, Jean Maximilien. The Oldest Biography of Spinoza. tr. A.J. Wolf (New York: The Dial Press, 1927).
Mason, Richard. The God of Spinoza: A Philosophical Study. (Cambridge: Cambridge University Press, 1997).
Matheron, Alexandre. Le Christ et le salut des ignorants chez Spinoza. (Paris: Aubier-Montaigne, 1971).
Nadler, Steven. Spinoza’s Heresy: Immortality and the Jewish Mind. (Oxford: Oxford University Press, 2001).
Nadler, Steven. A Book Forged in Hell: Spinoza’s Scandalous Treatise and the Birth of the Secular Age. (Princeton: Princeton University Press, 2011).
Nadler, Steven. “Scripture and Truth: A Problem in Spinoza’s Tractatus Theologico-Politicus.” Journal of the History of Ideas 74 (2013) 623-642.
Nadler, Steven. Spinoza: A Life. 2nd Edition. (Cambridge: Cambridge University Press, 2018).
Popkin, Richard. “Spinoza and Bible Scholarship,” in Don Garrett, ed., The Cambridge Companion to Spinoza. (Cambridge: Cambridge University Press, 1996) 383-407.
Ravven, Heidi. “Some Thoughts on What Spinoza Learned from Maimonides about the Prophetic Imagination, Part One: Maimonides and the Imagination.” Journal of the History of Philosophy 39 (2001) 193-215.
Ravven, Heidi. “Some Thoughts on What Spinoza Learned from Maimonides about the Prophetic Imagination, Part Two: Spinoza’s Maimonideanism.” Journal of the History of Philosophy 39 (2001) 385-406.
Rosenthal, Michael. “Spinoza’s Dogmas of the Universal Faith and the Problem of Religion.” Philosophy and Theology 13 (2001) 53-73.
Rutherford, Donald. “Spinoza’s Conception of Law: Metaphysics and Ethics,” in Yitzhak Melamed & Michael Rosenthal, eds., Spinoza’s Theological-Political Treatise: A Critical Guide. (Cambridge: Cambridge University Press, 2010) 143-167.
Smith, Steven. Spinoza, Liberalism, and the Question of Jewish Identity. (New Haven: Yale University Press, 1997).
Steinberg, Justin. “Spinoza’s Curious Defense of Toleration,” in Yitzhak Melamed & Michael Rosenthal, eds., Spinoza’s Theological-Political Treatise: A Critical Guide. (Cambridge: Cambridge University Press, 2010) 210-230.
Strauss, Leo. “How to Study Spinoza’s Theological-Political Treatise,” Proceedings of the American Academy of Jewish Research 17 (1947/48) 69-131.
Strauss, Leo. Spinoza’s Critique of Religion. Tr. E.M. Sinclair. (Chicago: University of Chicago Press, 1965)
Yovel, Yirmiyahu. Spinoza and Other Heretics: The Marrano of Reason. (Princeton: Princeton University Press, 1989).
Zac, Silvain. Spinoza et L’interprétation de L’Écriture. (Paris: Presses Universitaires de France, 1965).
Zac, Silvain. Philosophie, théologie, et politique dans l’œvre de Spinoza. (Paris: Vrin, 1979)
Author Information
Blake D. Dutton
Email: bdutton@luc.edu
Loyola University Chicago
U. S. A.
Epistemic Injustice
The term “epistemic injustice” refers to the existence of a distinctive type of injustice in which a wrong is done to someone specifically in their capacity as a knower. Philosophers working at the intersection of epistemology and ethics under the general umbrella of virtue ethics have recognized that individuals are not given adequate credit for their claims, which results in a credibility deficit that harms them in their capacity as rational agents and as knowers. The first explicit and groundbreaking account of epistemic injustice was given by Fricker in 2007. This article begins by providing a schematic overview of her influential account. It then examines in more detail the two main forms of epistemic injustice and her suggestions for overcoming them. It concludes with a brief survey of subsequent work that critically extends her initial account. An attractive feature of work on epistemic injustice concerns its attention to the central connections between moral value, knowledge, and power, especially as these intersect within the complex practices of everyday human lives. The harms of being denied credibility based on gender, political identity, or socioeconomic class, all highlight forms of epistemic injustice that matter to us and need to be overcome. As a contribution to social philosophy, this article highlights how these harms of epistemic injustice rest on flawed epistemic practices from the standpoint of human agents situated in complex social situations.
The introduction of this article is designed to highlight the existence of a distinctive type of injustice where a wrong is done to someone specifically in their capacity as a knower (Fricker 2007, 1). In Fricker’s canonical account, epistemic injustice is presented as occurring in two basic forms: testimonial injustice and hermeneutical injustice. The first occurs when prejudice or bias causes a hearer to give a deflated level of credibility to what someone says, while the second happens at an earlier stage, when a gap in our collective interpretive resources unfairly disadvantages someone in terms of understanding their own specific social experiences. These two injustices highlight ethical features of our everyday practices of knowing, where these emerge due to the use of social power within our epistemic relationships and interactions.
In order to properly locate these forms of injustice a fuller account of social power is needed, which Fricker understands as a socially situated capacity to control other’s actions. Of special interest is the specific type of social power called ‘identity power’ that is dependent on a shared understanding of social identities during the use of social power. Inequalities in social power help to create testimonial injustices, where prejudice causes the hearer to give less credibility to a speaker than is deserved. The key type of prejudice in such cases is identity prejudice, where the injustice suffered by a speaker is due to an identity prejudice held by the hearer, and it is this bias that results in a credibility deficit. The central case of testimonial injustice then turns on this identity-prejudicial credibility deficit (2007, 28). The identity prejudice that leads to testimonial injustice is enabled through the use of the social power described as identity power.
Identity prejudice influences a hearer’s judgments about speaker credibility by way of ‘social imagination’ where this involves prejudicial stereotypes as distorted images of social types (e.g. immigrants are lazy). Fricker views credibility judgements as a type of perception, so the negative impact of prejudice and stereotypes is measured in terms of the way these serve to distort our perceptions of someone’s credibility.
The nature of the ethical wrong done to the speaker in cases of testimonial injustice is that they are harmed in their capacity as a knower, as a giver of knowledge or informant. This capacity is further recognized as essential to human value and experience. As a result, this primary ethical harm is an intrinsic injustice, as it is tied to the value of one’s own personal identity as a knower. As we will see, experiencing this harm can result in serious negative consequences for one’s self-development.
Identity prejudice and its consequent epistemic harms are at the root of a further type of epistemic injustice known as hermeneutical injustice or interpretive injustice (these terms will be used interchangeably). This injustice is the result of having some important part of one’s social experience discounted because of a gap in society’s collective interpretive resources. Without the concepts necessary to make sense of this experience the individual suffers epistemic harm. By contrast to the case of testimonial injustice, the prejudice that disables individuals from sharing their experiences is structural, as it indicates a lacuna in the social environment within which we make sense of human experience. This gap in interpretive resources leads to stereotypical identity evaluations that can be prejudicial precisely because they fail to engage with the experiences of the individual. Consider the case of post-partum depression prior to its recognition as a serious medical condition. Women suffering from this depression would be met with suspicion by husbands, doctors and friends concerning the condition. Their experiences are not taken seriously as they would simply be viewed within the existing conceptual repertoire available, precisely because the social interpretive resources available cannot make sense of this experience as a genuine medical condition suffered by women. The lack of credibility given these women is the result of a prejudice that is built into the very social conditions of interpretation, which prove insufficient to make sense of their individual experience.
In both the cases of testimonial and interpretive injustice, identity prejudice then serves as the main cause of injustice. The exploration below of these types of injustice in more detail shows the virtue of epistemic justice offered as a local, transactional remedy for minimizing such forms of prejudice.
2. Testimonial Injustice and Social Power
We have been introduced to the idea that social power as a socially situated capacity to control other’s actions further results in the use of identity power as contributing to the central case of testimonial injustice involving identity prejudice. In this section we explore this use of social power and its connection to testimonial justice in greater detail.
We first begin with one of Fricker’s central illustrations. Consider the following example from the screenplay The Talented Mr. Ripley, when Herbert Greenleaf attempts to further understand what might have led to the disappearance of his son Dickie. Greenleaf silences his son’s fiancée Marge Sherwood, dismissing her suspicion concerning the circumstances of Dickie’s disappearance, and the further suggestion that Mr. Ripley may be somehow behind it (Fricker 2007, 9). Marge’s suspicion is based on the surprising fact that Ripley has Dickie’s rings, which Dickie told Marge he would never remove. Greenleaf interprets Marge’s suspicion as a case of ‘female intuition’, which is emotional and not based on facts. Here Greenleaf’s male identity enables him to silence and control Marge, resulting in a clear use of social power as identity power. This results in his use of a gender identity prejudice against her ability to offer credible testimony in this emotionally charged situation.
The type of social power briefly illustrated here, is defined as the “capacity we have as social agents to influence how things go in the social world” (2007, 9). To claim that power is a capacity is to affirm its existence even when not in active use. It then has both an active and passive dimension, where its passive force can diminish when not actively used. For example, the periodic giving of parking tickets by the traffic warden is an active use of power that remains in effect passively when tickets are not given. A further contrast can be drawn in terms of agential and structural power. The first requires the exercise of passive or active power on the part of an agent, where the second requires no such agent, but is spread out across the social system or part of the system in question. An example of such structural power would be some minority social group that tends not to vote. The fact that this group fails to vote is attributed to the structural power that operates on them in such a way that they do not exercise their voting rights.
However, the first type of agential power still depends on social structure, since its effective exercise depends on practical coordination with other social agents. When university professors assign grades, this capacity is dependent on a whole host of administrators to be effective. The key point here is that all relations of power, whether agential or structural, depend on the specific social alignments of people. In addition, there is more to be said of social power beyond this basic idea of sociability, since Fricker rejects the classic definition of power given by Lukes involving the disruption of someone’s interests (1974). In Fricker’s view, this perspective is far too narrow, since the use of power is not always against other’s interests. For example, students accept the professor’s power to assign grades. The key feature of this use of social power is then to “effect social control” (Fricker 2007, 13).
This view can be seen in both types of power we have considered. In agential power, someone controls the actions of another person or group. With the case of structural power there is no agent, but the actions of the group are still subject to control by means of the social structure in which they find themselves. Such structural uses of power work to create social order. With the example of nonvoting groups, we see that a complex network of institutions, laws and their implementation can result in some groups engaging in forms of nonaction that undermine their basic rights. We are thus led to the following working definition of social power: “a practically socially situated capacity to control others’ actions, where this capacity may be exercised (actively or passively) by particular social agents, or alternatively, it may operate purely structurally” (Fricker 2007, 13). This definition is neutral with regard to the possible uses of power, yet still captures what we might see as its core, critical function against attributions of power and social control. Whenever we recognize power being used, we should ask who is using this power, for what reasons, and who exactly are they controlling or attempting to control.
We now need to consider the more specific type of social power that Fricker calls ‘identity power’ (2007, 14-17). Here we have a case of social power that is dependent on a shared view of social identity. Such identity power can also be both passive and active, as illustrated in terms of gender with the example of The Talented Mr. Ripley and Herbert Greenleaf’s treatment of his son’s fiancée Marge. Greenleaf’s use of identity power is active, since he uses it to silence her. But it might also be exercised passively, since she could be silenced without him doing anything at all. It is important to note that the participants in a social interaction need not accept the social stereotypes that promote the use of identity power in that situation for the operation of social power to work. Identity power works at the level of “collective social imagination” (2007, 15). It is able to control actions whether we accept it or believe it. Identity power can further operate with other forms of social power, such as material forms of social power. For examples, see (Fricker 2007, 15-6). There can also be structural operations of identity power. This can once again be seen in the case of the non-voting group. The members of this group have adopted a social identity that includes the view that its members tend not to vote. This then further supports their resulting decision not to vote.
The key issue with regards to identity power is the way it forms an important part of the mechanism of testimonial exchange. This is because we frequently appeal to the use of social stereotypes in order to make credibility judgements concerning what others say. This use of stereotypes may be acceptable or not depending on the actual stereotype used. If a stereotype promotes a bias, or prejudice that works against the speaker, it can result in a loss of knowledge, since a low biased judgment of the speaker’s credibility results in failure to learn what the speaker may know. This further results in an ethical problem since the speaker is harmed by our wrongly undermining their capacity as a knower.
The core case of this epistemic injustice is found with Fricker’s emphasis on ‘testimonial Injustice’. Generally, there are two kinds of prejudice used in testimony: ones that lead to credibility excess and those that lead to credibility deficit. In any face-to-face discussion we need to make an attribution of credibility. Error is no doubt part of that process, but generally assignments of excess credibility are advantageous, while those involving a credibility deficit are disadvantageous. Fricker argues that the central cases of testimonial injustice involve credibility deficit, precisely because it is here that one is most clearly wronged in their capacity as a knower. Below, we will see that Fricker has been effectively criticized on this point. Given the fallible nature of human judgment, assigning an unfair level of credibility to someone can often occur. However, in cases where no prejudice is involved no ethical or epistemic harm has been done (2007, 21-22). The source of testimonial injustice is found in the prejudice or bias that results in an unfair and harmful lowering of a speaker’s credibility. Fricker concludes that the central cases of testimonial injustice involve identity-prejudicial credibility deficit (2007, 28).
In sum, social power in the form of identity power plays a fundamental role in undermining the credibility of speakers. It enables negative prejudice to influence our judgements of others’ testimony leading to epistemic injustice.
3. Stereotypes, Prejudice and Testimonial Injustice
We saw that the central cases of testimonial injustice turn on the way identity power is used to prejudice the judgement of someone’s credibility as an informant. Such uses of identity power further depend on the ease with which we communicate. In this section, we consider how communication enables prejudicial negative judgments through our use of stereotypes.
In order to effectively engage in communication, we must swiftly assess the credibility of what is said. This is accomplished through the use of stereotypes concerning the speakers we confront. This helps facilitate the ease with which we are able communicate with others. However, while the use of stereotypes is indispensable, it can be prejudicial thereby corrupting these judgements of speaker credibility. Fricker provides an in-depth examination of how prejudicial stereotypes negatively influence our credibility judgments.
Stereotypes are defined as “widely held associations between a given social group and one or more attributes” (2007, 30). This definition is broad in the following three central ways:
It is neutral with respect to the reliability of the generalization made using a stereotype.
It highlights other types of cognitive commitments. Stereotypes need not be held as beliefs but can include feelings that are less transparent than beliefs.
It is open to stereotypes being positive or negative (2007, 30).
The value of stereotype use can be seen by looking briefly at the nature of testimonial exchange. In such situations we are faced with the following sort of question: How do I immediately determine how likely what is said is true? Here, one makes a credibility judgment about epistemic trustworthiness, where this includes both the competency and sincerity of those of a specific social type. It is then, as Fricker further argues, both inevitable and desirable that hearers use generalizations based on reliable stereotypes. Familiar examples would include social stereotypes about the local family doctor, or neighborhood shop keeper. Everyday dialogue requires the kind of social categorization in the form of stereotypes that makes possible discussion and communication.
What then may happen within testimonial contexts is an identity prejudice that influences the use of stereotypes. This point can be further explored through the use of an example from Arpaly, the case of Solomon, as it highlights the difference between a non-culpable epistemic error and negative prejudicial judgments (Arpaly 2003, 103; 2007, 33-34). Stated briefly, Solomon, a boy growing up in a small rural community, is raised to believe that women are not inclined to abstract thinking and therefore are not as good abstract thinkers as men. In his local environment, he faces no counter evidence to this view, and so he is not proceeding irrationally. But once he enters university and now studies beside very capable female students, to continue to hold his previous belief would be irrational and the result of prejudice. Fricker concludes from this case that not all prejudice involves ethical mistakes or flaws (2007, 34). However, once Solomon arrives at university, this changes. Here, the prejudice in question becomes a negative identity prejudice, the central case of testimonial injustice, where it gives rise to a firmly held belief that is accompanied by morally suspect feelings about a group of individuals. A negative identity prejudicial stereotype that is at work in such cases of systematic testimonial injustice is then defined as “A widely held disparaging association between a social group and one or more attributes, where this association embodies a generalization that displays some (typically, epistemically culpable) resistance to counter-evidence owing to an ethically bad affective investment” (2007, 35). Here the stereotypical association is maintained in the face of contrary evidence due to morally suspect feelings one may have about a group. Not revising one’s belief in such circumstances reveals an epistemic failing that yields an epistemic injustice.
We can apply this clarified understanding of stereotypical associations and their role in identity prejudice to the case of our everyday talk with others. In such discussions, speakers and hearers perceive others as trustworthy, in being both sincere and competent concerning what is said. One may perceive someone’s testimony in terms of background assumptions about how likely this person (as a member of a social type) is to be trustworthy. In the process, reliable stereotypes are used, or, at least, they are taken to be reliable (2007, 36). However, in the case of negative identity prejudicial stereotypes, the prejudicial bias distorts this credibility judgement, resulting in its negative influence on the speaker’s perception of the speaker.
In the cases discussed so far, the prejudice in question is based on beliefs, but stereotypical associations may be based in other non-doxastic (not based on beliefs) influences on our perception of others. There is even the possibility that testimonial injustice may occur through such associations when our beliefs are inconsistent with it. Prejudicial stereotypes can then influence our credibility judgments without doxastic mediation, thereby making them very difficult to detect and correct. This non-doxastic mediation is further clarified with the work of Walter Lippmann who treats a stereotype as a social image creating an association between a social group and certain characteristics resulting in generalizations about this group (1965; Fricker 2007). Stereotypes based on this social imagery can have a deep visceral impact on judgements, influencing them without our explicit awareness. They can further influence our patterns of judgment even when the content of such imagery conflicts with our own beliefs. Fricker illustrates this conflict with the example of the feminist, who while believing in gender equality harbors seemingly negative attitudes toward female political candidates (2007, 37). In this case, her prior cognitive commitments held in imagination retain their impact concerning how she perceives the social world even after the corresponding beliefs have been explicitly rejected. As a result, they are very hard to detect and change, requiring a special kind of self-awareness or self-consciousness to see this kind of prejudice in one’s thinking and to take steps to correct it.
If images are prejudicial, they can be an ethical-epistemic liability, since they can influence our judgments without our being aware. This suggests that testimonial injustice happens often (2007, 39). We then lack a moral discourse or vocabulary necessary for the detection and criticism of these types of epistemic injustice. While the so-called social imagination leads to prejudicial images that remain difficult to detect, it also may in a more positive vein offer resources for social change. While prejudice can negatively influence perception and judgment, if the inconsistency between our imagination and belief can be brought to our attention, then our beliefs can be used to correct the prejudicial imagery. The opposite may also be possible: our prejudicial beliefs can be corrected by an unprejudiced social perception. For example, Fricker describes the case of Huckleberry Finn from Mark Twain’s The Adventures of Tom Sawyer, who believes that because Jim is a slave he has done something wrong by escaping, but still maintains an unbiased, empathic understanding of Jim as a fellow human being (2007, 40; Twain 1876). Our social imagination can then be seen as a potentially powerful force for social change. The key here is that the disagreement between social imagery and belief can serve as a crucial epistemic and ethical resource for reducing prejudice in our credibility judgements (2007, 45).
4. The Wrong of Testimonial Injustice
We have seen how prejudice can corrupt our credibility judgements leading to testimonial injustice. The next issue we must now briefly consider concerns the nature of the injustice or harm done in such cases of testimonial injustice. Initially, we can note a kind of general epistemic harm, since testimonial injustice prevents us from acquiring the possible knowledge that someone is attempting to communicate, and which we might also find helpful (2007, 43). However, even more significantly, we need to locate the precise nature of the additional ethical harm that is experienced on the part of the speaker. Here, we can follow Fricker who usefully distinguishes between primary and secondary dimensions of this harm.
The primary harm highlights how individuals are wronged in their capacity as knowers. Being treated as a purveyor of knowledge is an essential human value, so that when it is violated, one suffers an intrinsic injustice. The human capacity for the giving and sharing of knowledge and information is central to our humanity as it is crucially tied to our ability to think and reason. When this is denied us, individuals then suffer from this epistemic wrong, and further also suffer from what it means to be treated in this way. The import of this harm can be further emphasized by considering what we might call ‘epistemic trustworthiness’, which at a minimum includes both the attribution of competence and sincerity (2007, 45). In cases of testimonial injustice both of these features, when either denied singly or together, excludes one from the epistemic community. As a result, our basic capacity to reason and share knowledge is denied, which is basic to our identity as human creatures.
Secondary harms occur as further consequences of this primary harm and can be both practical and epistemic. In terms of practical consequences, we might consider one-time instances of testimonial injustice, where, for example, in a court of law, someone is found guilty rather than innocent. Additional cases might involve a case of testimonial injustice that influences someone’s working life, such that they are seen as lacking the judgment needed for a management position. Examples would include the professional disadvantages some women suffer through testimonial injustice at work. Further secondary epistemic consequences yield clear cases of epistemic harm, where there is a general loss of what might be termed one’s ‘epistemic confidence’. Here we can consider cases of persistent testimonial injustice that can lead to an overall loss of confidence in one’s general intellectual abilities, which could further hinder or limit one’s educational and intellectual development. Identity prejudice in the form of testimonial injustice can be further seen as a general attack on someone’s epistemic authority (2007, 48). It is reasonable to wonder how deep an impact such secondary effects can have in someone’s life. As has been suggested, testimonial injustice could lead to an overall loss of confidence in one’s beliefs and their justification, leading to overall lack of confidence in one’s own beliefs and possible knowledge. Of course, knowledge may also be lost as well. If this lack of confidence continues there may be a general lack of interest in inquiry or any engagement in the process of acquiring further knowledge. Perhaps less directly this loss of epistemic confidence might prevent the development of certain intellectual virtues. For example: the intellectual courage needed to examine possible views, standing firm until convinced of a mistake, or staying focused until a position is adequately defended. Such a stance requires an epistemic confidence that testimonial injustice could weaken. One’s beliefs may be given up too easily, resulting in the loss of knowledge. Furthermore, one’s intellectual character may be damaged, which plays a key role in intellectual confidence.
Consider once again Fricker’s example from The Talented Mr. Ripley, when Herbert Greenleaf unjustly silences Marge Sherwood. By using this example, we can detect the testimonial injustice he does to her, which will later help to locate the epistemic virtue he lacks and highlight the way his moral standing depends on his historical and social context. Clearly, he undermines Marge as a possessor of knowledge (2007, 87). His negative attitude towards what she says, results in his ignoring a key reason for her correct hypothesis about Ripley’s involvement with Dickie’s death. Here we have the primary harm of testimonial injustice, where he unjustly downplays Marge’s remarks. Furthermore, Marge is constructed as a hysterical individual. She becomes what she is constructed as – emotional, intuitive, and unable to see the facts for what they are. This makes it seem as if Greenleaf’s credibility judgement of her is correct (2007, 88-89). This construction of her identity is both unfair and unjust. In addition, she is in fact correct about her suspicions, and has good evidence in support of her view. Here we can note some of the epistemic consequences of the secondary harm, involving the loss of epistemic confidence that stems from one’s identity negatively shaped by an epistemic bias. Once we take all of this into account, it becomes increasingly clear that Greenleaf should give her more credibility. As we will consider further below, Greenleaf’s ability to perceive Marge as a credible witness is flawed due to the way it is in part formed through the gender prejudice of his time. We should further note this example highlights the way a hearer can clearly fail to correct the identity prejudice at work in their credibility assessments (2007, 89). Greenleaf fails to bring any critical awareness to bear on the prejudice that distorts his perception of Marge as an unreliable informant. Lastly, such reflections return us to the primary harm and its possible impact on one’s psychology. For example, Fricker describes how participation in mutually trustful dialogue is a chief part of the process for settling beliefs, and establishing one’s own identity, including social identity (2007, 53). Testimonial injustice excludes persons from the forms of trustful conversation required for the creation of these central features of personal identity. This exclusion is based on prejudice further inhibiting the development of personal identity and the formation of the self (2007, 54-55).
5. The Virtue of Testimonial Justice
We have been reviewing the serious harms that result from testimonial injustice, where the most serious involve the devaluing of someone’s capability as a sharer of knowledge. Given the severity of the harms, it is vital that we consider options for minimizing its negative influence. This issue will be explored in this section in terms of the virtue of testimonial justice. Central for this examination is Fricker’s view that a hearer’s credibility judgment be treated as a perception of the speaker as more or less trustworthy (70). A hearer’s ability to trust what is said is a capacity for a type of social perception that is informed by a background theory of the competence and motivations of social types in various social contexts (71). This further requires a well-trained ‘testimonial sensibility’, where through the right kind of epistemic socialization, a virtuous hearer gains this testimonial sensibility (72). This section provides an overview of this core idea and its importance for overcoming the negative harms seen with testimonial injustice.
Our testimonial sensibility develops through participation in and observation of practices of testimonial exchange. As a result, a body of collective and individual testimonial experiences influence the development of the testimonial sensibility required of a virtuous hearer. It is through such experiences that we learn to assume trust when it is appropriate (2007, 83). Importantly, this sensibility is not fixed or static: we can over time develop a critical attitude toward the passive inheritance that conditions our credibility judgements. This is perhaps most apparent when new experiences of testimony are in tension with past experience that has been passively shaped by our testimonial sensibility. In order to be epistemically responsible, we then need to adjust our sensibility to fit this new experience. Here we can revisit the example of the feminist who continues her dislike of female political candidates (2007, 83). Here, her sensibility needs to be critically updated in light of her feminist beliefs, with the result being a more trustworthy attitude towards the statements of females involved in politics. Through critical reflection on our perceptual habits, hearers can change their sensibility, and this can be seen as part of our maturation process in the face of new testimonial experiences. Testimonial sensibility is then viewed as a spontaneous critical sensitivity continuously adapting on the basis of new individual and collective experience.
We are then presented with a basic picture of human testimony as speakers perceiving others as more or less credible informants. Our perceptual faculty results from training that is historical, social, and individual. The existence of identity prejudice presents a danger to the ongoing development of our testimonial sensibility. With the presence of identity prejudices in our social environment, we are at risk of causing testimonial injustice, where the key case involves an identity-prejudicial credibility deficit. In order to counter this danger, we need a particular virtue, seen here as a way to prevent the harmful influence of prejudice on our credibility perceptions. Such a virtue would provide us with a kind of protection against the distorting influence of prejudice. We have already seen the kinds of epistemic harm done by Herbert Greenleaf when he unjustly silences Marge Sherwood. He undermines Marge as a possessor of knowledge, with the further consequence of weakening her confidence as an epistemic agent. Greenleaf fails to critically assess the prejudice he uses to distort his view of Marge as a reliable informant. We can now consider more precisely what sort of critical awareness was missing and the possible steps needed to prevent the harm from being done.
Our question then concerns what type of critical awareness is needed to avoid testimonial injustice (2007, 90)? The influence of a prejudicial stereotype in Greenleaf’s testimonial sensibility highlights a corrupt operation of identity power between speaker and hearer. The concept “women” yields a social stereotype that distorts the credibility judgement assigned to Marge. We earlier saw that the operation or use of identity power controls who can convey knowledge to whom. In this case, Greenleaf uses agential identity power over Marge, thereby controlling who can provide knowledge, and further denying Marge this capability. Greenleaf has failed to correct the irrational operation of identity power that distorts his credibility judgement. This suggests that testimonial responsibility requires a distinctly reflexive critical social awareness (2007, 91). In other words, the needed corrective anti-prejudicial virtue should be reflexive in structure, where the virtuous hearer recognizes both their identity and the interrelations between it and the identity of the person they are listening to. This further requires examining for the possible influence of prejudice in one’s judgments and then proceeding to fix or adjust the level of credibility accordingly. In this way, exhibiting this virtue of testimonial justice will involve neutralizing the impact of prejudice in the making of credibility judgements.
Fricker discusses several different ways this might be achieved (2007, 93-95). Usually, this will require an active corrective process, where self-reflective monitoring is needed to correct one’s judgments. While this active reflection is the default approach, it can also be accomplished through familiarity. On meeting someone for the first time, we might harbor negative attitudes against them given the way they speak or their social class. However, as we come to know this person and become friends, such attitudes disappear without any conscious reflection (for further examples, see 2007, 96). The changing circumstances surrounding the presence of prejudice in society over time may mean that our best hope in achieving the required reflexivity in judgement is through constant critical reflection. A further consequence of this fact is that possession of the virtue may always be partial. Acquiring the virtue of testimonial justice will involve a combination of spontaneity and ongoing critical alertness to the influence of prejudices, but even in the best of cases this will only be partial.
6. Hermeneutical Injustice
We now turn to a discussion of the other type of epistemic injustice described by Fricker, ‘hermeneutical injustice’, which is related in important ways to testimonial injustice, but is distinct and arguably more fundamental. Feminist philosophers have emphasized the way relations of power limit women’s ability to understand their own experiences. Seen from an epistemological perspective, this means that those in power have an unfair advantage in structuring our collective social understanding and knowledge. These issues of power and their connection of knowledge come increasingly to the fore with hermeneutical injustice.
Shared understandings and knowledge reflect the perspectives of different social groups. Relations of unequal power can distort these shared interpretive resources in ways that enable the powerful to make sense of their social experiences while those with less power cannot (2007, 148). Women collectively have had to overcome routine social interpretive habits and come up with exceptional interpretations of some of their confusing experiences together. This required them to develop resources to understand their experience that are largely implicit, or even absent from, the interpretive practices of their time. In the case of post-partum depression, for example, women were unable to understand their experience precisely because it was collectively ill-understood (2007, 149). In part this rests on social unfairness, one that results in a kind of structural inequality between men and women. This points to the possibility of a kind of epistemic injustice that may need to be overcome. Significant parts of a woman’s social experience may be downgraded or not taken seriously, with the result being a loss of self-understanding. Once again, individuals are wronged in their capacity as knowers resulting in the distinct type of epistemic injustice known as hermeneutical injustice.
Fricker locates such problematic experience within the ‘credibility economy’, which concerns the distribution of epistemic resources across a domain. In terms of epistemology, an uneven distribution of resources involving concepts, credibility, and knowledge, can lead to “hermeneutical marginalization” (2007, 153). A socially disadvantaged group can be blocked from access to knowledge, or access to communicating knowledge because of a gap in interpretive resources. This is especially so when these resources would help them understand the existence and nature of their marginalization. Such interpretive resources form part of our shared ‘social imagination’, concepts widely known and available for use in understanding ourselves and for communicating with others. They represent our shared tools of social interpretation and communication (see McKinnon 2016, 441).
This hermeneutical marginalization takes on an additional moral and political dimension since it indicates the subordination and exclusion from some practice that would be of value for the participant (2007, 153). This can impact a wide range of social experiences or perhaps only a certain area of experience. It can be also socially coerced. Given the complexity of social identities, one might be marginalized in one context (‘women’) but not in another (‘middle-class’) where one’s identity might determine level of participation. Marginalization can be an effect of material power (socio-economic background), or identity power, involving prejudicial stereotypes that present someone as unsuitable for a job. Moreover, this marginalization could be structural, or agent centered (consider the example of ageism among employers (2007, 154). What is negative about this form of marginalization is a structural prejudice that is present in our collective interpretive resources, and which is discriminatory (for examples see 2007, 155). Hermeneutical injustice can then be defined as “the injustice of having some significant area of one’s social experience obscured from collective understanding owing to a structural identity prejudice in the collective hermeneutical resource” (2007, 155).
Consider the further example of sexual harassment prior to its recognition as a serious social problem. Before being able to name, or categorize a variety of workplace behaviors, women were unable to understand their own experiences and the discomfort they felt in terms of harassment (2007, 149-150). This changed with the emergence of consciousness raising movements that enabled a better understanding of such experiences and their falling under the category of harassment. “Sexual harassment” as a concept was absent from our social imagination before this occurred. However, once this concept was available women were able to understand their experiences and communicate them to others. They were also in a position of increased power to limit and prevent them from happening. As a result, their hermeneutical marginalization was lessened, and in the process, one type of hermeneutical injustice was further minimized.
In the case of hermeneutical injustice, there is no agent; it is, as Fricker notes “purely structural” (2007,159). In addition, hermeneutical injustice may be further worsened through the testimonial injustice that can follow in its wake. In the case of women suffering from post-partum depression prior to its recognition as a serious medical condition, not only were they struggling with the structural interpretive gap that renders their experience ill understood and therefore difficult to communicate. But this can be worsened when they are not taken seriously due to gender prejudice in testimonial contexts with for example, doctors and husbands. Hermeneutical injustice can then be systematic, as members of marginalized groups are subject to an identity prejudice. Trying to explain their own poorly understood experience will receive a low credibility judgment because of the general difficulties associated with understanding it. If this is based on an identity prejudice, they are then doubly wronged (2007, 159). Here, the structural prejudice in our shared interpretive resources is coupled with an identity-prejudicial credibility judgement in our testimony with others (2007, 159-160).
7. The Wrong of Interpretive Injustice and the Virtue of Hermeneutical Justice
We have been witnessing the fact that in cases of hermeneutical injustice or what will also be referred to as ‘interpretive injustice’, there exists an asymmetrical cognitive disadvantage, where a collective hermeneutical gap impacts members of different groups in different ways. The doctor unable to diagnosis post-partum depression is not harmed, but the patient is. In cases of harassment, the harasser is not harmed, but the harassed are. This asymmetry results from a concrete, social and practical context that is influenced by a gap in our interpretive resources. Therefore, hermeneutical injustice is a kind of structural discrimination, where a collective gap places some group at a disadvantage that is discriminatory. In this section we consider the exact nature of the harm in cases of interpretive injustice and as was the case with testimonial injustice seek remedies in terms of virtue.
We can begin by considering the analogy of a welfare state with free healthcare but no dental care health plan (2007, 161-162). The social structure of the state provides general access to healthcare, a notable good. But the further lack of dental coverage leads to a gap in healthcare provisions that can be seen as discriminatory for those who lack the material means to pay for such coverage. Like the cases seen with hermeneutical injustice, some group is asymmetrically disadvantaged by a collective structural social lacuna. As a result, those unable to pay for dental coverage are unjustly harmed by the group gap in social services.
We can then appreciate that the primary harm of interpretive injustice is located in this situated hermeneutic inequality (2007, 162). It is useful to briefly compare this with the wrong associated with testimonial injustice. The primary harm common to both involves a prejudicial exclusion from participation in the spread of knowledge. We have seen further secondary harms revolving around practical consequences that can negatively impact everyday practical life (2007, 162-163). There are also secondary epistemic disadvantages concerning a loss of epistemic confidence (2007, 163). However, there remains a central difference between the primary harm seen in testimonial injustice and that found in cases of interpretive injustice injustice (168). The structural nature of the hermeneutical harm suggests that we need a new type of virtue to deal with hermeneutical injustice, something more than the virtue seen with testimonial justice. The virtue of testimonial justice secures the reliability of what is said without prejudice and can at times be achieved without reflective monitoring of prejudicial influence on one’s judgements (2007, 169). In the case of hermeneutical injustice, we must counteract a gap in the social environment that threatens not simply the reliability of what is said, but its very intelligibility. As a result, and unlike the case of testimonial injustice, the virtue must always be corrective in structure. What this means is that the virtue of hermeneutical justice must involve awareness (or alertness) to the possibility that there is a problem with someone communicating their experience when they are not the source of the problem.. Instead, it is due to a gap in the social resources available to communicate that experience, the key point being that the speaker is struggling with an objective problem, not a subjective one.
This critical awareness or sensitivity, that is, the reflexive awareness on the part of a hearer, highlights how they must be aware of the relationship between their social identity and that of the speaker and its influence on the intelligibility of what is being said (2007, 169). Both testimonial justice and hermeneutical justice share this reflexive awareness since it guards against types of identity prejudice. Furthermore, this reflexive awareness enables a correction in credibility judgements, where the lack of understanding of what is said led to a judgment of low credibility. However, in the case of hermeneutical justice, this involves not simply an assessment of whether the speaker is reliable but a further effort to recognize the truthfulness of what is said. In other words, in hermeneutical contexts, the responsible hearer makes a credibility judgment that is an assessment of the degree to which what is said ‘makes good sense’ that is, is a truthful interpretation of what is said: “The guiding ideal is that the degree of credibility is adjusted upwards to compensate for the cognitive and expressive handicap imposed on the hermeneutically marginalized speaker by the non-inclusive hermeneutical climate, by structural identity prejudice” (2007, 170). The hearer proceeds to minimize the way structural identity prejudice prevents the speaker from being understood. How would the virtuous hearer attempt to do this? With enough time, patience, careful attention, and understanding of the possible existence of epistemic injustice, the hearer can attempt to provide a more inclusive interpretive environment. This might obviously include engaging in a more extended, engaged discussion and a commitment to the kind of ‘epistemic humility’ toward interpretation needed to create this more inclusive setting.
What else might be involved in this more proactive, more socially aware kind of listening, where one is attentive both to what is said and what is not said? The lack of shared social experience can make this difficult. The responsible hearer needs to seek out other supporting evidence, and grounds for doubt concerning the reliability of what is said. There is an ongoing need to look for further evidence. Often there may not be enough time, requiring that one reserve judgement and keep an open mind to the credibility of testimony. The aim or end of the virtue is to neutralize the impact of the structural identity prejudice on one’s credibility judgment. The ultimate end or final aim may sometimes be understanding which falls more on the intellectual side. Other times the aim is justice and ethical virtue reflected in our actions and practices. Sometimes it may be both. Fricker emphasizes that the practical context will decide the precise nature of the end pursued.
8. Extensions and Challenges
There have been many further extensions of these ideas, both in terms of additional clarifications of the nature of epistemic injustice and its further application in the fields of education, healthcare, law, science, religion and elsewhere. This concluding section briefly discusses three areas. First, the possibility of credibility excess as a form of epistemic injustice. Second, institutional forms of epistemic injustice and their possible remedies. Lastly, the possible scope and sources of hermeneutical injustice.
We earlier saw Fricker stress credibility deficit as the core case of epistemic injustice. Many commentators have stressed the need to recognize instances of credibility excess as leading to related types of epistemic injustice and harm (Davis 2016; Medina 2011). For example, Davis has argued that positive prejudicial judgments and the inflated credibility that they engender leads to epistemic harm. The idea that Asians are all good at mathematics, or all African Americans good at sports are two familiar examples. In these and related cases, Davis points to the possibility of identity-prejudicial credibility excess, noting that such credibility excess includes an ethically flawed affective investment, where this involves a marked inability to recognize the individuality of members of the same social group (2016, 487-488). But briefly, the harm of this credibility excess involves the speaker’s subjectivity being limited by other dominant hearers, in terms of being allowed to contribute to the conversation only in ways that are viewed as distinct and unique as this is captured in the positive stereotype used (2016, 490). In order to counter this, Davis suggests that the reflexive corrective virtue suggested by Fricker include a greater expansive understanding of the larger epistemic landscape where discussion takes place (for details see 2016, 493-496).
Related ideas are discussed by Anderson, who argues that the existence of structural, institutional forms of epistemic injustice requires greater emphasis on the need for structural remedies (2012). Fricker’s exclusive use of individual epistemic virtues as remedies fails to address the kinds of structural injustices that may have locally innocent (non-prejudicial) causes. Structural causes of group-based credibility deficits may include what Anderson describes as ethnocentrism and a ‘shared reality bias’ (2012, 169-170). Ethnocentrism is described as bias in favor of groups to which one belongs and while morally innocent and epistemically useful when directed inward, can transmit structural injustice from one setting to another. If different groups engaged in inquiry are segregated based on their social identities, this may then lead to systematic unjust group inequalities, where, for example, an advantaged group disregards the testimony of a less advantaged group. This would serve to reinforce the epistemic disadvantages of this group and damage the epistemic standing of their members. In such cases, ethnocentrism causes a form a structural testimonial injustice.
The shared reality bias is a tendency of individuals who interact frequently to converge on their perspectives and judgments about the world. Again, such a bias is epistemically useful for individuals engaged in joint inquiry. However, when groups of inquirers are segregated along the same lines that define group inequalities, the shared reality bias will tend to insulate members of advantaged groups from the perspectives of the systematically disadvantaged. Members of the advantaged group will be unable to make sense of the perspective of those in the disadvantaged group. This is because the interpretive resources available have been developed for members of the advantaged group and their experiences. Such resources remain inadequate for understanding the experiences of those from whom they are isolated in the other disadvantaged group.
Anderson further notes that this shared reality bias, when coupled with inequality-grounding group segregation, is a cause of hermeneutical injustice (2012, 170). This epistemic injustice may in turn cause a structural testimonial injustice. This account of hermeneutical injustice allows that marginalized communities may succeed in developing coherent accounts of their own experience, something that Fricker was, perhaps, slow to appreciate. They may nevertheless suffer from hermeneutical injustice in that the more advantaged are unable to understand them. However, it is another case of a transactionally innocent cognitive bias that can lead to the spread of a structural injustice into new areas.
Anderson concludes that epistemic virtue is needed at both individual and structural levels. In the cases of ethnocentrism and the shared reality bias, group segregation along lines of social inequality is the key structural feature that turns innocent, if cognitively biased, epistemic transactions into vectors of epistemic injustice. If group segregation is the structural backdrop to this structural type of epistemic injustice, then Anderson proposes that greater group integration is a structural remedy, and therefore a virtue of epistemic institutions (institutions concerned with the distribution of knowledge, such as educational or scientific communities) (2012, 171-172).
The idea that marginalized groups suffering from hermeneutical injustice can still develop an understanding of their own experience is further explored by Goetze (2018). He frames this central point in terms of the concept of ‘hermeneutical dissent’, where marginalized groups have developed their own interpretive tools for understanding their experiences. He first explains how to view our collective interpretative resources in order to accommodate such dissent and then further shows how this results in interpretative injustice and harm. By his account, the primary harm of hermeneutical injustice occurs when a subject has a distinctive and important social experience that lacks intelligibility at a crucial moment (2018, 79). This intelligibility may be cognitive, when they fail to possess the tools needed to sense their experience, or communicative, when they are unable to communicate their understanding to others. On the basis of this disjunctive characterization of the primary harm found in hermeneutical injustice, Goetze is able to offer a further taxonomy of the various species of such injustice in terms of where the gap in our collective hermeneutical resources appears (for details see 2018, 80-85). Lastly, he shows how this impacts our understanding of epistemic justice and virtue in such cases. As we have seen, he too recommends that the virtuous hearer exhibit epistemic humility in terms of their interpretative resources, and further to recognize the need for a more expansive view of our epistemic environment, where this may include the kind of social integration championed by Anderson (2018, 86-87).
9. References and Further Reading
The following list highlights several classical treatments and some important extensions. For an extensive bibliography of earlier work, see the article on epistemic injustice in the Oxford Bibliographies Online. For two anthologies that treat a number of central themes and topics, see the Routledge Handbook of Epistemic Injustice, edited by Ian James Kidd, José Medina, and Gaile Pohlhaus, 2017 and Overcoming Epistemic Injustice, edited by Benjamin R. Sherman and Stacey Goguen, Rowman and Littlefield International, 2019.
Anderson, Elizabeth. (2012). Epistemic Justice as a Virtue of Social Institutions. Social Epistemology 26.2: 163-173.
Focuses on institutional injustice and wonders what kind of remedies are required to counter this structural form of epistemic injustice.
Arpaly, Nomy. (2003). Unprincipled Virtue: An Inquiry into Moral Agency. Oxford: Oxford University Press.
Discusses the difference between a non-culpable mistake and prejudice through the Solomon example.
Begby, Endre. (2021). Prejudice: A Study in Non-Ideal Epistemology. Oxford University Press.
An important study that discusses how our epistemic circumstances may result in prejudicial beliefs that are not morally blameworthy. Provides a contrasting position on the Solomon case discussed by Arpaly 2003, and Fricker 2007.
Davis, Emmalon. 2016. Typecasts, Tokens, and Spokespersons: A Case for Credibility Excess as Testimonial Injustice. Hypatia vol. 31: (3): 485-501.
Convincingly argues that for many minority groups, credibility excess leads to serious types of epistemic injustice.
Doan, Michael D. (2018). Resisting Structural Epistemic Injustice. Feminist Philosophy Quarterly 4 (4).
A further consideration of institutional forms of epistemic injustice focusing on struggles that are political.
Dotson, Kristie (2012). ‘A Cautionary Tale: On Limiting Epistemic Oppression’ Frontiers Vol. 33 Issue 1; pp. 24-47.
Presents a third type of epistemic injustice: contributory injustice. Here, a hearer is willfully insensitive to the interpretive resources used by a speaker thereby compromising their epistemic agency.
Freedman, Karyn L. (2020). The Epistemic Significance of #MeToo. Feminist Philosophy Quarterly 6 (2). Article 2.
Argues for the epistemic value of #MeToo testimony for survivors as hearers, as tellers, and for society at large.
Fricker, Miranda (2007) Epistemic Injustice: Power and The Ethics of Knowing (Oxford University Press)
Classic presentation of two main kinds of epistemic injustice: ‘testimonial injustice’ and ‘hermeneutical injustice’. Further offers the virtues of ‘testimonial justice’ and ‘hermeneutical justice’ as possible remedies.
Fricker, Miranda (2012). ‘Silence and Institutional Prejudice’, in Sharon Crasnow and Anita Superson eds. Out From the Shadows: Analytical Feminist Contributions to Traditional Philosophy (Oxford University Press).
Traces the development of Fricker’s interest in epistemic injustice and argues for the importance of institutional virtues of epistemic justice.
Fricker, Miranda (2013). ‘Epistemic Justice as a Condition of Political Freedom’ Synthese Vol. 190, Issue 7; pp. 1317-332.
Extends the idea of epistemic injustice into the political domain. Argues that institutions should cultivate institutional virtues of epistemic justice in order to prevent dominating citizens by denying them resources to question abuses of power.
Goetze, Trystan S. (2018). Hermeneutical Dissent and the Species of Hermeneutical Injustice. Hypatia 33 (1):73-90.
Provides a helpful taxonomy of the sources and scope of hermeneutical injustice through a discussion of the possibility of interpretive dissent.
Jackson, Debra L. (2018). “Me Too”: Epistemic Injustice and the Struggle for Recognition. Feminist Philosophy Quarterly 4 (4). Article 7.
Examines the lack of recognition found in forms of epistemic injustice experienced by victims of sexual harassment and assault.
Jenkins, Katharine (2017). Rape Myths and Domestic Abuse Myths as Hermeneutical Injustices. Journal of Applied Philosophy 34 (2):191-205.
Examines the way myths about rape and domestic abuse lead to distinctive types of interpretive injustice.
Lippmann, Walter. (1965). Public Opinion. New York: Free Press. First published in 1922.
Famous study that examines the use of stereotypes in the framing of public opinion.
Lukes, Steven. (1974). Power: A Radical View. London: Macmillan.
Classic analysis of power and its uses.
McConkey, Jane (2004). ‘Knowledge and Acknowledgement: ‘“Epistemic Injustice” as a Problem of Recognition’, Politics Vol. 24, No. 3; 198-205.
Explores testimonial injustice as involving a failure of recognition.
A useful overview that locates epistemic injustice within the epistemology of testimony.
Medina, José. (2011). ‘The Relevance of Credibility Excess in a Proportional view of Epistemic Injustice: Differential Epistemic Authority and the Social Imaginary’, Social Epistemology Vol. 25, No. 1; pp. 15-35.
Argues that prejudicial credibility excess should be seen as a kind of epistemic injustice.
Pohlhaus, Gaile (2014). ‘Discerning the Primary Epistemic Harm in Cases of Testimonial Injustice’, Social Epistemology, 28(2); pp. 99-114.
Argues that the harm of testimonial injustice is not a kind of epistemic objectification as in Fricker’s work but should be drawn in terms of the subject-other relation, as seen in the work of Ann Cahill and Simone de Beauvoir.
Pohlhaus Jr, Gaile (2012). ‘Relational Knowing and Epistemic Injustice: Toward a Theory of Willful Hermeneutical Ignorance’, Hypatia: A Journal of Feminist Philosophy, Vol. 27, Issue 4; pp. 715-735.
Argues that “willful hermeneutical ignorance” should be considered a third type of epistemic injustice.
Twain, Mark. (1998). The Adventures of Tom Sawyer, New York: Dover Publications. First published 1876.
Classic American tale introducing Huckleberry Finn.
Yap, Audrey S. (2017). “Credibility Excess and the Social Imaginary in Cases of Sexual Assault.” Feminist Philosophy Quarterly 3, (4). Article 1
Considers the epistemic harms of credibility excess through application to the case of sexual assault.
Postmodernism is a style of doing philosophy that is often distinguished from the analytic style. The Postmodern era is the time period when postmodernism was popular, especially in Europe.
Many scholars struggle to provide a definition that characterizes postmodernism. The fact that it lacks a unique identifying name but is called “post”—meaning that it comes after modernism—is a sign of this uncertainty. Ihab Hassan offers a useful list of characteristics applied to the postmodern from various theoretical and artistic fields. These include claims that it is playful, rhetorical, indeterminate, anarchic, performative, participatory, polymorphous, antithetical, dispersed, and anti-narrative, to name some of the most prevalent throughout the literature. He implies that each of these can be broken down further. Indeterminacy, for example, evokes ambiguity, discontinuity, heterodoxy, pluralism, randomness, revolt, perversion, and deformation. Although it is possible to list the names of likely postmodern thinkers, actors, and their productions from philosophy, culture, and the arts, there is considerable disagreement about who or what belongs to this category and who remains modern. The long list of concepts and the uncertainty about which thinkers should retain the title of modern makes it useful to begin with a brief account of the Modern era in order to indicate which aspects of it might have continued to shape and influence its aftermath—the Postmodern.
Taken in its most specific sense, the roots of modernity were already visible in the 18th century. The Modern era arose in the mid to late 19th century and continued through the mid-20th century. The Marxist scholar David Harvey (1990) cites the philosopher Jürgen Habermas (1983), who defines modernity as the intellectual and cultural project that produced objectivity in science, the domination of nature, universal morality and law, and autonomy in the arts and culture (TCP 12, MIP 9). It was commonly assumed that the norms of scientific objectivity and factual truth were the tools that make possible the refutation of myth, superstition, religion, and heredity systems of privilege such as a ruling aristocracy. Carrying this out required increased rationality in multiple areas of society.
Like the postmodern to follow, the modern or modernity “entails a ruthless break with any or all preceding historical conditions”, and so it is “characterized by a never-ending process of internal ruptures and fragmentations within itself” (TCP 12). This condition may appear to contradict the rationality of modernity, but it makes sense for an era that reached a low point with the onset of World War II and the Holocaust. The destruction of values and customs is not a peaceful process and many of the luminaries of the Modern era were among its most active destroyers. Among them we find Georges-Eugène Haussmann (1809-1891) ripping up medieval Paris to create boulevards; Frank Lloyd Wright (1867-1959) tearing down the boundaries between domesticity and nature; Pablo Picasso (1881-1973) destroying the image to create cubism; Ezra Pound (1885-1972) remaking poetry with precise and clear language while seduced by fascism; Robert Moses (1888-1981) bulldozing neighborhoods to build parks and highways; and the positivist philosophers who fled the Nazi’s for the United States, while declaring their philosophy to be the only true one (TCP 18-19).
These actions have been justified as important preliminaries to the promotion and creation of something stronger; namely, significant rational and functional cultural forms that introduce more and better order than what had existed prior to these efforts (CCC 52). The image of the machine and technology provided an easy and comprehensible symbol for many modernists. Nation-states and cities were envisioned as machines for living and the expression of universal values that would guard against encroachment from nationalistic politics and reactionary forces. Nevertheless, these values did not stop others from making use of the technology and bureaucracy of modernism to support socialist, fascist, and communist revolutions (TCP 33). The defeat of Nazism and fascism in World War II established a period referred to as High Modernism, which envisioned linear progress, absolute truths, and rational planning as the means to an ideal social order bringing human emancipation and economic benefits to war-torn countries. Modernist values were presumed universal, relevant not only to first-world developed countries but also exportable to what was then called the “backward Third World” under the control of Western nations (TCP 35).
The modernist drive for efficiency served the growing corporate bureaucratic power by abandoning ornamentation and personalization, especially for homes and public buildings. Often massive geometrical structures prevailed in public spaces. This uniformity spread into the arts and literature as universities established the Western canon in those fields. This ideal of classic literature, philosophy, and visual and performing arts became the model for Western culture. In the visual arts, abstraction became dominant as it avoided the representation of the horrors of war including the human and cultural destruction caused by the atom bomb (TCP 36). In the 1940s and 50s, abstract expressionism was practiced by artists like Mark Rothko, Jackson Pollock, Helen Frankenthaler, and Willem de Kooning. These artists, along with many others in the United States and Europe, sought to express the alienation, anxiety, and violence of the war, even as they celebrated “rugged individualism and creative expression” (TCP 37). These values were popularly taken up and became the basis of a drive toward universalization promoting America as “the essence of Western culture” and as the site of the celebration of freedom of expression (TCP 37). Thus, expressionist arts arose as part of the cultural protest against the dominant political authorities advocating policies that advanced corporate power and consumer culture in the United States and elsewhere.
This largely first-world interpretation of modern life also gave rise to anti-modernist movements opposed to technocratic rationality and institutionalized power (CCC 38). The sociologist Daniel Bell (1976) blames modernism for making the postmodern rebellion possible. He maintains that the spread of modernist art into the social and political mainstream was bolstered by the idea that it celebrates “the free creative spirit at war” (CCC 40). This is the idea of a cultural hero who is an individual at war with a civilization that curtails their freedom. Bell argues that this sustained a culture of rebellion that came to exert an outsized influence given its relatively small numbers. He argues that this prevented most everyday citizens from making what Bell calls an intellectually respectable culture of their own (CCC 41). Modernism is Bell’s most inclusive term for a “self-willed effort of a style and sensibility” that sought to remain in the forefront of the rest of society, the so-called avant-garde (CCC 46). It was exacerbated when what he refers to as radical intellectuals—mostly in the arts and letters—were invited to contribute to mass media magazines and newspapers. The reality, Bell maintains, is that mass society generally seeks entertainment and not elitist culture. In his view, radical elements dominated culture by embracing the mass media and using it as a lifeline to proclaim their cultural superiority (CCC 45).
If Bell is correct, then it would have been mass media celebrity that gave rise to postmodernism as an extension of the most extreme aspect of modernism, in part governed by impulse and pleasure. For Bell, postmodernism is dedicated to tearing down boundaries without replacing them with stable forms linked to social and political hierarchies and an idealized pre-modern past (CCC 50, 51). Whether or not this holds up to scrutiny, it has been argued that modernist ideas and productions being integrated into mainstream consumer culture on any scale implies that modernism no longer serves as a “revolutionary antidote” to traditional cultural practices (TCP 37).
2. Postmodern Cultural Transformation
Andreas Huyssen (1984), who regards the postmodern as “a slowly emerging cultural transformation in Western societies” (TCP 39), reconciles these opposing viewpoints to an extent. According to Huyssen, something may appear on one level as the latest fad, advertising pitch, or hollow spectacle, when in fact it belongs to the change in sensibility and cultural transformation that occurs regularly in Western societies (MP 8). Thus, the movement from era to era does not invoke a wholesale cultural, social, and economic paradigm shift (TCP 39). Instead, the postmodern must be addressed as a historical condition and not merely a style. It benefits from examining and analyzing its critical and artistic practices, but cannot be neatly defined. The point, Huyssen states, is not to ridicule the postmodern but to salvage it from both champions and detractors (MP 8, 9, 10). This calls for an increased emphasis on the outsize role the arts play in achieving this shift because the activities of artists, writers, filmmakers, architects, and performers have propelled culture beyond the narrow vision of modernism. In its place, they demanded a new version of modernism (MP 9)—a demand that leads to the postmodern.
Huyssen seeks to clarify the confusion surrounding the meaning of postmodernism by distinguishing two versions of it. One arose in the 1960s and sought to create a rebellious American upheaval of modernist European avant-garde movements such as surrealism. Its methods were both apocalyptic and visionary, sometimes an extension of modernism and sometimes an attempt at disruption. Americanization was one of the chief characteristics of this iteration of postmodernism, attacking European bourgeois institutions of modernist art and culture that separated them from everyday life. It embraced media, film, and photographic technology and validated popular culture, especially rock ‘n roll and folk music, as well as popular literature. Its goal, Huyssen states, was to undermine elitism (MP 20-3).
The second version of postmodernism emerged in the 1970s and turned its criticisms to the remaining modernist values of the 1960s in favor of a “culture of eclecticism” (MP 16). It incorporated motifs and codes from pre-modern and non-modern cultures, as well as those of popular mass culture, which it recognized in all its diversity (MP 25, 27). Additionally, the intellectual and cultural contributions of women and global minorities emerged as a challenge to essentialism and the Western canon and its traditions. Recent reassessments of the postmodern condition have drawn attention to the rise of gender, race, sexuality, and postcolonial studies widely associated with postmodern culture wars that may still serve as a lightning rod for polemics in the face of global pandemics, ecological catastrophe, and technological encroachment.
The increasing pluralism of voices and perspectives creates a plurality of interpretations, which for some amounted to a loss of meaning and the dissolution of any unified representation of reality. This diversity, widespread in the literature, nevertheless got its start in the project of modernity characterized by those who, like Habermas, retained faith in the modernist project and expressed the hope that these values would create and maintain the rational organization of everyday life. However, given the increasing distance between the culture of experts and specialists and that of the public in general, the modernist hold on the values of the Enlightenment appeared incomplete. The project to integrate objective science, morality, and art into everyday life remained unfulfilled. Nonetheless, although postmodernism seems contrary to the general values of the Enlightenment, aspects of the modernist project may still be present in theories of postmodernism. Many theorists do not view the efforts of postmodernism’s advocates and practitioners as eclectic and fragmentary. Instead, they describe the postmodern as a necessary discontinuity that contains elements of modernist liberatory values. If so, a full appreciation of the postmodern must address its perspective as eclectic and fragmentary as well as its liberatory aspect. The leading philosopher of postmodernism Jean-Francois Lyotard (1924-1998) perhaps best articulates the theoretical framework of this view.
3. Lyotard and The Conditions of the Postmodern
At the end of The Postmodern Condition, A Report on Knowledge (1984 [1979]), Jean-Francois Lyotard declares, “The nineteenth and twentieth centuries have given us as much terror as we can take” (PMC 81-82). He objects to the social, cultural, political, and economic rationalization of society by highly bureaucratic governments and business entities that seek efficiency, uniformity, and control, asserting this new version of terror is a threat to the proliferation of new and diverse ways of life. Uniformity and diversity are among the oppositions that characterize Lyotard’s understanding of the modern and its distinction from the postmodern in the realms of philosophy, science, history, aesthetics, and culture. The implication is that the modern and postmodern do not exist merely in the philosophical or the theoretical realms but embrace all aspects of society and culture. They correspond to the epistemes, the acts, ideas, and new methodologies giving rise to knowledge in an era, as articulated by Michel Foucault in The Order of Things (1970, [1966]). From this perspective, the Modern era is a foreclosure of one recent episteme, that of modernism. Its closure is due to the reorganization of knowledge in the modern humanist tradition—its influence, development, evolution, source, and origin—followed by its replacement by the postmodern episteme of discontinuity, rupture, threshold, limit, and transformation.
Lyotard uses the word “postmodern” (a word he claims was already in use among sociologists and critics in America) to describe the condition of knowledge in the most highly developed societies. It has become a truism that the postmodern reflects skepticism about metanarratives, that is, theories that claim to be comprehensive or universal throughout the institutions of science, literature, and the arts, as well as philosophy. This definition does not yet clarify the use and practice of the term postmodern. For Lyotard, a discussion of the rise of the postmodern as an aspect of computerized societies is missing. The focus of science and technology on computers and their languages, the impact of computational technical transformations on research, and the transmission of acquired learning that values quantities of information are, for him, fundamental to the postmodern. Lyotard argues that society moved from the training of minds to the exteriorization of knowledge in the machine, knowledge available for sale and consumption acting as the principal driver in the worldwide competition for power. Learning circulates like money once did, and value becomes a matter of units of knowledge exchanged for decision-making as opposed to being dedicated to the optimization of performance in work for the sake of paying for life’s ordinary expenses.
Computerization and science do not exist in a vacuum. Lyotard links them to politics and ethics, to the question of who decides what knowledge is, and who knows what needs to be decided. The experts with access to computer-generated data do not simply receive information. In their analyses, every “self” exists in a network of relations as sender, referent, and receiver, and the messages received are performative, denotative, evaluative, or descriptive. All are situated within the structures and uses prescribed by Ludwig Wittgenstein’s (1929-1951) language games, a method used to make sense of any episteme, not merely the postmodern. The philosopher Max Black (1979) describes language games as following one of three models: primitive, that is, simple patterns of language use such as those of a group of workers; sophisticated, such as language use requiring a background like that needed by translators; or finally, the general use of words that can degenerate into cliché (WLG 338-9).
Specifically, for the postmodern language game, there is “[a] cybernetic machine [that] does indeed run on information, but the goals programmed into it, for example, originate in prescriptive and evaluative statements it has no way to correct in the course of its functioning” (PMC 16). This procedure has removed Western societies from the narrative of traditional knowledge, such as that proposed by enlightenment rationality. Traditional narratives, Lyotard argues, have a three-fold competence: know-how, knowing how to speak, and knowing how to hear. These competencies transmit the pragmatic and useful rules of the social language into a fixed eternal form whose meaning is often obscure, yet they generally receive no independent verification. For Lyotard, the language game of the classic pragmatics of Western science is quite sophisticated and already produces judgments. It questions the validity of simple narrative statements and characterizes them as primitive (without objective knowledge) or opinion.
On the other hand, classic science uses only one language game, involving the production of verifiable or falsifiable statements about its referent. The verifiability or falsifiability of its statements contributes to its evolution as those statements follow strict rules that define proper scientific practices and make up its language game (PMC 25). Complementary to the “research game” is the “teaching game”—students who become experts, or competent equals. Separately, institutions form according to their own language games. People who work in civil institutions deliberate and formulate prescriptive utterances for justice, legislation, and civil laws. They are abstract subjects, that is, individuals who exclusively send and receive statements to be verified as either true or false. The effect is to destroy the traditional knowledge of peoples and communities not part of those institutions, and in principle, to reintroduce new narratives of knowledge, liberty, or freedom by creating new epistemes.
Still, the language games of science and those of ethical, social, and political practice conflict. Unification of these diverse language games would require a meta-narrative whose narrator must therefore be a meta-subject. This would be someone able to formulate and validate the legitimacy of the discourses of the empirical sciences and the institutions of popular cultures (PMC 34). Lyotard situates the initial unification process in the universities. First in the 18th century universities of Berlin, then in the 19th and 20th centuries, spreading to the United States and other mostly Western nations. Institutional research and the spread of learning do not serve the state. For these universities, knowledge is entitled to define what constitutes the state and society (PMC 34). This is not knowledge of specifics such as nature, the state, or society, but knowledge of that knowledge; indirect knowledge, composed of reported statements incorporated into the metanarrative, the comprehensive account of a subject who vouches for the legitimacy of the method used to study nature, the state, or society.
Likewise, the practical subject—humanity—is animated by the self-management of its freedom and does not look to nature or natural law. Humanity’s laws are assumed to be just, because the very citizens subject to them also constitute them. This language game turns morality into an effect of knowledge, including science, by claiming that knowledge exists only to serve the practical ethical subject. Of course, as Lyotard points out, scientists will cooperate only if they agree that the prescriptions of humanity are just. Richard Rorty (1983) reiterates this view, arguing that for postmodernism there is no ground for loyalty and conviction apart from the fact that the beliefs, desires, and emotions of the members of various groups overlap with those of many others with whom they identify for purposes of moral or political discussion. These beliefs, desires, and emotions must be distinctive group features accumulated over time and used by the group to construct its self-image in contrast to other groups. The group narrative is not likely to be philosophical. Generally, it is an historical narrative largely legitimized by appealing to a multitude of artistic practices about the past and predictions about the future derived from history and the arts.
4. The Delegitimation and Legitimation of Knowledge
For Lyotard, these specific language games and how the meta-narrative legitimizes knowledge declined with the rise of technologies developed since World War II that emphasize technological advances, that is, means rather than ends. Post-World War II technologies also tended to promote the benefits of the individual enjoyment of goods and services under advanced liberal capitalism. Lyotard states that skepticism toward knowledge is already implicit in G.W. Hegel’s (1770-1831) speculative philosophy. In Hegel’s analysis, discourse referring to nature in all its breadth and depth must be sublated, that is, raised or transformed into another discourse without negating the previous ones. The original discourse about natural science, for example, can be cited in a second-level discourse, meaning it is not self-legitimating, and therefore not a true science but an ideology or instrument of power. This erodes the principle of legitimizing knowledge and dissolves the hierarchy of disciplines into flat networks of areas of inquiry dominated by institutes and foundations. From this point of view, universities are no longer at the forefront of research, they merely transmit knowledge “didactically replicating teachers rather than producing researchers” (PMC 39). The appeal to ethical, social, and political practices to justify discourses further corrodes the truth value of the search for knowledge in favor of the question of its relevance, which is exacerbated by the claim that science, as a language game with its own rules, has no privilege concerning other practices. Given this situation, for Lyotard, postmodernism arises to make sense of if and where legitimation can be found.
For scientific research, legitimation has come to be situated in the syntax of formal deterministic systems which demand completeness, decidability, and independent axioms governed by the metalanguages of logic and mathematics. However, Kurt Gödel’s (1906-1978) incompleteness theorem exposes the limits of formal systems and the existence of paradoxes and paralogisms due to the limits of consistency and completeness. Gödel himself eventually offered the anti-mechanistic statement that “either . . . the human mind (even within the realm of pure mathematics) infinitely surpasses the power of any finite machine, or else there exist unsolvable diophantine problems” where the word diophantine refers to solving all the unknowns in the problem (1951). Yet if the referent of science is reality, then scientific claims still seem to call for a standard of proof independent of inconsistent and unreliable sensory perceptions. If logic and mathematics cannot provide completeness, then technology will. Technology, however, operates on the principle of optimal performance, maximizing output and minimizing input. Its game is not truth, but efficiency. Insofar as technology is marketed, science must meet the demands of performance and the commercialization that produces wealth.
With completeness and legitimation eroded, Lyotard taps performativity—the power of language to effect action—as the only viable standard of proof. Insofar as performativity exists for the sake of power (understood as making the best move in the technical language game) rather than terror (eliminating other players), the performativity of procedures can substitute for the normativity previously provided by laws. As Lyotard famously writes, “[p]ower is not only good performativity but also effective verification and good verdicts” (PMC 47). This becomes the method by which the new logic of power legitimates science and law.
Educational institutions do not escape the new logic. In Lyotard’s analysis, they create skills with the guidance of experts in technology, mathematics, logic, and related fields. Thus, professional and technical training begins to dominate educational institutions, leaving behind students in the arts and human sciences for whom jobs are scarce. Job retraining and continuing education are offered randomly to facilitate employment and promotion to those left behind. Insofar as knowledge has become translatable into computer languages, the role of teachers is greatly diminished. Effectively, Lyotard claims, the teaching function is easily replaced by the data banks of machine learning, as the professor is held to be no more competent than these data banks and interdisciplinary teams. In place of learning to speak foreign languages, students carry out computer information searches and become fluent in the use of computers. As data banks replace the encyclopedia, usefulness emerges as the highest value. Interdisciplinarity and teamwork become “nature for the postmodern man” (PMC 51-2).
In science, the question of legitimacy serves as the driving force for the renewal of scientific discourse and the replacement of each generation of scientists. In support of this, determinism remains central to scientific aims, but deterministic predictions require knowledge of variables. Classical mechanics satisfy the demand for events on the human scale that move in continuous, orderly, and predictable patterns. This is not the case for microscopic quantum particles whose position and momentum are uncertain so that knowledge of present states yields only probable knowledge of future states. Catastrophe theory and complexity theory further muddy the water of determinism. Isabel Stengers and Ilya Prigogine’s (1984) theory of an evolutionary universe, consisting of dissipative systems, irreversibility, and non-equilibrium mechanics contributed significantly to limiting the sphere of deterministic natural events.
Lyotard concludes that these scientific developments imply that science has evolved to the point where it no longer produces knowledge. Few scientists would agree with this judgment, and fewer still would likely characterize their methods as “paralogy”, reasoning which contradicts logic. Lyotard identifies paralogy with the unpredictability of postmodern science. He ascribes this to the philosopher of science Paul Feyerabend’s (1970) anti-method epistemological anarchism, which includes open systems and local determinism. Searching for a validation principle, Lyotard turns to Niklas Luhmann (1995, [1984]) whose systems theory exhibits paralogy by advocating moves played in the so-called pragmatics of knowledge. This occurs when “someone” disturbs the order of reason within any given paradigm and establishes new rules for a new field of research of the language games of science. The result is discoveries that are unpredictable and only locally determinable, thus not universalizable. For Lyotard, scientific research neither advocates nor uses this “terrorist” model. In pursuit of performance, decision-makers tend to declare that society is ignorant of its needs. Yet in modern scientific research, every case is open and subject to possible revision. Interesting statements are those that depart from what is already known and so generate additional game rules.
The rules governing the statements of the pragmatics of science referring to their objects of study follow the appropriate moves of scientific language games. They impose a method that allows them to generate new ideas, but all social systems do not operate alike. The universal consensus is dismissed for presupposing an impossible universal validity for the pragmatics of social life. Universal rules are viewed as outmoded and suspect, at most an end and not a process, leaving the idea and practice of justice unresolved. The pragmatics of science must be differentiated from social language games which deviate from norms. For example, if the society renounces terror, local moves must be agreed to by players who accept the reality of certain uses of language that lead to actions in a specific space and time. The social contract can then be accepted as a temporary contract extending to all social and institutional domains, but must remain ambiguous and impure. The necessity that players assume responsibility for rules and their effects is frequently ignored. Currently, computerization can either contribute to the instantiation of terror or support meta-prescriptions by supplying information and giving the public free access to data banks and computer memory. In principle, this can satisfy the desire for justice and yet unknown actions. However, it raises the serious question of whether it discourages individual and group responsibility.
5. The Limits of Postmodern Communication
Given the divergent rules of social systems, conflicts seem inevitable, especially if they are subject to a single rule of judgment. In The Differend: Phrases in Dispute (1988 [1983]), Lyotard defines this term by asking under what conditions a rule and a judgment produce a wrong. If a universal rule of judgment between heterogeneous parties, situations, or genres is lacking, how can conflicts between parties be resolved, in whose favor, and under what principles? Not only currently, but in disputes going back to ancient philosophy, phrases (speech acts including linguistic and extra-linguistic events, statements, objects, and so forth) cannot justly be translated into one another. As speech acts, they exist in the world and affect it. Genres of discourse link phrases to persuade, convince, vanquish, make laugh or cry, and so forth. But it remains doubtful that rules (regimens) governing reasoning, knowing, describing, recounting, questioning, showing, and ordering can link heterogeneous phrases. What happens when this kind of difference or heterogeneity, which Lyotard refers to as the differend, occurs between two sets of phrases, two genres used by different parties? What philosophical reflection could address the gap between these two ways of speaking and acting to “save the honor of thinking” (TD xii)? The question became crucial in the Postmodern era when formerly marginal groups and individuals began to speak and act publicly using novel methods.
For Lyotard, the task of philosophy is to show that linking these two is problematic and difficult in the discourse outside of philosophy such as economics, exchange, and capital, and the discourse inside philosophy—that of academic mastery. It demands bearing witness to the differend, the difference between what is outside and the inside, because there are stakes tied to every genre of discourse. This is less a conflict between humans than a conflict of phrases, the effect of phrases. Humans are situated in heterogeneous phrase regimens and so are tied to heterogeneous genres of discourse. In language, as Lyotard succinctly points out, attempts to resolve the differend through a supreme genre that encompasses the difference between them are undone by Russell’s paradox, which states that a class of all classes both is and is not included in itself (Irvine and Deutsch 2021). The same can be said of a genre encompassing everything including itself, but then not actually including itself. In other words, the genre that is part of a set of genres cannot be supreme. Thus, any genre of discourse claiming supremacy is making a paradoxical claim. It follows that the claim by capital or modernism or any other genre (especially politics) to encompass other genres is unjust and in vain.
For politics and philosophy, this situation is characterized by the victimization of interlocutors by language understood as communication. There always appears to be a residue of differends, conflicts that cannot be regulated by any idiom. Without new addresses, new addressors, new signification, and new referents, wrongs cannot be discussed and corrected. Furthermore, plaintiffs (those who lodge a complaint and request restitution or remuneration within a given set of rules or a genre of discourse) usually become victims. Conflicts will always be regulated in the idiom of one party, which does not recognize the harm or wrong suffered by the other. As a result, no tribunal can recognize a victim, which constitutes them as victims. For Lyotard, the process does not start with the differend being put into phrases but is signaled by a feeling, the sense that one cannot find the right words immediately so new idioms are needed. These new idioms begin with silence, that is, with the recognition that what needs to be phrased exceeds what can currently be phrased. This is illustrated by victims of the Holocaust who did not see the inside of the gas chambers and are therefore refused the designation of “plaintiff” when confronted by Holocaust deniers who narrowly define the applicable terms. This is also the case for the labor power of the undocumented laborer who does not exist within the idiom of bourgeois social and economic law.
Lyotard’s commitment to saving the honor of thinking has been overlooked by philosophers and non-philosophers eager to reinstate metanarratives, as much as by those who embrace the conflict inherent in the differend. He is clear that genocide or terror cannot be justified or even represented, as any representation is already a misrepresentation. Yet he maintains that the incommensurability of regimens does not condemn philosophers to silence. Human historical-political progress may be incapable of direct presentation but stands as an object of Ideas of Reason according to the schema specified by Immanuel Kant (1724-1804). For Kant, reason is the highest human faculty, and its ideas meet with nothing in our experience. Instead, they are thought only in the mode of “as-if.” We think and act as if the ideas of reason are possible in experience. This means that the referent of the feeling Lyotard proposes is still to come. He argues the referent will also have to be a different feeling from the one that inaugurated the differend, calling this feeling enthusiasm. Lyotard maintains that it is impossible to have a sensible or empirical representation of terror. In place of the experience of a feeling arising from conflict, the philosopher can engage with the feeling of non-reconciliation. This would be the pleasure of displeasure, the incommensurability of the sensibility with the ideas of reason, and the anticipation of a phrase still to come. Enthusiasm must remain undetermined regarding time and so cannot be subordinate to mechanical causation. Neither a cause nor an effect, this event is an index, the sign of history that recalls, shows, and anticipates, presenting only free causality. For Lyotard, given the human predisposition to speculative reason, this mode of thinking, this enthusiasm, is already progress.
The potential for communicability between parties begins with the experience of the original feeling of conflict. It is an effect of the Kantian idea of the beautiful, the phrase of good taste, insofar as Kant argued that an appreciation of the beautiful requires good taste. Although one cannot find words for it, in principle appreciation remains possible. Enthusiasm, however, is a modality of the feeling of the sublime, a feeling of respect. It arises from the pain of the incapacity to present a solution to the differend and the joy in discovering that the latter, as part of nature, is exceeded by the ideas of reason. Thus, the passage between conflicting genres is the impasse akin to the prohibition against attempting to represent the unrepresentable. Enthusiasm, aesthetically sublime and so indeterminate, exists beyond the bounds of sensibility. It is an impulse of the mind exceeding any sensible representation—formless and without empirical validation. Enthusiasm, Lyotard offers, encompasses the potentially universal idea of a communal sense, an ideal but indeterminate norm, a sentimental anticipation. It is the as-if presentation of the idea of a society free of national and local prejudices and imbued with the idea of a communal morality, both of which exist only in the idea and for which only indirect presentations will ever be found.
6. Rorty and The Broken Mirror
The American philosopher Richard Rorty offers a contrary approach to language games and social consensus. Rorty (1979) asserts the equality of all language games, ostensibly operating within the confines of his metanarrative. He begins with the demand for the elimination of epistemology as the search for foundations that constrain and frame and so cause conflict between language games. He then argues for replacing epistemology with a version of hermeneutics, defined as an expression of his objection to sets of rules that lay out how to reach rational agreement. The work of epistemology, as Rorty sees it, is to find maximal common ground with others, but in so doing he finds that it creates conflict. His goal of reducing conflict is somewhat like Lyotard’s project, yet it operates by other means. The practitioner of hermeneutics intervenes in conflicting language games and carries out the job of an “informed dilettante”, mediating between various incommensurable discourses and hoping for agreement or fruitful and exciting disagreement (PMN 317). Rorty characterizes most of what this requires as translation, which he attributes to physicalism (the idea that everything is ultimately physical). Persons capable of carrying out mediation are united in a societas; they are persons united by civility, although it seems there is nothing more than this uniting them.
Rorty sets the most blame for the epistemological view on the shoulders of Descartes and Kant: Descartes, whose demand for cognitive certainty produces the separation of body and mind, and Kant, whose transcendental philosophy claims that knowledge of the world is an inner mental representation. Rorty eliminates these two by embracing anti-dualism. According to anti-dualism, as Donald Davidson advocates, beliefs and their referents are not two separate phenomena. He also takes up what he calls the anti-essentialist pragmatism of John Dewey who argued that the human organism physically adapts to its environment.
However, even hermeneutics must initially operate with some discourse. Rorty refers to this as “normal” discourse by analogy with Thomas Kuhn’s (1962) concept of “normal science”, defined as an agreed upon set of conventions or the conceptual scheme “of our own culture”, a normative conventional discourse (PMN 324). Presumably what is normal is some version of standardized English as used by philosophers. It is not a “language of unified science” that contains everything anyone could say, but “neutral – if unhelpful observations language” (PMN 349). Hermeneutics must also take account of “abnormal” discourse—the failure to either know or follow normal conventions. It does so from the point of view of normal discourse, which it initially takes for granted. What is abnormal may be in the eye of the beholder, and Rorty works to clarify this point.
Rorty’s restrictions on discourse extend to the terms “objective” and “cognitive.” He characterizes these terms as products of the mirrored world of rational belief understood as correspondence to the real. He claims they are part of the wished-for agreement among inquirers in morality, physics, and psychology. Even more debatable, for Rorty, are the disciplines of chemistry, literary criticism, and sociology, whose signifiers are non-cognitive or merely operational terms. It becomes clear that his epistemological behaviorism is about agreements between persons charged with moral feeling, a consequence of Enlightenment values, which Rorty personally feels are justified (PMN 335).
For the cohort of agreeable persons on the other side of the epistemological binary, the term “subjective” refers only to something unfamiliar to that cohort. It can reveal nothing about what it means for an individual as this aspect of the term appears to have been swept away, leaving only references to cohort behavior. Personal considerations are illogical, whether historical, physiological, psychological, and so forth. They have withered away with anti-representationalism and the smashing of the mirror of nature. Rorty does not hesitate to wave aside the discourse of the mentally impaired or those he considers unintelligent and to account for their intentions and actions without consulting them by basing it on “our science” (PMN 349).
This is amplified by Rorty’s criticism of a hypothetical transcendental self, which he defines as a self that functions as a moral and epistemological agent constituting a phenomenal world. This world is structured largely by the mind, specifically the a priori categories of understanding, a position attributed to Kant. Of course, if one does not find coherent generalizations and categories in nature, one cannot simply create them. There must be a constant parallel between nature and knowledge of nature. Thus, Rorty cites the need for some constants. Only nature and matter, the subjects of physics, provide a stable background for stories about historical change more commonly attributed to moral law or poetry. Rorty’s “naturalism” gradually emerges, as it is the ability to predict others’ future behavior rather than the meaning of their discourse that dominates his thinking. It is also because he believes physics is a deterministic science that will eventually have the capacity to predict every aspect of human and animal behavior. There is no point in claiming that a person alters themselves in any respect through self-reflection when in fact they are altering themselves through a change of diet or sexual partner. It is not merely that there is no difference between nature and what he refers to as “spirit”. Rather, he argues that the latter is best explicated in terms of the former to ensure the non-emergence of spirit or mind as constituting or creating phenomenal reality. Ultimately, Rorty seeks to decisively eliminate epistemology as the activity of spirit once and for all.
7. Baudrillard’s Logic of Simulation
Breaking the mirror of nature is not celebrated unequivocally by all postmodern thinkers. Jean Baudrillard does not embrace the loss of the magic and charm of the real, nor the passing of the equivalence of the sign and the real. In Simulacra and Simulation (1994 [1981]), he agrees that there is “[n]o more mirror of being and appearances, of the real and its concept” (SS 2). The consequences of this transition are rather different from those Rorty predicted. Representation, Baudrillard argues, at least attempted to define simulation as a false representation and so absorbed it. But simulation on its own envelops any representation as just another simulacrum. The real truly becomes merely operational and is no longer real at all. All referentials, terms applied to specific phenomena, are liquidated, then artificially resurrected in another system, the system of signs where they cease to refer to specifics and often play the role of metaphors. The simulation process threatens, if not destroys, all distinctions between true and false, real and imaginary. The disappearance is orderly, as the image, the representation, passes through several phases. First, it reflects a profound reality, but then it strips away its defining characteristics. It masks its absence and eliminates any relation to reality. Finally, it becomes pure simulacrum, pure imitation. As it transitions, the image proceeds from a sacramental good appearance to a maleficent evil one, to the sorcery of a play of appearance to pure simulation—from something to nothing.
The many entanglements of the simulacrum are profoundly evident to Louis Marin in the Disneyland amusement park. Discussed initially in his essay “Disneyland, A Degenerate Utopia” (1977) as symptomatic of the society of control, Marin describes Disneyland as a logistically constituted illusion that masks itself as a site of freedom and equality. He argues it serves a much more disturbing function:
[I]t has become dedicated to the production of consumable collective images, which create and maintain the system of ideas and values used by social, political, and economic forces to maintain their dominance and somewhat secondarily, to promulgate their values, since ‘values’ are only in service to dominance (TRR 184).
Baudrillard affirms this, adding that the reason Disneyland is imaginary is to convince us the rest of America is real. For him, the rest is hyperreal; it is the effect of “the generation of models of a real without origin or reality”, a representation without a referent (SS 14). The child’s imaginary world of Disneyland is everywhere in America. Baudrillard states that many cities like Los Angeles and even all of California are immense scenarios recycling what he theorizes is the typical American imaginary of children and adults, the toxic waste of a hyperreal civilization. For this reason, the world outside of Disneyland is no different from the world inside. It is overrun by sexual, psychic, and somatic “institutes” that recycle lost faculties, bodies, sociality, and the lost taste for food in the form of natural and health food.
Baudrillard refers to this as the logic of simulation in which models precede facts, essentially doing away with them. These models, or simulacra, circulate and constitute a magnetic field and make simultaneous and contradictory interpretations possible, which appear true due to the subtle and elusive twisting of meaning. The logic of simulation effectively conjoins a system to its extreme alternative like two sides of a curved mirror. Baudrillard cites the conjoining of the political left to the political right, desire to value, capital, and law. As Gilles Deleuze and Félix Guattari have claimed, desire is found to want its own repression and its investment in paranoid (isolated) and fascist (authoritarian) systems, due to the conjoining of desire with value, capital, or law, (1983 [1972]).
The logic of simulation shows itself in numerous forms such as that of the panopticon, a system of constant surveillance, which has infiltrated what it once was meant to survey. Likewise, the neutral, implosive violence of systems of deterrence have infiltrated the threat of nuclear terror, serving as a pretext for even more sophisticated weapons. Space exploration, Baudrillard claims, was always a cover-up for “satellization”, the deployment of information-gathering satellites driven by programming and technical manipulation. He sees them operating to “model vectors of a system of planetary control”, which included securing the non-intervention of China in Vietnam and the rapid withdrawal of American forces once this was assured (SS 34). Novelist George Orwell’s concept of “war is peace” was thus verified as part of a series of artificial occurrences initiated to maintain the illusion of actuality—“wag the dog”, as it is called in the movies.
8. The Great Trauma of Postmodern Thought, Deterrence, and the Implosion of Meaning
The great trauma of Baudrilliard’s era is the decline of strong referentials, which represents the death pangs of the real and the rational and results in fetishizing the most recent eras and events of real history to the complete indifference of spectators. This is evident in cinema, a “hot” medium, says Marshall McLuhan (1977), which plagiarizes earlier versions of itself and loses all relation to the imaginary. For example, the Vietnam-era film Apocalypse Now (1979) hits its viewers primarily with the horror of filmmaking in place of the horror of the actual Vietnam War. Television, a “cold” medium in McLuhan’s terms, acts as another deterrent. The cold light of television mesmerizes the viewer and transistorizes their neurons. Yet, it is unable to produce even an image from its studios that resembles the control centers of nuclear power reactors. Therefore, the televised media-driven accounts of near-nuclear reactor meltdowns transforms the hot power of destruction into the cold power of deterrence and the simulation of catastrophe. The only possible mitigation comes from an act of God or terrorists, both of whom can produce real, palpable violence.
Commensurate with these events is the circular relationship between information and meaning. Following the hypothesis put forth by Claude Shannon (1949), information operates as a purely functional code without relation to meaning. Baudrillard reformulates this to argue that information is also exhausted through staging, that is, by inviting participants to join the simulation of communication. It lures them into something they do not fully believe is real. The medium seduces them, despite its myth-like, indecipherable message concerning the real and the dissipation of its power. This makes the media akin to a terrorist operation, carrying the simulation internal to the social system and that which destroys it. At the precise moment when the media urges the masses to take up the active subject position of freedom, emancipation, expression, or politics, they are equally subject to the object position renouncing practices of the subject and renouncing meaning in their alienation and passivity.
9. Seizing Reality and Embracing Nihilism
The renunciation of meaning is evident in advertising, defined by the convergence of the economic, the political, and eventually, the social. Advertising functions as a matter of supply and demand and as an extension of propaganda, Baudrillard’s zero degree of meaning. He argues that Las Vegas, where advertising effaces all architecture, streets, and walls, absorbing everything onto the surface in a mockery of all signs, exemplifies this.
Similarly, cloning purges the genetic code of the uncertainty of sex. It is the site where the body becomes a message, serially reproducible like a cancer. The mind is no different, able to be psychotropically modeled from inside—without perspectival representation, mirror or discourse required. The three-dimensional hologram tracks the imaginary aura of the double, allowing the passing through of one’s ghostly body like an unborn dead twin, extracting the imaginary, synthesizing and materializing it. No longer an image, it is another hyperreality, a three-dimensional simulacrum, the “luminous ectoplasm of your own body” (SS 105). For Baudrillard, such technology is now an extension of the body as in J.G. Ballard’s novel Crash (1973), which he calls the first great novel of simulation—neither fiction nor reality. The novel describes functionalist machines (automobiles), functional sexuality, language, and the body, fused and confused with technology in its violating and violent dimensions. The advent of technological hyperreality makes it likely that Ballard’s novel is no longer science fiction, a genre Baudrillard declares is over because it is no longer possible to even conceive of an alternative universe in a world where technology has no bounds.
Baudrillard reminds the reader that animals once had a more sacred and divine character than humans and were thought worthy of being sacrificed as gods. Now, he concludes, “we have made of them a racially inferior world” (SS 130). They are no longer worthy of justice, only of sentimentality (as pets), nor are they worthy of punishment, only experimentation in laboratories and subject to extermination for consumption (SS 130). Like children and the so-called “mad” and “primitives”, animals do not speak, yet their silence weighs on those who do. They are made to serve as models of unreason and metaphors for virtue and vice, ecological systems, artificial intelligence, and the “phantasmatic register of the unconscious” (SS 135). Most paradoxical is animal-becoming as the model of deterritorialization or displacement of desire, the productive power of all forms of life. Because the animal is the most territorial creature of all and its territory is open and circumscribed, it has become the site of animal and vegetal cycles, parentage and species, women and ritual, exchange and not death.
Finally, there is Baudrillard, the self-proclaimed nihilist accepting the radical destruction of appearances, the disenchantment of the world, and its abandonment to violence and interpretation. Baudrillard endorses nihilism without the nostalgia associated with it by cultural theorists like Theodore Adorno and Walter Benjamin. Thus, birth, death, value, sexuality, the social, and the real are remainders left behind by the rupture of alliances and the stockpile of repressed effects and representations. Even the universities are sites of the remainder—functioning only as zones for the shelter and surveillance of a certain class or a certain age of human, the youth. Against the system’s hegemony, Baudrillard advocates certain small reversals, such as the ruses of desire, the quotidian, even cooking. For him, however, even this fails. Imperiously checking (not ending) the system in broad daylight, anything carried out by either theoretical or real violence is hopeless.
10. The Arts and The Ruin of Representation
Without irony, postmodernism is not one singular line of thought. There is a postmodern nihilation of modernism, a reactionary postmodernism, and a postmodernism of resistance, although none preclude other versions of postmodernism. The word “postmodernist” was reportedly used first by the art critic Leo Steinberg in his essay “Other Criteria” (1972) and later appeared in his book by the same name. It was used to describe works of art, specifically “a picture conceived as an image of an image” (OC 91). Such an image guarantees that no image will be presented as a real-world space, and so it allows “any experience” to be the content of its representation. Significantly, it belongs to a shakeup that goes far beyond works of art to anything that “contaminates purified categories” (OC 91). Thus, postmodernism is not limited to art, and anything may now move into strange territories and abandon the long-established “stand-by criteria”, the typical standards of evaluation (OC 91). This version of postmodern resistance is powered by the turn from purely theoretical constructs to cultural and artistic ones, which in many instances precede their theoretical siblings.
Rosalind Krauss (1983) maintains that postwar American art kneaded, stretched, and twisted the categories of sculpture and painting with enormous elasticity in the name of the new. Nevertheless, its covert message was historicism, diminishing newness and mitigating difference. She argues that the logic of sculptural monumentality began to fail by the late 19th century when the work of Rodin and Balzac became siteless. This made it suitable in any museum or cultural context because it was abstract and self-referential, embodying modernist negativity without positive cultural content. Krauss claims the idealist space was exhausted by the 1950s. Despite Frederick Jameson and Baudrillard’s protests that the postmodern emerged as nothing more than pastiche, for many it was an expression of the social order of late capitalism. It embodied the dangerous, explosive, subversive aspects of modernism and the hyperrealism of simulations, the great advertising screens, to which many theorists and artists objected.
Edward Said (1983) famously criticized Jameson for endorsing the discontinuity of a global perspective with local alliances. He argued that Jameson advocated a strong hermeneutic globalism, a Marxism that subsumes the local into the larger context and thereby abandons the “extra-academic world” to the politics of the new right and Reaganism. This left a small yet humane function to the humanities, which could only represent a humane marginality. They served the preservation and concealment of the hierarchy of powers that occupy the center, define the social terrain, and fix the limits of use functions, fields, and marginality.
For the art theorist and critic Craig Owens (1983), this is evidence that Paul Ricoeur (1914-2005) was correct in observing that the discovery of a plurality of cultures is never harmless. Plurality puts into question the hegemony of Western culture and exposes the system of power that authorizes some representations and blocks others. Out of this plurality, the psychoanalyst Michel Montrelay (1978) identified women as the “ruin of representation”, exposing women as the blindspot in the discussion of postmodernism. Montrelay asserts that the ruin of representation refers to the unconscious representation of women as objects, which, unlike conscious representation, no longer refers to anything—it is just words removed from reality, no more than a text.
Referring to Montrelay, Owens argues male theorists like Jameson tried to neutralize the issues women placed on a critical agenda by subverting them to concerns about class. While Lyotard objected on behalf of postmodernism to the “grand recits”, the master narratives of modernity, it is the artist Mary Kelly who pragmatically catalogued the period from her child’s birth up to age five in her work “Post-Partum Document” (1973-1979), and so she rejects and refutes the reduction of all narratives to a single grand story, whether it is about motherhood or female artists. Another of Kelly’s works, Interim (1984-89), documents a series of Kelly’s conversations with women in various social settings. She situates these discussions in terms of three categories: body (corpus), money (pecunia), history (historia), and power (potentia). Emblematic of this method, the images and conversations about the category “body” personify the five passionate attitudes of “hysterical” women photographed by the French neurologist Jean Charcot (1825-1893). Charcot, Sigmund Freud’s Parisian mentor, treated the women in his care with hypnotism. Charcot’s photographic subjects (who may have been imitating the hysterics with whom they were housed) and Kelly’s images, make what she calls “a play dedicated to the production of unreason as a tangible event” a visible spectacle (OI 57).
Kelly was not the first feminist artist to work against tradition and stereotype. Her work contradicts the claim that artworks presenting “an image of an image” are removed from reality. Along with Kelly, multiple feminist performance and body artists took advantage of the plasticity of human personality made possible by the malleability of surfaces and appearances. Their art made use of self-referentiality to situate themselves as subjects and to expose their realities. Martha Rosler’s Semiotics of the Kitchen (1974/75) uses the sign system of domesticity—kitchen utensils. She names them in her performance, abstractly demonstrating their function with increasing ferocity while remaining impassive. Rosler has described this work as a postmodern image of TV culture and celebrated male chefs, while the language and objects of the kitchen reduce women to domesticity.
Valie Export’s “Women’s Art” manifesto (1972) set out the parameters for women artists shaping their own image in the work of art as an act of resistance. She theorizes that the mythology of male values allows “men [to] create and control the social and communication media such as science and art, word and image, fashion and architecture, social transportation and division of labor” (AW 187). Export demonstrates that her most important material was the female body used as a code or sign, her performances usually being performed fully or partially nude. To challenge pornographic/erotic elements, she exposed body parts including her breasts or genitals to be seen or touched while the rest of her body remained hidden. These acts always took place in the presence of attendees in the public sphere. Export directly observed each person who approached or touched her, returning their gaze. Non-sexualized parts of the body were covered or placed in a shocking context such as rolling over broken glass to deny the expected erotic bodily sign to the male gaze. Other significant postmodern feminist performance artists include Carolee Schneeman, Karen Finley, and Diamanda Galas.
Cultural studies scholar Meaghan Morris (1988) calls attention to the “sense of intrigue” that developed around the absence or withholding of “women’s speech” in discussions of postmodernism (TPF 376). She points to male counterparts who speculated that postmodernism was not pertinent to feminist concerns, admitting at best to the intersection of feminism and postmodernism. Morris asks under what conditions women’s work can figure in postmodernism other than being appropriately framed by male theorists’ discourse. She asserts the necessity of reclaiming women’s work and their names in the context of postmodern debates. This is necessary because an “overwhelmingly male pantheon of proper names” taken up as “ritual objects of academic exegesis and commentary” has successfully constituted postmodernism (TPF 378). Thus, she sees the occasional inclusion of women artists and theorists—by Owens and others—as token acts. The recognition of artists and theorists acknowledged to be postmodern but still not engaged with in the literature of postmodernism reenacts the “(feminized) object-language and a (masculine) meta-language” divide (TPF 379). It is precisely this type of division that has been brought into question by feminist scholarship, which has to be seen and read before it can be used.
11. The Architecture of Multivalence
The public visibility of architecture contributed to its extensive role in the practice and discourse of postmodernism. Architectural historian Charles Jencks (1977) points to the July 15, 1972 dynamiting of the Corbusier-inspired Pruitt-Igoe low-income housing development in St. Louis. He declares it the end of uninhabitable modernism and the beginning of postmodernism in architecture. An articulate spokesperson for postmodern architecture, Jencks argues that what is notable about postmodern buildings is they speak to both the professional architect and the public concerned with comfort and tradition. Critical especially of Robert Venturi, whom he judged could not reconcile opposing meanings within his buildings, Jencks advocates for the architectural language he called “radical eclecticism” (LPA 87). With this new language, a variety of styles could form a semantically justified creative synthesis that is at once variegated and witty, messy and ordered. To execute this effectively the architect had to work in at least two directions and train in more than one style, to the point of developing a kind of stylistic schizophrenia.
Jencks sets out canons for production and preconditions for creativity, which he calls “emergent rules”. Not all these rules can be universally practiced; notably, they include principles of dissonant beauty or disharmonious harmony. Cultural and political pluralism justifies these apparent contradictions. They are realized in buildings that manifest a long list of characteristics: urban contextualism that fits into the street; anthropomorphic moldings and ornaments suggestive of the human body; parody, nostalgia, or pastiche with respect to past architectural styles; pluralistic content and signification or intertextuality; double-coding through irony, ambiguity, and contradiction; multivalence with respect to the environment; dramatic reinterpretation of tradition; new rhetorical figures or signs; and the absence of a center. Jencks states that every style is possible due to modern methods of fabrication and new modes of communication and scholarship. These canons remain subject to evolution, and the architecture they create can still be subject to dead-ends, imbalances, and urban problems.
Possibly first among the architects embracing the postmodern turn are Robert Venturi, Denise Scott-Brown, and Steven Izenour, whose book Learning From Las Vegas (1972) celebrated the style of suburban strip malls, Las Vegas casinos, and billboards. Their designs culminated in the “decorated shed”, using signs and images taken from popular culture in place of the architectural forms championed by the modernist Bauhaus (CCA 87). Venturi argues for complexity and contradiction in architecture, and by this he means to include complexity of meaning—derived from interior characteristics and context and perceptual ambiguity and tension (LFL 16). He likens this to Kurt Gödel’s (1906-1978) proof of inconsistency in mathematics. It is an inclusivity that does not separate architecture from the experience of life and the needs of society. The elements of a building defined by the inclusive “tradition of both-and” make it possible for a single feature to occupy multiple functions even if it produces spaces that are good and bad, as long as they exist for the sake of the whole (LFL 23).
Venturi discovers these characteristics in architecture throughout its history. For example, he notes that even the apparent aesthetic simplicity of a Doric temple is only achieved through the subtleties and precision found in its distorted geometry, which give rise to the contradictions and tensions inherent in its apparent order. Venturi was much admired for his original ideas regarding architecture and popular culture. Yet, paradoxically, he also argued that architects should narrow their concerns and focus only on architecture rather than cultural change. Some criticized his concept and his focus on the creation of architectural functional boxes as supporting the ugly and banal. Despite his claim to plurality, Venturi ultimately made no effort to include Scott-Brown when he was awarded architecture’s Pritzker Prize in 1991.
12. The End of Postmodernism in Astral America
The theoretical and visual aspects of postmodernism converge explicitly in Baudrillard’s America (1988 [1986]), a text signaling the fulfillment and the dénouement of the postmodern condition. For the French intellectual, it is the journey from postmodern theory to postmodern reality. Baudrillard discovers the America of Venturi-Scott-Brown, the aesthetically sublime landscape envisioned by Lyotard, the indeterminate that is beyond the bounds of sensibility, the impulse of the mind exceeding sensible representation, and the formless, existing without validation. From the perspective of America, Europe simply disappears. Astral, from the Ancient Greek ἄστρον (star), refers in theosophy to the supersensible substance of the sensible realm. Baudrillard reveals the astral America of open freeways, deserts, motels, and minerals. He finds the America of the affectless succession of signs and images that can be understood only by traveling its physical and mental deserts. This aspect of America is the reality of the postmodern for Baudrillard, actualizing the “inhumanity of our ulterior, asocial, superficial world [which] immediately finds its aesthetic form here, its ecstatic form” in its critique of culture and the ecstatic form of disappearance (A 5).
Despite the desert’s silence and its luminous, fossilized network of inhuman intelligence, it is, for Baudrillard, the site of the society of complexity, hybridity, and intermingling. These are terms usually attributed to the postmodern and its alleged extermination of meaning. It is where the land, faces, and things are just what they are and nothing more. Baudrillard finds America to be the real hyperreality, the perfect simulacrum, whose inhabitants are the models for all simulation. Its culture arises in relation to the deserts, thus empty and radically nude. It is not the high culture familiar to visitors to Europe, not the old world, but a whole chasm of modernity, a simulation of culture that is a mirage. Its towns without centers and its nightlife take place on the strips where cars cruise up and down. It takes time for Baudrillard to adjust, yet eventually he concedes that the deserts are exemplary sites that banish sociality, sentimentality, and sexuality; they are the first places where words and intellect have no meaning. In the sublime beauty of Death Valley, Baudrillard finally revels in its pastel colors, soft air, and mineral light. Akin to an alien planet, it is the sign of 180 million years making plain the enigma of his own existence and the likelihood of humanity’s disappearance, which is quite possibly postmodernity’s ultimate prediction.
Ballard, J.G. Crash. New York: Fourth Estate, 1973.
Baudrillard, 1994, Simulacra and Simulation. trans. Sheila Faria Glaser Ann Arbor: University of Michigan Press. Simulacres et simulation, Editions Galilee, 1981. Ann Arbor: University of Michigan Press. (SS)
Bell, Daniel. The Cultural Contradictions of Capitalism. New York: Basic Books, 1976. (CCC)
Beville, Kieran. “Understanding Postmodernism, A Preliminary Discussion of Issues Pertaining to the Feasibility of the Homiletic Task in the Contemporary Epistemological Context”. in Preaching Christ in a Postmodern Culture. Cambridge UK: Cambridge Scholars Publishing, 2011. (UPM)
Boolos, George. “Introductory Note to 1951 Gibbs lecture”. Kurt Gödel 1995: 290–304.
Collected Works III. Unpublished Essays and Lectures, S. Feferman et al. (eds.). Oxford: Oxford University Press, 1995.
Deleuze, Gilles, and Félix Guattari. Anti-Oedipus. trans. Robert Hurley, Mark Seem, and Helen R. Lane. Vol. 1 of Capitalism and Schizophrenia. Minneapolis: University of Minnesota Press, 1983. Originally published as L’anti-Oedipe: Capitalisme et Schizophrénie. Paris: Minuit, 1972.
Export, Vallie. “Interview”. Juno, A. and Vale V. (eds.) Re/Search 13. Angry Women, San Francisco: Re/Search Publications: 1991, 186-93. (AW)
Feyerabend, Paul, K. Against Method: Outline of an Anarchistic Theory of Knowledge. Minneapolis: University of Minnesota Press, 1991.
Foster, Hal. The Anti-Aesthetic, Essays on Postmodern Culture. Seattle: Bay Press, 1983. (AA)
Foucault, Michel. The Order of Things. trans. Alan Sheridan. London: Routledge, 1970. Les mots et les choses. Paris: Gallimard, 1966.
Habermas, Jürgen. “Modernity—An Incomplete Project”, in Hal Foster (ed.) The Anti-Aesthetic, Essays on Postmodern Culture, Seattle: Bay Press, 1983. (MIP)
Harvey, David. The Condition of the Postmodern, An Enquiry into the Origins of Cultural Change. Cambridge and Oxford: Blackwell Publishers, 1990. (TCP)
Hassan, Ihab. The Postmodern Turn, Essays in Postmodern Theory and Culture. Ohio State University Press, 1987. (TPT)
Huyssen, Andreas. “Mapping the Postmodern”. New German Critique. No. 33. Modernity and Postmodernity. Autumn (1984):5-52. (MP)
Jameson, Fredric. “The Cultural Logic of Late Capitalism”. New Left Review. 146 (July/Aug 1984): 53-92. (CLL)
Jencks, Charles. The Language of Postmodern Architecture. New York: Rizzoli, 1977. (LPA)
Kelly, Mary. “Re-Presenting the Body: On Interim, Part I”. in Psychoanalysis and Cultural Theory: Thresholds. New York: St. Martin’s Press, 1986. (OI)
Krauss, Rosalind. “Sculpture in the Expanded Field”. Hal Foster. ed. The Anti-Aesthetic, Essays on Postmodern Culture. Seattle: Bay Press, 1983. (AA)
Kuhn, Thomas. M. The Structure of Scientific Revolutions. Chicago: University of Chicago Press, 1962.
Luhmann, Niklas. Social Systems Writing Science. Redwood City: Stanford University Press, 1996.
Lyotard, Jean-François. The Postmodern Condition: A Report on Knowledge. trans. Geoff Bennington and Brian Massumi. Manchester: Manchester University Press. 1984. La Condition post-moderne: rapport sur le savoir. Paris: Minuit, 1979. (PMC)
Lyotard, Jean-François. The Differend: Phrases in Dispute. trans. Georges Van Den Abbeele, Manchester: Manchester University Press, 1988. Le Différend, Paris: Minuit, 1983. (TD)
Morris, Meaghan. The Pirates Fiancée. New York and London: Verso Books, 1988. (TPF)
Olkowski, Dorothea. Postmodern Philosophy and the Scientific Turn. Indiana University Press, 2012.
Olkowski, Dorothea. Gilles Deleuze and The Ruin of Representation, University of California Press, 1999.
Olkowski, Dorothea. The Universal (In the Realm of the Sensible). Edinburgh and New York: Edinburgh University Press and Columbia University Press co-publication, 2007. (URS)
Owens, Craig. “The Discourse of Others: Feminists and Postmodernism”. Hal Foster. ed. The Anti-Aesthetic, Essays on Postmodern Culture, Seattle: Bay Press, 1983. (DO)
Poli, Francesco. Postmodern Art, from the Post-War to Today, New York: Harper Collins, 2008.
Rorty, Richard. Philosophy and the Mirror of Nature. Princeton: Princeton University Press, 1979. (PMN)
Said, Edward W. “Opponents, Audiences, Constituencies, and Community”. Hal Foster. ed. The Anti-Aesthetic, Essays on Postmodern Culture. Seattle: Bay Press. 1983. (OACC)
Shannon, Claude E. The Mathematical Theory of Communication. Urbana:The University of Illinois Press, 1949. (MTC)
Steinberg, Leo. Other Criteria, Confrontations with Twentieth Century Art. New York: Oxford University Press, 1972. (OC)
Stengers, Isabelle and Ilya Prigogine. Order Out of Chaos, Man’s New Dialogue with Nature. New York: Bantam Books, 1984. (OOC)
Venturi, Robert. Complexity and Contradiction in Architecture. New York: Museum of Modern Art, 1985. (CCA)
Venturi, Robert, Denise Scott-Brown, and Steven Izenour. Learning From Las Vegas. Cambridge: MIT Press, 1972. (LFL)
The metaphysics of quantum gravity explores metaphysical issues related to research programs in theoretical physics clustered under the term quantum gravity. These research programs aim at the formulation of a theory that reconciles the theory of general relativity with quantum theory. The goal is not necessarily to come up with a unified single theory but, more pragmatically, to describe phenomena with a dual nature, embodying both quantum and relativistic features—such as black holes and the early universe.
Approaches to quantum gravity are not yet fully worked-out theories. Nevertheless, they already provide a certain partial understanding of physical reality in different ways. Remarkably, they do so with a striking similarity: they virtually all deny the existence of some features usually regarded as essential to the existence of spacetime (or space and/or time) such as its four-dimensionality, the existence of distances and durations between events, or even the very partial ordering of events.
This observation is particularly noteworthy, considering the pervasive influence of spatial and temporal organisation on the human mind across various facets of daily life and theoretical thinking, ranging from most ancient religions to contemporary scientific worldviews. The metaphysics of quantum gravity takes as its starting point the puzzling observation that physics could teach us that space and time are not fundamental. It draws on resources from traditional metaphysics to tackle a set of issues related to the possible non-fundamentality of spacetime, and it investigates its potential implications for venerable traditional issues in metaphysics.
The metaphysics of quantum gravity is a relatively small and new research field, and thus as of now, its focus has been on explaining how spacetime could emerge from a more fundamental and non-spatiotemporal ontology. Consequently, this article is equally focused on questions regarding the status of spacetime and the emergence of spacetime. Section 1 situates the field within metaphysics of science more broadly. Sections 2 and 3 investigate, respectively, the status of spacetime in different approaches to quantum gravity and a number of potential issues with its lack of fundamentality. The article then covers the nature of the emerging spatiotemporal ontology (Section 4) and the building relation that relates it to the underlying non-spatiotemporal ontology (Section 5). Section 6 surveys various potential applications of spacetime emergence to a number of debates in metaphysics.
The metaphysics of quantum gravity is both a part of the more general philosophy of quantum gravity, which encompasses other epistemological and technical issues, and of metaphysics. This section situates the metaphysics of quantum gravity in this more general context.
Metaphysics, as traditionally conceived, aims to ascertain the most abstract structure of reality. Some questions metaphysicians are typically concerned with are: What is time? How do objects relate to the spatial regions they occupy? What kind of relation is the one that relates fundamental to non-fundamental entities? Using results from the development of a number of approaches to quantum gravity, the metaphysics of quantum gravity thus pursues the traditional tasks of metaphysics while shifting its perspective in two ways.
First, quantum gravity raises new metaphysical questions—in particular that of how to categorize the nature of spacetime if it turns out not to be fundamental according to fundamental physics. However, as novel as the problems we are confronted with might be, they might lead us to support some philosophical claims that have already been argued for on completely different grounds. Here is one example of such views: we might be led to conclude that spacetime simply does not exist (see Section 5d). Arguments to the effect that space and/or time are unreal have been put forward independently of considerations about quantum gravity (famously so by McTaggart in 1908). Second, quantum gravity might also suggest novel answers to a number of preexisting metaphysical questions (see Section 6). On the one hand, this concerns further metaphysical questions about the nature of space, time, and spacetime, beyond the question of their relation to fundamental reality. On the other hand, an overwhelming number of metaphysical concepts have received analyses that rely on the existence of space and time, or of spacetime. If the fundamentality of spacetime is challenged by quantum gravity, then these analyses are equally called into question—at least in so far as they are supposed to apply to the quantum gravity level.
By taking the preliminary results of quantum gravity research as a basis for philosophical investigation, the metaphysics of quantum gravity belongs to the metaphysics of science: an approach to metaphysics according to which metaphysical arguments, claims and theories should be informed by our best science. The metaphysics of quantum gravity shares many methodological and conceptual resources with other areas of physics-oriented metaphysics. For example, problems of emergence have also been discussed in the context of non-relativistic quantum mechanics regarding the status of space (but, importantly, not of spacetime): if one accepts configuration space realism (Albert 1996, Ney 2012 and Ney 2021b)—according to which the fundamental physical space is a physical counterpart of the high-dimensional mathematical configuration space in which the wavefunction is defined—then the question arises whether and how the four-dimensional spacetime emerges from this underlying structure. In contrast to other areas of the metaphysics of science, however, metaphysicians of quantum gravity do not reflect on empirically established scientific theories but on approaches to quantum gravity that are currently under construction.
The need for a theory of quantum gravity arises from the fact that general relativity and quantum physics can hardly be both entirely correct. General relativity and quantum field theories are our best theories in their respective domains of description. As such, they provide excellent descriptions of the world. However, their predictive and theoretical capacities are effectively limited to these respective domains. General relativity, on the one hand, only produces good results in situations where we can neglect the quantum behaviour of matter. The Standard Model of particle physics, on the other hand, offers an excellent description of quantum matter to the extent to which gravitational phenomena involving high energy can be neglected. We thus lack a theory to fully describe phenomena with both quantum and relativistic features, such as black holes and the early universe. Overall, theoretical physics presents us with a situation involving two distinct frameworks with different physical ideologies or philosophies (here understood as sets of ideas suggestive of an ontology), and in which we have no satisfactory reason to privilege one or other of the ideologies to guide us towards the ontology of the physical world. These two theoretical approaches cannot be easily unified beyond their respective domains. The most conservative attempt is called semi-classicalgravity.
The highly speculative character of quantum gravity research might raise doubts about the feasibility and relevance of pursuing a metaphysics of quantum gravity. What if none of the approaches to quantum gravity on the market turn out to be correct? Even worse, what if the successful theory of quantum gravity is so different from past approaches that it does not share any of the features deemed metaphysically important, and that past approaches display?
To react to this challenge, it is useful to distinguish two different strategies one might pursue in doing metaphysics of quantum gravity: one that deals with abstract issues across the board of different approaches to quantum gravity, and one that focuses on specific approaches to quantum gravity. A predominant view is that the work can be divided in this way for at least two reasons. First, pragmatically, working out general issues with spacetime emergence can be helpful to then solve more specific issues. Second, spacetime emergence may be related to general issues from the metaphysics literature with far-reaching implications. Note this strategy does not necessarily require investigating all approaches to quantum gravity. It can also focus on a limited set of approaches or aim at formulating results based on pre-theoretic constraints sufficiently disconnected from the theory, in the guise of what has been dubbed experimental metaphysics by Abner Shimony (Cohen et al., 1997). A notable exception to the separation of labour into these two equally legitimate strategies is Jaksland and Salimkhani (2023), who argue that the only valid metaphysics of quantum gravity should focus on specific approaches. Instead, this article follows the standard distinction between general and specific issues.
Now, let us return to the challenge formulated above. The first formulation of the challenge, which doubted that any of the existing approaches to quantum gravity will turn out to be correct, need not affect the more general strategy: (some of) the general features investigated on that strategy, and their metaphysical consequences, might survive in the correct theory. The second version of the challenge affects both strategies alike. Indeed, most metaphysicians of quantum gravity will reject the very sceptical attitude expressed in this challenge. However, even if current research in quantum gravity is as fundamentally misguided as the challenge suggests, this need not render research into the metaphysical consequences of existing quantum gravity approaches futile: maybe such metaphysical considerations can help to open up conceptual possibilities needed to develop the unheard-of correct theory of quantum gravity.
2. Quantum Gravity
As of December 2024, there is no consensus on what is the most promising approach to formulating a theory of quantum gravity. The most conservative attempt is called semi-classicalgravity. It tries to conserve elements of each framework by combining them without drastic modifications. This approach is a conceptually hybrid creature akin to a computational tool, which appears to wear no clear and complete ontological commitment on its sleeves. So, it is natural to regard semi-classical gravity as a mere step on the path to quantum gravity. Beyond that, different approaches have been advanced and are under constant development. Stringtheory, loopquantumgravity and causalsettheory, to name some candidates, are at different stages of elaboration. Some of them, like string theory, are mature research programs involving thousands of researchers. Others like causal set theory are still at an even earlier phase of development and involve only dozens of researchers. No empirical test has been able to give the edge to one of these approaches over the others, and experimental procedures are currently being developed (Huggett, Linnemann, and Schneider, 2023).
Approaches to quantum gravity each come with their specific issues, including problems of spacetime emergence. The present section briefly introduces three approaches to quantum gravity, demonstrates how spacetime could fail to be fundamental in these approaches, and discusses the prospects for spacetime to remain fundamental in quantum gravity. The approaches discussed here are by no means exhaustive, and their selection simply reflects the knowledge of the authors of this article. Other popular approaches that are philosophically fruitful include but are not limited to: canonicalquantumgravity, group field theory, shape dynamics, asymptotic safety, Penrose’s gravitationally-induced collapse approach, non-commutative geometry and causal dynamical triangulation.
a. String Theory
String theory is the most popular research programme in quantum gravity. (For easy-going presentations, see Greene 1999; Dawid 2013; Zimmerman Jones and Sfondrini 2022; for textbooks see Zwiebach 2009; Blumenhagen et al. 2013; Tomasiello 2022.) According to a rough understanding of the formalism, reality is constituted by one-dimensional strings, and other higher-dimensional entities called “branes.” Those entities have various properties, such as vibrations, size and topology. A number of states of this underlying ontology correspond to the particles of the Standard Model of particle physics. Some states of closed strings correspond to the graviton, the particle posited to mediate the gravitational interaction. There is not one but five string theories, and they are usually regarded as approximating an even more fundamental theory. (For an introduction to string theory aimed at philosophers, see Le Bihan 2023.)
String theory jeopardises the fundamentality of spacetime in at least three different ways.
First, for reasons of mathematical consistency, the background spacetime has not four but rather ten dimensions: nine spatial dimensions and one temporal dimension. The dimensionality of spacetime thus becomes problematic, and a story about the emergence of the four-dimensional spacetime from a ten-dimensional spacetime is required. To make things even worse, the five ten-dimensional theories are conjectured to approximate a more fundamental, non-approximative, eleven-dimensional theory called M-theory, involving ten spatial dimensions and one temporal dimension, or perhaps a twelve-dimensional, non-approximative, theory named F-theory, postulating ten spatial dimensions and two temporal dimensions.
Second, the five string theories can be described as quantum field theories on two-dimensional worldsheets that one can visualise, at least to some good approximation, as the extension of one-dimensional strings in an external temporal direction, just as we can view the trajectory of a particle in time as a spacetime line, a one-dimensional worldline, in the more familiar relativistic context (Le Bihan, 2020, Section 3). This worldsheet perspective presents us with a picture of quantum fields fluctuating on a two-dimensional manifold, and strings and branes do not exist qua objects. The manifold’s metric is conformally invariant, strongly suggesting that there is no matter of fact about distances and durations between elements of the manifold. If the worldsheet approach has ontological teeth, then we need to understand the emergence of the relativistic four-dimensional spacetime of general relativity from a two-dimensional surface lacking meaningful notions of distance and durations between its elements.
Third, the five string theories have a surprising feature. They have been shown to be empirically equivalent in a remarkable way, casting doubt on the very existence of the spacetime in relation to which they are defined. They are not merely empirically equivalent but also physically equivalent in a stronger sense. The empirical equivalence of two theories can be defined as the existence of a systematic correspondence between the quantities of all possible measurable quantities, such that empirical evidence cannot decide in favour of one of the theories over the other. Physical equivalence is a stricter condition insofar as there is also a systematic correspondence between the unobservable quantities of the two theories, thus generating inter-theoretical “giant symmetries” (De Haro and Butterfield, 2021, p. 2974). Those are called ‘duality relations’, between duality-related models and theories, and duality-related quantities. The philosophy of duality is usually approached in a very mathematical and non-metaphysical way (but see Le Bihan and Read (2018); Le Bihan (2023) for an introduction and discussion of the ontology of duality aimed at philosophers). Duality has been used to argue against the reality of the ten-dimensional spacetime, since duality-related models of the two theories will not share the spacetime metric (T-duality) and sometimes not even the same topology (mirror symmetry) (Huggett 2017; Matsubara and Johansson 2018). There is therefore no general agreement on the exact ontology of string theory, but there are strong reasons to doubt that the structure we refer to as spacetime in relativistic physics remains present at the more fundamental level described by string theory.
Aside from questions over the fundamentality of spacetime, string theory leads to questions regarding the reality, fundamentality and ontological categorisation of strings and branes: are strings genuinely fundamental objects according to the general framework of string theory, or are the branes the only fundamental entities of the approach (Le Bihan, 2023)? That strings should be eliminated from the ontology of string theory in favour of branes can be motivated by the fact that M-theory appears to merely include branes and not one-dimensional strings. The opposite view that strings are more fundamental than branes has also been defended, but in the context of the five string theories (Vistarini, 2019).
b. Loop Quantum Gravity
Unlike string theory that starts with a modification of the Standard Model of particle physics and tries to recover gravity, loop quantum gravity (LQG) is a general-relativity-first approach. Similarly, it is a geometry-first approach in that it focuses on the construction of spacetime and the gravitational field, without taking into account the quantum physics of matter. This section presents an extremely condensed and superficial version of their Chapter 6 focused on the emergence of space and time in LQG.
LQG refers to two distinct approaches: canonical loop quantum gravity and covariant loop quantum gravity (Rovelli, 2004; Rovelli and Vidotto, 2014).
Canonical LQG is built on a Hamiltonian reformulation of general relativity that is easier to quantise than the standard formulation. This formulation of general relativity goes against the original spirit of general relativity by forcing a foliation of the spacetime into an objective ordering of three-dimensional spaces and a universal time, thereby ruling out solutions of the theory that cannot be foliated this way (the non-globally-hyperbolic solutions). Then, to move from classical spacetime to a quantum structure, those classical three-dimensional spaces are transformed into quantum states via a technical procedure called canonical quantisation, and those are supposed to be in states of quantum superpositions, like quantum matter in textbook quantum mechanics. These three-dimensional quantum states are defined over a Hilbert space (the mathematical configuration space that describes the possible states of the system at hand). Those are spin networks states, which can be described by combinatorial graphs of links and nodes, and by numbers associated with both links and nodes.
Prima facie, the naive ontology of LQG appears to be one of a quantum superposition of discrete elements (the links and nodes) and one might be tempted to argue that spacetime just is this quantum structure. However, four reasons at least can be provided for why this structure differs significantly from spacetime.
First, according to a number of interpretations of quantum mechanics, its ontology is metaphysically indeterminate (one popular exception being the many-worlds interpretation, see Glick and Le Bihan 2024). By being quantum, the fundamental LQG structure could thus be metaphysically indeterminate as well. Whether geometry needs to be well-determined to pretend to the status of spacetime remains debated and might relate closely to the question of whether the world in general harbours metaphysical indeterminacy.
Second, the spatial status of the spin networks can be questioned because of disordered locality. Many models of LQG have adjacency relations between their elements that diverge from the adjacency relations existing between the corresponding elements in the general relativity description, taken to approximate the underlying LQG ontology. The well-defined ordering of events around us could thus turn out to be a statistical approximation, such that when zooming in on the deep fabric of spacetime, we would find anomalies such as adjacency relations that correspond to long spacetime intervals.
Third, there is a problem of frozen dynamics called the problemoftime, because the time variable appears to be missing from the equations supposed to describe the evolution of spin networks. Thus, both time and change appear to be at best perspectival or relational, describing relations between specific sub-systems in the universe. But there no longer seems to be any strong sense of a physical system evolving with respect to the rest of the universe.
Fourth, not all spin network states of the underlying structure are expected to give rise to an effectively spatiotemporal geometry. Thus, at best, spacetime could be identical to a spacetime state or property of the underlying ontology, but not to the bearer of the state or property itself.
The second version of LQG is covariant LQG. It describes a four-dimensional extension of spin networks called “spinfoams.” Because of technical difficulties with the canonical approach, most efforts in the LQG community focus on developing the spinfoam framework these days. It exploits a path integral approach to dynamical evolution. A path integral formulation computes the evolution of a physical state from an initial state to final state by weighting all the possible paths between the initial state and the final state. Moving to LQG, the paths are identified with spinfoam trajectories, roughly understood as spin network “evolutions.”
The covariant approach performs well for local descriptions of spacetime regions limited to the astrophysical scale. This contrasts with the canonical approach, which can be deployed to produce simplified toy models of the entire cosmos, resulting in the loop quantum cosmology, not unlike the Λ-CDM model, the standard model of cosmology, which exploits highly simplified models of general relativity to deliver cosmological models of the whole universe (see Bojowald 2011 in physics, and Huggett and Wüthrich 2018, Section 3 for a philosophy of physics perspective). On the contrary, covariant LQG requires feeding the equations with a lot of information about the boundary conditions, both at the beginning and at the end, but also on the spatial edges of the spacetime region being described, and thus operates better on the astrophysical than on the cosmological scale.
Among the four reasons to deny the fundamental status of spacetime in canonical LQG, three remain in covariant LQG. What remains is both metaphysical indeterminacy and disordered locality, with the possible exception of the many-world interpretation. However, there does not seem to be a problem of time since the dynamics is understood as the solution of a set of constraints between the initial and the final state (and not as the evolution of a state according to equations that do not feature a time parameter, like in the Hamiltonian formulation of the canonical approach). As for the fourth reason, namely that certain states of the underlying ontology do not embed a spacetime profile, that certainly continues to be the case with the covariant approach: not all states trigger a spacetime geometry, and even those who do are quantum superpositions of more fine-grained states, some of them failing to be spacetime-like.
c. Causal Set Theory
Causal set theory (CST) aims to rebuild the seemingly continuous spacetime world from a discrete structure of elements and partial ordering relations between these elements (Bombelli et al., 1987; Rideout and Sorkin, 1999; Dowker, 2006, 2013; Major et al., 2009; Rideout and Wallden, 2009). Unlike string theory, which seeks to push the Standard Model further, and loop quantum gravity, which strives to generalize general relativity to the quantum realm, CST sets out to reconstruct familiar physics from scratch from a new paradigm. The structures of partially ordered elements, causal sets, are expected to collectively give rise to spacetime and its material content, as described by, at least up to a good approximation, the general theory of relativity. The approach is premised on a theorem from general relativity (Malament, 1977), which states that the metric structure of a spacetime region can be derived from its causal structure, up to a conformal factor. This technical result has been taken to suggest that almost all the structure of a spacetime region could be built from scratch from the causal structure of said region. Causal set theory attempts this construction.
Causal sets are usually described as evolving through the operation of a dynamical law that adds elements one-by-one, connecting them to the pre-existing stage of the causal set (although this description might be misleading as will be explained below). The approach tries to develop a number of different dynamics in order to reproduce, up to some approximation, a class of models of general relativity consistent with the actual world.
CST offers a clear framework for discussing the possible emergence of continuous from discrete structures. Whether causal set theory should be interpreted as a case of spacetime emergence is still debated, as two main philosophical interpretations of the ontology of CST are in competition. According to the growing block approach, CST suggests an interestingly novel and radical form of the growing block theory of time, which is standardly defined as the view that the past and the present exist, but the future does not. The flow of time is then identified with the coming into existence of new slices of existence, at the border of the past-present block, and the future. With CST, there is no longer a sharp differentiation between three regions of reality, the past, the present and the future, as the coming into existence of being takes the form of single elements (single building blocks of a local time) instead of three-dimensional hypersurfaces (global times). The growing block interpretation to CST has been defended by physicists (Dowker, 2006, 2014) and assessed by philosophers (Earman 2008, Wüthrich and Callender 2016).
The growing block approach to CST differs from the traditional growing block theory in a number of ways. First, the growth is only local and there is often no definite matter of fact regarding which element from a pair of elements comes into existence before the other. The growing of causal sets can be thought of as happening in various directions, a visual simplification to emphasise that the ordering between the elements is merely partial. This local growth is better visualised as a growing octopus, or n-pus, than as an expanding block (Le Bihan, 2020). Another implication of the model is a commitment to metaphysicalindeterminacy of a new kind (Wüthrich and Callender, 2016). This metaphysical indeterminacy is new by applying not only to the future but also to the past of the growing block structure. Indeed, due to a property of the dynamics called “discrete general covariance,” it is (seemingly metaphysically) indeterminate which past configuration of the set, among a number of distinct possible configurations, led to any particular configuration in the growth of the octopus. This might generate contradictions motivating an alternative interpretation of the ontology of CST.
According to a natural alternative ontological interpretation, the growth description is merely heuristic, and we should really think of the maximal set produced at the limit of the dynamics (intuitively, when the process of elements coming into existence is infinitely complete) as a perspicuous representation of reality (Huggett, 2014). This approach is thus in the spirit of the blockhead interpretation of general relativity. The view still includes metaphysical indeterminacy but does not appear to generate problematic contradictions.
Other metaphysical issues discussed in relation to CST include: Is CST committed to a form of realism about causation (Wüthrich and Huggett 2020)? Could a fundamental ontology of causal relations ground or compose a derivative ontology of spacetime relations (Baron and Le Bihan, 2022a, 2024)?
d. Is Spacetime Non-Fundamental?
Is spacetime really non-fundamental according to most approaches to quantum gravity? Negative answers have been articulated or voiced by a number of scholars. They can be categorised in two classes: a priori and empirical.
The first category of objections to non-fundamental spacetime is a priori in that it is not grounded in the analysis of specific approaches to quantum gravity, but in a priori motivations that have nothing to do with the content of theoretical physics. Such objections can be made based on the claim that the non-fundamentality of spacetime stands in the way of our conception of concrete physical entities (Lam and Esfeld, 2013, p. 287), or that the concept of spacetime involves the property of fundamentality (Baker, 2021, Section 6). More generally, there is a long tradition in metaphysics to associate physicality to spatiotemporality, and the notion of a fundamental spacetime plays indeed a central role in many metaphysical views. Whether those views can be amended to account for fundamentally non-spatiotemporal reality constitutes another direction of research in the metaphysics of quantum gravity (which we turn to in Section 6), and one can see the conservative pressure coming from analytic metaphysics to preserve the rock-bottom fundamental status of spacetime.
A second type of reason to doubt that spacetime is not fundamental is empirical (see, for example, Esfeld 2021). Two lines of reasoning in this direction are possible. First, it could be that the correct theory of quantum gravity will be one that does not question the fundamentality of spacetime. This could potentially happen with a number of approaches to quantum gravity, as working out their ontology remains a vast project. For example, although the received view of canonical quantum gravity is that time disappears in a problematic way (Huggett and Thébault, 2023), the claim remains disputed (Chua and Callender, 2021). Turning to another example and as mentioned before, according to a certain version of causal set theory endorsed by Dowker (2014), spacetime fundamentally exists, although in a peculiar way. Another possible view in that direction is Bohmian quantum gravity, which would extend the Bohmian interpretation of non-relativistic quantum mechanics not only to quantum field theories, but also to quantum gravity (Vassallo and Esfeld 2014). Second, it could be that one of the theories of quantum gravity generally considered to deny crucial features of spacetime actually turns out not to deny them and could be reinterpreted differently in the future.
3. Problems with the Non-Fundamentality of Space-time
This section surveys a number of philosophical problems that emerge if spacetime does not exist fundamentally according to quantum gravity. Intuitively, that physical reality could fail to be fundamentally spatiotemporal appears troublesome: it clashes drastically with the way we usually conceive of the world as being fundamentally spatial and temporal, and with the scientific method, which seems to be based on collecting observations localised in space and time. Sections 3a to 3d review a number of problems resulting from different ways of making the nature of the clash precise and discuss what has or needs to be done to address them. The order in which the problems are presented reflects the extent to which their solution can be expected to follow from physics alone.
a. The Scientific Problem
The scientific problem is the problem of providing a theoretical derivation of spacetime physics from a non-spatiotemporal physics, namely a derivation of our best physics from a theory of quantum gravity. This should be done for two frameworks: general relativity and quantum physics. This problem is scientific in that it is an actual problem that quantum gravity physicists are facing. Indeed, absent new and independent empirical evidence, the most reliable guiding principle in the formulation of a theory of quantum gravity is the ability of the latter to derive our currently best, empirically-confirmed theoretical frameworks in physics—namely, general relativity and the Standard Model of particle physics based on a family of quantum field theories. Applied to spacetime, the problem amounts to the possibility of deriving, at least as a mathematical approximation and with bridge principles between the primitive notions of the two theories, the piece of apparatus that plays the spacetime role in general relativity and quantum field theories from a non-spatiotemporal theory of quantum gravity, along a heterogeneous Nagelian reduction (Nagel, 1979).
The problem has a different form depending on whether the focus is on general relativity or quantum physics. General relativity is a very successful theory in the low-energy regime of description that came with its share of conceptual revolutions—for example, with the possibility of intrinsically curved and expanding geometries. If spacetime is not fundamental, then how can this success be accounted for, and how should we rethink the conceptual revolutions mentioned above? A possible answer is that deriving general relativity as an approximation from the theory of quantum gravity would suffice to explain its predictive power—we should then look for conceptual revolutions in the new theory. The lessons learned from general relativity could then perish or survive the move to the new theory. The issue thus becomes the task of building one model (that is, solution) of general relativity consistent with the distribution of matter in the actual world from a solution to a theory of quantum gravity. This has been done to some degree, for example in string theory where the metric field—standardly regarded as representing spacetime in general relativity—can be described as a coherent state of an underlying ontology of strings and branes (Huggett and Vistarini, 2015).
When it comes to quantum physics, quantum gravity physicists do not focus on non-relativistic quantum mechanics, as is often the case in the metaphysics of quantum physics, but instead on quantum field theories, a class of which correspond to the Standard Model of particle physics. The Standard Model describes fundamental particles and fields, such as electrons and quarks, groups them into families, and provides a catalogue of their possible interactions. The Standard Model incorporates the relativistic effects described by special relativity, yet it does not account for the gravitational aspects of general relativity. As of 2025, a comprehensive quantum field theory approach to gravity is missing. A low-energy approach to quantum gravity has been developed, but it cannot be extended to high-energy interactions (Wallace 2022). The Standard Model should be derived via approximations procedures from a more comprehensive theory of quantum gravity.
The scientific problem may take different guises, depending on the approach to quantum gravity under study. For instance, in the context of string theory, a number of quantum field theories have been derived from string theory solutions. However, none of those quantum field theories are the right ones, namely the ones involved in the Standard Model of particle physics. The number of solutions to string theory is incredibly large, making it, apparently, virtually impossible to get our hands on solutions corresponding to the Standard Model. This is the infamous landscape problem (Read and Le Bihan 2021).
Overall, the scientific problem is a problem for scientists, one that should be carefully distinguished from more philosophical issues introduced below.
b. The Problem of Empirical Coherence
The problem of empirical coherence for spacetime emergence arises when considering physical theories positing that spacetime does not exist fundamentally, while being simultaneously established on empirical evidence manifestly localised in space and time. A similar problem was first formulated in the context of quantum mechanics by Barrett (1996), pointing out at the tension between an ontology of so-called local beables and a realism about the wave function. It was then discussed by Healey (2002) for canonical quantum gravity, and systematically studied for a wide range of quantum gravity approaches by Huggett and Wüthrich (2013).
The formulation of the problem of empirical coherence usually employs the now standard concept of local beable introduced by Bell (1987). The beables of a theory are the things the theory postulates as being physically real. They are deemed “be-able” because they manifest as degrees of freedom, that is, determinable properties that can take on various determinate values. Beables are local if they have a location in space and time. Local beables are regarded as crucial for the possibility of observation, and hence for the empirical justification of theories in physics. Thus, the problem goes, a theory that would deny the fundamental existence of spacetime would appear to be empirically incoherent: the truth of the theory would erode the reasons that initially motivated endorsement of the theory.
Local beables already appear to be lacking in wave-function ontologies, prompting many questions on how to interpret the fundamental ontology of non-relativistic quantum mechanics (Albert, 1996; Ney, 2021b). According to this interpretation of non-relativistic quantum mechanics, configuration space realism (also called wave function realism or wavefunctionfundamentalism), the fundamental ontology of the theory is a distribution of quantitative properties, a physical counterpart of the mathematical wave function. Importantly, those properties are not localised within the ordinary three-dimensional space but in a modal space whose regions correspond to possible configurations of physical systems in the three-dimensional space—the so-called configuration space. This configuration space is a mathematical tool designed to facilitate calculation. Its dimensionality corresponds to the number of apparent particles in the physical system being described. Configuration space realism goes beyond regarding the mathematical space as a mere calculation convenience. It states that the mathematical configuration space reflects the existence of an actual, physical configuration space. Consequently, the fundamental arena of reality would be this configuration space, not the three-dimensional space.
An especially difficult issue is then to understand the relation between this fundamental configuration space wherein the wave-function is defined, on the one hand, and the emergent, ordinary three-dimensional space, on the other hand. Configuration space realism faces the charge of being incoherent, since the ordinary space and its local beables are not part of the fundamental ontology of the theory (Maudlin, 2007). However, it has been noticed that local beables are not logically necessary for the possibility of observation (Ney, 2015). Inter-subjective accessibility to evidence localised in configuration space might turn out be just as effective as the more intuitive accessibility to three-dimensional objects localised in space and time.
An important difference between non-relativistic quantum mechanics and quantum gravity is the status of time. In non-relativistic quantum mechanics, time is regarded as a fixed external parameter. What is at stake for configuration space realism, then, is the possible emergence of the ordinary space from the more fundamental configuration space, both located inside a non-relativistic external time. Some proponents of configuration space realism in the context of quantum mechanics take time to be necessary for observation, as temporality is an explicit part in virtually all formal theories of empirical confirmation (Ney,2015). The problem of empirical coherence thus appears to be much more difficult in the context of quantum gravity, when it is spacetime and not only space that comes under attack.
A somewhat natural thought is that the empirical coherence of a quantum gravity theory can be straightforwardly achieved by asserting that spacetime exists, yet not fundamentally, thus divorcing existence from fundamentality (Huggett and Wüthrich, 2013; Wüthrich, 2017). That is, we can establish the empirical coherence of theories of quantum gravity by formally deriving the general-relativistic spacetime from the more fundamental theory of quantum gravity. But such a formal derivation will not suffice to establish the reality of spacetime. Additionally, the mathematical derivation needs to be “physically salient”: it cannot be a mere mathematical curiosity. That spacetime exists in a non-fundamental way can thus be understood as the claim that it is physically salient, yet not fundamental. Spacetime would exist over and above the fundamental ontology of quantum gravity, for instance as a structure permeating the ontology of quantum gravity. Whether physical salience can be freed from fundamental spatiotemporality remains debated. Huggett and Wüthrich (2013) argue that it does; Maudlin (2007) argues that it does not. The problem of empirical coherence thus intersects with the problem of whether spacetime really exists, and if so how exactly. Positing the existence of a non-fundamental spacetime could be key to solving the problem of empirical coherence. Other solutions might merely require positing the existence of non-local beables that do not require the existence of a non-fundamental spacetime. More work remains to be done to review issues of empirical coherence related to the non-fundamentality of time beyond the non-fundamentality of space.
c. The Ontological Problem
The ontological problem is the related problem of the status of spacetime. Is spacetime real? And if so, what does it mean that it is not fundamental? If not, how can we make sense of the world around us, which definitely seems to be spatial and temporal? Answers to this question belong to one of the three following strands: eliminativism, reductionism and dualism. Eliminativism about spacetime claims that spacetime is not fundamental because it does not exist at all. Spacetime would be a sort of theoretical artefact, and even space and time might turn out to be akin to perceptual illusions. A second option is to maintain the existence of spacetime and to identify it with (parts of) the non-spatiotemporal structure. Finally, according to dualism, spacetime exists and is distinct from the fundamental structure. To spell out such a dualist account, one should specify how the non-spatiotemporal structure relates to the spatiotemporal one. See Section 5 for an overview of different candidate relations.
It is an intricate issue whether a specific solution to the problem of empirical coherence implies a certain answer to the question of the ontological status of spacetime. As shown below, this might be so for some, but not necessarily all proposed solutions to the problem of empirical coherence. Also, someone who denies that there is any problem of empirical coherence can agree that the question of the ontological status of spacetime should receive some answer.
d. The Conceptual Problem
The claim that space and time do not exist fundamentally might at first be met with scepticism. For how could it possibly be the case that the physical world is not spatial and not temporal? The claim raises concerns as it goes against the deep belief in the fundamentality of space and time. Does the non-fundamentality of spacetime present a conceptual problem in light of those beliefs? Unlike the scientific and empirical coherence issues, this conceptual problem calls into question the coherence and metaphysical plausibility of the view that spacetime could fail to be fundamental.
The problem can be elaborated more precisely in the following way. It relies on an experience of discrepancy between non-spatiotemporal and spatiotemporal concepts that cannot be fully addressed by gesturing at a formal reduction of a theory involving the first set of concepts to another theory involving the other concepts. Or, at the very least, more needs to be said on how to relate primitive spatiotemporal concepts to primitive non-spatiotemporal concepts, beyond a simple analysis in terms of bridge principles relating them, as in heterogeneous Nagelian reductions (Nagel, 1979). One wants to know if there is something so specific in spatiotemporal concepts that they could not possibly be explained away in terms of non-spatiotemporal concepts.
Whether the emergence of spacetime really poses a conceptual problem is controversial. It has been disputed to what extent an analogy with the hard conceptual problem of consciousness, which is supposed to illustrate what the conceptual problem of spacetime emergence consists in, can be carried through. The idea behind the analogy is that, just as there might be a hard problem for explaining the relations between physical and mental entities, one could ask whether there is something akin to spacetime qualia or spacetime qualities (those are not supposed to be mental in this context), analogous to qualia in the philosophy of mind (Le Bihan, 2021). Qualia in the philosophy of mind are potential “what is it like to be conscious” properties, especially difficult to reduce to purely physical entities. Likewise, one might wonder if there are “what it is to be spacetime” properties, especially difficult to reduce to purely non-spatiotemporal entities.
The existence of spacetime qualia has been denied by Knox (2014) and Lam and Wüthrich (2018). Le Bihan (2021) argues that the concept at least should be taken seriously, as the existence of a conceptual discrepancy associated to irreducible spacetime qualities might ground the intuition shared by a number of scholars that spacetime cannot possibly fail to be fundamental. The conceptual problem of spacetime would thus be a hard problem of spacetime, similar to the hard problem of consciousness. Overall, realising that there is no hard problem could alleviate the worry that spacetime emergence is logically or physically incoherent, by insisting that resistance to the logical or physical possibility of spacetime emergence originates in deceiving cognitive, pre-theoretical intuitions.
Another formulation of the hard problem can be expressed as the concern that it would be impossible to understand a non-spatiotemporal theory. However, by dissociating understanding from conceivability, it could be granted that we cannot imagine a world without spacetime, and yet still be able to understand it in a more theoretical way. This would require a conception of understanding that does not require visual imagination as a prerequisite for understanding. Rather, for instance, understanding might require the ability to use the theory in certain ways (De Haro and W. de Regt, 2020).
4. What is Spacetime?
Investigating the status of spacetime in quantum gravity requires to agree beforehand about the defining features of spacetime. What is this phenomenon or theoretical entity that is supposed to emerge from the non-spatiotemporal ontology? Spacetime is a generic term that can be associated to a number of more precise concepts. These concepts can be classified in two broad families. First, conceptions of theoretical spacetime are built on notions found in theoretical physics, and especially in special and general relativity as these are our standard theories of spacetime. Second, conceptions of phenomenal spacetime build on the phenomenology of spatial and temporal phenomena, rooted in our perceptual experience of the world. For instance, space, time, motion, repetitions, local beables (localised objects) and, more generally, any notion essentially tied, at least to some degree, to our concepts of space and time, altogether constitute this broad class of spatial and temporal phenomena.
This section surveys various conceptions of theoretical spacetime and phenomenal spacetime, and how they constitute reasonable targets for the recovering of spacetime in the context of quantum gravity.
a. Theoretical Spacetime
The most obvious concept of spacetime to be recovered from a non-spatiotemporal ontology is the one appearing in theoretical physics. However, an immediate challenge for this project is that there might be more than one concept of spacetime in theoretical physics. First, theoretical physics is not a monolithic block. It is made of a number of distinct theoretical frameworks, and spacetime is not conceptualised in the same way in all of these approaches. Second, even in general relativity, arguably our most solid and advanced theory of spacetime, there is no universal consensus on the nature of spacetime. Let us review the two issues in turn.
Spacetime seems to enjoy a special affiliation with special relativity and, by extension, general relativity. The first scientific concept of spacetime was put forward by Hermann Minkowski in 1908, providing a beautiful and compelling formulation of special relativity. Both Minkowski’s flat spacetime and the curved pseudo-Riemannian spacetime of general relativity appear to be prime candidates for spacetime recovery. And as the Minkowskian spacetime of special relativity appears to be a local approximation of the spacetime of general relativity when curvature is negligible, or can be neglected for various purposes, the pseudo-Riemannian concept of general relativity would seem to be the most suitable target for a definition of spacetime.
However, it seems at least logically possible to temper this demand for a special relation with relativistic physics and envision spacetime as a more autonomous notion, which, although born from special relativity, could feature in other, potentially non-relativistic, theories. For consider Newtonian physics. It may be reformulated and generalised using a four-dimensional ideology, resulting in the Newton-Cartantheory(Cartan, 1923). Thus, whether the only viable concept of theoretical spacetime is the one found in general relativity is a legitimate concern (Baron and Le Bihan, 2022c). However, there is no doubt that the concept of spacetime found in general relativity is of paramount importance for analysing the emergence of spacetime. Hence, setting aside other possible targets for the theoretical concept of spacetime, we now focus on the spacetime concepts from special and general relativity.
The geometric approach is the standard interpretation of special and general relativity. In fact, for many it is not even an “interpretation” of the theory; it is an essential feature of the theory itself. Consider first special relativity. Its geometric interpretation states that special-relativistic effects—including time dilation and length contraction—manifest the geometric structure of the four-dimensional Minkowski spacetime. This geometric structure exists in itself and is metaphysically independent of the rest of the world. This interpretation, pioneered by Minkowski, became the standard reading of special relativity, eventually gaining acceptance from Einstein himself despite his initial reservations. This spacetime structure is a four-dimensional manifold equipped of a metric field, describing how things can and cannot move when acted upon by other material systems. Importantly, the structure delineates the respective perimeters of inertial and non-inertial motion. Both non-massive and massive bodies, when not acted upon by other bodies, follow straight lines in space.
However, Einstein was not completely satisfied by the geometric approach to special relativity, pointing out that this spacetime is acting upon matter but cannot be acted upon (Brown and Pooley, 2006). That goes against a deeply-wired principle of action/reaction typical of substances—understand, of real entities. Interestingly, the action/reaction principle comes back with general relativity. The geometric spacetime of general relativity also reacts to the presence of massive bodies. Indeed, massive bodies curve spacetime around them, notably explaining the presence of what we effectively perceive as a force of gravitation pulling things towards massive objects. Einstein’s initial reservations about the geometric approach thereby disappear when factoring in the dynamic backlash of matter on spacetime, in the context of general relativity. The geometrical approach thus remains the standard view; spacetime is a structure existing on its own partially responsible (together with the dynamical laws) for the motion of material systems. Ignoring the vivid debates about the status of the relation between the metric field and the manifolds on the one hand, and between the metric field and matter fields on the other, this metric field constitutes the target of the recovery of spacetime when one subscribes to the geometrical approach.
Consider now the competing dynamicalapproach, which was championed by Harvey Brown and developed in details by the Oxford philosophy of physics group (Brown 2005, Brown and Pooley 2001, Brown and Pooley 2006, Read et al. 2018). It demotes the Minkowski spacetime from its fundamental status by analysing relativistic effects as properties of the dynamics of material bodies (more precisely of the symmetries of the dynamical laws). The dynamical approach relocates the origin of special-relativistic effects from the ontological category of spacetime to the ontological category of laws. It is thereby better suited to special relativity than to general relativity. Indeed, if relativistic effects are the manifestations of symmetries of the dynamical laws, and not of spacetime, then there is a bit of mystery as to why the symmetries of the metric field coincide with the symmetries of the matter fields. An immediate reply is that one could be realist about the metric field without identifying it to the spacetime geometry. The metric field should thus be rethought not as a representation of an independent spacetime, but rather of another material field. Accordingly, Brown expresses sympathies for Rovelli’s view that the metric field is another material field, the world being composed of fields on top of fields (Brown, 2005, p. 159-160). Overall, the prospects for applying the dynamical approach to general relativity remain highly debated.
The question then arises as to whether the general relativistic concept of spacetime to be derived from the physics of quantum gravity should be that of the relationist in spirit dynamical approach or that of the substantivalist in spirit geometric approach. The dynamical approach (by already unreifying spacetime to a great extent) might be easier to identify with an emerging structure. So, if the dynamical and geometrical approaches turn out to be empirically equivalent, considering general relativity in the dynamical apparatus might be the right kind of re-conceptualisation to narrow the explanatory gap between the general theory of relativity and a non-spatiotemporal theory of quantum gravity. It has been argued, however, that it might be easier to relate a dynamical reading of spacetime to a non-spatiotemporal theory of gravity, since the very existence of spacetime (in technical parlance, the chronogeometricity of the metric field) turns out to be contingent by depending on the actual coupling of the metric field with the matter fields (Le Bihan and Linnemann, 2019).
Since the geometrical and dynamical approaches are regarded as interpretations of the formalism of general relativity, it is reasonable to expect the two approaches to be empirically equivalent. This justifies taking a step back and asking whether spacetime should not rather be understood in a more abstract way, by what it does. What is more, the dynamical approach being more difficult to square with general relativity than with special relativity, it has been argued that the dynamical approach should culminate in a functionalist rewriting or adjustment of Brown’s original project (Knox, 2019).
Spacetimefunctionalism is a wide range family of views that either attempts to understand the concept of spacetime in functionalist terms in relativistic physics (Knox, 2011, 2014, 2019), or attempts to analyse the relation of spacetime emergence in the context of quantum gravity (Lam and Wüthrich, 2018, 2021; Yates, 2021; Chalmers, 2021). We focus here on the functionalist concept of spacetime; the functionalist approach to the relation of emergence will be discussed in Section 5.1.
According to a broad definition of spacetime functionalism, spacetime is the theoretical concept that appears in general relativity (or possibly as mentioned above, any other relevant spacetime theory in physics). For consider the Ramsey sentence for general relativity. This sentence is a definition of spacetime in relation of all its relevant predicates in the context of general relativity. Spacetime is the entity selected by the variable in the sentence, namely the entity that plays all the spacetime roles described by the Ramsey sentence. Hence the slogan that spacetime is as spacetime does. What this spacetime role or roles are, exactly, remains highly debated. According to a popular account by Knox (2019), spacetime is associated with inertial motion.
Two other views, similar but distinct from spacetime functionalism, can be articulated. One is spacetimeoperationalism (see, for example, Le Bihan and Linnemann 2019; Menon 2021). It states that spacetime is the entity recorded by rods and clocks, concrete tools used for probing the structure of spacetime. It bears similarity with spacetime functionalism, by sharing the slogan that spacetime is as spacetime does (in this case, what it does on probes made of matter fields, namely the rods and clocks). However, there is an important difference between spacetime functionalism and spacetime operationalism. While the first position identifies spatiotemporal roles within the physical theory, the second associates them with experimental practice, as the structure that explains the nature of the data collected (the movement and direction of the rods, the durations measured by the clocks). In light of all the (too) many conceptions of spacetime, another option is spacetime quietism. This is the view that we cannot agree on what is the right analysis of spacetime (Baron and Le Bihan, 2022c). Because of the plurality of views on the proper theoretical conception of spacetime, future agreement of the different participants in the discussion appears indeed unlikely. Spacetime quietism is the view that it is not necessary to agree on the theoretical nature of spacetime to make progress with the problems of spacetime, motivating a shift towards more phenomenal concepts of space and time. Before moving to the phenomenal notions, it should be noted however that the scientific problem (Section 3a) makes it necessary to attribute a special status to the derivation of general relativity from a theory of quantum gravity (since it is one of the very ingredients in the development of any theory of quantum gravity). It will be thus necessary to derive at least one particular conceptions of theoretical spacetime consistent with general relativity to address the scientific problem. In the next section, we turn to phenomenal conceptions of space and time as an alternative potential target of metaphysical recovery.
b. Phenomenal Space and Time
The shift to the way things appear to us might justify abandoning the notion of theoretical spacetime in favour of the two distinct notions of space and time, or finding a way to combine the two. One way to ascribe an important function to both notions can be found for instance in Chalmers (2018) as he argues that spacetime can be functionally individuated by its role in triggering phenomenal space and time. But one could alternatively insist that only phenomenal space and time exist, unlike the more theoretical notion of spacetime. Indeed, it can be argued that in the way things appear to us, space and time are not primarily intertwined in a spatiotemporal unity. In fact, the notions of phenomenal space and time may themselves prove too coarse. More refined notions associated with phenomenal space could be notions of local and non-local beables, localized observations, spatial localization, etc. Similarly, more refined notions associated with phenomenal time could be notions of local change, series of experiments, repetition, duration, statistical data, etc. The retreat from theoretical spacetime could thus be more or less profound, depending on whether one wishes to preserve monolithic notions of phenomenal space and time beyond the diversity of spatial and temporal features of the manifest world.
This retreat might be more or less appealing depending on one’s allegiance to the primacy of the external world over phenomenological content, or the other way around. This debate revives to some degree the one that once took place in the Vienna Circle between Neurath (1931) on the one hand, and Schlick (1934) on the other.
According to Neurath’s physicalism, observational statements derive their truth from physical states in the world. They are therefore based on the existence of intersubjective invariants that transcend the private sphere of each individual’s experiences. These invariants take the form of objects located in space and time and instantiating properties. Observational statements are therefore fallible, but objective, by positing the existence of a mind-independent grid enabling the coordination of cognitive experiences and guaranteeing the intersubjective validity of observations made by different observers at different locations in spacetime. If this mind-independent spacetime can be characterized by empirical science—as we are entitled to assume, given the immense success of general relativity—then it is none other than the theoretical spacetime discussed in the previous section.
For Schlick’s psychologism, on the contrary, observational statements derive their truth from mental states. They have the form “here, now, this and that”, but these spatial and temporal notions are linked to the way things appear to us (and therefore cannot be questioned), and not to an external objective, mind-independent spatiotemporal arena of reality. It is the private experiences of individuals producing observational statements that provide the infallible, subjective justification for scientific knowledge. Infallible as it may be, this sort of justification at least leaves open the question of whether, beyond the phenomenal notions of space and time found in ordinary life and scientific practice, there exists a spatiotemporal structure.
Schlick was naturally criticised for opening a Pandora’s box, the subjective tenor of his approach seemingly leading to an unpleasant form of solipsism. This difficulty arises just as much in the case of the emergence of spacetime: for, if there is no spacetime but only phenomenal notions of space and time, how can we ever salvage the intersubjective validity of science, the fact that different observers can compare notes taken from different standpoints and collectively assemble an ontology of the world? One promising answer is to recognize the existence of a fundamental ontological grid which, although not spatiotemporal for various reasons yet to be made explicit, nevertheless makes it possible to coordinate the experiences of observers. One such approach is found in Baron andLe Bihan (2024)’s causal theory of spacetime, which considers that spacetime emerges from a causal network more fundamental than spacetime. The fundamental ontology of causal relations, even though not spatiotemporal, could thus act as the coordination grid allowing intersubjective agreement between observers.
We can thus see the, at least partial, resemblance between Neurath and Schlick’s debate on how to best conceive the epistemological foundations of empirical sciences to the recent discussions on whether space and time belong to the external world in the guise of theoretical spacetime, or in the fundamental conceptual categories sentient beings project onto the world to experiment it and interact with it.
The distinction between theoretical spacetime and phenomenal space and time opens up a new line of thought: if a theoretical notion of spacetime may not be found in contemporary physics, then perhaps we should bid adieu to the concept. Perhaps the lesson to be gained from the emergence of spacetime is that spacetime does not exist, and that the only useful concepts to understand the nature of reality are the notions of phenomenal space and time, associated with the way sentient beings experiment the world. This echoes phenomenological approaches in a broad sense that we can trace back, for instance, to Immanuel Kant’s transcendental philosophy that envision space and time, with many other fundamental categories of sentient experience, as a priori categories necessary to shape our sensory experience.
Moving from theoretical spacetime to phenomenal space and time will have a number of implications for the problems of spacetime emergence. For consider first the ontological problem. Space and time phenomenalism appear to lead to spacetime eliminativism, the view that spacetime does not exist (Ismael, 2021; Baron, 2023; Miller, 2024). If there is no theoretical spacetime, and the only spacetime there is simply the conjunction of space and time, then an obvious terminological choice for this approach is that spacetime is not fundamental because it does not exist. What about space and time? The elimination of spacetime that follows from space and time phenomenalism opens up two theoretical options. Either space and time do not exist (space and time eliminativism), or they do exist (space and time realism). This might end up being a purely conventional choice depending on what one takes the defining features of the concepts of space and time to be (Le Bihan, 2015).
5. Bridging the Gap
Section 3 has presented a variety of problems that the gap, or discrepancy, between the fundamental and the spatiotemporal levels engenders. Solutions to these problems attempt to bridge this gap. As such, they take mostly the form of philosophical articulations of the emergence relation that is supposed to connect the non-spatiotemporal ontology to the spatiotemporal ontology. “Emergence” is here intended as an umbrella term, or placeholder, that can be filled in by the relations we consider in more detail: primitiveemergence, functionalrealisation, grounding, and mereologicalcomposition. This is not to deny that “emergence” could also denote a specific, primitive relation from the philosophy toolbox, or that analyses of the notion of emergence — such as the distinction between a weak and a strong form of emergence (as in Wilson 2021b) — could prove fruitful in application to the spacetime case. These issues are mostly open for future research.
In the final part of this section, we examine a different way of trying to resolve the problems surrounding the non-fundamentality of spacetime, which consists in denying that there spacetime exists at all. On this conception, there is thus no gap to be bridged.
a. Functional Realisation
We have already encountered functionalism in Section 4, as one option for specifying what needs to be recovered to recover spacetime. But spacetime functionalism can also serve as an analysis of the emergence relation.
Spacetime functionalism in quantum gravity is inspired by functionalist projects from other areas—notably, mental states in the philosophy of mind, and space in non-relativistic quantum mechanics. It differs from these projects in a number of ways.
First, in the philosophy of mind, functional realisation is often understood causally: the functions with which the emergent entity becomes identified are spelled out in terms of this entity’s causal interactions with other things. If spacetime is not fundamental, then the status of causation is equally questionable. In particular, it is questionable whether what is present at the fundamental level could stand in causal relations. It is thus important for spacetime functionalism to be successful that the notion of functional reduction is broad enough to ensure reduction need not be causal.
Another distinguishing feature of spacetime functionalism in quantum gravity concerns the epistemic status of the entities related by functional realisation. Standardly, the realised entities are the ones that are conceptually problematic. This is reversed in the case of spacetime emergence in quantum gravity: a successful functional realisation of spacetime is supposed to help us understand the possible emergence of spacetime from a puzzling non-spatiotemporal ontology (Huggett and Wüthrich, 2020).
Which problems from Section 3, then, does functionalism address? Spacetime functionalism was specifically designed as a solution to the problem of empirical coherence (Huggett and Wüthrich, 2013). However, proponents of a deflationary take on the problem of empirical coherence have denied that the resources of functionalism are needed to address the problem (Linnemann, 2020). As for the ontological problem of spacetime emergence, Lam and Wüthrich maintain that on the one hand, functionalism amounts “to the denial that there is a ‘hard problem’ beyond the easy problem’ of the emergence of spacetime” (Lam and Wüthrich, 2018, p.44), and on the other hand, that functionalism is orthogonal to the ontological question (Lam and Wüthrich, 2018, p. 40). Distinguishing between different sorts of functionalism Le Bihan (2021) argues there is a tension in this pair of claims. Indeed, functionalism comes in a number of versions with different answers to the ontological problem. For instance, if there is no spacetime, then there is no hard problem. This amounts to dissolving the hard problem by endorsing a particular solution to the ontological problem, based on a particular sort of functionalism, namely eliminativist functionalism. Thus, denying that there is a hard problem because there is no ontological problem might rely on a specific approach to functionalism which already presupposes a particular answer to the ontological problem.
In brief, the functionalist machinery might not be that independent from the hard and ontological problems. Introducing terminology familiar from the philosophy of mind, different sorts of functionalism can be distinguished along two parameters: a first parameter distinguishes role from realiser functionalism; another between ontic and linguistic functionalism. The various sorts of ontic functionalism—realiser functionalism, role functionalism, and eliminativist functionalism—more or less implicitly entail an answer to the ontological question. According to realiser functionalism, spacetime is identical to what fulfils the spacetime role on the quantum gravity level. Role functionalism entails a dualist view on which spacetime is derivative. And according to eliminative functionalism, there is no spacetime at all, but only linguistic roles that we wrongly reify beyond the language. Linguistic functionalism, on the other hand, is a thesis only about the meanings of certain concepts—namely, that their meaning and reference should be functionally analysed. It thus remains ontologically neutral.
A functionalist solution to the problem of empirical coherence can thus be orthogonal to, that is independent from, a solution to the ontological problem. Butterfield and Gomes (2020) argue that the right way to understand spacetime functionalism is as a species of reduction (and hence not as neutral with respect to the ontological question). They take this to be the lesson from Lewis (1972), who argues that if one accepts that two entities fulfil the same role, then one is committed, by logic and meaning alone—that is, without needing to posit any additional bridge laws—to their identity. Pace Butterfield and Gomes, Knox and Wallace (2023) present an argument against reductive functionalism in the spacetime context. The argument points out that functional identifications in physics typically rely heavily on approximation procedures. However, the argument merely targets versions of reductive functionalism relying on strict identity. As both Lewis (1972) and Butterfield and Gomes (2020) acknowledge, reductive functionalism needs to, and can, accommodate approximations.
b. Grounding
The notion of grounding was developed to capture metaphysical relations of non-causal dependence possibly involved in non-causal explanations. Typically cited examples of such dependence relations include: the relation between a set and its members; the relation between a conjunction and its conjuncts; or the relation between the fact that a flower is coloured and the fact that the same flower is red. In all these cases, the first relatum can be described as grounded in, and non-causally explained by, the second. Such non-causal explanations appeal to metaphysical principles, such as that colours are determinable which must have determinate instances, or the logical structure of the conjunction.
How promising are grounding-based approaches to spacetime emergence? Wilson (2021a) provides a modal argument against a grounding-based approach to the emergence of spacetime. (The argument is formulated in terms of constitution rather than grounding, which we neglect for the ease of our exposition.) According to him, the modal status that is commonly ascribed to grounding claims (as necessary), and the modal status commonly ascribed to the existence of spacetime (as contingent), are incompatible with a grounding account of spacetime emergence. More precisely, working with the example of loop quantum gravity, the following four claims cannot be true together:
(1) Spacetime is grounded in a superposition of spinfoams.
(2) The grounding of spacetime is metaphysically non-contingent.
(3) Newtonian spacetime is metaphysically possible.
(4) Newtonian spacetime is not grounded in a superposition of spinfoams. (Wilson, 2021a, p. 189; adapted terminology)
Proponents of a grounding approach to spacetime emergence will have to reject (2), (3) or (4), none of which is a palatable option according to Wilson.
Let us now examine some consequences of the non-fundamentality of spacetime for the very understanding of the notion of grounding itself. Grounding is often characterised as being a metaphysical analogue of causation, or more rarely even as a kind of metaphysical causation (Schaffer 2016, Wilson 2018; for a dissenting view, see Bernstein 2016). Typically, a criterion for distinguishing the two notions makes reference to time: causation happens over time, whereas grounding is synchronic (if what stands in the grounding relation is temporal at all). While this simple temporal criterion arguably needs some refinement irrespective of quantum gravity (Baron et al., 2020), no version of the temporal criterion can apply at the quantum gravity level if time is not present there. One can draw one of the three following consequences from this. It could be that: (i) there is no causation or no grounding at the quantum gravity level; or (ii) causation and grounding are indistinguishable at the quantum gravity level; or (iii) a criterion other than temporal distinguishes causation from grounding at the quantum gravity level. Which of these consequences is drawn affects the outlook of a grounding-based account of spacetime emergence. Wilson (2021a) opts for a novel criterion between causation and grounding, namely for distinguishing grounding from causation through the kind of law by which they are governed. On this account, causal relations are those that are governed by laws of nature, and grounding relations are governed by constitutive principles—that is, by principles that tell us what it is to be a certain kind of thing.
c. Mereological Composition
Objects we encounter in daily life, such as chairs or tables, do not figure in theoretical physics. But we have a relatively straightforward explanation for how chairs and tables emerge from the entities posited by theoretical physics: they are mereologically composed from these entities, whatever these turn out to be (e.g., particles or quantum fields). Of course, there is still a bit of mystery, at least according to many, about how properties of a whole can emerge from parts that lack such properties, but those kinds of potential explanatory gaps are ubiquitous. The mereological approach to spacetime emergence suggests using the same compositional approach to explain the emergence of spacetime. Spacetime would emerge from more fundamental ingredients roughly as chairs and tables emerge from more fundamental entities (Le Bihan, 2018a,b). This means that spacetime would be composed of non-spatiotemporal parts. In what follows, we focus on approaches that try to give a mereological account of spacetime regions (rather than, e.g., distance relations).
The comparison between the composition of ordinary objects and the emergence of spacetime faces the following difficulty. Parthood is typically associated with a number of formal properties—for example, it is typically assumed to be a partial order and to obey certain decomposition principles. Although virtually all such properties have been confronted with putative counterexamples, there is a widespread agreement about certain core characteristics of the parthood relation. If the relation at work in the supposedly mereological composition of spacetime departs too much from these characteristics, then it becomes questionable whether this relation is really the same, or at least from the same family, as the one familiar from the composition of chairs and tables.
One such characteristic typically attributed to parthood that might be missing in mereological models of spacetime emergence concerns the linkage between parthood and location (Baron, 2020). Chairs and their parts are located in spacetime, and the relation between them seems to be mirrored by the relation between their respective locations: just as the chair back is part of the chair, the region of spacetime filled by the chair back is a subregion of the region filled by the chair. Such intuitions have been captured more rigorously by a number of so-called harmonyprinciples, one of which is the following:
x is a part of y iff x’s location is a subregion of y’s location. (Saucedo, 2011, p. 227)
Whether principles such as the above can be maintained in mereological approaches to spacetime emergence depends on a number of choices that need to be made in spelling out such a mereological approach and how it interacts with a theory of location. For example, it needs to be specified how subregionhood relates to parthood (a popular option is to define subregionhood as parthood between regions); whether locations are themselves located somewhere (if so, then most plausibly they are located at themselves); and one needs to decide whether entities at the non-spatiotemporal level can still be attributed a location, albeit a non-spatiotemporal one (Le Bihan, 2018a). In the case of causal set theory, for example, one could make sense of non-spatiotemporal location in terms of location within the causal set structure.
To illustrate the point, let us have a look at a simple toy model in which the harmony principle stated above does fail (see Figure 1). In this toy model, there are just two objects (o1 and o2) at the fundamental non-spatiotemporal level, each having a non-spatiotemporal location (l1 and l2, respectively)—so, we assume that there is a meaningful notion of non-spatiotemporal location available. The two non-spatiotemporal objects o1 and o2 compose the only entity existing at the spatiotemporal level, region r. We let parthood be reflexive, so everything is a part of itself. We further stipulate that a location is a subregion of another just in case it is a part of it, and that every (spatiotemporal or non-spatiotemporal) location is located at itself. Then the harmony principle is violated since o1 is part of r, but o1’s location (l1) is not a subregion of r’s location (which is just r itself). That l1 is not a subregion of r is because we did not assume l1 to be a part of r. Indeed, it seems unclear how we could say that non-spatiotemporal locations could be part of spatiotemporal ones, as they are not located in a common spatial framework.
Figure 1: A mereological model of spacetime emergence violating harmony principles. Thick lines represent parthood (going upwards).
Harmony principles could either be used as guiding principles for the development of mereological approaches to spacetime emergence, or one could endorse a mereological approach violating these principles. The latter strategy could be justified by urging that novel insights from scientific enquiry might trump intuitions about harmony principles (Le Bihan 2018a and Baron and Le Bihan 2022a). So, advances in fundamental physics can call into question location principles and principles about mereological composition. Discussions over locality based on common sense intuitions are then dismissed as irrelevant for discussions over the composition or decomposition of spacetime.
A different challenge for a mereological approach to spacetime emergence might lie in the use working physicists actually make of decomposition. Physicists use decomposition techniques in a highly pragmatic way, which arguably are not suitable for disclosing a hierarchical structure of reality (Healey, 2013). For example, how physicists decompose light (into particles, electromagnetic waves, or a quantum mixture of states of electro-magnetic fields) might depend on the intended application of the decomposition—those are not necessarily supposed to reveal the fundamental mereological structure of light. Furthermore, superposition and mixture, composition relations invoked in quantum physics, seem to have other formal properties than parthood. On this view, then, the viability of the mereological approach becomes a question of usefulness; what matters is whether such a decomposition is useful for the working physicists, not whether it solves the philosophical problems from Section 3.
d. Eliminativism
Spacetime eliminativism rejects the assumption that spacetime really emerges from the non-spatiotemporal fundamental structure: spacetime, on this view, simply does not exist (Baron, 2023). The view has been defended by Miller (2024) under the name of spacetime projectivism: spatiotemporal properties would be projected onto a world which lacks such properties. The problem of empirical coherence seems especially thorny for this approach. If there is no spacetime at all, not even derivative, then how are we to make sense of evidence seemingly localised in spacetime?
To solve the problem of empirical coherence, it must be shown how a non-spatiotemporal theory could be observationally justified. A spacetime eliminativist thus needs to dissociate observation from spacetime (Baron and Le Bihan, 2022c). Spacetime eliminativism comes in different versions, depending on the sort of entities that are invoked to replace spacetime in order to account for the problem of empirical coherence. According to a first version, the local beables of experimental physics still exist, but not in a way which also requires spacetime to be real (Baron, 2023). According to the second, more radical version, even local beables turn out not to exist.
The moderate version faces two difficulties. Firstly, it is not immediately clear how to understand the notion of local beables without reference to spacetime, or space and time. Questions that need to be addressed are as follows: in which sense is a local beable local, if it is not in a spacetime sense? How can we run statistical analyses of runs of experiments involving local beables, if there is no time to organise the data?
Secondly, and more importantly, dissociating the notion of local beables from the notion of spacetime might only shift the problem from spacetime emergence to the emergence of local beables, which now has to be accounted for independently. The attractiveness of spacetime eliminativism seems to depend on whether this new problem turns out to be easier to solve, or less salient than the problem we started out with.
The second version of spacetime eliminativism is more radical by even dispensing with local beables. To solve the problem of empirical coherence, a defender of this version of eliminativism can argue that what needs to be recovered, strictly speaking, is not the physical space, but the spatiotemporality of human perception (Ismael, 2021). And it seems at least possible that this phenomenology does not transparently describe the physical world as it is, which might in fact have a non-spatiotemporal physical structure. Moving to such a phenomenal or phenomenological approach, one can thus maintain a form of realism about physics without realism about spacetime (see Section 4b). More work is needed to assess if and how it could be possible to articulate a non-spatiotemporal account of the physical processes governing involved in (apparently spatiotemporal) human perception.
6. Implications
We have mentioned on several occasions that the spatiotemporality of reality plays a crucial role in many philosophical outlooks, and that denying spacetime a fundamental status will thus have important implications for a broad range of philosophical questions. We have already come across some potential candidates; this section presents further such consequences in a bit more detail.
a. Philosophy of Time
What is the fate of classical debates in the philosophy of time in light of quantum gravity? This will of course depend greatly on the approach to quantum gravity investigated (for a survey, see Huggett et al. 2013). Consider for instance the dispute between A- and B-theorists over whether time passes, or the one between presentists (only the present exists), growing-block theorists (only the entities we regard as past and present exist) and eternalists (entities categorised as past, present or future equally exist) over the domain of existence in time, or again the debate between relationalism and substantivalism as to whether spacetime should be conceived as a relational structure between material entities or as a substance with an existence of its own. Arguably, considerations from quantum gravity will have major repercussions on these views.
Let us focus on presentism and the objective foliation it requires and set aside first the possible emergence of spacetime to ask the following question: could we find one unique objective foliation of the spacetime in quantum gravity? The predominant view appears to be negative as quantum gravity should not resuscitate a non-relativistic world by imposing an objective, unique foliation onto the fundamental ontology (Callender 2000, Belot and Earman 2001, p. 241). However, it has been argued that, on the contrary, quantum gravity could provide a hospitable home to such a foliation, and hence presentism (Monton, 2006). Although this is certainly a logical possibility, this proposal encounters a number of issues. Among a number of technical objections, the most devastating one raised by Wüthrich (2010, 2013) is that even if it turned out that there was a genuine single foliation of the fundamental structure, there would be no reason to expect that our presentist intuitions could hook onto it. The situation is very similar to the now-now objection against the growing block theory (Braddon-Mitchell, 2004): if the present is really the edge of the past-present block, how do you know that your present, from your own perspective, corresponds to the objective boundary of being, to the real objective present, and that you are not lost in the past of the block?
Now, if the fundamental structure is genuinely non-spatiotemporal, then the situation appears even grimmer for the presentist (and the growing block theorist). Since they require the existence and fundamentality of time, spacetime emergence supports either standard eternalism or a new form of eternalism, atemporal eternalism that states that all proper parts of the natural world co-exist simpliciter, and this even though the natural world is not temporal (Le Bihan, 2020).
It has also been argued that some cosmological models based on quantum gravity might suggest not that there is no time, but on the contrary that we need two times (Wüthrich, 2022), a claim also found in one particular approach to string theory, namely F-theory (Le Bihan, 2023; Cinti and Sanchioni, 2023). Whether the denial of the uniqueness of time, and thus of the existence of a single fundamental time, is regarded as a genuine expression of the non-fundamentality of time is, of course, a matter of convention. But it could have important repercussions on debates in the metaphysics of time concerning the plausibility of the hypertimehypothesis, especially since the view has been described to be “just insane” (Skow, 2015, p. 47). This is the view that reality could encompass a second-order time allowing for the possibility of variations of the first-order time with respect to a second-order time, and thus of a veritable flow of time, the first-order present “moving” with respect to the second-order time (Smith, 2011). If hypertime were to gain justification from quantum gravity, it might thereby offer a route to a certain class of dynamical A-theories, contrary to what is generally considered to be the lessons of quantum gravity for our understanding of time.
Furthermore, it is interesting to note that a large part of the argument in metaphysics against this hypertime hypothesis builds on the belief that the two times must share a similar structure, an assumption that is questionable at best, as demonstrated by Baron and Lin (2022). Arguably, the approaches from quantum gravity and cosmology underwriting a two-times approach could provide a concrete blueprint for evaluating the discussion in more detail. Virtually all the work remains to be done to connect the philosophy of quantum gravity to the metaphysics of hypertime literature.
Another debate in the philosophy of time concerns the possibility of time travel and closed time-like curves. A certain category of time travel seems to be possible according to general relativity, as it allows for closed time curves, that is, closed spacetime trajectories that would permit a forward time traveller to return to his past (Earman et al., 2009). One might wonder whether this result is expected to carry over to the prospective theory of quantum gravity. At this stage, there is no clear answer to this question, as shown by Wüthrich (2021). But, one can already articulate possibilities and debate on whether closed timelike curves could survive the absence of closed curves in the fundamental ontology. Interestingly, according to a certain metaphysical interpretation of a speculative cosmological model based on quantum gravity ideas and developed by Penrose, the closed time curves might turn out to be the rule, and not the exception within spacetime. His conformal cyclic cosmology could indeed be teaching us that the world is a gigantic cosmic loop, the whole universe being closed on itself in all timelike directions that do not terminate into black holes (Le Bihan, 2024).
b. Modality, Laws of Nature, Causation
Our next stop is modality, laws of nature, and causation. Accounts of these three notions can come in certain package deals, of which David Lewis’s is a particularly influential one (Lewis, 1986). Lewis gives reductive accounts of causation and laws of nature, and crucial to these reductions is his modalrealism: the view that all ways the world could be exist concretely as possible worlds. To individuate the possible worlds within modal space, some kind of “world-making relation” is needed, and Lewis identifies spatiotemporal relations as these world-making relations. This will not do if, as quantum gravity suggests, spacetime is not fundamental. As Wüthrich (2019) argues, if a quantum gravity programme such as causal set theory turns out to be true of our world, it will be doubtful whether we can find any relation holding at the fundamental level that can fulfil the role of the world-making relation. Naturally, if it proved impossible to find any other non-spatiotemporal world-making relation, Lewis’s theory of modality and the conceptions of the laws of nature and causality that it underpins would fail in unison. One option could be to use entanglement relations of spacetime relations as building relations (Jaksland, 2021; Ney, 2021a; Cinti et al., 2022; Cinti and Sanchioni, 2021).
However, the problem is by no means unique to Lewis’s account of laws of nature. As Lam and Wüthrich (2023) demonstrate, most of the popular accounts of law have bad prospects of surviving the shift to non-spatiotemporal fundamental physics. The minimal primitivist account developed by Chen and Goldstein (2022) might be an exception, as it aims to give an explanation of how laws govern that does not necessitate a dynamical evolution from earlier states to later states.
In the case of causation, the situation is slightly different: we do not necessarily need an account of causation that applies to non-spatiotemporal settings. This is because, contrary to the case of laws of nature which should arguably be present at the quantum gravity level, it seems a viable option that causation emerges together with spacetime, and many take causation not to be a part of physics anyway. Accounts of causation that presuppose spacetime are thus not necessarily ruled out, if spacetime is not fundamental, but will plausibly relegate causation to an equally non-fundamental status.
However, one could ask the further question as to whether causation could be a fundamental feature of reality even if spacetime is not. On the one hand, time seems essential to differentiate causes from effects, since—disregarding the possibility of backwards causation—causes precede their effects. On the other hand, not everyone agrees that spacetime is essential to, or more fundamental than, causation. First, interventionist accounts of causation seem in principle applicable to non-spatiotemporal settings (Baron et al., 2010; Baron and Miller, 2014). Second, taking causation as more fundamental than time, and reducing spatiotemporal relations to causal relations, has some philosophical precedence that could be brought to bear on quantum gravity settings: it was already defended by Leibniz and Kant, and in the 20th century, causal theories of spacetime have prominently been advocated by Reichenbach (1956), Grünbaum (1973), and van Fraassen (1970) in the context of relativistic physics before being abandoned in the late seventies, and put back on the philosophical agenda recently in the context of quantum gravity (Baron and Le Bihan, 2024).
c. Other Topics
Further metaphysical positions incompatible with the non-fundamentality of spacetime might include Armstrong’s definition of naturalism, at least under a certain interpretation. According to his naturalism, the spacetime world is all that exists (Armstrong, 2004, p. 101). The fact that the spacetime world is all there is could collide with the view that spacetime is not fundamental. Indeed, a plausible position seems to be that there is more in the non-fundamental world than in the emergent world; and in the context of spacetime emergence, this excess of structure in fundamentals nonetheless appears to be physical, and should thus be accommodated by any suitable definition of naturalism.
Another example might be Schaffer’s mix of prioritymonism and supersubstantivalism (Schaffer, 2009, 2010). Priority monism is the view that the cosmos is more fundamental than any of its proper parts. Supersubstantivalism is the identification of the cosmos to a spacetime structure directly instantiating natural properties, without the mediation of objects. Supersubstantivalism appears to be incompatible with the view that the whole cosmos should be identified to a non-spatiotemporal substance. There is thus a question as to whether priority monism can be developed absent supersubstantivalism in the context of quantum gravity. Such a view has been proposed by Le Bihan (2018b): spacetime entities and spacetime itself are regarded as identical to (non-spatiotemporal) proper parts of the whole cosmos.
Another discussion concerns the existence of extended simples, that is, entities that are both extended and deprived of proper parts. These could be constituent parts of material objects, spacetime or any other relevant aspect of the natural world. It has been argued that string theory and loop quantum gravity lead to a conception of discrete spacetime, and that this fact provides a justification for the existence of extended simples (Rettler, 2018, p. 851). However, there is not much evidence to support the claim that the ontology of string theory is discrete (Baker, 2016). And even in the more suggestive case of loop quantum gravity, such a stance presupposes that the discrete entities associated with the ultimate nature of spacetime are not only simple, but also extended. But if the fundamental structure is not spatiotemporal, a dilemma then arises: either the discrete entities are not extended, or they are not spatiotemporally extended, which then calls for a non-spatiotemporal notion of extension (Baron and Le Bihan, 2022b).
Finally, note that the philosophical consequences of the non-fundamentality of spacetime might go beyond what is sometimes perhaps narrowly conceived of as the domain of metaphysics, to cover the philosophy of mind and language. Braddon-Mitchell and Miller (2019), for example, argue that the non-fundamentality of (space)time stands in the way of naturalistic theories of representation. In a nutshell, if representation requires causation, and causation requires time, then timelessness could undermine these naturalistic theories of causation. One can thereby appreciate how considerations originated in the metaphysics of quantum gravity might have far-reaching implications, branching out into a number of distinct philosophical debates.
7. References and Further Reading
Albert, D. Z. Elementary quantum mechanics. In J. T. Cushing, A. Fine, and S. Goldstein, editors, Bohmianmechanicsandquantumtheory:Anappraisal. Dordrecht: Kluwer, 1996.
Armstrong, David Malet. Truth and Truthmakers. Cambridge University Press, Cambridge, 2004.
Baker, David John. Does string theory posit extended simples? Philosopher’sImprint, 16 (18):1–15, 2016.
Baker, David John. Knox’s inertial spacetime functionalism (and a better alternative). Synthese, 199:277–298, 2021.
Baron, Sam. The curious case of spacetime emergence. PhilosophicalStudies, 177:2207– 2226, 2020.
Baron, Sam and Baptiste Le Bihan. Composing spacetime. Journal of Philosophy, 119(1): 33–54, 2022a.
Baron, Sam and Baptiste Le Bihan. Quantum gravity and mereology: Not so simple. PhilosophicalQuarterly, 72(1):19–41, 2022b.
Baron, Sam and Baptiste Le Bihan. Spacetime quietism in quantum gravity. In Antonio Vassallo, editor, The Foundations of Spacetime Physics:Philosophical Perspectives. Routledge, New York, 2022c.
Baron, Sam and Baptiste Le Bihan. Causal theories of spacetime. Noûs, 58(1):202–224, 2024.
Baron, Sam and Yi-Cheng Lin. Time, and time again. ThePhilosophicalQuarterly, 72(2): 259–282, 2022.
Baron, Sam and Kristie Miller. Causation in a timeless world. Synthese, 191:2867–2886, 2014.
Baron, Sam, Peter Evans, and Kristie Miller. From timeless physical theory to timelessness. HumanaMente, 13(35–59):35–59, 2010.
Baron, Sam , Kristie Miller, and Jonathan Tallant. Grounding at a distance. Philosophical Studies, 177:3373–3390, 2020.
Barrett, Jeffrey A. Empirical adequacy and the availability of reliable records in quantum mechanics. Philosophy of Science, 63(1):49–64, 1996.
Bell, John S. Speakable and Unspeakable in Quantum Mechanics. Cambridge University Press, Cambridge, 1987.
Belot, Gordon and John Earman. Pre-Socratic quantum gravity. In Craig Callender and Nick Huggett, editors, PhysicsmeetsphilosophyatthePlanckscale, pages 213–255. Cambridge University Press, Cambridge, 2001.
Bernstein, Sara. Grounding is not causation. PhilosophicalPerspectives, 30:21–38, 2016.
Blumenhagen, Ralph, Dieter Lüst, and Stefan Theisen. BasicConceptsofStringTheory, volume 17. Springer, 2013.
Bombelli, Luca, Joohan Lee, David Meyer, and Rafael D Sorkin. Space-time as a causal set. Physical Review Letters, 59(5):521, 1987.
Braddon-Mitchell, David. How do we know it is now now? Analysis, 64(3):199–203, 2004.
Braddon-Mitchell, David and Kristie Miller. Quantum gravity, timelessness, and the contents of thought. Philosophical Studies, 176(7):1807–1829, 2019.
Brown, H. R. and O. Pooley. The origins of the spacetime metric: Bell’s Lorentzian pedagogy and its significance in general relativity. In C. Callender and N. Huggett, editors, PhysicsMeetsPhilosophyatthePlanckScale. Cambridge University Press, Cambridge, 2001.
Brown, H. R. and O. Pooley. Minkowski space-time: A glorious non-entity. In D. Dieks, editor, The Ontology of Spacetime. Elsevier, Amsterdam, 2006.
Brown, Harvey R. PhysicalRelativity:Space-timeStructurefromaDynamicalPerspective. Oxford University Press, Oxford, 2005.
Butterfield, Jeremy and Henrique Gomes. Functionalism as a species of reduction. arXiv preprint, arXiv:2008.13366., 2020.
Callender, Craig. Shedding light on time. PhilosophyofScience, 67(S3):S587–S599, 2000.
Cartan, Élie. Sur les variétés à connexion affine et la théorie de la relativité généralisée (première partie). In Annalesscientifiquesdel’Écolenormalesupérieure, volume 40, pages 325–412, 1923.
Chalmers, David. Finding space in a non-spatial world. In Christian Wüthrich, Baptiste Le Bihan, and Nick Huggett, editors, Philosophy Beyond Spacetime, pages 154–181. Oxford University Press, Oxford, 2021.
Chalmers, David J.. The meta-problem of consciousness. JournalofConsciousnessStudies, 25(9-10):6–61, 2018.
Chen, Eddy Keming and Sheldon Goldstein. Governing without a fundamental direction of time: Minimal primitivism about laws of nature. In JerusalemStudiesinPhilosophy and History of Science, pages 21–64. Springer International Publishing, 2022.
Chua, Eugene Y. S. and Craig Callender. No time for time from no-time. Philosophyof Science, 88(5):1172–1184, 2021.
Cinti, Enrico and Marco Sanchioni. Humeanism in light of quantum gravity. Synthese, 199(3):10839–10863, 2021.
Cinti, Enrico and Marco Sanchioni. Time, spacetime, and f-theory. manuscript, 2023.
Cinti, Enrico, Alberto Corti, and Marco Sanchioni. On entanglement as a relation. EuropeanJournalforPhilosophyofScience, 12(1):10, 2022.
Cohen, Robert Sonné, Michael Horne, and John J Stachel. Experimental metaphysics: Quantum mechanical studies for Abner Shimony, volume one. Springer, 1997.
Dawid, Richard. String Theory and the Scientific Method. Cambridge University Press, Cambridge, 2013.
De Haro, Sebastian and Jeremy Butterfield. On symmetry and duality. Synthese, 198(4): 2973–3013, 2021.
De Haro, Sebastian and Henk W. de Regt. A precipice below which lies absurdity? Theories without a spacetime and scientific understanding. Synthese, 197:3121–3149, 2020.
Dowker, Fay. Causal sets as discrete spacetime. ContemporaryPhysics, 47(1):1–9, 2006.
Dowker, Fay. Introduction to causal sets and their phenomenology. GeneralRelativity and Gravitation, 45:1651–1667, 2013.
Dowker, Fay. The birth of spacetime atoms as the passage of time. Annals of the New York Academy of Sciences, 1326(1):18–25, 2014.
Earman, John. Reassessing the prospects for a growing block model of the universe. InternationalStudiesinthePhilosophyofScience, 22(2):135–164, 2008.
Earman, John, Christopher Smeenk, and Christian Wüthrich. Do the laws of physics forbid the operation of time machines? Synthese, 169:91–124, 2009.
Esfeld, Michael. Against the disappearance of spacetime in quantum gravity. Synthese, 199(Suppl 2):355–369, 2021.
Glick, David and Baptiste Le Bihan. Metaphysical indeterminacy in Everettian quantum mechanics. European Journal for Philosophy of Science, 14(1):3, 2024.
Greene, Brian. TheElegantUniverse. Vintage Books, New York, 1999.
Grünbaum, Adolf. The causal theory of time. In Robert S. Cohen and Marx W. Wartofsky, editors, PhilosophicalProblemsofSpaceandTime, volume 64, pages 179–208. Dordrecht: Springer Netherlands, 1973.
Healey, Richard. Can physics coherently deny the reality of time? Royal Institute of Philosophy Supplements, 50:293–316, 2002.
Healey, Richard. Physical composition. Studies in History and Philosophy of Modern Physics, 44:148–62, 2013.
Huggett, Nick. Skeptical notes on a physics of passage. AnnalsoftheNewYorkAcademy of Sciences, 1326:9–17, 2014.
Huggett, Nick. Target space ̸= space. Studies in History and Philosophy of Modern Physics, 59:81–88, 2017.
Huggett, Nick and Karim PY Thébault. Finding time for Wheeler-DeWitt cosmology. arXiv preprint arXiv:2310.11072, 2023.
Huggett, Nick and Tiziana Vistarini. Deriving general relativity from string theory. Philosophyof Science, 82(5):1163–1174, 2015.
Huggett, Nick and Christian Wüthrich. Emergent spacetime and empirical (in)coherence. StudiesinHistoryandPhilosophyofModernPhysics, 44(3):276–285, 2013.
Huggett, Nick and Christian Wüthrich. The (a)temporal emergence of spacetime. Philosophyof Science, 85(5):1190–1203, 2018.
Huggett, Nick, Tiziana Vistarini, and Christian Wüthrich. Time in quantum gravity. A Companion to the Philosophy of Time, pages 242–261, 2013.
Huggett, Nick, Niels Linnemann, and Mike D. Schneider. QuantumGravityinaLabora- tory? Elements in the Foundations of Contemporary Physics. Cambridge University Press, 2023.
Ismael, Jenann. Do you see space? How to recover the visible and tangible reality of space (without space). In Christian Wüthrich, Baptiste Le Bihan, and Nick Huggett, editors, Philosophy Beyond Spacetime. Oxford University Press, Oxford, 2021.
Jaksland, Rasmus. Entanglement as the world-making relation: Distance from entanglement. Synthese, 198(10):9661–9693, 2021.
Jaksland, Rasmus and Kian Salimkhani. The many problems of spacetime emergence. TheBritishJournalforthePhilosophyofScience, 2023. doi: 10.1086/727052.
Knox, Eleanor. Newton-Cartan theory and teleparallel gravity: The force of a formulation. StudiesinHistoryandPhilosophyofModernPhysics, 42(4):264–275, 2011.
Knox, Eleanor. Spacetime structuralism or spacetime functionalism, 2014. Manuscript.
Knox, Eleanor. Physical relativity from a functionalist perspective. StudiesinHistoryand Philosophy of Modern Physics, 67:118–124, 2019.
Knox, Eleanor and David Wallace. Functionalism fit for physics. Unpublishedmanuscript, 2023.
Lam, Vincent and Michael Esfeld. A dilemma for the emergence of spacetime in canonical quantum gravity. StudiesinHistoryandPhilosophyofSciencePartB:StudiesinHistory and Philosophy of Modern Physics, 44(3):286–293, 2013.
Lam, Vincent and Christian Wüthrich. Spacetime is as spacetime does. StudiesinHistory and Philosophy of Modern Physics, 64:39–51, 2018.
Lam, Vincent and Christian Wüthrich. Spacetime functionalism from a realist perspective. Synthese, 199:335–353, 2021.
Lam, Vincent and Christian Wüthrich. Laws beyond spacetime. Synthese, 202(71), 2023.
Le Bihan, Baptiste. The unrealities of time. Dialogue, 54(1):25–44, 2015.
Le Bihan, Baptiste. Space emergence in contemporary physics: Why we do not need fundamentality, layers of reality and emergence. Disputatio, 10(49):71–95, 2018a.
Le Bihan, Baptiste. Priority monism beyond spacetime. Metaphysica, 19(1):95–111, 2018b.
Le Bihan, Baptiste. String theory, loop quantum gravity and eternalism. EuropeanJournal for Philosophy of Science, 10(2):1–22, 2020. doi: 10.1007/s13194-020-0275-3.
Le Bihan, Baptiste. Spacetime emergence in quantum gravity: Functionalism and the hard problem. Synthese, 199(2):371–393, 2021.
Le Bihan, Baptiste. String theory for metaphysicians. Unpublishedmanuscript, 2023.
Le Bihan, Baptiste. The great loop: From conformal cyclic cosmology to aeon monism. Journal for General Philosophy of Science / Zeitschrift für Allgemeine Wissenschaftstheorie, 2024.
Le Bihan, Baptiste and Niels Linnemann. Have we lost spacetime on the way? Narrowing the gap between general relativity and quantum gravity. Studies in History and Philosophy of Modern Physics, 65:112–121, 2019.
Le Bihan, Baptiste and James Read. Duality and ontology. PhilosophyCompass, 13(12): e12555, 2018.
Lewis, David. Psychophysical and theoretical identifications. Australasian Journal of Philosophy, 50(3):249–258, 1972.
Linnemann, Niels. On the empirical coherence and the spatiotemporal gap problem in quantum gravity: And why functionalism does not (have to) help. Synthese, 199(2): 395–412, 2020.
Major, Seth, David Rideout, and Sumati Surya. Stable homology as an indicator of manifoldlikeness in causal set theory. ClassicalandQuantumGravity, 16(17):175008, 2009.
Malament, David B. The class of continuous timelike curves determines the topology of spacetime. Journal of Mathematical Physics, 18(7):1399–1404, 1977.
Matsubara, Keizo and Lars-Göran Johansson. Spacetime in string theory: A conceptual clarification. Journal for General Philosophy of Science, 49(3):333–353, 2018.
Maudlin, Tim. Completeness, supervenience and ontology. JournalofPhysicsA:Mathematical and Theoretical, 40(12):3151–3171, 2007.
McTaggart, John E. M.. The unreality of time. Mind, 17:457–484, 1908.
Menon, Tushar. Taking Up Superspace: The Spacetime Setting for Supersymmetric Field Theory. In Christian Wüthrich, Baptiste Le Bihan, and Nick Huggett, editors, Philosophy Beyond Spacetime: Implications from Quantum Gravity. Oxford University Press, Oxford, 2021.
Miller, Kristie. Spatiotemporal projectivism. In Stephen Hetherington, editor, Extreme Philosophy, pages 47–61. Routledge, 2024.
Monton, Bradley. Presentism and quantum gravity. In Dennis Dieks, editor, The Ontology of Spacetime, volume 1 of Philosophy and Foundations of Physics, pages 263–280. Elsevier, 2006.
Neurath, Otto. Soziologie im Physikalismus. Erkenntnis, 2:393–431, 1931.
Ney, Alyssa. The status of our ordinary three dimensions in a quantum universe. Noûs, 46(3):525–560, 2012.
Ney, Alyssa . Fundamental physical ontologies and the constraint of empirical coherence: A defense of wave function realism. Synthese, 192(10):3105–3124, 2015.
Ney, Alyssa. From quantum entanglement to spatiotemporal distance. In Christian Wüthrich, Baptiste Le Bihan, and Nick Huggett, editors, Philosophy Beyond Spacetime, pages 78–102. Oxford University Press, Oxford, 2021a.
Ney, Alyssa. TheWorldintheWaveFunction:AMetaphysicsforQuantumPhysics. Oxford University Press, New York, 2021b.
Read, James and Baptiste Le Bihan. The landscape and the multiverse: What’s the problem? Synthese, 199(3):7749–7771, 2021.
Read, James, Harvey R Brown, and Dennis Lehmkuhl. Two miracles of general relativity. Studies in history and philosophy of science Part B: Studies in history and philosophy of modern physics, 64:14–25, 2018.
Reichenbach, Hans. The Direction ofTime. Dover Publications, 1956.
Rettler, Bradley. Mereological nihilism and puzzles about material objects. Pacific Philosophical Quarterly, 99(4):842–868, 2018.
Rideout, David and Petros Wallden. Emergence of spatial structure from causal sets. InJournalofPhysics:ConferenceSeries, volume 174, page 012017. IOP Publishing, 2009.
Rideout, David P and Rafael D Sorkin. Classical sequential growth dynamics for causal sets. Physical Review D, 61(2):024002, 1999.
Rovelli, Carlo. Quantum Gravity. Cambridge University Press, Cambridge, 2004.
Rovelli, Carlo and Francesca Vidotto. CovariantLoopQuantumGravity: AnElementary Introduction to Quantum Gravity and Spinfoam Theory. Cambridge University Press, Cambridge, 2014.
Saucedo, Raul. Parthood and location. In Dean W. Zimmerman and Karen Bennett, editors, OxfordStudiesinMetaphysicsVol.5, pages 225–286. Oxford University Press, Oxford, 2011.
Schaffer, Jonathan. Spacetime the one substance. PhilosophicalStudies, 145(1):131–148, 2009.
Schaffer, Jonathan. Monism: The priority of the whole. Philosophical Review, 119(1): 31–76, 2010.
Schaffer, Jonathan. Grounding in the image of causation. PhilosophicalStudies, 173(1): 49–100, 2016.
Schlick, Moritz. Über das Fundament der Erkenntnis. Erkenntnis, 4:79–99, 1934.
Skow, Bradford. Objectivebecoming. Oxford University Press, 2015.
Smith, Nicholas JJ. Inconsistency in the A-theory. Philosophical Studies, 156:231–247, 2011.
Tomasiello, Alessandro. Geometry of String Theory Compactifications. Cambridge University Press, Cambridge, 2022.
van Fraassen, Bas. An Introduction to the Philosophy of Time and Space. Columbia University Press, 1970.
Vassallo, Antonio and Michael Esfeld. A proposal for a Bohmian ontology of quantum gravity. Foundations of Physics, 44:1–18, 2014.
Vistarini, Tiziana. The Emergence of Spacetime in String Theory. Routledge, New York, 2019.
Wallace, David. Quantum gravity at low energies. StudiesinHistoryandPhilosophyof Science Part A, 94(C):31–46, 2022. doi: 10.1016/j.shpsa.2022.04.003.
Wilson, Alastair. Explanations of and in time. In Christian Wüthrich, Baptiste Le Bihan, and Nick Huggett, editors, Philosophy Beyond Spacetime: Implications from Quantum Gravity, pages 182–198. Oxford University Press, Oxford, 2021a.
Wilson, Jessica. MetaphysicalEmergence. Oxford University Press, Oxford, 2021b.
Wüthrich, Christian. No presentism in quantum gravity. In Vesselin Petkov, editor, Space,Time,andSpacetime, pages 257–278. Springer, 2010.
Wüthrich, Christian. The fate of presentism in modern physics. In Robert Ciuni, Kristie Miller, and Giuliano Torrengo, editors, New Papers On The Present–Focus On Presentism, pages 91–131. Philosophia Verlag, Munich, 2013.
Wüthrich, Christian. Raiders of the lost spacetime. In D. Lehmkuhl, G. Schiemann, and Scholz, editors, TowardsaTheoryofSpacetimeTheories, pages 297–335. Birkhäuser, Basel, 2017.
Wüthrich, Christian. Time travelling in emergent spacetime. In Judit Madarász and Gergely Székely, editors, HajnalAndrékaandIstvánNémetionUnityofScience: From Computingto Relativity Theory Through Algebraic Logic, pages 453–474. Springer, 2021.
Wüthrich, Christian and Craig Callender. What becomes of a causal set? BritishJournal for the Philosophy of Science, 68(3):907–925, 2016.
Wüthrich, Christian. When the actual world is not even possible. In George Darby, David Glick, and Anna Marmodoro, editors, TheFoundationofReality:Fundamentality, Space and Time. Oxford University Press, Oxford, 2019.
Wüthrich, Christian. One time, two times, or no time? In Alessandra Campo and Simone Gozzano, editors, Einsteinvs.Bergson:AnEnduringQuarrelonTime, pages 209–230. de Gruyter, Berlin, 2022.
Yates, David. Thinking about spacetime. In Christian Wüthrich, Baptiste Le Bihan, and Nick Huggett, editors, PhilosophyBeyondSpacetime, pages 129–153. Oxford University Press, 2021.
Zimmerman Jones, Andrew and Alessandro Sfondrini. String Theory for Dummies. John Wiley & Sons, 2022. 2dn edition.
Zwiebach, Barton. A First Course in String Theory. Cambridge University Press, Cambridge, 2009.
From virtually the beginning of the Western tradition, philosophers have at least sporadically recognized that being is the primordial issue in philosophy. It is such because every theoretical sentence is implicitly or explicitly governed by a theoretical operator including a conjugated form of the verb “to be,” hence, everything we think or talk about is either being itself or an instance or aspect of being. In the language of concepts, the concept being presupposes no other concepts, but is itself presupposed by all other concepts. According to the Eleatic Visitor in Plato’s Sophist, being [ousia] was of sufficient philosophical importance to his predecessors to have instigated “something like a battle of gods and giants among them.” Shortly thereafter, being qua being [to on he on] (or, more accurately, the entity as entity, or better yet, the be-er as be-er) is identified in Aristotle’s Metaphysics—among the most influential books in the history of philosophy—as the issue at the heart of first philosophy. Nevertheless, there are central issues concerning being that are not recognized by any Greek philosopher, indeed not identified until the thirteenth century C.E., in some works by Thomas Aquinas. But these issues are not adequately treated by Aquinas, and, after Aquinas, they are recognized only quite rarely. They are, however, a focus of the structural-systematic philosophy (SSP), wherein their importance is indicated by the inclusion of the word “being” in the titles of the books StructureandBeing, BeingandGod, and BeingandNothing. This article sketches a theory of being that relies on a linguistic resource that appears to be available only in English. This is the present-continuous tense, as used in the sentence “It’s being” and the sentence operator “It’s being such that.” These formulations make possible the articulation of the primordiality, universality, and uniqueness of being.
One centrally important reason for lack of sufficient clarity in philosophical treatments of being is that at least most of the ordinary languages that philosophers have relied on throughout the history of philosophy, emphatically including English, articulate being in a variety of inadequate ways. That they do so adds needless and often misleading complications to being’s articulation. A first source of complications in ordinary English is the vocabulary available for the articulation of being; a second source is the structures of the sentences it provides for articulating being. Each of the following two subsections first identifies specific problems with ordinary-English ways of articulating being, and then introduces refinements to the SSP’s language that enables it to avoid these problems. This article’s version of the SSP differs from Lorenz B. Puntel’s version, particularly in the ways it articulates being and hence in its theory of being.
a. Refinements of Vocabulary
Three peculiarities of the words used by ordinary English to articulate being are of particular importance as far as philosophical articulations of being are concerned. The first peculiarity is that the word “being” has (most relevantly) the following distinct senses: (1) a nominal sense, in which “being” is roughly synonymous with “entity,” and (2) a verbal sense, in which “being” is roughly synonymous with “existing,” in the sense articulated in the Oxford English Dictionary (henceforth, OED) as “the fact of belonging to the universe of things material or immaterial.” Because the two senses are available, one can say both “To be is to be a being” and “To be is to be being,” or “I am a being” and “I am being.” Philosophical uses of the word “being” that do not clearly distinguish these senses, or that do not make clear, in all relevant cases, which sense is intended, require clarification.
The second important peculiarity of words used in ordinary English to articulate being is that, because the word “is” is so often used as copula (or, on an alternative interpretation, as a component of predicates, as in “is red” or “is human”), such sentences as “Fred is” can appear to be incomplete; hearing the sentence “Fred is,” one might well wonder, “Fred is what?”. Similarly, the question “Does God exist?” is more readily intelligible than is the question, “Is God?”.
Presumably because sentences ending with “is” so easily appear incomplete, the use of “is” as the final word in sentences situating the referents of their subject terms within being are rarely used. Instead, for this sense, the “is” usually follows the word “there,” in the phrase “there is.” In this phrase (as in “there are”), the “there” does not perform its usual role of indicating a location that is specified later in the sentence (as in, “There is a pizza restaurant on the corner”). It instead signals that the “is” situates the referent of its subject term within being, rather than functioning as copulative or predicative. That English can express this sense of being by “is,” by “exists,” and by “there is,” with the first of these being the most problematic, introduces avoidable confusion.
The third important peculiarity in ordinary-English words used to articulate being is that several conjugated forms of the verb “to be” have roots different both from that of the infinitive and from those of one another; these include, among others, “am, “is,” and “are.” Consequently, although any sentence using any one of these words at least co-articulates being, the words themselves do not make that fully explicit.
One way to improve talk about being, using a slightly modified version of ordinary English, is to introduce a capitalized version of the word (as is done in BeingandGod and BeingandNothing), and to explicitly link that version to the use of “being” in sentences that situate the referents of their subject terms within being. That remedy, however, avoids only the first of the three peculiarities just identified. Therefore, this article, following White’s Toward a Philosophical Theory of Being (henceforth, TAPTOE) and “Rearticulating Being,” proceeds differently. First, instead of using the word “being” in a nominal sense, in which it would be roughly synonymous with the word “entity,” it introduces for that sense the technical term “be-er,” a word similar to such ordinary-language terms as “runner,” “swimmer,” “writer,” and “philosopher.” Just as running is not a runner and does not run, being is not a be-er and does not be; instead, runners run, and be-ers be. Second, the SSP often uses “be” when ordinary English would require “am,” “is,” or “are,” that is, as the sole form of its verb “to be” in the simple present, and as a component of present-continuous verbs. A native speaker of Jamaican English has confirmed to me that that language indeed uses the sentence “We be jamming,” and that sentence is intelligible to speakers of other versions of English, as are (for example) “I be talking,” “You be reading,” and “We be philosophizing.” In the technical language of the SSP, there be human be-ers; human be-ers be the be-ers that be human. Their unavoidable mode of being be being human.
Grammatically, these variants of parts of the verb “to be” make that verb much more regular than its counterpart in ordinary English. Philosophically, they enable the SSP to directly and explicitly articulate being and its ubiquity. Thereby, it is hoped, this modified, technical English presents a more powerful obstacle than does ordinary English to the tendency that Heidegger calls forgetfulness or oblivion of being. In other words, these changes are meant to make it harder for us to fail to notice the ubiquity of being—to notice that whenever we are speaking or thinking, we are speaking or thinking either (rarely) of being itself, or (usually) of instances or aspects of being.
b. A Refinement of Semantically Significant Sentence-Structures
Drawing on several works by Étienne Gilson, this subsection shows how the vocabulary introduced above can make possible the direct and explicit articulation of being. Such articulation is the strongest obstacle to the oblivion of being (of which more below) only if it is accompanied by a refinement of semantically significant sentence-structures. A first step is taken with clarification of the ubiquity of being as articulable in theories.
Theories are articulated as collections of indicative sentences and, as Gilson 1952 (197) points out (using the term “affirmations” rather “indicative sentences”),
the principal function of the verb is to affirm, and since affirmation remains the same whatever may happen to be affirmed, a single verb should suffice for all affirmations. In point of fact, there is such a verb, and it is “to be.” If only spoken usage allowed it, we would never use any other one…. Not Ilive, or Isit, but I amliving, Iamsitting and likewise in all other cases.
If the “ams” in “am living” and “am sitting” are understood as components of present-continuous verbs rather than as being copulative or predicative uses of “am,” then of course they are not simply forms of the verb “to be,” but even then, the present-continuous verbs that include them co-articulate being. Moreover, additional distinctions are necessary, for example between the likes of “She be running” and “She be a runner” or, more expansively, “She be a be-er who also (more specifically) be running (right now)” and “She be a be-er who also (more specifically) be a runner (even if not running right now).”
A consequence of the possibility of such reformulations is that every theoretical sentence can be made to co-articulate being. Yet, for reasons given in Gilson 1948 (284-5), co-articulation of being has not sufficed, historically, to counter the oblivion of being. One reason for this is that the sentences considered in that text, like most sentences in English and in most and perhaps all other languages that have been used by philosophers, include semantically significant grammatical subjects.
According to Gilson, being is most directly articulated in sentences of the form “S [a semantically significant subject] be”; any such sentence articulates “the composition of the subject with its act of being, it unites them in thought as they are already united in reality.” [original text altered slightly to incorporate this article’s language of being]. Yet although any such sentence unites them, the text tells us that the human intellect (284) tends to focus on the subject—the be-er—and thereby to neglect the “act of being.” In any such sentence, being is articulated only “as included in the [be-er].” That it is only so articulated:
is often serious, to the point of sometimes being catastrophic, because, as history has made us see, the spontaneous conceptualism of ordinary thought tends constantly to reinforce the essence of the [be-er] to the detriment of its act of [being]. Let us also add that this fact is easily explained because the [be-er] has more than its [being], that is, it has its concept… Gilson 1948 284-5.
Gilson 1948 recognizes, as the only sentence-forms that articulate being, “S is p” and “S is.” The SSP, however, links its semantics and ontology not to sentence-forms including semantically significant subject-terms, but instead to sentences of the form “It’s such-and-suching.” This form makes it easy to explicitly and exclusively articulate being itself, not being as included in any be-er. This is done with the sentences “It’s being” and “It be being”, whose only semantically significant terms are their present-continuous verbs (the “It” in such sentences is considered below). The decisive contributions that these formulations make to the SSP’s theory of being are explained in greater detail in what follows, but one that links to (Gilson 1948) is appropriately included here. According to that text (248), if a sentence of the form “S is” is understood as articulating being, it says “not that the subject is itself, which is always true of everything, but [instead] that it is, which is not true, and moreover not always, except for some.”
As is more fully explained in what follows, the SSP’s sentence “It be being” is always true; this is a first indication of the primordiality of being.
2. Theories of Being and Theories of Be-ers
Throughout the history of philosophy, most theories that theoretical frameworks relying on ordinary English would classify as theories of being are, according to the SSP, instead theories of be-ers. Moreover, in this terminology (and, more broadly, in that of the SSP), theories of be-ers are ontologies. Theories of be-ers are concisely summarized by sentences of the form “To be a be-er is to be x,” with x generally replaced by one or more nouns, with appropriate article(s). Most ontologies, both historically and at present, hold that to be a be-er is to be either a thing (or object or substance), or an attribute of a thing (a property or, in some variants, a relation); according to such ontologies, the apple that is red and is on the table is (or be), as does the redness of that apple, and the table that the apple is on. The SSP’s ontology holds instead, as discussed in various contexts in TAPTOE, that to be a be-er is to be a fact (or, in TAPTOE’s technical term, a facting).
Why is there need for a theory of being? To answer this question, compare running: one could assemble a list of runners, but unless one had a theory of running, one would be unable to explain why the list contained the items that it did. Similarly, the one thing that all be-ers have in common is that they be, so in the absence of a theory of being, one is unable to explain why the list of be-ers contains the items that it does.
In what follows, clarity is served by speaking of frameworks for theories of be-ers that include thing- or substance- or object-ontologies as whatness-frameworks, because at least the most prominent members of that family of frameworks include identifiable versions of the thesis of the primacy of whatness, that is, the thesis that every be-er is, primarily, its whatness. In Aristotelian frameworks, the problem of being is not recognized, so the primacy of whatness is the primacy of substance over at least accidental attributes; according to such frameworks, Alan White is, primarily, either a human be-er (essence) or the specific human be-er that he is (individual), but he must be that whatness in order to be, at some specific time, doing anything else, for example, sitting rather than standing.
Far later than Aristotle, Thomas Aquinas takes an important step in recognizing, at least occasionally, the primacy of being over whatness but, as Gilson 2002 indicates (163), he has important predecessors:
Other philosophers had preceded Thomas along this path, and all of them helped him to follow it through to the end, particularly those among them who clearly raised the problem of [being]. Alfarabi, Algazel, Avicenna among the Arabs, Moses Maimonides among the Jews, had already noted the truly exceptional position that [being] occupies in relation to essence…. What seems to have especially intrigued these philosophers is that, however far you push the analysis of essence, [being] must be added to it in some way from outside, as an extrinsic determination conferring on it the act of [being]…. These philosophers started from essence, and using analysis they sought to discover [being] within it, but they could not find it there. Hence their conclusion: [being] was extraneous to essence as such…. So Alfarabi concludes: “[Being] is not a constitutive factor; it is only an accessory accident.”
As this passage makes clear, these predecessors move beyond Aristotle in positing the primacy of whatness not only over attributes, but also over being. Yet if what is called being is extraneous to essence, then essence is extraneous to what is called being. If, however, essence is, and is extraneous to what is called being, then what is called being is not being in its ubiquity. Additional steps must be taken.
As indicated above and considered in somewhat more detail below, some of those steps are detectable in various works by Thomas Aquinas although, as shown in Being and God (1.3.2), those works contain no theory of being. Moreover, those works’ articulations of being have not been widely influential. Indeed, as noted in Gilson 1952 (118), “the genuine meaning of the Thomistic notion of being is, around 1729, completely and absolutely forgotten,” thanks chiefly to the dominant influence of Suarez. A revival of so-called existential Thomism develops in the 1930s, and is sufficiently developed by the 1950s that Clarke 1955 includes (61-2) the following announcement:
What is now widely known as the existential interpretation of Thomistic metaphysics has definitely come of age. (By existential I mean that interpretation which sees in the act of [being] the source of all perfection and intelligibility, hence the center of gravity of St. Thomas’[s] whole philosophy)…. As speculation and text work proceed hand in hand, each illuminating the other, it is becoming more and more evident that this perspective is by no means some short-lived fad borrowed from the contemporary Existentialist movements and superimposed extrinsically on St. Thomas’[s] own thought, but rather that it is that one luminous center … in the light of which alone the total body of St. Thomas’[s] texts takes on full intelligibility and coherence.
Fad or not, the revival appears to have been relatively short-lived, and certainly had no influence on the mainstream analytic philosophy that has been dominant since around the time of the revival. Worse yet, as the SSP shows, “the total body of St. Thomas’[s] texts” cannot “[take] on full intelligibility and coherence,” not only because the heart of the substance ontology relied on by those texts is unintelligible (see TAPTOE 2.5), but also—and more importantly, for this article—because although some of those texts recognize the primacy of being, none adequately articulates the ubiquity of being because in them, essence remains somehow distinct from being (see also Being and God 1.3.2.2). The remainder of this section, however, focuses on semantic and ontological problems that would remain even if a variant were developed according to which essence was fully intrinsic to being.
The semantic and ontological problems endemic to Thomistic being-frameworks are linked to the one identified in a passage from Gilson 1948 considered above. They arise from the at least tacit reliance of Thomistic frameworks on compositional semantics—according to which, roughly speaking, the meanings or semantic values of sentences are functions of the meanings or semantic values of their sub-sentential components—and on ontologies strongly linked to semantically significant grammatical subjects and predicates. Within these frameworks, the semantic counterparts to both subjects and predicates are concepts, such that, for example, the sentence “Unicorns are mythological” links the word “unicorns” to the concept unicorn and the word “mythological” to the concept mythological. The semantic status of being, within such frameworks, is problematic because there is no concept that can be linked to the word “is” (or “are”) in any manner comparable to that in which the concept unicorn is linked to the word “unicorn,” and the concept mythological to the word “mythological.” This makes both the semantic and the ontological status of being obscure.
This point is worth emphasizing. Asked to clarify “unicorns,” the utterer of the sentence would easily be able to say that unicorns are animals that are like horses except that they have single sharp horns growing from their foreheads, and asked about “mythological,” the utterer could easily say that “mythological” beings are ones that occur only in stories told by human be-ers but not, unlike for example horses, cats, and dogs, in reality. But asked about the “are,” the utterer would have no comparably available explanation.
Gilson 2002 (172) addresses this problem by noting initially that “Being is the first of all concepts,” because being is co-articulated in every sentence of the form ‘S is’ or ‘S is p’ and, as shown above, any indicative sentence can be rewritten into a sentence with such a form. Being is also, however, “the most universal and abstract [concept], the richest in extension and the poorest in comprehension.” It is “richest in extension” because any be-er can be articulated in a sentence whose verb is “is,” but “poorest in comprehension” because sentences of the form “S is” can appear to say nothing specific about the be-er. Because of this poorness in comprehension, Gilson 2002 voices the suspicion that we would need either “an intuition of [being]” or an “intellectual intuition of being as being” in order to comprehend it, yet such an intuition would be inarticulable and thus would not make being conceivable. “But,” the book continues (174), “reason dislikes what is inconceivable, and because this is true of [being], philosophy does all it can do to avoid it.”
Concerning the articulation of being, then, Thomistic frameworks face first the at least partially avoidable problem that, because their sentences at best co-articulate being, they tend to place more emphasis on be-ers than on being. This problem is avoided by those who emphasize being, but those who do, even if they also were to recognize the ubiquity of being, would face the insuperable obstacles to clarification of being posed by those frameworks’ semantics and ontologies. What is needed, then, if philosophy is to cease to avoid being, is not a new Thomistic theory of being, but instead a different theory of being.
3. Central Aspects of the SSP’s Theory of Being: The Primordiality, Ubiquity, Uniqueness, and Universal Intelligibility of Being
Given that a theory of being (or of anything else) must be a collection of meaningful, or semantically significant, sentences, any adequate theory of being must explain its semantics, that is, how its sentences are meaningful.
As indicated in various contexts in TAPTOE, the SSP rejects both compositional semantics and any ontology or semantics strongly linked to the structural components of subject-predicate sentences. It links its ontology and semantics instead to sentences with the structure “It’s such-and-suching.” Because such sentences—technically, sentencings—have no semantically significant subsentential components, their semantics cannot be compositional; they are instead contextual, that is, ones according to which words have determinate meanings or semantic values only within the contexts of sentences. The semantic contents of sentences are propositions (technically, propositionings), and these semantic contents relate to the SSP’s ontology such that every propositioning is identical to a facting. The absolutely comprehensive facting IT’S BEING (or IT BE BEING) is identical to the true propositioning It’s being (or Itbebeing), which is expressible by the true sentencing “It’s being” (or “It be being”).
Being as articulated in the sentence “It be being” may appear to be, in Gilson 2002’s terms, “universal and abstract,” but it is straightforwardly and transparently concretized by means of that sentence’s expansion into the operator “It be being such that,” which can govern any and every sentencing that expresses a propositioning. An example: “It be being such that It be Alan Whiting such that It be revising It’s TAPTOEing,” a sentencing true at the time of its initial composition. Hence, although being is of problematic intelligibility within the theoretical frameworks considered and relied on in Gilson’s works, because their semantic focus is on concepts and being has no clear conceptual status within them, within the framework of the SSP, being is directly articulated by a true sentencing that expresses a true propositioning identical to an actual facting.
The uniqueness of being mentioned in this subsection’s title is clarified by more detailed comparison of being with running (running being simply one of a vast number of possible comparative items). Most human be-ers are capable of running or (in slightly different terms) have the capacity to run. The human be-er who runs is activating that capacity—that human be-er be at work running, be engaged in running—whereas the human be-er who sits generally retains the capacity to run while not activating it, while not engaged in running. In contrast, every human be-er who actually be cannot avoid engaging in being, cannot avoid being at work being; any human be-er not engaging in being would be a merely possible human be-er, like a future grandchild or perhaps a human be-er who, although having been, be no longer. Thus, whereas running is an ontological capacity because human be-ers (along with be-ers of many other kinds) can but need not be engaged in running, being is not a capacity, because be-ers, of whatsoever sort, have any capacities, activated or not, only if and when they be. This, then, is the most central way that the being of be-ers differs from all of their other engagements (or, more accurately, modes of being): being is the engagement or being-at-work (or, again, mode of being) that is not an ontological capacity. In this way, it is absolutely unique.
The uniqueness of being among the engagements of be-ers is further illuminated by the phenomenon of cryopreservation. Some organisms—including human embryos and adult members of a few species of vertebrates (chiefly amphibians)—can continue to be—and to be the organisms that they are—while they are frozen. While they are frozen, all of their metabolic processes cease. Hence, when frozen, they do not activate their capacities for aging or even for living, in anything like the usual sense of living, although they are not dead. They are not dead because they retain the capacity to live; that capacity is reactivated when they cease to be frozen. Even as frozen, then, they continue to be, to engage in being.
An additional step leads, in a manner different from those introduced above, from the being of be-ers to being itself. No organism has the capacity to bring itself into being, because before the organism is, the organism has no capacities. And yet, the coming into being of the organism reveals that it was possible that the organism come into being. The coming into being of the organism therefore reveals the capacity of being to be manifest, to manifest itself, as that organism. The birth of the organism is being’s reconfiguration of itself so as to include that organism; it is the emergence of the organism into and hence within being. For this reason, the gestation of the organism that grows into a salamander is also articulable as being engaging in salamandering: It be being such that It be salamandering.
“Salamandering” is of course a peculiar word, but one whose inclusion in this article should not be surprising, given that one way to be, according to the SSP’s ontology, is to be an actual facting identical to a true propositioning expressible by a true sentencing such as, for example, “It’s salamandering.” A fuller consideration of the “It” in this sentencing further clarifies the involvement of being in salamandering (on this topic, see also Being and God 3.2.1.1).
According to the OED, the “it” of “It’s raining” (and that of “It’s salamandering”) is “the subject of an impersonal verb or impersonal statement, expressing action or a condition of things simply, without reference to any agent.” The OED includes the following examples: “It has fared badly with the soldiers; How is it in the city? It will soon come to a rupture between them; It is all over with poor Jack; It is very pleasant here.” Resituated within the theoretical framework of the SSP, none of these sentences can articulate any condition of things, but each can articulate, intelligibly and coherently, a configuration of being. That each can indeed do so is revealed by the fact that each remains intelligible—although, of course, also becomes peculiar—if its “it” is replaced with “being”: “Being has fared badly with the soldiers; How is being in the city? Being will soon come to a rupture between them; Being is all over with poor Jack; Being is very pleasant here.” Given its rejection of the semantics linked to subject-predicate sentences, the SSP cannot of course identify being, in any of these cases, as the referent of the “it” that the word “being” replaces. Instead, it takes each “it” to indicate being, a configuration of which is articulated by the words that complete the sentence. Hence, an alternative formulation of one of the sentences introduced above is, “It’s being such that It’s faring badly with the soldiers.”
A further step is taken following the introduction of a second instance of the impersonal “it,” one that is of central importance to the SSP. This is the “it” of the theoretical operator, which has the forms “It is (it be) the case that” and “It is (it be) true that.” As explained more fully in TAPTOE (see 3.4-5, 6.3, and 6.3.1.), prefixing the theoretical operator to any indicative (hence, theoretical) sentence can make explicit the semantic status of that sentence. Hence, for example, the semantic status of “It’s raining,” as asserted, is explicitly articulated in the sentence “It is the case that it’s raining.”
As Being and God explains, in terms of indicative function, the “it” of any sentencing, such as “It’s raining,” and the “it” of the theoretical operator are not simply identical. The “it” of “It’s raining” usually indicates a configuration of being at a specific spatio-temporal location. Because, however, the theoretical operator makes explicit the semantic status of every theoretical sentence, its scope is absolutely unrestricted: it thus indicates being as a whole. The example sentence is thus understood as follows: being as a whole is configured such that being here-and-now is configured such that raining is ongoing. Or: It be being as a whole such that It be being here-and-now such that It be raining. Or: It be being as a whole such that It always be being such that It be 2+2=4ing.
In closing this section, note that a thesis central to the SSP’s theory of being as such is that being is universally intelligible.
4. Being and Whatness
A central advantage of being-frameworks (over essence- or whatness-frameworks) is that they can make the essence or whatness of any be-er intrinsic to the being of that be-er. By contrast, even in whatness-frameworks that recognize being, being is generally taken to be extrinsic to essence or whatness. This section shows that and how, according to the SSP, (1) being is prior to the whatness of any be-er and (2) whatness is intrinsic to the being of any be-er. To this end, this article, following TAPTOE, introduces two examples, one of a biological be-er’s coming- and ceasing-to-be, and one of an artifactual be-er’s coming- and ceasing-to-be.
The biological example is a possible case of in-vitro fertilization (in-vitro so as not to add a woman’s womb as a complicating factor). Prior to fertilization, there are, of central importance to this example, two be-ers, thus, two be-ers that be, that be engaged in being: these be a sperm cell and an egg cell. If fertilization occurs, egg and sperm will have ceased to be, and a zygote will have come to be. Much about specifically what that zygote will be, if it comes to be, is presumably determined by the genetic make-ups of the sperm and egg cells. If a zygote comes to be, it may be the only zygote that could have come to be, in this situation, but it will be that zygote only when it comes to be. There be one cell at work being a sperm cell, and one cell at work being an egg cell; if fertilization occurs, it will be because egg and sperm have jointly reconfigured themselves into constituents of a new be-er, the zygote.
Considered somewhat differently: sperm and egg are both specific configurations of being, of being because they be, and specific because each has specific capacities, and lacks capacities had by other kinds of be-ers; each has the capacity to unite with the other, and, in so uniting, to be reconfigured, or to reconfigure itself, such that some of what had been its constituents become constituents of a zygote. They lack the capacity to be jointly reconfigured, or to jointly reconfigure themselves, into anything other than a zygote. If they unite, the zygote will come to be as itself a specific or restricted configuration of being.
Even if the zygote as a new organism comes to be, it will not, of course, continue forever to be. At some point, it will die—possibly quite abruptly. The physical changes in the organism, when it dies, can appear to be relatively slight, but the ontological change could not be more profound. Following death, many organs that had been components of the organism’s body may continue to be, and to be at work continuing to be, for some time, but they will no longer be at work as organs, because there will no longer be an organism.
The artifactual example is the following: contemplating what to cook for breakfast, I may narrow my choices to oatmeal or an omelet. If I choose to make the omelet, I have—in one terminology—determined the essence of the be-er that will come to be if I succeed in making it. I can have before me all of the ingredients I will use—eggs and, say, sausage, onion, cheese, and the butter with which I will coat the frying pan—and if I proceed, what omelet will come to be, if an omelet comes to be, is highly determinate. But there is not yet an omelet, and nothing about the possible omelet’s essence or whatness determines whether or not it will come to be. What have come to be, already, are the ingredients, with their constituents jointly at work enabling them to continue to be, and I myself, the potential chef. If I opt for the omelet rather than for oatmeal, and if I successfully follow the requisite procedures, the ingredients will begin to work together in such a way that, soon thereafter, an omelet will begin to be. But the beginning to be is the beginning of the omelet—there be no omelet until the omelet be; until the omelet be, the ingredients be, but the omelet does not. After I have eaten the omelet, of course, the omelet no longer be, but its constituents continue to be, as they—or at least some of them—are reconfigured, temporarily, into constituents of my body.
5. Neglectfulness of Being
This subsection is metasystematic because it treats texts external to the SSP, but it is included to further clarify the central importance of being as a topic for systematic philosophy. Its phrase “neglectfulness of being” is a formulation more accurate than the Heideggerian “forgetfulness” or “oblivion” of being, mentioned above. The reason for introducing this phrase is suggested by what is said in subsection 3 about the theoretical operator. Because the theoretical operator indicates and indeed discloses being as a whole, being cannot be wholly absent from any theoretical framework. In many—indeed, presumably, in the overwhelming majority—it is tacitly presupposed, and nowhere denied. This is generally wholly non-problematic, although it would be a fatal flaw in any systematic philosophy, because any systematic philosophy not including a theory of being would be incomplete. Also fatally flawed, however, are theories that appear to deny being, despite (unavoidably) presupposing being. This is clarified by examples.
a. Examples of Neglectfulness of Being
i. Paired Philosophical Examples: van Inwagen and Lowe
van Inwagen 1996 includes (96) the following:
If the notion of an abstract object makes sense at all, it seems evident that if everything were an abstract object, if the only objects were abstract objects, there is an obvious and perfectly good sense in which there would be nothing at all, for there would be no physical things, no stuffs, no events, no space, no time, no Cartesian egos, no God…. When people want to know why there is anything at all, they want to know why that bleak state of affairs does not obtain.
Worth noting in passing is that speaking of states of affairs as “obtaining” or not “obtaining” —rather than as being or not being—is common in analytic philosophy, and is an evasion or neglect of being; what other than being could “obtaining” be? Be that as it may, by “there would be nothing at all,” van Inwagen 1996 explicitly means that there would be no non-abstract objects; there would however be, in his scenario, abstract objects, hence not the utter absence of being, which, as shown below, cannot be.
Lowe 1996, in response to van Inwagen 1996, includes (115) the following:
Suppose we could show that there couldn’t be a world containing only abstract objects, perhaps by arguing that abstract objects necessarily depend for their existence upon concrete objects: what would follow? Clearly, it would follow that van Inwagen’s ‘bleak’ state of affairs couldn’t obtain. And yet, in a perfectly clear sense, this wouldn’t suffice to show that it was necessary for something concrete to exist: for we wouldn’t have foreclosed the possibility that nothing at all—nothing either concrete or abstract—might have existed. To foreclose that possibility, it seems, we would need also to show that at least some objects, abstract or concrete, exist in every possible world.
For Lowe 1996, the possibility that “nothing at all—nothing either concrete or abstract—might have existed” is open if there is a possible world containing no concrete or abstract objects (see 111-12). That possible world would however have to be a possible world that would not only itself be but would be distinct from other possible worlds, including the actual world. It would, that is, be situated within being, and would not be—impossibly—the utter absence of being.
ii. An Additional Philosophical Example: van Inwagen
Other works by Peter van Inwagen are among the relatively few by analytic philosophers that recognize that there even might be a significant distinction between what the SSP terms being and be-ers, and hence a need for theories of being. van Inwagen 2008b includes (278) a conversation in which a fictional Alice argues that “being is a feature of everything,” asking, “who could deny that everything there is is?”. The conversation leads to “the identification of being with self-identity” (287). The text recognizes as a possible alternative—attributed to Sartre, among unnamed others—that being is “an activity that things engage in, the most general activity that they engage in.” van Inwagen 2009’s treatment of this alternative includes (477) the following (quoted in part in Being and God, 196):
If there is a most general activity that a human be-er (or anything else that engages in activities) engages in—presumably it would be something like ‘living’ or ‘getting older’ [the phenomenon of cryopreservation, introduced above, reveals that this is not the case]—it is simply wrong to call it ‘being’. And it is equally wrong to apply to it any word containing a root related to ‘être’ or ‘esse’ or ‘existere’ or ‘toon’ or ‘einai’ or ‘Sein’ or ‘be’ or ‘am’ or ‘is.’ One cannot, of course, engage in this most general activity (supposing there to be such an activity) unless one is, but this obvious truth is simply a consequence of the fact that one can’t engage in any activity unless one is: if an activity is being engaged in, there has to be something to engage in it.
As Being and God notes, this passage fails to clarify or even to recognize being because it makes no attempt to explain the “is” of “unless one is,” or the “be” of “there has to be something.” According to the SSP, if one actually is, then one be being, and actually to be something is to be being something.
Perhaps also worth noting is that van Inwagen 2009 attempts to show that being is somehow superfluous or avoidable by introducing (478) a fictional Martian language with the following characteristics:
There are in Martian no substantives in any way semantically related to ‘être’ or ‘esse’ or ‘existere’ or ‘toon’ or ‘einai’ or ‘Sein’ or ‘be’ or ‘am’ or ‘is.’ (In particular, Martian lacks the nouns ‘being and ‘existence’….) There is, moreover, no such verb in Martian as ‘to exist’ and no adjectives like ‘existent’ or ‘extant’. Finally, the Martians do not even have the phrases ‘there is and ‘there are’.
van Inwagen’s Martian language does, however, include the following sentences (478-79, emphases added):
Everything is not a dragon.
It is not the case that everything is not (a) God.
I think, therefore not everything is not I.
It makes me strangely uneasy to contemplate the fact that it might have been the case that everything was not always I.
It makes me strangely uneasy to contemplate the fact that everything is not (identical with) anything.
It is a great mystery why it is not the case that everything is not (identical with) anything.
As the italicizations clearly show, each of these sentences includes a form of the verb “to be.” Being is thus neither superfluous nor avoided in Martian, and it would be open to Martian philosophers to introduce counterparts to “being,” “be-er,” “It be being,” and “It be being such that” into their philosophical languages.
iii. An Example from Physics: Krause
AUniversefromNothing (Krauss 2012) presents itself (xiii) as responding to the question “Why is there something rather than nothing?” That it exhibits neglectfulness of being is evident from its assertion (xiv) that “‘nothing’ is every bit as physical as ‘something,’ especially if it is to be defined as the ‘absence of something.’” Any “nothing” that is physical is not, obviously, utter non-being. Nevertheless, additional details are worth noting.
According to Kraus’s 2012 (xvii), “perhaps the most surprising discovery in physics in the past century … has produced remarkable new support for the idea that our universe arose from precisely nothing,” The text later (58) clarifies “precisely nothing” as follows: “By nothing, I do not mean nothing, but rather nothing—in this case, the nothingness of what we normally call empty space.” Yet later (98), this “precisely nothing” is supplemented by several other factors, and becomes “essentially nothing”: “if inflation indeed is responsible for all the small fluctuations in the density of matter and radiation that would later result in the gravitational collapse of matter into galaxies and stars and planets and people, then it can be truly said that we are all here today because of quantum fluctuations in what is essentially nothing.” This passage clearly presupposes that matter, radiation, and quantum fluctuations be. Moreover, what is first described as empty space is later (104) said to be endowed with energy. As so endowed, it is “Nothing,” and (152) it “can effectively create everything we see, along with an unbelievably large and flat universe.” And yet,
it would be disingenuous to suggest that empty space, which drives inflation, is really nothing. In this picture one must assume that space exists and can store energy, and one uses the laws of physics like general relativity to calculate the consequences. So if we stopped here, one might be justified in claiming that modern science is a long way from really addressing how to get something from nothing. This is just the first step, however. As we expand our understanding, we will next see that inflation can represent simply the tip of a cosmic iceberg of nothingness.
So: a universe “created” by empty space endowed with energy is not a universe from nothing, despite earlier contentions to the contrary, and nothingness as a whole is “a cosmic iceberg.”
Krauss 2012’s cosmic-iceberg sense of nothing/nothingness is (170) “the absence of space and time” but the presence of quantum gravity, and although the text asserts at the outset (xiv) that all of its uses of “nothing” will be “scientific,” the following (174) passage indicates that rather than being required by any scientific theory, these uses are what “work” for the text’s author:
When I have thus far described how something almost always can come from “nothing,” I have focused on either the creation of something from preexisting empty space or the creation of empty space from no space at all. Both initial conditions work for me when I think of the “absence of being” and therefore are possible candidates for nothingness.
The continuation of this passage indicates that neither of these candidates adequately explains the universe as originating from nothing: “I have not addressed directly, however, the issues of what might have existed, if anything, before such creation, what laws governed the creation, or, put more generally, I have not discussed what some may view as the question of First Cause.”
Krauss’ suggested answer to this question is the multiverse (175), although Krauss 2012 nowhere asserts that the multiverse is nothing. Instead, it says (177) that “In a multiverse of any of the types that have been discussed, there could be an infinite number of regions, potentially infinitely big or infinitesimally small, in which there is simply ‘nothing,’ and there could be regions where there is ‘something.’” The empty regions would, of course, be regions. Moreover, Krauss includes the just-quoted contention about regions in which there is “simply ‘nothing’” despite having acknowledged (176) that “we don’t currently have a fundamental theory that explains the detailed character of the landscape of a multiverse … (… we generally assume that certain properties, like quantum mechanics, permeate all possibilities…).” In order to permeate all possibilities, the “property” quantum mechanics must of course somehow be.
So, Krauss 2012 in fact does not argue that the universe is created from nothing, even if “create” and “nothing” are understood in the idiosyncratic ways in which the book explains them. Each of its senses of “nothing” is an absence of be-ers of some kinds or other; yet each presupposes being.
6. Being and Existing
“Being and “existing,” and “to be” and “to exist,” are synonymous within some philosophical frameworks. As indicated above, in the SSP’s they are not. In the SSP, existence is the mode of being only of factings within the contingently actual dimension of being (this term is fully explained below). Thus: in the SSP’s terminology, merely possible worlds and the entities within them are (or be), but do not exist.
Some works by Peter van Inwagen follow works of Quine in equating being and existing. The difficulties that ensue, particularly in van Inwagen 2008b, are instructive. The following passage (283) provides a fruitful starting point:
if one says of some woman that she doesn’t exist, one has to be wrong. If the woman in question is “there” to have something said about her, then she exists.
What, one might wonder, if the woman is “there” in a work of fiction? Of Sherlock Holmes, the text asserts (295) the following:
There does exist such a fictional character as Sherlock Holmes. He is as much a part of the World as is any of the short stories and novels in which he “occurs.”
This is problematic at best, because whereas one can buy copies of stories and novels wherein Sherlock Holmes is a character, one cannot acquire the services of Sherlock Holmes; this is an enormous ontological difference. Moreover, van Inwagen 2008b also asserts (111) the following:
Words like ‘dragon’ and ‘unicorn’ are not names for kinds of non-existent things. Rather, they are not names for anything of any sort, for there are no dragons for them to name.
This introduces an inconsistency: if Sherlock Holmes is “as much a part of the World as is any of the short stories and novels in which he ‘occurs,’” then the dragon Smaug is as much a part of the World as is J.R.R. Tolkien’s novel TheHobbit. This inconsistency might plausibly result from a failure to adequately revise, given that the passage about Sherlock Holmes appears in the Coda found only in the third edition of van Inwagen’s Metaphysics, whereas the passage about dragons also appears in the earlier editions. But a comparable inconsistency emerges within the Coda itself. That text denies (296) that “the maps that accompany copies of TheLordoftheRings must be maps of something,” but again, if Sherlock Holmes is a part of what van Inwagen 2008b calls the World because he appears in short stories and novels, then Middle Earth is a part of the World because it appears in novels, and Middle Earth is precisely what the maps accompanying copies of TheLordoftheRings are maps of. One might also ask the following: how could the maps in TheLordoftheRings be maps—rather than mere drawings—if they were not maps of anything?
Distinguishing between being and existing facilitates avoidance of problems of the sorts just identified in van Inwagen 2008b. According to the SSP, Sherlock Holmes, Smaug, and Middle Earth do not exist, but each is (or be), within the non-actual world within which it appears in fictional accounts.
7. Dimensions of Being
Everywhere, there be being, because all be-ers be, or engage in being. But qualification is necessary, because only actual be-ers be; possible but non-actual be-ers do not. This point may also be put as follows: every actual be-er is actively being, is engaged in being. In terms closer to Aristotle’s, to be an actual be-er is to be at work being that be-er. For organisms, as indicated above, to die is to cease to be at work being organisms.
Because possible but non-actual be-ers are not at work being themselves, their mode of being is derivative (see Structure and Being 463, 471). There is, then, no be-er at work being Sherlock Holmes. Sherlock Holmes’s being is derivative in the first instance from the being-at-work of Arthur Conan Doyle, and in additional instances from the being-at-work of those who read or recall Conan Doyle’s novels and stories, and those who present versions of Sherlock Holmes in films, other works of literature, and so forth, and those who assimilate or recall such versions. A volume of Holmes stories on a library shelf is at work preserving those stories, and it retains the capacity to present them; that capacity is activated when anyone reads the stories.
These modalities of being—contingently actual being (for example, of the volume of Holmes stories) and contingently non-actual being (for example, of Holmes) —require both explanation and supplementation. According to the SSP, there are three distinct modalities of being. Most broadly, there is the absolutely necessary dimension of being and the contingent dimension of being, which can also be termed the dimension of contingent be-ers. The contingent dimension of being includes the dimension of contingently actual be-ers and the dimension of contingently non-actual be-ers. In some other philosophical frameworks, the SSP’s dimension of contingently actual be-ers is termed the actual world, and its dimension of contingently non-actual be-ers the realm of merely possible worlds; for convenience, this article occasionally uses this terminology.
(To clarify: throughout the discussion that follows, the reader may substitute “propositioning” for “proposition.” The former is a technical term in the SSP’s philosophical language, while the latter is taken from ordinary English. The same holds for “sentencing” and “sentence” and “facting” and “fact.”)
According to the SSP, because modalities qualify or determine true propositions expressible by true sentences, and because true propositions expressible by true sentences are identical to actual facts, modalities qualify actual facts. They are, therefore, being’s own modalities. These modalities of being can be made explicit by means of a number of sentence operators, all of which articulate modalities of being. These operators include the following (with examples of arguments included).
It is absolutely necessarily the case that it’s being.
This is considered below.
It is contingently actually the case that there are parents.
There currently are parents, but there is no necessity that there be parents: there were no parents shortly after the Big Bang, and the time may come when there are no longer any parents.
It is conditionally necessarily the case that every parent has at least one child.
Because it is only contingently actually the case that there are parents, it is not absolutely necessarily the case that every parent has at least one child. Because however to be a parent is to have at least one child, the modality of the relationship is nevertheless one of necessity. Differently put: if it is contingently actually the case that there are parents, then it is necessarily the case that every parent has at least one child.
It is contingently non-actually the case that Sherlock Holmes is a detective (or: It is the case in the contingently non-actual worlds presented in various stories, novels, and films that Sherlock Holmes is a detective).
It is necessarily not the case that Fred drew a round square.
It is necessarily not the case that there is nothing (or: that nothing is).
Concerning the dimensions of being, the most important of these sentences are (1) and (6). Both Structure and Being and Being and God include arguments from the truth of versions of (6) to the truth of versions of (1). A variant is the following: By definition, it is possible for any contingent be-er not to be: for each contingent be-er, it is possible that it be, and possible that it not be, so its non-being is possible. Similarly, if contingent being were exhaustive of being—if all being were contingent being—then it would be possible that being not be. But being’s not being would be possible only if it were possible for non-being to be, and that is not possible. Therefore, being is not exhausted by—is not exhaustively—contingent being, and so must include necessary being as well.
Differently put: it would be possible for all being to be contingent being only if either “It be being such that it be absolutely non-being” or “It be being such that it be absolute-nothinging” expressed a proposition, because if either of these sentences expressed a proposition, that proposition would be identical to a fact at least in some possible world, and possibly (at some point) in the actual world. But these sentences, like the sentence “Fred drew a round square,” do not express propositions. According to the SSP, they express pseudo-propositions, and pseudo-propositions are not identical to facts in any world, actual or possible. Such sentences are therefore necessarily false. As indicated in Structure and Being (239n48), the sentence “Fred drew a round square’’ can be analyzed into the sentences “What Fred drew was round” and “What Fred drew was a square.” Each of these sentences expresses a proposition, but the conjunction “What Fred drew was round and was a square,” although grammatically correct, does not.
The sentence “It be being such that it be absolutely non-being” is similar, but somewhat more complicated. Its status is clarified by consideration of the more ordinary-sounding “There is nothing,” understood as expressing the pseudo-proposition Thereisabsolute nothingness or Thereisabsolutenon–being. What makes these items pseudo-propositions is the fact that sentences of the form “There is such-and-such’’ express propositions only if the such-and-such somehow is. Any such-and-such, however, that in any way is, is not absolute nothingness, not absolute non-being. But if it is not even possibly the case that there be nothing, then it is absolutely necessarily the case that there be being.
Because there are contingent be-ers, and hence a contingent dimension of being or dimension of contingent be-ers, being is two-dimensional, including both the contingent dimension of being and the absolutely necessary dimension of being. And because it be possible that the entire contingent dimension of being not be, the primacy of being is, more specifically, the primacy of the absolutely necessary dimension of being.
The line of thought developed in the preceding paragraph can be put more technically as follows. The theoretical operator formulable as “It be being the case that,” which (as explained above) implicitly or explicitly governs every indicative sentence that expresses a proposition, and each of its modal variants (“It be being absolutely necessarily the case that,” “It be being contingently actually the case that,” and “It be being contingently not actually the case that”) situates its arguments within being. All propositions or propositionings are arguments of such operators, hence so too are the sentences or sentencings expressing them. Pseudo-propositions, however, are not arguments of these operators, but sentences and sentencings can express pseudo-propositions; those that do are necessarily false. The sentencing “It be being such that it be absolute-nothinging” expresses a pseudo-proposition because the “It be being such that” applicable (in one of its forms) to every sentence and sentencing expressing a proposition situates that proposition within being, and absolute-nothinging can in no way be, hence cannot be situated within being.
Perhaps worth noting at least in passing is that the word “nothing” is non-problematically and indeed often helpfully included within everyday theoretical frameworks relying on ordinary English. For example, “Nothing” can answer the question “What are you doing?”, but this would not mean that the respondent was not breathing, metabolizing, holding their body in some position or other, and so forth, but would instead mean that the respondent was not doing anything that would prevent them from doing something with the questioner. As another example, “There’s nothing in the refrigerator” would not mean that the refrigerator contained no shelves, air, and so forth, but instead that the refrigerator contained nothing that the utterer of the sentence wanted to eat or to drink.
Two final words on this topic may be in order. First, such sentences as “Nothing might exist” and “There might someday be nothing” are, of course, grammatically non-problematic. From that it does not follow that they are semantically non-problematic. Again, the same holds for “Fred drew a round square.” Second: the question “Why is there something rather than nothing?”, said to be addressed in Kraus 2012, can now be easily shown to be incoherent. It would be coherent only if both “There be being” and “There be nothing” expressed genuine possibilities. But, as shown above in this section, “There be nothing” does not express a possibility. Therefore, the question cannot be answered, it can only—at best—be shown to rest on a false presupposition. This may be clarified by a more everyday example. Let’s say that Smith and Jones agree both that wine A is better than wine B, and that in any venue—wine shop, grocery store, restaurant, or whatever—that offers both wine A and wine B, wine A is significantly cheaper. If, at a restaurant, Jones asks Smith why the former ordered wine B rather than wine A, the question can be answered only if both wines are available at the restaurant. If wine A is not available, Smith’s most appropriate response would be something like, “I didn’t order wine B instead of wine A, because wine A was not an option; of the wines that were options, I thought wine B was the best, although had wine A been on the menu, I would indeed have ordered it instead of wine B.”
8. Being and God
This section is far shorter than the book with which it shares its title, so a reasonable beginning for it is an explanation of the major differences between the two accounts. A thesis central to both accounts is put as follows in Being and God (1): “Any conception of ‘God’ that is not situated within an explicitly presented or implicitly presupposed theory of being as such and as a whole—and hence, obviously, any such conception presented in conjunction with the rejection of such theories—can only be a conception of something or other, an X, that putatively does or does not ‘exist’ beyond the world familiar to us and somehow separately from it, but that cannot ultimately be made either intelligible or reasonable.” Chapter 1 of Being and God criticizes as inadequate various historical and contemporary approaches to the issue of God that are inadequate because they are not situated within theories of being as such and as a whole; this article includes no such critiques. Chapter 2 of Being and God turns to Heidegger, at the heart of whose thought is the question of being, and it argues at length that Heidegger utterly fails to respond to that question in a philosophically defensible manner; this article does not repeat that critique. Being and God’s Chapter 3 develops the SSP’s theory of absolute being to the point at which coherence and intelligibility are increased by the introduction of the term “God”; this article presents a version of this theory (with minor alterations, and in its different terminology). Chapter 4 of Being and God, finally, criticizes Emmanuel Levinas and Jean-Luc Marion, the most important and influential of those thinkers who attempt—in the language of the central thesis introduced above—to produce conceptions of “God” in conjunction with rejections of theories of being. This article does not consider either Levinas or Marion.
As indicated in the preceding paragraph, the most important way the SSP’s treatment of the issue of God diverges from other treatments of that topic is by situating it within a theory of being as such and as a whole. A second divergence is also worth noting at this point. In contemporary philosophy, the issue “God” is generally treated within what is called the philosophy of religion. According to the SSP, this begs various questions and introduces various unnecessary complications. As is clear from (for example) Plato’s Euthyphro and Aristotle’s Metaphysics, the issue of God (or gods) is—no matter what else it may be—one that can be treated purely theoretically. That is how the SSP treats it. Consequently, the question addressed in this section is the following: does the inclusion of a facting appropriately designated as God increase the coherence and intelligibility of the SSP?
a. The Relation between the Contingent Dimension of Being and the Absolutely Necessary Dimension of Being
Given the preceding clarifications of the modalities of being and the status of absolute nothingness, the SSP’s alternative to the famous but—for reasons just given, incoherent—question “Why is there something rather than nothing?” is easily formulated and explained. The SSP’s question is the following: How is the inclusion within being of a contingently actual dimension best explained? There are in principle only three paths for exploration, and two of those paths are merely apparent. The first merely apparent path does not move beyond the contingently actual dimension of being, and thus leads—if it can even be said to lead—only to such superficial responses as “Well, there just are contingent be-ers.” The “path” that “leads” to such responses is merely apparent because no such response provides an explanation. The second merely apparent path would lead to the contingently non-actual dimension of being. That is indeed a distinct dimension of being, but it is one that, as non-actual, has no resources that could explain the inclusion within being of a contingently actual dimension and that, as derivative, cannot in any way be the source of any dimension from which it derives. The exclusion of these two merely apparent paths leaves open only the path to the absolutely necessary dimension of being. Because this is the only path, the questions to be asked are the following: how is that path followed, and where does it lead?
The first step along this path consists in determining the relation of the contingent dimension of being to the absolutely necessary dimension of being. According to StructureandBeing (454-5, 458), BeingandGod (234-5), and TAPTOE (171), that relation is one of total dependence. Why? First, to say that the contingently actual dimension of being is independent of the absolutely necessary dimension of being would be to take the first of the two merely apparent paths rejected in the preceding paragraph. What then if the contingently actual dimension of being were said to be partially dependent on the absolutely necessary dimension of being? Such partial dependence is perhaps posited by some accounts of a deusabsconditus, according to which God—or, one might say, the absolutely necessary dimension of being—brought the contingently actual dimension of being into being, and then severed relations with it. The problem is that no such account could explain the continuation in being of the contingently actual dimension of being; none, that is, could explain why that dimension of being does not cease to be. The thesis that the contingently actual dimension of being is totally dependent on the absolutely necessary dimension of being, however, does explain the continuing being of the contingently actual dimension of being: it is sustained in being by the absolutely necessary dimension of being.
The point made in the preceding paragraph can also be put as follows: being veridically manifests itself, according to SSP, such that it includes both an absolutely necessary dimension and a contingently actual dimension, and such that the latter dimension is totally dependent, for its initial and continuing being, on the former dimension. Challenges to these theses could be only of two sorts. First, it could in principle be argued that the SSP’s theoretical framework would be concretized with greater intelligibility and coherence if one or both of these theses were rejected or altered. Arguments given above in this section at least weigh heavily against any such course of argumentation, and perhaps even show that no such course of argumentation could be viable. Second, an alternative theory of being, lacking any version of the theses introduced at the beginning of this paragraph, could develop within an alternative theoretical framework. Were this to happen, that framework could be evaluated at a meta-systematic level of the SSP. In the absence of such an alternative theory, objections to the SSP’s theory along the lines of “Well, even if it’s the best explanation you can come up with, it might not be true” are vacuous. The SSP’s explanation is true, within its theoretical framework, and as true, it articulates factings that are constituents of being.
The next question is, does the total dependence of the contingently actual dimension of being on the absolutely necessary dimension of being make possible the further explication of the absolutely necessary dimension of being? Important to addressing this question is noting the inclusion within the contingently actual dimension of being of human be-ers as be-ers who are, both as thinking and as freely willing, intentionally coextensive with being as such and as a whole, and hence with the absolutely necessary dimension of being. The total dependence of such be-ers on the absolutely necessary dimension of being is however intelligible only if the absolutely necessary dimension of being likewise thinks and freely wills and is thereby intentionally coextensive with being as such and as a whole. Otherwise, what is intelligible to human be-ers would not be intelligible to the absolutely necessary dimension of being. The total dependence of human be-ers, in their being, cannot be explained as a relation to a dimension that is in no way cognizant of them or is in any way inferior to them.
The previous paragraph argues that a non-minded absolutely necessary dimension of being is not intelligible as that upon which the contingently actual dimension of being is totally dependent. What, then, of a minded absolutely necessary dimension of being? Such a dimension would not only be cognizant of the contingently actual dimension of being, but would also, as freely willing, be intelligible as that upon which the contingently actual dimension of being would be fully dependent: that there is within being a contingently actual dimension is explained by the free willing, by the absolutely necessary dimension of being, that it be.
In part because the contingently actual dimension of being includes human be-ers who make free decisions, the total dependence of that dimension on the free willing of the absolutely necessary dimension of being cannot be one of being determined in all respects. Instead, according to the SSP, what is freely willed by the absolutely necessary dimension of being is the being, as a whole, of the contingently actual dimension of being. This explains the inclusion within the dimension of being as a whole of the contingently actual dimension of being. Explanations of specific phenomena within the contingently actual dimension of being, on the other hand, are at least in the overwhelming majority of cases explained by other phenomena within that dimension.
At this point, the following question might be raised: even granting that the only way the inclusion within being of a contingently actual dimension can be explained is by its being freely willed by the absolutely necessary dimension of being, might this explanation nonetheless be false? The first thing to be said in response to this possible objection is that within the theoretical framework of the SSP, the explanation emerges as true. Because it does, it is the case that this is one of the ways in which being veridically manifests itself according to the SSP. The thesis that the inclusion within being of a contingently actual dimension is unintelligible and hence inexplicable cannot be situated within the SSP’s theoretical framework given the centrality, to that framework, of the thesis that being is universally intelligible. This of course does not rule out the possibility of theoretical frameworks within which some such thesis could be included, but if some such framework were to be developed and presented, then it could be assessed in comparison with that of the SSP. Only if it proved superior would the SSP give way to it.
b. God
Once the absolutely necessary dimension of being has been determined to have freely willed the being of the contingently actual dimension of being, and it has been determined, as for example in TAPTOE 5.2, that for be-ers within the contingently actual dimension of being it is good to be, it is appropriate to designate the absolutely necessary dimension of being as God.
To further explain this designation of the absolutely necessary dimension of being as God, it is helpful to introduce the principle of rank within being. This principle is the following:
(PRWB) No facting can arise exclusively from or be explained exclusively by any facting of a lower rank within being.
The rank within being of a given facting is determined by the extent of its sphere of influence, the latter understood as including both what the facting can influence, and what can influence the facting. Given this criterion, rocks have a relatively low rank within being, because (for example) they cannot be influenced by threats from animals or from human be-ers. Because of the ways they interact with other animals and with human be-ers, animals have considerably higher ranks within being than do rocks, but because they cannot be influenced by such things as arguments, they rank well below human be-ers. The sphere of influence of human be-ers has no limits, in that—given that human be-ers are intentionally coextensive with being as such and as a whole—humans can in principle be influenced by any constituent of being, precisely by thinking about it.
From the PRWB and the total dependence of the contingently actual dimension of being on the absolutely necessary dimension of being, it follows that the absolutely necessary dimension of being must be intentionally coextensive with being as such and as a whole, and must be free, because if it were not, then it would be of a lower rank within being than the human be-er.
Once the absolutely necessary dimension of being has been determined to be absolutely freely sustaining the being of the contingently actual dimension of being and to be appropriately designated as God, two additional lines of inquiry open. Following the first would involve confronting the many problems that arise following the introduction of God into the SSP; prominent among these is the problem of evil.
The second line of inquiry would require the crossing of a methodological watershed. The reason is that additional determination of the absolutely necessary dimension of being, or of God, may become possible through investigation of the contingently actual dimension of being as wholly dependent on the freedom of God. The question is, does the course of history provide evidence of God’s self-revelation within it such that the interpretive examination of history will make possible further determination of God—possibly as trinity, and possibly as having been incarnate? Both Structure and Being (459-60) and BeingandGod (3.7.4.1) identify this interpretive examination of history, which could include interpretive examination of such historical texts as the Bible, as a task for the SSP, but neither pursues this task. Nor does TAPTOE, and nor does this article.
c. The Principle of Rank within Being and Evolution
Biology, relying on its specific theoretical framework, treats specific empirical questions with specific concepts, assumptions, procedures, and so forth. Essential is that it establishes that there has been development within the domain of animals and that among the many stages of this development there are similarities and dissimilarities. From this it concludes that there are specific connections among these stages. Finally, it interprets these connections as constituting a history of descent (particularly: human be-ers descended from some ancestor of the currently extant apes). All of this is correct if it is governed by the qualifier “according to the theoretical framework of biology.” What that means is, among other things, the following: within that framework, only certain questions are addressed; other questions have no place therein. Among these is the following: How is it possible that such an ascending development can have taken place? How is this ascending development ultimately to be explained, particularly given that within it there are be-ers with enormously different ranks within being?
The first and most central thesis that emerges in the SSP’s response to these questions is the following: If a development to higher ranks within being has taken place, then it was possible for it to have taken place. How is this possibility to be explained? First, this possibility was always a genuine ontological factor included among the be-ers within the contingently actual dimension of being, where evolution occurs. Already in the earliest and lowest (the purely physical) stages of the cosmos, the possibility for developments to all possible forms and stages, including that for the development of ontologically higher forms, is contained as an immanent factor in the be-ers found at those stages. If this were not the case, then it would be a miracle that these more highly-ranked entities developed as they in fact developed. But how is the immanent ontological status of this possibility of development to be clarified?
The SSP clarifies it as follows: First, comparison of any evolutionarily pre-human organism, with its sphere of influence, with any normal adult human be-er, with its sphere of influence, indeed reveals that the human’s sphere of influence is greater, and thereby that the human be-er is of a higher ontological rank (that is, of a higher rank among be-ers). But human be-ers, prior to their emergence in the course of evolution, are not simply absent from the contingently actual dimension of being; they are instead ontologically included within this dimension of being as possibilities, in that if and when the requisite complex configuration of non-human factings emerges, that configuration will be a human be-er. The emergence of human be-ers in the course of evolution is thus nothing like a teleportation from the contingently non-actual dimension of being (or from some merely possible world) into the contingently actual dimension of being. Instead, prior to the emergence of human be-ers in the course of evolution, there be non-human be-ers that have the capacity, in conjunction, to reconfigure themselves such that they cease to be when be-ers of higher ontological ranks, and eventually human be-ers, come to be (this is wholly comparable to the reconfiguration of sperm and egg cells considered above). The span of time, whatever its extent, that precedes the emergence of human be-ers within the contingently actual dimension of being is thus a gestation period for human be-ers. The same holds for organisms of all other kinds.
d. The SSP and Christianity
According to StructureandBeing (332), “within the philosophical perspective developed here, Christianity is the incomparably superior religion.” The SSP includes this thesis because Christianity satisfies the following explicitly identified criterion (443): “only Christianity has developed a genuine theology: one that satisfies the highest demands and challenges of theoreticity.” The Christian religion thus provides the theoretician working within the framework of the SSP with a potentially valuable starting point in that Christian theology provides the theoretician with data potentially incorporable into the SSP’s theory of God. That no other religion provides such data is an empirical thesis. If it were shown to be false, or if it were to become false in the future—if a genuinely theoretical theology linked to any other religion were developed, identified, or discovered—then that theology, too, would provide data potentially incorporable into the SSP, and Christianity would, according to the SSP, cease to be the incomparably superior religion.
In part because Christian theology provides data for potential incorporation into the SSP, Being and God envisages, as the first central question to be addressed as the SSP seeks to further develop its theory of God by examining the history of the contingently actual dimension of being, the question of the degree to which God as articulated by that theory can be identified as the adequately articulated biblical-Christian God (see 252-3). It also, however, explicitly recognizes (271-2) the possibility that that degree would be insignificant. In addition, theoreticians working to further develop the SSP’s theory of God could focus on religions other than Christianity. Whether historical investigation will make possible further determination of God as articulated by the SSP and, if it does, how closely God, as further determined within the SSP, will resemble the God of any religion, are at this point open questions.
A final remark is in order. It concerns the relation between engaging in philosophy and being of religious faith. The philosopher who as a philosopher engages in theorization about God may or may not also be of religious faith, Christian or otherwise, and the Christian or person of other religious faith may or may not engage in philosophy. The philosopher who is not of religious faith may or may not be led by theoretical engagement with the issue of God to become of religious faith, Christian or otherwise, and the philosopher who is a Christian or of other religious faith may or may not be led by their theoretical engagement to alter or abandon that faith.
9. References and Further Reading
Gilson, Étienne. (1948) L’Être et L’Essence. Paris : Librairie Philosophique J. Vrin.
Gilson, Étienne. (1952) Being and Some Philosophers (2nd edition). Toronto: Pontifical Institute of Mediaeval Studies.
Gilson, Étienne. (2002) Thomism. The Philosophy of Thomas Aquinas. A Translation of Le Thomisme (6th and final edition)
Kraus, Lawrence. (2012) A Universe from Nothing. Why There Is Something Rather Than Nothing. New York: Free Press.
Lowe, E. J. (1996), “Why is there anything at all?” Proceedings of the Aristotelian Society. Supplementary Volumes. Volume 70: 111-29.
Puntel, Lorenz B. (2008), Structure and Being. A Theoretical Framework for a Systematic Philosophy. Translated by and in collaboration with Alan White. University Park, PA: Penn State University Press, 2008.
Puntel, Lorenz B. (2011), Being and God. A Systematic Approach in Confrontation with Martin Heidegger, Emmanuel Levinas, and Jean-Luc Marioni. Translated by and in collaboration with Alan White. Evanston, IL: Northwestern University Press, 2011.
van Inwagen, Peter. (1996), “Why is there anything at all?” Proceedings of the Aristotelian Society. Supplementary Volumes. Volume 70: 95-110.
van Inwagen, Peter. (2009) “Being, Existence, and Ontological Commitment.” In Metametaphysics: New Essays on the Foundations of Ontology, edited by David J. Chalmers, David Manley, and Ryan Wasserman. Oxford: Clarendon, 2009, 472-506.
White, Alan. (2014) Toward a Philosophical Theory of Everything. Bloomsbury Press.
Franz Brentano was a major philosopher of the second half of the 19th century who had a strong impact on the development of early phenomenology and analytic philosophy of mind. Brentano’s influence on students such as Karl Stumpf and Edmund Husserl was extensive, but Sigmund Freud was also much inspired by Brentano’s teaching and personality. Along with Bernard Bolzano, Brentano is acknowledged today as the co-founder of the Austrian tradition of philosophy.
Two of his theses have been the focus of important debates in 20th century philosophy: the thesis of the intentional nature of mental phenomena, and the thesis that all mental phenomena have a self-directed structure which makes them objects of inner perception. The first thesis has been taken up by proponents of the representational theory of mind, while the second thesis continues to inspire philosophers who advocate a self-representational theory of consciousness.
Brentano’s interests, however, were not limited to the philosophy of mind. His ambition was greater: to make the study of mental phenomena the basis for renewing philosophy altogether. This renewal would encompass all philosophical disciplines, but especially logic, ethics, and aesthetics. Moreover, Brentano was a committed metaphysician, much in contrast to Kant’s transcendental idealism and its further developments in German philosophy. Brentano advocated a scientific method to rival Kantianism that combined Aristotelian ideas with Cartesian rationalism and English empiricism. He was a firm believer in philosophical progress backed up by a theistic worldview.
Franz Brentano was born into a distinguished German family of Italian descent whose influence on Germany’s cultural and academic life was considerable. His uncle Clemens Brentano (1748-1842) and his aunt Bettina von Arnim (1785-1859) were major figures of German Romanticism, and his brother Lujo Brentano (1844-1931) became an eminent German economist and social reformer. The Brentano brothers considered their family as “zealously Catholic” (Franz Brentano) and “highly conservative” (Lujo Brentano). Brentano’s early association with the Catholic Church complicated his life and affected his academic career in Würzburg and Vienna.
a. Life
Brentano was born on January 16, 1838, in Marienberg on the Rhine. He was one of five children who reached adulthood. He studied philosophy and theology in Munich, Würzburg, Berlin, and Münster. Among his teachers were the philologist Ernst von Lasaulx (1805-1861), the Aristotle scholar Friedrich Trendelenburg (1802-1872), and the Catholic philosopher Franz Jacob Clemens (1815-1862). Under Clemens’s supervision, Brentano first began a dissertation on Francisco Suárez, but then took his doctorate at Tübingen in 1862 with a thesis on the concept of being in Aristotle. He then enrolled in theology and was ordained a priest in 1864. After his habilitation in philosophy in 1866 in Würzburg, Brentano began his teaching career there, first as a Privatdozent, and from 1872 as an Extraordinarius. Among his students in Würzburg were Anton Marty and Carl Stumpf. The fact that he left the University of Würzburg shortly thereafter was due to a falling out with the Catholic Church. In the fight between liberal and conservative groups, which took place both inside and outside the Catholic Church at the time, Brentano wrote in a letter to a Benedictine abbot that he felt himself “caught between a rock and a hard place” (quoted in Binder 2019, p. 79). A key role was played by a document he had written for the bishop of Mainz where he was critical of the dogma of the infallibility of the Pope proclaimed by the first Vatican Council in 1870.
Despite this rift, Brentano was able to continue his academic career. After he convinced the responsible authorities in Vienna that he was neither anti-clerical nor atheistic, he was appointed to a professorship of philosophy at the University of Vienna in 1874, supported among others by Hermann Lotze (1817-1881). As in Würzburg, Brentano quickly found popularity among the students in Vienna. These included Edmund Husserl, Alexius Meinong, Alois Höfler, Kasimir Twardowski, Thomas Masaryk, Christian Ehrenfels and Sigmund Freud. Privately, Brentano found connections to the society of the Viennese bourgeoisie. He met Ida von Lieben, daughter of a wealthy Jewish family, finally left the Catholic Church, and married her in Leipzig in 1880. This union was followed by a protracted legal dispute in which Brentano tried in vain to regain his professorship, which he had to give up as a married former priest in Austria. In 1894, after the unexpected death of his wife Ida, Brentano was left alone with their six-year-old son Johannes (also called “Giovanni” or “Gio”). One year later, he decided to end his teaching career, now as a Privatdozent, and left Austria first for Switzerland, then for Italy.
Although Brentano was offered several professorships in both countries, the 57-year-old decided to live as a private scholar from then on. In 1896, he acquired Italian citizenship and lived in Florence and Palermo. He continued to spend summers in Austria at his vacation home in Schönbühel on the Danube. In 1897 Brentano married his second wife, the Austrian Emilie Rüprecht. She not only took care of the household and his son, but also increasingly supported Brentano in his scientific work, since his vision had started to decrease around 1903. Due to the outbreak of World War I and the entry of Italy into the war, Brentano moved with his family to Zurich in 1915. During his time as a private scholar, he not only kept in touch with a small circle of students who had meanwhile made careers in Germany and Austria, but he was also in lively exchange with philosophers and intellectuals in Europe. Inspired by these contacts, Brentano was highly active intellectually until his death on March 17, 1917.
b. Works
Brentano’s philosophical work consists of his published writings and an extensive Nachlass, which includes a large amount of lecture notes, manuscripts, and dictates from his later years. The bequest also contains a wealth of correspondence that Brentano exchanged with his former students (notably Anton Marty) and students of his students (notably Oskar Kraus). Other correspondents of Brentano were eminent scientists and philosophers such as Ludwig Boltzmann, Gustav Theodor Fechner, Ernst Mach, John Stuart Mill and Herbert Spencer.
As Brentano was a prolific writer, but reluctant to prepare his works for publication, it was mostly left to his students and later generations of editors to prepare publications from his Nachlass manuscripts. In doing so, the editors often took the liberty of adapting and modifying Brentano’s original text, often without marking the changes as such. As a result, only the works published by Brentano himself are a truly reliable source, while the volumes edited after Brentano’s death vary widely in editorial quality (see the warnings in References and Further Readings, section 3).
2. Philosophy of Mind
Brentano believed that psychology should follow the example of physiology and become a full-fledged science, while continuing to play a special role within philosophy. To meet this goal, he conceived of psychology as a standalone discipline with its own subject matter, namely mental phenomena. The basic principles that inform us about these phenomena are emblematic of his conception of the mind and will be discussed in more detail below: his thesis of the intentional nature of mental phenomena; his thesis of inner perception as a secondary consciousness; and his classification of mental acts into presentations, judgements, and phenomena of love and hate.
In later years, Brentano drew an important distinction between descriptive and explanatory (“genetic”) psychology. These sub-disciplines of psychology differ both in their task and the methods they need to accomplish that task. According to Brentano, an analytical method is needed to describe psychological phenomena. We must analyze the experiences, which often only appear indistinct to us in inner perception, in order to precisely determine their characteristics. Genetic psychology, on the other hand, requires methods of explanation; it must be able to explain how the experiences that we perceive come about.
Brentano prepares the ground for that distinction in his seminal work Psychology from an Empirical Standpoint, first published in 1874, and fleshes it out fully in the second edition of the Classification of Mental Phenomena in 1911. The second edition contains several new appendices, but it is still far from completing the book project as Brentano originally conceived it. According to this plan Brentano wanted to add four more books that give a full treatment of the three main classes of mental phenomena (presentations, judgements, acts of the will and emotional phenomena), as well as a treatment of the mind-body problem, which shows that some notion of immortality is compatible with our scientific knowledge of the mind (see section 4d).
While thePsychology from an Empirical Standpoint secured Brentano’s place in the history of philosophy of mind, it is not an isolated piece in Brentano’s oeuvre. It stands somewhat in the middle between his earlier work on The Psychology of Aristotle (1867) and his lectures on Descriptive Psychology (1884-1889). If we add to this Brentano’s lectures on psychology in Würzburg (1871-1873) and the works on sensory psychology in his later years (1892-1907), we see a continuous preoccupation with questions of psychology over more than 40 years.
a. Philosophy and Psychology
Brentano’s interest in the philosophy of mind was driven by the question of how psychology can claim for itself the status of a proper science. Taking his inspiration from Aristotle’s De Anima, Brentano holds that progress in psychology depends on progress in philosophy, but he takes this dependence to go both ways. This means that we can ascribe to Brentano two programmatic ideas:
Philosophy helps psychology to clarify its empirical basis as well as to determine its object of research
Conversely, psychology contributes in various ways to many areas of philosophy, especially epistemology, logic, ethics, aesthetics, and metaphysics
Implementing the first idea involved Brentano in addressing thorny questions of methodology and classification: What is the proper method for studying mental phenomena? How do mental phenomena differ from non-mental phenomena? Is consciousness a characteristic of the mental? How does consciousness of outer objects differ from what Brentano calls “inner consciousness”? How can we classify mental phenomena? The first two books of the Psychology from an Empirical Standpoint provide ample answers to these questions.
During his years in Vienna (1874-1895), Brentano’s interest shifted more and more towards his second programmatic idea. In this context, Brentano found it necessary to distinguish more sharply between a “descriptive” and a “genetic” psychology. With this distinction in hand, he tried to show what psychology might contribute to various parts of philosophy. While philosophy may be autonomous from genetic psychology, it builds on the resources of descriptive psychology. How this shift towards descriptive psychology gradually took hold of Brentano’s thinking can be seen from the titles of his Vienna lectures: “Psychology” (1874-1880), “Selected questions of psychology and aesthetics” (1883-86), “Descriptive psychology” and “Psychognosy” (1887-1891).
It was only in later years that Brentano returned to questions of genetic psychology. Texts from the last decade of his life were published posthumously in a volume entitled On Sensory and Noetic Consciousness. Psychology from an Empirical Standpoint III (1928, English translation 1981). The subtitle is misleading because this volume is not a continuation of Brentano’s earlier book. How Brentano planned to continue his Psychology has been succinctly described in Rollinger (2012).
b. Inner Perception
In studying mental phenomena, philosophers often emphasize a crucial distinction between two questions:
How can we obtain knowledge of our own mental acts?
How can we obtain knowledge of mental acts in other subjects?
In dealing with these questions, Brentano rejects the idea that separating the two questions means to postulate an “inner” sense in addition to the outer senses. This old empiricist idea inherited from Locke was still popular among philosophers and psychologists of Brentano’s time. As an alternative to this old view, Brentano suggests that we access our own mental phenomena by simply having them, without the need of any extra activity such as introspection or reflection.
To appreciate the large step that Brentano takes here, one must address several much-contested questions about the nature of our self-knowledge (see Soldati 2017). To begin with, how does Brentano distinguish between “inner perception” and what he calls “inner observation”? One way to do so is to consider the role that attention and memory play in accessing our own mental states. Like John Stuart Mill, Brentano argues that attending to one’s current experiences involves the possibility of changing those very experiences. Brentano therefore suggests that inner perception involves no act of attention at all, while inner observation requires attending to past experiences as we remember them (see Brentano 1973, p. 35).
But Brentano goes further than that. According to him, we must not think of inner perception as a separate mental act that accompanies a current experience. Instead, we should think of an experience as a complex mental phenomenon that includes a self-directed perception as a necessary part. Some scholars have taken this view to imply that inner perception is an infallible source of knowledge for Brentano. But this conclusion needs to be drawn with care. Take, for instance, a sailor who mistakenly thinks he sees a piece of land on the horizon. Inner perception is telling him with self-evidence that he is having a visual experience of a piece of land. And yet he is mistaken about what he sees, if he mistakes a cloud, let’s say, for a piece of land. Still, Brentano would say that it was not inner perception that misled the sailor, but rather his interpretation of the visual content as of a piece of land. Such misinterpretations are errors of judgements, or attentional deficits that happen in observing or attending to the content of our experience. In the end, Brentano seems committed to the view that unlike observation, inner perception has no proper content and therefore has nothing it could be wrong about.
How can Brentano defend this commitment? One way to do so would be to appeal to the authority of Aristotle and his view that we are seeing objects, while at the same time experiencing seeing them. While Brentano is always happy to follow Aristotle, he also bolsters his view with new arguments. In the present case he does so by drawing on the ontology of parts and wholes. A common understanding of parts and wholes has it that a whole is constituted by detachable parts, like a heap of corn is constituted by single corns as detachable parts. But this model does not apply to inner perception, says Brentano. A conscious experience is not constituted like a heap of corn: you can’t detach the person who sees something from the subject who innerly perceives the seeing. This leads Brentano to the key insight that what appears to us in inner perception is nothing other than the entire mental act that presents itself to us.
How then does Brentano explain the immediate knowledge we have of our own mental phenomena? Having removed any appeal to a faculty of inner sense, Brentano ends up with a form of conceptual insight that could be summarized in the following way: Immediate knowledge of our present mental states comes with the insight that we can only conceptually separate what we see, feel, or think from the act of seeing, feeling or thinking. Such immediate knowledge becomes impossible when we consider the life of other people. In this case we are restricted to a form of “indirect knowledge” of their feelings and thoughts by listening to what they say or observing their behavior. This indirectness implies that we may only be certain that the other person feels or thinks something, without knowing what exactly it is that they feel or think. Here we are not just dealing with conceptual differences, but with a real difference between our own experiences and the mental phenomena we discover in the minds of others.
But are we able to “read” the minds of other people? In book II of his Psychology from an Empirical Standpoint, Brentano argues that psychology could not acquire the status of a proper science merely on the basis of the data of inner perception. We need to make sure that our conception of the mind is not biased by our own experiences. The data that help us to guard against such an egocentric bias include (a) biological facts and the behavior of others indicating, for example, that they feel hunger or thirst like we do, (b) the mutual understanding of communicative acts such as gestures or linguistic behavior, as well as (c) the recognition of behavior as voluntary actions performed with certain intentions.
In drawing on these resources, Brentano shows no skepticism towards our social instincts. It is part of our daily practice to infer from the behavior of others whether someone is hungry, whether he is ashamed because he has done something wrong, and so on. For Brentano, there is no reason why a scientific psychology should dispense with such inferences. On the contrary, he acknowledges that a psychology that limits itself exclusively to the knowledge of its own mental phenomena is exposed to the danger of far-reaching self-deceptions. Psychology must face the task of tracking down such deceptions.
Brentano thus provides a thoroughly optimistic picture of the empirical basis of psychology. With inner perception, it can count on the immediate and potentially error-free access that we have to our own experiences, while the fallacies that arise in self-observation can be rectified by relying on a rich repertoire of inferences about other people’s mental states.
c. Intentionality
Psychology is a self-standing science, says Brentano, because it has a specific subject matter, namely mental phenomena. In his Psychology from an Empirical Standpoint (1874), Brentano argues that these phenomena form a single and unified class thanks to a general characteristic that distinguishes them from all other phenomena: their intentional directedness to objects.
This so-called “intentionality thesis” has sparked a wide-ranging debate among Brentano’s pupils. Husserl and Twardowski–just to mention two–came up with different readings of what Brentano describes as the “intentional directedness towards an object.” Their disagreement paved the way for an interpretation (due to Oskar Kraus and later taken up by Roderick Chisholm) that ascribes to Brentano a fundamental change of view about the nature of intentionality. According to this interpretation, we find in the early Brentano a rich ontology of intentional objects that is threatened by inconsistency. Therefore, Brentano later developed a radical critique of non-real objects that forced him to accept a non-relational theory of intentionality. In the meantime, this interpretation has been disputed for various reasons: Some scholars (for example, Barry Smith, Arkadiusz Chrudzimski) tried to resolve the alleged inconsistencies in Brentano’s early ontology, thus removing the need for a radical shift to a non-relational theory of intentionality, while other scholars (e.g. Mauro Antonelli and Werner Sauer) tried to show that Brentano’s later view is not very different from his earlier conception. Still others have discussed whether one finds at the core of Brentano’s theory a relational concept of intentionality that applies to some but not all mental phenomena, thus putting pressure on the thesis that all mental phenomena share the same feature of intentionality (see Brandl, 2023).
These different descriptions of intentionality played an important role in the reception of the concept by Brentano’s students. Following Twardowski, Husserl insisted on the distinction between the content and the object of an intentional act, stressing (against Twardowski) that some acts are intentional and are yet objectless, e.g. my presentation of a golden mountain. Following this line, Husserl developed a non-relational, semantic view of intentionality in his Logical Investigations (1900/1901), according to which intentional acts are acts of “etwas meinen” (meaning something). Going beyond Brentano and Twardowski, Husserl argues that acts of meaning instantiate ideal species that account for the objectivity of meaning. Although Brentano seemed to be aware of the semantic problem of the objectivity of meaning in some of his lectures, this was clearly not a central concern for his understanding of intentionality. Meinong, on the other hand, extends Brentano’s concept of intentionality in the opposite direction to Husserl’s: not only is there no sharp distinction between semantics and ontology in Meinong, but all mental acts, including my presentation of a golden mountain, in his view have an object, which may or may not exist, or may simply belong to the extra-ontological realm (Außersein). In any case, Meinong defends a fully relational view of intentionality.
Given the plurality of developments of the concept within his school and the plurality of descriptions offered by Brentano himself, one lesson to be drawn from this debate is that the scholastic terminology used by Brentano is much less informative than it has been taken to be. Brentano seems to use different terms for object-directedness in the same way that one uses different numerical terms to measure the temperature of a body in Celsius or Fahrenheit. In putting mental phenomena on different scales, metaphorically speaking, he highlights some of their differences and commonalities and describes these as “having the same intentional content” or “being directed at the same object.” Take sense perception, for example. If every mental phenomenon has an object, then both my seeing and my imagining an elephant have an object. While this is true according to one use of the term “object,” it is not true if we want to compare an act of perception with an act of hallucination. Then, the so-called “object” of the sensory experience might be better described as a “content” of experience. It fills the grammatical gap in the expression “I see X,” which the hallucinator might also use.
Another controversy concerns phenomena that do not appear to be intentional at all. For instance, one can simply be in a sad mood without being sad about a particular event. Or one may generally tend to jealousy, without that jealousy being triggered by any particular object. Brentano’s intentionality thesis has been defended on the grounds that all mental states fulfill a representational function. The function of being sad could be, for example, to cast any goal we are striving for in a negative light. This could explain the paralyzing effect of sadness. But there is also the possibility of allowing that some mental phenomena may have objects only in a derived sense: for instance, when a subject perceives herself as being in a certain mood. The mood or character trait in itself might be “undirected,” but it would still be an object of inner consciousness when the subject perceives herself as being sad.
Brentano was not yet concerned with the many further questions raised by a representational theory of the mind. Thus, it is not clear how he would treat activities, processes, and operations that underlie our sensory experiences, feelings or acts of the will. Would they have intentional content for him only if they are part of our consciousness? Or would Brentano regard them as non-intentional “physical” phenomena to be studied by the physiological sciences? Perhaps the best way to relate Brentano to the contemporary debate about such questions is to say that he promotes a kind of qualitative psychology that does not need to invoke unconscious mental processes (see Citlak, 2023).
d. Descriptive Psychology
In the manuscripts starting from 1875, in which Brentano worked on the continuation of his Psychology from an Empirical Standpoint, one finds the first explicit mention of a separation between descriptive and genetic psychology. The basic idea behind this distinction is that a science of psychology cannot get off the ground unless it is able to identify the components of consciousness and their interrelationships. This requires something like a “geography” of mental concepts, which descriptive psychology is supposed to provide.
Although Brentano approaches this issue in the spirit of the empiricism of Locke and Hume, his objectives are somewhat closer to Descartes’, that is, to find the source of conceptual truths that are self-evident. Combining empiricism about concepts with the search for self-evident conceptual truths is the key to understanding the aim of descriptive psychology, which can be summarized as follows: Descriptive psychology aims to determine the elements of consciousness on the basis of inner perception, and thus to show how we arrive at concepts and judgements that we accept as self-evident and absolutely true. How Brentano tries to implement this plan can be seen in some examples that he discusses in his lecture courses. In one example, he asks whether it is possible to make contradictory judgements. Brentano denies that this is possible as soon as we have a clear perception of the fact that the judgements “A exists” and “A does not exist” cannot both be true. In another example, he asks whether one can feel a sensation without attributing content to that experience. Again, Brentano argues that this is impossible, and that we know this once we have a clear grasp of what “feeling a sensation” means. And we get a clear grasp of what “feeling a sensation” means by describing or analyzing the experience of feeling a sensation into its more basic constituents, by providing examples of such cases, by contrasting different cases, and so on. Doubts about the validity of these results are possible, but they can be treated as temporary. They can only arise while our knowledge of the elements of consciousness and their connections is incomplete.
In summary, we can say that Brentano conceived of descriptive psychology as an epistemological tool. It is aimed at principles which, like the axioms of mathematics, are immediately self-evident or can be traced back–in principle–to immediately self-evident truths. The caveat “in principle” reminds us that we may have good reason to believe that these self-evident truths exist even if we do not yet know what they are.
3. The Triad of Truth, Goodness and Beauty
In the second half of the 19th century the question of the objective character of logic, ethics and aesthetics became a much-contested issue in philosophy. Brentano’s answer to this question is guided by a grand idea: In these disciplines, the theoretical principles of psychology find their application. The domain of application is fixed in each case by one of the three fundamental classes of mental phenomena: logic is the art of making correct judgements, notably judgements inferred from other judgements; ethics deals with attitudes of interest, such as emotional and volitional acts, which direct us to what is good; while aesthetics examines presentations and our pleasure in having them, which makes us acquainted with what is beautiful.
Brentano thus places psychology at the base of what appears to be a philosophical system. The idea of such a system, however, was severely criticized by Husserl, who accused Brentano of psychologism. From the point of view of logic, this accusation weighs heavily since Brentano transforms the laws of logic into laws of correct thinking. But the objection here is a general one. It can also be used to deny that ethics should be based on moral psychology, and that aesthetics should be based on the psychology of imagination. If Brentano has a good answer to Husserl, it must be a general one.
a. A Philosophical System
Brentano was never convinced by Husserl’s claim that phenomenology can only play a foundational role within philosophy if it is freed from its psychological roots. His answer to Husserl’s reproach was a counter-reproach: Whoever speaks of “psychologism” really means subjectivism, which says that knowledge claims may only be valid for a single subject. For instance, an argument may be valid for me, even if others do not share my view. Pointing to his notion of self-evidence, which is not a subjective feeling at all, Brentano sees his position as shielded from this danger. He regards Husserl’s accusation as based on confusion or deliberate misunderstanding.
The jury is still out on this. While most phenomenologists tend to agree with Husserl, others see nothing wrong with the kind of psychologism that Brentano advocates. Some go even further and insist that only by reviving Brentano’s project of grounding mental concepts in experience can we hope to avoid the dead ends in the study of consciousness to which materialism and functionalism lead (see Tim Crane, 2021).
An interpretation of Brentano that also tries to make progress on the issue of psychologism has been proposed by Uriah Kriegel:
In order to understand the true, the good and the beautiful, we must get a clear idea of (i) the distinctive mental acts that aim at them, and (ii) the success of this aim. According to Brentano, the true is that which is right or fitting or appropriate to believe; the good is that which is right/appropriate to love or like or approve of; and the beautiful is that which is right/appropriate to be pleased with (U. Kriegel: The Routledge Handbook of Franz Brentano and the Brentano School. p. 21).
Kriegel’s interpretation is inspired by a tradition, going back to Moses Mendelssohn, of conceiving of truth, goodness, and beauty as closely related concepts. Brentano gives this idea a psychological twist:
It is necessary, then, to interpret this triad of the Beautiful, the True, and the Good, in a somewhat different fashion. In so doing, it will emerge that they are related to three aspects of our mental life: not, however, to knowledge, feeling and will, but to the triad that we have distinguished in the three basic classes of mental phenomena (Brentano 1995, 261).
Brentano scholars must decide how much weight to give to the idea expressed in this quotation. While Kriegel believes that the idea deserves full development, others are more sceptical. They point out that such a system-oriented interpretation goes against the spirit of Brentano’s philosophising. Wolfgang Huemer takes this line when he suggests that “Brentano’s hostility to German system philosophy and his empiricist and positivist approach made him immune to the temptation to construct a system of his own” (Huemer 2021, p.11).
Another question that arises at this point is how to integrate metaphysics into the system proposed by Kriegel. Brentano hints at a possible answer to this question by adding to the triad “the ideal of ideals,” which consists of “the unity of all truth, goodness and beauty” (Brentano 1995, 262). Ideals are achieved by the correct use of some mental faculties. Whether this also holds for the “ideal of ideals” is unclear. Perhaps Brentano is referring here to a form of wisdom that emerges from our ability to perceive, analyse and describe the facts of our mental life (see Susan Gabriel, 2013).
b. Judgement and Truth
Brentano adopts a psychological approach to logic which stands opposed to the anti-psychologistic consensus in modern logic that takes propositions, sentences, or assertions as the primary bearers of truth. Brentano’s approach starts from the observation that simple judgements can easily be divided into positive and negative ones. For Brentano, a simple judgement is correct if it makes the right choice between two responses. We can either acknowledge a presented object when we judge that it exists, or reject it when we judge that it does not exist. To illustrate this, suppose you have an auditory experience where you are presented suddenly with a gunshot. The sound wakes you up and you have no clue as to whether what you heard was real or just a dream. You are presented with content (the gunshot heard), and now you take a stance on this content, either by accepting it (“yes, it’s a gunshot”) or rejecting it (“no, it’s not a gunshot”). The stance you take will be right or wrong, and the resulting judgement will be true or false.
Now, there are two different ideas here that need to be carefully distinguished. One is the following concept acquisition principle:
We acquire the concept of truth by abstracting it from experiences of correct judgement.
The other is a definition of truth:
A judgement is true if and only if it is correct to acknowledge its object or if it is correct to reject its object.
The idea of defining truth in this way raises the question of how it relates to the classical correspondence theory of truth. The critical issue here is the concept of “object,” and the ontological commitments connected with this term. The common view is that Brentano rejected the view that facts or states of affairs could be objects standing in a correspondence relation to a judgement. In a lecture given in 1889, entitled “On the Concept of Truth,” Brentano rehearses some of the criticisms that the correspondence theory has received. He draws particular attention to the problem of negative existential judgements such as “there are no dragons.” This judgement seems to be true precisely because nothing in reality corresponds to the term “dragon.” For Brentano, this speaks not only against the introduction of negative facts, such as the fact that there are no dragons, but also against the acceptance in one’s ontology of states of affairs that might obtain if there were dragons, for example the state of affairs that dragons can sing.
But abandoning the correspondence theory has its price. How can Brentano distinguish between truth on the one hand, and the more demanding notion of correctness on the other? Correct judgements should help us to attain knowledge. If we want to know whether the noise that we heard was a gunshot, we are not engaged in a game of guessing, but rather we are trying to be as reasonable as possible about the probability that it was in fact a real gunshot. What is it that makes our judgement correct, if it is not the correspondence with a real object or fact?
Brentano tries to capture this more demanding notion of correctness with his notion of “self-evidence.” He points out that the judgement of a subject can be true even if it is not self-evident to that subject. This leaves open the possibility that it is evident to another subject whose judgement would then be correct. Suppose your judgement is that John is happy. You may have good reasons for judging so, but the truth of this judgement is not self-evident to you. There is however a person, John, who might be able to judge with self-evidence that he is happy right now. To say that your judgement is true means that it agrees with the judgement of John when asked whether he is happy or not.
But what about judgements like “Jack has won the lottery”? In this case we can ask the company running the lottery whether this assertion is true, but we will find no one in a position to resolve this question in a self-evident judgement. How then can Brentano’s definition be considered a general definition of truth that applies to all judgements?
It is at this point that the distinction between a definition of truth and a principle of concept acquisition becomes crucial. When we ask how we acquire the concept of truth, the slogan “Correctness First!” tells us that we acquire the concept of truth only later: the first step is to recognize that judging with self-evidence means to judge correctly (see Textor 2019). But we must not conclude from this that the slogan also applies when we define the concept of truth. Towards the end of his lecture “On the Concept of Truth” Brentano notices that one can remove the notion of “correspondence” from the classical Aristotelian definition of truth, without making it incomprehensible or false. “A judgement of the form ‘A exists’ is true if and only if A exists; a judgement of the form ‘A does not exist’ is true if A does not exist.” Was Brentano then a pioneer of a minimalist theory of truth? It has been argued that this is at least an interesting alternative to the epistemological interpretation described above (see Brandl, 2017). It explains why Brentano in his lecture on the concept of truth finds it unproblematic that a definition of truth may seem trivial. In doing so, he allows that a definition need not be informative about how we acquire the concept of truth.
c. Interest and the Good
The distinction between defining a concept and explaining how we acquire that concept also plays a role in Brentano’s meta-ethical theory of correctness. A few months before his lecture on the concept of truth, the Viennese Law Society had asked Brentano to present his views on whether there was such a thing as a natural sense of justice. In his lecture “On the Origin of our Knowledge of Right and Wrong” (1889), Brentano gives a positive answer to the question posed by the Society, but he makes it clear that the term “natural sense” can be understood in different ways. For him, it is not an innate ability to see what is just or unjust. Rather, what Brentano is defending is the idea that there are “rules which can be known to be right and binding, in and for themselves, and by virtue of their own nature” (Brentano, 1889, p.3).
Brentano’s meta-ethical theory of right and wrong can therefore be seen as a close cousin of his theory of truth. It is a highly original theory because it steers a middle course between the empirical sentimentalism of Hume and the a priori rationalism of Kant. Brentano carves out a third option by asking: What are the phenomena of interest (as Brentano calls them) that form the basis of our moral attitudes and decisions? Introducing the notion of “correct love,” he proposes the following principle of concept acquisition: We acquire the concept of the good by abstracting it from instances of correct love. The term “love” here stands for a positive interest, in polar opposition to “hate,” which includes for Brentano any negative dis-interest. For Brentano, these are phenomena in the same category as simple feelings of benevolence, but with a more complex structure that makes them cognitively much more powerful. By introducing these more powerful notions, Brentano hopes to show how one can take a psychological approach in meta-ethics that still “radically and completely [breaks] with ethical subjectivism” (ibid., p. xi, transl. modified).
We have already mentioned that Brentano denies that our moral attitudes and decisions are based on an innate, and in this sense “natural,” instinct. It may be true that we instinctively love children and cute pets, and that these creatures fully deserve our caring response. But such instinctive or habitual responses can also be misleading. We may instinctively or habitually love things that do not deserve our love, for example if we have become addicted to them. A theory that breaks with ethical subjectivism must be able to tell us why our love for children and pets is right and our love for a drug is wrong.
One way to approach this matter is to interpret Brentano as a precursor of a “fitting attitude” theory (see Kriegel 2017, p. 224ff). When we love an object that deserves our love, we may call this a “fitting attitude.” But to know what “fittingness” means, we have to turn to inner perception. Inner perception tells us, for example, when we love the kindness of a person, that this is a correct emotional response. Once we know that a person is kind, we know immediately that her kindness is something to be loved. This is “self-imposing,” as Kriegel says.
The question now is whether inner perception will also provide us with a list of preferences that is beyond doubt. For example, does inner perception tell us that being healthy or happy is better than being sick or sad, or as Brentano would put it: that it is correct to love health and happiness more than sickness or sadness? Such claims are open to counterexamples, or so it would seem. A person might want to get sick, or give in to sadness, or deliberately hurt herself. But there is a response Brentano can make to defend the self-imposing character of a preference-order. He could say that people have such deviant preferences only for the purpose of achieving some further goal. If we ask for good things that can be final goals, i.e. goals that we do not pursue for the sake of some other goal, then no one can reasonably doubt that health, happiness, and knowledge are better than sickness, sadness, and error.
Following this line, these cases could be treated in the same way as true judgements that are not self-evident to the subject making them: Even if it is not self-evident to a subject that health is a good, there may be subjects who take it as an ultimate goal and for whom it is a self-imposing good. Brentano can thus also claim to avoid a dilemma that threatens his meta-ethics: If goodness just depends on our emotional responses, this would make this notion fully subjective. But if it covers only those cases when the correctness of our love is self-evident to us, its domain of application would be very small indeed. Hence the idea that goodness must correlate with the responses of a perfect moral agent.
Brentano’s notion of correct love may explain how we acquire the concept of goodness, but it need not figure as part of a substantive definition of this concept. Cases of moral behavior whose correctness is self-imposing may still be informative, without comparing our moral behavior with an ideal moral agent. These cases suggest a preference order of goodness. For example, cases in which people sacrifice all their possessions to get medical treatment may help us to see why health is such a high good, perhaps even an ultimate good. And cases in which people risk their health for moral reasons may help us to see that there are goods that are ranked even higher than good health. Just as self-evident judgements lead us to the idea that truth is the highest epistemic value, so acts of love, whose correctness is self-imposing, can lead us to the idea of a supreme good.
d. Presentation and Beauty
Brentano concludes his analysis of normative concepts with an analysis of the concept of beauty. Extending his arguments against subjectivism, he attacks the common view that beauty exists only “in the eyes of the beholder.” For Brentano, beauty is a form of goodness that is no less objective than other forms of goodness. What the common view rightly points out is merely a fact about how we acquire the concept of beauty, namely by recourse to experience. Since this also holds for the concepts of truth and moral goodness, as Brentano says, beauty is no exception.
While we do not have a fully worked out version of Brentano’s aesthetic theory, its outlines are clear from lecture notes in his Nachlass (see Brentano: Grundzüge der Ästhetik, 1959). Again, taking a psychological approach, Brentano argues that it is not a simple form of pleasure that makes us see or feel the beauty of an object. Like in the case of moral judgements, it is a more complex mental state with a two-level structure. When we judge something to be beautiful, the first-level acts are acts of presentation: we perceive something or recall an image from memory. To this, we add a second-level phenomenon of interest: either “delight” or “disgust.” This gives us the following principle of concept acquisition: We acquire the concept of beauty by abstracting it from experiences of correct delight in a presentation.
Brentano’s theory nicely incorporates the fact that people differ in their aesthetic feelings. Musicians feel delight when hearing their favorite piece of music, art lovers when looking at a favorite painting, and nature lovers when enjoying a good view of the landscape. What they have in common is a particular kind of experience, which is why they can use the term “beautiful” in the same way. It is the experience of delight that underlies their understanding of beauty. Yet it is not a simple enjoyment like the pleasure we may feel when we indulge in ice cream or when we take a hot bath. Aesthetic delight is a response that requires a more reflective mode. To feel a higher-order pleasure, we must pay attention to the way objects appear to us and contemplate the peculiarities of these representations.
This reconstruction of Brentano’s aesthetic theory suggests that Brentano does not need a substantive definition of beauty. He can get by with general principles that connect the notions of beauty and delight, mimicking those that connect goodness with love, or truth with judgement and existence. As early as 1866, Brentano used this formulation of such a principle in its application to aesthetics: something is beautiful if its representation is the object of a correct emotion (see Brentano, “Habilitation Theses”). While these principles come with a sense of obviousness, interesting consequences follow from Brentano’s explanation of how we acquire our knowledge of them.
First, it follows that just as we can be mistaken in our judgements and emotional responses, so we can fail to appreciate the aesthetic qualities of an object. People may feel pleasure from things that do not warrant such pleasure, or they may fail to respond with pleasure to even the most beautiful things in front of their eyes. A plausible explanation for such cases of aesthetic incompetence is that people can perceive things very differently. They may not hear what the music lover hears, or fail to see what the nature lover sees, and therefore cannot understand why they find these things so pleasurable.
Second, Brentano’s theory implies that no relation of correspondence between mind and reality will explain what justifies aesthetic pleasure. Such justification can only come from inner perception, which provides us with exemplary cases of aesthetic delight. In such cases, like in the case of moral feelings, the correctness of the enjoyment is self-imposing. Such experiences may serve as a yardstick for judging cases whose beauty is less obvious and therefore more controversial.
4. Epistemology and Metaphysics
Brentano’s interest in questions of psychology is matched by an equally deep interest in questions of metaphysics. In both areas Brentano draws inspiration from Aristotle and the Aristotelian tradition. We can see this from his broad notion of metaphysics, encompassing ontological, cosmological, and theological questions. The central ontological question for Aristotle is the question of “being as such”; what does it mean to say that something is, is being, or has being? This is followed by questions concerning the categories of being, e.g.: What are the highest categories into which being can be divided? The second part of metaphysics, cosmology, seeks to establish the first principles of the cosmic order. It addresses questions concerning space, time and causality. Finally, the pinnacle of metaphysics is to be found in natural theology, which asks for the reason of all being: Does the world have a first cause, and does the order of the world suggest a wise and benevolent creator of the world?
For Brentano, as for Kant, these metaphysical questions pose an epistemological challenge first of all because they tend to exceed the limits of human understanding. To defend the possibility of metaphysical knowledge against skepticism, Brentano takes a preliminary step which he calls “Transcendental Philosophy,” but without adopting Kant’s Copernican turn. On the contrary, for Brentano, Kant himself counts as a skeptic because he declares things in themselves to be unknowable. What Kant overlooked, Brentano argues, is the self-evidence with which we make certain judgements. From this experience, he believes we can derive metaphysical principles whose validity is unquestionable.
a. Kinds of Knowledge
Skeptics of metaphysical knowledge often draw a contrast between metaphysical and scientific knowledge. Brentano resisted such an opposition, opting instead for an integration of scientific knowledge into metaphysics. He must therefore face the following conundrum: How can philosophy integrate the results obtained by the special sciences while at the same time making claims that go beyond the scope of any of the individual sciences?
Kant’s attempt to resolve this dilemma is based on the special nature of synthetic judgements a priori. Such judgements have empirical content, Kant holds, but we recognize them as true by the understanding alone, independently of sensory experience. Metaphysical knowledge thus becomes possible, but it is constrained by the scope of the synthetic judgements a priori. Brentano rejects this Kantian solution as inconclusive. Whatever a synthetic judgement a priori may be, it lacks what Brentano calls “self-evidence.” Metaphysics must not be constrained by such “blind prejudices,” Brentano holds.
Rejecting Kant’s view, Brentano proposes his own classification of judgements based on two distinctions: He divides judgements into those that are made with and without self-evidence, and he distinguishes between judgements that are purely assertive in character and apodictic (necessarily true or necessarily false) judgements. Knowledge in the narrow sense can only be found in judgements that are self-evident or that can be deduced from self-evident judgements. This does not mean, however, that non-evident judgements have no epistemic value for Brentano. To see this, we can replace the assertive/apodictic distinction with a distinction between purely empirical and not purely empirical judgements. Crossing this distinction with the evident/not-evident distinction gives us four possible kinds of knowledge:
knowledge through self-evident purely empirical judgements
knowledge through non-self-evident purely empirical judgements
knowledge through self-evident not purely empirical judgements, and
knowledge through non-self-evident not purely empirical judgements.
To the first category belong judgements of inner perception in which we acknowledge the existence of a current mental phenomenon. Such judgements are always made with self-evidence, according to Brentano. For instance, you immediately recognize the truth of “I am presently thinking about Socrates” when you do have such thoughts and therefore cannot reasonably doubt their existence.
Category 2 contains the non-self-evident empirical judgements, including all empirical hypotheses. These judgements differ in the degree of their confirmation, which is expressed in probability judgements. Although they lack self-evidence, Brentano allows that they may have what he calls “physical certainty.” Repeated observation may lead us to the certainty, for instance, that the sun will rise tomorrow. It is a judgement with a very high probability.
Category 3 contains apodictic universal judgements which, according to Brentano, are always negative in character. For instance, the true form of the law of excluded middle is not expressed by “for all x, either x is F or x is not-F,” but rather by the existential negative form: “there is no x which is F and not-F.” The truth of such a judgement is self-evident in a negative sense, i.e. it expresses the impossibility of acknowledging the conjunction of a property and its negation.
Finally, what about judgements in category 4? These judgements are similar to those which Kant classified as synthetic a priori. While Brentano denies that mathematical propositions fall into this category, taking them to be analytic and self-evident, he recognises a special status for judgements such as “Something red cannot possibly be blue” —that is, blue and red in the same place (see Brentano: Versuch über die Erkenntnis, p. 47). Or consider the judgement “It is impossible for parallels to cross”. These judgements are based on experiences described in the framework of commonsense psychology or Euclidean geometry. We can imagine alternative frameworks in which these judgements turn out to be false. But for practical reasons we can ignore this possibility and classify them as axioms and hence as knowledge.
Brentano’s official doctrine is that there are no degrees of evidence and that all axioms are self-evident judgements in category 3, despite the fact that many of them have been disputed. This puts a lot of pressure on explaining away these doubts as unjustified. Brentano was confident that this was a promising project and that it was the only way to show how metaphysics could aim at a form of wisdom that goes beyond the individual sciences. A more modest approach would emphasize the fact that axioms are not purely empirical judgements: they are part of conceptual frameworks like commonsense psychology or Euclidean geometry. These frameworks operate with relations (for example, of opposition or correlation) that may seem self-evident, but alternative frameworks in which other relations hold are conceivable. Treating axioms as knowledge in category 4 leads to a more modest epistemology, which might still serve metaphysics in the way Brentano conceived its role.
b. A World of Things
There are many possible answers to the question “What is the world made of?” The first task of ontology is therefore to compare the different possible answers and to provide criteria for choosing one ontology over another. After considering various options, Brentano settled on the view that the world is made of real things and nothing else. To emphasize that Brentano defends a particular version of realism, his view is called “Reism.”
To better understand Brentano’s Reism, it is necessary to follow his psychological approach. Crucial to Brentano’s view is the idea that irrealia cannot be primary objects of presentations. When we affirm something, we may mention all kinds of non-real entities. But the judgements we make must be based exclusively on presentations of real things. There are many terms that obscure this fact. Brentano calls these terms “linguistic fictions,” which give the false impression that our thoughts could also concern non-real entities. Brentano’s list of such misleading expressions is long:
One cannot make the being or nonbeing of a centaur an object [i.e. a primary object] as one can a centaur. […] Neither the present, past, nor future, neither present things, past things, nor future things, nor existence and non-existence, nor necessity, nor non-necessity, neither possibility nor impossibility, nor the necessary nor the non-necessary, neither the possible nor the impossible, neither truth nor falsity, neither the true nor the false, nor good nor bad. Nor can […] such abstractions as redness, shape, human nature, and the like, ever be the [primary] objects of a mental reference (F. Brentano: Psychology from an Empirical Standpoint, 1973, p. 294).
Commentators have spent a great deal of time examining the arguments Brentano uses to support his Reism (see Sauer 2017, p. 139). His method may be illustrated with the following example. Suppose that someone judges correctly:
Rain is likely to come.
Brentano proposes the following analysis of this statement:
Someone correctly acknowledges rain as a likely event.
This analysis shows that it is not the probability of rain that gets acknowledged, which would be something non-real. If the first statement is equivalent with the second, then the existence of a real thing is sufficient for the truth of both statements. This “truth-maker” is revealed in the second statement: It is a thinking thing that recognizes rain as an impending event, or, as Brentano says, a being that makes the judgement “rain exists” in the mode of the future and with some degree of probability.
With examples like this, Brentano inspired many philosophers in the analytic tradition to use linguistic analysis to promote ontological parsimony. Not many, however, would go as far as Brentano when it comes to analysing perceptual judgements. Brentano is committed to the view that in an act of perception there is only one real thing that must exist: the subject who enjoys the sensory experience. To defend this position, he relies on the following dictum: “What is real does not appear to us, and what appears to us is not real.”
Interpreting this dictum is a serious task. Does Brentano mean to say that all secondary qualities, such as colours, sounds and tastes, are not real because they are mere appearances? Or is he saying that secondary qualities have a special form of “phenomenal reality”? There seem to be good reasons counting against both claims. The phenomenal reality of colours, shapes and tastes counts against classifying them as “non-real,” and the deceptive character that appearances can have counts against a notion of “phenomenal reality” that applies to all appearances across the board. It is only by resolving such questions that a proper understanding of Brentano’s Reism can be achieved.
c. Substance and Accidents
Another classic theme in Aristotelian metaphysics is the relationship between substance and accidents, which include both properties as well as relations. Brentano divides substances into material and immaterial ones, and, in line with his reistic position, holds that both substances and accidents are real things in the broad sense of the word. Material substances are real because they exist independently of our sensory and mental activities. Mental substances, on the other hand, are real things because we innerly perceive them with self-evidence.
Both substances and accidents raise the question of how they exist in space and time. Let us start with space: Brentano holds on to the common view that material substances occupy portions of space (space regions), much in line with a container view of space. Extending this geometric view to mental substances, Brentano considers modeling them as points that occupy no location. To make sense of this idea, he introduces the limiting case of a zero-dimensional topology. It is a limiting case because the lack of any dimensions means that the totality of space in this topology is represented as a single point. Since, by definition, the totality of space does not exist in any other space, we get the desired result that a mental substance occupies no location. In a further step, Brentano then compares material substances with one- or multi-dimensional continua and argues that one can represent mental substances as boundaries of such continua without assigning them a location. This analogy between mental substances and points on the one hand, and material substances and one- or multidimensional continua on the other, forms the basis of Brentano’s version of substance dualism (on dualism, see below).
What about time? As in the case of space, Brentano represents points of time as boundaries of a continuum. With the exception of the now-point, they are fictions cum fundamentum in re, which Brentano sometimes also calls metaphysical parts. In contrast with space however, Brentano holds that time is not a real continuum. More precisely, it is an unfinished (unfertiges) continuum of which only the now-point is real. This makes Brentano a presentist about the reality of time. It is a view that seems counterintuitive since it implies that a temporal continuum, such as a melody, can only be perceived at the now point because, strictly speaking, your hearing and your awareness of hearing the melody cannot extend beyond the now point. Brentano’s solution to this problem was to argue for persistence in the tonal presentations of the melody through an original (that is, innate) association between these presentations. When you hear the fourth note of a melody, your presentations of the first three notes are retained, which accounts for the impression of time-consciousness (although technically there is no time consciousness). This account was very influential on Husserl (see Fréchette 2017 on Brentano’s conception of time-consciousness).
Now back to substance. We often characterize substantial change as a change of accidents inhering in a substance. For example, a substance may lose weight; a leaf on a tree may change its color from red to green, thereby losing an accident and gaining a new one. This common view is not Brentano’s view. On the contrary, for Brentano, spatial and temporal accidents are absolute accidents. This means that a substance cannot lose or gain any such properties. To make sense of this, Brentano does two things: first, he turns the traditional view of the substance-accident relation upside down. Instead of seeing the substance as a fundamental being and its accidents as inhering in it, Brentano changes the order: for him, accidents are more fundamental than substances. Second, he rejects inherence as a tie between substance and accident, and replaces it with the parthood relation: In his account, accidents are wholes of which a substance is a part. To illustrate this, take a glass of water which contains 200 ml of water at t1. After you have taken a sip from the glass, it then contains 170 ml of water at t2. According to Brentano, what changes are the wholes while the same substance is part of two different wholes at t1 and t2. Similarly, Brentano argues, when I heat the water in my cup, the water not only expands in space, but also takes up more or less “space” on a temporal continuum of thermal states. This temporal continuum of thermal states is also a continuum of wholes containing the same substance. Expansion does not require new properties to be added to a substance. It just requires an increase that is measurable on some continuum.
It did not escape Brentano that an Aristotelian realism about universals poses a serious problem. According to this view, universals exist in re, that is, not independently of the substances in which they occur. It follows, in the case of material substances, that an accident can exist in more than one place at the same time. It may also be that an accident ceases to exist altogether when all its instances disappear, but later begins to exist again as soon as there is a new instance in which it occurs.
At this point, one must consider again the role that Brentano assigns to linguistic analysis. Brentano relies on such an analysis when he interprets simple categorical judgements as existential judgements. This analysis should remove the root from which the problem of multiple existence arises. Consider, for example, the simple categorical judgement:
A. Some men are bald-headed.
Brentano reduces this judgement to the following existential judgement:
B. A bald-headed man exists.
In the categorical judgement A, the term “man” denotes a substance and the term “bald-headed” denotes an accident of that substance if the judgement is true. In the existential judgement B, we have a single complex term, “bald-headed man,” which corresponds to a complex presentation, independently of whether a bald man exists or not. The function of the judgement is to acknowledge the existence of the object presented in this way, without adding any further “tie” between substance and accident.
Brentano’s mereology provides us with an ontology that matches this semantic analysis. The idea is the following: The complex term “bald man” denotes a whole, which has two parts (“bald” and “man”), both of which seem to denote parts of this whole. Now, which is here the substance, and which is the accident? Brentano decides that only the term “man” denotes a substance, and only the complex term “bald-headed man” denotes an accident. This means a substance, say the man Socrates, can be part of a whole (e.g. the bald man Socrates) without there having to be another part which must be added to this substance in order for this whole to exist.
Brentano offers us an account of substance that is peculiar because it goes against the principle of supplementation in extensional mereology. Brentano doesn’t follow that intuition: For him, properties like “bald-headedness” are not further parts that add to the substance to make it a whole. He thereby corrects the Aristotelian form of realism about universals at the very point that raises the problem of multiple existence.
d. Dualism, Immortality, God
One of the oldest problems in metaphysics is the so-called “mind-body problem.” Brentano confronts this problem in a traditional as well as in a modern way. The traditional setting is provided by the Aristotelian doctrine that different forms of life are explained by the different kinds of souls that organisms possess. Plants and animals have embodied souls that form the lower levels of the human soul which also includes a “thinking soul” as a higher part. In this framework, the mind-body problem consists of two questions: (1) How do the lower parts of the human soul fulfil their body-bound function?; and (2) how, if at all, does the thinking soul depend on the body?
The modern setting of the mind-body problem is provided by Descartes’ arguments for the immateriality of the soul. These arguments make use of considerations that suggest that the activity of the mind is thinking, and that thinking requires a substance that thinks, but not a body with sensory organs. Following the Cartesian argument, the mind-body problem becomes primarily a problem of causation: (1) how can sensory processes have a causal effect on the thinking mind; and (2) how can our thoughts have a causal effect on our behavior?
Brentano’s ambitious aim in his Psychology from an Empirical Standpoint was to bridge the gap between these two historical frameworks. If he succeeded, he would be able to use Cartesian ideas to answer questions arising within the Aristotelian framework, and he would be able to use Aristotelian ideas to answer questions about mental causation. But since Brentano left his Psychology unfinished, we do not know for sure how Brentano hoped to resolve the question, “whether it is conceivable that mental life continues after the dissolution of the body” (Brentano 1973, xxvii).
While Brentano did not get to the part of his Psychology that was meant to deal with the immortality question, he prepares this discussion in a chapter on the unity of consciousness. He begins with the following observation:
We are forced to take the multiplicity of the various acts of sensing […] as well as the inner perception which provides us with knowledge of all of them, as parts of one single phenomenon in which they are contained, as one single and unified thing (Brentano 1973, p. 97).
Here Brentano is only talking about the unity of the experience we have when, for example, we simultaneously hear and see a musician playing an instrument. But the same idea can be extended when we think about our own future. Suppose I am looking forward to a vacation trip I have planned. There is a special unity involved in this because I could be planning a trip without looking forward to it, and I could be looking forward to a trip that someone else is planning. In the present case, however, my intention and my joy are linked, and in a double way. Both phenomena relate to the same object: my future self that enjoys the trip. Using examples like this, an argument for the immortality of the soul could be made along the following lines:
The unity I perceive between my intentions and the pleasures I feel now does not rely on any local part of my body.
If it doesn’t rely on any part of my body now, it will not rely on my body in the future.
Therefore, the unity I perceive now may well outlive my body.
While the question of immortality concerns the end of our life, there is also a question about its beginning: How does it come that each human being has an individual soul? Brentano takes up this question in a lengthy debate with the philologist and philosopher Eduard Zeller. The subject of this debate was whether Aristotle could be credited with the so-called “creationist” view, according to which the existence of each individual soul is due to God’s creation. Brentano affirms such an interpretation, and we may assume that it coincides with his own view of the matter. It is a view that presupposes a fundamental difference between human and non-human creatures, but also allows some continuity in the way souls enter the bodies of living creatures:
Lest this divine intercession should appear too incredible, Aristotle calls attention to the fact that the powers of the lower elements do not suffice for the genesis of any living being whatever. Rather, the forces of the heavenly substances participate in a certain way as a cause, thereby making such beings more godlike. The participation of the deity in the creation of man, therefore, has its analogy in the generation of lower life (Brentano 1978, 111ff.).
What this suggests is that individual souls of all kinds are created in such a way that God’s contribution to the process is recognisable. This is remarkable, because it means that the human soul owes its existence to a process of creation that is in many ways analogous to processes that enable the existence of plants and animals.
Such considerations fit nicely into the Aristotelian framework of the mind-body problem. Some scholars have therefore questioned whether Brentano could also advocate a Cartesian substance dualism. Dieter Münch for instance suggests that there are clear traces of a “monistic tendency” in Brentano (D. Münch, 1995/96, p. 137).
On the other hand, we also find in Brentano considerations that favor the Cartesian framework. For example, when Brentano offers a psychological proof of the existence of God, he follows closely in the footsteps of Descartes (On the Existence of God, sections 435-464). In this proof, Brentano criticizes what he calls an “Aristotelian semi-materialism,” and argues that the unity of consciousness is incompatible with any form of materialism. The tension between these arguments (which may have been modified somewhat by Brentano’s disciples) and the passages quoted above seems difficult to resolve. We may take this as a sign that the gap between the Aristotelian and Cartesian frameworks is too wide to be bridged (see Textor 2017).
5. History of Philosophy and Metaphilosophy
Brentano’s contributions to the history of philosophy are above all an expression of his metaphilosophical optimism. Brentano believed that philosophy develops according to a model that distinguishes phases of progress and phases of decline. Phases of progress are relatively rare and followed by much longer phases of decline. Only a few philosophers, such as Aristotle, Thomas Aquinas, Leibniz and Descartes, fulfill the highest standards which Brentano applies. Nevertheless, he was optimistic that another phase of progress will come, and that it will provide concrete solutions to philosophical problems.
Brentano’s phase model is undoubtedly speculative. It is the result of his reflections on how to approach the history of philosophy. In his view, it makes a big difference whether one studies the history of philosophy as an historian or as a philosopher. Brentano tried to convey to his students the relevance of this distinction, and it is for this purpose that he most often invoked his phase model.
a. How to do History of Philosophy
In a lecture given to the Viennese Philosophical Society in 1888, entitled “On the Method of Historical Research in the Field of Philosophy” (a draft of which has been published in Franz Brentano: Geschichte der Philosophie der Neuzeit (1987, pp. 81-105)), Brentano hands out his recommendations for doing history of philosophy. One should “approach the author’s thought in a philosophical way,” he says. This requires two critical competences: a specific hermeneutic competence, and a broad understanding of the main currents of progress and decline in philosophy.
The hermeneutic competence Brentano requires “consists in allowing oneself to be penetrated, as it were, by the spirit of the philosopher whose teachings one is studying” (ibid., p. 90). In his debate with Zeller, Brentano uses this requirement as an argument against a purely historical interpretation of Aristotle’s texts:
One must try to resemble as closely as possible the spirit whose imperfectly expressed thoughts one wants to understand. In other words, one must prepare the way for understanding by first meeting the philosopher philosophically, before concluding as a historian (Brentano: Aristoteles’ Lehre vom Ursprung des menschlischen Geistes (1911), p. 165).
The second requirement, namely an awareness of the main currents structuring the development of philosophy, brings us back to Brentano’s phase model. Such models were popular among historians at Brentano’s time. One of these historians was Ernst von Lasaulx, whose lectures Brentano attended as a student in Munich (see Schäfer 2020). A few years later, Brentano came across Auguste Comte’s model and put it to critical examination (see Brentano’s lecture “The Four Phases of Philosophy and its Current State” (1895)).
Brentano believes that progress of philosophy results from a combination of metaphysical interest with a strictly scientific attitude. He therefore disagrees with Comte on three points. Firstly, Comte fails to notice the repetitive cycles of progress and decline. Secondly, he does not see the classical period of Greek philosophy as a phase in which philosophers were driven by a purely theoretical interest. And thirdly, Comte mistakenly believes that modern philosophy had to pass through a theological and metaphysical phase before it could enter its scientific phase.
The broad perspective that Brentano takes on the history of philosophy leads him to cast a shadow over all philosophers who belong to a phase of decline. But this is not the only form of criticism to be found in Brentano. There are also independent and highly illuminating discussions of the works of Thomas Reid and Ernst Mach, as well as profound criticisms of Windelband, Sigwart and other logicians of Brentano’s time. From today’s point of view, these are all important contributions to the history of philosophy.
b. Aristotle’s Worldview
For Brentano, the study of Aristotle’s works was a major source of inspiration and went hand in hand with the development of his own ideas. Almost all his works contain commentaries on Aristotle, often in the form of a fictitious dialogue in which Brentano tested the viability of his own ideas and sought support for his own views from a historical authority.
Brentano was concerned with Aristotle’s philosophy throughout his life, from his early writings on ontology (1862) and psychology (1867), through his debate with Eduard Zeller on the origin of the human soul (1883), to his treatise on Aristotle’s worldview a few years before his death. Throughout these studies, we can observe Brentano applying his hermeneutic method to fill in the gaps he finds in Aristotle’s argumentation and to resolve apparent contradictions.
In his final treatment of Aristotle, Brentano focuses on the doctrine of wisdom as the highest form of knowledge. He approaches this problem by not confining himself to what Aristotle says in the Metaphysics, but “by using incidental remarks from various works” (Aristotle and His World View, p. ix), and by including commentaries such as Theophrastus’s Metaphysics. With the help of Theophrastus and other sources, he hopes to resolve what he believes are only apparent contradictions in Aristotle’s writings. Brentano’s ultimate aim in these late texts is to present his own doctrine of wisdom as the highest form of knowledge (see the manuscripts published in the volume Über Aristoteles 1986).
Another thorny issue taken up by Brentano is Aristotle’s analysis of induction. Brentano praises Aristotle for having recognized the importance of induction for empirical knowledge when, for example, he discussed the question of how we can deduce the spherical shape of the moon from observing its phases. It was, however, “left for a much later age to shed full light, by means of the probability calculus, upon the theory of measure of justified confidence in induction and analogy” (Aristotle and his World View, p. 35). Brentano’s own attempt to solve the problem of induction is tentative. Its solution, he says, will depend on the extent to which the future mathematical analysis of the concept of probability coincides with the intuitive judgements of common sense (ibid.).
Brentano follows in Aristotle’s footsteps in treating metaphysics as a discipline that includes not only ontology, but also cosmology and (natural) theology. The pinnacle of metaphysics, for Brentano, would be a proof of the existence of God. Brentano already hints at this idea in his fifth habilitation thesis: “The multiplicity in the world refutes pantheism and the unity in it refutes atheism” (Brentano, 1867).
Brentano has worked extensively on a proof of God’s existence that relies both on a priori and a posteriori forms of reasoning (see Brentano: On the Existence of God. 1987). Here, too, it is not difficult to find the Aristotelian roots of Brentano’s thinking. In a manuscript from May 1901, Brentano writes: “Aristotle called theology the first philosophy because, just as God is the first among all things, the knowledge of God (in the factual, if not in the temporal order) is the first among all knowledge” (Religion and Philosophy, 90).
We can see from these late writings that there were at least two constants in Brentano’s work: one is his engagement with Aristotle, another one is his theism. But these early imprints are not the only ones. There is yet another historical source that became decisive for Brentano early on, namely the contemporary positivism of Mill, Spencer and Comte.
c. Positivism and the Renewal of Philosophy
In 1859, the Franco-Luxemburgish philosopher and sociologist Théophile Funck, who became Brentano’s brother-in-law in 1860, published a book entitled Philosophie et lois de l’histoire, in which he connects the positivist movement with a model of historical development resembling Brentano’s phase model. How much Brentano was impressed by positivism at this point in time can be seen in his review of another book by the same author:
In the most recent epoch there has appeared in the person of Auguste Comte a thinker lacking neither the enthusiastic zeal for the most sublime questions, nor the insight capable of linking ideas, which elevate the truly great philosopher above the mass of lesser minds. Mill doesn’t hesitate to put him on the same level as Descartes and Leibniz, he even calls him superior to them, if not deeper, if only because he was able to bring to bear a similar spiritual force in a more advanced cultural epoch (Brentano 1876, 3. Our transl.).
This enthusiasm for Comte’s philosophy may have changed in light of the anti-metaphysical tendencies in Comte’s thought and his reservations about psychology becoming a proper science. But there are other factors, too, that may have played a role in this context: For example, the discovery of Mill’s monograph on Comte, which Brentano read in French translation in 1868, may have suggested to him a fundamental agreement between British empiricism and French positivism. Also, at this young age, Brentano was preparing for an academic career in philosophy as a Catholic priest. This was a difficult task in Germany in those days. Brentano had to expect fierce opposition to his beliefs, especially in Würzburg. The study of English and French positivism may have seemed to him an appropriate means of countering such resistance.
One way of preparing for this opposition was to ensure that his psychology had a firm empirical basis. To do this, Brentano wanted to show that mental phenomena are subject to distinct laws which are nevertheless similar or analogous to the laws of physics. Here, Brentano could rely on his doctrine that philosophy (including psychology) and the natural sciences share a common method, a doctrine which he defended as one of his theses for the habilitation. But Brentano had yet to substantiate this claim with concrete examples. In this respect, the debate on the Weber-Fechner law provided him with a welcome opportunity.
Brentano states the law as follows: “It has been found that the increase of the physical stimulus which produces a just barely noticeable increase in the strength of the sensations always bears a constant relation to the magnitude of the stimulus to which it is added” (Psychology from an Empirical Standpoint, p. 67). Brentano, then, goes on to correct what he takes to be a common mistake in applying this law:
Since it was assumed to be self-evident that each barely noticeable increase of sensation is to be regarded as equal, the law was formulated that the intensity of sensation increases by equal amounts when the relative increase of the physical stimulus is the same. In reality, it is by no means self-evident that each barely noticeable increase in sensation is equal, but only that it is equally noticeable (ibid.).
This example highlights two key moves in Brentano’s thinking. While he fully acknowledges the significance of the Weber-Fechner law, he points out a mistake in the formulation of the law. It takes the sharp eye of a philosopher, trained in making precise conceptual distinctions, to reveal this mistake. For Brentano, the consequence is that there are two kinds of laws which together explain the experimental findings by Weber and Fechner: a physiological law, as well as a law of descriptive psychology. These laws are perfectly tuned to each other, which shows that they are similar or at least analogous (For details on the Fechner-Brentano debate, see the introduction by Mauro Antonelli to Brentano and Fechner 2015 and Seron 2023).
In conclusion, it must be admitted that Brentano’s attitude to the positivist tradition remains somewhat ambiguous. This can also be seen in his criticism of Ernst Mach, who was appointed professor of philosophy at the University of Vienna in the same year that Brentano left the city. They shared the ambition of renewing philosophy by drawing upon the resources of the natural sciences. However, Brentano could not accept Mach’s theory of sensation, and the monism it implies. He believed that a firm distinction between physical and mental phenomena precluded such monism (see Brentano: Über Ernst Mach’s “Erkenntnis und Irrtum” 1905/1906, first published in 1988).
d. Philosophical Optimism
Brentano was a philosophical optimist through and through, firmly convinced that philosophy not only had a great past, but an even greater future ahead of it. He emphasized the continuity between the endeavors of philosophy and the empirical sciences, and he attacked in his lectures with great eloquence a cultural pessimism that was popular at the time. In addition, he backed up his optimism with his theistic worldview.
In his lecture “On the Reasons for a Loss of Confidence in the Area of Philosophy” (1874), Brentano admits that so far philosophy has not been able to keep pace with the progress of the natural sciences. It lacks continuity, consensus, and practical usefulness. However, Brentano denies that this is a permanent deficiency and counters the widespread view that the questions of philosophy cannot be treated as precisely as scientific questions. He therefore urges philosophers to orient themselves more closely to the natural sciences and, in particular, to take advantage of the new findings of physiology: “Now that even physiology is beginning to thrive more vigorously, we no longer lack for signs pointing to the time for philosophy, too, to awaken to productive life” (Brentano: “On the Reasons for a Loss of Confidence in the Area of Philosophy” 2022, 499).
With such colorful words, Brentano’s aim was to instill in his students a belief in the integrity of philosophy as a rigorous discipline. If this was not good enough to convince his audience, Brentano reminded them of the human capacity to search for ultimate reasons, thereby appealing to a religious impulse in some of his students. One of them was Alfred Kastil, who often changed Brentano’s words as an editor, but claims that the following passage comes from from one of Brentano’s manuscripts:
Man demands by nature not merely a knowledge of what is [Kenntnis des Dass], but also a knowledge of why [Kenntnis des Warum]. For this reason alone, the knowledge of God, as that of the first reason, is a great good, but it is also so insofar as the most joyful conception of the world, the most blissful hopes are attached to it (Brentano: Religion und Philosophie, 253 fn. Our translation).
Yet it would be be rash to reduce Brentano’s philosophical optimism to religious enthusiasm. Looking back on his own life, he speaks of the “duty of the wise man, having reached the age of maturity, to subject his religious convictions to an examination”, and in doing so, with all due respect for popular religion, to retain the freedom to oppose the unauthorized restrictions on research by an “ecclesiastical government” (Ibid., p. 251).
Clearly, Brentano’s conflict with the Catholic Church and the Austrian state left deep marks on his personality. But it was not only religious institutions that Brentano opposed. As a philosopher, he fought all his life against Kantian philosophy and the scepticism and relativism he saw as its corollary. And although he saw much of philosophy as grounded in psychology, he fought against the popular view that such psychologism implied a form of subjectivism.
6. References and Further Reading
a. Monographs published by Brentano
On the Several Senses of Being in Aristotle. Transl. R George. Berkeley: University of California Press 1975. [first published in 1862]
Brentano’s PhD dissertation. Argues that being in the sense of the categories is the most basic meaning of being according to Aristotle.
The Psychology of Aristotle, Especially His Doctrine of the Nous Poietikos. Trans. R. George. Berkeley: University of California Press, 1977. [first published in 1867]
Brentano’s habilitation thesis in which he challenges the traditional view of the active intellect in Aristotle.
The Origin of Our Knowledge of Right and Wrong. Transl. R. M. Chisholm and E.H. Schneewind. London: Routledge 1960. [first published in 1889]
An expanded version of Brentano’s lecture bearing the same title, along with extensive footnotes and additions, which elaborate and defend his conception of the Good in terms of correct love.
Psychology from an Empirical Standpoint. Trans. A. C. Rancurello, D. B. Terrell, and L. L. McAlister. London: Routledge 1973. [first published in 1874]
Brentano’s most important book, conceived as the first two parts of a six volume treatise on psychology, four of which were never published, in which Brentano develops his conception of psychology as an empirical science based on inner perception.
Von der Klassifikation der psychischen Phänomene. Berlin: Duncker & Humblot, 1911. [On the Classification of Mental Phenomena]
Reprint of the second book of his Psychology from an Empirical Standpoint along with an important appendix in which Brentano explains some changes regarding his earlier views. Included in the English translation of the Psychology from an Empirical Standpoint.
Untersuchungen zur Sinnespsychologie. Berlin: Duncker & Humblot 1907. [Investigations on Sensory Psychology]
A collection of papers previously published between 1890 and 1906 on sensory psychology. These papers are largely based on the Vienna lectures on descriptive psychology. Not yet translated into English.
Aristotle and his Worldview. Transl. R. George and R.M. Chisholm. Berkeley: University of California Press 1978 [first published in 1911]
Originally published as a chapter of an edited book on the great figures of philosophy and expanded into a book. Offers a general presentation of Aristotle’s metaphysics, in many respects influenced by Theophrastus’ reading of Aristotle.
Aristoteles Lehre vom Ursprung des menschlichen Geistes. Leipzig: Velt & Comp. 1911. [Aristotle’s Doctrine of the Origins of the Human Mind]
Brentano’s last word on the origin of the human soul in Aristotle in his debate with Eduard Zeller. Not yet translated into English.
b. Other Philosophical Works Published by Brentano
“Habilitation Theses”. Transl. S. Gabriel. In I. Tănăsescu, et. al. (eds.), Brentano and the Positive Philosophy of Comte and Mill. Berlin: DeGruyter 2022, 433-436. [first published in 1867]
A list of 25 theses (originally in Latin) that Brentano defended for his habilitation.
“Auguste Comte and Positive Philosophy”. Transl. S. Gabriel. In I. Tănăsescu, et. al. (eds.), Brentano and the Positive Philosophy of Comte and Mill. Berlin: DeGruyter 2022, 437-455. [first published in 1869]
The first and only published article out of a series of eight planned articles in which Brentano critically examines Comte’s conception of philosophy, his classification of sciences and his view on the cycles and phases of history of philosophy.
“Der Atheismus und die Wissenschaft”. 1873. [“Atheism and Science”] Reprinted in: Franz Brentano: Sämtliche veröffentlichte Schriften. Hgg. Von Th. Binder und A. Chrudzimski. Volume IX: Vermischte Schriften. Berlin. DeGruyter 2019, 37-61.
Brentano’s reply to an article in a Viennese newspaper published anonymously a few weeks earlier. Not yet translated into English.
“Der neueste philosophische Versuch in Frankreich”. 1876. Reprinted in: Franz Brentano: Sämtliche veröffentlichte Schriften. Hgg. Von Th. Binder und A. Chrudzimski. Volume III: Schriften zur Ethik und Ästhetik. Berlin. DeGruyter 2011, 1-17.
Brentano’s anonymous review of a book by T. Funck-Brentano. Not yet translated into English.
“Das Genie”. Berlin: Duncker & Humblot. 1892. [On Genious]. Reprinted in: Franz Brentano: Sämtliche veröffentlichte Schriften. Hgg. Von Th. Binder und A. Chrudzimski. Volume III: Schriften zur Ethik und Ästhetik. Berlin. DeGruyter 2019, 99-127.
Offers an exposition of his aesthetics according to which beauty is a property of acts of presentation and gives an account of the genetic preconditions of beautiful presentations. Not yet translated into English.
“On the Reasons for a Loss of Confidence in the Area of Philosophy”. Transl. S. Gabriel. In I. Tănăsescu, et. al. (eds.), Brentano and the Positive Philosophy of Comte and Mill. Berlin: DeGruyter 2022, 489-500. [first published in 1874]
Brentano’s Inaugural Lecture at the University of Vienna, in which he articulates his optimism about the progress in philosophy.
“Miklosich über subjektlose Sätze”. 1883. [Miklosich on sentences without subject terms]. Reprinted in: Franz Brentano: Sämtliche veröffentlichte Schriften. Hgg. Von Th. Binder und A. Chrudzimski. Volume IX: Vermischte Schriften. Berlin. DeGruyter 2019, 105-115.
An appreciative review of a short treatise by Franz Miklosich, a contemporary linguist, which Brentano included (minus the final paragraph) as an appendix in Our Knowledge of Right and Wrong (1889).
“The Four Phases of Philosophy and Its Current State.” Trans. B. M. Mezei and B. Smith. In Balázs M. Mezei and Barry Smith (eds.), The Four Phases of Philosophy. Amsterdam: Rodopi. 1998. [first published in 1895].
Brentano’s exposition of his cyclic view of the history of philosophy in phases of ascent and decline.
“On the Future of Philosophy” Transl. S. Gabriel. In I. Tănăsescu, et. al. (eds.), Brentano and the Positive Philosophy of Comte and Mill. Berlin: DeGruyter 2022, 523-570 [first published in 1893].
Brentano’s very critical reply to a lecture on political education delivered by Adolf Exner on the occasion of Exner’s inauguration as rector of the University of Vienna.
c. Selected Works Published Posthumously from Brentano’s Nachlass
Since Brentano’s death, many editions of his manuscripts and lectures have been produced. Given the editorial policy of his first editors, which consisted in adapting the text to what they took to be Brentano’s considered view on these matters, some of these texts are heavily edited and not always historically reliable.
Philosophy of Mind and Psychology
Sensory and Noetic Consciousness. Transl. M. Schättle and L.L. McAlister. London: Routledge 1981.
A collection of manuscripts on psychology and metaphysics, misleadingly presented as the third book of the Psychology from an Empirical Standpoint.
Descriptive Psychology. Trans. B. Müller. London: Routledge 1995.
Material from the lecture courses held in Vienna in the late 1880s, which Husserl often refers to.
Briefwechsel über Psychophysik 1874-1878. [Correspondence on Psychophysics 1874-1878] Berlin: DeGruyter 2015. With an Introduction by Mauro Antonelli.
Brentano’s correspondence with Gustav Fechner.
Epistemology and Truth
Versuch über die Erkenntnis [Essay on Knowledge]. Leipzig: Felix Meiner 1925.
Contains an edition of the treatise Nieder mit den Vorurteilen! [Down with prejudices!], the most explicit attack of Brentano on Kant’s notion of the synthetic apriori.
The True and the Evident. Trans. R. M. Chisholm, I. Politzer, and K. R. Fischer. London: Routledge 1966.
Manuscripts and lectures from the period between 1889 and 1915 on truth and ontology, including the lecture “On the concept of truth”.
Logic, Ethics, and Aesthetics
Die Lehre vom richtigen Urteil [The Theory of Correct Judgement]. Bern: Francke 1956.
An edition of Brentano’s logic lectures, dealing among other things with the existential reduction of judgements.
The Foundation and Construction of Ethics. Trans. E. H. Schneewind. London: Routledge 1973.
An edition of various lectures by Brentano on practical philosophy.
Grundzüge der Ästhetik [Outlines of Aesthetics]. Bern: Francke 1959.
Brentano’s Vienna lectures on psychology and aesthetics, which include an account of the distinction between intuitive and conceptual presentations.
Ontology and Cosmology
Die Abkehr vom Nichtrealen. [The Turn Away from the Non-Real]. Bern: Francke 1966.
A collection of selected letters from Brentano to Marty and his later students on the motives which led Brentano to adopt Reism.
The Theory of Categories. Trans. R. M. Chisholm and N. Guterman. The Hague: Martinus Nijhoff 1981.
A collection of manuscripts of the late Brentano on metaphysics.
Philosophical Investigations on Space, Time, and the Continuum. Trans. B. Smith. London: Routledge and Kegan Paul 1988.
A collection of manuscripts of the late Brentano dealing in detail with his mereology and his conception of boundaries.
Religion and Christian Faith
On the Existence of God. Trans. S. Krantz. The Hague: Martinus Nijhoff 1987.
An edition of various lectures on metaphysics (especially cosmology) and on the proofs of the existence of God.
Religion und Philosophie [Religion and Philosophy]. Bern: Francke 1954.
A collection of numerous essays on cosmology and immortality, as well as on wisdom, chance and theodicy.
The Teachings of Jesus and Their Enduring Significance. New York: Springer 2021.
Further studies prepared shortly before Brentano’s death offering a final word on his conception of Christian belief.
On the History of Philosophy
Über Aristoteles [On Aristotle]. Hamburg: Felix Meiner 1986.
A collection of manuscripts on Aristotle, along with correspondence on related topics.
Geschichte der Philosophie der Neuzeit [History of Modern Philosophy]. Hamburg: Meiner 1987.
Material on Brentano’s lecture course on the history of philosophy from Bacon to Schopenhauer, as well as notes for his lecture on the proper method of doing history of philosophy.
Über Ernst Machs Erkenntnis und Irrtum [On Ernst Mach’s Knowledge and Error]. Amsterdam: Rodopi 1988.
Manuscripts and excerpts of lectures in which Brentano critically examines Mach’s positivism.
d. Secondary Sources
A valuable selection of earlier literature on Brentano up to 2010 can be found in the 4-volume collection Franz Brentano: Critical Assessment, edited by M. Antonelli and F. Boccaccini, Routledge 2019.
Works with an asterisk * are quoted in the text.
Antonelli, Mauro, and Thomas Binder (eds.) The Philosophy of Brentano. Studien zur Österreichischen Philosophie Vol. 49: Brill 2021.
Antonelli, Mauro, and Federico Boccaccini (eds.) Franz Brentano: Critical Assessment. Routledge Critical Assessment of Leading Philosophers. 4 volumes. Routledge 2019.
*Binder, Thomas. Franz Brentano und sein philosophischer Nachlass. DeGruyter 2019.
*Brandl, Johannes L. “Was Brentano an Early Deflationist about Truth?”, The Monist 100 (2017), 1-14.
*Brandl, Johannes L. (ed.) Brentano on Intentional Inexistence and Intentionality as the Mark of the Mental. Special Issue of Grazer Philosophische Studien. Volume 100 (2023).
*Citlak, Amadeusz: “Qualitative Psychology of the Brentano School and its Inspirations”, Theory and Psychology (2023), 1-22.
*Crane, Tim. Aspects of Psychologism. Harvard University Press, 2014.
Curvello, Flávio Vieira. “Brentano on Scientific Philosophy and Positivism.” Kriterion: Revista de Filosofia 62 (2021): 657-79.
Dewalque, Arnauld. “Brentano’s Case for Optimism.” Rivista di filosofia neoscolastica, CXL, 4 (2019): 835-47.
Fisette, Denis. La philosophie de Franz Brentano. Vrin 2022.
Fisette, Denis, and Guillaume Fréchette (eds.) Themes from Brentano. Vol. 44: Rodopi 2013.
Fisette, Denis, Guillaume Fréchette, and Hynek Janoušek (eds.). Franz Brentano’s Philosophy after One Hundred Years: From History of Philosophy to Reism. Springer Nature, 2021.
Fisette, Denis, Guillaume Fréchette, and Friedrich Stadler (eds.) Franz Brentano and Austrian Philosophy. Springer, 2020.
*Fréchette, Guillaume. “Brentano on Time-Consciousness.” In: U. Kriegel (eds.): The Routledge Handbook of Franz Brentano and The Brentano School. New York 2017, 75-86.
Fréchette, Guillaume, and Hamid Taieb (eds.) Descriptive Psychology: Franz Brentano’s Project Today. Special Issue of European Journal of Philosophy Issue 31 (2023).
*Gabriel, Susan. “Brentano at the Intersection of Psychology, Ontology, and the Good.” In: D. Fisette and G. Frechette (eds.) Themes from Brentano. Brill, 2013. 247-71.
*Huemer, Wolfgang. “Was Brentano a Systematic Philosopher?” In: Antonelli, Mauro, and Thomas Binder. The Philosophy of Brentano. Studien zur Österreichischen PhilosophieVol. 49: Brill 2021, 11-27.
Kriegel, Uriah (ed.) The Routledge Handbook of Franz Brentano and the Brentano School. Taylor & Francis, 2017.
Massin, Olivier and Mulligan Kevin. Décrire: La Psychologie De Franz Brentano. Vrin: 2021).
*Münch, Dieter. “Die Einheit Von Geist und Leib: Brentanos Habilitationsschrift über Die Psychologie des Aristoteles als Antwort auf Zeller.” Brentano Studien. Internationales Jahrbuch der Franz Brentano Forschung 6 (1996), 125-144.
*Rollinger, Robin D. “Brentano’s Psychology from an Empirical Standpoint: Its Background and Conception.” In: I. Tănăsescu: Franz Brentano’s Metaphysics and Psychology (2012): 261-309.
Rollinger, Robin D. Concept and Judgment in Brentano’s Logic Lectures: Analysis and Materials. Vol. 48: Brill 2020.
*Sauer, Werner. “Brentano’s Reism.” The Routledge Handbook of Franz Brentano and the Brentano School. Routledge, 2017. 133-43.
Schaefer, Richard. “Learning from Lasaulx: The Origins of Brentano’s Four Phases Theory.” Franz Brentano and Austrian Philosophy. Brill (2020): 181-96.
*Seron, Denis. “Psychology First!” The Philosophy of Brentano. Brill, 2021. 141-55.
*Seron, Denis. “The Fechner-Brentano Controversy on the Measurement of Sensation.” In: D. Fisette et. al. (eds.): Franz Brentano and Austrian Philosophy. Springer, 2020, 344-67.
*Soldati, Gianfranco. “Brentano on Self-Knowledge.” In: U. Kriegel: The Routledge Handbook of Franz Brentano and the Brentano School. Taylor & Francis, 2017, pp. 124-129.
Tănăsescu, Ion (ed.) Franz Brentano’s Metaphysics and Psychology. Zeta Books 2012.
Tănăsescu, Ion, et al (eds.) Brentano and the Positive Philosophy of Comte and Mill: With Translations of Original Writings on Philosophy as Science by Franz Brentano. De Gruyter 2022.
Tassone, Biagio G. From Psychology to Phenomenology: Franz Brentano’s ‘Psychology from an Empirical Standpoint’ and Contemporary Philosophy of Mind. Palgrave-Macmillan 2012.
Textor, Mark. Brentano’s Mind. Oxford University Press 2017.
Textor, Mark. “Correctness First: Brentano on Judgment and Truth.” The Act and Object of Judgment: Historical and Philosophical Perspectives. Eds. Ball, Brian Andrew and Christoph Schuringa: Routledge, 2019.
*Textor, Mark. “From Mental Holism to the Soul and Back.” The Monist 100.1 (2017): 133-54.
Textor, Mark. “That’s Correct! Brentano on Intuitive Judgement.” British Journal for the History of Philosophy 31.4 (2022): 805-24.
A definition of art attempts to spell out what the word “art” means. In everyday life, we sometimes debate whether something qualifies as art: Can video games be considered artworks? Should my 6-year-old painting belong to the same category as Wallis’ Hold House Port Mear Square Island (see picture)? Is the flamboyant Christmas tree at the mall fundamentally different from a Louvre sculpture? Is a banana taped to a wall really art? Definitions of art in analytic philosophy typically answer these questions by proposing necessary and sufficient conditions for an entity x to fall under the category of art.
Defining art is distinct from the ontological question of what kind of entities artworks are (for example, material objects, mental entities, abstractions, universals…). We do not, for example, need to know whether a novel and a sculpture have a distinct ontological status to decide whether they can be called “artworks.”
Definitions of art can be classified into six families. (1) classical views hold that all artworks share certain characteristics that are recognizable within the works themselves (that is, internal properties), such as imitating nature (mimesis), representing and arousing emotions (expressivism), or having a notable form (formalism). A modified version of this last option is enjoying a revival in 21st century philosophy, where art is said (2) to have been produced with the aim of instantiating aesthetic properties (functionalism). Classical definitions initially met with negative reactions, so much so that in the mid-twentieth century, some philosophers advocated (3) skepticism about the possibility of defining art while others critiqued the bias of the current definitions. Taking up the challenge laid out by theses critics, (4) a fourth family of approaches defines art in terms of the relations that artworks enjoy with certain institutions (institutionalism) or historical practices (historicism). (5) A fifth family of approaches proposes to analyze art by focusing on the specific art forms—music, cinema, painting, and so one—rather than on art in general (determinable-determinate definitions). (6) A last family claims that “art” requires to be defined by a disjunctive list of traits, with a few borrowed from classical and relational approaches (disjunctivism).
The concept expressed by the word “art” may have a relatively recent, and geographically specific, origin. According to some, the semantic distinction between “art” and “crafts” emerged in Europe in the 18th century with the notion of “fine arts,” which includes music, sculpture, painting, and poetry (Kivy 1997, Chapter 1). Indeed, terms such as ars in Latin and tekhnê in Ancient Greek bear some relation to today’s concept of art but they also referred to trades or techniques such as carpentry or blacksmithing. In the Middle Ages, “liberal arts” included things such as mathematics and astronomy, not only crafts. This old meaning of “art” survives in expressions such as “the art of…”—for example, the art of opening a bottle of beer. Similar remarks can be made for related non-Western notions, such as the Hindu notion of “kala” (कला), which involves sixty-four practices, not all of which we would call artistic (Ganguly 1962). “Art” nowadays is more likely understood to mean something more restricted than these traditional meanings.
These differences in terminology do not mean that past or non-Western cultures don’t make art. On the contrary, making art is arguably a typical human activity (Dutton 2006). Moreover, the fact that a culture does not have a word co-referent with “art” does not mean that it does not have the concept of art or, at least, a concept that largely overlaps with it—see Porter (2009) against the idea that the concept of art has emerged only with the notion of “fine arts.” The following definitions of art are thus intended to apply to the practices and productions from all cultures, whether or not they possess a specific term for the concept.
Defining art typically relies on conceptual analysis. Philosophers aim to provide criteria that capture what people mean in everyday life when they talk and think about art while, at the same time, avoiding conceptual inconsistencies. This methodology goes hand in hand with a certain number of criteria that any definition of art must respect.
a. Criteria
Despite the immense diversity of definitions of art, philosophers usually agree on a set of minimum criteria that a good definition must meet in order to respect both folk and specialist uses of the term (in art history, art criticism, aesthetics…) while avoiding mere trivialities (Lamarque 2010). Three classes of criteria can be distinguished: those specifying what a definition must include, those that specify what it must exclude, and those the cases that a good definition must take into account.
i. What a Definition of Art Should Include
[i] Art of a mediocre quality.
[ii] Art produced by people who do not possess the concept of art.
[iii] Avant-garde, future, or possible art (for example, extraterrestrial art).
Criterion [i] touches on what might be called the descriptive project of an art definition. As noted by Dickie (1969), we sometimes use the word “art” descriptively—“The archeologists found tools, clothes, and artworks”—and sometimes evaluatively—“Wow, mom, your couscous is a real artwork!” Dickie points out that, in the past, the descriptive and evaluative (or prescriptive) uses of the term have often been confused. Introducing mediocre art as a criterion excludes the prescriptive or evaluative use. One reason for this is the practice of folk and professional criticism: we may talk of “bad plays”, “insipid books”, or “kitsch songs”, without denying that they are artworks. Doing so also avoids confusing personal or cultural preferences with the essence of art.
Criterion [ii] avoids excluding art produced before the 18th century, non-Western art, as well as the art brut (or outsider art) produced by people whose education or cognitive conditions make it unlikely that they are familiar with the concept of art. A good definition, therefore, does not require the artist to possess the contemporary, Western notion of art to produce artworks.
Criterion [iii] implies that a good definition must do more than designate a set of entities; it must be able to play an explanatory role and make fruitful predictions (Gaut 2000; Lopes 2014). Considering the upheaval that traditional art has undergone in the twentieth century with the emergence of cinema, conceptual art, and so on, this criterion takes on particular importance: any definition that fails to be predictive is doomed to soon be outdated.
ii. What a Definition of Art Should Exclude
[iv] Purely natural objects
[v] What purports to be art but has failed completely
[vi] Non-artistic artifacts (including those with an aesthetic function)
One of the few consensuses in aesthetics is that an artwork is an artifact [iv]—an object intentionally created for a certain purpose. Although a tree or the feathers of a peacock possess undeniable aesthetic qualities, they are not called artworks in the descriptive sense. Note that the criterion [iv], as formulated, does not necessarily exclude productions by Artificial Intelligence even if one denies that AI models are genuine creators of artifacts (see Mikalonytė and Kneer 2022 for empirical explorations of who are considered as the creators of AI-generated art).
A “minimal achievement” is also required for an object to be an artwork [v]. Thus, if Sandrine attempts to play a violin sonata without ever having touched an instrument in her life, she risks failing to produce something identifiable as an artwork. It’s not that she will have played a “bad piece,” but rather that she will have failed to produce a piece of music at all.
Finally, [vi] a good definition of art must also exclude certain artifacts—and this, despite their aesthetic qualities. Even if the maker of a shoelace, a nail, or an alarm bell may have intentionally endowed them with aesthetic properties, this does not seem sufficient to qualify them as art. It is possible to create non-artistic aesthetic artifacts. It is also possible to create shoelaces that are artworks. However, this is—and should—not be the case for most of them.
iii. What a Definition of Art Should Account For
[vii] Borderline cases
A good definition must be able to reflect the fact [vii] that there are many borderline cases, cases where it is not clear whether the concept applies—such as children’s drawings, lullabies, paintings produced by animals, Christmas trees, rituals, jokes, YouTube playthroughs, drafts… A good definition might exclude or include these cases or even account for their tricky nature; in any case, it should not remain silent about them. After all, these are the cases that may most often raise the question “What is the meaning of “art?”
Most contemporary definitions respect these criteria or consider the difficulties posed by some of them. This is less the case for definitions dating from before the second half of the twentieth century.
2. Classical Definitions of Art
Although the focus here is on contemporary definitions of art within the analytic philosophy tradition, it is worth doing a quick tour of the definitions that have prevailed in the West before and what problems they face.
a. Mimesis
i. The Ancients
Although we will focus on contemporary definitions of art within the analytic philosophy tradition, it is worth doing a quick tour of the definitions that have prevailed in the West before and what problems they face.
Plato, Aristotle, and their contemporaries grouped most of what is called “art” today under the heading of “mimesis”. In the Republic or the Poetics (for example, 1447a14-15), the focus is on works that would undeniably be considered artworks today: the poems of Homer, the sculptures of Phidias, works of music, architecture, painting, and so on.
Mimesis is an imitation or representation of the natural, in the sense that artists depict, represent, or copy movements, forms, emotions, and concepts found in nature, human beings, and even gods. The aim is not to achieve hyper-realism (à la Duane Hanson) or naturalism (à la Emile Zola), but to represent possible situations or even universals.
Perhaps surprisingly, even purely instrumental music was considered to be imitative. Beyond the representation of birdsong or the human voice, Aristotle among others believed that music could resemble the movements of the soul, and thus imitate the emotions, moods, and even character traits (vices and virtues) of sentient beings (see, for example, Aristotle’s Politics, 1340a). From that point of view, there is no such thing as non-representational art.
ii. The Moderns
The idea of mimetic art was hegemonic for around two millennia. Witness Charles Batteux who, in Les Beaux-arts réduits à un même principe (1746), popularized the concept of “fine arts” (Beaux-arts) and defended that the essence of these is their ability to represent. In particular, he sees them as an assemblage of rules for imitating what is Beautiful in Nature.
It is interesting to note that when Batteux wrote these lines, he was faced with the unprecedented development of music whose aim was not to imitate anything, as the genres of concerto, ricercare, and sonata were emerging. Batteux is opposed to the idea of non-representational music or dance, which, in his view, “goes astray” (“s’égarer”) (Batteux 1746, 1, chap. 2, §§14-15). Rousseau’s remarks (for example, 1753) on music—and, more particularly, on sonatas—also reflect the conception of art as mimetic among 18th-century thinkers.
iii. The Limits of the Mimesis Theory
The first problem with this theory is that imitation (mimesis) is not a necessary condition for the definition of art. Today, it seems clear that abstract art—that is, works that are not representational—is possible. Just think about Yayoi Kusama’s infinity nets. Thus, mimesis theorists seem to be violating the criterion [iii]: a definition of art must be able to include avant-garde, future, or possible art. Besides, Batteux’s assertion that non-representational music is defective seems to be a case where the descriptive and the prescriptive views of art are conflated. In the same vein, the notion of fine arts has also been vigorously criticized by feminist philosophers as it tends to exclude many works from the realm of genuine arts, including art practices traditionally associated with women, such as embroideries or quilts.
Nevertheless, a more charitable reading of the mimesis theory would see its criterion as going beyond obviously figurative representations. For example, Philippe Schlenker (2017) and Roger Scruton (1999) argue that all music, however abstract, represents or is a metaphor for real-world events, such as spatial movements or emotional changes. This idea captures the fact that, when listening to purely instrumental music, it’s easy to imagine a whole host of shapes or situations, sometimes cheerful and lively, sometimes melancholic and dark, that correspond more or less to the music’s rhythms, melodies, and harmony. Animations illustrating instrumental music also come to mind, as in Walt Disney’s Fantasia or Oskar Fischinger’s Optical Poem. In this broad sense, all art, even if it is not intended to be representational, can be seen as mimetic.
This attempt to save the mimesis theory can be criticized—see, for example, Zangwill (2010) on Scruton. However, even if one dismisses these critics, there seems to be a fatal problem: mimesis is in any case not a sufficient condition. For example, it cannot exclude non-artistic artifacts (criterion [vi]), nor can it properly account for borderline cases (criterion [vii]). Indeed, objects such as passports, souvenir key rings, or any sentence endowed with meaning are mimetic without falling into the category of art. This theory is therefore unsatisfactory.
b. Expressivism
Expressivist theories of art have been championed by many Romantic and post-Romantic philosophers, including Leo Tolstoy, Robin Collingwood, Benedetto Croce, John Dewey, and Suzanne Langer. Let us start with the first, often cited as a paradigmatic representative.
i. Tolstoy
The hegemony of the mimesis theory was gradually replaced by expressivist theories during the (late) Romantic period. Tolstoy is a flamboyant example. In the 2023 edition of What is Art (1898), he defends the following thesis:
To evoke in oneself a feeling one has once experienced, and having evoked it in oneself, then, by means of movements, lines, colors, sounds, or forms expressed in words, so to transmit that feeling that others may experience the same feeling—this is the activity of art (Tolstoy 1898 [2023], 52).
Tolstoy’s expressivist view is particularly strong in that it implies, on the one hand, that the artist experiences a feeling and, on the other, that this feeling is evoked in the audience. Other versions of the expressivist thesis require only one or the other of these conditions, but for Tolstoy, until the audience feels what the artist feels, there is no art. Note also that, in an expressivist approach, communicating both positive and negative emotions can lead to a successful work. Thus, Francis Bacon’s tortured works are expressivist masterpieces since they communicate the author’s near-suicidal state (see Freeland, 2002, Chap.6).
At first glance, expressivism is seductive: if we go to the movies, read a novel, or listen to a love song, is it not to undergo certain emotions? A second look, however, reveals an obvious problem: the existence of “cold” art. There seem to exist artworks whose purpose is not to communicate any affect. We can think of modern or contemporary cases, such as Malevich’s White on White or Warhol’s Empire—an 8-hour-long static shot of the Empire State Building. But we can also think of more traditional artworks, such as Albrecht Dürer’s Hare (see picture), a masterpiece of observation from nature that does not clearly meet Tolstoy’s expressivist criteria.
ii. Collingwood and Langer
A less demanding expressivist theory is advocated by Collingwood and refined by Langer. Unlike Tolstoy, Collingwood’s theory does not require the audience to feel anything: it’s enough that the artist has felt certain “emotions” and expresses them. Langer, for her part, argues that the artist does not need to feel an emotion, but she should be able to imagine it—and thus to have a certain knowledge of that emotion (see the IEP article on Susanne K. Langer).
According to one charitable interpretation, by “emotions” Collingwood means something similar to Susanne Langer’s notion of feelings:
The word ’feeling’ must be taken here in its broadest sense, meaning everything that can be felt, from physical sensation, pain and comfort, excitement and repose, to the most complex emotions, intellectual tensions, or the steady feeling-tones of a conscious human life (1957, p. 15).
So, a broad category of mental states can be qualified as feelings, including being struck by a contrast of colors or a combination of sounds (Wiltsher 2018).
By “expression”, Collingwood means an exercise in “transmuting” emotion (in the broad sense) into a medium that makes it shareable, which would require an exercise of the imagination (Wiltsher 2018, 771-8). Langer’s theory of art is similar, she emphasizes that expressing feelings requires the use of a symbolic form to transmit what the artist grasps about the value that an event has for them (Langer 1967). In that sense, it’s possible that Dürer could have expressed the elements that struck him in the hare he depicts.
iii. Limits of Expressivism
Although Collingwood and Langer’s theories are rich and sophisticated, they seem to run up against the same kind of objections as Tolstoy’s view, since it requires that the artist has actually experienced what is supposed to be expressed in the work. However, it’s doubtful that every artwork meets this criterion, and it is possible that many do not. An example is provided in Edgar Allan Poe’s The Philosophy of Composition (1846). He explains that he wrote The Raven purely procedurally, “with the precision and rigorous logic of a mathematical problem” (idem, 349), without reference to anything like feelings or emotions. Of course, Poe must have gone through certain mental states and feelings when writing this famous poem, but the point is that it is arguably not necessary that he had a feeling that he then expressed in order to create this artwork.
Even if one remains convinced that Poe and any other artist must in fact express feelings in the artworks they create, there is another problem with an expressivist theory which, as in the mimesis case, seems insurmountable: The expression of feelings is not a sufficient condition for art. Love letters, insults, emoticons, and a host of other human productions have an expressive purpose, sometimes brilliantly achieved, but this does not make them artworks. Expressivists, like mimesis theorists, seem unable to accommodate criteria [vi] and [vii]: excluding non-artistic artifacts and accounting for borderline cases (Lüdeking 1988, Chap.1).
c. Formalism
After the Romantic period and the apex of expressivism, a new, more objectivist trend emerged, closer to the spirit of its contemporary abstract artists. Instead of focusing on the artist’s feelings, formalism attempts to define art by concentrating on formal aesthetic properties—that is, aesthetic properties that are internal to the object and accessible by direct sensation, such as the aesthetic properties of colors, sounds, or movements (see the IEP article on Aesthetic Formalism). This (quasi-)objectivist approach is infused by Kant’s view on beauty (Freeland 2002, 15)—see Immanuel Kant: Aesthetics.
Note that formalism has affinities with aesthetic perceptualism, the view that any aesthetic property is a formal aesthetic property (Shelley 2003). However, moderate formalism does not imply aesthetic perceptualism (Zangwill 2000): A formalist may accept that artworks possess non-formal relevant properties—for example, originality, expressiveness, or aesthetic properties that depend on symbols (for example, in a poem). Nevertheless, for the formalist, these properties do not define art. This is developed in section 6.
i. Clive Bell
One of the leading figures of formalism is Clive Bell (1914)—and, later Harold Osborne (1952). According to formalism, the essence of visual art is to possess a “significant form,” which is a combination of lines and colors that are aesthetically moving. This is how he introduces this idea:
There must be some one quality without which a work of art cannot exist; possessing which, in the least degree, no work is altogether worthless. What is this quality? What quality is shared by all objects that provoke our aesthetic emotions? What quality is common to Sta. Sophia and the windows at Chartres, Mexican sculpture, a Persian bowl, Chinese carpets, Giotto’s frescoes at Padua, and the masterpieces of Poussin, Piero della Francesca, and Cézanne? Only one answer seems possible–significant form (Bell 1914, 22).
A clear advantage of formalism over expressivism is that it allows us to account for what is herein defined as “cold art”. Malevich’s White on White, Warhol’s Empire, and Dürer’s Hare do not necessarily trigger strong emotions, but the arrangement of colors and lines in these works nevertheless possess notable aesthetic properties. Similarly, Edgar Allan Poe’s poem may not express the author’s feelings, but its formal properties are noteworthy. Another notable advantage of formalism—especially over the mimesis theory—is that it readily accounts for non-representational art, contemporary as well as ancient or non-Western. Indeed, Bell was particularly sensitive to the emergence of abstract art among his contemporaries.
ii. Limits of Formalism
The first problem for formalism is that there seem to be artworks that lack formal aesthetic properties—particularly among conceptual art. The prototypical example is Duchamp’s famous Fountain (Fontaine). While some might argue that the urinal used by Duchamp possesses certain formal aesthetic properties—its immaculate whiteness or its generous curves—these are irrelevant to identifying the artwork that is Fountain (Binkley 1977). Duchamp specifically chose an object which, in his opinion, was devoid of aesthetic qualities, and, in general, his ready-made can be composed of any everyday object selected by the author (a shovel, a bottle-holder…). Similarly, a performance like Joseph Beuys’ I Like America and America Likes Me seems devoid of any formal aesthetic properties—Beuys’ performance consisted mainly of being locked in a cage with a coyote for three days. Formalism thus is threatened by criterion [iii]: there are avant-garde or possible art forms that go beyond formalism. It should be noted, however, that Zangwill offers possible answers to these counterexamples, which are discussed in section 6.a..
A second problem for formalism is that the possession of formal aesthetic properties is not sufficient to be art. Again, the problem concerns criteria [vi] and [vii]: excluding non-artistic artifacts and accounting for borderline cases. As noted above (1.c.), there are a whole host of artifacts that are elegant, pretty, catchy, and so one. in virtue of their perceptual properties but that are not artworks. Formalism seems unable to answer this objection (we will see how neo-formalism tries to avoid it in section 6).
3. The Skeptical Reaction
From the 1950s onwards, a general tendency against attempts to define art emerged among analytic philosophers. The main classical theories were being challenged by avant-garde art which constantly pushed back the boundaries of the concept. At the same time, a general suspicion towards definitions employing necessary and sufficient conditions emerged under the influence of Wittgenstein. In response, philosophers such as Margaret MacDonald and Morris Weitz adopted a radical attitude: they argued that art simply cannot be defined.
a. Anti-Essentialist Approaches
Wittgenstein (1953, §§ 66-67) famously endorses a form of skepticism with regard to the definition of games in terms of necessary and sufficient conditions. He postulates that there are only non-systematic features shared only by some sub-categories of the super-category ’games’—for example, one can win at football and chess…—from which a general impression of similarity among all members emerges, which he calls “a family resemblance”.
Margaret MacDonald, a student of Wittgenstein, is historically the first to take up this argument for art (see Whiting 2022). In her pioneer work, she argues that artworks should be compared “to a family having different branches rather than to a class united by common properties which can be expressed in a simple and comprehensive definition.” (MacDonald 1952, 206–207) In other words: art has no essence and cannot be defined with sufficient and necessary conditions.
Several philosophers came to the same conclusion in the years following MacDonald’s publications. Among them, Morris Weitz’s (1956) argument is the most influential. It claims that the sub-categories of artworks (such as novel, sonata, sculpture…) are concepts that can be described as “open” in the sense that one can always modify their intensions, as their extensions grow due to artistic innovations. Since all art sub-categories are open, the general “art” category should be open too. For instance: is Finnegans Wake a novel (or something brand new)? Is Les Années a novel (or a memoir)? To include these works in the “novel” category, the definition of the term needs to be revised. The same applies to all the other sub-categories of artworks: they need to be revised as avant-garde art progresses. So, since the sub-categories of art cannot be closed, we cannot provide a “closed” definition of art with necessary and sufficient conditions. Weitz moreover thinks that art should not be defined since, in his view, this hinders artists’ creativity.
The view of art as family resemblances is neither a definition nor even a characterization of art—it’s essentially a negative theory. Indeed, the family-resemblance approach does not seem very promising as a positive theory of art. For instance, Duchamp’s In Advance of a Broken Arm looks more like a standard shovel than any other artwork. A naive family resemblance approach might lead to the unfortunate conclusion that either this work is not art or that all shovels are (Carroll 1999, 223). This is probably of little importance to a true skeptic who doesn’t think the category of artworks will ever be definitely set. However, more moderate approaches have attempted to bring a stronger epistemic power to the family resemblance approach.
b. The Cluster Approach
Berys Gaut (2000) agrees with MacDonald and Weitz that any attempt to define art in terms of necessary and sufficient conditions is doomed to fail. Nevertheless, he defends an approach that can guide us to minimally understand the term “art,” capture borderline cases (cf. criterion [vii]), and establish fruitful theories with other human domains (cf. criterion [iii]). This is the cluster approach, which takes the form of a disjunction of relevant properties, none of which is necessary, but which are jointly sufficient. The idea is that, for something to qualify as an artwork, it must meet a certain number of these criteria, though none of them need to be met by all artworks. Gaut provides the following list:
(1) Possessing positive aesthetic properties … (2) being expressive of emotion … (3) being intellectually challenging … (4) being formally complex and coherent … (5) having a capacity to convey complex meanings; (6) exhibiting an individual point of view; (7) being an exercise of creative imagination (being original); (8) being an artifact or performance which is the product of a high degree of skill; (9) belonging to an established artistic form (music, painting, film, and so forth); (10) being the product of an intention to make an artwork (2000, 28).
For the most part, these criteria correspond to definitions proposed by other approaches. Roughly speaking, criteria (1) and (4) correspond to formalism, (2) and (6) to expressivism, (9) and (10) correspond to relationalism, and criteria (7) and (8) correspond to Kantian art theory—see Immanuel Kant: Aesthetics.
Gaut points out that the list can incorporate other elements in such a way as to modify the content but not the global form of the account. The approach is therefore good at incorporating new types of art—criterion [iii].
c. Limits of Skepticism
Skeptic approaches, and in particular the cluster approach, have been vigorously criticized. Note, however, that it has also inspired a non-skeptical approach called “disjunctive definitions” that is discussed in section 8.
A first and obvious objection against the cluster approach concerns the question of which criteria can be added to the list. For instance, why is the criterion “costs a lot” not on the list, while statistically, many artworks cost a lot? (Meskin 2007) More profoundly, the cluster approach has no resource to reject any irrelevant criteria—such as “has been made on a Tuesday” which can be true of some artworks. It would be absurd to elongate the disjunction infinitely with such criteria. Hence, without an element to connect the properties on the list, we run the risk of arbitrary clusters such as “flamp,” which stands for “either a flower or a lamp” (Beardsley 2004, 57). Intuitively, the term “art” is not as arbitrary as “flamp.”
Another question one may ask regarding the cluster theory is how many criteria an item must meet to be art. Some philosophical papers possess properties (3-6) and yet are not artworks (Adajian 2003). Presumably, some criteria are weighted more heavily than others, but this again leads to problems of arbitrariness (Fokt 2014).
Regarding skepticism more generally, Adajian (2003) points out that it has little resources for demonstrating that art has no essence. For instance, Dickie (1969) notices a problem with the most influential argument (by Weitz): it does not follow from the fact that all the sub-concepts of a category are open that the super-category is itself open. One can conceive of a closed super-category, such as “insect,” and sub-categories open to new, as of yet unknown, types of individuals—for example, a new species of dragonfly. Conversely, Wittgenstein may be right to argue that the super-category “game” is open, but this does not mean that sub-categories such as “football,” “role-playing game,” or “chess” are open. It seems, then, that there is no necessary symmetrical relation between open or closed sub-categories and open or closed super-categories.
4. Relational Definitions
Another reaction to classical definitions emerged in the second part of the 20th century. Contrary to skepticism, it did not give up on the attempt to spell out the necessary and sufficient conditions of art. Instead, it claimed that these conditions should not be found in the internal properties of artworks, but in the relational properties that hold between artworks and other entities, namely art institutions—for institutionalism—and art history—for historicism.
a. Institutionalism
Arthur Danto, in his seminal paper “The artworld” (1964), describes Andy Warhol’s Brillo Boxes as what inspired his institutionalist definition. These boxes, exhibited in an art gallery, are visually indistinguishable from the Brillo boxes found in supermarkets. But, crucially, the latter are not artworks. Danto concluded that the identity of an artwork depends not only on its internal properties, but also (and crucially) on relational properties such as the context of its creation, its place of exhibition, or the profession of its author. This led him to propose that artworks are constituted by being interpreted (see Irvin 2005 for a discussion). So, Warhol’s Brillo boxes are distinct from commonplace objects because they are interpreted in a specific way. It means that when Warhol made his Brillo boxes, he had an intention to inscribe the work in an artistic tradition or to depart from it, to comment on other artists, to mock them, and so on. Warhol’s interpretation of Brillo boxes is of a specific kind, it is related to the artworld—that is, related to a set of institutions and institutional players made up of art galleries, art critics, museum curators, collection managers, artists, conservatories, art schools, art historians, and so on.
According to Danto, a child dipping a tie in blue paint could not, on his own, make an artwork, even if he intended to make it prettier. Picasso, on the other hand, could dip the same tie in the same pot and turn it into an artwork. He would achieve this by virtue of his knowledge of the artworld, which he intends to summon through such a gesture. As the example of the child and the tie shows, Danto’s position implies that the creator of art possesses explicit knowledge of the artworld. Danto is thus led to deny that Paleolithic paintings, such as those in the Lascaux caves, can be art—a consequence he seems happy to accept (Danto 1964, 195). This result is highly counter-intuitive given that almost everyone intuitively attributes artistic status to these frescoes. What’s more, Danto excludes the vast majority of non-Western art and art brut. These problems are discussed in more detail below (4.a.ii).
George Dickie’s (1969) institutionalist definition aims to avoid Danto’s counterintuitive results. Dickie distinguishes three meanings of the term “art”: a primary (classificatory) meaning, a secondary (derivative) meaning, and an evaluative (prescriptive) meaning. For Dickie, throughout the history of art up to, roughly, Duchamp, the three meanings were intertwined. The primary meaning of art is what Dickie seeks to define, it reflects the sense of the term which would unify all artworks. The secondary meaning refers to objects that resemble paradigmatic work—for example, a seashell may be metaphorically qualified as an “artwork” given its remarkable proportions that are also used in many artworks. Finally, the third meaning corresponds to the evaluative use of the term art, as in “Your cake is a real work of art!”. Duchamp’s great innovation was to have succeeded in separating the primary from the secondary meaning, by creating works that in no way resembled paradigmatic works of the past (secondary meaning), but which nevertheless managed to be classified as artworks (primary meaning).
Indeed, Duchamp’s Bicycle Wheel and Camille Claudel’s La petite châtelaine are not linked by a relationship of perceptual resemblance but are nevertheless linked by the fact that both sculptures have been recognized by the guarantors of the artworld as possessing certain qualities. It is this institutional recognition that would allow us to classify these two very different sculptures as artworks.
Dickie’s original definition of art is the following:
A work of art in the descriptive sense is (1) an artifact (2) upon which [a] some society or some sub-group of a society [b] has conferred [c] the status of candidate for appreciation (Dickie 1969, 254).
Condition (1) is simply the criterion that artworks are produced by people. It is in condition (2) that one finds Dickie’s most significant contribution to the debate. Let’s consider its various features: the society or sub-group to whom [a] refers should be understood as the relevant actors from the artworld (1969, 254). It may be a single person, such as the artist who created the artifact, but more commonly involves various actors such as gallerists, museum curators, critics, other artists, and so on. Condition [b] refers to the symbolic act of conferring a special status. Dickie compares it to the act of declaring someone as a candidate for alderman (idem, 255)—in a way reminiscent of Austin’s (1962) performative speech acts. This comparison illustrates that the symbolic act of conferring cannot be performed by just anyone in any context. They need to have a special role that allows them to act on behalf of the artworld.
To return to the above example from Danto, a child dipping a tie in paint doesn’t have the institutional status necessary to turn the artifact into an artwork. On the other hand, his father, a museum curator, could take hold of the tie and confer on it the status of art brut by exhibiting it. To perform this act successfully, the father must, according to Dickie, confer to it [c] the status of candidate for appreciation. This doesn’t mean that anyone needs to actually appreciate the tie, but only that, thanks to the status the curator has conferred it, people apprehend the tie in ways that they usually apprehend other artworks so as to find the experience worthy or valuable (Dickie 1969, 255). Of course, it seems circular to define an artwork using the notion of the artworld. Dickie readily admits this. Nevertheless, he argues that this circularity is not vicious, as it is sufficiently informative: we can qualify the artworld through a whole host of notions that do not directly summon up the notion of art itself. Dickie describes the artworld through, for instance, historical, organizational, sociological descriptions that make the notion more substantial than by describing it uninformatively in terms of the “people who make art art.”
It should also be noted that Dickie reformulated and then substantially modified his original definition in the face of various criticisms (see the section Further Reading below).
i. Advantages of Institutionalism
A definite advantage of institutionalist definitions is that their extension corresponds much better to the standard use of the term “art” than the classical definitions discussed above. Thus, since Dickie’s improvement over Danto’s initial idea, institutionalism fulfills criteria [i] to [vi], at least partially.
Indeed, it allows for the inclusion of [i] mediocre art, [ii] art produced by people who do not possess the concept “art,” and [iii] and avant-garde, future, or possible art—insofar as the status of artwork is attributed by the institutions of the artworld to these works. We’ll see that this last condition is open to criticism, but in any case, it allows institutionalism to explain without any problem why Duchamps’ readymades are artworks, which is not obvious from classical or skeptic theories. Art brut, cult objects (for example, prehistoric), video games, chimpanzee paintings, music created by AIs, non-existent or extraterrestrial types of objects, but also natural objects (for example, a banana) can become art as soon as these entities are adequately recognized by the artworld. In this way, arguments such as “It’s not art, my 5-year-old could do the same” that one sometimes hears about contemporary art lose all their weight.
Institutionalism also makes it possible to exclude [iv] purely natural objects, [v] what purports to be art but has completely failed, [vi] and non-artistic artifacts, including those with an aesthetic function—as long as the status of artwork has not been conferred on these objects by the relevant institutions. Again, this last condition is debatable, but it does help to explain why Aunt Jeanne’s beautiful Christmas tree or Jeremy’s selfie on the beach are not considered artworks.
ii. Limits of Institutionalism
Levinson (1979) and Monseré (2012) argue that primitive or non-Western arts, as well as art brut, are not made with an intention related to the artworld. But they do not need to wait for pundits—a fortiori from the Western artworld—to be appropriately qualified as art. Think for instance of Aloïse Corbaz’s work before Dubuffet or Breton exhibited her, or Judith Scott’s (see picture) before her recognition in the art brut world. Through their exhibitions of art brut and non-Western art, Dubuffet and Breton did not transform non-artistic objects into artworks, they rather helped to reveal works that were already art. Turning to another example: in the culture of the Pirahãs, a small, isolated group of hunter-gatherers living in the forests of Amazonia, there is no such thing as the artworld (Everett, 2008). Yet this group produces songs whose complexity, aesthetic interest, and expressiveness clearly make them, in the eyes of many, artworks. Similar remarks can be made about prehistoric works, such as those in the Chauvet cave.
What underlies counterexamples such as art brut, non-Western, or prehistoric art is that there seem to be reasons independent of institutionalization itself that justify artworld participants institutionalizing certain artifacts and not others. In line with this idea, Wollheim (1980, 157-166) proposed a major challenge for any institutional theory: either artworld participants have reasons for conferring the status of art or they do not. If they have reasons, then these are sufficient to determine whether an artifact is art. If they do not, then institutionalization is arbitrary, and there’s no reason to take artworld participants seriously. An institutional definition is therefore either redundant or arbitrary and untrustworthy.
To revisit a previous example, the museum curator must somehow convince his peers in the artworld that his son’s tie dipped in paint is a legitimate candidate for appreciation. For this, it seems intuitive that he must be able to invoke aesthetic, expressive, conceptual, or other reasons to justify his desire to exhibit this object. But if these reasons are valid, then the institutional theory is putting the cart before the horse: it’s not because the artifact is institutionalized that it is art, but because there are good reasons to institutionalize it (see also Zangwill’s Appendix on Dickie, 2007). Conversely, if there are no good reasons to institutionalize the artifact, then the father simply had a whim. If institutionalism still allows the father to confer the status of artwork, then institutionalization is arbitrary, and there is an impossibility of error on the part of participants in the artworld.
The objections discussed have been partially answered by more recent versions of institutionalism, see notably Abell (2011) and Fokt (2017).
b. Historicism
Starting in 1979, Jerrold Levinson sought to develop a definition of art that, while inspired by the institutionalist theses of Danto and Dickie, also avoided certain of their undesirable consequences. Levinson’s theory retains the intuition that art must be defined by relational properties of the works. However, instead of basing his definition on the artworld, Levinson emphasizes a work’s relationship to art history. Noël Carroll (1993) is another well-known advocate of historicism while theories put forward by Stecker (1997) and Davies (2015) also contain a historicist element and are discussed in section 8 below. Here, the focus is on Levinson’s account, which is the oldest, most elaborate, and most influential.
Levinson summarizes his position as follows:
[A]n artwork is a thing (item, object, entity) that has been seriously intended for regard-as-a-work-of-art, i.e., regard in any way preexisting artworks are or were correctly regarded (Levinson 1989, 21).
Before delving into the details, let’s consider an example. We have seen that the institutional definitions of Danto or Dickie struggled to account for art produced by someone from a culture that lacked our concept of art (4.a.ii). Levinson (1979, 238) discusses the case of an entirely isolated individual—imagine, for instance, Mowgli from The Jungle Book. Mowgli could create something beautiful, let’s say a colored stone sculpture, with the intention, among other things, of eliciting admiration from Bagheera. Although Mowgli does not possess the concept of art, and although his sculpture has not been instituted by any representative of the artworld, his artifact is related to past works through the type of intention deployed by Mowgli.
Indeed, one can highlight at least three types of resemblance between the intention with which Mowgli created his sculpture and that of past artists (1979, 235). Firstly, Mowgli wanted to endow his sculpture with formal aesthetic properties—symmetry, vibrant colors, skillful assembly… Secondly, he aimed to evoke a particular kind of experience in his spectators—aesthetic pleasure, admiration, interest… Thirdly, he intended his spectators to adopt a specific attitude towards his sculpture—contemplate it, closely examine its skillfully assembled form, recognize the success of the color palette.
To produce art, according to Levinson, Mowgli does not need to have in mind works from the past, but his production must have been created with these types of intention—as long as it is with these types of intention that the art of the past was created.
The resemblance between an artist’s intentions and those of past artists may seem to lead Levinson to a form of circularity, but that is not the case; it leads him instead to a successive referral of past intentions to past intentions that themselves refer to older ones until arriving, at the end of the chain, at the first art productions the world has known. For Levinson, one does not need to know precisely what these first arts are. What matters is that, at some point in the prehistory of art, there are objects that can easily be called art—like the Chauvet caves (see picture) or the Venus of Willendorf.
This way of defining art can be compared to how biologists typically classify the living world. A biological genus is defined through its common origins, even though individuals of different species may have evolved divergently. The first arts are comparable to the common ancestors of species in the same genus. To know what this common ancestor looked like, one must trace the genetic lineages. But there is no need to know exactly what these ancestors looked like to know that the species belong to the same genus.
Notice that Carroll (1993) proposes a historicist definition of art that does not require an intentional connection like Levinson’s. For Carroll, there is no need for artists to have intentions comparable to those of previous artists; instead, their works must allow a connection to the past history that forms a coherent narrative. If an object can be given an intelligible place in the development of existing artistic practices, then it can also be considered art.
i. Advantages of Historicism
The first advantage of historicism over the theories reviewed so far is to explain the diversification of ways of making and appreciating art. From classical painting to horror movies, from ballet to readymades, the property of being considered-as-art gradually complexifies. Thus, the apparent impression of disparity or lack of unity in what is called “art” is attenuated. A whale, a bat, and a horse strike us as being very different, yet their common ancestor can be traced, explaining their common classification as mammals. It would be the same for a video game, a cathedral, and a performance by Joseph Beuys: although quite different, their common origin can be traced, thus explaining what binds them together. And their historical connection would be the only essential element shared by all these artworks.
Historicism also enjoys several advantages over institutionalism while being able to account for cases that institutionalism deals with successfully. The first advantage is that it sidesteps the issue concerning works not recognized by the artworld (criterion [ii]). A second advantage over institutionalism is what one may call “the primacy of artists over the public.” Danto’s or Dickie’s institutionalism implies that someone belonging to the artworld simply cannot be mistaken in their judgment that something is art. This is not the case with Levinson’s theory. Thus, an art historian who comes across an Inuit toy that was not created with artistic intentions and still catalogs it in a book as an “Inuit artwork” would be mistaken about the status of this artifact, even if, from their perspective, there are good reasons to classify it as such: according to Levinson, if this toy was not created with the right intentions, then it cannot become art just because an archaeologist or curator considers it as such (cf. Levinson’s discussion of ritual objects 1979, 237). By the same token, historicism has no problem explaining how an artifact can be an artwork before being recognized by institutions, for example, the early works of Aloïse Corbaz and Judith Scott. Before being institutionalized by art brut curators, these artists nonetheless created artifacts with the intention that they were regarded-as-works-of-art.
For other advantages of historicism, see the references in the section Further Reading below.
ii. Limits of Historicism
The first difficulty of historicism concerns the first arts. It can be introduced through the age-old problem of Euthyphro, in which Socrates shows Euthyphro that he is unable to say whether a person is pious because loved by the gods or loved by the gods because pious. Here, the question would be: Is something considered art because it fits into the history of art, or does something fit into the history of art because it is art? (Davies 1997)
Outside of first arts, Levinson’s answer is clear: something is considered art when it fits into the history of art through the intention-based relation described above. However, this answer is not possible for first arts because, by definition, they have no artistic predecessors. In his initial definition, Levinson simply stipulates that first arts are arts (Levinson 1979, 249). Consequently, he is forced to admit that first arts are art not because they fit into the history of art; hence the problem of Euthyphro.
Gregory Currie (1993) highlights a similar problem through the following thought experiment: imagine that there is a discovery on Mars of a civilization older than any that has ever existed on Earth. Long before the first arts appeared on our planet, Martians were creating objects that we would unquestionably call “art.” The reason for labeling these objects as “art” does not seem tied to the contingent history of human art but rather to ahistorical aspects.
A last problematic case that can be raised in connection with the criterion [vii] concerns the intentions governing activities seemingly more mundane than art. In particular, a five-year-old child drawing a picture to hang on the fridge has a similar intention to that of a painter who wants to present their work—the five-year-old wants to evoke admiration, create an object that is as beautiful as possible, and so on. Examples can be multiplied easily. Think of the careful preparation of a Christmas tree, the elegant furnishing of a living room, a spicy diary, or vacation photos on Instagram (see Carroll 1999, 247). In all these cases, the intentions clearly resemble those of paradigmatic works of the past but we don’t want to label all of them as artworks.
Levinson replied to some of these objections, but this goes beyond the scope of this article (see the section Further Reading below).
5. Feminist Aesthetics, Black Aesthetics, and Anti-Discriminatory Approaches
In addition to the skeptical approaches discussed in section 3, important criticisms against the project of defining art have been raised within feminist aesthetics, Black aesthetics, and anti-discriminatory approaches to art. In contrast with skepticism, these critiques constitute a constructive challenge for a definition of art—and especially for relational views (see section 4)—rather than an anti-essentialist position.
Two kinds of criticisms are distinguished: (a) those made by artists and (b) those made by philosophers and theorists of art. Art practiced by women, Black people, queer artists, or other groups underrepresented in the history of (Western) art can challenge traditional approaches to art notably by highlighting how the works and perspectives from members of these groups have been unfairly marginalized, ignored, discarded, or stripped of credibility and value. In parallel, one can find critiques from philosophers whose approach is guided by their sensitivity to discrimination, which leads them to detect problematic issues in the necessary and sufficient conditions of existing art definitions.
a. The Art as Critique
Let’s start by stating something quite obvious: not all art produced by a person who suffers from discrimination constitutes a critique of this discrimination. Similarly, not all feminist or anti-racist criticisms go in the same direction. For instance, Judy Chicago’s work The Dinner Party—a 1979 installation where plates shaped like vaginas and bearing the names of important women figures are arranged on a triangular table (see picture)—has been praised as a seminal feminist artwork but has also been criticized as naively essentialist (Freeland 2002, Chap. 5). It is thus difficult to make general statements that apply to all the relevant artworks.
That being said, much feminist art, which gained prominence during the 20th century and especially its second half, bears a certain continuity with Dadaism and conceptual art (Korsmeyer and Brand Weiser 2021). Works created by feminist artists have often radically challenged the most traditional conceptions of art (see section 2.a.)—think, for instance, of the protest art of the Guerrilla Girls or Marina Abramović’s performances.
Non-Western art and art made by marginalized communities, such as artworks by women, have also often been excluded or not considered central to the project of defining art. For this reason, some of these works can unsettle and destabilize the traditional Western conceptions of art discussed above. For instance, many cultures in Africa do not make a rigid distinction between the interpretation and creation of art, everyday practices of crafting, and “contemplation” of beauty (see Taylor 2016, 24). As Taylor notes, these diverse practices can serve as points of comparison with classical Western art for those who seek to elaborate a more general theory of art and so to respect the criteria [ii] and [iii].
It is noteworthy that philosophers have not always had this inclusive attitude towards art produced by marginalized artists. Batteaux’s or later Hegel’s notion of “fine art” (see section 2.a.ii) leads to the neglect of any type of art that does not fit into the “noble” categories of painting, sculpture, architecture, music, poetry, dance, or theater. These categories exclude art practices that have been associated with women or non-Western cultures, relegating these practices to a lesser status, such as craft. Think, for instance, about Otomi embroidery—a Mesoamerican art form—or about quilts, which have traditionally been made by women. These textile creations are not among the fine art categories. It was only in the latter part of the 20th century that some quilts started to be recognized as artworks, exhibited in art galleries and museums, thanks notably to the work of feminist artists such as Radka Donnell, who have used quilts as a means to subvert traditional, male-dominated perspectives on art. Donnell even considered quilts as a liberation issue for women (for example, Donnell 1990).
The emergence of expressivism and formalism, which also challenged the fine art categories, has perhaps helped philosophers and art theorists to adopt a more universalist and hence more inclusive approach to the diversity of the arts (Korsmeyer 2004, 111). Unfortunately, it is unclear whether this helped women and discriminated minorities to be recognized as artists since the art forms favored by formalism or expressivism have remained male- and white-dominated (see Freeland 2002, Chap.5 and Korsmeyer and Brand Weiser 2021, Sect. 4 for a discussion).
b. Philosophical Critiques
Korsmeyer and Brand Weisser (2021), two important feminist aestheticians, state that “There is no particular feminist ‘definition’ of art, but there are many uses to which feminists and postfeminists turn their creative efforts.” Similar remarks have been made for Black aesthetics (Taylor 2016, 23). As far as definitions of art are concerned, much effort in these traditions has concentrated on pointing to biases from prevalent proposals. A point of emphasis has been in explaining structures of power reflected in the concept of art, often from a Marxist perspective (Korsmeyer 2004, 109). Some philosophers working on minority viewpoints see typical attempts to provide necessary and sufficient conditions for art with suspicion and as being systematically guilty of biases regarding gender, race, or class (Brand Weiser 2000, 182).
A striking example of such criticisms is the work of Peg Brand Weiser (for example, 2000). In addition to pointing to the excluding nature of the fine art categories that were stressed above (section 5.a.), she highlights important objections to the relational definitions (section 4). Concerning institutionalism (section 4.a.), a way to reconstruct Brand Weiser’s argument is the following. According to this theory, institutional authorities have the power to make art. However, these authorities historically are (and still mostly are) white men who have acquired their institutional power by virtue of their male and white privilege (among other factors) and whose perspective on what should count as art is biased. Thus, the institutional definition is flawed because it inherits the biases of the authorities that decide what is art.
Note that this criticism holds even if institutions change and gradually become less patriarchal. According to institutionalism, indeed, quilts weren’t art until the 1970s—when art authorities started giving institutional accolades to this practice. However, it seems that some quilts always have been art, only of a neglected kind (cf. the counterexample of art brut in section 4.a.ii).
Brand Weiser also criticizes historicism and in particular Carroll’s version (see section 4.b.). What is art, according to this definition, depends on what has been considered to be well-established art practices in the past. However, since these practices have long been exclusive of arts practiced by marginalized and underrepresented groups, this makes the historicist definition decidedly suspect. Like the institutionalism of Danto or Dickie, this definition is flawed because it inherits historically prevalent biases.
In addition to these criticisms, Brand Weiser has also made positive proposals for a non-biased definition. She offers six recommendations that a definition of art should adopt; they can be summarized with these three points: (1) we must recognize that past art and aesthetic theories have been dominated by people with a particular taste and agenda that may suffer racist and sexist biases, (2) the definition of fine arts is flawed, (3) “gender and race are essential components of the context in which an artwork is created and thus cannot be excluded from consideration in procedural […] definitions of “art”” (Brand Weiser 2000, 194).
There are potential limitations to Brand Weiser’s critique. First, concerning institutionalism, while her criticisms apply to Danto and Dickie’s versions, it should be noted that more recent versions may be able to avoid them. For instance, Abel (2011) and Fokt (2017) propose characterizations of art institutions that can exist independently of the Western art world and its white-dominated authorities (see section 4.a.ii).
Concerning historicism, while Brand Weiser’s criticisms may apply to Carroll’s version, it is less evident that it applies to Levinson’s. This is because, according to the latter, new artworks must be connected to artworks from the past through the intentions of artists. Thus, for Levinson, what makes something an artwork today is not the art practices, art genres, art institutions, famous artworks, or even written art histories. These may well be biased while the relevant artistic intentions are not so biased. Think again about our Mowgli example (section 4.b.): his colored rock sculpture is art because his intention in creating it connected to past artistic intentions—let’s say intentions to overtly endow the sculpture with formal aesthetic properties. Arguably, these kinds of intentions need not be polluted with racism or sexism. If they are not, then Levinson’s historicism can avoid Brand Weiser’s criticism.
Regarding Brand Weiser’s positive suggestions, while (1) and (2) are important historical lessons that philosophers such as Dutton (2006), Abel (2011), Davies (2015), Fokt (2017), and Lopes (2018) have taken into account, (3) may seem too strong. Gender, social class, and sexual orientation are important sociological contextual factors, but one should resist importing them into (any) definition of art to avoid an infinite iteration of the import of contextual factors to which one would be led by parity of reasoning. For instance, Michael Baxandall (1985) shows that the intention to create, say, the Eiffel Tower could have emerged only thanks to a dozen contextual conditions such as the frequentations of the artist, views on aesthetics popular at the time, the trend for positivism regarding science, the state of technological advances, and Gustave Eiffel interests for the technic of puddling. These too are essential components of the context in which this artwork has been created (to use Brand Weiser’s formulation), but it doesn’t seem that all such variables, and more, should be included in a definition of art, however important they are to understand the relevant artworks.
It bears repeating that feminist and anti-racist approaches do not have an essential and particular definition of art. Nevertheless, their positive proposals show that such approaches can lead to new definitions of art, definitions that would pay particular attention to avoiding the negative biases and prejudicial attitudes towards minorities and marginalized groups, attitudes that have all-too-often polluted the (Western) history of art.
6. Functionalist Definitions
A lesson that may be drawn from the discussion of relational approaches is that it is difficult to define art without reference to the non-relational or internal properties of artifacts if one wants to avoid arbitrary definitions. The family of functionalist theories resurrects the idea shared by classical definitions that if x is art, it is in virtue of a non-relational property of x. Here, what x must possess is an aesthetic function.
a. Neo-Formalist Functionalism
A significant portion of functionalist approaches resembles and is inspired by classical formalism (see section 2.c.) since its main proponents acknowledge a close connection between the aesthetic function of artworks and formal aesthetic properties directly accessible by the senses. This sub-family, however, distinguishes itself from classical formalism since it is the aesthetic function in an artifact that makes it an artwork rather than the mere presence of aesthetic properties. Let’s call this sub-family neo-formalist functionalism (or just neo-formalism).
Monroe Beardsley (1961) and more recently Nick Zangwill are the major representatives of this line of thought. Zangwill is skeptical of attempts to include the most extreme cases of contemporary art in a definition of art (2007, 33). Instead, he focuses on the pleasure most of us experience in engaging in more traditional artistic activities and on the metaphysics of formal aesthetic properties.
Formal aesthetic properties supervene on properties that can be perceived through our five senses. For instance, the elegance of a statue supervenes in its curves—that is, there can be no difference in the elegance of the statue without a difference in its curves (see Supervenience and Determination). The elegance of a statue is a formal aesthetic property since its curves are non-aesthetic properties we can perceive through our eyes or hands. This metaphysical position leads Zangwill to believe that what fascinates us primarily when we create or contemplate an artwork are formal aesthetic properties: what strikes us when we listen to music is primarily the balance between instruments, and this balance supervenes on the sound properties of the instruments and tempo; what pleases us when we post a photo on Instagram is finding the appropriate filter to make our image enchanting. By choosing a filter, we do nothing more than decide to endow our photo with formal aesthetic properties based on non-aesthetic properties that can be perceived.
Zangwill’s neo-formalist definition reflects this idea: x is an artwork if and only if (a) x is an artifact endowed with aesthetic properties supervening on non-aesthetic properties, (b) the designer of x had the insight to provide x with aesthetic properties via non-aesthetic properties, and (c) this initial intention has been minimally successful (Zangwill 1995, 307).
Criterion (a) reflects Zangwill’s metaphysical position; (b) reflects the functionalist nature of this approach: to have an aesthetic function, something needs to have been conceived to possess aesthetic properties—which excludes natural objects (see criterion [iv]). Of course, the aesthetic function thus endowed is not necessarily the only one that artworks possess; religious artifacts can have both an aesthetic and a sacred function. More importantly, just like Kant, Zangwill argues that art cannot be created simply by applying aesthetic rules; one must have insight, a moment of “aesthetic understanding” (Zangwill 2007, 44), about how aesthetic properties will be instantiated by non-aesthetic properties. Finally, (c) ensures that an artist’s intention must be minimally successful. There can be dysfunctional artworks, but not so dysfunctional that they fail to have any aesthetic properties at all—this respects the criterion [v] to exclude what purports to be art but has failed completely.
As a “moderate” neo-formalist (as he calls himself), Zangwill does not deny that properties broader than formal aesthetic properties can make an artwork overall interesting—this is the case for relational properties such as originality, for example (2000, 490). However, genuine artworks (and, in fact, most artworks) must possess formal aesthetic functions.
This neo-formalist idea, though moderate, excludes conceptual artworks, since they have no formal aesthetic properties. If we consider Duchamp’s Fountain, we see only an ordinary urinal devoid of relevant formal aesthetic properties—in fact, we would not understand Duchamp’s attempt if we interpreted it as a work aiming to display the elegance of a urinal’s curves (Binkley 1977). But remember that this would be the point of neo-formalism because it reacts to institutionalism. To be more specific, Zangwill considers that these works indirectly involve aesthetic functions since they refer to, or are inspired by, traditional works of art, which are, in turn, endowed with aesthetic functions. Provocatively, Zangwill labels these cases as “second-order” and even “parasite” types of artworks (2002, 113). In doing so, Zangwill accounts for the primacy of formal aesthetic properties in defining art, while at the same time accounting for contemporary art.
i. Advantages of Neo-Formalist Functionalism
The main advantage of neo-formalism regards its efficiency in capturing an important aspect of our daily engagement with artworks: they are “evaluative artifacts” (Zangwill 2007, 28) in the sense that, from Minoan frescos to Taeuber-Arp’s non-representational paintings, these artifacts have been positively evaluated based on the formal beauty intended by their creator. Interestingly, as Korsmeyer (2004, 111) points out, formalism (and its heir) paves the way for appreciation of work created by non-Western cultures—just as we can appreciate beautiful creations of the past.
By contrast, art theorists and philosophers still have a hard time explaining to the person on the street why the most radical cases of conceptual works, such as Comedian by Maurizio Cattelan (a banana stuck on a wall by duct tape), should be considered art. It seems that “non-aesthetic” works—works that do not aim at instantiating aesthetic properties, such as hyper-realistic sculptures or conceptual art—are indeed not considered as paradigmatic examples of artworks by laypeople. Zangwill’s ambition, in opposition to institutionalism, consists precisely in focusing on paradigmatic cases (for example, The Fall of Icarus by Bruegel the Older, see picture) and avoiding a definition of art based on exceptions. In the same vein, Roger Pouivet (2007, 29) adds that practicing a “theory for exception” as institutionalism does is even harmful. The risk would be to provide a definition that would be a purely scholastic exercise but would no longer be related to what is the main interest of art.
Another advantage of neo-formalism is that the definition of art clearly aligns with its ontology within this theory. Zangwill’s definition is indeed based on an ontological analysis of artifacts and formal aesthetic properties. By contrast, the ontology underlying institutionalism is much less clear (Irvin and Dodd 2017).
ii. Limits of Neo-Formalist Functionalism
Limitations to the neo-formalist approach are numerous, given that its very ambition is not to capture all so-called “exceptions” of contemporary art. On paper, neo-formalism meets criterion [iii]: a new type of art can emerge as long as the artworks constituting it possess an aesthetic function. However, its rejection of many contemporary artworks is at odds with the ambition to offer a definition of art across the board. In such a case, it is good to apply the principle of reflective equilibrium, which attempts to determine a balance between the coherence of the theory and its ability to capture our intuitions.
A first notable worry concerns whether “non-aesthetic” works are considered as genuine artworks by Zangwill or as artworks “in name only.” According to his own definition, Fountain is not a genuine artwork, but Zangwill wants to account for it anyway. The status of “second-order” or “parasite” art should thus be clarified.
Another worry concerns artworks such as Tracey Emin’s My Bed – an unkempt bed littered with debris—or, to a lesser extent, Francis Bacon’s tortured works. Contrary to counterexamples such as conceptual arts, these cases were realized with an aesthetic insight, the one of providing them with negative formal aesthetic properties. Since neo-formalism focuses on positive aesthetic formal properties, it says almost nothing about these negative cases—see, by contrast, expressivism section 2.b. These cases are nevertheless quite complex since an artwork can possess an overall (or organic) positive value by possessing intended negative proper parts (Ingarden 1963, Osborne 1952). If one wishes to conserve the idea that art involves the insight to endow artifacts with positive aesthetic properties (which leads to positive appreciation), one must significantly complexify the neo-formalist approach. Indeed, the insight should focus on two layers of aesthetic properties–those concerning the aesthetic properties of the proper parts and those concerning the aesthetic property of the whole.
A broader objection challenges Zangwill’s metaphysical claim that formal aesthetic properties supervene on properties perceived by the five senses. Indeed, many internal aesthetic properties of literary works are central to this art form but are not directly accessible through the senses. Think for instance of the dramatic beauty of tragedy, the elegance with which a character’s psychology is depicted, or the exquisite comic of a remark in a dialogue. If this is true, this is a major objection to most metaphysical approaches to formal aesthetic properties via perceptual properties (Binkley 1977; Shelley 2003). Since a novel cannot be conceived with the intention that it possesses formal aesthetic properties supervening on perceptual properties, a novel cannot be art. Zangwill swallows this bitter pill: by extending “art” to objects that do not possess formal properties, we would commit the same mistake as naming “jade” both jadeite and nephrite (Zangwill 2002). Literature would be taken for art due to metaphorical resemblances with formal aesthetic artifacts. It is not a matter of rejecting the value of literature; it is simply denying that it has formal aesthetic properties, therefore being genuine art.
This response is highly problematic. Firstly, while it is true that narrative properties do not supervene on perceptual properties, we nevertheless enjoy them in the same way as we enjoy formal properties, namely through our sensibility (emotions, feelings, impressions…). Moreover, we attribute evaluative properties supervening on the internal properties of literary works with the same predicates used for formal properties: we speak of elegant narration, stunning style, the compelling rhythm of the plot, the sketchy psychology of a character… Why should these narrative properties be only metaphorically aesthetic since they share these two relevant features with formal aesthetic properties (Goffin 2018)? In a nutshell, narrative properties cannot be excluded from the domain of aesthetic properties just because it could consist of an exception in our ontological system—this is the “No true Scotsman” fallacy (but see Zangwill 2002, 116).
Finally, Zangwill’s definition is both too narrow and too broad. It is too narrow since it excludes literature from art although works such as Jane Eyre, The Book of Sand, or the Iliad seem to be paradigmatic examples of artworks. It is too broad since it includes any artifact created for aesthetic purposes from an insight—such as floral arrangements or even (some) cake decorations (Zangwill 1995). Such cases risk fatal consequences if we want to preserve criterion [vi] and exclude non-artistic artifacts (including those with an aesthetic function).
Patrick Grafton-Cardwell (2021) and, to a certain extent, Marcia Eaton (2000) suggest a definition of art able to keep the spirit of functionalism while bypassing the too-broad objection. They argue that the function of artworks consists in aesthetically engaging—that is, to direct our attention to aesthetic properties.
Cake decorations may indeed be designed with the insight to endow them with aesthetic properties, but not with the intention of making us contemplate or reflect on the aesthetic properties of these cakes. One can even argue, with Eaton, that in our culture, cake decorations do not deserve the relevant aesthetic attention—while letting the door open for another culture that would consider cake decoration as deserving it.
While the aesthetic engagement approach sidesteps a major objection of neo-formalist functionalism, it provides no resource (except our intuitions) to distinguish aesthetic engagement from other types of engagement—for example, epistemic engagement toward a philosophical question. It thus seems too uninformative and vague to have sufficient predictive power—cf. criterion [iii]. That being said, Eaton argues that aesthetic engagement concerns attention toward the internal features of an object that can produce delight. This is, however, still too vague (see Fudge 2003).
7. Determinable-Determinate Definitions
Up to this point, this article has given a general definition of art. To then define individual arts (for example, paintings), one would have to add further conditions to the general definition. One may consider this as a genus-species view of art: the super-category has a positive definition, involving necessary and sufficient conditions, that is independent of the sub-categories. In biology, the class of mammals is defined by attributes such as being vertebrate, having warm blood, nursing their young with maternal milk, and so on. The different species belonging to this class (bats, whales, ponies, humans…) share these properties and distinguish themselves by species-specific properties. In aesthetics, art would be a super-category that contains a set of sub-categories such as literature, music, sculpture… This idea is precisely what is attacked by skeptics.
A promising strategy is to rethink the relationship between art and the specific arts through a determinable-determinate approach—where arts (painting, cinema, music…) possess determinate definitions whereas art is understood as being “one of the arts to be determined”. Take the relationship between being colored and the property of having a specific color. It is difficult, if not impossible, to define “being colored” independently of different colors. “Being colored” is defined as the disjunction “being blue” or “being burgundy” or “being saffron yellow,”… Thus, “being colored” is determinable whereas “being burgundy” is determined. What is determined should have a specific, independent definition but what is determinable must be defined by what is determined. If art is like color, then the responsibility for defining “art” must be transferred to specific arts.
a. Buck-passing
Dominic Lopes embraces this idea. Here is his definition:
(R) item x is a work of art if and only if x is a work in activity P and P is one of the arts. (2008, 109)
The philosopher’s task would thus be to define each specific art, and once this task is accomplished, there is no need to add anything to define the category “art.” Just as by knowing that an object is red, you thereby know that it is colored, by knowing that an entity is a musical work, you thereby know it is art, with no further explanation needed. This definition is called “buck-passing” because the definition of art is delegated to the definitions of specific arts.
As Lopes notices, this matches with the idea that the generic concept of art plays a marginal role in our daily interaction with the arts. When seeking advice, do we look for an “expert in art”? That sounds pompous and somehow out of place. We rather seek advice from a photographer, a garden designer, or a book critic (see Lopes 2018, Chap.1). As Kennick provocatively puts it:
[O]nly a man corrupted by aesthetics would think of judging a work of art as a work of art in general, as opposed to this poem, that picture, or this symphony (in Lopes 2008, 121).
This may be true even when it comes to avant-garde works. Expecting to listen to a classical musical work, the audience will certainly be baffled by John Cage’s 4’33. The audience’s bewilderment does not come from a difficulty in considering 4’33 as art; what baffles them is that they do not know what kind of work they are dealing with–is it a piece of music? A theatrical performance? A joke? Thus, when the audience learns that this work should be classified as being halfway between a noise piece and a conceptual artwork, their bafflement should diminish.
The elephant in the room concerning buck-passing is, therefore, the following: Which are the individual arts? How can we determine if a particular artifact belongs to one of the arts or not? How many arts are there? Without good answers to these questions, buck-passing risks to be nothing more than a circular or non-informative theory.
Intuitively, the list of arts contains “fine art”—architecture, sculpture, painting, music, poetry –, dance (the sixth art), cinema (the seventh art), and so forth. Each of these arts belongs to a medium, meaning a set of (material) resources exploited and transformed through a “technique” (Lopes 2014, 139). Typically, sculptures can be made with several materials (bronze, marble, wood…) shaped by specific tools (chisels, burin…).
However, this is not enough. Buck-passing still has to overcome two challenges: (a) disqualify non-artistic artifacts belonging to the same medium as artistic artifacts (see criterion [vi]), and (b) account for cases in which an artifact does not seem to fall under an already-known medium but we want to say it is or could be an artistic medium (see criterion [iii]).
The first challenge (a), named by Lopes the “coffee mug objection,” is arguably the most important one. Although ceramic can be an artistic medium, there are many ceramic objects, such as our everyday coffee mugs, which arguably do not qualify as art. So, a criterion must be found to distinguish ceramic objects that are artworks from those that are not. To solve this challenge, Lopes has a surprisingly simple strategy: “[artworks] are works that exploit a medium in order to realize artistic properties and values.” (Lopes 2014, 144) In other words, what distinguishes a coffee mug from a ceramic sculpture is that the latter is an “appreciative kind” of object (2014, 121): in the latter, the way resources are exploited in a medium aims to realize aesthetic or artistic properties.
Now, regarding (b), let us return to the analogy with colors. Some new media resemble mixtures of colors; they are mixes between existing media—the most well-known example being cinema. This poses no problem for buck-passers as the same reasons that make these pre-existing media arts can make the new medium an art too. However, some artifacts belong to totally innovative media—for example, Robert Barry’s Inert Gas Series, in which the artist simply released inert gasses into the desert. These cases resemble the discovery of a color never seen before. Lopes admits that his approach would struggle to explain why these new media should enter the category of arts. However, he does not consider this as a challenge only addressed to buck-passing but to any account of art.
i. Advantages of Buck-passing
Let’s start by pointing out that since the notion of art does not play a central role in buck-passing, an individual (such as a non-Western individual) could produce artworks without necessarily knowing the Western notion of “art” (Lopes 2014, 78). Buck-passing also aligns with our daily experience of the arts and explains our reactions to borderline cases (such as 4’33).
However, the main advantage of buck-passing can be found elsewhere. Since its definition of art is “decentralized,” there can be a focus on descriptive as well as evaluative specificities of specific arts. In a nutshell, buck-passing encompasses diversity regarding the ontological status and the relevant way we evaluate each art type (Lopes 2008, 125). To see why this might be fruitful, let’s look at the example of video games.
Grant Tavinor defines video games as “artifacts of a digital visual medium” created with the intention of entertaining either through the deployment of rules, “objective gameplay,” or through “interactive fiction” (Tavinor 2009, 32-33). Tavinor’s definition highlights the aesthetic specificity of video games compared to other arts: gameplay and the interactive nature of (possible) narratives. Thus, even if the videogame Journey features landscapes of splendid dunes, as can be found in Lawrence of Arabia, an aesthetic specificity of Journey compared to a film comes from the fact that these dunes can be explored through a set of game rules requiring the player to press keys in a clever order and timing (meaning, the gameplay). The specificity of this art is at the heart of the aesthetic judgments of video game critics. It is not uncommon for a game with undeniable formal aesthetic properties to be criticized by specialists for the mediocrity of its gameplay. Conversely, games without notable formal aesthetic properties have been praised for the high interactivity of their narrative (for example, Undertale) or the depth of their gameplay (for example, Super Meat Boy).
Within genus-species approaches the focus is on the genus—“Are video games art?”—rather than on the specificities of the medium. This leads to a debate on the artistic nature of video games precisely because of the focus on shared criteria across all arts. Some scholars reject video games from the realm of art because of their interactive nature—which would leave no room for an appropriate aesthetic attitude (or contemplation) toward art in general (Rough 2018). Others reject it due to their gameplay—one can win or lose at video games, and since one cannot win or lose by watching a film or a painting, video games cannot be art (Ebert 2010).
Buck-passing scores a point over skeptical, relationalist, or neo-formalist theories by highlighting that each art requires specific approaches. It does not limit its efforts to setting out abstract properties for “art” that would govern all the arts, it directly considers the diversity of the arts, and it is thereby better suited to accounting for their ontological and evaluative specificities (for references on the ontological diversity of the arts, see the section Further Reading below).
ii. Limits of Buck-passing
Although promising in many respects, buck-passing turns out, upon closer inspection, to fall short of providing a satisfying response to the worries raised against it.
Lopes’ response to the coffee mug problem is that arts are “appreciative kinds”, media used to realize artistic properties (Lopes 2014, 144). But this dangerously resembles a neo-formalist approach “with an extra step”: [buck-passing] item x is an artwork if and only if x is a work in activity P and P is one of the arts…[formalism] an activity P is one of the arts if and only if P is a medium intentionally used to provide any x with aesthetic properties. In such a definition, the aesthetic function is distributed over P and x; ultimately, both derive their artistic character from this aesthetic function. We are thus dealing with a buck-stopping theory (Young 2016). An elegant way out would be to argue that not all arts necessarily need to possess an aesthetic function. For instance, Xhignesse (2020) suggests that new artistic media can emerge through convention. The risk, once again, would be to end up with a historicist view “with an extra step”.
Another issue concerns the way one talks about art in the public sphere which challenges the idea that a definition of art independent of the arts is not needed. Annelies Monseré (2016) emphasizes that artists must often justify “making art” to secure funding, be exhibited, and attract the attention of critics. If an artist simply said, “I am going to make a film, therefore I am going to make art,” they would certainly not obtain funds to subsidize their project. When artists justify themselves before committees, they seem to refer to a generic definition of art that is more substantial than that of buck-passing. Monseré’s objection highlights the evaluative nature of the generic term “art”: making and consuming art engages us in a particular way. Saying that something is art simply because it belongs to an artform fails to capture this evaluative character. Lopes is aware of this since he admits that artistic mediums are “appreciative kinds.” But this brings us back to the first objection discussed.
8. Disjunctive Definitions
A last strategy for approaching the definition of art is to partially concede to the skeptics such as MacDonald, Weitz, and Gaut that there are no necessary and sufficient conditions for art. However, rather than concluding that it is not possible to define art, the strategy is to give a definition of art by enumerating while acknowledging that, although none of them is necessary (in contrast to regular definitions), some of them, or certain conjunctions of them, are sufficient. For a formalization and a detailed analysis of the very notion of disjunctive definition, see Longworth and Scarantino (2010).
There are two approaches within this strategy: a symptomatic approach—which lists a large number of “typical” or “symptomatic” disjoints—and a synthetic approach—which combines different definitions of art through disjunction.
a. Symptomatic Disjunctivism
In a pioneering article from 1975, E. J. Bond responded to Weitz’s skeptical theory (1956) by observing that a set of conditions can be sufficient for something to be considered art without any member of that set being necessary. Bond proposed several conditions such that, if all are satisfied by an artifact, that artifact is undoubtedly an artwork (1975, 180). An artifact meeting fewer criteria could still be considered art; however, the fewer criteria fulfilled, the more dubious its artistic status. An analogy can be drawn with the symptoms of a syndrome in psychiatry according to manuals like the DSM: a person is considered to be on the autistic spectrum disorder when they exhibit certain symptoms; however, none of these particular symptoms is necessary for the syndrome to be diagnosed (cf. Dutton 2006, 373).
Note that this approach is disjunctive in the sense that the list of conditions for art is disjunctive, but being art requires fulfilling a conjunction of conditions. This contrasts with the synthetic disjunctive approach presented below (7.b.).
Bond’s paper was not very influential, but a similar approach has gained interest notably through the works of Denis Dutton (especially 2006), who drew inspiration from Bond. Dutton provides a list of disjoints that he calls “criteria for recognition,” some of which are similar to Bond’s and a few resemble Gaut’s criteria (2000). Here it is (we only describe the items absent from Gaut’s list (see section 3.b.)):
(i) Direct pleasure: the artistic object or performance is appreciated as a source of immediately pleasing experience. (ii) Skill or virtuosity. (iii) Style: it is executed in a recognizable style, following rules of form, composition, or expression. (iv) Novelty and creativity. (v) Criticism: there is a practice of criticism, often elaborate, commenting on this type of object or performance. (vi) Representation: the object or performance represents or imitates real or imaginary experiences. (vii) “Special” focus. It tends to be set apart from ordinary life and becomes a distinct and dramatic center of attention. (viii) Expressive individuality. (ix) Emotional saturation. (x) Intellectual challenge. (xi) Art traditions and institutions. (xii) Imaginative experience. The object or performance offers an imaginative experience to producers and the audience (2006, 369-372).
i. Advantages of Symptomatic Disjunctivism
As Dutton himself points out, his disjunctive approach avoids several objections raised against Gaut’s cluster approach (2000). Although both approaches superficially resemble each other, Dutton aims at providing a genuine definition of art.
Firstly, since the symptomatic disjunctive approach does not adopt the idea of family resemblance, we can exclude the embarrassing idea that any superficial resemblance (such as “executed on a Tuesday,” see section 3.c.) could be part of the list. For example, Dutton rejects the contrast of form/substance and eccentricity criteria from his list, as these would apply to too many artifacts (2006, 373).
Furthermore, the symptomatic disjunctive approach can accept new criteria without running the risk of becoming an ad hoc theory. Even if a new criterion (xiii) were accepted, and if an artifact only fulfilled (xiii), we could not consider that artifact an artwork (2006, 375). Given the evolution of art in our societies—which, Dutton believes, forged the i-xii criteria—it is highly improbable that another criterion appears (or already exists) and could be sufficient on its own.
ii. Limits of Symptomatic Disjunctivism
A first objection that arises when reading Dutton’s (or Bond’s) list is that finding what unifies the different disjoints is a challenge—an objection that had already been raised against Gaut (see 3.c. above).
A second objection, which applies to the cluster approach as well (see Fokt, 2014), points out that the weight assigned to certain criteria is arbitrary. For instance, criterion (xii) is central for Dutton: art would be primarily linked to exercises of imagination. But why would this criterion be more central than emotional saturation (ix), as suggested by the expressivists (section 2.b.)?
b. Synthetic Disjunctivism
Whereas symptomatic disjunctivism uses recognition criteria, synthetic disjunctivism involves blending, so to speak, multiple definitions of art (see, for example, Stecker 1997, Davies 2015). Consider, for example, the definition proposed by Davies:
I propose that something is art (a) if it shows excellence of skill and achievement in realizing significant aesthetic goals, and either doing so is its primary […] identifying function, or (b) if it falls under an art genre or art form established and publicly recognized within an art tradition, or (c) if it is intended by its maker/presenter to be art and its maker/presenter does what is necessary and appropriate to realizing that intention. (Davies 2015, 377-8)
Disjoint (a) closely resembles a neo-formalist definition through the notion of aesthetic goals, while disjoint (b) borrows the notion of artistic tradition from historicist definition, and disjoint (c) is inspired by the institutionalist idea that certain individuals can confer the status of art.
Davies’ approach does not intend to be a mere collage of different definitions; he assigns them different roles in the emergence of artistic practices and in the subsequent evolution of the concept of art. According to him, the first arts (in the sense used in section 4.b) are art by virtue of the disjoint (a). This latter “does the important work of getting the art-ball rolling” (Davies 2015, 378). Disjoints (b) and (c) must be elaborated based on the historical development of first arts. In this regard, Davies aligns with the historicist definition while encompassing the first arts under functionalism.
Note that this approach is disjunctive in the sense that the list of conditions for art is disjunctive, and being art requires meeting at least one condition, not a conjunction of symptom criteria, as in the symptomatic disjunctive approach (8.a.).
i. Advantages of Synthetic Disjunctivism
Synthetic disjunctivism has numerous advantages, especially when compared to the formalist, institutionalist, and historicist definitions from which it draws inspiration. For instance, it avoids the Euthyphro problem posed to historicism (see section 4.b.ii.) by clearly assuming that the first arts are art in virtue of their aesthetic functions. In other words, the Euthyphro problem is resolved as follows: something is art (a) because it possesses a formal function that makes it art or (b) because it fits into the history of art. Consequently, it also bypasses Currie (1989)’s objection about Martian art (see section 4.b.ii.). Martian art may not share the same history, but it should have the same “formalist” beginning.
Synthetic disjunctivism also has an advantage over neo-formalism: it can account for the fact that certain works lack formal quality without having to label them as secondary (or parasitic) arts or exclude them from the realm of art (see section 6.b.ii).
Finally, and perhaps most importantly, the synthetic disjunctive approach has strong descriptive powers: thanks to its disjunctive form, it has no problem accounting for neglected art genres and non-Western art traditions that some of the above definitions wrongly excluded (see Sections 4.a.ii and 5 above), such as the sophisticated make-up and masks of Kathakali dancers. In fact, one can hardly find a case of artwork that escapes this definition.
ii. Limits of Synthetic Disjunctivism
The encompassing aspect of synthetic disjunctivism is a double-edged sword. By borrowing characteristics from previous theories, it inherits some of their shortcomings. Davies’s definition thus accumulates difficulties of overgeneralization from functionalism, institutionalism, and historicism: it struggles to exclude certain non-artistic artifacts and activities or to account for borderline cases (criteria [vi] and [vii]). For example, disjoint (c) imports from institutionalism the problem of the zealous curator who decides to bring an object into the artworld with arbitrary authority (see section 4.a.ii). And with disjoint (a), it seems to include everyday objects such as flower arrangements, similarly to neo-formalism (section 6.a.ii).
A second objection argues that the synthetic disjunctive definition is not a theory in itself. Indeed, Davies explains how the notion of art has become more complex, transitioning from a practice that a formalist definition can capture to practices requiring a disjunctive definition. However, formalism and institutionalism are opposing theories with antithetical insights. Bringing together functionalism, institutionalism, and historicism in a single disjunction does capture a maximum number of cases but at the expense of the unity offered by each of the definitions taken separately. Davies could counter that this is not a problem; after all, if the concept of art is rich and complex, it is precisely because it has a rich history that has led to a disjunctive understanding of the concept. It has been seen that this strategy applied to symptomatic disjunctivism (section 8.a.i.).
9. Conclusion
This overview of definitions of art leads us to identify four broad strategies that philosophers have employed: a target on the internal elements of artistic artifacts (classical definitions; functionalism); a focus on the relational elements of artistic artifacts (institutionalism, historicism); an emphasis on the artistic media rather than art in general (determinable-determinate definitions); and the combination of both internal and relational elements (disjunctive definitions). Opposing these four strategies are the skeptics, for whom art is indefinable; at best, one can provide typical characteristics (family resemblance) or an open cluster of properties. Each of these approaches possesses its advantages and drawbacks, reflecting their contributions to the literature as well as their limitations.
The most important advantage of classical and functionalist definitions seems to be their intuitive simplicity. Aesthetics is the science of beauty, and artworks are a subcategory of objects studied by aestheticians for their formal beauty. An important drawback of these definitions is their inadequate extension, which is both too broad and too narrow compared to common conceptions of art. A definition that categorizes all hummed songs while doing the dishes as artworks and simultaneously excludes cold art, abstract art, or literature is problematic.
A major advantage of relationalist definitions seems to be their great adaptability to borderline cases and new types of art. Inevitably, new forms of art will emerge and are emerging; institutionalism has clear criteria for accepting them. A crucial drawback of these definitions is the room left for arbitrariness—concerning either the authorities capable of institutionalizing an artifact or the indefinable nature of firsts arts.
One major advantage of determinable-determined definitions is their ability to capture the aesthetic and ontological specificities of each art. Except for Oscar Wilde, no one can claim to be an expert “in art”; each domain requires a particular approach and expertise. One drawback of these definitions is the difficulty of finding a determinable-determined approach to the arts that is actually not reducible to another definition “with an extra step.”
Disjunctive definitions have the advantage of being inclusive—a quality that helps address the challenges raised by anti-discriminatory approaches to the definition of art. It seems reasonable to think that a good definition of art does not contain a single criterion but a set of internal and relational elements. However, a major drawback of these definitions is the fact that they are over-encompassing since they fail to exclude many kinds of non-artistic artifacts.
Faced with these definitions, skepticism seems to stand out by predicting that a positive definition of art cannot be given. However, skeptical theories have not proven that a good definition is impossible and even less that this project is a failure. Rather, the positive theories, with their commendable though imperfect efforts, taught us many insights by revealing hidden complexities in the concept of art. Contrary to skeptical predictions, it seems that progress has been made in understanding what art is.
10. References and Further Reading
a. References
Abell, Catharine. 2011. “Art: What It Is and Why It Matters.” Philosophy and Phenomenological Research 85 (3): 671–91.
Adajian, Thomas. 2003. “On the Cluster Account of Art.” The British Journal of Aesthetics 43 (4): 379–85.
Austin, John. L. 1962. How to Do Things with Words. Oxford: Clarendon Press.
Batteux, Charles. 1746. Les Beaux arts réduits à un même principe. Paris: Chez Durand.
Baxandall, Michael 1985.. Patterns of Intention: On the Historical Explanation of Pictures. Yale University Press.
Beardsley, Monroe C. 1961. “The Definitions of the Arts.” Journal of Aesthetics and Art Criticism 20 (2): 175–87.
Beardsley, Monroe C. 2004. “The Concept of Literature.” In The Philosophy of Literature: Contemporary and Classic Readings—An Anthology, edited by Eileen John and Dominic McIver Lopes, 1st edition, 51–58. Malden, MA: Wiley-Blackwell.
Bell, Clive. 1914. Art. London: Chatto and Windus.
Binkley, Timothy. 1977. “Piece: Contra Aesthetics.” Journal of Aesthetics and Art Criticism 35 (3): 265–77.
Bond, Edward. J. 1975. “The Essential Nature of Art.” American Philosophical Quarterly 12 (2): 177–83.
Brand Weiser, Peg Zeglin. 2000. “Glaring Omissions in Traditional Theories of Art,’” in Theories of Art Today, ed. Noël Carroll, University of WisconsinPress: 175–198.
Carroll, Noël. 1993. “Historical Narratives and the Philosophy of Art.” Journal of Aesthetics and Art Criticism, 51(3), 313–326.
Carroll, Noël. 1999. Philosophy of Art: A Contemporary Introduction. London ; New York: Routledge.
Collingwood, Robin G. 1938. The principles of art. Oxford: Oxford University Press.
Currie, Gregory. 1993. “Aliens, Too.” Analysis 53 (2): 116–18.
Danto, Arthur. 1964. “The Artworld.” The Journal of Philosophy 61 (19): 571–84.
Davies, Stephen. 1997. “First Art and Art’s Definition.” Southern Journal of Philosophy 35 (1): 19–34.
Davies, Stephen. 2015. “Defining Art and Artworlds.” The Journal of Aesthetics and Art Criticism 73 (4): 375–84.
Everett, Daniel. 2008. Don’t Sleep, There Are Snakes. London: Profile Books.
Fokt, Simon. 2014. “The Cluster Account of Art: A Historical Dilemma.” Contemporary Aesthetics 12.
Fokt, Simon. 2017. “The Cultural Definition of Art.” Metaphilosophy 48 (4): 404–29.
Freeland, Cynthia. 2002. But Is It Art?: An Introduction to Art Theory. Oxford, New York: Oxford University Press.
Fudge, Robert. 2003. “Problems with Contextualizing Aesthetic Properties.” Journal of Aesthetics and Art Criticism 61 (1): 67–70.
Ganguly, Anil Baran. 1962. Sixty-Four Arts in Ancient India. English Book Store, New Delhi.
Gaut, Berys. 2000. “’Art’ as a Cluster Concept.” In Theories of Art Today, edited by Noël Carroll, University of Wisconsin Press, 25–44. London.
Goffin, Kris. 2018. “The Affective Experience of Aesthetic Properties.” Pacific Philosophical Quarterly 0 (0): 1–18.
Grafton-Cardwell, Patrick. 2021. “The Aesthetic Engagement Theory of Art.” Ergo an Open Access Journal of Philosophy 8 (0).
Ingarden, Roman. 1964. “Artistic and aesthetic values.” The British Journal of Aesthetics 4 (3): 198–213.
Irvin, Sherri. 2005. “Interpretation et description d’une oeuvre d’art.” Philosophiques 32 (1): 135‑48.
Irvin, Sherri, and Julian Dodd. 2017 “In Advance of the Broken Theory: Philosophy and Contemporary Art.” Journal of Aesthetics and Art Criticism 75 (4): 375‑86.
Kivy, Peter. 1997. Philosophies of Arts: An Essay in Differences. Cambridge ; New York: Cambridge University Press.
Korsmeyer, Carolyn. 2004. Gender and Aesthetics: An Introduction. New York: Routledge.
Korsmeyer, Carolyn, and Peg Brand Weiser. 2021. “Feminist Aesthetics.” In The Stanford Encyclopedia of Philosophy, edited by Edward N. Zalta, Winter 2021. Metaphysics Research Lab, Stanford University. https://plato.stanford.edu/archives/win2021/entries/feminism-aesthetics/.
Lamarque, Peter. 2010. Work and Object: Explorations in the Metaphysics of Art. Oxford, New York: Oxford University Press.
Langer, Susanne. 1942. Philosophy in a New Key: A Study in the Symbolism of Reason, Rite and Art. Cambridge, MA: Harvard University Press.
Langer, Susanne. 1957. Problems of Art: Ten Philosophical Lectures. New York: Charles Scribner’s.
Langer, Susanne. Mind: An Essay on Human Feeling. Johns Hopkins Press, 1967.
Levinson, Jerrold. 1979. “Defining Art Historically.” British Journal of Aesthetics 19 (3): 21–33.
Levinson, Jerrold. 1989. “Refining Art Historically.” Journal of Aesthetics and Art Criticism 47 (1): 21–33.
Longworth, Francis, and Scarantino, Andrea. 2010. “The Disjunctive Theory of Art: The Cluster Account Reformulated.” The British Journal of Aesthetics 50 (2): 151–67.
Lopes, Dominic McIver. 2008. “Nobody Needs a Theory of Art.” The Journal of Philosophy 105 (3): 109–27.
Lopes, Dominic McIver. 2014. Beyond Art. Oxford ; New York, NY: OUP Oxford.
Lopes, Dominic McIver. 2018. Being for Beauty: Aesthetic Agency and Value. Oxford, New York: Oxford University Press.
Lüdeking, Karlheinz. 1988. Analytische Philosophie Der Kunst. Frankfurt am Main: Athenäum.
MacDonald, Margaret. 1952. “Art and Imagination.” Proceedings of the Aristotelian Society 53: 205–26.
Meskin, Aaron. 2007. “The Cluster Account of Art Reconsidered.” The British Journal of Aesthetics 47 (4): 388–400.
Mikalonytė, Elzė Sigutė, and Markus Kneer. 2022. “Can Artificial Intelligence Make Art?: Folk Intuitions as to Whether AI-Driven Robots Can Be Viewed as Artists and Produce Art.” ACM Transactions on Human-Robot Interaction 11 (4): 43:1-43:19.
Monseré, Annelies. 2012. “Non-Western Art and the Concept of Art: Can Cluster Theories of Art Account for the Universality of Art?” Estetika 49 (2): 148-165.
Monseré, Annelies. 2016. “Why We Need a Theory of Art.” Estetika 53 (2): 165–83.
Osborne, Harold. 1952. Theory of Beauty: An Introduction to Aesthetics. Routledge and K. Paul.
Poe, Edgar Allan. 1846. “The Philosophy of Composition.” Graham’s Magazine, 28 (4), 163-167.
Porter, James I. 2009. “Is Art Modern? Kristeller’s ?Modern System of the Arts? Reconsidered: Articles.” British Journal of Aesthetics 49 (1): 1–24.
Rough, Brock. 2018. “The Incompatibility of Games and Artworks.” Journal of the Philosophy of Games 1 (1)..
Rousseau, Jean-Jacques. 1753. Lettre sur la musique françoise. Unindentified editor.
Schlenker, Philippe. 2017. “Outline of Music Semantics.” Music Perception: An Interdisciplinary Journal, 35(1), 3‑37.
Scruton, Roger. 1999. The Aesthetics of Music. Oxford, New York: Oxford University Press.
Shelley, James. 2003. “The Problem of Non-Perceptual Art.” British Journal of Aesthetics 43 (4): 363–78.
Stecker, Robert. 1997. “Artworks: Definition, Meaning, Value.” Journal of Aesthetics and Art Criticism 56 (3): 311–13.
Tavinor, Grant. 2009. The Art of Videogames. New Directions in Aesthetics 10. Malden, Mass.: Wiley-Blackwell.
Taylor, Paul C. 2016. Black Is Beautiful: A Philosophy of Black Aesthetics. 1st edition. Chichester, West Sussex: Wiley-Blackwell.
Tolstoy, Leo. 1898 [2023]. What is art?. Germany: Culturae.
Weitz, Morris. 1956. “The Role of Theory in Aesthetics.” The Journal of Aesthetics and Art Criticism 15 (1): 27–35.
Whiting, Daniel. 2022. “Margaret Macdonald on the Definition of Art.” British Journal for the History of Philosophy 30 (6): 1074–95.
Wiltsher, Nick. 2018. “Feeling, emotion and imagination: In defence of Collingwood’s expression theory of art.” British Journal for the History of Philosophy, 26(4), 759‑781.
Wittgenstein, Ludwig. 1953. Philosophical Investigations. Translated by G. E. M. Anscombe. 3. ed, 2000. Englewood Cliffs, N.J: Prentice Hall.
Wollheim, Richard. 1980. Art and Its Objects: With Six Supplementary Essays. Cambridge University Press.
Xhignesse, Michel-Antoine. 2020. “What Makes a Kind an Art-Kind?” British Journal of Aesthetics 60 (4): 471–88.
Young, James O. 2016. “The Buck Passing Theory of Art.” Symposion: Theoretical and Applied Inquiries in Philosophy and Social Sciences 3 (4): 421–33.
Zangwill, Nick. 1995. “The Creative Theory of Art.” American Philosophical Quarterly 32 (4): 307–23.
Zangwill, Nick. 2000. “In Defence of Moderate Aesthetic Formalism.” The Philosophical Quarterly (1950-) 50 (201): 476–93.
Zangwill, Nick. 2002. “Are There Counterexamples to Aesthetic Theories of Art?” Journal of Aesthetics and Art Criticism 60 (2): 111–18.
Zangwill, Nick. 2007. Aesthetic Creation. Oxford, New York: Oxford University Press.
b. Further Reading
This sub-section provides further references that were not discussed for reasons of space and of accessibility. These references contain elaborations on most of the theories discussed in this article and many also respond to objections raised above.
Expressivism:
Croce, Benedetto. 1902. Estetica come scienza dell’espressione e linguistica generale, Florence: Sandron.
Dewey, John. 1934. Art as Experience. New York: Capricorn Books.
Skepticism:
Gaut, Berys. 2005. “The Cluster Account of Art Defended.” The British Journal of Aesthetics 45 (3): 273–88.
Institutionalism:
Danto, Arthur. 1981. The Transfiguration of the Commonplace: A Philosophy of Art. Cambridge: Harvard University Press.
Dickie, George. 1974. Art and the Aesthetic: An Institutional Analysis. Vol. 86. Cornell University Press.
Dickie, George. 1984. “The New Institutional Theory of Art.” Proceedings of the 8th Wittgenstein Symposium, no. 10: 57–64.
Historicism
Carney, James D. 1994. “Defining Art Externally.” British Journal of Aesthetics 34 (2): 114–23.
Levinson, Jerrold. 1993. “Extending Art Historically.” Journal of Aesthetics and Art Criticism 51 (3): 411–23.
Levinson, Jerrold. 2002. “The Irreducible Historicality of the Concept of Art.” British Journal of Aesthetics 42 (4): 367–79.
Pignocchi, Alessandro. 2012. “The Intuitive Concept of Art.” Philosophical Psychology 27 (3): 425–44.
Functionalism
Zangwill, Nick. 2001. The Metaphysics of Beauty. Ithaca, NY: Cornell University Press.
Determinable-Determinate Definitions, on the Ontological Diversity of the Arts:
Davies, Stephen. 2009. “Ontology of Art.” In The Oxford Handbook of Aesthetics, edited by Jerrold Levinson. Vol. 1. Oxford University Press..
Kania, Andrew. 2005. “Pieces of Music: The Ontology of Classical, Rock, and Jazz Music.” University of Maryland. http://drum.lib.umd.edu/handle/1903/2689.
Walton, Kendall L. 1970. “Categories of Art.” The Philosophical Review 79 (3): 334–67.
Disjunctivism:
Davies, Stephen. 2007. Philosophical Perspectives on Art. Oxford: Oxford University Press.
Davies, Stephen. 2012. The Artful Species: Aesthetics, Art, and Evolution. Oxford, New York: Oxford University Press.
Dutton, Denis. 2000. “But They Don’t Have Our Concept of Art.” In Theories of Art Today, edited by Noël Carroll, 217–40. Wisconsin: University of Wisconsin Press.
Dutton, Denis. 2009. The Art Instinct: Beauty, Pleasure, and Human Evolution. Oxford: Oxford University Press.
Stecker, Robert. 2000. “Is It Reasonable to Attempt to Define Art?” In Theories of Art Today, edited by Noël Carroll, 45–64. Wisconsin: University of Wisconsin Press.
Al-Ghazālī did not regard himself as a philosopher, given that during his period in Islamic intellectual history, philosophy was associated with the Aristotelian tradition promulgated primarily by Avicenna (Ibn Sina), and, for al-Ghazālī, Avicenna was undoubtedly considered to be an unbeliever whose philosophical views (such as his commitment to the eternity of the world) fell outside the scope of orthodox Sunni Islam. There would be a serious stigma attached, from the perspective of Islamic orthodoxy, with al-Ghazālī identifying with the philosophers. Instead, al-Ghazālī regarded himself primarily as a Sufi (mystic), theologian, and jurist.
Yet despite his aversion to particular philosophical theses, it is clear that Al-Ghazālī is not only sympathetic to particular disciplines and methodologies of philosophy (for example, logic and ethics), but produces work that would certainly qualify as philosophical both in his day and ours. Indeed, he contributed immensely to the history of Islamic philosophy and the history of philosophy more generally, and he is considered to be one of the greatest and most influential thinkers in Islamic intellectual history. Al-Ghazālī’s philosophical work spans epistemology, metaphysics, philosophy of mind, natural philosophy, and ethics. His philosophical work had a wide-reaching influence within the Islamic world, and his Incoherence of the Philosophers, in particular, was well-received by other medieval philosophers and the Latin philosophical tradition.
Al-Ghazālī, who holds the title of the “Proof of Islam,” was a Persian-Islamic jurist, mystic, theologian, and philosopher, born c.1058 in Tus, Khorasan (a region in the northeast of modern-day Iran, near Mashad). According to tradition, before his father died, he left the young al-Ghazālī and his brother Ahmad to the tutelage of an Islamic teacher and Sufi; he was eventually transferred to a local madrasa (Islamic school) after this teacher died where he continued his religious studies. After this early period of Islamic learning, al-Ghazālī received an advanced education in the Islamic sciences, particularly theology, from Imam al-Juwaynī (the leading Ash‘arite theologian of the day) in Nishapur.
An early experience that shaped the scholarly character of al-Ghazālī was when his caravan was raided during his travels. Al-Ghazālī demanded from the leader of the robbers that his notebooks and texts not be seized from him. When the robber asked al-Ghazālī what these writings were, Al-Ghazālī responded by saying “My writings contain all of the knowledge that I have acquired during my travels.” The leader then asked al-Ghazālī: “If you claim that you possess this knowledge, then how can I take it away from you?” Al-Ghazālī was deeply moved by this question, regarding this utterance of the robber as divine sent, and afterwards resolved to commit everything he studied to memory.
His reputation as a young scholar eventually led him to be appointed as a professor at the Nizamiyya College in Baghdad in 1091 by Niẓām al-Mulk. Although al-Ghazālī was highly successful as a professor, accruing a large following of students (he claims at one point that he had 300 students), during the period of his professorship it seems that he grew skeptical of his intentions with respect to teaching. That is, he questioned whether his professional ambitions were truly for the sake of God or personal aggrandizement, and he ultimately underwent a skeptical and spiritual crisis (this crisis is distinct from the skeptical crisis discussed in section 2). As he describes it, this crisis had psychosomatic effects, preventing him from speaking and thereby teaching his classes. Resolved to purify his intentions and recover an authentic experience of Islam, al-Ghazālī left his post and his family in Baghdad in 1095 to travel for ten years pursuing and cultivating the Sufi or mystical way of life. He visited Damascus, Jerusalem, and Hebron; and made a pilgrimage to Mecca and Medina. During this time, he writes that:
My only occupation was seclusion and solitude and spiritual exercise and combat with a view to devoting myself to the purification of my soul and the cultivation of virtues and cleansing my heart for the remembrance of God Most High, in the way I had learned from the writings of the sufis. (Deliverance: 80)
Al-Ghazālī eventually returned to teaching in Nishapur (Northeast Iran), seemingly motivated to rid the masses of theological, spiritual, and philosophical confusions, and convinced by his colleagues that he could help revive the true theory and practice of Islam. However, he eventually retired to Tus, where he established a Sufi school, and lived a quiet life of scholarly work and meditation. He died in 1111 at the age of 55 in Tus.
Al-Ghazālī is regarded as one of the most important intellectuals in all of Islamic history. He is most well-known for his work Revival of the Religious Sciences, a work divided into forty books which covers creed, theology, mysticism, ethics, and jurisprudence. Al-Ghazālī’s philosophical and theological work emerges in texts such as Deliverance from Error, Incoherence of the Philosophers, and Niche of Lights.
2. Skepticism in the Deliverance from Error
Written in the style of a spiritual and intellectual autobiography (Menn 2003), Deliverance from Error is one of al-Ghazālī’s most well-known texts. The Deliverance from Error covers a number of topics, from the errors of the Islamic philosophers (in particular, al-Farabi and Avicenna), the nature of prophecy, to the truth of Sufism. However, it has attracted the attention of historians of philosophy, especially for its engagement with skepticism in the first part of the text. This section discusses al-Ghazālī’s skeptical arguments, his solution to skepticism, and the popular question of the similarities between al-Ghazālī’s skepticism and that of Descartes’.
a. Motivations
Before introducing his skeptical arguments, al-Ghazālī first provides some motivations for engaging skepticism, then he provides a theory of knowledge that provides the grounds for a strategy to determine whether a subject’s belief in a proposition amounts to knowledge. Al-Ghazālī claims that from a young age, he had a strong desire to seek knowledge: “The thirst for grasping the real meaning of things was indeed my habit and wont from my early years and in the prime of my life” (Deliverance: 54). However, the following observation raised a nagging doubt for him as to whether this desire would lead him to objective knowledge:
The children of Christians always grew up embracing Christianity, and the children of Jews grew up adhering to Judaism, and the children of Muslims always grew up following the religion of Islam. (Ibid. 55)
The worry that al-Ghazālī has is that what we take to be knowledge (for example, a religious claim to truth) might be due more to parental or societal conditioning (blind imitation or taqlīd), rather than being due to objective epistemic standards. This led al-Ghazālī to inquire into the true nature of knowledge. So, what then are the objective standards for knowledge? al-Ghazālī writes:
Sure and certain knowledge is that in which the thing known is made so manifest that no doubt clings to it, nor is it accompanied by the possibility of error and deception. (Ibid.)
Al-Ghazālī’s account of knowledge here has received sophisticated treatment in the literature (See Albertini 2005; Hadisi 2022). It is also important to note that this is not the only type of knowledge that al-Ghazālī recognizes. For example, later in Deliverance from Error al-Ghazālī will identify other types of knowledge, such as dhawq or fruitful experience (a type of knowledge-by-acquaintance—see Götz 2003).
However, an operative definition of Ghazālīan knowledge, at least as it is used in this part of the Deliverance from Error, is the following: A subject, S, knows that p if and only if: (1) S believes that p, (2) p is true, and (3) p is absolutely certain. Certainty is the key criterion here, as for al-Ghazālī there can be no room for doubt in a genuine item of knowledge. (For an in-depth analysis of al-Ghazālī’s conception of absolute certainty see Hadisi 2022).
With this definition of knowledge in hand, al-Ghazālī now devises a strategy for determining whether his beliefs—particularly sensory beliefs and rational beliefs—amount to knowledge. If any proposition or belief lacks certainty (that is, is dubitable) then it cannot count as knowledge: “Whatever I did not know in this way and was not certain of with this kind of certainty was unreliable and unsure knowledge” (See also Book of Knowledge: 216). Al-Ghazālī then turns to his sensory and rational beliefs to see if they can meet his standard for knowledge.
b. Sensory Perception Doubt and Dream Doubt
With this theory of knowledge and strategy for examining his beliefs in place, al-Ghazālī raises two separate skeptical arguments that generate his skepticism: a sensory perception doubt, which targets the reliability of sense data; and a dream doubt, which targets the reliability of the primary truths of the intellect.
The sensory perception doubt runs as follows:
The strongest of the senses is the sense of sight. Now this looks at a shadow and sees it standing still and motionless and judges that motion must be denied. Then, due to experience and observation an hour later it knows that the shadow is moving, and that did not move in a sudden spurt, but so gradually and imperceptibly that it was never completely at rest. (Deliverance: 56)
Here is a plausible reconstruction of the argument:
If the senses misrepresent the properties of external objects, then sense data cannot be a source of knowledge about the external world.
The senses misrepresent the properties of external objects
Therefore, sense data cannot be a source of knowledge about the external world.
After having engaged the sensory perception doubt, al-Ghazālī admits that his sensory beliefs cannot amount to knowledge. Instead, he proposes that his rational beliefs, particularly his beliefs in primary truths are still secure. For al-Ghazālī, primary truths are necessary truths which are foundational to proofs, such as the law of non-contradiction: “One and the same thing cannot be simultaneously affirmed and denied” (Deliverance: 56). However, against his reliance on primary truths, al-Ghazālī raises the dream doubt, which runs as follows:
Don’t you see that when you are asleep you believe certain things and imagine certain circumstances and believe they are fixed and lasting and entertain no doubts about that being their status? Then you wake up and know that all your imaginings and beliefs were groundless and unsubstantial. So while everything you believe through sensation or intellection in your waking state may be true in relation to that state, what assurance have you that you may not suddenly experience a state which would have the same relation to your waking state as the latter has to your dreaming, and your waking state would be dreaming in relation to that new and further state? If you found yourself in such a state, you would be sure that all your rational beliefs were unsubstantial fancies. (Ibid. 57)
Here is a plausible reconstruction of the argument:
In our current cognitive position, we cannot doubt primary truths.
However, if our current cognitive position may be akin to a dream state where what we perceive is false, then what we take to be primary truths cannot be a source of knowledge.
We cannot rule out the possibility that our current cognitive position is akin to such a dream state.
It is possible that we can wake up from this dream state into a new and higher cognitive position where we realize that the primary truths, we held to be necessarily true are actually false.
Therefore, our belief in primary truths in our current cognitive position cannot amount to knowledge.
After having raised the dream doubt, al-Ghazālī finds himself in a skeptical crisis (It is important to note that these two doubts do not generate a hyperbolic doubt, under which every proposition is considered dubious. This is evidenced by the fact that al-Ghazālī does not rule out the existence of God during his skeptical crisis. He writes that he attempted to refute the dream doubt, however, his:
Effort was unsuccessful, since the objections could be refuted only by proof. But the only way to put together a proof was to combine primary cognitions. So if, in my case, these were inadmissible, it was impossible to construct the proof. (Ibid.)
The skeptical challenge al-Ghazālī has raised is particularly pernicious. In order to, refute either skeptical doubt al-Ghazālī would have to put together proof. However, proofs require primary truths, which are now suspect. As such, al-Ghazālī has his hands tied, as it were, for he cannot construct a proof. Consequently, there is no way for al-Ghazālī himself to defeat skepticism.
c. Resolution of Skepticism
Al-Ghazālī claims that his skepticism lasted for two months, during which time he was a skeptic. However, he claims that he eventually overcame his skepticism. But he did not defeat skepticism through any proof; rather, it came about through God’s intervention via a divine light:
At length God Most High cured me of that sickness. My soul regained its health and equilibrium and once again I accepted the self-evident data of reason and relied on them with safety and certainty. But that was not achieved by constructing a proof or putting together an argument. On the contrary, it was the effect of a light which God Most High cast into my breast. And that light is the key to most knowledge. (Ibid.)
Commentators generally understand al-Ghazālī as having an experience of dhawq via the divine light, which is supposed to secure the foundations of his knowledge. In the Niche of Lights, al-Ghazālī explains that the function of a light is to reveal the existence of something, and a divine light in particular reveals the true nature of that thing and is the source of certainty (Hesova 2012; Loumia 2020). On Hadisi’s (2022), reading it is through the engagement of Sufi practices that cultivate the imagination that makes this experience of the divine light possible. While there is much scholarly debate about the nature of this divine light and the conditions for its experience, it is argued that (unlike Descartes) al-Ghazālī clearly does not defeat skepticism through any rational efforts of his own. Rather he is rescued from skepticism through a divine intervention.
d. Al-Ghazālī and Descartes
On the face of it, there are a lot of similarities between al-Ghazālī Deliverance from Error and Descartes’ Discourse on Method (1637) and Meditations on First Philosophy (1641), both of which were written 500 years after the Deliverance from Error. As such, many modern commentators have been interested in studying the relationship between Descartes and al-Ghazālī (See Albertini 2005; Azadpur 2003; Götz 2003; Moad 2009). Commentators have debated two issues on this score: what the exact similarities are between al-Ghazālī and Descartes’ engagement with skepticism, and whether there is historical evidence that al-Ghazālī had any real influence on Descartes.
Commentators generally agree that there are similarities between the two philosophers’ search for certainty and, in particular, the skeptical arguments they raise. In 1857, George Henry Lewes wrote that the Deliverance from Error: “bears so remarkable a resemblance to the Discours sur la Méthode of Descartes, that had any translation of it existed in the days of Descartes, everyone would have cried out against the plagiarism” (1970: 306). For starters, both al-Ghazālī and Descartes place absolute certainty at the center of their epistemology and develop a similar strategy for evaluating whether their beliefs amount to knowledge, that is, a belief that is dubitable in any respect cannot amount to true knowledge. Furthermore, both al-Ghazālī and Descartes employ a sensory perception doubt and a dream doubt. However, while the sensory perception doubt for both philosophers’ functions similarly, the dream doubt does not. Recall, that al-Ghazālī uses the dream doubt to target the reliability of intellectual beliefs or primary truths. However, in the Meditations on First Philosophy Descartes uses the dream doubt to question the existence of the external world. Descartes uses a separate doubt, what is often called the defective nature doubt, to question the reliability of the intellect and intellectual beliefs. Moreover, while al-Ghazālī’s skeptical arguments do not generate hyperbolic doubt (because he never questions the existence of God), Descartes’ skepticism does end up in global skepticism. There also seems to be a difference in how the philosophers defeat skepticism. As it is well-known, Descartes ultimately claims to defeat skepticism through the natural light of reason. However, al-Ghazālī requires supernatural assistance through the divine light and explicitly denies that he can defeat skepticism through the use of reason and rational proofs.
As for whether Descartes had access to al-Ghazālī’s works, there is a theory that there was a translation of the Deliverance from Error in Latin that Descartes had access to; however, this theory is unlikely. Indeed, the Deliverance from Error was not translated until 1842, when it was translated into French (Van Ess 2018). There is also the possibility that the Deliverance from Error was orally translated for Descartes by Golius, an orientalist who had access to the Arabic Deliverance from Error (Götz 2003). However, there is only circumstantial evidence for this claim. In the end, the best evidence for there being an influence seems to be the striking similarities between al-Ghazālī and Descartes’ engagement with skepticism.
3. Assessment of Philosophy
a. Materialists, Naturalists, and Theists
After having defeated skepticism, al-Ghazālī proceeds to examine the “various categories of those seeking the truth,” to determine whether he can gain anything from them in his newfound path to knowledge, and in particular, whether these categories of seekers conform with the truth on religious matters according to orthodox Islam, or in a state of unbelief (kufr). Regarding his study of philosophy, al-Ghazālī claims that he studied and reflected on philosophy independently (without a teacher) for just under three years. He claims: “God Most High gave me an insight into the farthest reaches of the philosophers’ sciences in less than two years,” and he reflected “assiduously on it for nearly a year” until he understood philosophy’s “precision and delusion” (Deliverance: 61).
The philosophers, al-Ghazālī claims, are those “who maintain that they are the men of logic and apodeictic demonstration” (Ibid. 58) and he divides them into three groups: materialists, naturalists, and theists. Regarding the materialists, al-Ghazālī writes that these were ancient philosophers who denied the existence of an omniscient and omnipotent “Creator-Ruler.” Their fundamental claim is that the world is eternal, and it exists without a creator. According to al-Ghazālī, these philosophers are “the godless in the full sense of the term” (Ibid. 61).
Regarding the naturalists, al-Ghazālī writes that these were the ancient philosophers who devoted themselves to the study of “nature and the marvels found in animals and plants” (Ibid. 62). Unlike the materialists, these philosophers found God’s wisdom in the creation of the universe. What was problematic about these philosophers, is that they denied the immortality of the soul, claiming that the soul ceased to exist upon the corruption of its humors, and thus it would be impossible to bring back a non-existent soul. As such these philosophers denied the afterlife, despite believing in God and his attributes. These philosophers were not thoroughly godless in the way that the materialists were, but still held problematic beliefs that are objectionable from the perspective of orthodox Islam. These are unbelievers as well.
Regarding the theists, al-Ghazālī writes that these were the later ancient philosophers, such as Socrates, Plato, and Aristotle who admitted a Creator-Ruler in their philosophical systems. These philosophers, according to al-Ghazālī, refuted the materialists and naturalists and also refined the philosophical sciences. Nonetheless, all of these philosophers and their transmitters among the Islamic philosophers (that is, al-Farabi and Ibn Sina) must be charged with unbelief due to their still subscribing to theses such as the eternity of the world.
b. The Philosophical Sciences
Al-Ghazālī divides the sciences of philosophy into six fields: mathematics, logic, physical science, metaphysics, political philosophy, and moral philosophy or ethics (In the Book of Knowledge, he divides them into four: mathematics, logic, physical science, and metaphysics). Here, we will discuss his approach to mathematics, logic, physical science, political philosophy, and ethics, reserving his views on metaphysics for section 4 on the “Incoherence of the Philosophers.”
One might think that since al-Ghazālī wrote the Incoherence of the Philosophers, and claims that the ancient philosophers al-Farabi and Avicenna are unbelievers, he is completely antithetical to philosophy. However, it is important to note that the title of his famous text is not the Incoherence of Philosophy. Rather, al-Ghazālī, in general, takes specific issues with certain philosophical theses, rather than with the discipline of philosophy itself. Nonetheless, there are ways in which he thinks that an improper engagement with philosophy can lead to certain evils.
Regarding mathematics, which concerns arithmetic, geometry, and astronomy, he claims that: “nothing in them entails denial or affirmation of religious matters” (Deliverance: 63); rather, they contain facts which cannot be denied. Nonetheless, he identifies two evils that can follow from an improper engagement with mathematics.
The first evil occurs when a person who studies mathematics assumes that since philosophers have precision in their mathematical proofs, such precision in arriving at the truth extends to other areas of philosophy, such as metaphysics. This will make them susceptible to believing false metaphysical positions that are not actually demonstrated according to al-Ghazālī. Furthermore, this makes the person susceptible to unbelief because they will conform to the philosophers’ disdain (in al-Ghazālī’s view) for the law and religion more generally.
The second evil comes from the champion of Islam who believes that everything in philosophy must be rejected. When another person hears of such negative claims about philosophy as a whole, this will lead to them having a low opinion of Islam, for it seems that Islam is incompatible with clear mathematical truths established by demonstration, so “he becomes all the more enamored with philosophy and envenomed against Islam” (Ibid. 64).
Similarly, al-Ghazālī claims that logic does not conflict with religion, for it only concerns proofs, syllogisms, the conditions of demonstration, and the requisites for sound definitions. He admits that the philosophers have a greater refinement in their study and use of logic, which exceeds that of the theologians. Nonetheless, two evils can come from an improper understanding of the philosophers’ use of logic. First, if logic is rejected by the believer, this will lead the philosophers to have a low opinion of this rejector, and their religion more generally. Second, if the student of logic has too high opinion of the philosophers’ use of logic, they may come to believe that their metaphysical theses that amount to unbelief must be actually backed by demonstration, before even coming to examine the metaphysics per se.
The physical sciences or natural philosophy, which concerns the study of the heavens, stars, the sublunar world, and composite bodies, are also not objectionable with respect to religion, except for certain issues that al-Ghazālī covers in the Incoherence of the Philosophers. Al-Ghazālī stresses that one must remember that every aspect of physics is “totally subject to God Most High: it does not act of itself but is used as an instrument by its Creator” (Deliverance: 66) Any physical science whose theory rests on a denial of the dependency of an aspect of the physical world on the divine is objectionable (see section== 4d for al-Ghazālī’s views on causation and occasionalism).
Regarding political philosophy, al-Ghazālī does not regard the philosophers as contributing anything novel, for they “simply took these over from the scriptures revealed to the prophets” (Ibid. 67). As such, it seems that al-Ghazālī does not find issues with the philosophers’ views on political philosophy.
In al-Ghazālī’s view, ethics concerns “the qualities and habits of the soul, and recording their generic and specific kinds, and the way to cultivate the good ones and combat the bad” (Ibid.). According to al-Ghazālī, the philosophers’ ethical views derive from the sayings of the Sufis and prophets, and thus in general their ethical views are not objectionable (Kukkonen 2016; cf. Quasem 1974). Nonetheless, there are two evils that can arise from an improper engagement with philosophical ethics.
First, there is an evil that arises for the person who rejects philosophical ethics because they subscribe to a wholesale rejection of philosophy, due to their knowledge of other errors of the philosophers (for example, in metaphysics). This is problematic because, for example, the utterances of the prophets are true and should not be rejected. The fundamental problem here, according to al-Ghazālī, is that this “dim-witted” person does not know how to evaluate the truth: “The intelligent man, therefore, first knows the truth, then he considers what is actually said by someone. If it is true, he accepts it, whether the speaker be wrong or right in other matters” (Ibid. 68). Second, there is the evil that arises for the person who accepts philosophical ethics wholesale. The evil here is that one will slip into accepting the metaphysical errors of the philosophers that are mixed in with their ethical teachings.
4. Incoherence of the Philosophers
In the Incoherence of the Philosophers, al-Ghazālī famously condemns al-Farabi and Avicenna as unbelievers for their philosophical views. The notion of unbelief has come up thus far in this article, but what does it mean to be an unbeliever from the Islamic perspective? Al-Ghazālī defines unbelief or kufr as follows: “‘unbelief’ is to deem anything the Prophet brought to be a lie” (On the Boundaries of Theological Tolerance: 92). Unbelief is contrasted with its opposite, faith or iman: “‘faith’ is to deem everything he brought to be true” (Ibid.). From this basic definition, it follows:
Every Unbeliever deems one or more of the prophets to be a liar. And everyone who deems one or more of the prophets to be a liar is an Unbeliever. This is the criterion that should be applied evenly across the board. (Ibid. 93)
According to al-Ghazālī, the so-called “Islamic philosophers” ascribe to philosophical theses that are fundamentally incompatible with the religion of Islam, that is, these theses imply that the Prophet Muhammad, in particular, is a liar, and thus puts them outside the fold of orthodoxy. Out of the twenty theses he identifies, three in particular conflict fundamentally with Islam: (1) belief in the eternity of the world, (2) belief that God can only know universals and not particulars, and (3) denial of the bodily resurrection. The other seventeen are innovations and are thus heterodox positions from the perspective of orthodox Islam. They do not, strictly speaking, constitute unbelief and thus one could still technically remain a Muslim while believing one of these innovations. But according to al-Ghazālī, the other three doctrines:
Do not agree with Islam in any respect. The one who believes them believes that prophets utter falsehoods and that they said whatever they have said by way of [promoting common] utility, to give examples and explanation to the multitudes of created mankind. This is manifest infidelity which none of the Islamic sects have believed. (Incoherence: 226)
In addition to holding heretical beliefs, al-Ghazālī claims that the philosophers are heretics in practice as well, for they:
Have rejected the Islamic duties regarding acts of worship, disdained religious rites pertaining to the offices of prayer and the avoidance of prohibited things, belittled the devotions and ordained prescribed by the divine law, not halting in the face of its prohibitions and restrictions. (Ibid. 1-2).
With respect to their philosophical views, however, al-Ghazālī does not view the Islamic philosophers as original. In his view, al-Farabi and Ibn Sina do not produce anything philosophically novel, rather they merely regurgitate and reproduce ancient Greek philosophical views. As he, writes: “There is no basis for their unbelief other than traditional, conventional imitation” (Ibid. 2; for more on the philosophers’ imitation or taqlid see Griffel 2005). The charge of unbelief or apostasy against al-Farabi and Avicenna has severe implications, in particular, it allows for them to be killed. As Frank Griffel makes explicit, for al-Ghazālī, “Whoever publicly supports or teaches the three named positions indeed deserves to be killed” (2007: 103).
Al-Ghazālī employs a unique strategy in revealing the philosophers’ “incoherence,” which is that he wants to beat the philosophers at their own game. The philosophers claim to establish their theses based on valid demonstrations with true and certain premises. However, al-Ghazālī aims to show that their demonstrations actually do not meet their own standards for truth, certainty, and validity:
There is neither firm foundation nor perfection in the doctrine they hold; that they judge in terms of supposition and surmise, without verification or certainty; that they use the appearance of their mathematical and logical sciences as evidential proof for the truth of their metaphysical sciences, using [this] as a gradual enticement for the weak in mind. (Incoherence: 4)
While al-Ghazālī does use scripture as a motivation for his own views, his objections to the philosophers do not amount to merely citing scripture that conflicts with a philosophical thesis. Rather, he raises philosophical objections against them with the aim of showing that their views imply problematic consequences, which they should concede given their own logical standards and epistemic position:
I do not enter into [argument] objecting to them, except as one who demands and denies, not as one who claims [and] affirms. I will render murky what they believe in [by showing] conclusively that they must hold to various consequences [of their theories]. (Ibid. 7)
It is important to note that for each Discussion in the Incoherence of the Philosophers, al-Ghazālī considers a variety of proofs for the philosophers’ views, and correspondingly raises many objections against them. Here, we will only consider a portion of these arguments.
a. The Eternity of the World
In the First Discussion of the Incoherence of the Philosophers, al-Ghazālī considers four distinct proofs for the eternity of the world. The thesis of the eternity of the world is objectionable within orthodox Islam, according to al-Ghazālī, because scripture is clear that the world was created ex nihilo. For example, there is a verse in the Qur’an that states: “All it takes, when He wills something is simply to say to it: “Be!” And it is!” (36: 82). What this verse essentially implies is that God can create an existent out of nothing. Of course, this serves as a motivation to refute the philosophers’ proofs for the eternity of the world but does not constitute a refutation on its own, as al-Ghazālī wants to show that the philosophers’ own demonstrations fail.
Let us consider the first proof for the eternity of the world, which he claims is the philosophers’ strongest and most imaginative one. The first proof depends on the fundamental concepts of will and causality (Hourani 1958). Every change (whether physical or mental) requires some cause. For example, a billiard ball is moved by a billiard stick, and the perception of a bear can raise the emotion of fear in a subject. Thus, if God wills some new state of affairs, this must occur due to some cause external to God. Let us suppose, then, that the world was created ex nihilo at the beginning of time, and is not eternal. If God created the world in this way, then something must have influenced his constant, eternal, and unmoved nature to influence his will to arrive at this volition. But we are supposing that nothing exists besides God. If so, then there is nothing outside of God’s will to have such an influence on Him, because there are no causes that exist external to God. This would seem to imply that the world could never exist and that God exists alone eternally without anything else existing alongside Him. But we know that the world does exist. The only option that follows is that the world has existed eternally, as an eternal emanation from the divine essence.
Al-Ghazālī responds by arguing that God’s will can, as it were, postdate the existence of the world so that the world comes into existence at a designated point in time. As such, the world would not have to be eternal. The philosophers (on al-Ghazālī’s reconstruction) would answer to this as follows:
It is impossible for that which necessitates [a thing] to exist with all the conditions of its being necessitating, [all the conditions] of its principles and causes fulfilled, such that nothing at all remains awaited, and then for the necessitated [effect] to be delayed. (Incoherence: 15)
In other words, postdating the existence of the world is impossible because if God has the will for the world to exist, then this volition must have existed eternally. Unless there is an obstacle to delay God’s will from coming to fruition, then there should be no delay in the creation of the world since all the conditions for the world to exist have been met. Thus, the world must be eternal.
Al-Ghazālī objects that the philosophers do not truly know that it is impossible for God’s will to postdate the existence of the world:
It is incumbent on you to set up a demonstrative proof according to the condition of logic that would show the impossibility of this. For, in all of what you have stated, there is nothing but [an expression of] unlikelihood and the drawing of an analogy with our resolve and will, this being false, since the eternal will does not resemble temporal [human] intentions. As regards the sheer deeming of something as unlikely, without demonstrative proof, [this] is not sufficient. (Ibid. 17)
The philosophers claim that a finite and temporal world being created by an eternal will is impossible, but they have not shown the contradiction in God’s will postdating the existence of the world, even without there being an obstacle to his will creating the world earlier. All they have done is provide an analogy between human will and divine will, which is not sufficient for proof.
b. God’s Knowledge: Universals vs. Particulars
In the Thirteenth Discussion of the Incoherence of the Philosophers, al-Ghazālī aims to refute the philosophers’ claim that God only knows universals, but not particulars. This is objectionable within Islam because the Qur’an claims that: “Not an atom’s weight is hidden from Him in the heavens or the earth” (34: 3). According to al-Ghazālī, scripture is clear that God’s knowledge is infinite and all-encompassing, extending to everything that exists and everything that is possible (Moderation in Belief: 104). Again, scripture serves as a motivation for refuting the philosophers on this score, but al-Ghazālī will provide independent objections against their demonstrations.
According to Avicenna, God does not have knowledge of particulars per se, rather, He has knowledge of particulars in a universal manner. This is an implication of Avicenna’s conception of God or the necessary being (Belo 2006). According to Avicenna, God is an intellect that consists of pure thought and activity and is a being wholly distinct from matter and extension. Insofar as God’s essence is thought, this implies that God’s essence is identical with knowledge, since any type of thought that does not qualify as knowledge is not fitting to the perfection of God. Furthermore, the perfection of God’s knowledge requires that God himself is the first object of his knowledge, a reflective act which requires an identity between subject and object of knowledge.
Insofar as God is wholly distinct from matter, this implies that God’s knowledge does not depend on matter or sensory perception in any way, as it does for human beings. As such, God cannot have knowledge of particulars (at least in the way that humans do) because he is not subject to the same physical and temporal processes involved in being a human being, and sensory perception more generally. The example discussed here by Avicenna and al-Ghazālī is that of an eclipse (Belo 2006). We can divide the knowledge an astronomer has of an eclipse into three stages. First, before an eclipse occurs, an astronomer will know that an eclipse is not occurring but will be expecting it. Second, when the eclipse occurs, the astronomer will know that it is occurring. Second, when the eclipse ends, the astronomer will know that the eclipse is a past event. With each stage of the eclipse, there is a corresponding change in the knowledge of the astronomer, and thus a change in the astronomer himself. But this is problematic with respect to God’s knowledge of the eclipse. As al-Ghazālī formulates Avicenna’s view, the claim is that:
If the object known changes, knowledge changes; and if knowledge changes, the knower inescapably changes. But change in God is impossible. (Incoherence: 135)
Since knowing particulars per se would make God subject to change, Avicenna instead maintains that God has knowledge of particulars in a universal manner. This holds in three senses (Belo 2006): First, God’s knowledge is universal in the sense that it is intellectual, and not sensory in any way. Second, God’s knowledge is universal in the sense that his knowledge precedes the objects of his knowledge because he is the cause of their existence. Third, God’s knowledge of particulars extends only to their general qualities. Al-Ghazālī finds this view entirely objectionable from the perspective of orthodox Islam:
This is a principle which they believed and through which they uprooted religious laws in their entirety, since it entails that if Zayd, for example, obeys or disobeys God, God would not know what of his states has newly come about, because He does not know Zayd in his particularity. (Ibid. 136)
Al-Ghazālī agrees that God’s essence cannot admit change, but claims that Avicenna’s analysis of the eclipse example rests on confusion. The essence of Avicenna’s argument, in al-Ghazālī’s view, is that God would undergo change if he knew the particulars of an eclipse because he would move from a state of ignorance to a state of knowledge about the eclipse (in the transition from stages one to two), which would constitute a change in God’s essence. Al-Ghazālī objects that Gods’ knowledge is singular throughout the process of the changes in the eclipse. That is, God’s knowledge of the eclipse before it exists when it exists, and after it exists are all the same knowledge. In al-Ghazālī’s opinion, the differences in the states of the eclipse are relational—that is, in the way God relates to the changes in the eclipse—and thus does not require change in the intrinsic knowledge or the essence of the knower (God).
c. Bodily Resurrection
In the Twentieth Discussion of the Incoherence of the Philosophers, al-Ghazālī aims to refute the philosophers’ denial of bodily resurrection. The philosophers maintain that the soul is immortal, but deny a bodily resurrection in the afterlife. For al-Ghazālī, this is in clear contradiction with scripture and religious law as well. Al-Ghazālī reconstructs three different ways one might conceive of bodily resurrection, how the philosophers would respond to each type of resurrection, and he provides his analysis of each type of resurrection.
First, one might view the resurrection as “God’s returning the annihilated body back to existence and the returning of the life which had been annihilated” (Incoherence: 216). In this view, both the body and the soul are annihilated upon death, and newly created upon resurrection. The philosophers argue that this view of resurrection is false because the human being that is resurrected will not be identical to the original human being, but merely similar to the original human being. This is because there is not something that continues to exist between the annihilation of the body and the soul, and the human being’s “resurrection,” for “unless there is something that [continues to] exist, there [also] being numerically two things that are similar [but] separated by time, then [the meaning of] the term ‘return’ is not fulfilled” (Incoherence: 216). Al-Ghazālī responds by conceding with the philosophers that this option does not amount to a resurrection:
For the human is a human not by virtue of his matter and the earth that is in him, since all or most of [his] parts are changed for him through nourishment while he remains that very same individual. For he is what he is by virtue of his spirit or soul. If, then, life or spirit ceases to exist, then the return of what ceases to exist is unintelligible. (Ibid. 216)
Second, one might view the resurrection as occurring when “the soul exists and survives death but that the first body is changed back [into existence] with all its very parts” (Ibid. 215). The philosophers argue that if this option were conceivable, it would constitute a resurrection. However, they argue that this option is impossible. The philosophers raise several objections here. One of them appeals to cannibalism:
If a human eats the flesh of another human, which is customary in some lands and becomes frequent in times of famine, the [bodily] resurrection of both together becomes impossible because one substance was the body of the individual eaten and has become, through eating, [part of] the body of the eater. And it is impossible to return two souls to the same body. (Ibid. 217)
In short, the philosophers claim that it is impossible to resurrect the body with its original matter.
Third, one might view the resurrection as occurring when the soul is returned to a body, regardless of whether that body is constituted by its original matter for: “The [person] resurrected would be that [identical] human inasmuch as the soul would be that [same] soul” (Ibid. 215). The philosophers argue that this option does not work either because the individual resurrected “would not be a human unless the parts of his body divide into flesh, bone, and [the four] humors” (Ibid. 218) In other words, the philosophers claim that the raw materials of the earth, for example, wood and iron, are not sufficient to reconstitute a body: “it is impossible to return the human and his body [to life] from wood or iron” (Ibid.).
Al-Ghazālī’s response to the philosophers’ dismissal of the second and third types of resurrection is to argue that the body does not matter at all in the identity of the individual resurrected. Rather, it is the continuity of the soul that matters:
This is possible by returning [the soul] to the body, whatever body this might be, whether [composed] of the matter of the first body [or from that] of another or from matter whose creation commences anew. For [an individual] is what he is by virtue of his soul, not his body, since the parts of the body change over for him from childhood to old age through being emaciated, becoming fat, and [undergoing] change of nourishment. His temperament changes with [all] this, while yet remaining that very same human. (Ibid. 219)
According to al-Ghazālī, during the life of a particular human being, the body will undergo a variety of changes while the person remains the same, due to the continued existence of their soul. In the same way, it does not matter from which materials the body is reconstituted when the human being is resurrected, for if the same soul exists in the afterlife as it did in the previous life in the new body, then the individual is the same.
d. Causation and Occasionalism
In the Seventeenth Discussion of the Incoherence of the Philosophers, al-Ghazālī aims to refute the philosophers’ claim that there is a necessary condition between cause and effect. This view is problematic with respect to Islam because it conflicts with the existence of miracles, and God’s ability to change the course of nature at will. This is technically one of the innovative theses of the philosophers.
According to Avicenna’s conception of (efficient) causation, there is a necessary connection between cause and effect (the necessitation thesis—see Richardson 2020). Avicenna’s argument for the necessitation thesis is in part based on his modal metaphysics of the necessary and the possible or contingent. Avicenna distinguishes between the necessary and the possible as follows: “That which in itself is a necessary existence has no cause, while that which in itself is a possible existent has a cause” (Metaphysics of the Healing I, 6: 30).According to Avicenna, the only necessary being is God because existence is a part of the essential nature of God. There is nothing external to God, a cause, that brings God into existence. All other beings, however, are merely possible in themselves because existence is not a part of their essence. Because they are possible, something external to them, a cause, is required to bring them into existence, otherwise they would not exist. As such, there must be an external cause that necessitates the existence of possible things. As Avicenna explains it in his discussion of ontological priority in the Metaphysics of the Healing:
The existence of the second is from the first, so that it [derives] from the first the necessary existence which is neither from nor in itself, having in itself only possibility—allowing [that is] that the first is such that, as long s it exists, it follows as a necessary consequence of its existence that it is the cause of the necessary existence of the second—then the first is prior in existence to the second. (Metaphysics of the Healing IV, 1: 126)
Al-Ghazālī rejects the necessitation theses:
The connection between what is habitually believed to be a cause and what is habitually believed to be an effect is not necessary, according to us. But [with] any two things, where “this” is not “that” and “that” is not “this” and where neither the affirmation of the one entails the affirmation of the other nor the negation of the one entails negation of the other, it is not a necessity of the existence of the one that the other should exist, and it is not a necessity of the nonexistence of the one that the other should not exist—for example, the quenching of thirst and drinking, satiety and eating, burning and contact with fire, light and the appearance of the sun, death and decapitation, healing and the drinking of medicine, the purging of the bowels and the using of a purgative, and so on to (include] all [that is] observable among connected things in medicine, astronomy, arts, and crafts. Their connection is due to the prior decree of God, who creates them side by side, not to its being necessary in itself, incapable of separation. On the contrary, it is within [divine] power to create satiety without eating, to create death without decapitation, to continue life after decapitation, and so on to all connected things. The philosophers denied the possibility of [this] and claimed it to be impossible. (Incoherence: 166)
According to al-Ghazālī, there is no necessary connection between cause and effect, because one can affirm the existence of the cause, without having to affirm the existence of the effect, and one can deny the existence of the cause, without having to deny the existence of the effect.
Consider the example of the burning of cotton when it comes in contact with fire. According to the philosophers, there is a necessary connection between the fire and the burning of the cotton. The fire is the cause or agent of change, that necessitates the effect of change of burning. Denying this causal relation would result in a contradiction. According to al-Ghazālī, however, the connection between fire and burning is not one of necessity; rather, God has created these two events concomitantly or side by side. And due to our repeated or habitual perceiving of these events occurring side by side, this makes us believe that there is a genuine causal relation:
But the continuous habit of their occurrence repeatedly, one time after another, fixes unshakably in our minds the belief in their occurrence according to past habit. (Ibid. 170)
Here, al-Ghazālī anticipates Hume’s theory of causation, according to which causation is nothing more than a constant conjunction between two events (On the differences between Hume and al-Ghazālī, see Moad 2008).
On the standard reading, al-Ghazālī ascribes to occasionalism, according to which creatures do not have any causal power; rather, God is the sole source of causal power in the world (Moad 2005). As al-Ghazālī writes:
The one who enacts the burning by creating blackness in the cotton, [causing] separation in its parts, and making it cinder or ashes is God, either through the mediation of His angels or without mediation. As for fire, which is inanimate, it has no action. (Incoherence: 167)
On behalf of the philosophers, al-Ghazālī raise a unique objection against his theory of causality, namely, that it entails a type of radical skepticism (Dutton 2001). If one denies the necessary connection between cause and effect, then one cannot know what events God—through his unrestrained freedom—will create as conjoined side by side:
If someone leaves a book in the house, let him allow as possible its change on his returning home into a beardless slave boy—intelligent, busy with his tasks—or into an animal; or if he leaves a boy in his house, let him allow the possibility of his changing into a dog; or [again] if he leaves ashes, [let him allow] the possibility of its change into musk; and let him allow the possibility of stone changing into gold and gold into stone. (Ibid. 170)
As Blake Dutton formulates it, the radical skepticism of al-Ghazālī’s theory of causality runs as follows:
Given that we must rely on causal inferences for our knowledge of the world and our ability to navigate our way through it, Ghazali’s position entails that we must face the world with no expectations and adopt a position of skeptical uncertainty. (Dutton 2001: 39)
Al-Ghazālī’s response is to argue that although such transformations are possible, God institutes a habitual course of nature: “The idea is that although God is free to bring about whatever he desires in any order at all, the actual sequence of events that he creates in the world is regular” (Ibid.). As such, we can rely on our normal inferences about how events unfold in the world.
e. Averroes’ Response to the Charge of Unbelief
Averroes (Ibn Rushd) provide a systematic response to the Incoherence of the Philosophers in his Incoherence of the Incoherence. It is beyond the scope of this article to discuss Averroes’ specific responses to al-Ghazālī’s criticisms. But in his Decisive Treatise Averroes provides a philosophical and legal defense of the compatibility of philosophy with Islam, and al-Farabi and Avicenna as Islamic philosophers, that is worth briefly delving into.
According to Averroes, philosophy is obligatory according to scripture and law, because there are a variety of verses that call for a rational reflection on the nature of the world, which amounts to philosophical activity. Furthermore, Averroes claims that studying ancient philosophers is obligatory for the qualified Muslim—the one with intellect and moral virtue—because it is necessary to see what philosophical progress has already been made, in order to make further philosophical developments.
However, there seems to be a conflict between philosophical truth and religious truth when one looks at scripture. According to Averroes, however, there are no distinct standards of truth, for truth cannot contradict truth. When it comes to interpreting scripture from a philosophical perspective, then, one must engage in appropriate forms of allegorical interpretation in order for scripture to conform with demonstrated truths. An allegorical interpretation must be offered for metaphysical claims in scripture whose literal meaning does not conform to a demonstrated metaphysical truth. For example, a verse of scripture that implies that God has hands is problematic because God is not corporeal. However, the reference to God’s hands can be allegorically interpreted as an indication of God’s power. Engaging in allegorical interpretation, however, is only appropriate for philosophers who are capable of logical demonstrations, it is not suitable for the masses or even the theologians.
The philosophers who do engage in allegorical interpretation, have some leeway with respect to making errors because there isn’t consensus amongst scholars about metaphysical issues in the Qur’an in particular. Indeed, it seems that there can’t be a consensus—or at least it is difficult to establish consensus—because scholars hold on to the principle that interpretations of esoteric and theoretical matters should not be divulged to others. A philosopher, then, can assent to (what may turn out to be) false allegorical interpretation, and still be a Muslim. Applying this view of unbelief to al-Farabi and Avicenna, Averroes argues that al-Ghazālī’s condemnation of al-Farabi and Avicenna can only be tentative because there is no consensus amongst the scholars about theoretical matters about the eternity of the world, God’s knowledge, and bodily resurrection.
5. Revival of the Religious Sciences
The Revival of the Religious Sciences constitutes al-Ghazālī’s magnum opus. The primary purpose of the Revival of the Religious Sciences is ethical and spiritual in nature, as al-Ghazālī aims to instruct the reader in both theory and practice that will lead to spiritual enlightenment and an experiential knowledge of God. This section will briefly consider al-Ghazālī’s views on the heart, intellect, Sufism, and theodicy that arise across the Revival of the Religious Sciences.
a. The Heart
The non-physical heart, according to al-Ghazālī, it is a “subtle tenuous substance of an ethereal spiritual sort, which is connected with the physical heart” (Marvels of the Heart: 6). The heart in this sense constitutes “the real essence of man” (Ibid.). Furthermore, it is through the perception of the heart that human beings can understand reality, by reflecting reality in the mirror of the heart:
In its relationship to the real nature of intelligibles, it is like a mirror in its relationship to the forms of changing appearances. For even as that which changes has a form, and the image of that form is reflected in the mirror and represented therein, so also every intelligible has its specific nature, and this specific nature has a form that is reflected and made manifest in the mirror of the heart. Even as the mirror is one thing, the forms of individuals another, and the representation of their image in the mirror another, being thus to see things in all, so here, too, there are three things: the heart, the specific natures of things, and the representation and presence of these in the heart. (Ibid. 35)
Al-Ghazālī identifies five different reasons why a heart will fail to understand reality, that is, perceive the real nature of intelligible (Ibid. 36-38). First, there may be an imperfection in the constitution of the heart’s mirror. Second, the mirror of the heart may be dull due to acts of disobedience, which preclude the “purity and cleanness of heart”. Third, the heart may not be directed to reality due to the person being excessively preoccupied with his appetites and worldly pursuits (such as wealth and livelihood). Fourth, there is a veil that can block even the obedient who has conquered his appetites and devoted himself to understanding reality. This veil constitutes theological and legal beliefs that one has accepted blindly from their youth. These beliefs can “harden the soul,” constituting a veil between the person and their perception of reality. Fifth, in order to obtain knowledge of the unknown, a person must be able to place his prior knowledge in a “process of deduction” to find the “direction of the thing sought”:
For the things that are not instinctive, which one desires to know, cannot be caught save in the net of acquired knowledge; indeed, no items of knowledge is acquired except from two preceding items of knowledge that are related and combined in a special way, and from their combination a third item of knowledge is gained. (Ibid. 38)
In other words, logic and knowledge of syllogisms in particular are necessary to comprehend the true nature of the intelligible.
The primary upshot is that in order for a person to understand reality they must, broadly construed, be in a state of obedience and their heart must be purified. According to al-Ghazālī, “the purpose of improvement is to achieve the illumination of faith in it; I mean the shining of the light of knowledge [of God]” (Ibid. 41). The illumination of faith comes in three degrees, according to al-Ghazālī: faith based on blind imitation (what the masses possess), faith-based on partial logical reasoning (what the theologians possess), and faith based on the light of certainty (what the Sufis possess).
To illustrate the difference between these degrees of faith, al-Ghazālī offers the following example. Suppose that Zayd is in a house. One way of arriving at the belief that Zayd is in the house is to be told that he is in the house by someone trustworthy and truthful. This is blind imitation and the lowest level of faith. The second way of arriving at the belief that Zayd is in the house is by hearing the voice of Zayd from inside the house. This is a stronger form of faith, as it is based on a type of experience of Zayd, and not mere hearsay. But this type of belief requires logical reasoning, as there is an inference made from hearing Zayd’s voice to the conclusion that he is in the house. The third way of arriving at the belief that Zayd is in the house is to go inside the house and see Zayd within it. This is experiential knowledge that possesses the light of certainty. This is the level of belief and faith that should be the goal of the believer (see section c. on Sufism and Ethics, below).
b. The Intellect
According to al-Ghazālī, the seat of knowledge is the spiritual heart. The intellect is a faculty of the heart, which he defines as “an expression for the heart in which there exists the image of the specific nature of things.” (Marvels of the Heart: 35). In the Book of Knowledge, al-Ghazālī distinguishes four different senses of the intellect and divides it into various categories of knowledge. Before explaining these different senses, however, it will be useful to discuss al-Ghazālī’s views on the original condition (fiṭra) of the human being.
In the Deliverance from Error, al-Ghazālī writes that:
Man’s essence, in his original condition, is created in blank simplicity without any information about the “worlds” of God Most High…. Man get his information about the “worlds” by means of perception. Each one of his kind of perception is created in order that man may get to know thereby a “world” of the existents—and by “worlds” we mean the categories of existing things. (Deliverance: 83)
Here, al-Ghazālī seems to espouse the Lockean tabula rasa claim, namely, that the mind is initially devoid of any information, and that all information is ultimately acquired through perception. On this view, then, the intellect would acquire its information through empirical means, that is, through perception. This would make al-Ghazālī close to Avicenna on this matter; although there is some scholarly debate as to whether primary truths or axiomatic knowledge is acquired via perception. Nonetheless, according to Alexander Treiger (2012), al-Ghazālī’s four-part division of the intellect is partly indebted to Avicenna’s conception of the intellect. The first three stages of intellect in particular correspond to Avicenna’s conceptions of the material intellect, the intellect in habitu, and the actual intellect. For Avicenna, the material intellect is devoid of content, consisting in only a disposition to acquire information; the intellect in habitu is the stage of intellection where primary truths are abstracted from sensory experience; and the actual intellect is the stage of intellection where the intellect has access to and understanding of the intelligible.
According to al-Ghazālī, the order of perceptual capacities created in man is touch, sight, hearing, taste, discernment (around the age of seven), and then the intellect which perceives things “not found in the previous stages” (Ibid. 83) On the first meaning of the intellect, the intellect “is the attribute that differentiates human beings from all other animals and affords them the ability to apprehend the speculative sciences and to organize the subtle rational disciplines” (Book of Knowledge: 253). In other words, the intellect is a distinctive quality of human beings to understand theoretical matters, which is not shared by other animals.
On the second meaning of the intellect, the intellect “is the science that comes or types of knowledge that come into being in the disposition of the child” (Ibid. 255). More specifically, these types of knowledge concern possibility, necessity, and impossibility, such as the knowledge that “two is more than one, and that one person cannot be in two places at the same time.” According to al-Ghazālī, this type of axiomatic knowledge does not come about through blind imitation or instruction but is an endowment to the soul. This is different from acquired knowledge, which is acquired through learning and deduction (Marvels of the Heart: 45). Furthermore, axiomatic knowledge enters the disposition of the child after sensory perception has been created in the child.
On the third meaning of the intellect, the intellect “is the sciences derived through the observation of events and circumstances as they arise” (Ibid.). In other words, the intellect at this stage allows for the rational processing of empirical events. On the fourth meaning of the intellect, the intellect is “the capacity to discern the consequences of one’s actions” (Ibid.). More specifically, it includes the capacity to overcome one’s own desires that motivate one towards immediate pleasures. This sense of intellect is also what distinguishes human beings from animals. This fourth sense of the intellect is “the final fruit and its ultimate goal” (Book of Knowledge: 256). (For alternative conceptions of the intellect in al-Ghazālī’s work see Treiger 2012)
c. Sufism and Ethics
According to al-Ghazālī, a central goal of Sufism is not a discursive understanding of God, but an experiential one. The concept al-Ghazālī employs for experiencing the divine is dhawq or fruition experience:
It became clear to me that their [the Sufis] most distinctive characteristic is something that can be attained, not by study, but rather by fruitful experience and the state of ecstasy and “the exchange of qualities.” How great a difference there is between your knowing the definitions and causes and conditions of health and satiety and your being healthy and sated! And how great a difference there is between your knowing the definition of drunkenness…and your actually being drunk! (Deliverance: 78)
Dhawq is a type of knowledge-by-acquaintance, an unmediated experience of the divine (gnosis), as opposed to propositional knowledge of God that would depend on prior beliefs and inferences. Al-Ghazālī claims that this is the highest type of knowledge one can attain, superior even to proofs of God and faith. Ultimately, to verify the Sufi claim that an experiential knowledge of God is possible, one must embark on the Sufi path.
Broadly construed, the means to attaining dhawq is through purifying the heart, that is, ridding it of spiritual diseases and replacing it with virtues. As we have seen, for al-Ghazālī, the spiritual heart is the seat of perception, intelligence, and knowledge, and the primary function of “the heart being the acquisition of wisdom and gnosis, which is the specific property of the human soul which distinguishes man from animals” (Discipling the Soul: 46-47). However, the heart has to be in the right condition to acquire knowledge generally. It must be in a purified-disciplined state to acquire the experiential knowledge of God. To do so, the believer must engage in a variety of spiritual practices.
In Marvels of the Heart, al-Ghazālī explains, in general, the method of the Sufis for purifying the heart. First, the believer must cut off ties with the present world by “taking away concern for family, possessions, children, homeland, knowledge, rule, and rank” (Marvels of the Heart: 54). Then, he must withdraw into a private place of seclusion in order to fulfill the obligatory and supererogatory religious duties: “He must sit with an empty heart” and “strive [such] that nothing save God, the Exalted, shall come into his mind” (Ibid.) Next, in his seclusion, he must repeatedly recite the name of God (Allah) with his heart fixed on God, until “the form and letters of the expression and the very appearance of the word is effaced from the heart and there remains present in it nought save the ideal meaning” (Marvels of the Heart: 54-55). Engaging in these practices seems to be a necessary, but not sufficient condition for an unveiling of the divine realities. According to al-Ghazālī, it is ultimately up to God’s mercy to grant the believer gnosis if, again, he has purified his heart: “By what he has done thus far he has exposed himself to the breezes of God’s mercy, and it only remains for him to wait for such mercy” (Marvels of the Heart: 55).
In Discipling the Soul, al-Ghazālī lays out an ethical program for cultivating good character that is necessary for acquiring gnosis. Here, we interestingly see the influences of Ancient Greek virtue ethics on al-Ghazālī’s ethical thought. According to al-Ghazālī, a character trait “is a firmly established condition of the soul, from which actions proceed easily without any need for thinking or forethought” (Disciplining the Soul: 17). Good character traits lead to beautiful acts, whereas bad character traits lead to ugly acts. Character, however, is not the same as action, rather it is a “term for the condition and inner aspect of the soul.” (Ibid. 18).
According to al-Ghazālī, the fundamental character traits are wisdom, courage, temperance, and justice, from which a variety of other character traits derive (Ibid. 18-19). According to al-Ghazālī, wisdom is the condition of the soul that allows it to distinguish truth from falsity in all acts of volition. Justice is the condition of the soul that controls desire through wisdom. Courage is when the intellect exerts control over anger or the irascible faculty. Temperance is the control of the appetitive faculty through the intellect and religious law. The purpose of cultivating this character is ultimately to cut off the love of the world way to make space for the love of God in the heart (Ibid. 33).
The key to curing the diseases of the heart, according to al-Ghazālī, ultimately comes down to renouncing one’s desires. Al-Ghazālī writes that the essence of self-discipline consists in the soul “not taking pleasure in anything which will not be present in the grave” (Ibid. 60). The believer should restrict the fulfillment of their desires to the absolute necessities of life (for example, food, marriage, and clothing), and occupy their time in devotion to God. The believer who is solely occupied with the remembrance of God in this way is one of the “Truthful Saints” (Ibid.)
d. Theodicy: The Best of All Possible Worlds
In Faith in Divine Unity and Trust in Divine Providence, al-Ghazālī makes a remarkable claim, (anticipating Leibniz’s optimism) about the omnipotence and benevolence of God, namely, that God created the best of all possible worlds: “Nor is anything more fitting, more perfect, and more attractive within the realm of possibility” (Faith in Divine Unity: 45-6) (Similarly, in Principles of the Creed, al-Ghazālī writes that everything “proceeds forth from His justice in the best, most perfect, most complete, and most equitable way” [Principles of the Creed: 14]). This constitutes a theodicy for the existence of evil in the actual world. Here is the immediate context for this claim:
For everything which God Most High distributes among His servants: care and an appointed time, happiness and sadness, weakness and power, faith and unbelief, obedience and apostasy—all of it is unqualifiedly just with no injustice in it, true with no wrong infecting it. Indeed, all this happens according to a necessary and true order, according to what is appropriate as it is appropriate and in the measure that is proper to it; nor is anything more fitting, more perfect, and more attractive within the realm of possibility. For if something were to exist and remind one of the sheer omnipotence [of God] and not of the good things accomplished by His action, that would utterly contradict [God’s] generosity, and be an in justice contrary to the Just One. And if God were not omnipotent, He would be impotent, thereby contradicting the nature of divinity. (Faith in Divine Unity: 45-6)
According to al-Ghazālī, the world and existence more generally reflect God’s complete benevolence and justice. God’s maximal power that is manifested in the creation of the existing world is balanced by his maximal benevolence. As Ormsby puts it in his seminal study of this text, al-Ghazālī’s arguments for having trust in God in part depends on divine providence and the actual world being the best of all possible worlds:
The aspirant to trust in God must therefore learn to see the world as it really is—not as the product of blind chance or of any series of causes and effects, nor as the arena of his own endeavors, but as the direct expression of the divine will and wisdom, down to the least particular. Trust in God presupposes the recognition of the perfect Rightness of the actual. (1984: 43)
Al-Ghazālī, however, received much criticism for his best-of-all-possible world thesis by later theologians. Broadly construed, three primary and interrelated criticisms were raised (Ormsby 1984; cf. Ogden 2016). The first criticism is that, in terms of possibility, the actual world is not the best of all possible worlds as it certainly could be improved upon in terms of reducing suffering and increasing goodness. The second criticism is that al-Ghazālī was following the views of the philosophers in thinking that the world is created by a natural necessity, in that God’s creation necessarily emanates from his essence and relatedly, that since God is perfect, the creation that follows from his essence is necessarily perfect as well. This conflicts with the view that God has freedom and decrees what he wills. The third criticism is that al-Ghazālī’s best of all possible worlds thesis is dangerously close to a Mu’tazilite doctrine according to which God is obligated to do the best for creation. This conflicts with Ash’arite doctrine according to which God is not obligated to create at all, because he is omnipotent and no constraints can be placed on Him.
6. Islamic Philosophy after Al-Ghazālī
Al-Ghazālī, with his writing of the Incoherence of the Philosophers and his condemnation of al-Farabi and Avicenna, is often charged with causing a decline of philosophy and science in the Islamic world. After al-Ghazālī, the thought goes, Islamic intellectuals abandoned philosophical and scientific inquiry in favor of mysticism, theology, and the traditional Islamic sciences. As Steven Weinberg writes:
Alas, Islam turned against science in the twelfth century. The most influential figure was the philosopher Abu Hamid al-Ghazālī, who argued in The Incoherence of the Philosophers against the very idea of laws of nature…after al-Ghazālī, there was no more science worth mentioning in Islamic countries. (2007)
In Montgomery Watt’s view, such a charge against al-Ghazālī is unjustified because Islamic philosophy was arguably already on the decline after the death of Avicenna:
It is tempting to conclude that his attack on the philosophers had been so devastating that philosophy was killed off; but such a conclusion is not justified. It is true that there were not outstanding philosophers in the east after 1100 who stood within the ‘pure’ Aristotelian and Neoplatonic tradition; but it is also true that the last great philosopher there, Avicenna, had died in 1037; twenty years before al-Ghazālī was born; and so the decline of philosophy may have begun long before the Tahāfut [Incoherence of the Philosophers] appeared. (1962: 91)
According to other scholars, however, it is simply false that there was a decline in philosophy after the appearance of the Incoherence of the Philosophers. While the Incoherence of the Philosophers may have convinced certain orthodox thinkers to steer clear of philosophy and even encouraged some persecution, the Islamic philosophical tradition lived on. As Frank Griffel argues: “If al-Ghazālī tried to establish thought-police in Islam, he remained unsuccessful. There was simply no Inquisition in Islam” (2007: 158). In addition to Averroes, who wrote the Incoherence of the Incoherence, there is a rich tradition of post-classical Islamic philosophy full of a variety of Islamic philosophers that have been neglected in the story of Islamic philosophy (for example, Suhrawardī and the illuminations school of thought). As such, while al-Ghazālī’s Incoherence of the Philosophers was influential, it arguably did not put an end to philosophy in the Islamic world.
7. References and Further Reading
a. Primary Sources
Al-Ghazālī. 1980. Deliverance from Error. Translated by R.J. McCarthy. Louisville: Fons Vitae.
Al-Ghazālī. 1998. The Niche of Lights: A Parallel English-Arabic Text. Edited and Translated by D. Buchman. Provo: Brigham Young University Press.
Al-Ghazālī. 2000. The Incoherence of the Philosophers: A Parallel English-Arabic Text. Edited and Translated. By M.E. Marmura, 2nd ed. Provo: Brigham Young University Press.
Al-Ghazālī. 2001. Kitāb al-Tawḥid wa’l-Tawakkul (Faith in Divine Unity and Trust in Divine Providence) [Book XXXV]. Translated by D.B. Burrell. Louisville: Fons Vitae.
Al-Ghazālī. 2002. On the Boundaries of Theological Tolerance in Islam: Abu Hamid al-Ghazālī’s Faysal al-Tafriqa. Trans. by S.A. Jackson. Karachi: Oxford University Press.
Al-Ghazālī. 2010. Kitāb sharḥ ‘ajā’ib al-qalb (The Marvels of the Heart) [Book XXI]. Translated by W.J. Skellie. Louisville: Fons Vitae.
Al-Ghazālī. 2013. Al-Ghazali’s Moderation in Belief. Translated by A.M. Yaqub. Chicago: University of Chicago Press.
Al-Ghazālī. 2015. Kitāb al-‘ilm (The Book of Knowledge) [Book I]. Translated by K. Honerkamp. Louisville: Fons Vitae.
Al-Ghazālī. 2016. Kitāb qawā‘id al-‘aqā’id (The Principles of the Creed) [Book II]. Translated by K. Williams. Louisville: Fons Vitae.
Al-Ghazālī. 2016. On Disciplining the Soul and On Breaking the Two Desires: Books XXII and XXIII of the Revival of the Religious Sciences. Translated By T.J. Winter. Cambridge: The Islamic Texts Society.
Averroes. 1954. Averroes’ Tahafut Al-Tahafut (The Incoherence of the Incoherence). Translated by S. van den Bergh, 2 vols., London: Luzac.
Averroes. 2001. The Book of the Decisive Treatise Determining the Connection Between Law and Wisdom. Edited and Translated by C. Butterworth. Provo: Brigham Young University Press.
Avicenna. 2005. The Metaphysics of The Healing: A Parallel English-Arabic Text. Edited and Translated by M.E. Marmura. Provo: Brigham Young University Press.
b. Secondary Sources
Abrahamov, Binyamin. 1988. “Al-Ghazālī’s Theory of Causality.” Studia Islamica 67: 75-98.
Abrahamov, Binyamin. 1993. “Al-Ghazālī’s Supreme Way to Know God.” Studia Islamica 77: 141-168.
Ai-Allaf, Mashhad. 2006. “Al-Ghazālī on Logical Necessity, Causality, and Miracles.” Journal of Islamic Philosophy 2 (1): 37-52.
Albertini, Tamara. 2005. “Crisis and Certainty of Knowledge in Al-Ghazālī and Descartes.” Philosophy East and West 55: 1-14.
Azadpur, Mohammad. 2003. “Unveiling the Hidden: On the Meditations of Descartes & al-Ghazālī.” In The Passions of the Soul: A Dialogue Between Phenomenology and Islamic Philosophy, edited by Anna-Teresa Tymieniecka, 219-240. Kluwer.
Belo, C. 2006. “Averroes on God’s Knowledge of Particulars.” Journal of Islamic Studies 17 (2): 177-199.
Burrell, David B. 1987. “The Unknowability of God in Al-Ghazali.” Religious Studies 23 (2): 171-182.
Campanini, Massimo. 2018. Al-Ghazali and the Divine. New York: Routledge.
Dutton, Blake. 2001. “Al-Ghazālī on Possibility and the Critique of Causality.” Medieval Philosophy and Theology 10 (1): 23-46.
Ferhat, Loumia. 2020. “Al-Ghazālī’s Heart as a Medium of Light: Illumination and the Soteriological Process.” Journal of Islamic Ethics 4 (1-2): 201-222.
Götz, Ignacio. 2003. “The Quest for Certainty: Al-Ghazālī and Descartes.” Journal of Philosophical Research 28: 1–22.
Griffel, Frank. 2001. “Toleration and Exclusion: Al-Shāfiʾī and al-Ghazālī on the Treatment of Apostates.” Bulletin of the School of Oriental and African Studies, University of London 64 (3): 339-354.
Griffel, Frank. 2005. “Taqlîd of the Philosophers. Al-Ghazâlî’s Initial Accusation in the Tahâfut.” In Ideas, Images, and Methods of Portrayal. Insights into Arabic Literature and Islam, edited by S. Günther, 253-273. Leiden: Brill.
Griffel, Frank. 2009. al-Ghazālī’s Philosophical Theology. New York: Oxford University Press.
Griffel, Frank. 2011. “The Western Reception of al-Ghazālī’s Cosmology from the Middle Ages to the 21st Century.” Dîvân: Disiplinlerarası Çalışmalar Dergisi/Dîvân: Journal of Interdisciplinary Studies 16: 33-62.
Griffel, Frank. 2012. “Al-Ghazālī’s Use of “Original Human Disposition” (Fiṭra) and Its Background in the Teachings of Al-Fārābī and Avicenna.” The Muslim World 102 (1): 1-32.
Griffel, Frank. 2021. The Formation of Post-Classical Philosophy in Islam. New York: Oxford University Press.
Hadisi, Reza. 2022. “Ghazālī’s Transformative Answer to Scepticism.” Theoria 88 (1): 109-142.
Hasan, Ali. 2013. “Al-Ghazali and Ibn Rushd (Averroes) on Creation and the Divine Attributes,” In Models of God and Alternative Ultimate Realities, edited by Jeanine Diller & Asa Kasher, 141-156. The Netherlands: Springer.
Hesova, Zora. 2012. “The Notion of Illumination in the Perspective of Ghazali’s Mishkāt-al-Anwār.” Journal of Islamic Thought and Civilization 2: 65-85.
Hourani, George. 1958. “The Dialogue Between Al-Ghazālī and the Philosophers on the Origin of the World.” Muslim World 48: 183-191.
Kukkonen, Taneli. 2000. “Possible Worlds in the Tahâfut al-Falâsifa: Al-Ghazālī on Creation and Contingency,” Journal of the History of Philosophy 38: 479-502.
Kukkonen, Taneli. 2010. “Al-Ghazālī’s Skepticism Revisited.” In Rethinking the History of Skepticism: The Missing Medieval Background, edited by Henrik Lagerlund, 103-129. Leiden: Brill.
Kukkonen, Taneli. 2010. “Al-Ghazālī on the Signification of Names.” Vivarium 48 (1/2): 55-74.
Kukkonen, Taneli. 2012. “Receptive to Reality: Al-Ghazālī on the Structure of the Soul.” The Muslim World 102: 541-561.
Kukkonen, Taneli. 2016. “Al-Ghazālī on the Origins of Ethics.” Numen 63 (2/3): 271-298.
Lewes, George Henry. 1970. The Biographical History of Philosophy [originally published 1857]. London: J.W. Parker & Son.
Marmura, Michael E. 1981. “Al-Ghazālī’s Second Causal Theory in the 17th Discussion of His Tahâfut.” In Islamic Philosophy and Mysticism, edited by Parviz Morewedge, 85-112. Delmar: Caravan Books.
Marmura, Michael E. 1995, “Ghazālīan Causes and Intermediaries.” Journal of the American Oriental Society 115: 89-100.
Marmura, Michael E. 1965. “Ghazālī and Demonstrative Science.” Journal of the History of Philosophy 3: 183-204.
Martin, Nicholas. 2017. “Simplicity’s Deficiency: Al-Ghazālī’s Defense of the Divine Attributes and Contemporary Trinitarian Metaphysics.” Topoi 36 (4): 665-673.
Menn, Stephen. 2003. “The Discourse on the Method and the Tradition of Intellectual Autobiography.” In Hellenistic and Early Modern Philosophy, edited by Jon Miller and Brad Inwood, 141-191. Cambridge: Cambridge University Press.
Moad, Edward Omar. 2005. “Al-Ghazali’s Occasionalism and the Natures of Creatures.” International Journal for Philosophy of Religion 58 (2): 95-101.
Moad, Edward Omar. 2007. “Al-Ghazali on Power, Causation, and Acquisition.” Philosophy East and West 57 (1): 1-13.
Moad, Edward Omar. 2008. “A Significant Difference Between Al-Ghazālī and Hume on Causation.” Journal of Islamic Philosophy 3: 22-39.
Moad, Edward Omar. 2009. “Comparing Phases of Skepticism in al-Ghazālī and Descartes: Some First Meditations on Deliverance from Error.” Philosophy East and West 59 (1): 88-101.
Naumkin, V.V. 1987. “Some Problems Related to the Study of Works by al-Ghazālī.” In Ghazālī, la raison et le miracle, 119-124. Paris: UNESCO.
Ogden, Stephen R. (2016). “Problems in Al-Ghazālī’s Perfect World.” In Islam and Rationality, edited by F. Griffel. Leiden, 54-89. The Netherlands: Brill.
Ormsby, Eric Linn. 1984. Theodicy in Islamic Thought. The Dispute over al-Ghazālī’s ‘Best of All Possible Worlds.’ Princeton: Princeton University Press.
Özel, Aytekin. 2008. “Al-Ghazālī’s Method of Doubt and its Epistemological and Logical Criticism.” Journal of Islamic Philosophy 4: 69-76.
Quasem, Muhammad Abul. 1974. “Al-Ghazali’s Rejection of Philosophic Ethics.” Islamic Studies 13 (2): 111-127.
Richardson, Kara. 2015. “Causation in Arabic and Islamic Thought (The Stanford Encyclopedia of Philosophy).” Edited by Edward N. Zalta. Stanford Encyclopedia of Philosophy. Winter 2020 Edition. https://plato.stanford.edu/archives/win2020/entries/arabic-islamic-causation/.
Riker, Stephen. 1996. “Al-Ghazālī on Necessary Causality in ‘The Incoherence of the Philosophers.’” The Monist 79 (3): 315-324.
Ruddle-Miyamoto, Akira O. 2017. “Regarding Doubt and Certainty in al-Ghazālī’s Deliverance from Error and Descartes’ Meditations.” Philosophy East and West 67 (1): 160-176.
Treiger, Alexander. 2007. “Monism and Monotheism in al-Ghazālī’s Mishkāt al-Anwār.” Journal of Qur’anic Studies 9 (1): 1-27.
Treiger, Alexander. 2012. Inspired Knowledge in Islamic Thought. Al-Ghazālī’s Theory of Mystical Cognition and its Avicennian Foundation. London and New York: Routledge.
Van Ess, Josef. 2018. “Quelques Remarques Sur le Munqid min al-dalal,” In Kleine Schriften Vol. 3, edited by Hinrich Biesterfeldt, 1526-1537. Leiden: Brill.
Watt, W. Montgomery. 1962. Islamic Philosophy and Theology. Edinburgh: Edinburgh University Press.
Watt, W. Montgomery. 1963. Muslim Intellectual: A Study of al-Ghazālī. Edinburgh: Edinburgh University Press.
Weinberg, Steven. 2007. “A Deadly Certitude.” Times Literary Supplement, January 17.
Wilson, Catherine. 1996. “Modern Western Philosophy,” In History of Islamic Philosophy, edited by Seyyed Hossein Nasr and Oliver Leaman. London: Routledge.
Zamir, Syed Rizwan. 2010. “Descartes and al-Ghazālī: Doubt, Certitude and Light.” Islamic Studies 49 (2): 219-251.
Many philosophers and physicists claim that time has an arrow that points from the past to the future. The Roman poet Ovid tried to refer to this one-way property of time when he said, “Time itself glides on with constant motion, ever as a flowing river.”
Experts who accept the existence of time’s arrow divide into two broad camps on the question of the arrow’s foundation. Members of the intrinsic camp say time’s arrow is an intrinsic feature of time itself, a property it has on its own independently of what physical processes there are. These persons are apt to describe the arrow as an uninterrupted passage or flow from the past to the future. Members of the extrinsic camp say the arrow is not intrinsic to time itself but instead is all the physical processes that happen to go regularly and naturally in only one direction.
If the extrinsic position were correct, one might naturally expect the underlying laws of physics to reveal why all these macroscopic processes go regularly in only one of the two time directions; but the laws do not—at least all the significant fundamental ones do not. A normal documentary movie, if shown in reverse, is surprising to the viewer because the arrow of time reverses, but it does not show anything that is not permitted by the fundamental laws, yet those laws supposedly describe the microscopic processes that produce all the macroscopic processes. To explain the apparent inconsistency between the observed time-asymmetry of macroscopic processes and the time-symmetry of all the significant, fundamental, microscopic processes, some philosophers of physics point to the need to discover a new fundamental law that implies an arrow of time. Others suggest the arrow might be inexplicable, a brute fact of nature. Members of the “entropy camp” claim there is an explanation if we pay close attention to entropy, to how random or mixed up a system of particles is.
Commenting upon the observed asymmetry between past and future, a leading member of the extrinsic camp said, “The arrow of time is all of the ways in which the past is different from the future.” Some of these many ways are entropy increasing in the future, causes never producing effects in the past, the universe’s continuing to expand in volume, and our remembering the past but never the future. Can some of these ways be used to explain others, but not vice versa? This is called the taxonomy problem, and there are competing attempts to solve the problem.
Some philosophers even ask whether there could be distant regions of space and time having an arrow of time pointing in reverse compared to our arrow. If so, and if one of those distant regions had people somewhat like us, would they naturally naturally walk backwards on their way to becoming infants while they remember the future?
In 1927, Arthur Eddington coined the term time’s arrow when he said, “I shall use the phrase ‘time’s arrow’ to express this one-way property of time which has no analogue in space. It is a singularly interesting property from a philosophical standpoint.” Time’s arrow, if it exists and is not an illusion, is somewhat like an arrow that is shot from an archer’s bow. The archer’s arrow has an intrinsic difference between its head and its tail, as you can tell just by looking at its shape, and it was built to have this shape for clear and easily understood aerodynamic reasons. However, describing and explaining time’s arrow is more challenging.
Writers use a wide variety of terms to characterize the arrow. They say, for example, that time has a distinguished direction or orientation favoring just one of its two possible directions. The philosopher Max Black offered these other characterizations:
Instead of saying that time has a “direction,” writers will sometimes say that time is “asymmetrical” or “irreversible”, or that the passing of time is “irrevocable.” Although these alternative expressions are not exact synonyms, they are closely connected in meaning, and may well be considered together….Those who say that time has a direction will often say, by way of contrast, that space has no direction, or that space is “symmetrical.” Or they will say that events are reversible with respect to their spatial locations, but not with respect to their temporal locations.
Time is very often represented by physicists as a straight line of point instants. Given a time line of these instants, any segment between instants such as instants A and B has two directions or orientations: from A to B, and from B to A. Specifying time’s arrow distinguishes one of those directions from the other. Expressed in the language of mathematical physics, the asymmetric relation of happens-before-or-simultaneous-with provides a linear ordering of the instants. But so does the asymmetric relation of happens-after-or-simultaneous-with. Philosophers want to know what in the external world or in consciousness distinguishes one of these relations from the other. They want to know both the nature and origin of time’s arrow. This is the “problem of the arrow.”
Before Boltzmann entered the discussion, physicists and philosophers of physics did not recognize any problem of the arrow of time. They believed there simply was an arrow, the past just was always different from the future, and it only became a problem to explain this when people started using Newton’s mechanistic theory with its time reversible laws to account for this, and then suddenly physicists realized it was a difficult problem to get asymmetry from symmetry. This problem was first very clearly recognized by Boltzmann.
Here is a long list of topics and issues to be resolved in formulating a successful philosophical theory of time’s arrow. They are all discussed below. The order on the list is not very important. Conceptual clarity is the philosophical goal here.
There should be definitions of the terms “time’s arrow” and “arrow of time” and some decision on whether the arrow is objective or subjective.
Is there more than one arrow?
Is time’s arrow simply the fact that the future is different from the past?
Where is the arrow pointing?
Does it necessarily point there, or is this a contingent fact?
What might count as good grounds for saying time has an arrow, and do we have those grounds?
What is the relationship between time reversal and arrow reversal, and what exactly do those terms mean?
Was Hans Reichenbach correct when he said that “we cannot speak of a direction of time for the whole; only certain sections of time have directions, and these directions are not the same”?
Is time’s arrow a spatially-limited, local feature or instead a larger, perhaps global feature of the universe?
If the universe had started out differently, but with time existing, might time never have had the arrow it has?
How do we reconcile the fact that we humans always experience processes as one-way processes with the fact that the fundamental laws of physics allow the fundamental processes to go either way in time, and these laws give no indication why the two directions should not be equally common among all the actual processes?
Researchers do not agree on what counts as evidence of the arrow nor on what is a good example of the presence of the arrow, so a resolution is required.
There are some very rare processes of radioactive decay that do “prefer” to go more in one temporal direction than the other. Could this account for the origin of time’s arrow?
One of the main questions is whether the arrow is intrinsic to time itself or instead only an extrinsic feature due to processes that happen to occur over time the way they do.
Is time itself totally undirected like a long line with no preference for one direction over the other, and with the apparent directedness arising from some other feature of nature?
Do the past and future merely happen to be different, or do they have to be different by definition or for some other reason?
Can it be shown that time’s arrow is natural and to be expected, or is its existence merely a primitive fact, as some have argued?
Is the arrow fundamental, or can it be derived from something more fundamental?
Does the existence of an arrow have ontological implications such as placing special demands on the structure of space-time other than what is required by relativity theory?
Is it clear that temporal phenomena even seem to us to have an arrow? Researchers are divided on that question, too.
So, this set of issues is fertile territory for philosophers. The questions need answers or, in some cases, explanations of why they are not good questions.
Much has been said on these issues in the academic literature. This article provides only introductions to and clarifications of some of them, and it does not attempt to settle any open issues.
Looking back over history since the late 1800s, it is clear that our civilization is continually learning more about the philosophical issues involving the arrow of time. The academic accomplishments here provide a paradigm counterexample to the carelessly repeated quip that there is no progress in the field of philosophy. Progress in philosophy need not simply be a change from no consensus to consensus on an open issue.
Regarding the issue of whether the arrow is a local or a global feature, the philosopher Geoffrey Matthews remarked that, “Much of the recent literature on the problem of the direction of time assumes…it must be solved globally, for the entire universe at once, rather than locally, for some small part or parts of the universe,” but he recommends that it be solved locally (Matthews 1979, 82). However, the cosmologist George Ellis has a different perspective. He says: “The direction of time is the cosmologically determined direction in which time flows globally.” (Ellis 2013).
Max Black claimed time has an arrow if, but only if, ordinary discourse about time such as “This event happens earlier than that event” is objectively true or false and is not dependent upon who is speaking. That approach to the arrow via ordinary language retained few followers into the twenty-first century.
To say the arrow exists objectively implies that it exists independently of what we think, feel, do, or say. This is often expressed briefly as saying the arrow is mind-independent. One might consider whether there is a sense in which the arrow is mind-independent and another sense in which it is not. Or perhaps there are two arrows, an objective one and a subjective one, or a physical one and a wholly phenomenological one. It takes a conscious observer to notice time’s arrow, but is the arrow that is noticed dependent on that observer? Money is observer dependent. If there is no agreement to treat that piece of paper containing an ex-president’s picture on it as money, then it is not money and is merely a piece of paper. Is time’s arrow like money in this sense or in a sense that gives the arrow some other observer-dependent existence?
On this issue, Huw Price presents an argument inspired by David Hume’s claim that necessary connections among events are projected into the external world by our minds. Price says that, when seen from the Archimedean point of view that objectively looks down upon all the past, present and future events, it becomes clear that just as the color blue is a subjective, secondary quality of external reality so also time’s arrow is merely a subjective or intersubjective projection onto an inherently time-symmetric external reality (Price 1996). Seen from this atemporal perspective, the arrow exists only in our collective imagination. Physical time is not subjective, but its arrow is, he says. Craig Callender objected to Price’s use of the word “merely” here. He claimed Price tells only half the story. In addition to the subjective projection, there is also an objective asymmetry of time because, “Thermodynamic behaviour not only shapes asymmetric agents but also provides an objective asymmetry in the world” (Callender 1998, 157). The thermodynamic behavior Callender is talking about is entropy increase that everyone agrees is somehow positively correlated with time’s increase. Callender agreed with Eddington that time’s arrow is due to entropy increase. Critics complain that entropy, too, is subjective. This issue is explored later.
Physical time is what clocks are designed to measure. Phenomenological time or psychological time, unlike physical time, is private time. It is also called subjective time. Compared to physical time that is shown on a clock, our psychological time can change its rate depending on whether we are bored or intensively involved. Many philosophers in the twenty-first century believe psychological time is best understood not as a kind of time but rather as awareness of physical time. Psychological time is what people usually are thinking of when they ask whether time is just a construct of the mind. But see (Freundlich 1973) for a defense of the claim that “physical time acquires meaning only through phenomenological time.”
It is surprising to many people to learn that nearly all scientists believe there does not exist a physical time that is independent of space. These scientists say what does have independent existence is space-time. They are drawing this conclusion from Einstein’s theory of relativity. The theory implies space-time is the set of all events, and space-time is an amalgam of space and time but is different amalgams from different perspectives. According to the orthodox interpretation of relativity theory, the amount of time that an event lasts is relative to perspective, to someone’s choice of a reference frame (or coordinate system or vantage point). For example, how long your dinner party lasted last night is very different depending on whether it is measured by a clock on the dinner table or by a clock in a spaceship speeding by at close to the speed of light. If no reference frame has been pre-selected, then it is a violation of relativity theory to say one duration is correct and the other is incorrect. In this article, advocates of the extrinsic theory presume time exists, but only as a feature of space-time and only relative to a reference frame. Advocates of the intrinsic theory may or may not presume this.
One minor point about the word “state” that is used ahead. A “state” or “state of affairs” means a state at one time. For a physical system with multiple, fundamental parts, the state of the system at a time is also called its “configuration.” Every system, except the universe itself, has an environment that is external to the system.
2. The Intrinsic Theory
Philosophers ask whether there is an arrow OF time or, instead, IN time—that is, whether (i) there is an arrow of time itself in the sense of its being part of time’s intrinsic structure, or instead (e) there is only an arrow in time that is extrinsic to time itself and that is due to time’s contents, specifically to physical processes that evolve one way over time. The intrinsic theory is committed to claim (i), and the extrinsic theory is committed to claim (e). The intrinsic theory is interested in the asymmetry of time itself, over and above the asymmetry of processes in time. This difference in the two philosophical theories is sometimes expressed as their differing on whether time’s arrow is due to time’s inherent form or to time’s content.
There is no end to the number of one-way processes. People grow older but never younger, candles burn but never un-burn, and coffee and milk mix easily but never un-mix once mixed. Defenders of both the intrinsic theory and the extrinsic theory agree that the arrow of time reveals itself in the wide variety of these one-way processes. Yet only those persons promoting the extrinsic theory suggest that time’s arrow is identical to, or produced by, the set of these one-way processes.
You can imagine nothing changing in your neighbor’s home for five minutes. How about nothing changing anywhere for five minutes? Advocates of the intrinsic theory suggest that it is physically possible for all changes to stop while time continues to flow; nearly all advocates of the extrinsic theory believe this is not possible.
The image of time proposed by the class of intrinsic theories is closer to the commonsense understanding of time, to what Wilfrid Sellars called the “manifest image,” than it is to the scientific image of time, which is time as seen through the lens of contemporary science.
All who say the arrow is intrinsic to time believe that strong evidence for the arrow is found in personal experience, usually both internal and external experience. What motivates most persons in the intrinsic camp is that having an intrinsic arrow seems to be the best explanation for the everyday flux of their own temporal experiences plus the similar experiences of others. Some imagine themselves advancing through time; others, that they stand still while time flows like a river past them. Some speak of more events becoming real. The philosopher Jenann Ismael mentions “the felt whoosh of experience.” It is not simply that we all occasionally experience time as directed but that this is relentless. Time stops for no one, says the proverb.
Robin Le Poidevin points out that the experience of the arrow is both internal and external. He says, “we are not only aware of [time’s passage] when we reflect on our memories of what has happened. We just see time passing in front of us, in the movement of a second hand around a clock, or the falling of sand through an hourglass, or indeed any motion or change at all.”
Assuming those in the intrinsic camp are correct in claiming that there is strong evidence for the arrow being found in personal experience, the following question is relevant: “Did time’s arrow exist before consciousness evolved?”
Most twenty-first century experts in the intrinsic camp find the arrow not just within experience but also in what is experienced; it is found in what is experienced in the sense that, even though it may require conscious creatures to detect the arrow, nevertheless what is detected is the kind of thing that was a feature of nature long before beings with a conscious mind evolved on Earth to detect it, and it deserves a central place in any temporal metaphysics. To briefly summarize this extended bit of argumentation, it seems to us that the external world has an arrow of time because it does, and it does because we all experience it.
When it is said the arrow is a feature of time’s intrinsic structure, what is meant by the term “structure”? That term is about overall form rather than specific content. The structure is not a feature of a single three-dimensional object or event, nor is it detectable in an experience that lasts for only an instant.
When it is said the arrow is intrinsic to time or to be a feature of time’s intrinsic structure, what is meant by the term “intrinsic”? The term intrinsic is like internal or inherent. An intrinsic property can apply to just one thing, like a quality; an extrinsic property is a relational property. An object’s mass is intrinsic to it, but its weight is not because it is relative to, say, what planet the object is resting upon. Here is another example from the metaphysician Theodore Sider. He reminds us that a woman having long hair is very different from her having a long-haired brother. Having long hair is an intrinsic property she has without involving anyone else. But having a long-haired brother is not intrinsic to her. It is an extrinsic or relational property that she has because of her relationship to someone else, namely her brother. Notice that her having the intrinsic property of long hair does not imply that long hair is essential to her. For a second example, a potato’s mass is intrinsic to it, but its weight is extrinsic and depends, for example, upon whether it is weighed on the moon or on Earth. The metaphysicians David Lewis and Bernard Katz offer a deeper discussion of the difference between intrinsic and extrinsic in (Lewis 1983) and (Katz 1983).
Although the concept of being intrinsic is not the same as the concept of being essential, many researchers in the intrinsic camp believe the arrow is essential to time in the sense that time necessarily has an arrow, that it is not a contingent feature, and that there would not be real time without it. Thus, arrow reversal would imply time reversal, and vice versa. Those in the extrinsic camp are much less apt to hold this position, and they are content to say that the arrow is a contingent feature, but that time would be very strange without it. They also have a different definition of time reversal from that of the intrinsic camp.
Within the intrinsic camp there are different explanations of why time is intrinsically directed. The largest sub-camp is the dynamic camp that promotes a dynamic theory of time’s arrow. Members are commonly called “temporal dynamists.”
An Euler diagram of philosophical theories
of time’s arrow that are discussed below
Each point in the Euler two-circle diagram represents a specific theory of the arrow of time.
Let’s consider first the dynamic sub-camp of the intrinsic theorists. Its members frequently say time has an internal structure that is robustly “dynamic” or “transitory” or “active.” What does this mean? In answering, some philosophers offer a very picturesque style of description and appeal to the idea of a “river of time” that is now gushing out of nothingness. Less picturesquely, most answer by saying time “passes” or “flows” or “has a flux” or “lapses” or “runs” or has a “moving present” or has a feature called “becoming.” All these non-technical terms of ordinary discourse are intended to help describe time’s intrinsic arrow.
Let’s attempt to be clearer. Most advocates of time’s passing claim that passage or becoming is part of the physical world. What part?
Here is one clarification of time’s passing:
We take temporal passage to consist in (a) there being a fact of the matter regarding which entities are objectively present, and (b) there being changes in which [of the] entities are objectively present. Presentism, the growing block theory, the dropping branches theory and the moving spotlight theory are all theories according to which time passes (Miller and Norton 2021, 21).
If you believe in time’s passing, then you must believe that the future is in some sense not as real as the present or the past.
But how does a researcher go about showing that a dynamic theory is true or false? George Schlesinger made an interesting remark about this. Think of what he calls the “transient theory of time” as what this article calls the “dynamic theory.”
There is no doubt that the transient theory of time is consistent and intelligible. Is it true? I do not believe that this is the kind of question to which a final conclusive answer is possible. As with all genuinely metaphysical theories, what we may reasonably expect is further clarification concerning its precise presuppositions and implications and an increasingly more detailed list of its advantages and disadvantages. (Schlesinger 1985, 92).
The various, prominent dynamic theories of time are presented in (Zimmerman 2005) and (Dainton 2020). The present article introduces some of them.
In 1927, C.D. Broad said becoming is the transitory aspect of time that must be added to the “mere” ordering of events via McTaggart’s B-series relations. The B-series is too static to capture the dynamic character of time, Broad claimed, because if one event happens before another then it always does, and this fact never changes. Yet facts do change. It was once a fact that dinosaurs exist. The B-series is a static representation. Broad said absolute becoming “seems to me to be the rock-bottom peculiarity of time, distinguishing temporal sequence from all other instances of one-dimensional order, such as that of points on a line, numbers in order of magnitude, and so on.” He believed the A-theory was needed to capture this objective, non-phenomenological sense of becoming. Arthur Eddington said, “we have direct insight into ‘becoming’ which sweeps aside all symbolic knowledge as being on an inferior plane.”
Broad’s moving spotlight theory is suggested with the following metaphor from the Spanish philosopher George Santayana: “The essence of nowness runs like fire along the fuse of time.” The theory treats the dimension of time much like a single dimension of space with all the past, present, and future events arranged in temporal order along the time dimension (the fuse of time), and with simultaneous events having the same location. Promoting the moving spotlight theory as a way of clarifying what he meant by becoming, C.D. Broad said, “What is illuminated is the present, what has been illuminated is the past, and what has not yet been illuminated is the future” (Broad 1923, 59). The theory assumes eternalism in the sense that all past, present, and future events exist (in the tenseless sense of the term), but it has the extra assumptions that the present time is a metaphysically privileged time and that A-theory sentences about the past such as “The year 1923 is no longer our present year” are metaphysically basic, unlike the B-theory which might analyze that sentence as “The year 1923 happens before the year in which you are reading this sentence.” Most versions of the moving spotlight theory imply being present is primitive in the sense of being an unanalyzable feature of reality. The idea that for the moving spotlight the present is metaphysically privileged implies for its advocates Timothy Williamson and Quentin Smith that future events are not spatial, but they shed this property and acquire the property of being spatial or in space as they become highlighted by the spotlight, then they shed this property and become non-spatial, past events. For an examination of the spotlight theory, see (Zimmerman 2005) and (Miller 2019).
Some dynamists, such as C. D. Broad in his 1927 book Scientific Thought, have explained time’s passage as reality’s growing by the continual accretion of new moments or new facts or the creation of more states of affairs. This is not an eternalist theory nor a presentist theory. It employs a growing-block model of events in Einstein’s sense of the technical term “block,” but without the actual future events that exist in their traditional block universe. That is, the growing-block consists of all real present and past point-events; and the latest moment that exists in the block is metaphysically privileged and called “the present” and “now.” All longer-duration events are composed of point-events. In classical physics, the block can have a four-dimensional Cartesian coordinate system of three spatial dimensions and one time dimension. The block grows in volume over time as more moments and events occur and more facts about the present and past are created. The direction of the arrow of time is the direction that the block grows.
Enamored of the idea of reality’s growing by the accretion of facts, Michael Tooley argued that “the world is dynamic, and dynamic in a certain way: it is a world where tenseless states of affairs come into existence, but never drop out of existence, and therefore a world where the past and the present are real, but the future is not.” A tenseless state of affairs is a fact in which any use of tense in its description is not semantically or ontologically significant. When someone says, “one plus two is three,” the word “is” occurs in the present tense, but we know to ignore this and treat it as insignificant and assume the speaker’s statement is not only about the present. See (Dainton 2020) to learn more about tenseless vs. tensed characteristics.
An important feature of Tooley’s and others’ growing-block theories is that they make what is real be dependent on what time it is. Adherents to the theory say one of its virtues is its promotion of the unreality of the future because this naturally allows the future to be “open” or indeterminate unlike the past and present that are “closed” and determinate and so cannot change. This openness is part of the manifest image of time that almost all persons hold, but this sometimes means different things to different people. It might mean the future is non-existent, or that it is not straightforwardly knowable as is the past, or that we human beings are able to shape the future but not the past. Some researchers say this openness shows itself semantically by the fact that a contingent proposition about the future such as “There will be a sea battle tomorrow” is neither true nor false presently. The eternalist is more apt to say the proposition has a truth value and is eternally true (that is, true at all times) or else is eternally false, but we just do not know which one it is.
See (Miller 2013) for a comparison of the growing-block ontology of time with its competitors: presentism and eternalism. For some subsequent research on the growing block, see (Grandjean 2022).
Let us turn to intrinsic theories that are promoted by those outside of the large, dynamic camp. One theory implies time has an intrinsic arrow because time is intrinsically anisotropic (that is, one direction of time is intrinsically privileged over the other way), and sufficient evidence of time’s being intrinsically anisotropic would be the existence of time-anisotropic processes obeying time-anisotropic laws of nature. (The terms time-anisotropic and time-asymmetric and temporally directed have different senses, but they denote the same thing.)
Ferrel Christensen’s argument in favor of time’s being intrinsically asymmetric appealed to the evidence of there being so many kinds of one-way processes in nature, and the simplest explanation of this, he suggested, is that time itself is intrinsically asymmetric.
The bare fact that our experience of time is mediated by processes that take place in time doesn’t argue that any or all of the structural features of the latter aren’t also possessed by temporality in its own right. …Is it not plausible to suggest that a single asymmetry is responsible for them all, namely that of time itself? For reasons having to do with economy, the ability of a single feature to theoretically explain a diversity of phenomena is in general regarded in science as good evidence for the reality of that feature; it has the effect of unifying and organizing our picture of the world. Surely there must be some common reason, one is tempted to argue, for the evidence of the various asymmetries in time—what else might it be if not the asymmetry of time itself? (Christensen 1987, 238 and 243).
Tim Maudlin is a member of the intrinsic camp who also does not promote a dynamic theory of time’s arrow. See the letter “m” for Maudlin’s theory in the diagram above. He accepts the block universe theory in the sense that the past, present, and future are equally real, but he also accepts the passage of time and does not characterize the block as static. Maudlin’s reasoning here is that the word “static” refers to objects that persist through time and never change. The block universe does not persist through time. Instead, time exists within it.
Maudlin says that time’s passage is an intrinsic or inherent asymmetry in the structure of space-time itself and that time passes objectively and independently of the material contents of space-time and their processes. That is how he interprets “becoming.” Maudlin argues that the direction of time is primitive and so cannot be deduced or otherwise explained more deeply. He believes that “except in a metaphorical sense, time does not move or flow,” but it does pass. Maudlin adds that, “the passing of time…is the foundation of our asymmetrical treatment of the initial and final states of the universe” (Maudlin 2007, 142). Stephen Savitt, Dennis Dieks, and Mauro Dorato also have claimed that time passes in the block universe. Many philosophers (for example, Dorato 2006) would agree that if becoming is to be an objective feature of nature and not merely a subjective feature of human experience, then necessarily there is some sort of ontological asymmetry between past and future events.
However, if passage were merely having a one-dimensional asymmetric continuum structure, then the real numbers would pass from less to greater numbers, which would be an odd use of the word “pass.” Related to this point, Maudlin said, “The passage of time connotes more than just an intrinsic asymmetry; not just any asymmetry would produce passing … the passage of time underwrites claims about one state ‘coming out of’ or ‘being produced from’ another, while a generic spatial (or temporal) asymmetry would not underwrite such locutions….” Maudlin’s notion of production means that states at later times exist in virtue of states at earlier times.
Christian Loew explains Maudlin’s position this way:
(E)arlier states are metaphysically more fundamental than the later states that exist in virtue of them…. The laws of nature, given the state of the universe at some earlier time, constrain its state at all later times (either deterministically or by specifying probabilities). But it is only because of the intrinsic direction of time that earlier states are metaphysically more fundamental than later states and that later states exist in virtue of them. Think of the laws like the bed of a river and think of the direction of time like water pressure. The laws of nature chart out how the world must evolve, but the intrinsic direction of time is the driving force that gets this evolution going….Maudlin’s notion of production captures this generation of all other states from the initial state and the laws of nature: these other states are produced from the laws of nature operating on the initial state. By contrast, if time has no direction, then all states in time are equally fundamental and there is no explanation in terms of production (Loew 2018, 489, 490).
Maudlin believes he has pinpointed why so many physicists do not agree with him that it is a fundamental, irreducible fact of nature that time is an intrinsically directed object. Overly influenced by Einstein’s theory of relativity, these physicists treat time as if it is a space dimension, then note that no space dimension has an arrow, so they conclude the time dimension has no arrow. Einstein, himself, never made such an argument. According to Maudlin:
I think the reason it’s hard for physicists to see the direction of time is they use the piece of mathematics that was developed to analyze space, and there is no direction in space. So, you have this mathematical theory built for something with no direction in it, you then try to analyze space-time with it, and you say “Gosh, I don’t see a direction anymore; it must be an illusion.” …It’s not an illusion.
a. Criticisms
A variety of criticisms of the intrinsic theory have been offered. For example, those in the extrinsic camp often say there is an over-emphasis on the phenomenology of temporal awareness. In reply, those in the intrinsic camp often accuse those in the extrinsic camp of scientism due to their irresponsible rejection of our valid temporal intuitions.
One argument given for why the intrinsic theory improperly describes time is that relativity theory treats time as the fourth dimension, a one-dimensional subspace of space-time, and we know space has no direction. To elaborate on this criticism, notice that people on earth sometimes believe space has a direction toward “down,” but that is a mistake. Space seems to have a down arrow only because we happen to live on the surface of a very massive object which pulls objects toward its center, but if we lived out in space away from any very massive objects, we would not be inclined to assign a direction to space. Similarly, people commonly suppose time has a direction because they experience so many one-way processes, but they experience these only because their past is associated with an extremely low entropy state, namely, the big bang. If this peculiar event were not in their past, then they would appreciate that time could just as well have had no arrow or had a reversed arrow. Once they free their minds from the influence of the presence of earth and the influence of a low-entropy big bang in their past, they could see more clearly that neither space nor time has an intrinsic arrow. Advocates of the intrinsic theory usually respond to this criticism by saying it makes too much of the weak analogy between time and space.
One other broad criticism claims the intrinsic theory is not coherent. Huw Price said, “I am not convinced that it is possible to make sense of the possibility that time itself might have a certain direction” (Price 2002, 87). Dynamists typically, but not universally, promote an A-theory of time in which the concept of time’s passage depends upon the coherence of the idea that pastness and presentness are intrinsic properties of events, properties that are gained and lost over time. This dependence is illustrated by the fact that, according to the A-theory, the birthday party occurred last week because the party event has a week’s degree of pastness, a degree that will keep increasing. J.M.E. McTaggart famously claimed that the movement of presentness from one moment to another is the basis of the passage of time. Critics argue that these technical A-concepts are superficially coherent but ultimately not coherent and that B-concepts are sufficient for the task of explaining and grounding time and its arrow.
Nathan Oaklander criticized the moving-now theory or spotlight theory because it seems to be committed to the claim that the same NOW exists at every time, but he doubted anything sensible can be made of the NOW being the “same” (1985). For more discussion of this criticism, see (Prosser 2016).
Other critics of the intrinsic theory say the problem is not that the theory is inconsistent or nonsense but that it is obscure and unexplanatory.
Still others complain about subjectivity. They say the advocates of the intrinsic theory and its passage of time are relying ultimately on McTaggart’s A-series, but “A-series change and the passage of time are mind dependent in the sense of being merely matters of psychological projection,” unlike the B-theory with its arrow in time (Bardon 2013, 102).
Cognitive scientists and biochemists are naturally interested in learning more about the bodily mechanisms, including the mental mechanisms that allow people to detect time and also to detect time’s arrow. However, say the critics, they should attend more to the difference between the two. We see sand fall in the hourglass. Our detection of the change is correctly said to be evidence for us that time exists, but is it also evidence that the arrow of time exists, as Le Poidevin believes? No, say some critics from the extrinsic camp. What would be evidence would be noticing that the sand never falls up.
A motivation for adopting the dynamic theory is that it seems to them that they directly experience the dynamic character of time. “It would be futile to try to deny these experiences,” said D.C. Williams, who believed it seems to all of us that time passes. George Schlesinger said, “Practically all agree that the passage of time intuitively appears to be one of the most central features of reality.” Claims about their phenomenology of time do clearly motivate those experts in the intrinsic camp to say time passes, but it does not seem to be a strong motivation for the average person who is not an expert on issues of time’s arrow. Kristie Miller and her associates, “found that, on average, participants only weakly agreed that it seems as though time passes, suggesting that most people do not unambiguously have a phenomenology as of time passing.” Her results in experimental philosophy suggest “that ~70% of people represent actual time as dynamical and ~30% represent it as non-dynamical” (Miller 2020).
Some critics question the effectiveness of the argument that, if time intuitively seems to be dynamic, then it is. There are several very different ways this criticism is made. Many critics say that, even though much of our experience is not an illusion, our experience of the passage of time is merely an illusion—something we experience that should be explained away:
A moving spotlight theorist might…argue: his theory is superior because it is only in his theory that things are as they seem. But this is not a good argument. A B-theorist might have an excellent story to tell about why things are not as they seem. If he does, then it should not count against his theory that it says we are subject to an illusion (Skow 2011, 361).
In this spirit, other critics of dynamical time say that it seems to most of us that rocks are perfectly solid and contain no empty space; nevertheless, science rightly tells us we are mistaken. These critics say the intrinsic camp’s supposed intrinsic asymmetry of time that so many seem to find in their own experience and the experience of others is only a product of people, including certain philosophers of physics, overly relying on their intuitions and uncritical impressions, while misinterpreting their temporal experiences and being insufficiently sensitive to science. Their arguments do not take into account that science is properly in the business of precisification of concepts and of promoting concepts that are maximally useful for understanding nature. For more detailed criticism along these lines, see (Callender 2017). For a challenge to Callender’s approach plus an extended defense of the claim that science might have very little to tell us about real time, see (Baron et. al. 2022).
Dennis Dieks argued that, “on closer inspection it appears that the scientific B-theory may explain our intuition better than the A-theory, even though the latter at first sight seems to completely mirror our direct experience…. There is becoming and change in this picture in the following sense: events occur after each other in time, displaying different qualities at different instants” (Dieks 2012, 103 and 111). Opponents of a dynamic sense of becoming often say that becoming is real only in the sense that an event comes into being out of others in its local past. So, the B-theorist need not deny temporal passage provided it is not the robust passage promoted by the A-theorist.
The intrinsic theory is commonly criticized for its language use, for its violation of what philosophers of language call “logical grammar.” For example, pointing out how those in the intrinsic camp use the word becoming, J.J.C. Smart said:
Events happen, things become, and things do not just become, they become something or other. “Become” is a transitive verb; if we start using it intransitively, we can expect nothing but trouble. This is part of what is wrong with Whitehead’s metaphysics; see, for example, Process and Reality, p. 111, where he says that actual occasions “become.” (Smart 1949, 486).
C.D. Broad does not make this mistake in his use of language, says Smart, but he makes another mistake with language: his use of the transitive phrase “become existent” is misleading philosophically. Emphasizing Broad’s faulty use of language, Smart declares:
With what sorts of words can we use the expressions “to change” and “to become “? …I think that if certain philosophers, notably Whitehead and McTaggart, had asked themselves this question…they would have saved themselves from much gratuitous metaphysics… (Smart 1949, 486).
One prominent complaint made against the growing-block theory (GBT) is that it cannot give us a good reason to believe that we are now in the objective present and Napoleon is not. Bourne says, “We are in no better epistemic position than thinking subjects located in the objective past who are wrongly believing that they are located in the objective present, since ‘[…] we would have all the same beliefs […] even if we were past'” (Bourne 2002, 362). Vincent Grandjean adds, “the epistemic objection does not merely concern GBT, but is equally applicable to every A-theory of time that distinguishes between the notions of existing at the present time and just existing. For example, the epistemic objection is equally applicable to the moving spotlight theorist.”
Another frequent complaint made against the growing-block theory is that it is not compatible with the theory of relativity because it presumes absolute simultaneity rather than simultaneity relative to a conventionally chosen reference frame.
For consideration of the variety of these and other philosophical objections that are made to the tensed account of the dynamic, growing block, see chapter 10 of (Tooley 1997). See (Dorato 2006) for an argument that the unreality of the future, a proclaimed virtue of Tooley’s theory, is not a necessary condition for temporal passage. See (Earman 2008) regarding the prospects for revisions of a growing-block model of the universe.
The role the fields of psychology and cognitive science can and should play in understanding time’s arrow is an interesting issue. The human mind and body have some clock-like features, but clearly there is no single neuron that tracks time’s arrow. There may be some mental and neuronal structures to be found that do track time’s arrow, but there is no consensus that these have been found. Presumably there would be multiple mental procedures involved, and the neuronal structures would be both complex and distributed around the brain. But researchers cannot simply presume that what accounts for our temporal phenomenology is, among other things, time’s arrow. It will not be if there are pervasive phenomenal illusions regarding the arrow. Perhaps the mechanisms that account for our phenomenology that is purported to be about time’s arrow do not actually track the arrow. So, there is much useful, future research ahead. For more about these issues, see (Braddon-Mitchell and Miller 2017).
There is a subtle sub-issue here about how things seem. A distinction can be made between phenomenal error and cognitive error:
Temporal non-dynamists hold that there is no temporal passage, but [they] concede that many of us judge that it seems as though time passes. Phenomenal Illusionists suppose that things doseem this way, even though things are not this way. They attempt to explain how it is that we are subject to a pervasive phenomenal illusion. More recently, Cognitive Error Theorists have argued that our experiences do not seem that way; rather, we are subject to an error that leads us mistakenly to believe that our experiences seem that way. …We aim to show that Cognitive Error Theory is a plausible competitor to Phenomenal Illusion Theory (Miller et. al. 2020).
Adolf Grünbaum complained that the main weakness of dynamic theories is that passage and becoming and the arrow have no appropriate place in the fundamental laws. He probably would have found support in Jill North’s remark that, “There is no more structure to the world than what the fundamental laws indicate there is.” Some members of the dynamic camp reacted to Grünbaum’s complaint by saying the intrinsic theory does not need a law of physics to recognize the arrow: “[T]here is in the world an asymmetric relation holding among events, the temporal priority relation, and…we can know when this relation holds or fails to hold, at least sometimes, without relying upon any features of the lawlike nature of the world in time” (Sklar 1974, 407-410).
Other members of the dynamic camp reacted very differently to Grünbaum’s complaint by saying the fundamental laws do need to be revised in order to recognize some extra structure that reveals time’s intrinsic arrow. This suggestion has faced fierce resistance. Frank Wilczek, a Nobel Laureate in physics, objected to any revision like this. Coining the term “Core theory” for the theories of relativity and quantum mechanics (including quantum field theory and the standard model of particle physics), which are our two currently accepted fundamental theories of physics, Wilczek declared:
The Core has such a proven record of success over an enormous range of applications that I can’t imagine people will ever want to junk it. I’ll go further: I think the Core provides a complete foundation for biology, chemistry, and stellar astrophysics that will never require modification. (Well, “never” is a long time. Let’s say for a few billion years.)
Some members of the intrinsic camp might say, “We are not junking it, just supplementing it so it can be used to explain even more.”
Many critics of dynamic theories of time’s arrow speak approvingly of the 1951 article, “The Myth of Passage,” in which Harvard University metaphysician D.C. Williams argued that the passage of time is a myth, and that time does not really move or flow or pass or have any inherent dynamic character whatsoever. According to Williams, all proponents of a dynamic theory of time believe that:
Over and above the sheer spread of events, with their several qualities, along the time axis, …there is something extra, something active and dynamic, which is often and perhaps best described as “passage.” This something extra I think is a myth…one which is fundamentally false, deceiving us about the facts, and blocking our understanding of them. The literature of “passage” is immense, but it is naturally not very exact and lucid, and we cannot be sure of distinguishing in it between mere harmless metaphorical phenomenology and the special metaphysical declaration which I criticize. But “passage” it would seem, is a character supposed to inhabit and glorify the present, “the passing present,” “the moving present,” the “travelling now….” It is James’s “passing moment.” It is what Broad calls “the transitory aspect” of time…. It is Bergson’s living felt duration. It is Heidegger’s Zeitlichkeit. It is Tillich’s “moment that is creation and fact….” It is “the dynamic essence which Professor Ushenko believes that Einstein omits from the world. It is the mainspring of McTaggart’s “A-series” which puts movement in time, and it is Broad’s pure becoming.
The dynamic theories lead to other troubles, says J.J.C. Smart. For instance, when we critically examine the metaphor of time’s passage and ask about the rate of flow of time:
We are postulating a second time-scale with respect to which the flow of events along the first time-dimension is measured…the speed of flow of the second stream is a rate of change with respect to a third time-dimension, and so we can go on indefinitely postulating fresh streams…. Sooner or later we shall have to stop thinking of time as a stream….
With respect to motion in space it is always possible to ask “how fast is it?” …Contrast the pseudo-question “how fast am I advancing through time?” or “How fast did time flow yesterday?” …We do not even know the sort of units in which our answer should be expressed. “I am advancing through time at how many seconds per ___?” we might begin, and then we should have to stop. What could possibly fill the blank? Not “seconds” surely. In that case the most we could hope for would be the not very illuminating remark that there is just one second in every second. (Smart 1949, 485).
D.C. Williams agreed with Smart, and he added:
Bergson, Broad, and some of the followers of Whitehead have tried to soften the paradoxes of passage by supposing that the present does not move across the total time level, but that it is the very fountain where the river of time gushes out of nothingness (or out of the power of God). The past, then, having swum into being and floated away, is eternally real, but the future has no existence at all. This may be a more appealing figure, but logically it involves the same anomalies of meta-happening and meta-time which we observed in the other version.
Huw Price has complained that, “A rate of second per second is not a rate at all in physical terms. It is a dimensionless quantity, rather than a rate of any sort. (We might just as well say that the ratio of the circumference of a circle to its diameter flows at pi seconds per second!).”
Tim Maudlin (who advocates an intrinsic theory but not a dynamic theory) and others have bit the bullet and argued that time actually does pass at the rate of one second per second. What other rate could it have? It is trivial, but if you ask a trivial question you get a trivial answer. The numerator and denominator do not “cancel out.” Critics of the intrinsic arrow ask: If the rate of one second per second does make sense, then so does a rate of two seconds per second, and what would that be like? It would be absurd. George Schlesinger claimed the rate of two seconds per second does make sense (Schlesinger 1985). See (Skow 2012) and (Miller and Norton 2021) for more discussion of differential passage and of time’s rate of passage with and without a hyper-time against which time’s rate is compared.
An additional criticism of the dynamic camp’s position made by members of the extrinsic camp is that grounding time’s arrow on new nows being produced is a mistake because the concept of now, in the sense of a present for all of us, is inconsistent with scientific fact. According to the theory of relativity, if the reference frames of two observers, in which each observer is stationary in their own frame, are moving relative to each other, they must disagree on which events are happening now, with their having more disagreement the farther away that the events occur and the greater the relative speed between the two frames, so the concept of “now” cannot be objective. It is relative to a person’s favored frame of reference. The proper way to understand the word “now,” say most of these critics, is as an indexical that changes its reference from person to person and from time to time, as does the word “here;” and just as the changing reference of “here” indicates no arrow of space, neither does the changing reference of “now” indicate an arrow of time. For a defense of the moving spotlight theory against this criticism, see (Skow 2009).
For many members of the intrinsic camp, to explain time’s arrow is to explain the intrinsic difference between the past and the future. In this regard, some say there is something irrevocable or closed about past events that distinguishes them from future events. This is a deep and metaphysically significant fact. In response, Williams said in 1951, “As for the irrevocability of past time, it seems to be no more than the trivial fact that the particular events of 1902, let us say, cannot also be the events of 1952.”
Tim Maudlin has promoted an intrinsic theory but not a dynamic theory. Instead, he claimed it is a fundamental, irreducible fact that time is a directed object. The key metaphor for Maudlin is that of production, the present “produces” the past. Christian Loew highlights what he believes is a problem for Maudlin’s attempt to explain the thermodynamic arrow of entropy change in terms of production:
It is unclear what role an intrinsic direction of time that underwrites production could play in explaining the thermodynamic asymmetry. Nothing about production seems to rule out that low entropy macrostates have been produced from earlier states of higher entropy…. The time asymmetry of production, therefore, cannot explain the thermodynamic asymmetry by itself. To account for why entropy increase is typical toward the future but not toward the past, Maudlin’s account needs to be supplemented with restrictions on the boundary conditions. Maudlin seems to acknowledge this need…when he emphasizes that production guarantees that microstates in our universe have an atypical, low-entropy past only “given how it [i.e., production] started.” This appeal to how production started seems to presuppose the special initial boundary condition of the actual universe. But bringing in the boundary conditions in this way threatens to make production superfluous in an explanation of the thermodynamic asymmetry. (Loew 2018, 487).
Some other critics of Maudlin’s position on time’s arrow claim that, when Maudlin says, “change and flow and motion all presuppose the passage of time,” he should have said instead that they all presuppose the existence of time, not its passage. Maudlin was once asked “What does it mean for time to pass? Is that synonymous with ‘time has a direction,’ or is there something in addition?” Maudlin responded: “There’s something in addition. ‘For time to pass’ means for events to be linearly ordered by earlier and later.” Maudlin’s opponent in the extrinsic camp can be expected to say, “Wait! That linear ordering is just what I mean by time existing.”
Focusing on undermining objections to his position that time passes, Maudlin said:
There are three sorts of objections to the passage of time, which we may group as logical, scientific, and epistemological. Logical objections contend that there is something incoherent about the idea of the passage of time per se. Scientific objections claim that the notion of the passage of time is incompatible with current scientific theory, and so would demand a radical revision of the account of temporal structure provided by physics itself. Epistemological objections contend that even if there were such a thing as the passage of time, we could not know that there was, or in which direction time passes (Maudlin 2002 260).
Maudlin proceeded from there to argue that there are adequate responses to all three kinds of objections. He praised Huw Price’s book Time’s Arrow & Archimedes’ Point: New Directions for the Physics of Time for carefully presenting these objections and the responses.
3. The Extrinsic Theory
Have you ever seen this happen? What would you think if some morning you noticed the contents of a broken egg in a bowl on your kitchen table suddenly rise up out of the bowl and enter into the two halves of its eggshell in your hands where it then became an unbroken egg? You would think something is wrong here. Change does not occur that way naturally. Perhaps someone is secretly intervening to play a trick on you. If you could wait patiently for trillions and trillions of years, it is overwhelmingly probable you still would never witness an egg naturally and spontaneously behaving that way, yet this strange reverse-process does not violate the fundamental laws. (What it means for a process to be spontaneous is that no one intervenes and uses outside energy to manipulate the process.)
What if you could take a God’s eye view of the universe, and some morning you noticed that every process played out in a reverse direction to what you have learned to expect? You probably would conclude that time’s arrow had reversed. Appreciating this interpretation of the scenario provides a motivation for adopting the extrinsic theory of time’s arrow which implies that the arrow is due only to processes regularly showing one-way behavior spontaneously, and it is not due to some inherent structure within time itself as those in the intrinsic camp believe. It is a real pattern in time’s content in Daniel Dennett’s sense of the term “real pattern.” The extrinsic theory is more popular among physicists than among philosophers.
The extrinsic theory is committed to the claims that: (1) time’s arrow is extrinsic to time itself, (2) time’s arrow is identical to, or produced by, the presence of physical processes that are never observed to go the other way spontaneously even if the laws allow them to go that way, and (3) if anything depends on our choice of reference frame—or our choice of coordinate system—it is thereby not an objective feature of the world. It is not independently or intrinsically “real.”
Regarding point (3), time’s arrow does not have this frame dependence, which is why there can be a frame-free master arrow of time but not a frame-free master clock.
Those in the extrinsic camp consider the following to be the major feature of nature they need to explain: What emerges at a higher scale has an arrow, but what it emerges from at the lowest scale does not.
Those in the intrinsic camp disagree with clause (2) above and say the one-way physical processes illustrate time’s arrow or indicate it or exemplify it, but they do not produce it, and the arrow is not founded upon them. The intrinsic camp and extrinsic camp also give different answers to the question, “What is the relationship between time’s directedness and time’s arrow?” Those in the intrinsic camp are very likely to say they are the same. Many in the extrinsic camp are likely to say they are not the same because time itself, like space itself, has no direction. It just appears to have an arrow itself because of an important event in our past, the big bang. It could have been so different that time’s arrow would now go in reverse to how it now goes. Similarly, space has no direction even it seems to have the direction we call “down,” but that is just because we have an important object below our feet, the earth. In outer space, it would be clearer that there is no intrinsic arrow of space. Analogously, with a different origin of the universe, the arrow of processes might go in reverse or might not exist. That is why there is no intrinsic arrow of time.
If you were in the extrinsic camp, and you were looking for science to tell you about time’s arrow, you would naturally look to confirmed theories of physics rather than to the theories of the special sciences such as geology and plant science. The theories of physics underlie or explain the proper use of the word “time” in all the sciences. Our two comprehensive and fundamental theories of physics are the general theory relativity and quantum mechanics. They are fundamental because they cannot be inferred from other theories of physics. Time is primitive in all the fundamental theories. Surprisingly, their laws appear to be nearly oblivious to time’s arrow. Advocates of the extrinsic theory have concluded from this that time itself has no arrow or nearly no arrow. The need for the hedge term “nearly” is discussed in a later section. Some physicists, such as Ilya Prigogine, have concluded instead that if the laws are oblivious to the arrow, then new time asymmetric laws are needed.
The largest sub-group within the extrinsic camp constitutes the entropy camp. Its members believe they have uncovered a physical foundation for time’s arrow. It is entropy increase. Entropy-increase plus the fact that the universe had a minimal amount of entropy in the past is why our universe has our current B-relation rather than its inverse, they would say. A more detailed presentation of what entropy is and what role it plays in time’s arrow is provided later in this article, but loosely it can be said that entropy is a quantitative measure of a closed and isolated system’s closeness to equilibrium or how run down it has become or how disordered or how decayed or how close to being homogeneous or to having its energy spread out and dispersed and unusable. A system in which no matter can cross the system’s boundary is said to be closed, and a system in which no energy can cross the system’s boundary is said to be isolated. A closed and isolated system is radically insulated from its external environment beyond its boundary.
Entropy is higher after the light bulb has burned out, the cup of hot coffee has cooled to room temperature, the neat pile of leaves you just raked was scattered by a gust of wind, the battery has run down, and the tree has died. These macroscopic processes are always found to run in only one direction in time. Even the best swinging pendulum is a one-way process because of the friction it produces.
Here is a brief description of the key idea of the entropy camp:
We all have the incontestable experience of the flow of time. The problem is to explain it in terms of physics; for we feel that the direction of time is not merely “subjective,” but rooted in the nature of things or “objective.” …Many physicists and philosophers believed it to be solved when the second law of thermodynamics was discovered. (Hutten 1959).
Sean Carroll commented on this point:
We can explain our impressions that time flows and that there is something more real about now than about other moments as consequences of the arrow of time, which is itself a consequence of increasing entropy (Carroll 2022c, 136).
The second law says the total entropy of a closed and isolated physical system has a high probability of increasing in any spontaneous process. The law is expressed in terms of probabilities of what will happen, not what must happen, so, for the entropy camp, the appeal to probability is the key to explaining the arrow of time.
Here are seven tenets of the entropy camp. They believe the presence of time’s arrow in a closed and isolated region, including the region we call our observable universe itself, (1) can be explained or defined by the overall entropy increase in the region, (2) depends on the region having a very large number of atoms or molecules in motion so entropy can be well-defined, (3) emerges only at the macroscopic scale of description, and (4) depends on the fact that entropy was low earlier. (5) Rather than saying time’s arrow necessarily points towards the future, as members of the intrinsic camp say, members of the extrinsic camp say it points away from the Big Bang and towards equilibrium. Equilibrium is the state in which entropy has its maximum value. As that state is approached, time’s arrow fades away. (6) The direction toward equilibrium is, as a matter of fact, the future direction, regardless of reference frame. (7) The informal remark that time flows, or passes, or lapses is explicated as time’s existing.
The term “region” is purposefully vague to allow different claims about the size of the region. Also, there is a slight vagueness in the concept of entropy because there is no minimum number of particles required and no minimum number of new configurations or states per second that need to be present for a dynamic system to have a definite value for its entropy. At room temperature, there are 1024 molecules in three to four teaspoons of water. Also, at room temperature, a single water molecule collides with its neighbors about 1014 times per second. Therefore, the number of new configurations per second of this amount of water is enormous. It is so enormous that most experts believe the various sources of vagueness just mentioned are irrelevant to philosophical issues involving time’s arrow.
Members of the extrinsic camp would say that, if you clean your messy room and so decrease its entropy, you are not reversing the room’s arrow. They would say the arrow of time applies to the room or is manifest in the room, but there is no “room’s arrow.” Time’s arrow is overall entropy increase throughout the universe, they would say, even though there can be entropy decreases in some smaller sub-systems.
According to the entropy camp, the arrow emerges as the scale increases. This kind of emergence is not a process in time such as when an oak tree emerges from an acorn. It is a coarse-graining feature. It is something that reveals itself as the unhelpful information in the finer details is not taken into account. The arrow’s emergence is not strong emergence of something new and independent of whatever happens at the lower scale, but only weak emergence. Carroll explained the intended sense of the term “emergent”:
To say that something is emergent is to say that it’s part of an approximate description of reality that is valid at a certain (usually macroscopic) level and is to be contrasted with “fundamental” things, which are part of an exact description at the microscopic level…. Fundamental versus emergent is one distinction, and real versus not-real is a completely separate one.
The term “microscopic level” is a vague term that designates the atomic scale, that is, the world at the level of atoms and molecules, or much smaller scales. It is not tied to the use of a microscope. Emergent features are those we posit because they transcend the obscuring details of the microscopic level and give us useful, succinct information that improves our understanding of the phenomena we are interested in at a larger scale. When you want to understand why you just heard a loud noise, as a practical matter you must ignore the information about positions and velocities of all the molecules (even if you were to have some of this information) and focus on the more useful coarse-grained information that the room contained a glass of water which fell onto a hard floor and broke, thereby sending a loud sound throughout the room and into your ear. However, the lower-scale information about each molecule having this or that momentum and position and what external forces are acting on the molecules can in principle be used to explain both any exceptions to the higher-level regularities and the limits of those regularities. An example of an exception is when a falling glass of water does not break even though it hits the floor.
Because the law of entropy increase is a coarse-grained feature of nature, it is irrelevant to Laplace’s Demon. The demon can use the fine-grained features and not bother with entropy. However, coarse-graining is absolutely essential for a human being’s ability to understand nature.
How should one explain why the direction of entropy increase coincides with the direction of time? Perhaps it is an inexplicable fact. Perhaps it is a coincidence. Perhaps there is a deeper explanation. This is a bad question, said Julian Barbour: “It is wrong to seek to explain why the direction of entropy increase coincides with the direction of time. The direction of entropy increase is the direction of time.” Everyone in the intrinsic camp disagrees with Barbour.
Although adherents to the extrinsic theory often speak of the “ways” in which the past differs from the future as being “arrows,” this article often calls them “mini-arrows” in order to distinguish a mini-arrow from time’s master arrow that includes all the mini-arrows. The term “mini-arrow” is not a term commonly used in the literature. Typical mini-arrows recognized by the extrinsic camp are entropy increasing, causes always preceding their effects, space’s constantly expanding and never contracting, radiation flowing away from accelerated charges (such as in a candle flame or light bulb) and not into them, people having access to records of the past but not of the future, heat flowing naturally only from hot to cold, and our being able to intervene and affect the future but never the past. Explaining time’s arrow in more depth requires solving the problem of showing how these mini-arrows are related to each other. Perhaps some can be used to explain others but not vice versa. Huw Price called this the taxonomy problem. Attempts to solve the problem are explored in a later section.
The main goals of the entropy camp are (i) to describe how emergence works in more detail, (ii) to explain why the universe and its sub-systems have not yet reached equilibrium, (iii) to understand why entropy was lower in the past, and (iv) to solve the taxonomy problem.
Carroll commented:
In reality, as far as the laws of physics are concerned, all directions in space are created equal. If you were an astronaut, floating in your spacesuit while you performed an extravehicular activity, you wouldn’t notice any difference between one direction in space and another. The reason why there’s a noticeable distinction between up and down for us isn’t because of the nature of space; it’s because we live in the vicinity of an extremely influential object: the Earth…. Time works the same way. In our everyday world, time’s arrow is unmistakable, and you would be forgiven for thinking that there is an intrinsic difference between past and future. In reality, both directions of time are created equal. The reason why there’s a noticeable distinction between past and future isn’t because of the nature of time….
Instead, he would say, it is because of the nature of entropy change—a universal tendency to evolve toward equilibrium—and the fact that entropy was low in the distant past. Also, when he said, “both directions of time are created equal” he did not mean to imply that there was intentional creation involved.
Carroll’s position on the intrinsic theory and its appeal to our impression that time flows is that, “we can explain our impressions that time flows and that there is something more real about now than about other moments as consequences of the arrow of time, which is itself a consequence of increasing entropy” (Carroll 2022c, 136).
In his 1956 book The Direction of Time, Hans Reichenbach proposed an influential version of the entropy theory. He said, “positive time is the direction toward higher entropy,” and he defined the future direction of time as the direction of the entropy increase of most branch sub-systems. These are sub-systems that become isolated temporarily from the main system of objects being analyzed. Very probably, these isolated branch systems undergo an entropy increase over time, and the overall direction they all go is toward equilibrium. Reichenbach’s overall goal was to explain the direction of time in terms of the direction from causes to their effects.
a. Criticisms
A variety of criticisms has been directed at the extrinsic theory of time’s arrow. A very popular one is that the theory is too static. It misses the dynamic feature of time that is the key feature of the intrinsic arrow. It misses what is customarily called time’s becoming or time’s passage. They believe time is either asymmetric or at least it appears to be asymmetric even if it is not, but the extrinsic theory of the arrow with its static theory of time cannot account for this asymmetry. What can account for it is a dynamic theory of time in which time flows or passes.
A-theorists complain that the B-theorists of the extrinsic camp mistakenly promote a non-dynamic or static block theory of time in which insufficient attention is paid to change because the B-series of events is only about what events occur before what other events. For example, the B-theory fails to capture the dynamical fact that the present keeps moving along the A-series of events as time goes by in one direction.
Maudlin said the extrinsic theory misses the dynamical fact that the present “produces” the past. Those in the entropy camp believe entropy tends to increase over time in isolated systems because higher entropy states are “typical” or “likely,” compared to lower entropy states. These theorists, says Maudlin, do not appreciate that their notion of typicality needs to rely on an assumption of the “production” of events that serves as a “driving force” that gets states to evolve into other states. Yet it is the intrinsic arrow that provides this production, this driving force. The intrinsic arrow is why production is an asymmetric relation: If A produces B, then B does not produce A. In this sense, earlier states are metaphysically more fundamental than the later states that are produced by them and exist because of them. So, those in the entropy camp have things backward.
Bertrand Russell was an influential promoter of the B-series. His colleague at Cambridge University J.M.E. McTaggart disagreed with Russell about the nature of time. Based upon passages §313–315 of McTaggart’s book The Nature of Existence, one can imagine McTaggart saying to Russell: No, Russell, no. What you identify as “change” isn’t change at all. The B-series world you think is the real world is a world without becoming, a world in which nothing happens.
By “B-series world,” he means a world without an A-series. If Russell were to have lived in the twenty-first century, he might have responded to McTaggart by saying McTaggart’s mistake is to imply by analogy that a video file in a computer could never represent anything that changes because the file itself does not change.
Strictly speaking, the A-theory and the B-theory are theories about the ordering of events, not times. But if you think of a time as a set of simultaneous, instantaneous events, then the theories are about the ordering of times.
Members of the extrinsic camp have a responsibility to answer the following criticism: If, as they believe, objectively there is no dynamic flow or passage inherent in physical time, then why do so many people believe there is? Surely these people have gotten something right about the nature of time. Craig Callender has tried to defend the extrinsic theory against this criticism:
While physical time does not itself flow, we can explain why creatures like us embedded in a world like this one would nonetheless claim that it does…. In contrast to simply positing a primitive metaphysical flow and crossing one’s fingers in hope that somehow we sense it, the present theory advances independently suggested mechanisms, makes a number of specific claims, unifies some types of phenomena and theory, and suggests fruitful lines of inquiry. By any reasonable standard of theory choice, it is a better theory of passage than any currently on offer in metaphysics (Callender 2017, 227 and 263).
Let us turn now to criticisms that are more specific to the entropy camp. A critic might complain that members of the entropy camp cannot successfully defend their belief that the relation of earlier-than reduces to the lower-to-higher entropy relation rather than to the higher-to-lower entropy relation. At best they must presuppose what they need to prove.
Many critics of the entropy camp say that, even though entropy increase is directly correlated with the increase of time, this is an interesting but merely contingent feature of the universe that is not crucial to characterizing or explaining time’s arrow, although it might be a sign of the arrow’s presence. Stop signs near intersections of roads are signs of the presence of cars in the world, but cars do not need stop signs in order to be cars.
Another significant criticism of the entropy theory is that it is a pseudotheory because it makes no testable predictions. What experimental results can it explain? U.C. Berkeley physicist Richard Muller, appealing to the views of the philosopher Karl Popper, said:
A theory that makes no predictions cannot be falsified. I suggest we use the term pseudotheory for claimed theories that can be verified but not falsified. If time is related to entropy would you expect to see some effects? Relativity is full of such phenomena. Local gravity affects the rate of clocks; should not local entropy do the same? When entropy of the Earth’s surface decreases at night, shouldn’t we expect to see a change in the rate of time, perhaps a local slowing? But it doesn’t happen. Why not?…
Remember, there is no established connection between entropy rise and time flow; it is merely a speculation based on a correlation—the fact that both are moving forward. There is no actual theory, in the sense that general relativity is a theory. Maybe someday there will be a…theory. I can’t rule it out… (Muller 2016).
One thought is that perhaps we should look for testable effects of global entropy, not the local entropy of its sub-systems.
Does time need an entropy arrow or any kind of extrinsic arrow at all? Ferrell Christensen has an interesting perspective on this. “It is puzzling that Boltzmann’s thesis of extrinsic temporal asymmetry is accepted so widely without question—it is virtually an article of faith among philosophers and physicists in certain quarters. I suggest that such an attitude is unjustified. At least for now, the assumption that time needs an extrinsic arrow is in error” (Christensen 1987, 247).
Castagnino and Lombardi argue that the arrow is intrinsic to relativistic space-time, but it is not based on or reducible to entropy changes. They embrace:
Earman’s ‘Time Direction Heresy’, according to which the arrow of time, if it exists, is an intrinsic feature of space-time which does not need and cannot be reduced to non-temporal features [such as entropy] (Earman 1974, 20) and it cannot be “defined” in terms of entropy…[and the] geometrical approach to the problem of the arrow of time has conceptual priority over the entropic approach, since the geometrical properties of the universe are more basic than its thermodynamic properties…. [T]to confidently transfer the concept of entropy from the field of thermodynamics to cosmology is a controversial move. The definition of entropy in cosmology is still a very problematic issue, even more than in thermodynamics: there is not a consensus among physicists regarding how to define a global entropy for the universe. In fact, it is usual to work only with the entropy associated with matter and radiation because there is not yet a clear idea about how to define the entropy due to the gravitational field (Castagnino and Lombardi 2009, 8).
Critics of the entropy theory have complained that entropy is anthropomorphic and person-dependent, but time’s arrow is not. This criticism is explored later in this article after more has been said about the nature of entropy and after we consider the ideas of Edwin T. Jaynes.
Turning back to broader criticisms of the extrinsic theory that do not apply only to the entropy theory, Tim Maudlin and others complain that proponents of the extrinsic theory do not understand the nature of time-reversal symmetry. Saying the universe obeys time-reversal symmetry or time-reversal invariance, which is the same thing, is the technical way of saying that whatever happens could have happened in reverse. Maudlin believes misunderstanding this feature is a mistake that causes so many of those in the extrinsic camp to accept the following faulty argument: The fundamental physical laws are time-reversal invariant. Therefore, there is no direction of time in fundamental physics. Maudlin points out that Isaac Newton knew that his own laws were time-symmetric, yet he never doubted that time has an arrow.
The term “time-reversal” is ambiguous. The time-reversal operator T in physics replaces the time variable “t” in an equation by its negative “-t“. The intrinsic camp says T is reversing time. Many in the extrinsic camp say T is reversing only the dynamics and not time itself. That is why advocates of the extrinsic theory complain that it is misleading to call T the “time-reversal operation.” It is a naive time-reversal operation, they say, because it does not take into account that time reversal depends on which theories of physics are being applied. But this point is controversial. Someone like Sean Carroll would say a full time-reversal would take quantum theory into account and would reverse CPT, namely charge, parity and time. Nevertheless, almost everyone in the intrinsic camp continues to call T the “time-reversal operator” even though the two camps disagree about what time-reversal is.
The idea that entropy is intimately connected to time’s arrow originated from noticing that increases of entropy in closed and isolated systems are correlated with increases in time. The two increases are directly correlated and highly correlated but not perfectly correlated because entropy increase is a strong tendency, not a certainty, although this difference between tendency and certainty was not clear to the physics community in the late 1800s before thermodynamics became grounded in statistical mechanics.
Regarding this notion of tendency, consider that when two dice are rolled, the tendency is to not get two sixes. The failure to roll two sixes is not certain; it is merely likely. Similarly for entropy increase. When the state of a closed and isolated system of many molecules changes, it has a tendency or propensity to change to a state with higher total entropy because that is the most likely thing to happen. It is enormously likely, almost certain in systems with a great many particles that frequently change their configurations. However, the tendency does not hold for systems with a very small number of parts or for systems of parts that do not change their configurations. To summarize this point, we say the tendency emerges as the scale increases.
The word “entropy” is not part of the vocabulary of the ordinary person, but the concept is indispensable in modern science. It has many helpful characterizations depending on the situation. For example, in some systems it can be a useful numerical measure of the system’s randomness or disorder. The system can be the universe as a whole or some subsystem. The concept of disorder here is not a subjective one that two people could reasonably disagree about, but one that is a quantitatively precise number of different, physically possible configurations of a physical system. Ludwig Boltzmann was the first person to understand this. His key idea was to conceive of cases of entropy increase as cases of mixing of particles with two assumptions. There must be many particles, and they must be spontaneously shaking.
For example, consider a cardboard box containing an inch of black sand on the bottom and covered by an inch of white sand on top of this. The box system is at relatively low entropy because a white grain cannot be just anywhere, and ditto for a black grain. Randomly shaking will eventually mix the colors of sand and produce a mixed box at higher entropy. But if you randomly shake a box of half white and half black sand that is already mixed, you will never see the two colors separate into a layer of black under a layer of white. You could shake for much longer than the age of the universe and never see significant un-mixing. This phenomenon is explained by statistical physics, not thermodynamics. Boltzmann’s insight was to treat entropy change as a statistical phenomenon. He envisioned all material as if it is made of extremely tiny sand-like particles in motion, particles that we now call molecules. Very few physicists at the time believed molecules were actually real.
Is entropy increase the same as complexity increase? No. The sand box starts off simple with just two separate layers. During the mixing it becomes quite complex, but at the end of the mixing it is simple again, having become just an even distribution of all the particles. Similarly, today’s state of the universe is very complex, but a future universe near equilibrium in the far future will be very simple; it will just be a homogeneous soup of elementary particles at much higher entropy. At equilibrium, the system’s macroscopic properties are constant and nothing interesting is happening macroscopically even though its microstate does keep changing, so time still exists. What time needs in order to exist is merely to change; it does not need to change macroscopically, nor change its entropy, nor have an arrow, according to members of the extrinsic camp.
Is entropy increase the same as energy increase? In a closed and isolated and non-expanding system, total energy stays the same and merely changes form over time while total entropy increases over time. So, entropy is not energy. But there is a close connection. Entropy is a measure of how much energy of a system is unavailable for doing work. As entropy increases, energy is wasted, though not lost. So, the lesson here is that entropy is not a kind of substance that retains its identity over time, but is a special property of a changing, multi-particle system.
A small minority of cosmologists have argued that since the entropy of the universe remained constant during the period prior to the Higgs field turning on when no objects yet had mass, “time should have stopped” (Muller 2016,170). But time did not stop, so, it is argued that the arrow of time cannot be due to entropy increase. In my mind, says Muller, “the expression ‘not even wrong’ equally describes the entropy explanation for the arrow of time.” The majority of cosmologists say it is not time that should have stopped during this period; it is the arrow that should have stopped.
In the late 17th century, Robert Boyle claimed without strong evidence that physical phenomena can be explained by “the particular sizes, shapes, and situations of the extremely little bodies that cause them.” In the 1870s, it still was not generally accepted by physicists that there are these “little bodies” that we now call atoms and molecules. At that time, Boltzmann envisioned thought-experiments for a universe that he treated as a discrete particle system, as if it were a system of tiny, colliding billiard balls in constant motion in three dimensions of space with configurations of the balls changing continually. However, he believed this atomism, like Isaac Newton’s atomism, was merely a counterfactual assumption, one that was literally false yet helpful in describing thermodynamic behavior. Most all physicists, including Boltzmann, believed at the time that matter is continuous and infinitely divisible. But Boltzmann soon changed his mind and became an early exponent of the atomic thesis—that atoms are real and not just useful mathematical artifacts. But his contemporaries were slow to accept atoms, and they severely criticized him for promoting the atomic hypothesis. They argued that atoms cannot be seen and never would be, so why believe in them? Boltzmann’s response was that atoms are real because they are so helpful for explaining thermodynamic principles and phenomena.
Boltzmann had several insights that eventually were developed into the science of statistical thermodynamics. The notion of a macrostate as opposed to a microstate of a physical system was first suggested by him. When the atoms of air in a room are in a single macrostate described as having a constant density throughout the room, there are a zillion different configurations or microstates of those atoms that could have produced that same macrostate. At any point in time, a system is in one macrostate which is produced by exactly one microstate but which could have been produced by any one of a great many other microstates because at the macrolevel no one except Laplace’s Demon could practically tell one of these microstates from another.
Scientists never actually know the microstate at a given time of any macroscopic system, such as the exact positions and momenta of all of its atoms and molecules; they know only the values of some of the system’s useful, emergent, macroscopic variables such as its volume, pressure, temperature, and (going beyond thermodynamics) voltage, color, species, and the number of cows milked this morning in barn 6. The values of those macro-variables constitute the system’s macrostate. Boltzmann’s most important insight in statistical thermodynamics was that there is multiple realizability of any macrostate by an enormous number of microstates and that the probability of any macrostate occurring is proportional to the number of different microstates that could produce it. This assumption is now called the ergodic hypothesis.
He then defined entropy as being directly proportional to the logarithm of how many ways the constituent particles of a closed and isolated physical system can be re-configured so that external observers cannot tell the difference macroscopically among all those configurations. If W represents the number of those ways, and k is the constant of direct proportionality, then his equation for entropy is:
Entropy = k log W
When Boltzmann’s calculations for a few examples showed him that this is a successful description of entropy for some of the known values that had been measured for entropy by the engineers who worked on steam engines, he realized that he had discovered a quantitative concept that applied everywhere, not just to steam engines. Eventually he requested that this equation be etched on his tombstone, and it was.
Boltzmann’s insight was that entropy changes in a way that produces the most probable outcome. Here is a helpful analogy for understanding Boltzmann’s ‘counting argument’ for why a closed system’s high-entropy states can happen in such an enormous variety of ways that they occupy the ‘greatest volume’ of possibilities.
It is like imagining a house with a thousand blue rooms and … (several rooms that are red). …Think of the blue rooms as analogous to high-entropy states (of the isolated house). Now, suppose that you were to leave a red room and enter an arbitrary new room (with a uniform probability of entering any given room); it is overwhelmingly likely that the new room will be blue. Moreover, it is likely that if you continue to repeat this process, your room colour will (with high probability) be unchanging or ‘stationary’ over time…. Unfortunately, this is not enough to explain the time-asymmetric behaviour of such systems. Returning to the house analogy, suppose we find a person in a red room and ask what colour room they are most likely to have come from? The very same volume argument concludes: a blue one. That is, the counting argument by itself provides equally good evidence for a high entropy state to the future and to the past (Roberts 2022).
(The first two parenthetic phrases in the quotation were added by the author for clarification.) Your entering a new room is analogous to the system’s entering a new configuration, a new state.
Explaining why Boltzmann reasoning is useful for predicting the future behavior of the house but not its past behavior requires the addition of other assumptions to be discussed below.
The principal law in physics that involves entropy is the second law of thermodynamics. There is agreement that it has never been violated, but there is no agreement on what has not been violated. The philosopher of physics Percy Bridgman quipped that “There have been nearly as many formulations of the second law as there have been discussions of it.” Despite Bridgman’s caution, here is a very good, short, non-mathematical version of the second law that is especially useful for exploring the philosophical considerations of this article:
Second Law of Thermodynamics: There is a strong tendency for entropy to increase with time as a closed, isolated system moves towards equilibrium.
This is the case provided the system is not already at equilibrium. If it is already in equilibrium, then entropy would have a strong tendency to stay the same. The above statement of the second law for a system’s total entropy is recommended by and quoted from the physicist Richard Feynman.
Another way to express the second law is to say that, for a closed and isolated system having a nonequilibrium initial condition, it is far more likely to move closer to equilibrium than away from equilibrium. So, the proper answer to the question, “Why does total entropy increase over time?” is that this is what is very, very likely to happen.
The second law is not a fundamental law of physics. Physicists agree that it should be derivable from the fundamental laws using the techniques of statistical mechanics (statistical techniques for handling a system with a very large number of components), perhaps with other assumptions. However, there is no consensus on the details of the derivation. From 1902 to 1905, Einstein worked unsuccessfully to derive the second law from basic physical features, but he stopped working on the problem in 1905 when he produced the theory of relativity. The derivation difficulty continues in the 21st century. In the 20th century the field of thermodynamics came more and more to be understood as the field of coarse-grained statistical mechanics in which the coarse graining is due to the loss of a considerable amount of information about the configuration of the system’s constituent particles.
All other things being equal, physicists prefer exact laws to probabilistic laws, and the standard form of the second law is probabilistic. The second law is about a strong tendency or propensity, not a necessity, although in the early days of thermodynamics the law was mistakenly presented as a necessity—that total entropy never decreases in a closed and isolated system. This mistake led some critics to make the inaccurate comment that the growth of life on Earth is inconsistent with the second law because growing beings are decreasing their entropy. It is true that life represents an entropy decrease, but the reason the comment is a mistake is that life is a local decrease in entropy, and there is no problem with local decreases so long as they are compensated for by larger entropy increases elsewhere in the system. For example, think of human civilization on Earth as a sub-system of the effectively isolated solar system. Civilization can thrive on Earth and decrease its entropy—but at the expense of entropy increase due to the sun’s continuing to burn its nuclear fuel.
Another reason there can be life on Earth is because our sun sends high-temperature, low-entropy, yellow sunlight to Earth where it is used for photosynthesis and so forth, and then it is radiated away as lower-temperature, higher-entropy, infrared energy. If the sun’s energy were not continually re-radiated, the Earth would slowly heat up, and we all would die of global warming.
Thanks to this insight of Boltzmann’s, it is considered a misunderstanding of the second law and of the concept of entropy to believe there is an asymmetric force or mechanism that causes entropy to increase.
The second law is expressed as an inequality, not an equation, and the actual speed or rate with which a system increases its entropy is not specified, even by the quantified version of the law. The rate depends on many factors that are not discussed in this article and that do not affect the philosophical claims about time’s arrow.
There are several background assumptions made in founding thermodynamics upon statistical mechanics. The system must have very many particles, and they must readily change their configurations so that it looks as if chance is operating. More specifically, it has proved useful for explaining a system’s evolution in the future to assume that all or almost all the microscopic states underpinning a given macroscopic state are equiprobable, and it is presumed this idealization is not misleading. (But is it legitimate to assume that all the blue rooms in the above analogy are sufficiently equiprobable? In the real world beyond this toy model, the “blue rooms” surely are not exactly equiprobable.)
It is also clear that, if one wants to explain evolution from the past and not just evolution away from the past toward the future, there needs to be some assumption about why the entropy started out lower rather than higher than it is today. Unfortunately, there is no consensus among physicists regarding whether more assumptions or which assumptions are needed in order to establish in detail that thermodynamics is founded upon statistical thermodynamics.
Another point of controversy is the complaint that entropy is subjective or anthropomorphic. The influential physicist Edwin T. Jaynes remarked that:
Entropy is an anthropomorphic concept, not only in the well-known statistical sense that it measures the extent of human ignorance as to the microstate. Even at the purely phenomenological level, entropy is an anthropomorphic concept. For it is a property, not of the physical system, but of the particular experiments you or I choose to perform on it
because of our choice of what level of coarse-graining to use. It is we humans who choose what to count as a macrostate. For example, temperature might be a macrovariable that we require to specify the system, or temperature might be deemed irrelevant to the macrostate. It is our choice. Jaynes’ position is presented then attacked by the philosopher Adolf Grünbaum as misunderstanding the concept of entropy. It is clear that entropy is somewhat observer-dependent, but what is more important is the extent and significance of this observer dependence. Grünbaum’s point is that all the different ways of coarse-graining lead to nearly the same result, to the same value for the entropy. Thus, entropy is not significantly subjective (Grünbaum 1973 648-659). Nobel-Prize winning physicist Roger Penrose agreed:
In view of these problems of subjectivity, it is remarkable that the concept of entropy is useful at all in precise scientific descriptions—which it certainly is! The reason for this utility is that the changes from order to disorder in a system, in terms of detailed particle positions and velocities, are utterly enormous, and (in almost all circumstances) will completely swamp any reasonable differences of viewpoint as to what is or is not ‘manifest order’ on the macroscopic scale (Penrose 1989, 310).
Grunbaum’s and Pentrose’s point is that the subjectivity exists, but it is trivial, and to emphasize that entropy is subjective is to mislead people about the nature of entropy.
Not everyone adopted the position that Grünbaum and Penrose advocated. Carlo Rovelli and Huw Price did not. Rovelli said, “The directionality of time is…real but perspectival…: the entropy of the world in relation to us increases…and…the increase in entropy which we observe depends on our interaction with the universe….”
Time’s arrow is not illustrated by the fact that tomorrow comes after today. That fact is true by definition. Instead, according to the entropy camp, the arrow of our universe is shown by the fact that today has a greater value of entropy than yesterday and so on for tomorrow and the foreseeable days ahead. The universe will never be seen to have a state just like it has today. Why is this? If things change, why can’t they change back? They can, but the probability that they will is insignificant.
According to the entropy camp, there are two ways to have time without an arrow. There would be no arrow if entropy were to stop changing. There also would be no arrow if the entropy changes were to become randomly directed. Members of the intrinsic camp disagree with these two exceptions and say there is no way to have time without an arrow. Even at equilibrium, time would continue to go from past to future, they say.
Muller offers an interesting criticism of the entropy camp. Its proponents say we notice the direction of time from entropy decrease not entropy increase. Wrong, says Muller. “Humans take these magnificent low-entropy trees, cut them into boards, and construct buildings. If you watched a movie of a house being built, you would know the direction of time from the increasing order, not the increasing confusion; you would know it from the decreasing entropy.”
For a not-too-mathematical introduction to entropy and some of its many sub-issues, see the chapter “Entropy and Disorder” in (Carroll 2010). For a more mathematical, but easy-to-understand introduction, see (Lebowitz 1993). For an examination of how entropy has been misunderstood in the literature, see (Lazarovici and Reichert 2015).
a. The Past Hypothesis
Physicists presume that all the fundamental laws relevant to understanding entropy change are reversible in the sense that they have time-reversal symmetry. This implies (among many other things) that, for every solution of the equations for which entropy increases, there is also a time-reversed (that is, process-reversed) solution in which entropy decreases. Yet we never notice any significant entropy decreases. We invariably experience entropy increases. These increases taken all together are what those in the entropy camp call the arrow of time.
But what about the other half of the solutions to the equations? Philosophers and physicists want to know why the other half of the equations’ solutions are not manifested. So, careful attention is needed in order to explain in detail why the entropy in our universe generally increases from past to future and not from future to past. If not in our universe, then at least in our observable universe. Ludwig Boltzmann, with a little subsequent help from modern statistical mechanics, tried to explain this by appealing to the second law of thermodynamics with no assumption that time is intrinsically directed, but he invariably had to assume that entropy was low in the distant past. Without that assumption, his theory implies not only that entropy increases toward the future but it also increases toward the past. Yet we never observer entropy increasing toward the past. So, the problem of the relevant of the other 50% of the solutions to the equations still has not been solved.
Is entropy low in the distant past of our observable universe? Yes, it is, but why is this so? This fact about low entropy then is not derivable from any of the fundamental laws, and it is not known a priori. Richard Feynman highlighted the need for this assumption when he said in 1963:
So far as we know, all the fundamental laws of physics, like Newton’s equations, are reversible. Then where does irreversibility come from? It comes from going from order to disorder, but we do not understand this till we know the origin of the order… for some reason the universe at one time had a very low entropy for its energy content, and since then the entropy has increased. So that is the way towards the future. That is the origin of all irreversibility, that is what makes the process of growth and decay, that makes us remember the past and not the future…. One possible explanation of the high degree of order in the present-day world is that it is just a question of luck. Perhaps our universe happened to have had a fluctuation of some kind in the past…. We would like to argue that this is not the case.
In 1965, Feynman said: “I think it necessary to add to the physical laws the hypothesis that in the past the universe was more ordered, in the technical sense, than it is today” (Feynman 1965, 116). In 2000, the philosopher David Albert suggested we assume that the entropy of the observable universe was minimal all the way back in time, and if the time began at the Big Bang, then entropy was minimal at the Big Bang. This low-entropy boundary condition in the past is his Past Hypothesis. The hypothesis is not a dynamical law, but it is a law in the sense that it provides a lot of information in a compact and simple expression, which is all David Lewis requires of a law.
A box of particles at low entropy today would be highly ordered, maybe with all particles collected into a corner. The mutual attraction of the objects by gravity is irrelevant. That is today, but at the time early in the big bang the gravitational potential energy would be very relevant. It would be the dominant factor, and since gravity causes lumpiness, a highly ordered, low-entropy gravitational situation would be one in which all the early objects would not be lumpy but would be smoothly distributed. Early smoothness is low entropy; smoothness today is high entropy. The Past Hypothesis is the hypothesis that there was early smoothness and low entropy.
Sean Carroll defended the Past Hypothesis:
You need the Past Hypothesis…. Now, to be fair, the story I am telling you here, this is the standard story that most physicists or philosophers would tell you. There are people who don’t go along with the standard story. There are people who…think that time just has a direction built into it…that there is a flow from the past to the future. I don’t think that. Most working physicists don’t think that, but there are people who think that.… Even if you believe that, it doesn’t by itself tell you whether the past had low entropy.
To me the logic goes in the following way. You might want to think that time fundamentally has a direction—or that time doesn’t fundamentally have a direction [and] it’s just that it started with low entropy and so we perceive it to have a direction macroscopically. But if you think that time fundamentally has a direction, you still need to explain why the early universe had low entropy. That doesn’t come for granted. There is no feature about saying time has a direction that then says if I take the current state of the universe and evolve it into the past, the entropy goes down. There is no connection there, right? So, even if you believe that time has a direction, you still need to have some past hypothesis. And once you have the past hypothesis, you don’t need to assume that time [itself] has a direction because it will have a direction macroscopically [because of the second law of thermodynamics] even if microscopically it’s completely reversible. I think that’s why most people like the Past Hypothesis package when it comes to explaining the asymmetry of time (Carroll 2022b).
In the above quotation, Carroll supported two claims: (1) The Past Hypothesis is needed in order to successfully use entropy to explain the existence of time’s arrow. (2) The extrinsic theory, especially the entropy theory, is more appropriate than any intrinsic theory for explaining time’s arrow.
The Past Hypothesis does not require that the low entropy state at the big bang had no prior state nor that it is the very lowest possible value for entropy. Entropy in the distant past just needs to be low enough that the universe could easily have evolved into the observable universe of today.
But this raises another fine point: the difference between a boundary condition on the past as opposed to a boundary condition on the future. If our goal were to explain only why entropy increases in the future, then we could assume the Principle of Indifference—that we are indifferent about which microstate is producing a given macrostate—and not bother with the Past Hypothesis. Not so, when it comes to explaining why entropy decreases in the past. In that case, as explained in (Carroll 2020):
[W]e have to supplement the Principle of Indifference with the Past Hypothesis. When it comes to picking out microstates within our macrostate, we do not assign every one equal probability: We choose only those microstates that are compatible with a much lower-entropy past (a very tiny fraction), and take all of those to have equal probability. …If we use statistical mechanics to predict the future behavior of a system, the predictions we get based on the Principle of Indifference plus the Past Hypothesis are indistinguishable form those we would get from the Principle of Indifference alone. As long as there is no assumption of any special future boundary conditions, all is well.
Speaking for the community of cosmologists, Brian Greene issued a warning: The Past Hypothesis is “observationally motivated but theoretically unexplained.” Instead of merely adopting Albert’s hypothesis, cosmologists want a theoretical reason why the Big Bang began at a relatively low entropy macrostate, a reason that makes this low entropy natural and not merely assumed ad hoc. The search for that theoretical reason has turned out to be extremely difficult. About this search, Roger Penrose declared, “To me, it’s the greatest puzzle about the Big Bang.”
Craig Callender proposed a solution to Penrose’s puzzle: “It seems a philosophically respectable position to maintain that the Past Hypothesis doesn’t need explanation” because it is a brute fact, a rock-bottom truth.
Anything whatsoever could be explained by the right choice of unusual initial conditions. Is the Past Hypothesis true merely because of some random fluctuation? Remarking on what he believed is the weakness of that explanation, Carroll said:
The state of the early universe was not chosen randomly among all possible states. Everyone in the world who has thought about the problem agrees with that. What they don’t agree on is why the early universe was so special—what is the mechanism that put it in that state? And, since we shouldn’t be temporal chauvinists about it, why doesn’t the same mechanism put the late universe in a similar state? (Carroll 2010 301-2).
Motivated by this explanatory optimism, many cosmologists have produced speculative theories that appeal to special conditions long before the Big Bang that have led naturally to low entropy at the Big Bang. However, none of these theories has attracted many supporters.
Among cosmologists, the most widely supported explanation of why the Big Bang was at relatively low entropy is that this is implied by cosmic inflation, a special version of the Big Bang theory that supposes there was early, exponential inflation of space, a swelling that proceeded much faster than the speed of light. This inflation theory establishes what direction the arrow of time points, and it is attractive also because it provides a ready explanation for many other unsolved problems in physics such as why the oldest and most distant microwave radiation arriving now on Earth from all directions is so uniform in frequency and temperature.
The leading theory for why this inflation began is that it was ignited by a fluctuation in a pre-existing inflaton field (not inflation field) that was at even lower entropy. It is believed that at a very early time all the energy of the universe was contained within the inflaton field. Unfortunately, there is no convincing reason why the inflaton field exists and why it fluctuated as it did and why entropy was lower before then—a convincing reason that this was natural and to be expected—other than that these assumptions help explain the value of entropy just as inflation began. So, the conclusion has to be accepted that Penrose’s puzzle remains unsolved.
See (Wallace 2017) for an exploration of the controversy about the Past Hypothesis and how it should be properly formulated. The entropy theory of time’s arrow is actually a family of theories. See (Carroll 2010, 364-5) for different versions of the theory.
5. Other Arrows
Mini-arrows are time-asymmetries of kinds of macro processes. Time has many mini-arrows that distinguish the future from the past, and these are part of what constitutes or exemplifies time’s master arrow according to the extrinsic camp. This article has mentioned several, but there are others. These mini-arrows are deep and interesting asymmetries of nature, and philosophers and physicists would like to know how the mini-arrows are related to each other. Can some be used to explain others? This is the taxonomy problem.
Sean Carroll has a precisely expressed position on the taxonomy problem:
All of the macroscopic manifestations of the arrow of time…can be traced to the tendency of entropy to increase, in accordance with the Second Law of Thermodynamics.
So, that is the single thing that enables all these other asymmetries between past and future. The fact that entropy is increasing is the reason why there is an arrow of time. I would not say it is the arrow of time.
Not all members of the entropy camp agree with Carroll that entropy is the fundamental mini-arrow in terms of which all the other mini-arrows can be explained. See (Maudlin 2007) for more discussion of which of time’s mini-arrows can be explained in terms of which others.
The following sub-sections consider only three mini-arrows—the memory arrow, the cosmological arrow, and the causal arrow.
They do not discuss some of the other mini-arrows such as the arrow of quantum measurement or the action arrow or the electromagnetic arrow. The action arrow is shown by our ability to act on the future and not the past. The electromagnetic radiation arrow is shown by light leaving a candle flame rather then converging from all directions into the flame. In the late 1940s, Richard Feynman as a Princeton graduate student was able to convince the physicists that this electromagnetic radiation is symmetric in time and so does not define an arrow of time. The candle light could have gone in reverse and still obeyed all the laws of electromagnetism, he argued.
a. The Memory Arrow
The memory mini-arrow or psychological mini-arrow shows itself in the fact that we remember the past and never the future. The most popular explanation of this mini-arrow appeals to entropy.
Stephen Hawking, for example, asserts that the psychological arrow is baaed on the entropy arrow. But this is a tricky conclusion to make. It is not usually agued, but simply sated as self-evident. Hawking says, “Disorder increases with time because we measure time in the direction in which disorder increases. You can’t have a safer bet than that!” This statement is an example of the logical fallacy known as…proof by emphatic assertion, by authority (Muller 2016, 182).
Despite Hawking’s reasoning, physicists are still generally convinced that the memory arrow is due to entropy increases, but the details are subtle. Past events often have present traces, but future events never have them. Remembering an event is a mental process that interrogates the brain’s stored trace of the event. It is reviewing but not re-viewing the event. The trace in the sand requires the sand to increase its entropy; but the trace in our brains requires our neuron structure to be more ordered and to decrease its entropy, although at the expense of increased entropy for the larger system of which it is a sub-system.
Adrian Bardon offers a summary of the principal account:
In forming a memory, we reconfigure our neurons. This creates a local increase in order (within parts of our brain responsible for memory), but only at the expense of a slight expenditure of energy, a dissipation of bodily heat, and an overall entropy increase. Therefore, on this account, the formation of memories is relative to the larger thermodynamic trend. Our brains getting themselves into better order happens within the context of the trend towards overall heat dissipation. In a universe where systems necessarily decrease in entropy, our brains couldn’t be getting themselves into better order. According to the theory, then, the psychological order is dependent on the entropic arrow—and is thus just as contingent as the entropic arrow (Bardon 2013, 121).
b. The Cosmological Arrow
In 1937, Arthur Eddington coined the phrase “cosmological arrow” as the name for the mini-arrow of time produced by the relentless increase in the volume of the universe over time.
The most-favored explanation of the cosmological mini-arrow, and why it is directly correlated with the increase in time, involves dark energy. In 1998, cosmologists discovered the universal presence of dark energy. It exerts a small, negative, repulsive pressure on space making it expand everywhere. For billions of years, space has slowly been increasing the rate of this expansion, namely the rate at which clusters of galaxies expand away from each other. As time goes on, it will never stop expanding because dark energy never dilutes away, so when the volume doubles, so does the amount of dark energy (or so it is predicted, but this has never been experimentally or observationally established). Things might have started out differently, but they did not. This is the standard explanation of why there is a cosmological mini-arrow.
The physicist Richard Muller argued that this cosmological arrow grounds time’s arrow. Muller is in the intrinsic camp. He said the problem of time’s arrow is really the problem of “why time flows forward rather than backward.” And: “The flow of time consists of the continuous creation of new moments, new nows, that accompany the creation of new space” during cosmic expansion. So, the arrow of time is cosmic expansion.
Most all cosmologists believe the Big Bang’s expansion is only of 3-D space and not 4-D space-time. Muller challenged this popular position. He said, “The progression of time can be understood by assuming that the Hubble expansion takes place in 4 dimensions rather than in 3.” This is a version of the growing-block theory.
The usual assumption in cosmology is that 3D spatial expansion has no effect on the value of the universe’s entropy. According to Muller, this is not so. See his article, “Throwing Entropy under the Bus.” Penrose believes Muller’s proposal about entropy lacks promise. Penrose has said, “There is a common view that the entropy increase in the second law is somehow just a necessary consequence of the expansion of the universe…. This opinion seems to be based on…misunderstanding” (Penrose 2004 701).
George Ellis promoted the cosmological arrow of spatial expansion as the key to understanding time’s arrow. He advocated an intrinsic theory of time’s arrow via a growing-block theory in which:
The cosmological direction of time…is set by the start of the universe. There is no mechanism that can stop or reverse the cosmological flow of time, set by the start of the universe. It sets the direction of flow of the time parameter t…; time starts at t = 0 and then increases monotonically…. The gravitational equations…are time symmetric (because the Einstein equations are time symmetric), but the actual universe had a start. This broke the time symmetry and set the master arrow of time: the universe is expanding, not contracting, because it started off from a zero-volume state. It had nowhere to grow but larger….
A ‘past condition’ cascades down from cosmological to micro scales, being realized in many microstructures and setting the arrow of time at the quantum level by top-down causation. This physics arrow of time then propagates up, through underlying emergence of higher-level structures, to geology, astronomy, engineering, and biology. …The overall picture that emerges is one of the arrow of time in physical systems being determined in a top-down manner, starting off from a special initial condition at the cosmological scale where the cosmological arrow of time sets the basic direction of causation, but then emerging in complex systems through bottom-up causation… (Ellis 2013).
c. The Causal Arrow
Noting that causes happen before their effects, some researchers have suggested that time’s arrow and its mini-arrows can be explained or defined in terms of the causal mini-arrow. This causal theory is a bold proposal for solving the taxonomy problem, in part because most philosophers believe causality is derivative and not fundamental.
Some philosophers believe that it is true by definition that causes precede their effects. Others disagree and say this definition inappropriately rules out backward causation because the existence or non-existence of backward causation should be an empirical matter, not a matter of definition. Everyone agrees, though, that normally causes do happen before their effects.
Tooley says that causes “fix” their effects in the sense of making them real. If so, are uncaused events real for some other reason? Do they fit into the temporal order, or are uncaused events not real because they have no causes to make them real?
One would like to know more specifically how cause-effect relations are tied to the arrow of time. There have been many suggestions. For example, in his 1956 book The Direction of Time, Hans Reichenbach advocated a causal theory of time. Like Leibniz, he believed time order reduces to causal order. Reichenbach believed that macroscopic causality produces a temporal ordering on events, although the ordering alone is insufficient for supplying time’s direction (that is, specifying which of the two possible orderings is actual). Think of a horizontal line. Its points are ordered from left to right but also ordered from right to left. Intrinsically the two orders have the same structure; one order is just as good as the other. What is needed in addition for distinguishing one direction in time from the other, says Reichenbach, is entropy flow in branch systems. His point is explained below. He does not rely on a hypothesis about entropy starting off at a minimum. For another causal theory of time, see chapters 10 and 11 of Real Time II by D.H. Mellor. For commentary on the effectiveness of the program of using causation to establish an ordering on time, see (Smart 1969) and “Time and Causation” by Mattias Frisch in (Dyke and Bardon 2013). Here are some highlights and background.
An important issue is whether causes exist at both the microlevel and macrolevel. The physicist Lee Smolin insists that time’s arrow is intrinsic to time and that causes exist at any scale, no matter how small. Sean Carroll disagrees. He argues that time’s arrow is extrinsic and that, at the microlevel, the fundamental laws of physics imply there is no distinction between past and future and thus no causality. Commenting upon the fact that particle physicists do speak of cause and effect when discussing the microlevel, Carroll says they are using a different notion of causality. Physicists usually mean that signaling occurs slower than the speed of light because a cause has no possible local effect outside its forward light cone, nor can an effect have a possible local cause outside its backward light cone.
Many researchers have considered the concepts of cause and effect to be metaphysically dubious compared to the clearer concept of temporal order. Is it even appropriate to assume that the concept of cause is coherent? In the nineteenth century, the distinguished physicist Gustav Kirchhoff said the concept of cause is “infected with vagueness,” and Ernst Mach said the concept has no place in physics. In 1912, Bertrand Russell declared the concept to be a “relic of a bygone era” that is not useful in fundamental physics, and so physicists who aim to speak clearly about the foundations of physics should confine themselves to using differential equations and avoid causal discourse.
In the early twenty-first century, the philosophers Nancy Cartwright, David Albert, and Judea Pearl argued for at least the coherence and usefulness of causal discourse. Pearl remarked that, “Physicists write equations, but they talk cause and effect in the cafeteria.”
There is also the issue of whether the causal arrow is objective or subjective. The philosopher Huw Price (Price 1992) declared that causal discourse is “perspectival” or subjective, and causal order is not an objective asymmetry of nature. One implication of this is that backward causation is possible.
Assuming for the moment that the concept of causality is in fact coherent, consistent and an objective asymmetry in nature, how might it help us understand order relations for physical time? Some have said that to understand temporal precedence, it is sufficient to say:
Event C temporally precedes event E just in case event C could have been part of the cause of event E, the effect.
If this is correct, we can understand the “happens before” relation if we can understand the modal notion of “could have been part of the cause.” This proposal presupposes that we can be clear about how to distinguish causes from effects without relying on our knowledge of what happens before what. Can we? Mauro Dorato, among others, has argued that, “there is no physical property, attributable only to an event qua cause, that intrinsically (non-relationally) differentiates it from its effect” (Dorato 2000 S524). If causes can be distinguished from effects only by assuming causes happen before effects, then we have the fallacy of begging the question, which is a form circular reasoning.
Here are some suggestions that have been offered to avoid this circularity. The first comes from Frank Ramsey, and it was adopted by Hans Reichenbach. Using a macroscopic concept of causality, we can know what causes what independently of knowing whether causes happen before effects by an:
Appeal to intervention:
Event C is a cause of event E if controlling C is an agent’s effective means of controlling E and not vice versa.
Appeal to probability:
One event is the cause of another if the appearance of the first event is followed with a high probability by the appearance of the second, and there is no third event that we can use to factor out the probability relationship between the first and second events and thereby declare the relationship to be spurious.
Appeal to conditional probability:
Fact C causes fact E if the chance of E, given C, is greater than the chance of E, given not-C.
Appeal to counterfactuals:
What “C causes E” means is that, if C had been different, but everything else had stayed the same, then E would have been different.
Appeal to possible worlds:
What “C causes E” means is that in a possible world like ours in which E doesn’t happen, C doesn’t happen.
Philosophers of physics must assess whether any of these appeals succeed, perhaps with revision.
Some other scholars such as Tim Maudlin recommend not relying upon any of these appeals because causal order and thus the distinction between cause and effect is a primitive feature of the universe and cannot be defined or explained in terms of anything more fundamental. His claim successfully avoids the charge of circular reasoning, but it faces other problems involving how our knowledge of patterns of events, such as how this kind of event being followed by that kind, ever produces our knowledge of causal relations.
For a detailed discussion of attempts to avoid the charge of circular reasoning when defining or explaining temporal precedence in terms of causes preceding their effects, see (Papineau 1996). See also (Woodward 2014). The philosophical literature on the nature of causation is voluminous, and here we touch briefly on only a few points, but a point of entry into the literature is this encyclopedia’s article on causation.
Can entropy increase be explained in terms of causality? Can the cosmological mini-arrow (the expansion of the universe) also be explained in terms of causality? These are difficult questions to answer positively, but some researchers are optimistic that this can be done as part of a broader program aimed at the taxonomy problem.
Even if temporal precedence can be explained in terms of causal order, there is an additional issue involving the intrinsic camp vs. the extrinsic camp. Many in the intrinsic camp say that once we have temporal precedence we can say the arrow is simply the transformation from past to future. Those in the extrinsic camp disagree. More is needed, they say. To explain the arrow, we need to explain why so many processes go one-way in time; they do not do this because of temporal precedence. Perhaps a story about entropy increase is required.
Instead of trying to define or explain time’s arrow in terms of the direction of causation, Ferrel Christensen suggested doing the reverse. Perhaps the features giving time its intrinsic arrow are what is responsible for the direction of causation. See (Christensen 1987) for more about this research program.
Or perhaps the features giving time its extrinsic arrow rather than intrinsic arrow are what is responsible for causation and its direction. That is a position taken by many members of the entropy camp. Sean Carroll offered a promissory note: “We should be able to trace the fact that causes precede effects to the fact that entropy is increasing over time” (with the help of the Past Hypothesis). He means all causes, not merely some causes, but only causes at the macroscopic level. As Carroll describes causality, one can distinguish causes from effects at the macrolevel because the causes have “leverage” over the future, and this does not work the other way in time. He explains leverage in terms of intervention by saying that causation occurs when a small change at one time produces a large change at a later time (and the converse fails). We are confident, he says, that intervening and making a small change to the effect would not have made a change to the cause, and this is so due to the presence of the arrow of time. Because of this leverage, we can say the small change is the cause and the large change is the effect. At the fundamental microscale, says Carroll, there are no causes and effects, just patterns of events that particles follow with no arrow of time being apparent. To summarize, Carroll’s position is that causes make sense only on the large scale, and causes must occur before their effects, and this is so because of the direction of time which in turn is due to entropy increases. He provides a sixty-second video of his argument at “Do Causes and Effects Really Exist?”.
Huw Price, who also is an advocate of a causal theory of time, has objected to Carroll’s position:
I argue that the asymmetry of causation cannot be reduced to any of the available physical asymmetries, such as the second law of thermodynamics. The basic problem for such a reduction is that the available physical asymmetries are essentially macroscopic, and therefore cannot account for causal asymmetry in microphysics (Price 1996, pp. 9-10).
Many physicists do not agree with Price’s assumption that there is causal asymmetry in microphysics. Brian Greene, for example, insists that a causal relationship is an emergent, macroscopic phenomenon.
6. Living with Arrow-Reversal
What would it be like to live with time’s arrow going in reverse to the way it actually does go? Before examining the many proposed answers to this question, one should keep in mind that the intrinsic and extrinsic camps disagree about the relationship between time reversal and arrow reversal. Those in the extrinsic camp are likely to say time never runs backward, but the arrow could run backward if all processes in the universe reversed and evolved the other way in time. Those in the intrinsic camp would be likely to disagree and say arrow reversal without time reversal is impossible.
Could we use a telescope to look back in time to some place and find the arrow there pointing in reverse compared to our arrow? What would that look like? The answer might depend on how we answer the question, “Is the arrow local or always global? Sean Carroll said, “The thought experiment of an entire universe with a reversed arrow of time is much less interesting than that of some subsystem of the universe with a reversed arrow. The reason is simple: Nobody would ever notice…. If the whole thing ran in reverse, it would be precisely the same as it appears now.”
Roger Penrose disagreed. He claimed an entire universe with a reversed arrow of time is quite interesting. He said that, if we lived there, then we would ascribe teleological effects to omelets assembling themselves into unbroken eggs or water droplets distributed across the floor along with nearby broken shards of glass assembling themselves into an unbroken glass of water. According to Penrose,
‘Look!’, we would say, ‘It’s happening again. That mess is going to assemble itself into another glass of water!’
There is a significant amount of empirical evidence that some processes in distant galaxies unfold in the same time direction as they do here on Earth, and there is no contrary empirical evidence. For example, light signals are received only after they are sent, never before. Nevertheless, Horwich said: “I will defend the idea that the ‘directional character’ of time might vary from one region to another” (Horwich 1987 42). Boltzmann and Reichenbach tried to define the arrow locally, so they, too, supported the idea that the arrow could point in different directions in different regions.
How about the “directional character” of time pointing in different directions for different persons? Ferrel Christensen said:
Conceivably, then, the earlier-later asymmetry of common experience is limited to our region of time or of space. Indeed, suppose it were so highly spatially localized that different persons could have opposite time-senses: then one would remember events which for another are still in the future (Christensen 1987 232-3).
In 1902 in his Appearance and Reality, the British idealist philosopher and member of the intrinsic camp F.H. Bradley said that when time runs backward, “Death would come before birth, the blow would follow the wound, and all must seem irrational.” The philosopher J.J.C. Smart disagreed about the irrationality. He said all would seem as it is now because memory would become precognition, so an inhabitant of a time-reversed region would feel the blow and then the wound, just as in our normal region.
Stephen Hawking, also in the extrinsic camp with Smart, suggested in 1988 in A Brief History of Time:
Suppose, however, that God decided that…disorder would decrease with time. You would see broken cups gathering themselves together and jumping back onto the table. However, any human beings who were observing the cups would be living in a universe in which disorder decreased with time. I shall argue that such beings would have a psychological arrow of time that was backward. That is, they would remember events in the future, and not remember events in the past.
Hilary Putnam investigated the possibility of communication between our region of space with a normal arrow and a region with a reversed arrow:
Suppose…there is a solar system X in which the people “live backwards in time” (relative to us). Then if we go and spy on these people (bringing our own radiation source, since their sun sucks radiation in, and doesn’t supply it), we will see the sort of thing we see when we watch a motion picture run backwards…. It is difficult to talk about such extremely weird situations without deviating from ordinary idiomatic usage of English. But this difficulty should not be mistaken for a proof that these situations could not arise.
Tim Maudlin disagreed with Putnam and argued that there is a convincing argument that these situations could not arise. Assuming naturalism and the supervenience of the mental on the physical, and introducing a creative use of the asterisk symbol, Maudlin said:
[G]iven the actual sequence of physical states of your body over the last ten minutes, the time-reversed sequence of time-reversed states is also physically possible…. Let’s call this sequence of states your time-reversed Doppelgänger. […Introducing an asterisk notation] I will speak of the Doppelgänger’s neuron*s; these are just the bits of the Doppelgänger which correspond, under the obvious mapping, to the original’s neurons. …[T]he physical processes going on the Doppelgänger’s brain* are quite unlike the processes going on in a normal brain. …The visual system* of the Doppelgänger is also quite unusual: rather than absorbing light from the environment, the retina*s emit light out into the environment. …In every detail, the physical processes going on in the Doppelgänger are completely unlike any physical processes we have ever encountered or studied in a laboratory, quite unlike any biological processes we have ever met. We have no reason whatsoever to suppose that any mental state at all would be associated with the physical processes in the Doppelgänger. Given that the Doppelgänger anti-metabolizes, etc., it is doubtful that it could even properly be called a living organism (rather than a living* organism*), much less conscious living organism.
Norbert Wiener claimed any attempt to communicate between the normal region and the arrow-reversed region would “ruin everything” because one of the regions would rapidly collapse—the one that is very delicately balanced so that the entropy flows in reverse compared to our region. Sean Carroll agreed. A microstate that leads to entropy decrease is extraordinarily unstable under small perturbations, and entropy increase would take over again very quickly.
Sean Carroll proposed an argument against there actually being any time-reversed regions. Throughout the universe, cosmic rays continually hurtle from one region into another, so if there were a time-reversed region it would continually be encountering cosmic rays, but they that would be anti-particles for that region. However, any encounter between particles and anti-particles creates large releases of energy, much larger than the energy arriving on Earth from any distant star. Those large releases have never been observed, but they would have been if they existed.
7. References and Further Reading
Albert, David Z. 2000. Time and Chance. Harvard University Press. Cambridge, MA.
A technical treatise surveying the philosophical and physical issues involving time’s direction. The book never uses the word “arrow.” Reading between the lines, he says time has no intrinsic arrow, and that the arrow is basically due to processes taking place over time, but he is not optimistic that all the mini-arrows can be explained in terms of entropy change. On p. 11, Albert defines what it is for something to happen backward. On p. 20, he says, “classical electrodynamics is not time-reversal invariant.” Chapter 4 introduces his Past-Hypothesis that he calls a “fundamental…law of nature.” Albert describes a connection between the problem of time’s direction and the measurement problem in quantum mechanics.
Arntzenius, Frank. 1997. “Mirrors and the Direction of Time.” Philosophy of Science, December, Vol. 64, Supplement. Proceedings of the 1996 Biennial Meetings of the Philosophy of Science Association. Part II: Symposia Papers. The University of Chicago Press, pp. S213-S222.
Challenges an argument he had made two years earlier that if even one of the laws of nature is not time-reversal symmetric, then that is all that is required for us to infer that time has an objective direction. Assumes familiarity with quantum mechanics.
Arntzenius, Frank and Hilary Greaves. 2009. “Time Reversal in Classical Electromagnetism,” The British Journal for the Philosophy of Science, Volume 60, Number 3, pp. 557-584.
Surveys the debate between David Albert and David Malament regarding what time-reversal means, especially whether it always means reversing the order of states in a trajectory.
Augustynek, Zdzisla W. 1968. “Homogeneity of Time,” American Journal of Physics, 36, pp. 126-132.
A discussion of the physical equivalence of all time’s instants and the principles of invariance and symmetry involving time. The author worries about whether the principles of time symmetry are analytic or synthetic. Is time’s symmetry tautological or empirical? Explains why the principle of time’s symmetry implies, via Noether’s Theorem, the principle of the conservation of energy. Aimed at an audience of professional physicists. In her New York Times obituary of 1935, Einstein said, ““In the judgment of the most competent living mathematicians, Fräulein Noether was the most significant creative mathematical genius thus far produced since the higher education of women began.”
Barbour, Julian B. 2020. The Janus Point: A New Theory of Time. Basic Books, New York.
Contains an argument that the Past Hypothesis is a necessary consequence of a new fundamental law of the universe yet to be discovered.
Bardon, Adrian. 2013. A Brief History of the Philosophy of Time. Oxford University Press.
Chapter five offers a brief analysis of the relationships among the psychological arrow, the causal arrow, and the entropic arrow.
Baron, Samuel, Kristie Miller and Jonathan Tallant. 2022. Out of Time: A Philosophical Study of Timelessness. Oxford University Press.
It explores the possibility that time does not exist, and it challenges the position promoted by Craig Callender and others that science tells us what time really is.
Black, Max. 1959. “The Direction of Time.” Analysis, Vol. 19, No. 3, pp. 54-63.
Contains this philosopher’s proposal to explain the direction of time in terms of the objectivity of the truth values of ordinary language statements involving the temporal relation is-earlier-than.
Bourne. Craig. 2002. “When Am I? A Tense Time for Some Tense Theorists?” Australasian Journal of Philosophy, 80, 359–371.
Criticizes the growing-block model for its inability to distinguish our own objective present.
Braddon-Mitchell, David and Kristie Miller. 2017. “On Time and the Varieties of Science.” Boston Studies in the Philosophy and History of Science, vol. 326, pp. 67-85.
A study of how physics and the other sciences should work together to understand time. The authors say, “The special sciences…tell us where, amongst a theory of the physical world, we should expect to locate phenomena such as temporality; they tell us what it would take for there to be time. Physical theory tells us whether there is anything like that in the world and what its hidden nature is.”
Broad, Charlie Dunbar. 1923. Scientific Thought. London: Kegan Paul.
C.D. Broad describes a version of the moving spotlight theory, a growing-block theory.
Broad, Charlie Dunbar. 1938. Examination of McTaggart’s Philosophy, Volume II. Cambridge University Press.
Examines McTaggart’s proposals, including the existence of a universal now. Oaklander has written extensively on Broad’s treatment of time and how it changed during his lifetime. Broad’s 1938 position is considered to be his clearest and most defensible treatment.
Callender, Craig. 1998. “Review: The View from No-When” in The British Journal for the Philosophy of Science, Vol. 49, March. pp. 135-159.
This is a review of Huw Price’s book Time’s Arrow and Archimedes’ Point: New Directions for the Physics of Time. He says Price aims to answer the question: What does the world look like when we remove the effects of our temporally asymmetric prejudices?
Callender, Craig. 1999. “Reducing Thermodynamics to Statistical Mechanics: The Case of Entropy.” Journal of Philosophy vol. 96, pp. 348-373.
Examines the issue of how to explain thermodynamics in terms of statistical mechanics. The techniques of statistical physics are needed when systems are so complicated that statistical features are more useful than exact values of the variables—for example the statistical feature of average kinetic energy that is called temperature is more useful than trying to acquire knowledge of the position at a time of this or that molecule. From 1902 to 1905, Einstein worked unsuccessfully to derive the Second Law of Thermodynamics from basic physical features.
Callender, Craig. 2004. “There is No Puzzle About the Low Entropy Past.” In Contemporary Debates in Philosophy of Science, edited by C. Hitchcock, pp. 240-55. Malden: Wiley-Blackwell.
Explores some critical comments made about the Past Hypothesis.
Callender, Craig. 2017. What Makes Time Special? Oxford University Press, Oxford, U.K.
A comprehensive monograph on the relationship between the manifest image of time and the scientific image. He claims philosophers who defend parts of the manifest image have created all sorts of technical models (that is, theories) that try to revise and improve the scientific image. According to Callender, “These models of time are typically sophisticated products and shouldn’t be confused with manifest time. Instead, they are models that adorn the time of physics with all manner of fancy temporal dress: primitive flows, tensed presents, transient presents, ersatz presents, Meinongian times, existent presents, priority presents, thick and skipping presents, moving spotlights, becoming, and at least half a dozen different types of branching! What unites this otherwise motley class is that each model has features that allegedly vindicate core aspects of manifest time. However, these tricked out times have not met with much success” (p. 29). Chapter 11 is devoted to the flow of time.
Carroll, Sean. 2008. “The Cosmic Origins of Time’s Arrow.” Scientific American.
Describes the thermodynamic arrow and speculates that to solve the problem of the direction of time, one should accept a multiverse in which in some universes time runs in reverse to how it runs in ours.
Carroll, Sean. 2010. From Eternity to Here: The Quest for the Ultimate Theory of Time. Dutton/Penguin Group: New York.
A popular, lucid, and deep presentation of what can be learned from current science about the nature of time. Of all Carroll’s popular publications, this is the one that has the most to say about the arrow of time. See especially chapter 15.
Carroll, Sean. 2016, The Big Picture. Dutton/Penguin Random House. New York.
“The parts address how a complex universe can emerge from basic physical laws, how we can discover these laws, what we already know about them, and what implications they have for the evolution of life, and for consciousness, and for human values,” says David Kordahl in his review in The New Atlantis. Carroll explains how entropy can rise even as a system becomes less complex.
Carroll, Sean. 2019. “Sean Carroll on Causality and the Arrow of Time, ” FQXI Foundational Questions Institute, August 21. Available on YouTube.
He sketches his program to explain how entropy increase can explain the causal arrow. He admits that his explanation is still a work in progress.
Carroll, Sean. 2020. “Why Boltzmann Brains are Bad,” in Current Controversies in Philosophy of Science, 1st Edition, edited by Shamik Dasgupta, Ravit Dotan, and Brad Weslake. Routledge. pp. 7-20.
Argues that theories predicting Boltzmann Brains cannot simultaneously be true and justifiably believed.
Carroll, Sean. 2022a. “The Arrow of Time in Causal Networks.” U.C. Berkeley Physics Colloquium, April 22. YouTube https://www.youtube.com/watch?v=6slug9rjaIQ.
Discussion of how the thermodynamic arrow can explain the causal arrow. This talk is aimed at mathematical physicists.
Carroll, Sean. 2022b. “Ask Me Anything,” Mindscape podcasts, April AMA and May AMA. https://www.preposterousuniverse.com/podcast/.
Carroll discusses time’s having no fundamental or intrinsic arrow, why we need to adopt the Past Hypothesis, and how to define the term “arrow of time.”
Carroll, Sean. 2022c. The Biggest Ideas in the Universe: Space, Time, and Motion. Dutton/Penguin Random House.
A sophisticated survey of what modern physics implies about space, time, and motion, especially relativity theory without quantum mechanics. There is some emphasis on the philosophical issues. Introduces the relevant equations, but it is aimed at a general audience and not physicists. Chapter Five on Time is highly recommended for disentangling the meaning of time reversibility from the meaning of time reversal symmetry. Advocates the extrinsic theory of time’s arrow in terms of entropy.
Castagnino, Mario and Olimpia Lombardi. 2009. “The Global Non-Entropic Arrow of Time: From Global Geometrical Asymmetry to Local Energy Flow,” Synthese, vol. 169, no. 1 July, pp. 1-25.
Challenges the claim that time’s arrow should not be explicated in terms of entropy. The authors’ goal is to show how to define a global arrow of time from the geometrical properties of space-time and how this arrow can be “transferred to the local level, where it takes the form of a non-spacelike local energy flow that provides the criterion for breaking the symmetry resulting from the time-reversal invariant laws of local physics.”
Christensen, Ferrel. 1987. “Time’s Error: Is Time’s Asymmetry Extrinsic?” Erkenntnis March pp. 231-248.
Examination of whether time’s arrow is intrinsic or extrinsic. He claims, “there are no very strong arguments in favor of the view that time is only extrinsically anisotropic. Moreover, there are some serious arguments in opposition to the claim.” He is in the intrinsic camp, but he says the concept of time passing is nonsensical.
Dainton, Barry. 2020. Time and Space, Second Edition. McGill-Queens University Press. Ithaca, 2010.
An easy-to-read textbook that surveys the major philosophical issues about time and offers many arguments. It is not primarily about time’s arrow. Regarding time’s arrow, Dainton suggests the goal is “defining the direction of time in terms of entropy” (p. 49) rather than explaining the direction in terms of entropy.
Davies, Paul C. W. 1974. The Physics of Time Asymmetry. University of California Press. Berkeley and Los Angeles.
A survey by a proponent of the extrinsic theories of time.
Deng, Natalja M. 2017. “On ‘Experiencing Time’: A Response to Simon Prosser,” Inquiry: An Interdisciplinary Journal of Philosophy 61(3), pp. 281-301.
A chapter-by-chapter critique of (Prosser 2016). Explores the psychology of time.
Dieks, Dennis. 1975. “Physics and the Direction of Causation,” Erkenntnis, vol. 25, no. 1, July, pp. 85-110.
Explores how physics can recognize the direction of causation.
Dieks, Dennis. 2012. “The Physics and Metaphysics of Time, European Journal of Analytic Philosophy, pp. 103-119.
Surveys the physics and metaphysics of time and argues in favor of the B-theory over the A-theory. Confronts the claim that physics needs to be revised to account for the arrow and the claim that the B-theory cannot give an accurate description of our temporal experiences.
Dorato, Mauro. 2000. “Becoming and the Arrow of Causation.” Philosophy of Science, Sept., Vol. 67, Supplement. Proceedings of the 1998 Biennial Meetings of the Philosophy of Science Association. Part II: Symposia Papers September, pp. S523-S534.
The author focuses on what would be required to establish the objectivity of becoming. He recommends solving the taxonomy problem by saying causation is the main philosophical asymmetry, in Horwich’s sense of that term, namely the philosophical symmetries of trace, knowledge, explanation, action, counterfactual dependence, and our subjective sense of the passage of time.
Dorato, Mauro. 2006. “Absolute Becoming, Relational Becoming and the Arrow of Time: Some Non-Conventional Remarks on the Relationship Between Physics and Metaphysics,” Studies in History and Philosophy of Modern Physics, 37, 3, 2006, 559–76. Reprinted in (Oaklander 2008).
Provides an in-depth analysis of becoming. Argues that the arrow of becoming is more fundamental than the arrow of entropy change. And he asserts that, because the conceptual link between becoming and the issue of the direction of time requires regarding the asymmetry of causation as fundamental, such an asymmetry cannot turn out to be merely extrinsically correlated to irreversible physical processes.
Dyke Heather & Adrian Bardon 2013. (eds.), A Companion to the Philosophy of Time. Wiley-Blackwell.
A collection of academic articles on a wide variety of issues in the philosophy of time.
Earman, John. 1974. “An Attempt to Add a Little Direction to ‘The Problem of the Direction of Time.’” Philosophy of Science. 41: 15-47.
Comments on the role of semantic ambiguity in discussions of time’s arrow. Speculates on life in a time-reversed world. Argues that the arrow of time is an intrinsic feature of space-time.
Earman, John. 2002. “What Time Reversal Invariance Is and Why It Matters.” International Studies in the Philosophy of Science, 16, 245-264.
Explains how the time reversal operator must be defined differently in different situations.
Earman, John. 2006. “The ‘Past Hypothesis’: Not Even False.” Studies in History and Philosophy of Modern Physics 37, 399-430.
Criticizes the Past Hypothesis and the view that the asymmetry of entropy can be explicated through its role within cosmological theories.
Earman, John. 2008. “Reassessing the Prospects for a Growing Block Model of the Universe,” International Studies in the Philosophy of Science 22, 135-164.
Explains the growing-block model and examines arguments for and against it. Hermann Minkowski invented the block model in 1908. His block contains not all the future events that might happen but rather all the future events that will happen.
Ellis, George. 2013. “The Arrow of Time and the Nature of space-time.” Studies in History and Philosophy of Science Part B: Studies in History and Philosophy of Modern Physics 44 (3): 242-262.
Promotes a growing block theory with top-down causation due to a past condition of the universe. What he means by a past condition is the idea that global conditions determine the arrow of time by top-down causation. His ideas are developed with the tools of quantum field theory.
Falk, Dan. 2008. In Search of Time: The History, Physics, and Philosophy of Time. St. Martin’s Griffin. New York.
A popular survey by a reliable guide.
Farr, Matt and Alexander Reutlinger. 2013. “A Relic of a Bygone Age? Causation, Time Symmetry and the Directionality Argument.” Erkenntnis 78, Supplement 2, pp. 215-235.
An assessment of Russell’s argument that the time symmetry of fundamental physics is inconsistent with the time asymmetry of causation.
Freundlich, Yehudah. 1973. “‘Becoming’ and the Asymmetries of Time,” Philosophy of Science, Vol. 40, No. 4., pp. 496-517.
Examines the senses in which time’s arrow is mind-dependent, and the relationship between the possible asymmetries of phenomenological and physical time. He says, “We find that physical time acquires meaning only through phenomenological time, and that phenomenological time is fundamentally asymmetric. …The central thesis of this paper will be that merely to differentiate between appearance and reality is implicitly to assume a directed flow of time [from past to future]. …The focal point of any phenomenalist position is the assertion that the meaningful content of any physical statement is exhausted by the claims that statement makes as regards the ways we are appeared to.”
Frisch, Mathias. 2013. “Time and Becoming” in Dyke and Bardon 2013.
Endorses the dynamic theory and develops the causal theory.
Frisch, Mathias. 2014. Causal Reasoning in Physics. Cambridge: Cambridge University Press.
Explores the variety of issues involved in using causal reasoning in physics, including the relationship of the causal mini-arrow to other mini-arrows.
Grandjean, Vincent. 2022. The Asymmetric Nature of Time: Accounting for the Open Future and the Fixed Past. Synthese Library, volume 468. Springer. https://link.springer.com/book/10.1007/978-3-031-09763-8.
This book develops and defends a version of the growing-block theory.
Greene, Brian. 2004. The Fabric of the Cosmos: Space, Time, and the Texture of Realty. Alfred A. Knopf. New York.
A leading theoretical physicist provides a popular introduction to cosmology, relativity, string theory, and time’s arrow.
Greene, Brian. 2020. “Your Daily Equation #30: What Sparked the Big Bang?” May 20. https://www.youtube.com/watch?v=7QkT7evF2-E.
Describes repulsive gravity and cosmic inflation. Presupposes the viewer’s facility with partial differential equations.
Grünbaum, Adolf. 1973. Philosophical Problems of Space and Time. Second edition. Alfred A. Knopf. New York.
His views on time’s arrow are chiefly presented in the two chapters “The Anisotropy of Time,” and “Is the Coarse-Grained Entropy of Classical Statistical Mechanics an Anthropomorphism?” The first edition of 1963 was expanded in 1973 with new material.
Horwich, Paul. 1987. Asymmetries in Time: Problems in the Philosophy of Science. The MIT Press. Cambridge.
An analysis of many theories of time’s arrow. Horwich claims there is no intrinsic difference between the past and the future. Time itself is symmetric and does not itself have an arrow. David Hume was correct, says Horwich, in asserting that causes happen before their effects only because of our human convention about what those words mean. Horwich has a unique solution to the taxonomy problem that gives special weight to the knowledge mini-arrow and its explanation in terms of the fork asymmetry. This book is written for experts in the field.
Hutten, Ernest H. 1959. “Reviewed Work(s): The Direction of Time by H. Reichenbach.” Philosophy, Vol. 34, No. 128, January, pp. 65-66.
Briefly summarizes the main themes in Reichenbach’s causal theory of time. Hutten believes Reichenbach makes several serious, irrepairable mistakes in his argument.
Ismael, Jenann T. 2017. “Passage, Flow, and the Logic of Temporal Perspectives,” in Time of Nature and the Nature of Time: Philosophical Perspectives of Time in Natural Sciences. Ed. by Christophe Bouton and Philippe Huneman. Boston Studies in the History and Philosophy of Science 326. Springer International Publishing. Pp. 23-38.
A careful examination of some of the formal features of temporal perspectives such as time’s passage. Explores the logic of the content of temporal experience rather than of the quality of that experience.
Kajimoto, Naoyuki and Kristie Miller and James Norton. 2020. “Primitive Directionality and Diachronic Grounding,” Acta Analytica, pp. 195-211.
Considers how to defend the claim that time’s directionality is primitive by using the concept of grounding.
Katz, Bernard D. 1983. “The Identity of Indiscernibles Revisited,” Philosophical Studies: An International Journal for Philosophy in the Analytic Tradition 44, July, pp. 37-44.
Explores the difficulties of distinguishing intrinsic properties from extrinsic properties.
Lazarovici, Dustin, and Peter Reichert. (2015). “Typicality, Irreversibility and the Status of Macroscopic Laws.” Erkenntnis 80:4. 689-716.
Examines how entropy and the second law have been misunderstood in the academic literature. Considers theories that imply the Past Hypothesis.
Le Poidevin, Robin. 2007. The Images of Time: An Essay on Temporal Representation, Oxford University Press. Oxford.
Explores temporal representation and temporal phenomenology.
Lebowitz, Joel L. 1993. “Boltzmann’s Entropy and Time’s Arrow.” Physics Today. September. Pp. 32-38.
A popular treatment of the entropy arrow aimed at high school and college undergraduate physics teachers.
Lees, J. P., et al. 2012. “Observation of Time-Reversal Violation in the B0 Meson System.” Physical Review Letters, 109, 211801.
SLAC National Accelerator Laboratory at Stanford University reported the first direct test of time-reversal symmetry without any dependence on charge reversal or parity reversal. B0mesons failed to decay time-reversibly via the weak interaction.
Explores the possibility of explaining time’s arrow in terms of the causal arrow.
Lewis, David. 1983. “New Work for a Theory of Universals,” Australasian Journal of Philosophy 61, pp. 343-377.
Explores the difficulties in distinguishing intrinsic properties from extrinsic properties.
Loew, Christian. 2018. “Fundamentality and Time’s Arrow.” Philosophy of Science, 85. July. Page 483.
Develops and defends Maudlin’s views on time. Claims the intrinsic arrow is needed to explains why there is asymmetry of entropy change. He says, “My goal is to flesh out a way of understanding the idea that time has an intrinsic direction that can underwrite explanation.”
Matthews, Geoffrey. 1979. “Time’s Arrow and the Structure of space-time,” Philosophy of Science, vol. 46, pp. 82-97.
Argues that time’s arrow is a local rather than a global feature of the universe.
Maudlin, Tim. 2002. “Remarks on the Passing of Time,” Proceedings of the Aristotelian Society, 2002, New Series, Vol. 102 (2002), pp. 259-274. Oxford University Press.
Defends his claim that the passage of time is an intrinsic asymmetry in the structure of space-time itself, an asymmetry that is metaphysically independent of the material contents of space-time such as the entropy gradient. Focuses not on the positive reasons to accept his claim but rather on the negative program of undermining arguments given against his claim.
Maudlin, Tim. 2007. The Metaphysics Within Physics. Oxford University Press. Oxford.
Argues that time passes. Maudlin says the fundamental laws of nature and the direction of time require no philosophical analysis, but causation does. He says space-time has a fundamental, intrinsic, inexplicable temporal direction, and this explains why earlier states produce later states but not vice versa. He objects to the Humean program of analysis in philosophy of physics that implies (1) “laws are nothing but patterns in the physical state of the world,” and (2) the direction of time is “nothing but a matter of how physical systems are disposed to behave throughout the universe” (which is the extrinsic theory). Maudlin advocates a non-Humean primitivist approach to both the fundamental laws and time’s arrow.
Mellor, D. H. 1991. “Causation and the Direction of Time.” Also published in Erkenntnis 1975, 35, pp. 191–203.
A defense of a causal theory of time.
Mellor, D. H. 1995. The Facts of Causation. Routledge.
An influential analysis of the concept of causation that emphasizes singular causation, the causation of one fact by another.
Miller, Kristie. 2013. “Presentism, Eternalism, and the Growing Block,” in (Dyke and Bardon 2013, 345-364).
A careful presentation of the three main ontologies of time followed by an investigation of whether disagreements among advocates of the ontologies are involved in merely semantic disagreements and are “talking past” each other. The pros and cons of each ontology are considered.
Miller, Kristie. 2019. “The Cresting Wave: A New Moving Spotlight Theory.” Canadian Journal of Philosophy, 49, pp. 94-122.
A revision on the moving spotlight theory that adds a cresting wave of causal efficacy. Miller is not a temporal dynamist.
Miller, Kristie and A. Holcombe and A.J. Latham. 2020. “Temporal Phenomenology: Phenomenological Illusion versus Cognitive Error.” Synthese 197, pp. 751–771.
Our temporal phenomenology is our experience of temporal properties and relations such as order, succession, duration, and passage. The article defends the claim that a person can make a cognitive error in saying it seems to them that time passes because they fail to make a careful distinction between “how actual time is taken to be” and “a representation of what it is to be time: of what is essential to time.” Investigates how we represent time in all possible worlds.
Miller, Kristie and John Norton. 2021. “If Time Can Pass, Time Can Pass at Different Rates,” Analytic Philosophy Vol. 62, March, pp. 21–32.
Offers an explication of the notion of time passing and considers whether it always must pass at the same rate.
Muller, Richard A. 2016. Now: The Physics of Time, New York: W. W. Norton & Co.
Helpful introduction to the physics of time for philosophers. Its main original idea is that the arrow of time is not due to entropy increase but is only correlated with it. The relevant chapter is titled “Throwing Entropy under the Bus.” The arrow is due to the cosmic expansion of space-time, he says.
Muller, Richard A. and Shaun Maguire. 2016. “Now and the Flow of Time,” arXiv. https://arxiv.org/pdf/1606.07975.pdf.
An original argument for why the thermodynamic arrow is not the fundamental arrow of time. The progression of time can be understood, they say, by assuming the flow of time consists of the continuous creation of new moments, new nows, that accompany the creation of new space. This is a version of the growing block theory of time.
Musser, George. 2017. “A Defense of the Reality of Time,” Quanta Magazine. May 16. https://www.quantamagazine.org/a-defense-of-the-reality-of-time-20170516/.
A condensed interview with Tim Maudlin.
North, Jill. 2002. “What Is the Problem about the Time-Asymmetry of Thermodynamics? A Reply to Price.” The British Journal for the Philosophy of Science, Vol. 53, No. 1, March, pp. 121-136.
Commentary and critique of the positions taken by Huw Price. “Price argues that there are two conceptions of the puzzle of the time-asymmetry of thermodynamics. He thinks this puzzle has remained unsolved for so long partly due to a misunderstanding about which of these conceptions is the right one and what form a solution ought to take.” North argues that it is Price’s understanding that is mistaken.”
North, Jill. 2008. “Two Views on Time Reversal.” Philosophy of Science, Vol. 75, No. 2, April, pp. 201-223.
Clearly addresses the issue of what we could and should mean by time reversal in the context of classical physics.
North Jill. 2009. “The ‘Structure’ of Physics: A Case Study,” Journal of Philosophy, vol. 106, pp. 57–88.
North asks what a fundamental theory of physics says about the structure of the world when the theory has two significantly different mathematical formulations, such as Newton’s mechanics in its Lagrangian and Hamiltonian versions. Each of the two has its own system of coordinates and equations of motion. North considers scenarios in which the two versions of a theory indicate different structures of the world itself versus when they indicate simply two different descriptions of the same underlying structure.
Oaklander, L. Nathan. 1985. “A Reply to Schlesinger.” The Philosophical Quarterly, Vo. 35, No. 138, January, pp. 93-94.
Criticizes the moving-now theory that was presented by Schlesinger. One criticism is that the theory is committed to the claim that the same NOW applies to all times, but that impales the theory on the horns of a dilemma: it is either incoherent or circular.
Oaklander, L. Nathan. 2008. Editor. The Ontology of Time. Routledge.
A collection of diverse, but influential, articles on the major issues about time.
Papineau, David. 1996. “Philosophy of Science,” in The Blackwell Companion to Philosophy edited by Nicholas Bunnin and E. P. Tsui-James, Blackwell Publishers Inc.: Oxford. pp. 290-324.
Discusses a variety of attempts to escape the circularity problem that arises in trying to define or explain time’s arrow.
Penrose, Oliver. 2001. “The Direction of Time.” in Chance in Physics: Foundations and Perspectives, edited by J. Bricmont, D. Dürr, M. C. Galavotti, G. C. Ghirardi, F. Petruccione and N. Zanghi. Springer Verlag.
Adopts an extrinsic theory of time’s arrow. Argues that Reichenbach’s principle of the common cause is the proper approach to understanding the time direction of asymmetric processes. Presumes a familiarity with advanced mathematics and physics.
Penrose, Roger. 1989. The Emperor’s New Mind: Concerning Computers, Minds, and The Laws of Physics. Oxford University Press: Oxford. Reprinted with corrections in 1990.
A wide-ranging, popular physics book that contains speculations on living in a time-reversed world plus other philosophical commentary by this future Nobel Prize winner.
Penrose, Roger. 2004. The Road to Reality: A Complete Guide to the Laws of the Universe. Alfred A. Knopf: New York.
An expert in general relativity, Penrose provides an advanced presentation of all the most important laws of physics, interspersed with philosophical comments.
Pooley, Oliver. 2013. “Relativity, the Open Future, and the Passage of Time,” Proceedings of the Aristotelian Society, CXIII part 3: 321-363.
Discusses whether time passes according to the B-theory and the theory of relativity.
Price, Huw. 1992. “The Direction of Causation: Ramsey’s Ultimate Contingency,” PSA: Proceedings of the Biennial Meeting of the Philosophy of Science Association. Pp. 253-267.
Following Frank Ramsey, Price argues that the best way to account for causal asymmetry is to consider it to be a feature that agents project onto the world.
Price, Huw. 1996. Time’s Arrow & Archimedes’ Point: New Directions for the Physics of Time. Oxford University Press, Inc.: Oxford.
This book is filled with clear, expository material, but Price offers much original material. He is interested in more clearly separating subjective asymmetries from objective asymmetries. He argues that, objectively, time has no arrow, it has no direction, it does not flow, the future can affect the past, and philosophers of physics need to adopt an Archimedean point of view outside of time in order to discuss time in an unbiased way. Although Price admits we cannot literally step outside time, we can take the Archimedean point of view of “nowhen” from which we can view the timeless block universe and see that time’s arrow is inherently anthropomorphic, as is the directionality of causation. Makes a case that many arguments trying to show how and why temporal order exists presuppose that temporal order exists. He claims there is no good reason to rule out the possibility that what happens now can depend on what happens in the future, at least for microphysical phenomena. The book is reviewed in (Callender 1998).
Price, Huw. 2002. “Boltzmann’s Time Bomb,” The British Journal for the Philosophy of Science, March, Vol. 53, No. 1, pp. 83-119.
Price agrees that statistical arguments alone do not give us good reason to expect that entropy will always continue to increase because a past hypothesis is also needed. He clarifies the different assumptions being made in attempts to explain thermodynamic asymmetry, and he emphasizes that if thermodynamics were not time-asymmetric this could be for two different reasons: “The world might exhibit entropic gradients in both temporal directions, without a global temporal preference, at least on the large scale. For example, there might be a single long period of entropy ‘increase’, ‘followed’ by a matching period of entropy ‘decrease’.” Or, instead, “entropy might be approximately constant everywhere, and at all times” (pp. 89-90). Price’s position is challenged in (North 2002).
Prosser, Simon. 2016. Experiencing Time. Oxford University Press, Oxford.
Covers a broad range of topics in the interface between the philosophy of mind and the philosophy of time. Claims the A-theory is unintelligible. Says it is impossible to experience the passage of time. Argues that the present is not an ontologically privileged time. But see (Deng 2017).
Reichenbach, Hans. 1956. The Direction of Time. University of California Press: Berkeley.
An influential, technical treatise on the causal theory of time. One noteworthy feature is that it tries to establish the direction of time using the concept of entropy but without relying on a past hypothesis. Reichenbach died before being able to write his final chapter of the book that was to be on the relationship between time’s objective properties and our subjective experience of time.
Roberts, Brian W. 2022. Reversing the Arrow of Time, Cambridge University Press.
This philosopher of physics surveys the issues involved in understanding the arrow of time. He tries to debunk the extrinsic theory’s claim that the many mini-arrows commonly dealt with in the taxonomy problem are in fact arrows, and he argues that classical thermodynamics is not a temporally asymmetric theory. He maintains that there is “an asymmetry in the structure of time itself, which…holds independently of the contingent facts about matter and energy.”
Rovelli, Carlo. 2018. The Order of Time. Riverhead Books. New York.
A popular presentation of this physicist’s many original ideas about time. Claims that both entropy and the Past-Hypothesis are human-dependent.
Russell, Bertrand. 1913. “On the Notion of Cause,” Proceedings of the Aristotelian Society 13, pp. 1-26.
Russell argues that causes and causal relations should not be part of the fundamental physical description of the world.
Salmon, W. 1971. Statistical Explanation and Statistical Relevance. University of Pittsburgh Press: Pittsburgh.
Salmon argues that a good definition of causation should ward off counterexamples due to common causes.
Savitt, Steven F. 1991. “Asymmetries in Time: Problems in the Philosophy of Science by Paul Horwich,” Canadian Journal of Philosophy, Volume 21, no. 3, pp. 399-417.
A review of (Horwich 1987).
Savitt, Steven F. 1995. Time’s Arrows Today: Recent Physical and Philosophical Work on the Direction of Time.” Editor. Cambridge University Press. Cambridge.
A collection of independent research papers by distinguished investigators of the topic.
Schlegel, Richard. 1968. Time and the Physical World. Dover Publications, Inc., New York. This Dover reprint was originally published by Michigan State University Press in 1961.
This book in the philosophy of physics compares the manifest image to the scientific image of physical time as it was understood in 1961.
Schlesinger, George. 1985. “How to Navigate the River of Time”, The Philosophical Quarterly. Vol. 35, No. 138. January. Pp. 91-92.
Presents and defends the moving-now theory. He agrees that time seems to everyone to pass. Schlesinger defends the claim that time can change its rate. In the same journal two years earlier, Oaklander had claimed Schlesinger’s position is incoherent.
Sklar, Lawrence. 1974. Space, Time, and space-time. University of California Press: Berkeley, CA.
Attacks the intrinsic theory of the arrow. Surveys the causal theory of time and various theories of time’s arrow. Pages 379-394 describe the changes in Boltzmann’s thinking about the second law.
Skow, Bradford. 2009. “Relativity and the Moving Spotlight,” The Journal of Philosophy 106, pp. 666-678.
Argues that the moving spotlight theory is consistent with special relativity, particularly with its implication that the present or the NOW is relative to a reference frame.
Skow, Bradford. 2011. “Experience and the Passage of Time.” Philosophical Perspectives, 25, Metaphysics, pp. 359-387.
An examination of the argument that the best explanation of our experience is that time passes. Focuses on the moving spotlight theory which he believes is the best A-theory of time.
Skow, Bradford. 2012. “One Second Per Second.” Philosophy and Phenomenological Research 85, pp. 377-389.
Analyzes various arguments for and against the coherence of the phrase “a rate of one second per second,” and describes its multiple interpretations.
Smart, J.J.C. 1949. “The River of Time.” Mind, October, Vol. 58, No. 232, pp. 483-494.
Provides a variety of arguments against the intrinsic theory of time’s arrow. He emphasizes that things change, and events happen, but events do not change except when they are described too vaguely. This article emphasizes the analysis and clarification of ordinary language using the techniques of “logical grammar.”
Smart, J.J.C. 1967. “Time” in The Encyclopedia of Philosophy, ed. by Paul Edwards, volume 8. Macmillan Publishing Co., Inc. & The Free Press: New York, pp. 126-134.
A survey of philosophical issues about time from a member of the extrinsic camp. The views of the intrinsic camp are not given much attention.
Smart, J.J.C. 1969. “Causal Theories of Time,” The Monist, Vol. 53, No. 3. July. Pp. 385-395.
Criticizes some of the early causal theories of time from Hans Reichenbach, Adolf Grünbaum, and Henryk Mehlberg.
Th´ebault, Karim P. Y. 2021. “The Problem of Time,” in Routledge Companion to the Philosophy of Physics, edited by Eleanor Knox and Alastair Wilson. Routledge: London.
Explores the representation of time in classical, relativistic, and quantum physics. Not written at the easy intellectual level of the present encyclopedia article, but it provides a broad, accurate introduction to the problem of time and its academic literature.
Tooley, Michael. 1997. Time, Tense, and Causation. Oxford University Press, Clarendon Press: Oxford.
A growing-block model. In his dynamic and tensed theory of time, facts are tenseless states of affairs that come into existence, never to go out of existence. Causation is primitive, and the theory of relativity needs modification to allow for our common present. Tooley believes that both the tenseless theory and the standard tensed theory are false.
Wallace, David. 2013. “The Arrow of Time in Physics,” in (Dyke and Bardon 2013).
In this chapter, Wallace concentrates on the arrow of time as it occurs in physics. He explores how the arrow can exist even though there is time symmetry in thermodynamics and statistical mechanics. He provides a broad discussion of arrows other than the entropy arrow.
Wallace, D. 2017. “The Nature of the Past Hypothesis,” in The Philosophy of Cosmology, edited by K. Chamcham, et al. Cambridge University Press: Cambridge, pp. 486–99.
Explores the controversies about the acceptability of the Past Hypothesis and which formulation of it is appropriate.
Williams, Donald C. 1951. “The Myth of Passage,” Journal of Philosophy, volume 48, July 19, pp. 457-472.
Influential argument that time’s passage is a myth.
Woodward, James. 2014. “A Functional Account of Causation; or, A Defense of the Legitimacy of Causal Thinking by Reference to the Only Standard That Matters—Usefulness (as Opposed to Metaphysics or Agreement with Intuitive Judgment),” Philosophy of Science, volume 81, pp. 691–713.
A technically demanding treatment of the question “How does causation fit with physics?” and of the impact an appropriate answer has for understanding the metaphysical, descriptive, and functional roles of causation.
Zimmerman, Dean. 2005. “The A-Theory of Time, the B-Theory of Time, and ‘Taking Tense Seriously’.” Dialectica volume 59, number 4, pp. 401-457.
In exploring the issues mentioned in the title, Zimmerman considers different versions of the spotlight theory, those with a real future and those without, those having events shedding their A-properties and those without this shedding.
Zimmerman, Dean. 2011. “Presentism and the Space-Time Manifold”, in The Oxford Handbook of Philosophy of Time, ed. C. Callender, pp. 163–244, Oxford: Oxford University Press.
Considers how to reconcile presentism with relativity theory by finding a privileged reference frame.
Author Information
Bradley H. Dowden
Email: dowden@csus.edu
California State University, Sacramento
U. S. A.
1. What Are Durations, Instants, Moments, and Points of Time?
In informal conversation, an instant or moment is a very short duration. In physics, however, an instant is even shorter. It is the briefest of moments. It is instantaneous, having a zero duration. This is perhaps what the poet T.S. Eliot was thinking of when he said, “History is a pattern of timeless moments.” Midnight is an instant in this sense.
Does an instant exist across space or is it more limited? In classical physics, an instant is considered to be a three-dimensional spatial object, namely a ‘snapshot’ of the universe. This is the sense of the word instant meant by a determinist who says the state of the universe at one instant determines the state of the universe at any later instant. This is a Leibnizian notion of what a state is and what an instant is. The 21st century notion of a state at an instant is defined by physicists more exotically in terms of light cones, but that notion is not explored further in this supplement or the other supplements or the main Time article.
Time is a composite object. It is assumed in all currently accepted fundamental theories of physics that any interval of time is a linear continuum of the points of time that compose it, but it is an interesting philosophical question to ask how physicists know time is a continuum. Nobody could ever measure time that finely, even indirectly. Points of time cannot be detected. That is, there is no physically possible way to measure that the time is exactly noon even if it is true that the time is noon. Noon is 12 to an infinite number of decimal places, and no measuring apparatus is infinitely precise, and no measurement fails to have a margin of error. But given what we know about points, we should not be trying to detect points of anything. Belief in the existence of points of time is justified holistically by appealing to how they contribute to scientific success, that is, to how the points give our science extra power to explain, describe, predict, and enrich our understanding. In order to justify belief in the existence of points, we need confidence that our science would lose too many of these virtues without the points. One reason given for this confidence is that without points existing we could not use calculus to describe change in nature.
Consider what a point in time really is. Any interval of time is a real-world model of a segment of the real numbers in their normal order. So, each instant corresponds to just one real number and vice versa. To say this again in other words, time is a line-like structure on sets of point events. Just as the real numbers are an actually infinite set of decimal numbers that can be linearly ordered by the less-than relation, so time is an actually infinite set of instants or instantaneous moments that can be linearly ordered by the happens-before relation in a single reference frame. An instant or moment can be thought of as a set of point-events that are simultaneous in a single reference frame.
Although some philosophers differ on this, all physicists would claim that a moment is not able to change because change is something that is detectable only by comparing different moments.
There is a deep philosophical dispute about whether points of time actually exist, just as there is a similar dispute about whether spatial points actually exist. The dispute began when Plato said, “[T]his queer thing, the instant, …occupies no time at all….” (Plato 1961, p. 156d). Some philosophers do not wish to allow point-events and point-times. They want to make do with intervals, and want an instant always to have a positive duration. The philosopher Michael Dummett, in (Dummett 2000), said time is not made of point-times but rather is a composition of overlapping intervals each having a non-zero duration. Dummett required the endpoints of those intervals to be the initiation and termination of actual physical processes. This idea of treating time without instants developed a 1936 proposal of Bertrand Russell and Alfred North Whitehead. The central philosophical issue about Dummett’s treatment of motion is whether its adoption would negatively affect other areas of mathematics and science. It is likely that it would. For the history of the dispute between advocates of point-times and advocates of intervals, see (Øhrstrøm and Hasle 1995). The term interval in the phrase spacetime interval is a different kind of interval.
Even if time is made of points, it does not follow that matter is. It sometimes can be a useful approximation to say an electron or a quark is a point particle, but it remains an approximation. They are really spatially-extended vibrations of quantized fields.
2. What Is an Event?
In the manifest image, the universe is more fundamentally made of objects than events. In the scientific image, the universe is more fundamentally made of events than objects.
But the term event has multiple senses. There is sense 1 and sense 2. In ordinary discourse, one uses sense 1 in which an event is a happening lasting some duration during which some object changes its properties. For example, this morning’s event of buttering the toast is the toast’s changing from having the property of being unbuttered this morning to having the property of being buttered later this morning.
The philosopher Jaegwon Kim, among others, claimed that an event should be defined as an object’s having a property at a time. So, two events are the same if they are both events of the same object having the same property at the same time. This suggestion captures sense 1 of our informal concept of event, but with Kim’s suggestion it is difficult to make sense of the remark, “The vacation could have started an hour earlier.” On Kim’s analysis, the vacation event could not have started earlier because, if it did, it would be a different event. A possible-worlds analysis of events might be the way to solve this problem of change.
Physicists do sometimes use the term event this way, but they also use it differently—in what we here call sense 2—when they say events are point-events or regions of point-events often with no reference to any other properties of those events, such as their having the property of being buttered toast at that time. The simplest point-event in sense 2 is a location in spacetime with zero volume and zero duration. The set of all these events is called spacetime. Hopefully, when the term event occurs, the context is there to help disambiguate sense 1 from sense 2. For instance, when an eternalist says our universe is a block of events, the eternalist means the universe is the set of all point-events with their actual properties.
To a non-quantum physicist, any physical process is all its point-events plus the values of all their intrinsic properties. For example, the process of a ball’s falling down is a continuous, infinite series of point-events along the path in spacetime of the ball. One of those events would be this particular point piece of the ball being at a specific spatial location at some specific time. The reason for the qualification about “non-quantum” is discussed at the end of this section.
The physicists’ notion of point-event in real, physical space (rather than in mathematical space) is metaphysically unacceptable to some philosophers, in part because it deviates so much from the way the word event is used in ordinary language and in our manifest image. That is, sense 2 deviates too much from sense 1. For other philosophers, it is unacceptable because of its size, its infinitesimal size. In 1936, in order to avoid point-events altogether in physical space, Bertrand Russell and A. N. Whitehead developed a theory of time that is based on the assumption that every event in spacetime has a finite, non-zero duration. They believed this definition of an event is closer to our common sense beliefs, which it is. Unfortunately, they had to assume that any finite part of an event is also an event, and this assumption indirectly appeals to the concept of the infinitesimal and so is no closer to common sense than the physicist’s assumption that all events are composed of point-events.
McTaggart argued early in the twentieth century that events change. For example, he said the event of Queen Anne’s death is changing because it is receding ever farther into the past as time goes on. Many other philosophers (those of the so-called B-camp) believe it is improper to consider an event to be something that can change, and that the error is in not using the word change properly. This is still an open question in philosophy, but physicists use the term event and the term change as the B-theorists do, namely as something that does not change.
In non-quantum physics, specifying the state of a physical system at a time involves specifying the masses, positions and velocities of each of the system’s particles at that time. Not so in quantum mechanics. The simultaneous precise position and velocity of a particle—the key ingredients of a classical particle event—do not exist according to quantum physics. The more precise the position is, the less precise is the velocity, and vice versa. This feature is described by Heisenberg’s Uncertainty Principle that is discussed in the other supplement. Also, many physicists consider the notion of event in physics to be emergent at a higher scale from a more fundamental lower scale that has no events. The philosopher David Wallace, among others, has emphasized this idea.
The ontology of quantum physics is very different from that of non-quantum physics. The main Time article intentionally downplays this. But, says the physicist Sean Carroll, “at the deepest level, events are not a useful concept,” and one should focus on the wave function.
More than half the physicists in the first quarter of the twenty-first century believed that a theory of quantum gravity will require (1) quantizing time, (2) having time or spacetime be emergent from a more fundamental entity, (3) having only a finite maximum number of events that can occur in a finite volume. Current relativity theory and quantum theory have none of these three features.
For more discussion of what an event is, see the article on Events.
3. What Is a Reference Frame?
A reference frame is a standard viewpoint or perspective chosen by someone to display quantitative measurements about places of interest in a space plus the phenomena that take place there. It is not an objective feature of nature. To be suited for its quantitative purpose, a reference frame needs to include a coordinate system, that is, is a system of assigning numerical locations or ordered sets of numerical locations to points of the space. If the space is physical spacetime, then each point needs to be assigned at least four numbers, three for its location in space, and one for its location in time. These numbers are called “coordinates.” For every coordinate system, every point-event in spacetime has three spatial coordinate numbers and one time coordinate number. It is a convention that we usually choose the time axis to be straight rather than some other shape, but this is not required, and on the globe we use longitudes as coordinate lines, and these are not straight and not parallel.
Choosing a coordinate system requires selecting some point to be called the system’s “origin” and selecting the appropriate number of coordinate axes that orient the frame in the space. You need at least as many axes as there are dimensions to the space. To add a coordinate system to a reference frame for a space is to add an arrangement of reference lines to the space so that all points of space have unique names. It is often assumed that an observer is located at the origin, but this is not required; it is sufficient to treat the frame “as if” it had an observer. The notion of a reference frame is modern; Newton did not know about reference frames.
The name of a point in a two-dimensional space is an ordered set of two numbers (the coordinates). If a Cartesian coordinate system is assigned to the space, then a point’s coordinate is its signed distance projected along each axis from the origin point, and the axes are straight and mutually perpendicular. The origin is customarily named “(0,0).” For a four-dimensional space, a point is named with a set of four numbers. A coordinate system for n-dimensional space is a mapping from each point to an ordered set of its n coordinate numbers. The most useful numbers to assign as coordinates are real numbers because real numbers enable us to use the techniques of calculus and because their use makes it easy to satisfy the helpful convention that nearby points have nearby coordinates.
Physicists usually suggest that time is like a line. This means time is composed of durationless instants and the set of instants have a linearly-ordered structure under the happens-before relation, so time is like what mathematicians call “the continuum” and what non-mathematicians call “a line.”
When we speak of the distance between two points in a space, we implicitly mean the distance along the shortest path between them because there might be an infinite number of paths one could take. If a space has a coordinate system, then it has an infinite number of available coordinate systems because there is an unlimited number of choices for an origin, or an orientation of the axes, or the scale.
There are many choices for kinds of reference frames, although the Cartesian coordinate system is the most popular. Its coordinate axes are straight lines and are mutually perpendicular. Assuming Euclidean geometry (and so no curvature of space), the equation of the circle of diameter one centered on the origin of a Cartesian coordinate system is x2 + y2 = 1. This same circle has a very different equation if a polar coordinate system is used instead.
Reference frames can be created for physical space, or for time, or for spacetime, or for things having nothing to do with real space and time. One might create a two-dimensional (2-D) Cartesian coordinate system, with one coordinate axis for displaying the salaries of a company’s sales persons and a second coordinate axis for displaying their names. Even if the space represented by the coordinate system were to be real physical space, its coordinates would not be physically real. You cannot add two points. From this fact it can be concluded that not all the mathematical structures in the coordinate system are also reflected in what the system represents. These extraneous mathematical structures are called “mathematical artifacts.”
Below is a picture of a reference frame spanning a space that contains a solid ball. The coordinates are not confined to the surface of the ball but also cover the surrounding space. What we have here is a 3-dimensional Euclidean space that uses a Cartesian coordinate system with three mutually perpendicular axes. The space contains a 3-dimensional (3-D) solid ball:
The origin of the coordinate system is at the center of the ball, and the y-axis intersects the north pole and the south pole. Two of the three coordinate axes intersect the blue equator at specified places. The red line represents a typical longitude, but this longitude is not a coordinate axis. The three coordinates of any point in this space form an ordered set (x,y,z) of the x, y, and z coordinates of the point, with commas separating each from the other coordinate labels for the point. Thinking of the ball as the globe, there are points on the Earth, inside the Earth, and outside the Earth. For 3-D space, the individual coordinates normally would be real numbers. For example, we might say a point of interest deep inside the ball (the Earth) has the three coordinates (4.1,2,0), where it is assumed all three numbers have the same units, such as meters. It is customary in a three-dimensional space to label the three axes with the letters x, y, and z, and for (4.1,2,0) to mean that 4.1 meters is the x-coordinate of the point, 2 meters is the y-coordinate of the same point, and 0 meters is the z-coordinate of the point. The center of the Earth in this graph is located at the origin of the coordinate system; the origin of a frame has the coordinates (0,0,0). Mathematical physicists frequently suppress talk of the units and speak of 2 being the y-coordinate, although strictly speaking the y-coordinate is 2 meters. The x-axis is all the points (x,0,0); the y-axis is all the points (0,y,0); the z-axis is all the points (0,0,z), for all possible values of x, y, and z.
In a coordinate system, the axes need not be mutually perpendicular, but in order to be a Cartesian coordinate system, the axes must be mutually perpendicular, and the coordinates of a point in spacetime must be the values along axes of the perpendicular projections of the point onto the axes. All Euclidean spaces can have Cartesian coordinate systems. If the space were the surface of the sphere above, not including its insides or outside, then this two-dimensional space would be a sphere, and it could not have a two-dimensional Cartesian coordinate system because all the axes could not lie within the space. The 2D surface could have a 3D Cartesian coordinate system, though. This coordinate system was used in our diagram above. A more useful coordinate system might be a 3D spherical coordinate system. Space and time in the theory of special relativity are traditionally represented by a frame with four independent, real coordinates (t,x,y,z).
Changing from one reference frame to another does not change any phenomenon in the real world being described with the reference frame, but is merely changing the perspective on the phenomena. If an object has certain coordinates in one reference frame, it usually has different coordinates in a different reference frame, and this is why coordinates are not physically real—they are not frame-free. Durations are not frame-free. Neither are positions, directions, and speeds. An object’s speed is different in different reference frames, with one exception. The upper limit on the speed of any object in space satisfying the principles of special relativity is c, the speed of light in a vacuum. This claim is not relative to a reference frame. This speed c is the upper limit on the speed of transmission from any cause to its effect. This c is the c in the equation E = mc2. It is the speed of any particle with zero rest mass such as a photon. The notion of speed of travel through spacetime rather than through space is usually considered by physicists not to be sensible. Whether the notion of speed through time also is not sensible is a controversial topic in the philosophy of physics. See the main Time article’s section “The Passage or Flow of Time” for a discussion of whether it is sensible.
The word reference is often dropped from the phrase reference frame, and the term frame and coordinate system are often used interchangeably. A frame for the physical space in which a particular object always has zero velocity is called the object’s rest frame or proper frame.
A reference frame is a possible viewpoint. When choosing to place a frame upon a space, there are an infinite number of legitimate choices. Choosing a frame carefully can make a situation much easier to describe. For example, suppose we are interested in events that occur along a highway. We might orient the z-axis by saying it points up away from the center of Earth, while the x-axis points along the highway, and the y-axis is perpendicular to the other two axes and points across the highway. If events are to be described, then a fourth axis for time would be needed, but its units would be temporal units and not spatial units. It usually is most helpful to make the time axis be perpendicular to the three spatial axes, and to require successive seconds along the axis to be the same duration as seconds of the standard clock. By applying a coordinate system to spacetime, a point of spacetime is specified uniquely by its four independent coordinate numbers, three spatial coordinates and one time coordinate. The word independent implies that knowing one coordinate of a point gives no information about the point’s other coordinates.
Coordinate systems of reference frames have to obey rules to be useful in science. No accepted theory of physics allows a time axis to be shaped like a figure eight. Frames need to honor the laws if they are to be perspectives on real events. For all references frames allowed by relativity theory, if a particle collides with another particle, they must collide in all allowed reference frames. Relativity theory does not allow reference frames in which a particle of light is at rest. Quantum mechanics does. A frame with a time axis in which your shooting a gun is simultaneous with your bullet hitting a distant target is not allowed by relativity theory. Informally, we say it violates the fact that causes occur before their effects in all legitimate reference frames for relativity theory. Formally, we say it violates the light cone structure required by relativity theory.
How is the time axis oriented in the world? This is done by choosing t = 0 to be the time when a specific event occurs such as the Big Bang, or the birth of Jesus. A second along the t-axis usually is required to be congruent to a second of our civilization’s standard clock, especially for clocks not moving with respect to that clock.
A space with a topology defined on it and having any number of dimensions is called a manifold. Newtonian mechanics, special relativity, general relativity, and quantum theory all require the set of all events (in the sense of possible space-time locations) to form a four-dimensional manifold. Informally, what it means to be four-dimensional is that each point cannot be specified with less than four independent numbers. Formally, the definition of dimension is somewhat complicated.
Treating time as a special dimension of spacetime is called spatializing time, and doing this is what makes time precisely describable mathematically in a way that treating time only as becoming does not. It is a major reason why mathematical physics can be mathematical.
One needs to be careful not to confuse the features of time with the features of the mathematics used to describe time. Einstein admitted [see (Einstein 1982) p. 67] that even he often made this mistake of failing to distinguish the representation from the object represented, and it added years to the time it took him to create his general theory of relativity.
Times are not numbers, but time coordinates are.If the laws of physics are time-translation symmetric, which is the normal assumption, then the laws of mathematical physics are invariant relative to the group of transformations of time coordinate t expressed by t –> t + t0 where t0is an arbitrarily chosen constant real number.
Some features of reality are relative to a reference frame and some are not. Duration is relative. Distance is relative. Acceleration and spacetime interval are not. The speed of light in a vacuum is not.
4. Curved Space and Cartesian Coordinates
According to general relativity theory, space curves near all masses. Here are the three main types of geometries for representing curvature:
Euclidean geometry.
Hyperbolic geometry.
Elliptical geometry.
The following diagram shows how the three, when viewed from a higher dimension, differ in curvature and in the character of their parallel lines, circles and triangles. Click on the diagram to expand it:
The geometry of a space exists independently of whatever coordinate system is used to describe it, so one has to take care to distinguish what is a real feature of the geometry from what is merely an artifact of the mathematics used to characterize the geometry.
A Cartesian coordinate system can handle all sorts of curved paths and curved objects, but it fails whenever the space itself curves. What we just called “the space” could be real physical space or an abstract mathematical space or spacetime or just time.
Any Euclidean space can have a Cartesian coordinate system. A reference frame fixed to the surface of the Earth cannot have a Cartesian coordinate system covering all the surface because the surface curves and the space is therefore not Euclidean. Spaces with a curved geometry require curvilinear coordinate systems in which the axes curve as seen from a higher dimensional Euclidean space in which the lower-dimensional space is embedded. This higher-dimension can be real or unreal.
If the physical world were two-dimensional and curved like the surface of a sphere, then a two-dimensional Cartesian coordinate system for that space must fail to give coordinates to most places in the world. To give all the points of the 2D world their own Cartesian coordinates, one would need a 3D Cartesian system, and each point in the world would be assigned three coordinates, not merely two. For the same reason, if we want an arbitrary point in our real, curving 4D-spacetime to have only four coordinates and not five, then the coordinate system must be curvilinear and not Cartesian. But what if we are stubborn and say we want to stick with the Cartesian coordinate system and we don’t care that we have to bring in an extra dimension and give our points of spacetime five coordinates instead of four? In that case we cannot trust the coordinate system’s standard metric to give correct answers.
Let’s see why this is so. Although the coordinate system can be chosen arbitrarily for any space or spacetime, different choices usually require different metrics. Suppose the universe is two-dimensional and shaped like the surface of a sphere when seen from a higher dimension. The 2D sphere has no inside or outside; the extra dimension is merely for our visualization purposes. Then when we use the 3D system’s metric, based on the 3D version of the Pythagorean Theorem, to measure the spatial distance between two points in the space, say, the North Pole and the equator, the value produced is too low. The correct value is higher because it is along a longitude and must stay confined to the surface. The 3D Cartesian metric says the shortest line between the North Pole and a point on the equator cuts through the Earth and so escapes the universe, which indicates the Cartesian metric cannot be correct. The correct metric would compute distance within the space along a geodesic line (a great circle in this case such as a longitude) that is confined to the sphere’s surface.
The orbit of the Earth around the Sun is curved in 3D space, but “straight” in 4D spacetime. The scare quotes are present because the orbit is straight only in the sense that a geodesic is straight. A geodesic path between two points of spacetime is a path of shortest spacetime interval between the points.
One could cover a curved 4D-spacetime with a special Cartesian-like coordinate system by breaking up the spacetime into infinitesimal regions, giving each region its own Cartesian coordinate system, and then stitching the coordinate systems all together where they meet their neighbors. The stitching produces what is customarily called an atlas. Each point would have its own four unique coordinates, but when the flat Cartesian metric is used to compute intervals, lengths, and durations from the coordinate numbers of the atlas, the values will be incorrect.
Instead of considering a universe that is the surface of a sphere, consider a universe that is the surface of a cylinder. This 2D universe is curved when visualized from a 3D Euclidean space in which the cylinder is embedded. Surprisingly, it is not intrinsically curved at all. The measures of the three angles of any triangle sum to 180 degrees. Circumferences of its circles always equal pi times their diameters. We say that, unlike the sphere, the surface of a cylinder is extrinsically curved but intrinsically flat.
For a more sophisticated treatment of reference frames and coordinates, see Coordinate Systems. For an introduction to the notion of curvature of space, see chapter 42 in The Feynman Lectures on Physics by Richard Feynman.
5. What Is an Inertial Frame?
Galileo first had the idea that motion is relative. If you are inside a boat with no windows and are floating on a calm sea, you cannot tell whether the boat is moving. Even if it is moving, you won’t detect this inside a closed cabin of the boat, say, by seeing a dropped ball curve as it falls or by feeling a push on yourself or seeing all the flies near you being pushed to the back of the room. Galileo believed steady motion is motion relative to other objects, and there is no such thing as simply motion relative to nothing, or motion relative to fixed, absolute space. Newton disagreed.
Newton believed in absolute motion. This is motion of an object that is not dependent upon its relations with any other object. Newton would say an inertial frame is a reference frame moving at constant velocity relative to absolute space. Today we are more likely to say it is a frame in which momentum is conserved and observers fixed in the frame feel no forces acting on them.
An inertial observer is someone who feels weightless, as if they are floating. They feel no acceleration and no gravitational field, yet all the laws of physics apply to this observer as they do to anything else.
Einstein described an inertial frame as a reference frame in which Newton’s first law of motion holds. Newton’s first law says an isolated object, that is, an object affected by no total extrinsic force, has a constant velocity over time. It does not accelerate. In any inertial frame, any two separate objects that are moving in parallel and coasting along with no outside forces on them, will remain moving in parallel forever. Einstein described his special theory of relativity in 1905 by saying it requires the laws of physics to have the same form in any inertial frame of reference. Einstein’s general theory is not restricted to inertial frames.
According to the general theory of relativity, there are no global inertial reference frames at all because Newton’s first law is not strictly true globally. It holds to an acceptable degree of approximation in some restricted regions that are sufficiently far away from masses.
Newton’s first law can be thought of as providing a definition of the concept of zero total external force; an object has zero total external force if it is moving with constant velocity. In the real world, no objects behave this way; they cannot be isolated from the force of gravity. Gravity cannot be turned off, and so Newton’s first law fails, and there are no inertial frames. But the first law does hold approximately. That is, it holds well enough for various purposes in many situations. It holds in any infinitesimal region. In larger regions, if spacetime curvature can be ignored for a certain phenomenon of interest, then one can find an inertial frame for the phenomenon. A Cartesian coordinate system fixed to Earth usually will serve adequately as an inertial frame for describing cars on a race track or describing the flight of a tennis ball, but not for describing a rocket’s flight from Paris to Mars. A coordinate frame for space that is fixed on the distant stars and is used by physicists only to describe phenomena far from any of those stars, and far from planets, and far from other massive objects, is very nearly an inertial frame in that region. Given that some frame is inertial, any frame that rotates or otherwise accelerates relative to this first frame is non-inertial.
Newton’s theory requires a flat, Euclidean geometry for space and for spacetime. Special relativity requires a flat Euclidean geometry for space but a flat, non-Euclidean geometry for spacetime. General relativity allows all these but also allows curvature for spacetime as well as space. If we demand that our reference frame’s coordinate system span all of spacetime, then a flat frame does not exist for the real world, just as a 2D plane cannot cover the surface of a sphere. The existence of gravity requires there to be curvature of space around any object that has mass, thereby making a flat frame fail to span some of the space near the object.
Perhaps most importantly, it became generally accepted since the 1920s that Euclid and Kant were mistaken about the geometry of the universe because they failed to distinguish mathematical geometry (which is a priori) from physical geometry (which is empirical). In philosophy, this point was made most strenuously by Hans Reichenbach.
For a deeper philosophical introduction to inertial frames, see chapter 2 of (Maudlin 2012).
6. What Is Spacetime?
Spacetime is a certain combination of space and time. It is the set of locations of events, or it can be considered to be a field where all events are located. It makes to speak of speed through space, but not speed through spacetime.
There are actual spacetimes and imaginary spacetimes. Our real four-dimensional spacetime has a single time dimension and at least three space dimensions. It is still an open question whether there are more than three spatial dimensions. But there definitely are imaginary spacetimes with twenty-seven dimensions or three hundred. There could be a three-dimensional spacetime composed of two spatial dimensions and a time dimension in which points in space indicate the latitude and longitude in Canada for the sale of a company’s widget, and points along the time dimension indicate the date of the sale of the widget. In any spacetime, real or imaginary, the coordinates are the names of locations in space and time. Coordinates are mathematical artifacts.
In 1908, Einstein’s mathematics teacher Hermann Minkowski was the first person to say that real spacetime is fundamental and that space and time are just aspects of spacetime. And he was the first to say different reference frames will divide spacetime differently but correctly into their time part and space part. Einstein was soon convinced by Minkowski’s reasoning.
Later, Einstein discovered that real spacetime is dynamic and not static as in special relativity theory. It is dynamic because its structure, such as its geometry, changes over time. Einstein said it changes as the distribution of matter-energy changes. In special relativity and in Newton’s theory, spacetime is not dynamic; it stays the same regardless of what matter and energy are doing. In any spacetime obeying either the special or the general theory of relativity, the key idea about time is that there is a light-cone structure such that every point in spacetime has both a forward light-cone of future events and a backward light-cone of past events. What this means is explained momentarily.
In his general theory of relativity, Einstein said gravity is a feature of spacetime, namely its curvature. Spacetime curves near gravitational fields, and it curves more the stronger the field strength. The overall, cosmic curvature of space was far from zero at the Big Bang, but it is now about zero, although many cosmologists believe it is evolving toward a positive value. These days the largest curvature of spacetime is in black holes.
In general relativity, spacetime is assumed to be a fundamental feature of reality. It is very interesting to investigate whether this assumption is true. There have been serious attempts to construct theories of physics in which spacetime is not fundamental but instead emerges from something more fundamental such as quantum fields, but none of these attempts have stood up to any empirical observations or experiments that could show the new theories to be superior to the presently accepted theories.
The metaphysical question of whether spacetime is a substantial object or merely a relationship among events, or neither, is considered in the discussion of the relational theory of time in the main Time article. For some other philosophical questions about what spacetime is, see What is a Field?
According to the physicist George Musser, “Gravity is not a force that propagates through space but a feature of spacetime itself. When you throw a ball high into the air, it arcs back to the ground because Earth distorts the spacetime around it, so that the paths of the ball and the ground intersect again.”
How do we detect that a space is curved if we cannot look down upon it from a higher dimension and see the curvature and must instead make the decision from within the space? The answer is that we can detect deviations from Euclidean geometry, such as (1) initially parallel lines becoming less parallel as they are extended, or (2) failure of the theorem that says the sum of the interior angles of a triangle add to 180 degrees, or (3) the circumference of a circle is not the product of pi and its diameter.
Many physicists who promote string theory believe that spacetime really has many more dimensions of space than three. There are the three common ones, customarily called our “brane” (short for “membrane”), plus others. The higher-dimensional “hyperspace” in which our brane resides is called the “bulk.” Our 3D brane bends into the bulk. It is believed that light cannot escape our brane but gravity can.
7. What Is a Spacetime Diagram and a Light Cone?
A spacetime diagram is a graphical representation of the coordinates of events in spacetime. Think of the diagram as a picture of a reference frame. In classical spacetime diagrams, one designated coordinate axis is for time. The other axes are for space. A Minkowski spacetime diagram is a special kind of spacetime graph. It is a particular 4-dimensional generalization of 3-D Cartesian coordinates, one that represents phenomena that obey the laws of special relativity. A Minkowski diagram allows no curvature of spacetime itself, although objects themselves can have curving sides, and they can have curving paths in space.
The following diagram is an example of a three-dimensional Minkowski spacetime diagram containing two spatial dimensions (with straight lines for the two axes) and a time dimension (with a vertical straight line for the time axis). If you are located at the origin, then the space part of this spacetime frame constitutes your rest frame; it’s the frame in which you have zero velocity. Two cones emerge upward and downward from the point-event of you, the zero-volume observer being here now at the origin of the reference frame of your spacetime diagram. These cones are your future and past light cones. The two cones are composed of green paths of possible unimpeded light rays emerging from the observer or converging into the observer. The light cone at a point of space exists even if there is no actual light there.
By convention, in a Minkowski spacetime diagram, a Cartesian (rectangular) coordinate system is used, the time axis is shown vertically, and one or two of the three spatial dimensions are suppressed (that is, not included).
If the Minkowski diagram has only one spatial dimension, then a flash of light in a vacuum has a perfectly straight-line representation, but it is has a cone-shaped representation if the Minkowski diagram has two spatial dimensions, and it is a sphere if there are three spatial dimensions. Because light travels at such a high speed, it is common to choose the units along the axes so that the path of a light ray is a 45 degree angle and the value of c is 1 light year per year, with light years being the units along each space axis and years being the units along the time axis. Or the value of c could have been chosen to be one light nanosecond per nanosecond. The careful choice of units for the axes in the diagram is important in order to prevent the light cones’ appearing too flat to be informative.
Below is an example of a Minkowski diagram having only one space dimension, so every future light cone has the shape of the letter “V.”
This Minkowski diagram represents a spatially-point-sized Albert Einstein standing still midway between two special places, places where there is an instantaneous flash of light at time t = 0 in coordinate time. At t = 0, Einstein cannot yet see the flashes because they are too far away for the light to reach him yet. The directed arrows represent the path of the four light rays from the flashes. In a Minkowski diagram, a physical point-object of zero volume is represented as occupying a single point at one time and as occupying a line containing all the spacetime points at which it exists. That line is called the world line of the (point) object. All world lines representing real objects are continuous paths in spacetime. Accelerating objects have curved paths in spacetime. Real objects that are not just points have a world tube rather than merely a world line. If an object moves slower than the speed of light, its world line is confined within its future cone. The line stops if the object stops existing.
Events on the same horizontal line of the Minkowski diagram are simultaneous in the reference frame. The more tilted an object’s world line is away from the vertical, the faster the object is moving. Given the units chosen for the above diagram, no world line can tilt down more than 45 degrees, or else that object is moving faster than c, the cosmic speed limit according to special relativity.
In the above diagram, Einstein’s world line is straight, indicating no total external force is acting on him. If an object’s world line meets another object’s world line, then the two objects collide.
The set of all possible photon histories or light-speed world lines going through a specific point-event defines the two light cones of that event, namely its past light cone and its future light cone. The future cone or forward cone is called a cone because, if the spacetime diagram were to have two space dimensions, then light emitted from a flash would spread out in the two spatial dimensions in a circle of ever-growing diameter, producing a cone shape over time. In a diagram for three-dimensional space, the light’s wavefront is an expanding sphere and not an expanding cone, but sometimes physicists still will speak informally of its cone.
Every point of spacetime has its own pair of light cones, but the light cone has to do with the structure of spacetime, not its contents, so the light cone of a point exists even if there is no light there.
Whether a member of a pair of events could have had a causal impact upon the other event is an objective feature of the universe and is not relative to a reference frame. A pair of events inside the same light cone are said to be causally-connectible because they could have affected each other by a signal going from one to the other at no faster than the speed of light, assuming there were no obstacles that would interfere. For two causally-connectible events, the relation between the two events is said to be timelike. If you were once located in spacetime at, let’s say, (x1,y1,z1,t1), then for the rest of your life you cannot affect or participate in any event that occurs outside of the forward light cone whose apex is at (x1,y1,z1,t1). Light cones are an especially helpful tool because different observers in different rest frames should agree on the light cones of any event, despite their disagreeing on what is simultaneous with what and their disagreeing on the duration between two events. So, the light-cone structure of spacetime is objectively real.
Einstein’s Special Theory does apply to gravitation, but it does so very poorly. It falsely assumes that gravitational processes have no effect on the structure of spacetime. When attention needs to be given to the real effect of gravitational processes on the structure of spacetime, that is, when general relativity needs to be used, then Minkowski diagrams become inappropriate for spacetime. General relativity assumes that the geometry of spacetime is locally Minkowskian, but not globally Minkowskian. That is, spacetime is locally flat in the sense that in any infinitesimally-sized region one always finds spacetime to be 4D Minkowskian (which is 3D Euclidean for space but not 4D Euclidean for spacetime). When we say spacetime is curved and not flat, we mean it deviates from 4D Minkowskian geometry. In discussions like this, more often the term “Lorentzian” is used in place of “Minkowskian.”
8. What Are Time’s Metric and Spacetime’s Interval?
The metric of a space contains geometric information about the space, namely the curvature at points and distance between any two points along a curve containing the two points. If you change to a different coordinate system, generally you must change the metric. In that sense, the metric is not objective.
In simple situations, such as in a Euclidean space with a Cartesian coordinate system, time’s metric is a procedure that says that in order to find the duration of an event subtract the event’s starting time from its ending time. This is the standard way to compute durations when curvature of spacetime is not involved. When it is involved, such as in general relativity, we need a more exotic metric, and the computations can be extremely complicated.
The spacetime metric tells the spacetime interval between any two point events along any specific spacetime path. The spacetime interval has both space aspects and time aspects.
Let’s delve into this issue a little more deeply. There are multiple senses of the word space. A mathematical space is not a physical space. A physicist often represents physical time as a one-dimensional mathematical space, represents physical space as a three-dimensional mathematical space, and physical spacetime as a four-dimensional mathematical space. A metric for any sort of space is an equation that says how to compute the distance or interval between any two points in that space along a curve in the space, given the location coordinates of the points on that curve. Note the coordinate dependence. For ordinary Euclidean space, the usual metric is just the three-dimensional version of the Pythagorean Theorem. In a Euclidean four-dimensional space, the metric is the four-dimensional version of the Pythagorean Theorem.
What counts as an acceptable reference frame? The value of any shortest interval between two point events must not change if the reference frame changes. So, acceptable reference frames are those that preserve the shortest intervals between points.
To quantitatively handle this path dependence, mathematicians say the infinitesimal proper timedτ, rather than the differential coordinate-time dt, is the duration shown by a clock carried along the infinitesimal spacetime interval ds. It is defined in any spacetime obeying special relativity to be:
dτ2= ds2/c2.
Usually, dτ is not equal to dt. They are equal only if the two point-events have the same spatial location so that dx = 0. If you sit still, the duration of your coordinate time is the same as that of your proper time. Not so if you move.
As we have seen, the length of a path’s interval in spacetime is not calculated the way we calculate the length of a path in space. In space we use the Euclidean method; in spacetime we use the Minkowski method, which contains a negation sign for space. The equation is ds2 = c2dt2 – dx2 for a spacetime of only one spatial dimension.
Because spacetime “distances” (intervals) can be negative, and because the spacetime interval between two different events can be zero even when the events are far apart in spatial distance (but reachable by a light ray if intervening material were not an obstacle), the term interval here is not what is normally meant by the term distance.
If you move but do not move on a straight line in spacetime between two points, and so follow some spacetime path, then you integrate along the path to get the interval. There are three kinds of spacetime intervals: timelike, spacelike, and null. In spacetime, if two point-events are in principle connectable by a signal that travels from one event to the other at less than light speed, the interval between the two events is called timelike. The interval is spacelike if there is no reference frame in which the two events occur at the same place, so they must occur at different places and be some spatial distance apart—thus the choice of the word spacelike. Two events that are connectable by a signal moving exactly at light speed are separated by a null interval, an interval of magnitude zero. So, if you move at light speed you can get anywhere in no time at all.
Here is an equivalent way of describing the three kinds of spacetime intervals. If one of the two events occurs at the origin or apex of a light cone, and the other event is within either the forward light cone or backward light cone, then the two events have a timelike interval. If the other event is outside the light cones, then the two events have a spacelike interval [and are in each other’s so-called absolute elsewhere]. If the two events lie along the same light cone, then their interval is null or zero.
The spacetime interval between any two events in a human being’s life must be a timelike interval. No human being can do anything to affect an event outside their future light cone. Such is the human condition according to relativity theory.
The information in the more complicated metric for general relativity also enables a computation of the curvature at any spacetime point. This more complicated metric is the Riemannian metric tensor field. This is what you know when you know the metric of spacetime. The metric of spacetime in general relativity is not global but varies from place to place due to the presence of matter and gravitation, and it varies over time as the spatial distribution of matter and energy varies with time. So, spacetime cannot be given its coordinate numbers without our knowing the distribution of matter and energy. That is the principal reason why the assignment of time coordinates to times is so complicated.
A space’s metric provides a complete description of the local properties of the space. By contrast, the space’s topology provides a complete description only of the global properties such as whether it has external curvature like a cylinder or no external curvature as in a plane. A cylinder and a plane are locally the same, but they are globally very different.
For a helpful and more detailed presentation of the spacetime interval and the spacetime metric, see chapter 4 of (Maudlin 2012) and especially the chapter “Geometry” in The Biggest Ideas in the Universe: Space, Time, and Motion by Sean Carroll. For a simpler presentation, see the YouTube video “Special Relativity: This is why you misunderstand it” by Sabine Hossenfelder.
9. How Does Proper Time Differ from Standard Time and Coordinate Time?
Authors and speakers who use the word time often do not specify whether they mean proper time or standard time or coordinate time. They assume the context is sufficient for us to know what they mean. Coordinate time is time measured in some coordinate system. Proper time of any object is the time measured by a clock attached to the object. Standard time is the proper time reported by the standard clock at the origin of our conventionally-chosen standard coordinate system.
Every properly functioning clock measures its own proper time, the time along its own world line, no matter how the clock is moving or what forces are acting upon it. Standard time is the time reported by a clock for the Greenwich Observatory in London, England (sometimes called Greenwich Mean Time or GMT). By convention, this is the correct time for the time zone containing London. The Observatory is assumed to be stationary in the standard coordinate system.
If you are stationary, then the proper time elapsed between two events on your world line is the same as the coordinate time that elapses. Otherwise the two are not the same. For example, consider two distinct point events 1 and 2. Suppose event 1 is your saying goodbye to your twin in San Francisco, and event B is your saying hello to your twin who arrives back in San Francisco to the place you never left. The proper time that has elapsed for you is the same as the difference in coordinate times of the two events, namely |t(B) – t(A)|. Suppose your twin took a spaceship from San Francisco and traveled at near the speed of light until returning to San Francisco to meet you. You’d no longer be the same age as your twin. This surprising truth of relativity theorem is called the Twin Paradox, and it is discussed below.
The actual process by which coordinate time is computed from the proper times of real clocks and the process by which a distant clock is synchronized with a local clock are very complicated. Some of the details are discussed below in later sections of this supplement.
10. Is Time the Fourth Dimension?
Yes and no; it depends on what is meant by the question. It is correct to say time is a dimension but not to say time is a spatial dimension. Time is the fourth dimension of 4D spacetime, but time is not the fourth dimension of our physical space because that space has only three dimensions, as far as we know. In 4D spacetime, the time dimension is special and differs in a fundamental way from the other three dimensions.
Mathematicians have a broader notion of the term space than the average person. In their sense, a space need not contain any geographical locations nor any times, and it can have any number of dimensions, even an infinite number. A space might be two-dimensional and contain points represented by the ordered pairs in which a pair’s first member is the name of a voter in London and its second member is the average monthly income of that voter. Not paying attention to the two meanings of the term space is the source of all the confusion about whether time is the fourth dimension.
The mathematical space used by mathematical physicists to represent physical spacetime that obeys the laws of relativity is four-dimensional; and in that mathematical space, the space of places is a 3D sub-space, and time is another sub-space, a 1D one. The mathematician Hermann Minkowski was the first person to envision such a 4D mathematical space for spacetime, although in 1895 H. G. Wells treated time informally as the fourth dimension in his novel The Time Machine.
In 1908, Einstein’s mathematics teacher Minkowski remarked that “Henceforth space by itself, and time by itself, are doomed to fade away into mere shadows, and only a kind of union of the two will preserve an independent reality.” Many people at the time mistakenly took this to mean that time is partly space, and vice versa. At the time, the philosopher C. D. Broad made the helpful remark that the discovery of spacetime did not break down the distinction between time and space but only their independence of each other.
The reason why time is not partly space is that, speaking informally, when you set up a rectangular coordinate system on a spacetime with an origin at, say, some important event, you may point the x-axis east or north or up or any of an infinity of other directions, but you may not point it forward in time—you may do that only with the t-axis, the time axis.
For any coordinate system on spacetime, mathematicians of the early twentieth century believed it was necessary to treat a point-event with at least four independent numbers in order to account for the four dimensionality of spacetime. It can be treated this way, but it need not be. This treatment with four numbers uses the 19th-century definition of dimensionality, which is due to Bernhard Riemann. It is not necessary because mathematicians have subsequently discovered how to map a plane onto a line and thereby assign each point on the plane to a point on the line without any two points on the plane being assigned to the same point on the line. The idea comes from the work of Georg Cantor. Because of this one-to-one correspondence between the plane’s points and the line’s points, the points on a plane could be specified with just one number instead of two. If so, then the line and plane must have the same dimensions according to the Riemann definition of dimension. To avoid this result, and to keep the plane being a 2D object, the notion of dimensionality of space has been given a new, but more complex, definition that is not explored here.
There has been much research in string theory regarding whether space has more than three dimensions and whether the dimensionality can differ in different regions of spacetime. If string theory is correct, space might have many more dimensions than three, but string theory is an unconfirmed theory.
11. How Is Time Relative to the Observer?
The rate that a clock ticks is relative to the observer. Given one event, the first observer’s clock can measure one value for its duration, but a second clock can measure a different value if it is moving or being affected differently by gravity. Yet, says Einstein, both measurements can be correct. That is what it means to say time is relative to the observer. This relativity is quite a shock to someone presupposing the manifest image of time. According to Newton’s physics, in principle there is no reason why observers cannot agree on what time it is now or how long an event lasts or when some distant event occurred while ignoring the reference frame involved. Einstein’s theory disagrees with Newton’s on all this.
The term “observer” in relativity theory has a technical meaning. The observer has no effect on the observation. The observer at a point is idealized as a massless point particle having no impact on its environment. Ideally, an observer is a conscious being who can report an observation and who has a certain orientation to what is observed, such as being next to the measured event or being three light years away. If so, the observation is called objective. An observation is the result of the action of observing. It establishes the values of one or more variables as in: “It was noon on my spaceship’s clock when the asteroid impact was detected, so because of the travel time of light I compute that the impact occurred at 11:00.”
Think of an observer as being an omniscient reference frame. Consider what is involved in being an omniscient reference frame. Information about any desired variable is reported from a point-sized spectator at each spacetime location. A spectator is always accompanied by an ideal, point-sized, massless, perfectly functioning clock that is synchronized with the clocks of other spectators at all other points of spacetime. The observer at a location has all the tools needed for reporting values of variables such as voltage or the presence or absence of grape jelly at that location.
12. What Is the Relativity of Simultaneity?
The relativity of simultaneity is the feature of spacetime in which observers using different reference frames disagree on which events are simultaneous. Simultaneity is relative to the chosen reference frame. A large percentage of both physicists and philosophers of time suggest that this implies simultaneity is not objectively real, and they conclude also that the present is not objectively real, the present being all the events that are simultaneous with being here now.
Why is there disagreement about what is simultaneous with what? It occurs because the two events occur spatially far from each other without either possibly having a causal effect upon the other. If event A could cause event B, then A occurs before B in all reference frames.
In our ordinary lives, we can neglect all this because we are interested in nearby events. If two events occur near us, we can just look and see whether they occurred simultaneously. But suppose we are on a spaceship circling Mars when a time signal is received saying it is noon in London, England. Did the event of the sending and receiving occur simultaneously? No. Light takes an hour and twenty minutes to travel from the Earth to the spaceship. If we want to use this time signal to synchronize our clock with the Earth clock, then instead of setting our spaceship clock to noon, we should set it to an hour and twenty minutes before noon.
This scenario conveys the essence of properly synchronizing distant clocks with our nearby clock. There are some assumptions that are ignored for now, namely that we can determine that the spaceship was relatively stationary with respect to Earth and was not in a different gravitational potential field from that of the Earth clock.
The diagram below illustrates the relativity of simultaneity for the so-called midway method of synchronization. There are two light flashes. Did they occur simultaneously? Yes and no; it is relative to the frame used in the answer. Yes, for the frame in which Einstein is at rest; no, for frame in which Lorentz is at rest.
This Minkowski diagram represents Einstein sitting still in the reference frame indicated by the coordinate system with the thick black axes. Lorentz is traveling rapidly away from him and toward the source of flash 2. Because Lorentz’s world line is a straight line, we can tell that he is moving at a constant speed. The two flashes of light arrive simultaneously at their midpoint according to Einstein but not according to Lorentz. Lorentz sees flash 2 before flash 1. That is, the event A of Lorentz seeing flash 2 occurs before event C of Lorentz seeing flash 1. So, Einstein will readily say the flashes are simultaneous, but Lorentz will have to do some computing to figure out that the flashes are simultaneous in the Einstein frame because they are not simultaneous to him in a reference frame in which he is at rest. However, if we’d chosen a different reference frame from the one above, one in which Lorentz is not moving but Einstein is, then it would be correct to say flash 2 occurs before flash 1. So, whether the flashes are or are not simultaneous depends on which reference frame is used in making the judgment. It’s all relative.
There is a related philosophical issue involved with assumptions being made in, say, claiming that Einstein was initially midway between the two flashes. Can the midway determination be made independently of adopting a convention about whether the speed of light is independent of its direction of travel? This is the issue of whether there is a ‘conventionality’ of simultaneity.
13. What Is the Conventionality of Simultaneity?
The relativity of simultaneity is philosophically less controversial than the conventionality of simultaneity. To appreciate the difference, consider what is involved in making a determination regarding simultaneity. The central problem is that you can measure the speed of light only for a roundtrip, not a one-way trip, so you cannot simultaneously check what time it is on your clock and on a distant clock. A related, simpler problem, is to determine whether the speed of light is the same in opposite directions.
Given two events that happen essentially at the same place, physicists assume they can tell by direct observation whether the events happened simultaneously. If they cannot detect that one of them is happening first, then they say they happened simultaneously, and they assign the events the same time coordinate in the reference frame. The determination of simultaneity is very much more difficult if the two events happen very far apart, such as claiming that the two flashes of light reaching Einstein in the scenario of the previous section began at the same time. One way to measure (operationally define) simultaneity at a distance is the midway method. Say that two events are simultaneous in the reference frame in which we are stationary if unobstructed light signals caused by the two events reach us simultaneously when we are midway between the two places where they occurred. This is the operational definition of simultaneity used by Einstein in his theory of special relativity.
This midway method has a significant presumption: that the light beams coming from opposite directions travel at the same speed. Is this a fact or just a convenient convention to adopt? Einstein and the philosophers of time Hans Reichenbach and Adolf Grünbaum have called this a reasonable convention because any attempt to experimentally confirm the equality of speeds, they believed, presupposes that we already know how to determine simultaneity at a distance.
Hilary Putnam, Michael Friedman, and Graham Nerlich object to calling it a convention—on the grounds that to make any other assumption about light’s speed would unnecessarily complicate our description of nature, and we often make choices about how nature is on the basis of simplification of our description of nature.
To understand the dispute from another perspective, notice that the midway method above is not the only way to define simultaneity. Consider a second method, the mirror reflection method. Select an Earth-based frame of reference, and send a flash of light from Earth to Mars where it hits a mirror and is reflected back to its source. The flash occurred at 12:00 according to a correct Earth clock, let’s say, and its reflection arrived back on Earth 20 minutes later. The light traveled the same empty, undisturbed path coming and going. At what time did the light flash hit the mirror? The answer involves the conventionality of simultaneity. All physicists agree one should say the reflection event occurred at 12:10 because they assume it took ten minutes going to Mars, and ten minutes coming back. The difficult philosophical question is whether this way of calculating the ten minutes is really just a convention. Einstein pointed out that there would be no inconsistency in our saying that the flash hit the mirror at 12:17, provided we live with the awkward consequence that light was relatively slow reaching the mirror, but then traveled back to Earth at a faster speed.
Suppose we want to synchronize a Mars clock with our clock on Earth using the reflection method. Let’s draw a Minkowski diagram of the situation and consider just one spatial dimension in which we are at location A on Earth next to the standard clock used for the time axis of the reference frame. The distant clock on Mars that we want to synchronize with Earth time is at location B. See the diagram.
The fact that the world line of the B-clock is parallel to the time axis shows that the two clocks are assumed to be relatively stationary. (If they are not, and we know their relative speed, we might be able to correct for this.) We send light signals from Earth in order to synchronize the two clocks. Send a light signal from A at time t1 to B, where it is reflected back to us at A, arriving at time t3. So, the total travel time for the light signal is t3 – t1, as judged by the Earth-based frame of reference. Then the reading tr on the distant clock at the time of the reflection event should be set to t2, where:
t2 = t1 + (1/2)(t3 – t1).
If tr = t2, then the two spatially separated clocks are supposedly synchronized.
Einstein noticed that the use of the fraction 1/2 rather than the use of some other fraction implicitly assumes that the light speed to and from B is the same. He said this assumption is a convention, the so-called conventionality of simultaneity, and is not something we could check to see whether it is correct. Only with the fraction (1/2) are the travel speeds the same going and coming back.
Suppose we try to check whether the two light speeds really are the same. We would send a light signal from A to B, and see if the travel time was the same as when we sent it from B to A. But to trust these durations we would already need to have synchronized the clocks at A and B. But that synchronization process will presuppose some value for the fraction, said Einstein.
Not all philosophers of science agree with Einstein that the choice of (1/2) is a convention, nor with those philosophers such as Putnam who say the messiness of any other choice shows that the choice of 1/2 must be correct. Everyone does agree, though, that any other choice than 1/2 would make for messy physics.
Some researchers suggest that there is a way to check on the light speeds and not simply presume they are the same. Create two duplicate, correct clocks at A. Transport one of the clocks to B at an infinitesimal speed. Going this slow, the clock will arrive at B without having its own time reports deviate from that of the A-clock. That is, the two clocks will be synchronized even though they are distant from each other. Now the two clocks can be used to find the time when a light signal left A and the time when it arrived at B, and similarly for a return trip. The difference of the two time reports on the A and B clocks can be used to compute the light speed in each direction, given the distance of separation. This speed can be compared with the speed computed with the midway method. The experiment has never been performed, but the recommenders are sure that the speeds to and from will turn out to be identical, so they are sure that the (1/2) is correct and not a convention.
Sean Carroll has yet another position on the issue. He says “The right strategy is to give up on the idea of comparing clocks that are far away from each other” (Carroll 2022, 150).
The conventionality of simultaneity is discussed in the encyclopedia article on Eternalism. For additional discussion of the controversies involved in the conventionality of simultaneity, see (Callender 2017, p. 51) and pp. 179-184 of The Blackwell Guide to the Philosophy of Science, edited by Peter Machamer and Michael Silberstein, Blackwell Publishers, Inc., 2002.
14. What are the Absolute Past and the Absolute Elsewhere?
What does it mean to say the human condition is one in which you never will be able to affect an event outside your forward light cone? Here is a visual representation of the human condition according to the special theory of relativity, whose spacetime can always be represented by a Minkowski diagram of the following sort:
The absolutely past events for the observer’s here and now are the events in the Past Light Cone of the diagram above. The backward light cone of event Q is the imaginary cone-shaped surface of spacetime points formed by the paths of all light rays reaching Q from the past.
The events in your absolute past zone or region are those that could have directly or indirectly affected you, the observer, at the present moment, assuming there were no intervening obstacles (so the space was empty). The events in your absolute future zone are those that you could directly or indirectly affect.
An event’s being in another event’s absolute past is a feature of spacetime itself because the event is in the point’s past in all possible reference frames. This feature is frame-independent. For any event in your absolute past, every observer in the universe (who is not making an error) will agree the event happened in your past. Not so for events that are in your past but not in your absolute past. Past events not in your absolute past are in what Eddington called your absolute elsewhere. The absolute elsewhere is the region of spacetime containing events that are not causally connectible to your here-and-now. Your absolute elsewhere is the region of spacetime that is neither in nor on either your forward or backward light cones. No event here and now, can affect any event in your absolute elsewhere; and no event in your absolute elsewhere can affect you here and now.
If you look through a telescope you can see a galaxy that is a million light-years away, and you see it as it was a million years ago. But you cannot see what it looks like now because the present version of that galaxy is outside your light cone, and is in your absolute elsewhere.
A single point’s absolute elsewhere, absolute future, and absolute past form a partition of all spacetime into three disjoint regions. If point-event A is in point-event B’s absolute elsewhere, the two events are said to be spacelike related. If the two are in each other’s forward or backward light cones they are said to be time-like related or to be causally connectible. We can affect or be affected by events that are time-like related to us here and now; we cannot affect or be affected by events that are space-like separated from our here and now. Whether a space-like event occurs before the event of your being here now depends on the chosen frame of reference, but the order of occurrence of a time-like event and our here-and-now is not frame-relative. Another way to make the point is to say that, when choosing a reference frame, we have a free choice about the time order of two events that are space-like related, but we have no freedom when it comes to two events that are time-like related because the causal order determines their time order. That is why the absolute elsewhere is also called the extended present. There is no fact of the matter about whether a point in your absolute elsewhere is in your present, your past, or your future. It is simply a conventional choice of reference frame that fixes what events in your absolute elsewhere are present events.
For any two events in spacetime, they are time-like, space-like, or light-like separated, and this is an objective feature of the pair that cannot change with a change in the reference frame. This is another implication of the fact that the light-cone structure of spacetime is real and objective, unlike features such as durations and lengths.
The past light cone looks like a cone in small regions in a spacetime diagram with one dimension of time and two of space. However, the past light cone is not cone-shaped in a large cosmological region, but rather has a pear-shape because all very ancient light lines must have come from the infinitesimal volume at the Big Bang.
15. What Is Time Dilation?
Time dilation causes two synchronized clocks to lose their synchrony. Time dilation or time’s “stretching” can occur because of speed or because of being affected by different gravitational forces for a period of time. For the kind of time dilation that is due to speed, the specific amount of time dilation depends on the relative speed of one clock toward or away from the other. If one clock circles the other perfectly, their relative speed is zero, so there is no time dilation due to speed, regardless of how fast the rotational speed of the circling clock. A clock in a circular orbit around Earth does have a speed relative to a clock on the Earth’s surface because the Earth clock is not at the Earth’s center.
The clock of an astronaut on the Moon ticks faster than it does back on the Earth’s surface where gravity is stronger. After about 50 years, the astronaut would be one second older than if he or she had remained on Earth. The clock in a satellite orbiting Earth disagrees with the standard Earth clock by ticking slower due to its time dilation due to speed while ticking faster due to its time dilation due to gravitation. These two time dilation effects cancel out when the satellite is about 2,000 miles above Earth.
The sister of time dilation is length contradiction. It is also called space contraction. The length of an object changes in different reference frames to compensate for time dilation so that the speed of light c in a vacuum is constant in any frame. The object’s length measured perpendicular to the direction of motion is not affected by the motion, but the length measured in the direction of the motion is affected. If you are doing the measuring, then moving sticks get shorter if moving toward you or away from you. The length changes not because of forces, but rather because space itself contracts. No one notices that the space around themselves is contracting, only that the space somewhere else seems to be affected. What a shock to our manifest image!
Here is a picture of the visual distortion of moving objects due to space contraction:
Image: Corvin Zahn, Institute of Physics, Universität Hildesheim,
Space Time Travel (http://www.spacetimetravel.org/)
The picture describes the same wheel in different colors: (green) rotating in place just below the speed of light; (blue) moving left to right just below the speed of light; and (red) remaining still.
To give some idea of the quantitative effect of time dilation:
Among particles in cosmic rays we find protons…that move so fast that their velocities differ infinitesimally from the speed of light: the difference occurs only in the twentieth (sic!) non-zero decimal after the decimal point. Time for them flows more slowly than for us by a factor of ten billion, If, by our clock, such a proton takes a hundred thousand years to cross our stellar system—the Galaxy—then by ‘its own clock’ the proton needs only five minutes to cover the same distance” (Novikov 1998, p. 59).
16. How Does Gravity Affect Time?
According to the general theory of relativity, gravitational differences affect time by dilating it—in the sense that observers in a less intense gravitational potential field find that clocks in a more intense gravitational potential field run slow relative to their own clocks. It’s as if the time of the clock in the intense gravitational field is stretched out and not ticking fast enough. For this reason, people in ground floor apartments outlive their twins in penthouses, all other things being equal. Basement flashlights will be shifted toward the red end of the visible spectrum compared to the flashlights in attics. All these phenomena are the effects of gravitational time dilation.
Spacetime in the presence of gravity is curved, according to general relativity. So, time is curved, too. When time curves, clocks do not bend in space as if in a Salvador Dali painting. Instead they undergo gravitational time dilation.
Information from the Global Positioning System (GPS) of satellites orbiting Earth is used by your cell phone to tell you whether you should turn right at the next intersection. The GPS is basically a group of flying atomic clocks that broadcast the time. The curvature of spacetime near Earth is significant enough that gravitational time dilation must be accounted for by these clocks to keep us from making navigation errors. GPS clocks slow down because of gravitational time dilation and speed up because of time dilation due to their high speed. The combined effect is seven seconds for the GPS clocks. Therefore, these GPS satellites are launched with their clocks adjusted ahead of Earth clocks by about seven seconds and then are periodically readjusted ahead so that they stay synchronized with Earth’s standard time. The smaller the error in the atomic clock in the satellite and in the standard clock the better the GPS system works. That is one reason physicists keep trying to build better clocks. In 2018, gravitational time dilation was measured in Boulder, Colorado, U.S.A. so carefully that it detected the difference in ticking of two initially synchronized atomic clocks that differed in height by only a centimeter. Unfortunately, the actual atomic clocks used in GPS satellites are much less accurate than these two atomic clocks.
When a metaphysician asks the question, “What is gravity?” there are three legitimate, but very different, answers. Gravity is (1) a force, (2) intrinsic curvature of spacetime, and (3) exchanges of virtual particles. All three answers are correct and have their uses. When speaking of spilling milk or designing a rocket to visit the moon, the first answer is most appropriate to use. In the context of general relativity, the second answer is most appropriate. For ontology, the third answer is best.
In the context of a future theory of quantum gravity that incorporates gravity into the best features of quantum mechanics and the best features of the Standard Model of Particle Physics, the third answer is expected to be best. At this more fundamental level, forces are features of field activity. The virtual gravity particles are called gravitons, and they are fluctuations within the gravitational field. What is happening with milk spilling is that pairs of virtual entangled particles bubble up out of the relevant fields. Normally one member of the pair has positive momentum, and the other member has negative momentum. Those particles with negative momentum are exchanged between the milk and the Earth and floor, thereby causing the milk to be attracted to the floor in analogy to how, when someone throws a boomerang beyond you, it can hit you on its way back and push you closer to the thrower.
17. What Happens to Time near a Black Hole?
Time slows down for all processes that occur near a black hole, as measured by people far from the black hole. As measured by the people on the spaceship near a black hole, the processes are normal.
What is a black hole? It is a volume of space that has such a strong gravitational field that, if you were within it, it would require you to go faster than the speed of light to escape it. Princeton physicist Richard Gott described a black hole as a hotel in which you can check in but cannot check out. You can shoot a bullet through an apple, but not through a black hole. The bullet will go in and never come out. Even a light beam sent back toward Earth from inside the hole will get dragged back and never escape the black hole’s edge, its “event horizon.” Black holes can be bright on the inside of the horizon, but no one on the outside can see this brightness.
Most black holes are made from stars that have burned out and been crushed by their own gravity to a small volume. Larger black holes are formed from a group of stars or group of black holes that congeal together. Black holes cannot be seen directly, but their effects on their neighbors can be seen because the hole tugs on its neighbors gravitationally. The particles attracted in this way outside the horizon are called the black hole’s “accretion disk.” In our Milky Way, about 1% of the visible stars will eventually become black holes.
Here is a processed photograph of a black hole surrounded by its colorful but dangerous accretion disk that is radiating electromagnetic radiation (mostly high-energy x-rays) due to particles outside the hole crashing into each other as they are gravitationally pulled toward the hole:
The M87 black hole image produced by the European Southern Observatory
The red and orange area is the accretion disk outside the hole where incoming particles crash into each other. The black region in the center is produced by the black hole’s blocking the light from its own accretion disk that is behind it. The colors in the picture are artifacts added by a computer because the real light (when shifted from x-ray frequencies to optical frequencies) would be white and because humans can detect differences among colors better than differences in the brightness of white light. It is believe that nearly all black holes spin, and they spin faster the smaller they are, but even if a black hole is not spinning, its surrounding accretion disk beyond its event horizon will surely be spinning. Because of the spinning, the accretion disk is not spherical, but is pizza-shaped. It can have a temperature of from a thousand to many millions of degrees.
Black holes in otherwise empty space are dark as seen from the outside, but in the real world the hole has neighbors, mostlyk surrounding plasma and gas. The accretion disk formed by these neighbors can eject hot, magnetized gas and shine as a quasar. Consequently, some supermassive black holes are the most luminous objects in all the universe.
If you were unlucky enough to fall through the event horizon, you could see out continually, but you could not send a signal out, nor could you yourself escape even if your spaceship had an extremely powerful thrust. Also, the space around you increasingly collapses, so you would be sucked toward the center and be squeezed on your way there—a process called “spaghettification.” Spaghettification is not very significant as you cross the horizon of a very large black hole, but it gets very significant as you get near the center. Despite your soon being crushed to a tiny volume on your way in, you would continue to affect the world outside the black hole but only via your contribution to its gravity.
The center of a black hole is often called its “singularity.” This is where information is lost. Strictly speaking the center and the singularity are different. The spatial center is a very small region of very large spatial curvature. It might be a point, or it might be a ring. The singularity, on the other hand, is the end of the proper time of any object that plunges into the hole. Nevertheless, it is common to casually use the two terms interchangeably. Most contemporary physicists do not believe the center is of zero volume because that much confinement of mass would violate Heisenberg’s Uncertainty Principle of quantum theory. On this issue, contemporary physicists trust quantum mechanics more than relativity theory.
Any macroscopic object can become a black hole if sufficiently compressed. An object made of anti-matter can become a black hole, too. If you bang two rocks together fast enough, they will produce a black hole, and the black hole will begin pulling in nearby particles, including you the experimenter. Luckily even our best particle colliders in Earth’s laboratories are not powerful enough to create black holes, but the creators of those first laboratories were gambling with the future of humanity when they guessed no black hole would be created. High speed black holes the mass of an asteroid could easily pass through our globe regularly without anyone noticing. Their diameter would be nanometer-sized and would grow as they passed through Earth, but they would exit Earth long before they had time to absorb any very significant amount of mass.
All massive stars will become black holes when their fuel runs out and they stop radiating so that the star’s gravity takes over and crushes the star. Some black holes are amalgamations of many stars and perhaps many other black holes. Supermassive black holes have been found that are a billion times as massive as our Sun.
Our Sun is not quite massive enough to collapse to a black hole when its nuclear fuel ruins out. Instead, it will collapse into a white dwarf which will slowly cool over trillions of years, but eventually this future ash of our Sun will find its way into a black hole somewhere.
If an electron were a point particle, then it would have an enormous density and be dense enough to become a black hole. That electrons exist around us and do not become black holes is the best reason to believe electrons are not point particles.
The black hole M87 is pictured above. It has a mass of about 6.5 billion of our suns, so it is too big for it to have originated from the collapse of only a single star. It probably has eaten many nearby stars and much cosmic dust. It is not in the Milky Way but in a distant galaxy 55 million light years from Earth. There is another, smaller yet supermassive black hole at the center of the Milky Way. It, too, is probably made by feeding on neighbor stars and other nearby infalling material. The black hole at the center of the Milky Way gobbles one of its neighbors about every million years. Almost all galaxies have a black hole at their center, but black holes also exist elsewhere. Most black holes are not powerful enough to suck in all the stars and dust and particles around them, just as our Sun will never suck in all the planets of our solar system. They have enough angular momentum and thus centrifugal force to resist the gravitational pull.
A black hole’s accretion disk spins, and because of this the Doppler effect shown in the picture above requires the redness at the top to be less bright than at the bottom of the picture. The picture has been altered to remove the blurriness that would otherwise be present due to the refraction from the plasma and dust between the Earth and the black hole. The plasma close to the black hole has a temperature of hundreds of billions of degrees.
The matter orbiting the black hole is a diffuse gas of electrons and protons. …The black hole pulls that matter from the atmospheres of stars orbiting it. Not that it pulls very much. Sagittarius A* is on a starvation diet—less than 1 percent of the stuff captured by the black hole’s gravity ever makes it to the event horizon. (Seth Fletcher. Scientific American, September 2022 p. 53.)
Relativity theory implies an infalling spaceship suffers an infinite time dilation at the event horizon and so does not fall through the horizon in a finite time. Most physicists believe Relativity theory’s implication is incorrect and is a reason that relativity theory needs to be revised. When quantum mechanics is taken into account, it seems that the unfortunate spaceship does fall through in a finite time. This is because the gravitational field produced by the spaceship itself acts on the black hole. As the spaceship gets very, very close to the event horizon such as an atom’s width away, the time dilation does radically increase, but the event horizon slightly expands enough to swallow the spaceship in a finite time—a trivially short time as judged from the spaceship, but a very long time as judged from Earth. This occurrence of slight expansion is one sign that the event horizon is fluidlike. After the spaceship is swallowed, the event horizon returns to its original shape, but it is slightly larger now because it is more massive by one spaceship-mass.
Relativity theory does not allow a black hole to radiate but quantum theory does. A black hole is not actually black because it always emits “Hawking radiation” at its horizon. The strength of the Hawking radiation is inversely proportional to its mass so this radiation is extremely weak for large black holes. Nevertheless, the radiation continually reduces the mass of the black hole causing it eventually to evaporate. Large black holes take about 1067 years to completely evaporate, assuming they are not continually fed new material. To appreciate how long a black hole lives, remember that the Big Bang occurred less than 2 x 1010 years ago. Quantum theory suggests black holes get warmer as they shrink. The Hawking radiation has a continually increasing frequency as the black hole’s mass gets smaller. Hawking first realized that the holes do this by absorbing particles on their event horizon that have negative energy and thus negative mass (Yes, “negative”). When a black hole shrinks to the size of a bacterium, its outgoing radiation becomes white-colored, producing a white black-hole. At the very last instant of its life, it completely evaporates as it explodes in a flash of extremely hot, high-energy particles.
Another odd property is that nearly all physical objects tend to get warmer when you shine a light on them. Think of your ice cream cone in the sunshine. A black hole is an exception. It get colder.
Black holes produce startling visual effects. A light ray can circle outside a black hole once or many times depending upon its angle of incidence to the event horizon. A light ray grazing a black hole can leave at any angle, so a person viewing a black hole from outside can see multiple copies of themselves various angles. Each copy arrives at a different time. We can see the same galaxy more than once because of this. See http://www.spacetimetravel.org/reiseziel/reiseziel1.html for some of these visual effects. An external viewer can also see the part of the accretion disk that is behind the black hole and would be obscured by the hole itself if it were not for gravitational lensing.
Every spherical black hole has the odd geometric feature that its diameter is very much larger than its circumference, which is very unlike the sphere of Euclidean geometry. The diameter “plunges” into the hole.
Some popularizers have said that the roles of time and space are reversed within a black hole, but this is not correct. Instead, it is only coordinates that reverse their roles. Given a coordinate system whose origin is outside a black hole, its time-like coordinates become space-like coordinates inside the horizon. If you were to fall into a black hole, your clock would not begin measuring distance. See (Carroll 2022c 251-255) for more explanation of this role reversal.
Black Holes and Information
Do black holes destroy information? Stephen Hawking’s calculation was that information is lost to outsiders as an object hits the singularity; it is an irreversible process. The information is lost before any measurements are made so it is a second kind of information loss for quantum theory. By the word “information,” Hawking meant all the details of an initial state at some time before the object plunges through the event horizon. According to Hawking’s calculation, the details are lost to outsiders because the information is destroyed at the singularity. This loss is inconsistent with standard quantum theory which says information is never lost except during measurements. The leading hypothesis in the first quarter of the 21st century is that the information is not lost, and it escapes back out in the Hawking radiation during the evaporation process of the black hole. Unfortunately, this hypothesis cannot be experimentally tested because the Hawking radiation is far too weak to be practically measured, and we still do not have a theory of quantum gravity that says whether information is or is not lost at the singularity. In the absence of both theory and data, physicists produced many hypotheses about what might happen in black hole evaporation. For example, maybe black holes do not entirely evaporate, but leave behind remnants. Maybe information cannot ever fall in but instead remains on the event horizon. Maybe it bounces back at the singularity. Maybe it comes out with the Hawking radiation. The latter is the favored hypothesis.
History
In 1783, John Michell speculated that there may be a star with a large enough diameter that the velocity required to escape its gravitational pull would be so great that not even Newton’s particles of light could escape. He called them “dark stars.” Einstein invented the General Theory of Relativity in 1915, and the next year the German physicist Karl Schwarzschild discovered that Einstein’s equations imply that if a non-rotating, perfectly spherical star with no electric charge, in an otherwise empty universe were massive enough and its radius were somehow small enough so that it had extremely high density, then it would undergo an unstoppable collapse.
Soon during the collapse, the gravitational force from the object would be so strong that not even light within the hole could escape the inward pull of gravity. In 1935, Arthur Eddington commented upon this discovery that relativity theory allowed a star to collapse this way:
I think there should be a law of nature to stop a star behaving in this absurd way.
Because of Eddington’s prestige, other physicists (with the notable exception of Subrahmanyan Chandrasekhar) agreed. Then in 1939, J. Robert Oppenheimer and his student Hartland Snyder first seriously suggested that some stars would in fact collapse naturally into black holes when they burned out, and they first clearly described the defining features of a black hole—that “The star thus tends to close itself off from any communication with a distant observer; only its gravitational field persists.” The term “black hole” was first explicitly mentioned by physicist Robert Dicke some time in the early 1960s when he made the casual comparison to a notorious dungeon of the same name in India, the Black Hole of Calcutta. The term was first published in the American magazine Science News Letter in 1964. John Wheeler subsequently promoted use of the term, following a suggestion from one of his students.
Roger Penrose won a Nobel Prize for proving that if you perturb Schwarzchild’s solution by making the black hole not quite spherical, the new hole still must have a singularity.
It is now believed that most of the entropy of the universe is within black holes. Any black hole has more entropy than that of the material from which it was made, as is required by the Second Law of Thermodynamics.
Despite all this “knowledge” of black holes, as of 2025, astrophysicists are not certain that black holes actually exist. There are many theoretical candidates for black hole mimickers which are dense, massive bodies without an event horizon. One candidate is a boson star that is a dense accumulation of axions. Another proposed mimicker is a fuzzball which a dense bundle of one-dimensional strings.
18. What Is the Twin Paradox?
The twin paradox is a faulty argument that uses the theory of relativity to produce a contradiction about the ages of two twins, namely that each is older than the other.
Before giving the paradoxical argument, let’s set up a typical situation that can be used to describe the paradox. Consider two twins at rest on Earth and the fixed stars. Their correct clocks are synchronized. One twin climbs into a spaceship, and flies far away at a high, constant speed, then reverses course and flies back at the same speed. The assumpti0n of a high, constant speed is a simplification that is not required. An application of the equations of special relativity theory shows that the twin in the spaceship will be the younger twin when they reunite because the spaceship’s clock has had fewer ticks; its time has dilated relative to the Earth-based clock. Isaac Newton’s mechanics cannot be used to produce this difference in ages, and Henri Bergson said there could not really be a difference, but experimental evidence has convinced the physics community that the phenomenon actually does occur as Einstein predicted.
Here is a graphic representation of the situation, using a Minkowski spacetime diagram. We let the coordinate axis represent our civilization’s standard time. The Earth-based twin is stationary in this coordinate system.
The spaceship twin takes a curved path through spacetime according to this coordinate system; the Earth-based twin does not. The coordinate axes display coordinate time, which happens also to be the proper time of the Earth-based twin. The count of the dots along the two spacetime paths displays each person’s proper time, the time on the clock that travels with them. The nodes (dots) represent ticks of their clocks with the duration between adjacent ticks being the same. The number and the spacing of nodes along their two world lines is fixed by an application of the laws of special relativity. The theory implies that the proper time in the spaceship is stretched or dilated compared to the proper time of the stay-at-home twin. The key idea why their clocks get out of synchrony is that, during the trip, the traveling twin experiences less time but more new space. The Earth-based twin experiences more time and less space. If the spaceship approached the speed of light, the twin on board would cover an enormous amount of space but hardly any time at all.
Now for the faulty argument: Either twin could say they are the stationary twin and say it was only the other twin who turned around and then use relativity theory to compute that the other twin comes back younger. That is paradoxical.
There has been considerable disagreement among the experts regarding how to explain why there is no contradiction. Herbert Dingle was the President of London’s Royal Astronomical Society in the early 1950s. He argued in the 1960s that this twin paradox reveals an inconsistency in special relativity. This caused quite a stir, but eventually all scientists and almost all philosophers have united in disagreement with Dingle.
The most commonly accepted solution to the paradox says the situation of the two twins is not the same. Yes, either twin could reasonably claim to be the stationary twin, but only the twin in the spaceship had to fire the retrorockets and had to feel additional acceleration.
Richard Feynman said the presence and absence of acceleration is the key to the resolution of the paradox, but Tim Maudlin (Maudlin 2012) argues that acceleration is not the key to resolving the paradox because the paradox could be expressed in a spacetime obeying special relativity in which neither twin accelerates.
Is the presence of Earth important? What if all matter including the background stars were removed from the universe except for the twins and one spaceship? Could the twin paradox argument still be produced in that situation?
When coordinate systems are assigned to spaces, coordinates are assigned to points. The space can be physical space or mathematical space. The coordinates hopefully are assigned in a way that a helpful metric can be defined for computing the distances between any pair of point-places, or, in the case of time, the duration between any pair of point-times. Points, including times, cannot be added, subtracted, or squared, but their coordinates can be. Coordinates applied to the space are not physically real; they are tools used by the analyst, the physicist; and they are invented, not discovered. The coordinate system gives each instant a unique name.
Technically, the question, “How do time coordinates get assigned to points in spacetime?” presupposes knowing how we coordinatize the four-dimensional manifold that we call spacetime. The manifold is a collection of points (technically, it is a topological space) which behaves as a Euclidean space in neighborhoods around any point. The focus in this section is on its time coordinates.
There is very good reason for believing that time is one-dimensional, and so, given any three different point events, one of them will happen between the other two. This feature is reflected in the fact that when real number time coordinates are assigned to three point events, one of the three coordinates is between the other two.
Every event on the world-line of the standard clock is assigned a t-coordinate by that special clock. The clock also can be used to provide measures of the duration between two point-events that occur along the coordinate line. Each point-event along the world-line of the master clock is assigned some t-coordinate by that clock. For example, if some event e along the time-line of the master clock occurs at the spatial location of the clock while the master clock shows, say, t = 4 seconds, then the time coordinate of the event e is declared to be 4 seconds. That is t(e)=4. We assume that e occurs spatially at an infinitesimal distance from the master clock, and that we have no difficulty in telling when this situation occurs. So, even though determinations of distant simultaneity are somewhat difficult to compute, determinations of local simultaneity in the coordinate system are not. In this way, every event along the master clock’s time-line is assigned a time of occurrence in the coordinate system.
In order to extend the t-coordinate to events that do not occur where the standard clock is located, we can imagine having a stationary, calibrated, and synchronized clock at every other point in the space part of spacetime at t = 0, and we can imagine using those clocks to tell the time along their world lines. In practice we do not have so many accurate clocks, so the details for assigning time to these events is fairly complicated, and it is not discussed here. The main philosophical issue is whether simultaneity may be defined for anywhere in the universe. The sub-issues involve the relativity of simultaneity and the conventionality of simultaneity. Both issues are discussed in other sections of this supplement.
Isaac Newton conceived of points of space and time as absolute in the sense that they retained their identity over time. Modern physicists do not have that conception of points; points are identified relative to events, for example, the halfway point in space between this object and that object, and ten seconds after that point-event.
In the late 16th century, the Italian mathematician Rafael Bombelli interpreted real numbers as lengths on a line and interpreted addition, subtraction, multiplication, and division as “movements” along the line. His work eventually led to our assigning real numbers to instants. Subsequently, physicists have found no reason to use complex numbers or other exotic numbers for this purpose, although some physicists believe that the future theory of quantum gravity might show that discrete numbers such as integers will suffice and the exotically structured real numbers will no longer be required.
To assign the time coordinate numbers or dates, we use a system of clocks and some calculations, and the procedure is rather complicated the deeper one probes. For some of the details, the reader is referred to (Maudlin 2012, pp. 87-105). On pp. 88-89, Maudlin says:
Every event on the world-line of the master clock will be assigned a t-coordinate by the clock. Extending the t-coordinate to events off the trajectory of the master clock requires making use of…a collection of co-moving clocks. Intuitively, two clocks are co-moving if they are both on inertial trajectories and are neither approaching each other nor receding from each other. …An observer situated at the master clock can identify a co-moving inertial clock by radar ranging. That is, the observer sends out light rays from the master clock and then notes how long it takes (according to the master clock) for the light rays to be reflected off the target clock and return. …If the target clock is co-moving, the round-trip time for the light will always be the same. …[W]e must calibrate and synchronize the co-moving clocks.
The master clock is the standard clock. Co-moving inertial clocks do not generally exist according to general relativity, so the issue of how to assign time coordinates is complicated in the real world. What follows is a few more interesting comments about the assignment.
The main point of having a time coordinate is to get agreement from others about which values of times to use for which events, namely which time coordinates to use. Relativity theory implies every person and even every object has its own proper time, which is the time of the clock accompanying it. Unfortunately these personal clocks do not usually stay in synchrony with other well-functioning clocks, although Isaac Newton falsely believed they do stay in synchrony. According to relativity theory, if you were to synchronize two perfectly-performing clocks and give one of them a speed relative to the other, then the two clocks readings must differ (as would be obvious if they reunited), so once you’ve moved a clock away from the standard clock you can no longer trust the clock to report the correct coordinate time at its new location.
The process of assigning time coordinates assumes that the structure of the set of instantaneous events is the same as, or is embeddable within, the structure of our time numbers. Showing that this is so is called solving the representation problem for our theory of time measurement. The problem has been solved. This article does not go into detail on how to solve this problem, but the main idea is that the assignment of coordinates should reflect the structure of the space of instantaneous times, namely its geometrical structure, which includes its topological structure, diffeomorphic structure, affine structure, and metrical structure. It turns out that the geometrical structure of our time numbers is well represented by the structure of the real numbers, the decimals.
The features that a space has without its points being assigned any coordinates whatsoever are its topological features, its differential structures, and its affine structures. The topological features include its dimensionality, whether it goes on forever or has a boundary, and how many points there are. The mathematician will be a bit more precise and say the topological structure tells us which subsets of points form the open sets, the sets that have no boundaries within them. The affine structure is about which lines are straight and which are curved. The diffeomorphic structure distinguishes smooth from bent (having no derivative).
If the space has a certain geometry, then the procedure of assigning numbers to time must reflect this geometry. For example, if event A occurs before event B, then the time coordinate of event A, namely t(A), must be less than t(B). If event B occurs after event A but before event C, then we should assign coordinates so that t(A) < t(B) < t(C).
Consider a space as a class of fundamental entities: points. The class of points has “structure” imposed upon it, constituting it as a geometry—say the full structure of space as described by Euclidean geometry. [By assigning coordinates] we associate another class of entities with the class of points, for example a class of ordered n-tuples of real numbers [for a n-dimensional space], and by means of this “mapping” associate structural features of the space described by the geometry with structural features generated by the relations that may hold among the new class of entities—say functional relations among the reals. We can then study the geometry by studying, instead, the structure of the new associated system [of coordinates]. (Sklar 1976, p. 28)
But we always have to worry that there is structure among the numbers that is not among the entities numbered. Such structures are “mathematical artifacts.”
The goal in assigning coordinates to a space is to create a reference system; this is a reference frame plus (or that includes [the literature is ambiguous on this point]) a coordinate system. For 4D spacetime obeying special relativity with its Lorentzian geometry, a Lorentzian coordinate system is a grid of smooth timelike and spacelike curves on the spacetime that assigns to each point three space-coordinate numbers and one time-coordinate number. No two distinct points of the spacetime can have the same set of four coordinate numbers. Technically, being continuous is a weaker requirement than being smooth, but the difference is not of concern here.
As we get more global, we have to make adjustments. Consider two coordinate systems in adjacent regions. For the adjacent regions, we make sure that the ‘edges’ of the two coordinate systems match up in the sense that each point near the intersection of the two coordinate systems gets a unique set of four coordinates and that nearby points get nearby coordinate numbers. The result is an atlas on spacetime. Inertial frames can have global coordinate systems, but in general, we have to use atlases for more realistic frames. If we are working with general relativity where spacetime can curve and we cannot assume inertial frames, then the best we can do without atlases is to assign a coordinate system to a small region of spacetime where the laws of special relativity hold to a good approximation. General relativity requires special relativity to hold locally, that is, in any infinitesimal region, and thus for space to be Euclidean locally. That means that locally the 3-d space is correctly described by 3-d Euclidean solid geometry. Adding time is a complication. Spacetime is not Euclidean in relativity theory. Infinitesimally, it is Minkowskian or Lorentzian.
Regarding any event represented in the atlas, we demand that nearby events get nearby coordinates. When this feature holds everywhere, the coordinate assignment is said to be monotonic or to “obey the continuity requirement.” We satisfy this requirement by using real numbers as time coordinates.
To approach the question of the assignment of coordinates to spacetime points more philosophically, consider this challenging remark:
Minkowski, Einstein, and Weyl invite us to take a microscope and look, as it were, for little featureless grains of sand, which, closely packed, make up space-time. But Leibniz and Mach suggest that if we want to get a true idea of what a point of space-time is like we should look outward at the universe, not inward into some supposed amorphous treacle called the space-time manifold. The complete notion of a point of space-time in fact consists of the appearance of the entire universe as seen from that point. Copernicus did not convince people that the Earth was moving by getting them to examine the Earth but rather the heavens. Similarly, the reality of different points of space-time rests ultimately on the existence of different (coherently related) viewpoints of the universe as a whole. Modern theoretical physics will have us believe the points of space are uniform and featureless; in reality, they are incredibly varied, as varied as the universe itself.
—From “Relational Concepts of Space and Time” by Julian B. Barbour, The British Journal for the Philosophy of Science, Vol. 33, No. 3 (Sep., 1982), p. 265.
For a sophisticated and philosophically-oriented approach to assigning time coordinates to times, see Philosophy of Physics: Space and Time by Tim Maudlin, pp. 24-34.
21. How Do Dates Get Assigned to Actual Events?
For example, you might wonder how the ancients decided which event counts as starting another day. They noticed that the shadow of the same tree or same stick is shortest each day at a time we would now call “noon” in English. When a new year starts is there some signal to indicate this? Yes. The ancient Egyptians noticed that the Nile river flooded repeatedly in what came to be called in English “every year.” Their astronomers also noticed a positive correlation between this re-flooding and the rise of the bright star Sirius next to the sun. They also noticed this annual flooding happened every 365 day. Based on the regularity of this pattern of events, they devised a 365 day calendar is about 3100 BCE. Then they recorded during which year noteworthy events occurred.
If we generalize this process, we say they were producing a coordinate system for time. The purpose in choosing a coordinate system is to express time-order relationships (Did this event occur between those two or before them or after them?) and magnitude-duration relationships (How long after A did B occur?) and date-time relationships (When did event A itself occur?). The date of a (point) event is the time coordinate number of the spacetime coordinate of the event. We expect all these assignments of dates to events to satisfy certain mathematical requirements. For example, we require that event A happens before event B iff t(A) < t(B), where t(A) is the time coordinate of A, namely its date. The term “iff” abbreviates the expression “if and only if.” The assignments of dates to events also must satisfy the demands of our physical theories, and in this case we face serious problems involving inconsistency if a geologist gives one date for the birth of Earth, an astronomer gives a different date, and a theologian gives yet another date. We must either negotiate a reconciliation that removes the inconsistency or else begin talking of geological time, astronomical time, and theological time.
To continue with the more general process of Ideally assigning dates to events by constructing a coordinate system, we would like to partition the set of all actual events into simultaneity equivalence classes by some reliable method. All events in one equivalence class happen at the same time in the frame, and every event is in some class or other. This is the Leibnizian method of defining a simultaneity equivalence class. Since Einstein, the method of light cones is used, but that method is not explored in detail in this article.
These equivalence classes cannot be created perfectly for various reasons including spacetime curvature, but it is interesting to know how close we can come to doing it and how we should go about doing it. Here we face the problem of the relativity of simultaneity and the problem of the conventionality of simultaneity, as we shall see.
We will need to choose a standard clock and then consider how we calibrate and synchronize our own clock with the standard clock. Suppose we have already assigned a date of zero to the event that we choose to be at the origin of our coordinate system. To assign dates (that is, time coordinates) to other events, we must have access to information from the standard clock, our master clock, and be able to use this information to declare correctly that the time intervals between any two consecutive ticks of our own clock are the same. The second is our conventional unit of time measurement, and it is defined to be the duration required for a specific number of ticks of the standard clock. Minutes, week, years, and microseconds then can be defined in terms of the second.
We hope to synchronize other clocks with the standard clock so the clocks show equal readings at the same time. We cannot do this. What are the obstacles? An event occurs at some place. We want to know when. If there is a correct clock there, then the time or date at which a point-event occurs is the number reading on the clock at rest there. If there is no clock there, the assignment process is more complicated. One could try to transport a synchronized clock to that place, but any clock movement or influence by a gravitational field during the transport will cause time dilation and will need to be compensated for. If “that place” is across the galaxy, then any practical transport is out of the question, and other means must be used.
We want to use clocks to assign a time coordinate even to very distant events, not only to events in the immediate vicinity of the standard clock. The major difficulty is that two nearby synchronized clocks, namely clocks that have been calibrated and set to show the same time when they are next to each other, will not in general stay synchronized if one is transported somewhere else. If they undergo the same motions and gravitational influences, and thus have the same world line or time line, then they will stay synchronized; otherwise, they will not. There is no privileged transportation process that we can appeal to. Einstein offered a solution to this problem. He suggested the following method. Assume in principle that we have stationary, ideal clocks located anywhere and we have timekeepers there who keep records and adjust clocks. Assume there is an ideal clock infinitesimally near the spaceship. Being stationary in the coordinate system implies it co-moves with respect to the master clock back in London. We need to establish that the two clocks remain the same distance apart, so how could we determine that they are stationary? We determine that, each time we send a light signal from London and bounce it off the distant clock, the roundtrip travel time remains constant. That procedure also can be used to synchronize the two clocks, or at least it can in a world that obeys special relativity, provided we know how far away the distant clock is. For example, the spaceship is known to be a distance d away from London. The roundtrip travel time is, say 2t seconds. When someone at the spaceship receives a signal from London saying it is noon, the person at the spaceship sets their clock to t seconds after noon. This is an ideal method of establishing simultaneity for distant events.
This method has some hidden assumptions that have not been mentioned. For more about this and about how to assign dates to distant events, see the discussions of the relativity of simultaneity and the conventionality of simultaneity.
As a practical matter, dates are assigned to events in a wide variety of ways. The date of the birth of the Sun is assigned very differently from dates assigned to two successive crests of a light wave in a laboratory laser. For example, there are lasers whose successive crests of visible light waves pass by a given location in the laboratory every 10-15 seconds. This short time is not measured with a stopwatch. It is computed from measurements of the light’s wavelength. We rely on electromagnetic theory for the equation connecting the periodic time of the wave to its wavelength and speed. Dates for other kinds of events, such as the birth of Mohammad or the origin of the Sun, are computed from historical records rather than directly measured with a clock.
22. What Is Essential to Being a Clock?
The definition of “clock” is not a precisely defined term in science, but normally a clock is used for one of three purposes: to tell what time it is, to determine which of two events happened first, and to decide how long an event lasts. In order to do this, the clock needs three sub-systems, (1) fairly regular ticking, (2) the counting of those ticks, (3) conversion of this count into a report of what time it is presently.
Regarding (1), the goal in building the ticking sub-system is to have a tick rate that is stable for a long enough time to be useful. Stability implies regularity of ticking in the sense of not drifting very much over time. The tick rate in clocks that use cyclic processes is called the clock’s “frequency,” and it is measured in cycles per second. So, a clock is intended to have a sufficiently stable frequency for a reasonably continuous period of time. If a clock is stable for only half a second, but you intend to use it to measure events that last a minute, then it is probably not stable enough.
Regarding (2), the counting sub-system counts the ticks in order to measure how much time has elapsed between two events of interest. All other things being equal, the higher the frequency of our best clocks the better. Earth rotations have a low frequency. Pendulums are better. With a quartz clock (used in all our computers and cellphones), a piece of quartz crystal is stimulated with a voltage in order to cause it to vibrate at its characteristic frequency, usually 32,768 cycles per second. So, when 32,768 ticks occur, the quartz clock advances its count of seconds by one. Our civilization’s standard atomic clock ticks at a frequency of 9,192,631,770 ticks per second. After that many ticks, it advances its count by one second.
The longer a clock can tick without gaining or losing a second, the more useful it is for many measurements made by physicists, such as measuring whether supposed universal constants do not vary or drift.
While the definition of a second has stayed the same since 1967, the technology of atomic clocks has not. The best atomic clocks are predicted to be off by less than a second in 31 billion years. The cesium atomic clock of 1967 drifted quite a bit more. Nuclear clocks of the future will depend on transitions between energy levels of an atom’s nucleus instead of its outer electrons. Nuclear clocks are predicted to be off by only a second in 300 billion years. That is why the world’s 1967 time-standard using cesium atomic clocks is likely to be revised in the 21st century.
Expressed a bit technically in the language of relativity theory, what a clock does measure is its own “proper” time along its trajectory in spacetime. An object’s world-line is its trajectory through spacetime. Commenting on this, the philosopher Tim Maudlin said:
An ideal clock is some observable physical device by means of which numbers can be assigned to events on the device’s world-line, such that the ratios of differences in the numbers are proportional to the ratios of interval lengths of segments of the world-line that have those events as endpoints.
So, for example, if an ideal clock somehow assigns the numbers 4, 6, and 10 to events p, q, and r on its world-line, then the ratio of the length of the segment pq to the segment qr is 1:2, and so on. (Maudlin 2012, 108).
A clock’s ticking needs to be a regular process but not necessarily a repeatable process. There are two very different ways to achieve a clock’s regular ticking. The most important way is by repetition, namely by cyclic behavior. The most important goal is that any one cycle lasts just as long as any other cycle. This implies the durations between any pair of ticks are congruent. This point is sometimes expressed by saying the clock’s frequency should be constant.
A second way for a clock to contain a regular process or stable ticking is very different, and it does not require there to be any cycles or repeatable process. A burning candle can be the heart of a clock in which duration is directly correlated with, and measured by, how short the candle has become since the burning began. Two ideal candles will regularly burn down the same distance over the same duration. There will be a regular rate of burning, but no cyclic, repeatable burning because, once some part of the candle has burned, it no longer exists to be burned again. This candle timer is analogous to the behavior of sub-atomic ‘clocks’ based on radioactive decay that are used for carbon dating of ancient trees and mammoths.
A daily calendar alone is not a clock unless it is connected to a regular process. It could be part of a clock in which daily progress along the calendar is measured by a process that regularly takes a day per cycle, such as the process of sunrise followed by sunset. A pendulum alone is not a clock because it has no counting mechanism. Your circadian rhythm is often called your biological clock, because it produces a regular cycle of waking and sleeping, but it is not a complete clock because there is no counting of the completed cycles. A stopwatch is not a clock. It is designed to display only the duration between when it is turned on and turned off. But it could easily be converted into a clock by adding a counting and reporting mechanism. Similarly for radioactive decay that measures the time interval between now and when a fossilized organism last absorbed Earth’s air.
Here are some examples of cyclical processes that are useful for clocks: the swings of a pendulum, repeated sunrises, cycles of a shadow on a sundial, revolutions of the Earth around the Sun, bouncing mechanical springs, and vibrations of a quartz crystal. Regularity of the repetitive process is essential because we want a second today to be equal to a second tomorrow, although as a practical matter we have to accept some margin of error or frequency drift. Note that all these repetitive processes for clocks are absolute physical quantities in the sense that they do not depend upon assigning any coordinate system, nor are they dependent on any process occurring in a living being, including any thought.
The larger enterprise of practical time-keeping for our civilization requires that clock readings be available at locations of interest, including onboard our spaceships and inside submarines. This availability can be accomplished in various ways. A standard clock sitting in a room in Paris is a practical standard only if either its times can be broadcast quickly to the desired distant location, or the clock can be copied and calibrated so that the copies stay adequately synchronized even though they are transported to different places. If the copies cannot always stay sufficiently synchronized (calibrated) with the standard clock back in Paris, then we need to know how we can compensate for this deviation from synchrony.
The count of a clock’s ticks is normally converted and displayed in seconds or in some other unit of time such as minutes, nanoseconds, hours, or years. This counting of ticks can be difficult. Our civilization’s 1964 standard clock is a cesium atomic clock that ticks 9,192,631,770 times per second. Nobody sat down for a second and counted this number. An indirect procedure was required.
It is an arbitrary convention that we design clocks to count up to higher numbers rather than down to lower numbers. It is also a convention that we re-set our clock by one hour as we move across a time-zone on the Earth’s surface in order that the sun be nearly overhead at noons in those zones. In order to prevent noon from ever occurring when the sun is setting, we also add leap years. However, it is no convention that the duration from instantaneous event A to instantaneous event B plus the duration from B to instantaneous event C is equal to the duration from A to C. It is one of the objective characteristics of time, and failure for this to work out numerically for your clock is a sure sign your clock is faulty.
A clock’s ticking needs to be a practically irreversible process. Any clock must use entropy increase in quantifying time. Some entropy must be created to ensure that the clock ticks forward and does not suffer a fluctuation that causes an occasional tick backward. The more entropy produced the less likely such an unwanted fluctuation will occur.
In addition to our clocks being regular and precise, we also desire our clocks to be accurate. What that means and implies is discussed in the next section.
23. What Does It Mean for a Clock to Be Accurate?
A group of clock readings is very precise if the readings are very close to each other even if they all are inaccurate because they all report that it is 12:08 when actually it is noon.
A clock is accurate if it reports the same time as the standard clock. A properly working clock correctly measures the interval along its own trajectory in spacetime, its so-called proper time. The interval in spacetime is the spatio-temporal length of its trajectory, so a clock is analogous to an odometer for spacetime. Just as a car’s odometer can give a different reading for the distance between two locations if the car takes a different route between two locations, so also a properly working clock can give different measures of the duration of time between two events if the clock takes different spacetime trajectories between them. That is why it is easiest to keep two clocks in synchrony if they are sitting next to each other, and that is why it is easiest to get an accurate measure of the time between two events if they occur at the same place.
Because clocks are intended to be used to measure events external to themselves, a goal in clock building is to ensure there is no difficulty in telling which clock tick is simultaneous with which external event. For most nearby situations and nearby clocks and everyday purposes, the sound made by the ticking helps us make this determination. We hear the tick just as we hear or see the brief event occur that we wish to “time.” Humans actually react faster to what they hear than what they see. Trusting what we see or hear presupposes that we can ignore the difference in time between when a sound reaches our ears and when it is consciously recognized in our brain; and it presupposes that we can safely ignore the difference between the speed of sound and the speed of light.
If a clock is synchronized with the standard clock and works properly and has the same trajectory in spacetime as the standard clock, then it will remain accurate (that is, stay in synchrony) with the standard clock. According to the general theory of relativity, if a clock takes a different trajectory from the standard clock, then its readings will deviate from those of the standard clock, and when the second clock is brought back to be adjacent to the standard clock, the two will give different readings of what time it is. That is, if your well-functioning clock were at rest adjacent to the standard clock, and the two were synchronized, then they would stay synchronized, but if your clock moved away from the standard clock and took some different path through space, then the two would not give the same readings when they were reunited, even though both continued to be correct clocks, so this complicates the question of whether a clock that is distant from the standard clock is telling us standard time. To appreciate the complication, ask yourself the question: When our standard clock shows noon today, what event within a spaceship on Mars occurs simultaneously? Or ask the question: How do you “set” the correct time on the Mars clock?
There is an underlying philosophical problem and a psychological problem. If we assign a coordinate system to spacetime, and somehow operationally define what it is for a clock at one place to be in synch with a clock at another place, then we can define distant simultaneity in that coordinate system. However, whether spatiotemporally separated clocks are simultaneous is a coordinate-dependent artifact. Even when people understand this philosophical point that arises because of the truth of the general theory of relativity, they still seem unable to resist the temptation to require a correct answer to the question “What event on a spaceship circling Mars is simultaneous with noon today here on Earth” and unable to appreciate that this notion of simultaneity is a convention that exists simply for human convenience.
The quartz clock in your cellphone drifts and loses about a second every day or two, so it frequently needs to be repeatedly “reset” (that is, restored to synchrony with our society’s standard clock).
Our best atomic clocks need to be reset by one second every 100 million years.
Suppose we ask the question, “Can the time shown on a properly functioning standard clock ever be inaccurate?” The answer is “no” if the target for proper functioning is synchrony with the current standard clock, as the conventionalists believe, but the answer is “yes” if there is another target. Objectivists can propose at least three other distinct targets: (1) synchrony with absolute time (as Isaac Newton proposed in the 17th century), (2) synchrony with the best possible clock, and (3) synchrony with the best-known clock. We do not have a way of knowing whether our current standard clock is close to target 1 or target 2. But if the best-known clock is known to not yet have been chosen to be the standard clock, then the current standard clock can be inaccurate in sense 3 and perhaps it is time to call an international convention to discuss adopting a new time standard.
Practically, a reading of ‘the’ standard clock is a report of the average value of the many conventionally-designated standard clocks, hundreds of them distributed around the globe. Any one of these clocks could fail to stay in sync with the average, and when this happens it is re-set (that is, re-calibrated, or re-set to the average reading of the other standard clocks). In the 21st century, the re-setting occurs about once a month to restore accuracy.
Physicists do not have an answer to the question, “How precise and accurate can a future clock ever be?” However, there is a physical limit to the shortest duration measurable by a given clock because no clock can measure events whose duration is shorter than the time it takes a signal to travel between the components of that clock, the components in the part that generates the regular ticks. This theoretical limit places a lower limit on the margin of error of any measurement of time made with that clock.
Every physical motion of every clock is subject to disturbances. So, we want to minimize the disturbance, and we want our clock to be adjustable in case it drifts out of synchrony a bit. To achieve this goal, it helps to keep the clock isolated from environmental influences such as heat, dust, unusual electromagnetic fields, physical blows (such as dropping the clock), immersion in liquids, and differences in gravitational force. And it helps to be able to predict how much a specific influence affects the drift out of synchrony so that there can be an adjustment for this influence.
Sailors can use clocks to discover the longitude of where they are in the ocean. Finding a sufficiently accurate clock was how 18th and 19th century sailors eventually were able to locate themselves when they could not see land. At sea at night, the numerical angle of the North Star above the horizon is their latitude. Without a clock, they had no way to determine their longitude except by dead reckoning, which is very error-prone. A pendulum clock does not work well when the sea is not smooth. If they had an accurate mechanical clock with them that wasn’t affected by choppy seas, they could use it to find their longitude. Ideally, before setting sail away from the home port, they would synchronize their clock with the standard clock that sits at zero degrees longitude. Out on the ocean or on some island, this clock would tell them the time back at zero degrees longitude. Then at sea on a particular day, the sailors could wait until the Sun was at its highest point and know the local time is 12 noon. If at that moment their clock read 0900 (that is, 9:00 A.M.), then they would know their clock is off by 3 hours from the time at zero degrees longitude. Because Earth turns on its axis 360 degrees of longitude every day and 15 degrees every hour, the sailors could compute that they were 3 x 15 degrees west of zero degrees longitude, namely at 45 degrees west longitude. Knowing both their latitude and longitude, they could use a map to locate themselves. The first reasonably reliable mechanical clock that could be used to do this, that is, to measure longitude at sea, was invented by British clock maker John Harrison in 1727. It was accurate to one second a month. When mariners adopted similarly accurate mechanical clocks, the number of ships per year that crashed into rocks plummeted.
24. What Is Our Standard Clock or Master Clock?
Our civilization’s standard clock or master clock is the clock that other clocks are synchronized with. It reports ‘the correct time’ because our civilization agrees that it does. This standard clock, at the beginning of the second quarter of the 21st century, is a designated cesium atomic clock in Paris France. Your cell phone synchronizes its internal clock with this standard clock about once a week.
More specifically, the standard clock reports the proper time for the Royal Observatory Greenwich that is in London, England. It sits at zero degrees longitude (the prime meridian), even though the report of its time is created in a laboratory near Paris. The report is the result of a computational average of reports supplied from a network of many designated atomic clocks situated around the globe.
a. How Does an Atomic Clock Work?
First, a one-paragraph answer to this question. Then a much more detailed answer and explanation.
An atomic clock is a very regular clock that measures the time taken for a fixed number of jumps by electrons between energy levels in atoms. There are many kinds of atomic clocks, but the one adopted worldwide in 1964 for Coordinated Universal Time relied on the very regular behavior of the cesium-133 atom. What is regular is the frequency of the microwave radiation needed for a maser pointed at cesium atoms in a vacuum chamber in order get the cesium’s outer electrons to be simulated to the next higher orbit and then to drop back to the original orbit while radiating with a specific color. The oscillation of the wave in the maser is analogous to the swing of a tiny pendulum.
Resonance occurs when the cesium isotope’s atoms are stimulated by a special incoming microwave frequency that makes them shine brightly with the expected color. Each element and each transition is associated with a characteristic color or “resonant frequency.” Cesium-133 is especially good for this process because, at other frequencies, the isotope will radiate very little compared to what it can radiate if properly stimulated. The stimulation causes the outer electron to transition from a low-energy ground state to the next higher-energy “hyperfine” ground state and then to fall back down again to the original state while emitting the same microwave frequency that caused it to bump up in the first place. The oscillation or “waving” of this incoming and outgoing radiation from the maser is the ticking of the clock. Counting those ticks tells us the time. A count of 9,192,631,770 was defined to be one standard second. Nobody sits by the clock and counts the waves flying in and out of the cesium in a cesium clock, but the engineering details of counting are not discussed here.
Pendulum clocks work by counting swings of the pendulum. Quartz clocks work by counting the shakes of a small piece of quartz crystal set in motion when electrical voltage is applied to it. Astronomical clocks count rotations or revolutions of the Earth. Atomic clocks work by producing a wave process such as a microwave, and counting a specific number of those waves that pass by a single point in space, then declaring that the time taken for this many waves is one second.
The key idea for all objects that deserve to be called “clocks” is that they can be relied upon to produce nearly the same, fixed number of ticks per second. Call that number n. So, for every occurrence of n oscillations, the clock reports that a second has passed. For every 60n oscillations, it reports a minute has passed. For every 60(60n) oscillations it reports an hour, and so forth. The frequency (or, equivalently, the number of oscillations per second) is the clock’s rate of ticking. If the frequency doesn’t drift very much, it is called a “stable” frequency. The more stable the better. The reason why all the above clocks successfully work as clocks is that they can produce relative stable frequencies compared to that of the rest of the universe’s processes.
The advantage of using an atomic clock that relies on a single, specific isotope is that (1) all its atoms behave exactly alike, so the atomic clock’s ticking is very regular compared to any non-atomic clock, (2) it ticks at a very fast rate (high frequency) so it is useful for measurements of events having a very brief duration, (3) the clock can easily be copied and constructed elsewhere, (4) the clock is not easily perturbed by environmental influences, and (5) there is no deep mystery about why it is a better master clock than other candidates for a master clock.
An atomic clock’s stable frequency is very easy to detect because the isotope “fluoresces” or “shines” or “resonates” in a characteristic, easily-detectable narrow band of frequencies (that is, colors) when its electrons are stimulated or excited to rise to a higher energy level and then spontaneously cascade back to the original energy level while emitting the characteristic color. That is, when the cesium is properly stimulated, it shines or resonates with a single, known color. Its frequency distribution has a very, very narrow central peak that clearly differs from the peaks of radiation that can be produced by electron transitions between all other energy levels in the the same isotope. By the way, no radioactivity is involved in an atomic clock.
In 1879, James Clerk Maxwell was the first person to suggest using the frequency of atomic radiation as a kind of invariant natural pendulum. This remark showed great foresight because it was made before the rest of the physics community had yet accepted the existence of atoms.
The key innovation that led to atomic clocks came from American physicist Isidor Isaac Rabi of Columbia University, who won the Nobel Prize in Physics in 1944 for developing a way to precisely measure atom’s resonance frequencies. His technique, called the nolecular-0beam magnetic resonance method, finely tuned a radio frequency to cause atoms; quantum states to transition. In 1939 Rabi suggested using this method to build a clock, and the next year his colleagues at Columbia applied his technique to determine the resonance frequency of cesium (Jay Bennett, “Redefining Time,” Scientific American 2025, p. 62).
A cesium atomic clock was adopted in 1967 as the world’s standard clock, and it remains the standard in the 2020s. At the 1967 convention, physicists agreed that when 9,192,631,770 cycles of microwave radiation in the clock’s special, characteristic process are counted, then the atomic clock should report that a duration of one atomic second has occurred.
What is this mysterious “special, characteristic process” in cesium clocks that is so stable? This question can be answered assuming every cesium atom behaves according to the Bohr model of atoms. The model is easy to visualize, but it provides a less accurate description than does a description in terms of quantum theory. However, quantum theory is more difficult to understand, so mention of it is minimized in this article.
Every atom of a single isotope behaves just like any other, unlike two manufactured pendulums or even two rotations of the Earth. It is not that every atom of an isotope is in the same position or has the same energy or the same velocity, but rather that, besides those properties, the atoms in the cesium are all alike. Cesium is very similar to mercury, but it is the most chemically reactive material in the world, so builders of atomic clocks must be careful to keep its cesium away from all other kinds of atoms.
An atom’s electrons normally stay in orbit and don’t fly away, nor do they crash into the nucleus. Electrons stay in their orbits until perturbed, and each orbit has a characteristic energy level, a specific value of its energy for any electron in that orbit. When stimulated by incoming electromagnetic radiation, such as from a laser or maser, the electrons can absorb the incoming radiation and transition to higher, more energetic orbits. Which orbit the electron moves to depends on the energy of the incoming radiation that it absorbs. Higher orbits are orbits that are more distant from the nucleus. Also, an electron orbiting in a higher, more energetic orbit is said to be excited because it might emit some radiation spontaneously and transition into one of the lower orbits. There are an infinite number of energy levels and orbits, but they do not differ continuously. They differ by discrete steps. The various energies that can be absorbed and emitted are unique to each isotope of each element. Examining the various frequencies of the emitted radiation of an object gives sufficient information to identify which isotope and element is present. Ditto for the signature of the absorption frequencies. Famously, finding the frequency signature for helium in sunlight was the first evidence that there was helium in the Sun and not only on Earth.
A cesium atom’s outer electron shell contains only a single electron, making it chemically reactive to incoming microwave radiation. To take advantage of this feature in a cesium atomic clock, an outer electron in its lowest-energy orbit around the cesium-133 nucleus is targeted by some incoming microwave radiation from the atomic clock’s maser. Doing so makes the electron transition to a higher energy orbit around the cesium nucleus, thus putting the electron into an “excited” state. Properly choosing the frequency of the incoming radiation that hits the target cesium (called successfully “tuning” the laser) can control which orbit the electron transitions to. Tuning the laser is a matter of controlling the laser’s frequency with a feedback loop that keeps it generating the desired, stable frequency. Initially, the cesium is heated to produce a vapor or gas, then the cesium atoms are cooled as a group to reduce their kinetic energy, and then they are magnetically filtered to select only the atoms whose outer electrons are in the lowest possible energy state.
Our Bohr model supposes, following a suggestion from Einstein, that any electromagnetic wave such as a light wave or a microwave or a radio wave can just as well be considered to be composed of small, discrete particle-like objects called photons. The photon’s energy is directly correlated with the wave’s frequency—higher energy photons correspond to higher frequency waves. If a photon of exactly the right energy from the maser arrives and hits a cesium atom’s electron, the electron can totally absorb the photon by taking all its energy and making the electron transition up to a higher energy level. Energy is conserved during absorption and emission.
Later, the electron in a higher, excited state might spontaneously fall back down to one of the various lower energy levels, while emitting a photon of some specific frequency. The value of that frequency is determined by the energy difference in the two energy levels of the transition. If it is still in an excited state, the (or an) electron might spontaneously fall again to an even lower energy level, and perhaps cascade all the way down to the lowest possible energy level. There is an infinite number of energy levels of any atom, so potentially there is an infinite number of frequencies of photons that can be absorbed and an infinite number of frequencies of photons that can be emitted in the transitions. There are an infinite number, but not just any number, because the frequencies or energies differ in small, discrete steps from each other.
If the electron in a specific energy level were hit with a sufficiently energetic incoming photon, the electron would fly away from the atom altogether, leaving the atom ionized.
For any atom of any isotope of any element with its outer electron in its lowest ground state, there is a characteristic, unique energy value for that state, and there is a characteristic minimum energy for an incoming photon to be able to knock the outer electron up to the very next higher level and no higher, and this is the same energy or frequency that is emitted when that higher-level electron spontaneously transitions back to the lowest level. This ground state behavior of transitioning to the next higher level and back down again is the key behavior of an atom that is exploited in the operation of an atomic clock.
In a cesium atomic clock using the isotope 133Cs, its cesium gas is cooled and manipulated so that nearly all its atoms are in their unexcited, lowest ground state. This manipulation uses the fact that atoms in the two different states have different magnetic properties so they can be separated magnetically. Then the maser’s frequency is tuned until the laser is able to knock the outer electrons from their ground state up to the next higher hyperfine energy state (but no higher) so that the excited electrons then will transition back down spontaneously to the ground level and produce radiation of exactly the same frequency as that of the maser. That is, the target cesium shines or fluoresces with the same frequency it was bombarded with. When this easily-detectable fluorescence occurs, the counting can begin, and the clock can measure elapsed time.
For more details on how an atomic clock works, see (Gibbs, 2002) and (Bennett 2025).
b. How Do We Find and Report the Standard Time?
If we were standing next to the standard clock, we could find the standard time by looking at its display of the time. Almost all countries use a standard time report that is called Coordinated Universal Time. Other names for it are UTC, and Zulu Time. It once was named Greenwich Mean Time (GMT). Some countries prefer their own, different name.
How we find out what time it is when we are not next to the standard click is quite complicated. First, ignoring the problems of time dilation and the relativity of simultaneity raised by Einstein’s theory of relativity that are discussed above, let’s consider the details of how standard time is reported around the world for the vast majority of countries. The international standard time that gets reported is called U.T.C. time, for the initials of the French name for Coordinated Universal Time. The report of U.T.C. time is based on computations and revisions made from the time reports of the Atomic Time (A.T.) of many cesium clocks in many countries.
U.T.C. time is, by agreement, the time at zero degrees longitude. This longitude is an imaginary great circle that runs through the North Pole and South Pole and a certain astronomical observatory in London England, although the report itself is produced near Paris France. This U.T.C. time is used by the Internet and by the aviation industry throughout the world. Different geographical regions of the world have their own time due to the world’s being divided into time zones, approximately by the region’s longitude. Usually a time zone differs by one hour from its neighboring zone.
U.T.C. time is produced from T.A.I. time by adding or subtracting some appropriate integral number of leap years and leap seconds, with leap years added every four years and leap seconds added as needed. T.A.I. time is computed from a variety of reports received of A.T. time (Atomic Time), the time of our standard, conventionally-designated cesium-based atomic clocks. All A.T. times are reported in units called S.I. seconds. A.T. time produces T.A.I. time which produces U.T.C. time.
An S.I. second (that is, a Système International second or a second of Le Système International d’Unités) is defined to be the numerical measure of the time it takes for the motionless (motionless relative to the Greenwich-London observatory), designated, master cesium atomic clock to emit exactly 9,192,631,770 cycles of radiation. The number “9,192,631,770” was chosen rather than some other number by vote at an international convention for the purpose of making the new second be as close as scientists could come to the duration of what was called a “second” back in 1957 when the initial measurements were made on cesium-133 using the best solar-based clocks available then.
The T.A.I. scale from which U.T.C. time is computed is the average of the reports of A.T. time from about 200 designated cesium atomic clocks that are distributed around the world in about fifty selected laboratories, all reporting to Paris. One of those laboratories is the National Institute of Standards and Technology (NIST) in Boulder, Colorado, U.S.A. The calculated average time of the 200 reports is the T.A.I. time, the abbreviation of the French phrase for International Atomic Time. The International Bureau of Weights and Measures (BIPM) near Paris performs the averaging about once a month. If your designated laboratory in the T.A.I. system had sent in your clock’s reading for a certain specified event that occurred in the previous month, then in the present month the BIPM calculates the average answer for all the 200 reported clock readings and sends you a notice of how inaccurate your report was from the average, so you can reset your clock, that is, make adjustments to your atomic clock and hopefully have it be in better agreement with next month’s average for the 200. Time physicists are following the lead over time of their designated clocks because there is nothing better to follow.
A.T. time, T.A.I. time, and U.T.C. time are not kinds of physical time but rather are kinds of reports of physical time.
In the 17th century, Christiaan Huygens recommend dividing a solar day into 24 hours per day and 60 minutes per hour and 60 seconds per minute, making a second be 1/86,400 of a solar day. This is called Universal Time 1 or UT1. This is rotational time. Subsequently, the second was redefined by saying there are 31,556,925.9747 seconds in the tropical year 1900. At the 13th General Conference on Weights and Measures in 1967, the definition of a second was changed again to a specific number of periods of radiation produced by a standard cesium atomic clock (actually, the average of 200 standard atomic clocks). This second is the so-called standard second or the S.I. second. It is defined to be the duration of 9,192,631,770 periods (cycles, oscillations, vibrations) of a certain kind of microwave radiation absorbed in the standard cesium atomic clock. More specifically, the second is defined to be the duration of exactly 9,192,631,770 periods of the microwave radiation required to produce the maximum fluorescence of a small gas cloud of cesium-133 atoms as the single outer-shell electron in these atoms transitions between two specific energy levels of the atom. This is the internationally agreed-upon unit for atomic time in the T.A.I. system. In 1967 the atomic clocks were accurate to one second every 300 years. The accuracy of atomic clocks subsequently have gotten very much better.
All metrologists expect there to be an eventual change in the standard clock by appeal to higher frequency clocks. The higher ticking rate is important for many reasons, one of which is that the more precise the clock that is used the better physicists can test the time-translation invariance of the fundamental laws of physics, such as checking whether the supposed constants of nature do in fact stay constant over time.
Leap years (with their leap days) are needed as adjustments to the standard clock’s count in order to account for the fact that the number of the Earth’s rotations per Earth revolution does not stay constant from year to year. The Earth is spinning slower every day, but not uniformly. Without an adjustment, the time called “midnight” eventually would drift into the daylight. Leap years are added every four years. The effect on the period is not practically predictable, so, when the irregularity occurs, a leap second is introduced or removed as needed whenever the standard atomic clock gets behind or ahead of the old astronomical clock (Universal Coordinated Time UTC) by more than 0.9 seconds.
The meter depends on the second, so time measurement is more basic than space measurement. It does not follow from this, though, that time itself is more basic than space. In 1983, scientists agreed that the meter is how far light travels in 1/299,792,458 seconds in a vacuum. This conventional number is chosen for four reasons: (i) Choosing the number 299,792,458 made the new meter be very close to the old meter that was once defined to be the distance between two specific marks on a platinum bar kept in the Paris Observatory. (ii) light propagation is very stable or regular; its speed is either constant, or when not constant we know how to compensate for the influence of the medium; (iii) a light wave’s frequency can be made extremely stable (that is, little drift); and (iv) distance cannot be measured more accurately in other ways; using the platinum bar in Paris is a less accurate means of measuring distance.
Time can be measured more accurately and precisely than distance, voltage, temperature, mass, or anything else.
So why bother to improve atomic clocks? The duration of the second can already be measured to 14 or 15 decimal places, a precision 1,000 times that of any other fundamental unit. One reason to do better is that the second is increasingly the fundamental unit. Three of the six other basic units—the meter, lumen and ampere—are defined in terms of the second. (Gibbs, 2002)
One philosophical implication of the standard definition of the second and of the meter is that they fix the numerical value of the speed of light in a vacuum in all inertial frames. The speed is exactly 299,792,458 meters per second. There can no longer be any direct measurement to check whether that is how fast light really moves; it is defined to be moving that fast. Any measurement that produced a different value for the speed of light is presumed to have an error. The error would be in accounting for the influence of gravitation and acceleration, or in its assumption that the light was moving in a vacuum. This initial presumption of where the error lies comes from a deep reliance by scientists on Einstein’s general theory of relativity. However, if it were eventually decided by the community of scientists that the speed of light should not have been fixed as it was, then the scientists would call for a new world convention to re-define the second or the meter.
25. Why Are Some Standard Clocks Better than Others?
Other clocks ideally are calibrated by being synchronized to “the” standard clock, our master clock. It is normally assumed that the standard clock is the most reliable and regular clock. Vibrations in atomic radiation are the most stable periodic events that scientists have been able to agree to use for clock building. Physicists have chosen the currently-accepted standard clock for two reasons: (1) they believe it will tick very regularly in the sense that all periods between adjacent ticks are sufficiently congruent—they have the same duration. (2) There is no better choice of a standard clock. Choosing a standard clock that is based on the beats of a president’s heart would be a poor choice because clocks everywhere would suddenly and mysteriously get out of synchrony with the standard heartbeat clock when the president goes jogging.
So, some choices of standard clock are better than others. Some philosophers of time believe one choice is better than another because the best choice is closest to a clock that tells what time it really is. Most philosophers of time argue that there is no access to what time it really is except by first having selected the standard clock.
Let’s consider the various goals we want to achieve in choosing one standard clock rather than another. One goal is to choose a clock with a precise tick rate that does not drift very much. That is, we want a clock that has a very regular period—so the durations between ticks are congruent. On many occasions throughout history, scientists have detected that their currently-chosen standard clock seemed to be drifting. In about 1700, scientists discovered that the duration from one day to the next, as determined by the duration between sunrises, varied throughout the year. They did not notice any variation in the duration of a year, so they began to rely on the duration of the year rather than the day.
As more was learned about astronomy, the definition of the second was changed. In the 19th century and before the 1950s, the standard clock was defined astronomically in terms of the mean rotation of the Earth upon its axis (solar time). For a short period in the 1950s and 1960s, the standard clock was defined in terms of the revolution of the Earth about the Sun (ephemeris time), and the second was defined to be 1/86,400 of the mean solar day, which is the average throughout the year of the rotational period of the Earth with respect to the Sun. But all these clocks were soon discovered to drift too much.
To solve these drift problems, physicists chose a certain kind of atomic clock as the standard, and they said it reported atomic time. All atomic clocks measure time in terms of the natural resonant frequencies of electromagnetic radiation absorbed and emitted from the electrons within certain atoms of the clock. The accurate dates of adoption of these standard clocks are omitted in this section because different international organizations adopted different standards in different years. The U.S.A.’s National Institute of Standards and Technology’s F-1 atomic fountain clock is so accurate that it drifts by less than one second every 30 million years. We know there is this drift because it is implied by the laws of physics, not because we have a better clock that measures this drift.
Atomic clocks use the frequency of a specific atomic transition as an extremely stable time standard. While the second is currently defined by caesium-based clocks that operate at microwave frequencies, physicists have built much more accurate clocks that are based on light. These optical clocks tick at much higher frequencies than microwave clocks and can keep time that is accurate to about one part in 1018, which is about 100 times better than the best caesium clocks.
The international metrology community aims to replace the microwave time standard with an optical clock, but first must choose from one of several clock designs being developed worldwide”—Hamish Johnston, Physics World, 26 March 2021 .
Optical atomic clocks resonate at light frequencies rather than microwave frequencies, and this is why they tick about 100,000 faster than the microwave atomic clocks.
To achieve the goal of restricting drift, and thus stabilizing the clock, any clock chosen to become the standard clock should be maximally isolated from outside effects. A practical goal in selecting a standard clock is to find a clock that can be well insulated from environmental impacts such as convection currents in the Earth’s molten core, comets impacting the Earth, earthquakes, stray electric fields, heavy trucks driving on nearby bumpy roads, the invasion of dust and rust into the clock, extraneous heat, variation in gravitational force, and adulteration of the clock’s gas (for example, the cesium) with other stray elements.
If not insulation, then compensation. If there is some theoretically predictable effect of an environmental influence upon the standard clock, then the clock can be regularly adjusted to compensate for this effect. For example, thanks to knowing the general theory of relativity, we know how to adjust for the difference in gravitational force between being at sea level and being a meter above sea level. Commenting on the insulation problem, Nobel Prize winner Frank Wilczek said that the basic laws of the universe are local, so:
Thankfully, you don’t have to worry about the distant universe, what happened in the past, or what will happen in the future…and it is philosophically important to notice that it is unnecessary to take into account what people, or hypothetical superhuman beings, are thinking. Our experience with delicate, ultra-precise experiments puts severe pressure on the idea that minds can act directly on matter, through will. There’s an excellent opportunity here for magicians to cast spells, for someone with extrasensory powers to show their stuff, or for an ambitious experimenter to earn everlasting glory by demonstrating the power of prayer or wishful thinking. Even very small effects could be detected. but nobody has ever done this successfully.” Fundamentals: Ten Keys to Reality.
Consider the insulation problem we would have if we were to replace the atomic clock as our standard clock and use instead the mean yearly motion of the Earth around the Sun. Can we compensate for all the relevant disturbing effects on the motion of the Earth around the Sun? Not easily nor precisely. The principal problem is that the Earth’s rate of spin varies in a practically unpredictable manner. This affects the behavior of the solar clock, but not the atomic clock.
The old standard Earth-Sun clock is now known to have lost more than three hours in the last 2,000 years. Leap years and leap seconds are added or subtracted occasionally to the standard atomic clock in order to keep our atomic-based calendar in synchrony with the rotations and revolutions of the Earth. We do this because we want to keep atomic-noons occurring on astronomical-noons and ultimately because we want to prevent Northern hemisphere winters from occurring in some future July. These changes do not affect the duration of a second, but they do affect the duration of a year because not all years last the same number of seconds. In this way, we compensate for the Earth-Sun clocks falling out of synchrony with our standard atomic clock.
Another desirable feature of a standard clock is that reproductions of it stay in synchrony with each other when environmental conditions are the same. Otherwise, we may be limited to relying on a specifically-located standard clock that cannot be trusted elsewhere and that can be broken, vandalized or stolen.
The principal goal in selecting a standard clock is to reduce mystery in physics. The point is to find a clock process that, if adopted as our standard, makes the resulting system of physical laws simpler and more useful, and allows us to explain phenomena that otherwise would be mysterious. Choosing an atomic clock as standard is much better for this purpose than choosing the periodic revolution of the Earth about the Sun. If scientists were to have retained the Earth-Sun astronomical clock as the standard clock and were to say that by definition the Earth does not slow down in any rotation or in any revolution, then when a comet collides with Earth, tempting the scientists to say the Earth’s period of rotation and revolution changed, the scientists instead would be forced not to say this but to alter, among many other things, their atomic theory and to say the frequency of light emitted from cesium atoms mysteriously increases all over the universe when comets collide with the Earth. By switching to the cesium atomic standard, these alterations are unnecessary, and the mystery vanishes.
To make this point a little more simply, suppose the President’s heartbeats were chosen as our standard clock and so the count of heartbeats always showed the correct time. It would become a mystery why pendulums (and cesium radiation in atomic clocks) changed their frequency whenever the President went jogging; and scientists would have to postulate some new causal influence that joggers have on pendulums and on atomic clocks across the globe.
To achieve the goal of choosing a standard clock that maximally reduces mystery, we want the clock’s readings to be consistent with the accepted laws of motion, in the following sense. Newton’s first law of motion says that a body in motion should continue to cover the same distance during the same time interval unless acted upon by an external force. If we used our standard clock to run a series of tests of the time intervals as a body coasted along a carefully measured path, and we found that the law was violated and we could not account for this mysterious violation by finding external forces to blame and we were sure that there was no problem otherwise with Newton’s law or with the measurement of the length of the path, then the problem would be with the clock. Leonhard Euler (1707-1783) was the first person to suggest this consistency requirement on our choice of a standard clock. A similar argument holds today but with using the laws of motion from Einstein’s general theory of relativity, one of the two fundamental theories of physics.
When we want to know how long a basketball game lasts, why do we subtract the start time from the end time? The answer is that we accept a metric for duration in which we subtract the two time numbers. Why do not we choose another metric and, let’s say, subtract the square root of the start time from the square root of the end time? This question is implicitly asking whether our choice of metric can be incorrect or merely inconvenient.
When we choose a standard clock, we are choosing a metric. By agreeing to read the clock so that a duration from 3:00 to 5:00 is 5-3 hours, and so 2 hours, we are making a choice about how to compare two durations in order to decide whether they are equal, that is, congruent. We suppose the duration from 3:00 to 5:00 as shown by yesterday’s reading of the standard clock was the same as the duration from 3:00 to 5:00 on the readings from two days ago and will be the same for today’s readings and tomorrow’s readings.
Philosophers of time continue to dispute the extent to which the choice of metric is conventional rather than objective in the sense of being forced on us by nature. The objectivist says the choice is forced and that the success of the standard atomic clock over the standard solar clock shows that we were more accurate in our choice of the standard clock. An objectivist says it is just as forced on us as our choosing to say the Earth is round rather than flat. It would be ridiculous to insist the Earth is flat. Taking the conventional side on this issue, Adolf Grünbaum argued that time is metrically amorphous. It has no intrinsic metric. Instead, we choose the metric we do in order only to achieve the goals of reducing mystery in science, but satisfying those goals is no sign of being correct.
The conventionalist, as opposed to the objectivist, would say that if we were to require by convention that the instant at which Jesus was born and the instant at which Abraham Lincoln was assassinated are to be only 24 seconds apart, whereas the duration between Lincoln’s assassination and his burial is to be 24 billion seconds, then we could not be mistaken. It is up to us as a civilization to say what is correct when we first create our conventions about measuring duration. We can consistently assign any numerical time coordinates we wish, subject only to the condition that the assignment properly reflects the betweenness relations of the events that occur at those instants. That is, if event J (birth of Jesus) occurs before event L (Lincoln’s assassination) and this, in turn, occurs before event B (burial of Lincoln), then the time assigned to J must be numerically less than the time assigned to L, and both must be less than the time assigned to B so that t(J) < t(L) < t(B). A simple requirement. Yes, but the implication is that this relationship among J, L, and B must hold for events simultaneous with J, and for all events simultaneous with K, and so forth.
It is other features of nature that lead us to reject the above convention about 24 seconds and 24 billion seconds. What features? There are many periodic processes in nature that have a special relationship to each other; their periods are very nearly constant multiples of each other, and this constant stays the same over a long time. For example, the period of the rotation of the Earth is a fairly constant multiple of the period of the revolution of the Earth around the Sun, and both these periods are a constant multiple of the periods of a swinging pendulum and of vibrations of quartz crystals. The class of these periodic processes is very large, so the world will be easier to describe if we choose our standard clock from one of these periodic processes. A good convention for what is regular will make it easier for scientists to find simple laws of nature and to explain what causes other events to be irregular. It is the search for regularity and simplicity and removal of mystery that leads us to adopt the conventions we do for the numerical time coordinate assignments and thus leads us to choose the standard clock we do choose. Objectivists disagree and say this search for regularity and simplicity and removal of mystery is all fine, but it is directing us toward the correct metric, not simply the useful metric.
For additional discussion of some of the points made in this section, including the issue of how to distinguish an accurate clock from an inaccurate one, see chapter 8 of (Carnap 1966).
26. What Is a Field?
The technical word “field” in physics (as opposed to mathematics or to farming) means something that extends throughout space and that exerts a force on various things it encounters. The most familiar physical field is the temperature field for a single time; it is displayed on the screen during a weather report. It shows the temperature at each place represented on the map—such as 70 degrees Fahrenheit now in New York, and 73 degrees now in Washington, DC. The field is a special kind of extended object having three or more dimensions with each place having a value for some physical variable—temperature in this example. Sound waves in a room are oscillations of its air density field due to changing air density in different places at different times. Objects move through fields, but the fields themselves do not move.
Unlike a temperature field or an air density field, some fields always have directions at their points. A wind field has a wind speed and a wind direction at each place. A magnetic field has a magnetic strength and a direction at each place.
What is the advantage of treating the world in terms of fields? Briefly expressed, in any field theory with the property called “locality,” any change in the field’s value at a place can directly induce changes only in infinitesimally-nearby places. Think of points in the field as interacting only with their nearest neighbors, which in turn interact with their own neighbors, and so forth. So, field theory with locality has the advantage that, if you want to know what will happen next at a place, you do not have to consider the influence of everything everywhere in the universe but only the field values at the place of interest and the rates of change of those values. Computing the effect of a change can be much simpler this way.
In Newton’s mechanics, gravity acts from one object to another instantaneously. In relativistic mechanics, this is not so; the two distant objects act on each other no faster than the speed of light. However, Newton’s theory of gravity is sometimes more practical to use because gravitational forces get weaker with distance, and the gravitational influence of all the distant particles can be ignored for many practical purposes. Newton’s theory of gravity is sufficient for sending rockets to Mars.
The universe at a time is approximately a system of particles in spacetime, but physicists believe that more fundamentally, the universe at a time is really a system of co-existing quantized fields acting on the vacuum or being the vacuum. We know this is so for all non-gravitational phenomena, but do not have a definitive theory of the gravitational field that involves quantum phenomena. In the early years of using the concept of fields, the fields were considered something added to systems of particles, but the modern viewpoint (influenced by quantum theory) is that particles themselves are only local vibrations or excitations of fields; the particles are wave packets that are fairly stable in the sense of persisting (for the particle’s lifetime) and not occupying a large spatial region as the field itself does. Particles of light, called photons, are these kinds of particles. Light itself is a traveling wave in an electromagnet field.
The classical concept of there being a particle at a point does not quite hold in quantum field theory. The key ontological idea is that the particles supervene on the fields. Particles are epiphenomena. Also, the particles of quantum fields do not change their values continuously as do particles in classical fields. A particle in a quantum field is able to change its energy only in discrete jumps.
The concept of a field originated with Pierre-Simon Laplace in about 1800. He suggested Newton’s theory of gravity could be treated as if it were a field theory. In this theory, the notion of action at a distance was eliminated. Michael Faraday was the first physicist to assert that fields in this sense are real, not just mathematical artifacts.
Newton would have been happy with this idea of a field because he always doubted that gravity worked by one object’s mass acting directly on another distant object’s mass instantaneously. In a letter to Richard Bentley, he said:
It is inconceivable that inanimate brute matter should, without the intervention of something else which is not material, operate upon and affect other matter, and have an effect upon it, without mutual contact.
Instantaneous actions were removed from the treatment of electricity and magnetism by Maxwell in the 1860s when he created his theory of electromagnetism as a field theory. Changes in electromagnetic forces were propagated, not instantaneously, but at the speed c of light. Instantaneous actions were eventually removed from gravitational theory in Einstein’s general theory of relativity of 1915. It was Einstein who first claimed that spacetime itself is the field associated with gravity. According to Einstein,
As the Earth moves, the direction of its gravitational pull does not change instantly throughout the universe. Rather, it changes right where the Earth is located, and then the field at that point tugs on the field nearby, which tugs on the field a little farther away, and so on in a wave moving outward at the speed of light. (Carroll 2019, p. 249)
Gravitational force, according to Einstein’s theory, is not really a force in the usual sense of the term, but is the curvature of spacetime.
Depending upon the field, a field’s value at a point in space might be a simple number (as in the Higgs field), or a vector (as in the classical electromagnetic field), or a tensor (as in Einstein’s gravitational potential field), or even a matrix. Fields obey laws, and these laws usually are systems of partial differential equations that hold at each point.
As mentioned briefly above, with the rise of quantum field theory, instead of a particle being treated as a definite-size object within spacetime it is treated as a special kind of disturbance of the field itself, a little “hill” or deviation above its average value nearby. For example, an electron is a localized disturbance in the electromagnetic field. The anti-electron is a localized disturbance in the same field, and so is a photon. The disturbance is a fuzzy bundle of quantized energy occupying a region of space bigger than a single point, but having a maximum at a place that would classically have been called the “particle’s location.” A particle is a little hill in the field. These hills can be stationary or moving. The hills can pass by each other or pass through other hills or bounce off them, depending on the kinds of hills. Moving hills carry information and energy from one place to another. New energy inputted into the field can increase the size of the hill, but only in discrete sizes. Any hill has a next bigger possible size (or energy).
So, the manifest image of a particle cannot easily be reconciled with the quantum mechanical image of a particle. Although fields, not particles, are ontologically basic, it does not follow from this that particles are not real. They are real but odd because they are emergent and epiphenomenal entities having no sharply defined diameter and not being able to change their sizes gradually. Although an electron does have a greater probability of being detected more at some places than at others, in any single detection at a single time the electron is detected only at a point, not a region. The electron is a disturbance that spreads throughout space, although the high-amplitude parts are in a very small region. Despite its having no sharp boundary, the electron is physically basic in the sense that it has no sub-structure particle. The proton is not basic because it is made of quarks and gluons. Particles with no sub-structure are called elementary particles.
Relativity theory’s biggest ontological impact is that whether a particle is present depends on the observer. An accelerating observer might observe (that is, detect) particles being present in a specific region while a non-accelerating observer can see (that is, measure) no particles there. For a single region of spacetime, there can be particles in the region in one reference frame and no particles in that region for another frame, yet both frames are correct descriptions of reality!
Because all these fields are quantum fields, their disturbances or excitations can occur only in quantized chunks, namely integer multiples of some baseline energy, the so-called zero-point energy, which is the lowest possible positive energy. It is these chunks, called “quanta,” and a single one is called a “quantum.”
Although fields that exist cannot go out of existence, they can wake up from their slumbers and turn on. Soon after the Big Bang, the Higgs field, which had a value of zero everywhere, began to increase in value as the universe started cooling. When the universe’s temperature fell below a certain critical value, the field grew spontaneously. From then on, any particle that interacted with the Higgs field acquired a mass. Before that, all particles were massless. The more a particle interacts with the Higgs field, the heavier it is. The photon does not interact at all with the Higgs field. The neutrino interacts the least.
What is the relationship between spacetime and all these fields? Are the fields in space or, as Einstein once asked, are they properties of space, or is there a different relationship? Some physicists believe the gravitational field does reside within spacetime, but others believe it does not.
There is some serious speculation that fields are not the ontologically basic entities; information is the universe’s basic entity.
For an elementary introduction to quantum fields, see the video https://www.youtube.com/watch?v=X5rAGfjPSWE.
For references, go back to the main “Time” article.
Author Information
Bradley Dowden
Email: dowden@csus.edu
California State University Sacramento
U. S. A.
Time
Time is a distinguished dimension of space-time that we measure with a clock, said the scientist Albert Einstein. That remark misrepresents what time really is, said the philosopher Henri Bergson, because it treats time too much like a spatial dimension and it de-emphasizes time’s relation to our consciousness.
Everyone agrees that information about time tells the durations of events, and the times when events occur, and which events happen before which others, so time plays a very significant role in the universe’s structure, including the structure of our lives. However, the attempt to carefully describe time’s properties has led to many unresolved issues, both philosophical and scientific.
Consider this issue upon which philosophers are deeply divided: What sort of ontological differences are there among the present, the past and the future? There are three competing philosophical theories. Presentism implies that necessarily only present objects and present events are real, and we conscious beings can recognize this in the special vividness of our present experiences compared to our relatively dim memories of past experiences and dim expectations of future experiences. So, the dinosaurs have slipped out of reality even though our present ideas of them have not. However, the growing-past theory implies the past and present are both real, but the future is not, because the future is indeterminate or merely potential. Dinosaurs are real, but our future death is not. The third theory, eternalism, is that there are no objective ontological differences among present, past, and future because the differences are merely subjective, depending upon whose present we are referring to. Einstein was an eternalist, and Bergson was a presentist.
In no particular order, here is a list of other issues about time that are discussed in this article:
•Whether there was a moment without an earlier one.
•Whether time itself passes or flows.
•What kinds of time travel are possible.
•Whether time has an arrow.
•Do any of the fundamental laws depend on what time it is?
•How to distinguish an accurate clock from an inaccurate one.
•Whether what happens in the present is the same for everyone.
•Which features of our ordinary sense of the word time are, or should be, captured by the concept of time in physics.
•Whether sentences about the future have truth-values now.
•The proper formalism or logic for capturing the special role that time plays in reasoning.
•Whether an instant can have a zero duration and also a very next instant.
•Whether time can exist when nothing changes.
•What neural mechanisms account for our experience of time.
•Whether time is objective or only an artifact of consciousness.
•Whether time branches or is linear.
•Whether there is a timeless substratum from which time emerges.
•Which specific aspects of time are conventions.
•How to settle the disputes between proponents of McTaggart’s A-theory and B-theory of time.
This article does not explore how time is treated within different cultures and languages, how persons can more efficiently manage their time, nor what entities are timeless.
The great philosophical dispute in the first half of the twentieth century about the nature of time between Albert Einstin, the most famous scientist of the day, and Henri Bergsn, the most famous European philosopoher of the day, has been characterized by some as based upon a substantial mistake, an equivocation between two senses of the word “time.” Einstein spoke of clock time or scientific time or what philosophers call “physical time,” whereas Bergson spoke of human time or “phenomenological time.” The mistake is that each considered their time to be real time, while believing their opponent was not appreciating this point. Whether this is so remains an open question in metaphysics and the philosophy of physics, but one that the reader is well advised to keep in mind during the presentations that follow.
Researchers in the philosophy of time want to build a robust and defensible theory of time that resolves as many of the issues that they can on the list of philosophical issues mentioned in the opening summary of this article, or at least they want to provide a mutually consistent set of proposed answers to them that is supported by the majority of experts on these issues. That list of issues is very long. Here is a shorter list of the most important issues in the philosophy of time:
How do we clarify the fact that we conceive of time as something that flows or passes?
Are past events real?
Is time fundamental, or does it emerge from something more fundamental?
Does time itself have an intrinsic arrow?
Is there a smallest allowable duration?
Does the full extent of time include an infinite past and infinite future?
What aspects of time are conventional or subjective?
Is time best understood with McTaggart’s A-theory or his B-theory?
How is the manifest image of time related to the scientific image?
This last question is asking about the complicated relationship between beliefs about time held by ordinary speakers of our language and beliefs about time as understood through the lens of contemporary science, particularly physics and its fundamental theories. Why physics? Because its fundamental theories have a universality that other theories of science do not.
When this article uses language to explain what a scientific theory says about time, there is often much information that is withheld. Attempting to put into words what scientific equations imply about reality is a sophisticated process that invariably is slightly misleading, but the point of this article is to minimize misunderstanding. When describing time, the commonsense approach is expressed with non-technical terms such as now, flow, and past and not with the scientific terms such as continuum and reference frame. The manifest image or folk image or commonsense image or everyday image of time is the understanding of time as it appears to us using our common sense that is unaffected by most of the advances in contemporary science. It does not qualify as a theory in the technical sense of that term but is more an assortment of tacit beliefs. The concept is vague, and there is no good reason to believe that there is a single shared concept. Different cultures have somewhat different concepts of time.
A reasonable way to make our manifest image of time less vague is to say it contains all the following beliefs about time [some of which are declared to be false according to the scientific image]: (1) The universe has existed for longer than five minutes. (2) We experience time by experiencing changes. (3) The future must be different from the past. (4) Time exists in all places. (5) You can change the direction you are going in space but not in time. (6) Every event has a duration that is never negative. (7) Every event occurs at some time or other. (8) Time is made of instants, that is, times. (9) Past events cannot be affected by our actions now, whereas what happens in the future can be. (10) The set of past events grows larger over time. (11) A nearby present event cannot directly and immediately influence a distant present event. (12) Time has an intrinsic arrow pointing toward the future. (13) Time has nothing to do with space. (14) Given any two events, they have some objective order such as one happening before the other, or else their being simultaneous. (15) Time passes; it moves; metaphorically, we can say it flows like a river that passes us as we view it from a place along the river bank; and we directly experience this flow. (16) There is a present that is objective, that every living person shares, and that divides everyone’s past from their future. (17) Time is independent of the presence or absence of physical objects and what they are doing. (18) We never remember the future because this fact is true by definition. Of these 18, only items 1 through 9 have clearly survived the impact of modern science.
The scientific image and the manifest image are not images of different worlds. They are images of the same reality. Both images have changed over the years, and often the scientific image will increase the precision of a word that is used in the manifest image. The changes have sometimes been abrupt. The most significant and abrupt impact on time’s scientific image was the acceptance of the theory of relativity that was created by Einstein early in the 20th century. See (Callender 2017) and (Baron et. al. 2022) for more discussion of how science assaults our manifest image.
A popular methodology used by some metaphysicians is to tentatively accept as real a feature of the manifest image and then change it only if there are good reasons to do so. Unfortunately, there is no consensus among philosophers of time about what counts as a good reason, although there is much more consensus among physicists. Does conflict with relativity theory count as a good reason? Yes, say physicists; but Husserl’s classic 1936 work on phenomenology, The Crisis of European Sciences and Transcendental Phenomenology, criticized the scientific image because of its acceptance of so many of the implications of the orthodox interpretation of relativity theory, and in this spirit A. N. Prior said that the theory of relativity is for this reason not about real time. For more exploration of the possibility that theories of physics do not have any role for real time, see (Baron et. al. 2022).
Ever since the downfall of the Logical Positivists‘ program of requiring all meaningful, non-tautological sentences to be reducible to commonsense sentences about what is given in our sense experiences (via seeing, hearing, feeling, and so forth), few philosophers of science would advocate any direct translation of sentences expressed in the manifest image to sentences expressed in the scientific image, or vice versa, but the proper relationship between the two images is an open question.
With the rise in popularity of naturalism and scientific realism in both metaphysics and the philosophy of science in the latter part of the twentieth century, many philosophers of science would summarize the relationship between the two images by saying our commonsense experience of reality is real but overrated. They suggest that defenders of the manifest image have been creative, but ultimately they have wasted their time in trying to revise and improve the manifest image to lessen its conflict with the scientific image. Regarding these attempts in support of the manifest image, the philosopher of physics Craig Callender offered this sharp criticism:
These models of time are typically sophisticated products and shouldn’t be confused with manifest time. Instead they are models that adorn the time of physics with all manner of fancy temporal dress: primitive flows, tensed presents, transient presents, ersatz presents, Meinongian times, existent presents, priority presents, thick and skipping presents, moving spotlights, becoming, and at least half a dozen different types of branching! What unites this otherwise motley class is that each model has features that allegedly vindicate core aspects of manifest time. However, these tricked out times have not met with much success (Callender 2017, p. 29).
In some very loose and coarse-grained sense, manifest time might be called an illusion without any harm done. However, for many of its aspects, it’s a bit like calling our impression of a shape an illusion, and that seems wrong (Callender 2017, p. 310).
Some issues listed in the opening summary are intimately related to others, so it is reasonable to expect a resolution of one to have deep implications for another. For example, there is an important subset of related philosophical issues about time that cause many philosophers of time to divide into two broad camps, the A-camp and the B-camp, because the camps are on the opposite sides of so many controversial issues about time.
The next two paragraphs summarize the claims of the two camps. Later parts of this article provide more introduction to the philosophical controversy between the A and B camps, and they explain the technical terms that are about to be used. Briefly, the two camps can be distinguished by saying the members of the A-camp believe McTaggart’s A-theory is the fundamental way to understand time; and they accept a majority of the following claims which members of the B-camp reject. Past events are always changing as they move farther into the past. This change is the only genuine, fundamental kind of change. The present or “now” is objectively real and the same for everyone. Time passes or flows. Ontologically, we should accept either presentism, the growing-past theory or the moving-spotlight theory because the present is somehow metaphysically privileged compared to the future. Predictions are not true or false at the time they are uttered. Being in the past is more fundamental than happening before some event. Tensed facts are ontologically fundamental, not untensed facts. The ontologically fundamental objects are 3-dimensional, not 4-dimensional. The word “fundamental” in these discussions is used either in the sense of “not derivable” or “not reducible.” It does not mean “most important.”
Members of the B-camp reject all or almost all of the claims of the A-camp. They believe McTaggart’s B-theory is the fundamental way to understand time; and they accept a majority of the following claims: events never undergo what A-theorists call genuine change; the present or now is not objectively real and neither is time’s flow; ontologically we should accept eternalism and the block-universe theory; predictions are true or false at the time they are uttered; happening before some event is more fundamental than an event’s being in the past, untensed facts are more fundamental than tensed facts; the fundamental objects are 4-dimensional, not 3-dimensional; and all A-predicates are reducible to B-predicates, or at least the truth conditions of sentences using A-predicates can be adequately explained in terms of the truth conditions of sentences using only B-predicates. Most B-theorists claim that they do not deny the reality of the human experiences that A-theorists are appealing to, but rather they believe those experiences can be best explained from the perspective of the B-theory.
To what extent is time understood? This is a difficult question, not simply because the word understood is notoriously vague. There have been a great many advances in understanding time over the last two thousand years, especially since 1905, as this article explains, so we can definitively say time is better understood than it was—clear evidence that philosophy makes progress. Nevertheless, in order to say time is understood, there remain too many other questions whose answers are not agreed upon by the experts. We cannot even be sure whether time is a fundamental feature of reality or only an emergent one. Can we at least say only the relatively less important questions are left unanswered? No, not even that. Yet this is the state of our understanding of time. It is certainly less than a reader might wish to have. Still, it is remarkable how much we do know about time that we once did not; and it is remarkable that we can be so clear about what it is that we do not know; and there is no good argument for why this still sought-after knowledge is beyond the reach of the human mind.
2. Physical Time, Biological Time, and Psychological Time
Physical time is public time, the time that clocks are designed to measure. Biological time is indicated by regular, periodic biological processes, and by signs of aging. The ticks of a human being’s biological clock are produced by heartbeats, the rhythm of breathing, cycles of sleeping and waking, and periodic menstruation, although there is no conscious counting of the cycles. Biological time is not another kind of time in addition to physical time, but rather is physical time measured with a biological process.
Psychological time is private time; it is also called subjective time and phenomenological time. Our psychological time can change its rate, compared to physical time, depending on whether we are bored or instead intensively involved. The position advocated by Husserl in On the Phenomenology of the Consciousness of Internal Time is that objective physical time is constituted intersubjectively by our internal time-consciousness, namely our phenomenological time. This is a controversial suggestion about how to understand time, but everyone in the debate would agree that the process of giving linguistic meaning to the word “time” and the process of acquiring the human idea of time is a product in part of internal time-consciousness.
The position advocated by many analytic philosophers is that this psychological time is better understood not as a kind of time but rather as awareness of physical time. Psychological time is usually what people are thinking of when they ask whether time is just a construct of the mind.
Some philosophers suggest time is a projection upon our experience in analogy to how the color green is a visual feeling we get by experiencing healthy grass that is not actually green or any other color. On this view of time, time could not have existed before the evolution of consciousness. The present article presumes the more mainstream position that time is not a projection upon our experience.
There is no experimental evidence that the behavior of a clock that measures physical time is affected in any way by the presence or absence of mental awareness, or by the presence or absence of any biological phenomenon. For that reason, physical time is often called objective time and scientific time. The scientific image of time is the product of science’s attempt to understand physical time.
When a physicist defines speed to be distance traveled divided by the amount of time during the travel, the term time in that definition refers to physical time. Physical time is more useful than psychological time for helping us understand our shared experiences in the world; but psychological time is vitally important for understanding many mental experiences, as is biological time for understanding biological phenomena.
Psychological time and biological time are discussed in more detail in Section 17 which explores our perception of time. Otherwise, this article focuses primarily on physical time.
Regarding the measurement of physical time, there is no objectively universal clock or absolute ticking of nature against which processes run slower or faster. Nevertheless, there is a conventionally-chosen standard clock in the Royal Observatory Greenwich in London which is assumed to be stationary and whose reported time is assumed by nearly all countries to be the “correct time.”
3. What is Time?
Time may not be what it seems. Clocks can tell you what time it is, but they cannot tell you what time is. “Time is succession,” Henri Bergson said in 1922, but that remark is frustratingly vague. So is the remark that time is like a line. And so is the remark that time is the quality of the world that allows change to exist. Yet all these remarks contain a grain of truth.
In a widely publicized and intensely discussed debate in 1922, the philosopher Henri Bergson debated the scientist Albert Einstein regarding the nature of time. Bergson argued that scientists were not properly appreciating that there are kinds of time. In particular, they were not appreciating the existence of “philosopher’s time” or “psychologist’s time” that we all experience. Einstein claimed there is no “philosopher’s time,” but only a manifest image of time or commonsense image of time that inaccurately describes time. For him, time is a single kind of entity that is described differently by different people, often incorrectly. Einstein’s position eventually became the mainstream position, especially within the scientific community.
When philosophers ask “What is time?,” they might or might not be intending to separate the question of what time is made of from the question of its function or its key properties or a short definition of the term, but usually they would like answers to all these questions.
Maybe we can decide what time is by considering what our world would be like if it did not contain time. Where do we proceed from here, though? We cannot turn off time and look at the result. Unfortunately, our imagining the world without time is not likely to be a reliable guide.
Should we say time is whatever plays this or that functional role such as accounting best for our temporal phenomenology? Unlike Einstein, Henri Bergson would have been sympathetic to this. He believed that Einstein’s so-called scientific time or physical time is derived by abstraction from the more fundamental time of our experience. However, Baron and Miller have argued that, if we knew that a demon plays the functional role of providing us with our temporal phenomenology, then we would not agree that time is a demon, so more constraints need to be placed on any functionalist account of time.
Sometimes, when we ask what time is, we are asking for the meaning of the noun “time.” It is the most frequently used noun in the English language. A first step in that direction might be to clarify its meaning and its reference. The term time has several meanings. It can mean the duration between events, as when we say the trip from home to the market took too much time because of all the traffic. It can mean, instead, the temporal location of an event, as when we say he arrived at the time they specified. It also can mean the temporal structure of the universe, as when we speak of investigating time rather than space. This article uses the term in all these senses.
Ordinary Language philosophers have carefully studied talk about time. This talk is what Ludwig Wittgenstein called the language game of discourse about time. Wittgenstein said in 1953, “For a large class of cases—though not for all—in which we employ the word ‘meaning’ it can be defined this way: the meaning of a word is its use in the language.” Perhaps an examination of all the uses of the word time would lead us to the meaning of the word. Someone, such as John Austin, following the lead of Wittgenstein, might also say that giving more careful attention to how we use words would then enable us to readily dissolve rather than answer most of our philosophical problems about time. Those problems would be shown to be pseudo-problems, and the concept of time would no longer be so mysterious. Wittgenstein once said work on phlosophy is really more work on oneself.
That methodology of dissolving a problem was promoted by Austin in response to many philosophical questions. However, most philosophers of time in the twenty-first century are not interested in dissolving their problems about time, nor in changing themselves, nor in precisely defining the word time. They are interested in specifying time’s important characteristics and in resolving philosophical disputes about time that do not seem to turn on what the word means. When Isaac Newton discovered that both the falling of an apple and the circular orbit of the Moon were caused by gravity, this was not primarily a discovery about the meaning of the word gravity, but rather about what gravity is. Do we not want some advances like this for time? If so, then the goal is to discover the best concept of time to use in understanding the world and to develop a philosophical theory of time that addresses what science has discovered about time plus what should be said about the many philosophical issues that practicing scientists usually do not concern themselves with, such as the relationship between the manifest image of time and the scientific image.
There are deep disputes in metaphysics regarding the extent to which scientific theories can or should guide our metaphysical theories. This article does not assume science is the royal road to metaphysics, but it does assume there is much to learn about time from scientific theories, specifically the fundamental scientific theories. The exploration in sections ahead adopts a realist perspective on these scientific theories. That is, it interprets them to mean what they say, even in their highly theoretical aspects, while appreciating that there are such things as mathematical artifacts. The scientific realist does not treat scientific theories as useful fictions, nor treat them as merely helpful instruments, nor treat them operationally. It assumes that, in building a scientific theory, the goal is to achieve truth even though most theories achieve this goal only approximately; but what makes them approximately true is not their corresponding to some mysterious entity called approximate truth. Scientific realism not only says the aim of science is to say how the world really is, it also claims that science can successfully do this. The present article leaves open the question of the extent of the success of doing this.
Scientific realism has been seriously challenged in the philosophical literature. For one example, the influential continental philosopher Henri Bergson has insisted that philosophers of physics should promote the primacy of intuition in place of the objects of mathematical physics. The constructive empiricist Bas van Fraassen said the aim of science should confine itself to helping predict the results of experiments. For a third challenge, called the pessimistic meta-induction, the critic points out that looking back at scientific theories throughout history shows they’ve always been imperfect, so won’t the future ones be, too?
A specific time such as 3:00 is one part of time, but philosophers disagree about whether a time is a concrete object or an abstract object. A cow is a concrete object. It has causal powers. A number is an abstract object. It has no causal powers.
Everyone agrees that time apparently has something to do with change, that it provides a numerical measure of certain kinds of change, but not, say, of the kind of change that occurs in the colors across a rainbow, but definitely the kind of change that occurs in the locations of the stars. Presumably we can learn about the structure of time by studying change and the structure of the changes, and presumably clocks need to change properly in order to measure time. This article’s supplement of “Frequently Asked Questions” discusses what a clock is, and what it is for a clock to be accurate as opposed to precise, and why we trust some clocks more than others. The mainstream position is that there are no universal, objectively correct clocks, only clocks that are more or less useful for us human beings. Still the relationship between time and change needs to be made more precise. For example, can the nature of time change from time to time?
Einstein said that physical time is what clocks measure. This is a remark about clocks and not a definition of time. Yet the remark is not as trivial as it might seem since it is a deep truth about our physical universe that it is capable of having clocks. We are lucky to live in a universe with so many different kinds of regular, periodic processes. We humans can use these cyclic processes for measuring time. However, some philosophers of physics claim that there is nothing more to time than whatever numbers are displayed on our clocks. The vast majority of philosophers of physics disagree with that claim. They say time is more than those numbers; it is what we intend to measure with those numbers.
In the anti-realist spirit of those who do say there is nothing more to time than whatever numbers are displayed by our clocks, the distinguished philosopher of science Henri Poincaré said in 1912, “The properties of time are…merely those of our clocks just as the properties of space are merely those of the measuring instruments.” He believed that to know time is to know what makes the equations of mechanics be the simplest and most convenient, and he cautioned that we should not confuse the term best convention with true or factual.
Is time a concept that every extraterrestrial civilization must use? Many philosophers speculate that the answer is yes, but we know very little about the needs of such civilizations, or if they exist, so it is understandable why there is no consensus on this issue.
a. History of Answers
What then is time really? This is still an open question with much more to be said about it in the coming sections of this article and its supplements. Let’s briefly stop to consider how this question has been answered in different ways throughout the centuries. Here we are interested in very short answers that give what the proponent considers to be the key idea about what time is.
Aristotle proposed what has come to be called the relational theory of time when he remarked, “there is no time apart from change….” (Physics, chapter 11). He clarified his remark by saying, “time is not change [itself]” because a change “may be faster or slower, but not time…” (Physics, chapter 10). For example, a leaf can fall faster or slower, but time itself cannot be faster or slower. Aristotle claimed that “time is the measure of change” (Physics, chapter 12) of things, but he never said space is the measure of anything. Elsewhere he remarked that time is the steps between before and after.
René Descartes, who is known for doubting many things, never doubted the existence of time. He answered the question, “What is time?” by claiming that a material body has the property of spatial extension but no inherent capacity for temporal endurance and that God by his continual action sustains (or re-creates) the body at each successive instant. Time is a kind of sustenance or re-creation (“Third Meditation” in Meditations on First Philosophy, published in 1641). Descartes’ worry is analogous to that of Buddhist logicians who say, “Something must explain how the separate elements of the process of becoming are holding together to produce the illusion of a stable material world.” The Buddhist answer: causality. Descartes’ answer: God’s actions.
In the late 17th century, Gottfried Leibniz, who was also a relationist as was Aristotle, said time is a series of moments, and each moment is a set of co-existing events in a network of relations of earlier-than and simultaneous-with. Isaac Newton, a contemporary of Leibniz, argued instead that time is independent of events. He claimed time is absolute in the sense that “true…time, in and of itself and of its own nature, without reference to anything external, flows uniformly…” (1687). This difference about how to understand time is also reflected in their disagreement about space. Newton thought of space as a thing, like a substance, while Leibniz disagreed and said it is not a thing but only a relationship among the things.
Both Newton and Leibniz assumed that time is the same for all of us in the sense that how long an event lasts is the same for everyone, no matter what they are doing. In the early 20th century, Albert Einstein would claim to have refuted their assumption.
In the 18th century, Immanuel Kant made some very influential remarks that suggested he believed time and space themselves are forms that the mind projects upon the things-in-themselves that are external to the mind, and only if you were to be able to step out of your mind could you see the external world as it really is. In the twenty-first century, this description of Kant’s remarks is believed to be a slight misinterpretation of his intentions, even though he did say things that would lead to this false interpretation. What he actually believed was that our representations of space and time have this character. So, Kant’s remarks that time is “the form of inner sense” and that time “is an a priori condition of all appearance whatsoever” are probably best understood as suggesting that we have no direct perception of time but only have the ability to experience individual things and events within time. The “we” here is human beings; Kant left open the possibility that the minds of non-humans perceive differently than we humans do. Also, he left open the possibility that the world-in-itself, that is, the world as it is independently of being perceived, may or may not be temporal. The much more popular theory of mind in the 21st century is anti-Kantian and implies conscious beings have unmediated access to the world; we can experience the external world and not merely experience internal representations of that world.
In the early 20th century, the philosophers Alfred North Whitehead and Martin Heidegger said time is essentially the form of becoming, with becoming being an objective feature of the external world, not merely a feature of our consciousness. This is an idea that excited a great many philosophers, but not many scientists, because the remark seems to give ontological priority to the manifest image of time over the scientific image.
Ever since Newton’s theory of mechanics in the 17th century, time has been taken to be a theoretical entity, a theory-laden entity, in the sense that we can tell much about time’s key features by looking at the role it plays in our confirmed, fundamental theories. One of those is the theory of relativity that was created in the early 20th century. According to relativity theory, time is not fundamental, but is a necessary feature of space-time, which itself is fundamental. Space-time is all the actual events in the past, present, and future. In 1908, Hermann Minkowski argued that the proper way to understand relativity theory is to say time is really a distinguished, non-spatial dimension of space-time, and time has no existence independent of space. Einstein agreed.
Another philosophically interesting feature of the relationship between time and space in relativity theory is that which part of space-time is space and which part is time are not fixed; they are relative to a chosen frame of reference or coordinate system. We humans in our daily lives do not notice this dependence upon frame of reference because we move too slowly compared to the speed of light, and we never experience large differences in gravitational force, and we can easily agree with each other on the frame of reference to use. Someone fixing the origin of their reference frame to their spaceship that is speeding by us at 99.9% the speed of light would disagree with us about the times and durations of events, and Einstein’s theory surprisingly implies that we are both correct.
Whatever time is, one should consider whether time has causal powers. The musician Hector Berlioz said, “Time is a great teacher, but unfortunately it kills all its pupils.” Everyone knows not to take this joke literally because, when you are asleep and then your alarm clock rings at 7:00, it is not the time itself that wakes you. Nevertheless, there are more serious reasons to believe that time has causal powers. Princeton physicist John Wheeler said that according to relativity theory “space-time tells matter how to move….” The very geometry of space-time changes, both over time and from place to place. There is a scientific consensus on this point that the general theory of relativity does imply space and time are dynamic actors and not a passive stage where events occur, as Newton mistakenly believed.
In the 20th century, the theoretical physicist Stephen Wolfram said every physical process is a natural computation, and time is really the inexorable progress of this computation. This progress is what other scientists have been calling “evolving in time according to the laws of nature.” The computation is a digital process. Wolfram believes the atoms of time have a duration of 10-100 seconds. So, there is quite a bit of parallel processing going on throughout the universe. Not many others have accepted Wolfram’s idea—for the following reasons. No single instance of updating has been observed experimentally. The removal of the continuum from physics and the pervasive use of atoms of time and atoms of length violates relativity theory’s Lorentz invariance which implies that the duration of an event and the distance between two places must vary with reference frames. In response, Wolfram said the promotion of the concept of space-time is the biggest wrong turn in twentieth century physics. One of Wolfram’s critics, the philosopher of physics Tim Maudlin, quipped that, “The physics determines the computational structure, not the other way around.”
There is a rising suspicion among twenty-first century physicists that, for shorter and shorter durations below the Planck time scale of 10-44 seconds, the notion of time itself becomes progressively less applicable to reality. That is why some experts say that the whole idea of time is just an approximation. Even if this were to be so, it would still be appropriate to say time is real, even if it is not fundamental, because human beings need to use the concept of time as the scale increases. Laplace’s Demon does not. The Demon has no limits on its computational capabilities and needs no simple models and coarse-graining and approximations. Counter to this “rising suspicion” among their fellow physicists about time being less applicable to reality at the smallest scales, physicists David Gross and Lee Smolin argue that time is not an approximation, and it will remain fundamental at all scales in all our future and fundamental theories regardless of whether time turns out to be continuous or discrete.
The physicist-philosopher Emily Adlam has challenged what she calls the “time evolution model” of nature that has been in place in physics for many centuries. She advocates the “all-at-once model”:
Now, for much of the history of science, that intuitive picture of time as something that flows has been reflected in the way in which we do physics. Physicists have written down laws of nature in the form of time evolution equations which take a state at a given time and tell us how to evolve it forwards in time to arrive at states for later times. These laws invite us to think of the universe as something like a computer which takes in an initial state and evolves it forwards in time to produce the rest of history. But there’s an alternative way to think about the universe. Rather than starting at one end of time and evolving towards the other end, the laws of nature could apply all at once to the whole of history in an atemporal manner.
To visualize this, think about the rules of the game of Sudoku. The rules of Sudoku don’t tell you to start at the left and move towards the right. Rather, they apply all at once to the whole grid—simply judging a total solution as valid or invalid. And n principle the laws of nature could work in a similar way, constraining all of space-time at once and choosing a whole solution in one go rather than generating the course of history moment by moment as in the evolution picture.
Now, if this is correct, it would significantly change the way we think about time and our place within it. …Time isn’t a journey.…The future is already out there…and it is just as real and concrete as the past (“Time Isn’t Going Anywhere,” YouTube Tedx talk, January 2025).
One minor comment on the use of the word “state” in Adlam’s quotation above. The word “state” or, equivalently, “state of affairs” of a physical system means a state at one time. For a physical system with multiple, fundamental parts, the state of the system at a time is also called its “configuration.” Every system, except the universe itself, divides the universe into the system itself and its environment that is external to the system.
Later sections of this article and the supplement “What Else Science Requires of Time (That Philosophers Should Know),” introduce other conjectures about how to answer the question, “What is time?”
b. Kinds of Physical Time
Are there different kinds of time? There are many ways to measure time—by pendulums, crystal watches, atomic clocks, and so forth—but our question is whether they are all measuring the same thing. There have been a few suggestions that they are not—because there are kinds of time.
One suggestion is that a pendulum could measure gravitational time while an atomic clock measures electromagnetic time, and the difference between the two times is small enough that their difference has not yet been noticed.
In the 1930s, the physicists Arthur Milne and Paul Dirac worried about this. Independently of each other, they suggested their fellow physicists should investigate whether there may be many correct, but differing, times and time scales. For example, Milne and Dirac worried that there could be the time of atomic processes and the time of nuclear processes and perhaps yet another time of gravitational processes. Perfectly-working clocks for any pair of these processes might drift out of synchrony after being initially synchronized without there being a reasonable explanation for why they do not stay synchronized. It would be a time mystery.
In 1967, physicists rejected the gravitational standard of measuring time and replaced it with the atomic standard because the observed deviation between periodic atomic processes in atomic clocks and periodic gravitational processes such as the Earth’s revolutions could be explained better by assuming that the atomic processes were more regular. Physicists still have no reason to believe a gravitational periodic process that is not affected by friction or impacts or other forces would ever mysteriously drift out of synchrony with a regular atomic process, yet this is the possibility that worried Milne and Dirac.
However, we may in the future be able to perform a much better check on the possibility that electromagnetic time differs from nuclear time. The best atomic clocks are based on the stable resonances of light emitted as electrons transition from one energy level to another in the atoms of the elements ytterbium and strontium. This electron activity is vulnerable to influence from all sorts of stray electric and magnetic fields; therefore, significant shielding is required. Future nuclear clocks will not have this vulnerability. They will be based on transitions of energy levels of neutrons inside the atom’s nucleus, for example a thorium nucleus. Neutrons are better than protons because neutrons are electrically neutral and so are impervious to those stray fields that can easily affect atomic clocks. Neutrons are affected, though, in another way; they are affected by the strong nuclear force. This force is much stronger than the electromagnetic force, but only over extremely short distances on the order of less than a small atom’s width; and the likelihood of stray, strong nuclear forces that might affect a nuclear clock are expected to be minimal compared to the problems of ensuring the regularity or stability of an atomic clock.
The philosophical question that is relevant to this clock building is: What if the best atomic clocks were discovered to drift in comparison with the best nuclear clocks? Would that show that there are two distinct kinds of physical time? Maybe, but only if we have first eliminated the more mundane hypothesis that the drift is due to interference affecting the atomic clock (or perhaps the nuclear clock). If we could not account for the drift and were presented with a mystery, only then would we consider the more exotic hypothesis that Milne and Dirac entertained.
Another, very different, speculation about kinds of time is that, if two spatial regions differ in their density, then time is different for the two. The suggestion is that time passes more slowly in cosmic voids as opposed to cosmic filaments of clusters of galaxies. “Voids can be as much as 4 billion years older,” said a member of a team of researchers at the University of Canterbury in 2025. So, the time that has passed since the Big Bang differs depending on where you are located in the cosmos.
The term cosmic time is used by astronomers when they say the Big Bang event began at t = 0, and when they say space became transparent at t = 380,000 years. This cosmic time is not a kind of time, but rather is time measured in a special reference frame called the cosmic reference frame.
c. Structure and Topology
Physical time has an exotic structure. This section explores this structure, emphasizing what mathematicians call its topology. Time’s being structured like a one-dimensional spatial line is probably its most well known topological feature. But is time really like a dimension of space? Yes, if we are to trust Einstein who said that it is the fourth dimension of space-time. But clearly there is a very great difference between time and space. Perhaps the greatest is that, if you know something is happening in a room now, then you have some helpful information about what is likely to happen there later, but if you know something is happening now in the same room, you have no helpful information about what is happening now in other rooms. In that sense, now and then is not like here and there. The point being made with this example is that time plays a key role in predictability that space does not.
According to most persons, a significant difference between time and space is that people can move relatively freely in spatial directions, but can only move forward in the time direction. Some express this difference by saying time has a direction or arrow and space does not.
Other topological issues are whether time is like a circle or instead a straight line. Is it like a fully-extended line that is infinite in both directions, or is it like a ray, or is it like a line segment with time having both a beginning and ending?
In a typical coordinate system we use a directed object for each axis of space and time, namely the mathematical line that uses real numbers ordered by the less-than relation to provide names of the line’s points. It could be useful to do this regardless of whether space itself and time itself are discrete or continuous or merely dense. It is very important to distinguish the representation from what is being represented. Moving forward in time may be very different from moving right along a spatial line. The notion of moving right as opposed to left along a coordinate axis is purely a useful convention, so when physicists say, “Let’s let the y axis represent time,” they would not agree that, to know about time’s topological nature, it is sufficient to investigate only the axis.
Is time a composite object, made up of its instants? Perhaps, but we’d like more details about the composition. Is it like e a string of pearls along a necklace with each pearl having a closest neighbor? Points along the mathematical line do not have this structure.
Physicists usually assume time is a composite object that is a continuum of instants, with each instant have a zero duration. It is a continuum in the same way that the real number line is a continuum of real numbers. To understand what that means, note first that physicists usually say time is a structure of instants linearly ordered by the relation “happens before.” To be linearly ordered or, what is the same thing, to be totally ordered is that, for any two non-identical instants, one of the two happens before the other. No instant fails to be on the one time line. In that sense, there may be unique events, but no unique instants. Noon is an instant that is no different than any other instant, but the assassination of the duke at noon is an event that could be a unique event different from all other events.
A mathematical line is a continuum of points. Time seems to be a continuum of instants. Relativity theory and quantum mechanics are the only two fundamental theories of physics in the 21st century, and they both imply time is a linear continuum of instants—it is one-dimensional and gap-free; between any two point instants there are many other instants (specifically a non-denumerably infinite number of instants); for any possible duration, there is a shorter one; and no instant has a next instant.
There are two technical amendments to these assumptions about continuous time. First, physicists want to leave it as an open, empirical question whether there is a first instant with no previous instant. Second, in order to make it be an empirical question whether time is topologically like a straight line rather than a circle, it is usually implicitly assumed not that the all of time is a linear continuum but only that every neighborhood of an instant (that is, short period before and after an instant) is a linear continuum (except for the neighborhood of an instant that is the first instant if there is one).
Relativity theory implies time began at a singularity, an infinitely small point of infinite density. It also implies there is a singularity within a black hole. Most physicists believe these implications reveal an error in relativity theory because they are inconsistent with the predictions of quantum mechanics.
Could there really be a time with no previous time? Yes. There are no convincing reasons against this if one does not blindly accept the theory of relativity’s implications. Here is an ancient Greek attempt at a reason for no first instant. Before Socrates, Parmenides declared: “Out of nothing, nothing comes,” so time has no beginning. But how does he know this a priori? Maybe out of nothing, something does come. He cannot rely on the fact that, just because he has not seen this happen in his own life, it is not possible.
Time is not quantized in either relativity theory or quantum mechanics, at least as they are usually interpreted. In 1916, Albert Einstein had written privately that, even though he was assuming space is a continuum in his new theory of relativity, he believed that space is actually discrete with a shortest possible length. Presumably he believed the same for time, but there is no more information in the historical record. Werner Heisenberg and Niels Bohr, founders of quantum mechanics, also did not personally believe in a temporal and spatial continuum; but they did not promote their doubts because they could not work out the details. So, for the rest of the twentieth century it was accepted by nearly all experts that time is a continuum; and textbooks taught this to subsequent generations.
Yet during the first quarter of the 21st century, as deeper investigations searched for a way to combine the best parts of relativity theory and quantum mechanics, a great many experts began to suspect that both space and time are not continuous at the fundamental level (which is usually the smallest level). Their belief is that continuous time is real, but it is real only because its continuity emerges as the scale increases—in analogy to how an ice cube’s temperature is real but only emerges at higher spatial scales without any molecule of the ice cube having a temperature.
A significant minority of physicists in the 21st century have decided that in order to make sense of what is really going on in quantum mechanics, we must not assume there is a single time line and instead adopt a many-worlds perspective and assume time is a tree of branching time lines with each branch being isolated from the other branches. Our own personal time splits into multiple futures as we interact with our environment. If we could have walked left or walked right at one time, then there is a branch of time in which we did walk left and another branch in which we did walk right. Both persons have identical memories of what happened before they turned, but neither person can now communicate with the other. This odd feature of time is explored in the “What Else” supplement of this article in its sub-section on the Many-Worlds interpretation of quantum mechanics. The rest of this article assumes time has no branching, is one-dimensional and, for neighborhoods (as opposed to its whole length) of points except a possible first point, it is topologically like the real number line; this is the mainstream position.
4. Why There Is Time Instead of No Time
The existence of time is a brute fact according to the two fundamental theories of physics, namely relativity theory and quantum theory. The theories imply time exists at least relative to a chosen reference frame (that is, a formal viewpoint or coordinate system). The reference frame is an abstract, intangible tool that displays what part of space-time is its time part and what part is its space part. However, the two fundamental theories have nothing to say about why space-time exists or why time exists or why time is one-dimensional and not multi-dimensional.
Many physicists believe that the existence of time will be explained. They believe it will not be a brute fact in a new theory of quantum gravity that someday will be able to reconcile the inconsistencies between relativity theory and quantum theory. They believe that in this new theory time will emerge from something timeless. But many other physicists do not believe this about time. The issue remains open. The related issue of whether it is urgent to reconcile the inconsistency between the two fundamental theories, relativity theory and quantum theory, is open, too. The vast majority of researchers in the foundations of physics agree that the problem is urgent, but some have argued that the search for this so-called theory of quantum gravity is a waste of time, and we should live with the inconsistency.
Among physicists and philosophers of physics, there is no agreed-upon answer to why our universe contains time instead of no time, why it contains dynamical physical laws describing change over time, whether these fundamental physical laws will hold tomorrow, why the universe contains the fundamental laws that it does contain, and why there is a universe instead of no universe, although there have been interesting conjectures on all these issues.
There is little support for the negative claim that any of these unsolved problems are intractable, something too difficult for the human mind, in analogy to how even the most clever tuna fish will never learn the chemical composition of the water it swims in.
Here is one linguistic explanation for why time exists: Without time there would be no verbs. A more serious and interesting theological explanation for why time exists is that God wanted the world to be that way. The argument for this often uses the technique called “God of the gaps” which says that, if scientists have a gap in their knowledge, then the presence of God’s actions should be used to fill the gap. Here is an anthropic explanation for why time exists. If it were not to exist, we would not now be taking the time to ask why it does exist. Here is an intriguing non-theological and non-anthropic explanation. As steam cools, eventually it reaches a tipping point and undergoes a phase transition into liquid water. Many cosmologists agree with James Hartle’s and Stephen Hawking’s speculation that the universe should contain laws implying that, as the universe cools, a phase transition occurs during which four-dimensional space is eventually produced from infinite-dimensional space; then, after more cooling, another phase transition occurs during which one of the four dimensions of primeval space collapses to become a time dimension. The previous sentence is a bit misleading because of its grammar which might suggest that something was happening before time began, but that is a problem with the English language, not with this suggestion about the origin of time.
There is a multiverse answer to our question, “Why does time exist?” The reason why our universe exists with time instead of no time is that nearly every kind of universe exists throughout the inflationary multiverse; there are universes with time and universes without time. Like all universes in the multiverse, our particular universe with time came into existence by means of a random selection process without a conscious selector, a process in which every physically possible universe is overwhelmingly likely to arise as an actual universe given enough selections, in analogy to how continual re-shuffling a deck of cards makes it overwhelmingly likely that any specific ordering of the cards will eventually appear. Opponents complain that this multiverse explanation is shallow. To again use the metaphor of a card game, they wish to know why their poker opponent had four aces in the last three hands, and they are not satisfied with the shallow explanation that four aces are inevitable with enough deals or that it is just a random result. Nevertheless, perhaps there exists no better explanation to be found.
5. The Scientific Image of Time
This article has much to say about how the science of time relates to the philosophy of time. Time has been studied for 2,500 years, but only in the early twentieth-century did time become one of the principal topics in professional journals of physics, and soon after in the journals of philosophy of science. The primary reason for this was the creation of the theory of relativity.
Any scientific theory can have its own implications about the nature of time, and time has been treated differently in different scientific theories over the centuries. But, as the philosopher Tim Maudlin has emphasized, when it comes to figuring out what a scientific theory implies about the nature of time, “It isn’t that the scientists have worked all this out, and we (philosophers) just need to go ask them.”
When this article speaks of the scientific image of time or what science requires of time it means time of the latest, accepted theories that are fundamental in physics and so do not depend upon other theories. For example, Einstein’s theory of relativity is fundamental, but Newton’s theory of mechanics is not, nor is Newton’s theory of gravitation, nor is Maxwell’s theory of electromagnetism, nor is statistical mechanics with its Second Law of Thermodynamics. All these non-fundamental theories are useful and used regularly by scientists because they have their domains of applicability. Newton’s theory is useful only for applications where the speed is slow compared to light speed, where there are no extreme changes of gravitational forces, and where durations are very large compared to the Planck time because in all these cases Newton’s theory agrees with Einstein’s. Newton’s theories are all that are needed to specify the trajectory of a spaceship sent from Earth to land safely on the Moon.
When scientists use their theories, they adopt positions that philosophers call philosophical. Consider, for example, the most famous scientist of the twentieth-century, Albert Einstein. He advocated empiricism (knowledge is based on experience, not reason alone, and scientific laws are not logical necessities), determinism (the present plus the fundamental laws uniquely determine the future), realism (there is a mind-independent objective world in which we all live), and physicalism (everything is physical). Regarding physicalism, Einstein would agree with the spirit of W.V.O. Quine’s remark that, “Nothing happens in the world … without some redistribution of microphysical states.” This physicalist position can be re-expressed as the thesis that all the facts about any subject matter such as geology or psychology are fixed by or dependent upon the totality of microphysical facts about the universe involving the distribution of elementary particles. People and their minds are real, but they are not fundamental; they are emergent entities from more fundamental entities.
There have been many challenges to these philosophical assumptions. Philosophers and some scientists are especially interested in whether the human mind might be a special counterexample to physicalism. This is an open question in metaphysics, but so far no scientific experiments or observations have shown clearly that the answer to the question, “Does mind supervene upon physical material?” is negative. Nor do scientific observations ever seem to need us to control for what the observer is thinking. However, physicalism is a metaphysical position that needs to be argued for and not simply assumed.
Underlying all the philosophical issues about physicalism and, more generally, the relationship between science and philosophy, is the question of whether there is a First Philosophy, that is, whether there are metaphysical truths and metaphysical methods that are somehow prior to the Second Philosophy of, say, empiricism, physicalism and naturalism. McTaggart is a metaphysician who promoted First Philosophy. Second Philosophy challenges the claim that there exist effective methods of doing philosophy that can be distinguished from those of the empirical sciences.
In the manifest image, the universe is fundamentally made of objects rather than events. In the scientific image, the reverse holds; the universe is fundamentally made of events rather than objects. Physicists use the term “event” in two ways, and usually only the context suggests which sense is intended. In sense 1, an event is a happening at a place for a certain amount of time. In sense 2, an event is simply a location in space and time. Sense 2 is what Albert Einstein had in mind when he said the world of events forms a four-dimensional continuum in which time and space are not completely separate entities. In either of these two senses of “event,” it is assumed in fundamental scientific theories that longer events are composed of shorter sub-events and that events are composed of instantaneous events, called point-events. The presumption of there being instantaneous events has been controversial. That presupposition upset Alfred North Whitehead who said: “There is no nature apart from transition, and there is no transition apart from temporal duration. This is why an instant of time, conceived as a primary simple fact, is nonsense” (Whitehead 1938, p. 207).
Frames of reference are perspectives on the space or the space-time we are interested in. A coordinate system is what the analyst places on a reference frame to help specify locations quantitatively. But any fact that depends on coordinates is considered by physicists not to be an objective fact. Choosing a different reference frame or coordinate system on space-time is choosing a different way of slicing space-time into space and time. A coordinate system placed on a reference frame of space-time normally assigns four numbers as names of point events; it gives them temporal point-locations (called point-times) and spatial locations (point-places). The best numbers to assign are real numbers (that is, decimals) in order to allow for the applicability of calculus. Calculus implies that a duration of only a billionth of a second still contains a great many point-times, a nondenumerable infinity of them. Relativity theory also implies there are an infinite number of legitimate, different reference frames and coordinate systems. No one of them is distinguished or absolute in Isaac Newton’s sense of specifying what time it “really” is, and where you “really” are, independently of all other objects and events.
Coordinate systems are not objective features of the world. They vary in human choices made about the location of their origins, their scales, the orientation of their coordinate axes, and whether the coordinate system specifies locations by things other than axes, such as the angle between two axes. In relativity theory, reference frames are often called “observers” or “ideally correct observers,” but there is no requirement here that conscious beings be involved.
The two fundamental theories of physics having ontological implications are the general theory of relativity and quantum mechanics. They imply that everything that is fundamentally real and physical is made of quantum fields. The exception is gravity. Many believe gravity, too, is a quantum field, but no one has successfully treated gravity as a quantum field, despite this being the central problem of 21st century physics. Relativity theory and quantum theory disagree with each other in many ways. Relativity theory allows there to be a time when nothing happens, but quantum mechanics denies this. Most physicists are betting that quantum mechanics is correct on this point. Applications of relativity theory to a situation tell you what will happen next. Quantum theory does not; it yields only statistical predictions.
The theory of relativity is well understood philosophically, but quantum mechanics is not, although the mathematical implications of both theories are well understood by mathematicians and physicists. These two fundamental theories are not merely informed guesses. Each is a confirmed set of precise, teleology-free laws with recognized ways to apply the laws to physical reality. The theories have survived a great many experimental tests and observations, so the scientific community trusts their implications in cases in which they do not conflict with each other.
Below is the scientific image of time presented as a numbered list of its most significant implications about time, with emphasis upon relativity theory as standardly interpreted.
(1) When you look at a distant object, you see it as it was, not as it is.
Because seeing an object requires light to travel from the object to your eyes, and because the speed of light is not infinite, and because it takes time for the brain to process information that it receives from the eyes, the information your brain obtains by looking at an object is information about how it was, not how it is. The more distant the object that you see, the more outdated is the information. The sensory experience you have when you see Saturn in a telescope is in your present, but the Saturn that you see is how it was an hour and a half earlier.
(2) The duration of the past is at least 13.8 billion years.
Our universe is expanding, so it must have been smaller in the past. This remark summarizes the key feature of the Big Bang Theory of how the universe was billions of years ago. It is well confirmed (though not as well confirmed as relativity theory), and it requires the past of the universe to extend back at least 13.8 billion years ago to when an explosion of space occurred, the so-called “big bang.” Because it is unknown whether anything happened before this big bang event, it is better to think of the big bang not as the beginning of time but rather as the beginning of what we understand about our distant past. A large majority of cosmologists believe the big bang’s expansion is an expansion of space but not of space-time and thus not of time. When cosmologists speak of space expanding, this remark is about increasing distances among clusters of galaxies. There is no implication that the space between two opposite walls of your bedroom is expanding.
(3) Time is one-dimensional.
The scientist Joseph Priestly in 1765 first suggested time is like a one-dimensional line. The idea quickly caught on, and now time is represented as one-dimensional in all the fundamental theories of physics. Two-dimensional time has been studied by mathematical physicists, but no theory implying that time has two dimensions in our actual universe has acquired a significant number of supporters. Such a theory is very difficult to make consistent with what else we know, and there is no motivation for doing so. Because of this one-dimensionality, time is represented in a coordinate system with a time line rather than a time area, and so its geometry is much simpler than that of space. However, neither the geometry of real, physical space nor the geometry of time is something that can be known a priori. Both Euclid and Kant mistakenly supposed otherwise.
(4) Time connects all events.
Given any two events that ever have existed or ever will, one of the events happens before the other or else they happen simultaneously. No exceptions.
(5) Time travel is possible, but you cannot change the past.
You can travel to the future—to meet your great, great grandchildren. Your traveling to the future (someone else’s future) has been experimentally well-confirmed many times. However, there is no consensus among scientists regarding whether you might someday be able to travel into your own past. If you were able to travel to the past, you could not change it because doing so would make something be both so and not so, and everyone believes that there can be contradictions in language but not in reality. Here is an example of a contradiction in language: “1+1 is 2, but also it isn’t.”
(6) Time is relative.
According to relativity theory, the amount of time an event lasts (the event’s duration) is relative to someone’s choice of a reference frame or coordinate system or vantage point. How long you slept last night can be eight hours as measured by a correct clock next to your bed or eight centuries as judged by a clock in a spaceship speeding away from you at close to the speed of light. If no reference frame has been pre-selected, then it is a violation of relativity theory to say one of those two durations for your sleeping is correct and the other is incorrect. Newton would have said both durations cannot be correct. In resonse, Einstein and his colleague Infeld said, “In classical physics it was always assumed that clocks in motion and at rest have the same rhythm…[but] if the relativity theory is valid, then we must sacrifice this assumption. It is difficult to get rid of deep-rooted prejudices, but there is no other way.”
Assuming duration is relative in this way, the conclusion is drawn that:
(7) Time is not an objectively real feature of the universe.
According to relativity theory, space-time is objectively real, but time is not (and neither is space). A state of the universe at a single time is also frame-relative. Places are frame-relative, and so are shapes, All this relativity is not noticeable in ordinary circumstances in our lives. To some philosophers, these claims cast doubt upon either the theory of relativity itself or the importance that scientists ascribe to frame-independence.
(8) Simultaneity is relative and the time-order of some pairs of events is not objective.
According to relativity theory, if two events, say A and B, happen far enough away from each other and close enough in time, then there is no fact of the matter as to whether A occurs before B or, instead, B occurs before A. Clearly, this is a violation of the manifest image. Also, two persons (who reason correctly using their own reference frame) will not agree on which pairs of distant events occurred simultaneously. Relativity theory implies there is no uniquely correct answer to the question, for some distant place, “What is happening now at that place?” The answer depends on what observer is answering the question, or more specifically what reference frame is being assumed. The informal details are described here, with examples.
(9) Within a single reference frame, coordinate time “fixes” (i) when each event occurs, (ii) what any event’s duration is, (iii) what other events occur simultaneously with it, and (iv) the time-order of any two events.
The term “coordinate time” refers to time measured along the time dimension in a chosen coordinate system.
(10) Speeding clocks run slower.
Two perfectly functioning and synchronized clocks can, over time, disagree about the time and thus be “out of sync.” As Sean Carroll has remarked, “In Newton’s universe, time was universal, and clocks wouldn’t go out of sync just because someone took a rapid jaunt through space. In Einstein’s universe (which presumably is our universe) time is personal, and your clocks reflect the particularity of your journey.” Your personal time is called your “proper time.” It is the time your perfect clock measures when you consider yourself to be stationary in your own reference frame. According to the special theory of relativity, from this perspective speeding clocks run slow, but yours does not. Their time is said to be “dilated” (that is, stretched or extended) compared to your stationary clock. The more speeding, the more dilation. Bergson denied this, and said time is the same for everyone, but the empirical evidence confirms Einstein’s position. When someone else’s time dilates, their space also contracts; coordination between time dilation and space contraction is required in order to satisfy Einstein’s requirement that nothing travels faster than light. (The requirement, said more technically and more accurately, is that nothing can ever go so fast that its space-time trajectory is space-like and escapes its forward light cone.)
(11) Time depends on altitude; it runs slower downstairs than upstairs.
According to general relativity, Initially synchronized clocks will get out of sync if they are affected differently by gravity. Take two of the world’s best clocks that are synchronized and sitting beside each other. If you lift one up a small fraction of an inch and set it back down, then their synchrony will be lost, and the clock you moved higher will be slightly ahead. This dilation due to gravity is a second kind of time dilation and is called “gravitational time dilation.” This effect of gravity holds for all physical processes, not just the ticking of clocks, so you will live longer on the first floor than on the tenth floor of your apartment building because the effect of Earth’s gravity is weaker up there.
(12) Time can warp or curve.
When time warps, clocks do not bend as if in a Salvador Dali painting. Instead, they undergo time dilation in which ideally good clocks undergoing different motions or different forces of gravity will disagree on how long an event lasts. In this sense, time is malleable. Newton’s theory and Einstein’s special theory of relativity do not allow the curvature of either space or time. Einstein’s general theory of relativity implies space-time can curve by bending, stretching, shaking and rippling (or doing all these together) just as a gelatin dessert can.
(13) Black holes slow down time.
A black hole is the place in our universe where time is the strangest. According to relativity theory, if you were in a spaceship and had passed through its outer boundary, called its event horizon and point of no return, then in order to escape back beyond the horizon, you would have to travel faster than light. If you were outside and your spaceship accelerated rapidly toward the horizon so that you quickly plunged inside (according to time as measured by you), viewers far away from the black hole would see your spaceship progressively slow its speed during its approach to the horizon. Reports sent by radio back toward Earth of the readings of your spaceship’s clock would become lower in frequency, and they would contain evidence that your clock’s ticking was slowing down (dilating) compared to Earth clocks. An outside viewer watching your spaceship as it plunges toward the horizon might never live long enough to see your spaceship actually reach the event horizon, although by your own clock on your spaceship you reached it in perhaps less than a minute.
(14) The notion of now is not objective for events happening far away.
The only reason that there is such a thing as THE correct time everywhere is that we accept the convention of trusting reports from just one clock, our standard clock or master clock at the Greenwich Observatory in London, England. But relativity theory allows other conventions such as choosing the master clock be a clock speeding by the Observatory at 99% the speed of light. Two observers using these two different conventions would disagree about which distant events are happening now. Technically the phrase “events happening far away” in item (14) refers to events that are spacelike separated from you, namely events occurring in the region that physicist call “your absolute elsewhere.”
(15) You have enough time left in your life to visit the edge of the Milky Way galaxy and return.
One philosophically interesting implication of time dilation in relativity theory is that in your lifetime, without using cryogenics, you have enough time to visit the far side of our Milky Way galaxy 100,000 light years away from Earth and then return to report on your adventure. The closer you get to the speed of light the closer you get to experiencing no time at all (as measured by your own clock). Assuming your clock and Earth-based clocks were initially synchronized and that you had a very speedy rocketship that could avoid obstacles, you might use only a few years of your personal time to get there and back, but the Earth-based clocks will show your roundtrip took you over 200,000 years.
(16) There are no instantaneous causal influences.
Relativity theory implies that there are no causal influences that move faster than the speed that light can travel in a vacuum. Newton’s theory of gravity allowed a gravitation influence from the Sun to the Earth to travel at infinite speed.
(17) Science does not require atoms of time.
Many physicists do suspect that there are atoms of time, but there is no convincing experimental evidence for this suspicion. The main difficulty with the idea of a smallest duration is that this would be a smallest duration for all reference frames, but general relativity implies duration can be different in different reference frames, so a major overhaul of the general theory of relativity would be required, and no one has satisfactorily shown how to do this.
(18) There is time-translation invariance except perhaps at the cosmic level.
What it means for the laws of nature to be time-translation invariant is that they do not depend on what time it is, and they do not change as time goes by. Your health might change as time goes by, but the basic laws underlying your health do not. This translation symmetry property implies the equivalence of all instants. It can be expressed using the language of coordinate systems by saying that replacing the time variable t everywhere by t + 4 in a fundamental law does not change what processes are allowed by the law. The choice of “4” was an arbitrary choice of a real number. Surprisingly, time-translation invariance is equivalent to energy conservation; if one fails so does the other.
(19) All the important fundamental physical laws are invariant under time-reversal.
When you reverse time, causes become effects and effects become causes, but no important fundamental laws are violated. This point about the laws having invariance or, what is the same thing, symmetry under time-reversal can be expressed informally by saying that if you make a documentary film and show it in reverse, what you see may look very surprising, but actually nothing shown violates an important fundamental physical law. The reversal probably violates the Second Law of Thermodynamics which demands that entropy increase on average over time, but that law is not fundamental. Another way time-reversal symmetry shows itself is in the fact that the form of the fundamental laws look the same if you change the time variable “t” to its negation “-t“.
The reason for using the hedge word “important” above is that some rarely seen decays of certain mesons do violate time-reversal, but the more common and important processes in the universe do not; they all could possibly go the other way. So, one says “almost” all the fundamental laws of physics are time symmetric.
If almost all the fundamental laws are time-reversal symmetric, this raises the interesting question of why all the common physical processes are seen by us to go spontaneously in only one of the two possible directions in time, as if time has an arrow. This puzzling issue is examined further in the later section on the arrow of time.
This ends our list of the most significant features of time according to the scientific image. It presumes the majority position on the theory of relativity even though some of the points above would need changing if one of the minority positions takes over. Why we speak of “majority” is that science is controversial and much less “cut and dried” than it appears in student science textbooks. Many very intelligent scientists and philosophers assert minority positions on a remarkable number of issues.
The reason the above list of 19 statements has so little to say about quantum mechanics is that, even though there is considerable agreement among the experts that quantum mechanics might have deep implications about the nature of time, there is little consensus on what those implications are. The major problem is that quantum theory is inconsistent with the general theory of relativity. Discussion of this controversy continues in the supplement What Else Science Requires of Time (That Philosophers Should Know). For more about special relativity, see Special Relativity: Proper Times, Coordinate Systems, and Lorentz Transformations. For an extended defense of the claim that science might have very little to tell us about real time, see (Baron et. al. 2022).
6. Time and Change (Relationism vs. Substantivalism)
Does physical time necessarily depend on change existing, or vice versa? Philosophers have been sharply divided on these issues, and any careful treatment of them requires clarifying the relevant terms being used.
Let’s focus on whether time necessarily involves change. If it does, then what sort of change is required? For example, would time exist in a universe that does change yet does not change in enough of a regular manner to have a clock? Those who answer “yes,” are quick to point out that there is a difference between not being able to measure some entity and that entity not existing. Those who answer “no,” have sometimes said that if an entity cannot be measured then the very concept of it is meaningless—although not that it must be meaningless, as a Logical Positivist would declare, but only that it is as a matter of fact meaningless. The latter position is defended by Max Tegmark in (Tegmark 2017).
Classical relationists claim that time necessarily involves change, and classical substantivalists say it does not. Substantivalism (also called substantialism) implies that both space and time exist always and everywhere regardless of what else exists or changes. They say space and time provide a large, invisible, inert container within which matter exists and moves independently of the container. The container provides an absolute rest frame, and motion relative to that frame is real motion, not merely relative motion. Relationism (also called relationalism and relationality) implies space and time are not like this. It implies there is no container, so, if you take away all of matter’s motions, you take away time, and if you also take away all the matter itself, you take away space.
Substantivalism is the thesis that space and time exist always and everywhere independently of physical material and its events.
Relationism is the thesis that space is only a set of relationships among existing physical material, and time is a set of relationships among the events of that physical material.
Relationism is inconsistent with substantivalism. Substantivalism implies there can be empty time, time without the existence of physical events. Relationism does not allow empty time. It is committed to the claim that time requires material change. That is, necessarily, if time exists, then change exists.
Everyone agrees that clocks do not function without change and that time cannot be measured without there being changes, but the present issue is whether time exists without changes. Can we solve this issue by testing? Could we, for instance, turn off all changes and then look to see whether time still exists? No, the issue has to be approached indirectly.
Relationists and substantivalists agree that, perhaps as a matter of fact, change is pervasive and so is time. What is contentious is whether time exists even if, perhaps contrary to fact, nothing is changing. This question of whether time requires change is not the question of whether change requires time, nor is it the question of whether time is fundamental.
To make progress, more clarity is needed regarding the word change. The meaning of the word is philosophically controversial. It is used here in the sense of what has been called ordinary change—an object changing its ordinary properties over time. For example, a leaf changes its location if it falls from a branch and lands on the ground. This ordinary change of location is very different from the following three extraordinary kinds of change. (1) The leaf changes by being no longer admired by Donald. (2) The leaf changes by moving farther into the past. (3) The leaf changes across space from being green at its base to brown at its tip, all at one time. (4) The leaf changes not only by having different locations at different times, but also by the coming into being of its new location and the ceasing to be of its old location. The second kind of change is accepted by all advocates of McTaggart’s A-theory, and the fourth is accepted by a great many of them. So, a reader needs always to be alert about whether an occurrence of the word change means ordinary change or one of the extraordinary kinds of change.
There is a fifth kind of change that also is extraordinary. Consider what the word properties means when we say an object changes its properties over time. When referring to ordinary change of properties, the word properties is intended to exclude what Nelson Goodman called grue-like properties. Let us define an object to be grue if and only if, during the time that it exists, it is green before the beginning of the year 1888 but is blue thereafter. With this definition, we can conclude that the world’s chlorophyll underwent a change from grue to non-grue in 1888. We naturally would react to drawing this conclusion by saying that this change in chlorophyll is very odd, not an ordinary change in the chlorophyll, surely nothing that would be helpful to the science of biology.
Classical substantival theories are also called absolute theories. The term absolute here implies existing without dependence on anything except perhaps God. The relationist, on the other hand, believes time’s existence depends upon material events.
Many centuries ago, the manifest image of time was relationist, but due to the influence of Isaac Newton upon the teaching of science in subsequent centuries and then this impact upon the average person who is not a scientist, the manifest image has become substantivalist.
a. History of the Debate from Aristotle to Kant
Aristotle had said, “neither does time exist without change” (Physics, Book IV, chapter 11, page 218b). This claim about time is often called Aristotle’s Principle. In this sense he was Leibniz’s predecessor, although Leibniz’s relationism contains not only Aristotle’s negative element that there is no changeless time but also a positive element that describes what time is. In opposition to Aristotle on this topic, Democritus spoke of there being an existing space within which matter’s atoms move, implying space is substance-like rather than relational. So, the ancient Greek atomists were a predecessor to Newton on this topic.
The battle lines between substantivalism and relationism were drawn more clearly in the early 18th century when Leibniz argued for relationism and Newton argued against it. Leibniz claimed that space is a network of objects. It is nothing but the “order of co-existing things,” so without objects there is no space. “I hold space to be something merely relative, as time is; …I hold it to be an order of coexistences, as time is an order of successions.” Time is a relational order of successions of events, with events causing other events. The typical succession-relationships Leibniz is talking about here are that this event caused that event to occur two minutes later. If asked what a specific time is, a modern Leibnizian would be apt to say a single time is a set of simultaneous events.
Opposing Leibniz, Isaac Barrow and his student Isaac Newton returned to a Democritus-like view of space as existing independently of material things; and they similarly accepted a substantival theory of time, with time existing independently of all motions and other kinds of events. Newton’s actual equations of motion and his law of gravity are consistent with both relationism and substantivalism, although this point was not clear at the time to either Leibniz or Newton.
In 1670 in his Lectiones Geometricae, the English physicist Isaac Barrow rejected any necessary linkage between time and change. He said, “Whether things run or stand still, whether we sleep or wake, time flows in its even tenor.” Barrow also said time existed even before God created the matter in the universe. Newton agreed. In Newton’s unpublished manuscript De gravitatione, written while he was composing his Principia, he said, “we cannot think that space does not exist just as we cannot think there is no duration” (Newton 1962, p. 26). This suggests that he believed time exists necessarily, and this idea may have influenced Kant’s position that time is an a priori condition of all appearance whatsoever.
Newton believed time is not a primary substance, but is like a primary substance in not being dependent on anything except God. For Newton, God chose some instant of pre-existing time at which to create the physical world. From these initial conditions, including the forces acting on the material objects, the timeless scientific laws took over and guided the material objects, with God intervening only occasionally to perform miracles. If it were not for God’s intervention, the future would be a logical consequence of the present.
Leibniz objected. He was suspicious of Newton’s substantival time because it is undetectable, which, he supposed, made the concept incoherent. Leibniz argued that time should be understood not as an entity existing independently of actual, detectable events. He complained that Newton had under-emphasized the fact that time necessarily involves an ordering of events, the “successive order of things,” such as one event happening two minutes after another. This is why time needs events, so to speak. Leibniz added that this overall order is time.
It is clear that Leibniz and Newton had very different answers to the question, “Given some event, what does it mean to say it occurs at a specific time?” Newton would says events occur at some absolute time that is independent of what other events occur, but Leibniz would say we can properly speak only about events occurring before or after or simultaneous with other events. Leibniz and Newton had a similar disagreement about space. Newton believed objects had absolute locations that need no reference to other objects’ locations, but Leibniz believed objects can be located only via spatial relations between other material objects—by an object being located above or below or three feet from another object.
One of Leibniz’s criticisms of Newton’s theory is that it violates Leibniz’s Law of the Identity of Indiscernibles: If two things or situations cannot be discerned by their different properties, then they are really identical; they are just one and not two. Newton’s absolute theory violates this law, Leibniz said, because it implies that if God had shifted the entire world some distance east and its history some minutes earlier, yet changed no properties of the objects nor relationships among the objects, then this would have been a different world—what metaphysicians call an ontologically distinct state of affairs. Leibniz claimed there would be no difference because there would be no discernible difference in the two, so there would be just one world here, not two, and so Newton’s theory of absolute space and time is faulty. This argument is called “Leibniz’s shift argument.”
Regarding the shift argument, Newton suggested that, although Leibniz’s a priori Principle of the Identity of Indiscernibles is correct, God is able to discern differences in absolute time or space that mere mortals cannot.
Leibniz offered another criticism. Newton’s theory violates Leibniz’s a priori Principle of Sufficient Reason: that there is a sufficient reason why any aspect of the universe is the way it is and not some other way. Leibniz complained that, since everything happens for a reason, if God shifted the world in time or space but made no other changes, then He surely would have no reason to do so.
Newton responded that Leibniz is correct to accept the Principle of Sufficient Reason but is incorrect to suppose there is a sufficient reason knowable to humans. God might have had His own reason for creating the universe at a given absolute place and time even though mere mortals cannot comprehend His reason.
Newton later admitted to friends that his two-part theological response to Leibniz was weak. Historians of philosophy generally agree that if Newton had said no more, he would have lost the debate.
Newton, through correspondence from his friend Clarke to Leibniz, did criticize Leibniz by saying, “the order of things succeeding each other in time is not time itself, for they may succeed each other faster or slower in the same order of succession but not in the same time.” Leibniz probably should have paid more attention to just what this remark might imply. However, Newton soon found another clever and clearer argument, one that had a much greater impact at the time. He suggested a thought experiment in which a bucket’s handle is tied to a rope hanging down from a tree branch. Partially fill the bucket with water, grasp the bucket, and, without spilling any water, rotate it many times until the rope is twisted. Do not let go of the bucket. When everything quiets down, the water surface is flat and there is no relative motion between the bucket and its water. That is situation 1. Now let go of the bucket, and let it spin until there is once again no relative motion between the bucket and its water. At this time, the bucket is spinning, and there is a concave curvature of the water surface. That is situation 2.
How can a relational theory explain the difference in the shape of the water’s surface in the two situations? It cannot, said Newton. Here is his argument. If we ignore our hands, the rope, the tree, and the rest of the universe, says Newton, each situation is simply a bucket with still water; the situations appear to differ only in the shape of their water surface. A relationist such as Leibniz cannot account for the change in shape. Newton said that even though Leibniz’s theory could not be used to explain the difference in shape, his own theory could. He said that when the bucket is not spinning, there is no motion relative to space itself, that is, to absolute space; but, when it is spinning, there is motion relative to space itself, and so space itself must be exerting a force on the water to make the concave shape. This force pushing away from the center of the bucket is called centrifugal force, and its presence is a way to detect absolute space.
Because Leibniz and his supporters had no counter to this thought experiment, for over two centuries Newton’s absolute theory of space and time was generally accepted by European scientists and philosophers, with the notable exceptions of Locke in England and d’Alembert in France. In the 21st century, many would point out Newton’s presupposition that the claim that a concave shape exists after the spinning depends on presupposing the influence of Earth’s gravitational field, and we cannot trust our intuitions about what the shape would be when the field is removed.
One hundred years later, Kant entered the arena on the side of Newton. Consider two nearly identical gloves except that one is right-handed and the other is left-handed. In a world containing only a right-hand glove, said Kant, Leibniz’s theory could not account for its handedness because all the internal relationships among parts of the glove would be the same as in a world containing only a left-hand glove. However, intuitively we all know that there is a real difference between a right and a left glove, so this difference can only be due to the glove’s relationship to space itself. But if there is a space itself, then the absolute or substantival theory of space is better than the relational theory. This indirectly suggests that the absolute theory of time is better, too.
Newton’s theory of time was dominant in the 18th and 19th centuries, even though Christiaan Huygens (in the 17th century) and George Berkeley (in the 18th century) had argued in favor of Leibniz. See (Huggett 1999) and (Arthur 2014) for a clear, detailed discussion of the opposing positions of Leibniz and Newton on this issue.
b. History of the Debate after Kant
Leibniz’s criticisms of Newton’s substantivalism are clear enough, but the positive element of Leibniz’s relationism is vague. It lacked specifics by assuming uncritically that his method for abstracting duration from change is unique, but this uniqueness assumption is not defended. That is, what exactly is it about the relationship of objects and their events that produces time and not something else? Nor did Leibniz address the issue of how to define the duration between two arbitrarily chosen events.
Newton and subsequent substantivalists hoped to find a new substance for defining absolute motion without having to appeal to the existence and location of ordinary material objects. In the late 19th century, the substantivalists discovered a candidate. It was James Clerk Maxwell’s luminiferous aether, the medium that waves when there is a light wave. Maxwell had discovered that light is an electromagnetic wave. Since all then-known waves required a medium to wave, all physicists and philosophers of science at the time believed Maxwell when he said the aether was needed as a medium for the propagation of electromagnetic waves and also when he said that it definitely did exist even if it had never been directly detected. Yet this was Maxwell’s intuition speaking; his own equations did not require a medium for the propagation.
The idea was that velocity relative to the ether was the “real velocity” or “absolute velocity” of an object as opposed to a velocity relative to some ordinary object like the Earth or a train.
Late in the 19th century, the physicist A. A. Michelson and his chemist colleague Edward Morley set out to experimentally detect this aether by detecting the aether wind as the Earth flew through it. Their assumption was that at different times of the year along the Earth’s path around the Sun, the Earth would move at different angles relative to the aether, and thus the speed of light measured on Earth would differ at different times of the year. Their interferometer experiment was very sensitive, but somehow it failed to detect an aether wind or any difference in the speed of light at different times of year even though the experiment was at the time the most sensitive experiment in the history of physics and apparently should have detected it. Some physicists, including Michelson himself, believed the failure to detect the wind was due to the fact that he needed a better experimental apparatus. Other physicists believed that the aether was somehow corrupting the apparatus. Most others, however, believed the physicist A. J. Fresnel who said the Earth is dragging a layer of the aether with it, so the Earth’s nearby aether is moving in concert with the Earth itself. If so, this would make the aether undetectable by the Michelson-Morley experimental apparatus, as long as the apparatus was used on Earth and not in outer space. That is, there is no aether wind detectable close to Earth because of the aether drag there. No significant physicist said there was no aether to be detected.
However, these ad hoc rescues of the aether hypothesis did not last long. In 1893, the physicist-philosopher Ernst Mach, who had a powerful influence on Albert Einstein, offered an original argument that attacked Newton’s bucket argument, promoted relationism, and did not assume the existence of absolute space (the aether) or absolute time. Absolute time, said Mach, “is an idle metaphysical conception.” Mach claimed Newton’s error was in not considering the presence or absence of stars or, more specifically, not considering the combined gravitational influence of all the matter in the universe beyond the bucket. That is what was curving the water surface in the bucket when the water was spinning. If that matter were not present, then the water would not climb up the side of the spinning bucket.
To explore Mach’s argument, consider a female ballet dancer who pirouettes in otherwise empty space. Our experience on Earth is that her arms always splay out from her body as she spins. Would her arms have to do so in this thought experiment if there were no Earth and stars? Newton would answer “yes.” Similarly, if we were to spin Newton’s bucket of water in otherwise empty space, would the presence of absolute space eventually cause the surface of the water to become concave? Newton would answer “yes.” Mach would say the questions makes no sense because the very notion of spin must be spin relative to some object, such as the surrounding stars. If you take away all the surrounding objects of a spinning object, it makes no sense to call it a spinning object. Mach would add that, if the distant stars were retained in the thought experiment, then the question would make sense and there would be spin relative to the stars, and he would change his answers to “yes.” Newton believed the presence or absence of the distant stars is irrelevant to the situations with a spinning ballet dancer and a spinning bucket of water. Unfortunately, Mach did not provide any detailed specification of how the distant stars exerted their influence on the water in Newton’s bucket or on a ballet dancer, and he had no suggestion for an experiment to test his answer; and so nearly all physicists and philosophers of physics were not convinced by Mach’s reasoning. Thus, the prevailing orthodoxy was that, because of Maxwell’s aether, Newton’s substantivalism is correct.
A young Albert Einstein was very intrigued by Mach’s remarks. He at first thought Mach was correct, and even wrote him a letter saying so, but he eventually rejected Mach’s position and took an original, relationist position on the issue.
In 1905, he proposed his special theory of relativity, and in 1915 he proposed his general theory of relativity. Neither require the existence of Newton’s absolute space or Maxwell’s aether. The theory of relativity was immediately understood by the leading physicists, and, when experimentally confirmed, it caused the physics and philosophy communities to abandon classical substantivalism.
To give just one implication of relativity theory, Einstein and colleagues noticed that there is a special asymmetry in the two bucket scenarios. In the first, the water is stationary; in the second the water is rotating. Velocity is relative but acceleration and rotation is not, according to the new theory of relativity. You can feel when your body is accelerated, but you might be sitting on a train that jiggles and not be able to tell whether your train’s speed relative to the train tracks is zero or not. Rotation, spin, and change of velocity are accelerations and so are absolute, according to relativity theory. The acceleration is what produces an outward centrifugal force on the water in the second bucket scenario but not the first. This difference is now considered to be the key to resolving the bucket puzzle. The resolution does not rely upon there being either distant stars or an aether.
The intellectual tide quickly turned against what Newton had said in his Principia, namely that “Time exists in and of itself and flows equably without reference to anything external.” Influenced by relativity theory, the philosopher Bertrand Russell became an articulate promoter of relationism in the early twentieth century.
Waxing philosophical in The New York Times newspaper in 1919, Einstein declared his general relativity theory to be a victory for relationism:
Till now it was believed that time and space existed by themselves, even if there was nothing—no Sun, no Earth, no stars—while now we know that time and space are not the vessel for the Universe, but could not exist at all if there were no contents, namely, no Sun, no Earth, and other celestial bodies.
Those remarks show Einstein believed in relationism at this time. However, in his Nobel Prize acceptance speech three years later in 1922, Einstein backtracked on this and took a more substantivalist position by saying time and space could continue to exist without celestial bodies. He claimed that, although relativity theory rules out Maxwell’s aether and Newton’s absolute space, it does not entail relationism, and it does not rule out some other underlying substance being pervasive. All that is required is that, if such an underlying substance exists, then it must obey the principles of the theory of relativity. Soon he was saying this substance is space-time itself—a field whose intrinsic curvature is what we call gravitational force. By adopting this position, he became a non-Newtonian, non-Maxwellian substantivalist. Rejecting classical substantivalism, Einstein said that space-time, “does not claim an existence of its own, but only as a structural quality of the [gravitational] field.”
This pro-substantivalism position has been subsequently strengthened by the 1998 experimental discovery of dark energy. This implies that space repels itself, and the rate at which new space is being created is increasing. It follows that the energy and pressure of empty space is not zero, as it was previously assumed to be. The discovery eventually was interpreted as indicating that space itself has inertia and is accelerating in volume. Because space-time has these properties and also because it can have ripples (from gravitational waves) and changing curvature (change its shape), and because these properties are usually considered to be properties of substances, the pro-substantivalist position became the most popular position in the 21st century.
Quantum field theory provides another reason to accept substantivalism. This theory is the result of applying quantum mechanics to fields. The assumption of Leibniz and Newton that fundamentally there are particles in space and time buffeted about by forces was rejected due to the rise of quantum field theory in the late twentieth century. It became clear that fields are better candidates than particles for the fundamental entities of the universe. Physicists influenced by logical positivism, once worried that perhaps Einstein’s gravitation field, and all other fields, are merely computational devices without independent reality. However, ever since the demise of logical positivism and the development and confirmation of quantum electrodynamics in the late twentieth century, fields have been considered to be real by both physicists and philosophers. What once were called “fundamental particles” still exist, but only as weakly emergent entities from fundamental fields. Because quantum field theory implies that a field does not go away even if the field’s values reach a minimum everywhere, the gravitational field is considered to be substance-like, but it is a substance that changes with the distribution of matter-energy throughout the universe. The philosophers John Earman and John Norton have called this position (of promoting the substance-like character of the gravitational field) manifold substantivalism. In response, the philosopher of physics Tim Maudlin said: “The question is: Why should any serious substantivalist settle on manifold substantivalism? What would recommend that view? Prima facie it seems like a peculiar position to hold” because the manifold has no spatiotemporal structure. (Maudlin 1988, p. 87).
Since the late twentieth century, philosophers have continued to create new arguments for and against substantivalism, so the issue is still open. Nevertheless, many other scientists and philosophers have suggested that the rise of quantum field theory has so changed the concepts in the Newton-Leibniz debate that the old issue cannot be settled either way.
For additional discussion of substantivalism and relationism, see (Dainton 2010, chapter 21).
7. Is There a Beginning or End to Time?
This section surveys some of the principal, well-informed speculations about the beginning and end of time. The emphasis should be on “speculations” because there are a great many competing ideas about the beginning and end of time, and none of the ideas are necessary to explain any actual observations. There is no consensus on whether the universe can stop existing, nor whether there was a time when it started existing. There is no consensus about whether time is infinite in the future, and there is none about whether time is infinite in the past, although most cosmologists do believe there was a time before the big bang. For all we know, we may never know the answer to these questions, despite our being better informed on these issues than were our predecessors. Researchers sometimes speak of the creation of a universe from the physicists’ quantum vacuum and call this creation ex nihilo, but a quantum vacuum is not nothing in the sense used by many philosophers, namely complete absence of anything.
a. The Beginning
Many persons have argued that the way to show there must have been a first event is to show that time has a finite past. But this is a mistake. Here is a counterexample. Think of instants as being assigned to the positive real numbers in the interval (0,1] with the present being assigned “1.” The notation “(0,1]” stands for real numbers greater than zero but less than or equal to one. The total interval, that is, the total past duration, is finite, but there is no first instant because there is no smallest positive number.
Most theologians agree that there was a beginning to time, but there is no agreement among cosmologists about whether there was. Carroll remarked that “We don’t yet know the final laws of physics, but there’s nothing we know about how such laws work that suggests the universe needs any help to exist” (Carroll 2016, 201). Some metaphysicians would react to this comment by saying that you cannot rely only on science’s laws to answer questions about the universe’s origin. One needs to look beyond science to metaphysical first principles in order to find the answers.
Immanuel Kant argued that contradictions could be deduced from the claim that the universe had a beginning. But he also believed contradictions followed from the claim that it did not have a beginning. So much the worse for pure reason.
Relativity theory and quantum mechanics both allow time to be infinite in the future and in the past. Thus, any restrictions on time’s extent must come from other sources. Almost all cosmologists believe in the hot big bang theory that says the universe was very hot and very small 13.8 billion years ago; this was a highly-unlikely environment of very low entropy. That ancient time is the t = 0 of cosmic time used by professional cosmologists and astronomers. The main controversy is whether t = 0 is really the beginning. Your not being able to imagine there not being a time before the big bang does not imply there is such an earlier time, although this style of argument based on what can be imaged might have been acceptable to the ancient Greek philosophers. The cosmologist Stephen Hawking once famously quipped that asking for what happened before the big bang is like asking what is north of the North Pole. He later retracted that remark and said it is an open question whether there was time before the big bang.
There is no consensus on whether there was a first instant at the beginning of the big bang. Most believe not. If the universe began at a single first instant of the big bang, to some philosophers of physics this entails something begins from nothing, that a state of being called “nothing” transformed into a state of something, and this would be a violation of the law of the conservation of energy because there would be a transition from no energy to some energy. What would make more sense instead, they say, is there simply being a first instant with no previous instant and the total energy of the universe is always zero at all times. But that in turn presents two problems: (1) Can we be so sure that the law of the conservation of energy holds? (2) Even if it does, how do we know the total value is zero? The universe clearly has many entities or situations with a positive energy, so for the universe to have a total of zero energy, there must be many entities or situations with negative energy. This problem (2) is treated by producing a candidate for this negative energy. It is gravity, that is, the curvature of space itself.
Even if there were times before the big bang began, there would still be the question as to whether the extent of this prior time is finite or infinite.
The big bounce theory of cosmology says the small, expanding volume of the universe 13.8 billion years ago was the effect of a prior multi-billion-year compression that, when the universe became small enough, stopped its compression and began a rapid expansion that we have been calling the big bang. Perhaps there have been repetitions of compression followed by expansion, and perhaps these cycles have been occurring forever and will continue occurring forever. This is the cyclic universe theory of cosmology. This is a kind of multiverse with the multiple universes existing along one time line.
The Hawking-Hartle No Boundary Proposal suggests that the universe had no time, then suddenly one dimension of space converted to a dimension of time. An even stranger cosmological theory comes from Princeton cosmologist J. Richard Gott. He speculated that time had no first instant but rather began in an unusual process in which the universe was produced by itself by a process of backward causation. He assumes that backward causation is consistent with general relativity, which is controversial. At the beginning of the universe, he says, there was a closed time-like loop that lasted for 10-44 seconds during which the universe caused its own existence. If so, then past time is finite and the loop was a beginning of time without there having been a first event. See (Gott 2002) for an informal presentation of the idea, which has not been promoted by many other cosmologists.
b. The End
The phrase “the end of time” is ambiguous in the literature. I can mean the end of time itself, or it can simply mean the end of all known physical structure or all physical objects. If it is meant in this latter sense, then the cosmologists’ most favorite scenario for the end is the heat death scenario. Approaching the heat death, the last star will burn out in 1015 years. Eventually everything that can decay will decay, and all the remaining objects will get ever farther from each other, with no end to this dilution and cooling, while all the ripples of space-time become ever weaker. This scenario of everything in the entire universe approaching absolute zero an infinite time from now is called the heat death of the universe, and the big chill, and also the big freeze. This scenario presumes that the dark energy is nearly a cosmological constant and does not vary radically. Wity a radical increase, we get the Big Rip scenario in which all structure as we know it is ripped apart.
Approaching nearly featureless space-time is the most likely future for our universe, but it is likely not to be the end of absolutely all structure forever because the laws of quantum mechanics imply there is always a small probability of a fluctuation of some structure from no-structure. Fluctuations usually last an extremely short amount of time, but they do not have to.
Here is a summary of some serious, competing suggestions by twenty-first-century cosmologists about our universe’s future. The list begins with the most popular one:
Heat Death—Big Chill (Expansion of space and decay of particles at an ever-increasing rate as the universe’s temperature asymptotically approaches an equilibrium temperature.) An infinite future.
Big Crunch (The expansion of space eventually stops somehow, erhaps because dark energy starts decreasing and gravitational attraction becomes dominant; and then the universe begins contracting to a final compressed state as if the big bang is running in reverse.) A finite future.
Big Bounce. (Eternal pattern of cycles of expansion, then compression, then expansion, then compression, and so forth. There are repeated returns to a microscopic volume with each being followed by a new big bang). An infinite future.
Cycles without Crunches (While the universe expands forever and never contracts, the observable part of the universe can oscillate between expansions and contractions with a big bounce separating a contraction from the next expansion. Each cycle is different from the previous one.) An infinite future.
Big Rip (Dark energy runs wild. The expansion rate of dark energy increases exponentially toward infinity. As this happens, every complex system that interacts gravitationally is eventually pushed apart—first galaxy clusters, then galaxies, later the planets, then all the molecules, and even the fabric of space itself.) A finite future for all known structure, but still an infinite future.
Big Snap (The fabric of space suddenly reveals a lethal granular nature when stretched too much, and its “snaps” like when an overly stretched rubber band breaks.) A finite future for all known structure, but still an infinite future.
Death Bubble (Due to some high energy event such as the creation of a tiny black hole with a size never created before such as a few ounces, our vacuum (which is really a metastable Higgs field) takes a quantum jump to a new value, a more stable true vacuum value. This is analogous to supercooled distilled water being disturbed and rapidly turning to ice. The energy of the vacuum decay that this collapse creates appears as a 3D bubble with no inside expanding at almost the speed of light, changing the masses of all fundamental particles in its path and destroying all currently known structures. Expected to occur 10102 to 10359 years from now. An infinite future.
Mirror Universe. (Before the big bang, time runs in reverse. Both the big bang’s before-region and its after-region evolve from a tiny situation at cosmic time t = 0 in which the apexes of their two light cones meet. The two regions are almost mirror images of each other. On some versions, the light cones meet at a point; on other versions, they might meet in an infinitesimal region.) There are versions with a finite future and versions with an infinite future.
These theories have been described in detail with mathematical physics, but they are merely hypotheses in the sense that none are tied to any decisive experimental results, at least so far. The Big Crunch was the most popular theory among cosmologists until the 1960s. All the theories of the end of the universe on the above list presuppose that we are not Boltzmann brains and that we can trust that our supposed knowledge of science is in fact knowledge.
See (Mack 2020) and (Hossenfelder 2022, chapter two) for a presentation by two cosmologists of many of the competing theories about the beginning and the end of time in the various senses of these phrases.
c. Historical Answers
There has been much speculation over the centuries about the extent of the past and the future, although almost all remarks have contained serious ambiguities. For example, regarding the end of time, is this meant in the sense of (a) the end of humanity, or (b) the end of life, or (c) the end of the universe that was created by God, but not counting God, or (d) the end of all natural and supernatural change? Intimately related to these questions are two others: (1) Is it being assumed that time exists without change, and (2) what is meant by the term change? With these cautions in mind, below there is a brief summary of conjectures throughout the centuries about whether time has a beginning or an end.
Regarding the beginning of time, the Greek atomist Lucretius in about 50 B.C.E. said in his poem De Rerum Natura:
For surely the atoms did not hold council, assigning order to each, flexing their keen minds with questions of place and motion and who goes where.
But shuffled and jumbled in many ways, in the course of endless time they are buffeted, driven along chancing upon all motions, combinations.
At last they fall into such an arrangement as would create this universe.
The implication is that time has always existed, but that an organized universe began a finite time ago with a random fluctuation.
Plato and Aristotle, both of whom were opponents of the atomists, agreed with them that the past is infinite or eternal. Aristotle offered two reasons. Time had no beginning because, for any time, we always can imagine an earlier time. In addition, time had no beginning because everything in the world has a prior, efficient cause. In the fifth century, Augustine disagreed with Aristotle and said time itself came into existence by an act of God a finite time ago, but God, himself, does not exist in time. This is a cryptic answer because it is not based on a well-justified and detailed theory of who God is, how He caused the big bang, and how He can exist but not be in time. It is also difficult to understand St. Augustine’s remark that “time itself was made by God.” On the other hand, for a person of faith, belief in their God is usually stronger than belief in any scientific hypothesis, or in any desire for scientific justification of their remark about God, or in the importance of satisfying any philosopher’s demand for clarification.
Agreeing with Augustine against Aristotle, Martin Luther said the past is not infinite, and he estimated the world began in 4,000 B.C.E. In the early seventeenth century, the Calvinist James Ussher calculated from the Bible that the world began in 4,004 B.C.E. on Friday, October 28.
In about 1700, Isaac Newton claimed future time is infinite and that, although God created the material world some finite time ago, there was an infinite period of past time before that, as Lucretius and Aristotle had also claimed.
Nearly all twentieth and twenty-first century astronomers said the universe is at least as old as the big bang which began about 13.8 billion years ago. There is still no consensus on whether the past and future are infinite.
For more discussion of this issue of the extent of time, see the companion section Infinite Time.
8. Emergence of Time
Is physical time emergent? This question is asking whether time and thus space-time fails to be basic, or is time only a higher level entity that is dependent upon the collective behavior of lower level, more basic, timeless features? Experts are not sure of the answer, although in earlier centuries it was assumed that time is basic.
In relativity theory and quantum mechanics, time is a designated dimension of space-time, so the issue before us is whether space-time is basic or emergent. If space-time is emergent, the most favored candidate for what it emerges from is the quantum wave function, and in particular from quantum entanglement.
The word emerge has been used in different ways in the philosophical literature. When we say an oak tree emerges from an acorn, this is a time-dependent sense of emergence that is not being considered here. Some persons define emergence as a whole emerging from, but being greater than, the sum of its parts. There are better, less vague definitions. The word “emerge” in this article is intended to indicate the time-independent or change-independent appearance of an objective feature of nature in the philosopher Mark Bedau’s sense of “weak emergence.” If the emergent entity were instead to obey laws that are not entailed by the lower and more fundamental laws, then the emergence is called “strong emergence.”
Consider the difference between weak and strong emergence. Weak emergence is about new features supervening upon more basic features but not existing at that more basic level. A supervenes on B if changes in A require there to be changes in B. Temperature supervenes on molecular motion because the temperature of an object cannot change without there being changes in the object’s molecular motions. Even though the low-scale laws entail the high-scale behavior and high-scale laws, as a practical matter it is rare that a higher level concept is in practice explicitly defined in terms of a lower level concept even if it can be in principle. Emergence is about explainability, not definability.
Strong emergence, on the other hand, denies the supervenience and emphasizes the independence of the emergent concept from a lower level. Many philosophers have claimed that consciousness strongly emerges from the human body and that there can be a change in consciousness without any change in the configuration of the molecules in the body. The British Emergentists of the 19th century believed this.
Regarding emergence being about explainability, not definability. the philosopher Daniel Dennett helpfully recommended treating an emergent entity as a “real pattern” that has an explanatory and predictive role in the theory positing the entity, but it is a pattern at a higher or less-detailed level. Information is lost as one moves to higher levels, but the move to a higher level can reveal real patterns and promote understanding of nature that would never be noticed by focusing only on the fundamental level. Typically the information at the higher level that is not thrown away involves what philosophers of science have called “observables” and “macro variables.”
Sean Carroll tried to clarify our sense of emergence as follows:
To say that something is emergent is to say that it’s part of an approximate description of reality that is valid at a certain (usually macroscopic) level, and is to be contrasted with “fundamental” things, which are part of an exact description at the microscopic level….Fundamental versus emergent is one distinction, and real versus not-real is a completely separate one (Carroll 2019, p. 235).
Believing space-time and thus time will be considered to be coarse-grained or weakly emergent in any future, successful theory of quantum gravity, he said, “Time is just an approximation….” Carlo Rovelli agreed:
Space-time is…an approximation. In the elementary grammar of the world, there is neither space nor time—only processes that transform physical quantities from one to another…. At the most fundamental level that we currently know of,…there is little that resembles time as we experience it. There is no special variable “time,” there is no difference between past and future, there is no space-time (Rovelli 2018 195).
Turning from the issue of fundamental vs. emergent to the issue of real vs. emergent, eliminativism says emergent entities are unreal. So, if time is emergent, it is not real. Similarly, if pain is emergent, it is not real—and therefore no person has ever really felt a pain. The theory is also called strong emergentism. The more popular ontological position is weak emergence or anti-eliminativism. It implies that emergent entities are real patterns of fundamental entities despite being emergent and despite our not knowing how to reduce the higher level concept to the lower level one, even though Laplace’s Demon knows how to perform the reduction. The majority position is that emergence is an objective relationship between two theories, the micro one and the macro one, and it holds regardless of human inability to understand the relationship. The term “reduction” in philosophy has several other senses. In addition to theory reduction, there is straightforward ontological reduction or object reduction as when we say a wooden table reduces to its hydrocarbon molecules. There is also field reduction, as when we say chemistry reduces to physics.
An important philosophical issue is to decide which level is the fundamental one, the basic one. Philosophers often point out that being fundamental is relative to the speaker’s purpose. Biologists and physicists have different purposes. To a biologist, the hunger causing you to visit the supermarket emerges from the fundamental level of cellular activity. But to a physicist, the level of cellular activity is not fundamental but rather emerges from the more fundamental level of elementary particle activity which in turn emerges from the even more fundamental level of fluctuations in elementary quantum fields.
In another sense of emergence, the one in which we say a large oak tree emerged later from a small acorn, some physicists speculate that time emerged from space in the following manner. Very, very early in the big bang period there were an infinite number of dimensions of space and none of time. As the universe expanded and cooled, these eventually collapsed into four dimensions of space and still none of time. Then this collapsed so that one of the space dimensions disappeared as the time dimension emerged, leaving our current four-dimensional space-time. (This description—especially its use of the word “then”—seems to imply that there was time before time began, but that is a problem with the English language and not with what is intended by the description.)
Some philosophers such as Lawrence Sklar (in 1983) and some physicists such as David Gross (in 2004) have claimed that time is fundamental and not ontologically emergent. In 2004, after winning the Nobel Prize in physics, Gross expressed that viewpoint. In speaking about string theory, which is his favored theory for somehow reconciling the inconsistency between quantum mechanics and the general theory of relativity, he said.
Everyone in string theory is convinced…that space-time is doomed. But we don’t know what it’s replaced by. We have an enormous amount of evidence that space is doomed. We even have examples, mathematically well-defined examples, where space is an emergent concept…. But in my opinion the tough problem that has not yet been faced up to at all is, “How do we imagine a dynamical theory of physics in which time is emergent?” …All the examples we have do not have an emergent time. They have emergent space but not time. It is very hard for me to imagine a formulation of physics without time as a primary concept because physics is typically thought of as predicting the future given the past. We have unitary time evolution. How could we have a theory of physics where we start with something in which time is never mentioned?
By doomed, Gross means not-fundamental and ontologically emergent.
The physicist Carlo Rovelli has a suggestion for what the fundamental level is from which time emerges. It is a configuration of loops. He conjectured: “At the fundamental level, the world is a collection of events not ordered in time” (Rovelli 2018a, p. 155). Rovelli is re-imagining the relationship between time and change. For him, space-time emerges from a configuration of loops, analogous to the way a vest of chainmail emerges from a properly connected set of tiny circular chain links.
Nevertheless, he would say that above the Planck level, time does exist, although there is no sharp cutoff point between when time does and does not exist.
The English physicist Julian Barbour is an eliminativist and strong emergentist about time. He said the “universe is static. Nothing happens; there is being but no becoming. The flow of time and motion are illusions” (Barbour 2009, p. 1). He argued that, although there does exist objectively an infinity of instantaneous moments, nevertheless there is no objective happens-before ordering of them, no objective time order. There is just a vast, jumbled heap of moments. Each moment is an instantaneous configuration (relative to one reference frame) of all the objects in space. Like a photograph, a moment or configuration contains information about change, but it, itself, does not change. If the universe is as Barbour describes, then space (the relative spatial relationships within a configuration) is ontologically fundamental and a continuum, but time is neither. He says, time is unreal or, at best, emerges as some general measure of the differences among the existing spatial configurations. For more on Barbour’s position, see (Smolin 2013, pp. 84-88).
Sean Carroll has a different idea about time. He is not an eliminativist, but is a weak emergentist who claims in (Carroll 2019) that time and everything else in the universe emerges from the universe’s wave function in a “gravitized quantum theory.” The only fundamental entity in the universe is the wave function. Everything else that is real emerges from the wave function that obeys Schrödinger’s equation. This gives a physical interpretation of the wave function. Carroll says neither time, space, nor even space-time is fundamental. These features emerge from the quantum wave function. So, space-time is merely an approximation to reality and does not exist at the most fundamental level.
Carroll points to a result by Juan Maldacena regarding two different versions of the very same theory of cosmology, but the two versions differ on the number of their space-time dimensions (Carroll 2010, 282). This suggests to Carroll that our own four-dimensional space-time probably emerges and is not fundamental.
Another proposal is that whether time is emergent may not have a unique answer. Perhaps time is relative to a characterization of nature. That is, perhaps there are alternative, but empirically adequate theoretical characterizations of nature, yet time is fundamental in one characterization but emergent in another. This idea is influenced by Quine’s ontological relativity.
For more description of the different, detailed speculations on whether time is among the fundamental constituents of reality, see (Merali 2013) and (Rovelli 2018b).
9. Conventions about Time
Which features of time are conventional and which are not? A convention is a widely agreed upon assumption, and it is not a hypothesis. The clearest way to specify the conventional elements in a theory would be by axiomatizing it, but there is no precise theory of time to be axiomatized.
The issue about convention is conventional vs. factual, not conventional vs. foolish nor conventional vs. impractical. Although the term convention is somewhat vague, conventions as used here are up to our civilization to freely adopt; they are not objective features of the external world that we are forced to accept if we seek the truth. Conventions are inventions as opposed to being natural or mandatory or factual. It is a convention that the English word green means green, but it is not a convention that the color of most, healthy leaves is green.
Conventions need not be arbitrary; they can be useful or have other pragmatic virtues. Nevertheless, if a feature is conventional, then there must in some sense be reasonable alternative conventions that could have been adopted. Also, conventions can be explicit or implicit. For one last caution, conventions can become recognized as having been facts all along. The assumption that matter is composed of atoms was a useful convention in late nineteenth century physics; but, after Einstein’s explanation of Brownian motion in terms of atoms, the convention was generally recognized by physicists as having been a fact all along.
When Westerners talk about past centuries, they agree to use both A.D. and B.C.E. A clock measuring B.C.E. periods would count toward lower numbers. The clock on today’s wall always counts up, but that is merely because it is agreed we intend to use it only in the A.D. era, so there is no need for the clock to count in B.C.E. time. The choice of the origin of the time coordinate is an uncontroversial convention, too. The choice might have been an event in Muhammad’s life or a Jesus event or a Temple event or the big bang event.
The duration of the second is universally recognized to be a convention. Our society could have chosen it to be longer or shorter. It is a convention that there are sixty-seconds in a minute rather than sixty-six, and that no week fails to contain a Tuesday. It is a convention that we choose time coordinates so that time goes forward as the coordinate numbers get larger rather than smaller.
Some say it is a convention as to which event here now is simultaneous with which events there then. Others disagree saying there is a fact of the matter here. The controversy is discussed later in this article and also at The Relativity of Simultaneity.
In a single reference frame, if event 1 happens before event 2, and event 2 happens before event 3, must event 1 also happens before event 3 as a matter of fact or as a matter of convention? Here is an argument for why it is not a convention. This fact about the transitivity of the happens-before relation in any single reference frame is a general feature of time, not a convention. It is implied by relativity theory; it is helpful to believe; no one has ever seen evidence that this transitivity is violated; and there are no reputable theories implying that there should be such contrary evidence.
Time in physics is measured with real numbers (decimal numbers) rather than imaginary numbers (such as the square root of negative one). Does this reveal a deep feature of time? No, it is simply an uncontroversial convention.
How do we know the speed of light is the same in all directions? Is this a fact, or is it a convention? This is a controversial issue in the philosophy of physics. Einstein claimed it was a convention and untestable, but the philosophers B. Ellis and P. Bowman in 1967, and D. Malament in 1977, gave different reasons why Einstein is mistaken. For an introduction to this dispute, see The Conventionality of Simultaneity.
It is a useful convention that, in order to keep future midnights from occurring during the daylight, clocks are re-set by one hour as one moves across a time-zone on the Earth’s surface—and that is also why leap days and leap seconds are used. The minor adjustments with leap seconds are required because the Earth’s rotations are not exactly regular—mostly due to friction from ocean tides. Back in the time of dinosaurs, the rotation took only 23.5 hours. And the mass of the Earth increases continually as space dust lands. So, without conventions about re-setting clocks, one of these days the sun would be shining overhead at midnight.
Consider the ordinary way a clock is used to measure how long a nearby event lasts. We adopt the following metric, or method: Take the time at which the event ends, say 5:00, and subtract the time at which it starts, say the previous 3:00. The metric procedure says to take the absolute value of the difference between the two numbers; this method yields the answer of two hours. Is the use of this method merely a convention, or in some objective sense is it the only way that a clock could and should be used? That is, is there an objective metric, or is time metrically amorphous? Philosophers of physics do not agree on this. Adolf Grünbaum has argued that the method is conventional. Perhaps the duration between instants x and y could be:
|log(y/x)|
instead of the ordinary:
|y – x|.
A virtue of both metrics is that duration cannot be negative. The trouble with the log metric is that, for any three point events x, y, and z, if t(x) < t(y) < t(z), it is normal to demand that the duration from x to y plus the duration from y to z be equal to the duration from x to z. However, the log metric does not have this property. The philosophical issue is whether it must have this property for any reason other than convenience.
It is an interesting fact and not a convention that our universe is even capable of having a standard clock. It is a fact and not a convention that our universe contains a wide variety of phenomena that are sufficiently regular in their ticking to serve as clocks. They are sufficiently regular because they tick in adequate synchrony with the standard clock. The word adequate here means successful for the purposes we have for using a clock.
Physicists regularly assume they may use the concept of a point of continuous time. They might say some event happened at the point of time that is the square root of three seconds after that singular, point event. Physicists usually uncritically accept a point of time as being real-valued, but philosophers of physics disagree with each other about whether this is merely a useful convention. Whitehead argued that it is not a convention but rather a false hypothesis because there are no points of time only durations of time.
Our society’s standard clock tells everyone what time it really is. Can our standard clock be inaccurate? “Yes,” say the objectivists about the standard clock. “No,” say the conventionalists who claim the standard clock is accurate only by convention; if it acts strangely, then all other clocks must act equally strangely in order to stay in synchrony with the standard clock. For an example of strangeness, suppose our standard clock used the periodic rotations of the Earth relative to the background stars. In that case, if a comet struck Earth and affected the rotational speed of the Earth (as judged by, say, a pendulum clock), then we would be forced to say the rotation speed of the Earth did not really change but rather the other periodic clock-like phenomena such as swinging pendulums and quartz crystal oscillations all changed in unison because of the comet strike. The comet “broke” those clocks. That would be a strange conclusion to draw, and in fact, for just this reason, 21st century physicists have rejected any standard clock that is based on Earth rotations and have chosen a newer standard clock that is based on atomic phenomena. Atomic phenomena are unaffected by comet strikes.
A good choice of a standard clock makes the application of physics much simpler. A closely related philosophical question about the choice of the standard clock is whether, when we change our standard clock, we are merely adopting constitutive conventions for our convenience, or in some objective sense we can be making a choice that is closer to being correct. For more on this point, see this article’s Frequently Asked Questions.
The special theory of relativity is believed by most physicists to imply that the notion of now or the present is conventional because it depends on which person’s present is being referred to. Many philosophers, but not a majority, disagree and believe in an objective present. Here is a two-dimensional Minkowski diagram of space and time displaying the gray area where a range of possible conventions is allowed according to relativity theory:
The light cone of your future is the region above the gray area; the past line cone is the region below the gray area. The diagonal straight lines are the worldlines of light rays reaching and leaving you here now. The gray areas of this block universe represent all the events (in sense 1 of the term “event”) that could be classified either way, as your future events or as your past events; and this classification depends upon someone’s choice of what line within the gray area will be the line of your present. Events within the gray areas represent all the events that could neither cause, nor be caused by, your being here now. The more technical ways of saying this is that the gray area is all events that are space-like separated from your here and now or that are in your here-and-now’s absolute elsewhere or that constitute your extended present. Two events are time-like separated from each other if they could possibly have affected each other. If a pair of events is time-like separated, then they cannot also be space-like separated. Light cones are not frame relative; they are absolute and objective. Also, this structure of space-time holds not just for you; every single point-event, has its own unique pair of light cones.
The gray region of space-like events is called the “extended present” because, if you were defining an x-axis of this diagram in order to represent your present events, then you would have a great latitude of choice. You could place the line that is to be the frame’s spatial axis anywhere in the gray area; but, in order to avoid ambiguity, once it is chosen it stays there for all uses of the coordinate system; it cannot change its angle. For example, suppose the two point-events represented as a and b in the above diagram both occur in the Andromeda Galaxy. That galaxy is 2,000,000 light-years away from you, assuming you are now on Earth. Even though event b were to occur a million years after a, you (or whomever is in charge of setting up the axes of the coordinate system you are using) are free to choose either event as happening now in that galaxy, and you also are free to choose any intermediate event there. But you are not free to choose an event in a white area because that would violate relativity theory’s requirements about causality. One implication of this argument is that relativity theory implies there is no fact of the matter as to what is happening at present in the Andromeda Galaxy. It is frame-relative.
To summarize, the time-order of space-like events is conventional by depending on our choice of reference frame. The time-order of time-like events is not conventional and stays the same with any new choice of reference frame. For more on these issues, see the discussion of the relativity of simultaneity. Additional conventional and non-conventional features of time are discussed in the supplement What Else Science Requires of Time (That Philosophers Should Know).
10. Arguments That Time Is Not Real
We can see a clock, but we cannot see time, so how do we know whether time is real—that it exists? Someone might think that time is real because it is what clocks are designed to measure, and because there certainly are clocks. The trouble with this reasoning is that it is just as bad as saying that unicorns are real because unicorn hunters intend to find unicorns, and because there certainly are unicorn hunters.
A principal argument that time is real is—as the metaphysician David Lewis would say—because the hypothesis is so serviceable. The concept provides theoretical unity and economy across multiple theories, especially our fundamental theories, and its acceptance does not violate Occam’s Razor nor seem to have hidden, unacceptable implications. For a similar reason, poems, dead presidents, and extinct languages are real, but ghosts are not.
But if, as most physicists say, to be real is to be frame-independent, then time is not real. This insight into the nature of time was first promoted by Hermann Minkowski soon after his student Albert Einstein created the special theory of relativity. Similarly, because energy, spatial distance, and mass are also different in different references frames, they, too, are not real. The requirement that to be real is to be frame-independent is not a logical truth, nor a result of observation. It is a plausible metaphysical assumption that so far has the support of almost every physicist and, to a lesser extent, the philosophers of physics, and to an even lesser extent, philosophers in general. Physicists presume the reality of time, energy, distance, and mass because they implicitly assume that there is prior agreement on which reference frame is accepted. This assumption is usually made in all the discussions below.
Let’s consider some arguments against the reality of time that have appeared in the philosophical literature. The logical positivist Rudolf Carnap said, “The external questions of the reality of physical space and physical time are pseudo-questions” (“Empiricism, Semantics, and Ontology,” 1950). He meant these two questions are meaningless because there is no way to empirically verify their answers one way or the other. Subsequent philosophers have generally disagreed with Carnap and have taken these metaphysical questions seriously.
Here are other reasons for the unreality of time. Time is unreal because (i) it is emergent, or (ii) it is subjective, or (iii) it is merely conventional (such as being only a mathematical construct that doesn’t correspond to something that exists in the real world), or (iv) it is defined inconsistently, or (v) its scientific image deviates too much from its commonsense image. The five are explored below, in order. For more discussion of the issue of whether time exists, see (Baron et. al. 2022).
i. Because Time is Emergent
Time does not emerge from space-time, but suppose it does emerge from the quantum gravitational field, or something else. Does this imply time is not real? Most scientists and philosophers of time will answer “no” for the following reasons. Scientists once were once surprised to learn that heat emerges from the motion of molecules. Yet a single molecule itself has no heat. Would it not have been a mistake to conclude from this that heat is unreal and nothing is warm? And when it became clear that a baseball is basically a collection of molecules, and so baseballs can be said to emerge from arrangements of molecules, would it not have been a mistake to say this implies baseballs no longer exist—that they are not real? It would be a mistake because baseballs and heat are real patterns of fundamental objects and events. Also, the concept of time has proven itself to be extremely useful from the ultra-microscopic scales of quarks to the large scale of the entire cosmos, so most experts argue that time is real at least at all those scales. There is some serious and popular speculation in the physics community that as one investigates nature at smaller and smaller scales below the Planck scale, the concept of time becomes less applicable to reality, but few physicists or philosophers draw the conclusion from this that time is not real at any scale. The compatibility of time’s not existing somewhere below, say, the Planck scale to its existing above that scale is somewhat analogous to free will’s not existing at the scale of an individual human cell to its existing at the macroscopic scale of human activity.
ii. Because Time is Subjective
Psychological time is clearly subjective, but the focus now is on physical time. Any organism’s sense of time is subjective, but is the time that is sensed also subjective? Well, first what does subjective mean? This is a notoriously controversial term in philosophy. Here it means that a phenomenon is subjective if it is a mind-dependent phenomenon, something that depends upon being represented by a mind. A secondary quality such as being red is a subjective quality, but being capable of reflecting the light of a certain wavelength is not subjective. The same point can be made by asking whether time comes just from us or instead is wholly out there in the external world independent of us. Throughout history, philosophers of time have disagreed on the answer. Without minds, nothing in the world would be surprising or beautiful or interesting. Can we add that nothing would be in time? If so, time is not objective, and so is not objectively real.
Aristotle envisioned time to be a counting of motions (Physics, IV.ch11.219b2), but he also asked the question of whether the existence of time requires the existence of mind. He does not answer his own question because he says it depends on whether time is the conscious numbering of movement or instead is just the capability of movements to be numbered were consciousness to exist.
St. Augustine, clearly adopted a subjectivist position regarding time, and said time is nothing in reality but exists only in the mind’s apprehension of that reality.
Several variants of idealism have implied that time is not real. Kant’s idealism implies objective time, the time of things-in-themselves, if there even are such things, is unknowable, and so is in that sense unreal. The post-Kantian German idealists (Fichte, Schelling, Hegel) argued that the problem isn’t that time is unknowable but that all reality is based wholly upon minds, so objective time is unreal. It cannot be a feature of, or part of, reality.
Here are some comments against the above arguments and for the reality of objective time. Notice that a clock can tick in synchrony with other clocks even when no one is paying attention to the clocks. Second, notice how useful the concept of time is in making such good sense of our evidence involving change, persistence, and succession of events. Consider succession. This is the order of events in time. If judgments of time order were subjective in the way judgments of being interesting vs. not-interesting are subjective, then it would be too miraculous that everyone can so easily agree on the temporal ordering of so many pairs of events: birth before death, the acorn sprouts before oak tree appears, houses are built before they are painted. W. V. O. Quine might add that the character of the objective world with all its patterns is a theoretical entity in a grand inference to the best explanation of the data of our experiences, and the result of this inference tells us that the world is an entity containing an objective time, a time that gets detected by us mentally as psychological time and gets detected by our clocks as physical time.
iii. Because Time is Merely Conventional or Only a Mathematical Construct
One might argue that time is not real because the concept of time is just a mathematical artifact in our fundamental theories of mathematical physics which is merely playing an auxiliary mathematical role. Similarly, coordinate systems are mathematical constructs, and the infinite curvature of space at the center of a black hole is generally considered to be merely an artifact of the mathematics used by the general theory of relativity but not to exist in reality.
Or one might argue as follows. Philosophers generally agree that humans invented the concept of time, but some philosophers argue that time itself is invented. It was created as a useful convention, like when we decided to use certain coin-shaped metal objects as money. Money is culturally real but not objectively real because it would disappear if human culture were to disappear, even if the coin-shaped objects were not to disappear. Money and oxygen both exist, but money’s existence depends upon social relations and conventions that oxygen’s existence does not depend upon. Is time’s existence more like money than oxygen in that regard?
Although it would be inconvenient to do so, our society could eliminate money and return to barter transactions. Analogously, Callender asks us to consider the question, “Who Needs Time Anyway?”
Time is a way to describe the pace of motion or change, such as the speed of a light wave, how fast a heart beats, or how frequently a planet spins…but these processes could be related directly to one another without making reference to time. Earth: 108,000 beats per rotation. Light: 240,000 kilometers per beat. Thus, some physicists argue that time is a common currency, making the world easier to describe but having no independent existence (Callender 2010, p. 63).
In 1905, the French physicist Henri Poincaré argued that time is not a feature of reality to be discovered, but rather is something we have invented for our convenience. He said possible empirical tests cannot determine very much about time, so he recommended the convention of adopting whatever concept of time that makes for the simplest laws of physics. Nevertheless, he said, time is otherwise wholly conventional, not objective.
There are two primary reasons to believe time is not merely conventional: First, there are so many one-way processes in nature. For example, mixing cold milk into hot, black coffee produces lukewarm, brown coffee, but agitations of lukewarm, brown coffee have never turned it back into hot black coffee with cool milk. The process goes only one way in time.
Second, our universe has so many periodic processes whose periods are constant multiples of each other over time. That is, their periods keep the same constant ratio to each other. For example, the frequency of rotation of the Earth around its axis relative to the “fixed” stars is a constant multiple of the frequency of swings of a fixed-length pendulum, which in turn is a constant multiple of the half-life of a specific radioactive uranium isotope, which in turn is a constant multiple of the frequency of a vibrating quartz crystal, which in turn is a constant multiple of the frequency of a light beam emitted from a specific kind of atomic process used in an atomic clock, which in tun is a constant multiple of the period of a pulsar star’s flash. The relationships do not change as time goes by—at least not much and not for a long time, and when there is deviation we know how to predict it and compensate for it. The existence of these sorts of constant time relationships—which cannot be changed by convention—makes our system of physical laws much simpler than it otherwise would be, and it makes us more confident that there is some convention-free, natural kind of entity that we are referring to with the time-variable in those physical laws—despite the fact that time is very abstract and intangible and not something we can see, taste, or touch.
iv. Because Time is Defined Inconsistently
Bothered by the contradictions they claimed to find in our concept of time, Parmenides, Zeno, Spinoza, Hegel, and McTaggart said time is not real.
Plato’s classical interpretation of Zeno’s paradoxes is that they demonstrate the unreality of any motion or any other change. Assuming the existence of time requires the existence of change, then Zeno’s paradoxes also overturn Greek common sense that time exists.
The early 20th-century English philosopher J.M.E. McTaggart believed he had a convincing argument for why a single event can acquire the properties of being a future event, a present event, and also a past event, and that since these are contrary properties, our concept of time is inconsistent, and the inconsistency cannot be removed. It follows for McTaggart that time is not real. This argument has received a great deal of attention in the philosophy literature but hardly any in the physics literature.
The early 20th-century absolute-idealist philosopher F.H. Bradley claimed, “Time, like space, has most evidently proved not to be real, but a contradictory appearance…. The problem of change defies solution.”
Regarding the inconsistencies in our concept of time that Zeno, McTaggart, Bradley, and others claim to have revealed, most philosophers of time say that there is no inconsistency, and that the complaints can be handled by clarification or by revising the relevant concepts. For example, Zeno’s paradoxes were solved by requiring time to be a linear continuum like a segment of the real number line. This solution was very fruitful and not ad hoc. It would be unfair to call it a change of subject.
v. Because Scientific Time is Too Unlike Ordinary Time
If you believe that for time to exist it needs to have certain features of the commonsense image of time, but you believe that science implies time does not have those features, you might be tempted to conclude that science has really discovered that time does not exist. In the mid 20th century the logician Kurt Gödel argued for the unreality of time as described by contemporary physical science because the equations of the general theory of relativity allow for physically possible universes in which all events precede themselves. People can, “travel into any region of the past, present, and future and back again” (Gödel, 1959, pp. 560-1). It should not even be possible for time to be circular or symmetric like this, Gödel believed, so, he concluded that, if we suppose time is the time described by relativity theory, then time is not real.
Regarding the claim that our commonsense understanding of time by science is not treated fairly by the science of time, there is no consensus about which particular features of commonsense time cannot be rejected, although not all can be or else we would be changing the subject and not talking about time. But science has not required us to reject our belief that some events happen in time before other events, nor has science required us to reject our belief that some events last for a while. Gödel’s complaint about relativity theory’s allowing for circular time has been treated by the majority of physicists and philosophers of time by saying he should accept that time might possibly be circular even though as a contingent matter it is not circular in our universe, and he needs to revise his intuitions about what is essential to the concept.
vi. Conclusion
The upshot is that the previous five arguments against the reality of time do not succeed. So, time is real even if it is or is not fundamental, but only given that it is a feature of space-time in which durations are relative to a chosen reference frame.
11. Time Travel
Would you like to travel to the future and read about the history of your great-grandchildren? You can do it. Nothing in principle is stopping you. Would you like to travel, instead, to the past? You may have regrets and wish to make some changes. Unfortunately, travel to your own past is not as easy as travel to someone else’s future. It is much easier to visit your descendants than your ancestors, and doing the latter may be impossible, although not all experts agree about the impossibility.
The term time travel has now become a technical term. For starters, it means travel in physical time, not psychological time. You do not time travel if you merely dream of living in the past, although neuroscientists commonly do call this “mental time travel.” You do not time travel for five minutes simply by being alive for five minutes. You do not time travel by crossing a time zone, nor by having your body frozen and thawed later, even if this does extend your lifetime.
Time travel to the future presupposes the metaphysical theory of eternalism because, if you travel to the future, there must be a future that you travel to. Presentism and the growing-past theory deny the existence of this future. That is why the growing-past theory is called no-futurism and also possibilism.
In 1976, the Princeton University metaphysician David Lewis offered this technical definition of time travel:
In any case of physical time travel, the traveler’s journey as judged by a correct clock attached to the traveler takes a different amount of time than the journey does as judged by a correct clock of someone who does not take the journey.
The implication from this definition is that time travel occurs when correct clocks get out of synchronization. If you are the traveler, your personal time (technically called your proper time) is shown on the clock that travels with you. A person not taking the journey is said to be measuring external time. This external time could be their proper time, or it could be the proper time of our civilization’s standard clock.
Lewis’s definition is widely accepted, although it has been criticized occasionally in the philosophical literature. The definition has no implications about whether, if you travel forward in external time to the year 2376 or backward to 1776, you can suddenly pop into existence then as opposed to having traveled continuously during the intervening years. Continuity is required by scientific theory, but discontinuous travel is more popular in fictional books and films.
a. To the Future
Time travel to the future occurs very frequently, and it has been observed and carefully measured by scientists. Time travel to the past is much more controversial, and experts disagree with each other about whether it is even physically possible. Relativity theory implies there are two different kinds of time travel to the future: (1) two clocks becoming out of synchrony due to their moving relative to each other, and (2) two clocks becoming out of synchrony due to their encountering different gravitational forces.
When you travel to the future, you eventually arrive at some future event having taken less time on your clock than the non-travelers do on their clocks. You might travel to the future in the sense that you participate in an event ten years in their future, having taken only two years according to your own clock. That would be an eight-year leap forward in time. You could have been continuously observed from Earth’s telescopes during your voyage to that event. However, the astronomers on Earth would notice that you turned the pages in your monthly calendar very slowly. The rate of ticking of your clock would differ from that of their clock during the flight. Reversing your velocity and traveling back to the place you began the trip will not undo this effect.
If you travel to the future this way, do you travel to your own future, or only to someone else’s future? You do both. If you do travel to the future, you never get biologically younger; you simply age more slowly than those who do not travel with you.
Any motion produces time travel to the future, relative to the clocks of those who do not move. That is why you can legitimately advertise any bicycle as being a time machine. The faster you go the sooner you get to the part of the future you desire but the more easily the dust and other particles in space will slice through your body during the trip.
The second kind of future time travel is due, not to a speed difference between two clocks, but to a difference in the strength of the gravitational field on two clocks. This is called gravitational time dilation, and it is most noticeable near a source of extreme gravitation such as near a black hole. If you were to leave Earth and orbit near a black hole, your friends back on Earth might view you continuously through their telescopes and, if so, would see you live in slow motion. When you returned, your clock would show that less time had expired on your clock than on their clock that remained on Earth. Similarly, in a tall building the lower-floor clocks tick more slowly than upper-floor clocks because the lower floor is in a stronger gravitational field, all other things being equal. There is no theoretical limit to how slow a clock can tick when it undergoes time dilation, but it would never tick in reverse.
Travelers to the future can participate in that future, not just view it. They can influence the future and affect it. Saying travelers can change the future is a controversial comment; it is either true or false depending on what is meant by the term “change the future.” According to the metaphysician David Lewis (Lewis 1976, 150), changing the future is impossible. If it changed, then it was not really the future after all. He argued that no action changes the future, regardless of whether time travel is involved.
Suppose you were to encounter a man today who says that yesterday he lived next door to Isaac Newton in England in the year 1700, but now he has traveled to the future and met you. According to the theory of relativity, it is physically possible that he did this. Yet it is an extraordinary claim since you undoubtedly believe that sufficiently fast spaceships or access to extraordinarily high gravitational fields were not available to anyone in 1700. And it is unlikely that history books failed to mention this if it did occur. Epistemology tells us that extraordinary claims require extraordinarily good evidence, so the burden of proof is on the strange man to produce that evidence—such as a good reason how the secret of building spaceships was discovered but kept from the public in 1700 and from later historians. You also would like to be shown that his body today contains traces of the kind of atmosphere that existed back in 1700; that atmosphere is slightly different chemically from ours today. If he cannot or will not produce the evidence, then it is much more likely that he is deluded or is simply lying. Giving him a lie detector test will not be very helpful; you want to know what is true, not merely what he believes to be true.
b. To the Past
There are no known examples of travel to the past. But before we assess whether travel to the past is at least physically possible, let’s consider what we mean by travel to the past. A telescope is a window into the past. If we use it to look out into space to some 3D object a million light-years away, we are seeing the object as it was a million years ago. However, this is looking at the past, not being in the past.
Being in our own past is part of what is meant by traveling to the past. The word “our” is important here. At present, we are existing in the past of future people, but we are not traveling into their past. What optimists about past time travel hope is that it is possible to travel into our own past. This is impossible according to Newton’s physics and impossible according to Einstein’s special theory of relativity, but it may be possible according to Einstein’s general theory of relativity, although experts are not in agreement on this point despite much study of the issue. The present article tentatively assumes that eventually there will be agreement that it is physically possible.
One of the world’s experts on time travel, the Nobel-Prize winner Kip Thorne at the California Institute of Technology, made this helpful comment:
If it is possible to go backward in time, you can only do so by traveling outward in space and then returning to your starting point before you left. You cannot go backward in time at some fixed location while watching others go forward in time there.
Travel in time to the past was seriously discussed in the physics community only after 1949 when the logician Kurt Gödel published a solution to the equations of the general theory of relativity that he claimed allows travel to the past. He claimed that some very exotic distributions of matter and energy will curve space-time enough to form loops along which, as you continue to travel forward in your own proper time, you arrive back to your past events. These curves or time loops are technically called “closed time-like curves” by physicists. There is no requirement that anything moves past another object faster than the speed of light in this scenario. Einstein praised the paper, but said he hoped some new physical law would be discovered that would block Gödel’s solution to the relativity equations. Other physicists say Einstein should not have praised Gödel’s argument because it is not clear that it is successful.
During all this traveling, both growing up and then growing younger, your life events are required to stay within your own light cone. However, as Gödel claimed, general relativity theory seems to allow the light cone structure itself to “tip” in a space-time diagram as time progresses. Gödel argued that, in a universe having a non-zero total angular momentum, general relativity allows enough tipping that eventually the cone encloses an earlier event on your world line that is within your past light cone. If so, you’ve traveled in a time loop without having had to travel faster than the speed of light.
But there is still an entropy problem. Suppose an adult person does travel on a loop consistent with general relativity and arrives back when they were, say, three years old. Returning to the past as a three year old requires undergoing a transition back to the body of a three-year-old while losing memories of being an adult. Storing memories is a process that increases entropy, and so is growing up physically. So, a return to childhood apparently requires violating the second law of thermodynamics. The counter to this argument is that it assumes entropy is required to increase in the future, but that is an error. Entropy very probably will increase in the future, but it is not required to increase. Entropy can increase in some parts of a closed system and decrease in other parts so long as, on average over the total system, the entropy tends to increase.
Creating a wormhole might be another way to visit your past. Hawking says:
One can show that to create a wormhole one needs to warp space-time in the opposite way to that in which normal matter warps it. Ordinary matter curves space-time back on itself, like the surface of the Earth. However, to create a wormhole one needs matter that warps space-time in the opposite way, like the surface of a saddle. The same is true of any other way of warping space-time to allow travel to the past if the universe didn’t begin so warped that it allow time travel. What one would need would be matter with negative mass and negative energy density to make space-time warp in the way required (Hawking 2018, 134).
This matter is called “exotic matter.”
Here is a description of travel to the past via a wormhole:
For example, suppose an alien civilization, back in our year 1938, constructed a wormhole and sped up one of its mouths such that in a single year of the wormhole’s time, it aged one hundred years of our time. That mouth would “live” in 2038. The other mouth, at rest compared to the first one, having aged one year both in its time and ours, would “live” in; 1939. Imagine, for the sake of argument, that both mouths were close enough to Earth that a terrestrial astronaut could access each of them and return to our planet in a reasonable time. Consequently, an intrepid voyager could enter the first mouth in the year 2038, pass through the wormhole’s throat, and venture back to 1939 via the second mouth (Paul Halpern, The Quantum Labyrinth).
In 1992, Hawking proposed The Chronology Protection Hypothesis that Nature conspires somehow to block backward time travel on the macroscopic scale. Maybe time machines above the Planck scale will explode. One of the world’s experts on time travel, Kip Thorne, cautioned in 2024 that Hawking’s hypothesis is a guess, not a fact.
There is general agreement that science must obey logic, which implies that in a single world there is consistent story of what has happened and will happen, despite the fact that novels about time travel frequently describe traveling back to remake the past and thereby to produce a new version of reality that is inconsistent with the earlier version.
The equations of general relativity are simply too complicated to solve regarding past time travel, even for experts. Many of these experts (for example, Frank Wilczek) speculate that travel to the past is not allowed in any physically possible universe, and the closest one can come to time travel to the past is to travel to a new branch of the universe’s quantum wave function, which implies, for some experts, traveling to a parallel universe. All the experts agree that, even if the equations do allow some possible universe to contain travel to one’s own past via the creation of a time machine, they do now allow travel to a time before the creation of the first time machine in that universe.
Still others say that, even if relativity does allow travel to the past, the theory should be revised to prevent this. Other opponents of the possibility of time travel to the past hope that an ad hoc restriction is not needed and instead that relativity theory will be understood more clearly so it can be seen that it does rule out past time travel. And still other opponents of time travel to the past hope an as yet unknown physical law will be discovered that rules out travel to the past. However, defenders of time travel say we should bite the bullet, and accept that relativity does allow time travel in some kinds of universes that have special warped space-time.
Here is a pessimistic remark about time travel from J.J.C. Smart in The Journal of Philosophy in 1963:
Suppose it is agreed that I did not exist a hundred years ago. It is a contradiction to suppose that I can make a machine that will take me to a hundred years ago. Quite clearly no time machine can make it be that I both did and did not exist a hundred years ago.
Smart’s critics accuse him of the fallacy of begging the question. They wonder why he should demand that it be agreed that “I did not exist a hundred years ago.”
If general relativity does allow a universe that contains time travel to the past, this universe must contain a very special distribution of matter-energy. Regardless of how space is curved and what sort of time travel occurs, if any past time travel does occur, the traveler apparently is never able to erase facts or otherwise change the past. That is the point of saying, “whatever happened, happened.” But that metaphysical position has been challenged. It assumes there is only one past and that whatever was the case will always have been the case. These assumptions, though widely accepted, occasionally have been challenged in the philosophical literature. They were challenged in the 11th century by Peter Damian who said God could change the past.
Assuming Damian is mistaken, if you do go back, you would already have been back there. For this reason, some philosophers argue that for reasons of logical consistency, if you go back in time and try to kill your grandfather by shooting him before he conceived a child, you will fail no matter how hard you try. You will fail because you have failed. But nothing prevents your trying to kill him. There is no free will problem.
Or is there? The impossibility of killing your grandfather seems to some philosophers to raise a problem about free will. If you are free to shoot and kill people before you step into a time machine, then presumably you are free to shoot and kill people after you step out.
Assuming you cannot shoot and kill your grandfather because you did not, many philosophers argue that in this situation you do not really have freedom in the libertarian sense of that term. To resolve this puzzle, the metaphysician David Lewis said you can in one sense kill your grandfather but cannot in another sense. You can, relative to a set of facts that does not include the fact that your grandfather survived to have children. You cannot, relative to a set of facts that does include this fact. However, Lewis said there is no sense in which you both can and cannot. So, the meaning of the word can is sensitive to context. The metaphysician Donald C. Williams disagreed, and argued that we always need to make our can-statement relative to all the available facts. Lewis is saying you can and can’t, but in different senses, and you can but won’t. Williams is saying simply that you can’t, so you won’t.
If you step into a time machine that projects you into the past, then you can expect to step out into a new place because time travel apparently always involves motion. There is an ongoing philosophical dispute about whether, in a real closed time-like curve, a person would travel to exactly an earlier event or, instead, only to a nearby event. One suggested reason for restricting the time-like curve to only nearby events is that, if one went back to the same event, one would bump into oneself, and this would happen over and over again, and there would be too many copies of oneself existing in the same place. Many physicists consider this to be a faulty argument.
If it is logically inconsistent to build a new time machine to travel back to a time before the first time machine was invented, then there is no hope of creating the first time machine in order to visit the time of the dinosaurs. In 1988 in an influential physics journal, Kip Thorne and colleagues described the first example of how to build a time machine in a world that has never had one: “[I]f the laws of physics permit traversable wormholes, then they probably also permit such a wormhole to be transformed into a time machine….” (Morris 1988, p. 1446).
A wormhole is a second route between two places; perhaps it is a shortcut tunnel to a faraway place. It may behave as if the tunnel is a route through an extra spatial dimension. Just as two clocks get out of synchrony if one moves relative to the other, a clock near a rapidly moving mouth of a wormhole could get out of synch with a clock at the other, stationary mouth. In principle a person could plunge into one hole and come out at an earlier time. Wormholes were first conceived by Einstein and Rosen, and later were named wormholes by John Wheeler.
Experts opposed to traversable wormholes usually have less of a problem with there being wormholes than with them being traversable. Although Thorne himself believes that traversable wormholes probably do not exist naturally, he also believes they might in principle be created by a more advanced civilization. However, Thorne also believes the short tunnel or “throat” between the two mouths of the wormhole probably would quickly collapse before anything of macroscopic size could use the wormhole to travel back in time. There has been some speculation by physicists that an advanced civilization could manipulate negative gravitational energy with its positive pressure in order to keep the hole from collapsing long enough to create the universe’s first non-microscopic time machine. Perhaps it could be used to visit the age of the dinosaurs.
It is a very interesting philosophical project to decide whether wormhole time travel, or any other time travel to the past, produces paradoxes of identity. For example, can a person travel back and be born again?
To solve the paradoxes of personal identity due to time travel’s inconsistency with commonly held assumptions about personal identity, many philosophers recommend rejecting the endurance theory which implies a person exists wholly at a single instant. They recommend accepting the perdurance theory in which a person exists as a four-dimensional entity extending in time from birth to death. The person is their space-time “worm.” If the person were envisioned to be a point particle whose worm is a one-dimensional curve, then worms of this sort can live partly in wormholes and become closed time-like curves in space-time.
Let us elaborate on this radical scenario. A closed time-like curve has implications for causality. The curve would be a causal loop. Causal loops lead to backward causation in which an effect can occur before its cause. Causal loops occur when there is a continuous sequence of events e1, e2, e3, …. in which each member is a cause of its numerical successor and, in addition, for some integer n, en causes e1. Some philosophers of time have cautioned that with a causal loop, “we would be clearly on the brink of magic.” Other philosophers of time are more willing to accept the possibility of causal loops, strange though they would be. These loops would be a fountain of youth. When you go around the loop, you travel back to a time when you were younger, or perhaps even to your birth.
Most time travel stories in literature involve contradictions, either logical contradictions or inconsistency with accepted laws of physics. The most famous one that appears not to is Robert Heinlein’s story “All You Zombies.” It shows how someone could be both their father and mother, provided relativity theory does allow backward time travel.
For a detailed review of the philosophical literature on backward time travel and the resulting paradoxes of causality and of personal identity, see (Wasserman, 2018, ch. 5) and (Fisher, 2015).
Richard Feynman suggested that a way to interpret the theory of quantum electrodynamics about interactions dominated by electromagnetic and weak forces is that an antimatter particle is really a matter particle traveling backward in time. He agreed with John Wheeler’s claim that the positively charged positron moving forward in time is really a negatively charged electron moving backward in time.
US Postal Museum
This phenomenon is pictured in the two diagrams on the left (only) of the above postage stamp, where time is portrayed as increasing upward. The positron e+ is moving down the stamp or backward in time. Every Feynman diagram pictures a short sequence of elementary interactions among particles.
Most physicists who work with quantum mechanics accept that this idea of going backward in time can provide a useful simplification of the required computations, but they are not convinced that any particle really does travel backwards in time. The principal reason for this caution is that all empirical searches attempting to detect a particle moving backward in time have failed. See (Muller 2016a, 241-6, 296-7) and (Arntzenius & Greaves 2009) for commentary on this.
Here are a variety of very brief philosophical arguments against all travel to the past:
If travel to the past were possible, you could go back in time and kill your grandfather before he met your grandmother, but then you would not be born and so could not go back in time and kill your grandfather. That is a logical contradiction. So, travel to the past is impossible.
Only the present is real. So time travel to the past is not real.
Time travel is impossible because, if it were possible, we should have seen many time travelers by now, but nobody has ever encountered any time travelers.
If past time travel were possible, then you could be in two different bodies at the same time, which is metaphysically impossible.
If you were to go back to the past, then you would have been fated to go back because you already did, and this rules out your freedom to go back or not. Yet you do have this freedom, so travel to the past is impossible.
If past time travel were possible, then you could die before you were born, which is biologically impossible.
If you were presently to go back in time, then your present events would cause past events, which violates our concept of causality.
If travel to the past were possible, then when time travelers go back and attempt to change history, they must always fail in their attempts to change anything, and it will appear to anyone watching them at the time as if Nature is conspiring against them. Since no one has ever witnessed this apparent conspiracy of Nature, there probably cannot be time travel.
Travel to the past is impossible because it allows the gaining of information for free. Here is a suggestive scenario. You in the 21st century buy a copy of Charles Darwin’s book The Origin of Species, which was published in 1859. You enter a time machine with it, go back to 1855 and give the book to Darwin himself. He could have used your copy in order to write his manuscript which he sent off to the publisher. If so, who first came up with the knowledge about evolution? You got the knowledge from Darwin, but Darwin got the knowledge from you. This is “free information.” Because this scenario contradicts what we know about where knowledge comes from, past-directed time travel is not possible.
Travel to the past allows you to return to have intercourse with one of your parents, causing your birth. You would have the same fingerprints as one of your parents. But this is biologically impossible.
If past time travel is possible, then it should be possible for a rocket ship to carry a time machine capable of launching a probe (perhaps a smaller rocket) into its recent past which might eventually reunite with the mother ship. The mother ship has been programmed to launch the probe at a certain time unless a safety switch is on at that time. Suppose the safety switch is programmed to be turned on if and only if the return or impending arrival of the probe is detected by a sensing device on the mother ship. Does the probe get launched? It seems to be launched if and only if it is not launched.
These objections to travel to the past are a mixture of arguments that past-directed time travel is not logically possible, not metaphysically possible, not physically possible, not biologically possible, not technologically possible, and not probable enough.
Counters to all of these arguments have been suggested by advocates of time travel. Item 1 is the grandfather paradox, and some supports of backward time travel object to its interpretation of the modal term “could kill.” They prefer interpreting “could kill” as “could try to kill but not actually kill.”
Item 2 is the argument from presentism.
One response to item 3, the Enrico Fermi Paradox, is that perhaps we have seen no time travelers because we live in a boring era of little interest to time travelers. A better response is that perhaps the first time machine has never been built, and it is known that a time machine cannot be used to go back to a time before the first time machine exists.
Item 9, the Paradox of Free Information, has gotten considerable attention in the philosophical literature. In 1976, David Lewis said this:
But where did the information come from in the first place? Why did the whole affair happen? There is simply no answer. The parts of the loop are explicable, the whole of it is not. Strange! But not impossible, and not too different from inexplicabilities we are already inured to. Almost everyone agrees that God, or the Big Bang, or the entire infinite past of the Universe, or the decay of a tritium atom, is uncaused and inexplicable. Then if these are possible, why not also the inexplicable causal loops that arise in time travel?
Einstein and Rosen suggested that the laws of general relativity might allow traversable, macroscopic wormholes. A wormhole is a short tunnel connecting two distant regions of space that a very similar to two connected black holes, and the tunnel between them would be a shortcut between them. It is analogous to a path taken by a worm who has eaten its way to the opposite side of an apple’s surface without taking the longer path along the apple’s skin. That is why John Wheeler coined the name “wormhole.”
The wormhole is highly curved space-time, and from the outside it looks like a sphere in 3D-space. It is not quite a black hole because it has no event horizon. There is no consensus among theoretical physicists whether general relativity permits the existence of a wormhole. Assuming it does, and assuming one of the spheres could be controlled and forced to move very fast back and forth, then with two connected spheres situated in separate galaxies, a particle or person could enter one at some time, then exit the other at an earlier time, having traveled, say, just a few meters through the tunnel. Because of this implication for time, some physicists argue that if these traversable wormholes are allowed by general relativity, then the theory needs to be revised to disallow them.
For more discussion of time travel by wormhole, see the supplement on relativity theory. For more about time travel in general, see this encyclopedia’s article “Time Travel.” For some arguments in the philosophy literature against the possibility of a person traveling back to a time at which the person previously existed, see (Horwich 1975), (Grey 1999), and (Sider 2001).
12. McTaggart’s A-Theory and B-Theory
In 1908, the English philosopher J. M. E. McTaggart launched the metaphysical debate between the tensed and tenseless theories of time by proposing two ways of linearly ordering all events in time. These are two ways of putting instantaneous events in order along a line. The two ways are different, but the two orderings themselves are the same. (Or so this article assumes, although some metaphysicians argue that the two ways are different and so are the two orderings. This issue is not pursued in the present article.)
McTaggart stated his A-theory in English using tenses to locate events in time relative to a present time, thereby making each event either past, present or future. For example, the following expression uses the future tense to force the lunch event to be future: “Our office lunch meeting will occur next Monday.” But many other languages do not use the linguistic tool of applying tenses as their means of locating events in time, yet they can describe time perfectly adequately, so the philosophical point of the A-theory is expressed better by saying that describing the world accurately requires some linguistic method of locating each event relative to now, the present. The B-theory does not require this reliance on the present. It makes do with dates and with phrases that tell us what happens before what.
Here is how McTaggart introduced his two series:
For the sake of brevity, I shall give the name of the A series to that series of positions which runs from the far past through the near past to the present, and then from the present through the near future to the far future, or conversely. The series of positions which runs from earlier to later, or conversely, I shall call the B series. (McTaggart 1927, 10)
When McTaggart uses the word series, he means what mathematicians call a sequence, but the literature in philosophy often follows McTaggart on this point. Below is a graphic representation of McTaggart’s ordering, in which point event c happens later than point events a and b:
McTaggart is making several assumptions here. First, he does not believe time is real, so his remark that the A-series and B-series mark out positions in time is only on the assumption that time is real, despite what he, himself, believes. Another assumption is that longer-lasting events are composed of their point events. Also, there are a great many other events that are located within the series at event a‘s location, namely all the other events that are simultaneous with event a.
Using the standard time diagram with time increasing to the right along a horizontal line, event a in McTaggart’s B-series (see the previous picture above) is ordered to the left of event b because a happens before b. But when ordering the same two events into McTaggart’s A-series, event a is ordered to the left of b for a different reason—because event a is more in the past than event b, or, equivalently, has more pastness than b. The A-series locates each event relative to the present; the B-series is created with no attention paid to the present, but only to what occurs before what. Importantly, the B-relations of earlier-than and later-than and simultaneous-with do not change over time. If World War I occurs before World War II, then this never changes. The A-relation of the siren going off two hours ago can change over time from holding for the siren to not holding. Although the picture above of the A-series looks just like the picture above of the B-series, many metaphysicians prefer to say that the A-series changes and the B-series does not. Strictly speaking, they are talking about the properties that constitute the A-series vs. the properties that constitute the B-series.
Suppose that event c occurs in our present and after events a and b. Although the philosophical literature is not in agreement, it is usually said that the information that c occurs in the present is not contained within either the A-series or the B-series itself, but is used to create the A-series. That information of c‘s being in the present tells us to place c to the right of b because all present events are without pastness; they are not in the past. Someone constructing the B-series places event c to the right of b for a different reason, namely that c happens after b.
To summarize, McTaggart’s A-series is a series of events ordered by tense or by implicit reference to our present, whereas his B-series is not. The B-series is created with dates or, more specifically, with the relations of before this event and after that event and simultaneous with those events.
One influential treatment of McTaggart’s idea is to say a future event will shed its intrinsic, non-relational A-property of futureness to eventually acquire presentness, then shed that property in favor of some pastness, then shed that, too, in favor of even greater pastness, and so forth. McTaggart himself did not accept this notion of shedding properties. He believed the A-series is paradoxical, but he also believed the A-properties (such as being past or being two weeks past) are essential to our concept of time. So, for this reason, he believed our current concept of time is incoherent. This reasoning is called McTaggart’s Paradox.
McTaggart is not an especially clear writer, so his remarks can be and have been interpreted in different ways, and the reader needs to work hard to make sense of them. Consider McTaggart’s Paradox. Regarding one specific event, say the event when:
Socrates speaks to Plato for the first time.
This speaking to Plato is in the past, at least it is in our past, though not in the past of Egyptian King Tut, so the speaking is past in our present. Nevertheless, back in our past, there is a time when the event is present. From this, McTaggart concludes that the event is both past and present, from which he declares that the A-series is contradictory and so paradoxical. If that reasoning is correct (and it has been challenged by many), and if the A-series is essential to time, then time itself must be unreal.
When discussing the A-theory and the B-theory, metaphysicians often speak of:
A-series and B-series
A-theorist and B-theorist
A-facts and B-facts
A-terms and B-terms
A-properties and B-properties
A-predicates and B-predicates
A-propositions and B-propositions
A-sentences and B-sentences
A-camp and B-camp.
Here are some examples of using this terminology. Unlike the A-series terms, the B-series terms are relational terms because a B-term refers to a property that relates a pair of events. Some of these properties are: is earlier than, happens twenty-three minutes after, and is simultaneous with. An A-theory term, on the other hand, refers to a single event, not a pair of events. Some of these properties are: in the near future, happened twenty-three minutes ago, and is present. The B-theory terms represent distinctively B-properties; the A-theory terms represent distinctively A-properties. So, the disagreement between advocates of the A-theory and those of the B-theory is about whether the fundamental temporal properties of events are monadic or relational.
The B-fact that event a occurs before event b does not itself change; it will always be a fact, but the A-fact that event a occurred about an hour ago will not be a fact for long. B-theorists do not like facts going in and out of existence, but this is acceptable to A-theorists. This crucial difference between the two theories is why philosophers tend to call the B-theory static and the A-theory dynamic.
Similarly, if we turn from fact-talk to statement-talk, the A-statement that event a occurred about an hour ago, if true, will soon become false. B-facts are eternal. For example, the statement “The snowfall occurred an hour before this act of utterance” will, if true, be true at all times, provided the indexical phrase the snowfall does not change its reference. Eternal facts are static; changing facts are dynamic.
The A-theory usually implies A-facts are the truthmakers of true A-statements and so A-facts are ontologically fundamental. The B-theorist, at least a B-theorist who believes in the existence of facts (and some B-theorists do not), appeals instead to B-facts. According to a classical B-theory, when the A-theorist correctly says, “It began snowing an hour ago,” what really makes it true is not that the snowing has an hour of pastness (so the fact is tensed) but that the event of uttering the sentence occurs an hour after the event of it beginning to snow. Notice that occurs an hour after is a B-term that is supposed to be logically tenseless and to be analogous to the mathematical term numerically less than even though when expressed in English it must use the present tense of the verb to occur.
When you like an event, say yesterday’s snowfall, then change your mind and dislike the event, what sort of change of the event is that? Well, this change in attitude is not a change that is intrinsic to the event itself. It is extrinsic. When your attitude changes, the snowfall itself undergoes no intrinsic change, only a change in its relationship to you. (A-theorists and B-theorists do not disagree about this.) This illustrates what is meant by intrinsic when A-theorists promote the intrinsic properties of an event, such as the snowfall having the intrinsic property of being in the past. B-theorists analyze the snowfall event differently, saying that more fundamentally the event is not in the past but is in the past relative to us. “Being in the past,” they say, is not intrinsic as A-theorists claim but rather is relational.
Members of the A-camp and B-camp recognize that ordinary speakers are not careful in their use of A and B terminology; but, when the terminology is used carefully, each believes their camp’s terminology can best explain ordinary speech involving time and also the terminology of the other camp. Each theorist believes their theory best explains what aspects of reality make true or false the various facts of temporal becoming (the event becoming more in the past) and temporal relations (this event occurring before that event), including facts about the arrow of time.
In the twentieth century, their debate was mostly about finding a linguistic reduction of tensed discourse and temporal concepts into tenseless ones. That is, the debate was mostly about which theorist could give the best translations of temporal sentences or could give the best definitions of temporal terms, but subsequently the debate has broadened to accounting for the truth conditions of sentences. The truth conditions of a false sentence are its truthmakers, namely the basis in reality for why it is false. Similarly, for true sentences.
McTaggart also spoke of a C-series and a C-theory. The A-theory says a correct description of reality requires there to be an objective present. The B-theory instead requires an objective relation of before and after for all pairs of events. The C-theory has neither of these requirements. It says moments can be ordered into a series without an appeal to temporal concepts at all. The ordering might instead appeal to entropy or some other objective physical property. To illustrate this C-theory, note the infinity of points on the shortest line connecting New York City and London, but those points have no intrinsic or unique order. They might, for example, be ordered from New York City to London, or they might be ordered from London to New York City, or they might be considered to be a set of disordered points.
Most A-theorists promote becoming. The term means a change in the A-series position of an event, such as a change in its degree of pastness. The B-theorist philosopher Adolf Grünbaum believed becoming is mind-dependent, and he pointed to the following initial quotation from J. J. C. Smart in opposition to the A-theory:
“If past, present, and future were real properties of events [i.e., properties possessed by physical events independently of being perceived], then it would require [non-trivial] explanation that an event which becomes present [i.e., qualifies as occurring now] in 1965 becomes present [now] at that date and not at some other (and this would have to be an explanation over and above the explanation of why an event of this sort occurred in 1965)” (says Smart). It would, of course, be a complete trivialization of the thesis of the mind-independence of becoming to reply that by definition an event occurring at a certain clock time t has the unanalyzable attribute of nowness at time t (Grünbaum 1971, p. 218).
Grünbaum is implying that it is appropriate to ask regarding the event of her house falling down in 1965, “Why now instead of some other date?” He believes that it would be an appropriate explanation to appeal to mind-independent soil conditions and weather patterns, but that it would be trivial and inadequate to say instead that the event occurs now because by definition it had at that time the unanalyzable attribute of nowness. And, more generally, says Grünbaum, temporal becoming has no appropriate place within a proper physical theory. In other words, the notion of becoming is unscientific.
Beginning with Bertrand Russell in 1903, many B-theorists have argued that there are no irreducible one-place A-qualities (such as the monadic property of being past) because the qualities can all be reduced to, and adequately explained in terms of, two-place B-relations. The A-theorist disagrees. For example, the claim that it is after midnight might be explained, says the B-theorist, by saying midnight occurs before the time of this assertion. Before is a two-place asymmetric binary relation. The A-theorist claims this is a faulty explanation because temporal sentences be true even if they are never asserted.
Is the A-theory or is the B-theory the correct theory of reality? The A-theory has two especially central theses, each of which is contrary to the B-theory:
(1) Time is fundamentally constituted by an A-series in which any event’s being in the past (or in the present or in the future or twenty-three seconds in the past) is an intrinsic, objective, monadic property of the event itself.
(2) Events change.
In 1908, McTaggart described the special way that events change:
Take any event—the death of Queen Anne, for example—and consider what change can take place in its characteristics. That it is a death, that it is the death of Anne Stuart, that it has such causes, that it has such effects—every characteristic of this sort never changes…. But in one respect it does change. It began by being a future event. It became every moment an event in the nearer future. At last it was present. Then it became past, and will always remain so, though every moment it becomes further and further past.
This extraordinary change is usually called second-order change or McTaggartian change. For McTaggart, second-order change is genuine change, whereas a B-theorist such as Russell says this is not genuine change. Genuine change is ordinary extrinsic change, he would say. Just as there is no intrinsic change in a house due to your walking farther away from it, so there is no intrinsic change in an event as it supposedly “moves” farther into the past.
Based upon passages §313–315 of McTaggart’s book The Nature of Existence, one can imagine McTaggart saying to Russell: No, Russell, no. What you identify as “change” isn’t change at all. The B-series world you think is the real world is a world without becoming, a world in which nothing happens.
A world with becoming, on the other hand, is a world in which events change and time flows or passes. “It is difficult to see how we could construct the A series given only the B series, whereas given the former we can readily construct the latter,” says G.J. Whitrow in defense of the A theory.
The B-theory conflicts with two central theses of the A-theory. According to the B-theory,
(1′) Time is fundamentally constituted by a B-series, and the temporal properties of being in the past (or in the present or in the future) are fundamentally relational, not monadic.
(2′) Events do not change.
To re-examine this dispute, because there is much misunderstanding about what is in dispute, let us ask again what B-theorists mean by calling temporal properties relational. They mean that an event’s property of occurring twenty-three minutes in the past, say, is a relation between the event and us. When analyzed, it will be seen to make reference to our own perspective on the world. Queen Anne of Scotland’s death has the property of occurring in the past because it occurs in our past. It is not in Aristotle’s past or King Tut’s past because it occurred in the 18th century. So, the labels, “past,” “present,” and “future” are all about us and are not intrinsic properties of events. That is why there is no objective distinction among past, present and future, say the proponents of the B-theory. For similar reasons the B-theorist says the property of being two days in the past is not an ‘authentic’ property because it is a second-order property. The property of being two days in our past, however, is a genuine property, says the B-theorist.
Their point about A-properties being relational when properly analyzed is also made this way. The A-theory terminology about space uses the terms here, there, far, and near. These terms are essentially about the speaker, says the B-theorist. “Here” for you is not necessarily “here” for me. World War II is past for you but not for Aristotle.
The B-theorist also argues that the A-theory violates the theory of relativity because that theory implies an event can be present for one person but not for another person who is moving relative to the first person. So, being present is relative and not an intrinsic quality of the event. Being present is relative to a reference frame.
A-theorists are aware of these criticisms, and there are many counterarguments. Some influential A-theorists are A. N. Prior, E. J. Lowe, and Quentin Smith. Some influential B-theorists are Bertrand Russell, W. V. O. Quine, D. H. Mellor, and Nathan Oaklander. The A-theory is closely related to the commonsense image of time, and the B-theory is more closely related to the scientific image. Proponents of each theory shoulder a certain burden—explaining not just why the opponent’s theory is incorrect but also why it seems to be correct to the opponent.
The philosophical literature on the controversy between the A and B theories is vast. During a famous confrontation in 1922 with the philosopher and A-theorist Henri Bergson, Einstein defended his own B-theory of time and said “the time of the philosophers” is an illusion. This is an overstatement by Einstein. He meant to attack only Bergson and the time of the philosophers who are A-theorists and who believe our consciousness of time is more fundamental than the time described in the theory of relativity and who believe, like Bergson again, that time is not relative to a reference frame.
Martin Heidegger said he wrote Being and Time in 1927 as a response to the conflict between the A-theory and the B-theory.
Other than the thesis that the present is metaphysically privileged, the other principal thesis of the A-theory that distinguishes it from the B-theory is that time flows. Let us turn to this feature of the A-theory.
13. The Passage or Flow of Time
Many philosophers agree with the ancient Greek philosopher Heraclitus that time passes or flows. This characteristic of time has also been called a flux, a transiency of the present, a moving now, and simply becoming. The philosopher G.J. Whitrow claimed “the passage of time…is the very essence of the concept.”
Advocates of this controversial philosophical position often point out that the present keeps vanishing. And they might offer a simile and say present events seem to flow into the past, like a boat that drifts past us on the riverbank and then recedes farther and farther downstream from us. In the converse sense, the simile is that we ourselves flow into the future and leave past events ever farther behind us. Philosophers disagree with each other about how to explain the ground of these ideas. Philosopher X will say time passes or flows, but not in the sense used by philosopher Y. Philosopher Z will disagree with both of them and say time does not really flow at all; the flow is an illusion or some inevitable interpretation that minds like ours apply to our experiences.
There are various entangled issues regarding flow. Here are seven. (i) What does it mean for time to flow? (ii) Is the flow an objective feature of physical events that exists independently of our awareness of them? (iii) What is actually flowing? (iv) Are there different kinds of flow? (v) If time flows, do we experience the flow directly or only indirectly? (vi) What is its rate of flow, and can the rate change? (vii) If time does not flow, then why do so many people believe it does?
Regarding point (i) about what it means for time to flow, rivers do not flow, but their water does. Is time more like the river or more like the water? Or is this a misleading question?
There are two primary philosophical positions about time’s flow: (A) the flow is objectively real. (B) The flow is not objectively real; it is merely subjective. This B-theory is called the static theory, mostly by its opponents because of the negative connotation of the word “static.” The A-theory is called the dynamic theory because it implies time is constantly in flux. The A-theory implies that this fact of passage obtains independently of us; it is not subjective. The letters A and B are intended to suggest alliances with McTaggart’s A-theory and B-theory. One A-theorist describes the situation this way: “The sensation we are (perhaps wrongly) tempted to describe as the sensation of temporal motion is veridical: it somehow puts us in touch with an aspect of reality that is unrepresented in Russell’s theory of time [the original B-theory]” (van Inwagen 2015, 81).
In support of the A-theory, many have argued that time flows because past events keep receding from us and new events become part of our present. The death of Queen Anne is an event that an A-theorist says is continually changing by receding ever farther into the past.
B-theorists respond that this change is no more of an objectively real change than saying her death changed from being approved of by Queen Elizabeth to being disapproved of by her. This extrinsic change in approval is not intrinsic to her death and so does not count as an objectively real change in her death. Time’s flowing is merely an extrinsic and subjective feature of time, say B-theorists.
Another B-theorist charge is that the notion of flow is the product of a faulty metaphor. They say time exists, things change, and so we say time passes, but time itself does not change. It does not change by flowing or passing or elapsing or undergoing any motion. The present does not objectively flow because the present is not an objective feature of the world. We all experience this flow, but only in the sense that we all frequently misinterpret our experience. Mirages are not real and neither is time’s flow. It is not that the sentence “Time flows” is false; it just is not an objective truth.
Still another complaint from the B-theorists appeals to how a movie is projected at the cinema. If you slow the projector down, you eventually see that activity on the screen turns into a series of discrete frames, and no frame itself contains anything changing. But when the projector works at normal speed, the frames fly by and you see a continuous flow in the action on the screen. The point is that this flow of time is not there in the film strip; it is produced by your perceptual system. Similarly, time’s flow is produced by your mind; it does not exist objectively in nature.
One point J.J.C. Smart offered against the A-theory of flow was to ask about the rate at which time flows. It would be a rate of one second per second. But that is silly, he claimed. One second divided by one second is the number one, a unit-less number, and so not an allowable rate. And what would it be like for the rate to be two seconds per second, asks Huw Price who adds that, “We might just as well say that the ratio of the circumference of a circle to its diameter flows at pi seconds per second!” (Price 1996, p. 13).
Other philosophers of time, such as John Norton and Tim Maudlin argue that the rate of one second per second is coherent and acceptable, despite these criticisms. Paul Churchland countered that the rate is meaningful but trivial, for what other rate could it be?
A great many philosophers agree that there seems to most ordinary persons to be a flow of time that they experience. If that is correct, then there surely is some objective feature of their brains that causes this. B-theorists say perhaps the belief is due not to time’s actually flowing but rather to the objective fact that we have different perceptions at different times and that anticipations of experiences always happen before memories of those experiences.
A-theorists who believe in flow have produced many dynamic theories that are closer to common sense on this topic. Here are six.
(1) The passage or flow is a matter of events changing from being future, to being present, to being past. Events change in their degree of futureness and degree of pastness. This kind of change is often called McTaggart’s second-order change to distinguish it from more ordinary, first-order change that occurs when, say, a falling leaf changes its altitude over time. McTaggart suggested that time passes because the present itself moves from moment to moment.
(2) A second type of dynamic theory implies time’s flow is the coming into existence of new facts, the actualization of new states of affairs. Reality grows by the addition of more facts. There need be no commitment to events changing intrinsically.
(3) A third dynamic theory implies that the flow is a matter of events changing from being indeterminate to becoming determinate in the present. Because time’s flow is believed to be due to events becoming determinate, these dynamic theorists speak of time’s flow as becoming.
(4) A fourth dynamic theory says, “The progression of time can be understood by assuming that the Hubble expansion takes place in four dimensions rather than in three. The flow of time consists of the continuous creation of new moments, new nows, that accompany the creation of new space…. Unlike the picture drawn in the classic Minkowski space-time diagram, the future does not yet exist; we are not moving into the future, but the future is being constantly created.” (Muller 2016b).
(5) A fifth dynamic theory suggests the flow is (or is reflected in) the change over time of truth-values of declarative sentences. For example, suppose the sentence, “It is now raining,” was true during the rain yesterday but has changed to false because it is sunny today. That is an indication that time flowed from yesterday to today, and these sorts of truth-value changes are at the root of the flow.
In response to the linguistic turn exemplified by theory (5), critics of the dynamic theory suggest that the temporal indexical sentence, “It is now raining,” has no truth-value because the reference of the word now is unspecified. If the sentence has no truth-value, it cannot change its truth-value. However, the sentence is related to a sentence that does have a truth-value, namely the associated complete sentence or eternal sentence, the sentence with its temporal indexical replaced by some date expression that refers to a specific time, and with the other indexicals replaced by names of whatever they refer to. Typical indexicals are the words: then, now, I, this, here, them. Supposing it is now midnight here on May 1, 2025, and the speaker is in San Francisco, California, then the indexical sentence, “It is now raining,” is intimately associated with the more complete or context-explicit sentence, “It is raining at midnight on May 1, 2025, in San Francisco, California.” Only these latter, non-indexical, non-context-dependent, so-called complete sentences have truth-values, and these truth-values do not change with time, so they do not underlie any flow of time, according to the critic of the fifth dynamic theory.
(6) A sixth dynamic theory adds to the block-universe a traveling present. The present is somehow metaphysically privileged, and there is a moving property of being now that spotlights a new slice of the present events of the block at every new, present moment. A slice is a set of events all of which are simultaneous in the block. So, a slice of events can temporarily possess a monadic property of being now, and then lose it as a newer slice becomes spotlighted. This theory is called the moving spotlight theory. Metaphysically, the moving spotlight theory has been interpreted in two different ways, one rejecting eternalism and the other accepting it. That is, one way suggests there are illuminated moments and unilluminated moments that are, respectively, real and unreal. A second and more common way suggests all times exist but that the present is the only actual time; the actual time exists but is privileged over the other times. Here is how Hermann Weyl described the spotlight theory as being subjective rather than objective:
The objective world simply is, it does not happen. Only to the gaze of my consciousness crawling along the lifeline of my body, does a section of the world come to life as a fleeting image in space which continuously changes in time.
The moving spotlight theory usually accepts eternalism as does the B-theory, but also accepts temporal passage with moments gaining and losing the objective, intrinsic properties of being present or past or future as the now-spotlight moves, which the typical B-theory eternalist does not.
Huw Price offers a short overview of various arguments against the passage of time in (Price 1996 pages 12-16). These arguments are responded to by Tim Maudlin in (Maudlin 2002).
14. The Past, Present, and Future
a. Presentism, the Growing-Past, Eternalism, and the Block-Universe
Have dinosaurs slipped out of existence? More generally, this question is asking whether the past is part of reality. How about the future? Philosophers are divided on how to answer these two ontological questions. There are three leading theories. They are called presentism, the growing-past theory, and eternalism; but there are also other, less popular, theories.
(1) According to presentism, only present objects exist. Stated another way: if something is real, then it exists now. The past and the future are not real, so either the past tense sentence, “Dinosaurs existed” is false, or else it is true but its truth is grounded only in some present facts. A similar analysis is required for statements in the future tense. Perhaps they can be analyzed in terms of present anticipations. With that accomplished, then all the events can be linearly ordered as if the past ones occur before the present ones and the present ones occur before the future ones, when actually they do not because all real events occur only in the present. Most presentists would say the present is the same for all of us all over the universe. Einstein argued that the present is relative to us; he claimed this would be evident if I were traveling past you at close to the speed of light.
Heraclitus, Duns Scotus, Thomas Hobbes, Arthur Schopenhauer, A. N. Prior, and Lee Smolin are presentists. In the 17th century, Hobbes wrote, “The present only has a being in nature; things past have a being in the memory only, but things to come have no being at all, the future being but a fiction of the mind….” In 1969, Prior agreed and said of the present and the real:
They are one and the same concept, and the present simply is the real considered in relation to two particular species of unreality, namely the past and the future.
(2) Advocates of a growing-past agree with the presentists that the present is special ontologically, but they argue that, in addition to the present, the past is also real and is growing bigger all the time. Some have claimed there is a property of presentness that successively inheres in different times, as time flows on. The philosophers of science C. D. Broad, George Ellis, Richard Jeffrey, and Michael Tooley have defended the growing-past theory. William James famously remarked that the future is so unreal that even God cannot anticipate it. It is not clear whether Aristotle would have accepted the growing-past theory or accepted a form of presentism; see Hilary Putnam (1967, p. 244) for interesting commentary on this issue. The growing-past theory is also called by other names such as the now-and-then-ism, the becoming theory, possibilism, and no-futurism. Members of McTaggart’s A-camp are divided on whether to accept presentism or, instead, the growing-past theory, but they agree on rejecting eternalism.
(3) Advocates of eternalism believe in the equal existence of all times, but not the equal human significance of them. That is, there are no objective ontological differences among the past, present, and future, just as there are no objective ontological differences between here and there. The differences are subjective, according to eternalism, and they depend upon whose experience is being implicitly referred to—yours or Napoleon’s or Aristotle’s. An eternalist will say Napoleon’s rise to power in France is not simply in the past, as the first two theories imply; instead, it is in the past for you, but in the future for Aristotle, and it is equally real for both of you. The past, the present, and the future exist but not at the same times. The eternalist is committed to saying the events of the present are not ontologically privileged. Eternalism is consistent with indeterminism (and determinism).
There is controversy about the exact wording of each of the three metaphysical theories, and about whether the true theory is metaphysically necessary or just contingently true. The three theories do not differ in their observational consequences as do competing scientific theories.
Presentism is the everyday view, and it has become engrained within all the world’s languages. Eternalism is the only one of the three that permits time travel, so it is understandable that time travel was not seriously discussed in philosophy until the twentieth century when presentism began to be challenged. In the 20th century, Bertrand Russell, J.J.C. Smart, W.V.O. Quine, and David Lewis endorsed eternalism. Eternalism is less frequently called the tapestry theory of time.
Presentism was the implicitly accepted ontology early in human history as languages were being created, so it has influenced our current use of tenses and of the words “now” and “present.” It is very difficult to speak correctly about eternalism using natural language because all natural languages are infused with presumptions about presentism. Correct descriptions of personal identity are especially difficult for eternalists for this reason.
Here is how one philosopher of physics briefly defended eternalism:
I believe that the past is real: there are facts about what happened in the past that are independent of the present state of the world and independent of my knowledge or beliefs about the past. I similarly believe that there is (i.e., will be) a single unique future. I know what it would be to believe that the past is unreal (i.e., nothing ever happened, everything was just created ex nihilo) and to believe that the future is unreal (i.e., all will end, I will not exist tomorrow, I have no future). I do not believe these things, and would act very differently if I did. Insofar as belief in the reality of the past and the future constitutes a belief in a ‘block universe’, I believe in a block universe. But I also believe that time passes, and see no contradiction or tension between these views (Maudlin 2002, pp. 259-260).
The block is reality represented as one four-dimensional collection of all its events; it contains the world’s past, present, and future. Because the future has not happened yet, the block does not display detailed records of future events. A single reference frame is assumed.
Hermann Minkowski is the father of the block universe concept, but Bertrand Russell is noteworthy for promoting it within the philosophical community. After Einstein suggested that space-time could curve; subsequently the block theorists said their block curves where space-time curves.
The block theory is promoted primarily by advocates of the B-theory who are eternalists. A and B theorists agree that it is correct to say, “The past does not exist” and to say, “Future events do not exist” if the verbs are being used in their tensed form, but they argue that there should be no implications here for ontology because this is merely an interesting feature of how some languages such as English use tensed verbs. When Tim Maudlin was asked, “Do you believe in a block universe?,” he replied: In a sense I absolutely do, and in a sense I absolutely do not. I do if it means past, present, and future are equally real. But not if it means, as (Huw) Price believes, there is no objective direction of time, or if it means, say, that Einstein is not really dead now.”
For a graphic presentation of the non-curving block, see the green four-dimensional space-time diagram diagram in a supplement to this article. If time has an infinite future or infinite past, then the block is infinite in those directions in time. If space has an infinite extent, then the block is infinitely large along along the spatial dimensions. If it were learned that space is nine-dimensional rather than three-dimensional, then block theorists would promote a ten-dimensional block rather than a four-dimensional block.
The block does not flow or otherwise move, so eternalism is often called a static theory because it promotes the block theory. Use of the derogatory label “static” suggests that the theory can not successfully deal with change. Motion of a single object in the real world is dynamic, but its historical record, such as its record or worldline within the block, is static. The block theory has been accused by A-theorists of spatializing time and geometricizing time, which arguably it does. The philosophical debate is whether this is a mistake. Some B-theorists complain that the very act of labeling the static view as being static is implying mistakenly that there is a time dimension in which the block is not changing but should. The block describes change but does not itself change, say B-theorists. “Do you want the block to wiggle or shake?” they say. The A-theorist’s complaint, according to the B-theorist, is like complaining that a printed musical score is faulty because it is static, while real music is vibrant.
To get a better sense of why the block is philosophically controversial, note that in his book The Future, the Oxford philosopher John Lucas said,
The block universe gives a deeply inadequate view of time. It fails to account for the passage of time, the pre-eminence of the present, the directedness of time, and the difference between the future and the past.
G. J. Whitrow complains that “the theory of the block universe…implies that past (and future) events co-exist with those that are present.” This is a contradiction, he believes. Whitrow’s point can be made metaphorically this way: The mistake of the B-theorist is to envision the future as unfolding, as if it has been waiting in the wings for its cue to appear on the present stage—which is absurd.
A principal difficulty for the presentism theory is to make sense of some distant event happening now. Relativity theory implies that which events are simultaneous with which other events depends upon the reference frame that is chosen to make the determination. That is, the concept of the present or now is frame-relative and so is not objectively real. For the eternalist and block-theorist, the block that is created using one reference frame is no more distinguished than the block that is created using another frame allowed by the laws of science. Any chosen reference will have its own definite past, present, and future. The majority of physicists accept this block theory, which could be called the mild block theory. Metaphysicians also argue over whether reality itself is a static block, rather than just being representable as a static block. These metaphysicians are promoting a strong block theory. Some theorists complain that this strong block theory is confusing the representation with what is represented. See (Smolin 2013, pp. 25-36) for an elaboration of this point.
Some proponents of the growing-past theory have adopted a growing-block theory. They say the block is ever-growing, and the present is the leading edge between reality and the unreal future. Some philosophers express that point by saying the present is the edge of all becoming. The advocates of the growing-block can agree with the eternalists that what makes the sentence, “Dinosaurs once existed,” be true is that there is a past region of the block in which dinosaurs do exist.
All three ontologies (namely, presentism, the growing-past, and eternalism) imply that, at the present moment, we only ever experience a part of the present and that we do not have direct access to the past or the future. They all agree that nothing exists now that is not present, and all three need to explain how and why there is an important difference between never existing (such as Santa Claus) and not existing now (such as Aristotle). Members of all three camps will understand an ordinary speaker who says, “There will be a storm tomorrow so it’s good that we fixed the roof last week,” but they will provide different treatments of this remark at a metaphysical level.
Most eternalists accept the B-theory of time. Presentists and advocates of the growing-past tend to accept the A-theory of time. Let us take a closer look at presentism.
One of the major issues for presentism is how to ground true propositions about the past. What makes it true that U.S. President Abraham Lincoln was assassinated in 1865? Speaking technically, we are asking what are the truthmakers of the true sentences and the falsemakers of the false sentences. Many presentists say past-tensed truths lack truthmakers in the past but are nevertheless true because their truthmakers are in the present. They say what makes a tensed proposition true are only features of the present way that things are, perhaps traces of the past in pages of present books and in our memories. The eternalist disagrees. When someone says truly that Abraham Lincoln was assassinated, the eternalist and the growing-past theorist believe this is to say something true of a real Abraham Lincoln who is not present. The block theorist and the growing-block theorist might add that Lincoln is real but far away from us along the time dimension just as the Moon is real but far away from us along a spatial dimension. Because of this analogy, they say, “Why not treat these distant realities in the same manner?”
A related issue for the presentist is how to account for causation, for how April showers bring May flowers. Presentists believe in processes, but can they account for the process of a cause producing an effect without both the cause and the effect being real at different times?
Presentism and the growing-past theory need to account for the Theory of Relativity’s treatment of the present, or else criticize the theory. On its orthodox interpretation, relativity theory implies there is no common global present, but only different presents for each of us. Relativity theory allows event a to be simultaneous with event b in one reference frame, while allowing b to be simultaneous with event c in some other reference frame, even though a and c are not simultaneous in either frame. Nevertheless, if a is real, then is c not also real? But neither presentism nor the growing-past theory can allow c to be real. This argument against presentism and the growing-past theory presupposes the transitivity of co-existence.
Despite this criticism, (Stein 1991) says presentism can be retained by rejecting transitivity and saying what is present and thus real is different depending on your space-time location. The implication is that, for event a, the only events that are real are those with a zero space-time interval from a. Many of Stein’s opponents, including his fellow presentists, do not like this implication.
Eternalists very often adopt the block-universe theory. This implies our universe is the set of all the point-events with their actual properties. The block is representable with a Minkowski diagram in the regions where space-time does not curve and where nature obeys the laws of special relativity.
The presentist and the advocate of the growing-past theory usually will unite in opposition to eternalism for many of these five reasons: (i) The present is so much more vivid than the future. (ii) Eternalism misses the special open and changeable character of the future. In the classical block-universe theory promoted by most eternalists, there is only one future, so this implies the future exists already and is hiding somewhere awaiting its time to emerge; but that is absurd. (iii) A present event moves in the sense that it is no longer present a moment later, having lost its property of presentness, but eternalism disallows this movement. (iv) Future events do not exist and so do not stand in relationships of before and after, but eternalism says they do stand in those relationships. (v) Future-tensed statements that are contingent, such as “There will be a sea battle tomorrow,” do not have existing truthmakers and so are neither true nor false, yet almost all eternalists mistakenly believe all these contingent statements do have truth values now.
Defenders of eternalism and the block-universe offer a variety of responses to these criticisms. For instance, regarding (i), they are likely to say the vividness of here does not imply the unreality of there, so why should the vividness of now imply the unreality of then? Regarding (ii) and the openness of the future, the block theory allows a closed future and the absence of libertarian free will, but it does not require this. Eventually, there will be one future, regardless of whether that future is now open or closed, and that is what constitutes the future portion of the block that has not happened yet.
“Do we all not fear impending doom?” an eternalist might ask. But according to presentism and the growing-block theory, why should we have this fear if the future doom is known not to exist, as these two kinds of theorists evidently believe? Implicitly accepting this argument in 1981, J.J.C. Smart, who is a proponent of the block-universe, asked us to:
conceive of a soldier in the twenty-first century…cold, miserable and suffering from dysentery, and being told that some twentieth-century philosophers and non-philosophers had held that the future was unreal. He might have some choice things to say.
All observation is of the past. If you look at the North Star, you see it as it was, not as it is, because it takes so many years for the light to reach your eyes, about 434 years. The North Star might have burned out several years ago. If so, then you are seeing something that does not exist, according to the presentist. That is puzzling. Eternalism with the block theory provides a way out of the puzzle: you are seeing an existing time-slice of the 4D block that is the North Star.
Determinism for a system is the thesis that specifying the state of the system at one time fixes how the system evolves forward in time. So, the present state determines each future state. Similarly, a single state at a past time determines the present. By “determines,” we mean determines by rules or laws. Determinism implies that no event is purely random. Most physicists reject determinism because quantum mechanics implies that two identical radioactive atoms in identical situations will take different times to decay. They decay randomly, although the rate at which a great many of them decay can be computed precisely, so they can make precise claims such as “The probability that atoms of kind K will decay within t seconds is p.”
Here is a commonly offered defense of the block-universe theory against the charge that it entails determinism:
The block universe is not necessarily a deterministic one. …Strictly speaking, to say that the occurrence of a relatively later event is determined vis à vis a set of relatively earlier events, is only to say that there is a functional connection or physical law linking the properties of the later event to those of the earlier events. …Now in the block universe we may have partial or even total indeterminacy—there may be no functional connection between earlier and later events (McCall 1966, p. 271).
One defense of the block theory against Bergson’s charge that it inappropriately spatializes time is to point out that when we graph the color of eggs sold against the location of the sales, no one complains that we are inappropriately spatializing egg color.
The issues of spatialization and determinism reflect a great philosophical divide between those who believe the geometrical features of space-time provide an explanation of physical phenomena or instead provide only a representation or codification of those phenomena.
Challenging the claim that the block universe theory must improperly spatialize time, but appreciating the point made by Bergson that users of the block universe can make the mistake of spatializing time, the pragmatist and physicist Lee Smolin says,
By succumbing to the temptation to conflate the representation with the reality and [to] identify the graph of the records of the motion with the motion itself, these scientists have taken a big step toward the expulsion of time from our conception of nature.
The confusion worsens when we represent time as an axis on a graph…This can be called spatializing time.
And the mathematical conjunction of the representations of space and time, with each having its own axis, can be called space-time. The pragmatist will insist that this space-time is not the real world. It’s entirely a human invention, just another representation…. If we confuse space-time with reality, we are committing a fallacy, which can be called the fallacy of the spatialization of time. It is a consequence of forgetting the distinction between recording motion in time and time itself.
Once you commit this fallacy, you’re free to fantasize about the universe being timeless, and even being nothing but mathematics. But, the pragmatist says, timelessness and mathematics are properties of representations of records of motion—and only that.
For a survey of defenses for presentism and the growing-past theories, see (Putnam 1967), (Saunders 2002), (Markosian 2003), (Savitt 2008), and (Miller 2013, pp. 354-356).
b. The Present
Some philosophers believe the coherent use of tenses requires there to be an objective, global present. This present is what we are supposedly referring to when any of us uses the word “now.” Obviously the present is an objective feature of reality, many people say, because some events are really happening now and other events are not. The majority position among physicists is that these people are mistaken. The present is not an objective, global feature of reality. It is a mind-dependent or sociological feature that differs for two people who are moving relative to each other. Yes, there is a present, but everyone has their own present. There is no global present. The present is very often different for different people.
The everyday view is that the present is like an instantaneous shapshot or photograph of everything at one time. Each time has a present that is the same for everyone. Time itself is a continuum of these presents. After the rise of the theory of relativity, physicists discarded this view in favor of treating each point event as having its own light cone structure, made of two cones. Its forward light cone contains all point events in the future that could be causally connected to the present event. Its backward light cone contains all events in its past that could have causally affected the present event. Consequently, there is no objective snapshot of a present, and time is not a sequence of snapshots, despite this being the manifest image of time. Any global notion of a present requires adopting conventions about making some event in Greenwich, England being the privileged events from which other presents are “construted” thereby determining events outside the light cones count as being part of the present.
The notion of a global present is a feature of the manifest image and of classical Newtonian physics, but relativity theory gives up on a global and frame-independent instant of time called “now.” Relativity implies there is frame-dependence for the instant called “7:00 on November 1, 1887” and any other instant. In relativity, everyone has their own proper time, in which what happens at 7:00 on that day can be different for different people, depending on their circumstances, such as you standing still while another person flies by at just below the speed of light. The two of you must disagree about which events are part of the present globally. The details are discussed in the supplement of frequently asked questions. But to summarize, the notion of a sequence of snapshot presents is not an objective notion. The light cone structure of the universe is an objective notion, at least according to relativity theory.
A-theorists, unlike B-theorists, believe the present is an objective, metaphysically-privileged instant that is fundamental, spatially-extended, and global (applying to the entire cosmos). The A-theorists favor the claim that the present is objectively real; the B-theorists say it is subjective because everyone has their own present; and there is no fact of the matter as to which person’s present is the real present. In defending this position, the B-theorists typically refer to the requirements of the theory of relativity which, they say, implies what is happening now is relative to a chosen reference frame; and so the present is always different for two people moving toward or away from each other. That is why the present is subjective, they say.
Let us consider some arguments in favor of the objectivity of the present, the reality of now. One is that the now is so much more vivid to everyone than all other times. Past and future events are dim by comparison. Proponents of an objective present say that if scientific laws do not recognize this vividness and the objectivity of the present, then there is a defect within science. Einstein considered this argument and rejected it. The philosopher of science Tim Maudlin accepts it, and he hopes to find a way to revise relativity theory so it allows a universal present for each instant.
One counter to Einstein is that there is so much agreement among people about what is happening now and what is not. Is that not a sign that the now is objective, not subjective? This agreement is reflected within our natural languages where we find evidence that a belief in the now is ingrained in our language. It is unlikely that it would be so ingrained if it were not correct to believe it.
What have B-theorists said in response? Well, regarding vividness, we cannot now step outside our present experience and compare its vividness with the experience of past presents and future presents. Yet that is what needs to be done for a fair comparison. Instead, when we speak of the “vividness” of our present experience of, say, a leaf falling in front of us, all we can do is compare our present experience of the leaf with our dim memories of leaves falling, and with even dimmer expectations of leaves yet to fall. So, the comparison is unfair; the vividness of future events should be assessed, says the critic, by measuring those future events when they happen and not merely by measuring present expectations of those events before they happen.
In another attempt to undermine the vividness argument, the B-theorist points out that there are empirical studies by cognitive psychologists and neuroscientists showing that our judgment about what is vividly happening now is plastic and can be affected by our expectations and by what other experiences we are having at the time. For example, we see and hear a woman speaking to us from across the room; then we construct an artificial now, in which hearing her speak and seeing her speak happen at the same time. But they do not really happen at the same time, so we are playing a little trick on ourselves. The acoustic engineer assures us we are mistaken because the sound traveled much slower than the light. Proponents of the manifest image of time do not take travel time into account and mistakenly suppose there is a common global present and suppose that what is happening at present is everything that could in principle show up in a still photograph taken with light that arrives with infinite speed.
When you speak on the phone with someone two hundred miles away, the conversation is normal because the two of you seem to share a common now. But that normalcy is only apparent because the phone signal travels the two hundred miles so quickly. During a phone conversation with someone much farther away, say on the Moon, you would notice a strange 1.3 second time lag because the Moon is 1.3 light seconds away from Earth. Suppose you were to look at your correct clock on Earth and notice it is midnight. What time would it be on the Moon, according to your clock? This is not a good question. A more sensible question is, “What events on the Moon are simultaneous with midnight on Earth, according to my clock?” You cannot look and see immediately. You will have to wait 1.3 seconds at least because it takes any signal that long to reach you from the Moon. If an asteroid were to strike the Moon, and you were to see the explosion through your Earth telescope at 1.3 seconds after midnight, then you could compute later that the asteroid must have struck the Moon at midnight. If you want to know what is presently happening on the other side of Milky Way, you will have a much longer wait. So, the moral is that the collection of events comprising your present is something you have to compute; you cannot directly perceive those events at once.
To continue advancing a pro-B-theory argument against an objective present, notice the difference in time between your clock which is stationary on Earth and the time of a pilot using a clock in a spaceship that is flying by you at high speed. Assume the spaceship flies very close to you and that the two clocks are synchronized and are working perfectly and they now show the time is midnight at the flyby. According to the special theory of relativity, the collection of events across the universe that you eventually compute and say occurs now at midnight, necessarily must be very different from the collection of events that the spaceship traveler computes and says occurs at midnight. You and the person on the spaceship probably will not notice much of a difference for an event at the end of your street or even for an event on another continent, but you will begin to notice the difference for an event on the Moon and even more so for an event somewhere across the Milky Way or, worse yet, for an event in the Andromeda galaxy.
When two people disagree about what events are present events because the two are in motion relative to each other, the direction of the motion makes a significant difference. If the spaceship is flying toward Andromeda and away from you, then the spaceship’s now (what it judges to be a present event) would include events on Andromeda that occurred thousands of years before you were born. If the spaceship is flying away from Andromeda, the spaceship’s now would include events on Andromeda that occur thousands of years in your future. Also, the difference in nows is more extreme the faster the spaceship’s speed as it flies by you. The implication, says the B theorist, is that there are a great many different nows and nobody’s now is the only correct one.
To make a similar point in the language of mathematical physics, something appropriately called a now would be an equivalence class of instances that occur at the same time. But because Einstein showed that time is relative to refence frame, there are different nows for different reference frames, so the notion of now is not frame-independent and thus is not objective, contra the philosophical position of the A-theorist.
When the B-theorist says there is no fact of the matter about whether a distant explosion has happened, the A-theorist will usually disagree and say, regardless of your limitations on what knowledge you have, the explosion has occurred now or it has not.
For ordinary discussions about events on Earth, a reference frame is customarily used in which the Earth is not moving. And since we all move at slow speeds relative to each other on Earth and do not experience very different gravitational forces and do not consider very distant phenomena, we can agree for practical purposes on Earth about what is simultaneous with what.
Opponents of an objective present frequently point out that none of the fundamental laws of physics pick out a present moment. Scientists frequently do apply some law of science while assigning, say, t0 to be the temporal coordinate of the present moment, then they go on to calculate this or that. This insertion of the fact that some value of the time variable t is the present time is an initial condition of the situation to which the law is being applied, and is not part of the law itself. The basic laws themselves treat all times equally. If science’s laws do not need the present, then it is not real, say the B theorists. The counterargument is that it is the mistake of scientism to suppose that if something is not in our current theories, then it must not be real. France is real, but it is not mentioned in any scientific law.
In any discussion about whether the now is objective, one needs to remember that the term objective has different senses. There is objective in the sense of not being relative to the reference frame, and there is objective in the sense of not being mind-dependent, and there is objective in the sense of not being anthropocentric. Proponents of the B-theory say the now is not objective in any of these senses.
There is considerable debate in the philosophical literature about whether the present moments are so special that the laws should somehow recognize them. It is pointed out that even Einstein said, “There is something essential about the Now which is just outside the realm of science.” In 1925, the influential philosopher of science Hans Reichenbach criticized the block theory’s treatment of the present:
In the condition of the world, a cross-section called the present is distinguished; the ‘now’ has objective significance. Even when no human being is alive any longer, there is a ‘now’….
This claim has met stiff resistance. Earlier, in 1915, Bertrand Russell had objected to giving the present any special ontological standing:
In a world in which there was no experience, there would be no past, present, or future, but there might well be earlier and later (Russell 1915, p. 212).
Later, Rudolf Carnap added that a belief in the present is a matter for psychology, not physics.
The B-camp says belief in a global now is a product of our falsely supposing that everything we see is happening now, when actually we are not factoring in the finite speed of light and sound. Proponents of the non-objectivity of the present frequently claim that a proper analysis of time talk should treat the phrases the present and now as indexical terms which refer to the time at which the phrases are uttered by the speaker, and so their relativity to us speakers shows the essential subjectivity of the present. A-theorists usually do not accept these criticisms.
Carlo Rovelli accepted the objectivity of the flow of time, as did Henri Bergson, but he argued against Bergson as follows:
The mistake is to take the complex experience of time that we have and to assume that all aspects of it are general, and must underpin nature in general. This is a serious mistake, and is the first mistake that Bergson makes. This is a very common mistake…, [and] it is the central source of confusion in the debate about time.
Rovelli would be apt to say that there is an objective difference between up and down, but it would be a mistake to generalize this as assume it applies to all people in all circumstances when in fact it is merely a product of the fact that we live on Earth’s surface and assume down is toward the center of Earth. In outer space, far from the influence of Earth or other significant masses, there is no clear distinction between up and down. One mistake too often made by overgeneralizing this way is to assume that your own notion of whether a distant event is happening now must apply generally to all other persons and events.
There are interesting issues about the now in the philosophy of religion. For one example, Norman Kretzmann has argued that if God is omniscient, then He knows what time it is, and to know this, says Kretzmann, God must always be changing because God’s knowledge keeps changing. Therefore, there is an incompatibility between God’s being omniscient and God’s being immutable.
Disagreement about the now is an ongoing feature of debate in the philosophy of time, and there are many subtle moves made by advocates on each side of the issue. (Baron 2018) provides a broad overview of the debate about whether relativistic physics disallows an objective present. For an extended defense of the claim that the now is not subjective and that there is temporal becoming, see (Arthur 2019).
There is philosophical debate about the duration of the present. Most presentists take the present to be an instantaneous slice of reality. But others speak of the specious present, the unit of time that humans perceive. This issue is discussed in section 17 on perceiving time.
c. Persistence, Four-Dimensionalism, and Temporal Parts
Eternalism differs from four-dimensionalism. Eternalism is the thesis that the present, past, and future are equally real, whereas four-dimensionalism says the ontologically basic objects are four-dimensional events and ordinary objects referred to in everyday discourse are three-dimensional slices of 4-d space-time. However, most four-dimensionalists do accept eternalism. Most all eternalists and four-dimensionalists accept McTaggart’s B-theory of time.
Four-dimensionalism does not imply that time is a spatial dimension. When a four-dimensionalist represents time relative to a reference frame in a four-dimensional diagram, say, a Minkowski diagram, time is a special one of the four-dimensions of this mathematical space, not an arbitrary one. Using this representation technique does not imply that a four-dimensionalist is committed to the claim that real, physical space itself is four-dimensional, but only that space-time is.
Four-dimensionalists take a stand on the philosophical issue of endurance vs. perdurance. Some objects last longer than others, so we say they persist longer. But there is no philosophical consensus about how to understand persistence. Objects are traditionally said to persist by enduring over some time interval. At any time during the interval the whole of the object exists. Not so for perduring objects. Perduring objects are said, instead, to persist by perduring. They do not exist wholly at a single instant but rather exist over a stretch of time. These objects do not pass through time; they do not endure; instead, they extend through time. A football game does not wholly exist at one instant; it extends over an interval of time. The issue is whether we can or should say the same for electrons and people. Technically expressed, the controversial issue is whether or not persisting things are (or are best treated as) divisible into temporal parts.
The perduring object persists by being the sum or fusion of a series of its temporal parts (also called its temporal stages). Instantaneous temporal parts are called temporal slices and time slices. For example, a forty-year-old man might be treated as being a four-dimensional perduring object consisting of his three temporal stages that we call his childhood, his middle age, and his future old age. But his right arm is also a temporal part that has perdured for forty years.
Although the concept of temporal parts is more likely to be used by a four-dimensionalist, here is a definition of the concept from Judith Jarvis Thomson in terms of three-dimensional objects:
Let object O exist at least from time t0 to time t3. A temporal part P of O is an object that begins to exist at some time t1, where t1 is less t0, and goes out of existence at some time t2 less than t3, and takes up some portion of the space that O takes up for all the time that P exists.
Four-dimensionalists, by contrast, think of physical objects as regions of space-time and as having temporal parts that extend along all four dimensions of the object. A more detailed presentation of these temporal parts should say whether four-dimensional objects have their spatiotemporal parts essentially.
David Lewis offers the following, fairly well-accepted definitions of perdurance and endurance:
Something perdures iff it persists by having different temporal parts, or stages, at different times, though no one part of it is wholly present at more than one time; whereas it endures iff it persists by being wholly present at more than one time.
The term “iff“ stands for “if and only if.” Given a sequence of temporal parts, how do we know whether they compose a single perduring object? One answer, given by Hans Reichenbach, Ted Sider, and others, is that they compose a single object if the sequence falls under a causal law so that temporal parts of the perduring object cause other temporal parts of the object. Philosophers of time with a distaste for the concept of causality, oppose this answer.
According to David Lewis in On the Plurality of Worlds, the primary argument for perdurantism is that it has an easier time solving what he calls the problem of temporary intrinsics, of which the Heraclitus Paradox is one example. The Heraclitus Paradox is the problem, first introduced by Heraclitus of ancient Greece, of explaining our not being able to step into the same river twice because the water is different the second time. The mereological essentialist agrees with Heraclitus, but our common sense says Heraclitus is mistaken because people often step into the same river twice. Who is really making the mistake?
The advocate of endurance has trouble showing that Heraclitus is mistaken, says Lewis. We do not step into two different rivers, do we? They are the same river. Yet the river has two different intrinsic properties, namely being a collection of water we stepped in the first time and a collection of water we stepped in the second time; but, by Leibniz’s Law of the Indiscernibility of Identicals, identical objects cannot have different intrinsic properties. So, the advocate of endurance has trouble escaping the Heraclitus Paradox. So does the mereological essentialist.
A 4-dimensionalist who advocates perdurance says the proper metaphysical analysis of the Heraclitus Paradox is that we can step into the same river twice by stepping into two different temporal parts of the same 4-dimensional river. Similarly, we cannot see a football game at a moment; we can see only a momentary temporal part of the 4D game.
For more examination of the issue with detailed arguments for and against perdurance and endurance, see (Wasserman, 2018), (Carroll and Markosian 2010, pp. 173-7), and especially the article “Persistence in Time” in this encyclopedia.
d. Truth-Values of Tensed Sentences
The above disputes about presentism, the growing-past theory, and the block theory have taken a linguistic turn by focusing upon a related question about language: “Are predictions true or false at the time they are uttered?” Those who believe in the block-universe (and thus in the determinate reality of the future) will answer “Yes,” while a “No” will be given by presentists and advocates of the growing-past.
The issue is whether contingent sentences uttered now about future events are true or false now rather than true or false only in the future at the time the predicted event is supposed to occur. For example, suppose someone says, “Tomorrow the admiral will start a sea battle.” And suppose that the next day the admiral does order a sneak attack on the enemy ships which starts a sea battle. The eternalist says that, if this is so, then the sentence token about the sea battle was true yesterday at the time it was uttered. Truth is eternal or fixed or unchangeable by current actions, eternalists say, and the predicate is true is a timeless or tenseless predicate, not one that merely means is true now. The contingent sentence spoken now has a truth-maker within the block at a future time, even though the event has not yet happened and so the speaker has no access to that truth-maker. These B-theory philosophers point favorably to the ancient Greek philosopher Chrysippus who was convinced that a contingent sentence about the future is simply true or false now, even if we do not know which.
Many other philosophers, usually in McTaggart’s A-camp, agree with Aristotle’s suggestion that the sentence about the future sea battle is not true (or false) until the battle occurs (or does not). Predictions fall into the truth-value gap. This position that contingent sentences have no classical truth-values when uttered is called the doctrine of the open future and also the Aristotelian position because many researchers throughout history have taken Aristotle to have been holding that position in chapter 9 of his On Interpretation—although today it is not so clear that Aristotle himself held the position.
One principal motive for adopting the Aristotelian position arises from the belief that, if sentences about future human actions are now true, then humans are determined to perform those actions, and so humans have no free will. To defend free will, we must deny truth-values to predictions.
This Aristotelian argument against predictions being true or false has been discussed as much as any in the history of philosophy, and it faces a series of challenges. First, if there really is no free will, or if free will is compatible with determinism, then the motivation to deny truth-values to predictions is undermined.
Second, according to many compatibilists, but not all, your choices do affect the world as the libertarians believe they must; but, if it is true that you will perform an action in the future, it does not follow that now you will not perform it freely, nor that you were not free to do otherwise if your intentions had been different back then, but only that you will not do otherwise. For more on this point about modal logic, see the discussion of it in Foreknowledge and Free Will.
A third challenge, from Quine and others, claims the Aristotelian position wreaks havoc with the logical system we use to reason and argue with predictions. For example, here is a deductively valid argument, presumably:
If there will be a sea battle tomorrow, then we should wake up the admiral.
There will be a sea battle tomorrow.
So, we should wake up the admiral.
Without both premises in this argument having truth-values, that is, being true or false, we cannot properly assess the argument using the usual standards of deductive validity because this standard is about the relationships among truth-values of the component sentences—that a valid argument cannot possibly have true premises and a false conclusion. Unfortunately, the Aristotelian position says that some of these component sentences are neither true nor false. So, deductive logic does not apply. Surely, then, the Aristotelian position is implausible.
In reaction to this third challenge, proponents of the Aristotelian argument say that if Quine would embrace tensed propositions and expand his classical logic to a tense logic, he could avoid those difficulties in assessing the validity of arguments that involve sentences having future tense.
Quine has claimed that the analysts of our talk involving time should in principle be able to eliminate the temporal indexical words such as now and tomorrow because their removal is needed for fixed truth and falsity of our sentences [fixed in the sense of being eternal or complete sentences whose truth-values are not relative to the situation and time of utterance because the indexicals and indicator words have been replaced by expressions for specific times, places and names, and whose verbs are treated as timeless and tenseless even if they have grammatical tenses], and having fixed truth-values is crucial for the logical system used to clarify science. “To formulate logical laws in such a way as not to depend thus upon the assumption of fixed truth and falsity would be decidedly awkward and complicated, and wholly unrewarding,” says Quine. For a criticism of Quine’s treatment of indexicals, see (Slater 2012, p. 72).
Philosophers are divided on all these issues.
e. Essentially-Tensed Facts
Using a tensed verb is a grammatical way of locating an event in time relative to the present. The English language distinguishes “Her death has happened” from “Her death will happen” by an implicit reference to now. However, English also expresses time in other ways: with the adverbial phrases such as soon and twenty-three days ago and frequently, with the adjective phrases now and twenty-three days old, and with the prepositions until and since.
Philosophers have asked what we are basically committed to when we use tense to locate an event in time. There are two principal answers: tenses are objective, and tenses are subjective. The two answers have given rise to two competing camps of philosophers of time.
The first answer is that tenses represent objective features of reality that are not captured by the B-theory, nor by eternalism, nor by the block-universe approach. This philosophical theory is said to “take tense seriously” and is called the tensed theory of time. The theory claims that, when we learn the truth-values of certain tensed sentences, we obtain knowledge of time which tenseless sentences do not and cannot provide, for example, that such and such a time is the present time. Tenses are almost the same as what is represented by positions in McTaggart‘s A-series, so the theory that takes tense seriously is commonly called the A-theory of tense, and its advocates are called tensers.
A second, contrary answer to the question of the significance of tenses is that they are merely subjective, artefacts of consciousness. Tensed terms do have an indexical feature which is specific to the subject doing the speaking, but this feature has no ontological significance. Saying the event happened rather than is happening indicates that the subject or speaker said this after the event happened rather than before or during the event. Tenses are about speakers, not about some other important ontological characteristic of time in the world. This theory is the B-theory of tense, and its advocates are called detensers. The detenser W.V.O. Quine expressed the position this way:
Our ordinary language shows a tiresome bias in its treatment of time. Relations of date are exalted grammatically…. This bias is of itself an inelegance, or breach of theoretical simplicity. Moreover, the form that it takes—that of requiring that every verb form show a tense—is peculiarly productive of needless complications, since it demands lip service to time even when time is farthest from our thoughts. Hence in fashioning canonical notations it is usual to drop tense distinctions (Word and Object §36).
The philosophical disagreement about tenses is not so much about tenses in the grammatical sense, but rather about the significance of the distinctions of past, present, and future which those tenses are used to mark.
The controversy is often presented as a controversy about whether tensed facts exist, with advocates of the tenseless theory objecting to tensed facts and advocates of the tensed theory promoting them as essential. The primary function of tensed facts is to make tensed sentences true, to be their truthmakers.
The B-theorist says tensed facts are not needed to account for why tensed sentences get the truth values they do.
Consider the tensed sentence, “Queen Anne of Great Britain died.” The A-theorist says the truthmaker is simply the tensed fact that the death has pastness. The B-theorist gives a more complicated answer by saying the truthmaker is the fact that the time of Queen Anne’s death is-less-than the time of uttering the above sentence. Notice that the B-answer does not use any words in the past tense. According to the classical B-theorist, the use of tense (and more importantly, any appeal to tensed facts) is an extraneous and eliminable feature of our language at the fundamental level, as are all other uses of the terminology of the A-series (except in trivial instances such as “The A-series is constructed using A-facts”).
This B-theory analysis is challenged by the tenser’s A-theory on the grounds that it can succeed only for utterances or readings or inscriptions, but the A-theorist points out that a proposition can be true even if never uttered, never read, and never inscribed.
There are other challenges to the B-theory. Roderick Chisholm and A.N. Prior claim that the word “is” in the sentence “It is now midnight” is essentially present-tensed because there is no adequate translation using only tenseless verbs. Trying to give a B-style analysis of it, such as, “There is a time t such that t = midnight,” is to miss the essential reference to the present in the original sentence because the original sentence is not always true, but the sentence “There is a time t such that t = midnight” is always true. So, the tenseless analysis fails. There is no escape from this criticism by adding “and t is now” because this last indexical phrase needs its own analysis, and we are starting a vicious regress. John Perry famously explored this argument in his 1979 article, “The Problem of the Essential Indexical.”
Prior, in (Prior 1959), supported the tensed A-theory by arguing that after experiencing a painful event,
one says, e.g., “Thank goodness that’s over,” and [this]…says something which it is impossible that any use of a tenseless copula with a date should convey. It certainly doesn’t mean the same as, e.g., “Thank goodness the date of the conclusion of that thing is Friday, June 15, 1954,” even if it be said then. (Nor, for that matter, does it mean “Thank goodness the conclusion of that thing is contemporaneous with this utterance.” Why should anyone thank goodness for that?).
Prior’s criticisms of the B-theory involves the reasonableness of our saying of some painful, past event, “Thank goodness that is over.” The B-theorist cannot explain this reasonableness, says Prior. The only way to make sense of our saying “Thank goodness that is over” is to assume we are thankful for the A-fact that the pain event is in the past, that is, we are thankful for the pastness. But if so, then the A-theory is correct and the B-theory is incorrect.
One B-theorist response is simply to disagree with Prior that it is improper for a B-theorist to thank goodness that the end of their pain happens before their present utterance, even though this is an eternal B-fact. Still another response from the B-theorist comes from the 4-dimensionalist who says that as 4-dimensional beings it is proper for us to care more about our later time-slices than our earlier time-slices. If so, then it is reasonable to thank goodness that the time slice at the end of the pain occurs before the time slice in which we are saying, “Thank goodness that is over.” Admittedly this is caring about an eternal B-fact. So, Prior’s premise [that the only way to make sense of our saying “Thank goodness that is over” is to assume we are thankful for the A-fact that the pain event has pastness] is a faulty premise, and Prior’s argument for the A-theory is unsuccessful.
D.H. Mellor and J.J.C. Smart, both proponents of the B-theory, agree that tensed talk is important, and can be true, and even be essential for understanding how we think and speak; but Mellor and Smart claim that tensed talk is not essential for describing extra-linguistic reality and that the extra-linguistic reality does not contain tensed facts corresponding to true, tensed talk. These two philosophers, and many other philosophers who “do not take tense seriously,” advocate a newer tenseless B-theory by saying the truth conditions of any tensed, declarative sentence can be explained without tensed facts even if Chisholm and Prior and other A-theorists are correct that some tensed sentences in English cannot be adequately translated into tenseless ones.
The truth conditions of a sentence are the conditions which must be satisfied in the world in order for the sentence to be true. The sentence “Snow is white” is true on the condition that snow is white. More particularly, it is true if whatever is referred to by the term ‘snow’ satisfies the predicate ‘is white’. Regarding if-then sentences, the conditions under which the sentence “If it is snowing, then it is cold” are true are that it is not both true that it is snowing and false that it is cold. Other analyses are offered for the truth conditions of sentences that are more complex grammatically. Alfred Tarski has provided these analyses in his semantic theory of truth.
Mellor and Smart agree that truth conditions can adequately express the meaning of tensed sentences or all that is important about the meaning when it comes to describing objective reality. This is a philosophically controversial point, but Mellor and Smart accept it, and argue that therefore there is really no need for tensed facts and tensed properties. The untranslatability of some tensed sentences merely shows a fault with ordinary language‘s ability to characterize objective, tenseless reality. If the B-theory, in accounting for the truth conditions of an A-sentence, fails to account for the full meaning of the A-sentence, then this is because of a fault with the A-sentence, not the B-theory.
Let us make the same point in other words. According to the newer B-theory of Mellor and Smart, if I am speaking to you and say, “It is now midnight,” then this sentence admittedly cannot be translated into tenseless terminology without some loss of meaning, but the truth conditions can be explained fully with tenseless terminology. The truth conditions of “It is now midnight” are that my utterance occurs (in the tenseless sense of occurs) at very nearly the same time as your hearing the utterance, which in turn is the same time as when our standard clock declares the time to be midnight in our reference frame. In brief, it is true just in case it is uttered at midnight. Notice that no tensed facts are appealed to in this explanation of the truth conditions.
Similarly, an advocate of the new tenseless theory will say it is not the pastness of the painful event that explains why I say, “Thank goodness that’s over” after exiting the dentist’s chair. I say it because I believe that the time of the occurrence of that utterance is greater than the time of the occurrence of the painful event, and because I am glad about this; and even though it was true even last month that the one time occurred before the other, I am happy to learn this. Of course, I would be even gladder if there were no pain at any time. I may not be consciously thinking about the time of the utterance when I make it; nevertheless, that time is what helps explain what I am glad about. Being thankful for the pastness of the painful event provides a simpler explanation, actually a simplistic explanation, but not a better explanation.
In addition, it is claimed by Mellor and other new B-theorists that tenseless sentences can be used to explain the logical relations between tensed sentences; they can be used to explain why one tensed sentence implies another, is inconsistent with yet another, and so forth. According to this new theory of tenseless time, once it is established that the truth conditions of tensed sentences can be explained without utilizing tensed facts, then Ockham’s Razor is applied. If we can do without essentially-tensed facts, then we should say essentially-tensed facts do not exist.
To summarize, tensed facts were presumed by the A-theory to be needed to be the truthmakers for the truth of tensed talk; but proponents of the new B-theory claim their analysis shows that ordinary tenseless facts are adequate. The B-theory concludes that we should “not take tense seriously” in the sense of requiring tensed facts to account for the truth and falsity of sentences involving tenses because tensed facts are not actually needed.
Proponents of the tensed theory of time do not agree with this conclusion. They will insist there are irreducible A-properties and that what I am glad about when a painful event is over is that the event is earlier than now, that is, has pastness. Quentin Smith says, more generally, that the “new tenseless theory of time is faced with insurmountable problems, and that it ought to be abandoned in favor of the tensed theory.”
The advocate of the A-theory E.J. Lowe opposed the B-theory because it conflicts so much with the commonsense image of time:
I consider it to be a distinct merit of the tensed view of time that it delivers this verdict, for it surely coincides with the verdict of common sense (Lowe, 1998, p. 104).
Lowe argued that no genuine event can satisfy a tenseless predicate, and no truth can be made true by B-theory truth conditions because all statements of truth conditions are tensed.
So, the philosophical debate continues over whether tensed concepts have semantical priority over untensed concepts, and whether tensed facts have ontological priority over untensed facts.
15. The Arrow of Time
Unlike space. time appears to have an arrow, a direction or an orientation. Time’s arrow points from the past to the future. Arthur Eddington first used the term “time’s arrow” in 1927. Philosophers subsequently have disagreed about how to explain the arrow. They do not agree on why the future is not just like the past. If you are shown an ordinary movie and also shown the same movie running in reverse, you have no trouble telling which is the one in which time’s arrow is pointing in the wrong direction. It’s the movie with surprising processes in which people walk backwards up steps, apples spontaneously rise from the ground and attach to the apple tree, and an egg yolk along with scattered bits of eggshell collect into an unbroken egg. Yet the fundamental laws of physics permit these surprising processes, and these laws give no indication why the two directions should not be equally common among all the processes we experience in our lives. Reconciling our experience of processes going only one way with the fact that scientific laws permit them to go either direction in time is called “The Problem of the Arrow of Time.”
Those who accept time’s directedness, and not all do, divide into two main philosophical camps regarding this arrow of time. One claims there exists an intrinsic arrow of time itself that is due to its flow or to its passage or to more events becoming real. Those in the second camp claim there is only an extrinsic arrow due to the fact that so many of nature’s processes spontaneously go in only one direction. Those in the intrinsic camp often accuse those in the extrinsic camp of scientism. Those in the extrinsic camp often accuse those in the intrinsic camp of over-emphasizing the phenomenology of temporal awareness.
The presence of the arrow implies, among other things, that tomorrow will be different from today in many ways. Can this fact be explained, or is it simply a brute fact that we must accept? Some wish (a) to drill down to the most basic description of change and find a significant, fundamental, dynamic law of physics that requires one-way behavior in time, or (b) to assume a special feature at the origin of time that directs time to start out going in only one direction and keep going that way, or (c) to assume the arrow of time is an intrinsic but otherwise inexplicable feature of time itself.
Many researchers in the extrinsic camp claim that the presence of time’s arrow is basically a statistical issue involving increased disorder and randomization of the constituent molecules (the technical term is “increased entropy”) plus a special low-entropy configuration of nature early in the cosmic Big Bang, with the target of the arrow being thermodynamic equilibrium in the very distant future when the universe’s average temperature approaches absolute zero. These researchers point to the Second Law of thermodynamics as the law that gives a quantitative description of entropy increase and thus of time’s arrow. In short, there is an arrow because the universe is slowly becoming randomized.
Researchers in the intrinsic camp disagree with this kind of explanation of the arrow. They say the one-way character of time is not fundamentally a statistical issue involving processes but rather is intimately tied to the inherent nature of time itself such as its intrinsic and uninterrupted flow or passage. That is why they believe arrow reversal would imply time reversal, and vice versa, whereas those in the extrinsic camp would say there could be arrow reversal without time reversal.
There is a wide variety of special kinds of processes that those in the extrinsic camp call mini-arrows. Here are some especially important ones: The human mind can know the past more easily than the future (the knowledge arrow). Heat flows from hot to cold (the thermodynamic arrow). Causes precede their effects (the causal arrow). The universe expands and does not shrink (the cosmological arrow). Light rays expand away from a light bulb rather than converge into it (the electromagnetic arrow). We remember the past, not the future (the memory arrow). We can act to affect the future, but not the past (the action arrow). These mini-arrows are deep and interesting asymmetries of nature, and philosophers of physicists would like to know how the mini-arrows are related to each other. This is called the taxonomy problem. There are many proposed solutions.
Some philosophers have even asked whether there could be distant regions of space and time where time’s arrow points in reverse compared to our arrow. If so, would adults there naturally walk backwards on the way to their infancy while they remember the future?
For more discussion of the arrow, see the article “The Arrow of Time”.
16. Temporal Logic
Temporal logic is the representation of reasoning about time and temporal information by using the methods of symbolic logic in order to formalize which statements imply which others. For example, in McTaggart’s B-series, the most important relation is the happens-before relation on events. Logicians have asked what sort of principles must this relation obey in order to properly account for our reasoning about time and temporal information.
Here is one suggestion. Consider this informally valid reasoning:
Alice’s arrival at the train station happens before Bob’s. Therefore, Bob’s arrival at the station does not happen before Alice’s.
Let us translate this into classical predicate logic using a domain of instantaneous events, where the individual constant ‘a‘ denotes Alice’s arrival at the train station, and ‘b‘ denotes Bob’s arrival at the train station. Let the two-place or two-argument relation ‘Bxy‘ be interpreted as x happens before y. The direct translation of the above informal argument produces:
Bab
——- ~Bba
The symbol ‘~’ is the negation operator; some logicians prefer to use the symbols ‘¬’ and ‘–’. Unfortunately, this simple formal argument is invalid. To make the argument become valid, we can add some semantic principles about the happens-before relation, namely, the premise that the B relation is asymmetric. That is, we can add this additional premise to the argument:
∀x∀y[Bxy → ~Byx]
The symbol ‘∀x‘ is the universal quantifier on the variable ‘x‘. Some logicians prefer to use ‘(x)‘ for the universal quantifier. The symbol ‘→‘ is the conditional operator or if-then operator; some logicians prefer to use the symbol ‘⊃‘ instead.
In other informally valid reasoning, we discover a need to make even more assumptions about the happens-before relation. For example, suppose Alice arrives at the train station before Bob, and suppose Bob arrives there before Carol. Is it valid reasoning to infer that Alice arrives before Carol? Yes, but if we translate directly into classical predicate logic we get this invalid argument:
Bab Bbc
—— Bac
To make this argument be valid we can add the premise that says the happens-before relation is transitive, that is:
∀x∀y∀z [(Bxy & Byz) → Bxz]
The symbol ‘&’ represents the conjunction operation. Some logicians prefer to use either the symbol ‘·‘ or ‘∧’ for conjunction. The transitivity of B is a principle we may want to add to our temporal logic.
What other constraints should be placed on the B relation (when it is to be interpreted as the happens-before relation)? Here are some of the many suggestions:
∀x∀y{Bxy → [t(x) < t(y)]}. If x happens before y, then the time coordinate of x is less than the time coordinate of y. ‘t‘ is a one-argument function symbol.
∀x~Bxx. An event cannot happen before itself.
∀x∀y{[t(x) ≠ t(y)] → [Bxy v Byx]}. Any two non-simultaneous events are connected by the B relation. That is, there are no temporally unrelated pairs of events. (In his Critique of Pure Reason, Kant says this is an a priori necessary requirement about time.)
∀x∃yBxy. Time is infinite in the future.
∀x∀y(Bxy → ∃z(Bxz & Bzy)). B is dense in the sense that there is a third point event between any pair of non-simultaneous point events. This prevents quantized time.
To incorporate the ideas of the theory of relativity, we might want to make the happens-before relation be three-valued instead of two-valued by having it relate two events plus a reference frame.
When we formalized these principles of reasoning about the happens-before relation by translating them into predicate logic, we said we were creating temporal logic. However, strictly speaking, a temporal logic is just a theory of temporal sentences expressed in a formal logic. Calling it a logic, as is commonly done, is a bit of an exaggeration; it is analogous to calling the formalization of Peano’s axioms of arithmetic the development of number logic. Our axioms about B are not axioms of predicate logic, but only of a theory that uses predicate logic and that presumes the logic is interpreted on a domain of instantaneous events, and that presumes B is not open to re-interpretation as are the other predicate letters of predicate logic, but is always to be interpreted as happens-before.
The more classical approach to temporal logic, however, does not add premises to arguments formalized in classical predicate logic as we have just been doing. The classical approach is via tense logic, a formalism that adds tense operators on propositions of propositional logic or predicate logic. A. N. Prior was the pioneer in the late 1950s. Michael Dummett and E. J. Lemmon also made major, early contributions to tense logic. Prior created this new logic to describe our reasoning involving time phrases such as now, happens before, twenty-three minutes afterward,at all times, and sometimes. He hoped that a precise, formal treatment of these concepts could lead to the resolution of some of the controversial philosophical issues about time.
Prior begins with an important assumption: that a proposition such as “Custer dies in Montana” can be true at one time and false at another time. That assumption is challenged by some philosophers, such as W.V.O. Quine, who prefer to avoid the use of this sort of proposition and who recommend that temporal logics use only sentences that are timelessly true or timelessly false.
Prior’s main original idea was to appreciate that time concepts are similar in structure to modal concepts such as it is possible that and it is necessary that. He adapted modal propositional logic for his tense logic by re-interpreting its propositional operators. Or we can say he added four new propositional operators. Here they are with examples of their intended interpretations using an arbitrary present-tensed proposition p.
Pp
“It has at some time been the case that p“
Fp
“It will at some time be the case that p”
Hp
“It has always been the case that p”
Gp
“It will always be the case that p”
‘Pp‘ might be interpreted also as at some past time it was the case that, or it once was the case that, or it once was that, all these being equivalent English phrases for the purposes of applying tense logic to English. None of the tense operators are truth-functional.
One standard system of tense logic is a variant of the S4.3 system of modal logic. In this formal tense logic, if p represents the present-tensed proposition “Custer dies in Montana,” then Pp represents “It has at some time been the case that Custer dies in Montana” which is equivalent in English to simply “Custer died in Montana.” So, we properly call ‘P‘ the past-tense operator. It represents a phrase that attaches to a sentence and produces another that is in the past tense.
Metaphysicians who are presentists are especially interested in this tense logic because, if Prior can make do with the variable p ranging only over present-tensed propositions, then this logic, with an appropriate semantics, may show how to eliminate any ontological commitment to the past (and future) while preserving the truth of past tense propositions that appear in biology books such as “There were dinosaurs” and “There was a time when the Earth did not exist.”
The axiom says that for any two propositions p and q, at some past time it was the case that p or q if and only if either at some past time it was the case that p or at some past time (perhaps a different past time) it was the case that q.
If p is the proposition “Custer dies in Montana” and q is “Sitting Bull dies in Montana,” then P(p v q) ↔ (Pp v Pq) represents the assertion that:
Custer or Sitting Bull died in Montana if and only if either Custer died in Montana or Sitting Bull died in Montana.
The S4.3 system’s key axiom is the following equivalence. For all propositions p and q,
(Pp & Pq) ↔ [P(p & q) v P(p & Pq) v P(q & Pp)].
This axiom, when interpreted in tense logic, captures part of our ordinary conception of time as a linear succession of states of the world.
Another axiom of tense logic might state that if proposition q is true, then it will always be true that q has been true at some time. If H is the operator It has always been the case that, then a new axiom might be:
Pp ↔ ~H~p.
This axiom of tense logic is analogous to the modal logic axiom that p is possible if and only if it is not necessary that not-p.
A tense logic will need additional axioms in order to express q has been true for the past two weeks. Prior and others have suggested a wide variety of additional axioms for tense logic. It is controversial whether to add axioms that express the topology of time,
for example that it comes to an end or does not come to an end or that time is like a line instead of a circle; the reason usually given is that this is an empirical matter, not a matter for logic to settle.
Regarding a semantics for tense logic, Prior had the idea that the truth or falsehood of a tensed proposition could be expressed in terms of truth-at-a-time. For example, the proposition Pp (it was once the case that p) is true-at-a-time t if and only if p is true-at-a-time earlier than t. This suggestion has led to extensive development of the formal semantics for tense logic.
Prior himself did not take a stand on which formal logic and formal semantics are correct for dealing with temporal expressions.
The concept of being in the past is usually treated by metaphysicians as a predicate that assigns properties to events, for example, “The event of Queen Anne’s dying has the property of being in the past”; but, in the tense logic just presented, the concept is treated as an operator P upon propositions, “It has at some time in the past been the case that Queen Anne is dying,” and this difference in treatment is objectionable to some metaphysicians.
The other major approach to temporal logic does not use a tense logic. Instead, it formalizes temporal reasoning within a first-order logic without modal-like tense operators. One method for developing ideas about temporal logic is the method of temporal arguments which adds an additional temporal argument to any predicate involving time in order to indicate how its satisfaction depends on time. Instead of translating the x is resting predicate as Px, where P is a one-argument predicate, it could be translated into temporal predicate logic as the two-argument predicate Rxt, and this would be interpreted as saying x is resting at time t. P has been changed to a two-argument predicate R by adding a place for a temporal argument. The time variable t is treated as a new sort of variable requiring new axioms to more carefully specify what can be assumed about the nature of time.
Occasionally the method of temporal arguments uses a special constant symbol, say n, to denote now, the present time. This helps with the translation of common temporal sentences. For example, let the individual constant s denote Socrates, and let Rst be interpreted as “Socrates is resting at t.” The false sentence that Socrates has always been resting would be expressed in this first-order temporal logic as:
∀t(Ltn → Rst)
Here L is the two-argument predicate for numerically less than that mathematicians usually write as <. And we see the usefulness of having the symbol n.
If tense logic is developed using a Kripke semantics of possible worlds, then it is common to alter the accessibility relation between any two possible worlds by relativizing it to a time. The point is to show that some old possibilities are no longer possible. For example, a world in which Hillary Clinton becomes the first female U.S. president in 2016 was possible relative to the actual world of 2015, but not relative to the actual world of 2017. There are other complexities. Within a single world, if we are talking about a domain of people containing, say, Socrates, then we want the domain to vary with time since we want Socrates to exist at some times but not at others. Another complexity is that in any world, what event is simultaneous with what other event should be relativized to a reference frame.
Some temporal logics have a semantics that allows sentences to lack both classical truth-values. The first person to give a clear presentation of the implications of treating declarative sentences as being neither true nor false was the Polish logician Jan Lukasiewicz in 1920. To carry out Aristotle’s suggestion that future contingent sentences do not yet have truth-values, he developed a three-valued symbolic logic, with each grammatical declarative sentence having just one of the three truth-values True, or False, or Indeterminate [T, F, or I]. Contingent sentences about the future, such as, “There will be a sea battle tomorrow,” are assigned an I value in order to indicate the indeterminacy of the future. Truth tables for the connectives of propositional logic are redefined to maintain logical consistency and to maximally preserve our intuitions about truth and falsehood. See (Haack 1974) for more details about this application of three-valued logic.
For an introduction to temporal logics and their formal semantics, see (Øhrstrøm and Hasle 1995).
17. Perceiving Time
Do we perceive time? If so, how? An overarching issue involves the reliability of our experience because mistakes, illusions and delusions are an ever-present danger. The principal philosophical issue about time and mind is to specify how time is represented in the mind and why it is represented that way instead of some other way; and the principal scientific issue in cognitive neuroscience is to uncover the neurological basis of our temporal experience.
Most experts would assert that we perceive time because our perception seems to transcend what impinges upon our present senses. This perceptual experience reveals time to us in many ways: (1) We notice some objects changing over time and some other objects persisting unchanged. (2) We detect some events succeeding one another. (3) We notice that some similar events have different durations. (4) We seem to automatically classify events as present, past, or future, and we treat those events differently depending upon how they are classified. For example, we worry more about future pain than past pain.
Neuroscientists and cognitive scientists know that these ways of experiencing time exist, but not why they exist. Humans do not need to consciously learn these skills any more than they need to learn how to be conscious. It’s just something that grows or is acquired naturally. It is something that appears due to a human being’s innate biological nature coupled with the prerequisites of a normal human environment—such as an adequate air supply, warmth, food, and water. A tulip could be given the same prerequisites, but it would never develop anything like our time consciousness. But neuroscientists do not yet understand the details of how our pre-set genetic program produces time consciousness, although there is agreement that the genes themselves are not conscious in any way.
A minority of philosophers, the panpsychists, would disagree with these neurophysiologists and say genes have proto-mental properties and proto-consciousness and even proto-consciousness of time. Critics remark sarcastically that our genes must also have the proto-ability to pay taxes. The philosopher Colin McGinn, who is not a panpsychist, has some sympathies with the panpsychist position. He says that genes:
contain information which is such that if we were to know it we would know the solution to the mind-body problem. In a certain sense, then, the genes are the greatest of philosophers, the repositories of valuable pieces of philosophical information. (McGinn 1999, p. 227)
No time cell nor master clock has been discovered so far in the human body, despite much searching, so many neuroscientists have come to believe there are no such things to be found. Instead, the neurological basis of our time sense probably has to do with coordinated changes in a network of neurons (and glia cells, especially astrocytes) that somehow encodes time information. Our brain cells, the neurons, are firing all at once, but they are organized somehow to produce a single conscious story in perceived, linear time. Although the details are not well understood by neuroscientists, there is continual progress. One obstacle is complexity. The human central nervous system is the most complicated known structure in the universe.
Cognitive neuroscientists want to know the neural mechanisms that account for our awareness of change, for our ability to anticipate the future, for our sense of time’s flow, for our ability to place remembered events into the correct time order (temporal succession), for our understanding of tenses, for our ability to notice and often accurately estimate durations, and for our ability to keep track of durations across many different time scales, such as milliseconds for some events and years for others.
It surely is the case that our body is capable of detecting very different durations even if we are not conscious of doing so. When we notice that the sound came from our left, not right, we do this by unconsciously detecting the very slight extra time it takes the sound to reach our right ear, which is only an extra 0.0005 seconds after reaching our left ear. The unconscious way we detect this difference in time must be very different from the way we detect differences in years. Also, our neurological and psychological “clocks” very probably do not work by our counting ticks and tocks as do the clocks we build in order to measure physical time.
We are consciously aware of time passing by noticing changes either outside or inside our body. For example, we notice a leaf fall from a tree as it acquires a new location. If we close our eyes, we still can encounter time just by imagining a leaf falling. But scientists and philosophers want more details. How is this conscious encounter with time accomplished, and how does it differ from our unconscious awareness of time?
With the notable exception of Husserl, most philosophers say our ability to imagine other times is a necessary ingredient in our having any consciousness at all. Some say our consciousness is a device that stores information about the past in order to predict the future. Although some researchers believe consciousness is a hard problem to understand, some others have said, “Consciousness seems easy to me: it’s merely the thoughts we can remember.” We remember old perceptions, and we make use of our ability to imagine other times when we experience a difference between our present perceptions and our present memories of past perceptions. Somehow the difference between the two gets interpreted by us as evidence that the world we are experiencing is changing through time. John Locke said our train of ideas produces our idea that events succeed each other in time, but he offered no details on how this train does the producing. Surely memory is key. Memories need to be organized into the proper temporal order in analogy to how a deck of cards, each with a different integer on the cards, can be sorted into numerical order. There is a neurological basis to the mental process of time-stamping memories so they are not just a jumble when recalled or retrieved into consciousness. The human’s ability to organize memories far surpasses any other conscious being. We can decide to do next week what we planned last month because of what happened last year. This is a key part of what makes homo sapiens be sapien.
As emphasized, a major neurological problem is to explain the origin and character of our temporal experiences. How do brains take the input from all its sense organs and produce true beliefs about the world’s temporal relationships? Philosophers and cognitive scientists continue to investigate this, but so far there is no consensus on either how we experience temporal phenomena or how we are conscious that we do. However, there is a growing consensus that consciousness itself is an emergent property of a central nervous system, and that dualism between mental properties and physical properties is not a fruitful supposition. The vast majority of neuroscientists are physicalists who treat brains as if they are just wet machines, and they believe consciousness does not transcend possible scientific understanding.
Neuroscientists agree that the brain takes a pro-active role in building a mental scenario of the external 3+1-dimensional world. As one piece of suggestive evidence, notice that if you look at yourself in the mirror and glance at your left eyeball, then glance at your right eyeball, and then glance back to the left, you can never see your own eyes move. Your brain always constructs a continuous story of non-moving eyes. However, a video camera taking pictures of your face easily records your eyeballs’ movements, proving that your brain has taken an active role in “doctoring” the scenario.
Researchers believe that at all times when we are conscious our mind is testing hypotheses regarding what is taking place beyond our brain. The brain continually receives visual, auditory, tactile, and other sensory signals arriving at different times from an event, then must produce a hypothesis about what the signals might mean. Do those signals mean there probably is a tiger rushing at us? The brain also continuously revises hypotheses and produces new ones in an attempt to have a coherent story about what is out there, what is happening before what, and what is causing what. Being good at unconsciously producing, testing, and revising these hypotheses has survival value.
Psychological time’s rate of passage is a fascinating phenomenon to study. The most obvious feature is that psychological time often gets out of sync with physical time. At the end of our viewing an engrossing television program, we often think, “Where did the time go? It sped by.” When we are hungry in the afternoon and have to wait until the end of the workday before we can have dinner, we think, “Why is everything taking so long?” When we are feeling pain and we look at a clock, the clock seems to be ticking slower than normal.
An interesting feature of the rate of passage of psychological time reveals itself when we compare the experiences of younger people to older people. When we are younger, we lay down richer memories because everything is new. When we are older, the memories we lay down are much less rich because we have “seen it all before.” That is why older people report that a decade goes by so much more quickly than it did when they were younger.
Do things seem to move more slowly when we are terrified? “Yes,” most people would say. “No,” says neuroscientist David Eagleman, “it’s a retrospective trick of memory.” The terrifying event does seem to you to move more slowly when you think about it later, but not at the time it is occurring. Because memories of the terrifying event are “laid down so much more densely,” Eagleman says, it seems to you, upon your remembering, that your terrifying event lasted longer than it really did.
The human being inherited most or perhaps all of its biological clocks from its ancestor species. Although the cerebral cortex is usually considered to be the base for our conscious experience, it is surprising that rats can distinguish a five-second interval from a forty-second interval even with their cerebral cortex removed. So, a rat’s means of sensing time is probably distributed throughout many places in its brain. Perhaps the human being’s time sense is similarly distributed. However, surely the fact that we know that we know about time is specific to our cerebral cortex. A rat does not know that it knows. It has competence without comprehension. A cerebral cortex apparently is required for this comprehension. Very probably no other primate has an appreciation of time that is as sophisticated as that had by any normal human being.
Entomologists still do not know how the biological clock of a cicada enables these insects to hatch after 13 years living underground, and not after 12 years or 14 years. Progress on this issue might provide helpful clues for understanding the human being’s biological clocks.
We humans are very good at detecting the duration of silences. We need this ability to tell the difference between the spoken sentence, “He gave her cat-food,” and “He gave her cat food.” The hyphen is the linguistic tool for indicating that the duration between the spoken words “cat” and “food” is shorter than usual. This is a favorite example of the neuroscientist Dean Buonomano.
Do we have direct experience only of an instantaneous present event or instead do we have direct experience only of the specious present, a present that lasts a short stretch of physical time? Informally, the issue is said to be whether the present is thin or thick. Plato, Aristotle, Thomas Reid, and Alexius Meinong believed in a thin present. Shadworth Hodgson, Mary Calkins and William James believed in a thick present. The latter position is now the more favored one by experts in the fields of neuroscience and philosophy of mind.
If it is thick, then how thick? Does the present last longer than the blink of an eye? Among those accepting the notion of a specious present, a good estimate of its duration varies from approximately eighty milliseconds to three seconds for human beings, although neuroscientists do not yet know why it is not two milliseconds or seven seconds.
Another issue is about overlapping specious presents. We do seem to have what William James called a unified “stream of consciousness,” but it is not known how our individual specious presents overlap to produce this unity.
When you open your eyes, can you see what is happening now? In 1630, René Descartes would have said yes, but nearly all philosophers in the twenty-first century say no. You see the North Star as it was over 300 years ago, not as it is now. Also, light arriving at your eye from an external object contains information about its color, motion, and form. The three kinds of signals arrive simultaneously, but it takes your brain different times to process that information. Color information is processed more quickly than motion information, which in turn is processed more quickly than form information or shape information. Only after the light has taken its time to arrive at your eye, and then you have processed all the information, can you construct a correct story that perhaps says, “A white tiger is leaping towards me.”
So, we all live in the past—in the sense that our belief about what is happening now occurs later than when it really happened according to a clock. Our brain takes about eighty milliseconds or more to reconstruct a story of what is happening based on the information coming in from our different sense organs. Because of its long neck, a giraffe’s specious present might last considerably longer. However, it cannot take too much longer than this or else the story is so outdated that the giraffe risks becoming a predator’s lunch while the information processing is happening. Therefore, evolution has probably fine-tuned each kind of organism’s number of milliseconds of its specious present.
In the early days of television broadcasting, engineers worried about the problem of keeping audio and video signals exactly synchronized. Then they accidentally discovered that they had about a tenth-of-a-second of “wiggle room.” As long as the signals arrive within this period, viewers’ brains automatically re-synchronize the signals. Outside that tenth-of-a-second period, it suddenly looks like a badly dubbed movie (Eagleman, 2009).
Watch a bouncing basketball. The light from the bounce arrives into our eyes before the sound arrives into our ears; then the brain builds a story in which the sight and sound of the bounce happen simultaneously. This sort of subjective synchronizing of visual and audio works for the bouncing ball so long as the ball is less than 100 feet away. Any farther and we begin to notice that the sound arrives later.
Some Eastern philosophies promote living in the present and dimming one’s awareness of the past and the future. Unfortunately, people who “live in the moment” have a more dangerous and shorter life. The cognitive scientist Lera Boroditsky says a crack addict is the best example of a person who lives in the moment.
Philosophers of time and psychologists who study time are interested in both how a person’s temporal experiences are affected by deficiencies in their imagination and their memory and how different interventions into a healthy person’s brain might affect that person’s temporal experience.
Some of neuroscientist David Eagleman’s experiments have shown clearly that under certain circumstances a person can be deceived into believing event A occurred before event B, when in fact the two occurred in the reverse order according to clock time. For more on these topics, see (Eagleman, 2011).
The time dilation effect in psychology occurs when events involving an object coming toward you last longer in psychological time than an event with the same object being stationary. With repeated events lasting the same amount of clock time, presenting a brighter object will make that event seem to last longer. This is also true for louder sounds.
Suppose you live otherwise normally within a mine or tunnel and are temporarily closed off from communicating with the outside world. For a long while, simply with memory, you can keep track of how long you have been inside, but eventually you will lose track of the correct clock time. What determines how long the long while is, and how is it affected by the subject matter? And why are some persons better estimators than others? These are interesting, open questions.
Do we directly experience the present? This is controversial, and it is not the same question as whether at present we are having an experience. Those who answer “yes” tend to accept McTaggart’s A-theory of time. But notice how different such direct experience would have to be from our other direct experiences. We directly experience green color but can also directly experience other colors; we directly experience high-pitched notes but can also directly experience other notes. Can we say we directly experience the present time but also can directly experience other times? Definitely not. So, the direct experience of the present either is non-existent, or it is a strange sort of direct experience. Nevertheless, we probably do have some mental symbol for nowness in our mind that correlates with our having the concept of the present, but it does not follow from this that we directly experience the present any more than our having a concept of love implies that we directly experience love. For an argument that we do not experience the present, see chapter 9 of (Callender 2017).
If all organisms were to die, there would be events after those deaths. The stars would continue to shine, but would any of these star events be in the future? This is a philosophically controversial question because advocates of McTaggart’s A-theory will answer “yes,” whereas advocates of McTaggart’s B-theory will answer “no” and add “Whose future?”
The issue of whether time itself is subjective, a mind-dependent phenomenon such as a secondary quality, is explored elsewhere in this article.
According to René Descartes’ dualistic philosophy of mind, the mind is not in space, but it is in time. The current article accepts the more popular philosophy of mind that rejects dualism and claims that our mind is in both space and time due to the functioning of our brain. It takes no position, though, on the controversial issue of whether the process of conscious human understanding is a computation.
Neuroscientists and psychologists have investigated whether they can speed up our minds relative to a duration of physical time. If so, we might become mentally more productive, and get more high-quality decision making done per fixed amount of physical time, and learn more per minute. Several avenues have been explored: using cocaine, amphetamines and other drugs; undergoing extreme experiences such as jumping backwards off a ledge into a net; and trying different forms of meditation. These avenues definitely affect the ease with which pulses of neurotransmitters can be sent from one neuron to a neighboring neuron and thus affect our psychological time, but so far, none of these avenues has led to success productivity-wise.
For our final issue about time and mind, do we humans have an a priori awareness of time that can be used to give mathematics a firm foundation? In the early twentieth century, the mathematician and philosopher L.E.J. Brouwer believed so. Many mathematicians and philosophers at that time were suspicious that mathematics was not as certain as they hoped for, and they worried that contradictions might be uncovered within mathematics. Their suspicions were increased by the discovery of Russell’s Paradox and by the introduction into set theory of the controversial non-constructive axiom of choice. In response, Brouwer attempted to place mathematics on what he believed to be a firmer epistemological foundation by arguing that mathematical concepts are admissible only if they can be constructed from an ideal mathematician’s vivid, a priori awareness of time, what in Kantian terminology is called an intuition of inner time. Time, said Kant in his Critique of Pure Reason in 1781, is a structuring principle of all possible experience. As such time is not objective; it is not a feature of things-in-themselves, but rather is a feature of the phenomenal world.
Brouwer supported Kant’s claim that arithmetic is the pure form of temporal intuition. Brouwer tried to show how to construct higher-level mathematical concepts (for example, the mathematical line) from lower-level temporal intuitions; but unfortunately, he had to accept the consequence that his program required both rejecting Aristotle’s law of excluded middle in logic and rejecting some important theorems in mathematics such as the theorem that every real number has a decimal expansion and Cantor’s theorem that there is an actual infinity as opposed to a potential infinity of points between any two points on the mathematical line. Unwilling to accept those inconsistencies with classical mathematics, most other mathematicians and philosophers instead rejected Brouwer’s idea of an intimate connection between mathematics and time.
For interesting video presentations about psychological time, see (Carroll 2012) and (Eagleman 2011). For the role of time in phenomenology, see the article “Phenomenology and Time-Consciousness.” According to the phenomenologist Edmund Husserl, “One cannot discover the least thing about objective time through phenomenological analysis” (Husserl, 1991, p. 6).
Consider the mind of an extraterrestrial. Could an extraterrestrial arrive here on Earth with no concept of time? Probably not. How about arriving with a very different concept of time from ours? Perhaps, but how different? Stephen Hawking’s colleague James Hartle tried to answer this question by speculating that we and the extraterrestrial will at least, “share concepts of past, present and future, and the idea of a flow of time.”
Arntzenius Frank and H. Greaves. 2009. “Time Reversal in Classical Electromagnetism,” The British Journal for the Philosophy of Science vol. 60 (3), pp. 557-584.
Challenges Feynman’s claim that anti-particles are nothing but particles propagating backwards in time.
Arthur, Richard T. 2014. Leibniz. Polity Press. Cambridge, U.K.
Comprehensive monograph on all things Leibniz, with a detailed examination of his views on time.
Arthur, Richard T. W. 2019. The Reality of Time Flow: Local Becoming in Physics, Springer.
Challenges the claim that the now is subjective in modern physics.
Azzouni, Jody. 2015. “Nominalism, the Nonexistence of Mathematical Objects,” in Mathematics, Substance and Surmise, edited by E. Davis and P.J. Davis, pp. 133-145.
Argues that mathematical objects referred to by mathematical physics do not exist despite Quine’s argument that they do exist. Azzouni also claims that a corporation does not exist.
Barbour, Julian. 1999. The End of Time, Weidenfeld and Nicolson, London, and Oxford University Press, New York.
A popular presentation of Barbour’s theory which implies that if we could see the universe as it is, we should see that it is static. It is static, he says, because his way of quantizing general relativity, namely quantum geometrodynamics with its Wheeler-DeWitt equation, implies a time-independent quantum state for the universe as a whole. Time is emergent and not fundamental. He then offers an exotic explanation of how time emerges and why time seems to us to exist.
Barbour, Julian. 2009. The Nature of Time, arXiv:0903.3489.
An application of the Barbour’s ideas of strong emergentism to classical physics.
Baron, Sam. 2018. “Time, Physics, and Philosophy: It’s All Relative,” Philosophy Compass, Volume 13, Issue 1, January.
Reviews the conflict between the special theory of relativity and the dynamic theories of time.
Baron, S. and K. Miller. 2015. “Our Concept of Time” in Philosophy and Psychology of Time edited by B. Mölder, V. Arstila, P. Ohrstrom. Springer. Pp 29-52.
Explores the issue of whether time is a functionalist concept.
Baron, S. and K. Miller. 2019. An Introduction to the Philosophy of Time. Polity Press, Medford, Massachusetts.
Explores the metaphysical issues involving time.
Baron, S., K. Miller and J. Tallant. 2022. Out of Time: A Philosophical Study of Timelessness. Oxford University Press.
Explores timelessness, specifically the possibility that theories of physics do not have any role for real time.
Bunge, Mario. 1968. “Physical Time: The Objective and Relational Theory.” Philosophy of Science. Vol. 35, No. 4. Pages 355-388.
Examines the dispute between relationism and substantivalism, sometimes acerbically.
Butterfield, Jeremy. 1984. “Seeing the Present” Mind, 93, pp. 161-76.
Defends the B-camp position on the subjectivity of the present. Argues against there being a global present.
Callender, Craig, and Ralph Edney. 2001. Introducing Time, Totem Books, USA.
A delightful cartoon-style book covering many of the topics in this encyclopedia article in a more elementary way. Each page is two-thirds graphics and one-third text.
Callender, Craig and Carl Hoefer. 2002. “Philosophy of Space-Time Physics” in The Blackwell Guide to the Philosophy of Science, ed. by Peter Machamer and Michael Silberstein, Blackwell Publishers, pp. 173-98.
Discusses whether it is a fact or a convention that in a reference frame the speed of light going one direction is the same as the speed in the opposite direction.
Callender, Craig. 2010,. Is Time an Illusion?”, Scientific American, June, pp. 58-65.
Explains how the belief that time is fundamental may be an illusion.
Callender, Craig. 2017. What Makes Time Special? Oxford University Press.
A comprehensive monograph on the relationship between the manifest image of time and its scientific image. The book makes a case for how, if information gathering and utilizing systems like us are immersed in an environment with the physical laws that do hold, then we will create the manifest image of time that we do. Not written at an introductory level.
Carnap, Rudolf. 1966. Philosophical Foundations of Physics: An Introduction to the Philosophy of Science. Basic Books, Inc. New York.
Chapter 8 “Time” is devoted to the issue of how to distinguish an accurate clock from an inaccurate one.
Carroll, John W. and Ned Markosian. 2010. An Introduction to Metaphysics. Cambridge University Press.
This introductory, undergraduate metaphysics textbook contains an excellent chapter introducing the metaphysical issues involving time, beginning with the McTaggart controversy.
Carroll, Sean. 2010. From Eternity to Here: The Quest for the Ultimate Theory of Time, Dutton/Penguin Group, New York.
Part Three “Entropy and Time’s Arrow” provides a very clear explanation of the details of the problems involved with time’s arrow. For an interesting answer to the question of what happens in an interaction between our part of the universe and a part in which the arrow of time goes in reverse, see endnote 137 for p. 164.
Carroll, Sean. 2011. “Ten Things Everyone Should Know About Time,” Discover Magazine, Cosmic Variance.
Contains the quotation about how the mind reconstructs its story of what is happening “now.”
Carroll, Sean. 2012. Mysteries of Modern Physics: Time. The Teaching Company, The Great Courses: Chantilly, Virginia.
A series of popular lectures about time by a renowned physicist with an interest in philosophical issues. Emphasizes the arrow of time.
Carroll, Sean. 2016. The Big Picture. Dutton/Penguin Random House. New York.
A physicist surveys the cosmos’ past and future, including the evolution of life.
Carroll, Sean. 2019. Something Deeply Hidden: Quantum Worlds and the Emergence of space-time, Dutton/Penguin Random House.
Pages 287-289 explain how time emerges in a quantum universe governed by the Wheeler-DeWitt equation, a timeless version of the Schrödinger equation. The chapter “Breathing in Empty Space” explains why the limits of time (whether it is infinite or finite) depend on the total amount of energy in the universe. His podcast Mindscape in August 13, 2018 “Why Is There Something Rather than Nothing?” discusses this topic in its final twenty minutes. His answer is that this may not be a sensible question to ask.
Carroll, Sean. 2022. The Biggest Ideas in the Universe: Space, Time, and Motion. Dutton/Penguin Random House.
A sophisticated survey of what relativity theory implies about space, time, and motion, with some emphasis on the philosophical issues. Introduces the relevant equations, but is aimed at a general audience and not physicists.
Crowther, Karen. 2019. “When Do We Stop Digging? Conditions on a Fundamental Theory of Physics,” in What is ‘Fundamental’?, edited by Anthony Aguirre, Brendan Foster, and Zeeya Merali, Springer International Publishing.
An exploration of what physicists do mean and should mean when they say a particular theory of physics is final or fundamental rather than more fundamental. She warns, “a theory formally being predictive to all high-energy scales, and thus apparently being the lowest brick in the tower [of theories] (or, at least, one of the bricks at the lowest level of the tower), is no guarantee that it is in fact a fundamental theory. …Yet, it is one constraint on a fundamental theory.” When we arrive at a fundamental theory, “the question shifts from ‘What if there’s something beyond?’ to ‘Why should we think there is something beyond?” That is, the burden of justification is transferred.”
Damasio, Antonio R. 2006. “Remembering When,” Scientific American: Special Edition: A Matter of Time, vol. 287, no. 3, 2002; reprinted in Katzenstein, pp.34-41.
A look at the brain structures involved in how our mind organizes our experiences into the proper temporal order. Includes a discussion of Benjamin Libet’s claim to have discovered in the 1970s that the brain events involved in initiating our free choice occur about a third of a second before we are aware of our making the choice. This claim has radical implications for the philosophical issue of free will.
Dainton, Barry. 2010. Time and Space, Second Edition, McGill-Queens University Press: Ithaca.
An easy-to-read, but technically correct, book. This is probably the best single book to read for someone desiring to understand in more depth the issues presented in this encyclopedia article.
Davies, Paul. 1995. About Time: Einstein’s Unfinished Revolution, Simon & Schuster.
An easy-to-read survey of the impact of the theory of relativity and other scientific advances on our understanding of time.
Davies, Paul. 2002. How to Build a Time Machine, Viking Penguin.
A popular exposition of the details behind the possibilities of time travel.
Deutsch, David and Michael Lockwood. 1994. “The Quantum Physics of Time Travel,” Scientific American, pp. 68-74. March.
An investigation of the puzzle of acquiring information for free by traveling in time.
Deutsch, David. 2013. “The Philosophy of Constructor Theory,” Synthese, Volume 190, Issue 18.
Challenges Laplace’s Paradigm that physics should be done by predicting what will happen from initial conditions and laws of motion. http://dx.doi.org/10.1007/s11229-013-0279-z.
Dowden, Bradley. 2009. The Metaphysics of Time: A Dialogue, Rowman & Littlefield Publishers, Inc.
An undergraduate textbook in dialogue form that covers many of the topics discussed in this encyclopedia article. Easy reading for newcomers to the philosophy of time.
Dummett, Michael. 2000. Is Time a Continuum of Instants?,” Philosophy, Cambridge University Press, pp. 497-515.
A constructivist model of time that challenges the idea that time is composed of durationless instants.
Eagleman David. 2009. “Brain Time.” In What’s Next? Dispatches on the Future of Science. Max Brockman, Ed., Penguin Random House.
A neuroscientist discusses the plasticity of time perception or temporal distortion.
Eagleman David. 2011. “David Eagleman on CHOICE,” Oct. 4, https://www.youtube.com/watch?v=MkANniH8XZE.
Commentary on research as of 2011 on subjective time.
Einstein, Albert. 1982. “Autobiographical Notes.” In P. A. Schilpp, ed. Albert Einstein: Philosopher-Scientist, vol. 1. LaSalle, Il. Open Court Publishing Company.
Describes his early confusion between the structure of the real number line and the structure of time itself.
Earman, John. 1972. “Implications of Causal Propagation Outside the Null-Cone,” Australasian Journal of Philosophy, 50, pp. 222-37.
Describes his rocket paradox that challenges time travel to the past.
Fisher, A. R. J. 2015. “David Lewis, Donald C. Williams, and the History of Metaphysics in the Twentieth Century.” Journal of the American Philosophical Association, volume 1, issue 1, Spring.
Discusses the disagreements between Lewis and Williams, who both are four-dimensionalists, about the nature of time travel.
Gödel, Kurt. 1959. “A Remark about the Relationship between Relativity Theory and Idealistic Philosophy,” in P. A. Schilpp, ed., Albert Einstein: Philosopher-Scientist, Harper & Row, New York.
Discussion of solutions to Einstein’s equations that allow closed causal chains, that is, traveling to your past.
Gott, J. Richard. 2002. Time Travel in Einstein’s Universe: The Physical Possibilities of Travel Through Time.
Presents an original theory of the origin of the universe involving backward causation and time travel.
Grant, Andrew. 2015. “Time’s Arrow,” Science News, July 25, pp. 15-18.
Popular description of why our early universe was so orderly even though nature should always have preferred the disorderly.
Greene, Brian. 2020. Until the End of Time: Mind, Matter, and Our Search for Meaning in an Evolving Universe, Alfred A. Knopf, New York.
Its chapter called “The Twilight of Time” explores a variety of futures of the universe. Many of the key arguments occur only in the lengthy footnotes.
Greene, Brian. 2011. The Hidden Reality: Parallel Universes and the Deep Laws of the Universe, Vintage Books, New York.
Describes nine versions of the Multiverse Theory, including the Ultimate multiverse theory described by the philosopher Robert Nozick. Confronts the issue of whether the claim that there is another universe in addition to ours is falsifiable.
Grey, W. 1999. “Troubles with Time Travel,” Philosophy 74: pp. 55-70.
Examines arguments against time travel.
Grünbaum, Adolf. 1950-51. “Relativity and the Atomicity of Becoming,” Review of Metaphysics, pp. 143-186.
An attack on the notion of time’s flow, and a defense of the treatment of time and space as being continua. Difficult reading.
Grünbaum, Adolf. 1971. “The Meaning of Time,” in Basic Issues in the Philosophy of Time, Eugene Freeman and Wilfrid Sellars, eds. LaSalle, pp. 195-228.
An analysis of the meaning of the term time in both the manifest image and scientific image, and a defense of the B-theory of time. Difficult reading.
Guth, Alan. 2014. “Infinite Phase Space and the Two-Headed Arrow of Time,” FQXi conference in Vieques, Puerto Rico. https://www.youtube.com/watch?v=AmamlnbDX9I. 2014.
Guth argues that an arrow of time could evolve naturally even though it had no special initial conditions on entropy, provided the universe has an infinite available phase space that the universe could spread out into. If so, its maximum possible entropy is infinite, and any other state in which the universe begins will have relatively low entropy.
Haack, Susan. 1974. Deviant Logic, Cambridge University Press.
Chapter 4 contains a clear account of Aristotle’s argument (in section 14d of the present article) for truth-value gaps, and its development in Lukasiewicz’s three-valued logic.
Hawking, Stephen. 2018. Brief Answers to the Big Questions. Bantam Books, New York.
Popular survey of science’s impact upon big questions such as “How did it all begin?, What is inside a black hole?, Is time travel possible?, Will artificial intelligence outsmart us?
Hawking, Stephen. 1992. “The Chronology Protection Hypothesis,” Physical Review. D 46, p. 603.
Nature conspires somehow to block backward time travel.
Hawking, Stephen. 1996. A Brief History of Time, Updated and Expanded Tenth Anniversary Edition, Bantam Books.
A leading theoretical physicist and cosmologist provides introductory chapters on space and time, black holes, the origin and fate of the universe, the arrow of time, and time travel. Hawking suggests that perhaps our universe originally had four space dimensions and no time dimension, and time came into existence when one of the space dimensions evolved into a time dimension. He called this special space dimension “imaginary time.”
Horwich, Paul. 1975. “On Some Alleged Paradoxes of Time Travel,” Journal of Philosophy, 72: pp.432-44.
Examines some of the famous arguments against past time travel.
Horwich, Paul. 1987. Asymmetries in Time, The MIT Press.
A monograph that relates the central problems of time to other problems in metaphysics, philosophy of science, philosophy of language and philosophy of action. Horwich argues that time itself has no arrow.
Hossenfelder, Sabine. 2022. Existential Physics: A Scientist’s Guide to Life’s Biggest Questions, Viking/Penguin Random House LLC.
A theoretical physicist who specializes in the foundations of physics examines the debate between Leibniz and Newton on relational vs. absolute (substantival) time. Her Chapter Two on theories about the beginning and end of the universe is especially deep, revealing, and easy to understand.
Huggett, Nick. 1999. Space from Zeno to Einstein, MIT Press.
Clear discussion of the debate between Leibniz and Newton on relational vs. absolute (substantival) time.
Husserl, Edmund. 1991. On the Phenomenology of the Consciousness of Internal Time. Translated by J. B. Brough. Originally published 1893-1917. Dordrecht: Kluwer Academic Publishers.
The father of phenomenology discusses internal time consciousness.
Katzenstein, Larry. 2006. ed. Scientific American Special Edition: A Matter of Time, vol. 16, no. 1.
A collection of Scientific American articles about time.
Kirk, G.S. and Raven, J.E. 1957. The Presocratic Philosophers. New York: Cambridge University Press,
Krauss, Lawrence M. and Glenn D. Starkman, 2002. “The Fate of Life in the Universe,” Scientific American Special Edition: The Once and Future Cosmos, Dec. pp. 50-57.
Discusses the future of intelligent life and the extent to which life can adapt to and survive the increasing expansion of the universe.
Krauss, Lawrence M. 2012. A Universe from Nothing. Atria Paperback, New York.
Discusses on p. 170 why we live in a universe with time rather than with no time. The issue is pursued further in the afterward to the paperback edition that is not included within the hardback edition. Krauss’ position on why there is something rather than nothing was challenged by the philosopher David Albert in his March 23, 2012 review of Krauss’ hardback book in The New York Times newspaper.
Kretzmann, Norman. 1966. “Omniscience and Immutability,” The Journal of Philosophy, July, pp. 409-421.
Raises the question: If God knows what time it is, does this demonstrate that God is not immutable?
Lasky, Ronald C. 2006. “Time and the Twin Paradox,” in Katzenstein, pp. 21-23.
A short analysis of the twin paradox, with helpful graphs showing how each twin would view his or own clock plus the other twin’s clock.
Le Poidevin, Robin and Murray MacBeath, 1993. The Philosophy of Time, Oxford University Press.
A collection of twelve influential articles on the passage of time, subjective facts, the reality of the future, the unreality of time, time without change, causal theories of time, time travel, causation, empty time, topology, possible worlds, tense and modality, direction and possibility, and thought experiments about time. Difficult reading for undergraduates.
Le Poidevin, Robin. 2003. Travels in Four Dimensions: The Enigmas of Space and Time, Oxford University Press.
A philosophical introduction to conceptual questions involving space and time. Suitable for use as an undergraduate textbook without presupposing any other course in philosophy. There is a de-emphasis on teaching the scientific theories, and an emphasis on elementary introductions to the relationship of time to change, the implications that different structures for time have for our understanding of causation, difficulties with Zeno’s Paradoxes, whether time passes, the nature of the present, and why time has an arrow.
Lewis, David K. 1986. “The Paradoxes of Time Travel.” American Philosophical Quarterly, 13:145-52.
A classic argument against changing the past. Lewis assumes the B-theory of time.
Lockwood, Michael. 2005. The Labyrinth of Time: Introducing the Universe, Oxford University Press.
A philosopher of physics presents the implications of contemporary physics for our understanding of time. Chapter 15, “Schrödinger’s Time-Traveler,” presents the Oxford physicist David Deutsch’s quantum analysis of time travel.
Lowe, E. J. 1998. The Possibility of Metaphysics: Substance, Identity and Time, Oxford University Press.
This Australian metaphysician defends the A-theory’s tensed view of time in chapter 4, based on an ontology of substances rather than events.
Mack, Katie. 2020. The End of Everything (Astrophysically Speaking). Scribner, New York.
Exploration of alternative ways the universe might end.
Markosian, Ned. 2003. “A Defense of Presentism,” in Zimmerman, Dean (ed.), Oxford Studies in Metaphysics, Vol. 1, Oxford University Press.
Maudlin, Tim. 1988. “The Essence of Space-Time.” Proceedings of the Biennial Meeting of the Philosophy of Science Association, Volume Two: Symposia and Invited Papers (1988), pp. 82-91.
Maudlin discusses the hole argument, manifold substantivalism and metrical essentialism.
Maudlin, Tim. 2002. “Remarks on the Passing of Time,” Proceedings of the Aristotelian Society, New Series, Vol. 102 (2002), pp. 259-274 Published by: Oxford University Press. https://www.jstor.org/stable/4545373. 2002.
Defends eternalism, the block universe, and the passage of time.
Maudlin, Tim. 2007. The Metaphysics Within Physics, Oxford University Press.
Chapter 4, “On the Passing of Time,” defends the dynamic theory of time’s flow, and he argues that the passage of time is objective.
Maudlin, Tim. 2012. Philosophy of Physics: Space and Time, Princeton University Press.
An advanced introduction to the conceptual foundations of space-time theory.
McCall, Storrs. 1966. “II. Temporal Flux,” American Philosophical Quarterly, October.
An analysis of the block universe, the flow of time, and the difference between past and future.
McCoy, C. D. and Craig Callender. 2022. “Time in Cosmology” in Eleanor Knox and Alastair Wilson (eds.), The Routledge Companion to Philosophy of Physics, Routledge, London, pp. 707-718.
Section 2 surveys the controversy over whether space and time are united in an entity that relativity theory calls “space-time.”
McGinn, Colin. 1999. The Mysterious Flame: Conscious Minds in a Material World. Basic Books.
Claims that the mind-body problem always will be a mystery for your mind but not for your genes.
McTaggart, J. M. E. 1927. The Nature of Existence, Cambridge University Press.
Chapter 33 restates more clearly the arguments that McTaggart presented in 1908 for his A series and B series and how they should be understood to show that time is unreal. Difficult reading. The argument for the inconsistency that a single event has only one of the properties of being past, present, or future, but that any event also has all three of these properties is called “McTaggart’s Paradox.” The chapter is renamed “The Unreality of Time,” and is reprinted on pp. 23-59 of (Le Poidevin and MacBeath 1993).
Mellor, D. H. 1998. Real Time II, International Library of Philosophy.
This monograph presents a subjective theory of tenses. Mellor argues that the truth conditions of any tensed sentence can be explained without tensed facts.
Merali, Zeeya. 2013. Theoretical Physics: The Origins of Space and Time,” Nature, 28 August , vol. 500, pp. 516-519.
Describes six theories that compete for providing an explanation of the basic substratum from which space and time emerge.
Miller, Kristie. 2013. “Presentism, Eternalism, and the Growing Block,” in A Companion to the Philosophy of Time. Ed. by Heather Dyke and Adrian Bardon, John Wiley & Sons, Inc., pp. 345-364.
Compares the pros and cons of competing ontologies of time.
Morris, Michael S., Kip S. Thorne and Ulvi Yurtsever. 1988. “Wormholes, Time Machines, and the Weak Energy Condition,” Physical Review Letters, vol. 61, no. 13, 26 September.
The first description of how to build a time machine using a wormhole.
Moskowitz, Clara. 2021. “In Every Bit of Nothing There is Something,” Scientific American, February.
Describes how the Heisenberg Uncertainty Principle requires there to be continual creation and annihilation of virtual particles. This process is likely to be the cause of dark energy and the accelerating expansion of space.
Mozersky, M. Joshua. 2013. “The B-Theory in the Twentieth Century,” in A Companion to the Philosophy of Time. Ed. by Heather Dyke and Adrian Bardon, John Wiley & Sons, Inc., pp. 167-182.
A detailed evaluation and defense of the B-Theory.
Muller, Richard A. 2016a. NOW: The Physics of Time. W. W. Norton & Company, New York.
An informal presentation of the nature of time by an experimental physicist at the University of California, Berkeley. Chapter 15 argues that the correlation between the arrow of time and the increase of entropy is not a causal connection. Chapter 16 discusses the competing arrows of time. Muller favors space expansion as the cause of time’s arrow, with entropy not being involved. And he recommends a big bang theory in which both space and time expand, not simply space. Because space and time are so intimately linked, he says, the expansion of space is propelling time forward, and this explains the flow of time. “The flow of time causes entropy to increase, not the other way around,” he says. (p. 9)
Muller, Richard. 2016b. “Now and the Flow of Time,” arXiv, https://arxiv.org/pdf/1606.07975.pdf.
Argues that the flow of time consists of the continual creation of new moments, new nows, that accompany the creation of new space.
Nadis, Steve. 2013. “Starting Point,” Discover, September, pp. 36-41.
Non-technical discussion of the argument by cosmologist Alexander Vilenkin that the past of the multiverse must be finite (there was a first bubble) but its future must be infinite (always more bubbles).
Argues that, “We don’t find passage in our present theories and we would like to preserve the vanity that our physical theories of time have captured all the important facts of time. So we protect our vanity by the stratagem of dismissing passage as an illusion.”
Novikov, Igor. 1998. The River of Time, Cambridge University Press.
Chapter 14 gives a very clear and elementary description of how to build a time machine using a wormhole.
Oaklander, L. Nathan. 2008. The Ontology of Time. Routledge.
An authoritative collection of articles on all the major issues. Written for an audience of professional researchers.
Øhrstrøm, P. and P. F. V. Hasle. 1995. Temporal Logic: from Ancient Ideas to Artificial Intelligence. Kluwer Academic Publishers.
An elementary introduction to the logic of temporal reasoning.
Penrose, Roger. 2004. The Road to Reality: A Complete Guide to the Laws of the Universe. Alfred A. Knopf.
A mathematical physicist discusses cosmology, general relativity, and the second law of thermodynamics, but not at an introductory level.
Perry, John. 1979. “The Problem of the Essential Indexical,” Noûs,13 (1), pp. 3-21.
Argues that indexicals are essential to what we want to say in natural language; they cannot all be explicated by, reduced to, or eliminated in favor of B-theory discourse.
Pinker, Steven. 2007. The Stuff of Thought: Language as a Window into Human Nature, Penguin Group.
Chapter 4 discusses how the conceptions of space and time are expressed in language in a way very different from that described by either Kant or Newton. Page 189 says that time in only half the world’s languages is the ordering of events expressed in the form of grammatical tenses. Chinese has no tenses, in the sense of verb conjugations, but of course, it expresses all sorts of concepts about time in other ways.
Plato. Parmenides. 1961. Trans. by F. Macdonald Cornford in The Collected Dialogues of Plato, ed. E. Hamilton and H. Cairns. Princeton, NJ: Princeton University Press.
Plato discusses time.
Pöppel, Ernst. 1988. Mindworks: Time and Conscious Experience. San Diego: Harcourt Brace Jovanovich.
A neuroscientist explores our experience of time.
Price, Huw. 1996. Time’s Arrow & Archimedes’ Point: New Directions for the Physics of Time. Oxford University Press.
Price believes the future can affect the past, the notion of direction of the flow cannot be established as an objective notion, and philosophers of physics need to adopt an Archimedean point of view outside of time in order to discuss time in an unbiased manner.
Argues that a tenseless or B-theory of time fails to account for our feeling of relief that painful past events are in the past rather than in the present.
Prior, A.N. 1967.Past, Present and Future, Oxford University Press.
Pioneering work in temporal logic, the symbolic logic of time, that permits propositions to be true at one time and false at another.
Prior, A.N. 1969. “Critical Notices: Richard Gale, The Language of Time,” Mind, 78, no. 311, 453-460.
Contains his attack on the attempt to define time in terms of causation.
Prior, A.N. 1970. “The Notion of the Present,” Studium Generale, volume 23, pp. 245-8.
A brief defense of presentism, the view that the past and the future are not real.
Putnam, Hilary. 1967. “Time and Physical Geometry,” The Journal of Philosophy, 64, pp. 240-246.
Comments on whether Aristotle is a presentist. Putnam believes that the manifest image of time is refuted by relativity theory.
Quine, W.V.O. 1981. Theories and Things. Cambridge, MA: Harvard University Press.
Quine argues for physicalism in metaphysics and naturalism in epistemology.
Rovelli, Carlo. 2017. Reality is Not What It Seems: The Journey to Quantum Gravity. Riverhead Books, New York.
An informal presentation of time in the theory of loop quantum gravity. Loop theory focuses on gravity; string theory is a theory of gravity plus all the forces and matter.
Rovelli, Carlo. 2018. The Order of Time. Riverhead Books, New York.
An informal discussion of the nature of time by a theoretical physicist. The book was originally published in Italian in 2017. Page 70 contains the graph of the absolute elsewhere that was the model for the one in this article.
Rovelli, Carlo. 2018. “Episode 2: Carlo Rovelli on Quantum Mechanics, space-time, and Reality” in Sean Carroll’s Mindscape Podcast at www.youtube.com/watch?v=3ZoeZ4Ozhb8. July 10.
Rovelli and Carroll discuss loop quantum gravity vs. string theory, and whether time is fundamental or emergent.
Russell, Bertrand. 1915. “On the Experience of Time,” Monist, 25, pp. 212-233.
The classical tenseless theory.
Russell, Bertrand. 1929. Our Knowledge of the External World. W. W. Norton and Co., New York, pp. 123-128.
Russell develops his formal theory of time that presupposes the relational theory of time.
Saunders, Simon. 2002. “How Relativity Contradicts Presentism,” in Time, Reality & Experience edited by Craig Callender, Cambridge University Press, pp. 277-292.
Reviews the arguments for and against the claim that, since the present in the theory of relativity is relative to reference frame, presentism must be incorrect.
Savitt, Steven F. 1995. Time’s Arrows Today: Recent Physical and Philosophical Work on the Direction of Time. Cambridge University Press.
A survey of research in this area, presupposing sophisticated knowledge of mathematics and physics.
Savitt, Steven F. “Being and Becoming in Modern Physics.” In E. N. Zala (ed.). The Stanford Encyclopedia of Philosophy.
In surveying being and becoming, it suggests how the presentist and grow-past ontologies might respond to criticisms that appeal to relativity theory.
Sciama, Dennis. 1986. “Time ‘Paradoxes’ in Relativity,” in The Nature of Time edited by Raymond Flood and Michael Lockwood, Basil Blackwell, pp. 6-21.
A clear account of the twin paradox.
Shoemaker, Sydney. 1969. “Time without Change,” Journal of Philosophy, 66, pp. 363-381.
A thought experiment designed to show us circumstances in which the existence of changeless periods in the universe could be detected.
Sider, Ted. 2000. “The Stage View and Temporary Intrinsics,” The Philosophical Review, 106 (2), pp. 197-231.
Examines the problem of temporary intrinsics and the pros and cons of four-dimensionalism.
Sider, Ted. 2001. Four-Dimensionalism: An Ontology of Persistence. Oxford University Press, New York.
Defends the ontological primacy of four-dimensional events over three-dimensional objects. He freely adopts causation as a means of explaining how a sequence of temporal parts composes a single perduring object. This feature of the causal theory of time originated with Hans Reichenbach.
Sklar, Lawrence. Space. 1976. Time, and space-time, University of California Press.
Chapter III, Section E discusses general relativity and the problem of substantival space-time, where Sklar argues that Einstein’s theory does not support Mach’s views against Newton’s interpretations of his bucket experiment; that is, Mach’s argument against substantivalism fails.
Slater, Hartley. 2012. “Logic is Not Mathematical,” Polish Journal of Philosophy, Spring, pp. 69-86.
Discusses, among other things, why modern symbolic logic fails to give a proper treatment of indexicality.
Smith, Quentin. 1994. “Problems with the New Tenseless Theories of Time,” pp. 38-56 in Oaklander, L. Nathan and Smith, Quentin (eds.), The New Theory of Time, New Haven: Yale University Press.
Challenges the new B-theory of time promoted by Mellor and Smith.
Smolin, Lee. 2013. Time Reborn. Houghton, Mifflin, Harcourt Publishing Company, New York.
An extended argument by a leading theoretical physicist for why time is real. Smolin is a presentist. He believes the general theory of relativity is mistaken about the relativity of simultaneity; he believes every black hole is the seed of a new universe; and he believes nothing exists outside of time.
Sorabji, Richard. 1988. Matter, Space, & Motion: Theories in Antiquity and Their Sequel. Cornell University Press.
Chapter 10 discusses ancient and contemporary accounts of circular time.
Steinhardt, Paul J. 2011. “The Inflation Debate: Is the theory at the Heart of Modern Cosmology Deeply Flawed?” Scientific American, April, pp. 36-43.
Argues that the big bang Theory with inflation is incorrect and that we need a cyclic cosmology with an eternal series of big bangs and big crunches but with no inflation. The inflation theory of quantum cosmology implies the primeval creation of a very large universe in a very short time.
Tallant, Jonathan. 2013. “Time,” Analysis, Vol. 73, pp. 369-379.
Examines these issues: How do presentists ground true propositions about the past? How does time pass? How do we experience time’s passing?
Tegmark, Max. 2017. “Max Tegmark and the Nature of Time,” Closer to Truth, https://www.youtube.com/watch?v=rXJBbreLspA, July 10.
Speculates on the multiverse and why branching time is needed for a theory of quantum gravity.
Thorne, Kip. 2014. The Science of INTERSTELLAR. W. W. Norton & Company, New York, London.
This specialist on time travel describes THE scientific implications OF time machines, black holes, and the big bang.
Unruh, William. 1999. “Is Time Quantized? In Other Words, Is There a Fundamental Unit of Time That Could Not Be Divided into a Briefer Unit?” Scientific American, October 21. https://www.scientificamerican.com/article/is-time-quantized-in-othe/
Discusses whether time has the same structure as a mathematical continuum.
Van Fraassen, Bas C. 1985. An Introduction to the Philosophy of Time and Space, Columbia University Press.
An advanced undergraduate textbook by an important philosopher of science.
Van Inwagen, Peter. 2015. Metaphysics, Fourth Edition. Westview Press.
An introduction to metaphysics by a distinguished proponent of the A-theory of time.
Veneziano, Gabriele. 2006. “The Myth of the Beginning of Time,” Scientific American, May 2004, pp. 54-65, reprinted in Katzenstein, pp. 72-81.
An account of string theory’s impact on our understanding of time’s origin. Veneziano hypothesizes that our big bang was not the origin of time but simply the outcome of a pre-existing state.
Wallace, David. 2021. Philosophy of Physics: A Very Short Introduction. Oxford University Press.
An excellent introduction to the philosophical issues within physics and how different philosophers approach them.
Wasserman, Ryan. 2018. Paradoxes of Time Travel, Oxford University Press.
A detailed review of much of the philosophical literature about time travel. The book contains many simple, helpful diagrams.
Whitehead, A. N. 1938. Modes of Thought. Cambridge University Press.
Here Whitehead describes his “process philosophy” that emphasizes the philosophy of becoming rather than of being, for instance, traveling the road rather than the road traveled.
Whitrow, G. J. 1980. The Natural Philosophy of Time, Second Edition, Clarendon Press.
A broad survey of the topic of time and its role in physics, biology, and psychology. Pitched at a higher level than the Davies books.
Author Information
Bradley Dowden
Email: dowden@csus.edu
California State University, Sacramento
U. S. A.
What Else Science Requires of Time (That Philosophers Should Know)
Science appears to have a great many other implications about the nature of time that are not discussed in the main Time article.
One of the principal philosophical assumptions in physics is that the world makes sense and we can understand it. The main tool for doing this is a theory. The term theory has many senses, even in physics. In the article “Time” and in these supplements, it is used primarily in a special, technical sense, not in the sense of an explanation as in the remark, “My theory is that the mouse stole the cheese,” nor in the sense of a prediction as in the remark, “My theory is that the mouse will steal the cheese.” The general theory of relativity is an example of our intended sense. The key feature is that the theory’s laws are quantitative and not vague. The laws describe physically possible patterns of events; if a law does not allow certain behavior, then the behavior is not physically possible even though it might be logically possible.
Ideally the confirmed theories of physics explain what we already know, predict what we don’t, and help us understand what we can. There are other important ideals. We would like the theories to increase our ability to manipulate and control nature for our benefit. We expect each theory to be consistent, and we expect our collection of theories either to be mutually consistent or else for there to be understandable reasons why they are not. Also, it is wonderful when our theory can explain a wide variety of phenomena that intuitively would be considered to be unrelated. Newton’s theory of gravity explains why apples fall from trees and the Moon moves as it does through the night sky. When we say theories explain, we know theories themselves do not do the explaining; we humans use the theories in order to explain. However, the idiom is commonly used.
Whether to add that the remark that ideally the fundamental theories are true or at least approximately true has caused considerable controversy among philosophers of science. The philosopher Hilary Putnam is noted for arguing that the success of precise theories in physics would be a miracle if they were not at least approximately true.
Regarding the remark that the laws of physics are not vague, this is to state an ideal, not a requirement. Laws can be vague and stlll be laws. It is presumed there is no vagueness in the objective world. Vagueness lies in our concepts and semantics.
The epistemological goal in theory building is not to prove a scientific theory in the sense of supporting the theory so well that in the future any claim that it might be false should be ignored. The goal is a consensus among the experts, all of whom should have the mindset of being open-minded about the possible occurrence of new, relevant evidence against any scientific theory.
Physicists hope their theories can have a minimum number of laws and a minimum number of assumptions. They would like to minimize assumptions specifying the specific values of numerical constants, the so-called free parameters that cannot be calculated and must be determined by measurement. An example is the mass of the electron. It has to be measured; it cannot be predicted. That the number of free parameters can be minimized and the number of independent laws can be minimized is only a hope. It is not an a priori truth. Nevertheless, averaging over the history of physics, more and more phenomena are being explained with fewer and fewer laws. This has led to the hope of finding a set of fundamental laws explaining all phenomena at least in principle, one in which it would be clear how the currently accepted fundamental laws of relativity theory and quantum theory and other scientific theories are approximately true. This hope is the hope for a successful theory of quantum gravity. That theory is sometimes called a “theory of everything.” The name is pretentious because having this theory does not automatically yield a cure for cancer or the date when Julius Caeser saw his eighty-fourth fig leaf.
Since the time of Newton, the laws created by physicists have placed limitations on how one configuration of the objects in a physical system of objects is related to another configuration at another time. There definitely should be limitations because the universe is not created anew each moment with its old configuration having nothing to do with its new one. The meta-assumption that the best laws are dynamic laws describing the time evolution of a system has historically dominated physics.
Ideally, our dynamic laws satisfy the Markov Assumption that the future depends on the present, not on the present plus the past.
But some philosophers of physics in the 21st century have suggested pursuing other kinds of laws, non-dynamic laws. For example, maybe the ideal laws would be like the laws of the game Sudoku. Those laws are not dynamic. They only allow you to check whether a completed sequence of moves made in the game is allowable; but for any point in time during the game they do not tell you the next move, as would a dynamic law.
Regarding the term “fundamental law,” if law A and law B can be explained by law C, then law C is considered to be more fundamental than A and B. This claim has two, usually implicit presuppositions: (1) A, B, and C are mutually consistent, and (2) C is not simply equivalent to the conjunction “A and B.” The word “basic” is often used synonymously with “fundamental.”
Here is the opinion of the influential theoretical cosmologist Stephen Hawking about the nature of scientific laws:
I believe that the discovery of these laws has been humankind’s greatest achievement…. The laws of nature are a description of how things actually work in the past, present and future…. But what’s really important is that these physical laws, as well as being unchangeable, are universal [so they apply to everything everywhere all the time] (Brief Answers to the Big Questions, 2018).
We humans are lucky that we happen to live in a universe that is so explainable, predictable and understandable, and that is governed by so few laws.
Some theories are expressed fairly precisely, and some are expressed less precisely. All other things being equal, the more precise the better. If they have important simplifying assumptions but still give helpful explanations of interesting phenomena, then they are often said to be models. Very simple models are said to be toy models (“Let’s consider a cow to be a perfect cube, and assume 4.2 is ten”). However, physicists do not always use the terms this way. Very often they use the terms “theory” and “model” interchangeably. For example, the Standard Model of particle physics is a model, but more accurately it would be said to be a theory in the sense used in this section. All physicists recognize this, but, for continuity with historical usage of the term, physicists have never bothered to replace the word “model” with “theory.”
For our fundamental theories of physics, the standard philosophical presupposition is that any dynamical law describes how the system evolves from a state at one time into a state at another time. All the dynamical laws in our fundamental theories of relativity and quantum mechanics are differential equations (or inequalities). These have infinitely many solutions that describe infinitely many situations. The equations are meant to be solved for a specific situation that provides the initial values or “initial conditions” (aka “boundary conditions”) for the variables within the equations. A single solution to the equations can be used as a prediction of how those values will change.
For example, in 1916 Karl Schwarzschild found the first exact solution to Einstein’s equations of general relativity. The environment (and thus the set of initial conditions) that he chose was a large sphere of gas in an otherwise empty universe, and the solutions showed it would collapse into what we now call a black hole. In 1965, Roger Penrose was the first person to argue convincingly that relativity theory implies large stars under a wide variety of realistic conditions after their nuclear fuel has been expended will always collapse to a black hole with an inner singularity.
Most researchers say a theory ideally should tell us how the system being studied would behave under perturbations or small changes in the initial conditions, for example, if the initial density of Schwarzschild’s sphere were changed. Knowing how a system would behave under different conditions helps us understand the causal structure of the system, which is what philosophers also call the counterfactual structure. This structure, when understood, gives the physicist a “feel” for the equations and their solutions.
Theories of physics are, among other things, a set of laws and a set of standard ways to link its statements to the real, physical world. A theory might link the variable “t” to time as measured with a standard clock, and link the constant “M” to the known mass of the Earth. In general, the mathematics in mathematical physics is used to create mathematical representations of real entities and their states and behaviors. That is what makes it be an empirical science, unlike pure mathematics.
Do the laws of physics actually govern us? In Medieval Christian theology, the laws of nature were considered to be God’s commands, but today saying nature “obeys” scientific laws or that nature is “governed” by laws is considered by scientists to be a harmless metaphor. Scientific laws are called “laws” because they constrain what can happen; they imply this will happen and that will not. It was Pierre Laplace who first declared that the fundamental scientific laws should be hard and fast rules with no exceptions.
Are the basic laws of science just emergent patterns of the universe’s behavior that humans find useful, or do they somehow pre-exist the universe and act to bring the universe into existence? Philosophers’ positions on laws divide into two camps, Humean and anti-Humean. Anti-Humeans consider scientific laws to be bringing nature forward into existence. It is as if laws are causal agents. Some anti-Humeans agree with Aristotle that whatever happens is because parts of the world have essences and natures, and the laws are describing these essences and natures. This position is commonly accepted in the manifest image. Humeans, on the other hand, consider scientific laws simply to be useful patterns among the mosaic of events that very probably will hold in the future. The patterns summarize the behavior of nature. The patterns do not “lay down the law for what must be.” In response to the question of why these patterns and not other patterns, some Humeans say they are patterns described with the most useful concepts for creatures with brains like ours (and other patterns might be more useful for extraterrestrials). More physicists are Humean than anti-Humean. More philosophers are anti-Humean than Humean.
All fundamental laws of relativity theory are time-reversible. Time-reversibility implies the fundamental laws do not notice any difference between the future direction and the past direction. The second law of thermodynamics does notice this difference because it says entropy tends to increase toward the future; so the theory of thermodynamics is not time-reversible, but it is also not a fundamental theory. Time-reversibility fails for quantum measurements, and this issue is discussed in more details in the section below on quantum mechanics.
Time-translation invariance is a meta-law that implies all instants are equivalent, that is, indistinguishable. This is not implying that if you bought an ice cream cone yesterday, you will buy one tomorrow, but only that yesterday’s laws are tomorrow’s laws. All experts agree that the invariance holds over short distances and times. A translation in time to a first moment would be to a special moment with no earlier moment, so there is at least one exception to the claim that all moments are indistinguishable. A deeper question is whether any of the laws we have now might change in the future. The default answer is “no,” but this is just an educated guess. Any evidence that a fundamental law fails will be treated by some physicists as evidence that it was never a law to begin with, while it will be treated by others as proof that time-translation invariance fails. Hopefully a future consensus will be reached one way or the other.
Epistemologically, the laws of physics are hypotheses that are helpful to hold and that have not been refuted. However, some laws are believed less strongly than others, and so are more likely to be changed than others if future observations indicate a change is needed. The laws that are held most strongly in this sense are the Second Law of thermodynamics and the laws of general relativity and quantum mechanics.
Regarding the divide between science and pseudoscience, the leading answer is that:
what is really essential in order for a theory to be scientific is that some future information, such as observations or measurements, could plausibly cause a reasonable person to become either more or less confident of its validity. This is similar to Popper‘s criteria of falsifiability, while being less restrictive and more flexible (Dan Hooper).
a. The Core Theory
Some physical theories are fundamental, and some are not. Fundamental theories are foundational in the sense that not all their laws can be derived from the laws of other physical theories even in principle. For example, the second law of thermodynamics is not fundamental, nor are the laws of plate tectonics in geophysics or the law of natural selection in biology despite their being critically important to their respective sciences. The following two theories are fundamental in physics: (i) the general theory of relativity, and (ii) quantum mechanics. Their amalgamation is what Frank Wilczek called the Core Theory, the theory of everything physical except gravity. More specifically, the quantum mechanics used here is a version of quantum field theory that includes the Standard Model of elementary particle physics that describes the behavior of fundamental particles and all their forces or interactions other than gravity.
Nearly all scientists believe this Core Theory holds not just in our solar system, but all across the universe, and it held yesterday and will hold tomorrow. Wilczek claimed:
[T]he Core has such a proven record of success over an enormous range of applications that I can’t imagine people will ever want to junk it. I’ll go further: I think the Core provides a complete foundation for biology, chemistry, and stellar astrophysics that will never require modification. (Well, “never” is a long time. Let’s say for a few billion years.)
This implies one could think of biology as applied quantum theory.
The Core Theory does not include the big bang theory, which is the most favored theory of cosmology. The Core Theory also does not use the terms time’s arrow or now. The concept of time in the Core Theory is primitive or “brute.” It is not definable.
What physicists do not yet understand is the collective behavior of the particles of the Core Theory—such as why some humans get cancer and others do not. But it is believed by nearly all physicists that however this collective behavior does get explained, doing so will not require any revision in the Core theory, and its principles will underlie any such explanation. Reductionists in physics would say other theories of physics can be reduced to the Core Theory, but not vice versa.
The key claim is that the Core Theory can be used in principle to adequately explain the behavior of a people, galaxies, leaves, and molecules. The hedge phrase “in principle” is important. One cannot replace it with “in practice” or “practically.” Practically there are many limitations on the use of the Core Theory. Here are some of the limitations. Leaves are too complicated. There are too many layers of emergence needed from the level of the Core Theory to the level of leaf behavior. Also, there is a margin of error in any measurement of anything. There is no way to acquire the leaf data precisely enough to deduce the exact path of a specific leaf falling from a certain tree 300 years ago. Even if this data were available, the complexity of the needed calculations would be prohibitive. Commenting on these various practical limitations for the study of galaxies rather than leaves, the cosmologist Andrew Ponzen said “Ultimately, galaxies are less like machines and more like animals—loosely understandable, rewarding to study, but only partially predictable.”
The Core has been tested in many extreme circumstances and with great sensitivity, so physicists have high confidence in it. There is no doubt that for the purposes of doing physics the Core Theory provides a demonstrably superior representation of reality to that provided by its alternatives.
But all physicists know the Core is not strictly true and complete, and they know that some features will need revision—revision in the sense of being modified or extended. Physicists are motivated to discover how to revise it because such a discovery can lead to great praise from the rest of the physics community. Nobel Prizes would be won. Wilczek says the Core will never need modification for understanding (in principle) the special sciences of biology, chemistry, stellar astrophysics, computer science and engineering, but he would agree that the Core needs revision in order to adequately explain why 95 percent of the universe consists of dark energy, why the universe has more matter than antimatter, why neutrinos change their identity over time, and why the energy of empty space is as small as it is. One philosophical presupposition here is that the new Core Theory should be a single, logically consistent theory. The theoretical physicist Sabine Hossenfelder has challenged this assumption, saying it has the epistemological status of a hope.
The Core Theory presupposes that time exists, that it is a feature of spacetime, and that spacetime is more fundamental than time. Within the Core Theory, relativity theory allows space to curve, ripple, and expand; and this curving, rippling, and expanding can vary from one time to another and from one place to another. Space could even have the shape of a donut (a torus). Quantum theory does not allow any of these features, although a future revision of quantum theory within the Core Theory is expected to allow all of them.
In the Core Theory, the word time is a theoretical term, and time is treated somewhat like a single dimension of space. Informally, space is the set of all point-locations, and time is the set of all point-times, the instants. Spacetime is a set of all point-events. Spacetime is presumed to have a minimum of four-dimensions and also to be a continuum, with time being represented as a distinguished, one-dimensional sub-space of spacetime. But time is not a spatial dimension. Because the time dimension is so different from a space dimension, physicists often say spacetime is (3+1)-dimensional rather than 4-dimensional.
Both relativity theory and quantum theory presuppose that three-dimensional space is isotropic (rotation symmetric) and homogeneous (spatial-translation symmetric) and time-translation symmetric (yesterday’s laws are tomorrow’s laws). However, some results in the 21st century in cosmology cast doubt on this latter symmetry; there may be exceptions. Regarding all these symmetries, the laws need to obey the symmetries, but specific physical systems do not. For example, your body is a physical system that could become very different if you walk across the road at noon on Tuesday instead of Friday, even though the Tuesday physical laws are also the Friday laws.
The Core Theory presupposes that all dynamical laws should have the form of describing how a state of a system at one time turns into a different state at another time. This implies that taking into account the entire history of past states is not required to make a claim about what will happen next.
The Core Theory does not presuppose or explicitly mention consciousness. The typical physicist believes consciousness is contingent; it happens to exist but it is not a necessary feature of the universe. That is, consciousness happened to evolve because of fortuitous circumstances, but it might not have. Many philosophers throughout history have disagreed with this treatment of consciousness, especially the idealist philosophers of the 19th century.
[For the experts: More technically, the Core Theory is the renormalized, effective quantum field theory that includes both the Standard Model of particle physics and the weak field limit of Einstein’s General Theory of Relativity in which gravity is very weak and spacetime is almost flat, and no assumption is made about the character or even the existence of space and time below the Planck length and Planck time.]
2. Relativity Theory
Albert Einstein
Of all the theories of science, relativity theory has had the greatest impact upon our understanding of the nature of time. According to this theory, time can curve and stretch. Time is also strange because it has no independent, objective existence apart from a more fundamental entity, four-dimensional spacetime. Einstein’s theory rejects Newton’s assumptions that there are global instants of time and that points of space persist through time.
When the term relativity theory is used, it usually refers to the general theory of relativity of 1915, but sometimes it refers to the special theory of relativity of 1905, and sometimes it refers to both, so one needs to be alert to what is being referred to. Both theories are theories of time, among other entities. Both have been well-tested; and they are almost universally accepted among physicists as applying correctly to those situations in reality in which their assumptions are true. Today’s physicists understand them better than Einstein himself did. “Einstein’s twentieth-century laws, which—in the realm of strong gravity—began as speculation, became an educated guess when observational data started rolling in, and by 1980, with ever-improving observations, evolved into truth” (Kip Thorne). Strong gravity, but not too strong. In the presence of extremely strong gravity, such as found within a black hole, general relativity is believed to break down.
Special relativity is not a specific theory but rather a general framework for theories. General relativity is a generalization of special relativity that removes its restrictions to uncurved spacetime and to there being no gravitational forces.
Overall, the main difference between the two is that, in general relativity, spacetime does not simply exist passively as a background arena for events. Instead, spacetime is dynamical in the sense that changes in the distribution of matter and energy in any region of spacetime are directly related to changes in the curvature of spacetime in that region. John Wheeler summarized this point by saying that spacetime tells matter how to move and matter tells spacetime how to curve.
Einstein’s key equations in his general theory imply that energy and mass distort the geometry of spacetime, and as the distribution of energy and mass changes, so does the geometry. Although the Einstein field equations in his general theory:
are exceedingly difficult to manipulate, they are conceptually fairly simple. At their heart, they relate two things: the distribution of energy in space, and the geometry of space and time. From either one of these two things, you can—at least in principle—work out what the other has to be. So, from the way that mass and other energy is distributed in space, one can use Einstein’s equations to determine the geometry of that space, And from that geometry, we can calculate how objects will move through it (Dan Hooper).
Laplace was the first person to realize that Newton’s theory of mechanics requires the conservation of information over time, which in turn implies determinism. Einstein’s theory also has this feature, but quantum theory does not (as it is commonly interpreted).
An important assumption of general relativity theory (GR) is the principle of equivalence: gravity is basically acceleration. That is, gravitational forces cannot be distinguished from forces produced by acceleration. That is why, when you are in an elevator with the doors closed and that is accelerating away from Earth at the right rate, it can feel exactly like you are in an elevator that is not moving.
GR has many other assumptions that are usually never mentioned explicitly. One is that gravity did not turn off for three seconds during the year 1777 in Australia. A more general one is that the theory’s fundamental laws are the same regardless of what time it is. This feature is called time-translation invariance.
The special theory is inconsistent with Newton’s law of gravity. Einstein’s theory of general relativity was his solution to that problem of inconsistency.
The relationship between the special and general theories is slightly complicated. Both theories are about the motion of objects and both approach agreement with Newton’s theory the slower the speed of those objects, and the weaker the gravitational forces involved, and the lower the energy of those objects. General relativity implies the truth of special relativity in all infinitesimal regions of spacetime. General relativity holds in all reference frames, but special relativity holds only for inertial reference frames, namely non-accelerating frames. The frame does not accelerate, but objects in the frame are allowed to accelerate. Special relativity implies the laws of physics are the same for all inertial observers, that is, observers who are moving at a constant velocity relative to each other. ‘Observers’ in this sense are also the frames of reference themselves, or they are persons of zero mass and volume making measurements from a stationary position in a coordinate system. These observers need not be conscious beings.
Special relativity allows objects to have mass but not gravity. Also, it always requires a flat geometry—that is, a three-dimensional Euclidean geometry for space and a Minkowskian four-dimensional non-Euclidean geometry for spacetime. General relativity does not have those restrictions on geometry. And whereas special relativity is a framework for specific theories, general relativity is a very specific theory of gravity, or it is a specific theory if we add in a specification of the distribution of matter-energy throughout the universe. Both the special and general theory imply that Newton’s two main laws of F = ma and F = GmM/r2 hold only approximately, and they hold better for slower speeds and weaker gravitational strengths.
General relativity is geometric. What this means is that when an artillery shell flies through the air and takes a curved path in space relative to the ground because of a gravitational force acting upon it, what is really going on is that the artillery shell is taking a geodesic or the straightest path of least energy in spacetime, which is a curved path as viewed from a higher space dimension. That is why gravity or gravitational attraction is not really a force but rather is a curvature of spacetime.
The theory of relativity is generally considered to be based on causality. What this means is that:
One can take general relativity, and if you ask what in that sophisticated mathematics is it really asserting about the nature of space and time, what it is asserting about space and time is that the most fundamental relationships are relationships of causality. This is the modern way of understanding Einstein’s theory of general relativity….If you write down a list of all the causal relations between all the events in the universe, you describe the geometry of spacetime almost completely. There is still a little bit of information that you have to put in, which is counting, which is how many events take place…. Causality is the fundamental aspect of time. (Lee Smolin).
(An aside for the experts: The general theory of relativity requires spacetime to have at least four dimensions, not exactly four dimensions. Technically, any spacetime, no matter how many dimensions it has, is required to be a differentiable manifold with a metric tensor field defined on it that tells what geometry it has at each point. General relativistic spacetimes are manifolds built from charts involving open subsets of R4. General relativity does not consider a time to be a set of simultaneous events that do or could occur at that time; that is a Leibnizian conception. Instead, general relativity specifies a time in terms of the light cone structures at each place. A light cone at a spacetime point specifies what events could be causally related to that point, not just what events are causally related to it.)
Relativity theory implies time is a continuum of instantaneous times that is free of gaps just as the mathematical line is free of gaps between points. This continuity of time was first emphasized by the philosopher John Locke in the late seventeenth century, but it is meant here in a more detailed, technical sense that was developed for calculus only toward the end of the 19th century.
According to both relativity theory and quantum theory, time is not discrete or quantized or atomistic. Instead, the structure of point-times is a linear continuum with the same structure as the mathematical line or the real numbers in their natural order. For any point of time, there is no next time because the times are packed together so tightly. Time’s being a continuum implies that there is a non-denumerably infinite number of point-times between any two non-simultaneous point-times. Some philosophers of science have objected that this number is “too large,” and we should use Aristotle’s notion of potential infinity and not the late 19th century notion of a completed infinity. Nevertheless, accepting the notion of an actual nondenumerable infinity is the key idea used to solve Zeno’s Paradoxes and to remove inconsistencies in calculus, so for these reasons the number of point-events is not considered to be “too large.”
The fundamental laws of physics assume the universe is a collection of point events that form a four-dimensional continuum, and the laws tell us what happens after something else happens or because it happens. These laws describe change but do not themselves change. At least that is what laws are in the first quarter of the twenty-first century, but one cannot know a priori that this is always how laws must be. Even though the continuum assumption is not absolutely necessary for describing what we observe, so far it has proved to be too cumbersome to revise our theories in order to remove the assumption while retaining consistency with all our experimental data. Calculus has proven its worth.
No experiment has directly revealed the continuum structure of time. No experiment is so fine-grained that it could show point-times to be infinitesimally close together, although there are possible experiments that could show the assumption to be false if it were false and if the graininess of time were to be large enough.
Not only is there much doubt about the correctness of relativity in the tiniest realms, there is also uncertainty about whether it works differently on cosmological scales than it does at the scale of atoms, houses, and solar systems, but so far there are no rival theories that have been confirmed.
A rival theory intended to incorporate into relativity theory what is correct about the quantum realm is often called a theory of quantum gravity. Einstein claimed in 1916 that his general theory of relativity needed to be replaced by a theory of quantum gravity. The physics community generally agrees with him, but that theory has not been found so far. A great many physicists of the 21st century believe a successful theory of quantum gravity will require quantizing time. But this is just an educated guess. The majority view is that a quantum of gravity, the graviton particle, does exist, but it just has not yet been detected.
Regardless of whether there are or are not gravitons, the question remains as to whether there are or are not atoms of time. If there are, then there can be a next instant and a previous instant. It is conjectured that, if time were discrete, then a good estimate for a shortest duration is 10-44 seconds, the so-called Planck time. The Planck time is the time it takes light to traverse one Planck length. No physicist can yet suggest a practical experiment that is sensitive to this tiny scale. For more discussion, see (Tegmark 2017).
The special and general theories of relativity imply that to place a reference frame upon spacetime is to make a choice about which part of spacetime is the space part and which is the time part. No choice is objectively correct, although some choices are very much more convenient for some purposes. This relativity of time is one of the most significant philosophical implications of both the special and general theories of relativity.
Since the discovery of relativity theory, scientists have come to believe that any objective description of the world can be made only with statements that are invariant under changes in the reference frame. That is why saying, “It occurred at noon” does not have a truth value unless a specific reference frame is implied, such as one fixed to Earth with time being the time that is measured by our civilization’s standard clock. This relativity of time to reference frames is behind the remark that Einstein’s theories of relativity imply time itself is not objectively real whereas spacetime is.
Regarding relativity to frame, Newton would say that if you are seated in a vehicle moving along a road, then your speed relative to the vehicle is zero, but your speed relative to the road is not zero. Einstein would agree. However, he would surprise Newton by saying the length of your vehicle is slightly different in the two reference frames, the one in which the vehicle is stationary and the one in which the road is stationary. Equally surprising to Newton, the duration of the event of your drinking a cup of coffee while in the vehicle is slightly different in those two reference frames. These relativistic effects are called space contraction and time dilation, respectively. Both length and duration are frame dependent and, for that reason, say physicists, they are not objectively real characteristics of objects. Neither are the shapes of objects. Speeds also are relative to reference frame, with one exception. The speed of light in a vacuum has the same value c in all frames that are allowed by relativity theory. Space contraction and time dilation change in tandem so that the speed of light in a vacuum is always the same number. Convincing evidence for time dilation was first discovered in 1938 by Ives and Stilwell.
Another surprise for Newton would be to learn that relativity theory implies he was mistaken to believe in the possibility of arbitrarily high velocities. According to relativity theory, nothing that once went slower than the speed c (the speed of light in a vacuum) can go faster than c, regardless of the reference frame. This is an interesting fact about time because speed is distance per unit of time.
The constant c in the equation E = mc2 is called “the speed of light,” but it is not really about light only. According to relativity theory, it is the maximum speed of any causal influence, light or no light. It is not quite correct to say that, according to relativity theory, nothing can go faster than the speed of light. The remark needs some clarification, else it is incorrect. Here are four ways to go faster than the speed of light. (1) First, the medium needs to be specified. c is the speed of light in a vacuum. The speed of light in certain crystals can be much less than c, say 40 miles per hour, and if so, then a racehorse outside the crystal could outrun the light beam in the crystal. (2) Second, the limit c applies only to objects within space relative to other objects within space. However, the general theory of relativity places no restrictions on how fast an object can go that has always been going faster than c. (3) GR allows space itself to expand so that two clusters of galaxies have a relative speed of recession greater than c if the intervening space expands sufficiently rapidly. Astronomers have established that our space is expanding and is accelerating its expansion. They have detected galaxy clusters receding from us faster than c. This expansion is of the relative distance between galaxy clusters. Our solar system is not expanding, nor is the Milky Way–yet. But in 100 billion years, the last star beyond ours will blink out of even the best telescopes. The problem is not that we would need more sensitive telescopes to detect the light, but rather than there would be no light to detect. Then the expansion of smaller structures will become significant. (4) When we say nothing can go faster than light, we need a precise definition of “nothing.” A shadow can go faster than c. Is it nothing?
Technically expressed, Einstein’s point above is that no event has causes or effects outside the event’s backward or forward light cones. That is, no event is space-like separated from its causes or effects.
Some physicists believe Einstein is mistaken about this, and they believe the assumption that nothing goes faster than light will eventually be shown to be false because, in order to make sense of Bell’s Theorem in quantum theory, two entangled particles must be able to affect each other faster than this, perhaps instantaneously. The majority of physicists are unconvinced.
Perhaps the most philosophically controversial feature of relativity theory is that it allows great latitude in selecting the classes of simultaneous events, as shown in this diagram. Because there is no single objectively-correct frame to use for specifying which events are present and which are past—but only more or less convenient ones—one philosophical implication of the relativity of time is that it seems to be easier to defend McTaggart’s B theory of time and more difficult to defend McTaggart’s A-theory. The A-theory implies the temporal properties of events such as “is happening now” or “happened in the past two weeks ago” are intrinsic to the events and are objective, frame-free properties of those events. So, the relativity to frame makes it difficult to defend absolute time and the A-theory.
Relativity theory challenges other ingredients of the manifest image of time. For two point-events A and B, common sense says they either are simultaneous or not, but according to relativity theory, if A and B are distant enough from each other and occur close enough in time to be within each other’s absolute elsewhere, then event A can occur before event B in one reference frame, but after B in another frame, and simultaneously with B in yet another frame. To make the same point in other terminology, for two events that are spacelike separated, there is no fact of the matter regarding which occurred before which. Their temporal ordering is indeterminate. In the language of McTaggart’s A and B theory, unlike for the A-series ordering of events, there are multiple B-series orderings of events, and no single one is correct. No person before Einstein ever imagined time is so strange. Not all temporal ordering is relative, though, only the temporal ordering of events that are spacelike separated, so neither of the two events could have caused the other, even partially.
The special and general theories of relativity provide accurate descriptions of the world when their assumptions are satisfied. Both have been carefully tested. One of the simplest tests of special relativity is to show that the characteristic half-life of a specific radioactive material is longer when it is moving faster.
The special theory does not mention gravity, and it assumes there is no curvature to spacetime, but the general theory requires curvature in the presence of mass and energy, and it requires the curvature to change as their distribution changes. The presence of gravity in the general theory has enabled the theory to be used to explain phenomena that cannot be explained with either special relativity or Newton’s theory of gravity or Maxwell’s theory of electromagnetism.
The equations of general relativity are much more complicated than are those of special relativity. To give one example of this, the special theory clearly implies there is no time travel to events in one’s own past. Experts do not agree on whether the general theory has this same implication because the equations involving the phenomena are too complex for them to solve directly. A slight majority of physicists do believe time travel to the past is allowed by general relativity. Because of the complexity of Einstein’s equations, all kinds of tricks of simplification and approximation are needed in order to use the laws of the theory on a computer for all but the simplest situations. Approximate solutions are a practical necessity.
Regarding curvature of time and of space, the presence of mass at a point implies intrinsic spacetime curvature at that point, but not all spacetime curvature implies the presence of mass. Empty spacetime can still have curvature, according to general relativity theory. This unintuitive point has been interpreted by many philosophers as a good reason to reject Leibniz’s classical relationism. That claim was first made by Arthur Eddington.
Two accurate, synchronized clocks do not stay synchronized if they undergo different gravitational forces. This is a second kind of time dilation, in addition to dilation due to speed. So, a clock’s time depends on the clock’s history of both speed and gravitational influence. Gravitational time dilation would be especially apparent if a clock were to approach a black hole. The rate of ticking of a clock approaching the black hole slows radically upon approach to the horizon of the hole as judged by the rate of a clock that remains safely back on Earth. This slowing is sometimes misleadingly described as “time slowing down,” but this metaphor may misleading suggest that time itself has a rate, which it doesn’t. After a clock falls through the event horizon, it can still report its values to a distant Earth, and when it reaches the center of the hole not only does it stop ticking, but it also reaches the end of time, the end of its proper time.
The general theory of relativity theory has additional implications for time. It implies that spacetime can curve or warp locally or cosmically, and it can vibrate or jiggle. Whether it curves into a real fourth spatial dimension is unknown, but it definitely curves as if it were curving into such an extra dimension. Here is a common representation of the situation which pictures our 3D space from the outside reference frame as being a 2D curved surface that ends infinitely deep at a point of infinite mass density, the hole’s singularity.
Representation of a 2D black hole
This picture is helpful in many ways, but it can also be misleading because space need not really be curved into this extra dimension that goes downward in the diagram, but the 2D space does really need to be compacted more and more as one approaches the black hole’s center. That is, the picture implies there are more dimensions than there really are.
Let’s explore the microstructure of time in more detail, beginning with the distinction between continuous and discrete space. In the mathematical physics that is used in both relativity theory and quantum theory, the ordering of instants by the happens-before relation of temporal precedence is complete in the sense that there are no gaps in the sequence of instants. Any interval of time is a continuum, so the points of time form a linear continuum. Unlike physical objects, physical time and physical space are believed to be infinitely divisible—that is, divisible in the sense of the actually infinite, not merely in Aristotle’s sense of potentially infinite. Regarding the density of instants, the ordered instants are so densely packed that between any two there is a third so that no instant has a very next instant. Regarding continuity, time’s being a linear continuum implies that there is a nondenumerable infinity of instants between any two non-simultaneous instants.
The actual temporal structure of events can be embedded in the real numbers, at least locally, but how about the converse? That is, to what extent is it known that the real numbers can be adequately embedded into the structure of the instants, at least locally? This question is asking for the justification of saying time is not atomistic. The problem here is that the shortest duration ever measured is about 250 zeptoseconds. A zeptosecond is 10-21 second. For times shorter than about 10-43 second, which is the physicists’ favored candidate for the duration of an atom of time, science has no experimental grounds for the claim that between any two events there is a third. Instead, the justification of saying the reals can be embedded into the structure of the instants is that (i) the assumption of continuity is very useful because it allows the mathematical methods of calculus to be used in the physics of time; (ii) there are no known inconsistencies due to making this assumption; and (iii) there are no better theories available. The qualification earlier in this paragraph about “at least locally” is there in case there is time travel to the past. A circle is continuous and one-dimensional, but it is like the real numbers only locally.
One can imagine two empirical tests that would reveal time’s discreteness if it were discrete—(1) being unable to measure a duration shorter than some experimental minimum despite repeated tries, yet expecting that a smaller duration should be detectable with current equipment if there really is a smaller duration, and (2) detecting a small breakdown of Lorentz invariance. But if any experimental result that purportedly shows discreteness is going to resist being treated as a mere anomaly, perhaps due to there somehow being an error in the measurement apparatus, then it should be backed up with a confirmed theory that implies the value for the duration of the atom of time. This situation is an instance of the kernel of truth in the physics joke that no observation is to be trusted until it is backed up by theory.
The General Theory of Relativity implies gravitational waves will be produced by any acceleration of matter. Drop a ball from the Leaning Tower of Pisa, and this will shake space-time near the tower and produce ripples that will emanate in all directions from the Tower. The existence of these ripples was confirmed in 2015 by the LIGO observatory (Laser Interferometer Gravitational-Wave Observatory) when it detected ripples caused by the merger of two black holes.
In addition to relativity theory, the other fundamental theory of physics is quantum mechanics. It was created in the late 1920s. At that time, it was applied to particles and not to fields. In the 1970s, it was successfully applied to quantum fields via the new version of the theory called “quantum field theory.” The term “quantum mechanics” is now used to mean either the classical theory of the 1920s or the improved quantum theory that includes quantum field theory with its Standard Model of particle physics. Context is usually needed in order to tell what the term “quantum mechanics” refers to.
What kind of world is quantum mechanics describing for us? What does it imply about time? There is considerable agreement among the experts that quantum mechanics has deep implications about the nature of time, but there is considerable disagreement among the experts regarding what those implications are.
Time is treated as being a continuum in mainstream quantum mechanics, just as it is in relativity theory and Newtonian theory, but change over time is treated in quantum mechanics very differently than in all previous theories—because of (1) quantum discreteness and (2) apparently instantaneous and discontinuous collapse of the wave function during any measurement, a collapse from many possibilities to a single, actual one. Although this is a disputed ontological claim, many experts say the wave function gives a description of a physical system at a single time. It is analogous to what other theories call the state of the system, and during a measurement, according to the quantum mechanic’s main interpretation, the wave function collapses or updates from a superposition of multiple possible states of the system to a single state with the measured variable having a single value. For example, an electronic system is created to be in a state that is a superposition of four states having 2, 3, 4, or 5 volts in a wire of a certain electrical device, and during the measurement the superposition collapses to a single state in which the voltage is 3 volts. The measurer then says the voltage was measured to have 3 volts with a probability of one, whereas the measurer might only be able to say before the measurement that the voltage had a probability of 1/4 of being 3 volts. According to the standard interpretation of quantum mechanics, there is nothing that fixes in advance the outcome of the measurement, so nature is truly random. In non-quantum theories, probabilities are always due to some hidden information not being available that would enable a deduction of the measured value. In quantum mechanics as standardly interpreted this hidden information does not exist.
Quantum physicists have learned from the behavior of certain kinds of physical systems how to assign to a new system either an exact wave function or a limited range of wave functions. Then they apply the laws of quantum mechanics to that wave function or the range in order to calculate the probabilities that the system will evolve into other states at a later time. So, unlike relativity which, when applied to a physical system, will make a precise prediction about what will happen, quantum mechanics makes only statistical predictions.
The three strangest features of quantum mechanical phenomena are its randomness, superposition, and entanglement. We do not notice any of these strange phenomena in our ordinary lives outside a physics laboratory, but they underlie our reality. The following sections explore these phenomena and other aspects of quantum mechanics.
a. Quantum Leaps, Quantum Waves, and Duality
Quantum mechanics is often said to be our theory of small things, and relativity is our theory of large things. These are crude remarks since quantum mechanics applies to everything, but in their day to day work most physicists do not need to think about quantum mechanics if they are considering phenomena larger than a nanometer.
Quantum mechanics says many phenomena are discrete, but not all. This discreteness is not shown directly in the equations, but rather in two other ways.
(1) Quantum mechanics represents every physical system as a wave, even an atom or an airplane, but for any wave there is a smallest possible amplitude it can have, called a “quantum.” Smaller amplitudes simply do not occur. Classical physics does not have this intuitively odd feature. As Hawking quipped: “It is a bit like saying that you can’t buy sugar loose in the supermarket, it has to be in kilogram bags.”
(2) The possible solutions to some of the equations of quantum mechanics form a discrete set, not a continuous one. For example, the possible values of certain variables such as energy states of an electron within an atom are allowed by the equations to have values that change to another value only in multiples of minimum discrete steps in a shortest time. Changing by a single step is sometimes called a “quantum jump” or “quantum leap.” When applying the quantum equation to a world containing only a single electron bound to a hydrogen atom, the solutions imply the electron can have -13.6 electron volts of energy or -3.4 electron volts of energy, but no value between those two. This illustrates how energy levels are quantized. However, in the equation, the time variable is continuous.
The variety of phenomena that quantum mechanics can be used to successfully explain is remarkable. For four examples, it explains: (1) why you can see through a glass window but not a potato, (2) why the Sun has lived so long without burning out, (3) why atoms are stable so that the negatively-charged electrons of an atom do not spiral into the positively-charged nucleus, and (4) why the periodic table of elements has the structure it has. Without quantum mechanics, these four facts (and many others) must be taken to be brute facts of nature.
Quantum mechanics is our most successful theory in all of science. One especially important success is that the theory has been used to predict the measured value of the anomalous magnetic moment of the electron extremely precisely and accurately. This value is a measure of how much the electron wobbles when traveling through a magnetic field. The predicted value, expressed in terms of a certain number g, is the real number:
g/2 = 1.001 159 652 180 73…
The measured value agrees with this predicted value to this many decimal places. No similar feat of precision and accuracy can be accomplished by any other theory of science.
Under the right physical conditions such as when dealing with a large object (a macroscopic object), quantum theory gives the same results as classical Newtonian theory. In 1927, Paul Ehrenfest first figured out how to deduce (or morph into) Newton’s second law of mechanics F = ma from the corresponding equation in quantum mechanics, namely the Schrödinger equation. That is, he showed under what conditions you can get classical mechanics from quantum mechanics.
The ontology of a theory is what it says exists fundamentally. Regarding the effect of quantum theory on ontology, the majority viewpoint among philosophers of physics in the twenty-first century is that potatoes, galaxies, and brains are fairly stable patterns over time of interacting quantized fields. So, every entity except these fields is an emergent entity. Also, the multi-decade debate about whether an electron is a point object or instead an object with a small, finite width has been settled by quantum field theory. It is neither. An electron takes up all of space. It is a “bump” or “packet of waves” with a narrow peak that trails off to trivially lower and lower amplitude throughout the electron field; that field fills all of space. A sudden disturbance in a field will cause wave packets to form, thus permitting particle creation in the field. Until quantum field theory was created, particle creation was a mystery.
So, quantum mechanics provides a new description of a fundamental particle. A particle is understood in quantum field theory to be a wave packet of a wave that vibrates a million billion times every second and has a somewhat localized peak in amplitude but has nearly zero amplitude throughout the rest of space. If we use a definition that requires a fundamental particle to be an object with a precise, finite location, then quantum mechanics now implies there are no fundamental particles. For continuity with the past, particle physicists still do call themselves “particle physicists” and do say they study “particles”; but they know this is not what is really going on. The term is not intended to be taken literally, nor used in the informal sense of ordinary language. Similarly, the term “forces” is commonly-used by physicists even though they know the word does not mean what it means in ordinary discourse; a force is really an interaction involving complex particle exchange. Very often, though, the force-talk and particle-talk are useful because the terms are adequate for many scientific purposes and because their use avoids complexities that do not need to be made explicit at the moment.
Scientists sometimes say “Almost everything physical is made of quantum fields.” The hedge word “almost” is there because they mean everything physical except gravity. No treatment of gravity in terms of quantum fields has been successful.
The principal scientific problem about quantum mechanics is that it is consistent with special relativity but inconsistent with general relativity, which is our theory of gravity, yet physicists have a high degree of trust in all these theories. For one example of the inconsistency, general relativity theory implies black holes have point singularities, and quantum theory implies they do not. For another example, general relativity implies black holes are black, and quantum theory implies they shine with Hawking radiation.
Quantum mechanics is well tested and relatively well understand mathematically, yet it is not well understood intuitively or informally or philosophically or conceptually. Its requirements on reality are very difficult to visualize. This is what he meant, when one of the founders of quantum field theory, Richard Feynman, said he did not really understand his own theory. Some commentators draw the conclusion from this that human language is not a very good descriptive tool for nature.
Surprisingly, because of competing interpretations, physicists still do not agree on the exact formulation of the theory and how it should be applied to the world. They do not agree on what quantum theory’s axioms would be if it were to be axiomatized. So, this failure stands in the way of solving problem 6 of David Hilbert’s list of 23 problems in mathematics that he said in 1900 need to be solved in coming centuries, namely the problem of formalizing all accepted scientific theories.
Quantum mechanics has many interpretations, but there is a problem. “New interpretations appear every year. None ever disappear,” joked physicist N. David Mermin, although the joke has a point. This article describes only four of the many different interpretations: the Copenhagen Interpretation, the Hidden Variables Interpretation, the Many-Worlds Interpretation, and the Objective Collapse Interpretation.
The Copenhagen Interpretation has a strong plurality of supporters, about one-third, but not a majority. It is the “classical” interpretation. The four interpretations are proposed answers to the question, “What is really going on?” Because these interpretations have different physical principles and can make different experimental predictions, they actually are competing theories. Each theory here is a theory in the philosopher’s sense of the term “theory,” but each is actually a family of specific theories of physics. That is, each is a sketch of how to build a more specific, precise theory. Some are sketchier than others.
During the 20th century, most physicists resisted the need to address the question “What is really going on in quantum mechanics?” Their mantra was “Shut up and calculate” and do not explore the philosophical questions involving quantum mechanics. Abandon thoughts about ontology. For example,quantum mechanics has nothing to say about how to answer the question “What is time?” Discussion of the philosophical questions did not appear in college textbooks. Turning away from this head-in-the-sand approach, Andrei Linde, said, “We [theoretical physicists] need to learn how to ask correct questions, and the very fact that we are forced right now to ask…questions that were previously considered to be metaphysical, I think, is to the great benefit of all of us.”
A particle is understood in quantum field theory to be a wave packet of a wave that vibrates a million billion times every second and has a somewhat localized peak in amplitude but has nearly zero amplitude throughout the rest of space. If we use a definition that requires a fundamental particle to be an object with a precise, finite location, then quantum mechanics now implies there are no fundamental particles. For continuity with the past, particle physicists still do call themselves “particle physicists” and do say they study “particles”; but they know this is not what is really going on. The term is not intended to be taken literally, nor used in the informal sense of ordinary language. Sometimes, though, the particle-talk is useful because it is good enough for many scientific purposes and because it avoids unneeded complexities.
If we want to use quantum mechanics to describe the behavior of a system of particles over time, then we start with the system’s initial state such as its wave function Ψ(x,t) for each point x of space for the initial instant t and then compute the wave function for other places and times. When a measurement is made, we use the resulting value to update the wave function. It is an open question in ontology whether the wave function is a direct description of reality rather than merely a mathematical tool.
An individual electron can have its own wave function, and so can a trio of particles, and a device, and even the universe as a whole; it all depends on what one considers to be the system being studied.
Max Born, one of the fathers of quantum mechanics, first suggested interpreting these quantum waves not literally as waves in space and time but rather as waves of probability. Stephen Hawking explained it this way:
In quantum mechanics, particles don’t have well-defined positions and speeds. Instead, they are represented by what is called a wave function. This is a number at each point of space. The size of the wave function gives the probability that the particle will be found in that position. The rate at which the wave function varies from point to point gives the speed of the particle. One can have a wave function that is very strongly peaked in a small region. This will mean that the uncertainty in position is small. But the wave function will vary very rapidly near the peak, up on one side and down on the other. Thus the uncertainty in the speed will be large. Similarly, one can have wave functions where the uncertainty in the speed is small but the uncertainty in the position is large.
The wave function contains all that one can know of the particle, both its position and its speed. If you know the wave function at one time, then its values at other times are determined by what is called the Schrödinger equation. Thus one still has a kind of determinism, but it is not the sort that Laplace envisaged (Hawking 2018, 95-96).
Given a wave function at one time, we insert this into the Schrödinger wave equation that says how the wave function changes over time. That equation is the partial differential equation:
i is the square root of negative one. h-bar is Planck’s constant divided by 2π. H is the Hamiltonian operator on Ψ. This Schrödinger wave equation is the quantum version of Newton’s laws of motion, and it indicates the rate of change of the system and what it changes into. Knowing the Hamiltonian of the quantum mechanical system is analogous to knowing the forces involved in a system obeying Newtonian mechanics. The abstract space (or arrangement) of all possible wave functions is call Hilbert Space.
The probabilities are computed by “squaring” the wave function. In our example, the state Ψ can be used to show the probability p(x,t) that a certain particle will be measured to be at place x at a future time t, if a measurement were to be made, where
p(x,t) = Ψ*(x,t)Ψ(x,t).
The values of psi are complex numbers. The asterisk designates a complex conjugate operator, so the “squaring” is an exotic squaring, but let’s not delve any more into the mathematical details. This equation is called the Born Rule. That rule connects the abstract wave function to actual probabilities of measurements of the system’s behavior. It connects theory to nature.
Experimentally, the wave function can be sampled, but not measured overall. For that reason, the wave function used in practice is always an estimate. The formulation of the function Ψ has been improved since the days of Schrödinger because of advances in creating quantum field theory and its Standard Model of particle physics. But the Schrödinger equation itself has not changed.
An important feature of the quantum state Ψ is that you, the measurer, cannot measure it without disturbing it and altering its properties. “Without disturbing it” means “without collapsing the wave function.” Also, on most interpretations of quantum mechanics (but not on the Bohm interpretation), fundamental particles are considered to be waves, or, to speak more accurately, they are considered to be “wavicles,” namely entities that have both a wave and a particle nature, but which are never truly either because the two properties are mutually exclusive. This dual feature of nature is called “wave-particle duality.”
The electron that once was conceived to be a tiny particle orbiting an atomic nucleus is now better conceived as something larger and not precisely defined spatially; the electron is a cloud that completely surrounds the nucleus, a cloud of possible places where the electron is most likely to be found if it were to be measured. The electron or any other particle is no longer well-conceived as having a sharply defined trajectory. A wave cannot have a single, sharp, well-defined trajectory. The location and density distribution of the electron cloud around an atom is the product of two opposite tendencies: (1) the electron wave “wants” to spread out away from the nucleus just as a water wave wants to spread out away from the point where the stone fell into the pond, and (2) the electron-qua-particle is a negatively-charged particle that “wants” to reach the positive electric charge of the nucleus because opposite charges attract. However, it is generally very difficult for scientist’s to map the states of a system’s wave function.
b. The Copenhagen Interpretation and Complementarity
Niels Bohr (1885-1962)
The Copenhagen Interpretation has become the orthodox interpretation of quantum mechanics. Nevertheless, it is vague. It contains an assortment of beliefs about what physicists are supposed to do with the mathematical formalism of quantum mechanics and how they should conceptualize its features. This classical interpretation of quantum mechanics was created by Niels Bohr and Werner Heisenberg and their colleagues in the late 1920s. It is called the Copenhagen Interpretation because Bohr and Heisenberg taught at the University of Copenhagen. According to many of its advocates, it has implications about time reversibility, determinism, the conservation of information, locality, and realism’s promotion of the reality of the world independently of its being observed—namely, that they all fail. The creators of the Copenhagen Interpretation such as Born, Bohr, Heisenberg, von Neumann, Pauli, and Wigner did not agree with each other about what is really going on in a world described by quantum mechanics, but this historical dispute is not described below in any detail.
Let’s consider how a simple experiment might reveal why we philosophers and physicists should understand the world in this new way. Thomas Young’s double-slit experiment convinced physicists to believe that light is a wave and not, as Isaac Newton believed, a beam of particles. In the famous quantum version of Young’s double-slit experiment, electrons all having the same energy and being what we now call “coherent” are repeatedly shot toward and through two adjacent, parallel slits in an otherwise impenetrable metal plate. Here is a simplistic diagram of the experimental set up, giving an aerial view of the electrons (black dots) with straight lines showing the approximate paths taken:
The diagram is a bird’s eye view of electrons passing through two slits and then hitting an optical screen that is behind the screen with the two slits. The optical screen is shown twice, first on the right in an aerial view and then farther to the right in a full frontal view as if it were being viewed from the two slits. The full frontal view shows two jumbled rows on the right where the electrons have collided with the optical screen. The optical screen that displays the dots behind the plate is similar to a computer monitor that displays a pixel-dot when and where an electron collides with it. Think of it as a position measuring device. Bullets, pellets, and sand grains would produce a similar pattern.
What is especially interesting is that the electrons behave differently if someone observes which slits they passed through. When observed, the electrons create the pattern shown above, but when not observed they leave the pattern shown below:
When unobserved, the impacts build up over time into a pattern of many alternating dark and bright bands on the screen. This pattern is very similar to the pattern obtained by diffraction of light waves or water waves. This suggests the electron is behaving like a wave that went through both slits at once. The incoming wave exits the slits as two waves that interfere either constructively or destructively. Later when one wave’s trough interferes with another wave’s peak at the screen, no dot is produced. When two crests meet at the screen, there is constructive interference, and the result is a dot. There are multiple, parallel stripes of dots produced along the screen, but only five are shown in the diagram. Stripes farther from the center of the screen are dimmer. Waves have no problem going through two or more slits simultaneously, but classical particles cannot behave this way. Because the collective electron behavior over time looks so much like optical wave diffraction, this is considered to be definitive evidence of electrons behaving as waves. The same pattern of results occurs if neutrons or many other particles are used in place of electrons.
The other remarkable feature of this experiment is that the pattern of interference is produced even when the electrons are shot one at a time at the plate several seconds apart. Does an electron know what its earlier fellow electrons did so it can act to produce the proper pattern on the screen?
Philosophers of physics do not agree on what is really going on here. A minority view, sometimes called “quantum Bayesianism,” says quantum theory does not tell you what is happening in nature but only what you can know about nature. This position rejects the standard philosophical position that science describes or represents the physical world. The Copenhagen Interpretation’s most extreme position is that what is real are only measurement outcomes.
A less extreme and now highly favored explanation of the double-slit experiment says what is happening is that we are seeing evidence of “wave-particle duality,” namely that a single electron itself has both wave and particle properties. When an electron is unobserved, it is a wave that is an extended object in many places at once, but when it is observed it is an unextended particle having a single, specific location. This mix of two apparently incompatible properties (wave properties and particle properties) is called a “duality,” and the electron is said to behave as a “wavicle.”
Thinking more about the duality problem, Bohr envisioned nature as obeying a principle of complementarity in which there are complementary pairs of properties. When a measurement is made of one property of the pair, as this measurement becomes precise any measurement of the complementary property must become less precise. The main example was of position and velocity of an electron, but energy and time are also complementary properties. Heisenberg had proved theorems that are now called “Heisenberg’s Uncertainty Principles,” and Bohr said these expressed symbolically the complementarity feature of nature. The psychologist Jerome Bruner reported that Bohr discovered his complementary principle when, after his son was caught stealing a pipe and he confessed, Bohr realized the impossibility of simultaneously loving his son and wanting justice for the pipe owner. Brooding upon that problem, he thought of the gestalt switch required as one sees a black vase or instead a pair of white human faces in the famous figure-ground trick graphic below,
and then “the impossibility of thinking simultaneously about the position and the velocity of a particle occurred to him.” Generalizing upon this impossibility, Bohr produced his principle of complementarity. However, experiments in 1979 first showed that wave–particle duality allows a wavicle to have different ratios of being a particle to being a wave, depending on the situation.
In the first half of the twentieth century, influenced by Logical Positivism which was then dominant in analytic philosophy, some advocates of the Copenhagen interpretation said quantum mechanics shows that our belief that there is something a physical system is doing when it is not being observed is meaningless. In other words, a fully third-person perspective on nature is impossible. To explain the double-slit experiment, Niels Bohr adopted an anti-realist stance by saying there is no determinate, unfuzzy way the world is when it is not being observed. There is only a cloud of possible values for each property of the system that might be measured. So, there is no place where the electron is when it is not being observed; there are just electron “clouds.”
Nobel Prize winning physicist Eugene Wigner promoted the more extreme claim that there exists a determinate, unfuzzy reality only when a conscious being is observing it. This prompted some philosophers to ask for clarity about which beings count as being conscious and which do not. These claims of Wigner’s prompted an opponent of anti-realism, Albert Einstein, to ask a supporter of the Copenhagen Interpretation whether he really believed that the moon exists only when it is being looked at.
c. Superposition and Schrödinger’s Cat
In the two-slit experiment, the Copenhagen Interpretation implies that before a measurement is made the experiment’s comprehensive state is a simultaneous superposition or “sum” of two different comprehensive states, one in which the electron goes through the left slit and one in which it goes through the right slit. This is not at all like a tree having a state of being tall and of being green. Those are not comprehensive states of a physical system. What is really going on mathematically is that the Schrödinger equation is the kind of equation in which, if there are two solutions to the equation, then the sum of the two solutions is also a solution. The word “superposition” was invented to give an intuitive idea of this mathematical fact. It is very misleading to describe the unobserved electron as being in two places at once as it goes through the two slits, but unfortunately, many popular books do describe it this way.
The previous paragraph said a comprehensive state becomes a simultaneous superposition or “sum” of two different comprehensive states. The sum is actually not a simple sum. It is a sum of various terms in which each term represents a possible state that will occur, say after a measurement, and each term is weighted (multiplied by) the square root of the probability that the state designated by the term will be the outcome of the measurement. This is the exotic way that theory is tied to observation.
The wave function whose representation over time is a stretched-out wave like a sine wave suddenly becomes spike-shaped as the measurement is made. Because of this global collapse, the physical system changes its state discontinuously and so momentarily violates the Schrödinger equation.
Sympathetic to the realist attitude of Einstein’s that there are no superpositions and no intrinsic need for mentioning consciousness in describing the two-slit experiment, Erwin Schrödinger created his thought experiment in 1935 about a cat in a windowless box. He believed it should convince people to oppose the Copenhagen Interpretation, especially it’s notion of superposition. A vial of poison gas is inserted into the box with an apparatus that gives the vial a 50% probability of being broken during the next minute depending on the result of a quantum event such as the fission (or not) of a radioactive uranium atom. If the vial is broken during the next minute, the cat is poisoned and dies. Otherwise it is not poisoned and lives. According to Wigner’s version of the Copenhagen Interpretation, argued Schrödinger, if the box is not observed by a conscious being at the end of the minute, the cat remains in a superposition of two states, the sum of the state with the cat being alive and the state with the cat being dead, and this situation can continue until some conscious being finally looks into the box. Schrödinger believed that this scenario is absurd, yet implied by the Copenhagen Interpretation, at least as it was promoted by Wigner, and he used this reasoning to show the Copenhagen Interpretation is mistaken in how it explains nature.
Einstein agreed. In 2025, physics Nobel-Prize winner Gerard ‘t Hooft also agreed, remarking that “Using superpositions, then, is just a trick that works at first but doesn’t get at the states we want to understand.” The states we want to understand are those that enable prediction of a measurement’s outcome. Einstein’s and ‘t Hooft’s goal is a return to determinism and to the conservation of quantum information.
The double-slit experiment and the Schrödinger’s cat thought experiment have caused philosophers of physics to disagree about what an object is, what it means for an object to have a location, how an object maintains its identity over time, and whether consciousness of the measurer is required in order to make reality become determinate and “not fuzzy” or “not blurry.” Eugene Wigner and John von Neumann were the most influential physicists to suggest that perhaps consciousness collapses the wave function. David Albert said he heard Wigner say, when asked where the borderline is between those creatures that can collapse the wave function and those that cannot, that it probably is between a dog and a mouse. Few professional physicists in 2025 believe in reality’s dependence upon consciousness.
d. Indeterminism
In a classical world, if you know where a particle is at some time, and what forces will be acting on it in the future, then Newton’s equations can be used to tell you where it will be at any later time. That’s what it means for Newton’s theory to be deterministic. Relativity theory is also deterministic. Quantum theory is not deterministic, at least it is not if we restrict it to what can be known in this observable universe. If you want to know what will happen there then, given what is happening here now, the quantum mechanical answer is always probabilistic, never definite.
The scientific ideal ever since the beginning of modern science has been that information is conserved in any closed and isolated system, either a small system or a large system or even the universe as a whole. This ideal has an ontological version and an epistemological version. The ontological version says conservation of information implies present states fix both all later states and all earlier states. Because conservation of information is about both the future and the past, but determinism is strictly speaking only about the future, conservation of information implies determinism, but the converse does not hold.
The epistemological version says conservation of information implies the prediction of any past state or future state from one present state (using knowledge of the laws of nature) is theoretically possible—at least it is possible for Laplace’s Demon who has no limits on its computational abilities. Another way of expressing the point is to say that all possible available quantum information at an instant would be enough for the Demon to make all predictions or postdictions regarding the results of measurements.
Let’s explain this a bit more. Consider the difference between practical predictions and theoretically possible predictions. There are two main kinds of reasons why physicists cannot practically predict what will happen in the future: (1) It is too tedious of a job to acquire knowledge of the microstate of a system at a single time. In the pre-quantum era, a microstate is fixed, not by a wave function, but by the locations and momenta of each of its zillions of basic micro-particles at a single time. (2) The relevant equations that need to be used are too complicated to solve—even with the aid of computers and even if we were to completely know a state at a time.
With the rise of quantum mechanics, there are two additional obstacles to making predictions; and these are theoretical, not just practical: (1) The theory of quantum mechanics implies the wave function evolves deterministically, except during measurements. (2) Heisenberg’s Uncertainty Principle sets limits on the precise values of pairs of variables. For example, the more precise is the position of a particle the less precise is its velocity.
According to the Copenhagen Interpretation, which is the orthodox interpretation, given how a system is at some initial time, the Schrödinger equation describes not what will happen at later times, but only the probabilities of what will happen at later times. This gives up on the classic viewpoint that the role of physics is to describe nature completely and instead embraces a role in which physics is about describing actual and possible measurement outcomes, and these only with probability.
The inevitability of having these probabilities implies indeterminism. The probabilities are not a product of the practical limitations on the human being’s ability to gather all the information about the initial state. Niels Bohr, the founder of the Copenhagen Interpretation, embraced indeterminism because he believed it is needed in order to account for a human being’s free will. Many philosophers disagree with Bohr about this, but accept that he was correct that quantum mechanics implies indeterminism.
The presence of these irremovable probabilities shows there is a characteristic randomness at the heart of nature. In fact, quantum-induced randomness is the only source of true randomness in nature. Shuffling a deck of cards a thousand times will not produce a truly randomly-ordered deck. The probabilities of quantum phenomena rarely reveal themselves to us in our everyday, macroscopic experience because, at our scale, every value of the relevant probabilities is extremely close to one, and the deviation goes unnoticed. Nevertheless, everything fluctuates randomly, even brains and moons.
In quantum mechanics a state of a system is described very differently from all earlier theories of physics. It is described using the Schrödinger wave function. The wave is not a wave similar to the electromagnetic wave that exists in our physical space; the wave is a mathematical tool. The wave is represented as a vector in an infinite dimensional Hilbert space or, if not infinite, the number of dimensions is extremely large. Schrödinger’s wave function describes the state, and Schrödinger’s wave equation describes how the state changes deterministically from one time to another (except at the times that a measurement is made). The wave equation gives the rate of change of the wave function.
The theory of quantum mechanics is tied to physical reality by the Born Rule. This rule (from Max Born) says the square of the amplitude of the wave function is proportional to the probability density function. What this means is that the Born Rule specifies for a time and place not what exactly will happen there then but only the probabilities of this or that happening there then, such as it being 5% probable an electron will be detected in this spatial region when a certain electron-detecting measurement is made at a certain time. The probabilities in any situation always add to one. So, probability is apparently at the heart of quantum mechanics and thus of our universe. Max Born recommended thinking of the wave function as a wave of probabilities. Because of these probabilities, if you were to repeat a measurement, then the outcome the second time might be different even if the two initial states are exactly the same. So, the key principle of causal determinism, namely “same cause, same effect,” fails.
e. Hidden Variables
Einstein was unhappy with indeterminism. He was unhappy with there being a necessary role for consciousness in measurement. He was also unhappy with the fact that the Copenhagen Interpretation bifurcated nature into a measured part and an unmeasured part, limiting the scope of the laws of quantum mechanics and making them incomplete. He believed quantum theory tells how an ensemble or group of electrons behaves but fails to tell us how a single electron behaves. He was a reductionist who believed the whole cannot be greater than the sum of its parts. All these features of the Copenhagen Interpretation, he said, are a clear sign that quantum mechanics is not completely and correctly describing the universe.
Einstein proposed that there would be a future discovery of as yet unknown “hidden” variables. These extra variables are properties that, when taken into account by a revised Schrödinger wave function, would make quantum mechanics be deterministic and consciousness-free and representationally complete. Einstein believed you would not need probabilities if you had access to the precise values of all the variables affecting a system, including the variables that are currently hidden. Hidden variables are like a hidden instruction set telling nature how to behave in more detail than classical quantum theory provides. For example, one hidden variable might be the precise location of an electron, precise to an infinite number of significant digits.
Einstein believed the consequent of adopting the Hidden Variables Interpretation would be that determinism, time-reversibility, and information conservation would be restored, and there would be no need to speak of a discontinuous collapse of the wave function during measurement. Also, quantum probabilities would be epistemological; they would be caused by our lack of knowledge of the values of all the variables, just as a universe obeying Newton’s laws. Einstein’s universe would not have any imprecision nor any cases of indeterminism that would require Laplace’s Demon to use probabilities.
Einstein’s arguments in favor of the Hidden Variables Interpretation were philosophical, not mathematical. He wrote in a 1926 letter to Max Born:
Quantum mechanics is certainly imposing. But an inner voice tells me that it is not yet the real thing. The theory says a lot, but does not really bring us any closer to the secret of the “old one.” I, at any rate, am convinced that He does not throw dice.
Niels Bohr responded to another, similar remark by Einstein with: “But still, it cannot be for us to tell God how he is to run the world.”
In the 1950s, David Bohm agreed with Einstein and went some way in this direction by building a revision of quantum mechanics that has hidden variables and, unlike the Copenhagen Interpretation, has no instantaneous collapse of the wave function during measurement, but his interpretation did not succeed in moving the needle of mainstream scientific opinion because of its difficulty of accounting for quantum field theory.
Challenging the hidden variable proposal, John Bell showed that any hidden variable assumption designed to make quantum mechanics deterministic would require that nature obey what are now called “Bell inequalities.” Later, carefully-crafted experiments showed that the inequalities fail. So, Einstein’s proposal never gathered much support.
f. The Measurement Problem and Collapse
There is an apparent contradiction between the deterministic evolution of the wave function and its indeterministic change during a measurement (or an experiment). This is the “measurement problem.” Many persons want an explanation of what causes a quantum system to collapse its wave function and “choose” a definite outcome when there is a measurement. The Copenhagen Interpretation says that the wave function collapses because there is a measurement, but it does not explain details of how the collapse happens. It does not say which processes count as measurements and which do not. It does not define the term measurement nor does it clarify why or how consciousness is involved. It does presume that the measurement apparatus is real and correctly described by classical or non-quantum physics. Unfortunately, the Interpretation says nothing about where the line is drawn between the classical part of the world and the quantum part. Could the measurer be part of the system being measured? The Copenhagen Interpretation says no. Some other interpretations say the Copenhagen interpretation is incomplete and inadequate if it demands such a gulf between the measurer and the measured.
The philosophical background of the measurement problem began with the 18th century dispute between rationalists and empiricists. Speaking very loosely, the empiricist wanted to get the observer out of the system being measured, and the rationalist wanted the observer to be inextricably bound to the system being measured. So, quantum mechanics according to the anti-realist version of the Copenhagen Interpretation tilts toward rationalism. These claims have generated considerable debate in the fields of physics and philosophy of physics.
The quantum measurement problem is the unsolved problem of how to understand the process of measurement. It is quite a difficult problem, and it has occupied the very best minds among the community of physicists and philosophers of physics. There also has been controversy about whether it is merely a philosophical problem or also a scientific problem. Many philosophers of physics believe it is not merely a philosophical problem but rather a scientific problem requiring changes in the equations of the theory and new experiments to test those changes.
Loosely, one can think of the measurement problem this way: Would you like to know the mechanism that produced the measured value of 4 when your measurement could have had the any of the outcomes 1, 2, 3, 4, or 5? Quantum theory cannot give you an answer; it can provide you only with the probabilities of your measurement procedures producing each single, possible outcome. Measurement is a random choice among the physically possible outcomes of the measurement, with (as Max Born first claimed) each possible outcome having its own probability of being chosen.
Classically, an ideal measurement need not disturb the system being measured. According to the Copenhagen Interpretation and many other interpretations of quantum mechanics, this classical ideal is unachievable in principle; experimenters always disturb the system they are measuring. This disturbance happens locally and instantaneously. Also, because of the apparent, associated information loss, there is a fundamental time asymmetry in the measurement process; so, reversing the process in time need not take you back to the situation before the measurement began.
A measurement is called a “collapse” because it quickly produces a simpler wave function. Experts do not agree on how narrow the wave function is immediately after the collapse, but the change does seem to happen almost instantaneously.
The notion of something happening instantaneously or almost instantaneously conflicts with the theory of relativity’s requirement that causal effects move no faster than the speed of light in a vacuum. Unfortunately, creating an experiment to confirm any claim about the speed of the collapse faces the obstacle that no practical measurement can detect such a short interval of time:
Yet what we do already know from experiments is that the apparent speed at which the collapse process sweeps through space, cleaning the fuzz away, is faster than light. This cuts against the grain of relativity in which light sets an absolute limit for speed (Andrew Pontzen).
Here is a simple, crude analogy that has been pedagogically helpful. Think of electrons as if they are spinning coins on a table top. They are neither heads up nor tails up until your hand pushes down on the coin, forcing it to have just one of the two possibilities. Your hand activity is the measurement process. Assume your hand pushes the coin down instantaneously.
Quantum theory on the Copenhagen interpretation cannot apply to everything because it necessarily must split the universe into a measured part and an unmeasured part, and it can describe only the measured part but not the process of measurement itself nor what is happening when there is no measurement. So, in that sense, quantum theory is apparently an incomplete theory of nature because there are some things it does not account for. Einstein was very dissatisfied with the Copenhagen Interpretation’s requirement that, during any measurement, the usual principles of quantum mechanics stop applying to the measurement apparatus. He wanted a quantum theory that describes the world without mentioning measuring instruments or the terms “measurement” or “collapse.” He wanted a more complete theory in this sense.
When a measurement occurs, it is almost correct to explain this as follows: At the beginning of the measurement, the system “could be in any one of various possibilities, we’re not sure which.” Strictly speaking, this is not quite correct according to the Copenhagen Interpretation. More precisely stated, before the measurement is made, the system is actually in a superposition of multiple states, one for each possible outcome of the measurement, with each outcome having a fixed probability of occurring as determined by the Born Rule in the formalism of quantum mechanics; and the measurement itself is a procedure that removes the superposition and realizes just one of those states. Informally, this is sometimes summarized in the remark that measurement turns the situation from fuzzy to definite. To account for the two-slit experiment the Copenhagen Interpretation needs both the notion of superposition and of its collapse in order to be consistent with experimental results, namely that a measurement produces an electron in one place at one time, not in two places at a time.
For an instant, a measurer of an electron can say it is there at this specific place, but immediately afterward, due to some new interaction, the electron becomes fuzzy again, and then there is no single truth about precisely where an electron is, but only a single truth about the probabilities for finding the electron in various places if certain additional measurements were to be made.
Following the lead of Einstein’s complaints in the 1930s, there has been growing dissatisfaction with the Copenhagen’s requirement that, during a measurement of quantum properties, quantum mechanics fails to apply to the measurement situation because of a collapse. Many opponents of the Copenhagen Interpretation have reacted in this way:
In the wake of the Solvay Conference (in 1927), popular opinion within the physics community swung Bohr’s way, and the Copenhagen approach to quantum mechanics settled in as entrenched dogma. It’s proven to be an amazingly successful tool at making predictions for experiments and designing new technologies. But as a fundamental theory of the world, it falls woefully short (Sean Carroll).
George Ellis, co-author with Stephen Hawking of the influential book The Large-Scale Structure of Space-Time, identifies what he believes is a key difficulty with our understanding of collapse during measurement: “Usually, it is assumed that the measurement apparatus does not obey the rules of quantum mechanics, but this [assumption] contradicts the presupposition that all matter is at its foundation quantum mechanical in nature.”
Those who want to avoid having to bring consciousness of the measurer into quantum physics and who want to restore time-reversibility and determinism and conservation of quantum information typically recommend adopting a different interpretation of quantum mechanics that changes how measurement is treated.
Nevertheless, either the wave function actually does collapse, or else something is happening that makes it look very much as if the wave function collapses. What is this “something”?
g. The Many-Worlds Interpretation
According to the Many-Worlds Interpretation of quantum mechanics, anything that can happen at a moment according to the laws of quantum mechanics, given the state of our world, does happen in some world or other. For example, if at noon you could go to lunch or stay working in your office, but you choose lunch, then at noon your world branches or splits into two, one in which you go to lunch at noon, and one in which you stay working in your office at noon. The world in which you chose lunch becomes the actual world. The two worlds then evolve independently of each other, but the one noon is not the same noon as the other because time exists within a single world and not across worlds. When a branch is created, time branches, too. The worlds are created when the choice is made, when there is a collapse of the wave function. They are not made when the superposition is created.
How frequently are new worlds created? According to Sean Carroll, “In a typical human body, about 5,000 atoms undergo radioactive decay every second. If every decay branches the wave function in two, that’s 25000 branches every second.” The tree of time lines acquires many new branches every second.
Clearly, the weirdness of the Copenhagen theory has been traded for a new kind of weirdness. This interpretation requires a revision in the meaning of the terms “world” and “you.” The Many-Worlds Interpretation is also called the Everettian Interpretation for its founder Hugh Everett III.
The Many-Worlds Interpretation does not imply that anything can happen. When it describes the two-slit experiment, it allows an electron to go left, and to go right, and to tunnel through the steel plate and crash into the experimenter’s cup of coffee, but it does not allow the electron to get a charge of +1 or -2 because those values for charge are inconsistent with the laws of quantum mechanics.
What is presented here is the maximalist version that says the many worlds are real worlds. This has been the most interesting version for philosophers, but there are minimalist versions which treat the situation merely as if there are many worlds, and make no ontological claims. The main reason offered for the maximalist version is that assuming and speaking about worlds can provide genuine physical insight about what is going on. The minimalists respond that this might be helpful, but no ontological conclusion should be drawn from this. Analogously, some minimalist philosophers say Kripke’s possible worlds should not be taken literally; they are merely calculation devices that are helpful for understanding modal terms such as “could happen at some time” and “must happen at all times,” and it is ontological overreach to conclude that the worlds are existent entities.
According to the Many-Worlds Interpretation, the behavior of the set of all the worlds is deterministic, and total information for the sum of all the worlds is always conserved during an interaction of any kind; but a single world itself is not deterministic nor is information conserved there. So, in a single world there is only an apparent collapse of the wave function. Laplace’s Demon, too, if restricted to information that is only about our particular universe, would be surprised by measurement results in our world. But, strictly speaking, probabilities are always a sign of ignorance in the Many-Worlds models. In these models, all the fundamental laws of physics apply to all worlds and are deterministic, time-reversible symmetric, and information-conserving.
What the Copenhagen Theory calls quantum fuzziness or a superposition of many states, the Many-Worlds Theory calls a superposition of many alternate, unfuzzy universes. The reason that there is no problem with energy conservation is that, if a world splits into seven new worlds, then each new world has one-seventh the energy of its parent world.
The Many-Worlds theory does not accept the Copenhagen version of measurement collapse. Instead, it implies only that, when a system in a superposition is measured, the system interacts with and becomes entangled with its environment thereby producing a single value for the measurement in each single world. There is a new world for every possible value that could have been produced during the measurement.
The multiple universes of the Many-Worlds Interpretation are different from those of the Multiverse Interpretation of cosmic chaotic inflation that is described below in the section about extending the Big Bang Theory. All the multiple universes produced by inflation do exist within a single background physical space and time, the same one that our universe exists within. When reading other literature, one needs to be alert to the fact that often the Many-Worlds Theory is called the Multiverse Theory and vice versa. And both theories are occasionally called two multiverse theories or two many-worlds theories.
The branches of the Many-Worlds Interpretation can interact but rarely do. It is very, very unlikely that you will ever find out what happened to your “twin” who lives in another world. For more on this point, see (Carroll 2019, 160-161).
A significant problem for the Many-Worlds Interpretation is to explain how the concept of a probability measure works across worlds. For example, it is unclear what it precisely means to say of the two-slit experiment that the electron went through the left slit in 50% of the worlds. Many opponents of the Interpretation say this problem is unsolvable, so the Interpretation is incorrect.
Another difficulty is to say how many worlds there are. If we were measuring a continuous quantity such as spatial position of an electron, there is no limit to the number of possible measured values, so there is no limit to the number of worlds. But many advocates of certain proposed theories of quantum gravity believe space is atomistic, not continuous, and so there is an upper limit to the number of worlds.
Also, experts do not agree on whether the quantum wave function is a representation of reality or only of our possible knowledge of reality. And there is no consensus on whether we currently possess the fundamental laws of quantum theory, as Everett believed, or instead only an incomplete version of the fundamental laws, as Einstein believed.
Quantum physicist John Preskill at the California Institute of Technology is an advocate of the Everettian interpretation. Speaking with him in 2025 about the role of physics, Brian Greene said:
To me, it doesn’t feel enough to be able to just have predictions confirmed by experiments. I want physics to lift the curtain and show me the behind the scenes drama. And so, if one does take a Many-Worlds point of view in thinking about quantum mechanics, does that mean that the behind the scenes drama is…a universal wave function governed by the Schrodinger equation living in some Hilbert space, and it’s just dynamically evolving forward with the Schrödinger equation? Is that the true reality that we inhabit?”
In response, Preskill said, “Until we have experimental evidence to dissuade me, I am comfortable with provisionally accepting this is the way nature really works.”
h. Heisenberg’s Uncertainty Principle
In quantum mechanics, various Heisenberg Uncertainty Principles restrict the simultaneous measured values of some pairs of variables. One of his principles says a particle’s position and its momentum cannot both be precise at the same time. Another Heisenberg uncertainty principle places the same restriction on time and energy. During particle emission or particle absorption, a system cannot have both a precise energy and a precise time duration with that energy. The more precision for one variable, the less precision for the other.
Philosophers of physics disagree about whether these restrictions are only on the values that can be measured and known, or are ontological restrictions on what can exist. We are referring to epistemological uncertainty when we say, “I am uncertain. I just don’t know.” We are referring to ontological uncertainty when we say, “Things are inherently fuzzy. They are not determinate.” Many advocates of the Copenhagen Interpretation favored epistemological uncertainty, but In the 21st century most theoretical physicists have come to believe the Heisenberg Uncertainty Principle is about ontological uncertainty.
Quantum uncertainties do not appear in a single measurement. They are detected over a collection of measurements because any single measurement has (in principle and not counting practical measurement error) a precise value and is not “fuzzy” or uncertain or indeterminate. Repeated measurements necessarily produce a spread in values that reveal the fuzzy, wavelike characteristics of the phenomenon being measured, and these measurements collectively obey the Heisenberg inequality. Heisenberg himself thought of his uncertainty principle as being about how the measurer necessarily disturbs the measurement and not about how nature itself does not have definite values, but many physicists believe both and others believe only the latter.
The Heisenberg Uncertainty Principle about energy is commonly said to be a loan agreement with nature in which borrowed energy must be paid back. There can be temporary violations in the classical law of the conservation of energy as the borrowing takes place. The classical law of conservation says the total energy of a closed and isolated system is always conserved and can only change its form but not disappear or increase. For example, a falling rock has kinetic energy of motion during its fall to the ground, but when it collides with the ground, the kinetic energy changes its form to extra heat in the ground, extra heat in the rock, and the sound energy of the collision. No energy is lost in the process. This classical law can be violated in two ways: (1) if the universe (or the isolated system being studied) expands in volume, and (2) by being violated by an amount –E for a time –t, as described by Heisenberg’s Uncertainty Principle. The classical law is often violated for very short time intervals and is less likely to be violated as the time interval increases. Some philosophers of physics have described this violation as something coming from nothing and as something disappearing into nothing, which is misleading to people who use these terms in their informal sense instead of the sense intended by quantum field theory. The quantum “nothing” is not really what many philosophers call “nothing.” Many experts believe quantum field theory does contain a more sophisticated law of conservation of energy that has no violations and that accounts for the deviations from the classical conservation law, but other experts believe it best to say the law is violated, that it is not a universal law.
i. Virtual Particles, Quantum Foam, and Wormholes
Quantum theory and relativity theory treat the vacuum radically differently from each other. Quantum field theory implies the vacuum contains virtual particles. They are created out of the quantum vacuum via spontaneous, random quantum fluctuations—due to Heisenberg’s Uncertainty Principles. Because of this behavior, no quantum field can have a zero value at any place for very long.
Virtual particles are called “virtual” not because they are unreal but because they are unusual: they borrow energy from the vacuum and pay it back very quickly, so quickly that they cannot be detected with any currently existing instruments. What happens is that, when a pair of energetic virtual particles—say, an electron and anti-electron—form from “borrowed” energy in the vacuum, the two exist for a short time before being annihilated or reabsorbed, thereby giving back their borrowed energy. The greater the energy of the virtual pair, the shorter the probable duration of their existence before being reabsorbed. The more energy that is borrowed, the quicker it is paid back.
There are never any isolated particles. An elementary particle supposedly sitting alone in empty space is actually surrounded by a cloud of virtual particles. Many precise experiments can be explained only by assuming there is this cloud. Without assuming the existence of virtual particles, quantum theory would not be able to predict this precise value of the electron’s magnetic moment
g/2 = 1.001 159 652 180 73…
That value agrees to this many significant digits with our most careful measurements. So, physicists are confident in the existence of virtual particles.
An electron is continually surrounded by virtual photons of temporarily borrowed energy. Some virtual photons exist long enough to produce electron-positron pairs, and these buffet the electron they came from. This buffeting produces the so-called “Lamb shift” of energy levels within an atom.
Virtual particles are not exactly particles like the other particles of the quantum fields. Both are excitations of these fields, and they both have gravitational effects and thus effects on time, but virtual particles are not equivalent to ordinary quantum particles, although the longer lived ones are more like ordinary particle excitations than the shorter lived ones.
Virtual particles are just a way to calculate the behavior of quantum fields, by pretending that ordinary particles are changing into weird particles with impossible energies, and tossing such particles back and forth between themselves. A real photon has exactly zero mass, but the mass of a virtual photon can be absolutely anything. What we mean by “virtual particles” are subtle distortions in the wave function of a collection of quantum fields…but everyone calls them particles [in order to keep their names simple] (Carroll 2019, p. 316).
Suppose a small region empty space were to have zero energy. Then we would know the exact value of the energy at a time. But that violates Heisenberg’s Uncertainty Principle. So, quantum physics needs to ascribe some energy to the vacuum, with the smaller the region requiring a larger vacuum energy. If the region is sufficiently tiny, very tiny, this energy will produce a microscopic black hole.
Based upon this reasoning, the physicist John Wheeler suggested that the ultra-microscopic structure of spacetime for periods on the order of the Planck time (about 5.4 x 10-44 seconds) or less in regions about the size of the Planck length (about 1.6 x 10-35 meters) is a quantum foam of rapidly changing curvature of spacetime, with micro-black-holes and virtual particle-pairs and perhaps wormholes rapidly forming and dissolving. Wormholes are very similar to two black holes connected by a narrow tunnel, but the wormhole is not enclosed within an event horizon as is a black hole.
Another remarkable, but speculative, implication about virtual particles is that it has seemed to many physicists that it is physically possible in principle to connect two black holes into a wormhole and then use it for time travel to the past. “Vacuum fluctuations can create negative mass and negative energy and a network of wormholes that is continually fluctuating in and out of existence…. The foam is probabilistic in the sense that, at any moment, there is a certain probability the foam has one form and also a probability that it has another form, and these probabilities are continually changing” (Kip Thorne). The foam process can create a negative energy density and thus create exotic matter whose gravity repels rather than attracts, which is the key ingredient needed to widen a wormhole and turn it into a time machine for backward time travel. A wormhole is a tunnel through space and time from one place to another in which your travel through the hole could allow you to reach a place before anyone moving at the speed of light or less, but not through the hole, had time to get there.
Without sufficient negative gravitational force in its neck connecting its two opening holes, the wormhole has a natural tendency to close its neck, that is, “pinch off” to a width with zero diameter. For a popular-level discussion of how to create this real time machine as opposed to a science fiction time machine, see the book The Warped Side of Our Universe: An Odyssey Through Black Holes, Wormholes, Time Travel, and Gravitational Waves by Kip Thorne and Lia Halloran, 2023. Thorne says: “One way to make a wormhole, where previously there was none, is to extract it from the quantum foam…, enlarge it to human size or larger, and thread it with exotic matter to hold it open.”
Another controversial implication about virtual particles is that there is a finite but vanishingly small probability that a short-lived potato or conscious brain will spontaneously fluctuate out of the vacuum tomorrow. If such an improbable event were to happen, many non-physicists would be apt to say that a miracle had happened, and God had temporarily intervened and suspended the laws of science.
Positive but indirect evidence for the existence of virtual particles and perhaps also for the quantum foam comes from careful measurements of the Casimir Effect between two mirrors or conducting plates, in which, as they get nearer to each other, there is a new force that appears and starts pushing them even closer.
Richard Muller reacts to these arguments for quantum foam as “theory overreaching experiment….All the theory written on these subjects may be nothing more than fanciful speculation.”
Another reason to believe that physicists’ understanding of the quantum foam must be incorrect is that they have attempted to add up all the energy contributions to the foam and have learned that regardless of whether the number comes out to be positive or negative it is at least 10120 larger or smaller than the dark energy of space that we observe. This is the biggest mismatch between theory and observation in the history of science. It is called the Cosmological Constant Problem.
j. Entanglement, Decoherence, and Non-Locality
Entanglement
Schrödinger introduced the term “entanglement” in 1935 to describe what is perhaps the strangest feature of quantum mechanics. Entanglement is an exotic kind of correlation across distances and times that is stronger than classical correlations such as falling raindrops being correlated with moving windshield wipers. Entanglement is an experimentally well-confirmed feature of reality, say most physicists. But other physicists with expertise in this area do not agree. They say it is a well-confirmed feature of quantum mechanics, but not of reality.
Entanglement is the opposite of separability. Pre-quantum theories assume separability, that ideally Alice’s lab here can be completely and correctly described independently of Bob’s lab there. This is not so if the two labs are entangled. The manifest image assumes separability. So does relativity theory.
In quantum mechanics, what is entangled are particles or properties. When the two particles become entangled, they remain “tied together” even if they move a great distance away from each other or even if they exist at different times. This entanglement is a kind of correlation (or anti-correlation) across space or time or both. If two particles somehow become entangled, this does not mean that, if you move one of them, then the other one moves, too. Quantum entanglement is not that kind of entanglement. It is not about actions. Ontologically, the key idea about quantum entanglement is that if a particle becomes entangled with one or more other particles within the system, then it loses some of its individuality. The whole system becomes more than the sum of its individual parts. The parts do not have separability. The quantum state of an entangled group of particles is not determined by the sum of the quantum states of each separate particle.
The point is that the predictions of quantum mechanics are independent of the relative arrangement in space and time of the individual measurements: fully independent of their distance, independent of which is earlier or later, etc…. So quantum mechanics transgresses space and time in a very deep sense. We would be well advised to reconsider the foundations of space and time…. (Anton Zeilinger).
If Alice and Bob’s systems are entangled, they can share this entanglement with the systems of Charlie and Dale which are not entangled and do not interact with each other. Suppose Alice’s particle is entangled with Bob’s particle. Now, suppose Alice’s particle becomes entangled with Charlie’s and suppose Bob’s particle becomes entangled with Dale’s. Then some of the entanglement between Alice’s and Bob’s particles can be passed on so that Charlie’s and Dale’s two particles become entangled even though Charlie and David’s system never directly interact with each other.
Non-Locality
All human beings have the basic intuition of locality, which is the opposite of action at a distance. Voodoo and other kinds of magic violate locality. Sticking a needle in a doll representing your enemy supposedly causes an immediate stomach ache in your enemy. Isaac Newton’s theory of gravity implies that moving the Sun immediately affects the Earth gravitationally without affecting any intermediate objects. That is why Newton believed his theory needed improvement. The interactions in local processes are limited to immediate neighbors. Many interpretations of quantum mechanics suggest there are non-local processes. In reaction to this, Gerard ‘t Hooft asserted that locality is “an essential ingredient for any simple, ultimate law governing the universe.” He believed that if quantum mechanics implies non-locality, then quantum mechanics is failing to provide an ultimate description of how nature works. Einstein would have agreed. The majority of physicists do not agree with them because they believe in quantum mechanics and its implication that locality is often violated.
Locality in space implies an object is influenced directly only by its immediate surroundings. The distant Sun influences our skin on Earth, but not directly. There are two different kinds of spatial non-locality: (1) via direct physical action at a distance, and (2) via correlated knowledge between distant measurements so that the knowledge of a measurement taken at one place instantly gives the measurer knowledge of what a similar measurement would produce if it were to occur at the other place. Entanglement is what produces non-locality, but it is only of kind (2).
An example of kind (1) would be that, when the Sun burns out at time t0, then at the same time t0 the Earth is plunged into darkness. Another example would be if, when you apply a force and move particle A, its correlated particle B moves, too, at the same time. Entanglement of kind (1) can be exploited to send a message.
Let’s focus on non-locality of kind (2) since this is the only kind of non-locality allowed by quantum mechanics. Quantum entanglement comes in degrees. As it becomes weaker, the system starts to cross over into classical behavior, but let’s consider strong entanglement.
Suppose Alice and Bob are able to make similar measurements on two strongly entangled particles. Alice cannot use her particle to force Bob to measure her intended value for his particle because she has no control over the measured value that she herself will obtain by her own measurement. That is why she cannot use her particle to send a message faster than light speed to Bob. Nevertheless, whatever value she does obtain will instantly affect what value Bob will obtain even if he makes a measurement at the same time as she does.
That is the standard interpretation of quantum correlation and non-locality. Some opponents say this is the tip of the iceberg, and we just have not yet been clever enough to figure out how Alice can send Bob a message faster than the speed of light.
Decoherence
The term “decoherence” was coined by David Bohm in 1951 to describe destroyed coherence. The Many-Worlds theory treats decoherence not as collapse of the wave function but as the mechanism by which new worlds are created.
The majority position is that measurement of an entangled particle breaks its superposition instantly. The breaking is called collapsing the wave function. It is also called decohering, especially if the measurement is unintentional and we lose track of the details. Any interaction of a quantum system with its environment will collapse its superposition and serve as a measurement, though usually an unintentional measurement.Note that it is entanglement that collapses the wave function. No “observer” is required. That is why so many physicists believe the Copenhagen Interpretation was wrong about needing a human observer for a measurement. In the two-slit experiment if we entangle anything with the incoming photon or electron so that it reveals which slit it goes through, we get the same loss of interference on the screen behind the slits.
The two-slit experiment is designed so that the incoming beam of electrons or photons is “coherent.” This means the electrons are coordinated; they form an entangled, collective, united, understandable whole. “Coherence is a measure of how well certain systems will maintain their relationships with each other and how well we are able to predict the evolution of those systems,” said Martin Holt. Initially the photons within any laser beam are coherent, but continual interacting with the environment leads to degradation of this quantum coherence, to loss of superposition, to collapse of the wave function.
Lost coherence is non-classical noise in systems intended to preserve coherence, such as in a quantum computer. Quantum states are inherently fragile, and most interactions are strong enough to produce decoherence; so, it takes very careful work to produce the kind of interaction that creates and then preserves coherence. Preserving coherence is the most difficult goal to achieve in the construction of a quantum computer, and cooling is one of the main techniques used to achieve the goal. Interactions that cause unintended decoherence lower the fault tolerance of a quantum computer.
According to most interpretations of quantum theory, the moon is there when it is not being looked at because the moon is always interacting with some particle or other and thereby decohering and, in that sense, getting measured. Decoherence is also why the moon’s unusual quantum properties are not visible to us at our macro-scale. Nevertheless, the moon is a quantum object (an object obeying the rules of quantum theory), as are all other objects. There is no sharp dividing line in which, as the size of an object increases, the line is crossed and its behavior switches from quantum behavior to classical behavior.
Entanglement apparently violates special relativity. The reason is that a quantum measurement by Alice of a certain property of one member of an entangled pair of particles will instantaneously or nearly instantaneously determine the value of that same property that would be found by Bob if he were to make a similar measurement on the other member of the pair, no matter how far away the two entangled particles are from each other and no matter the duration between the two acts of measuring. Alice and Bob’s measurement processes can be space-like separated from each other. This space-like separation feature is what bothered Einstein the most. In a letter to Max Born in 1947, Einstein referred to non-locality pejoratively as “spooky action at a distance.” Most physicists still say this, but actually it is spooky but not spooky action. It is a way of propagating definiteness, not propagating action or information.
In 1935, bothered by this quantum non-locality, Erwin Schrödinger said:
Measurements on (spatially) separated systems cannot directly influence each other—that would be magic.
Einstein agreed. Yet the magic seems to exist. “I think we’re stuck with non-locality,” said John Bell.
Einstein was unhappy with many features of quantum mechanics but especially its non-locality. He was the first person to clearly see that quantum mechanics is local but incomplete or else complete but non-local. He hoped for the incompleteness, but the majority of physicists believe it is complete and so not local.
The philosopher David Albert has commented that “In order to make sense of this ‘instaneity’ of the quantum correlation, it looks as if there is a danger that one may require an absolute notion of simultaneity of exactly the kind that the special theory of relativity denied.” Tim Maudlin agreed and has speculated that instantaneous signaling might be just not yet discovered. He commented that perhaps there can be some faster-than-light signaling which could be detected by somehow exploiting arrival times of the signals sent between Alice and Bob.
The philosopher Huw Price speculated in (Price 1996) that non-local processes are really backwards causal processes with effects occurring before their causes. Maudlin reacted to this by saying, “I think retro-causation is analytically impossible.” In other words, having causes occur after their effects is nonsense.
The physicist Juan Maldacena has conjectured that entanglement of two objects is really a wormhole connecting the two. Informally this means there is a “portal” connecting them. The physicist Leonard Susskind has emphasized that it is not just particles and properties that can become entangled. Parts of space can be entangled with each other, and he conjectured that “quantum entanglement is the glue holding space together. Without quantum entanglement, space would fall apart into an amorphous, unstructured, unrecognizable thing.” Many physicists believe entanglement is linked somehow to the emergence of space in the sense that if we were to know the degree of entanglement between two quantum particles, then we could derive the distance between them. Some others speculate that time itself might be the product of quantum entanglement.
k. Objective Collapse Interpretations
Objective collapse interpretations of quantum mechanics try to solve the measurement problem by somehow slightly modifying the Schrödinger equation that describes the evolution of quantum states. This new interpretation is not deterministic because it introduces random elements into the collapse process. These elements serve to make a collapse natural, spontaneous, and more probable over time, a collapse that would happen during an interaction with or without a conscious observer. The collapse occurs either randomly or when there is a sufficient increase in, say, the complexity of the system.
A measurement is considered to be any interaction with anything external to the system that causes the system’s wave function to appear to collapse. A passing photon or even a virtual electron bubbling up out of the quantum vacuum can do this.
There is little agreement on specifically how to modify the Schrödinger equation, although the GRW model and the Penrose model are leading candidates. Objective collapse interpretations are also called “spontaneous collapse models.”
l. Quantum Tunneling
Quantum mechanics implies a kind of barrier penetration called quantum tunneling. It implies that a particle can penetrate through a potential energy barrier that is higher in energy than the particle should be able to penetrate according to classical theory. According to quantum mechanics, there is a chance that, if a rock is sitting quietly in a valley next to Mt. Everest, it will leave the valley spontaneously and tunnel through the mountain and appear intact on the other side. The probability is insignificant but not zero. It is an open question in physics as to how long it takes the object to do the tunneling. Some argue that the speed of the tunneling is faster than light speed. The existence of quantum tunneling is accepted because it seems to be needed to explain some radioactive decays, and some chemical bonds, and how sunlight is produced by protons in our sun that overcome their mutual repulsion and instead fuse and eject photons.
m. Approximate Solutions
Like the equations of the theory of relativity, the equations of quantum mechanics are very difficult to solve and thus to use except in very simple situations. The equations cannot be used directly in digital computers. There have been many Nobel-Prize winning advances in chemistry by finding methods of approximating quantum theory in order to simulate the results of chemical activity. For one example, Martin Karplus won the Nobel Prize for chemistry in 2013 for creating approximation methods for computer programs that describe the behavior of the retinal molecule in our eye’s retina. The molecule has almost 160 electrons, but he showed that, for describing how light strikes the molecule and begins the chain reaction that produces the electrical signals that our brain interprets during vision, chemists can successfully use an approximation; they need to pay attention only to the molecule’s outer electrons.
n. Emergent Time and Quantum Gravity
There has been much speculation about the role of time in a theory of quantum gravity, a theory of quantum mechanics that reconciles its differences with general relativity. Perhaps the new theory will need to make use of special solutions to the Schrödinger equation that normally are not discussed—solutions describing universes that don’t evolve at all. For these solutions, there is no time, and the quantum state is a superposition of many different classical possibilities:
In any one part of the state, it looks like one moment of time in a universe that is evolving. Every element in the quantum superposition looks like a classical universe that came from somewhere, and is going somewhere else. If there were people in that universe, at every part of the superposition they would all think that time was passing, exactly as we actually do think. That’s the sense in which time can be emergent in quantum mechanics…. This kind of scenario is exactly what was contemplated by physicists Stephen Hawking and James Hartle back in the early 1980s (Carroll 2016, 197-9).
It looks as if time exists, but fundamentally it doesn’t.
o. The Standard Model
The Standard Model of particle physics was proposed in the 1970s, and subsequently it has been revised and well tested. The Model is our best quantum field theory. It designed to describe elementary particles and the physical laws that govern them. The Standard Model is really a loose collection of theories describing seventeen different particle fields except for any gravitational field. It is our civilization’s most precise and powerful theory of physics. It originally was called a model, but now has the status of a confirmed theory. Because of this it probably should not be called a “model” because it does not contain simplifications as do other models, but its name has not changed over time. The Standard Model is expressed with symmetries and groups, but those are not explored in this article.
The Standard Model sets severe limits of what exists and what can possibly happen. It implies that a particle can be affected by some forces but not others. It implies that a photon cannot decay into two photons. It implies that protons attract electrons and never repel them. It also implies that every proton consists in part of two up quarks and one down quark that interact with each other by exchanging gluons. The gluons “glue” the quarks together via the strong nuclear force. Photons “glue” electrons to protons and vice versa via the electromagnetic force. Unlike how Isaac Newton envisioned forces, all forces are transmitted by particles. That is, all forces have carrier particles that “carry” the force from one place to another.
Actually, the old notion of “force” is explained in terms of interactions of fields, not interactions of particles. This concept of interaction is very exotic in the Standard Model. Whenever a particle interacts with another particle, the two particles exchange other particles, the so-called carriers of the interactions. When milk is spilled onto the floor, what is going on is that the particles of the milk and the particles in the floor and the particles in the surrounding air exchange a great many carrier particles with each other, and the exchange is what is called “spilling milk onto the floor.” Yet all these varied particles are just tiny fluctuations of fields. This scenario indicates one important way in which the scientific image has moved very far away from the manifest image.
Because the exchange of so many gluons within a single proton is needed to “glue” its constituent quarks together and keep them from escaping, more than 90% of the mass of the proton is not due to the mass of its three quarks. It is due to a combination of virtual quarks, virtual antiquarks and virtual gluons. Because these virtual particles exist over only very short time scales, they are too difficult to detect by any practical experiment, and so they are called “virtual.” However, this word “virtual” does not imply “not real.”
The properties of spacetime points that serve to distinguish any particle from any other are a spacetime point’s values for mass, angular momentum, and charge at that point. Nothing else. There are no other differences among what is at a point, according to the Standard Model, so in that sense fundamental physics is very simple. If we are talking about a point inside a pineapple, what about the value of its pineapple-ness? In principle, according to the Standard Model, the pineapple’s characteristics depend only on these other, more fundamental characteristics. Charge, though, is not simply electromagnetic charge. There are three kinds of color charge for the strong nuclear force, and two kinds of charge for the weak nuclear force. In the atom’s nucleus, the strong force holds protons together tightly enough that their positive electric charges are not capable of pushing the protons away from each other. The strong force also holds the three quarks together inside a proton. The weak force turns neutrons into protons and spits out electrons. It is the strangest of all the forces because it allows some rare exceptions to time-symmetry.
Except for gravity, the Standard Model describes all the universe’s forces. Strictly speaking however, these theories are about interactions rather than forces. A force is just one kind of interaction. Another kind of interaction does not involve forces but rather changes one kind of particle into another kind. The neutron, for example, changes its appearance depending on how it is probed. The weak interaction can transform a neutron into a proton. It is because of transformations like this that the concepts of something being made of something else and of one thing being a part of a whole become imprecise for very short durations and short distances. This is a problem for classical mereology—the formal study of parts and the wholes they form.
According to the Standard Model, but not according to general relativity theory, all particles must move at light speed c unless they interact with other fields. The particles when created do not speed up to c; they begin at that speed. All the particles in your body such as its protons and electrons would move at the speed c if they were not continually interacting with the Higgs Field. The Higgs Field can be thought as being like a “sea of molasses” that slows down all protons and electrons and gives them the mass and inertia they have. That is what Richard Feynman meant when he said, “All mass is interaction.” Neutrinos are not affected by the Higgs Field, but they move slightly less than c because they are slightly affected by the field of the weak interaction. Of all the particles described by the Standard Model of particle physics, the Higgs boson is the strangest.
The Standard Model helps explain what is happening in an atomic clock when an electron in a cesium atom changes energy levels and radiates some light indicating the clock is properly tuned. The Standard Model “says” that the electron, being a localized vibration in the electron field, suddenly vibrates less, thereby loses energy, and the lost energy is transferred to the electromagnetic field, creating a localized vibration there that is a new photon.
As of the first quarter of the twenty-first century, the Standard Model is incomplete because it can account for neither gravity nor dark matter particles nor dark energy nor the fact that there is more matter than anti-matter. When a new version of the Standard Model does all this, then it will perhaps become the long-sought “theory of everything.”
For discussion of quantum mechanics at a more advanced level, see all the other articles on the subject in this encyclopedia.
4. The Big Bang
The Big Bang Theory is the standard model of cosmology, but it is not part of the Core Theory. Most of its versions treat time the way it is treated in the Core Theory, especially relativity theory. However, the version called the multiverse theory, does have a radical implication for the topological structure of time.
The phrases “big bang” and “big bang theory” have several senses, which causes ambiguity in discussions. When the phrases are used simply in the sense of there having been a brief time about 13.8 billion years ago when the universe was very hot and very small before it expanded explosively, then the big bang theory is accepted by nearly all cosmologists, astronomers, astrophysicists, and philosophers of physics. Even so, it is still not as firmly accepted as is the Core Theory.
The standard version is the “hot big bang theory” that implies the universe once was extremely small, extremely dense, extremely hot, nearly uniform, at minimal entropy, and expanding; and it had extremely high energy density and severe curvature of its spacetime at all scales. Now the universe has lost all these properties except one. It is still expanding.
Later in this section there are more details about the controversy regarding how to define the term “big bang.” Notably, there is controversy about whether the big bang theory requires there to be a first instant of time, and there is controversy about whether a period of exponential inflation exists, and, if it does, should that period be said to occur before or only during the big bang?
The first second of the big bang event is the universe’s most significant time period because without the contingent features that it had then, today’s universe would have been radically different. There might have been no Earth and no life of any kind.
As far as is known, the big bang explosion was a rapid expansion of space itself, not an expansion of something into a pre-existing void. There was never a time when the universe was empty. Think of the expansion as being due to the creation of new space everywhere very quickly. The universe’s space has no center around which it expanded. As it expanded from t = 0 to t = 10-33 seconds, it diluted and became extremely cold. But then suddenly 1088 particles were created (via E=mc2) from some of the potential energy of space, and the universe re-heated.
During the big bang, the universe probably expanded in all directions almost evenly, and it probably did not produce any destruction of anything, though these are just guesses. There is some uncertainty about whether dark matter was created at the same time as was ordinary matter; some cosmologists believe the dark matter particles were created a month later.
There is an important distinction between the observable universe and the entire universe. The observable universe is the part of the universe that in principle could be observed by creatures on Earth or that could have interacted with us via actions that move at the speed of light or less. The observable universe is the contents of our past light-cone, so it contains nothing that exists in the absolute elsewhere. No one knows whether the observable universe could be rotating relative to the rest of the universe.
The unobservable universe may or may not have an edge, but the observable universe definitely does. Its diameter is about 93 billion light years, and it is rapidly growing more every day, but it will always be finite in volume. The observable universe is probably a sphere containing from 350 billion to one trillion large galaxies. It is also called “our Hubble Bubble” and “our pocket universe.” It is still producing new stars, but the production rate is ebbing. 95% of the stars that will ever exist in it have already been born.
Scientists have no well-confirmed idea about the universe as a whole; the universe might or might not be very similar to the observable universe, but the default assumption is that the unobservable universe is like the observable universe. It is unknown whether the unobservable universe’s volume is infinite.
As the universe expands, each day a few more stars become inaccessible to us here on Earth as they speed away from us and their red shift gets higher and higher. “Of the 2 trillion galaxies contained within our observable Universe, only 3% of them are presently reachable, even at the speed of light,” said Ethan Siegel. That percentage is expected to reduce to zero in the future. On average, the more distant galaxies speed away from us faster than the nearby galaxies.
The big bang model was revised in 1988 to say the universe’s expansion rate has been increasing slightly for the last five billion years due to the pervasive presence of a “dark energy.” Dark energy is whatever it is that speeds up the expansion of the universe at the cosmic level. It is not known whether the dark energy will continue to increase forever or will eventually weaken and cause the universe to decrease in volume, although most cosmologists believe it will increase forever.
Here is a picture that displays the evolution of the observable universe since the big bang:
Attribution: NASA/WMAP Science Team
Clicking on the picture will produce an expanded picture with more detail. The picture displays only two of our three spatial dimensions. Time is increasing to the right while space increases both up and down and in and out of the picture.
Let’s return to the question of how to define the term “big bang.” The term is used by experts in many ways that conflict with each other. The big bang is: (1) A first instant. (2) A short, very early period of the hot and dense universe’s expansion. (3) The entire history and future of the universe that began with this expansion. (4) What happened right after inflation ended. (5) Whatever happened before inflation began. This article usually uses the term in sense (2), and it calls (3) the big bang model. Definition (2) allows the greatest number of cosmologists to say there was a big bang.
About 10% of professional cosmologists as of 2025 believe the universe began at the big bang, and that there was a first instant of time. This beginning time is about 13.8 billion years ago. It is the famous cosmic time of t = 0. However, the classical, hot big bang theory in sense (2) does not imply anything about whether there was a time before t = 0. Most philosophers and physicists do expect to eventually answer the question, “What was the universe doing before it expanded?” The idea of the big bang being the universe’s first event is primarily a product of accepting the theory of relativity, which is known to fail in the limit as the universe’s currently expanding volume is reversed and shrinks to a point.
The most convincing evidence in favor of the big bang theory is the discovery of the cosmic microwave background radiation (CMB) that it predicts. Astronomers on Earth have detected microwave radiation arriving in all directions. The big bang theory implies it is a fossil record of the cooled down heat from the big bang. More specifically, it is electromagnetic radiation produced about 380,000 years after the big bang began; at that time the universe suddenly turned transparent for the first time. By then the universe had cooled down to 3,000 degrees Kelvin, which was finally cool enough to form stable atoms and to allow photons for the first time to move freely without being immediately reabsorbed by neighboring particles. This primordial electromagnetic radiation has now reached Earth as the universe’s most ancient light. To give a sense of how ancient, Richard Muller suggests this helpful analogy. Suppose you are twenty years old and your twenty-year lifespan is analogous to the 13.8 billion period since the big bang. The 380,000 years until the first ancient light is released is analogous to when you were six hours old.
Because of space’s expansion during the CMB light’s travel to Earth, the ancient light has cooled and dimmed, and its wavelength has increased and become microwave radiation with a corresponding temperature of only 2.73 degrees Celsius above absolute zero. The microwave’s wavelength is about two millimeters and is small compared to the 100-millimeter wavelength of the microwaves in kitchen ovens. Measuring this incoming Cosmic Microwave Background (CMB) radiation reveals it to be extremely uniform in all directions in the sky.
The ancient light is extremely uniform, but not perfectly uniform. It varies slightly with the angle it is viewed from–by about one ten-thousandth of a degree of temperature. The principal assumption is that these small temperature fluctuations of the currently arriving microwave radiation are caused by quantum fluctuations in the density of the matter of the early plasma and so are probably the origin of what later would become today’s galaxies plus the dark voids between them. This is because the early regions of high matter density would contract under the pull of gravity and cause the collapse of its matter into stars, galaxies and clusters of galaxies; meanwhile, the low density regions would become relatively less dense and become the expanding voids that now exist between the galaxy clusters.
If we trust just relativity theory, then the big bang would have begun from a singularity, a point of zero volume; and immediately after this, quantum theory would imply there would be lumps created everywhere so that, by the time the CMB radiation was created, would show up as larger temperature differences than we in fact see on Earth today. This is the main reason not to believe the universe began with a singularity.
The universe’s expansion rate has been nearly constant for billions of years. This rate is now accelerating slightly and has been doing this for at least a few billion years because there is a another source of expansion—the repulsion of dark energy. The influence of dark energy was initially insignificant for billions of years, but its key feature is that it does not significantly dilute as the space undergoes expansion. So, finally, after about seven or eight billion years of space’s expanding after the big bang, the dark energy became an influential factor and started to significantly accelerate the expansion. For example, now the diameter of today’s observable universe will double in about 10 billion more years. This influence from dark energy is mentioned in the top right of the above diagram. Future curvature will be much greater. Most cosmologists believe this dark energy is the energy of space itself, so they call it “vacuum energy.” It is an open question whether it will stop increasing and begin decreasing, although the default position is that it will not stop increasing.
The initial evidence for dark energy came from observations in 1998 of Doppler shifts of supernovas. These observations are also called “redshifts.” This plus the fact that these shifts to lower frequencies are increasing is best explained by the assumption that the average distance between supernovas is increasing at an accelerating rate. The influence of the expansion is not currently significant except at the level of galaxy clusters, but the influence is accelerating, and eventually it will rip apart all galaxy superclusters, then later the individual clusters, then galaxies, and someday all solar systems, and ultimately even all configurations of elementary particles, as the universe approaches its “heat death.”
Seen from a great distance, the collection of all the galaxy clusters look somewhat like a spider web. But the voids between the web filaments are eating the spider web. Observations by astronomers indicate the dark voids are pushing the nearby normal matter away from itself and now are beginning to rip apart the filaments in the web.
The Cosmological Principle is a principle of cosmology that says our large region of the observable universe is not a special region. It says that the current distribution of matter in the universe tends towards uniformity as the scale increases. More specifically, the Cosmological Principle says that, at scales of about 400 million lights years, the material in our space is homogeneous and isotropic. So, wherever in the observable universe you are located and whatever direction you are looking, you will see at these large distances about the same overall temperature, the same overall density, and the same lumpy structure of dense super-clustered galaxies separated by hollow voids.
Well, you will see this on Earth only if you take into account the Earth’s motion through the cosmos. This compensation is analogous to our calculating that, as we run down the street during the rain and notice that more rain hits the front of our shirt than the back of our shirt, when we take into account our running speed we realize that the rain is falling straight down and not at an angle toward the front of our shirt. The Cosmological Principle is an approximation, an idealization, that so far is helpful to assume. The reference frame where it holds the best is the so-called “canonical frame of the big bang” in which every galaxy is almost at rest (analogous to us standing still in the rain and not running).
A minority of cosmologists do not accept the Cosmological Principle because they believe that at large scales the universe is not homogeneous or it is not isotropic, but they have not convinced the majority.
Occasionally, someone remarks that the big bang is like a time-reversed black hole. It is not. The big bang is not like this because the entropy in a black hole is extremely high, but the entropy of the big bang is extremely low. Also, black holes have event horizons, but our big bang apparently does not. Some cosmologists do call the edge of the observable universe an event horizon, but that is a different kind of event horizon.
Because the big bang happened about 13.8 billion years ago, you might believe that no observable object can be more than 13.8 billion light-years from Earth, but this would be a mistake that does not take into account the fact that the universe has been expanding all that time. The relative distance between galaxy clusters is increasing over time and accelerating over time. That is why astronomers can see about 45 billion light-years in any direction from Earth and not merely 13.8 billion light-years.
When contemporary physicists speak of the age of our universe, namely the time since our big bang, they are implicitly referring to cosmic time measured in the cosmological rest frame. This is time measured in a unique reference frame in which the average motion of all the galaxies is stationary and the Cosmic Microwave Background radiation is as close as possible to being the same in all directions. This frame is not one in which the Earth is stationary.
Cosmic time is time measured in the cosmic rest frame by a clock that would be sitting as still as possible while the universe expands around it. In cosmic time, the time of t = 0 years is when the big bang began, and t = 13.8 billion years is our present. If you were at rest at the spatial origin in this frame, then the Cosmic Microwave Background radiation on a very large scale would have about the same average temperature in any direction, and the Cosmological Principle provides its best approximation.
The cosmic rest frame is a unique, privileged reference frame for astronomical convenience, but there is no reason to suppose it is otherwise privileged. It is not the frame sought by the A-theorist who believes in a unique present, nor by Isaac Newton who believed in absolute rest, nor by James Clerk Maxwell who believed in an aether that waved whenever a light wave passed through.
The cosmic frame’s spatial origin point is described as follows:
In fact, it isn’t quite true that the cosmic background heat radiation is completely uniform across the sky. It is very slightly hotter (i.e., more intense) in the direction of the constellation of Leo than at right angles to it…. Although the view from Earth is of a slightly skewed cosmic heat bath, there must exist a motion, a frame of reference, which would make the bath appear exactly the same in every direction. It would in fact seem perfectly uniform from an imaginary spacecraft traveling at 350 km per second in a direction away from Leo (towards Pisces, as it happens)…. We can use this special clock to define a cosmic time…. Fortunately, the Earth is moving at only 350 km per second relative to this hypothetical special clock. This is about 0.1 percent of the speed of light, and the time-dilation factor is only about one part in a million. Thus to an excellent approximation, Earth’s historical time coincides with cosmic time, so we can recount the history of the universe contemporaneously with the history of the Earth, in spite of the relativity of time.
Similar hypothetical clocks could be located everywhere in the universe, in each case in a reference frame where the cosmic background heat radiation looks uniform. Notice I say “hypothetical”; we can imagine the clocks out there, and legions of sentient beings dutifully inspecting them. This set of imaginary observers will agree on a common time scale and a common set of dates for major events in the universe, even though they are moving relative to each other as a result of the general expansion of the universe…. So, cosmic time as measured by this special set of observers constitutes a type of universal time… (Davies 1995, pp. 128-9).
It is a convention that cosmologists agree to use the cosmic time of this special reference frame, but it is an interesting fact and not a convention that our universe is so organized that there is such a useful cosmic time available to be adopted by the cosmologists. Not all physically possible spacetimes obeying the laws of general relativity can have this sort of cosmic time.
The connection between entropy and the big bang is interesting. Let’s answer the question, “Why hasn’t the universe reached maximum entropy by now?” The favored answer goes like this. Suppose the universe were to have reached maximum entropy. Immediately this situation would change because the expansion of space creates new possible ways for the universe’s matter to fill the universe. So, the maximum possible entropy for the universe must continue to grow. Calculations show that the maximum possible value for the universe’s entropy grows faster than the actual value of the universe’s entropy.
History of the Theory
The big bang theory originated with several people, although Edwin Hubble’s very careful observations in 1929 of galaxy recession from Earth were the most influential pieces of evidence in its favor. Noticing that the more distant galaxies are redder than nearby ones, he showed that on average the farther a galaxy is from Earth, the faster is recedes from Earth. Cosmologists now agree that the early galaxies were not actually receding from each other but rather space itself was expanding between the galaxies, and this is what causes the apparent recession on average of galaxies from other galaxies. But neither Hubble nor anyone else noticed until the end of the twentieth century that the apparent speed of galaxies receding from each other was accelerating.
In 1922, the Russian physicist Alexander Friedmann discovered that the general theory of relativity allows an expanding universe. Unfortunately, Einstein reacted to this discovery by saying this is a mere physical possibility and not a feature of the actual universe. He later retracted this claim, thanks in large part to the influence of Hubble’s data. The Belgian physicist Georges Lemaître is another father of the big bang theory. He suggested in 1927 that there is some evidence the universe is expanding, and he defended his claim using previously published measurements by Hubble and others of galaxy speeds plus his own inferences from Einstein’s equations. Lemaître published in French in a minor journal, and his prescient ideas were not appreciated until after Hubble’s discoveries.
The big bang theory was very controversial when it was created in the 1920s. At the time and until the 1960s, physicists were unsure whether proposals about cosmic origins were pseudoscientific and so should not be discussed in a well-respected astronomy journal. This attitude changed in the late 1960s, because Stephen Hawking and Roger Penrose convinced their fellow professional cosmologists that there must have been a big bang. The theory’s primary competitor from the 1920s to the 1960s was the steady state theory. That theory allows space to expand in volume but only if this expansion is compensated for by providing spontaneous creation of matter in order to keep the universe’s overall density constant over time.
At the end of the 20th century, this classical big bang theory was replaced by the now standard cosmological model that adds to the big bang the existence of dark matter particles and dark energy. As of 2025, the ontological makeup of the universe was considered to be 4.9% ordinary matter, 26.8% dark matter, and 68.3% dark energy.
a. Cosmic Inflation
The theory of cosmic inflation suggests that there was a special inflationary period before the standard, hot big bang began. According to this cosmicinflation theory, the universe was created from a quantum fluctuation in an inflaton field, then the field underwent a cosmological phase transition for some unknown reason causing an exponentially accelerating expansion of space. This was an extra “bang” before the standard big bang. The universe rapidly cooled during the inflation. Then for some unknown reason the inflation stopped very soon after it began, and the universe was flooded with 1088 particles that were created by the conversion of the potential energy of the inflaton field. During the creation of all these particles, the universe became very hot again, so this is called the period of “re-heating.”
By the time that inflation was over, every particle was left in isolation, surrounded by a vast expanse of empty space extending in every direction. And then—only a fraction of a fraction of an instant later—space was once again filled with matter and energy. Our universe got a new start and a second beginning. After a trillionth of a second, all four of the known forces were in place, and behaving much as they do in our world today. And although the temperature and density of our universe were both dropping rapidly during this era, they remained mind-boggingly high—all of space was at a temperature of 1015 degrees. Exotic particles like Higgs bosons and top quarks were as common as electrons and photons. Every last corner of space teemed with a dense plasma of quarks and gluons, alongside many other forms of matter and energy. After expanding for another millionth of a second, our universe had cooled down enough to enable quarks and gluons to bind together forming the first protons and neutrons (Dan Hooper, At the Edge of Time, p. 2).
Cosmic inflation is a framework for a theory that might explain a wide variety of otherwise inexplicable phenomena. Its epistemological status is that of an informed guess that is difficult to test because it is a framework and not a quantitatively specific theory. Its main difficulty is that there is no convincing explanation of why inflation started or why it stopped. Many cosmologists do not believe in cosmic inflation, and they hope there is another explanation of the phenomena that inflation theory explains. But that other explanation has not been found, so inflationary cosmology is the most favored explanation of our universe’s origin.
The virtue of the inflation theory is that it provides an explanation for the mysteries of (i) why the microwave radiation that arrives on Earth from all directions is so uniform (the cosmic horizon problem), (i) why there is currently so little curvature of space on large scales (the flatness problem), (iii) why there are not point-like magnetic monopoles most everywhere (the monopole problem), and (iv) why we have been unable to detect proton decay that has been predicted (the proton decay problem). It is difficult to solve these mysteries in some other way than by assuming cosmic inflation.
According to the theory of inflation, for some unknown reason the epoch of inflation (the epoch of radically repulsive gravity) began at about t = 10-36 seconds and lasted until about t = 10-33 seconds, during which time the volume of space increased by a factor of a billion billion billion times (1026), and any initial unevenness in the distribution of energy was almost all smoothed out, that is, smoothed out from the large-scale perspective, somewhat in analogy to how blowing up a balloon removes its initial folds and creases so that it looks flat when a small section of it is viewed close up. Thus, even if initially the Big Bang were to have exploded unevenly in different directions and places, the subsequent inflation will have smoothed out the unevenness, and today we should see a relatively homogeneous and isotropic universe at a large scale, as we do.
Although the universe at the beginning of the inflation was actually much smaller than the size of a proton, to help with understanding the rate of inflation you can think of the universe instead as having been the size of a marble. Then during the inflation period this marble-sized object expanded abruptly to a gigantic sphere whose radius is the distance that now would reach from Earth to the nearest supercluster of galaxies. This would be a spectacular change in volume of something marble-sized.
The speed of this inflationary expansion was much faster than light speed. However, this fast expansion speed does not violate Einstein’s general theory of relativity because that theory places no limits on the speed of expansion of space itself, but only limits on the speeds of objects within space.
At the end of that inflationary epoch at about t = 10-33 seconds or so, the inflation stopped. The exploding material decayed for some unknown reason and left only normal matter with attractive gravity. Meanwhile, the universe continued to expand, but at a nearly constant, rate. Regardless of any previous curvature in our universe, by the time the inflationary period ended, the overall structure of space on the largest scales was nearly flat in the sense that it had very little spatial curvature, and its space was extremely homogeneous. But at the very beginning of the inflationary period, there surely were some very tiny imperfections due to the earliest quantum fluctuations in the inflaton field. At the end of the inflationary period, these quantum imperfections had inflated into slightly bumpy macroscopic regions . Subsequently, the denser regions slowly attracted more material than the less dense regions, and these dense regions would eventually turn into our current galaxies. The less dense regions, meanwhile, evolved into the current dark voids between the galaxies. Evidence for this is that those early quantum fluctuations have now left their traces in hot and cold spots, namely in the very slight hundred-thousandth of a degree differences in the temperature of the cosmic microwave background radiation at different angles as one now looks out into space from Earth with microwave telescopes. In this way, the inflation theory predicts the CMB values that astronomers on Earth see with their microwave telescopes, thereby solving the cosmic horizon problem. That problem is problem (i) in the list above.
Let’s re-describe the process of inflation. Before inflation began, for some as yet unknown reason the universe contained an unstable inflaton field or false vacuum field. For some other, as yet unknown reason, this energetic field expanded and cooled and underwent a spontaneous phase transition (somewhat analogous to what happens when water that is cooled spontaneously freezes into ice). The phase transition caused the highly repulsive primordial material to hyper-inflate exponentially in volume for a very short time. To re-describe this yet again, during the primeval inflationary epoch, the gravitational field’s stored, negative, repulsive, gravitational energy was rapidly released, and all space wildly expanded. At the end of this early inflationary epoch at about t = 10-33 seconds, the highly repulsive material decayed for some as yet unknown reason into ordinary matter and energy, and the universe’s expansion rate stopped increasing exponentially. The expansion rate dropped precipitously and became nearly constant. During the inflationary epoch, the entropy continually increased, so the second law of thermodynamics was not violated, but the law of conservation of energy apparently was, though we saw back in section 2 how some cosmologists have argued that the law was not violated.
Alan Guth described the inflationary period this way:
There was a period of inflation driven by the repulsive gravity of a peculiar kind of material that filled the early universe. Sometimes I call this material a “false vacuum,” but, in any case, it was a material which in fact had a negative pressure, which is what allows it to behave this way. Negative pressure causes repulsive gravity. Our particle physics tells us that we expect states of negative pressure to exist at very high energies, so we hypothesize that at least a small patch of the early universe contained this peculiar repulsive gravity material which then drove exponential expansion. Eventually, at least locally where we live, that expansion stopped because this peculiar repulsive gravity material is unstable; and it decayed, becoming normal matter with normal attractive gravity. At that time, the dark energy was there, the experts think. It has always been there, but it’s not dominant. It’s a tiny, tiny fraction of the total energy density, so at that stage at the end of inflation the universe just starts coasting outward. It has a tremendous outward thrust from the inflation, which carries it on. So, the expansion continues, and as the expansion happens the ordinary matter thins out. The dark energy, we think, remains approximately constant. If it’s vacuum energy, it remains exactly constant. So, there comes a time later where the energy density of everything else drops to the level of the dark energy, and we think that happened about five or six billion years ago. After that, as the energy density of normal matter continues to thin out, the dark energy [density] remains constant [and] the dark energy starts to dominate; and that’s the phase we are in now. We think about seventy percent or so of the total energy of our universe is dark energy, and that number will continue to increase with time as the normal matter continues to thin out. (World Science U Live Session: Alan Guth, published November 30, 2016 at https://www.youtube.com/watch?v=IWL-sd6PVtM.)
Before about t = 10-46 seconds, there was a single basic force rather than the four we have now. The four basic forces (or basic interactions) are: the force of gravity, the strong nuclear force, the weak force, and the electromagnetic force. At about t = 10-46 seconds, the energy density of the primordial field was down to about 1015 GEV, which allowed spontaneous symmetry breaking (analogous to the spontaneous phase change in which water cools enough to spontaneously change to ice); this phase change created the gravitational force as a separate basic force. The other three forces had not yet appeared as separate forces.
Later, at t = 10-12 seconds, there was even more spontaneous symmetry breaking. First the strong nuclear force, then the weak nuclear force and finally the electromagnetic force became separate forces. For the first time, the universe now had exactly four separate forces. At t = 10-10 seconds, the Higgs field turned on. This slowed down many kinds of particles by giving them mass so they no longer moved at light speed.
Much of the considerable energy left over at the end of the inflationary period was converted into matter, antimatter, and radiation, namely quarks, antiquarks, and photons. The universe’s temperature escalated with this new radiation; this period is called the period of cosmic reheating. Matter-antimatter pairs of particles combined and annihilated, removing from the universe all the antimatter and almost all the matter. At t = 10-6 seconds, this matter and radiation had cooled enough that quarks combined together and created protons and neutrons. After t = 3 minutes, the universe had cooled sufficiently to allow these protons and neutrons to start combining strongly to produce hydrogen, deuterium, and helium nuclei. At about t = 379,000 years, the temperature was low enough (around 2,700 degrees C) for these nuclei to capture electrons and to form the initial hydrogen, deuterium, and helium atoms of the universe. With these first atoms coming into existence, the universe became transparent in the sense that short wavelength light (about a millionth of a meter) was now able to travel freely without always being absorbed very soon by surrounding particles. Due to the expansion of the universe since then, this early light’s wavelength expanded and is today invisible on Earth because it is at much longer wavelength than it was 379,000 years ago. That CMB radiation is now detected on Earth as having a wavelength of 1.9 millimeters. That energy is continually arriving at the Earth’s surface from all directions. It is almost homogenous and almost isotropic.
In the literature in both physics and philosophy, descriptions of the big bang often speak of it as if it were the first event, but the big bang theory does not require there to be a first event, an event that had no prior event. Any description mentioning the first event is a philosophical position, not something demanded by the scientific evidence. Physicists James Hartle and Stephen Hawking once suggested that looking back to the big bang is just like following the positive real numbers back to ever-smaller positive numbers without ever reaching the smallest positive one. There isn’t a smallest positive number. If Hartle and Hawking are correct that time is strictly analogous to this, then the big bang had no beginning point event, no initial time.
The classical big bang model sometimes is expressed by asking us to think of the current expansion of the universe being reverse so that everything in the universe was once at a point, called the singularity. The singularity is generally assumed to be an unrealistic artifact of relativity theory because it violates quantum theory. Relying upon quantum theory, physicists now agree that the projection to a smaller volume must become untrustworthy for any times less than the Planck time. The famous singularity theorems that proved the universe had a singularity are now considered to be inapplicable because of their premises that are not longer accepted. If a theory of quantum gravity ever gets confirmed, it is expected to provide more reliable information about the Planck epoch from t=0 to the Planck time, and it may even allow physicists to answer definitively the questions, “What caused the big bang?” and “Did anything happen before then?”
History of the Theory
Like the big bang theory, inflation theory is a kind of theory rather than one single, specific theory. The original theory of inflationary expansion (without eternal inflation and many universes) was created by Alan Guth, along with Andrei Linde, Paul Steinhardt, Alexei Sterobinsky and others in the period 1979 to 1982. Proponents say it saved the big bang theory from refutation because it explained so many facts that the classical big bang theory conflicts with.
The theory of primordial cosmic strings has been the major competitor to the theory of cosmic inflation, but the above problems labeled (i), (ii), (iii), and (iv) are more difficult to solve with strings and without inflation.
One criticism of the theory is that it could easily be adjusted so it accounts for almost any observation, so it is unfalsifiable and unscientific. Princeton cosmologist Paul Steinhardt and Neil Turok of the Perimeter Institute are two of inflation’s noteworthy opponents, although Steinhardt once made important contributions to the creation of inflation theory. One of their major complaints is that at the time of the big bang, there should have been a great many long wavelength gravitational waves created, and today we have the technology that should have detected these waves, but we find no evidence for them. Steinhardt recommends replacing inflation theory with a revised big bounce theory.
For a short lecture by Guth on these topics that is designed for students, see https://www.youtube.com/watch?v=ANCN7vr9FVk.
b. Eternal Inflation and the Multiverse
Many of the big bang inflationary theories are theories of eternal inflation, of the eternal creation of more big bangs and thus more universes. The theory is called the Multiverse Theory, the Theory of Chaotic Inflation, and the Theory of the Inflationary Multiverse. These universes are not the worlds of Hugh Everett’s Many-Worlds Theory. Time is structured in the Multiverse Theory very differently than in the Core Theory.
The key idea is that once inflation gets started it cannot easily be turned off. The inflaton field is the fuel of a big bang. Advocates of eternal inflation say that not all the inflaton fuel is used up in producing just one big bang, so the remaining fuel is available to create other big bangs, at an exponentially increasing rate because the inflaton fuel increases exponentially faster than it gets used. Presumably, there is no reason why this process should ever end, so time across the multiverse is eternal and there will be a potentially infinite number of universes each with their own time. Also, there is no good reason to suppose our own universe was the first one. Actually the notion of order of creation among the many universes has not been well defined.
A helpful mental image here is to think of the multiverse as a large, expanding space filled with bubbles of all sizes, most of which are growing. Each bubble is its own universe, and each might have its own physical constants, its own number of dimensions, even some laws of physics different from ours. In some of these universes, there may be no time at all. Regardless of whether a single bubble universe is inflating or no longer inflating, the space between the bubbles is inflating and more bubbles are being born at an exponentially increasing rate. Because the space between bubbles is inflating, nearby bubbles are quickly hurled apart. That implies there is a low probability that our bubble universe contains any empirical evidence of having interacted with a nearby bubble.
After any single big bang, eventually the hyper-inflation ends within that universe. We say its bit of inflaton fuel has been used up. However, after the hyper-inflation ends, the expansion within that universe does not. Our own bubble was produced by our particular big bang 13.8 billion years ago, and it becomes larger every day. It is called the Hubble Bubble.
Even if our Hubble Bubble has a finite volume, unobservable space in our universe might be infinite, and if so then there probably are an infinite number of universes among all the bubbles.
The inflationary multiverse is not the quantum multiverse predicted by the many-worlds theory. The many-worlds theory says every possible outcome of a quantum measurement persists in a newly created world, a parallel universe. If you turn left when you could have turned right, then two universes are instantly created, one in which you turned left, and a different one in which you turned right. You exist in both. Unfortunately both theories are called the multiverse theory as well as the many-worlds theory, so a reader needs to be alert to the use of the term. The Everettian Theory is the theory of the quantum multiverse but not of the inflationary multiverse.
The theory of eternal inflation with new universes was created by Linde in 1983 by building on some influential work by Gott and Vilenkin. The multiplicity of universes of the inflationary multiverse also is called parallel worlds, many worlds, alternative universes, alternate worlds, and branching universes—many names denoting the same thing. Each universe of the multiverse normally is required to use some of the same physics (there is no agreement on how much) and all the same mathematics. This restriction is not required by a logically possible universe of the sort proposed by the philosopher David Lewis.
Normally, philosophers of science say that what makes a theory scientific is not that it can be falsified (as the philosopher Karl Popper proposed), but rather that there can be experimental evidence for it or against it. Because it is so difficult to design experiments that would provide evidence for or against the multiverse theories, many physicists complain that their fellow physicists who are developing these theories are doing technical metaphysical conjecture, not physics. However, the response from defenders of multiverse research is usually that they can imagine someday, perhaps in future centuries, running crucial experiments, and, besides, the term physics is best defined as being whatever physicists do professionally. But critics in turn respond that these predictions are based on too-flexible assumptions that can be revised to prevent any refutation of the multiverse theory.
Now that this section has come to a close, the reader can better appreciate the point that Stephen Toulmin was making when he said, “Those who think of metaphysics as the most unconstrained or speculative of disciplines are misinformed; compared with cosmology, metaphysics is pedestrian and unimaginative.”
5. Infinite Time
Is time infinitely divisible? Yes, because general relativity theory and quantum theory require time to be a continuum. But this answer will change to “no” if these theories are eventually replaced by a new Core Theory that quantizes time. “Although there have been suggestions by some of the best physicists that spacetime has a discrete structure,” Stephen Hawking said in 1996, “I see no reason to abandon the continuum theories that have been so successful.” Twenty-five years later, the physics community became much less sure that Hawking is correct.
Did time begin at the big bang, or was there a finite or infinite time period before our big bang? The answer is unknown. There are many theories that imply differing answers to the question, but the major obstacle in choosing among them is that the theories cannot be tested practically. The Borde–Guth–Vilenkin theorem in cosmology is often said to imply that there is a finite past. This is too simple of a remark. It shows in the context of the Multiverse theory that our particular bubble had a finite past, but not that the universe itself did.
Will time exist infinitely many years from now? The most popular answer is “yes,” but physicists are not sure. What a future theory of quantum gravity will require is still unknown.
Stephen Hawking and James Hartle said the difficulty of knowing whether the past and future are infinite in duration turns on our ignorance of whether the universe’s positive energy is exactly canceled out by its negative energy. The energy of motion and the energy described by the equation E = mc2 of a mass m is positive energy. All the energy of gravitation and of spacetime curvature is negative energy. Hawking said in 2018:
When the Big Bang produced a massive amount of positive energy, it simultaneously produced the same amount of negative energy. In this way, the positive and the negative add up to zero, always. It’s another law of nature. So, where is all this negative energy today? It’s … in space. This may sound odd, …space itself is a vast store of negative energy. Enough to ensure that everything adds up to zero.
A short answer to the question “Why is the energy of gravitation negative and not positive?” is that this negative energy is needed if the law of conservation of energy is going to be even approximately true, which it clearly is. A long answer might ask us to consider a universe containing only the Earth plus a ball above its surface. The ball has gravitational potential energy because of its position in the Earth’s gravitational field—the higher, the more energy. The quantitative value of this gravitational potential energy depends on where you choose to set your zero point in the coordinate system, that is, the point where the potential energy is zero. Customarily this is chosen to be at an infinite distance away from Earth (and away from any other objects if they were to be added into our toy universe). Let go of the ball, and it will fall toward the Earth. As gravitational potential energy of position is converted to kinetic energy of motion during the fall of the ball toward Earth, the sum of the two energies remains constant. When the ball reaches Earth, it will have much less than zero potential energy. Its potential energy will be even more negative. An analogous but more complicated argument applies to a large system, such as all the objects of the universe. We would not want to make the zero point for potential energy have anything to do with the Earth if we are making the calculations for all the universe, thus the choice of zero at an infinite distance away from Earth. One philosophical assumption in this argument is that what is physically real is not the numerical value of energy but of energy differences.
If the total of the universe’s energy is either negative or positive (and if quantum mechanics is to be trusted, including its law of the conservation of energy), then time is infinite in the past and future. Here is the argument for this conclusion. The law of conservation of energy implies energy can change forms, but if the total were ever to be non-zero, then the total energy could never become exactly zero (nor ever have been exactly zero) because that would violate the law of conservation of energy. So, if the total of the universe’s energy is non-zero, then there always have been states whose total energy is non-zero, and there always will be states of non-zero energy. That implies there can be no first instant or last instant and thus that time is eternal.
There is no solid evidence that the total energy of the universe is non-zero, but a slim majority of the experts favor a non-zero total, although their confidence in this is not strong. Assuming there is a non-zero total, there is no favored theory of the universe’s past, but there is a favored theory of the future—the big chill theory. The big chill theory implies the universe just keeps getting chillier forever as space expands and gets more dilute, and so there always will be changes and thus new events produced from old events, and so time is potentially infinite in the future.
Here are more details of the big chill theory. 95% of all stars that ever will be born have already been born. The last star will burn out in 1015 years. Then all the stars and dust within each galaxy will fall into black holes. Then the material between galaxies will fall into black holes as well, and finally all the black holes will evaporate, leaving only a soup of elementary particles that gets less dense and therefore “chillier” as the universe’s expansion continues. The microwave background radiation will continue to red shift more and more into longer wavelengths. Future space will expand toward thermodynamic equilibrium. But because of vacuum energy, the temperature will only approach, but never quite reach, zero on the Kelvin scale. Thus the universe descends into a “big chill,” forever having the same amount of total energy it always has had.
Here is some final commentary:
In classical general relativity, the big bang is the beginning of spacetime; in quantum general relativity—whatever that may be, since nobody has a complete formulation of such a theory as yet—we don’t know whether the universe has a beginning or not.
There are two possibilities: one where the universe is eternal, one where it had a beginning. That’s because the Schrödinger equation of quantum mechanics turns out to have two very different kinds of solutions, corresponding to two different kinds of universe.
One possibility is that time is fundamental, and the universe changes as time passes. In that case, the Schrödinger equation is unequivocal: time is infinite. If the universe truly evolves, it always has been evolving and always will evolve. There is no starting and stopping. There may have been a moment that looks like our big bang, but it would have only been a temporary phase, and there would be more universe that was there even before the event.
The other possibility is that time is not truly fundamental, but rather emergent. Then, the universe can have a beginning. …And if that’s true, then there’s no problem at all with there being a first moment in time. The whole idea of “time” is just an approximation anyway (Carroll 2016, 197-8).
6. Time-Reversibility and Time-Reversal Symmetry
Understanding time-reversibility and time-reversal symmetry can be difficult because some persons say they are the same and others say they are not. Also, some of the relevant terms serve as misleading labels that continue to be used for historical reasons. Here is a list of the relevant terms:
Reversal of time’s arrow
Reversibility
State reversal
Time invariance
Time reversal
Time-reversal invariance
Time-reversal operation
Time-reversal symmetry
Time-reversibility
Time symmetry
Time-translation symmetry
Time-reversibility for a physical system is equivalent to the time-reversal symmetry of the system. Informally, this is the feature that says the laws do not distinguish between a movie clip of the system played forward and played backward. Either could be physically possible without violating the fundamental physical laws because each instantaneous state of the scene is a solution to the laws’ equations. This scene-reversal feature is controversial. It seems to many physicists and philosophers, but not to others, to illegitimately require more than merely symmetry under the operation T of reversing the time variable (that is, replacing t by -t in the equations). The universe is almost always time-reversal symmetric, but there are exceptions in the way some very rare atomic particles decay and perhaps with measurements in quantum mechanics.
To indicate some of the related philosophical controversies, consider whether having a time-reversal symmetry has a bearing on time’s arrow. Both the intrinsic and extrinsic camps regrading time’s arrow agree that time-reversal symmetry is not about time’s stopping and then reversing, and both agree that you cannot run experiments backwards in time. But there the agreement ends. Many in the extrinsic camp say time’s arrow is completely independent of whether time has a time-reversal symmetry; most in the intrinsic camp would disagree with this. Maudlin pointed out that Isaac Newton was very aware that his own equations of what we now call classical mechanics were time-reversal symmetric, but he never doubted that time itself has an arrow.
Before exploring these and related issues, let’s clarify the term “symmetry.” Symmetry is about changes or operations or, equivalently, transformations that conserve something. Symmetry is a special kind of pattern of staying the same (staying invariant or not varying) in an important way despite the change. If a symmetry is preserved under a transformation, then it is said to be invariant under that transformation.
Is change in time asymmetric? That is, is the relation of happens-before asymmetric? Asymmetry is not merely lack of symmetry. The two are contraries of each other, not contradictories. For example, on the integers, the “=” relation is symmetric, the “<” relation is asymmetric, and the “less than or equal” relation is neither.
In this sub-section, it is presumed that an instant is a point of time with zero duration. Noon is an instant. It is often claimed that the set of all instants would not really be time if it were not asymmetric under the happens-before relation. It is asymmetric if, for any instant a that happens before instant b, then instant b cannot happen before instant a. This feature rules out closed loops of time. To prevent this and to make the question of whether time can circle back on itself be an empirical question, physicists do not require asymmetry for all of time’s instants, but only for smaller segments of time called “neighborhoods” of instants.
We won’t pause here to explore the many kinds of symmetry and asymmetry. Instead, this section focuses on the two most relevant symmetries for time: time-translation symmetry and time-reversal symmetry.
Time-translation symmetry implies time’s homogeneity, that all instants are equivalent. This feature of time can be expressed using the language of coordinate systems by saying that replacing the time variable t everywhere in a fundamental law by t + n does not change what processes are allowed by the law. The constant n can be any real number. This feature of time was first proposed by Isaac Newton. The General Theory of Relativity is completely time-translation symmetric. There is no special time in the universe according to general relativity.
Most members of the intrinsic camp say nature has this symmetry because time passes the same regardless of whether it is Sunday or Monday. Members of the extrinsic camp who advocate time-translation symmetry would say, instead, that the symmetry exists because the allowable fundamental physical processes stay the same regardless of whether it is Sunday or Monday. Members of both camps who say nature is time-translation symmetric agree that specific physical systems within space-time need not have this symmetry; a truck can change its location from Sunday to Monday. Truck-location need not be the same on Monday as it was on Sunday.
Some physicists believe time-translation symmetry fails in two ways. Assuming the universe had a first moment, such as at the big bang, then the first moment is distinguished from all later moments because it is the only moment without a predecessor. Second, there is no time-translation symmetry because some of the fundamental laws we have now were different in the past and will change in the future. However, there is no empirical evidence of this failure; it is merely an educated guess. In response to this guess, other physicists say that if a fundamental law were to change over time from true to false, then this would show instead that the fundamental law was not really a law after all. This response usually initiates a debate about what is required to be a law.
Another symmetry is time-reversal symmetry. Unsurprisingly, the two camps give different senses to the term “time-reversal symmetry.”
Being time-reversal symmetric is having a special symmetry called reversibility or, equivalently, time-reversibility. When there is time-reversibility for the universe or for a closed and isolated sub-system, the system is deterministic in the sense that the present state determines future states. Time-reversibility implies determinism, and it also implies more. It implies that, from complete knowledge of the present state of a closed and isolated system, Laplace’s Demon could retrodict its past with perfect accuracy. If the system is not closed and isolated, then we also need to know what is crossing the system’s boundary and exactly how all the external forces are acting on the system. If the system is time-reversible in this manner, then we can do all this predicting and retrodicting because of the conservation of information in the system. So, in the main sense of “time-reversibility,”
Time-reversibility is equivalent to conservation of information.
What this means, anthropomorphically-expressed, is that a closed and isolated system that conserves information remembers where it came from and knows where it is going. For example, conservation of information implies that, if your apartment burns down, then the information in the smoke, heat, light, and charred remains is sufficient for reconstructing your apartment exactly. The prior information about the apartment is preserved, despite the fire. It is not lost, although it is not accessible practically. Only Laplace’s Demon could access the information.
Even if the universe actually does conserve information and is time-reversible, as some physicists do believe though some do not, everyone agrees that we humans would never experience a reversal because all our experience is of one-way macroscopic processes. We will never experience an un-burning apartment. This is the arrow of time in action.
The big uncertainty about time-reversibility is whether the universe really is deterministic. If it is not, then information is not conserved and time is not reversible. Quantum theory given the Copenhagen interpretation implies that the universe is not deterministic. Quantum theory given the Many-Worlds interpretation implies that the universe is deterministic.
The definition of the state affects the theory’s definition of time reversal. The state of a physical system at a time is defined to be all the information about the system at that time that is needed to specify what can be known of its evolution to other times, given the laws of physics, regardless of whether determinism holds. It is crucially important to notice that different physical theories have their own definition of “state at a time.” In Newton’s classical mechanics, the information needed to specify the state and to predict the evolution of the system is the position and momentum of every component particle of the system. Information about acceleration is not needed, nor is information about what Napoleon Bonaparte had for breakfast the day after his twenty-first birthday. For other physical theories, other information is needed. In classical electrodynamics, information about magnetic and electric fields is also needed. In quantum mechanics we need to know about the wave function.
Some philosophers believe there is a clear, theory-independent definition of time-reversal. Many others disagree and say different theories have their own best definition of time-reversal. In Newton’s classical mechanics, time-reversal requires reversing both time and momenta. Maxwell’s classical electrodynamics requires reversing time, momenta, and the magnetic field but not the electric field. Time-reversal in quantum theory’s standard model of particle physics requires reversing time, parity, and charge all at once in order to get a reversed movie to show physically possible scenes. The key takeaway here is that time-reversal requires more than reversing time in the naive sense of changing t to -t.
Although this claim, too, is controversial, most physicists claim that time-reversibility has two different requirements: (i) reversing any sequence of instantaneous states, and (ii) reversing each instantaneous state. In classical mechanics, for example, if you reverse a process of a woman walking up the steps, you must do two things: (i) start with her on her highest step and have her walk down backwards over each intervening step until she reaches her lowest step, and (ii) reverse her instantaneous velocity at each instant along the way. If she were looking at a vase of roses as she walked up, she continues to look at the vase as she backs down. This is so, even though her eyeball emits the light rays rather than absorbs them.
Component (i) implies that whenever a closed and isolated system changes from being in an instantaneous state 1 at time 1 to a new instantaneous state 2 at any other time 2 while obeying a theory’s fundamental laws, those same laws imply that, if the system were to have been in state 2 at time 2, then it must have been in state 1 at time 1.
Component (ii), on the other hand, implies for our example in classical physics of the woman walking up the stairs that any instantaneous configuration is turned into a configuration with all her particles in the same spatial positions but with reversed instantaneous velocities. These two different configurations would “look” the same if you could take an instantaneous photo of each one without any blurriness. If the two photos were blurry, you’d see upon comparing them that she is tending to move in opposite directions in the two photos. The blurriness is not due to any motion of the camera.
Some philosophers of physics say time reversal is only about (i) and not (ii). Albert (2000, 18) says, “What can it possibly mean for a single instantaneous physical situation to be happening ‘backward’?” His own answer is that this is “gibberish.” Callender says, “It does not make sense to time-reverse a truly instantaneous state of a system” (Callender 2000, p. 254). For an entry into the debate, see section 2.2 of (Roberts 2022).
One point that all the contenders seem to agree upon is that any transformation deserving of being called a time-reversal transformation must have the property that the transformation of the transformation takes you back to the state where you started.
But there is considerable disagreement about the sense of the relevant terms. Many members of the extrinsic camp believe:
Time-reversibility is not identical to time-reversal symmetry.
They say the first implies the second, but the second does not imply the first, if time-reversal symmetry is T symmetry, as it is customarily taken to be. The time-reversal symmetry operation T merely changes t to -t, but it does not reverse the processes allowed by all the laws. Those in the extrinsic camp say full time reversibility should reverse time plus all the processes over time. Carroll explains this:
There is always, in every deterministic set of dynamical laws, a transformation you can do that reverses the direction of time and is a symmetry of the system. The real important thing is information conservation and reversibility. When you have information conservation and reversibility, you will always have some time reversal symmetry. So for example, in electromagnetism, it turns out that if you do time reversal, the electric field is unchanged, but if you… Let’s put it this way, I can make up whatever transformation I want, and then I can ask, is it a symmetry? So if I do a symmetry where I just change t to minus t, time to minus time, that would not be a symmetry of electromagnetism, classical Maxwell’s electromagnetism. What I need to do is take the magnetic field and also reverse it, but not the electric field. So E goes to E, B goes to minus B, and T goes to minus T. That is a good time reversal symmetry. Indeed, in Hamiltonian mechanics, where you have positions and momenta, …the right time reversal invariance operator sends x to x and p to minus p, as well as t to minus t. It reverses momentum and time. So basically, you have to think about what would make a good time reversal operation. In the case of quantum field theory, the good time reversal operation that is a symmetry is called CPT. You change t to minus t, but then you also change the parity of the system and you charge conjugate it. You change particles to antiparticles.
Regarding CPT, individually, each one of C, P, and T is violated in nature, but collectively CPT is not violated. Carroll would probably say that it is the CPT transformation that reverses the phenomena, not T alone.
There is general agreement between the two camps in the dispute about times arrow that time-reversal symmetry requires at least time-reversal invariance under the instantaneous time-reversal operation T that changes t to -t.
The term “time-reversal” is ambiguous. The intrinsic theory of time’s arrow says T is reversing time. The extrinsic theory says T is only reversing the dynamics and not time itself in the sense of reversing every process that happens. That is why some complain that it is misleading to call T the “time-reversal operation.” Perhaps it would be better to call it time-derivative-reversal since that is all it can guarantee gets reversed. Nevertheless, everyone in both camps continues to call T the “time-reversal operation” because it has been customary to do so in the literature for so long. And it is known that T-reversal fails in the Standard Model for certain rare weak interactions.
For centuries after Isaac Newton created what is now called classical mechanics, time-reversal symmetry T had been considered by scientists to be an exact feature of the fundamental laws and to be a brute feature of nature. It was presumed to be a universal symmetry that would hold in all fundamental theories. Then in the 20th century, a direct failure of naive time-reversal symmetry was detected. Philosophers of physics immediately disagreed about the significance of this discovery. Some claimed the arrow of time had been found in nature. Members of the entropy camp claimed instead that whether time-reversal symmetry holds or does not hold is irrelevant to time’s arrow because this symmetry or lack of symmetry has nothing to do with entropy, which, they say, is what gives time its arrow.
Back to the main “Time” article for references and citations.
Author Information
Bradley Dowden
Email: dowden@csus.edu
California State University, Sacramento
U. S. A.
Alan Gewirth (1912-2004)
Gewirth photo, courtesy of University of Chicago News Office
Alan Gewirth was an American philosopher, famous for his argument that universal human rights can be rationally justified as the outcome of claims necessarily made by rational agents. According to this argument, first outlined in Reason and Morality (1978), all agents necessarily want to be successful in their actions, and since freedom and well-being are the generally necessary conditions of successful agency, every agent must claim rights to freedom and well-being. As the justifying reason for the agent’s rights-claim is the very fact that she is an agent with purposes that she wants to realize, she must also accept the universalized claim that all agents have rights to freedom and well-being. Gewirth calls these rights generic as they correspond to features generically necessary to successful agency. Hence, the supreme principle of morality is the Principle of Generic Consistency (PGC), stating that every agent should act in accord with the generic rights of the recipients of her actions as well as of herself. While freedom refers to one’s control of one’s behaviour in accordance with one’s own unforced choice, well-being refers to the general conditions needed for one to be able to act and to maintain and expand one’s capacity for successful agency.
The PGC applies not only to interpersonal actions but also to social and political morality as it justifies rules and institutions needed for the protection of the generic rights to freedom and well-being at the level of political communities. The minimal state, preventing violations of basic rights, as well as the democratic state, upholding the right to freedom at the political level, are justified in this way. In The Community of Rights (1996), Gewirth argues that the PGC also justifies a more supportive state, involving rights to economic democracy and to productive agency as further specifications of the generic rights to freedom and well-being.
In his last published work Self-Fulfillment (1998), Gewirth outlines a normative theory of self-fulfilment based on a distinction between aspiration-fulfilment and capacity-fulfilment. In aspiration-fulfilment, one’s aim is to satisfy one’s deepest desires; in capacity-fulfilment it is to make the best of oneself. While one’s deepest desires might be for goals that are unrealistic or even immoral, trying to make the best of oneself requires that one’s goals and aspirations are consistent with the requirements of reason. Thus, capacity-fulfilment involves making the most effective possible use of one’s freedom and well-being within the limits set by the PGC as the rationally justified supreme principle of morality.
Beginning with an analysis of the normative claims involved in agency, Gewirth manages not only to justify a supreme moral principle, but also to derive implications of that principle for political and personal morality. His work is not only a major contribution to contemporary moral philosophy, but also an impressive example of how philosophy can make sense of our lives as agents who are rationally committed to morality.
Alan Gewirth was born in Union City, New Jersey, on November 28, 1912 as Isidore Gewirtz. His parents, Hyman Gewirtz and Rose Lees Gewirtz, were immigrants from what was then Tsarist Russia, where the antisemitic pogroms of the early twentieth century forced many people to cross the Atlantic in the hope of a new beginning and a better life for themselves. Gewirth later dedicated his 1982 book Human Rights “To the memory of my Mother and Father and to Aunt Rebecca and Cousin Libby who as young emigrants from Czarist Russia knew the importance of human rights.” At age eleven, after having been teased by playmates on the schoolyard as “Dizzy Izzy,” he announced to his parents that from now on his first name was to be Alan. The source of inspiration here was a character, Alan Breck Stewart, in Robert Louis Stevenson’s historical adventure novel Kidnapped. In the novel, Alan Breck Stewart was an eighteenth-century Scottish Jacobite whom the young boy Isidore Gewirtz admired as a fearless man of the people. Later, in 1942, he changed his last name from Gewirtz to Gewirth. At a time when antisemitism was also rife in the US, many Jewish Americans found it necessary to anglicize their names. In this way Isidore Gewirtz became Alan Gewirth.
His father, who once had entertained a dream of becoming a concert violinist, gave him violin lessons when Alan was just four or five years old, and later had him take professional lessons. At around age eleven or twelve, Alan himself started to give violin lessons to younger children in the family’s apartment. After entering Columbia University in 1930, he joined the Columbia University Orchestra as a violinist, becoming concertmaster in 1934.
At Columbia, Gewirth was encouraged to pursue philosophical studies by his teacher Richard McKeon. In 1937, he became McKeon’s research assistant at the University of Chicago. Gewirth served in the US Army 1942-46, moving up the ranks from private to captain, after which time he spent the 1946-47 academic year at Columbia on the GI Bill, completing his doctorate in philosophy with a dissertation on Marsilius of Padua and medieval political philosophy (published as a book in 1951). From 1947 onwards, he was a regular member of the faculty of the University of Chicago, from 1960 as a full professor of philosophy. Gewirth was elected a Fellow of the American Academy of Arts and Sciences in 1975, and served as President of the American Philosophical Association Western Division (1973-74) as well as President of the American Society for Political and Legal Philosophy (1983-84). He was the recipient of several prizes and awards, including the Gordon J. Laing Prize for Reason and Morality. He was appointed the Edward Carson Waller Distinguished Service Professor of Philosophy at the University of Chicago in 1975; he became an Emeritus Professor in 1992.
Gewirth continued to give lectures well into his eighties, teaching a course on the philosophical foundations of human rights within the newly constituted Human Rights Program at the University of Chicago as late as 1997-2000. His last public talk was given in August 2003 at the XXI World Congress of Philosophy in Istanbul, Turkey. Alan Gewirth died on May 9, 2004. He was married three times, between 1942 and 1954 to Janet Adams (1916-1997), from 1956 until her death to Marcella Tilton (1928-1992), and from 1996 to Jean Laves (1936-2018). In his first marriage, he was the father of James Gewirth and Susan Gewirth Kumar; in his second marriage, he was the father of Andrew Gewirth, Daniel Gewirth, and Letitia Rose Gewirth Naigles. His younger brother was the educational psychologist Nathaniel Gage (1917–2008).
2. Gewirth in the Context of Twentieth Century Moral Philosophy
The twentieth century was not a very hospitable age for philosophers trying to provide an objectivist foundation for moral principles. In the first half of the century the dominant mode of philosophical thinking about ethics was emotivist and non-cognitivist. Emotivism regarded moral statements about what is right and wrong as mere expressions of the speaker’s attitudes and of her desire to make us share these attitudes. Like other emotive statements, they could be neither true nor false, and there could be no way of proving them. Moral pronouncements came to be thought of as similar to claims made in advertising or in various forms of propaganda. Taking its point of departure in the works of philosophers such as A. J. Ayer in the U.K. and Charles Leslie Stevenson in the U.S., emotivism maintained a dominant presence in analytic philosophy throughout the Cold War years.
The later years of the twentieth century witnessed the rise of postmodernism and a return of pre-modernist cultural relativism. This created room for moral values in public debate, but these values were regarded as relative to various traditions or cultures. Once again, belief in objective and universally justified moral values, such as human rights, was rejected as a culturally produced superstition. However, there was also the additional suspicion that any talk of universal moral values was in reality a disguised attempt by the specific culture of Western Enlightenment to impersonate global reason. Hence, the discourse of universal human rights could either be dismissed as a form of cultural imperialism or find itself compared with a belief in witches and unicorns. The latter claim was made by Alasdair MacIntyre, one of the leading proponents of a communitarian form of relativism.
Against this rather hostile background, Alan Gewirth took upon himself to prove that there could indeed be given a rational foundation for normative ethics, a foundation that would be valid for all rational agents regardless of their subjective preferences or cultural context. Over a period of twenty years, Gewirth published four books and more than sixty journal articles, developing and defending his argument that we, as rational agents who want to realize our goals, logically must claim rights to freedom and well-being, since these rights are the necessary conditions of all successful action. His theory has been the topic of one large analytical monograph and three edited anthologies of philosophical comments and criticism (see References and Further Readings for details). It continues to receive the attention of human rights scholars, moral philosophers, sociologists, and political scientists, and is likely to have a lasting influence on normative analysis and debate in general.
3. Gewirth’s Moral Theory: Agency, Reason, and the Principle of Generic Consistency
In Reason and Morality (1978), Gewirth described his theory as “a modified naturalism” (Gewirth 1978, 363). He wanted to anchor morality in the empirical world of agents and in the canons of deductive and inductive logic, rather than relegating it to intuitions or emotive attitudes that could be arbitrarily accepted or rejected.
Agency is a natural context for a moral theory given that moral prescriptions are normally about what we should or should not do. Here it could be objected that at least one type of moral theory, virtue ethics, is about what kind of dispositions we should have and so focuses on what we should be like rather than on what we should do. However, virtues are at least implicitly related to agency, as the dispositions that they are meant to cultivate are manifested in the ways virtuous persons act. A prudent person is a person who acts prudently, a courageous person is one who acts courageously, a temperate person is one who acts temperately, and so on. Moreover, a person’s inculcating and developing virtues in herself involves agency – as Aristotle pointed out, one becomes prudent, courageous, and temperate by acting prudently, courageously, and temperately; virtues are acquired by practice and habituation.
a. The Normative Structure of Agency and the Necessary Goods of Agency
Any kind of naturalist normative ethics will face the formidable obstacle known as Hume’s Law, saying that we cannot derive an “ought” from an “is.” According to Hume’s Law, descriptive and prescriptive statements inhabit different logical domains. That is, we cannot derive moral conclusions directly from non-moral empirical premises. For instance, from the descriptive observation that, as a matter of fact, most people in a particular society are in favour of a criminalization of blasphemy, we cannot derive the prescriptive conclusion that blasphemy indeed should be criminalized (in this or in any other society).
However, Gewirth argues that agency provides a context in which it is indeed possible to escape the constraints of Hume’s Law. This is so, since agency in itself has a normative structure involving evaluative and normative judgements made by agents about the very conditions of their agency. Gewirth begins with certain evaluative judgements that necessarily must be made by all rational agents. He then moves dialectically from these judgements to a moral rights-claim that likewise must be embraced by all rational agents. This is to derive moral rights in accordance with the Dialectically Necessary Method. The claim that all agents have rights is therefore presented not as a moral conclusion derived from a non-moral empirical premise about what all agents need, but instead as what rationally consistent agents must accept, given their own necessary evaluative judgements about what it means to be an agent.
To begin with, every agent can be assumed to consider the purpose of her action as something good. Gewirth takes this as a conceptual truth of agency. We involve ourselves in agency for the sake of something that we want to achieve by our action, and in this sense our action reveals a positive evaluation of its purpose. However, for an agent to hold her purpose to be good is not necessarily for her to make a moral evaluation. In this context, “good” should be understood simply as “worth achieving,” according to whatever criteria of worth an agent may have. These criteria need not be moral or even prudent ones. For instance, a burglar might have as his purpose to break into people’s homes and steal their possessions, and a person of a hedonist persuasion might have as his purpose to eat and drink as much as he can, although he is aware that this is bad for his long-term health. Still, the burglar and the hedonist hold their respective purposes to be good in the minimal sense that they want to pursue them; this is why they move from non-action to action with the intention of realizing their purposes.
Hence, agency is purposive and for an agent to have a purpose involves that she holds it to be good. However, agency must also be voluntary, in that the agent must be able to control her behaviour by means of her own unforced choice. It must be her choice to act in a certain way; otherwise, the purposes for which she acts would not be her purposes.
Gewirth’s conceptual analysis thus results in an account of action as being controlled by the agent’s unforced and informed choice (voluntariness) and guided by her intention to realize some goal of hers that she judges to be good (purposivenenss). Voluntariness and purposiveness are the generic features of agency, as they necessarily pertain to all actions.
Having argued that agency involves voluntariness and purposiveness, and that purposiveness involves the agent’s positive evaluation of her goals of action, Gewirth goes on to claim that all agents must hold that the capacities and conditions needed for generally successful agency constitute necessary goods, regardless of their many different particular goals of action. Of course, the capacities and conditions needed for climbing mountains are very different from the capacities and conditions needed for writing a doctoral dissertation in philosophy, not to speak of the capacities and conditions needed for being a good chess player or being good at growing tomatoes. However, common to all agents, including the mountain climber, the dissertation writer, the chess student, and the tomato grower, is the need for certain general capacities and conditions without which it is either impossible or at least unlikely that they will be successful in the realization of any of their purposes. As every agent necessarily wants to be successful in her actions – the very point of her agency being to achieve her purposes – every agent must accept that the capacities and conditions generally needed for any pursuit are necessary goods. They are goods, in the sense of being objects of positive value, and they are necessary, in the sense that no agent can do without them.
According to Gewirth, the necessary goods of agency are freedom and well-being. Freedom and well-being can also be conceptualized as the generic goods of agency, as they correspond to the generic features of agency, voluntariness and purposiveness. Freedom, corresponding to voluntariness, hence refers to the agent’s actual ability to control her behaviour in accordance with her informed and unforced choice. It requires that the agent is not subjected to violence, coercion, and deception in a way that negatively affects her capacity to control her agency. Moreover, it also requires that the agent should not suffer from any compulsive obsession that would interfere with her capacity for informed and unforced choice; nor should she be addicted to drugs that would negatively affect her capacity to control her behaviour. Without freedom, a person’s behaviour would not qualify as agency, as it would not reflect her choices and her will; instead, she would resemble a slave, being the tool of other people, or the powerless victim of uncontrollable impulses.
Well-being, corresponding to purposiveness, refers to the agent’s possession of abilities and resources necessary to her successful realization of her purposes, involving basic preconditions of agency as well as the conditions required for maintaining and developing capacities for agency. The well-being relevant to successful agency therefore has three levels. Basic well-being includes life, health, physical integrity, and mental equilibrium, as well as objects necessary to maintain life and health, such as food, shelter, and clothing. Nonsubtractive well-being includes whatever is necessary to an agent’s maintaining an undiminished capacity for agency, such as not being the victim of theft, broken promises, malicious slander, or generally unsafe conditions of life and work. Additive well-being, finally, includes whatever is necessary to expand an agent’s capacity for agency, such as having self-esteem and the virtues of prudence, temperance, and courage, as well as education, income, and wealth.
Since all agents necessarily want to be successful in their actions, and since freedom and well-being are necessary to all successful action, all rational agents must find it unacceptable to be deprived of or prevented from having freedom and well-being. Consequently, “[s]ince the agent regards as necessary goods the freedom and well-being that constitute the generic features of his successful action, he logically must also hold that he has rights to these generic features” (Gewirth 1978, 63). For an agent not to claim such rights would be for her to hold that it is acceptable that she is left without freedom and well-being. But she cannot hold this to be acceptable, since she, simply by being an agent, must view freedom and well-being as indispensable goods. Hence, any agent who were to deny that she has rights to freedom and well-being would thereby also involve herself in a contradiction, since she would both hold and at the same time deny that she must have freedom and well-being.
Now, so far, the agent has not made a moral rights-claim. She has only made a prudential rights-claim, that is, a rights-claim that is intended to protect the agent’s own interest in being a successful and efficient agent. However, since the sufficient ground for her rights-claim is simply the fact that she is an agent with purposes that she wants to fulfil, she must also recognize that the same rights-claim can and must be made by all other agents as well. Thus, every rational agent must accept the normative conclusion “All agents have rights to freedom and well-being.” Now this is a moral rights-claim, since it refers to the important interests not only of the individual agent, but of all agents. More precisely, it refers to all prospective purposive agents, since the claim applies not only to persons who are presently involved in agency, but also to persons who in the future can be expected to engage in agency.
Here it is once again important to note that Gewirth does not derive moral rights from facts about agency. His argument is not of the form “Because A is an agent, A has moral rights to freedom and well-being.” Instead he argues that each and every rational agent, from within her own perspective as such an agent, must claim rights to freedom and well-being. That is, from within her own perspective as an agent, A holds that (1) “My freedom and well-being are necessary goods.” Having accepted (1), A, who wants to achieve her goals of action and who is unable to achieve these goals without freedom and well-being, must, on pain of self-contradiction, embrace the evaluative judgement (2) “I must have freedom and well-being.” The “must” of (2) has implications for how A has to conceive of possible threats to her possession of freedom and well-being emanating from other persons. That is, she is logically compelled to hold that (3) “Other persons should not interfere with my having freedom and well-being.” And since (3) is equivalent to claiming a protected possession of freedom and well-being, it is also equivalent to a rights-claim: (4) “I have rights to freedom and well-being.” And since the sufficient condition of A’s rights-claim is that A is an agent, A must then also accept the generalized claim (5) “All agents have rights to freedom and well-being.” While (4) is a prudential rights-claim, (5) is a moral rights-claim, referring to rights had by all agents and not only by A.
Strictly speaking what Gewirth has proven is not that all agents have rights to freedom and well-being, but that all rational agents must hold that all agents have rights to freedom and well-being. However, this does not in any way diminish the practical relevance of Gewirth’s argument. As he himself observes, “what a rational agent ought to do … is what he is rationally justified in thinking he ought to do. But what he is rationally justified in thinking he ought to do is what he logically must accept that he ought to do” (Gewirth 1978, 153). Hence, if every agent must logically accept that all agents have rights to freedom and well-being, then every agent has as good reason as there could ever be to adhere to a moral principle prescribing universal respect for the equal rights of all agents to freedom and well-being.
Such a moral principle is also the outcome of Gewirth’s argument. This is the Principle of Generic Consistency (PGC): “Act in accord with the generic rights of your recipients as well as of yourself” (Gewirth 1978, 135). The term “recipients” refers to the people who are affected by an agent’s action, while “generic rights” denotes rights to freedom and well-being, the generic and necessary goods of agency. Therefore, without any loss of meaning, the PGC could also be stated as “Act in accord with the rights to freedom and well-being of your recipients as well as of yourself.”
The rights prescribed by the PGC are both negative and positive. Agents are required both not to interfere with their recipients’ possession of freedom and well-being, and to help them have freedom and well-being when they are unable to secure these necessary goods by their own efforts, and when help can be given at no comparable cost to the helping agent. To refuse to give such help would be tantamount to a practical denial of the equality of rights prescribed by the PGC.
b. Conflicting Rights
The rights of different agents and recipients may conflict with each other: an unlimited right to freedom for murderers and robbers would allow them to infringe the rights to well-being for their victims; to uphold the right to basic well-being for poor people might require a welfare state that taxes wealthier people and so interferes with their right to property; the additive right to well-being includes a right to have an income, but to have an income from selling drugs to children conflicts with the children’s rights to freedom and basic well-being, and so on. Now, according to Gewirth, conflicts between rights are to be resolved in accordance with three different criteria.
The first criterion is about Prevention or Removal of Inconsistency, according to which agents violating the generic rights of their recipients – aggressor agents, for short – can have their own generic rights justifiably interfered with. Agents who interfere with their recipients’ rights to freedom and well-being by, for instance, coercing them, threatening them, manipulating them, killing them, terrorizing them, assaulting them, or stealing from them are guilty of an inconsistency in that they deny rights to others that they must claim for themselves. Even if they do not make an explicit claim to be superior to other agents, their actions involve a practical rejection of the PGC.
In order to protect or restore the equality of rights prescribed by the PGC, aggressor agents must be either prevented from violating their recipients’ rights in the first place or, if they have already violated these rights, punished for their transgressions. The force used in preventing aggressor agents from violating their recipients’ rights should not exceed what is necessary to protect these rights. Likewise, the severity of the punishment meted out to aggressor agents should be proportionate to the seriousness of the violation of rights that they have inflicted on their victims. Preventing and punishing violations of rights will necessarily involve an interference with aggressor agents’ rights to freedom and well-being, but as it is needed to uphold the equality of rights prescribed by the PGC, such interference is also morally justified by that principle.
The criterion concerning Prevention or Removal of Inconsistency also points to the need for a legal system that can administrate punishment in a fair manner. Such a legal system involve laws, courts, judges, prosecutors, defence lawyers, and police officers. By extension, the justification of such a legal system implies the justification of states that are capable of implementing and upholding rights-protecting laws within their borders.
The second criterion is about Degrees of Needfulness for Action, giving priority to the right whose object is most needed for successful agency. For instance, the right to life (basic well-being) is more important to successful agency than the right not to have one’s property interfered with (non-subtractive well-being). Hence, if I can save a drowning child only at the cost of ruining my clothes, I have a duty to do so. Likewise, at the political or societal level, taxing wealthy people for the sake of providing poor people with basic healthcare or providing their children with basic education is morally justified. It is not that wealthy people do not have a right to non-interference with their property, but instead that this right is overridden by the rights of poorer people to life, health, and education. This is so, not because the poor usually outnumber the rich, but because the specific rights of the poor that we are considering here are more important from the point of view of successful agency than the right of wealthier people to have their wealth left untouched.
While the criterion concerning Prevention or Removal of Inconsistency mainly deals with negative rights – rights not to have one’s freedom or well-being interfered with – the criterion concerning Degrees of Needfulness for Action has more to do with positive rights – rights to have one’s freedom and well-being effectively upheld and protected by other agents. The criterion of Degrees of Needfulness for Action also has a bearing on the justification of what is commonly known as the welfare state, that is, a state which upholds the basic and additive rights of well-being of all citizens by means of redistributive taxation.
The third criterion is about Institutional Requirements, allowing for interference with people’s freedom and well-being when doing so is necessary to uphold institutions and social rules that are themselves required by the PGC. Thus, when a judge sentences a murderer to lifetime imprisonment, she interferes with the murderer’s right to freedom, but she is not thereby doing anything morally wrong. On the contrary, she is representing a legal system upholding the generic rights to freedom and well-being as prescribed by the PGC. More specifically, she represents an institution – the state with its laws, courts, judges, police officers, and so on – designed to remove inconsistencies as described by the first criterion above. The judge is not acting as a private person, but as a representative of the law; therefore, although the convicted murderer has not interfered with the judge’s freedom, he cannot argue against the judge that she has violated the PGC by denying him his right to freedom. This is not a case of an individual agent acting on an individual recipient, but of a representative of the law upholding a moral (and legal) principle of justice.
Many rights are not absolute, as they can be overridden by other, more important, rights, as outlined by the criterion of Degrees of Needfulness for Action. Gewirth hence defends a consequentialist position, according to which the rightness of an action depends on how it affects the recipient’s rights, and, in cases of conflicts between rights, on how upholding one right affects other, and possibly more important, rights. This is what he calls a deontological consequentialism (Gewirth 1978, 216), focusing on protecting rights rather than on producing good results in general. Thus, it should be distinguished from a utilitarian consequentialism, according to which the right action or rule of action is the one that results in the greatest total quantity of happiness or preference satisfaction, regardless of how benefits and burdens, pleasures and pains are distributed among individuals.
According to Gewirth, there are indeed rights that are absolute, in the sense that they cannot be overridden. As an example, he presents the case of Abrams, a young lawyer and prominent member of the society, who is being blackmailed by a group of terrorists threatening to use nuclear weapons against the city in which Abrams lives, unless Abrams publicly tortures his mother to death. So – should Abrams give in to the terrorists’ threats and torture his mother to death for the sake of saving the lives of thousands of his fellow citizens?
Gewirth says no. He provides an argument based on agent responsibility that is capable of explaining why the right to basic well-being of Abrams’s mother in this case is indeed absolute. Central to his argument is the Principle of the Intervening Action. According to this principle, if there is a causal connection between one person A’s action or inaction X and some harm Z being inflicted on some other person C, then A’s moral responsibility for Z is removed if, between X and Z, there intervenes some person B’s action Y, and Y is what actually brings about Z (Gewirth 1982: 229).
In the case of Abrams and the terrorists, the Principle of the Intervening Action means that although there is a causal connection between Abrams’s refusing to torture his mother to death and the deaths of thousands of innocent people, Abrams is not morally responsible for their deaths. This is so because there is an action that intervenes between his refusal and the death of these people and which actually brings about their deaths, namely, the terrorists’ use of their nuclear weapons. It is not Abrams’s refusal by itself that kills thousands of innocent people, but the terrorists’ use of their weapons. No one is forcing the terrorists to kill anyone, least of all Abrams; they freely choose to detonate their bombs or whatever type of weapon they have at their disposal as a response to Abrams’s refusal to torture his mother to death. This is their decision and their action, no one else’s.
Thus, the terrorists are both causally and morally responsible for the deaths of these thousands of innocent people, not Abrams:
The important point is not that he lets these persons die rather than kills them, or that he does not harm them but only fails to help them, or that he intends their deaths only obliquely but not directly. The point is rather that it is only through the intervening lethal actions of the terrorists that his refusal eventuates in the many deaths. (Gewirth 1982: 230)
The conflict here is not between Abrams’s refusal to torture his mother to death and the survival of thousands of innocent townspeople, but between their survival and the terrorists’ intention to use nuclear weapons against them. Consequently, Abrams’s duty to respect his mother’s right to basic well-being is not affected by the terrorists’ threat. In this sense, Abrams’s mother has an absolute right not to be tortured to death by her son. By implication, all innocent persons (and not only mothers) have an absolute right not to be tortured to death by anyone (and not only by their sons).
4. The Community of Rights
The PGC applies not only to individual agents and their interaction with their recipients, but also to the collective level of political communities, institutions, and states. The minimal state with its laws against criminal transgressions such as murder, rape, robbery, fraud, enslavement, and so on is justified as instrumentally necessary to the protection of negative rights to freedom and well-being, that is, rights that are about not having one’s freedom and well-being interfered with. The right to freedom also justifies the democratic state, that is, a state that functions according to the method of consent, allowing the people to be a community of citizens, deciding about their own collective fate, and not just the subjects of an autocratic ruler. But the PGC also justifies a supportive state – in common parlance known as the welfare state – which is instrumentally necessary to the protection of positive rights to freedom and well-being, that is, rights that are about actually possessing the freedom and well-being needed for successful agency. Such positive rights imply that people who are unable to develop, or are prevented from developing, freedom and well-being for themselves should receive support from the state to overcome these obstacles to their successful agency. The support provided would involve access to education, health care, and employment whereby individuals can secure an income for themselves, but also involve public goods such as clean air and water, safe roads, public libraries, and similar commodities that contribute to everyone’s actual possession of freedom and well-being. A supportive state that responds in this manner to its citizen’s positive rights is thereby also a community of rights. The Community of Rights is also the title of Gewirth’s 1996 sequel to Reason and Morality.
According to Gewirth, making the state a community of rights is not only justified in the sense of being permissible but is indeed necessitated by the PGC, as this principle justifies positive rights to freedom and well-being. To act in accord with one’s recipient’s rights to freedom and well-being is not only about not interfering with these goods, but also, when it is necessary, and when one can do so at no comparable cost to oneself, assisting one’s recipients in actually having freedom and well-being. To refuse to help a person who is unable to maintain her basic well-being when one can do so without jeopardizing any aspect of one’s own basic well-being would imply a practical denial of that person’s generic rights of agency. Now, sharing a political community – a sovereign territorial state governed by its citizens – transforms the relationship between agents and their recipients from one of individual persons directly interacting with each other to one of a collective of persons indirectly interacting with each other, by means of social rules and institutions that they or their elected political representatives have decided about. A community of rights aims to remove structural threats to the equality of rights, focusing on “situations where threats to freedom and well-being arise from social or institutional contexts, such as where economic or political conditions make for unemployment, homelessness, or persecution” (Gewirth 1996, 41).
In a political community characterized by social inequality, some people may enjoy prosperous lives with good salaries, wealth, and property, while vast numbers might be left in unemployment and poverty. These poor people may live in unhealthy homes in crime-ridden neighbourhoods, often lacking sufficient education and suffering from a hopelessness that inclines them to abuse drugs and alcohol and to neglect their responsibilities as spouses and parents. Often such inequality has an institutional dimension in that it is being maintained by laws or the absence of laws concerning social welfare, work hours, taxation, property, unionization, and so on. Therefore, providing impoverished and marginalized groups with effective rights to freedom and well-being involves creating a supportive legal and institutional framework that gives them access to education and employment. These goods allow them to make a living for themselves and develop their capacities for successful agency. This is what the community of rights is about.
In this context, it is the state that is the relevant agent of change, rather than individual charity workers, as it is a matter of changing a social condition affecting the political community at large, and as the required changes involve political decisions about laws and institutions. Moreover, as the state derives its moral justification from being instrumental to the upholding and maintenance of its citizens’ equal rights to freedom and well-being, it cannot without undermining its own moral foundation remain passive when confronted with the unfulfilled rights of vast numbers of its citizens.
Acting on behalf of all its citizens, the state can be seen as establishing a link between wealthy and poor groups of citizens, connecting the rights of the poor to the duties of the wealthy: “In so acting the state carries out the mutuality of human rights. For since each person has rights to freedom and well-being against all other persons who in turn have these rights against him or her, the state, as the persons’ representative, sees to it that these mutual rights are fulfilled. … So the state, in helping unemployed persons to obtain employment, enables its other members to fulfill positive duties that, in principle, are incumbent on all persons who can provide the needed help” (Gewirth 1996, 219).
However, here it is important to note that the poor have a right to be helped only when they are indeed unable to secure their rights to freedom and well-being for themselves. The positive right to help is based on necessity – that is, it applies only to cases in which one cannot have one’s rights to freedom and well-being realized without the help of other people or social institutions. Thus, “the positive rights require personal responsibility on the part of would-be recipients of help: the responsibility to provide for themselves if at all possible, prior to seeking the help of others” (Gewirth 1996, 42). Moreover, where there is a justified right to assistance, it is about securing goods needed for successful agency – food, housing, education, a job with a sufficient income, and so on; it is not about satisfying whatever particular need a person might have in the light of her personal interests and preferences.
Central among the positive rights outlined and discussed in The Community of Rights is the one to productive agency. According to Gewirth, unemployment, the lack of education, the lack of affordable health care, and so on, are morally problematic because they deprive persons of their capacity for successful agency. Hence, according to Gewirth, the way to deal with these societal shortcomings is not just to give poor people money but rather to help them develop for themselves the means of self-support, making them capable of standing on their own two feet rather than being reduced to permanent recipients of welfare cheques or charity. To be such a permanent recipient of welfare cheques would be detrimental both to one’s autonomy (central to freedom) and to one’s self-esteem (central to additive well-being); living on welfare is not the solution to the moral problem of poverty but rather a problem in its own right.
Instead of welfare cheques, the community of rights focuses on two mutually reinforcing strategies. At the level of individuals, it aims to develop their capacity for productive work by means of education. At the societal level, it aims to establish a system of full employment combined with a system of economic democracy to be applied to firms competing on market conditions. Gewirth’s ideas here are quite bold, combining ideas usually associated with socialism with a defence of market principles usually associated with liberalism. This comes out clearly in his discussion of economic democracy “in which products are put out to be sold in competitive markets and the workers themselves control the productive organization and process,” which in turn may involve “aspects of ownership either by the workers themselves or by the state” (Gewirth 1996, 260).
Gewirth’s argument that the state is morally obligated to guarantee employment to all citizens reflects his belief that there is a human right to employment. This specific right is derived from the more general right to well-being, as work in exchange for money is a morally justified way of satisfying both basic and additive well-being, enabling oneself to have not only food on the table and a roof over one’s head, but also to buy various consumer goods whereby one can increase one’s quality of life. (Theft, robbery, selling drugs, blackmail, and fraud would exemplify morally unjustified ways of making a living that negatively interferes with others’ well-being.) By participating in the production of goods and services one recognizes the mutuality of human rights by offering goods and services to consumers that they value as components of their well-being, while they in turn pay for these goods and services, thereby contributing to one’s own well-being. By participating in such a mutual exchange, one can also derive a sense of justified self-respect and pride. One contributes something of value to others and earns a living from that contribution. Hence, one is a productive member of one’s community – a person who adds to its total wealth and to the well-being of its members.
The right to productive agency includes a right to education that prepares the individual for work life but also promotes “cultural, intellectual, aesthetic, and other values that contribute to additive well-being, including an effective sense of personal responsibility” (Gewirth 1996, 149–150). In this way, the individual is helped to avoid welfare dependence while at the same time being made aware of human rights and the cultural values of her own as well as other communities. Once again, it is the state, as the institutional representative of the community of rights, that should see to it that all members of the community receive this kind of education.
The right to productive agency implies a right to employment. It has both a negative version – the right not to be arbitrarily deprived of employment – and a positive version – the right to actually have an employment in the first place. The state, representing a community of rights, “should seek to secure the negative right to employment by enforcing on private corporate employers the duty not to infringe this right through policies that disregard their severe impact on the employment security of workers.” As for the positive right to employment, it is also the state “that has the correlative duty to take the steps required to provide work for unemployed persons who are able and willing to work” (Gewirth 1996, 218–219). This latter duty can be effectuated by offering unemployed workers retraining to equip them for a new job, or by the state directly employing them in the public sector.
In addition to defending state interventionism in the job market, Gewirth offers an even more radical solution to the problem of how to protect workers’ rights, namely, economic democracy. Given both the background of universal rights to freedom and well-being and that workers are exposed to the power of their capitalist employers regarding decisions about wages, job security, and conditions of work, Gewirth argues that workers should own and control the companies and corporations that employ them. Still, the Gewirthian idea of economic democracy does not try to dispose of the market, nor does it try to eliminate capitalists as such. What it does is to separate the role of capitalists from the power of ownership. Firms owned and controlled by workers will still compete with each other on a market that is guided by the demands of consumers. These firms will also need capital for investment and development. Sometimes they might need to borrow money from banks or private investors. However, the banks or investors will not themselves be owners or share-holders of the firms in question: “In capitalist systems, capital hires workers; in economic democracy, workers hire capital” (Gewirth 1996, 261).
Firms owned by their workers and competing on a market may of course fail in this competition. In a capitalist system, this would entail that workers lose their jobs as their firm is trying to cover its losses. In a system of economic democracy, however, there will be a general organization of firms that intervenes to prevent such threats to workers’ well-being: “When firms are threatened with failure and consequent layoffs of workers, they are not simply permitted to fail. Instead, the general organization … helps them either to improve their efficiency or to convert to some other line of production in which they can successfully compete” (Gewirth 1996, 295).
Here it is also important to point out that Gewirthian economic democracy does not imply a justification of state socialism of the kind associated with the former Eastern Bloc. On the contrary, Gewirth holds that the introduction of a system of economic democracy must depend on such a system being freely accepted by the citizens in a political democracy. It “should not be imposed by fiat; it should be the result of a democratic process of discussion, deliberation, and negotiation in which arguments pro and con are carefully considered by the electorate of political democracy” (Gewirth 1996, 263). Although Gewirth favours a system of economic democracy, he is also aware of the complexity of the question. The argument for workers’ ownership of firms involves empirical assumptions about motivation, solidarity, and productivity that are not settled by the rational justification of the rights of all agents to freedom and well-being. While political democracy, based on the human right to freedom, and its method of consent are rationally justified and morally necessary, the question of how to best organize the economic structure of the community of rights admits of more than one answer. As Gewirth himself recognizes, “[w]hile there are rational arguments for the economic rights I have upheld, there may also be rational arguments for different and even opposed arrangements, so that the rights have a certain element of normative contingency, as against the generic rights [to freedom and well-being] themselves” (Gewirth 1996, 323).
5. The Good Life of Agents
In his last published book, Self-Fulfillment (1998), Gewirth demonstrated that the necessary goods of agency are central not only to interpersonal and political morality but also to personal morality and the quest for a fulfilling and meaningful life. Distinguishing between two varieties of self-fulfilment, aspiration-fulfilment and capacity-fulfilment, he outlines a normative theory of the good life that is also consistent with the prescriptions of the PGC. In aspiration-fulfilment, the aim is to satisfy one’s deepest desires; in capacity-fulfilment, it is to make the best of oneself. Although these two aims are not mutually exclusive – our deepest desire might be to make the best of ourselves – they are conceptually distinct. For instance, it might be the case that our deepest desires are for other things than making the best of ourselves.
Together, aspiration-fulfilment and capacity-fulfilment define us as persons and agents. In aspiration-fulfilment, we are guided by our actual deepest desires; in capacity-fulfilment, we are guided by an idea of what is best in us that might well go beyond our actual desires and even conflict with them. Here, we should also observe that many people do not have self-fulfilment (in either variety) as the direct and conscious goal of their actions. Instead, “their self-fulfillment and their awareness of it emerge as ‘by-products’ of their achieving the direct external objects of their aspirations, whether these consist in composing beautiful music or pursuing political objectives or whatever” (Gewirth 1998, 50).
Aspiration-fulfilment makes an important contribution to the good life as it provides the aspiring person with motivation and purposes to guide her; thus, “[t]he person who has aspirations has something to live for that is especially significant for him, something that gives meaning, zest, and focus to his life” (Gewirth 1998, 32). However, aspiration-fulfilment can also be problematic, from a prudential as well as from a moral point of view.
Prudentially speaking, one can be mistaken about the contents of one’s deepest desires. Hence, one might aim for targets that will leave one frustrated, either because one fails to understand what it takes to realize them, or because one has an exaggerated view of the satisfaction one will get from realizing them. For instance, one might aspire to become a famous novelist without realizing the effort required to achieve this goal, or one might have exaggerated expectations about the happiness one will get from actually becoming a famous novelist.
From a moral point of view, aspirations can be immoral, conflicting with the human rights to freedom and well-being prescribed by the PGC. One’s deepest desires might be about dominating other people, thereby violating their right to freedom. Or it might be about creating a racially, religiously, culturally or ideologically “pure” society in which people identified as having the “wrong” ethnicity, faith, sexual preferences, or political beliefs are killed, enslaved, imprisoned, or persecuted, thereby having their most basic rights to freedom and well-being violated.
In order to overcome the potential errors of aspirations and reconcile self-fulfilment with the requirements of morality and the PGC, Gewirth moves from aspiration-fulfilment to capacity-fulfilment. This form of self-fulfilment is, as we have already noted, about making the best of oneself. Making the best of oneself involves a move in the direction of objectivity and rationality and away from the subjectivity and arbitrariness of the desires of aspiration-fulfilment. This does not mean that capacity-fulfilment rejects desires. On the contrary: as in the case of aspiration-fulfilment, capacity-fulfilment involves agency, and as there can be no agency without desires for some outcome, “there is ultimately no disconnection between capacity-fulfillment and aspiration-fulfillment” (Gewirth 1998, 159). However, in capacity-fulfilment one’s desires have passed through a process of critical assessment, guided by the goal of making the best of oneself. We might describe this process as the move from aspiration-fulfilment’s question “What do I want for myself?” to capacity-fulfilment’s question “What should I want for myself?”
Now, according to Gewirth, “making the best of oneself” is about acting in accordance with the best of one’s capacities. Reason, as the capacity for ascertaining and preserving truth, would belong to this category of capacities, as it is needed for all rational deliberation, including deliberation about what to do with one’s life, what one should aspire to, how one might best realize one’s aspirations, how to handle conflicts between one’s own goals, or between one’s own and other people’s goals, and so on. In Gewirth’s terminology, reason should be understood as “the canons of deductive and inductive logic, including in the former the operations of conceptual analysis and in the latter the deliverances of sense perception” (Gewirth 1998, 72). In relation to self-fulfilment, we use reason to collect, analyse, and evaluate facts about ourselves and our capacities, as well as to make relevant inferences from these facts and apply them to our aspirations and goals.
One important aspect of ourselves that is ascertained by reason is that we are agents – indeed, the very idea of self-fulfilment implies agency, that is, that we can and should do something with our lives, whether it is about satisfying our deepest desires or making the best of ourselves. Now, as Gewirth has shown in his earlier work, being agents, we must claim rights to freedom and well-being, and we must recognize the claim of all other agents that they too have rights to freedom and well-being. Hence, as capacity-fulfilment involves the use of reason, and as reason justifies human rights to freedom and well-being as prescribed by the PGC, capacity-fulfilment involves recognizing a universalist morality of human rights. Accordingly, to make the best of oneself involves acting in accord with the rights to freedom and well-being of one’s recipients as well as of oneself.
Here we can see how capacity-fulfilment comes to modify aspiration-fulfilment – aspirations that are inconsistent with the human rights to freedom and well-being will be rejected by a rational agent as unjustified and impermissible. Therefore, “capacity-fulfillment can sit in reasoned judgment over aspiration-fulfillment” (Gewirth 1998, 101). Moreover, as agents guided by reason must conceive of freedom and well-being as necessary goods, these goods should also figure prominently in projects whereby agents try to realize a good life for themselves. According to Gewirth, “to fulfill oneself by way of capacity-fulfillment, one must make the most effective use one can of one’s freedom and well-being, within the limits set by the overriding authority of universalist morality” (Gewirth 1998, 110).
“Most effective” should not be understood in purely quantitative terms as the realization of as many goals of action as possible. Instead it is about using one’s critical and reflective understanding of one’s abilities and preferences to select some major goals, such as pursuing a certain career that one finds meaningful, or supporting a cause that one finds valuable. Different agents will make different choices here, depending on their varying abilities and preferences. Some agents may dedicate themselves to human rights activism or to the creation of art that widens the horizons of its audience. However, capacity-fulfilment can also be achieved within more ordinary types of occupations. Thus, people can make the best of themselves by becoming “a professional athlete or an electrician or an engineer or a philosopher or a journalist, and so forth” (Gewirth 1998, 131). Within these various occupations, professions, and callings there will be standards of excellence, and by trying to achieve in accordance with these standards the agent will be able to make the best of herself and so achieve capacity-fulfilment. (Here we should once again remind ourselves that the occupations and professions themselves must be consistent with the human rights to freedom and well-being; hence, an agent would not be able to achieve capacity-fulfilment by excelling as a SS officer in the service of Adolf Hitler or as a NKVD agent working for Joseph Stalin.)
It is important in this context to note that making the best of oneself might well imply different commitments and projects at different times in one’s life. Choosing a particular career for oneself might be an important aspect of one’s capacity-fulfilment when one is young; as one grows older, and all or most of one’s working life belongs to the past, it might be more relevant to have a plan about how to make the most of one’s retirement, possibly developing new skills in the process. Such a dynamic conception of capacity-fulfilment implies a realistic view of human life, according to which “one must accept oneself for what one is; in this way one can age gracefully, as against a neurotic longing for one’s past youth” (Gewirth 1998, 118).
An agent’s effectiveness involves her acquiring certain virtues that enhance her capacity for successful agency. Important among these virtues is prudence, which is the ability to “ascertain which of one’s possible ends are most worth pursuing in light of one’s overall capacities and aspirations,” including “both self-knowledge and knowledge of one’s natural and social environment, as well as the proximate ability and tendency to bring these to bear on one’s actions and projects” (Gewirth 1998, 126).
The development of knowledge about oneself and one’s environment can be promoted by education (including self-education) and culture in the form of art and literature that enlighten and enlarge the agent’s understanding of herself and of the world. As a result of such a widening of her horizons, the agent might find inspiration and motivation to develop her own skills in a way that works well for her, given her talents and abilities.
As the agent tries to achieve her ends, she might be confronted with obstacles in the form of fears, self-doubt, setbacks, frustrations, temptations, and disruptive urges. Thus, she will also need the virtues of courage and temperance, helping her to persevere in the face of adversity and to overcome unfounded fears as well as to control her appetites and inclinations so that they do not undermine her determination and ability to achieve her ends.
A good life for an agent often includes various social commitments. She might experience fulfilment in love relationships or family life, or by participating in various voluntary associations, or by patriotic dedication to her country and political community. By identifying with smaller or larger groups, she might provide her own life with meaning and significance, making the best of herself by being a loyal and supportive member of one or many of these groups. Moreover, “a sense of belonging, of being part of a larger nurturing whole, is a valuable component of additive well-being and self-fulfillment” (Gewirth 1998, 151) as it provides the individual with the identity and confidence needed to make the best of herself.
Social commitments typically involve preferential treatment of other members of one’s group. Lovers typically care for each other in a way that they do not care for others, parents typically support their own children in a way that go beyond whatever support they offer children in general, citizens are typically willing to make sacrifices for their political community that they would not contemplate in relation to other nations or states, and so on. Such particularist allegiances are consistent with the requirements of reason and the principle of equal human rights justified by these requirements – that is, the PGC – as long as the preferential treatment involved does not result in a violation of innocent persons’ rights to freedom and well-being.
Indeed, certain preferential concerns are justified by the PGC. For instance, the right to freedom involves a more specific right to join others in voluntary associations, such as families. In these associations one typically also acquires special responsibilities for each other’s well-being. Parents, for instance, are morally obligated to care for their children as the very existence of these children depend on the parents’ exercise of their freedom to procreate. Likewise, the right to well-being justifies the existence of states as necessary to the protection of that right and makes support for such rights-protecting states a duty for their respective citizens. Therefore, citizens are not only justified but morally obligated to support their state – provided, that is, that the state in question is indeed protective of their human rights and does not unjustifiably threaten the rights of members of other political communities. Such support is effectuated by the citizens when they pay their taxes to maintain rights-protecting institutions or when they take part in the defence of their political community in a just war of defence.
Preferential treatment conflicts with reason and the principle of equal human rights to freedom and well-being only when it involves violations of these rights. A mother does not offend against the universality of human rights by choosing to prioritize the feeding of her own starving child in a situation in which there are many other starving children around. However, things would be different if she feeds her child with food that she has taken from someone else’s starving child; then she would have violated the right to basic well-being of that child. Likewise, a citizen of a rights-respecting state does not offend against the universality of human rights by taking a particular interest in the flourishing of her own political community and by being willing to make sacrifices for that community that she would not make for any other political community. However, she would be guilty of contributing to violations of human rights if her patriotic loyalty were to extend to a support for her political community even as that community violates human rights, for instance, by perpetrating a genocidal attack on a religious or ethnic minority.
Gewirth’s justification of agency-based rights and of the PGC has received many critical comments from other philosophers. Among other things, it has been argued (in the 1984 anthology Gewirth’s Ethical Rationalism, edited by Edward J. Regis Jr.) that agents are not logically compelled to claim moral rights just because they want to be successful in achieving their own goals of action (R. M. Hare); that Gewirth might not have been successful in bridging the gap between the “is” of human agency and the “ought” of morality (W. D. Hudson); that while Gewirth might be capable of justifying negative rights, his theory is unable to justify positive rights and hence also unable to justify the supportive or welfare state (Jan Narveson). Gewirth has replied to these and many other objections (in the anthology mentioned above, as well as in many separate articles in various philosophy journals).
In addition to Gewirth’s own replies to his critics, his theory has been carefully and thoroughly defended by Deryck Beyleveld, who in his 1991 book The Dialectical Necessity of Morality listed 66 categories of objections to the justification of the PGC (often including more than one critic in each category) and went on to show how each and every objection either had already been convincingly dealt with by Gewirth himself or could be dealt with by means of a rational reconstruction of Gewirth’s argument.
In 1997 a conference dedicated to the exploration of Gewirth’s moral philosophy took place at Marymount University, Arlington, Virginia. The comments, presented by the participants in this conference together with a reply by Gewirth himself, were later published as a book with the title Gewirth, edited by Michael Boylan. Gewirth’s theory has continued to attract attention after his death in 2004, including a 2016 anthology entitled Gewirthian Perspectives on Human Rights (edited by the Swedish philosopher Per Bauhn); likewise, the PGC is frequently referred to in discussions relating to human rights and social justice.
One reason for the continued interest in Gewirth’s theory is, of course, that we live in troubled times. As Gewirth himself once pointed out, “[i]n a century when the evils that man can do to man have reached unparalleled extremes of barbarism and tragedy, the philosophical concern with rational justification in ethics is more than a quest for certainty” (Gewirth 1978, ix). Referring to the twentieth century with its two World Wars and the Holocaust, these words have certainly not lost their relevance in the twenty-first century, when mankind is tormented by fanaticism and terrorism, as well as by widespread global inequalities between men and women, and between those who have and those who have not.
The need for a rationally justified morality is as great as ever before in human history, if not greater, given the facts of globalization. Different cultures and moralities are brought in ever closer contact with each other, thereby creating possibilities for conflict as well as for cooperation, while new technologies enable us to affect the lives of people across the globe. Thus, questions of agency, morality, and rights will be of the utmost importance for our deliberations about how to shape our individual and collective futures. It is in the context of such deliberations that Alan Gewirth’s carefully developed arguments have their place; his contributions to modern moral and political philosophy are of a significant and lasting kind.
6. References and Further Reading
a. Primary Works
i. Monographs
Marsilius of Padua and Medieval Political Philosophy (New York: Columbia University Press, 1951). Gewirth’s doctoral dissertation on the 14th century philosopher who challenged the papal authority in political matters and defended the idea of popular sovereignty.
Reason and Morality (Chicago: The University of Chicago Press, 1978). Gewirth’s main work in moral philosophy, providing a detailed argument for the Principle of Generic Consistency (PGC) and its derivation from agents’ necessary evaluation of freedom and well-being as the necessary goods of successful agency.
Human Rights (Chicago: The University of Chicago Press, 1982). A collection of essays by Gewirth, dealing with the justification and application of the PGC.
The Community of Rights (Chicago: The University of Chicago Press, 1996). Gewirth’s main work in social and political philosophy, in which he provides an argument for positive rights to freedom and well-being, including rights to employment and to economic democracy, as well as for a welfare state that is also a community, based on values such as respect and care.
Self-Fulfillment (Princeton: Princeton University Press, 1998). In this work, Gewirth sets out to argue that self-fulfilment comes in two forms, as aspiration-fulfilment and as capacity-fulfilment, and that making the best of one’s life must include adherence to universal human rights, as defined by the PGC.
ii. Articles and Book Chapters
“Introduction”, in Gewirth, Alan (ed.) Political Philosophy (London: Collier-Macmillan, 1965), pp. 1–30. This introductory chapter provides valuable clues to Gewirth’s later thinking on political rights and social justice, as well as his early ideas on combining natural law with consequentialism.
“The Epistemology of Human Rights”, Social Philosophy & Policy 1 (2), 1984, 1–24. In this article Gewirth outlines the conceptual and logical structure of human rights in general and his dialectically necessary justification of the PGC in particular.
“Practical Philosophy, Civil Liberties, and Poverty”, The Monist 67 (4), 1984, 549–568. Here Gewirth outlines his ideas about how philosophy can be practical, exemplifying by discussing how the poor can be provided with effective access to the political process.
“Private Philanthropy and Positive Rights”, Social Philosophy & Policy 4 (2), 1987, 55–78. In this article, Gewirth argues that while private philanthropy might contribute to important human values, for reasons of justice and fairness, the primary responsibility for upholding citizens’ positive rights to basic well-being should rest with the state.
“Ethical Universalism and Particularism”, The Journal of Philosophy 85 (6), 1988, 283–302. Here Gewirth argues that certain particularist commitments, for instance to one’s family and country, are not only consistent with but are indeed justified by universalist morality and its supreme principle, the PGC.
“Is Cultural Pluralism Relevant to Moral Knowledge?” Social Philosophy & Policy 11 (1), 1994, 22–43. In this article Gewirth addresses the topic of multiculturalism, arguing that the norms and values of different cultures must themselves be assessed from the perspective of rational moral knowledge as embodied in the PGC.
“Duties to Fulfill the Human Rights of the Poor”, in Pogge, Thomas (ed.), Freedom from Poverty as a Human Right (Oxford: Oxford University Press, 2007), pp. 219–236. This book chapter is based on “Justice: Its Conditions and Contents”, Gewirth’s keynote address at the XXI World Congress of Philosophy in Istanbul, Turkey, delivered on August 17, 2003. Here Gewirth outlines a positive duty of wealthier nations to provide poorer nations with agency-empowering assistance. This argument can be seen as containing ideas for a book that Gewirth was working on at the time, entitled Human Rights and Global Justice; the book was left unfinished at the time of his death.
b. Secondary Works
Bauhn, Per (ed.). Gewirthian Perspectives on Human Rights (New York: Routledge, 2016). A collection of essays with new interpretations and applications of Gewirth’s theory, with a particular focus on human rights.
Beyleveld, Deryck. The Dialectical Necessity of Morality (Chicago: The University of Chicago Press, 1991). An extensive and detailed defence of Gewirth’s argument for the PGC, dealing with sixty-six distinct types of objections made by philosophers; foreword by Gewirth.
Boylan, Michael (ed.). Gewirth (Lanham: Rowman & Littlefield, 1999). A collection of essays commenting on Gewirth’s theory, how it relates to Kantianism, rationalism in ethics, altruism, and community; the book also contains Gewirth’s replies to the comments, as well as a chronological list of all his published writings up to 1998.
Regis Jr., Edward (ed.). Gewirth’s Ethical Rationalism (Chicago: The University of Chicago Press, 1984). A collection of critical essays dealing with various aspects of Gewirth’s theory, such as the “is–ought” problem, duties relating to positive rights, and marginal agents; Gewirth replies to his critics in the last chapter.
William Hazlitt is best known as a brilliant essayist and critic. His essays include criticism of art, poetry, fiction, and drama. He wrote social and political commentary, portraits of major writers and political figures of his age, and a biography of his great hero, Napoleon. He had intended to follow his father into the Unitarian ministry but became instead a painter of portraits before settling into a career as a writer. His earliest writing is philosophical, and his key ideas are incorporated into his later work as a critic and conversational essayist.
Hazlitt was acquainted with many of the leading figures of the period, including Wordsworth and Coleridge, Keats and Shelley, the philosopher William Godwin, and the essayists Leigh Hunt and Charles Lamb. Like other political radicals of the time, he was persecuted by the Tory press, being referred to disparagingly by one periodical as belonging, with Keats and Hunt, to the ‘Cockney School’. His most notorious work, Liber Amoris (1823), gave ammunition to his enemies by candidly recounting the story of his infatuation with Sarah Walker, the daughter of his landlady, for whom he divorced his wife only to be rejected. He died in 1830, at the age of 52.
Hazlitt was educated at New College, Hackney, a Dissenting academy, where he acquired a thorough grounding in philosophy and literature. He left prematurely, but not before he had begun developing the ideas that he later described as his ‘metaphysical discovery’ and that formed the core arguments of his first book, An Essay on the Principles of Human Action (1805). In this he argues against psychological egoism, materialism, associationism, and a Lockean account of personal identity. He argues for the formative power of the mind and the natural disinterestedness of human action regarding future benefits for oneself and others.
Hazlitt was born on April 10, 1778, in Maidstone, in the English county of Kent. His Irish father, also named William Hazlitt, was a Presbyterian minister, an author of theological and philosophical works, and a friend of leading Dissenting thinkers such as Joseph Priestley and Richard Price. His mother, Grace, was from an English Dissenting family. In 1780 the family moved to Bandon, County Cork, Ireland. Running into political difficulties with the local community, the Rev. Hazlitt moved the family again, this time to the United States, where he founded the first Unitarian church in Boston but failed to become established. The family returned to England in 1787, to the village of Wem, near Shrewsbury, Shropshire.
William started his formal education in the small school run by his father. He also had periods of schooling in Liverpool, from where he wrote home, precociously, about the injustice of the slave trade and of the Test and Corporation Acts. William intended to follow his father into the Presbyterian, specifically Unitarian, ministry. Because Unitarianism is a rational and politically liberal Dissenting tradition, the family welcomed the French Revolution in 1789. In July 1791 the Birmingham home, library, and laboratory of Joseph Priestley were destroyed by a mob. Young William penned a passionate letter in defence of Priestley, which was published in the Shrewsbury Chronicle.
In 1793 Hazlitt left Wem and the oversight of his father to begin his formal training for the ministry at New College, Hackney, just north of London. This was a Dissenting Academy for the education of lay and ministerial students. If its building rivalled Oxford and Cambridge colleges for grandeur, the curriculum exceeded them in its breadth and its pedagogy in its promotion of free enquiry and philosophical debate. Robert Southey wrote that students ‘came away believers in blind necessity and gross materialism – and nothing else’. Internal disputes and financial difficulties, as well as its reputation as a hotbed of sedition, were already starting to destabilise the college. Many of the students were restless, radicalised by works such as Godwin’s Enquiry Concerning Political Justice (1793) or distracted by the theatres and other pleasures afforded by its proximity to London. Hazlitt left prematurely, having lost his sense of a vocation, but the college had given him a solid grounding in philosophy and literature; and he may already have made the ‘metaphysical discovery’ that would form the basis of his first book.
Hazlitt chose to follow his brother John into a career as a portrait painter, and to live and train with him in London. He soon met leading radicals and thinkers, including William Godwin and Joseph Fawcett. 1798 was a landmark year, as he recalls in his 1823 essay ‘My First Acquaintance with Poets’ (CW 12). When Samuel Taylor Coleridge came to Shrewsbury, to consider a vacancy as a Unitarian minister, Hazlitt went to hear him preach, and later at dinner in Wem, the poet ‘dilated in a very edifying manner on Mary Wollstonecraft and [James] Mackintosh’. Coleridge stayed the night; in the morning he received a letter from Thomas Wedgewood offering him £150 a year to relinquish his ministerial intentions and devote himself to poetry and philosophy, which he immediately accepted. Any disappointment Hazlitt felt was assuaged by the poet’s invitation to visit him at Nether Stowey in Somerset. Delighted, the nineteen-year-old William accompanied Coleridge back to Shrewsbury. On this walk, Hazlitt attempted, with difficulty, to outline the argument of his ‘metaphysical discovery’.
His stay later that year with Coleridge at Nether Stowey, and with Wordsworth at nearby Alfoxton, was another formative experience. The diarist Henry Crabb Robinson, who first met Hazlitt at this time, describes him as a shy and somewhat tongue-tied young man, but also as the cleverest person he knew. By now it was becoming clear that the tide was turning against the radicals and reformers. At a lecture, ‘On the Law of Nature and Nations’, Hazlitt heard James Mackintosh renounce his support for the Revolution and radicalism. Thereafter Hazlitt had nothing but contempt for apostasy of this kind.
Hazlitt painted portraits in Manchester, Liverpool, and Bury St Edmunds. A portrait of his father was exhibited at the Royal Academy in 1802. On a visit to Paris, he caught a glimpse of his hero, Napoleon, and spent hours copying works by Titian, Raphael, and Poussin in the Louvre. He told an Englishman who praised his work that rapid sketching was his forte and that, after the first hour or two, he generally made his pictures worse. He later wrote about his career as a painter in essays such as ‘On the Pleasure of Painting’ (CW 18).
When he was twenty-five, Hazlitt visited the Lake District in Northern England, where Coleridge and Wordsworth were living. The poets regarded him as a moody and easily enraged young man but possessed of real genius. However, the stay ended badly. In 1815, Wordsworth gave an account of the episode to Crabb Robinson as an explanation for his coolness towards Hazlitt. According to this (not necessarily reliable) account, Wordsworth had rescued Hazlitt from a ducking following Hazlitt’s assault of a local woman.
An important new friendship dates from October 1804. Charles Lamb was an old schoolmate and friend of Coleridge and was already a published poet and journalist. Hazlitt also saw a good deal of William Godwin. He was still attempting to get his ‘metaphysical discovery’ into decent order, and it is likely that Godwin advised him. He certainly assisted practically by recommending the work to the publisher Joseph Johnson. The book was published in July 1805. Johnson clearly did not anticipate a huge demand for a work of metaphysics from an unknown author: the first (and only edition in Hazlitt’s lifetime) consisted of just 250 copies, and yet when he was denigrated by the editor of the Quarterly Review, William Gifford, as a writer of third-rate books, Hazlitt responded: ‘For myself, there is no work of mine which I would rate so high, except one, which I dare say you never heard of – An Essay on the Principles of Human Action’ (CW 9: 51).
Mary Lamb, Charles Lamb’s sister, was attempting to interest Hazlitt in a relationship with her friend Sarah Stoddart, who lived in a cottage in Winterslow in Wiltshire. Hazlitt was often busy, so meetings were few and far between. He was working on a second publication, Free Thoughts on Public Affairs (1806). In 1807 he completed his abridgement of The Light of Nature Pursued by Abraham Tucker, and he worked on a series of letters, published in William Cobbett’s Political Register and subsequently as a book entitled A Reply to the Essay on Population, by the Rev. T. R. Malthus (CW 1). Then came his The Eloquence of the British Senate, an anthology of political speeches. Occupied with researching and writing, and still painting, he somehow found time to correspond with Miss Stoddart. She was, at 33, three years older than him. After their largely epistolary courtship, they married on May 1, 1808. The financial interference of Sarah’s brother, John Stoddart, a lawyer and future editor of The Times, rankled with Hazlitt. There were tensions from the start. Sarah liked tidiness and busyness, and despite his recent flurry of publications, she suspected Hazlitt was an idler. In fact, he was working on a book Godwin had commissioned: A New and Improved Grammar of the English Tongue (CW 2). Hazlitt and Sarah went to live in Winterslow. A child was born but lived only a few months. In September 1811, their only child to survive, another William, was born.
Hazlitt worked on a completion of the Memoirs of the Late Thomas Holcroft, which was not published until 1816, in part because Godwin, who had been a close friend of the playwright and novelist, objected to the way Hazlitt had made use of Holcroft’s diary. Earlier, a plan to write a History of English Philosophy had failed due to insufficient subscribers. The scheme was reinvented as a lecture series, with lectures planned on Hobbes, Locke, Berkeley, self-love and benevolence, Helvétius, Price and Priestley on free will and necessity, John Horne Tooke on language, and natural religion. The lectures were delivered at the Russell Institute in Bloomsbury from January 14, 1812. The first lecture was considered monotonous, but subsequently the delivery improved, and Crabb Robinson reported that the final lecture was ‘very well delivered and full of shrewd observation’. But another attempt to publish the series as a book again failed to attract sufficient subscriptions. Having rejected the ministry, and with only mediocre success as a portrait painter and as a philosopher, Hazlitt was ready for his true vocation: as a journalist, critic, and essayist.
In October 1812 Hazlitt was engaged as a Parliamentary reporter by James Perry, the proprietor of the Morning Chronicle. The four guineas a week he was paid enabled him to move his lodgings to a house in Westminster, one recently vacated by James Mill and owned by Jeremy Bentham, who was also a neighbour (although they may never have met). Within a few months, Hazlitt progressed to an appointment as drama critic, with the opportunity also to contribute political pieces. However, his public support for the fallen Napoleon caused him difficulty with editors. More sympathetic were Leigh Hunt and his brother John Hunt of the Examiner, who were steady in their commitment to political reform. Writing now on drama, painting, and poetry, Hazlitt contributed also to The Edinburgh Review, the leading liberal periodical. Napoleon’s return from Elba in March 1815 and his subsequent defeat at Waterloo represented ‘the utter extinction of human liberty from the earth’. A period of depression and heavy drinking followed. Godwin was one of the few friends who shared his anguish.
Hazlitt’s family situation was difficult. He and Sarah quarrelled, and his brother was now alcoholic and in decline. Hazlitt worked relentlessly to cover household expenses. In 1816 the Memoirs of the Late Thomas Holcroft (CW 3) was finally published, and in 1817 came his first essay collections: The Round Table and Characters of Shakespeare’s Plays (both CW 4). The Round Table shows the mastery of form that Hazlitt had already achieved as an essayist, with 41 titles, including ‘On the Love of Life’, ‘On Mr Kean’s Iago’, ‘On Hogarth’s Marriage a-la-Mode’, ‘On Milton’s Lycidas’, ‘On the Tendency of Sects’, ‘On Patriotism’, ‘On The Character of Rousseau’, ‘Why the Arts are Not Progressive’ and, perhaps most famously, ‘On Gusto’. (There were additional essays by Leigh Hunt in the original edition.)
Hazlitt met John Keats for the first time in 1817: Keats admired him and regarded him as a philosophical mentor. He had also met Percy Shelley, probably at Godwin’s. His relationship with the older Romantics was not good. When John Murray published Coleridge’s long- uncompleted poems ‘Christabel’ and ‘Kubla Khan’, Hazlitt’s reviews displayed the full extent of his frustration with his early mentor. He subsequently criticized Coleridge’s Biographia Literaria vigorously. These attacks upset their mutual friend Charles Lamb and Coleridge himself. Henceforth, in Duncan Wu’s words, Coleridge and Wordsworth ‘dedicated themselves to the dismantling of Hazlitt’s reputation, by fair means or foul’ (Wu, 2008: 191). Hazlitt’s attacks on the poets continued with a review of Robert Southey’s youthful dramatic poem Wat Tyler, which had recently been published against Southey’s wishes. Hazlitt compared its radical sentiments with the poet’s more recent ultra-Royalist articles in the Quarterly Review. But the renegade poets were not the only people he criticized: liberals and reformers were not sacrosanct.
In June 1817, Hazlitt became the drama critic of The Times, which sold 7000 copies a day. The work was exhilarating and exhausting as the two main theatres, Drury Lane and Covent Garden, changed their bills daily, and Hazlitt would often compose the review in his head as he hurried through the streets to dictate it to the printer. In 1818 his reviews were collected in A View of the English Stage. In the same year Lectures on the English Poets was published, based on a lecture series he had given at the Surrey Institution (both CW 5). His lectures had been applauded and cheered, despite some provocative political allusions. The reviews of the books were good, except, inevitably, those in the Tory periodicals. Hazlitt regarded William Gifford, the editor of the Quarterly Review, as ‘the Government’s Critic, the invisible link that connects literature with the police’.
In the summer of 1818 Hazlitt retired to Winterslow to stay by himself at an inn. He wanted to be close to his beloved son, but he was estranged from Sarah. He worked on a new series of lectures, on English comic writers from Ben Jonson to Henry Fielding and Laurence Sterne. By the end of that summer, however, he was rocked by a vituperative article in Blackwood’s Edinburgh Magazine entitled ‘Hazlitt Cross-Questioned’. The author, J. G. Lockhart, the co-editor, had heard from Wordsworth about Hazlitt’s Lake District episode of 1803. The periodical also attacked Keats and Leigh Hunt, ridiculing the three of them as the ‘Cockney School’. It was a scurrilous political campaign, aimed at harming him professionally—and it partly succeeded, for Taylor and Hessey, who had published previous lecture series, withdrew an offer of £20 for the copyright of the Comic Writers series. Hazlitt took legal action and eventually settled out of court, winning £100 damages, plus costs.
Hazlitt now had yearly earnings of approximately £400, some of which he may have gifted to his elderly parents (now living, with his sister Peggy, in Devon). He was not inclined to save money. He was often in default of his rent, and Bentham eventually evicted him. Lectures on the English Comic Writers (CW 6) was published in 1819, followed in August by his Political Essays (CW 7), which included the major two-part essay ‘What is the People’. As usual, the reviews were partisan. William Gifford, in the Quarterly Review, returned to the attack, and the Anti-Jacobin Review called for Hazlitt’s arrest. The attacks, as A. C. Grayling notes (2000: 248), were purely personal. The Government’s repressive measures had raised the political temperature and public discontent. In August eleven people were killed and 600 injured when dragoons charged demonstrators at St Peter’s Field in Manchester.
In November 1819, Hazlitt began a series of lectures on Elizabethan dramatists other than Shakespeare. Lectures on the Dramatic Literature of the Age of Elizabeth was published in February 1820 (CW 6). He was at the height of his reputation, widely recognised as a great critic and prose stylist. In April 1820 he embarked on a series of essays that would further enhance his reputation. These were published as Table Talk, in two volumes in 1821 and 1822 (CW 8). Less polemical than the Round Table essays, they are longer and more reflective. One of the best known of the thirty-three essays is ‘The Indian Jugglers’; others include ‘On Genius and Common Sense’, ‘On the Ignorance of the Learned’, ‘Why Distant Objects Please’, and ‘On the Knowledge of Character’. News of his father’s death had reached Hazlitt belatedly at Winterslow and his feelings about his father are movingly expressed in the essay ‘On the Pleasure of Painting’. After a visit to his mother and sister, he returned to lodgings in London, once again in Southampton Buildings. It was about to become the scene of the most painful episode of his life.
Sarah Walker, the landlady’s daughter, was nineteen. A flirtatious relationship became, on Hazlitt’s side, a passionate infatuation. But Hazlitt was not the only lodger with whom Sarah flirted. When he overheard Sarah and her mother talking lewdly about other lodgers, he was shocked and frustrated. He decided to push Sarah towards a commitment by freeing himself to marry. By February 1822 he was in Scotland, where divorce was easier to obtain, arranging for his wife to discover him with a prostitute. A period of residence in Scotland was required, and he spent the time writing, also lecturing on Shakespeare. His wife’s attitude was pragmatic, and she took the opportunity to walk independently in the Highlands.
News from London concerning Sarah Walker’s behaviour caused Hazlitt additional agony; he rushed to see her only to be met with a cold reception. After a tormented week, he returned to Edinburgh to complete divorce proceedings, then back in London, newly single, he witnessed Sarah Walker walking with his main rival, John Tomkins, in a way that convinced him they were lovers. By now he regarded her as ‘a regular lodging-house decoy’. His friends witnessed the ‘insanity’ of his conflict of adoration and jealousy. Meeting him in the street, Mary Shelley was shocked by his changed appearance. As late as September 1823, visiting London from Winterslow, he spent hours watching Sarah Walker’s door. Hazlitt chose now to compile from his notes and letters a confessional account of the whole affair. This became Liber Amoris, or The New Pygmalion (CW 9), his most notorious book. It was published anonymously in 1823, but no one doubted its authorship. It caused an uproar and allowed his enemies further to impugn his morality and his judgment.
During his period of infatuation, both Keats and Shelley had died. Before his death, Shelley had collaborated with Leigh Hunt and Lord Byron to launch a new journal, The Liberal. One of Hazlitt’s contributions was ‘My First Acquaintance with Poets’, the essay in which he discusses his early meetings with Coleridge and Wordsworth, and which ends with an affectionate tribute to Lamb. If Hazlitt’s literary power was undiminished by those recent events, his financial situation certainly was, and in February 1823 he was arrested for debt. It was an unpleasant experience, but brief because his friends were able to supply ‘terms of accommodation’. He now resumed his regular contributions to periodicals, and he started work on the character portraits that would be republished in The Spirit of the Age (CW 11) in 1825. It sold well and is considered one of his finest achievements. The men portrayed include Bentham, Godwin, Coleridge, Sir Walter Scott, Lord Byron, Southey, Wordsworth, Mackintosh, Malthus, Cobbett, and Lamb.
A relief to his emotional struggles and financial crisis presented itself, conveniently, in the form of marriage to an independent woman, Isabella Bridgewater. She was intelligent and educated, a widow with £300 a year. Their love (or understanding) developed rapidly, and in early 1824 they were in Scotland (where his divorce was recognised) to get married. They then embarked on a continental tour, during which he contributed travel pieces to the Morning Chronicle, subsequently published (in 1826) as Notes of a Journey Through France and Italy (CW 10). In Paris they visited the Louvre. His hopes and commitments had not changed since his last visit in 1802. In Florence he visited Leigh Hunt and Walter Savage Landor. He liked Venice more than Rome but admired the Sistine Chapel. They returned via the Italian Lakes and Geneva, where he enjoyed scenes associated with Jean-Jacques Rousseau.
In 1826 Hazlitt finished preparing for publication the essays collected in his final major collection, The Plain Speaker (CW 12). These included some of his greatest essays, such as ‘On the Prose Style of Poets’, ‘On the Conversation of Authors’, ‘On Reason and Imagination’, ‘On Londoners and Country People’, ‘On Egotism’, ‘On the Reading of Old Books’, ‘On Personal Character’, and, perhaps best known, ‘On the Pleasure of Hating’. The book was published in May 1826, in the same month as Notes of a Journey. Settled in Down Street, Piccadilly, with Isabella, without financial worries, Hazlitt contemplated his most ambitious work: a biography of Napoleon. Researching this would require a prolonged stay in Paris. This, or perhaps young William’s unfriendly manner towards his step-mother, unsettled Isabella, and the marriage foundered. It was said that she had fallen in love with Hazlitt because of his writings and parted from him because of the boy.
By December 1827 Hazlitt was complaining of ill-health. In February 1828 he wrote his ‘Farewell to Essay Writing’, a powerful justification of his ‘unbendingness’: ‘What I have once made up my mind to, I abide by to the end of the chapter’ (CW17: 319). He returned to Paris to pour everything into the completion of The Life of Napoleon Buonaparte (CW 13 – 15). Then, back in London, he continued a project he had started in autumn 1827: a series of ‘conversations’ with James Northcote (1746 – 1831), the artist and former pupil and biographer of Sir Joshua Reynolds. Crabb Robinson reported that there was ‘more shrewdness and originality’ in Northcote and Hazlitt himself than in Dr Johnson and James Boswell. The essays Hazlitt wrote in the last few months of his life showed no signs of decline. One, ‘The Sick Room’, describes his pleasure in reading when ‘bed-rid’: ‘If the stage shows us the masks of men and the pageant of the world, books let us into their souls and lay open to us the secrets of our own. They are the first and last, the most home-felt, the most heart-felt of our enjoyments’ (CW 17: 375 – 76).
Having been often disappointed by the course of European history, Hazlitt survived long enough to hear that the Bourbon monarchy had been overthrown. He wrote in his final essay, ‘On Personal Politics’, that should the monarchy be restored once more, liberty would live on because the hatred of oppression is ‘the unquenchable flame, the worm that dies not’ (CW19: 334n). Hazlitt died on 18 September 1830, most likely from stomach cancer. He was 52. According to his son, his last words were, ‘Well, I’ve had a happy life’. He was buried in the churchyard of St Anne’s, Soho. The funeral was arranged by Charles Lamb, who had been with him when he died.
2. Early Philosophical Works
a. The ‘Metaphysical Discovery’
At the age of sixteen or seventeen Hazlitt made a ‘metaphysical discovery’. It would be another ten years before he could articulate this insight to his satisfaction and work out its wider implications. When the work containing them, An Essay on the Principles of Human Action (CW 1), was finally published in 1805, it was largely ignored. From the late twentieth century on, it has received more attention than ever before. Hazlitt did not pursue a career as a metaphysician, but the ideas remained central to his thinking: he found ways better suited to his genius to infiltrate them into public consciousness.
The ‘metaphysical discovery’ occurred as Hazlitt was reading Baron d’Holbach’s arguments for self-love. He contemplated the possibility that we have a greater tendency to altruism than Hobbes, and most philosophers since, had allowed. Voluntary action concerns future consequences, so questions about egoism and altruism are ultimately about the individual’s relation to his future self. Psychological egoism suggests that even an apparently benevolent action has an underlying selfish motivation, and Hazlitt does not deny that this can be the case, but he wonders what accounts for it. Is the principle of self-love inherent in human nature, or is there a metaphysical case for questioning this dispiriting conclusion?
Hazlitt argues as follows. If I now regret an earlier generous action and, looking back, hold my past self accountable, I am presuming a continuity of personal identity between past and present—and with good reason, for my past self is causally connected to my present self through memory. There is some kind of ‘mechanical’ or psycho-physiological process that connects my past decisions to my present consciousness. But if I now anticipate a future benefit or injury to myself, resulting from a present decision, there can be no comparable connection because the future has not occurred, it does not exist. Therefore, Hazlitt insists, a principle of self-interest cannot apply to my future self: at least, not one that posits an actual continuity or identity of self through time. But is there not some faculty of mind that connects me to it, and is this not as ‘personal’ to me, as exclusive, as memory and consciousness are? These faculties give me access to my past and present experience of a kind that I cannot have to anyone else’s: is not my anticipation of my future experience a directly parallel case?
Hazlitt argues that it is not. There is, currently, no future self. The faculty of mind that anticipates the future self is imagination and, yes, it allows me to anticipate my future, but only in the same way as it allows me to anticipate your future or anyone else’s. We are ‘thrown forwards’ into our futures but not in the intimate, exclusive way in which we connect through memory with our past or through consciousness with our present. The connection I have with my future self, through imagination, has the same degree of disinterestedness or impersonality as my relationship with another person’s future self. An action that might be described as motivated by self-love could equally be described as motivated by disinterested benevolence, for my future self has the metaphysical status of otherness.
This seems counterintuitive. It is true that I can anticipate another person’s pleasure or pain to some extent, but not with the same force or degree of interest as I do my own. Hazlitt knows that we do, as a matter of fact, have a bias towards our own future interests, and that this provides some sense of continuity. However, his supposition is that this bias is acquired: the selfishness that other philosophers argued was inherent is actually the result of nurture, of socialisation. The point is not that benevolence is inherent, but that humans are ‘naturally disinterested’, and therefore we could be educated to think and act differently.
For some commentators this reorientation of the argument concerning egoism and altruism is the Essay’s main point of interest, while for others it is the argument for the discontinuity of personal identity. John Locke had argued in An Essay Concerning Human Understanding (first published in 1689) that what makes someone the same person today as yesterday, or as in their distant childhood, is memory. His ‘Prince and Cobbler’ example, in which the memories of each switch bodies, was intended to show that psychological rather than physical continuity is what guarantees identity through time (Locke, 1975: 340). This was questioned by Thomas Reid in his Essays on the Intellectual Powers (1785). His ‘General and Ensign’ example suggested that personal identity could not be reduced to psychological continuity (Reid, 1983: 216 – 218). Renewed interest in the question from the 1960s onwards produced a range of fission-based thought experiments that led Derek Parfit, for example, to conclude that it is not personal identity that matters but some degree of psychological survival (Parfit, 1987: 245 – 302). In some scenarios it might not be me who persists, but someone qualitatively very much like me.
Hazlitt appears to have anticipated this distinction. He employs a multiple-fission example to show that the kind of connection we have with our future self cannot guarantee personal identity. What if the connection between past and present were non-causal: could that produce identity? If Person A’s consciousness were replicated (non-causally) in Person B, would A not feel ‘imposed upon’ by a false claim to identity? Anticipating twentieth-century examples involving multiple replicants, such as Parfit’s ‘Mars Tele-Transporter’ example, Hazlitt asks: if a Deity multiplied my self any number of times, would they all be equally myself? Where would my self-interest lie? He concludes: ‘Here then I saw an end to my speculations about absolute self-interest and personal identity’ (CW 12: 47). Hazlitt’s point is that if the concept of personal identity cannot be carried through with logical consistency to a future self which is one and the same as the present self who acts, neither can the idea of the necessity of self-interested action. In any case, although what I am now depends on what I have been, the chain of communication cannot run backwards from the future to the present. If the Deity multiplied me any number of times in the future, or destroyed me, it could not affect my present self.
My only interest in a future self comes from the psychological bias I have acquired from experience, including my upbringing, and if I have thereby acquired a sense of self and a capacity for sympathy with my future self, I have equally acquired a potential for sympathy (empathy) with others. My future self is in fact one of those others with whom I can empathize. Imagination enables me to project out of myself into the feelings of others. Moreover, ‘I could not love myself, if I were not capable of loving others’ (CW1: 2). If this implies that I could not wish good things for my future self if I did not wish good things for the future selves of others, it prompts the question: what motivates the desire for good things to happen to anyone?
Hazlitt’s position suggests an account of child development that sees children as learning in stages to distinguish between self and others and to identify with their own current and anticipated longer-term interests. He knew that people sometimes fail to acquire a moral sense and that they can be driven by circumstances to evil, yet still, he argues, they must have in choosing between alternative actions some notion of good. So, although he rejects the hypothesis that we are naturally self-interested, he admits that there is something which is inherent, that ‘naturally excites desire or aversion’, and this is ‘the very idea of good or evil’. Regardless of what I think makes a future consequence a good one, ‘it cannot be indifferent to me whether I believe that any being will be made happy or miserable in consequence of my actions, whether this be myself or another’ (CW 1: 11 – 12).
It follows from this that both selfish and altruistic actions are in a sense impersonal, for it is the idea of good that motivates, rather than a rational calculation or allocation of benefits. Hazlitt may have had in mind a role for parenting and education in refining and extending the child’s understanding of good. The sensitivity of the faculty of imagination in differentiating degrees of good and evil is improvable. No doubt, in anticipating a future benefit or pleasure, imagination can stimulate an illusion of continuous identity, and the satisfaction one gains from imagining one’s own future pleasure is especially forceful because I know from experience what my future feelings might be, but this does not make the connection with the future self parallel with the connection with one’s past self: it is still a fiction. Imagination provides the freedom to think in a more expansive way, to project one’s love of good beyond self-interest to others one is close to, and beyond to others unknown. It is the freedom to aspire to universal benevolence.
Hazlitt’s disjunction between the self, as constituted by memory and consciousness, and the putative future self appears to have been an original observation. We have seen how Hazlitt has been said (for example, by Martin and Barresi, 1995) to anticipate Parfit on personal identity. A. C. Grayling (2000: 363 – 4) finds a parallel between Peter Strawson’s argument concerning other minds and Hazlitt’s ‘transcendental’ argument that being capable of having an interest in other people’s future is a condition for being capable of having an interest in one’s own. Just as, according to Strawson, one can ascribe states of consciousness to oneself only if one can ascribe them to others, which suggests that observable, bodily behaviours constitute logically adequate criteria for ascribing states of consciousness to others, so, according to Hazlitt, one’s relation to one’s future self has the same status as one’s relation to another person’s future self and this suggests that it must be a condition of acting benevolently towards oneself (self-interestedly) that one can act benevolently towards others.
b. Hartley and Helvétius
The Essay has a second part, Remarks on the Systems of Hartley and Helvétius. David Hartley (1705 – 1757) had presented a physiological and mechanical account of the impact of sensation on the brain. Ideas become associated through repetition, so that one sensation can cause, involuntarily, multiple ideas. For Hazlitt this form of associationism provides an insufficient account of the mind. A physiological chain of vibrations, or the ‘proximity of different impressions’, can no more produce consciousness than ‘by placing a number of persons together in a line we should produce in them an immediate consciousness and perfect knowledge of what was passing in each other’s mind’. Furthermore, the suggestion that different ideas have a definite location in the brain is simply absurd; nor can associationism account for the mingling of different experiences in one idea, as when one hears with joy the song of a thrush and imagines it coming beyond the hill from some warm wood shelter. Every beginning of a series of association must derive from some act of the mind which does not depend on association. Association, where it does exist, is only a particular and accidental effect of some more general principle. Hartley’s account leaves no room for such voluntary mental activity as comparison of one idea with another, for abstraction, reasoning, imagination, judgment—in short, ‘nothing that is essential or honourable to the human mind would be left to it’.
Helvétius (1715 – 1771), Hazlitt’s other disputant in the second part of the Essay, had argued (in De l’esprit, 1758) for a materialist theory of mind and for self-interest as the sole motive of human action, a reduction of right and wrong to pleasure and pain: benevolent actions are an attempt to remove the uneasiness which pity creates in our own minds. Therefore, any disinterestedness hypothesis must be wrong, because only self-gratification provides the required causal mechanism. Hazlitt responds, firstly, that this is irrelevant to the issue. The relation of voluntary action to the future does not differ according to whether the principle impelling it is directed towards the self or towards others. It is no more mechanical in the former case than in the latter. Secondly, there is no reason to resolve feelings of compassion or benevolence into a principle of mechanical self-love. We are necessarily affected emotionally by our actions and their consequences: it would be ‘palpable nonsense’ to suggest that to feel for others we must in reality feel nothing. If all love were self-love, what would be the meaning of ‘self’? It must either point to a distinction in certain cases or be redundant. There must be clear limits to the meaning of the term ‘self-love’; but, in any case, purely as a matter of fact, Hazlitt thinks, it is incorrect to think that we have a mechanical disposition to seek our own good, or to think that, when we act benevolently, an accompanying pleasure sensation necessarily displaces the painful feeling occasioned by another’s distress.
The relevant distress is the other person’s, not my own; it is the relief of his or her distress that I will. To the argument that my love of others amounts to self-love because ‘the impression exciting my sympathy must exist in my mind and so be part of me’, Hazlitt responds that ‘this is using words without affixing any distinct meaning to them’. After all, any object of thought could be described as a part of ourselves: ‘the whole world is contained within us’. If any thought or feeling about or for another person is directed not to them but to me, then by the same token I might sometimes be said to be filled with self-hate: ‘For what is this man whom I think I see before me but an object existing in my mind, and therefore a part of myself?… If I am always necessarily the object of my own thoughts and actions, I must hate, love, serve, or stab myself as it happens’ (CW 1: 89 – 90).
Hazlitt concludes the Essay by affirming the common-sense view that compassion for another person’s injury is not a selfish feeling. When I am wounded, the pain is the effect of ‘physical sensibility’; when I see another person’s wound, my experience of pain is ‘an affair of imagination’. Benevolence ‘has the same necessary foundation in the human mind as the love of ourselves’ (CW 1: 91).
c. History of Philosophy
Hazlitt’s intention to write a history of English philosophy was first heard of in 1809 when an eight-page pamphlet was published advertising ‘Proposals for Publishing, in One Large Quarto… A History of English Philosophy: containing an Account of the Rise and Progress of modern Metaphysics, an Analysis of the Systems of the most celebrated Writers who have treated on the Subject, and an Examination of the principal Arguments by which they are supported. By the Author of An Essay on the Principles of Human Action, and An Abridgement of the Light of Nature Pursued’ (CW 2: 112). By 1810 Hazlitt had decided to turn the History into a series of essays, and in January 1812 these became lectures. After the lecture series had been successfully completed, the Proposals for a History of English Philosophy was republished, with a list of subscribers, but assured sales were too few to cover production costs and the book was never published. Most of the lectures were eventually published as essays in Literary Remains of the Late William Hazlitt in 1836.
The Proposals (entitled Prospectus in Howe’s Complete Works) outlines the positive claims on which Hazlitt’s critique of English philosophy would be based. These include the following: that the mind is not material; that the intellectual powers of the mind are distinct from sensation; that the power of abstraction is a necessary consequence of the limitation of the comprehending power of the mind; that reason is a source of knowledge distinct from, and above, experience; that the principle of association does not account for all our ideas, feelings and actions; that there is a principle of natural benevolence in the human mind; that the love of pleasure or happiness is not the only principle of action, but that there are others implied in the nature of man as an active and intelligent being; that moral obligation is not the strongest motive which could justify any action whatever; that the mind is not mechanical, but a rational and voluntary agent—it is free in as far as it is not the slave of external impressions, physical impulses, or blind senseless motives; and that the idea of power is inseparable from activity—we get it from the exertion of it in ourselves (CW 2: 116 – 119).
The lectures of 1812 included ‘On the Writings of Hobbes’, ‘On Locke’s Essay’, ‘On Self-Love’, and ‘On Liberty and Necessity’. In the first of these he argues that, contrary to popular opinion, Locke was not the founder of ‘the modern system of philosophy’. He sees Locke as a follower of Hobbes. Hazlitt’s argument in these essays takes forward the aim referred to in his Proposals, to oppose ‘the material, or modern, philosophy, as it has been called’, according to which ‘the mind is nothing, and external impressions everything. All thought is to be resolved into sensation, all morality into the love of pleasure, and all action into mechanical impulse’ (CW 2: 113 – 4). This theory, he writes, derives from a false interpretation of Francis Bacon’s use of the word ‘experience’, according to which the term applies to external things only and not to the mind. To apply the experimental methodology of natural philosophy to the mind is to assume an affinity based on ‘no better foundation than an unmeaning and palpable play of words’ (CW 2: 114).
In ‘On Liberty and Necessity’, Hazlitt largely agrees with Hobbes’s account of necessity as implying no more than a connection between cause and effect. Free will is not unmotivated: the motives which cause free actions originate in the mind. ‘The will may be said to be free when it has the power to obey the dictates of the understanding’ (CW 2: 255). Liberty is not an absence of obstruction or an uncertainty, it is ‘the concurrence of certain powers of an agent in the production of that event’. It is as real a thing ‘as the necessity to which it is thus opposed’ (CW 2: 258 – 9).
In the same year as the Proposals first appeared (1809), Hazlitt published A New and Improved Grammar of the English Tongue (CW 2). Hazlitt claims some originality for his theoretical and logical analysis of language. He rejects the assumption that grammatical distinctions and words of different kinds relate to different sorts of things or ideas rather than to our manner of relating to them. The same word can play many roles: what changes is the way things are reordered in relation to one another in our thoughts and discourse. A substantive, for example, is not the name of a substance or quality subsisting by itself but of something considered as subsisting by itself. It is an abstraction. Grammatical distinctions also mark changes in the orientation of the speaker to the hearer (‘the poisonous plant’ vs. ‘the plant is poisonous’). Verbs, like adjectives, express attributes and direct the hearer either to a familiar connection between things or to a new or unknown one. Verbs are not the only words that express ideas of being, doing, or suffering, but they have a certain eminence in that, without them, we cannot affirm or deny, ask for information or communicate a desire, express or understand an idea. Hazlitt appears to have grasped something of the pragmatics, in addition to the syntactical and semantic features, of language and communication.
Hazlitt acknowledged the importance and influence of John Horne Tooke’s The Diversions of Purley (1786), which had provided a general theory of language and the mind, but he disagreed with Tooke’s ideas concerning abstractions. He returned to this in ‘On Abstract Ideas’ and ‘On Tooke’s Diversions of Purley’, two of the lectures delivered in 1812. Tooke agrees with Hobbes, Hume, Berkeley, and others that there are no abstract or complex ideas. Hazlitt counters that, on the contrary, ‘we have no others’, for if all ideas were simple and individual we could not have an idea even of such things as a chair, a blade of grass, a grain of sand, each of which is a ‘certain configuration’ or assemblage of different things or qualities. Every idea of a simple object is ‘an imperfect and general notion of an aggregate’ (CW 2: 191). ‘Without the cementing power of the mind, all our ideas would be necessarily decomposed… We could indeed never carry on a chain of reasoning on any subject, for the very links of which this chain must consist, would be ground to powder’ (CW 2: 280).
Hazlitt alludes to an idea that was to stay central to his philosophical outlook: ‘The mind alone is formative, to borrow the expression of a celebrated German writer’ (CW 2: 280). How much did Hazlitt know of Immanuel Kant’s philosophy and to what extent was he an idealist? One reason for asking this is that the debate concerning politics and epistemology, brought to the fore by the empiricism of Burke’s Reflections on the Revolution in France (1790), enticed Romantic writers to explore, as Timothy Michael puts it, ‘the idea… that it is through rational activity that things like liberty and justice cease to be merely ideas’ (Michael, 2016: 1). Hazlitt was one writer who went some way towards idealism. It promised a potential alternative to Godwin’s utopianism, to Bentham’s felicific calculus, and to Burke’s arguments from experience and tradition.
d. Kant and Idealism
Hazlitt’s opinion that particulars are abstract ideas constructed by an abstract entity we call ‘the mind’ suggests that he had absorbed at least some of the ideas of idealism. He is opposed to any materialist epistemology that has no place for the active power of the mind, but he does not agree with Berkeley that there is no mind-independent world. We can, to an extent, experience an external reality but we cannot conceptualize, know, or understand it without the mind’s faculty of abstraction.
‘Abstraction,’ Hazlitt writes, ‘is a trick to supply the defect of comprehension’. This sentence occurs in the Preface to his 1807 abridgement of Abraham Tucker’s The Light of Nature Pursued. He goes on to argue that abstraction is only half of the understanding: common sense is also needed, and he sees Tucker’s ‘sound, practical, comprehensive good sense’ as the great merit of his (too-voluminous) work (CW 1: 125). There are only two sorts of philosophy: one ‘rests chiefly on the general notions and conscious perceptions of mankind, and endeavours to discover what the mind is, by looking into the mind itself; the other denies the existence of everything in the mind, of which it cannot find some rubbishly archetype, and visible image in its crucibles and furnaces, or in the distinct forms of verbal analysis’. The latter can be left to chemists and logicians, the former is ‘the only philosophy fit for men of sense’.
Hazlitt himself connects Tucker’s philosophy with Kant’s. Tucker ‘believed with professor Kant in the unity of consciousness, or “that the mind alone is formative,” that fundamental article of the transcendental creed’. It is not clear when Hazlitt first became acquainted with Kant’s philosophy. Before he had finished preparing the arguments of the Essay for publication, in 1805, he may have encountered Friedrich August Nitsch’s A General and Introductory View of Professor Kant’s Principles concerning Man, the World and the Deity, submitted to the Consideration of the Learned (1796) or John Richardson’s Principles of Critical Philosophy, selected from the works of Emmanuel Kant and expounded by J. S. Beck (1797) or the same writer’s later publications, but it is most likely that he had encountered Anthony Willich’s Elements of the Critical Philosophy (1798). We know that Coleridge possessed a copy of Willich’s Elements, and he may well have discussed Kant’s philosophy directly with Hazlitt, or with a mutual acquaintance such as Godwin or Crabb Robinson. By 1807, possibly by 1805, Hazlitt certainly knew something of Kant and appreciated him as a formidable opponent of ‘the empirical or mechanical philosophy’ and as a proponent of the doctrine of the creative and active power of the mind.
Hazlitt had definitely seen Willich’s translation by 1814. In his review of Madame de Staël’s Account of German Philosophy and Literature, he mentions Willich’s summary of the Critique of Pure Reason as including the proposition that ‘We are in possession of certain notions a priori which are absolutely independent of all experience, although the elements of experience correspond with them, and which are distinguished by necessity and strict universality’ (CW 20: 18). A footnote takes issue with this idea: ‘This, if the translation is correct… is, as it appears to me, the great stumbling block in Kant’s Philosophy. It is quite enough to shew, not that there are certain notions a priori or independent of sensation, but certain faculties independent of the senses or sensible objects, which are the intellect itself, and necessary, after the objects are given, to form ideas of them’. Having rejected Locke’s doctrine of the mind as a blank slate, Hazlitt was not keen to fill it with what he saw as innate ideas. Whether he or Willich is to blame for the misreading, it seems that, even in 1814, Hazlitt’s understanding of Kant’s philosophy was incomplete.
Did he appreciate Kant’s moral philosophy? There is, perhaps, some similarity between the categorical imperative and the role the mind plays in Hazlitt’s account of disinterested voluntary action. Certainly, for Hazlitt moral action is dissociated from a calculation of material advantage. It is not utilitarian or teleological, but neither is it exactly deontological in the sense of being based on universal rules or duties. There are parallels with Kant’s notion of a priori understanding in that moral action is conformity to a moral standard that is not derived from sensory experience, but for Hazlitt sensory experience is not all experience. Moral action is free of self-interest, but it is not free of selfhood, of the passions and habits and dispositions of the individual self. Hazlitt’s faculty of imagination lacks the purity of Kant’s idea of reason. Its recognition of good is based on experience and past preferences, so the active nature of the individual mind does not entail that reasonable choices converge.
3. Political Thought
a. Early Political Writing
One of the intellectual virtues that Hazlitt championed as a critic and essayist was openness, in the sense both of open-mindedness and of candour. This does not mean that he was flexible in his core political commitments. On the contrary, he valued a principled steadiness. True to his Dissenting roots, he was unshakeable in his commitment to civil and religious liberty and in his opposition to Toryism, the war with France, and the restoration of the Bourbon monarchy. Like many of his generation, he admired Godwin’s Enquiry Concerning Political Justice, but not uncritically, and if he espoused the politics of radical reform, he did so with a degree of skepticism concerning the modifiability of human nature. His open-mindedness and candour meant that he was prepared to criticize, and to antagonize, people who shared his commitments, and to praise those with whom he disagreed politically.
After the Essay was published, Hazlitt was quick to apply its fundamental insights to politics. In his 1806 pamphlet Free Thoughts on Public Affairs (CW 1), he denounces the ‘false patriotism’ of Tory policies, seeing it as a cover for militaristic nationalism, imperialism, and the erosion of constitutional rights. He blames the recently deceased William Pitt the Younger for diffusing ‘a spirit of passive obedience and non-resistance’ (CW 1: 112). He does not reject, as Godwin had done, the legitimacy of the state as an institution, but he insists that a radically reformed state and senate should reject selfish attachments in favour of disinterested policies that bring universal benefit. In this early statement of his political commitments, the ‘metaphysical discovery’ underpins his opposition to tyranny, capitalism, and imperialism.
In The Eloquence of the British Senate (CW 1), an anthology of political speeches (with commentary), published in 1807, Hazlitt makes clear his commitment to the British people’s right and constitutional duty ‘to resist the insidious encroachments of monarchical power’. He condemns the political corruption rampant in the parliamentary system and praises (pre-1789) Edmund Burke and Charles James Fox for their disinterested patriotism and benevolence. In the same year he composed five letters addressed to Thomas Malthus, three of which were published in William Cobbett’s Political Register. Malthus had originally written An Essay on the Principle of Population (1798) in response to Nicolas de Condorcet’s and Godwin’s optimism about the consequences of social improvement. Malthus’s argument that population growth would inevitably outstrip the potential for subsistence was proving influential even in Whig circles. Godwin’s major response would not be published until 1820; meanwhile, Hazlitt’s letters were a significant contribution to the defence of social progress and justice. Hazlitt condemns Malthus’s fatalism and advocacy of the principle of self-love. He also opposed Samuel Whitbread’s Malthusian Poor Bill, which proposed a national system of parochial schools, which, in Hazlitt’s view, would indoctrinate and further increase the vassalage of the poor. Like Godwin, he saw any such state-sponsored system as undermining independence of thought and the principle of popular democracy.
b. Virtue and Liberty
As Hazlitt’s career as a journalist, critic, and essayist developed, he focused on the particular and the individual rather than on abstract principles, but his opposition to the unjust exercise of power was clear and consistent. His Preface to Political Essays (1819) emphasizes his commitment to autonomy, to candour, to opposing selfishness and corruption. He writes:
I am no politician, and still less can I be said to be a party-man: but I have a hatred for tyranny, and a contempt for its tools… I have no mind to have my person made a property of, nor my understanding made a dupe of. I deny that liberty and slavery are convertible terms, that right and wrong, truth and falsehood, plenty and famine, the comforts or wretchedness of a people, are matters of perfect indifference. (CW 7: 7)
Openness, integrity, and sincerity are the virtues Hazlitt opposes to the temptations of advancement through corruption or the allure of power. ‘The admiration of power in others,’ he writes, ‘is as common to man as the love of it in himself: the one makes him a tyrant, the other a slave’ (CW 7: 148). The willingness of a people to become the instruments of tyrants and oppressors allows power to claim legitimacy. It does not speak well for human nature if it can be seduced in this way. Once embedded in people’s minds, power is almost irremovable. This hatred of unjust power explains Hazlitt’s opposition to hereditary monarchy and the idea of divine right; and it explains also, and more controversially, his admiration for Napoleon, whom he came to see as the final bastion against the threat to liberty represented by European monarchies.
Though not a ‘party-man’, Hazlitt thinks like a modern politician when he concedes the need to make pragmatic and partisan concessions in the cause of liberty:
If we were engaged in a friendly contest, where integrity and fair dealing were the order of the day, our means might be as unimpeachable as our ends; but in a struggle with the passions, interests, and prejudices of men, right reason, pure intention, are hardly competent to carry through: we want another stimulus. The vices must be opposed to each other sometimes with advantage and propriety. (CW 17: 40)
Integrity sometimes permits one to speak truth to power in language that power understands.
Freedom of will and political freedom are linked in Hazlitt’s conception of the mind’s innate power, subject only to the laws of its own innate constitution, and arbitrary political power which tries to make us passive machines. Both kinds of power may be tyrannical, and we are too inclined to admire political power in others. Uttara Natarajan observes that in his conversational essays we see Hazlitt attempting to translate into practice the ideal of the innate power of the individual resisting arbitrary political power (1998: 116). Potentially the most powerful instrument in the cause of liberty, poetry is neutral, and the power of language can be put to use on either side. In a sense liberty and political power are unevenly matched, for the former is diffused and the latter concentrated, and liberty must contend also with ego, pride, and prejudice. As with the will of individuals, it is not inevitable that the will of the people will be directed to the common good, but at least it has the capacity to be so directed.
That provides at least a degree of hope; yet Hazlitt sometimes comes close to despair about the prospects for genuine change. In one of his many aphorisms, he states:
If reform were to gain the day, reform would become as vulgar as cant of any other kind. We only shew a spirit of independence and resistance to power, as long as power is against us. As soon as the cause of opposition prevails, its essence and character are gone out of it; and the most flagrant radicalism degenerates into the tamest servility. (CW 20: 333)
Nevertheless, it was important to sustain resistance, to exert freedom of the will, in order to retain whatever liberty remained.
c. The People
Like Winston Smith in George Orwell’s Nineteen Eighty-Four, Hazlitt looks to the proletariat. In essays depicting country people and townspeople, he characterizes both with frankness. A Cockney is someone who ‘sees and hears a vast number of things, and knows nothing’ (CW 12: 66). By his lack of servility, ‘Your true Cockney is your only true leveller’ (CW12: 67). Whereas the county dweller is petty and parochial, the urban dweller benefits from his exposure to the mass of people. London is described as ‘a visible body-politic, a type and image of that great Leviathan’. The urban social experience is an emancipation from ‘petty interests and personal dependence’.
Hazlitt recognises that although differences of character, talent, and discrimination mean there is undeniable superiority in particular spheres of life, including art, poetry, and criticism, nevertheless, superiority ‘arises out of the presupposed ground of equality’ (CW 8: 208). The benefit that ordinary people gain from society, from ‘free communication and comparing of ideas’, is denied to people of rank, ‘where all is submission on one side, and condescension on the other’. He is astonished by the airs and graces some people give themselves when there is so ‘little difference… in mankind (either in body or mind)’. Individual achievement is grounded in the essential equality of the people: ‘I am proud up to the point of equality—every thing above or below that appears to me arrant impertinence or abject meanness’ (CW 20: 123). Differences are largely due to disparities of opportunity and esteem.
In two powerful essays entitled ‘What is the People?’ (CW 7), Hazlitt attacks efforts by Southey and others to associate parliamentary reform with insurrection. The Poet Laureate criticizes the maxim vox populi vox Dei—the voice of the people is the voice of God. Hazlitt’s answer to the question What is the people? is:
Millions of men, like you, with hearts beating in their bosoms, with thoughts stirring in their minds, with blood circulating in their veins, with wants and appetites, and passions and anxious cares, and busy purposes and affections for others and a respect for themselves, and a desire for happiness, and a right to freedom, and a will to be free’. (CW 7: 259)
He launches into a ferocious attack on his antagonist, who would lay the mighty heart of the nation ‘bare and bleeding at the foot of despotism’, who would ‘make the throne every thing, and the people nothing’ and be himself a ‘cringing sycophant, a court favourite, a pander to Legitimacy.
This notion, legitimacy, is nothing other than the old doctrine of Divine Right ‘new-vamped’. The purpose of the Government should be to benefit the governed; its interests should not be at variance with those of the people, which are common and equal rights, yet the Government, Hazlitt thinks, sees its interest as preserving its privileges and those of the great and powerful. The dog kennels of the great and powerful are ‘glutted with the food which would maintain the children of the poor’. The people obstruct their absolute power; therefore rulers will always try to root out ‘the germs of every popular right and liberal principle’. How can rulers such as these be expected to have sympathy with those whose loss of liberty is their gain? The wealth of the few is composed of ‘the tears, the sweat, and blood of millions’ (CW 7: 264 – 5).
If a corrupt, self-interested Government cannot be trusted to serve the people’s interest, what can? There is no better solution, Hazlitt insists, than a popular democracy: ‘Vox populi vox Dei is the rule of all good Government: for in that voice, truly collected and freely expressed… we have all the sincerity and all the wisdom of the community’. In fact, the closer we can get to a direct democracy, in which each individual’s consciousness of his or her own needs and desires is registered, the better. In the opposite extreme (hereditary despotism), the people are ‘an inert, torpid mass, without the power, scarcely with the will, to make its wants or wishes known’ (CW 7: 268).
Hazlitt does not appear to endorse Godwin’s anarchistic localism, for he thinks representation and universal suffrage is the closest to direct democracy that can be achieved, but there are Godwinian themes when he addresses, in the second part of the essay, the question ‘Where are we to find the intellect of the people?’. His answer is everywhere. Public opinion incorporates ‘all those minds that have ever devoted themselves to the love of truth and the good of mankind’ (CW 7: 269). Lord Bacon was a great man, but not because he was a lord; Burke received his pension from the King, but not his understanding or his eloquence. What have hereditary monarchs ever done for the people? What wisdom is there in the Established Church, in the slave trade, in error, inhumanity, corruption and intolerance, in Church-and-King mobs but not in petitions for parliamentary reform? According to Hazlitt,
‘Loyalty, patriotism, and religion, are regarded as the natural virtues and plain unerring instincts of the common people: the mixture of ignorance or prejudice is never objected to in these: it is only their love of liberty or hatred of oppression that are discovered… to be the proofs of a base and vulgar disposition. (CW 7: 271)
Vox populi is the voice of God because it is the cry raised against ‘intolerable oppression and the utmost extremity of human suffering’ (CW 7: 278). Freed from attempts to stifle it or give it a false bias, it must lead ‘to the gradual discovery and diffusion of knowledge in this as in all other departments of human knowledge’. Indoctrinated by the Church and State, the people have been denied a proper (non-catechistic) education, and therefore the opportunity to ‘take the management of our own affairs into our own hands, or to seek our fortunes in the world ourselves’. Liberty requires the people to want it more than they want power, and to recognise their oppression. He shares with Godwin the view that individual judgment will improve when people are allowed to exercise it.
To develop his attack on the hereditary principle, Hazlitt invents a fable or thought experiment. He supposes that the actor Edmund Kean takes out letters patent ‘to empower him and his heirs for ever… by the grace of God and the favour of the Prince Regent’ to take the lead in all future stage tragedies, regardless of actual talent, and despite the fact that his son is ‘a little crook-kneed, raven-voiced, disagreeable, mischievous, stupid urchin, with the faults of his father’s acting, and none of his fine qualities’ (CW 7: 274). Unless forced to attend at the point of a bayonet, the public would simply stay away. ‘Surely, if government were a thing requiring the utmost genius, wisdom, virtue, to carry it on, the office of King would never even have been dreamt of as hereditary, any more than that of poet, painter, or philosopher’ (CW 7: 274). Near idiots are supposed capable of ruling while the people are denigrated as ‘a swinish multitude’ and mocked for their lack of refinement and philosophy. When will the ‘long minority’ of the people expire? Despotic rulers, tenacious of power, should indeed fear the people’s fury, even if timely reform might prevent, or delay, a revolution.
‘What is the People?’ is Hazlitt at his most polemical. The pronouns ‘we’ and ‘our’ become prominent as he aligns himself with the people. His tendency to skepticism about enduring progress is suppressed in favour of a defiant tone that conveys the tensions in the period before the Peterloo Massacre of 1819. It is in stark contrast to the aloof voice of An Essay on the Principles of Human Action, though the militant is identical with the metaphysician.
d. The Press and Freedom of Speech
The idea of a disinterested state of mind, first developed in the Essay, grounds Hazlitt’s political thinking and, specifically, his commitment to freedom of speech and the liberty of the press. The faculties of mind, including imagination, are active and receptive, and they develop through exposure to ideas and beliefs that are encountered through conversation and, especially, through reading. We are naturally disposed to sympathize with the feelings of others, but our faculties need cultivation. We need to be challenged and to exercise judgment in the careful consideration of different points of view, and without prioritising our own interests or settled opinions.
The invention of printed books made knowledge more widely available, and the press is, potentially at least, ‘the great organ of intellectual improvement and civilisation’ (CW 13: 34). Hazlitt was, of course, aware that periodicals could equally be organs of Government propaganda. Editors, booksellers, and publishers were prosecuted under the Sedition and Libel acts more often than authors themselves. For example, in 1798 Joseph Johnson, the publisher of the Essay, and of Hazlitt’s father’s sermons, of Priestley, Wollstonecraft, Godwin and others, was tried and imprisoned for sedition. John Hunt, Hazlitt’s friend and the publisher of The Examiner and The Yellow Dwarf, was twice imprisoned. William Cobbett fled to the United States to avoid arrest. Hazlitt is often at his most splenetic (and least disinterested, perhaps) in considering the editors of the Tory press and their turncoat contributors. In the former category is William Gifford, the editor of the Quarterly Review. Hazlitt describes him as being so well qualified for the situation ‘by a happy combination of defects, natural and acquired’ that at his death ‘it will be difficult to provide him a suitable successor’ (CW 11: 114). Mercilessly denouncing Whigs and Radicals as ‘dunces and miscreants’, Gifford ‘entitles himself to the protection of the Church and State’ (CW 11: 117). People like this ‘poison the sources of public opinion’ (CW 11: 124). A puppet press manipulates public opinion, diverting it from truth, justice, and liberty. Without the opportunity to develop independent thinking, individuals cannot break free from prejudice and received opinion.
The discontinuity of personal identity is alluded to, ironically, in Hazlitt’s response to Southey’s attempt to suppress the unauthorised publication, in 1817, of his youthful dramatic poem Wat Tyler, which the Poet Laureate now regarded as seditious. The former Radical had become a strong opponent of parliamentary reform and of free speech. What could prompt such a turnaround? Hazlitt imputes it ‘rather to a defect of moral principle than to an accession of intellectual strength’ (CW 7: 180). Hazlitt had admired the Lake poets’ earlier work, but he insists on his right to criticize them in print. Integrity requires Hazlitt to speak truth, also, to those allies he feels sometimes undermine the cause: Godwin was too utopian, Percy Shelley too extreme, Robert Owen disingenuous in claiming originality. The underlying idea is still disinterestedness: a critical, candid, disinterested response to the spirit of the age, to the cacophony of its leading voices. What would be the point of independent, disinterested judgment if, from tact or pusillanimity, one preferred self-censorship to candid free speech? Freedom is the right to criticize and disagree.
4. The Essayist as Philosopher
a. Late Twentieth and Early Twenty-First Century Studies
Interest in Hazlitt, and especially in his philosophy, was largely dormant throughout Victorian times and most of the twentieth century. Herschel Baker (1962) and W. P. Albrecht (1965) both comment on the philosophy, in Baker’s case none too favourably, but it is since Roy Park’s 1971 study that interest has been sustained. Several biographies and critical studies have appeared that have attempted to do justice to Hazlitt the philosopher and political thinker, as well as to Hazlitt the critic and conversational essayist. These include biographies by A. C. Grayling (2000) and Duncan Wu (2008). Stanley Jones (1989) focuses on Hazlitt’s later life. David Bromwich’s intellectual biography The Mind of the Critic (1983) is recognised as a major critical study of Hazlitt as a leading figure of Romanticism. John Kinnaird (1978) traces Hazlitt’s use of the term ‘power’ in both political and creative contexts. Tom Paulin (1998) emphasizes Hazlitt’s genius as a prose stylist and radical thinker. Stephen Burley (2014) places Hazlitt’s life and thought in the context of his Unitarian upbringing and education and focusses on his early philosophical work. Kevin Gilmartin (2015) puts politics at the centre of Hazlitt’s achievement as a critic and essayist. A major collection of essays by several authors, Metaphysical Hazlitt: Bicentenary Essays (2005) marked the bicentenary of Hazlitt’s Essay on the Principles of Human Action and explored its relevance to his other work.
These works testify to the modern interest in Hazlitt’s overall achievement. But it is worth taking a closer look at two distinctive interpretations of the philosophy, focused on different key concepts, in order to relate them to his work as a critic and essayist. Park’s Hazlitt and the Spirit of the Age (1971) focuses on what he sees as Hazlitt’s entirely original theory of abstract ideas; Uttara Natarajan’s Hazlitt and the Reach of Sense: Criticism, Morals, and the Metaphysics of Power (1998) focuses on Hazlitt’s insistence on the formative power of the mind. Both books investigate how Hazlitt’s philosophical commitments were integral to the style and content of the literary essays on which his reputation as a writer rests.
b. Abstraction and the Poetic
Roy Park argues that Hazlitt’s theory of abstraction explains the role that both painting and philosophy played in the formulation of his literary theory. He emphasizes how Hazlitt disguises the philosophy in his essays by focusing the reader’s attention on the concrete and particular. He sees partial parallels with the thought of Coleridge, Thomas Carlyle, and Mathew Arnold in the way Hazlitt represents abstraction as a threat to our experience of the poetic—which is to say, to civilized living. Literature is a response to life at its deepest level, to life experienced imaginatively rather than rationally. Contemporary philosophers, such as Bentham, Condorcet, even Godwin, represent humankind in materialistic terms. ‘Experience’ should not be restricted to material or physiological experience. The moral theories of egotism and utilitarianism, which make pleasure and pain the only criteria of right action, also Malthus’s theory of population, are the outcome of empirical epistemology. Park sees Hazlitt as attempting (like Thomas Reid) to harmonize the ‘material’ with the ‘intellectual’ or ‘imaginative’, and (like Francis Bacon) the abstract with the concrete, the individual with the universal, the scientific with the spiritual (1971: 20 – 21).
Hazlitt was not interested, Park writes, in converting the poetic into something other than itself. To ask What is the poetic? is an improper question, for the essence of poetry and life are lost when we attempt to explain them. It is not so much that thought is mysterious as that mystery is part of the thought. Hazlitt’s experiential response to the poetic and the existential was appreciated by Percy Shelley and John Keats: ‘Hazlitt initiated the response… but it is to Keats that we owe the classic formulation of the experiential standpoint in his characterization of negative capability as a condition in which “man is capable of being in uncertainties”’ (Park, 1971: 32).
Park sees Hazlitt’s objection to abstraction as being an objection ‘to all closed systems of thought in which the whole of human experience was interpreted in the light of the system’s initial premiss, empirical or metaphysical, with scant regard to the individuality, complexity and diversity of “the truth of things”’ (1971: 35). He quotes Hazlitt’s observation in the Lectures on English Philosophy:
‘They [system-makers] have in fact always a purpose… [which] takes away the tremulous sensibility to every slight and wandering impression which is necessary to complete the fine balance of the mind, and enable[s] us to follow all the infinite fluctuations of thought through their nicest distinctions. (CW 2: 269; Park, 1971: 37)
One of Hazlitt’s objections to Wordsworth’s ‘The Excursion’ is that it has ‘palpable designs upon us’.
Hazlitt’s Essay prepared the way for his rejection of abstraction by its rejection of mechanical associationism and of psychological egoism, and by its discovery of natural disinterestedness and the active power of the mind. Park thinks Hazlitt’s analysis of personal identity is not as significant as the argument against psychological egoism or the positioning of the imagination—the faculty of self-transcendence—as the moral faculty. It is the prerequisite for the openness characteristic of the existential stance. Furthermore, Park argues, it parallels Kant’s autonomously legislative will: practical reason and imagination are both essentially experiential.
But imagination’s role is not just as a mode of volitional consciousness: it is also the faculty of the poetic (1971: 49). Hazlitt connects imagination with the vital role played by sentiments in developing our habits and affections. The existential or sentimental relationship with the world around us is what Hazlitt calls ‘the poetry of life’—what makes life worth living. Imagination extends beyond volitional consciousness to include art and life in general. We are poetical animals because we love, hope, fear, hate, despair, wonder, admire. For Hazlitt, the spirit of poetry and the spirit of humanity are the same. The past becomes a fable, investing objects with value; objects become epiphanies. Childhood is important for its ‘symbolic fecundity and its subsequent symbolic richness’ (Park, 1971: 66). Poetry expresses this revelation of the significance of human life, modifying and enlivening the mystery of existence, the real and interior spirit of things. Value is quality, not, as in Bentham’s utilitarianism, quantity. The fine arts and poetry are self-authenticating; their value is never instrumental: ‘they toil not, neither do they spin’ (CW 18: 167). But that is not to say that they have no cultural implications for the individual or for society. Through literature and the arts, we are humanized: they enable us to become aware of our inter-relatedness with the rest of humanity.
Intellectual progress is not towards abstraction, as Locke and later philosophers had thought, but towards individuation. Objects of sense are complex; they are ‘seen’ with our understanding and our hearts: the more we observe, the more we see. Hazlitt, Park observes, had learned this as a painter. Detail is the essence of the poetic. Hazlitt’s critical vocabulary is full of terms derived from painting, terms related to the kind of ‘gusto’ appropriate to that branch of literature, and terms connected with particularity and individuation: ‘detail’, ‘distinction’, ‘tint’, ‘local’, ‘concrete’, ‘subtle’, and the contrasting terms such as ‘abstract’, ‘vague’, ‘universal’, ‘indefinite’, ‘theoretical’. It is out of particularity that the universal emerges. It is the precise and the vague that are in opposition, not the individual and the general. Park observes that Hazlitt’s attitude to abstraction helps us to understand his own view of his work as ‘the thoughts of a metaphysician expressed by a painter’.
Hazlitt’s use of this term ‘gusto’ epitomises his experiential view of the poetic. Derived from art criticism, it refers to the particular character of a work of art or literature: the quality which, as it were, differentiates one grain of sand from another. What the work expresses, in all its complexity, can only be expressed in the work itself. Our job is to submit to the artist’s or poet’s vision. In an essay that itself exemplifies gusto, ‘The Indian Juggler’, Hazlitt refers to the poet’s ability to unravel ‘the mysterious web of thought and feeling’ through ‘the power of that trembling sensitivity which is awake to every change and every modification of its every-varying impressions’ (CW 8: 83).
Philosophers are too connected to their form of abstraction. Park quotes Ludwig Wittgenstein (1958: 18) on the ‘craving for generality’ and the tendency of philosophers to try to answer questions in the way that scientists do (Park, 1971: 210). Feeling is the most important factor in Hazlitt’s distrust of abstraction. Imagination is the power of carrying feeling into other situations, including into other people’s situations. Park uses the term ‘imaginative sincerity of feeling’ to refer to the power of imagination at work in art and in moral action. This and gusto and the distrust of abstraction are all facets, Park insists, of Hazlitt’s experiential view of the poetic: they give a unity to his criticism (1971: 169). The combination and balance between these facets serve to isolate the peculiar, original, and characteristic nature of a work in relation to the artist’s individual genius, and to discriminate kinds and degrees of poetic excellence.
Park argues that for Hazlitt hope lies in the nature of poetry itself and in the spirit of man. His disappointment with his own age is demonstrated in The Spirit of the Age (CW 11), published in 1825, a work Park describes as ‘a masterpiece of indirectness’, an ‘aggregate of well-founded particulars’ (1971: 213 – 214). It is like an historical painting of the age, starting with portraits of Bentham, Godwin, and Coleridge, and proceeding with writers and politicians that exemplify aspects of the times or who, like Coleridge, had capitulated to the spirit of the age. Abstraction is to blame for its political and aesthetic limitations: the principle of utility, for example, is characteristic.
c. Power and the Poetic
Uttara Natarajan thinks the attention given by Park to Hazlitt’s criticism of abstraction is ‘at the expense of the larger theoretical framework of his writing’ (Natarajan,1998: 6). She calls for the recognition of Hazlitt as not only a great critic but also as a profound philosopher. Hazlitt’s criticism of art and literature, and his political and social criticism, is pervaded by the epistemology and metaphysics of the Essay, and also by its moral theory. Her main claim is that all Hazlitt’s subsequent thought follows from what she sees as the central idea of the Essay: the concept of power. Power is the independence of the mind from manipulation by the senses (or, equivalently, by external objects). The concept of power ‘is at the very core of Hazlitt’s celebration of all intellectual activity as the vindication of an innate self-directing principle with which the mind is endowed (Natarajan,1998: 27).
The formative power of the mind is evidenced by the structure of language, and poetic language especially has a reach that extends beyond the mind to objective reality. In Natarajan’s view, therefore, Hazlitt’s linguistic philosophy is more important than has previously been recognised. Language is the means by which we can understand the self. Words affirm, Natarajan writes, ‘the relation between mind and nature: the moral goal, unity’ (1998: 146). Unity is always a function of the self-projecting attribute of the mind and its ability to create relations and perceive wholes. Language is not a limitation but a manifestation of the mind’s power.
Whereas ordinary language does not reveal externally ‘subsisting objects’, poetic or ‘inspired’ language is true to nature when it conveys the impression the object makes on the mind. Its role is to evoke things as they are felt by the speaker, their emotional significance. Hazlitt’s term ‘impression’, Natarajan observes, is imbued with the weight of feeling. The prime mover is not the object itself but the imagination: it is constructive, it assembles the whole. The power of human perception is embodied in the ‘excess of the imagination beyond the actual or ordinary impression of any object or feeling’ (1998: 27). Hazlitt ‘grants to the purely intellectual a degree of actuality equal to, if not greater than, the impressions of the sense’ (1998: 39). It is through the apprehension of imagination that, in Wordsworth’s words, we ‘see into the life of things’.
Natarajan reminds us that Hazlitt had been educated in Unitarianism, in which God is the power that unifies the order of creation. In Hazlitt’s work, the human intellectual faculty replaces the divine. ‘Nature’ and ‘truth’ are imagination embodied in words. Imagination brings about a process of association, whereby it projects itself into the order of nature to produce its own immaterial creation, its own unity. This active form of associationism opposes the mechanical, materialist, passive, and deterministic associationism of Hartley and Priestley. We never see the whole of an object by looking at it: we ‘read’ it with an associative power that allows it endlessly to accrue meaning. The mind is no blank slate; rather, nature is the blank that gains meaning from the mind. It grows with its growth. Looking into nature is looking into oneself, and vice-versa.
Hazlitt adopts a pluralistic view of truth: it ‘is not one but many; and an observation may be true in itself that contradicts another equally true, according to the point of view from which we contemplate the subject’ (CW 9: 228). Truth is an aggregation, a universal composed of particulars, a union of the abstract and the concrete, synthesized by the imagination. A point of view is true when it is authentic and produced, as in poetry and art, from well-founded particulars. Poetic truth is individualistic, an original insight made possible by the poet’s experience and circumstance and influenced by innate biases or predispositions. This suggests a paradox of determinism simultaneous with free volition. The agent’s dispositions influence choices, but the idea of freedom of the will refers to the choice between motives, which is an inner activity, not externally directed. So, although an agent, artist, or poet is inevitably constrained by the constitution of his or her own mind, it is precisely this that creates the individuality to which she or he must be true. Imaginative truth is an exclusive original insight: truth but not the truth. The artistic genius is compelled to communicate this truth to us, almost tyrannically, in a kind of ‘transfusion of mind’, and yet this is also a sublimation of self.
Hazlitt’s idea of the empowered mind, Natrajan observes, is where his linguistics, epistemology, and poetics converge. It provides, too, a model of the self ‘at once the origin and the product of power’ (Natarajan, 1998: 78). Self-love and benevolence are identically manifestations of innate imagination. By refuting the mechanistic and passive notion of the self, Hazlitt is able to emphasize the capacity of the mind to apprehend holistically both self and others. ‘Alterity’, Natarajan writes, ‘validates the moral nature of man… If that which is other to the self can be shown to constitute a real motive to action, then the self owns moral agency (1998: 121). However, there is a love of power in the mind independent of the love of good. Hazlitt emphasizes this especially in The Plain Speaker essays (CW 12), where empowered mind is shown as bigoted and exclusive. To reach a just determination, we must set aside the bias of the will, the mind’s dispositions. This reconciliation of wisdom and power is achievable because the self is the instrument for action, not the motive. Whereas ‘power’ expresses moral capacity, ‘good’ expresses moral purpose. The ‘metaphysical discovery’ shows the mind’s imaginative capacity, not its achievement, but it allows us to adjust our motives to suit this new understanding of the self as naturally disinterested and to expand the circle of our sympathy.
d. Conclusion
Natarajan and Park agree that a full appreciation of Hazlitt as a philosopher requires us to explore the philosophy in his criticism and conversational essays as well as in his earlier, more explicitly philosophical works. His role in the conversational essays is to be both an artist and a moralist, to create and to criticize, to entertain and to enlighten, and sometimes to enrage. As Natarajan points out, he constantly revisits favourite themes, qualifying, refining, contradicting, aggregating, and composing. Like Michel de Montaigne, whom he admired, Hazlitt was not afraid to turn the spotlight on himself and to explore his own contradictions. Despite its huge variety, there is a unity in his work, a continuity of interests, and commitments that reaches right back into young adulthood. His ideas and themes continue to deserve to be revisited. Interpretation and appreciation of his philosophy has flourished since Park’s and Natarajan’s books were published, and interesting approaches continue to be explored.
Charles Lamb’s review of the first volume of Table Talk provides a fine assessment of his friend’s achievement:
To an extraordinary power of original observation he adds an equal power of familiar and striking expression… In fact, he all along acts as his own interpreter, and is continually translating his thoughts out of their original metaphysical obscurity into the language of the senses and of common observation. (Lamb, 1980: 306 – 7)
5. References and Further Reading
Bibliographical note: references to Hazlitt’s works are generally to the 21-volume Complete Works, edited by P. P. Howe, and are indicated by CW + volume number.
Hazlitt’s Complete Works:
Howe, P. P. (ed.) 1930-34. The Complete Works of William Hazlitt. 21 vols. London: Dent.
This is the standard edition of Hazlitt’s writing. It contains texts of the full-length volumes, annotated. Not quite complete. It is available from the Internet Archive.
Selections, Letters and Memoirs:
Cook, Jon (ed.). 1998. William Hazlitt: Selected Writings. World’s Classics. Oxford: Oxford University Press.
A one-volume paperback selection with useful annotations and an introduction. It contains some of Hazlitt’s shorter philosophical essays and some of his aphorisms.
A collection of Hazlitt’s major metropolitan essays, with a critical introduction on his attitude to London life.
Dart, Gregory (ed.). 2008. Liber Amoris and Related Writings. Manchester: Carcanet Press.
Sets Liber Amoris, Hazlitt’s memoir of the main crisis of his life, in the context of his other writings from 1822-23, with notes and a critical introduction by the editor.
Hazlitt, William Carew. (ed.) 1867. Memoirs of William Hazlitt. 2 volumes. London: Richard Bentley.
Edited by Hazlitt’s grandson. Includes portions of his correspondence. Available from Internet Archive and Forgotten Books.
Keynes, Geoffrey (ed.). 1948. Selected Essays of William Hazlitt. London: Nonesuch Press.
Mee, Jon and Grande, James (eds.). 2021. The Spirit of Controversy and Other Essays. Oxford: Oxford University Press.
A useful, updated World’s Classics edition. The texts are drawn from the original periodical publications, rather than those subsequently prepared for book publication.
Paulin, Tom and Chandler, David (eds.). 2000. William Hazlitt: The Fight and Other Writings. London: Penguin
Substantial annotated Penguin Classics selection, with an introduction by Tom Paulin.
Sikes, Herschel Moreland, Bonner, William Hallam, and Lahey, Gerald (eds.). 1978. The Letters of William Hazlitt. New York: New York University Press.
Wu, Duncan (ed.). 1998. The Plain Speaker: The Key Essays. Oxford: Blackwell.
Paperback selection of some of Hazlitt’s best essays, plus a newly discovered essay, ‘A Half-Length’. Introduction by Tom Paulin.
Wu, Duncan (ed.). 1998. The Selected Writings of William Hazlitt. 9 vols. London: Pickering and Chatto.
A major selection: nine-volumes, with updates of Howe’s texts and annotations. Introduction by Tom Paulin. It includes two previously unpublished essays and An Essay on the Principles of Human Action but excludes much of the philosophical writing to be found in CW 20).
Wu, Duncan (ed.). 2007. New Writings of William Hazlitt. 2 vols. Oxford: Oxford University.
A collection of 205 more recently discovered writings, including major essays on the poetry of Wordsworth and Coleridge and some late philosophical essays not previously recognised as Hazlitt’s.
Wu, Duncan. 2014. All That is Worth Remembering: Selected Essays of William Hazlitt. London: Notting Hill Editions.
Biographies:
Baker, Herschel 1962. William Hazlitt. Cambridge, Mass.: Harvard University Press.
Grayling, A. C. 2000. The Quarrel of the Age: The Life and Times of William Hazlitt. London: Weidenfeld and Nicholson.
A lively biography which appreciates the man and the philosophy.
Howe, P. P. 1947 (new edition). The Life of William Hazlitt. London: Hamish Hamilton.
A standard biography, which draws heavily on Crabb Robinson’s diary. Useful but dated.
Jones, Stanley. 1989. Hazlitt: A Life. From Winterslow to Frith Street. Oxford: Oxford University Press.
An important critical biography. Covers later part of Hazlitt’s life and work.
Wu, Duncan. 2008. William Hazlitt: the First Modern Man. Oxford: Oxford University Press.
An important and substantial biography by a leading Hazlitt scholar.
Critical and Historical Studies:
Albrecht, W. P. 1965. Hazlitt and the Creative Imagination. Lawrence, Kan.: University of Kansas Press.
A study of Hazlitt’s concept of imagination, his political thought and his literary judgements.
Barbalet, Jack. 2009. ‘Disinterestedness and Self-Formation: Principles of Action in William Hazlitt’. European Journal of Social Theory, 12 (2), 195 – 211.
Bromwich, David.1983. Hazlitt: The Mind of a Critic. Oxford: Oxford University Press.
A major modern study of Hazlitt’s philosophy, politics, criticism, and moral theory. Makes the case for Hazlitt as a major critic.
Bullitt, J. M. 1945. ‘Hazlitt and the Romantic Conception of the Imagination’. Philological Quarterly, 24.4, 343-61.
Burley, Stephen. 2014. Hazlitt the Dissenter: Religion, Philosophy, and Politics, 1766-1816. London: Palgrave Macmillan.
A major study with an emphasis on the Dissenting tradition’s influence on Hazlitt’s early philosophical and political writing.
Butler, Marilyn. 1981. Romantics, Rebels and Reactionaries: English Literature and its Background 1760-1830. Oxford: Oxford University Press.
A major work on the period, it includes a summary of Hazlitt’s career as a radical and characteristically English thinker. It puts the Romantic movement in its historical setting and emphasizes its contradictions.
Cook, Jon. 2023. ‘Hazlitt’s First acquaintance with Poets’. The Hazlitt Review, 16, 33 – 47.
Dart, Gregory. 2000. ‘Romantic Cockneyism: Hazlitt and the Periodical Press’. Romanticism 6.2, 143-62.
Eagleton, Terry. 1973. ‘William Hazlitt: An Empiricist Radical’. New Blackfriars, 54, 108 – 117.
Eagleton, Terry. 2009. ‘The Critic as Partisan: William Hazlitt’s Radical Imagination’. Harper’s Magazine, 318 (1907), 77 – 82.
Gilmartin, Kevin. 1996. Print Politics: The Press and Radical Opposition in Early Nineteenth Century England. Cambridge: Cambridge University Press.
Gilmartin, Kevin. 2015. William Hazlitt: Political Essayist. Oxford: Oxford University Press.
A major study which makes a case for the centrality of Hazlitt’s political thought to his achievement as an essayist.
Hunnekuhl, Philipp. 2017. ‘Hazlitt, Crabb Robinson, and Kant: 1806 and Beyond’. The Hazlitt Review, 10, 45 – 62.
Johnston, Freya. 2018. ‘Keeping to William Hazlitt’, in Thinking through Style: Non-Fiction Prose of the Long Nineteenth Century. Oxford: Oxford University Press.
Kinnaird, John. 1977. ‘Hazlitt, Keats, and the Poetics of Intersubjectivity’. Criticism, 19 (1), 1 – 16.
Kinnaird, John. 1978. William Hazlitt: Critic of Power. New York: Columbia University Press.
Takes ‘power’ to be the unifying theme of Hazlitt’s works, in both its political sense and in the sense of creative energy.
Lockridge, Laurence S. 1989. The Ethics of Romanticism. Cambridge: Cambridge University Press.
Chapter 7 is entitled ‘Hazlitt: Common Sense of a Dissenter.’
Martin, Raymond and Barresi, John. 1995. ‘Hazlitt on the Future of the Self’, Journal of the History of Ideas, 56 (3), 463 – 81.
Makes a strong claim for the originality of Hazlitt’s theory of personal identity.
Martin, Raymond and Barresi, John. 2003. ‘Self-Concern from Priestley to Hazlitt’. British Journal for the History of Philosophy,11(3), 499 – 507.
McFarland, Thomas. 1987. Romantic Cruxes: The English Essayists and the Spirit of the Age. Oxford: Clarendon Press.
Mee, Jon. 2011. Conversable Worlds: Literature, Contention, and Community 1762 – 1830. Oxford: Oxford University Press.
Michael, Timothy. 2024. ‘Hazlitt, Disinterestedness, and the Liberty of the Press’. The Review of English Studies, 75 (318), 57 – 74.
Milnes, Tim. 2000. ‘Seeing in the Dark: Hazlitt’s Immanent Idealism’. Studies in Romanticism, 39 (1), 3 – 25.
Milnes, Tim. 2003. Knowledge and Indifference in English Romantic Prose. Cambridge: Cambridge University Press.
Milnes, Tim. 2017. ‘“This Happy Nonentity”: Hazlitt, Hume, and the Essay’. The Hazlitt Review, 10, 63 – 72.
Milnes, Tim. 2019. The Testimony of Sense: Empiricism and the Essay from Hume to Hazlitt. Oxford: Oxford University Press.
Mulvihill, James. 1990. ‘Hazlitt and “First Principles”’. Studies in Romanticism, 29 (2), 241 – 255.
Natarajan, Uttara. 1996. ‘Abstracting Passion: Hazlitt’s Ideal of Power’. New Blackfriars, 77, 276 – 287.
Natarajan, Uttara. 1998. Hazlitt and the Reach of Sense: Criticism, Morals, and the Metaphysics of Power. Oxford: Clarendon Press.
A major study, focusing on the innate, independent activity of the mind. It makes a case for the importance of the philosophy for the essays and criticism.
Commemorates the bicentenary of Hazlitt’s Essay on the Principles of Human Action, with essays by important Hazlitt scholars and philosophers.
Noxon, James. 1963. ‘Hazlitt as Moral Philosopher’. Ethics, 73 (4), 279 – 283.
Park, Roy. 1971. Hazlitt and the Spirit of the Age: Abstraction and Critical Theory. Oxford: Clarendon Press.
An important study, relating Hazlitt’s literary works to his painting and to his philosophy, especially his concept of abstraction.
Paulin, Tom. 1998. The Day-Star of Liberty: William Hazlitt’s Radical Style. London: Faber.
A passionate argument for Hazlitt’s status as a prose artist and political radical.
Philp, Mark. 2020. Radical Conduct: Politics, Sociability and Equality in London 1789 – 1815. Cambridge: Cambridge University Press.
Postle, Martin. 2015. ‘“Boswell Redivivus”: Northcote, Hazlitt, and the British School’. The Hazlitt Review, 8, 5 – 20.
Rée, Jonathan. 2019. Witcraft: The Invention of Philosophy in English. London: Allen Lane.
As well as intellectual portraits of celebrated philosophers there is discussion of the philosophical work of literary authors. A section entitled ‘1801: Politics, Paradise and Personal Identity’ provides a lively narrative concerning Hazlitt’s family, influences, relationships and ideas.
Schneider, Elisabeth W. 1933. The Aesthetics of William Hazlitt: A Study of the Philosophical Basis of his Criticism. Philadelphia: University of Pennsylvania Press.
Tomalin, Marcus. 2012. Romanticism and Linguistic Theory: William Hazlitt, Language and Literature. Cambridge: Cambridge University Press.
Wakefield, James. 2021. ‘On Whether William Hazlitt Was A Philosophical Idealist (and Why It Matters’). The Hazlitt Review,14, 5 – 23.
Wellek, Rene. 1931. Immanuel Kant in England 1793 – 1838. Princeton, N.J: Princeton University Press.
Whale, John. 2000. ‘Hazlitt and the Limits of the Sympathetic Imagination’, in Imagination under Pressure, 1789-1832: Aesthetics, Politics and Utility. Cambridge: Cambridge University Press, 110-39.
Wu, Duncan. 2006. ‘Hazlitt’s Unpublished History of English Philosophy: The Larger Context’. The Library: The Transactions of the Bibliographical Society, 7(1), 25 -64.
Other Resources:
Keynes, Geoffrey. 1931. Bibliography of William Hazlitt. London: Nonesuch Press.
Lamb, Charles. 1980. Lamb as Critic. R. Park (ed.). London: Routledge and Kegan Paul.
Locke, John. 1975. An Essay Concerning Human Understanding. P. H. Nidditch (ed.). Oxford: Oxford University Press.
Martin, Raymond and Barresi, John. 2000. Naturalization of the Soul: Self and Personal Identity in the Eighteenth Century. London and New York: Routledge.
Michael, Timothy. 2016. British Romanticism and the Critique of Political Reason. Baltimore, Md: John Hopkins University Press.
Parfit, Derek. 1987. Reasons and Persons. Oxford: Oxford University Press.
Reid, Thomas.1983. Inquiry and Essays. R. E. Beanblossom and K. Lehrer (eds.). Indianapolis, Ind.: Hackett Publishing Company.
Strawson, Peter. 1964. Individuals: An Essay in Descriptive Metaphysics. London: Methuen University Paperback.
Wittgenstein, Ludwig. 1958. The Blue and Brown Books. Oxford: Basil Blackwell.
Author Information
Graham Nutbrown
Email: gn291@bath.ac.uk
University of Bath
United Kingdom
‘Probability’ is an ambiguous word. In the history of ideas, it has been used with many different senses giving rise to different concepts of probability. Being associated with games of chance and gambling, death tolls and insurance policies, statistical inferences and the chancy world of modern physics, probabilities have been made susceptible to different interpretations. These interpretations may reflect on probabilities’ objectivity of logic or the subjectivity of a person’s belief and lack of knowledge, the frequencies of observed data or the real tendency of a system to yield an outcome. Commonly, but not always, are considered to be interpretations of the mathematical concept of probability which by itself and in itself has no empirical meaning.
The article attempts to present the different meanings of ‘probability’ and provide an introductory topography of the conceptual landscape. Without trying to provide a history of the idea, historical elements have been considered. Also, realizing that an exhaustive treatment would be difficult, we are focusing, mainly, on the discussion of induction and confirmation. The article is intended as a companion to another entry on IEP in which we discuss The Problem of Induction (Psillos and Stergiou, 2022); this explains why we do not deal here with Hans Reichenbach’s major contribution to the interpretation of probability theory.
Table of Contents
Elements of Probability Theory and its Interpretations
On Mathematical Probability
Interpretations of Probability
What is Probability?
The Classical Interpretation
Probability as a Measure of Ignorance
Probabilities as Frequencies
Are Propensities Probabilities?
Probability as the Logic of Induction
Keynes and The Logical Concept of Probability
The Principle of Indifference
Keynes on the Problem of Induction
On the Rule of Succession
Carnap’s Inductive Logic
Two Concepts of Probability
C-functions
The Continuum of Inductive Methods
Subjective Probability and Bayesianism
Probabilities as Degrees of Belief
Dutch Books
Bayesian Induction
Too Subjective?
Some Success Stories
Appendices
Lindenbaum algebra and probability in sentential logic.
The Rule of Succession: a mathematical proof
The mathematics of Keynes’s account of Pure Induction
References and further reading
Elements of Probability and its Interpretations
On Mathematical Probability
In the monograph Foundations of the Theory of Probability, first published in German in 1933, the Soviet mathematician A. N. Kolmogorov presented the definitive form of what is nowadays regarded an axiomatization of mathematical probability.
The challenge of axiomatization has been set by D. Hilbert in the sixth of his famous twenty-three problems at the beginning of twentieth century (1902):
…to treat in the same manner [as geometry], by means of axioms, those physical sciences in which mathematics plays an important part; in the first rank are the theory of probabilities and mechanics.
Kolmogorov, addressing the problem, developed a theory of probability as a mathematical discipline “from axioms in exactly the same way as Geometry and Algebra” (1933:1). In his axiomatization, probability and the other primary concepts, devoid of any empirical meaning, are defined implicitly in terms of consistent and independent axioms in a set-theoretic setting. Thus, modern mathematical probability theory grew within the branch of mathematics called measure theory.
Kolmogorov called elementary theory of probability “that part of the theory in which we have to deal with probabilities of only a finite number of events.” (ibid). A random event is an element of an event space; the latter being formalized by the set-theoretic concept of field, introduced by Hausdorff in Set Theory (1927). A field is a non-empty collection of subsets \(\mathcal{S}\) of a given non-empty set \(\Omega\) that has the following properties:
for every pair of elements, \(A, B\) of \(\mathcal{S}\), their union, \(A \cup B\), belongs in \(\mathcal{S}\);
for every element \(A\) of \(\mathcal{S}\) its complement with respect to \(\Omega\), \(A^c = \Omega \setminus A\), is in \(\mathcal{S}\).
In probability theory the set \(\Omega\) is called sample space.
To understand the above formalization, consider the simple example of tossing a die. Let \(\Omega\) be the set of the six possible outcomes:
\(E_1, E_2, E_3, E_4, E_5, E_6\).
The collection \(\mathcal{S}\) of all \(2^{6} = 64\) subsets of \(\Omega\):
satisfies conditions (a) and (b); \(\mathcal{S}\) is a field. The subsets of \(\Omega\) represent different possibilities that can be realized in tossing a single die: the empty set, \(\emptyset\), is a random event that represents an impossible happening. The singletons, \(\{E_1\}, \{E_2\}\), \(\dots, \{E_6\}\), are the elementary events, since any other random event (except \(\emptyset\)) is a disjunction of these events, expressed by taking the set-theoretic union of the respective singletons. Finally, \(\Omega\) = \(\{E_1, E_2, E_3, E_4, E_5, E_6\}\), is an event that represents the realization of any possibility.
A function from a field \(\mathcal{S}\) to the set of real numbers, \(\mathbb{R}\), \(p: \mathcal{S} \to \mathbb{R}\), is called a probability function on \(\mathcal{S}\), if it satisfies the following axioms:
\(p(A) \geq 0\), for \(A \in \mathcal{S}\);
\(p(\Omega) = 1\);
\(p(A \cup B) = p(A) + p(B)\), for \(A \cap B = \emptyset\);
In the simple example of tossing a die, a probability function \(p\) would assign a non-zero real number \(p(E)\) to each element \(E\) of \(\mathcal{S}\), according to axiom (i). Axiom (ii) requires that the random event which describes any possible outcome has probability 1, \(p(\Omega) = 1\). Axiom (iii), commonly called finite additivity property, tells us how to calculate the probability value of any random event from the probability values of elementary events, for instance:
Notice that there are infinitely many admissible probability functions on the event space of the tossing of a die and that only one of them corresponds to a fair die, the one with \(p(\{E_i\}) = \frac{1}{6}\) for \(i = 1, \dots, 6\).
Problems concerning a countably infinite number of random events require an additional axiom and the formalization of the event space as a σ-field. A field \(\mathcal{S}\) is a σ-field if and only if it satisfies the following condition:
for every infinite sequence of elements of \(\mathcal{S}\), \(\{A_n\}_{n \in \mathbb{N}}\), the countably infinite union of these sets, \(\bigcup_{n=1}^\infty A_n\) belong in \(\mathcal{S}\).
Every field \(\mathcal{S}\) of finite cardinality is a σ-field since any infinite sequence in \(\mathcal{S}\) consists of a finite number of different subsets of \(\Omega\) and their union is always in \(\mathcal{S}\), according to (a). Yet this may not be the case if the field is constructed from a countably infinite set \(\Omega\). Imagine, for instance, a die of infinite faces, where the set \(\Omega\) of possible outcomes is:
\(E_1, E_2, E_3,\dots\)
Let the collection \(\mathcal{S}\) consist of subsets \(A\) of \(\Omega\) which are either of finite cardinality or their complement, \(A^c = \Omega \setminus A\), is of finite cardinality:
\(\mathcal{S} = \{A \subset \Omega: A \text{ is finite or } A^c \text{ is finite} \}\).
It’s easy to show that \(\mathcal{S}\) is a field. Yet it is not a σ-field, since the set
\(\bigcup_{n \in \mathbb{N}} \{E_{2n}\}\) which is the infinite union of \(\{E_{2n}\}\), \(n \in \mathbb{N}\) does not belong to \(\mathcal{S}\).
A probability function on a σ-field \(\mathcal{S}\), \(p: \mathcal{S} \to \mathbb{R}\), satisfies the following axioms: i’. \(p(A) \geq 0\), for \(A \in \mathcal{S}\); ii’. \(p(\Omega) = 1\); iii’. \(p(\bigcup_{n=1}^\infty A_n) = p(A_1) + \cdots + p(A_N) + \cdots = \sum_{n=1}^\infty p(A_n)\), for \(A_i \cap A_j = \emptyset\), for \(i \neq j\).
It is evident that axiom (iii’), commonly called countable additivity property of the probability function, extends finite additivity to the case of a countably infinite family of events. Originally, Kolmogorov suggested a different axiom, equivalent to countable additivity, the axiom of continuity (1933: 14):
iii”. For a monotone sequence of events \(\{A_n\}_{n \in \mathbb{N}}\), with \(A_n \supseteq A_{n+1}\), \(n \geq 1\) such that \(\bigcap_{n=1}^\infty A_n = \emptyset\), \(p(A_n) \to 0\) when \(n \to \infty\).
In what follows we will see that many interpretations of mathematical probabilities are actually interpretations of elementary probability theory, and that they face serious problems when applied to mathematical probability theory formulated in σ-fields.
A special probability function \(p(\cdot|A): \mathcal{S} \to \mathbb{R}\) can be defined on \(\mathcal{S}\), if we are given a function \(p\) on \(\mathcal{S}\) and a random event \(A \in \mathcal{S}\) such that \(p(A) \neq 0\):
\(p(B|A) = \frac{p(B \cap A)}{p(A)}\), for \(B \in \mathcal{S}\)
\(p(\cdot|A)\) determines the conditional probability \(p(B|A)\) of some event \(B \in \mathcal{S}\) given an event \(A\), while \(p(B)\) is the unconditional probability of \(B\).
The conditional probability given an event \(A \in \mathcal{S}\) of any random event \(B \in \mathcal{S}\), \(p(B|A)\), can be understood as unconditional probability of an event \(D\), \(p_A(D)\), determined by a probability function \(p_A\) on a reduced event space \(\mathcal{S}_A\) consisting of subsets of the event \(A \in \mathcal{S}\) we conditionalize on; namely, \(p_A: \mathcal{S}_A \to \mathbb{R}\), \(p_A(D) = p(B|A)\), where \(\mathcal{S}_A = \{D: D = B \cap A, \text{ for } B \in \mathcal{S}\}\).
In the tossing of a fair die example, the conditional probability of any outcome, event \(B = \{E_i\}\), \(i = 1, \dots, 6\), given that it is an even number, event \(A = \{E_2, E_4, E_6\}\), is provided by the conditional probability function \(p(\cdot|A)\), defined on the σ-field \(\mathcal{S}\).
Since the die is fair, \(p(\{E_i\}) = \frac{1}{6}\) for \(i = 1, \dots, 6\); also, \(p(B \cap A) = \frac{1}{6}\) for \(B = \{E_i\}\), \(i = 2,4,6\), while \(p(B \cap A) = 0\) otherwise; using the finite additivity axiom, \(p(A) = p(\{E_2\}) + p(\{E_4\}) + p(\{E_6\}) = \frac{1}{6} + \frac{1}{6} + \frac{1}{6} = \frac{1}{2}\); so, \(p(B|A) = \frac{1}{3}\), for \(B = \{E_2\}, \{E_4\}, \{E_6\}\) and \(p(B|A) = 0\) otherwise. Now, consider the reduced event space \(\mathcal{S}_A\) consisting of the subsets of \(\{E_2, E_4, E_6\}\). Since the die is fair, \(p(\{E_i\}) = \frac{1}{6}\) for \(i = 2,4,6\) and, \(p_A(\{E_i\}) = \frac{1}{3}\) for \(B = \{E_i\}\), \(i = 2,4,6\), while \(p_A(\emptyset) = 0 = p(B|A)\) otherwise.
Kolmogorov’s axiomatic account, the standard mathematical textbook account of probability theory, explicates the concepts of random event and event space in terms of set theory. Yet, Boole proposed
… another form under which all questions in the theory of probabilities may be viewed; and this form consists in substituting for events the propositions which assert that those events have occurred, or will occur; and viewing the element of numerical probability as having reference to the truth of those propositions, not to the occurrence of the events concerning which they make assertion. (1853:190)
This formulation of probability theory is very common in philosophical contexts, especially when discussing inductive inference. It typically concerns elementary probability theory, presented in the language of sentential logic. Elements of this account can be found in Appendix 6.a and the reader may also consult (Howson and Urbach 2006: Ch.2). Here, we present just a few propositions of elementary probability theory as formulated in this setting that will be found useful in what follows:
Probability 1 is assigned to tautologies and probability 0 to contradictions. All other sentences have probability values between 0 and 1.
The probability of the negation of sentence \(a\) is \(1 – p(a)\).
The probability of the disjunction of two inconsistent sentences \(a, b\) is the sum of probabilities of the sentences:
\(p(a \vee b) = p(a) + p(b)\).
The conditional probability of a sentence \(a\) given the truth of a sentence \(b\) is:
Bayes’s Theorem. The posterior probability of a hypothesis \(h\) – i.e., the probability of \(h\) conditional on evidence \(e\) – is:
\(p(h|e) = \frac{p(e|h)p(h)}{p(e)}\), where \(p(h), p(e) > 0\), where \(p(e|h)\) is called likelihood of the hypothesis and expresses the probability of the evidence conditional on the hypothesis; \(p(h)\) is called prior probability of the hypothesis; and \(p(e)\) is the probability of the evidence.
We conclude this brief introduction to mathematical probability with the following instructive application of Bayes’s theorem. A factory uses three engines \(A_1, A_2, A_3\) to produce a product. The first engine, \(A_1\), produces 1000 items, the second, \(A_2\), 2000 items and the third, \(A_3\), 3000 items, per day. Of these items, 4%, 2% and 4%, respectively, are faulty.
What is the probability of a faulty product having been produced by a given engine in a day? Let \(h_i\) be the hypothesis: “A product has been produced by engine \(A_i\) in a day”, for \(i = 1,2,3\), and \(e\): “A faulty product has been produced in a day”. Then the prior probabilities of \(h_i\) are, \(p(h_1) = \frac{1}{6}\); \(p(h_2) = \frac{1}{3}\); \(p(h_3) = \frac{1}{2}\) and the likelihoods are \(p(e|h_1) = 0.04\), \(p(e|h_2) = 0.02\); \(p(e|h_3) = 0.04\), respectively. Using the theorem of total probability (see, Appendix 6a), we can calculate \(p(e) = p(h_1) p(e|h_1) + p(h_2) p(e|h_2) + p(h_3) p(e|h_3) = \frac{1}{6} \cdot 0.04 + \frac{1}{3} \cdot 0.02 + \frac{1}{2} \cdot 0.04 = \frac{1}{30}\).
By applying Bayes’s theorem we obtain the posterior probability for each hypothesis, \(p(h_1|e) = 0.20\); \(p(h_2|e) = 0.20\); \(p(h_3|e) = 0.60\), that is, the probability of a faulty product to have been produced by a given engine in a day.
Interpretations of Probabilities
As any other part of mathematics, probability theory does not have on its own any empirical meaning and cannot be applied to games of chance, to the study of physical or biological systems, to risk evaluation or insurance policies and, in general, to empirical science and practical issues, unless we provide an interpretation of its axioms and theorems.
This is what Wesley Salmon (1966: 63) dubbed the philosophical problem of probability:
It is the problem of finding one or more interpretations of the probability calculus that yield a concept of probability, or several concepts of probability, which do justice to the important applications of probability in empirical science and in practical affairs. Such interpretations whether one or several would provide an explication of the familiar notion of probability.
Salmon suggested three criteria that an interpretation of probability is desirable to satisfy. The first one is called admissibility, and it requires that the probability concepts satisfy the mathematical relations of the calculus of probability, i.e., the axioms of Kolmogorov. This is a minimal requirement for the concept of probability to be an interpretation of mathematical probability but not a trivial one, since countable additivity may be a problem for some interpretations of probability (see, 2.a.i and 2.b), while in others, Kolmogorov’s axioms are supposed to follow naturally from the practice of gambling (see, 5.a and 5.b). The second criterion is ascertainability. It requires that there should be a method by which, in principle at least, we can ascertain values of probabilities. If it is impossible to find out what the values of probability are, then the concept of probability is useless. Again, not all suggested interpretations satisfy this requirement. According to Salmon, Reichenbach’s frequency interpretation fails to meet this requirement (1966: 89ff.). Finally, applicability is the third criterion: a concept of probability should be applicable, i.e., it should have a practical predictive significance. The force of this criterion is manifested in everyday life, in science as well as in the logical structure of science. The concept of scientific confirmation provides a venerable example of application of probability theory.
Interpretations of probability theory may be classified under two general families: inductive and physical probability.
The classical, the logical and the subjective interpretations of probability are deemed inductive, while the frequency and the propensity interpretations yield physical probabilities. To illustrate the difference between inductive and physical probability, an example may be instructive (Maher, 2006). Think of a coin that you know is either two-headed or two-tailed, but you have no information about what it is.
What is the probability that it would land heads, if tossed? One possible answer would be that the probability is \(\frac{1}{2}\), possibilities, and we have no evidence which one is going to be realized. Another answer would say that the probability is either 0, if the coin is two-tailed, or 1, if two-headed, but we do not know which. Maher suggests that if \(\frac{1}{2}\) answer, then we understand ‘probability’ in the sense of inductive probability while the sense in which ‘0 or 1’ occurs as a natural answer is physical probability. What is the difference between the two meanings? Inductive probability is relative to available evidence, and it does not depend on how the unknown part of the world is, i.e., on unknown facts of the matter. Thus, if in this example we come to know that the coin tossed has a head on one side, we should revise the probability estimate in the light of new evidence and claim that now the inductive probability is 1. On the other hand, physical probability is not relative to evidence, and it depends on facts that may be unknown. This is why the further piece of information we entertained does not alter the physical probability (it is still ‘0 or 1’).
What is Probability?
The Classical Interpretation
Pierre Simon Laplace proposed what has come to be known as the classical interpretation in his work, The Analytical Theory of Probabilities (1812), and in the much shorter, A Philosophical Essay on Probabilities (1814); a book based on a lecture on probabilities he delivered in the Ecole Normale, in 1795. His deterministic view of the universe, Laplacian determinism, is legendary. Not only did he believe that every aspect of the world, any event that takes place in the universe is governed by the principle of sufficient reason “…the evident principle that a thing cannot occur without a cause which produces it” (1814: 3) but also that “[w]e ought … to regard the present state of the universe as the effect of its anterior state and as the cause of the one which is to follow.” (1814: 4). Moreover, he claimed that the universe is knowable, in principle, and that a supreme intelligence that: could comprehend all the forces by which nature is animated and the respective situation of the beings who compose it—an intelligence sufficiently vast to submit these data to analysis—it would embrace in the same formula the movements of the greatest bodies of the universe and those of the lightest atom. (ibid)
However, human intelligence is weak. It cannot provide an adequate unified picture of the world and subsume the macroscopic and microscopic realm under the province of a single formula. Nor can it give the causes of all events that occur and render them predictable. Thus, ignorance emerges as an expression of human limitation. Laplace stressed that:
The curve described by a simple molecule of air or vapor is regulated in a manner just as certain as the planetary orbits; the only difference between them is that which comes from our ignorance. (1814: 6)
Due to ignorance of the true causes, he claimed, people believe in final causation, or they make chance (‘hazard’ in Laplacian terminology) an objective feature of the world. “[B]ut these imaginary causes” explains Laplace, “have gradually receded with the widening bounds of knowledge and disappear entirely before sound philosophy, which sees in them only the expression of our ignorance of the true causes.” (1814: 3)
i. Probability as a Measure of Ignorance
In this context, Laplace interpreted probability as a measure of our ignorance making it dependent on evidence one is aware of, or, on lack of such evidence:
Probability is relative, in part to this ignorance, in part to our knowledge. We know that of three or a greater number of events a single one ought to occur; but nothing induces us to believe that one of them will occur rather than the others. In this state of indecision, it is impossible for us to announce their occurrence with certainty. It is, however, probable that one of these events, chosen at will, will not occur because we see several cases equally possible which exclude its occurrence, while only a single one favors it. (1814: 6)
The measure of probability of an event is determined by considering equally probable cases that either favor or exclude its occurrence and the concept of probability is reduced to the notion of equally probable events:
The theory of chance consists in reducing all the events of the same kind to a certain number of cases equally possible, that is to say, to such as we may be equally undecided about in regard to their existence, and in determining the number of cases favorable to the event whose probability is sought. The ratio of this number to that of all the cases possible is the measure of this probability, which is thus simply a fraction whose numerator is the number of favorable cases and whose denominator is the number of all the cases possible. (1814: 6-7)
Laplace claims that the probability of an event is the ratio of the number of favorable cases to that of all possible cases. And this principle of the calculus of probability has for Laplace the status of a definition:
First Principle.—The first of these principles is the definition itself of probability, which, as has been seen, is the ratio of the number of favorable cases to that of all the cases possible. (1814: 11)
In the jargon of the mathematical theory of probability, one may consider a partition \(\{A_k\}_{k=1\dots n}\) of the event space \(\mathcal{S}\), i.e. a family of mutually exclusive subsets exhaustive of the sample space, \(A_i \cap A_j = \emptyset\) and \(\bigcup_{k=1}^n A_k = \Omega\) – and assume equal probability for all random events \(A_k\), \(p(A_i) = p(A_j)\), for every \(1 \leq i, j \leq n\).
Now, for every event \(E\) that is decomposable into any sub-family \(\{A_{k_l}\}_{l=1\dots m} \subseteq \{A_k\}_{k=1\dots n}\), the probability of \(E\) is \(p(E) = \frac{m}{n}\), where \(E = \bigcup_{l=1}^m A_{k_l}\).
\(\frac{\text{number of favorable cases for } E}{\text{number of possible cases}}\).
We can easily show that a function defined in this way satisfies the axioms of elementary probability theory: \(p(A) \geq 0\), for \(A \in \mathcal{S}\); \(p(\Omega) = 1\); \(p(A \cup B) = p(A) + p(B)\), for \(A \cap B = \emptyset\). Hence, Laplace’s first principle suggests an admissible, in Salmon’s sense, interpretation of the elementary theory.
Countable additivity (axiom iii’), on the other hand, is not satisfied for an event space of countably infinite cardinality. To show this, consider an infinite partition \(\{A_k\}_{k=1}^\infty\) and assign equal probability to all \(A_k\)s, \(p(A_k) \geq 0\). Then by employing axioms i’ and ii’ along with the equal probability condition and countable additivity (axiom iii’), we are led to the following absurdity:
\(1 = p(\Omega) = p(\bigcup_{k=1}^\infty A_k) = \sum_{k=1}^\infty p(A_k) = \infty\), or \(p(A_k) = 0\) for all \(k\).
Hence, classical interpretation is not an admissible interpretation of the mathematical theory of probability in general. It singles out only certain models of probability theory (elementary theory) in which the cardinality of the event space is finite.
Another criticism raised against the classical interpretation (Hajek, 2019) is related to its applicability. The classical interpretation of probability allows only rational-valued probability functions, defined in terms of a ratio of integers. However, in many branches of science, theories (for instance, quantum mechanics) assign to events irrational probability values. In these cases, one cannot interpret probability value in terms of the ratio of the number of favorable, over the total number of cases.
As we have already discussed, in the definition of probability, Laplace presupposes that all cases are equally probable. This fact gives rise to a well-known criticism, namely, that of circularity of the definition of probability: if the relation of equiprobability of two events depends conceptually on what probability is, then the definition of probability is circular. To avoid this criticism, the soviet mathematician and student of Kolmogorov Boris Gnedenko, considered the notion of equal probability a primitive notion “which is …basic and is not subject to a formal definition.” (1978: 23)
Laplace, in several places, wrote about “equally possible” cases as if ‘possibility’ and ‘probability’ were terms that could be used interchangeably. To assume that is to commit a category mistake, as Hayek has pointed out, since possibilities do not come in degrees. Nevertheless, as we shall see in section 3.a.1, the connection between possibility and probability can be established in terms of Keynes’s principle of indifference. In the same section we will discuss the paradoxes of indifference that also undermine Laplace’s idea of probability.
Probabilities as Frequencies
The frequency interpretation of probability can be traced back to the work of R. L. Ellis and John Venn, in the middle of nineteenth century and it has been described as “a ‘British Empiricist’ reaction to the ‘Continental rationalism’ of Laplace” (Gillies 2000: 88). In Ellis’s article “On the Foundations of the Theory of Probability” (1842) we identify the rudiments of this interpretation:
If the probability of a given event be correctly determined, the event will, on a long run of trials, tend to recur with frequency proportional to this probability. Venn presented his own account, a few years later, in 1866, in The Logic of Chance: we may define the probability or chance … of the event happening in that particular way as the numerical fraction which represents between the two different classes in the long run. (the quote is from 3rd edition, 1888: 163) The real boost, however, for the frequency interpretation has been given in the early twentieth century, with the advent of Logical Empiricism, by Richard von Mises, in Vienna, and Hans Reichenbach, in Berlin. The first, in his work Probability, Statistics and Truth, published in German in 1928, provides a thorough mathematical and operationalist account of probability theory as empirical science, alike empirical geometry and the science of mechanics. The account has been presented more rigorously in von Mises’ posthumously published work, entitled Mathematical Theory of Probability and Statistics (1964). Reichenbach presented his mature views on probability in the work The Theory of Probability: an inquiry into the logical and mathematical foundations of the calculus of probability originally published in Turkey, in 1935. In this work, Reichenbach attempted to establish a probability logic, based on the relation of probability implication, which is governed by four axioms.
Relative frequencies of sub-series of events in a larger series are interpreted as probabilities and they are shown to satisfy the axioms of probability logic. However, Reichenbach’s milestone contribution concerns the connection between probability theory and the problem of induction. In this section, we will focus, mainly, on the frequency interpretation of probability as suggested by von Mises while for Reichenbach’s views the reader may consult our IEP entry on The Problem of Induction (Psillos and Stergiou, 2022).
Von Mises claimed that the subject matter of probability theory are the repetitive events – “same event that repeats itself again and again” – and the mass phenomena – “a great number of uniform elements … [occurring] at the same time” (1928: 11).
Probability, according to von Mises, is defined in terms of a collective, a concept which “denotes a sequence of uniform events or processes which differ by certain observable attributes, say colors, numbers or anything else” (1928: 12). For example, take a plant coming from a given seed as a single instance of a collective which consists of a large number of plants coming from the given type of seed. All members of the collective differ from each other with respect to some attribute, say the color of the flower or the height of the plant. Respectively, in the case of tossing a die the collective consists of the long series of tosses and the attribute which distinguishes the instances is the number that appears on the face of the die. The mathematical representation of such finite empirical collectives is given in terms of their idealized counterpart, the infinite ordered sequences of events, which exhibit attributes that are subsets of the attribute space of the collective (which is no different from what we have called sample space).
Yet, to be an empirical collective, a sequence of events should satisfy two empirically well-confirmed laws that dictate the mathematical axioms of probability theory in the ideal case of the infinite sequences. The first law, dubbed by Keynes (1921: 336), Law of Stability of Statistical Frequencies, requires that:
the relative frequencies of certain attributes become more and more stable as the number of observations is increased. (von Mises 1928: 12)
Thus, if \(\Omega\) is the attribute space, \(A \subseteq \Omega\) is an attribute and \(m(A)\) is the number of manifestations of \(A\) in the first \(n\) members of the collective, the relative frequency, \(\frac{m(A)}{n}\), tends to a fixed number as the number \(n\) of observations increases. According to von Mises, the Law of Stability of Statistical Frequencies is confirmed by observations in all games of chance (dice, roulette, lotteries, etc.), in data from insurance companies, in biological statistics, and so on (von Mises 1928: 16-21). This empirical law gives rise to the axiom of convergence for infinite sequences of events:
for an arbitrary attribute \(A\) of a collective \(C\), \(\lim_{n \to \infty} \frac{m(A)}{n}\) exists.
This law can be traced back to the views of von Mises’s predecessors. For instance, Venn thought that probability is about “a large number or succession of objects, or, as shall term it, series of them” (1888: 5). This series should be ‘indefinitely numerous’ and it should “combine[s] individual irregularity with aggregate regularity” (1888: 4). All series, for Venn, initially exhibit irregularity, if one considers only their first elements, while, subsequently, a regularity may be attested. This regularity, however, can be unstable and it can be destroyed in the long run, in the “ultimate stage” of the series. According to Venn, a series is of the fixed type if it preserves the uniformity while it is of the fluctuating type if “the uniformity is found at last to fluctuate.” (1888: 17). Probability is defined only for series of the fixed type; if a series is of the fluctuating type, it is not the subject of science (1888: 163). But what does it mean, in terms of relative frequencies, that a series is of the fixed type? “The one [fixed type] tends without any irregular variation towards a fixed numerical proportion in its uniformity”. (ibid).
In more detail:
[a]s we keep on taking more terms of the series we shall find the proportion still fluctuating a little, but its fluctuations will grow less. The proportion, in fact, will gradually approach towards some fixed numerical value, what mathematicians term its limit. (1888: 164)
The second presupposition for a sequence to be a collective is an original contribution of von Mises. Apart from the existence of limiting relative frequencies in infinite sequences, he demanded the sequence to be random in the sense that there is no rule-governed selection of a subsequence of the original sequence that would yield a different relative frequency of the attribute in question from the one obtained in the original sequence. In von Mises (1957: 29) own words:
…these fixed limits are not affected by place selection. That is to say, if we calculate the relative frequency of some attribute not in the original sequence, but in a partial set, selected according to some fixed rule, then we require that the relative frequency so calculated should tend to the same limit as it does in the original set… The fulfilment of the condition…will be as the Principle of Randomness or the Principle of Impossibility of a Gambling System.
In a more detailed account of how the subsequence is obtained by place selection, von Mises (1964: 9) explained that in inspecting all elements of the original sequence, the decision to keep the nth element in or to reject it from the subsequence depends either on the ordinal number \(n\) of this element or on the attributes manifested in the (\(n – 1\)) preceding elements. This decision does not depend on the attribute exhibited by the nth or by any subsequent element.
Von Mises suggested that we should understand the Principle of Impossibility of a Gambling System by analogy to the Principle of Conservation of Energy. As the energy principle is well-confirmed by empirical data about physical systems, so the principle of randomness is well-confirmed for random sequences manifested in games of chance and in data from insurance companies. Moreover, as the principle of conservation of energy prohibits the construction of a perpetual motion machine, the principle of impossibility of a gambling system prohibits the realization of a rule-governed strategy in games of chance that would yield perpetual wealth to the gambler:
We can characterize these two principles, as well as all far-reaching
laws of nature, by saying that they are restrictions which we impose on the basis of our previous experience, upon our expectation of the further course of natural events. (1928: 26)
Having defined the concept of a collective that is appropriate for the theory of probability in terms of the two aforementioned laws, we may, now, define the ‘probability of an attribute \(A\) within a given collective \(C\)’ in terms of the limiting value of relative frequency of the given attribute in the collective:
\(p_C(A) = \lim_{n \to \infty} \frac{m(A)}{n}\).
Thus defined, probabilities are always conditional to a given collective. Does, however, this definition provide an admissible concept of probability in compliance with Kolmogorov’s axioms?
It is straightforward that axioms (i) and (ii) are satisfied. Namely, since for every \(n \in \mathbb{N}\), \(0 \leq m(A)/n \leq 1\), it follows that \(0 \leq p_C(A) \leq 1\). And if the attribute examined consists in the entire attribute space \(\Omega\) then it will be satisfied by any member of the sequence, \(m(\Omega)/n = n/n = 1\), so, taking limits, \(p_C(\Omega) = 1\).
Regarding the axiom of finite additivity, (iii), we have that for any pair of mutually exclusive attributes, \(A, B\), the number of times that either \(A\) or \(B\) occurs is the sum of the occurrences of \(A\) and \(B\), since the two cannot occur together:
By taking limits: \(p_C(A \cup B) = p_C(A) + p_C(B)\).
However, von Mises’ concept of probability does not satisfy the axiom of countable additivity (axiom iii’). To show that, consider the following infinite attribute space \(\Omega = \{A_1, \dots, A_k, \dots\}\) and assume that each attribute \(A_k\) appears only once in the course of an infinite sequence of repetitions of the experiment, then
\(p_C(A_k) = 0\), for every \(k \in \mathbb{N}\). If the countable additivity condition were true, then
\(p_C(\Omega) = p_C(A_1) + \cdots + p_C(A_k) + \cdots = 0\). However,
this is absurd, since it violates the normalization condition \(p_C(\Omega) = 1\).
To provide a probability theory that satisfies all Kolmogorov axioms,
von Mises restricted further the scope of a collective. In addition to the Law of Stability of Statistical Frequencies and the Principle of Randomness, in his Mathematical Theory of Probability he required a third, independent, condition that a collective should satisfy (von Mises 1964: 12). Namely, that for a denumerable attribute space \(\Omega = \{A_1, \dots, A_k, \dots\}\):
To define conditional probability, we may begin with a given collective C and pick out all elements that exhibit some attribute B.
Assuming that they form a new collective \(C_B\), we calculate the limiting relative frequency \(p_{C_B}(A) = \lim_{n \to \infty} \frac{m(A)}{n}\) in \(C_B\). The conditional probability of \(A\) given \(B\) in the collective \(C\) is then:
\(p_C(A|B) = p_{C_B}(A)\).
In case attribute B is manifested only a finite number of times in C, then \(C_B\) is a set of a finite cardinality; hence, it does not qualify as a collective and conditional probability is not defined. To avoid this ill-defined case, Gillies suggested that we require that \(p_C(B) \neq 0\). Given this condition he shows all prerequisites for \(C_B\) to be a collective are satisfied and conditional probability can be defined (Gillies, 2000:112).
Von Mises’s account of probability has been criticized as being too narrow with respect to the common use of the term ‘probability’: there are important situations in which we apply the term although we cannot define a collective. Take for instance, von Mises’s question “Is there a probability of Germany being at some time in the future involved in a war with Liberia?” (1928: 9) Since we do not refer to repetitive or mass
events, we cannot define a collective and, in the frequency interpretation, the question is meaningless, since ‘probability’ is meaningfully used only with reference to a collective. Hence, many common uses of ‘probability’ in ordinary language become illegitimate if we think in terms of the empirical science of probability as delineated by von Mises.
Some may think that this is not an objection at all: von Mises explicates probability in a way that legitimizes only some uses of the term as it occurs in ordinary language and in this way he deals with the problem of single-case probabilities that burdens the frequency interpretation: associating probability with (limiting)
relative frequency
yields trivial certainty (probability equal to 1) for all unrepeated or unrepeatable events. The solution offered by von Mises is to exclude definitionally such events from the domain of application of the concept of probability.
Of course, there are alternative ways to understand probability, not as relative frequency, that render its use to unrepeated or unrepeatable events legitimate. Take for instance the subjectivist account (see section 5), which considers probability as a measure of the degree of belief. In this conception, the question acquires meaning requesting the degree of belief an agent would assign to that proposition. In addition, to be on the safe side and avoid paradoxes, one may request coherence from the agent, i.e., that their degrees of belief satisfy Kolmogorov’s axioms of probability.
A criticism raised against von Mises’s account by de Finetti underlines that the theory fails to deal with the role of probability in induction and confirmation:
If an essential philosophical value is attributed to probability
theory, it can only be by assigning to it the task of deepening,
explaining or justifying the reasoning by induction. This is not done by von Mises… (De Finetti 1936)
In response to investigations on probability that aim to produce a theory of induction, von Mises claims that probability theory itself is an inductive science and it would be circular to try to justify inductive methodology by means of a science that applies it or to provide any degree of confirmation for any other branch or science:
According to the basic viewpoint of this book, the theory of
probability in its application to reality is itself an inductive science; its results and formulas cannot serve to found the inductive process as such, much less to provide numerical values for the plausibility of any other branch of inductive science, say the general theory of relativity. (1928: vii)
However, it’s not that frequency interpretation, in general, does not contribute to the problem of induction. As we have examined elsewhere, [IEP entry on The Problem of Induction (Psillos and Stergiou, 2022)], Reichenbach thought that the frequency interpretation of probability theory provides a new context for understanding the problem of induction.
Are Propensities
Probabilities?
The propensity interpretations are a family of accounts of physical probability. They aim to provide an account of objective chance in terms
of probability theory.
Originally, this interpretation has been developed by Karl Popper
(1959) but later David Miller, James Fetzer, Donald Gillies and others developed their own accounts (see, Gillies 2000). Paul Humphreys (1985)
describes propensities as:
[I]ndeterministic dispositions possessed by systems in a particular
environment, exemplified perhaps by such quite different phenomena as a radioactive atom’s propensity to decay and my neighbor’s propensity to shout at his wife on hot summer days.
The problems that guided Popper to abandon the frequency interpretation of probability and to develop this new account had to do, on the one hand, with the interpretation of quantum theory, on the other, with the objective single-case probabilities.
To deal with the problem of single-case probabilities, Popper suggested that probabilities should be associated not with sequences of events but with the generating conditions of these sequences i.e., “the set of conditions whose repeated realisation produces the elements of the sequence” (1959). He claimed that “probability may … be said to be a property of the generating conditions” (ibid). This was not just an analysis of the meaning of the term ‘probability’. Popper claimed to have proposed, “a new physical hypothesis (or perhaps a metaphysical hypothesis) analogous to the hypothesis of Newtonian forces. It is the hypothesis that every experimental arrangement (and therefore every state of the system) generates physical propensities which can be tested by frequencies.” (ibid).
The propensity interpretation is supposed to avoid a number of problems faced by the frequency interpretation; for instance, it avoids the problem of inferring probabilities in the limit. But, especially in Popper’s version, it faces the problem of specifying the conditions on the basis of which propensities are calculated – the ascertainability requirement fails. Given that an event can be part of widely different conditions, its propensity will vary according to the conditions. Does it then make sense to talk about the true objective singular probability of an event?
Even if this problem is not taken seriously (after all, the advocate of propensities may well claim that propensities are the sort of thing
that varies with the conditions), it has been argued on other grounds that probabilities cannot be identified with propensities. Namely, the so-called inverse probabilities, although they are
mathematically well-defined, remain uninterpreted since it does not make sense to talk about inverse propensities. Suppose, for instance, that a
factory produces red socks and blue socks and uses two machines (Red and Blue) one for each color.
Suppose also that some socks are faulty and that each machine has a definite probability to produce a faulty sock, say one out of ten socks produced by the Red machine are faulty. We can meaningfully say that the
Red machine has an one tenth propensity to produce faulty socks. But we can also ask the question: given an arbitrary faulty sock, what is the probability that it has been produced by the Red machine? From a mathematical point of view, the question is well-posed and has a definite answer [for a detailed computation of probabilities in a similar example, see section 1a above]. But we cannot make sense of this answer under the propensity interpretation. We cannot meaningfully ask:
what is the propensity of an arbitrary faulty sock to have been produced by the Red machine? Propensities, as dispositions, possess the asymmetry of the cause-and-effect relation that cannot be adequately expressed in terms of the symmetric conditional probabilities. Thus, there are well-defined mathematical probabilities that cannot be interpreted as propensities (see Humphreys 1985).
Is this really a problem for the propensity interpretation? We would say ‘yes’ if a probability interpretation aspires to conform with Kolmogorov’s axioms (admissibility requirement) and, also, claims to provide a complete interpretation of probability calculus. But this condition is not universally accepted. One may suggest that probability interpretations are partial interpretations of the probability calculus or even take the more radical position to abandon the criterion of admissibility, as Humphreys suggested.
Probability as the Logic of Induction
Keynes and The Logical Concept of Probability
John Maynard Keynes presented his account of probability in the work titled A Treatise on Probability (1921). He attempted to provide a logical foundation for probability based on the concept of partial entailment. In deductive logic, entailment, considered semantically, expresses the validity of an inference and partial entailment is meant to be its extension to inductive logic. From a semantical point of view, partial entailment expresses a probability relation between the conclusion of an inference and its premises, i.e., that the conclusion is rendered likely true (or more likely to be true) given the truth of the premises. Here is how Keynes (1921: 52) understood this extension and its relation to probability:
Inasmuch as it is always assumed that we can sometimes judge directly
that a conclusion follows from a premiss, it is no great extension of this assumption to suppose that we can sometimes recognise that a conclusion partially follows from, or stands in a relation of probability to a premiss.
And:
We are claiming, in fact, to cognise correctly a logical connection
between one set of propositions which we call our evidence and which we suppose ourselves to know, and another set which we call our conclusions, and to which we attach more or less weight according to the grounds supplied by the first. It is not straining the use of words to speak of this as the relation of
probability. (Keynes 1921: 5–6)
Thus, partial entailment rests on an analogy with deductive (full)
entailment and both concepts express logical relations, the former of deductive and the latter of inductive logic. Here is an example: the conjunction (p and q) entails deductively p; by analogy, it is said that, though proposition p does not (deductively) entail the conjunction
(p and q), it entails it partially, since it entails one of its conjuncts
(for instance, p). The difference between the two kinds of entailment stems from the fact that validity of an inference, expressed in deductive entailment, is a yes-or-no question, while the probability relation, expressed in partial entailment, comes in degrees. Keynes (1921: 4) considered probability to be the degree of rational belief that a future occurrence of an event under specified circumstances is partially entailed from past evidence for the occurrence of similar events under similar circumstances:
Let our premises consist of any set of propositions \(h\), and our
conclusion consist of any set of propositions \(a\), then, if a knowledge of
\(h\) justifies a rational belief in \(a\) of degree \(\alpha\), we say that there is a probability-relation of degree \(\alpha\) between \(a\) and \(h\).
To say that the probability of a conclusion is high or low given a set of premises is not for Keynes a matter of subjective evaluation of
the believer. It shares the objectivity of any other logical relation between propositions. That is why Keynes (1921: 4) talks about the degree of rational belief and not simply of a degree of belief:
… in the sense important to logic, probability is not subjective.
It is not, that is to say, subject to human caprice. A proposition is not probable because we think it so. When once the facts are given which determine our knowledge, what is probable or improbable in these circumstances has been fixed objectively, and is independent of our opinion. The Theory of Probability is logical, therefore, because it is concerned with the degree of belief which it is rational to entertain in given conditions, and not merely with the actual beliefs of particular individuals, which may or may not be rational.
It should be noted that Keynes based his defense of the logical character of the probability relations on what he called “logical intuition”, viz., a certain capacity possessed by agents in virtue of
which they can simply “see” the logical relation between the evidence and the hypothesis. It is in virtue of this shared intuition that different agents can have the same rational degree of belief in a certain hypothesis in light of certain evidence. This view was immediately challenged by Frank Ramsey, who, referring to Keynes’s
“logical relations” between statements, noted: “I do not perceive them and if I am to be persuaded that they exist it must be by argument”
(1926, 63).
It should be clear that for Keynes probability is not always quantitative. He believed that qualitative probabilities are meaningful
as well and that the totality of probabilities, or of degrees of rational belief, may include both numbers and non- numerical elements.
In the usual numerical probabilities, all probabilities lie within the unit interval and they are all comparable in terms of the relation ‘being greater than or equal to’ as defined in real numbers. This relation induces a complete ordering to the unit interval which acquires the structure of a completely ordered set. Since for Keynes probabilities may not be numerical, a different interpretation of the relation “being more probable than or equally probable to” expressing the comparability of probabilities is required. In the class of probabilities, Keynes defines a relation of ‘between’:
\(A\) is between \(B\) and \(C\), \((A, B, C)\)
where, for any three probabilities \(A, B, C\) the relation, if satisfied, is satisfied by a unique ordered triple \((A, B, C)\). He identifies two distinguished probabilities, impossibility, \(O\), and certainty, \(I\), between which all other probabilities lie. Finally, he used the relation of betweenness to compare probabilities:
If \(A\) is between \(O\) and \(B\), the probability \(B\) is said to be greater than
the probability \(A\).
To illustrate these relations among probabilities, Keynes suggested the following diagram. In this diagram, all probabilities comparable in
terms of the ‘greater than’ relation are connected with a continuous path:
In Keynes’s (1921: 39) words:
\(O\) represents impossibility, \(I\) certainty, and \(A\) a numerically measurable probability intermediate between \(O\) and \(I\); \(U, V, W, X, Y, Z\)
are nonnumerical probabilities, of which, however, \(V\) is less than the numerical probability \(A\), and is also less than \(W, X\), and \(Y\). \(X\), and \(Y\) are both greater than \(W\), and greater than \(V\), but are not comparable with one another, or with \(A\). \(V\) and \(Z\) are both less than \(W, X\), and \(Y\), but are not
comparable with one another; \(U\) is not quantitatively comparable with any of the probabilities \(V, W, X, Y, Z\).
Probabilities which are numerically comparable will all belong to one series, and the path of this series, which we may call the numerical path or strand, will be represented by \(OAI\).
The Principle of Indifference
To have numerical probabilities between alternative cases, Keynes
(1921: 41) believed that equiprobability of the alternatives is required:
And:
In order that numerical measurement may be possible, we must be given a number of equally probable alternatives.
It has always been agreed that a numerical measure can actually be obtained in those cases only in which a reduction to a set of exclusive and exhaustive equiprobable alternatives is practicable. (1921: 65)
In the terminology of the mathematical theory of probability, Keynes stipulates that a real number \(p(E|H)\) denotes the numerical probability of an event \(E\) given the truth of some hypotheses \(H\), assigned by a function \(p\) satisfying Kolmogorov’s axioms, only if \(p(E|H)\) can be deduced by or it can be reduced to some initial numerical probabilities \(p(A_k|H)\) assigned to the members of a partition \(\{A_k\}_{k=1\dots n}\) of the event space \(\mathcal{S}\) that satisfy the equiprobability condition:
\(p(A_k|H) = p(A_j|H)\), \(k, j = 1, \dots, n\).
What is the basis of equiprobability and how can it be justified?
Keynes (1921: 45) suggested that the justification of equiprobability follows from the Principle of Indifference which states that:
if there is no known reason for predicating of our subject one rather than another of several alternatives, then relatively to such knowledge the assertions of each of these alternatives have an equal probability. Thus, equal probabilities must be assigned to each of several arguments, if there is an absence of positive ground for assigning unequal ones.
The term ‘Principle of Indifference’ was coined by Keynes in the Treatise on Probability. According to Ian Hacking (1971), this principle can be traced back to Leibniz’s paper “De incerti
aestimatione” (1678). In this, Leibniz, anticipating Laplace, claimed that:
Probability is the degree of possibility. Hope is the probability of
having. Fear is the probability of losing.
Leibniz considered the above claim as an axiom—something very similar to the Principle of Indifference:
Axiom. If players do similar things in such a way that no distinction
can be drawn between them, with the sole exception of the outcome, there is the same proportion of hope to fear.
Moreover, he suggested that we understand this axiom as having its source in metaphysics, which seems to be an allusion to the Principle of
Sufficient Reason and, in particular, to the claim that God does, or creates, nothing without a sufficient reason. Applying this metaphysical principle to the expectations of rational agents, i.e., ‘players’, we get the foregoing axiom, as Hacking suggested (1975:126):
If several players engage in the same contest in such a way that no
difference can be ascribed to them (except insofar as they win or lose) then each player has exactly the same ground for ‘fear or hope’.
Keynes, however, traces the principle of indifference to Jacques (James) Bernoulli’s Principle of Non-Sufficient Reason (1921: 41). Bernoulli in his Ars Conjectandi, attempted to calculate the “degree of certainty, or probability, that the argument generates” [Notice that by ‘argument’ he meant a piece of evidence.] and he assumed that “all cases are equally possible, or can happen with equal ease.” There are examples, however, in which a case happens more ‘easily’ than others. Then, according to Bernoulli (1713: 219), we need to make a correction:
For any case that happens more easily than the others as many more cases must be counted as it more easily happens. For example, in place of a case three times as easy I count three cases each of which may happen as easily as the rest.
Thus, Bernoulli suggested that to save equiprobability we should consider a finer partition of the sample space by subdividing the ill-behaved case into distinct cases.
Keynes was aware that the principle faces a number of difficulties which take the form of a paradox: it predicted contradictory evaluations of probabilities in specific cases. To resolve these paradoxes and avoid ill cases, he attempted to provide restrictions to the application of the principle of indifference.
The first paradox is known as the Book Paradox. Consider a book of unknown cover color. We have no reason to believe that its color is red rather than not red.
Hence, by the principle of indifference the probability of being red is \(1/2\). In a similar
vein, the probability of being green, yellow or blue are all \(1/2\) which contradicts the theorem of probability that the sum of probabilities of mutually exclusive events is less than or equal to 1.
The second paradox is the Specific Volume Paradox. Consider the specific volume \(v\) of a given liquid and assume that \(1 \leq v \leq 3\) in some system of
units. Given that there is no reason to assume that \(1 \leq v \leq 2\), rather than \(2 \leq v \leq 3\), by the principle of indifference it is equally likely for the specific volume to lie in each one of these intervals. Next,
consider the specific density \(d = 1/v\). Given our original assumption, we are justified to infer that \(1/3 \leq d \leq 1\). Similarly, the principle of indifference maintains that it is equally likely for the specific density to have a value, \(1/3 \leq d \leq 2/3\),
or to have a value, \(2/3 \leq d \leq 1\). Turning now to considerations about specific volume we find that it is equally likely that \(1 \leq v \leq 3/2\) or \(3/2 \leq v \leq 3\). But we have already shown that it is as likely \(v\) to lie between 1 and 2 as between 2 and 3.
The third paradox that seems to challenge the principle of indifference is Bertrand’s paradox. Bertrand in his Calcul des Probabilités (1888) argues that the principle of indifference can be applied in more than one way in cases with infinitely many possibilities giving rise to contradictory outcomes regarding the evaluation of probabilities. In support of his argument he presented, among other examples, his famous paradox: We trace at random a chord in a circle. What is the probability that it would be longer than the side of the inscribed equilateral triangle? Here are some different ways to apply the principle of indifference to solve the problem, each leading to different probability values. The first solution assumes that one end of the requested chord is at a vertex of the triangle and the other lies on the circumference.
The circumference is divided in three equal arcs by the vertices of the triangle. From all possible chords traced from the given vertex, only those that lie in the arc which subtends the angle at that vertex are longer than the side of the equilateral triangle.
Therefore, the probability is \(1/3\). For the second solution, we assume that the chord is
parallel to a side of the triangle. From these parallel chords only the ones with
distance less than one-half of the circle’s radius will have a length greater than the
side of the inscribed equilateral triangle. Thus, the requested probability is \(1/2\). Finally,
we yield a third solution by assuming that the chord is defined by its midpoint. Then a
chord is longer than the side of triangle if its midpoint falls that Bertrand’s Paradox can undermine
the principle of indifference if and only if the problem at hand is a determinate
problem with no unique solution. But there is no agreement on that!
Many believe that the problem is ambiguous or underspecified and, in this sense indeterminate. They claim that once we select the set of chords from which we draw one at random, the problem has a unique solution by applying the principle of indifference. [For an interesting discussion, see Shackel, 2007].
To address the Book and the Specific Volume Paradoxes, Keynes suggested that we should place a restriction to the application of the Principle of Indifference. We should require that given our state of knowledge, the partition of the sample space, i.e., the number of alternative cases, is finite, and each alternative cannot be split up further into a pair of mutually exclusive sub-alternatives which have non-zero probability to occur (see 1921: 60). Now it is obvious that the class of books with a non-red cover can be further subdivided into the class of books with a blue cover and those with a non-blue cover and so on; thus the adequacy condition for the application of the principle is not satisfied. Similarly, in the case of the ranges of values of the specific volume and the specific density, the principle does not apply since there is no range of values which does not contain within itself two similar ranges. Finally, for Bertrand’s paradox, since areas, arcs and segments can be subdivided further into non-overlapping parts without a limit, the principle of indifference is not applicable (see 1921: 62). Yet, for the geometric example, Keynes suggested a solution. Instead of considering as an alternative a point in a continuous line, we may divide that line into a finite number of \(m\) segments, no matter how small, and take as an alternative the segment in which the point under consideration lies. Then we can apply the principle of indifference to the \(m\) alternatives which we consider indivisible.
However, Keynes solution is not at all clear. Number \(m\) can be as great as one desires on the condition that we keep it finite. Hence, who
decides what is the number of alternatives to which the principle of indifference is applied? If, on the other hand, we allow \(m\) to increase indefinitely then we get the continuous case we sought to avoid. (see Childers 2013: 126)
Keynes on the Problem of
Induction
For Keynes, probability is the part of logic that deals with rational but inconclusive arguments; and since inductive reasoning is both
inconclusive but rational, induction becomes inductive logic. The key question, of course, is the following: on what grounds one is justified to believe that induction is rational?
According to Keynes, though Hume’s skeptical claims are usually associated with causation, the real object of his attack is induction i.e., the inference from past particulars to future generalizations (see 1921: 312).
Keynes’s argument is the following:
A constant conjunction between two events has been observed in the past. This is a fact. Hume does not challenge this at all.
What Hume challenges is whether we are justified to infer from a past constant conjunction between two events that it will also hold in
the future.
This kind of inference is called inductive.
So, Hume is concerned with the problem of induction.
To see Keynes’s reaction to the problem of induction, let’s first clarify what is for him an inductive argument: (1921: 251)
It will be useful to call arguments inductive which depend in anyway on the methods of Analogy and Pure Induction.
Arguments from analogy are based on similarities among the objects of a collection, on their likeness, while Pure Induction is induction by enumeration. As Keynes (ibid) put it:
[w]e argue from … Pure Induction when we trust the number of the experiments.
Keynes criticized Hume for not taking into account the analogical dimension of an inductive argument by considering the observed instances which serve as premises, as absolutely uniform (see 1921: 252). Instead, Keynes suggested that the basis of Pure Induction is the likeness of instances in certain respects (positive analogies) and their dissimilarity in others (negative analogies). Only after having verified such a likeness, we can single out some features and predict the occurrence of other features or infer a generalization of the sort “all A is B”. Hence (1921: 253):
In an inductive argument, therefore, we start with a number of instances similar in some respects AB, dissimilar in others C. We pick out one or more respects A in which the instances are similar, and argue that some of the other respects B in which they are also similar are likely to be associated with the characteristics A in other unexamined cases.
So, assume that a finite number, \(n\), of instances exhibits a certain group of qualities,
\(a_1, \dots, a_r\) and single out two subgroups:
\(a_1, a_2, a_3\) and \(a_{r-1},\)
\(a_r\)
An inductive argument, for Keynes, would conclude that in every instance of
\(a_1, a_2, a_3\), qualities
\(a_{r-1}, a_r\) are also exhibited. Or that
\(a_{r-1}, a_r\) “bound up” with qualities \(a_1,\)
\(a_2, a_3\). (1921: 290) This account of induction presupposes, claims Keynes (ibid), that qualities in objects are exhibited in groups and “a sub-class of each group [is]
an infallible symptom of the coexistence of certain other members of it also.”
However, the world may not co-operate to the success of an inductive
argument.
Keynes identifies three “open possibilities” that would compromise inductive generalization:
Some quality \(a_{r-1}\) or \(a_r\), may be independent of all other qualities of the instances, i.e., there are no groups of qualities that contain the said quality and at least some of the others.
There are no groups to which both \(a_1, a_2,\)
\(a_3\) and \(a_{r-1}, a_r\) belong.
\(a_1, a_2, a_3\) belong to groups that include \(a_{r-1}, a_r\) and to other groups that do not include them.
In any of the three cases, “All \(a_1, a_2,\)
\(a_3\)’ are \(a_{r-1}, a_r\)” fails. Hence induction fails.
Keynes (1921: 291) suggested an assumption of probabilistic nature that would save us from such ‘pathological’ cases and would lead to a successful induction; namely:
If we find two sets of qualities in coexistence there is a finite probability that they belong to the same group, and a finite probability also that the first set specifies this group uniquely.
If we grant this assumption, then inductive methodology aims to increase the prior probability and make it large, in the light of new
evidence. But to this point we will return later.
Keynes discusses the justificatory ground of this assumption and shows that it requires an a priori commitment to the claim that
qualitative variety in nature is limited. Although the individuals do differ qualitatively, “their characteristics, however numerous, cohere together in groups of invariable connection, which are finite in number”
(1921: 285).
This idea is incorporated in the Principle of Limited Variety of a finite system (PLV), which Keynes (1921: 286) stated thus:
the amount of variety in the universe is limited in such a way that
there is no one object so complex that its qualities fall into an infinite number of independent groups (i.e. groups which might exist independently as well as in conjunction); or rather that none of the objects about which we generalise are as complex as this; or at least that, though some objects may be infinitely complex, we sometimes have a finite probability that an object about which we seek to generalise is not infinitely complex.
The gist behind the role of PLV is this. Suppose that although a group of properties, say \(A\), has been invariably associated with a group
of properties, \(B\), in the past, there is an unlimited variety of groups of properties, \(B_1, \dots, B_n\), such that it is logically possible that future occurrences of A will be accompanied by any of the \(B_i\)’s, instead of \(B\). Then, and if we
let \(n\) (the variety index) tend to infinity, we cannot even start to say how likely it is that \(B\) will occur given \(A\), and the past association of
\(A\)s with \(B\)s. PLV excludes the possibility just envisaged.
But as PLV stipulates there are no infinitely complex objects;
alternatively, the qualities of an object cannot fall into an infinite number of independent groups. For Keynes, the qualities of an object are determined by a finite number of primitive qualities; the latter (and their possible combinations) can generate all apparent qualities of an object. Since the number of primitive qualities is finite, the number of groups they generate alone or by being combined is finite. Hence, for any two sets of apparent properties, Keynes (1921: 292) concludes, there is, “in the absence of evidence to the contrary, a finite probability
that the second set will belong to the group specified by the first set.”
In any case, Keynes takes it that a generalization of the form ‘All \(A\)s are \(B\)s’ should be read thus ‘It is probable that any given \(A\) is \(B\)’ rather than thus ‘It is probable that all \(A\)s are \(B\)s’. So, the issue is the next instance of the observed regularity and not whether it holds generally (1921: 287-288).
The absolute assertion of the finiteness of a system under consideration as expressed by the Principle of Limited Variety is called Inductive Hypothesis (IH) (1921: 299), and provides one of the premises of an inductive argument; namely, that the a priori
probability of our conclusion, \(p(C|IH)\), has a finite value. Keynes distinguished (IH) from Inductive Method (IM) which amounts to the process of increasing the a priori probability of the conclusion, \(p(C|IH)\), by taking into account the evidence \(e\):
\(p(C|e\&IH) > p(C|IH)\).
[For the mathematics of Keynes’s account of inductive method and the emergence of the need for the inductive hypothesis in order that new
evidence strengthen our belief in the truth of the conclusion of an inductive argument, the reader may consult Appendix 6.c]
Significantly, Keynes adds that the Inductive Method may be used to strengthen the Inductive Hypothesis itself. Since \(IH\) is a hypothesis and
since \(IM\) is indifferent to the content/status of the hypothesis it applies to, it can be applied to \(IH\) itself. In other words, \(IM\) brings some evidence to bear on the truth of \(IH\). What Keynes suggests is this:
\(p(IH|e’\&IH’) > p(IH|IH’)\),
where \(IH’\) is another general hypothesis, “more primitive and less far-reaching” than
\(IH\) such that \(p(IH|IH’)\) has a finite value, and \(e’\) other evidence. The argument is non-circular since the justification of the inductive hypothesis is not accomplished by the hypothesis itself but in terms of some other hypothesis more fundamental, by means of inductive method. Of course, the account runs the risk of exchanging circularity for infinite regress unless there exist some primitive inductive hypothesis.
But what would such a primitive inductive hypothesis be? We are left in the dark:
We need not lay aside the belief that this conviction gets its invincible certainty from some valid principle darkly present to our minds, even though it still eludes the peering eyes of philosophy. (1921: 304)
However, in the end of the day, Keynes simply argues that a non-zero
(finite) a priori probability is assigned to the inductive hypothesis \(IH\) (which is equivalent to PLV). What would be the reason to assign an a priori non-zero probability to the inductive hypothesis \(IH\)?
Keynes answer, honest to the bone, shows the limitations of all attempts to satisfy the inductive sceptic: “It is because there has been so much repetition and uniformity in our experience that we place great confidence in it.” (1921: 289-290)
It seems we cannot do better than relying on past experience. The Inductive Hypothesis that supports induction, PLV in Keynes’s case, is neither a self-evident logical axiom nor an object of direct acquaintance (1921: 304). But nevertheless, he insists that it is true of some factual systems. How do we know this? By past experience!
On the Rule of Succession
Before we leave Keynes let us consider his critique of Laplace’s Rule of Succession, i.e., the theorem of mathematical probability which
claims that if an event has
occurred m times in succession, then the probability that it will occur again is \(\frac{m+1}{m+2}\).
As discussed elsewhere [see our entry in IEP on The Problem of
Induction (Psillos and Stergiou, 2022)] Venn had reasons not to “take such a rule as this seriously.”
(1888: 197), but Keynes’s criticism goes well beyond these reasons.
The crux of Keynes’ criticism consists in that the derivation of the rule of succession combines two different methods for the determination
of the probability of an event which yield different probability values.
Thus, their combination is inconsistent and it includes a latent contradiction.
Consider several possible events \(E_1, E_2, \dots, E_n\) that are alternatives, i.e., they are mutually exclusive and exhaustive of the sample space, and choose any one of them,
\(E_i\).
The first method stipulates that “when we do not know anything about an alternative, we must consider all the possible values of the
probability of the alternative; these possible values can form in their turn a set of alternatives, and so on. But this method by itself can lead to no final conclusion.” (1921: 426) Let the probability of the alternative be \(p(E_i)\). The method stipulates that we should consider all probability values of \(E_i\) assigned by any admissible probability functions \(p\). These probability values for
\(E_i\) form another set of alternatives, say,
\(p_1(E_i), \dots, p_n(E_i),\dots\) And the same process may be repeated, again and again, involving us in an infinite regress. Thus, the first method is inconclusive.
The second method applies the principle of indifference stipulating that “when we know nothing about a set of alternatives, we suppose the
probabilities of each of them to be equal.” (ibid)
Thus, the second method concludes that, \(p(E_1) = \cdots = p(E_n)\).
Consider the event that \(E_1\): “the sun will rise tomorrow” and its alternative that the \(E_2\): “the sun will not rise tomorrow”. If we apply the first method only, we reach no conclusion about probability and we are involved in infinite regress. Secondly, if we
apply the second method only, we obtain \(p(E_1) = p(E_2) = 1/2\).
Finally, in deriving the rule of succession both methods are applied subsequently. Namely, the probability of
\(E_1\) is unknown, and any probability value is possible according to the first method. Thus, we form a set of alternatives for the probability of \(E_1\) which, at a second stage are reduced to the equal probability case by applying the second method. This reasoning
is presupposed by the rule of succession.
The latent contradiction included in the rule of succession is that for its derivation it is assumed that the a priori probability
of the event can be any number in the interval [0,1], with all numbers being equally probable, while by application of the
rule the a priori probability, calculated in the absence of any observations (\(N=0\)) is 1/2.
In Keynes’s (1921: 430) own words:
The principle’s conclusion is inconsistent with its premises. We begin with the assumption that the a priori probability of an event,
about which we have no information and no experience, is unknown, and that all values between 0 and 1 are equally probable. We end with the conclusion that the a priori probability of such an
event is 1/2 … this contradiction was latent, as soon as the Principle of Indifference was superimposed on the principle of unknown probabilities.
Carnap’s Inductive Logic
Two Concepts of Probability
Carnap presented his views of probability and induction mainly in the two books entitled the Logical Foundations of Probability
(1950) and The Continuum of Inductive Methods (1952) and in his papers “A basic system of inductive logic, I, II” (1971 and 1980,
respectively) and “Replies and Systematic Expositions” (1963). For Carnap, the theory and principles of inductive reasoning, inductive logic, is the same as probability logic (1950, v) and the primary task to be set toward an account of inductive logic is the explication of probability.
Explication, according to Carnap (1950: 3), is the transformation of an inexact, possibly prescientific concept, the explicandum, into a new exact concept, the explicatum,
that obeys explicitly stated rules for its use. By means of this transformation a concept of ordinary discourse or a metaphysical concept may be incorporated into a well-structured body of logico-mathematical or empirical concepts. Explication has a long history as a philosophical method that, in a wide sense, may be traced back even to Plato’s investigations on definitions. Strictly speaking, however, Carnap borrowed the term “Explikation” from Kant and Husserl while Frege may be considered his precursor in this method of philosophical analysis and
Goodman, Quine and Strawson among his prominent intellectual inheritors.
[For a general presentation of the notion explication, consult IEP’s entry on Explication, (Cordes and Siegwart 2019).]
Two concepts are distinguished as explicanda of probability according to Carnap: the logical or inductive probability, called
‘probability1’ and the statistical probability, called ‘probability2’.
Both concepts are important for science and lack of recognition of this fact, Carnap claimed, has fueled many futile controversies among philosophers. The meaning of probability2 is that of relative frequency of a kind of event in a long sequence of events, and in science it is applied to the description and statistical analysis of mass phenomena.
All sentences about statistical probability are factual, empirical.
The logical concept of probability, probability1, is the basis for all inductive reasoning. For Carnap (1950: 2), the problem of induction
is the problem of the logical relation between a hypothesis and some confirming evidence for it and
“inductive logic is the theory based upon what might be called the degree of inducibility, that is, the degree of confirmation.” Hence, by taking probability1 to mean “the degree of confirmation of a hypothesis
\(h\) with respect to an evidence statement \(e\), e.g., an observational report” (1950: 19) Carnap made it the basis of inductive logic. As for any logical sentence, the truth or falsity of sentences about probability1 is independent of extralinguistic facts.
In addition, logical probability is an objective concept, i.e., “if a certain probability1 value holds for a certain hypothesis with respect
to a certain evidence, then this value is entirely independent of what any person may happen to think about these sentences, just as the relation of logical consequence is independent in this respect.”(1950:
43) The objectivity of probability1, Carnap recognized it in the views of Keynes and Jeffreys who interpreted probability in terms of rational degrees of beliefs as distinguished from subjective, actual degrees of belief a person might bear on the truth of a sentence given some evidence. Later, he (1963: 967) came to accept the interpretation of probability1 as “the degree to which [one]… is rationally entitled to believe in \(h\) on the basis of \(e\).”
C-functions
Carnap suggested three different concepts of confirmation. The classificatory concept of confirmation, which expresses a logical relation between a piece of evidence \(e\) and a hypothesis \(h\) and, if satisfied, it qualifies the former as a confirming instance of the latter. To signify the explicatum of this concept, Carnap used the symbol ‘\(\mathfrak{C}\)’ and \(\mathfrak{C}(h, e)\) corresponds to “\(h\) is confirmed (or, supported) by \(e\)”. The second concept of confirmation he employed is the comparative concept which compares the strength by which a piece of evidence \(e_1\) confirms a hypothesis \(h_1\) with the corresponding strength by which \(e_2\) confirms \(h_2\). Thus, comparative confirmation requires the underlying classificatory confirmation and it is, in general, a tetradic relation. Its explicatum is symbolized by ‘\(\mathfrak{MC}\)’, where \(\mathfrak{MC}(h_1, e_1, h_2, e_2)\) corresponds to the statement ‘\(h_1\) is confirmed by \(e_1\) at least as strongly (i.e., either more, or equally, strongly) as \(h_2\) by \(e_2\)’. Finally, there is a quantitative (or, metrical) concept of confirmation, the degree of confirmation, which assigns a numerical value to the degree to which a hypothesis \(h\) is
supported by given observational evidence \(e\). The explicatum of this concept is symbolized by ‘c’, where ‘the degree of ‘c(\(h\), \(e\)) = \(r\)’ is the statement, “the degree of confirmation of \(h\) with respect to \(e\) is \(r\)”,
where \(h\) and \(e\) are sentences and \(r\) a real number in the unit interval.
In this context, Carnap points out that Keynes’s objective conception of probability is similar to the comparative concept of confirmation and
only in some special cases, when the principle of indifference is applicable, it can be interpreted quantitatively similar to his concept of degree of confirmation (1950: 45 & 205). Moreover, notice that all three conceptions of confirmation Carnap (1950: 19) suggested are semantical:
The concepts of confirmation to be dealt with in this book are
semantical, i.e., based upon meaning, and logical, i.e., independent of facts.
The inductive relation the three concepts of confirmation attempt to explicate is not determined by the form of the sentences, as Hempel
required in his syntactic account of confirmation (1945), nor depend on the users of a language, as Goodman suggested in his pragmatic solution of the new riddle of induction (1955) (See also our other entry in IEP on The Problem of Induction (Psillos and Stergiou, 2022)). Rather:
[O]nce ℎ and 𝑒 are given, the question mentioned requires only that
we be able to understand them, that is, to grasp their meanings, and to establish certain relations which are based upon their meanings (1950:
20).
Carnap begins with the construction of the language(s) in which inductive logic is to be applied. He defines several language systems
each one characterized by the number of names (constants) it contains
(1950: 58). Each name refers to individuals in the corresponding universe of discourse, be they things, events, or the like. Thus, he considered an infinite language system \(\mathcal{L}_\infty\), having an infinite number of names and a sequence \(\mathcal{L}_1, \mathcal{L}_2, \dots, \mathcal{L}_N, \dots\) of language systems each one characterized by the index \(N\) that runs through all positive integers indicating the number of names the system includes. Hence, \(\mathcal{L}_1\) contains only ‘\(a_1\)’; \(\mathcal{L}_2\) contains ‘\(a_1\)’ and ‘\(a_2\)’; etc. Notice that any sentence of \(\mathcal{L}_\infty\) is contained in an infinite number of finite language systems of the hierarchy since if ‘\(a_N\)’ is the name with highest subscript that appears in that sentence, then this sentence will be represented in any language system \(\mathcal{L}_n\) with \(n \geq N\). Apart from names, \(\mathcal{L}_\infty\) contains a finite number of primitive (atomic) predicates of any degree (unary, binary etc.) designating properties and relations among individuals in the universe of discourse. Carnap considered only three connectives as primitive for his language systems: the negation ‘\(\sim\)’, the conjunction ‘\(\&\)’ and the inclusive disjunction ‘\(\vee\)’ – and he defined implication and biconditional in terms of these three. Each language system contains an infinite number of variables, \(x, y, z, x_1, x_2 \dots\), and two quantifiers, the existential ‘\((\exists x)\)’ and the universal one, ‘\((x)\)’. The sentence ‘\((x)Px\)’ is taken to be logically equivalent to ‘\(Pa_1 \& Pa_2 \dots \& Pa_N\)’ in a language \(\mathcal{L}_N\), according to the semantics adopted. The same is not true for the case of \(\mathcal{L}_\infty\) since in this case the conjunction of an infinite number of sentences is not a well-formed formula of the language. Apart from the atomic predicates, molecular predicates may be defined. They are formed by atomic or more basic molecular predicates with the help of connectives. For example, if \(P_1\),
\(P_2, P_3\) are atomic predicates, then
‘\(\sim P_1\)’ or ‘\(P_1 \& P_2\)’ or ‘\(P_1
\vee P_3\)’ are molecular predicates understood as follows: for any variable
\(x\), (\(\sim P_1\))x stands for ‘\(\sim (P1x)\)’;
(\(P_1 \& P_2\))x for
‘\(P_1(x) \& P_2(x)\)’; and (\(P_1 \vee
P_3\))x for ‘\(P_1(x) \vee P_3(x)\)’. Finally,
language systems contain an equality symbol ‘\(=\)’ designating identity of individuals in the universe of discourse and a tautological sentence
‘\(t\)’. As any language, these language systems are equipped with some rules
for the formation of well-formed formulas (sentences) and some rules of truth, i.e., a semantics.
A state description \(\mathfrak{V}\) is an explication of the vague concept of a state of affairs relativized to a given language system \(\mathcal{L}\) (1950:
70ff). It purports to describe possible states of the universe of discourse of \(\mathcal{L}\). A state description describes for every individual designated by some name ‘\(a\)’ and for every property designated by an atomic predicate ‘\(P\)’ of \(\mathcal{L}\) whether or not this individual has that
property, and similarly for relations. Thus, a state description will
contain exactly one sentence from the pair ‘\(Pa\), \(\sim Pa\)’: either ‘\(Pa\)’ or ‘\(\sim Pa\)’ but not both, and no other element (similarly for relations). In the case of a finite language system \(\mathcal{L}_N\), a state description has the form of a conjunction of sentences of the aforementioned sort while in the case of an infinite language system \(\mathcal{L}_\infty\), a state description is a class of sentences that contains at most one sentence of the aforementioned sort. In both cases nothing more is included in a state description. The class of all state descriptions in a given system \(\mathcal{L}\) is designated by ‘\(V_{\mathfrak{V}}\)’ while the null class by ‘\(\Lambda_{\mathfrak{V}}\)’.
For example, consider a language system \(\mathcal{L}_3\) with names,
‘\(a\), \(b\) and \(c\)’ and a single atomic unary predicate symbol ‘\(P\)’. The complete set of state descriptions is the following:
\(\mathfrak{V}_1\) ‘\(Pa \& Pb \& Pc\)’
\(\mathfrak{V}_5\) ‘\(\sim Pa \& \sim Pb \& Pc\)’
\(\mathfrak{V}_2\) ‘\(\sim Pa \& Pb \& Pc\)’
\(\mathfrak{V}_6\) ‘\(\sim Pa \& Pb \& \sim Pc\)’
\(\mathfrak{V}_3\) ‘\(Pa \& \sim Pb \& Pc\)’
\(\mathfrak{V}_7\) ‘\(Pa \& \sim Pb \& \sim Pc\)’
\(\mathfrak{V}_4\) ‘\(Pa \& Pb \& \sim Pc\)’
\(\mathfrak{V}_8\) ‘\(\sim Pa \& \sim Pb \& \sim Pc\)’
The adequacy of a language system \(\mathcal{L}\) for inductive logic requires compliance with two important conditions: the requirement of logical
independence and the requirement of completeness. The first condition aims at restricting the language system to bar contradictory state descriptions. The requirement of logical independence stipulates (i)
that atomic sentences (i.e. sentences that consist of an \(n\)- place predicate and \(n\) names ) are logically independent, i.e. a class containing atomic sentences (e.g. sentences of the form \(Pa\) for a predicate ‘\(P\)’ and a name ‘\(a\)’) and the negations of other atomic
sentences does not entail logically entail another atomic sentence or its negation; (ii) names in \(\mathcal{L}\) designate different and separate individuals;
(iii) atomic predicates are interpreted to designate logically independent attributes.
The requirement of completeness of language stipulates that the set of the atomic predicates of \(\mathcal{L}\) be sufficient for expressing every
qualitative attribute of the individuals in the universe of discourse of
\(\mathcal{L}\). This requirement seemed absolutely necessary for the Carnapian system, since the language systems affect the c-values in the theory of inductive logic. For the time being, all we need to stress is that this requirement implies that a language system \(\mathcal{L}\) mirrors its
universe of discourse.
Whatever there is in it can be exhaustively expressed within \(\mathcal{L}\). Here is Carnap’s example (1950: 75). Take a language system \(\mathcal{L}\) with only two predicates, ‘\(P_1\)’ and ‘\(P_2\)’ interpreted as Bright and Hot. Then, every individual in the universe of discourse of \(\mathcal{L}\) should
differ only with respect to these two attributes. If a new predicate
‘\(P_3\)’,
interpreted as Hard, were added, the c -values of hypotheses concerning individuals in
\(\mathcal{L}\) would change. Even if this simple scheme holds (or might hold) in a simple language, can it be adequate for the language of natural sciences? A similar requirement had been proposed by Keynes, in the form of the Principle of Limited Variety (see section 3c above).
Later on, Carnap abandoned this requirement and replaced it with the following: The value of the confirmation function c(h, e) remains
unchanged if further families of predicates are added to the language
(see 1963: 975). According to this requirement, the value of c(h, e)
depends only on the predicates occurring in h and e. Hence, the addition of new predicates to the language does
not affect the value of
c(h, e). This new idea amounts to what Lakatos (1968: 325) called the minimal language requirement, according to which the degree of confirmation of a proposition depends only on the minimal language in
which the proposition can be expressed.
Another important concept defined by Carnap is that of the range of a sentence or of a collection of sentences (1950: 78).
The range of a sentence \(i\), \(\mathfrak{R}(i)\), is the class of those state descriptions in which that sentence holds. A (molecular) sentence of the form ‘\(Pa\) or \(\sim Pa\)’ for a atomic predicate ‘\(P\)’ and some name ‘\(a\)’ holds in a state description \(\mathfrak{V}\) if it is either a conjunct in \(\mathfrak{V}\)’s defining
conjunction or it belongs to the class of sentences that define \(\mathfrak{V}\).
Analogously, if a sentence is a conjunction of sentence, then all components of the conjunction should hold for a state description while if it is a disjunction, at least one disjunct should hold in a state description – so that the state description partake of the sentence’s range. Notice that a tautology holds in all state descriptions. For instance, in the previous example, the range of
\(Pa \& Pb\) is \(\mathfrak{R}(Pa \& Pb) = \{\mathfrak{V}_1, \mathfrak{V}_4\}\) while the range of \(Pa \vee Pb\) is
\(\mathfrak{R}(Pa \vee Pb) =\)
\{\(\mathfrak{V}_1, \mathfrak{V}_2, \mathfrak{V}_3, \mathfrak{V}_4, \mathfrak{V}_6, \mathfrak{V}_7\)\}. Finally, the range of a class of sentences is the class of state descriptions in which every sentence of class holds.
As a final step before defining the c-function, we present Carnap’s account of logical concepts in a system \(\mathcal{L}\) in terms of state descriptions
and the concept of range: a sentence \(i\) is L-true in \(\mathcal{L}\) if and only if \(\mathfrak{R}(i)\) is \(V_{\mathfrak{V}}\) while it is L-false in \(\mathcal{L}\) if and only if \(\mathfrak{R}(i)\) is \(\Lambda_{\mathfrak{V}}\); a sentence \(i\) L-implies \(j\) in \(\mathcal{L}\) if and only if \(\mathfrak{R}(i) \subset \mathfrak{R}(j)\); \(i\) is L- equivalent to \(j\) in \(\mathcal{L}\) if and
only if \(\mathfrak{R}(i) = \mathfrak{R}(j)\); \(j_1, j_2, \dots, j_n\) (\(n \geq 2\)) are L-disjunct with one another in \(\mathcal{L}\) if and only if \(\mathfrak{R}(j_1) \cup \mathfrak{R}(j_2) \cup \dots
\cup \mathfrak{R}(j_n)\) is \(V_{\mathfrak{V}}\); \(i\) is L-exclusive of \(j\) in \(\mathcal{L}\)
if and only if \(\mathfrak{R}(i) \cap \mathfrak{R}(j)\) is \(\Lambda_{\mathfrak{V}}\); a class of sentences is L-exclusive in pairs
if and only if every pair of the class is L-exclusive of every other sentence of that class. L-truth is the explicatum for logical truth or analytical truth while L-false for contradiction. L-implication is the explicatum for logical entailment while L- equivalence explicates mutual deducibility and it is the same as mutual L-implication. L-disjunctness applied to a set of sentences explicates the idea that at least one of those sentences is true and L-exclusion explicates logical incompatibility or logical impossibility of joint truth.
For the sake of simplicity, in this presentation we focus on finite
language systems.
Thus, \(\mathfrak{m}\) is a regular measure function (briefly, a regular \(\mathfrak{m}\)-function) for \(\mathfrak{V}\) in \(\mathcal{L}_N\) if and only if it fulfills the following two conditions: (a) for every \(\mathfrak{V}_i\) in \(\mathcal{L}_N\), \(\mathfrak{m}(\mathfrak{V}_i) \in \mathbb{R}\); (b) the sum of the values of \(\mathfrak{m}\) for all \(\mathfrak{V}\) in \(\mathcal{L}_N\) is 1, \(\sum_{\mathfrak{V}_i} \mathfrak{m}(\mathfrak{V}_i) = 1\). The regular \(\mathfrak{m}\)- function for \(\mathfrak{V}\) can be extended to a regular \(\mathfrak{m}\)-function for the sentences in \(\mathcal{L}_N\) by requiring the following: (a) for any L-false sentence \(j\) in \(\mathcal{L}_N\), \(\mathfrak{m}(j) = 0\) ; (b) for any non-L-false sentence \(j\), \(\mathfrak{m}(j) = \sum_{\mathfrak{V} \in \mathfrak{R}(j)} \mathfrak{m}(\mathfrak{V})\) (Carnap 1950: 295).
In the example of the language system \(\mathcal{L}_3\) considered previously, a regular \(\mathfrak{m}\)- function for state descriptions is defined as follows:
\(\mathfrak{m}(\mathfrak{V}) = \frac{1}{12}\), for \(i = 1,3,4,7\) \(\mathfrak{m}(\mathfrak{V}_i) = \frac{1}{6}\), for \(i = 2,5,6,8\)
It is extended to a regular \(\mathfrak{m}\)-function for sentences that assigns numerical values to sentences, e.g.,
A regular confirmation function is defined as a two-argument function for sentences on the basis of a regular \(\mathfrak{m}\)-function for sentences in \(\mathcal{L}_N\). Namely, let \(\mathfrak{m}\) be a regular \(\mathfrak{m}\)-function for sentences in \(\mathcal{L}_N\), then c is a regular confirmation function (briefly, a regular c-function) for sentences in \(\mathcal{L}_N\) if and only if for any sentences \(e, h\) in \(\mathcal{L}_N\),
c(h, e) =
\(\frac{\mathfrak{m}(e \& h)}{\mathfrak{m}(e)}\),
where \(\mathfrak{m}(e) \neq 0\) and c(h, e) has no value, where \(\mathfrak{m}(e) = 0\) (Carnap 1950:
295). In the aforementioned example, if \(e\) stands for the L-false
sentence ‘\(Pa \& \sim Pa\)’,
c(h, e) is not defined for any hypothesis h. L-false sentences cannot be evidence for or against any hypothesis. However, if an L-false sentence, e.g., ‘\(Pa \& \sim Pa\)’, is taken as hypothesis h, then c(h, e) =
0, for any admissible piece of evidence e. Consider an L-true sentence,
such as ‘\(Pa \vee \sim Pa\)’, as hypothesis h. Then c(h, e) = 1 no matter what the admissible evidence might be; no evidence can increase or decrease the degree of confirmation of a logical truth (obviously, e is not L-false). In other cases, e.g., for the hypothesis h, ‘\(Pa\)’ and the evidence e, ‘\(Pb\)’, c(Pa, Pb) = \(\mathfrak{m}(Pa \& Pb)\) =
\(\mathfrak{m}(Pb)\) = \(\frac{1/6}{1/2} = 1/3\).
A regular c-function is a conditional probability function in the common parlance
of mathematical theory of probability since it satisfies Kolmogorov’s axioms. This was a desideratum for Carnap who stipulated that an adequate concept of degree of confirmation should fulfill the following conditions (1950: 285):
L-equivalent evidences. If \(e\) and \(e’\) are L-equivalent,
then c(h, e) = c(h, e’).
L-equivalent hypotheses. If \(h\) and \(h’\) are L-equivalent,
then c(h, e) = c(h’, e).
General Multiplication Principle. c(h \& j, e) = c(h,
e) \cdot c(j, e \& h).
Special Addition Principle. If e \& h \& j is L-false, then c(h \vee j, e) = c(h, e) +
c(j, e)
Maximum Value. For any not L-false \(e\) c(t, e) = 1,
where \(h, h’, e, e’, j\) are any sentences in \(\mathcal{L}_N\) and \(t\) is a logical truth. Conditions, (a) and (b) demand that the explicatum of the degree of confirmation should respect logical equivalence. The General Multiplication Principle is derived mathematically directly from the definition of conditional probability. The Special Addition Principle is recognized as the additivity axiom in Kolmogorov’s formulation which gives rise to the finite additivity condition and the Maximum Value condition corresponds to the fact probability of the sample space is 1.
To recover unconditional probability functions for sentences in \(\mathcal{L}_N\), Carnap suggested to consider the probability of any
sentence conditionally to a tautology. Namely, if c is a regular confirmation function for \(\mathcal{L}_N\), then for every sentence \(j\) in
\(\mathcal{L}_N\), the null confirmation c_0 is c_0(j) = c(j, t). Moreover, he showed that c_0(j) =
\(\mathfrak{m}(j)\). The null confirmation represents the prior probability of a sentence in the absence of any evidence (1950: 307-8).
In the example of the language system \(\mathcal{L}_3\) considered previously we suggested a regular \(\mathfrak{m}\)-function that assigns different real
numbers to different state descriptions, i.e., to different states in the universe of discourse. However, is there any reason to believe that these numbers should be unequal? Is there any reason to believe that one state description weighs more than any other? Rather, by application of the principle of indifference, it seems that we should demand equal distribution of weight to all state descriptions, \(\mathfrak{m}^+(\mathfrak{V}) = \frac{1}{\zeta}\) where \(\zeta\) is the number of the state descriptions in \(\mathcal{L}_N\) (Carnap, 1950: 564). Moreover, it is easy to show that for any given piece of evidence \(e\) and for every pair of state description \(\mathfrak{V}_i, \mathfrak{V}_j\) compatible with \(e\), it holds:
Of course, the principle of indifference entails equiprobability only for state descriptions and not for all sentences, in a way that Keynes would appreciate, since he was the first to suggest restricted application of the principle of indifference to possibilities that are mutually exclusive and exhaustive of the sample space, to avoid the Book paradox. Salmon (1966: 72) notes that Carnap’s “…explication of probability in these terms has been thought to preserve the ‘valid core’ of the traditional principle of indifference”.
Nevertheless, Carnap has shown that to suggest a regular \(\mathfrak{m}\)-function for \(\mathfrak{V}\) in \(\mathcal{L}\) that assigns equal weight to all state descriptions, although intuitively plausible, has deeply undesirable consequences: it inhibits learning from experience. To see why consider a language \(\mathcal{L}_{N+1}\), with a single unary atomic predicate \(P\). We want to calculate the degree of confirmation of the hypothesis that the (\(N + 1\))th individual will have the property \(P\), i.e., h: ‘\(Pa_{N+1}\)’, given the evidence that all individuals examined so far had the property \(P\), i.e., e: ‘\(Pa_N \& \dots \& Pa_1\)’. The number of state descriptions is
\(2^{N+1}\), hence, the \(\mathfrak{m}^+\) regular \(\mathfrak{m}\)-function assigns equal weight to all
state descriptions, \(\mathfrak{m}^+(\mathfrak{V}) = \frac{1}{2^{N+1}}\)
. First, notice that h \& e and \sim h \& e are state
(h \& e) \vee (\sim h \& e) are L-equivalent and \(\mathfrak{m}^+(e) =
c^+(e) = c^+(e, t)\). By the L- equivalent-hypotheses condition, \(\mathfrak{m}^+(e) = c^+((h \& e) \vee (\sim h \& e), t)\);
and by the Special
Addition Principle, \(\mathfrak{m}^+(e) = c^+(h \& e, t) +
c^+(\sim h \& e, t) = c^+(h \& e) +
c^+(\sim h \& e) =
\mathfrak{m}^+(h \& e) + \mathfrak{m}^+(\sim h \& e) = \frac{1}{2^{N+1}} + \frac{1}{2^{N+1}} = \frac{2}{2^{N+1}} = \frac{1}{2^N}\)
. Hence,
The last equality yields the desired conclusion: the degree of confirmation of a hypothesis is independent of the evidence collected in a given population. No matter how many positive instances of a given property one observes in a population, their guess regarding the appearance of the property in the next individual is not better justified than if no observations were made; thus learning does not come from experience (1950: 564-5).
To avoid this difficulty, Carnap suggested to apply the principle of indifference in a different way. Instead of distinguishing states of affairs in terms of properties and relations instantiated by certain individuals, Carnap grouped all states of affairs instantiating the same properties and relations, independently of the individuals that instantiated them, and distinguished only among these classes. Hence, we should not focus anymore on state descriptions describing possible states of the universe of discourse for a language system but on classes of such state descriptions in which any two state descriptions are isomorphic to one another. Two sentences 𝑖, 𝑗 in 𝔏𝑁 are isomorphic if 𝑗 is formed from 𝑖 by replacing each individual constant occurring in 𝑖 by its correlate with respect to a one-to-one relation among all individual constants in
𝔏𝑁. These classes are called structure descriptions, 𝔖𝔱𝔯. They describe the common structure attributed to the realm of individuals by a class of state descriptions.
For instance, a structure description may express the fact that there are exactly two individuals in the universe of discourse possessing a given property 𝑃 or that none of the individuals bears the relation 𝑅 to itself, or that relation 𝑅 is satisfied by pairs of individuals non-symmetrically – i.e., if for all individual constants 𝑎, 𝑏 𝑅𝑎𝑏 and
~𝑅𝑏𝑎 are both satisfied – etc. Now the principle of indifference applies in two stages: firstly, following the principle we assign equal weight to all structure descriptions and, secondly, within each structure description we assign equal weight to all isomorphic state descriptions.
Thus, for a state description 𝔙𝑖 in a language system 𝔏𝑁, if 𝜏 is the number of structure descriptions 𝔖𝔱𝔯 and
\(\zeta_i\) the number of all state descriptions that are isomorphic to \(\mathfrak{V}_i\), we define (1950: 564) the regular \(\mathfrak{m}\)-function for
\(\mathfrak{V}\):
To illustrate the relation between state descriptions and structure descriptions and the difference between the values of \(\mathfrak{m}^+\),
\(\mathfrak{m}^*\) regular \(\mathfrak{m}\)-functions, consult the following table which represents the example of \(\mathcal{L}_3\) with a single predicate \(P\):
STATE DESCRIPTIONSWEIGHT STRUCTUREDESCRIPTIONSWEIGHT \(\mathfrak{m}^+\) \(\mathfrak{m}^*\)
\(Pa \& Pb \& Pc\)
1/8
All \(P\)s, no~\(P\)s
1/4
1/8
1/4
\(\sim Pa \& Pb \& Pc\)
1/8
1/8
1/12
\(Pa \& \sim Pb \& Pc\)
1/8
2 \(P\)s, 1 ~\(P\)
1/4
1/8
1/12
\(Pa \& Pb \& \sim Pc\)
1/8
1/8
1/12
\(\sim Pa \& \sim Pb \& Pc\)
1/8
1/8
1/12
\(\sim Pa \& Pb \& \sim Pc\)
1/8
1 \(P\), 2 ~\(P\)
1/4
1/8
1/12
\(Pa \& \sim Pb \& \sim Pc\)
1/8
1/8
1/12
\(\sim Pa \& \sim Pb \& \sim Pc\)
1/8
No \(P\)s, all ~\(P\)s
1/4
1/8
1/4
Let’s now revisit the problem of determining the degree of confirmation of the hypothesis that the (\(N + 1\))th individual will have the property \(P\), i.e., h: ‘\(Pa_{N+1}\)’, given the evidence that all individuals examined so far had the property \(P\), i.e., e: ‘\(Pa_N \& \dots \& Pa_1\)’ in a language \(\mathcal{L}_{N+1}\) with a single unary predicate \(P\). Since our language contains \(N + 1\) individual constants, a structure description is determined by the number of instances of the property \(P\) we find in the
universe of discourse disregarding the identity of the individuals that instantiate the property. Thus, all state descriptions that are
isomorphic to ‘\(Pa_{N+1} \& Pa_{N-1} \& \dots
\& Pa_1\)’ correspond to the same structure description characterized by \(N + 1\) property instances in the universe of discourse,
while all state descriptions that are isomorphic to
‘\(\sim Pa_{N+1} \& \sim Pa_{N-1} \& \dots \& \sim Pa_1\)’ correspond to the same structure description characterized by 0 property instances in the universe of discourse. Thus, we have different structure description corresponding to 0,1, … , \(N + 1\) occurrences of \(P\) and the total number of structure descriptions is \(\tau = N + 2\). To calculate the number \(\zeta_k\) of state descriptions that are isomorphic to \(\mathfrak{V}_k\), let us take \(k\) to denote the number of occurrences of \(P\) in \(\mathfrak{V}_k\), i.e., \(k = 0,1, \dots, N + 1\). Then \(\zeta_k\) is the number of the different ways that (\(N + 1\)) individuals can form k-tuples, i.e., C(\(N+1\), k) = (\(N+1\)! / (k! (\(N+1\)-k)!). Thus, we find that
\(\mathfrak{m}^*(\mathfrak{V}_k) = 1 / ((\(N + 2\) \times C(\(N+1\), k)) = (\(N + 2\))! / ((\(N + 2\) \times k! (\(N + 1 – k\))!) = (\(N + 1\))! / (k! (\(N + 1 – k\))!) for k = 0,1, … , N + 1. The degree of confirmation of the hypothesis h given evidence e is
Using the same reasoning, we may calculate, more generally, the degree of confirmation of the hypothesis that the (r + 1)-th individual
ar+1 will exhibit property P, i.e., h:
‘Par+1‘ given the evidence that r individuals of the universe of discourse have exhibited so far the same property P, i.e. e :
‘Pa_r & … & Pa_1’,
c^*(h, e) = m^*(h&e) / m^*(e) = (r + 1)/(N + 2)
These results amount to the celebrated Laplace’s Rule of Succession,
which in Carnap’s theory of inductive logic has become a theorem.
The Continuum of Inductive
Methods
In the examples so far, we have examined three different regular c-functions: one determined by arbitrarily assigning weight to state
descriptions in L_3; the other two,
c+, c*, determined by assigning equal weight to state and structure descriptions, respectively, on the basis of the principle of indifference. There are many alternative ways to assign such a weight to the different possibilities and each one of them results in a different regular c-function yielding a different degree of confirmation c(h, e) for a given hypothesis h and evidence e in a language system \(\mathcal{L}\). Thus, there are many different inductive methods, actually, a continuum of such possible methods (Carnap, 1952). For a given language system each inductive method is characterized by the value of a non-negative real parameter \(\lambda\). For a given \(\lambda\) the degree of confirmation c(h, e) is fixed for any hypothesis h and with respect to any evidence e and any two inductive methods have the same \(\lambda\) only if they agree on the value of c(h, e).
To understand how the degree of confirmation is defined in terms of the \(\lambda\)- parameter, we need first to explain the concept of logical width of a property (1950: 126-127). Consider any language system \(\mathcal{L}_N\) having \(\pi\) unary atomic predicates. We may form molecular predicates by taking the conjunction of \(\pi\) predicates which are either the atomic predicates or of their negations. In this way we form \(\kappa = 2^{\pi}\) molecular predicates (Q-predicates). Then any property \(F\) expressible in \(\mathcal{L}_N\) is represented either by a Q-predicate or by a disjunction of two or more Q-predicates. Logical width characterizes the logical complexity of a property \(F\). The greater the logical width of a property, the greater is the number of possible (non-contradictory) properties it admits. For example, the property \(P_1 \vee P_2\) is wider than \(P_1\) since property \(\sim P_1 \& P_2\) is admitted by the first but excluded by the second. Thus, the logical width of a contradictory property is 0 while the logical width of a property represented by a Q-predicate is 1. Any property \(F\) that is expressed as a disjunction of Q-predicate has a logical width \(\kappa \geq w > 1\) equal to the number of disjuncts.
Moreover, the relative width \(F\) is the ratio \(w/\kappa\). Notice that the relative width varies from 0, for a contradictory property, through \(1/2\), for any property represented by a atomic predicate, to 1 for a logically necessary property.
Let \(e\) be the sentence expressing that out of \(s\) individuals examined, \(s_F\) had property \(F\) and h be the hypothesis that a given individual different that those examined so far had also F, then the degree of confirmation c(h, e) is c(h, e) = (\(s_F + \lambda w\) / (\(s + \lambda \kappa\)) where \(s_F/s\) is the relative frequency of observed instances of the property F and \(\lambda\) a non-negative real number (Burks, 1953). The relative frequency of observed instances, \(s_F/s\), is an empirical fact while the relative width of the property is a logical fact depending on the language system and the predicate that represents the property. Hence, the degree of confirmation is determined as a mixture of a logical factor and of an empirical factor (1952: 24):
c(h, e) = (1 – a) (\(s_F / s\)) + a (\(w / \kappa\)),
where a = \(\lambda / (s + \lambda)\). If no observation has taken place, i.e., s = 0,
then c(h, e) = \(w / \kappa\), and the degree of confirmation is determined on logical grounds. As the number of
observations increases relative frequency of observed instances acquires significance
and the degree of confirmation tends toward \(s_F / s\).
Exactly how fast we learn from experience, that is how fast c(h, e) tends to \(s_F / s\),
depends on \(\lambda\). In the following table we have summarized the degrees of confirmation that correspond to different characteristic values of \(\lambda\)
\(\lambda\)
c(h, e)
0
\(s_F\)
s
\(\kappa\)
\(s_F + w\) s + \(\kappa\)
\(\lambda \to \infty\)
w \(\kappa\)
For \(\lambda = 0\), we have the straight rule which stipulates that the observed relative frequency is equal to the probability that an
unobserved individual has the property in question. Carnap says that the straight rule is problematic since it yields complete certainty (c = 1),
if all examined individuals are found to possess the relative property
(\(s_F = s\)) – a conclusion that may be accepted if the size s of the sample is quite large but not otherwise (1950: 227). The second row in our table (\(\lambda = \kappa\)) is better interpreted if we assume that our language system consists of one atomic unary predicate only. Then w = 1
and \(\kappa = 2\), and we get Laplace’s rule of succession,
c(h, e) = c^*(h, e). Finally, with the same assumptions about the language system, for \(\lambda \to \infty\) the logical factor reigns and c(h, e) = c^+(h, e) = 1/2, as calculated for equiprobable state descriptions.
How can we decide which of the uncountable infinity of inductive methods is the appropriate one? Carnap’s answer is based on two important elements: (a) adopting an inductive method is a matter of choice that we make; (b) this choice is made on a priori grounds. Carnap agreed with Burks’ suggestion to apply to induction the internal-external distinction concerning the adoption of frameworks (1963: 982).
Thus, while the degree of confirmation for a given hypothesis on given evidence is an internal question, it presupposes the adoption of a c-function, the choice of which is an external one; i.e., it is raised outside any inductive system and has to do with the choice of a framework similar to the choice of a language system.
Carnap counted the specification of c-functions among the semantical rules for languages. Choice of a language was a framework question, a practical choice that could be wise or foolish, and lucky or unlucky, but not true or false.
The pragmatic (i.e., non-cognitive) nature of the scientist’s choice of an inductive method becomes apparent in the passage below:
X may change this instrument [i.e., their inductive method] just as he changes a saw or an automobile, and for similar reasons. (Carnap 1952: 55)
It is up to the scientists to make up their minds and to choose among them the one that they feel are the more appropriate for their purposes.
They can change them as they change their automobiles!
Assuming that a choice of an inductive method has been made and a particular c- function has been defined, any statement of the sort “c(h,
e) = p” for specified sentences h, e, is analytic, if true (and contradictory, if false), i.e., their truth or falsity
rests on definition and pure logic. This fact raises additional problems regarding the justification of the applicability of the inductive methods to practical issues: “The question is”, says Salmon
(1966:76), “How can statements that say nothing about any matters of fact serve as ‘a guide of life’?” The observation that non-trivial empirical content is introduced by the synthetic sentence e expressing evidence of past experience, does not improve things very much. For, one may further require a justification of considering past evidence and logico-mathematical facts about the degree of a confirmation as a guide to predictions and our future conduct. On what grounds do we deem such a practice rational? Nevertheless, these last
questions seem to get us outside the limits of any framework since they are reformulations of the external question about the choice of a particular c-function, and can be answered neither from reason nor from experience.
Where does all this leave Carnap’s project? The project of specifying the inductive logic falls apart. There is no uniquely rational way to determine the relations between evidence and hypotheses. Instead, Carnap’s attitude seems to be captured by the following paraphrase of
Chairman Mao’s famous dictum: ‘Let a hundred inductive methods bloom’.
But even if we were to argue that we end up with a plurality of inductive methods, they would still fall short of being inductive logics. As we saw, the c-function depends on the parameter \(\lambda\). But, as Howson and Urbach (1989: 55) have stated, the very idea of an adjustable parameter \(\lambda\) “calls into question the fundamental role assigned to his systems of inductive logic by Carnap. If their adequacy is itself to be decided empirically, then the validity of whatever criterion we use to assess that adequacy is in need of justification, not something to be accepted uncritically”.
Subjective Probability and Bayesianism
Probabilities as Degrees of Belief
Subjective theory is a theory of inductive probability proposed by the Cambridge Apostle F. P. Ramsey in his paper “Truth and Probability”, written in 1926 and published in 1931, and, independently, by the Italian mathematician, Bruno de Finetti, who proposed it somewhat later, in 1928, and published it in a series of papers in 1930. In this conception, probability is the degree of belief of an individual at a given time. The inductive nature of the account is reflected in de Finetti’s (1972: 21) that:
[t]he subjectivists … maintain that a probability evaluation, being but a measure of someone’s beliefs, is not susceptible of being proved or disproved by the facts …
A major assumption of the theory is that beliefs, commonly conceived as psychological states, are measurable, otherwise as Ramsey put it “all our inquiry will be vain” (1926:166). Thus, one needs to specify a method of measuring belief to consider the sentence ‘the degree of belief of X, at time t, is p‘ meaningful. Ramsey examined two such methods. The first one is based on the fact that the degree of belief is perceptible by its owner, since one ascribes different intensities of feelings of conviction to different beliefs that they hold. However, as Ramsey noted, we do not have strong feelings
for things we take for granted, actually, such things are practically accompanied by no feeling; thus, this way of measuring degree of belief seems inadequate. The second method rests on the supposition that the degree of belief is a causal property and:
the difference [in the degree of belief] seems to me to lie in how
far we should act on these beliefs (ibid: 170).
To measure beliefs as bases for actions Ramsey (ibid: 172)
suggested:
to propose a bet and see what are the lowest odds which… [the agent] will accept.
In a similar vein, de Finetti (1931) characterized probability “the psychological sensation of an individual” and also suggested to use bets to measure degrees of belief.
A bet on a hypothesis h, with betting quotient p, at stake S, bet(h, p, S), is defined by the following conditions:
if hypothesis h is true, the gambler wins (1 − p)S;
if hypothesis h is false, the gambler loses pS,
where p is any real number in the unit interval and S any sum of money.
We say that the odds in a bet on h at stake S are R: Q whenever the betting quotient
p = R/(R + Q).
h
AGENT PAYS
AGENT RECEIVES
NET PAYOFF FOR THE AGENT
T
pS
S
(1 − p)S
F
pS
0
−pS
The actions that measure an agent’s degree of belief in a hypothesis h are the buying and selling of a bet on h. In particular, the degree of belief of an individual X in a hypothesis h is a number p_0 which, expressed in monetary values,
$p_0$, is (i) the highest price X is willing to buy a bet that returns $1 if h is true, and $0 if h is false, and, (ii) the lowest price, X is willing to sell that same bet.
To better understand this definition, consider the set of all bets on h at stake $1. It can be characterized in terms of the betting quotients as follows: \{p \in \mathbb{R}: bet(h, p, $1)\} To buy any bet from this collection the bettor should pay $p. But depending on h they are not willing to pay any amount of money; on the contrary they seek to pay the least possible. The definition assumes that the amount of money the agent is willing to pay to buy the bet is bounded from above and its least upper bound is $p_0$. Similarly, the money an agent could earn from selling the bet is bounded from below and the greatest lower bound is also $p_0$. This number p_0 is the degree of belief of an agent in h.
On this view, the conditional degree of belief of an individual X in a hypothesis h
given some statement e, b_X(h|e) = p0 is defined in terms of the following bet:
if hypothesis h\&e is true, the bettor wins (1 − p_0);
if hypothesis e is false, the bettor wins p_0
The idea for this bet is that it is called off in case e is false and the agent gets a refund of $p_0$. (Jeffrey 2004: 12)
The degree of belief p_0 of an individual X in a hypothesis h is confined within the unit interval. To see this, assume, first, that p_0 < 0 and consider the agent selling a bet to the bookie that pays $1 if h is true, and $0 if h is false, for $p_0$. Independently of the truth-value of h, this bet is a loss for the agent: the agent has a net gain of
$(-1 + p_0) < 0 in case h is true and $p_0
< 0 in case h is false. In a similar vein, if
p_0 > 1, an agent buying a bet from the bookie that pays $1 if h is true, and $0 if h is false, for $p, gains $(1 −
p_0) < 0 if h is true, and $ – p_0 < 0 if h is false, and the bet is, again, a loss for the agent. Hence, if an agent assigns to any of their beliefs degrees that are either negative or greater than 1, they are exposed to a betting situation with guaranteed loss independently of the truth or the falsity of that belief. Such an unwelcome bet or set of bets which “will with certainty result in a loss” (de Finetti, 1974: 87) for the agent is called Dutch book. It is conjectured that the term can be traced back to the
introduction of the Lotto game in the Low Countries, at the beginning of the 16th century where in the so-called “Dutch Lotto”, the organizer had, in any event, a positive gain (de Finetti, 2008: 45). Hence, to avoid a Dutch book, one should confine degrees of belief within the interval [0,1].
A degree of belief function b_X is an assignment of degrees of belief of a person X’s beliefs as represented by propositions (or, classes of logically equivalent sentences, in a language dependent context):
S_L \ni h \mapsto b_X(h) \in [0,1].
For an agent X with an assignment of degrees of belief described by the function b_X, we may define the expected winnings of a bet(h, p, S) for X, as a convex combination of the gains and losses of the agent on this bet with coefficients determined by their degree of belief in h :
where V(h) is the net payoff for the agent if h is true and V(~h), the net payoff if h is false. To understand this concept, think of V(h) and V(~h) as the possible states in which an agent that their belief function assigns 1 and 0 to h, respectively, expects to be found if the bet offered is accepted. Namely, an agent that is certain of the truth of h, expects to gain V(h) an agent that is certain of the falsity of h, expects to gain V(~h) by accepting the bet. If the agent’s belief function assigns any other number in the unit interval to h, they will occupy an intermediate state. Geometrically, V(h) and V(~h) may be thought as the extremities of a line segment and any other state a point between these extremities. Next, assume that the agent is placed on the midpoint of the segment, equidistant from its extremities. Then the bet doesn’t give any prevalence beforehand to the truth or the falsity of the hypothesis for that particular agent and it is fair. If the agent’s belief function places them closer to either of the extremities, V(h) or V(~h), then the gives an unfair advantage for or against h, for this agent. Thus, for b_X(h) = p_0, the expected winnings of a bet(h, p, S) for X is:
(p0 – p)S
and it measures how much fair or unfair is the bet for that particular agent. In this understanding, no commitment to a probabilistic view of the belief function is required. It is sufficient to treat belief quantitatively, to consider the degree of belief on a hypothesis a number in the closed interval and to interpret the values 0
and 1 in terms of the belief in the falsity and truth of the hypothesis respectively.
Accordingly, we may now give the following definitions:
We call bet(h, p, S) a fair bet for X if and only if EW[
bet(h, p, S), X] = 0.
We call bet(h, p, S) advantageous for X if and only if EW[ bet(h, p, S), X] > 0.
We call bet(h, p, S) disadvantageous for X if and only if EW[ bet(h, p, S), X] < 0.
Notice that the Dutch book in which we would be vulnerable were we to consider degrees of belief outside the unit interval, is fair, since it is defined in terms of buying and selling bet(h, p0, S) – a fact that makes its bite even worse.
Dutch Books
Ramsey identified a connection between Dutch books and the laws of mathematical probability. In “Truth and Probability” we read that (1926: 182):
If anyone’s mental condition violated these laws [of probability] … [h]e could have a book made against him by a cunning bettor and would then stand to lose in any event.
And conversely,
Having degrees of belief obeying the laws of probability implies a further measure of consistency, namely such a consistency between the odds acceptable on different propositions as shall prevent a book being made against you (1926: 183).
Instead of Ramsey’s ‘consistency’, de Finetti (1974: 87) has spoken of ‘coherence’ of degrees of beliefs. The degrees an agent assigns to his beliefs are said to be coherent :
if among the combinations of bets which [y]ou have committed yourself to accepting there are none for which the gains are all uniformly negative.
Thus, if an agent is not vulnerable to a Dutch book with betting quotients equal to their degrees of belief, the agent is said to have coherent degrees of belief. In addition, an agent has coherent degrees of belief if and only if their degrees of belief satisfy the axioms of probability. This is the celebrated Ramsey – de Finetti or Dutch-Book theorem:
Let b_X: S_L \to \mathbb{R} be a degree of
belief function of a person X. If b_X does not satisfy the axioms of probability, then there is a family of fair bets bet(h_i, p_i, S_i), with h_i \in S_L , p_i = b_X(h_i) and S_i \in \mathbb{R}, for every i = 1, \dots, n (or \infty) which guarantees that the agent will result in an overall loss, independently of the truth-values of the hypotheses h_i.
The converse of that theorem has also been shown:
Let b_X: S_L \to \mathbb{R} be a degree of
belief function of a person X. If b_X satisfies the axioms of probability, then there is no family of fair bets bet(h_i, p_i, S_i), with h_i \in S_L , p_i = b_X(h_i) and S_i \in \mathbb{R}, for every i = 1, \dots, n which guarantees that the agent will result in an overall loss, independently of the truth-values of the hypotheses h_i.
We have already discussed the application of the Ramsey-de Finetti theorem in the case of violation of the axiomatically imposed constraint that probability values lie within the unit interval. The next example illustrates how an agent will experience an overall loss if they hold degrees of belief that do not comply with the finite additivity axiom.
Consider the tossing of a die and assume that the degrees of belief assigned by a person X to the beliefs that they will obtain: ‘6’ in a single toss is q; ‘3’ in a single toss is r; and, either ‘6’ or ‘3’ is k. Moreover, let k < r + q, i.e., finite additivity axiom is violated. Then we may consider the following family of fair bets, suggested to the agent:
bet(‘6’, q, 1), bet(‘3’, r, 1), bet(‘6’ or ‘3’, k, -1).
The agent buys from the bookie bet(‘6’, q, 1) that pays $1,
if “‘6’ is obtained” is true, and $0, if false, for $q.Next, the agent buys the second bet, bet(‘3’, r, 1), that pays $1, if “‘3’ is obtained” is true, and $0, if false, for $r. Finally, in the third bet,
the agent sells to the bookie bet(‘6’ or ‘3’, k, -1) that pays $1, if
“‘6’ or ‘3’ is obtained” is true, and $0 if false, for $k. In the following table, is calculated the net gain for the agent in this betting sequence:
“‘6’,” “‘3’,” “‘6’ OR ‘3’,” NET GAIN FOR THE AGENT
As we can see, this sequence of bets results in an overall loss for the agent. Thus, as the Ramsey-de Finetti theorem demands, an agent whose degree of belief function violates the axiom of finite additivity is exposed to a Dutch book.
One could obtain a similar result for the violation of countable additivity axiom. In this case they need to employ a countable infinite family of bets. However, a criticism that follows such an assumption is that it is unrealistic for any agent to be engaged in infinitely many bets. (Jeffrey,2004: 8)
There have been attempts to extend the requirement of coherence from the synchronic case, as expressed by the compliance of the degrees of belief with the axioms of probability theory, to diachronic coherence by stipulating rules for belief updating. Learning from experience requires that the agent should change their assignment of degree of belief (probability) on a given hypothesis in response to the result of experiment or observation. The simplest, and most common, rule for updating is the following:
In the light of new evidence, the agent should update their degrees
of beliefs by
conditionalizing on this evidence.
Thus, assume that the belief function of a person X before new evidence e is acquired is b_X_old and b_X_new is the belief function after the acquisition of new evidence. The transition from the old degree of belief to the new one is governed by the rule:
b_X_new(h) = b_X_old(h|e)
where e is the total evidence, and b_X_old(h|e) is the posterior probability as determined by Bayes’s Theorem if we identify the degree of belief function with the probability function.
This form of conditionalization is called strict conditionalization and it takes the probability of the learned evidence to be unity, i.e., b_X_new(e) = 1 . Jeffrey found out that certainty is a very restrictive condition that does not conform with the uncertainties of real empirical research in science and everyday life. To show that Jeffrey suggested the example of observing the color of a piece of cloth by candlelight. The agent gets the impression that the observed color is green, but they concede that it maybe blue or less probably violet. The experience causes as to change our degrees of belief in propositions about the color of the object but does not cause us to change them to 1. Hence, strict conditionalization is inapplicable for updating our degrees of belief. Jeffrey suggested another form of conditionalization that tackles the problem, known as Jeffrey-conditionalization (or,
probability kinematics, as Jeffrey called it), which considers evidence as providing probabilities to a partition of our set of beliefs. In this case, the new degree of belief function is calculated in terms of the old one, b_X_new(h) = \sum_i b_X_old(h|e_i) p_i,
where \{e_i\} is a partition of our set of beliefs consisting mutually exclusive and jointly exhaustive propositions and p_i = b_X_new(e_i) , i = 1, \dots, n, are the probabilities assigned to propositions e_i by new evidence. As before, b_X_old(h|e_i) is calculated as the posterior probability in Bayes’s Theorem.
One difficulty with Jeffrey’s conditionalization is that while strict conditionalization provides an assurance to convergence to truth,
Jeffrey’s conditionalization generally doesn’t. There is a family of theorems, known as convergence theorems, with the most well-known being that of Gaifman and Snir (1982), which claim that, under reasonable assumptions, the probability of a hypothesis conditional on available evidence converges to 1 in the limit of empirical research, if the hypothesis is true. These theorems provide a vindication of Bayesianism showing that it is guaranteed to find the truth eventually by applying successively strict conditionalization.
Conditionalizing on the evidence is purely logical updating of degrees of belief. It is not ampliative. It does not introduce new content, nor does it modify the old one. It just assigns a new degree of belief to an old opinion. The justification for the requirement of conditionalization is supposed to be a diachronic version of the Dutch-book theorem. It is supposed to be a canon of rationality (certainly a necessary condition for it) that agents should update their degrees of belief by conditionalizing on evidence. The penalty for not doing this is liability to a Dutch-book strategy: the agent can be offered a set of bets over time such that a) each of them taken individually will seem fair to them at the time it is offered; but b) taken collectively, they lead them to suffer a net loss, come what may.
Bayesian Induction
In this context, induction rests on the degree of belief one assigns to a hypothesis given a body of confirmatory evidence and on the process of updating in the light of new evidence. Hence, the problem of justification of induction gives way to the problem of justifying conditionalization on the evidence. In general, Bayesian theories of confirmation maintain the following theses:
Belief is always a matter of degree; degrees of belief are probability values and degree of belief functions are probability
functions.
Confirmation is a relation of positive relevance, viz., a piece of evidence confirms a hypothesis if it increases its
probability;
e confirms h iff p(h|e) > p(h), where p is a probability
function.
Similarly, we may define disconfirmation of a hypothesis by a piece of evidence in terms of negative relevance (p(h|e) < p(h)),
as well as neutrality of a hypothesis with respect to a piece of evidence in terms of irrelevance (p(h|e) =
p(h)).
The relation of confirmation is captured by Bayes’s theorem which dictates the change of the degree of belief in a given hypothesis in the
light of a piece of evidence.
𝑒 confirms ℎ iff 𝑝(ℎ|𝑒) > 𝑝(ℎ), where 𝑝 is a probability function. Similarly, we may define disconfirmation of a hypothesis by a piece of evidence in terms of negative relevance (p(h|e) < p(h)), as well as neutrality of a hypothesis with respect to a piece of evidence in terms of irrelevance (p(h|e) = p(h)).
The relation of confirmation is captured by Bayes’s theorem which dictates the change of the degree of belief in a given hypothesis in the
light of a piece of evidence.
The only factors relevant to confirmation of a hypothesis are its prior probability
p(h), the likelihood of the evidence given the hypothesis p(e|h); and the probability of the evidence p(e).
The specification of the prior probability of (aka prior degree of belief in) a hypothesis is a purely subjective matter.
The only (logical-rational) constraint on an assignment of prior probabilities to several hypotheses should be that they obey the axioms of the probability calculus.
The reasonableness of a belief does not depend on its content;
nor, ultimately, on whether the belief is made reasonable by the evidence.
Too Subjective?
In 1954, Savage discussed a criticism of subjective Bayesianism based on the idea that science or scientific method aims at finding out “what is probably true, by criteria on which all reasonable men agree.” (1954:67). By applying intersubjectively accepted criteria, scientific method is supposed to lead to an agreement between any two rational agents on the probability for the truth of a hypothesis given the same body of evidence. According to Savage this demand for intersubjectivity has its source either in considering probabilistic entailment as a generalization of logical entailment, or in considering probability an objective property of certain physical systems. Yet, the criticism goes, complete freedom in the choice of prior probabilities for a hypothesis by two agents may yield different posterior probabilities for that hypothesis given the same body of evidence. This fact compromises the desideratum of intersubjectivity of criteria since it makes room for the intrusion of idiosyncratic elements, non-cognitive values, or any other source of subjective preferences, reflected in the disagreement of the agents in the choice of priors, and, ultimately, in the value of posterior probability of a hypothesis. Hence, what is “probably true” is not evaluated by “criteria on which all people agree”. In a nutshell, it is claimed that purely subjective prior probabilities fail to capture the all-important notion of rational or reasonable degrees of belief and that subjective Bayesianism is too subjective to offer an adequate theory of confirmation.
In defense of subjective probability, Savage claims that although this view incorporates all the universally acceptable criteria for reasonableness in judgement… [these criteria] do not guarantee agreement
on all questions among all honest and freely communicating people, even in principle (ibid), considering disagreements a non-distressful situation. Moreover, anticipating what later became known as convergence-to-certainty or merger-of-opinions theorems, he showed that:
…in certain contexts any two opinions, provided that neither is extreme in a technical sense, are almost sure to be brought very close to one another by a sufficiently large body of evidence. (1954: 68; see also 46f)
Yet, as Hesse (1975; see Earman 1992:143) objected, Savage’s argument makes assumptions that are valid for the flipping of a coin case but are not typically valid in scientific inference. Gaifman and Snir (1982) have shown important results which overcome the limitations of Savage’s account. They have shown (Thm. 2.1) that for an infinite sequence of empirical questions, \phi_1,\dots, \phi_n, \dots, formulated in a given language that satisfies certain conditions:
Convergence-to-certainty: The limiting probability of a true sentence \psi in that language, given all empirical evidence collected in our world w, in response
to empirical questions stated, \phi^w_1, \dots, \phi^w_n, \dots, equals to 1, \lim_{n \to \infty} Pr(\psi|\&_{i \leq n} \phi^w_i) =
1. For a false proposition, the respective probability is 0,
Merger-of-opinions: The distance between any two probability functions that agree to assign probability 0 to the same sentences, i.e., they are equally dogmatic, converges to 0, in the limit of empirical research, i.e.,
Merger-of-opinions theorem is supposed to mitigate the excessive subjectivity of Bayesianism in the choice of prior probabilities: the actual values assigned to prior probabilities do not matter much since they ‘wash out’ in the long run.
Unfortunately, several criticisms of the theorem showed that the objection of subjectivism is not fully addressed. Let us briefly review some of these criticisms: The first objection is related to the asymptotic character of convergence and merging and the fact that the speed of convergence is unknown. The results do not apply to the divergences of opinion induced by small and medium-sized sets of evidence that have practical importance. The second objection is related to the language-dependent nature of the theorems restricting them to cases in which the predicates of the language are fixed. The theorems cannot guarantee washing out the priors assigned by agents in different linguistic contexts, as before and after a scientific revolution.
An important criticism stems from the fact that convergence in the theorems is obtained almost everywhere, i.e., for all worlds w, the actual world included, which belong to some set of possible worlds with probability 1. In the authors’ own words:
… with probability 1, two persons holding mutually nondogmatic
initial views will, in the long run, judge similarly… Also the convergence is guaranteed with probability 1, where “probability” refers to the presupposed prior. (I) and (II) [referring to the two parts of the theorem] form an “inner justification” but they do not constitute a justification of the particular prior.
So, the theorem guarantees convergence to truth and merging of opinions in every world except for some pathological cases that form small sets of worlds of measure zero. But who decides what those sets of worlds of measure zero would be? The Bayesian agent themselves through the choice of priors who is compelled to assign probability zero to
‘unpleasant’ scenarios. On these grounds, Earman claims that the “impressiveness of these results disappears in the light of their narcissistic character… ‘almost surely’ sometimes serves as a rug under which some unpleasant facts are swept” (1992:147).
Extending on this criticism, Belot (2013; 2017) has argued that in
problems of convergence to truth, there are typical cases –
their typicality being defined in a topological sense without measure-theoretic presuppositions – in which convergence to truth is
unsuccessful, a fact that a Bayesian agent is bound to ignore by assigning prior probability zero to such cases. Thus, Belot, concludes,
convergence – merger theorems “constitute a real liability for Bayesianism by forbidding a reasonable epistemological modesty”
(2013)
Belot’s arguments have prompted a variety of responses: some philosophers were critical of Belot’s topological considerations as
being irrelevant to probability theory (Cisewski et al. 2018;
Huttegger 2015). Others focused on imprecise probabilities and finitely additive probabilities to escape the charge of immodesty (Weatherson
2015; Elga 2016; Nielsen and Stewart 2019). Huttegger (2021) has shown using non-standard analysis that “convergence to the truth fails with
(non-infinitesimal) positive probability for certain hypotheses … [a fact] that creates a space for modesty within Bayesian epistemology.” As regards the countable additivity of the probability function,
convergence-to-certainty and merger-of opinions theorem relies essentially on this axiom. Prominent subjective Bayesians, on the other hand, such as de Finetti and Savage, explicitly reject countable additivity axiom despite its theoretical fecundity. Yet Savage, as mentioned above, has explored the possibility of theorems that despite their shortcomings attempt to mitigate the extreme subjectivism of Bayesianism. Recently, Nielsen (2021) has shown that there are uncountably many merely finitely additive probabilities that converge to
the truth almost surely and in probability. As a general comment, we would say that the area convergence and merger theorems seems to have many open problems to capture the interest of researchers.
Some Success Stories
Bayesian theory has a record of successful justifications of some important common intuitions about confirmation – such as the belief that a theory is confirmed by its observational consequences or the belief that a theory is better confirmed if subject to strict tests – and it has provided a solution to the famous ‘raven paradox’.
It is straightforward to show that hypotheses are confirmed by their consequences. Assume that h \models e, then the likelihood of e given h is p(e|h) = 1 and according to Bayes theorem, p(h|e) = p(e|h) p(h) / p(e) = p(h) / p(e) > p(h), given that e is not trivially true (p(e) < 1); hence, e confirms h. This result justifies the inference of the truth of a hypothesis on the basis of its observational consequences as the hypothetico- deductive method of confirmation suggests. Although the inference commits the formal fallacy of affirming the consequent, if considered inductively, through the lenses of Bayes's theorem, it is fully justified and the confirmatory nature of the hypothetico-deductive method is explained. This is what Earman recognized as an important "success story" of the Bayesian approach (1992: 233)
Another common methodological intuition that may be justified on
Bayesian grounds is related to the scientific practice of subjecting a hypothesis to severe tests on the basis of improbable consequences. As Deborah Mayo (2018: 14), following Popper, suggested in her Strong
Severity Principle:
We have evidence for a claim C just to the extent it survives a
stringent scrutiny. If C passes a test that was highly capable of findings flaws or discrepancies from C, and yet none or few are found,
the passing result, x, is evidence for C.
Now, as before, consider a logical consequence e of a hypothesis h. i.e., h \models e . A severe test of h would be one in which p(\sim e) is high and, consequently, p(e) is low. In this case e would be evidence for h. Hence, a necessary condition for collecting evidence for a hypothesis according to the aforementioned principle, would be to test its improbable consequences. Indeed, following Bayes’s theorem:
p(h|e) = p(e|h) p(h) / p(e) = p(h) / p(e).
Thus, the more improbable the consequence e is, the greater the degree of confirmation, as measured by the ratio p(h|e)/p(h), is.
Another piece in the collection of trophies of the Bayesian account is the resolution of the ravens paradox. This is a paradox of
confirmation, first noted by Carl Hempel, which took its name from the example that Hempel used to illustrate it viz., all ravens are black. The paradox emerges from the impossibility of having jointly satisfied three intuitively compelling principles of confirmation. The first is Nicod’s principle [named after the French philosopher Jean Nicod]: a universal generalization is confirmed by its positive instances. So, that all ravens are black is confirmed by the observation
of black ravens. Second, the principle of logical equivalence:
if a piece of evidence confirms a hypothesis, it also confirms its logically equivalent hypotheses.
Third, the Principle of relevant empirical investigation:
hypotheses are confirmed by investigating empirically what they assert.
To set up the paradox, take the hypothesis h: All ravens are black. The hypothesis h’: All non-black things are non-ravens is logically equivalent to h. A positive instance of h’ is a white piece of chalk. Hence, by Nicod’s condition, the observation of the white piece of chalk confirms h’. By the principle of equivalence, it also confirms h, that is that all ravens are black. But then the principle of relevant empirical investigation is violated. For, the hypothesis that all ravens are black is confirmed not by examining the colour of ravens (or of any other birds) but by examining seemingly irrelevant objects (like pieces of chalk or red roses). So at least one of these three principles should be abandoned, if the paradox is to be avoided.
To resolve the ravens paradox, a Bayesian may show that there is no problem with accepting all three principles of confirmation since the degree of confirmation conferred on the hypothesis h by an instance of a non-raven-non-black object is negligible in comparison with how much the hypothesis is confirmed by an instance
of a black object.[According to Howson and Urbach (2006: 100) a Bayesian analysis could also challenge the adequacy of Nicod’s criterion as a universal principle of confirmation.]
To see that consider hypotheses h: \forall x (Rx \to Bx) and h’: \forall x (\sim Bx \to \sim Rx)
and evidence e: Ra \& Ba and e’: \sim Ba \& \sim Ra which are positive instances of h, h’ respectively. We calculate the ratio p(h|e)/p(h|e’) which according to Bayes’s theorem and the easily verifiable equality of likelihoods of e and e’ given h, p(e|h) = p(e’|h), is p(h|e) = p(e) / p(e’) . But p(e’) >> p(e) because there are very many more things which are non-Black and non-Ravens than Black Ravens. Hence, p(h|e) >> p(h|e’),
i.e e confirms h a lot more than e’ confirms h’.
We are closing this presentation of subjective probability and
Bayesian confirmation theory by referring to what has become known as the old evidence problem. The problem has been identified for the first time by Glymour (1980) and it underlines a potential conflict between Bayesianism and scientific practice. Suppose that a piece of evidence e is already known (i.e., it is an old piece of evidence relative to the hypothesis h under test). Its probability, then, is equal to unity, p(e) = 1. Given Bayes’s theorem, it turns out that this piece of evidence does not affect at all the posterior probability, p(h|e), of the hypothesis given the evidence; the posterior probability is equal to the prior probability, i.e., p(h|e) = p(h). This, it is argued, is clearly wrong since scientists typically use known evidence to support their theories. This fact is demonstrated by the use of the anomalous precession of Mercury’s perihelion, discovered in the nineteenth century, as confirming evidence for Einstein’s General Theory of Relativity. Therefore, the critics conclude, there must be something wrong with Bayesian confirmation. Some Bayesians have replied by adopting a counterfactual account of the relation between theory and old evidence (Howson and Urbach 2006: 299). Suppose, they argue, that K
is the relevant background knowledge and e is an old (known) piece of evidence—that is, e is actually part of K. In considering what kind of support e confers on a hypothesis h, we subtract counterfactually the known evidence e from the background knowledge K. We therefore presume that e is not known and ask: what would the probability of e given
K \setminus \{e\}? This will be less than one; hence, the evidence e can affect
(that is, raise or lower) the posterior probability of the hypothesis.
Appendices
Lindenbaum algebra and probability in sentential logic.
In this appendix we show how one can assign probabilities, originally defined in set- theoretic framework, to sentences in the language of
sentential logic, L. We formulate Kolmogorov’s axioms of probability for sentences and some important theorems.
In particular, consider the set of all well-formed formulas (wffs) of L and define for every wff \phi the equivalence class:
[\phi] = \{\psi: \models_L \phi \equiv \psi\}.
In the set of all equivalence classes S_L, we define set-theoretic operations that correspond to the sentential connectives of the language. Thus, for every two wffs
\phi, \psi:
[\phi] \cup [\psi] = [\phi \vee \psi] [\phi] \cap [\psi] = [\phi \wedge \psi] [\phi]^c = [\sim \phi] [\bot] = \emptyset [t] = \{wffs of L\} where “\bot” designates a contradiction and “t” a tautology. This way constructed, the set of all equivalence classes, S_L, is a field (and a Boolean algebra) (see section 1a), and it is called Lindenbaum algebra (Hailperin 1986: 30ff.). However, since in the language of sentential logic, infinitary operations, like \phi_1 \vee \dots \vee \phi_n \vee \dots , cannot be applied to wffs \phi_i to produce other wffs, we cannot define in S_L the countably infinite union of classes of wffs. As a consequence, S_L is not a \sigma-field and the probability function that we are about to define does not satisfy countable additivity. So, this is an account of elementary probability theory. To discuss the full axiomatic apparatus of probability theory one needs to work in richer languages, which for present purposes is not deemed necessary.
So, we can define a probability function p that satisfies
Kolmogorov’s axioms (i)-
(iii) on S_L and assign to each singular sentence of the language L the probability value of its equivalence class. Thus, for any sentences a, b and a tautotology t of L:
p(a) \geq 0;
p(t) = 1 ;
p(a \vee b) = p(a) + p(b), where a \models_L \sim b.
As for the conditional probability of a sentences a given the truth of a sentence sentences b, we have:
p(a|b) = p(a \wedge b) / p(b), p(b) \neq 0.
It is obvious from the discussion above that logically equivalent sentences have equal probability values:
if \models_L a \equiv b, then p(a) = p(b).
We conclude this appendix with some useful theorems of the probability calculus which we state in sentence-based formalism, without proof:
The sum of the probability of a sentence and of its negation is
1:
p(\sim a) = 1 – p(a).
Contradictions (\bot) have zero probability:
p(\bot) = 0.
The probability function respects the entailment relation: if a
\models_L b, then p(a) \leq p(b).
Probability values range between 0 and 1:
0 \leq p(a) \leq 1.
Finite Additivity Condition:
p(a_1 \vee \dots \vee a_N) = p(a_1) + \dots +
p(a_N), a_i \models_L \sim a_j, 1 \leq i < j \leq N.
Corollary:
If \models_L a_1 \vee \dots \vee a_N and a_i
\models_L \sim a_j, 1 \leq i < j \leq N, 1 = p(a_1) + \dots
+ p(a_N).
Theorem of total probability:
If p(a_1 \vee \dots \vee a_N) = 1, and a_i \models_L \sim a_j, i \neq j, then p(b) = p(b \wedge a_1) + \dots + p(b \wedge a_N), for any sentence b.
Or in terms of conditional probabilities: If p(a_1 \vee \dots \vee a_N) = 1, a_i \models_L \sim a_j, i \neq j, and p(a_i) > 0 then p(b) = p(b|a_1)p(a_1) + \dots + p(b|a_N)p(a_N), for any sentence b.
Corollary 1: If \models_L a_1 \vee \dots \vee a_N and a_i \models_L \sim a_j, i \neq j, then p(b) = p(b \wedge a_1) + \dots + p(b \wedge a_N).
Corollary 2: p(b) = p(b|c)p(c) + ⋯ + p(b|~c)p(~c), for any sentence c, p(c) > 0.
Bayes’s Theorem. The famous theorem that took its name after the eighteenth- century clergyman Thomas Bayes.
First form (Thomas Bayes):
p(e|h)p(h) / p(e), where p(h), p(e) > 0,
where p(h|e) is called posterior probability and expresses the probability of the hypothesis h conditional on the evidence e;
p(e|h) is called likelihood of the hypothesis and expresses the probability of the evidence conditional on the hypothesis; p(h) is called prior probability of the hypothesis; and p(e) is the probability of the evidence.
Second form (Pierre Simon Laplace):
If p(h1 ∨ … ∨ hN) = 1 and h_i ⊢_L ~h_j, i ≠ j, and p(h_i), p(e) > 0
then
p(h_k|e) = p(e|h_k)p(h_k) / ∑ p(e|h_j)p(h_j)
Third form:
𝑘 𝑁
p(h|e) = p(e|h_i) p(h_i) / ∑ p(e|h_j) p(h_j)
A sketch of proof for Laplace’s Rule of Succession
Assume that we want to calculate the probability that the sun will rise tomorrow given that the sun has risen for the past N days. We have
observation data about the sunrise in the past N days but the probability q of the sunrise is unknown. By application of the principle of indifference, we claim that it is equally likely that the probability of sunrise be any number q ∈ [0,1]. Hence, the distribution of probability values of sunrise is uniform.
We take the sample space to consist of (N+2)-ples of the following
type:
< S, S, …, S, x, q >, where S, F stand for ‘Success’ and ‘Failure’ of the sunrise,
respectively, and q denotes a possible value for the probability of the sun rising.
The subset of the sample space
E = {< S, …, S, x, q > |x ∈ {S, F} and q ∈ [0,1]},
is a random event consistent with observations of the sun rising in the past N days, no matter what is going to happen in the (N + 1) day or what the probability q of the sunrise is.
Since, parameter q takes real values we should not ask what the probability of a given value k of the parameter q is, but what the probability of q to be found within a given interval is:
p(q ≤ k|E).
To calculate this probability, we first apply Bayes’ rule:
p(q ≤ k|E) = p(q ≤ k) ⋅ p(E|q ≤ k) / p(E)
Since all values of q in [0,1] are equiprobable:
p(q ≤ k) = k.
Since the sequence of past sunrises is a sequence of independent trials, i.e., whether the sun has risen or not in a given day does not
influence the rising of the sun in subsequent days:
and p(E|q ≤ k) = k^N / (N + 1)
Hence: p(E) = 1 / (N + 1)
p(q ≤ k|E) = k^{N+1}.
From here, we can calculate the probability density function for q =
k conditional on E: f(k) = (N + 1) k^N.
To yield the probability of the sun to rise in the (N + 1) day, given that it has risen in the last N days, no matter what the probability of sunrise might be is given by the following integral:
∫_0^1 k f(k) dk = (N + 1) ∫_0^1 k^{N+1} dk = (N + 1) [k^{N+2} / (N+2)]_0^1 = (N + 1) / (N + 2)
The Mathematics of Keynes’s Account of Pure Induction
Consider a generalization h: “all A is B” and n positive instances
e_i: “this A is B” ,
i = 1, … , n that follow logically from h, i.e., h ⊢ e_i.
Let p(h|K) the prior to any evidence probability relative to background knowledge K. Background knowledge is understood as the body of evidence which is related to the truth of the hypothesis with the exception of the evidence that are being considered explicitly. If n positive
instances e_i: “this A is B” , i = 1, … , n and no negative instances have been observed, the posterior probability of h is p(h|e_1& … &e_n&K).
To justify inductive inference, Keynes claims, we need to find the conditions on which the posterior probability increases with the accumulation of positive instances and the absence of negative instances so that the inductive argument is strengthened and in the limit of empirical investigation, hypothesis h can be inferred with certainty on the basis of empirical evidence:
lim p(h|e_1& … &e_n&K) = 1. n→∞
From Bayes’s theorem we have: p(h|K) p(e_1& … &e_n|h&K)
Hence, by (2) and (3):
p(h|e1& … &en&K) = p(h|K) + p(e1& … &en|~h&K) p(~h|K)
If lim p(e1&…&en|~h&K) = 0, the requested condition of asymptotic certainty,
n→∞
lim p(h|e1& … &en&K) = 1, is satisfied. Since p(h|K) is the prior probability of the
n→∞
hypothesis which is independent of the evidence accumulated, it is a fixed number.
Hence, the antecedent of the aforementioned conditional can be split into the following two conditions:
p(h|K) ≠ 0 (4)
and lim p(e1& … &en|~h&K) = 0 (5)
Condition (5) can be analyzed in terms of the probability of a positive instance e_j given j − 1 positive instances for h,
e1& … &e_{j-1}, and that h is false:
p(e_j|e1& … &e_{j-1}&~h&K) = q_j, j = 2, … , n
p(e1|~h&K) = q1.
The probability of n positive instances and no negative instances given that h is false is:
p(e1& … &en|~h&K) = q1 · … · qn.
Let 1 > M_n = max{q1, … , qn} then p(e1& … &en|~h&K) ≤ M_n^n. The sequence
{M_n} is bounded. If M = sup M_n , 0 < M < 1, then:
for every n ∈ ℕ, p(e1& … &en|~h&K) ≤ M_n^n < M^n
and (5) follows:
lim p(e1& … &en|~h&K) ≤ lim M^n = 0.
By contraposition we infer that if condition (5) is not satisfied,
{M_n} is not bounded by any number M, 0 < M < 1. Thus, for every M there is a n0 ∈ ℕ such that
M_n0 > M. Since M_n0 = max{q1, … , q_n0 }, we infer that for every k ∈ ℕ, k < n0, such that:
1 > p(e_k|e1& … &e_{k-1}&~h&K) = q_k > M,
and lim p(e_k|e1& … &e_{k-1}&~h&K) = 1. (6)
Hence, if (5) is false then (6). But it is reasonable to demand that a negative instance of h, ~e_k, should have non-zero probability no matter how many positive
instances have been observed given the falsity of h. Thus, Keynes
(1921: 275) suggested that (6) is false:
[given that] the generalisation is false, a finite uncertainty as to
its conclusion being satisfied by the next hitherto unexamined instance which satisfies its premiss.
Or, as Russell commented referring to condition (5), “[i]t is difficult to see how this condition can fail in empirical material.” (1948: 455).
Keynes justified the second condition, (4), by applying the principle of limited independent variety and the principle of indifference (see sections 3.a.1, 3.a.2).
According to the principle of limited independent variety, qualities are classified into a finite number of groups so that two qualities that belong in the same group have the same extension, i.e., they are satisfied by the same individuals, and, in this sense, they are equivalent. More precisely, [A] is the set of all qualities that are equivalent to A; it includes all qualities B \in [A] which (\forall x)(Ax \equiv Bx).
Thus, generalization h is entailed logically by the assumption that A, B are equivalent properties. Moreover, the principle of limited variety
requires that the number of independent qualities that are inequivalent is finite. Hence, if n is the number of independent qualities by the principle of indifference we conclude that the probability of any two properties A, B to belong in the same group is 1/n. Since, h is a logical consequence of this fact, by a well-known theorem in probability theory (see section 1.a), p(h|K) \geq 1/n, n fixed counting number. But this is exactly what the demand for finite prior probability, condition (4), requires.
Belot, G., (2017). “Objectivity and Bias”. Mind 126(503): pp.655-695.
Bernoulli, J., (1713 [2006]). The Art of Conjecturing. Baltimore: The John Hopkins University Press.
Boole, G., (1854). An Investigation of The Laws Of Thought,
on Which Are Founded The Mathematical Theories Of Logic And Probabilities. London: Walton – Maberly.
Burks, A.W., (1953). “Book Review: The Continuum of Inductive Methods. Rudolf Carnap.” Journal of Philosophy 50 (24):731-734.
Carnap, R., (1950). Logical Foundations of Probability.
London: Routledge and Kegan Paul, Ltd.
Carnap, R., (1952). The Continuum of Inductive Methods.
Chicago: University of Chicago Press.
Carnap, R., (1963). “Replies and Systematic Expositions”. In
Schilpp, P.A., (ed.). The Philosophy of Rudolf Carnap. Library of Living Philosophers,
Volume XI. Illinois: Open Court Publishing Company, pp.859-999.
Carnap, R., (1971). “A basic system of inductive logic, I”. In
Jeffrey, R., and Carnap, R., (eds.). Studies in Inductive Logic and Probability. Los Angeles: University of California Press. pp.
34-165.
Carnap, R., (1980). “A basic system of inductive logic, II”. Jeffrey, R., (ed.). Studies in Inductive Logic and Probability. Berkeley: University of California Press. pp. 2-7.
Childers, T., (2013). Philosophy and Probability. Oxford: Oxford University Press.
Cisewski, J., Kadane, J. B., Schervish, M. J., Seidenfeld, T. and Stern, R., (2018). “Standards for Modest Bayesian Credences”. Philosophy of Science, 85(1): pp. 53-78.
de Finetti, B., (1931). “Probabilismo. Saggio critico sulla teoria delle probabilità e sul valore della scienza”. In: Logos. Napoli: F. Pezzella, pp.163-219. English translation in Erkenntnis 31 (1989): pp.169-223.
de Finetti, B., (1936). “Statistica e Probabilita nella concezione di R. von Mises”. Supplemento Statistico ai Nuovi Problemi di Politica, Storia ed Economia Anno II, Fasc.2-3, pp. 5-15.
de Finetti, B., (1972). Probability and Induction. The art of guessing. London: Wiley.
de Finetti, B., (1974). Theory of Probability: A Critical Introductory Treatment. Chichester: Wiley.
de Finetti, B., (2008). Philosophical Lectures on
Probability, collected edited and annotated by A. Mura.
Springer.
Earman, J., (1992). Bayes or Bust: A critical examination of Bayesian Confirmation Theory. Cambridge, Massachusetts – London,
England: The MIT Press.
Elga, A., (2016). “Bayesian Humility”. Philosophy of
Science, 83: pp. 305–23.
Ellis, R.L., (1842). “On the Foundations of the Theory of
Probability”. In The Mathematical and Other Writings of Robert Leslie Ellis, 1862. Cambridge: Deghton Bell, and Co. pp.
1-11.
Gaifman, H., and Snir, M, (1982). “Probabilities Over Rich
Languages, Testing and Randomness”. The Journal of Symbolic Logic, 47(3), pp. 495-548.
Gillies, D., (2000). Philosophical Theories of
Probability. London and New York: Routledge.
Gnedenko, B.V., (1969 [1978]). The Theory of
Probability. Moscow: Mir Publishers.
Goodman, N., (1955 [1981]). Fact, Fiction and Forecast.
Cambridge, MA: Harvard University Press.
Hájek, A., (2019). “Interpretations of Probability”, The Stanford Encyclopedia of Philosophy (Fall 2019 Edition), Edward N. Zalta (ed.), URL = <https://plato.stanford.edu/archives/fall2019/entries/probability-interpret/>.
Hacking, I., (1971). “Equipossibility Theories of Probability”. The British Journal for the Philosophy of Science, 22 (4), pp. 339-355.
Hacking, I., (1975 [2006]). The Emergence of Probability: A philosophical study of early ideas about probability induction and statistical inference. Cambridge: Cambridge University Press.
Hailperin, T., (1986). Boole’s Logic and Probability.
Amsterdam: North-Holland.
Hausdorff, F., (1914 [1957]). Set theory. New York:
Chelsea Publishing Company.
Hempel, C.G., (1945). “Studies in the logic of confirmation, I”. Mind 54 (213), pp. 1-26.
Hempel, C.G., (1945). “Studies in the logic of confirmation, II”. Mind 54 (214), pp. 97-121.
Hesse, M., (1975). “Bayesian Methods and the Initial Probability of Theories”. In, Maxwell, G. and Anderson, R.M., (eds). Induction, Probability and Confirmation. Minnesota Studies in the Philosophy of
Science, vol.6. Minneapolis: University of Minnesota Press.
Hilbert, D., (1902). “Mathematical Problems”. Bull. Amer.
Math. Soc. 8 : pp. 437- 479
Howson, C. and Urbach, P., (1989/2006). Scientific Reasoning:
The Bayesian Approach. Chicago and La Salle, Illinois: Open Court.
Humphreys, P., (1985). “Why Propensities cannot be
Probabilities”. The Philosophical Review 94(4)
pp.557-570.
Huttegger, S. M. (2015). “Bayesian Convergence to the Truth and the Metaphysics of Possible Worlds”. Philosophy of Science, 82:
pp. 587–601.
Huttegger, S. M. (2021). “Rethinking Convergence to the Truth”. The Journal of Philosophy 119: pp. 380–403
Jeffrey, R., (1992). Probability and the Art of
Judgement. Cambridge: Cambridge University Press.
Jeffrey, R., (2004). Subjective Probability: The Real Thing. Cambridge: Cambridge University Press.
Kolmogorov, A. N. (1933 [1950]). Foundations of the Theory of Probability. New York: Chelsea Publishing Company
Keynes, J. M., (1921). A Treatise on Probability. London: Macmillan and Co., Limited.
Lakatos, I., (1968). “Changes in the problem of inductive logic”. In Lakatos, I., (ed.), The Problem of Inductive Logic: Proceedings of the International Colloquium in the Philosophy of Science, London, 1965, vol.2. Amsterdam: North Holland Pub. Co. pp.315-417.
Laplace, P. S., (1814 [1951]). A Philosophical Essay on Probabilities. New York: Dover Publications, Inc.
Leibniz, G. W., (1678 [2004]). “On Estimating the Uncertain”. The Leibniz Review 14.
Maher, P., (2006). “The Concept of Inductive Probability”. Erkenntnis 65, pp.185–206.
Nielsen, M., (2021). “Convergence to Truth without Countable Additivity”. Journal of Philosophical Logic, 50: pp. 395–414.
Nielsen, M. and Stewart, R.T., (2019). “Obligation, permission and Bayesian orgulity”. Ergo 6(3).
Popper, K., (1959). “The Propensity Interpretation of Probability”. The British Journal for the Philosophy of Science, 10, (37), pp. 25-42.
Ramsey, F. P., (1926). “Truth and Probability”. In The Foundations of Mathematics and other Logical Essays. London and New York: Routledge (1931), pp. 156-198.
Reichenbach, H., (1934 [1949]). The Theory of Probability: An
Inquiry into the Logical and Mathematical Foundations of the Calculus of Probability. Berkeley and Los Angeles: University of California
Press.
Russell, B., (1948 [1992]). Human Knowledge—Its Scope and
Limits. London: Routledge.
Salmon, W. C. (1966). The Foundations of Scientific
Inference. Pittsburgh: University of Pittsburgh Press.
Savage, L. J. (1954 [1972]). The Foundations of
Statistics. New York: Dover Publications. Inc.
Shackel, N. (2007), “Bertrand’s Paradox and the Principle of Indifference”. Philosophy of Science, 74 (2), pp. 150–175.
Venn, J., (1888). The Logic of Chance. London: Macmillan and Co
von Mises, R., (1928 [1981]). Probability, Statistics and
Truth. New York: Dover Publications, Inc.
von Mises, R., (1964). Mathematical Theory of Probability and
Statistics. London and New York: Academic Press.
Eternalism is a metaphysical view regarding the nature of time. It posits the equal existence of all times: the past, the present, and the future. Every event, from the big bang to the heat death of the universe, including our births and deaths, is equally real.
Under standard eternalism, temporal locations are somewhat akin to spatial locations. No place is exclusively real. When someone says that they stand ‘here’, it is clear that the term ‘here’ refers to their position. ‘Back’ and ‘front’ exist as well. Eternalists stress that ‘now’ is indexical in a similar way. It is equally real with ‘past’ and ‘future’. Events are classified as past, present, or future from some perspective.
Eternalism is contrasted with presentism, which maintains that only present things exist and with the growing block view (also known as possibilism or no-futurism), which holds that past and present things exist but not future ones. The moving spotlight view suggests that all times exist but that the present is the only actual time. This view can be termed eternalist, but it preserves a non-perspectival difference between past, present, and future by treating tense as absolute. Additionally, the moving spotlight view retains some characteristics of presentism by maintaining that the ‘now’ is unique and privileged.
Broadly speaking, presentism is a common-sensical view, and so aligns with our manifest image of time. This view is, however, at odds with the scientific image of time. The primary motivation for eternalism arises from orthodox interpretations of the theories of relativity. According to them, simultaneity is relative, not absolute. This implies that there is no universal ‘now’ stretched out across the entire universe. One observer’s present can be another’s past or future. Assuming the universe is four-dimensional spacetime, then all events exist unconditionally.
The classical argument for eternalism was devised in the 1960s by Rietdijk in 1966 and Putnam in 1967, with subsequent follow-ups by Maxwell in 1985, Penrose in 1989, and Peterson and Silberstein in 2010. This argument and its ramifications remain the subject of ongoing debate. They shall be further explored in this article, including their relation to other issues in the metaphysics of time and the philosophy of physics.
It is usually thought that presentism and perhaps the growing block view are a better match with our common-sensical idea of time than eternalism. It is obvious that the present is different from the past and the future. As you are presently reading these sentences, your reading is real. At least it is real in comparison to what you did a long time ago or will do a long time from now. There is, however, a problem. If the ‘now’ is exclusively real, that moment is entirely universal. The ‘now’— the moment your eyes skim through these lines— is the same ‘now’ as the ‘now’ in other regions of the universe. Independently of where we are, the ‘now’ is the same. If this is true, simultaneity is absolute. Both presentism and the growing block view assume the absoluteness of simultaneity. There is a knife-edge present moment stretched throughout the entire universe. That universal now is the boundary of all that exists. According to presentism, what was before that moment and what lies ahead that moment does not exist. According to no-futurism, what lies ahead that moment does not exist.
This common-sensical picture is in tension with the special theory of relativity. This theory is included in two central pillars of contemporary physics: the general theory of relativity and the quantum field theory. Whether we are dealing with gravitational effects or high-energy physics, time dilation is prevalent. This result is of central importance to eternalism, as the relativity of simultaneity is included in time dilation. Simultaneity differs across frames of reference. Provided that in some frame of reference the time difference between two events is zero, the events are simultaneous. In another frame, the time-difference between those same events is not zero, so the events are successive. They might be successive in a different order in a third frame.
The classical argument for eternalism hinges on this result. Before delving into this argument more explicitly in Section 2, let’s consider some intuitions that arise from relative simultaneity. Imagine three observers witnessing two events in a room with one door and a window. The first observer stands still and sees the window and the door open simultaneously. In their frame, the two events are simultaneous, that is, happening now. The second observer moves toward the window and away from the door. For them, the window opens now, and the opening of the door is in their future. The third observer moves toward the door and away from the window. For them, the door opens now, and the opening of the window is in their future. Thus, the three observers disagree on what happens now. Someone’s ‘now’ can be someone else’s future, and it can also be that someone’s ‘now’ is someone else’s past.
Another way to motivate eternalism is to imagine, for the sake of the argument, that there is an alien standing still in a faraway galaxy, around ten light-years from us. I am now practically motionless as I am typing these sentences. Let’s imagine we could draw a now-slice that marks the simultaneity of the alien standing and me sitting. Provided we do not move, we share the same moment ‘now’. Then the alien starts to move, not too fast, away from me. The alien’s now-slice is now tilted toward my past. With a great enough distance between us, even what could be considered a relatively slow motion, the alien would carve up spacetime so their now-slice would no longer match with mine. It would match with Napoleon’s invasion of the Russian Empire. If the alien turns in the opposite direction, their now-slice would be tilted toward my future. Their ‘now’ would be—well, that we do not know. Perhaps a war with sentient robots on Earth?
According to the general theory of relativity, gravitational time dilation causes time in one location to differ from time in another. Clocks closer to massive sources of gravity tick slower than those farther away. This phenomenon is evident even in simple scenarios, such as two clocks in a room – one high on the wall and the other lower on a table – showing different times. According to presentism, there are no different times. There is one and only one time, the present time. However, the potential number of clocks is indefinite, leading to countless different times. This is clearly in tension with presentism. There are many different times, not just one unique time.
At first sight, eternalism is backed up by empirical science. We are not dealing with a purely a priori argument. Time dilation is apparent in a plethora of experiments and applications that utilize relativity. These include, for example, the functioning of CERN’s Large Hadron Collider in Geneva, Switzerland, the detection of muons at the ground of the earth, and GPS technology. The important point for motivating eternalism is that the empirical evidence for the existence of time dilation in nature is very extensive and well-corroborated. Metaphysicians with a naturalist bent have reasons to take the ramifications of relativity seriously.
b. Central Definitions and Positions
The different metaphysics of time can be clarified by introducing reality values. The assumption is that an event has either a reality value of 1 (to exist) or 0 (to not exist). An event does not hover between existing and non-existing (it is however possible to connect the truth values of future-tensed statements to probabilities as the future might be open and non-existent; more about this in Section 4.d).
Eternalism
Temporal location
Past
Present
Future
Reality value
1
1
1
Presentism
Temporal location
Past
Present
Future
Reality value
0
1
0
Growing block, possibilism, no-futurism
Temporal location
Past
Present
Future
Reality value
1
1
0
As the moving spotlight view differs in terms of actuality, it is appropriate to add an actuality value on its own row:
Moving spotlight
Temporal location
Past
Present
Future
Reality value
1
1
1
Actuality value
0
1
0
Presentism, growing block view, and moving spotlight view all treat tenses as absolutes, whereas eternalism treats them as indexicals. The former take that the distinction between past, present, and future is absolute, whereas the latter takes that it is perspectival. The former are instances of the A-theory of time, whereas the latter is a B-theory or a C-theory of time, as shall be clarified in Section 3.a. This article focuses on eternalism, so not too much will be said about the other metaphysics of time. To that end, see Section 14.a “Presentism, the Growing-Past, Eternalism, and the Block-Universe” of this Internet Encyclopedia of Philosophy on Time, as well as the section on A-theory and B-theory.
2. The Classical Argument for Eternalism
a. History of the Concept
It is far from clear who was the first one to use the concept of “eternalism”. It is not even clear where to look for in the first place. Frege’s late 19th-century theory of propositions has been interpreted in eternalist terms, as propositions have an eternal character (O’Sullivan 2023). Hinton published an essay “What is the Fourth Dimension?” in the 1880s. This was perhaps a precursor to eternalism, as the idea of the fourth dimension is co-extensive with the idea of the four-dimensional spacetime. It has been suggested that Spinoza in the 17th century (Waller 2012) and Anselm around 1100 argued for an eternalist ontology (Rogers 2007). Spinoza held that all temporal parts of bodies exist, thus anticipating a perdurantist account of persistence, a view that aligns well with the existence of all times. Anselm thought that God is a timeless, eternal being (a view also held by his predecessors like Augustine and Boethius) and that all times and places are under his immediate perception. Past, present, and future are not absolute distinctions but relative to a temporal perceiver. The history of the philosophy of time certainly stretches farther in time and place. It might very well be that eternalism, or a position close to it, was conceptualized by the ancient philosophers in the West and the East.
Considering the contemporary notion of eternalism and the debates within current metaphysics of time, the special theory of relativity includes the essential ingredients for eternalism. Although the theory has forerunners, it is typically thought to originate in Einstein’s 1905 article “On the Electrodynamics of Moving Bodies” and in Minkowski’s 1908 Cologne lecture “Space and Time.” We can assume that the earliest relevant publications concerning eternalism that draw on relativity came out in the first quarter of the twentieth century.
The classical argument for eternalism was formulated in the 1960s by Rietdijk and Putnam, independently of each other, but neither used the notion of eternalism explicitly. In the 1980s, Maxwell and Penrose argued along the lines of Rietdijk and Putnam without using the notion of eternalism. Rietdijk’s or Putnam’s predecessors like Williams (1951) and Smart (1949) did not invoke eternalism explicitly. Surprisingly, not even Russell, who is known for his tenseless theory of time, mentions eternalism in his 1925 exposition of relativity, The ABC of Relativity. The last chapter of that book is “Philosophical consequences,” in which one would expect something to be said about the ontology of time.
It is also worth mentioning one famous quote of Einstein, drawn from a letter to the widow of his long-time friend Michele Besso. This was written shortly after Besso’s demise (he died on March 15, 1955, and Einstein’s letter is dated on March 21, 1955). The letter reads, “Now he has departed from this strange world a little ahead of me. That signifies nothing. For those of us who believe in physics, the distinction between past, present, and future is only a stubbornly persistent illusion.” (Mar 2017: 469). It is not difficult to find Internet forums in which this part of the letter is characterized as eternalist. To dub this eternalism is, however, to read a personal, poetic letter out of its context. Moreover, the letter is cryptic. It is by no means an explicit endorsement of a philosophical position, as one would expect from a personal, moving letter.
There are quite a few 20th-century philosophers, physicists, and mathematicians, like Cassirer, Eddington, Einstein, Grünbaum, Gödel, Lewis, Minkowski, Quine, Smart, and Weyl, who have endorsed a view akin to eternalism (Thyssen 2023, 3). Yet the central argument and motivation for contemporary eternalism comes from the Rietdijk-Putnam-Maxwell-Penrose argument, which will be denoted the ‘classical argument’ below. The most recent extensive defense of this same idea comes from Peterson and Silberstein (2010).
b. The Argument
The two important notions required for the eternalist argument are reality value and reality relation. Reality values, or better for logic notation, r-values, represent the ontological status of any event. 1 denotes a real event, 0 an unreal event. An ideal spacetime diagram that represents everything from the beginning to the end would record all events with a reality value of 1, but none with a reality value of 0. This starting point omits some higher values like “possibly real” or “potentially real in the future,” which will become relevant in the discussion about the openness of the future in quantum physics (Section 4.d). The uniqueness criterion of reality means that an event has only one reality value. An event having two different reality values, 1 and 0, would be a contradiction. Reality relations, for their part, apply to events that share the same reality value. They can be translated into equally real relations: when two events are equally real, they are in a reality relation with each other (Peterson and Silberstein 2012, 212).
If events A, B, and C are equally real, then ArBrC. The properties of the reality relation are reflexivity, symmetricity, and transitivity. Reflexivity stipulates that ArA, since A has a unique reality value. Symmetricity stipulates that if ArB then BrA, since A and B have the same reality value. Transitivity stipulates that if ArB and if BrC, then ArC, since A, B, and C all have the same reality value. The transitivity condition is the most controversial (see Section 4.a).
As noted already in the introduction, the relativity of simultaneity is of the utmost importance for eternalism. So is the idea of four-dimensional spacetime, and concepts related to spacetime diagrams, something originally introduced by Minkowski in 1908.
Figure 1. Minkowski light cones
Only events lying outside light cones, that is, spacelike separated events that are in each other’s absolute elsewhere, may be simultaneous. These events are not causally related, and no signals may traverse between them. If one can establish that two spatially separated events are connected with a hyperplane of simultaneity, they are simultaneous. Hyperplanes do not apply to lightlike (the edges of the cones) or timelike (the observer’s worldline) separated events. Events in the observer’s past or future light cones cannot be simultaneous. Different observers do not agree on the temporal order of spacelike separated events. Two or more events happen at the same time in different places, according to some observers, but not according to all.
Figure 2. The classical argument illustrated with spacetime diagrams.
In Figure 2, we have three events, A, B, and C. There are two observers, that is, inertial frames of reference. 1 is marked with a blue axis, and 2 is marked with a red axis. A and B are spacelike separated from each other, as are A and C. B and C are timelike separated in relation to each other. Provided that one may establish a hyperplane of simultaneity among spacelike separated events, A is simultaneous with B in observer 1’s frame, and A is simultaneous with C in observer 2’s frame. To use the notion s to denote simultaneity, we may write: AsB for 1, and AsC for 2.
The classical argument assumes that events have unique r-values. Physical events exist independently of observers, they are located somewhere in spacetime. Whether a coordinate system is designated matters not to the existence of the event. Moreover, when dealing with separate, distinct events, these do not affect each other in any way. We may assume that distant events are equally real. If this assumption is correct, then simultaneous events are equally real. AsB should align with ArB, and AsC with ArC.
To spell out the eternalist argument:
Premise 1 AsB
Premise 2 AsC
Premise 3 AsB → ArB
Premise 4 AsC → ArC
Premise 5 ArB ∧ ArC
Conclusion ArB ∧ ArC → BrC
The presentist or growing blocker cannot accept the above conclusion, only two of the premises:
Premise 1 AsB
Premise 2 AsB → ArB
Premise 3 ¬BsC → ¬BrC
Conclusion ArB ∧ ¬BrC
The presentist and growing blocker both agree that the present is completely universal. In any place of the universe, what occurs in a moment is the exact same moment as in any other place. Every existing thing is simultaneous with any other existent thing. Everything that exists exists now. The now becomes redundant with such claims, as “happens now” means simply “happens”, as happening outside of the present has no reality value. All events are simultaneous. If B occurs now, C cannot occur, as it does not yet exist. From the eternalist viewpoint, B and C are equally real. If the classical argument for eternalism that draws on the relativity of simultaneity is valid, then presentism and the growing block view are committed to BrC ∧ ¬BrC. That would render these doctrines contradictory and absurd.
Penrose (1989) presents a similar eternalist argument (well-illustrated here). Imagine two observers, Bob and Alice. They pass each other at normal human walking speeds. Alice walks toward the Andromeda galaxy, Bob in the opposite direction. Andromeda is about 2×1019 kilometers from Earth. Stipulating that the Earth and Andromeda are at rest with respect to each other (which they are actually not, as is also the case in the Alien example discussed previously in Section 1.a), Alice’s and Bob’s planes of simultaneity intersect with Andromeda’s worldline for about five days. Imagine then that an Andromedan civilization initiates war against Earth. They decide to attack us in a time that sits between Alice’s and Bob’s planes of simultaneity. This means that the launch is in Alice’s past and in Bob’s future. The space fleet launching the attack is an unavoidable event. An event can be in some observer’s past (Alice), in some observer’s future (Bob), and in some observers’ present (at Andromedan’s home, at the time they take off, this event occurs now for them).
So far, we have focused on the reality values of events. The Minkowski coordinate system enables us to assess tenses: the present in the origin, and the past and the future in the light cones. How one argues for the direction of time is a huge topic of its own; it will not be dealt with here. Here it is assumed that future light cones point toward later times and past light cones toward earlier times. We may stick with the same events, A and C, as in the previous Figure 2. Let’s say there is some observer at event A. For them, event A occurs now, and event C is in their future. Another observer at event C: the event C occurs now for them, while A is in their past.
Figure 3. On the left: Future event C for an observer at A when A occurs now for them. On the right: Past event A for an observer at C when C occurs now for them.
Let’s say there is yet another event that we just previously did not mention: D. This event occurs now for an observer located at D. A is in their past, and C in their future.
Figure 4. For an observer at D, D occurs now,
A is in the past, and C in the future.
This brings us to the semantic argument for eternalism initiated by Putnam. He contrasts his position to Aristotle on future contingents. Putnam sees Aristotle as an indeterminist. Statements about potential future events do not have truth values in the present time. Putnam maintains Aristotle’s theory is obsolete, as it does not fit with relativity. The semantic argument can be clarified with the aid of Figure 4 above. When an observer at A utters a statement, “Event D will occur,” and an observer at C utters a statement, “Event D did occur,” one would expect both statements to have definite truth-values. Claims about future or past events are either true or false. Provided a physical event exists in some spacetime location, it does not matter in which spacetime location the observer who utters the existence claim is located. The occurrence of some physical events is not a subjective matter. From the four-dimensional perspective, D’s occurrence has a definite truth-value grounded in its definite reality value.
Putnam’s argument can be bolstered by truthmaker semantics. For something to be true about the world, there has to be something on the side of the world, perhaps a fact, state of affair, being, or process, that makes the statement, assertion, proposition, or theory about that aspect of the world true. In the case of physical events like D, that event itself would be the truthmaker for an existence claim like “Event D occurs at a given location in spacetime.” The truthmaker does not depend on the contingent spacetime location in which the existence claim is uttered. Even if past, present, and future are frame-relative, the physical event itself is not. Unlike tensed predicates (past, present, future), truthmakers (like an event) are not indexical. Armstrong (2004, chapter 11), for one, supports eternalism, or omnitemporalism, as he calls it, based on a truthmaker theory. Eternalism does not face some of the difficulties that presentism has about truthmaking, including postulating truthmakers in the present, finding them outside of time, or accepting non-existents as truthmakers.
3. Eternalism in Relation to Other Metaphysical Issues
a. Dynamic A-theory and Static B-theory
An exposition of the A-theory and the B-theory is provided in this Internet Encyclopedia of Philosophyarticle. In short, A-theorists think time is structured into past, present, and future. B-theorists think time is structured according to earlier than, simultaneous with, and later than relations. There is also the C-theory of time, which maintains that time is structured according to temporal in-betweenness relations. The A-theory is typically called dynamic; the B/C-theory is static. Presentists, growing blockers and moving spotlighters are A-theorists. Eternalists are typically, but not always, B-theorists. A-theorists maintain that time passes, while B/C-theorists deny the passage of time.
Mellor (1998) is a B-theorist who denies that events could be absolutely past, present, or future. Under his theory, properties like being past, being present, and being future do not exist. When statements about pastness, presentness, or futureness of events are made, they can be reformulated in a way that uses the resources of the B-theory. An event that happened in the past means that e is earlier than t. An event happening now means that e is located at t. An event that will happen means that e is later than t. Tensed sentences are switched into tenseless sentences.
Eternalism and the B-theory are typically categorized as the static theory of time. In this view, there is no passage of time in the sense that the future approaches, turns into the present, and then drifts off into the past. The four-dimensional world is thought to be changeless. This is based on the following issue (Price 1996, 13). We are misled by imagining the universe as a three-dimensional static spatial block, with time treated as an external dimension. However, in the framework of four-dimensional spacetime, time is not extrinsic but intrinsic. Time is one dimension of spacetime. Each clock measures proper time, the segment of the clock’s own trajectory in spacetime. Along the observer’s timelike worldline, events are organized successively, perhaps according to earlier than and later than relations. Above this temporal order there is no temporal passage.
The proponents of eternalism typically do not admit temporal flux, an objective change in A-properties, to be part of reality. Grünbaum (1973) criticized vehemently the idea of passage of time. In his view, relativity does not permit postulating a transient now. The now denoting present time is an arbitrary zero, the origin of temporal coordinates at the tip of Minkowski light cones. Absolute future and absolute past are events that take place earlier than or later than the arbitrarily chosen origin, the present moment. Relativity allows events to exist and sustain earlier than t or later than t relations, not any kind of objective becoming. In his view, organisms are conscious of some events in spacetime. Organisms receive novel information about events; there is no coming into existence and then fading away.
b. Passage of Time
A great many philosophers (not only philosophers of physics like Grünbaum) are and have been against the idea of passage of time. Traditionally, logical a priori arguments have been laid against passage. These can be found from very early philosophical sources, like Parmenides’ fragments. There is an inconsistency involved in thinking about passage. If the future, which is nothing, becomes the now, which is something, then this existing now becomes the past, which is nothing. How can nothing become something and something become nothing? How can non-existing turn into existing, and existing disappear into non-existing?
At first sight, eternalism is inconsistent with passage. If all times, past, present, and future exist, then the future does not come to us, switch into the ‘now’, and then disappear into the past. No thing comes into existence; no thing comes out of existence. All entities simply be, tenselessly. How does eternalism deal with the passage of time? There are different strategies for answering this question. 1) Passage is an illusion. We might experience a passage, but this experience is mistaken. 2) We believe that we experience passage, but we are mistaken by that belief. 3) Although the orthodox relativistic eternalism points towards B-theory and perspectivity and indexicality of tense, the moving spotlighters disagree. They maintain that the passage is a genuine feature of reality. 4) There is a passage of time, but that passage is something different than change along the past–present–future. Deflationary theories treat passage as a succession of events. The two first options are anti-realist about passage, while the last two are realistic.
i. The Illusion Argument
According to Parmenides, reality lacks time and change in general. Parmenidian monism suggests that the one and only world, our world, is timeless. Our experience of things changing in time is an illusion. All-encompassing antirealism about time is not currently popular. Yet the classic article for contemporary debates, McTaggart’s “The Unreality of Time” from 1908, is antirealist. Somewhat like Parmenides, McTaggart maintained that describing the world with tensed concepts is illogical. Past, present, and future are incompatible notions. An event should be either past, present, or future. It would be contradictory to claim that they share more than one tensed location in time. But that is how it should be if time passes. Perhaps the distinction introduced by the A-series “is simply a constant illusion of our minds,” surmises McTaggart (1908, 458).
In addition to aprioristic reasoning, there are empirical cases to be made for the illusion argument. There are various motion illusions. The color phi phenomenon might be taken to lend support to the argument that passage of time is an illusion. In the case of color phi, we wrongly see a persistent colored blob moving back and forth. It appears to change its color. Our experience is not veridical. There is no one blob changing its color, but two blobs of different colors. There is no reciprocating motion to begin with. We somehow construct the dynamic scenario in our experience. Perhaps we also create the animation of the flow of time from the myriad of sensory inputs. Another motion illusion: Say someone spins rapidly multiple times. After they stop spinning, the environment seems to move around them. It does not; that is the illusion. This is caused by the inner ear’s fluid rotation. It could be that the flux of time is a similar kind of phenomenon.
ii. Error Theory
Temporal error theory is the following claim: Our belief in the experience of time passing from the future to the present and to the past is false. Temporal error theory can be challenged by considering the origin of our temporal beliefs. Where does our belief in the passage of time come from, if not from a genuine experience of passage, of a very real feeling of time passing by? Torrengo (2017, 175) puts it as follows: “It is part of the common-sense narrative about reality and our experience of it not only that time passes, but that we believe so because we feel that time passes.”
One common metaphor is the flowing river. It is not difficult to find inspirational quotes and bits of prose in which time is compared to the flowing of water (although it is difficult to authenticate such sources!):
“Time flows away like the water in the river.” – Confucius
“Everything flows and nothing abides; everything gives way and nothing stays fixed.” – Heraclitus
“Time is like a river made up of the events which happen, and a violent stream; for as soon as a thing has been seen, it is carried away, and another comes in its place, and this will be carried away too.” – Marcus Aurelius
“River is time in water; as it came, still so it flows, yet never is the same.” – Barten Holyday
“Time is a river without banks.” – Marc Chagall
“I wanted to change my fate, to force it down another road. I wanted to stand in the river of time and make it flow in a different direction, if just for a little while.” – April Genevieve Tucholke
Perhaps we assume metaphors of a flowing river from fiction and from our ways of using language more generally. Miller, Holcombe, and Latham (2020) speculate that all languages are at least to some degree passage-friendly. That is how we come to mis-describe our phenomenology of time. This approach does not imply that we tacitly conceptualize the world as containing passage, and then come to describe our experience as including time’s passing. Instead, ”we only come to tacitly conceptualize the world as containing passage—and hence to believe that it does—once we come to deploy passage-laden language,” Miller, Holcombe, and Latham (2020, 758) write. By this conceptualization, we not only believe time to be passing, but we also describe our temporal phenomenology in terms of time’s passing. From error theoretic point of view, this means that we mis-describe our temporal experience.
Note that the error theory should be separated from the illusionist thesis. In the case of illusions, we humans erroneously observe something to be what it is not. There are, for example, well-known optical illusions. Take the finger sausage illusion. Put two index fingers close to your eyes, and you see an individual “sausage” floating in the middle. There is no sausage floating in the air. The illusion is that you really see the floating sausage. We know the mechanism that is productive of the finger sausage illusion, how the gaze direction of the eyes is merged, and the brain corrects this by suppressing one end of the finger. According to the error theory, passage of time is not an illusion because we do not experience time flowing in the first place. We rather falsely believe and describe our temporal phenomenology by using passage-friendly and passage-laden language.
iii. Moving Spotlight
Broad originally expressed (1923, 59) the idea of a moving spotlight: “We imagine the characteristic of presentness as moving, somewhat like the spot of light from a policeman’s bull’s-eye traversing the fronts of the houses in a street.” The illuminated part is the present moment, what was just illuminated is the past, and what so far has not been illuminated is the future. Broad remained critical of this kind of theory. He thought originally in eternalist terms, but his metaphysics of time changed to resemble the growing block view of temporal existence (Thomas 2019).
The moving spotlight theory is a form of eternalism. The past, the present, and the future all exist, yet there is objective becoming. Only one time is absolutely present. That present “glows with a special metaphysical status” (Skow 2009, 666). Cameron (2015, 2) maintains both privileged present and temporary presentness. The former is a thesis according to which there is a unique and privileged present time. The latter is a thesis according to which this objectively special time changes. In other words, for the moving spotlighter, temporal passage is a fundamental feature of reality. The moving spotlight view therefore connects the A-theory with eternalism.
iv. Deflationary Passage
The deflationary theory agrees with traditional anti-realism about passage. There is no unique global passage and direction of time across the entire universe. There is no A-theoretic, tensed passage. There are however local passages of time along observers’ timelike worldlines. Fazekas argues that special relativity supports the idea of “multiple invariant temporal orderings,” that is, multiple B-series of timelike related events. She calls this “the multiple B-series view.” Timelike related events are the only events that genuinely occur successively. “So,” in the view of Fazekas (2016, 216), “time passes in each of the multiple B-series, but there is no passage of time spanning across all events.”
Slavov (2022) argues that the passage of time is a relational, not substantial, feature of reality (see the debate between substantivalism and relationism). Over and above the succession of events, there is no time that independently flows. This should fit with the four-dimensional block view. It, Slavov (2022, 119) argues,
contains dynamicity. Time path belongs to spacetime. The succession of events along observers’ timelike worldlines is objectively, although not uniquely, ordered. One thing comes after another. The totality of what exists remains the same, but there is change between local parts of spacetime regions between an earlier and a later time.
Passage requires that temporally ordered events exist and that there is change from an earlier time to a later time. This is how Mozersky describes his deflationist view in an interview: “such a minimal account captures everything we need the concept of temporal passage to capture: growth, decay, aging, evolution, motion, and so forth.” Growing, decaying, aging, evolving, and moving are all related to change.
c. Persistence and Dimensions of the World
How do things survive change across time? At later times, an object is, however slightly, different from what it used to be at an earlier time. Yet that object is the same object. How is this possible? There are two major views about persistence: endurantism and perdurantism (for a much more detailed and nuanced analysis of persistence, see this article). The former maintains that objects are wholly present at each time of their existence. The objects have spatial parts, but they are not divisible temporally. The eternalists typically side with the latter. Perdurantism is the view that objects are made of spatial and temporal parts, or more specifically, spatiotemporal parts. Most humans are composed of legs, belly, and head in the same way as most humans are composed of childhood, middle-age, and eld. Ordinary objects are so-called spacetime worms; they stretch out through time like earthworms stretch out through space (Hawley 2020).
Endurantism is a three-dimensional theory. An object endures in three-dimensional space. It just sits there, occupying a place in three-dimensional Cartesian space. Time is completely external and independent of the enduring object. Endurantism assumes that the three spatial dimensions are categorically different from the temporal dimension. Perdurantism, for its part, is a four-dimensional theory of persistence. Space and time cannot be completely separated. Bodies do not endure in time-independent space. Rather, objects are composed of spatiotemporal parts. Earlier and later parts exist, as they are parts of the same object. The perdurantist view aligns with eternalism and relativity (Hales and Johnson 2003; Balashov 2010).
Figure 5. Temporal parts of an object in spacetime.
The perdurantist explains change in terms of the qualitative difference among different parts. Change occurs in case an object has incompatible properties at different times. How are the temporal parts connected? What makes, for instance, a person the same person at different times? Lewis (1976a) mentions mental continuity and connectedness. The mental states of a person should have future successors. There should be a succession of continuing mental states. There is a bond of similarity and causal dependence between earlier and later states of the person.
Perdurantism fits nicely with eternalism. It predicates the existence of all temporal parts and times, and it is consistent with the universe having four dimensions. Considering humans, every event in our lives, from our births to our deaths, is real.
d. Free Will and Agency
In his article “A Rigorous Proof of Determinism Derived from the Special Theory of Relativity,” Rietdijk (1966) argues that special relativity theory indicates fatalism and negates the existence of free will. Consider Figure 2. An observer at event B should be able to influence their future. So, they should be able to influence how C will unfold. However, for another observer at event A, event C is simultaneous with A, suggesting that event C is fixed and unalterable. Yet, C lies in the absolute future of the observer at B. C is predetermined. It is an inevitable event, akin to the Andromedan space fleet in Penrose’s example. This notion poses a threat to at least some conception of free will. If the future must be open and indeterminate for agents to choose between alternative possibilities, a relativistic block universe does not allow for such openness.
There are reasons to think that eternalism does not contradict free will. Let’s assume that the future is fixed. It exists one way only. Statements about future events are true or false. Suppose, Miller explains,
It is true that there will be a war with sentient robots. In a sense, we cannot do anything about that; whatever we in fact do, the war with the robots will come to pass. But that does not mean that what you or I choose to do makes no difference to the way the world turns out or that somehow our choices are constrained in a deleterious manner. It is consistent with it being the case that there will be a war with sentient robots, that the reason there is such a war is because of what you and I do now. Indeed, one would expect that the reason there is such a war is in part because we build such robots. We make certain choices, and these choices have a causal impact on the way the world is. These choices, in effect, bring it about that there is a war with the robots in the future. Moreover, it is consistent with the fact that there will be such a war, that had we all made different choices, there would have been no war, and the facts about the future would have been different. The future would equally have been fixed, but the fixed facts would have been other than they are. From the fact that whatever choices we in fact make, these lead to a war with the robots, it does not follow that had we made different choices, there would nevertheless have been a war with the robots (Miller 2013, 357–8).
The future condition of later local regions of the universe depends on the state of their earlier local regions. We, as human agents, have some degree of influence over how things will unfold. For instance, as I compose this article in the 2020s, our actions regarding climate will partly shape the climate conditions in the 2050s. We are not omnipotent, and our understanding of consequences, especially those far into the future, is somewhat uncertain. However, even if it is a fact that the future exists in only one definite way, this does not inherently exclude free will or the causal relationship between actions and their consequences. Subscribing to eternalism does not resolve the debate over free will; one can be a fatalist or affirm the reality of free will within an eternalist framework.
One weakness to note about Rietdijk and Penrose’s arguments, at least if they are used to deny freedom of the will, is their focus on spacelike separated events. These events exist beyond any conceivable causal influence. It is obvious that if a distant civilization, with which we have no communication or interaction, decides to attack us, we cannot influence that decision. Events occurring in regions of the universe beyond our reach remain indifferent to our capacity to make free choices. What truly pertains to freedom of choice are events lying within the future light cone of the observer, not those outside or within the past light cone. Norton provides an illustrative example with mini spacetimes:
Figure 6. Causal connectibility.
Drawing based on Norton (2015, 211).
An observer at the origin of their path in O may only influence events that will be in the multiplicity of the future light cones, along the line towards event E, as they are timelike separated from them. There is no way to affect anything toward the spatially separated event F. Only future timelike or lightlike separated events can be affected, as in that case the affection stays within the cosmic speed limit, the speed of the electromagnetic spectrum frequency. Action from O to S would require the observer to surpass the maximum speed limit.
An important asymmetry within the eternalist framework is between perception and action. We may never perceive the future or affect the past. When perceiving something, we always perceive the past. An event, distant from the observer, occurs, and then there is a time-lag during which the information gets transmitted to and processed by the observer. The event causing the perception occurs before its perception. Actions, for their part, are always directed toward the future. All of this can be translated into B-language (Mellor 1998, 105). We may affect what happens before a time t but not what happens after a time t. We may not perceive what happens before a time t but what happens after a time t. This characterization might be misleading. To clarify, imagine time t as the time when we have lunch. Our breakfast occurs before t, and our dinner happens after t. At breakfast, we can influence what we are going to have for lunch, but we cannot observe it yet. At dinner, we can observe what we had for lunch, but we can no longer influence it.
Traditionally, philosophies of time akin to eternalism employ an all-knowing being that can see all times. Around a thousand years ago, Anselm argued that God is timeless, and so the entire world is immediately present to Him. This indicates every place and every time is under his immediate perception. All times from the beginning to the end are real. This is a tenseless view of time, which treats past, present, and future as perspectives relative to a temporal perceiver (Rogers 2007, 3). A more modern, science fiction example could be a being who can intuit the four-dimensional spacetime. That kind of being could somehow see the past, the present, and the future equally. The movie Men in Black 3 portrays a being like this, an alien named Griffin. He sees a past baseball game and a future attack of the movie’s villain in the same way as the present.
There is, however, a notable difficulty when it comes to observing the future. Perceiving the future would require turning the temporal asymmetry of causation around. It is hard to understand how causation would function in perception if one could observe events occurring later in time. For example, observing something outdoors requires light originating from the Sun to be reflected towards the observer’s eyes. The photons that strike the retina are eventually transduced into electric charges, which then navigate their way through the brain, resulting in the creation of a visual experience. Reversing the temporal direction in this process would be extremely weird. It would imply that the charges in the brain are transformed into light particles, which are then emitted from the eyes towards the object on Earth, subsequently traveling back towards the Sun and initiating physical processes there.
e. The Possibility of Time Travel
At first sight, presentism cannot accommodate time travel because, according to presentism, there are no various times in which one could travel. There is only the present time, but no other times that we could, even in principle, access. Past objects and future objects do not exist, so we cannot go and meet them, just like we cannot meet fictional beings. Not everyone, however, thinks that presentism could not deal with time travel (see Monton 2003).
For its part, eternalism is, in principle, hospitable to the idea of time travel. If the entire universe exists unconditionally, all spacetime with its varying regions simply be. We could travel to different times because all times exist. Traveling to different spatial locations is made possible by the existence of all spatial locations and the path between them. Four-dimensionally, there is a timelike path between different spacetime locations.
Travel to the future is in some sense a trivial idea: we are going toward later times all the time. As writing this encyclopedia article takes x months of my time, it means I am x months farther from my birth and x months closer to my death from starting to finishing writing. Time dilation is consistent with future time travel in another way. An observer traveling close to the speed of light or situated close to a black hole will age more slowly than an observer staying on Earth. After such a space journey, when they get back home to Earth, the traveler would have traveled into the future.
The question about traveling into the past, like to regions of spacetime that precede our own births, is more controversial. If closed timelike curves are possible, then at least in principle time travel to times earlier than our births is possible. This raises interesting questions about what we could do in our pasts. Can we go on and kill our grandparents? Lewis published a famous article in the 1970s, in which he argued that travel to the past is possible, and there is nothing paradoxical about it.
4. Objections
Eternalism has faced numerous strands of criticism. Typical objections concern eternalism’s putative incapability of dealing with change and free will (Ingram 2024). The issue of changelessness was tackled in Section 3.a, and the issue of free will in Section 3.d. Below, five more objections are presented, including possible answers to them.
a. Conventionality of Simultaneity
The conventionality of simultaneity poses a challenge to the classical argument for eternalism (see Ben-Yami 2006; Dieks 2012; Rovelli 2019). The conventionality of simultaneity was something already noted by Poincaré in the late 1800s and by Einstein in the early 1900s. If we are dealing with two spatially separated places, or spacelike separated events, how can we know that these places share the same time and that these events happen at the same time? How do we know that clocks at different places show the same time? How do we know that the ’now’ here is the same ‘now’ in another place? How do we know that the present extends across space even within a designated inertial frame?
Here we have the problem of synchronization. Say we could construct two ideal clocks with the exact same constitution separated by a distance AB. According to Einstein’s proposal, we may send a ray of light from location A to B, which is then reflected from B to A. The time the signal leaves from A is tA, which is measured by an ideal clock at A. The time it gets to B and bounces back is tB, and that time is measured by an identical clock at B. The time of arrival at A is measured by the clock at A. The two clocks are in synchrony if tB – tA = tA’ – tB.
In his Philosophy of Space and Time (1958), Reichenbach went on to argue that simultaneity of two distant events is indeterminate. He adopted Einstein’s notation but added the synchronization parameter ε. The definition of synchrony becomes tB = tA + ε(tA’ – tA), 0 < ε < 1. If the speed of light is isotropic, that is the same velocity in all directions, ε = 1/2. However, because the constant one-way speed of light is a postulate based on a definition, not on any brute fact about nature, the choice of the synchronization parameter is conventional (Jammer 2006: 176–8). Provided that simultaneity is a matter of convention, different choices of the synchronization parameter ε yield different simultaneity relations. If hyperplanes are arbitrary constructions, arguments relying on ontological simultaneity and co-reality relations become questionable. Conventionality implies that spacelike separated events are not even in a definite temporal order. If this is correct, the classical eternalist argument does not even get off the ground.
The conventionality objection relates to the issue of transitivity. Some relations are clearly transitive. They form chains in which transitivity holds. If A is bigger than B and B bigger than C, then A is bigger than C. Yet if A is B’s friend and B is C’s friend, it does not necessarily follow that A is C’s friend. How about simultaneous events being equally real? Do the premises AsB → ArB and AsC → ArC hold in the first place? They should, if we are to truthfully infer that ArB ∧ ArC → BrC. If distant simultaneity is a matter of convention, there seems to be no room for implying that events happening at the same time share the same reality value. Moreover, per Reichenbach’s causal theory of time, only causally connectable events lying in the observer’s light cones are genuinely temporally related. Outside light cones, we are in the regions of superluminal signals. What lies outside light cones is in principle not temporally related; spacelike separated events are neither simultaneous (occurring now) nor past or future. There is no fact of the matter as to which order non-causally connectable events occur. This applies to a great many events in the universe. As the universe is expanding, there are regions that are not causally connected. The different regions have the Big Bang as their common cause, but they are not otherwise affecting each other. They are therefore not temporally related.
Although conventionalism can be laid against eternalism, it can also be taken to support eternalism. Presentism and the growing block view require that the present moment is universal. There should be a unique, completely universal present hyperplane that connects every physical event in the universe. In that case, it should be true that the present time for the observer on Earth is the same time as in any other part of the universe. This means that it also should be true that a time that is past for an observer on Earth is past for an observer at any other place in the universe. Likewise, a future time for an observer on Earth, which both the presentist and no-futurist think does not yet exist, does not exist for any observer anywhere. Presentism and the growing block view both should accept the following to be true: “The moment a nuclear reaction in Betelgeuse occurs, the year 2000 on Earth has either passed or not.” If we consider the relativity of simultaneity, in some frames the year 2000 has passed at the time the reaction occurs but in some other frames it has not. If we consider the conventionality of simultaneity, the statement in question is not factual to begin with. As for the non-eternalist the statement must be true; for the eternalist, it is false (based on relativity of simultaneity) or it is not truth-app (based on conventionality of simultaneity). In both cases, the statement in question is not true, as the passing of the year 2000 and the reaction occurring have no unique simultaneity relation.
b. Neo-Lorentzianism
Historically, Lorentz provided an ether-based account of special relativity. His theory retains absolute simultaneity and is, in some circumstances, empirically equivalent to Einstein-Minkowski four-dimensional theory. Lorentz’s theory is not part of currently established science. It was abandoned quite a long time ago, as it did not fit with emerging general relativity and quantum physics (Acuña 2014).
There has however been an emerging interest in so-called Neo-Lorentzianism about relativity in the 2000s. Craig (2001) and Zimmerman (2007) have both argued, although not exactly in similar ways, for the existence of absolute simultaneity. This interpretation of special relativity would, against the orthodox interpretation, back up presentism. Craig’s theological presentism leans on the existence of God. “For God in the “now” of metaphysical time,” Craig explains (2001, 173), would “know which events in the universe are now being created by Him and are therefore absolutely simultaneous with each other and with His “now.”” According to this interpretation, there is a privileged frame of reference, the God’s frame of reference. Zimmerman does not explicitly invoke Neo-Lorentzianism. In his view, there is nevertheless a privileged way to carve up spacetime into a present hyperplane:
My commitment to presentism stems from the difficulty I have in believing in the existence of such entities as Bucephalus (Alexander the Great’s horse) and the Peloponnesian War, my first grandchild, and the inauguration of the first female US president. It is past and future objects and events that stick in my craw. The four-dimensional manifold of space-time points, on the other hand, is a theoretical entity posited by a scientific theory; it is something we would not have believed in were it not for its role in this theory, and we should let the theory tell us what it needs to be like. As a presentist, I believe that only one slice of this manifold is filled with events and objects (Zimmerman 2007, 219).
These approaches maintain that the present moment is ontologically privileged, as there is a privileged frame of reference and a privileged foliation of spacetime. These views can be seen as revising science on theological and metaphysical grounds. Some, like Balashov and Jansen (2003), Wüthrich (2010, 2012), and Baron (2018), have criticized these strategies. Here we may also refer to Wilson’s (2020, 17) evidential asymmetry principle. To paint with a broad brush, physical theories are better corroborated than metaphysical theories. Physical theories, like special relativity, are supported by a vast, cross-cultural, and global consensus (not to mention the immense amount of empirical evidence and technology that requires the theory). Making our metaphysics match with science is less controversial than making our science match with intuitively appealing metaphysics. Many intuitions—only the present exists, time passes unidirectionally along past-present-future, a parent cannot be younger than their children—can be challenged based on modern physics.
Alternative interpretations of relativity usually invoke something like the undetermination thesis. There is more than just one theory, or versions of the same theory, that correspond to empirical data. Hence, the empirical data alone does not determine what we should believe in. This motivates the juxtaposition of the Einstein-Minkowski four-dimensional theory and the Lorentz ether theory. Even though there are historical cases of rival theories that at some point in history accounted for the data equally well, this does not mean that the two theories are equal contestants in contemporary science. The two theories might not be empirically equivalent based on current knowledge. Impetus/inertia, phlogiston/oxygen, and Lorentz/Einstein make interesting alternatives from a historical viewpoint. It would be a false balance to portray them as equally valid ways of understanding the natural world. Lorentz’s theory did not fit with the emerging general relativity and quantum field theories. These theories have made a lot of progress throughout the 20th and 21st centuries, and they are parts of established science, unlike the ether theory. Moreover, Einstein’s 1905 theory did not only pave the way for subsequent physics. It also corrected the preceding Maxwellian-Hertzian electrodynamics by showing that electric fields are relative quantities; they do not require an ether in which the energy of the field is contained. Maxwell’s electrodynamics is an important part of classical physics and electric engineering without the assumption of space-permeating ether.
c. General Relativity
Although special relativity, at least its orthodox interpretation, does not lend support to presentism or the growing block view, things might be different in case of general relativity. That theory includes the so-called Friedmann-Lemaître-Robertson-Walker (FLRW) metric. This enables one to argue for cosmic simultaneity, the unique hypersurface of cosmic time. This idea requires a fundamental observer that could be construed as one who is stationary relative to the microwave background. In this sense, presentists or growing blockers may argue that although special relativity is at odds with their accounts of the nature of time, this is not so in the case of more advanced science. Swinburne (2008, 244), for one, claims “that there is absolute simultaneity in our homogeneous and isotropic universe of galaxies receding from each other with a metric described by the” FLRW solution. As pointed out by Callender (2017, 75–6) and Read and Qureshi-Hurst (2021), however, we are not fundamental observers, as we are in various relative states of motion. We move on our planet; our planet rotates around its axis; it orbits the Sun; it moves in relation to our galaxy, which in turn moves in relation to other galaxies (for a more astute description, see Pettini 2018, 2). This indicates that we do not have local access to cosmic time.
Black hole physics is also troublesome for presentism and the growing block view. It retains the frame-relativity of simultaneity, so not all observers agree with what is present (see Romero and Pérez 2014, section “Black holes and the present”). Baron and Le Bihan (2023) consider the idea of surface presentism based on general relativity. According to surface presentism, what exists is limited to a three-dimensional hypersurface. Although surface presentism allows there is no preferred frame of reference in physics, it maintains there is a preferred frame in the metaphysical sense. This anchors the one and only actual present moment. Consider the event horizon. Nothing, not even massless particles like photons, can escape from a black hole. The event horizon is the limit between what goes in the black hole and can never leave and the rest of the universe. The later times of the region, what is beyond the event horizon in the black hole, are ones in which nothing that enters it will never escape. What is relevant for the metaphysics of time is the ontological dependence of earlier and later times in black hole physics.
According to the argument of Baron and Le Bihan, there would be no event horizons if surface presentism were true. Surface presentism, as well as presentism in general and the growing block view, maintains there is no future or times later to the present time. Briefly put, the very existence of black holes as evidenced by general relativity is against presentism. As nothing can escape the interior of a black hole after entering it, there is an ontological reference to a later time. No matter how long it takes, nothing can escape. To paraphrase and interpret Curiel (2019, 29), the location of the event horizon in spacetime requires the entire structure of spacetime, from the beginning to the end (and all the way to infinity). All spacetime exists.
d. Quantum Physics
Whereas relativity is a classical, determinist theory and so well-fit with predicating a fixed future, quantum physics is many times interpreted in probabilistic terms. This may imply that the future is nonexistent and open. This would flat-out contradict eternalism. There are good reasons for this conclusion. Take the famous double-slit experiment.
In this experiment, a gun steadily fires electrons or photons individually towards two slits. Behind the slits, a photographic plate registers the hits. Each particle leaves a mark on the plate, one at a time. When these particles are shot one by one, they land individually on the detector screen. However, when a large number of particles are shot, an interference pattern begins to emerge on the screen, indicating wave-like behavior.
At first glance, this is highly peculiar because particles and waves are fundamentally different: particles are located in specific spatial regions, while waves spread out across space.
What is important for debates concerning the nature of time is the probabilistic character of the experimental outcome. Determinist theories, within the margin of error, enable the experimenter to precisely predict the location of the particle in advance of the experiment.
In the double-slit experiment, one cannot even in principle know, before carrying out the experiment, where the individual particle will eventually hit the screen. To put it a bit more precisely, the quantum particle is associated with a probability density. There is a proportionality that connects the wave and the particle nature of matter and light. The probability density is proportional to the square of the amplitude function of the electromagnetic wave. This means we can assign probabilities for detecting the particle at the screen. It is more likely that it will be observed at one location as opposed to another.
This can be taken to imply that the future location of the particle is a matter of open possibility. Before performing the experiment, it is a random tensed fact where the particle will be. Yet eternalism cannot allow such open possibilities, because it treats any event as tenselessly existing. To give a more commonsensical example, eternalism indicates that the winning lottery numbers of the next week’s lottery exist. We were ignorant of those numbers before the lottery because we were not at the spacetime location in which we could see the numbers. Yet the no-futurist, probabilistic interpretation of the situation suggests that the next week’s lottery numbers do not yet exist. The machine will eventually randomly pick out a bunch of numbers.
Putnam’s classical semantic argument for eternalism assumes that statements concerning events have definite truth values, independently of whether they are in the observer’s past or future. This is consistent with a determinist theory, but quantum theory might require future-tense statements that have probabilistic truth values. Hence future events would have probabilistic reality values. Statements concerning what will occur do not have bivalent truth values, but they instead range between 0 and 1 in the open interval [0, 1]. Sudbery (2017) has developed a logic of the future from a quantum perspective. Sudbery (2017, 4451–4452) argues “that the statements any one of us can make, from his or her own perspective in the universe,” when they concern the future, “are to be identified with probabilities.” This account seems to go well with quantum physics and no-futurist views like presentism or the growing block.
One way for the eternalist to answer this objection is to consider a determinist interpretation of quantum mechanics, like the many worlds interpretation, initiated by Everett in 1957. Greaves and Myrvold (2010, 264) encapsulate the underlying Everettian idea: “All outcomes with non-zero amplitude are actualized on different branches.” All quantum measurements correspond to multiple measurement results that genuinely occur in different branches of reality. Under Everettianism, one could think that there is no ‘now’ that is the same everywhere in all of physical reality, but different worlds/branches have their own times. Different worlds within the Everett multiverse or different branches within the single universe are causally isolated. This is not much different from relativistic spacelike separation: different locations in the universe are in each other’s absolute elsewhere, not connected by any privileged hyperplane of simultaneity. There is no unique present moment that cuts through everything that exists and defines all that exists at that instant.
A potential challenge to classical eternalist arguments that draw on the relativity of simultaneity comes from quantum entanglement. Based on the ideas of Bell, Aspect and his team were able in the 1980s to experimentally corroborate a non-local theory. Two particles, separated by distance, turn out instantly to have correlated properties. This could not be the case with only locally defined physical states of the particles. Maudlin (2011, 2) explains that they:
appear to remain “connected” or “in communication” no matter how distantly separated they may become. The outcome of experiments performed on one member of the pair appears to depend not just on that member’s own intrinsic physical state but also on the result of experiments carried out on its twin.
Non-locality might introduce a privileged frame of reference. (For a thorough discussion on non-locality and relativity, see Maudlin (2011).) If this is so, the classical argument that relies on chains of simultaneity relations and their transitivities would perhaps be challenged. A question still remains about time. Gödel (1949) argued that objective passage requires a completely universal hyperplane, a global ‘now’ that constantly recreates itself. It is not clear whether the instant correlation of distant particles in quantum entanglement introduces this kind of unique spacetime foliation required for exclusive global passage of time.
The research programs on quantum gravity aim to weave together relativistic and quantum physics, considering both gravitational and quantum effects. This could potentially yield the most fundamental physical theory. Some approaches to quantum gravity indicate that spacetime is not fundamental. When one reaches the Planck scale, 1.62 x 10-35 meters and 5.40 x 10-44 seconds (Crowther 2016, 14), there might not be space and time as we know them. At first glance, this might challenge eternalism, as the classical argument for it leans on four-dimensional spacetime. Le Bihan (2020) analyzes string theory and loop quantum gravity, arguing that both align with an eternalist metaphysics of time. If deep-down the world is in some sense timeless, the distinct parts of the natural world still exist unconditionally. This undermines presentism and no-futurism, since they rely on positing a global present by means of a unique, universal hyperplane and an absolute distinction between A-properties. Whether something is past, present, or future is not determined by fundamental physics.
e. Triviality
A case has been made that the presentist/eternalist debate lacks substance. Dorato (2006, 51), for one, claims that the whole issue is ill-founded from an ontological viewpoint. Presentism and eternalism reflect our different practical attitudes toward the past, present, and future. Dorato notes that Putnam’s assertion “any future event X is already real” (Putnam 1967, 243) is problematic. This assertion seems to implicitly assume presentism and eternalism. By saying that “the future is already,” we are saying something like “the future is now.” This is contradictory: the present and the future are different times. They cannot exist at the same time.
To be more precise, Dorato (2006, 65) distinguishes between the tensed and tenseless senses of the verb “exist.” In the tensed sense, an event exists in the sense that the event exists now. In the tenseless sense, an event exists in the sense that it existed, exists now, or will exist. These become trivial definitions: both presentists and eternalists accept them. The whole debate is verbal. Presentism maintains that past or future events do not exist now. What happens is that “presentism becomes a triviality” because “both presentists and eternalists must agree that whatever occurs” in the past or future “does not exist now!”. Instead of being an ontological debate, the presentism/eternalism dialogue is a matter of differing existential attitudes toward the past, present, and future.
One way to answer the worry of triviality—and the charge that the whole debate is merely verbal—is to add that presentism and eternalism disagree on what exists unconditionally (the Latin phrase is simpliciter). Consider the following statement S:
S: “Cleopatra exists unconditionally.”
The presentist thinks S is false. All that exists for the presentist are the presently existing entities. When I am composing or you are reading this article, Cleopatra does not exist anymore. The original claim is false, according to the presentist. The eternalist disagrees: S is true for the eternalist while I’m writing and you are reading. We are not in the same spacetime location as Cleopatra, but nevertheless there is a location that the living Cleopatra occupies (Curtis and Robson 2016, 94).
The sense in which presentism and eternalism ontologically agree/disagree can be clarified by specifying the domain of quantification. This clarification borrows from predicate logic. Presentism and eternalism both agree on restricted quantification. It is true, according to presentism and eternalism, that Cleopatra does not exist anymore while writing or reading this article. The two nevertheless respond differently to unrestricted quantification. When quantifying over the totality of what exists, the presentist maintains that the quantification is over the presently existing entities, while the eternalist maintains that the quantification is over the past, the present, and the future entities.
There are many other reasons to think that presentism and eternalism imply separate temporal ontologies. These aspects have already been discussed in this article: dimensionality of the world, indexicality, and persistence. Presentism maintains the existence of a three-dimensional spatial totality plus the universal present moment that cuts through the whole universe. Eternalism denies there is such a universal present moment and that existence is restricted to the present time. Instead, the entire block universe exists (Wüthrich 2010). Eternalism is very different from presentism because it predicates the existence of all events, irrespective of their contingent spacetime location. Presentists cannot accept the existence of all events from the beginning of the universe to its end but limit existence to the presently existing entities, which are thought to be the only existent entities. Presentists may well treat the spatial location ‘here’ as indexical but hold the present to be absolute. Eternalists (apart from the moving spotlight view) treat both spatial and temporal tensed locations as indexicals. Presentists typically assume a perdurantist account of persistence, the view that objects persist the same across time temporally indivisibly, whereas eternalists subscribe to temporal parts in spacetime.
5. References and Further Reading
Acuña, P. (2014) “On the Empirical Equivalence between Special Relativity and Lorentz’s Ether Theory.” Studies in History and Philosophy of Modern Physics 46: 283-302.
Armstrong, D. M. (2004) Truth and Truthmakers. Cambridge: Cambridge University Press.
Balashov, Y. and M. Janssen (2003) ”Presentism and Relativity.” British Journal for the Philosophy of Science 54: 327-46.
Balashov, Y. (2010) Persistence and Spacetime. New York: Oxford University Press.
Baron, S. (2018). “Time, Physics, and Philosophy: It’s All Relative.” Philosophy Compass 13: 1-14.
Baron, S. and B. Le Bihan (2023) “Composing Spacetime.” The Journal of Philosophy.
Ben-Yami, H. (2006) “Causality and Temporal Order in Special Relativity.” The British Journal for the Philosophy of Science 57: 459-79.
Broad, C. D. (1923) Scientific Thought. London: Routledge and Kegan Paul.
Callender, C. (2017) What Makes Time Special? New York: Oxford University Press.
Cameron, R. (2015) The Moving Spotlight. New York: Oxford University Press.
Craig, W. L. (2001). Time and the Metaphysics of Relativity. Dordrecht: Springer.
Crowther, K. (2016) Effective Spacetime. Understanding Emergence in Effective Field Theory and Quantum Gravity. Cham: Springer.
Curiel, E. (2019) “The Many Definitions of a Black Hole.” Nature Astronomy 3: 27–34.
Curtis L. and Robson J. (2016) A Critical Introduction to the Metaphysics of Time. London: Bloomsbury Academic.
Dieks, D. (2012) “Time, Space, Spacetime.” Metascience 21: 617-9.
Dorato, M. (2006) “Putnam on Time and Special Relativity: A Long Journey from Ontology to Ethics”. European Journal of Analytic Philosophy 4: 51-70.
Dowden, B. (2024) “Time.” Internet Encyclopedia of Philosophy, https://iep.utm.edu/time/.
Einstein, A. (1905/1923) “On the Electrodynamics of Moving Bodies.” In Lorentz et al. (ed.) The Principle of Relativity, 35‒65. Trans. W. Perret and G.B. Jeffery. Dover Publications, Inc.
Everett, H. I. (1957) “Relative State Formulation of Quantum Mechanics.” Review of Modern Physics 29: 454-62.
Fazekas, K. (2016) “Special Relativity, Multiple B-Series, and the Passage of Time.” American Philosophical Quarterly 53: 215-29.
Greaves, H. and W. Myrvold (2010) “Everett and Evidence.” In Saunders, S., Barrett, J., Kent, A. and Wallace, D. (eds.), Many Worlds? Everett, Quantum Theory, & Reality, 264-306. New York: Oxford University Press.
Grünbaum, A. (1973) Philosophical Problems of Space and Time. Second, enlarged version. Dordrecht: Reidel.
Gödel, K. (1949) “A Remark about the Relationship between Relativity Theory and the Idealistic Philosophy.” In Schilpp, P. A. (ed.), Albert Einstein: Philosopher-Scientist, 555-62. La Salle, Illinois: Open Court.
Hales, S. and Johnson, T. (2003) “Endurantism, Perdurantism and Special Relativity.” Philosophical Quarterly 53: 524–539.
Hawley, K. (2020) “Temporal Parts.” The Stanford Encyclopedia of Philosophy, https://plato.stanford.edu/entries/temporal-parts/.
Hinton, C. (1904) The Fourth Dimension. London: Swan Sonnenschein & Co. Ltd.
Ingram, D. (2024) “Presentism and Eternalism.” In N. Emery (ed.), The Routledge Companion to Philosophy of Time. Routledge.
Jammer, M. (2006) Concepts of Simultaneity: From Antiquity to Einstein and Beyond. Baltimore: Johns Hopkins University Press.
Maxwell, N. (1985) ”Are Probabilism and Special Relativity Incompatible?” Philosophy of Science 52: 23–43.
Le Bihan, B. (2020) “String Theory, Loop Quantum Gravity, and Eternalism.” European Journal for Philosophy of Science 10: 1-22.
Lewis, D. (1976a) “Survival and Identity.” In A. O. Rorty (ed.), The Identities of Persons. Berkeley: University of California Press, 17-40.
Lewis, D. (1976b) “The Paradoxes of Time Travel.” American Philosophical Quarterly 13: 145-52.
Mar, G. (2017) “Gödel’s Ontological Dreams.” In Wuppuluri, S. and Ghirardi, G. (eds.), Space, Time, and the Limits of Human Understanding, 461-78. Cham: Springer.
Maudlin, T. (2011) Quantum Non-locality and Relativity. Third Edition. Wiley-Blackwell
McTaggart, J. M. E. (1908) “The Unreality of Time.” Mind 17: 457-74.
Mellor, H. (1998) Real Time II. London and New York: Routledge.
Miller, K. (2013). “Presentism, Eternalism, and the Growing Block.” In H. Dyke & A. Bardon, eds., A Companion to the Philosophy of Time. Malden, MA: Wiley-Blackwell, 345-64.
Miller, K. et al. (2020) “Temporal Phenomenology: Phenomenological Illusion versus Cognitive Error.” Synthese 197: 751-71.
Minkowski, H. (1923) “Space and Time.” In Lorentz et al. (eds.), The Principle of Relativity, 73-91. Translated by W. Perrett and G. B. Jeffery. Dover Publications.
Monton, B. (2003) “Presentists Can Believe in Closed Timelike Curves.” Analysis 63: 199-202.
Norton, J. (2015) “What Can We Learn about the Ontology of Space and Time from the Theory of Relativity.” In L. Sklar (ed.), Physical Theory: Method and Interpretation, Oxford University Press, 185-228.
O’Sullivan, L. (2023) “Frege and the Logic of Historical Propositions.” Journal of the Philosophy of History 18: 1-26.
Penrose, R. (1989) The Emperor’s New Mind: Concerning Computers, Minds, and Laws of Physics. New York and Oxford: Oxford University Press.
Peterson, D. and M. D. Silberstein (2010) “Relativity of Simultaneity and Eternalism.” In Petkov (ed.), Space, Time, and Spacetime, 209-37. Heidelberg: Springer.
Pettini, M. (2018) “Introduction to Cosmology – Lecture 1. Basic concepts.” Online lecture notes: https://people.ast.cam.ac.uk/~pettini/Intro%20Cosmology/Lecture01.pdf
Price, H. (2011) “The flow of time.” In C. Callender, ed., The Oxford Handbook of Philosophy of Time. Oxford: Oxford University Press, 276-311.
Putnam, H. (1967) “Time and Physical Geometry.” The Journal of Philosophy 64: 240-7.
Read, J. and E. Qureshi-Hurst (2021) “Getting Tense about Relativity.” Synthese 198: 8103–8125.
Reichenbach, H. (1958) The Philosophy of Space and Time. Translated by Maria Reichenbach and John Freund. New York: Dover Publications, Inc.
Rietdijk, C. W. (1966) “A Rigorous Proof of Determinism Derived from the Special Theory of Relativity.” Philosophy of Science 33: 341-4.
Rogers, K. A. (2007) “Anselmian Eternalism: The Presence of a Timeless God.” Faith and Philosophy 24: 3-27.
Romero, G. E. and D. Pérez (2014) “Presentism Meets Black Holes.” European Journal for Philosophy of Science 4: 293-308.
Rovelli, C. (2019) “Neither Presentism nor Eternalism.” Foundations of Physics 49: 1325-35.
Russell, B. (1925) The ABC of Relativity. New York and London: Harper and Brothers.
Russell, B. (1915) “On the Experience of Time.” The Monist 25: 212-23.
Skow, B. (2009) “Relativity and Moving Spotlight.” The Journal of Philosophy 106: 666–78.
Slavov, M. (2022) Relational Passage of Time. New York: Routledge.
Smart, J. (1949) “The river of time.” Mind 58: 483-94.
Sudbery, A. (2017) “The Logic of the Future in Quantum Theory.” Synthese 194: 4429-53.
Swinburne, R. (2008) “Cosmic Simultaneity.” In W. L. Craig & Q. Smith (Eds.), Einstein, Relativity and Absolute Simultaneity, 224–261. London: Routledge.
Thomas, E. (2019) “The Roots of C. D. Broad’s Growing Block Theory of Time.” Mind 128: 527-49.
Thyssen, P. (2023) ”The Rietdijk–Putnam–Maxwell Argument.” https://philarchive.org/rec/THYTRA.
Torrengo, G. (2017) “Feeling the Passing of Time.” Journal of Philosophy 114: 165–188.
Waller, J. (2012) Persistence through Time in Spinoza. Lexington Books.
Williams, D. C. (1951) “The Myth of Passage.” The Journal of Philosophy 48: 457-72.
Wilson, A. (2020) The Nature of Contingency: Quantum Physics as Modal Realism. Oxford: Oxford University Press.
Wüthrich, C. (2010) “No Presentism in Quantum Gravity.” In V. Petkov (ed.), Space, Time, and Spacetime: Physical and Philosophical Implications of Minkowski’s Unification of Space and Time, 257-78. Heidelberg: Springer.
Wüthrich, C. (2012) “The Fate of Presentism in Modern Physics.” In Ciuni, Miller, and Torrengo (eds.), New Papers on the Present—Focus on Presentism, 92-133. München: Philosophia Verlag.
Zimmerman, D. (2007) “The Privileged Present: Defending an ‘A-Theory’ of Time.” In Sider, T. et al. (eds.), Contemporary Debates in Metaphysics, 211-25. Blackwell.
The Cognitive Foundations and Epistemology of Arithmetic and Geometry
How is knowledge of arithmetic and geometry developed and acquired? In the tradition established by Plato and often associated with Kant, the epistemology of mathematics has been focused on a priori approaches, which take mathematical knowledge and its study to be essentially independent of sensory experience. Within this tradition, there are two a priori approaches. In the epistemological a priori approach, mathematical knowledge is seen as being a priori in character. In the methodological a priori approach, the study of the nature of mathematical knowledge is seen primarily as an a priori philosophical pursuit. Historically, there have been philosophers, most notably Mill in 1843, who have challenged the epistemological a priori approach. By contrast, until the 21st century, the methodological a priori approach has remained unchallenged by philosophers.
In the first two decades of the 21st century, the methodological a priori approach has received serious challenges concerning both arithmetic and geometry, which are generally considered to be among the most fundamental areas of mathematics. Empirical results have emerged that suggest that human infants and many non-human animals have something akin to arithmetical and geometrical capacities. There has been a great deal of disagreement over the philosophical significance of such results. Some philosophers believe that these results are directly relevant to philosophical questions concerning mathematical knowledge, while others remain sceptical.
This article presents some key empirical findings from the cognitive sciences and how they have been applied to the epistemology of arithmetic and geometry. It is divided into two parts. The first part is focused on arithmetic. Results on early quantitative cognition are reviewed, and important conceptual terminological distinctions are made, after which the importance of these empirical data for the epistemology of arithmetic is discussed. Two separate but connected problems are distinguished: the development of arithmetical knowledge on the level of individual ontogeny, and on the level of phylogeny and cultural history. The role of culture in the development of arithmetical knowledge is discussed, after which general epistemological considerations are provided. While at present the empirical data relevant to the development of arithmetic are stronger and more plentiful, there is also a growing body of data relevant to the development of geometry. In the second part, these data are used to provide geometrical knowledge with a similar treatment to that provided to arithmetical knowledge.
Traditional pre-19th-century Western philosophy of mathematics is often associated with two specific views. The first is Plato’s (The Republic) notion that mathematics concerns mind-independent abstract objects. The second is Immanuel Kant’s (1787) view of mathematical knowledge as synthetic a priori. These views, while not necessarily connected, are compatible. By combining them, readers obtain a standard Platonist view of mathematics: mathematical knowledge is acquired and justified through reason and recollection, and it concerns mind-independent abstract objects.
This standard view can be challenged from different directions. Conventionalists, for example, deny that mathematical knowledge concerns mind-independent objects. Hence, mathematics does not give us genuinely new knowledge about the world (in a broad sense), and should be considered analytically a priori (see, for example, Carnap, 1937). A less popular challenge claims that mathematical knowledge is not essentially independent of sensory experience, but a posteriori in character. The most famous such empiricist view was presented by John Stuart Mill (1843). By the late 20th century, empiricist epistemologies of mathematics had been supported by Philip Kitcher (1983) and Penelope Maddy (1990). Maddy connects empiricism to mathematical realism, while, according to Kitcher, mathematical knowledge concerns generalisations of operations that we undertake in our environment. A similar view is supported by George Lakoff and Rafael Núñez (2000), who focus on the use of conceptual metaphors in cognitive development.
While there are many similarities in the approaches of Kitcher and Lakoff and Núñez, there are also notable differences. Importantly, Lakoff and Núñez make more connections to the empirical literature on the development of mathematical cognition. From the 1990s on, authors began taking this approach even more seriously, making important use of empirical studies on early numerical and geometrical cognition in the epistemology of mathematics. Significantly, such authors are not necessarily empiricists concerning mathematical knowledge. Accordingly, it is important to distinguish between empiricist and empirically-informed epistemological theories of mathematics. While the former are likely to include the latter, the opposite is not necessarily the case. Since the late 20th century, many philosophers have proposed epistemological accounts of arithmetic and geometry that are based in a significant way on empirical research, but which do not support the view that mathematical knowledge is essentially empirical in character.
b. Empirical Research and the Philosophy of Mathematics
An important reason for the emergence of empirically-informed accounts is the extensive development of empirical research on numerical cognition in the 1990s. This empirical work was presented in two popular books: The Number Sense by Stanislas Dehaene (Dehaene 1997) and What Counts: How Every Brain is Hardwired for Math (Butterworth 1999), published in the United Kingdom as The Mathematical Brain. Within this research corpus, one of the most famous items is Karen Wynn’s paper “Addition and subtraction by human infants” (Wynn 1992). In it, Wynn presents her research on five-month-old infants, whom she interprets as possessing genuine number concepts and arithmetical abilities. Wynn’s experiment is widely discussed in the subsequent literature, and since it illuminates the different kinds of interpretations that can be made of empirical research, it is worth presenting in detail.
In the experiment, infants were shown dolls, and their reactions were observed to determine whether they had numerical abilities. In the first variation of the experiment, infants were shown two dolls placed, one by one, behind an opaque screen. In some trials, one of the dolls was removed without the infant seeing its removal, revealing only one doll when the screen was lifted. In others, both dolls were left behind the screen, revealing two dolls. In the second variation of the experiment, the infants were first shown two dolls and, after one was visibly removed, either one or two dolls were revealed.
Wynn’s experiment showed that infants reacted with surprise (measured through longer looking times) to the trials where the revealed quantity of dolls was unnatural (namely, when only one doll was revealed in the first variation, and when two dolls were revealed in the second variation). Wynn argued that this showed that “infants can calculate the results of simple arithmetical operations on small numbers of items. This indicates that infants possess true numerical concepts, and suggests that humans are innately endowed with arithmetical abilities.” (Wynn 1992, 749). Others were equally excited by the results. Dehaene, for example, motivated his book by asking: “How can a five-month-old baby know that 1 plus 1 equals 2?” (Dehaene 2011, xvii). Yet, a lot is assumed in these claims. Do infants really possess true numerical concepts? Are they innately endowed with arithmetical abilities? Their behaviour in the experiment notwithstanding, do they know that one plus one equals two? These questions warrant a detailed analysis before we say more about the arithmetical capacities of infants. But, clearly, empirical research of this type is highly philosophically relevant; after all, if Wynn’s and Dehaene’s interpretations are correct, some mathematical knowledge is already possessed by infants. This would pose a serious challenge to epistemological theories according to which mathematical knowledge is acquired solely through reason and recollection.
2. Numerical and Arithmetical Cognition
a. Arithmetic and Proto-Arithmetic
Neither Wynn nor Dehaene propose that infants possess arithmetical knowledge in the same sense in which arithmetically-educated adults do. Yet, as they interpret the empirical results, adult arithmetical knowledge is a later stage of a developmental trajectory that builds on innate arithmetical abilities. Standardly, two innate abilities are identified (see, for example, Dehaene, 2011; Feigenson et al., 2004; Hyde, 2011; Spelke, 2011). First is the ability to subitize, which was first reported in (Kaufman et al. 1949). Subitizing is the ability to determine the quantity of objects in the field of vision without counting. The subitizing ability allows for the precise determination of quantities but standardly stops being applied for collections larger than four items. In one of the earliest results supporting the existence of an infant ability to discriminate quantities, Starkey and Cooper (1980) reported that 22-week-old infants subitize. Since then, it has been established that many non-human animals have the ability to subitize (see, for example, Dehaene, 2011 for an overview). The second ability is estimating the size of an observed collection of objects. Unlike subitizing, the estimating ability is not limited to small quantities. Yet, it becomes increasingly inaccurate as the estimated collections become larger. Indeed, the accuracy of the estimations decreases in a logarithmic manner, following the so-called Weber-Fechner law (also called Weber’s law in the literature): it is more difficult to distinguish between, say, 17 and 18 objects than it is between 7 and 8 (Dehaene 2011; Fechner 1948). Since the performance signatures of these abilities (for both human children and non-human animals) are different for small and large collections of objects, subitizing (being precise but limited) and estimating (being approximate but essentially unlimited) are standardly thought to be distinct abilities (Feigenson, Dehaene, and Spelke 2004).
In addition to infants, the subitizing and estimating abilities have been detected in many non-human animals. Among them are animals that are generally considered to be intelligent, like primates and birds in the corvid family (for review, see Dehaene 2011; Knops 2020; Nieder 2019). More surprising have been empirical results showing that goldfish (DeLong et al. 2017), newborn chicks (Rugani et al. 2009) and honeybees (Howard et al. 2019) also seem to possess similar quantitative abilities. These data suggest that the abilities have early evolutionary origins, or evolved several times. In either case, their existence has been an important reason to reconsider the origins of numerical abilities.
Given that subitizing and estimating are innate abilities, it is commonplace among empirical researchers to attribute them to so-called core cognitive systems (Carey 2009; Spelke 2000). According to Susan Carey, core cognition refers to how human cognition begins with “highly structured innate mechanisms designed to build representations with specific content” (Carey 2009, 67). The core system responsible for the subitizing ability allows for the tracking of persisting objects in the field of vision, and it is usually called the object tracking system (OTS) (Knops 2020), but sometimes also the parallel individuation system (Carey 2009; Hyde 2011). Unlike the OTS, which has functions besides determining the quantity of objects, the core system responsible for the estimating ability is standardly thought to be quantity specific. It is usually called the approximate number system (ANS) (Spelke 2000), and it is what Dehaene called the number sense (Dehaene 1997), even though some research suggests that instead of being number-specific, the estimation system is common to space, time, and number (Walsh 2003).
As mentioned above, it is commonplace to think of the OTS and the ANS as two distinct core cognitive systems (see, for example, Hyde, 2011; Nieder, 2019). Recently, though, a mathematical model has been proposed according to which, under limited informational capacity, only one innate system is responsible for different performance signatures for small and large collections (Cheyette and Piantadosi 2020). From a philosophical standpoint, more important than the exact division between systems is how the core cognitive systems should be understood in terms of the development of arithmetical cognition. Based on Wynn’s (1992) report of her experiment, it is likely that infants are subitizing when determining the quantity of the dolls. She believes that they also practice arithmetic and possess true numerical concepts. This prompts the question: could the infants’ behaviour be understood in another way, one that does not assign genuine arithmetical ability or numerical concepts to them? Many researchers believe that subitizing, based on the OTS, works by means of the observed objects occupying mental object files (Carey 2009; Noles, Scholl, and Mitroff 2005). When two objects are observed, two object files are occupied. Under this explanation, the infants’ surprise during Wynn’s experiment is explained by their observations not matching the occupied object files. Importantly, the infants are not thought to observe “twoness”, representing the number of dolls in terms of numerical concepts.
Based on such considerations, it is commonplace to distinguish between genuinely arithmetical ability and the innate quantitative abilities of subitizing (that is, quickly recognizing and naming the number in a group without counting) and estimating. For this reason, Markus Pantsar (2014; 2019) has called the latter abilities proto-arithmetical, whereas Núñez (2017) calls them quantical. Arithmetic, under these distinctions, refers exclusively to the culturally developed human system of natural numbers and their operations. Arithmetic does not necessarily mean modern sophisticated formal systems, like the Dedekind-Peano axiomatization (Peano 1889). Instead, under the characterization of Pantsar (2018, 287), arithmetic refers to a “sufficiently rich discrete system of explicit number words or symbols with specified rules of operations.” What counts as “sufficiently rich” is left purposefully undefined; more important is the idea that arithmetic, in contrast to proto-arithmetic, needs to consistently and specifically discriminate between different cardinalities (unlike with the ANS), without there being a pre-set limit in size (unlike with the OTS). Put more precisely in mathematical terms, the system must “sufficiently follow the structure of the omega progression, that is, the standard ordering of the set of natural numbers” (ibid.).
A similar distinction has also been proposed when it comes to the subject matter of proto-arithmetical/quantical abilities. While some argue that they concern numbers (see, for example, Carey, 2009; Clarke & Beck, 2021), others insist that we need to distinguish them from numbers as objects of arithmetic. Hence it has been proposed that instead of numbers, proto-arithmetical abilities should be discussed as detecting numerosities (De Cruz and De Smedt 2010; Pantsar 2014). Under this distinction, numerosities refer to the quantity-specific content that can be processed using proto-arithmetical abilities. In what follows, we employ the double distinction between arithmetic and proto-arithmetic and between numbers and numerosities.
b. Acquisition of Number Concepts and Arithmetical Knowledge
Recall that Wynn interpreted the results of her experiment to imply that “infants possess true numerical concepts” (Wynn 1992, 749). With the distinction between numbers and numerosities in place, this conclusion seems dubious. The infants do appear to be able to process numerosities, but this ability is entirely proto-arithmetical, not arithmetical. Hence there is no reason to believe that number concepts are innate to humans (or non-human animals). This is supported by evidence from anumeric cultures such as the Pirahã and the Munduruku of the Amazon. These cultures have not developed arithmetic and their languages do not have numeral words, with the possible exceptions of words for one and two; their members show no arithmetical abilities (Gordon 2004; Pica et al. 2004). Yet experiments show that members of these cultures do possess proto-arithmetical abilities (Dehaene et al. 2008; Everett and Madora 2012; Frank et al. 2008). Therefore, it is likely that, while proto-arithmetical abilities are innate and universal to neurotypical humans, number concepts develop only in particular cultural contexts, in close connection with the development of numeral words (or symbols).
There are, however, disagreeing voices in the literature. Aside from Wynn, Rochel Gelman and C. Randy Gallistel are proponents of a nativist view according to which number concepts are innate and pre-verbal (Gallistel 2017; Gelman and Gallistel 2004). In addition, Dehaene and Brian Butterworth have both presented influential accounts which could be interpreted as forms of nativism. Butterworth (1999) has argued for the existence of an innate “number module”, while Dehaene (2011) has argued for an innate “mental number line.” Yet, it seems that nativist interpretations of these accounts result from misleading terminology. While both authors support innate numerical capacities, given our terminological distinctions, this amounts to the innateness of proto-arithmetical abilities.
If number concepts are not innate, they must be acquired during individual ontogeny. The first influential account along these lines was presented by Jean Piaget (1965). At the time, it was not known that children possess pre-verbal proto-arithmetical abilities, so Piaget endorsed a view according to which all numerical abilities arise from logical capacities and only typically emerge at around the age of five. Over the years after Piaget, researchers have contended that he was wrong about numerical abilities, but he seems to have been right about number concepts not being innate. Moreover, Piaget got the age of this development wrong, as the first number concepts seem to emerge at the age of two.
Parents of young children may want to object at this point, since children can count already before the age of two. Yet it is important to distinguish between different types of counting. Paul Benacerraf (1965) distinguishes between intransitive and transitive counting. The former consists merely of repeating the numeral word sequence in order, as in the beginning of a game of hide and seek. In the latter, the numeral words are used to enumerate items in a collection, like when counting grapes on a plate. In the empirical literature, these two types of counting are often referred to by different terms. Transitive counting is often called enumeration, and the word “counting” is simply used for intransitive counting (for example, Rips et al., 2008).
When exploring number concept acquisition, it is imperative that we distinguish both types of counting from counting with number concepts, which is a further stage in cognitive development. It is known that intransitive counting precedes transitive counting, but also that transitive counting is not sufficient for possession of number concepts (see, for example, Davidson et al., 2012). The way this is usually established is through the “give-n” test developed by Wynn (1990). In the test, children are presented with a collection of objects and asked to give n of them. If they consistently give n objects, they are thought to possess the number concept of n. Yet, as pointed out by Davidson and colleagues, there is a stage at which children can transitively count objects but do not pass the give-n test.
At about two years of age, children start passing the test for n = 1, at which point they are called one-knowers. After that, they acquire the next three number concepts in ascending order, in stages that typically take 4-5 months (Knops 2020). Subsequently, a qualitative change occurs in children’s cognitive processing. After becoming four-knowers, instead of following this trajectory, in the next stage, children grasp something more general about numbers: in addition to the give-5 test, they start passing the give-n test for six, seven and so on (Lee and Sarnecka 2010). At this point, when they have acquired a general understanding that the last uttered word in the counting list refers to the cardinality of the objects, they are called cardinality-principle-knowers (Lee and Sarnecka 2011).
What happens in children’s cognitive development when they become cardinality-principle-knowers? The theory of number concept acquisition that is currently most thoroughly developed in the literature is called bootstrapping. Originally presented by Susan Carey (2004), this theory has since been further developed and clarified by Carey and others (Beck 2017; Carey 2009; Pantsar 2021a). In a nutshell, the bootstrapping account ascribes a central role to the object tracking system in the process. After acquiring the counting list (that is, being able to intransitively count), children are thought to form mental models of different sizes of collections, based on computational constraints set by the object files of the OTS (Beck 2017, 116). When observing two objects, for example, two mental object files are occupied. Such instances are thought to form a representation of two objects in the long-term memory (Carey 2009, 477). Then, through counting games in which the counting list is repeated while pointing to objects, this representation is connected to a particular number word, like “two” (Beck 2017, 119). This explanation can only hold for up to four, though, which is the limit of the OTS. After this, in the last stage of the bootstrapping process, children are thought to grasp, through inductive and analogous reasoning, that the way in which the first four number concepts are in an ascending, regular, order can be extrapolated to the rest of the counting list (Beck 2017, 119). This is the stage at which children become cardinality-principle-knowers.
The bootstrapping theory has critics. Pantsar (2021a) has asked how children ever grasp that there are numerosities larger than four if the OTS is the sole proto-arithmetical system relevant to the bootstrapping process. Lance Rips and colleagues (2006) ask why children bootstrap a linear number system and not, say, a cyclical one? Why, after acquiring the number concept for twelve, for example, is thirteen the next step, instead of one? Rips and colleagues (2008) argue that there needs to be a mathematical schema concerning numbers already in place to prevent such cyclical systems or other “deviant interpretations”. For them, grasping the concept of natural number requires understanding general truths of the type “for all numbers, a + b = b + a”. This, however, is quite problematic as it implies that grasping natural numbers requires understanding something like the Dedekind-Peano axioms, which standardly only happens quite late in individual ontogeny (if at all). It is problematic to associate grasping number concepts with such sophisticated mathematical understanding given that children already possess considerable arithmetical knowledge and skills much earlier.
The conclusion of Rips and colleagues (2008, p. 640) is that the concept of natural number may be completely independent of proto-arithmetical abilities. While they believe that number concepts are learned in a “top-down” manner by grasping general principles, others have suggested different explanations for why we do not acquire deviant interpretations such as cyclical number systems. Margolis and Laurence (2008) have suggested that the deviant interpretation challenge supports a nativist view about number concepts. Yet others have looked for solutions to the challenge that do not entirely abandon the bootstrapping account. Paula Quinon (2021) has suggested that an innate sense for rhythm can explain the preference for linear, regular number systems. Pantsar (2021a) has argued that the approximate number system can influence the bootstrapping process so as to prevent cyclical number systems. Jacob Beck (2017), by contrast, rejects the overall importance of the problem, pointing out that it is just another instance of the general problem of inductive learning, as formulated by the likes of Kripke (1982) and Goodman (1955).
So far, we have been discussing number concepts, but acquiring them is only the first step in developing arithmetical knowledge and skills. While acquiring number concepts would seem to be a necessary condition for having arithmetical knowledge, possessing number concepts alone does not ensure possession of arithmetical knowledge. In accounts like that of Lakoff & Núñez (2000), detailed schemas of the acquisition of arithmetical knowledge are presented. They see the application of conceptual metaphors as the foundation of mathematical knowledge. Addition, for example, is seen as the metaphorical counterpart of the physical task of putting collections of objects together (Lakoff and Núñez 2000, 55).
In other accounts, addition is understood as a more direct continuation of counting procedures, which are mastered in the process of acquiring number concepts. The psychologists Fuson and Secada (1986), for example, presented an account of grasping the addition operation based on a novel understanding of counting. In counting a + b, children typically first use the “counting-all” strategy. When presented with a objects, the children count to a. When presented with b additional objects and asked to count all (that is, a + b) objects, they re-count to a and then continue the count to a + b. Yet, at some stage, children switch to the “counting-on” strategy, starting the count directly from where they finished with a. That is, instead of counting, say, 5 + 3 by counting 1, 2, 3, 4, 5, 6, 7, 8, when applying the “counting-on” strategy, children simply count 6, 7, 8 in the second stage. This strategy has been shown to be helpful in understanding addition (Fuson and Secada 1986, 130).
This example shows how arithmetical operations can be grasped by applying knowledge that children already possess (that is, about counting). Addition is thus understood as a direct descendant of counting. Similarly, multiplication can be understood as a descendant of addition, and so on. Therefore, acquiring arithmetical knowledge and skills can build directly on number concept acquisition and the knowledge and skills that children learn in that process. One should be careful, though, not to make too many assumptions about ontogenetic trajectory. For empirically-informed approaches, it is important to be faithful to the actual empirical details regarding the development of children during ontogeny. While schemas like that presented by Lakoff and Núñez (2000) can be instructive, we should not confuse them with empirical evidence. Understanding ontogenetic trajectories in learning arithmetic is an important challenge that, like that of number concept acquisition, needs further empirical research.
c. Embodied Mind and Enculturation
One consequence of the traditional a priori emphasis in philosophy of mathematics has been that the embodied aspects of mathematics have often been dismissed as philosophically inconsequential. This was famously the case for Frege (1879), who distinguished between the context of discovery (how truths are learned) and the context of justification (how truths are established). For arithmetic, Frege (1884) deemed the context of discovery philosophically irrelevant, ridiculing Mill’s empiricist account as “pebble arithmetic”. However, as we have seen, many researchers currently see the ontogenetic aspects of arithmetic as important to a philosophical understanding of arithmetical knowledge. So far, we have been focused on the role of the proto-arithmetical abilities of subitizing and estimating, and the cognitive core systems associated with them. But in individual ontogeny, embodied aspects and external resources also play important roles. Therefore, it is important to expand epistemological accounts of arithmetic beyond proto-arithmetical abilities.
This is the case already for number concepts. Recognizing the role of embodied aspects of cognition and external resources within cognitive development does not, by itself, imply a preference for one account of number concepts over another. In Beck’s (2017) formulation of the bootstrapping account, number concepts are acquired with the help of two types of external resources: the counting list and counting games. But it is not the presence of external resources that distinguishes between bootstrapping accounts and nativist accounts. Nativists concerning number concepts also acknowledge the importance of education in grasping number concepts. The difference lies in what kind of influence the external resources have. Within nativist accounts, external resources are seen as facilitators that help innate number concepts to emerge. Within bootstrapping accounts, number concepts are shaped by a combination of innate cognitive capacities and external cultural influences.
While non-nativist accounts acknowledge the importance of external factors for number concept acquisition and the development of arithmetical cognition in general, it is not always clear how such accounts view interactions between learning subjects and their environment. Helen De Cruz (2008) was one of the first to tackle this problem systematically by adopting the extended mind perspective within the philosophy of arithmetical cognition. According to the extended mind thesis, cognitive processes are extended into the environment in the sense that they are constituted by both internal cognitive capacities and external resources (Clark and Chalmers 1998). De Cruz argued that this holds for arithmetical cognition; external media such as numeral words, body parts, and numeral symbols are constitutive of number concepts and arithmetical operations on them.
The embodiment of mind is often associated with the extended mind thesis. Arithmetical cognition seems to provide good examples of how our bodies shape our cognitive processes. The use of fingers, in particular, has been widely established across cultures to be significant in the early grasp of counting processes (Barrocas et al. 2020; Bender and Beller 2012; Fuson 1987). Furthermore, finger gnosis, the ability to differentiate between one’s own fingers without visual feedback, is a predictor of numerical and arithmetical ability levels (Noël 2005; Penner-Wilger and Anderson 2013; Wasner et al. 2016). The use of body parts has also been identified by philosophers as an important factor in developing arithmetical cognition (Fabry 2020). In addition to finger (and other body part) counting, there are many other important embodied processes involved in learning arithmetic. Manipulating physical objects like building blocks has been established as important in early arithmetic education (Verdine et al. 2014). The use of cognitive tools, such as pen and paper or abacus, becomes important in later stages (Fabry and Pantsar 2021). It is important to note, though, that this kind of support for the embodiment of mind does not require subscribing to the stronger extended mind thesis. Under a weaker understanding of the embodiment of mind, embodied processes shape the concepts and cognitive processes involved in arithmetic. According to the extended mind thesis, embodied processes are constitutive of concepts and cognitive processes.
External resources are important for the development of arithmetical cognition in many ways. In addition to body part counting and the use of physical cognitive tools, numeral word and numeral symbol systems also play an important role. The Mandarin numeral word system, for example, has shorter words than the English system and follows the recursive base of ten more closely. This has been suggested as an explanation of why native Mandarin speakers typically learn to count faster than native English speakers (Miller et al. 1995). For numeral symbol systems, similar data is not available, largely due to the widespread use of Indo-Arabic numerals. Yet, philosophers have worked on different numeral symbol systems and the way they can shape arithmetical cognition (see, for example, Schlimm 2021).
The importance of external resources for the development of arithmetical cognition appears to raise an age-old question: what is the role of nature and what is the role of nurture in this process? How much of arithmetical cognition comes from evolutionarily developed and innate proto-arithmetical abilities, and how much comes from culturally developed practices? From the above considerations, it seems clear that both play important roles, yet it is difficult to determine their relative impact. These types of considerations have led some researchers to abandon the crude nature vs. nurture framework. Influentially, Richard Menary (2015) has argued that arithmetical cognition is the result of enculturation. Enculturation refers to the transformative process through which interactions with the surrounding culture determine the way cognitive practices are acquired and developed (Fabry 2018; Jones 2020; Menary 2015; Pantsar 2020; Pantsar and Dutilh Novaes 2020). In the enculturation account, we must consider the development of cognitive abilities through such interactions. Menary focuses mainly on the brain and its capacity to adapt learning in accordance with the surrounding culture. Regina Fabry (2020) has extended this focus to the rest of the body, emphasizing the importance of embodied processes for the development of arithmetical cognition. The enculturation account has also proven to be fruitful for studying the general role of material representations, like symbols and diagrams, in the development of arithmetical (and other mathematical) cognition (Johansen and Misfeldt 2020; Vold and Schlimm 2020).
The enculturation account helps us understand how the brain changes as a result of learning cognitive practices involving cognitive tools, such as numeral symbol systems or tools for writing, like pen and paper. These changes typically involve the same areas of the brain across individuals. Menary (2015) has explained such predictable changes in the brain through the notion of learning driven plasticity, which includes both structural and functional changes. At present, there are two competing accounts for explaining learning driven plasticity. Menary follows Dehaene’s (2009) notion of neuronal recycling, according to which evolutionarily developed neural circuits are recycled for new, culturally specific purposes. In the case of arithmetic, that means that evolutionary developed proto-arithmetical neural circuits are (partly) redeployed for culturally specific, arithmetical, purposes. There is empirical data in support of this, showing, for example, that the prefrontal and posterior parietal lobes, especially the intraparietal sulcus, activate when processing numerosities both symbolically and non-symbolically (Nieder and Dehaene 2009).
Michael Anderson (2010; 2015), by contrast, has argued that neural reuse is the basic organizational principle involved in enculturation. According to the neural reuse principle, neural circuits generally do not perform specific cognitive or behavioural functions. Instead, due to the flexibility of the brain, cognitive and behavioural functions can employ (and re-deploy) resources from many neural circuits across different brain areas. Brain regions do, though, have functional biases, which explain why, during enculturation, the brains of individuals typically go through similar structural and functional changes. Recently, both Fabry (2020) and Max Jones (2020) have argued for neural reuse as a better explanation of enculturation in the specific context of arithmetic (see Pantsar (2024a) for further discussion). In addition, due to the importance of the embodied aspects of arithmetical cognition, Fabry (2020) has emphasized the need for also explaining adaptability outside the brain, introducing the notion of learning driven bodily adaptability.
One further question that has emerged in the literature on the development of arithmetical cognition is that of representations. The standard accounts among empirical researchers imply that proto-arithmetical numerosities are represented in the mind either in object files (OTS) or a mental number line (ANS) (see, for example, Carey, 2009; Dehaene, 2011). However, this goes against radical enactivist views according to which basic, that is, unenculturated, minds do not have content or representations (Hutto and Myin 2013; 2017). Recently, radical enactivist accounts of mathematical cognition have been presented (Hutto 2019; Zahidi 2021; Zahidi and Myin 2018). These accounts try to explain the development of arithmetical cognition in ontogeny without evoking representational models of proto-arithmetical abilities. This topic can be expected to get more attention in the future, as our empirical understanding of proto-arithmetical abilities improves.
d. From Ontogeny to Phylogeny and Cultural History
The surrounding culture and its impact on learning play key roles in the enculturation account of arithmetical knowledge. This prompts the question: how can cultures develop arithmetical knowledge in the first place? The problem is already present at the level of number concepts. In the bootstrapping account, for example, counting lists and counting games are thought to be integral to the acquisition of number concepts. But how can such cultural practices develop if members of a culture do not already possess number concepts? In many cultures, this happened through cultural transmissions from other cultures (Everett 2017). But this cannot be the case for all cultures. Most fundamentally, as Jean-Charles Pelland, among others, observed, the question concerns the origin of numeral words. Is it possible that numeral words – and consequently counting lists – could have developed without there first being number concepts (Pelland 2018)?
In response to this question, the role of external representations of numbers has become an important topic in the literature (see, for example, Schlimm, 2018, 2021). To explain how external representations like numeral symbols and words can influence concept formation, many researchers have turned to the cultural evolution of practices and concepts. Particularly influential in this field is the theory of cumulative cultural evolution (Boyd and Richerson 1985; 2005; Henrich 2015; Heyes 2018; Tomasello 1999). The central idea of cumulative cultural evolution is that cultural developments frequently take place in small (trans-)generational increments. From this background, it has been argued that material engagement with our environment (Malafouris 2013) has been central to the cultural evolution of numeral words, numeral symbols, and number concepts (dos Santos 2021; Overmann 2023; Pantsar 2024a). The emergence of numeral symbols has been traced to around 8,000-4,500 B.C.E. in Elam and Mesopotamia, where clay tokens were used to represent quantities in accounting (Overmann 2018; Schmandt-Besserat 1996). At first, a proper array of different tokens was put in a clay case to represent the quantity, but later the outside of the case was marked with a symbol to signal the contents (Ifrah 1998, xx). In the next phase, people bypassed the case and simply used the symbol (Nissen, Damerow, and Englund 1994).
For numeral words, similar connections to material practices have been identified. Not all languages have a system of numeral words. Hence, we must ask: how can numeral words be introduced into a language? Recently, César dos Santos (2021) has brought philosophical attention to the interesting case of the Hup language spoken by cultures in the Amazonia, which seem to be in the process of developing a numeral word system. In the Hup language, the word for two means “eye quantity” and the word for three comes from a word for a three-chambered rubber plant seed (Epps 2006). Thus, in the Hup language, numeral words appear to be emerging from prototypic representations of quantities of natural phenomena. Lieven Decock (2008) has called such representations “canonical collections”. Canonical collections may also hold the key to understanding how number concepts have evolved. The linguist Heike Wiese (2007) has argued that numeral words and number concepts co-evolved. Within philosophy, this idea has been pursued by dos Santos (2021) and Pantsar (2024b). Canonical collections in natural phenomena, in addition to body parts like fingers, have provided references that proto-arithmetical numerosity representations can latch onto. The word for the rubber plant seed in the Hup language, for example, has gradually changed meaning to also concern a quantity. Such words can be connected to body part counting procedures, which make them part of the counting procedure, and together with other similarly evolved words, part of the counting list. Through this process of co-evolution, the concept of number develops into the kind of exact notion of natural number that we have in arithmetic. Thus, the co-evolution of numeral words and number concepts is a proposed solution to the question presented by Pelland; numeral words can develop without there being prior number concepts because the concepts themselves can emerge as part of the same (or parallel) development.
Unfortunately, the emergence of numeral words and number concepts is, in most cases, impossible to trace. Similarly, the evolution of arithmetical operations is difficult to study due to a dearth of surviving material from the early stages. There are, however, important works that help improve our understanding of the evolution of numbers and arithmetic, including those by the mathematician Georges Ifrah (1998), the cognitive archaeologist Karenleigh Overmann (2023), and the anthropologist Caleb Everett (2017). History of mathematics also provides important insights concerning the development of arithmetic into the modern discipline with which we are familiar (for example, Merzbach and Boyer 2011). This kind of interdisciplinary research is highly philosophically relevant; it provides important material for recent epistemological accounts of arithmetic, like that of Pantsar (2024a). There are good reasons to expect this development to continue in the future, with the cultural development of arithmetic receiving more philosophical attention.
For the epistemology and ontology of arithmetic, both the ontogenetic proto-arithmetical foundations and the cultural development of arithmetical knowledge and skills are highly relevant. According to one view, mathematical objects like numbers are social constructs (Cole 2013; 2015; Feferman 2009). If this is the case, then an important question arises: what kind of social constructs they are? According to conventionalist philosophy of mathematics, which has found popularity in different forms ever since the early 20th century, mathematical truths are merely firmly entrenched conventions (Ayer 1970; Carnap 1937; Warren 2020; Wittgenstein 1978). Thus, one way of interpreting numbers as social constructs would be as parts of such conventions, which ultimately could be arbitrary. However, it has been argued that the proto-arithmetical origins of arithmetic (partly) determine its content, which means that arithmetical truths are (partly) based on evolutionarily developed cognitive architecture and not purely conventional (Pantsar 2021b).
e. Ordinal or Cardinal
Let us next consider whether knowledge of natural numbers is primarily ordinal or cardinal. In many languages, numeral symbols are connected to two numeral words, one cardinal (one, two, three, so on) and the other ordinal (first, second, third, so on). In the philosophy of mathematics, this is an old and much-discussed problem that traces back to Cantor (1883), who defined the cardinal numbers based on the ordinal numbers, which has been interpreted to imply the primariness of ordinals (Hallett 1988). Also, within set theory, ordinals are often seen as more fundamental, since the cardinal number of a set can be defined as the least ordinal number whose members can be put in a one-to-one correspondence with the members of the set (Assadian and Buijsman 2019, 565). Finally, in structuralist philosophy of mathematics, numbers are understood to be fundamentally places in the natural number structure. In Stewart Shapiro’s structuralist account, for example, natural numbers are explicitly defined in terms of their (ordinal) position in the natural number structure (Shapiro 1997, 72).
Are the cognitive foundations of arithmetic relevant to whether our knowledge of natural numbers is primarily ordinal or cardinal? If the OTS and the ANS are indeed integral cognitive systems relevant to acquiring number concepts, this is at least plausible. Both subitizing and estimating detect the cardinality of the observed collection of objects, and there is nothing to suggest that the order of the occupied object files in the OTS, for example, influences numerosity determination. Therefore, our first numerosity representations are likely cardinal in nature. This also receives direct empirical support. Studies show that children who do not possess numeral words cannot make ordinal judgments while they are able to make (some) cardinal judgments (Brannon and Van de Walle 2001). It could be that the ordinal understanding of numerosities is something that only emerges through numeral words and the ordered counting lists they comprise.
Stefan Buijsman (2019) has suggested that the acquisition of the first number concepts depends on understanding claims of the logical form “there exists exactly one F, which requires grasping the singular/plural distinction from syntactic clues. Since that distinction does not concern ordinality, this may likewise suggest that the first number concepts are cardinal in nature. Yet, Buijsman (2021) has also argued that acquisition of larger number concepts requires ordinal understanding, which fits well, for example, with the role of ordered counting lists in the bootstrapping process.
f. Empirically-Informed Epistemology of Arithmetic
To conclude our discussion of arithmetic, we evaluate the philosophical significance of the work presented above. This is important because there is a potential counterargument to it having any philosophical significance. Specifically, one can acknowledge that we conduct meaningful research on numerical cognition, yet insist, as Frege (1884) did, that such considerations are only about the context of discovery and not the philosophically important context of justification. From this perspective, arithmetical knowledge can be completely a priori and fit the rationalist paradigm; the empirical studies reviewed above merely concern developmental trajectories in acquiring the kind of conceptual and reasoning abilities required for arithmetical knowledge.
This potential counterargument should be taken seriously. After all, even within the kind of empirically-informed philosophy of arithmetic that this article presents, there are empirical dimensions that are not considered to be philosophically relevant. The fact that we need to have visual (or tactile) access to number symbols, for example, is generally not considered to make epistemology of arithmetic somehow empirical, even though it clearly connects arithmetical knowledge to sensory experience. The counterargument formulated above is essentially similar: what if most – perhaps even all – of the empirical data on numerical cognition is connected to arithmetical knowledge only within the context of discovery, and not relevant to the context of justification and the philosophically important characteristics of arithmetical knowledge?
This counterargument can be divided into two. First, one may agree that the empirical data on numerical cognition is epistemologically relevant but insist that there can be multiple routes to arithmetical knowledge; while people can acquire arithmetical knowledge based on proto-arithmetical abilities, this is not necessarily the case. According to this counterargument, there is no path-dependency or arithmetical knowledge that necessarily develops through proto-arithmetical abilities. Most advocates of empirically-informed philosophy of arithmetic accept this counterargument. They are not claiming that arithmetical knowledge could not be acquired, at least in principle, through an essentially different ontogenetic path. What they maintain is that arithmetical knowledge standardly develops based on (one or more) proto-arithmetical abilities. Given that proto-arithmetical abilities are universal, this ontogenetic path is available to anyone who has access to the kind of enculturation required to move from proto-arithmetical abilities to proper arithmetic. But it is not a necessary path: we cannot rule out that somebody learns arithmetic entirely based on principles of equinumerosity and logic, as described by Frege (see also (Linnebo 2018)).
The second form of the counterargument is more serious. According to it, the empirical data on numerical cognition is not epistemologically relevant at all. This is similar to the way Frege (1884) dismissed contemporary psychologist theories of arithmetic, though we should resist the temptation to speculate about what he would have thought of the kind of modern empirical research presented in this article. While this counterargument is rarely explicitly stated, it seems to be implicitly accepted by many philosophers of mathematics that the importance of empirical research is at least very limited, as demonstrated by the way the topic is ignored in most modern textbooks and encyclopaedia articles on the philosophy of mathematics (see, for example, Horsten 2023; Linnebo 2017).
While further work is needed to determine the epistemological consequences of empirical research on numerical cognition, some progress has already been made. Pantsar (2024a) has presented an epistemological account within which arithmetical knowledge is characterized as contextually a priori. According to this account, the experience of applying our proto-arithmetical abilities in ontogeny sets the context for developing arithmetical knowledge. While that context is thus constrained by our way of experiencing the world, within that context, arithmetical statements are thought to be knowable a priori. Therefore, arithmetical statements are neither refuted nor corroborated by observations. Distinguishing the account from that of Kitcher (1983), Pantsar does not claim that basic arithmetical truths (concerning finite natural numbers and their operations) are generalizations of operations in our environment. Instead, they are determined by our proto-arithmetical abilities. This also distinguishes the account from conventionalist views of mathematics. In his account, the reason for, say, 2 + 2 = 4 being an arithmetical truth is not that it is a firmly entrenched convention; instead, it is because our evolutionarily developed cognitive architecture—in this case the OTS—(partly) determines the domain of arithmetical truths.
3. Geometry
a. The Cognitive Foundations of Geometry
i. Proto-Geometrical Cognition
Historically, within the philosophy of mathematics, geometry has generally been seen in a similar light to arithmetic. In Ancient Greece, geometry was the paradigmatic field of mathematics and provided key content for Plato’s (The Republic) treatment of mathematics. Indeed, Euclid, who gathered the Ancient knowledge of arithmetic into his famous Elements (Euclid, 1956), treated arithmetic essentially as an extension of geometry, with numbers defined as lengths of line segments. Geometry and arithmetic are likewise treated similarly in the work of Kant (1787), for whom geometry is also a paradigmatic case of synthetic a priori knowledge. These philosophical views suggest the innateness of geometrical abilities. Recently, this suggestion has been supported by the psychologist Gallistel (1990), who takes both arithmetic and (Euclidean) geometry to be the result of innate cognitive mechanisms.
More recently still, though, nativist views concerning geometrical abilities have been contested. There have been philosophical accounts, like that of Ferreirós and García-Pérez (2020), that emphasize the cultural characteristics of Euclidean geometry. Yet philosophers have also pursued accounts, like the arithmetical ones discussed above, according to which geometry is based on proto-mathematical, evolutionarily developed, capacities. While the empirical research on proto-geometrical abilities has not been as extensive as that on proto-arithmetical abilities, there are important results that should be considered by philosophers. Potentially, these may provide at least a partial cognitive foundation for the ontogenetic, and perhaps also the phylogenetic and cultural development, of geometrical knowledge and skills. This kind of philosophical work has been discussed most extensively by Mateusz Hohol in his book Foundations of Geometric Cognition (Hohol 2019), but there are also articles focusing on similar approaches concerning the cognitive foundations of geometry (for example, Hohol & Miłkowski, 2019; Pantsar, 2022).
The state of the art in empirical research on proto-geometrical cognition is fundamentally similar to that on proto-arithmetical cognition: two proto-geometrical abilities have been identified in the literature. The first of these concerns shape recognition, the second concerns orientation. Similarly to the OTS and the ANS, it has been proposed that these two abilities are due to two different core cognitive systems (Spelke 2011). And just like in the case of arithmetic, both have been seen as forming (at least a partial) cognitive foundation for the development of geometry (Hohol 2019).
ii. Shape Recognition
Let us first focus on the ability to recognize geometric shapes, which Hohol (2019) calls object recognition (shape recognition is a more fitting term because it is not clear that all recognized shapes are treated cognitively as objects). For a long time, psychology was dominated by the views of Piaget (1960), according to which children are born with no conception of objects and consequently, no conception of shapes. This view started to be contested in the 1970s, as evidence of neonates recognizing geometrical shapes emerged (Schwartz, Day, and Cohen 1979). Since then, there have been various empirical reports of infants (Bomba and Siqueland 1983; Newcombe and Huttenlocher 2000), non-human animals (Spelke and Lee 2012), and members of isolated cultures (Dehaene et al. 2006) being sensitive to geometric shapes in their observations and behaviour. Importantly, these abilities are almost always reported in terms of Euclidean geometry. Véronique Izard and colleagues, for example, report that Munduruku adults and children show an “intuitive understanding [of] essential properties of Euclidean geometry” (p. 9782). This includes estimating the sum of internal angles to be roughly 180 degrees and there being one parallel line to any given line drawn through a given point. Izard and Elizabeth Spelke (2009) have also described the different developmental stages in children’s learning of shape recognition in Euclidean terms.
Is there an innate ability, or at least a tendency, toward recognising Euclidean geometric shapes? In addition to psychologists like Gallistel, some philosophers have supported strong nativist views of Euclidean representations (see, for example, Hatfield, 2003). However, as described by Izard and Spelke (2009), the shape recognition system does not have enough resources to fully represent Euclidean geometry. While preschool children can detect, for example, curved lines among straight lines, and right angles among different types of angles, there are many notions of Euclidean geometry that they are not sensitive to. These are typically higher-order properties, such as symmetry (Izard and Spelke 2009). Given the limitations of the preschoolers’ abilities, we need to ask whether it makes sense to call them geometrical in the first place. Mirroring the distinction between arithmetic and proto-arithmetic, it seems necessary to distinguish between proto-geometrical and properly geometrical abilities. This distinction takes proto-geometrical abilities to be evolutionarily developed and innate, while geometrical abilities are culturally developed. Unlike proto-geometrical abilities, proper geometrical abilities are not limited to specific characteristics of shapes.
This does not mean that geometrical ability requires knowledge of Euclidean geometry, or other axiomatic systems. Such a definition would classify most people as geometrically ignorant. While it is not possible to precisely define geometrical ability, we can characterize the difference between geometrical and proto-geometrical abilities. In terms of distinguishing between angles, for example, Izard and Spelke (2009) report preschool children as being able to distinguish a different angle regardless of whether it is presented among acute, straight, or obtuse angles. This kind of ability is far from geometrical knowledge that, for example, the angles of a triangle are equal to two right angles. This latter kind of systematic knowledge about shapes and ways to understand them through precise notions, such as the size of an angle, should be considered geometrical. Under this characterization, it does not make sense to talk about proto-geometrical cognition being “Euclidean”. Only properly geometrical abilities can be Euclidean or non-Euclidean, proto-geometrical abilities are too imprecise to be so classified.
The core cognitive system for shape recognition should therefore be understood as proto-geometrical, but what kind of system is it, and what is the evidence for it? One key experiment reported that infants react to changes in angle size rather than to changes in the orientation of an angle (Cohen and Younger 1984). In this experiment, which has since been replicated by others (Lindskog et al. 2019), 6-week-old and 14-week-old infants were habituated to simple two-dimensional forms consisting of two lines that formed an angle. In the test trials, the angle (which was either 45 or 135 degrees) remained the same, but its orientation was changed. The eye movements of the infants showed that six-week-olds dishabituated (indicated by longer looking times) to a change in orientation, suggesting that they were surprised by the changing orientation. However, the fourteen-week-olds dishabituated to the angle size and not the orientation. If the angle stayed the same, they were not surprised by the next form being presented, but as soon as the angle size changed, their looking times became longer.
Cohen and Younger (1984) concluded that there has to be a developmental shift between the ages of six and fourteen weeks during which the infants start to recognize the geometric property of two lines being at a certain angle. Similar tests have been run on older children and it has been established that, starting from at least four years of age, children can consistently pick out a deviant geometric form, such as a different angle, from a collection of forms (Izard and Spelke 2009). Such results strongly imply that that there is a core cognitive ability that enables shape recognition, and which develops with age without explicit understanding of geometry as a mathematical theory. This is also in line with data on members of the Amazonian Munduruku culture. Adults and children of at least four years of age have been reported to make similar shape discriminations as European and North American children (Izard et al. 2011). Hence there are good reasons to understand the core cognitive shape recognition ability as being proto-geometrical.
iii. Orientation
In addition to shape recognition, there are also extensive data on a proto-geometrical ability concerning orientation, which Hohol (2019) calls spatial navigation (orientation is a better term, given that spatial navigation also applies the shape recognition ability). In cognitive and comparative psychology, there is a long-standing idea that navigation in both humans and non-human animals is based on so-called “cognitive maps” (Tolman 1948). Cognitive maps are thought to be mental representations that animals form of new environments. In Tolman’s account, these representations are enduring, Euclidean, and independent of the location of the observer. Thus, the account can be seen as an early hypothesis suggesting a proto-geometrical orientation ability, even though under the present distinction between proto-geometrical and geometrical we should not call the representations Euclidean. Recently, these types of mental representations have been extensively criticized by the so-called radical enactivist philosophers (Hutto and Myin 2013), according to whom all representations are dependent on language. Among psychologists, though, the idea of cognitive maps has endured. Yet, according to the modern understanding, cognitive maps are not necessarily enduring, observer-free, or Euclidean (Spelke and Lee 2012). Instead, they can be momentary and tied to particular viewpoints, and contain “wormholes” that go against Euclidean characteristics (Rothman and Warren 2006; Wehner and Menzel 1990).
While the characteristics, and indeed the very existence, of cognitive maps is a topic of much debate, there is little doubt about the existence of the orientation ability with which they are connected in the literature. This ability is thought to represent distances and directions on large-scale surfaces and spaces. In one key early experiment reported by Ken Cheng (1986), rats were shown the location of buried food in a rectangular space. After being disoriented, they looked for the food almost equally at the right location and at the point rotated 180 degrees from it. This was remarkable, because the rectangular environment had distinct features in the corners, which the rat could have used to represent the space. It was only after training that rats started to use these features for navigation, suggesting that the orientation ability is the primary one used in spatial navigation. Similar behaviour has been reported for young children (Hermer and Spelke 1996; McGurk 1972), as well as non-human animals like ants (Wystrach and Beugnon 2009).
Moreover, there is evidence that the proto-geometrical ability for orientation is based on abstract representations that animals can use in a multi-modal manner. In one experiment, it was reported that rats navigate according to the shape of a chamber even in the dark (Quirk, Muller, and Kubie 1990). In addition, the orientation ability appears to overrun other changes in the environment. It has been reported that rats’ navigation in a chamber remains unchanged even with radical changes in the texture, material, and colour of the chamber (Lever et al. 2002). In the brain, the orienting ability has been strongly associated with the hippocampus (O’Keefe and Burgess 1996). Interestingly, some studies suggest that the spatial structure is mirrored in the location of neurons firing in the hippocampus. Colin Lever and colleagues, for example, report an experiment in which rats repeatedly exposed to two differently shaped environments develop different hippocampal place-cell representations (Lever et al. 2002). This is explained by there being “grid cells” located in the entorhinal cortex, which is the interface between the hippocampus and the neocortex. The grid cells are thought to be “activated whenever the animal’s position coincides with any vertex of a regular grid of equilateral triangles spanning the surface of the environment” (Hafting et al. 2005, 801). Such studies should not be seen as suggesting that spatial representations are generally expected to be mirrored in neural coding, but they do raise interesting questions about the way proto-geometrical representations are implemented in the brain. The connection between hippocampus and spatial representations could also explain interesting phenomena in humans, including data showing that London taxi drivers – who presumably need extensive cognitive maps for orientation – have significantly larger posterior hippocampal volume than control subjects (Maguire et al. 2000).
Spatial navigation in small children and non-human animals is not conducted exclusively with the orientation ability, though. In navigation, animals also use cues from so-called “landmark objects,” applying the proto-geometrical ability for shape recognition (Izard et al. 2011). Object representations based on shape recognition are thought to be different in three ways from spatial representations due to the orientation system (Spelke and Lee 2012, 2789). First, the shape representations fail to capture absolute lengths and distances between parts. Second, they are “sense-invariant,” that is, they do not capture the difference between a shape and its mirror image. Third, the shape representations capture relationships between lengths and angles that allow distinguishing between shapes. The cognitive map of grid cells is thought to be anchored also to landmarks, but it remains in their absence. This has been seen as evidence for the orientation ability being a distinct system from the shape recognition system (Hafting et al. 2005). Further support for this comes from there being distinct correlates both in neural (concerning brain areas) and cognitive (concerning learning strategies) terms for navigating based on landmarks (shape recognition) and extended surfaces (orientation), with data showing similar neural activity in rats and humans (Doeller, King, and Burgess 2008; Doeller and Burgess 2008; Spelke and Lee 2012).
b. The Development of Geometric Cognition
As in the case of arithmetic, the development of geometrical cognition needs to be divided into two questions, one concerning its ontogeny and the other its phylogeny and cultural history. Based on the kind of research summarized above, Hohol (2019) has argued that geometrical cognition has developed in a largely similar way to how arithmetical development was described in the enculturation account above. The two proto-geometrical core cognitive systems that we possess already in infancy and share with many non-human animals form a partial foundation of geometry, but the development of geometry would not be possible without human linguistic capacity, the ability to abstract, and the capacity to create and understand diagrams (Hohol 2019). Turning attention to these capacities—evidently exclusive to humans—it is apparent how they have shaped the development of geometry in phylogeny and cultural history, continuing to do so in ontogeny for every new generation.
It should be noted at this point that, just like in the case of arithmetic, the empirical literature on proto-geometrical cognition does not distinguish terminologically between proto-geometrical and geometrical abilities, or between proto-geometrical representations and geometrical abilities. Virtually all articles report the abilities of young children and non-human animals as concerning “geometry.” As in the case of the epistemology of arithmetic, the conflation of two very different types of abilities and representations can be damaging for developing a proper philosophical understanding of the nature of geometrical knowledge.
To understand just how important that difference is, we must consider both the ontogeny and the phylogeny and cultural history. In terms of ontogeny, it is clear that acquiring proper geometrical knowledge requires a considerable amount of education even in order to grasp general geometrical notions like line and angle. Furthermore, many additional years of education must typically be completed before one can reach an understanding of geometry in the formal mathematical sense of a system of axioms and proofs based on them. In terms of phylogeny and cultural history, the matter is likely to be no less complex. If we accept that geometrical knowledge is based (partly) on proto-geometrical abilities, we are faced with the enormous challenge of explaining how these simple abilities for shape recognition and orientation have developed into axiomatic systems of geometry. This has led philosophers to criticize the notion of Spelke and colleagues (2010) that Euclidean geometry is “natural geometry.” Ferreirós and García-Pérez (2020) have argued that the gap between proto-geometrical abilities and Euclidean geometry is so wide that the latter cannot be called “natural” in any relevant sense. Instead, it is the product of a long line of cultural development in which cognitive artifacts (such as the ruler and the compass) and external representations (such as pictures and diagrams) have played crucial roles.
Ferreirós and García-Pérez (2020, 194) formulate a three-level model of the emergence of geometrical knowledge. The first level is called visuo-spatial cognition, which includes what has here been called proto-geometrical cognition. The second level they call “proto-geometry” which, contrary to the use of this terminology in this article to refer to evolutionarily developed abilities, refers in their taxonomy to developing basic concepts like circles and squares. On this level, tools and external representations play a key role. Finally, on the third level we get actual geometry, which requires the systematic and specific development of the second level. While the conceptual distinctions made by Ferreirós and García-Pérez are partly different from the ones introduced in this article, their key content is compatible with the approach here. We should be careful not to confuse lower-level abilities with higher-level concepts. Terminology-wise, the most important consequence is that we cannot call the ability of ants and rats “geometry,” as is often done in the literature (for example, Wystrach and Beugnon 2009). But the philosophically more important point is that geometry is so far away from our evolutionarily developed cognitive abilities that any conception of Euclidean geometry as “natural” geometry is potentially problematic.
This is directly related to one key problem Ferreirós and García-Pérez see with the approach of Spelke and colleagues (2010), namely the way the latter describe Euclidean concepts as “extremely simple,” because “just five postulates, together with some axioms of logic, suffice to specify all the properties of points, lines, and forms” (Spelke, Lee, and Izard 2010, 2785). To challenge this idea, Ferreirós and García-Pérez (2020, 187) point out two ways in which Euclidean concepts are not as simple as they may seem. First, the logical foundation of a system of geometry must be much richer than that of Euclid’s five postulates (Manders 2008). Second, and more importantly for the present purposes, Euclidean geometry cannot be equated with modern formal Hilbert-type (Hilbert 1902) theories of geometry that focus on logical step-by-step proofs. Instead, Euclidean geometry is fundamentally based on proofs that use (lettered) diagrams (Manders 2008; Netz 1999). Hence, Ferreirós (2016) has argued that the Euclidean postulates cannot be considered similar to modern Hilbertian axioms. The upshot of this, Ferreirós and García-Pérez (2020, 188) argue, is that very different types of cognitive abilities are involved in the two ways of practicing geometry. While Hilbertian geometry can be seen as reasoning based on simple postulates, Euclidean geometry is possible only by means of artifacts (ruler and compass) that allow for the construction of lettered diagrams. Therefore, Euclidean concepts may not be so simple after all.
While this history of geometry is important to recognize, it does not imply that we cannot trace the development of Euclidean geometry from proto-geometrical abilities. Such an effort is made by Hohol (2019), who has identified embodiment, abstraction, and cognitive artifacts as key notions in explaining this development. Among others sources, he refers to the work of Lakoff and Núñez (2000) in explaining the process of abstraction. They describe the process of abstraction as creating metaphorical counterparts of embodied processes in our environment. In the case of arithmetic, for example, addition is seen as the metaphorical counterpart of putting collections of physical objects together (p. 55). Geometrical concepts can feasibly be seen as being based on this kind of abstraction: a line in geometry, for example, has no width so it corresponds to no physical object, but it can be seen as an abstract counterpart of physically drawn lines.
In his analysis of cognitive artifacts, Hohol (2019) focuses on diagrams and formulae. As argued by Netz (1999), the introduction of lettered diagrams was central to the development of Greek geometry and its deductive method. While agreeing with this, Hohol and Miłkowski emphasize also the general role of linguistic formulae as cognitive artifacts in the development of geometry (Hohol and Miłkowski 2019). It was only through these cognitive artifacts that ancient practitioners of geometry could communicate with each other and intersubjectively develop their knowledge.
Finally, the matter of non-Euclidean geometries should be discussed. While there are competing axiomatizations of arithmetic, their differences are not as fundamental as the difference between Euclidean and non-Euclidean geometries. In non-Euclidean geometries, the fifth postulate of Euclid (called the parallel postulate) is rejected. According to this postulate, for any line l and point a not on l, there is exactly one line through a that does not intersect l. In hyperbolic geometry, there are infinitely many such non-intersecting lines. In elliptic geometry, all lines through a intersect l.
From the perspective of cognitive foundations, how can we account for non-Euclidean geometries? This problem used to be mainly academic, but gained importance when Einstein used non-Euclidean Riemann geometry in his general theory of relativity. Currently, our best theory of macro-level physics applies a non-Euclidean geometry (even though the effects of general relativity tend to be detectable only in phenomena that are on a much larger scale than our everyday experiences). If, as argued by Spelke and others, our “natural” geometry is Euclidean, how is it possible that this natural geometry is not the geometry of the world, so to speak? Should this be seen as evidence against Euclidean geometry being natural in the sense of it being based on our basic cognitive architecture, that is, our proto-geometrical abilities? While such connections may be tempting to make, it is important to not read too much into proto-geometrical origins. First, we must remember that Euclidean geometry is a distant development from proto-geometrical origins, and, as such, it already contains a lot that is not present in our basic cognitive architecture. Non-Euclidean geometries may simply be a further step in culturally developing geometry. Second, it is possible that our basic proto-geometrical abilities are proto-Euclidean in the specific sense of agreeing with the intuitive content of the parallel axiom. The geometrical structure of the world, on the other hand, may be different from that intuitive content.
As in the case of arithmetic, it is important to note that this type of empirically-informed epistemology emerged over the last decades of the 20th century. So far, quite little has been written explicitly about the connection between the foundations of geometrical cognition and the nature of geometrical knowledge. Indeed, even less has been written about geometrical knowledge than arithmetical knowledge. There are at least two reasons for this. First, the empirical data relevant to arithmetical cognition are currently stronger both in quantity and quality. Second, philosophers have been working for longer on the cognitive foundations of arithmetic. There is, however, no reason to believe that geometrical knowledge will not be given an empirically-informed epistemological treatment similar to that provided for arithmetical knowledge. In fact, many of the considerations relevant to arithmetic seem to be applicable, mutatis mutandis, to geometry. For example, the point made above about the path-dependency of arithmetical knowledge also applies to geometrical knowledge: while there is growing evidence that geometrical knowledge is at least partly based on evolutionarily developed proto-arithmetical abilities, that does not imply that this is the only ontogenetic or cultural historical trajectory leading to geometrical knowledge. It is at least in principle possible that geometrical knowledge can be acquired and developed independently of proto-geometrical abilities. Yet, there are good reasons to think that, standardly, children apply their proto-geometrical abilities in learning geometry. Exactly how this happens is a question that demands a lot more future work, and progress in this work will most likely make it increasingly philosophically relevant. It is to be expected that in coming years both fields will receive increasing attention from philosophers.
4. References and Further Reading
Anderson, Michael. 2015. After Phrenology: Neural Reuse and the Interactive Brain. Cambridge, MA: MIT Press.
Anderson, Michael. 2010. “Neural reuse: A fundamental organizational principle of the brain.” Behavioral and brain sciences 33 (4): 245–66.
Assadian, Bahram, and Stefan Buijsman. 2019. “Are the Natural Numbers Fundamentally Ordinals?” Philosophy and Phenomenological Research 99 (3): 564–80. https://doi.org/10.1111/phpr.12499.
Ayer, Alfred Jules. 1970. Language, Truth and Logic. Unabridged and Unaltered republ. of the 2. (1946) ed. New York, NY: Dover Publications.
Barrocas, Roberta, Stephanie Roesch, Caterina Gawrilow, and Korbinian Moeller. 2020. “Putting a Finger on Numerical Development – Reviewing the Contributions of Kindergarten Finger Gnosis and Fine Motor Skills to Numerical Abilities.” Frontiers in Psychology 11:1012. https://doi.org/10.3389/fpsyg.2020.01012.
Benacerraf, Paul. 1965. “What Numbers Could Not Be.” The Philosophical Review 74 (1): 47–73. https://doi.org/10.2307/2183530.
Bender, Andrea, and Sieghard Beller. 2012. “Nature and Culture of Finger Counting: Diversity and Representational Effects of an Embodied Cognitive Tool.” Cognition 124 (2): 156–82. https://doi.org/10.1016/j.cognition.2012.05.005.
Bomba, Paul C., and Einar R. Siqueland. 1983. “The Nature and Structure of Infant Form Categories.” Journal of Experimental Child Psychology 35 (2): 294–328. https://doi.org/10.1016/0022-0965(83)90085-1.
Boyd, Robert and Peter J. Richerson. 1985. Culture and the Evolutionary Process. Chicago: University of Chicago Press.
Boyd, Robert and Peter J. Richerson 2005. Not by Genes Alone. Chicago: University of Chicago Press.
Brannon, Elizabeth M., and Gretchen A. Van de Walle. 2001. “The Development of Ordinal Numerical Competence in Young Children.” Cognitive Psychology 43 (1): 53–81. https://doi.org/10.1006/cogp.2001.0756.
Buijsman, Stefan. 2019. “Learning the Natural Numbers as a Child.” Noûs 53 (1): 3–22.
Buijsman, Stefan. 2021. “How Do We Semantically Individuate Natural Numbers?†.” Philosophia Mathematica 29 (2): 214–33. https://doi.org/10.1093/philmat/nkab001.
Butterworth, Brian. 1999. What Counts: How Every Brain Is Hardwired for Math. New York: The Free Press.
Carey, Susan. 2004. “Bootstrapping & the Origin of Concepts.” Daedalus 133 (1): 59–68.
Carey, Susan. 2009. The Origin of Concepts. Oxford: Oxford University Press.
Carnap, Rudolf. 1937. The Logical Syntax of Language. Open Court Classics. Chicago, Ill: Open Court.
Cheng, Ken. 1986. “A Purely Geometric Module in the Rat’s Spatial Representation.” Cognition 23 (2): 149–78. https://doi.org/10.1016/0010-0277(86)90041-7.
Cheyette, Samuel J., and Steven T. Piantadosi. 2020. “A Unified Account of Numerosity Perception.” Nature Human Behaviour 4 (12): 1265–72. https://doi.org/10.1038/s41562-020-00946-0.
Clark, Andy, and David Chalmers. 1998. “The Extended Mind.” Analysis 58 (1): 7–19.
Clarke, Sam, and Jacob Beck. 2021. “The Number Sense Represents (Rational) Numbers.” Behavioral and Brain Sciences, April, 1–57. https://doi.org/10.1017/S0140525X21000571.
Cohen, Leslie B., and Barbara A. Younger. 1984. “Infant Perception of Angular Relations.” Infant Behavior and Development 7: 37–47.
Cole, Julian C. 2013. “Towards an Institutional Account of the Objectivity, Necessity, and Atemporality of Mathematics†.” Philosophia Mathematica 21 (1): 9–36. https://doi.org/10.1093/philmat/nks019.
Cole, Julian C. 2015. “Social Construction, Mathematics, and the Collective Imposition of Function onto Reality.” Erkenntnis 80 (6): 1101–24. https://doi.org/10.1007/s10670-014-9708-8.
Davidson, Kathryn, Kortney Eng, and David Barner. 2012. “Does Learning to Count Involve a Semantic Induction?” Cognition 123: 162–73.
De Cruz, Helen and Johan De Smedt. 2010. “The Innateness Hypothesis and Mathematical Concepts.” Topoi 29 (1): 3–13.
De Cruz, Helen. 2008. “An Extended Mind Perspective on Natural Number Representation.” Philosophical Psychology 21 (4): 475–90. https://doi.org/10.1080/09515080802285289.
Decock, Lieven. 2008. “The Conceptual Basis of Numerical Abilities: One-to-One Correspondence Versus the Successor Relation.” Philosophical Psychology 21 (4): 459–73. https://doi.org/10.1080/09515080802285255.
Dehaene, Stanislas. 1997. The Number Sense: How the Mind Creates Mathematics. 2nd ed. New York: Oxford University Press.
Dehaene, Stanislas. 2009. Reading in the Brain: The New Science of How We Read. London: Penguin.
Dehaene, Stanislas, Véronique Izard, Elizabeth Spelke, and Pierre Pica. 2008. “Log or Linear? Distinct Intuitions of the Number Scale in Western and Amazonian Indigene Cultures.” Science 320: 1217–20.
Dehaene, Stanislas. 2011. The Number Sense: How the Mind Creates Mathematics, Revised and Updated Edition. Revised, Updated ed. edition. New York: Oxford University Press.
Dehaene, Stanislas, Véronique Izard, Pierre Pica, and Elizabeth Spelke. 2006. “Core Knowledge of Geometry in an Amazonian Indigene Group.” Science 311 (5759): 381–84. https://doi.org/10.1126/science.1121739.
DeLong, Caroline M., Stephanie Barbato, Taylor O’Leary, and K. Tyler Wilcox. 2017. “Small and Large Number Discrimination in Goldfish (Carassius Auratus) with Extensive Training.” Behavioural Processes, The Cognition of Fish, 141 (August):172–83. https://doi.org/10.1016/j.beproc.2016.11.011.
Doeller, Christian F., and Neil Burgess. 2008. “Distinct Error-Correcting and Incidental Learning of Location Relative to Landmarks and Boundaries.” Proceedings of the National Academy of Sciences 105 (15): 5909–14. https://doi.org/10.1073/pnas.0711433105.
Doeller, Christian F., John A. King, and Neil Burgess. 2008. “Parallel Striatal and Hippocampal Systems for Landmarks and Boundaries in Spatial Memory.” Proceedings of the National Academy of Sciences 105 (15): 5915–20. https://doi.org/10.1073/pnas.0801489105.
Epps, Patience. 2006. “Growing a Numeral System: The Historical Development of Numerals in an Amazonian Language Family.” Diachronica 23 (2): 259–88. https://doi.org/10.1075/dia.23.2.03epp.
Euclid. 1956. The Thirteen Books of Euclid’s Elements. Vol. 1: Introduction and Books I, II. Second edition revised with additions. Vol. 1. New York: Dover Publications.
Everett, Caleb. 2017. Numbers and the Making of Us: Counting and the Course of Human Cultures. Harvard University Press.
Everett, Caleb, and Keren Madora. 2012. “Quantity Recognition Among Speakers of an Anumeric Language.” Cognitive Science 36 (1): 130–41. https://doi.org/10.1111/j.1551-6709.2011.01209.x.
Fabry, Regina E. 2018. “Betwixt and between: The Enculturated Predictive Processing Approach to Cognition.” Synthese 195 (6): 2483–2518.
Fabry, Regina E. 2020. “The Cerebral, Extra-Cerebral Bodily, and Socio-Cultural Dimensions of Enculturated Arithmetical Cognition.” Synthese 197:3685–3720.
Fabry, Regina E., and Markus Pantsar. 2021. “A Fresh Look at Research Strategies in Computational Cognitive Science: The Case of Enculturated Mathematical Problem Solving.” Synthese 198 (4): 3221–63. https://doi.org/10.1007/s11229-019-02276-9.
Fechner, Gustav Theodor. 1948. “Elements of Psychophysics, 1860.” In Readings in the History of Psychology, 206–13. Century Psychology Series. East Norwalk, CT, US: Appleton-Century-Crofts. https://doi.org/10.1037/11304-026.
Feferman, Solomon. 2009. “Conceptions of the Continuum.” Intellectica 51 (1): 169–89.
Feigenson, Lisa, Stanislas Dehaene, and Elizabeth Spelke. 2004. “Core Systems of Number.” Trends in Cognitive Sciences 8 (7): 307–14.
Ferreirós, José. 2016. Mathematical Knowledge and the Interplay of Practices. Princeton: Princeton.
Ferreirós, José, and Manuel J. García-Pérez. 2020. “Beyond Natural Geometry: On the Nature of Proto-Geometry.” Philosophical Psychology 33 (2): 181–205.
Frank, Michael C., Daniel L. Everett, Evelina Fedorenko, and Edward Gibson. 2008. “Number as a Cognitive Technology: Evidence from Pirahã Language and Cognition.” Cognition 108 (3): 819–24. https://doi.org/10.1016/j.cognition.2008.04.007.
Frege, Gottlob. 1879. “Begriffsschift.” In From Frege to Gödel: A source book in mathematical logic, 1879-1931, edited by J. Heijenoort, 1–82. Harvard University Press.
Frege, Gottlob. 1884. The Foundations of Arithmetic. Oxford: Basil Blackwell.
Fuson, Karen C. 1987. Children’s Counting and Concepts of Number. New York: Springer.
Fuson, Karen C., and Walter G. Secada. 1986. “Teaching Children to Add by Counting-On with One-Handed Finger Patterns.” Cognition and Instruction 3 (3): 229–60.
Gallistel, Charles R. 1990. The Organization of Learning. Cambridge, Mass: Mit Pr.
Gallistel, Charles R. 2017. “Numbers and Brains.” Learning & Behaviour 45 (4): 327–28.
Gelman, Rochel and Charles R. Gallistel. 2004. “Language and the Origin of Numerical Concepts.” Science 306: 441–43.
Goodman, Nelson. 1955. Fact, Fiction, and Forecast. Second. Harvard University Press.
Gordon, Peter. 2004. “Numerical Cognition without Words: Evidence from Amazonia.” Science 306 (5695): 496–99.
Hafting, Torkel, Marianne Fyhn, Sturla Molden, May-Britt Moser, and Edvard I. Moser. 2005. “Microstructure of a Spatial Map in the Entorhinal Cortex.” Nature 436 (7052): 801–6. https://doi.org/10.1038/nature03721.
Hallett, Michael. 1988. Cantorian Set Theory and Limitation of Size. Oxford Logic Guides 10. Oxford [England] : New York: Clarendon Press ; Oxford University Press.
Hatfield, Gary. 2003. The Natural and the Normative: Theories of Spatial Perception from Kant to Helmholtz. The MIT Press. https://doi.org/10.7551/mitpress/4852.001.0001.
Henrich, Joseph. 2015. The Secret of Our Success: How Culture Is Driving Human Evolution, Domesticating Our Species, and Making Us Smarter. Princeton University Press.
Hermer, Linda, and Elizabeth Spelke. 1996. “Modularity and Development: The Case of Spatial Reorientation.” Cognition 61 (3): 195–232. https://doi.org/10.1016/S0010-0277(96)00714-7.
Heyes, Cecilia. 2018. Cognitive Gadgets: The Cultural Evolution of Thinking. Cambridge: Harvard University Press.
Hilbert, David. 1902. The Foundations of Geometry. Open court publishing Company.
Hohol, Mateusz. 2019. Foundations of Geometric Cognition. New York: Routledge.
Hohol, Mateusz, and Marcin Miłkowski. 2019. “Cognitive Artifacts for Geometric Reasoning.” Foundations of Science 24 (4): 657–80. https://doi.org/10.1007/s10699-019-09603-w.
Horsten, Leon. 2023. “Philosophy of Mathematics.” In The Stanford Encyclopedia of Philosophy, edited by Edward N. Zalta and Uri Nodelman, Winter 2023. Metaphysics Research Lab, Stanford University. https://plato.stanford.edu/archives/win2023/entries/philosophy-mathematics/.
Howard, Scarlett R., Aurore Avarguès-Weber, Jair E. Garcia, Andrew D. Greentree, and Adrian G. Dyer. 2019. “Numerical Cognition in Honeybees Enables Addition and Subtraction.” Science Advances 5 (2): eaav0961. https://doi.org/10.1126/sciadv.aav0961.
Hutto, Daniel D. 2019. “Re-Doing the Math: Making Enactivism Add Up.” Philosophical Studies 176: 827–37.
Hutto, Daniel D., and Erik Myin. 2013. Radicalizing Enactivism. Basic Minds without Content. Cambridge, MA: MIT Press.
Hutto, Daniel D., and Erik Myin. 2017. Evolving enactivism. Basic minds meet content. Cambridge, MA: MIT Press.
Hyde, Daniel C. 2011. “Two Systems of Non-Symbolic Numerical Cognition.” Frontiers in Human Neuroscience 5: 150.
Ifrah, Georges. 1998. The Universal History of Numbers: From Prehistory to the Invention of the Computer. London: Harville Press.
Izard, Véronique, Pierre Pica, Elizabeth S. Spelke, and Stanislas Dehaene. 2011. “Flexible Intuitions of Euclidean Geometry in an Amazonian Indigene Group.” Proceedings of the National Academy of Sciences 108 (24): 9782–87.
Izard, Véronique, and Elizabeth S. Spelke. 2009. “Development of Sensitivity to Geometry in Visual Forms.” Human Evolution 23 (3): 213.
Izard, Véronique, Pierre Pica, Stanislas Dehaene, Danielle Hinchey, and Elizabeth Spelke. 2011. “Geometry as a Universal Mental Construction.” In Space, Time and Number in the Brain, edited by Stanislas Dehaene and Elizabeth M. Brannon, 319–32. San Diego: Academic Press. https://doi.org/10.1016/B978-0-12-385948-8.00019-0.
Johansen, Mikkel W., and Morten Misfeldt. 2020. “Material Representations in Mathematical Research Practice.” Synthese 197 (9): 3721–41. https://doi.org/10.1007/s11229-018-02033-4.
Jones, Max. 2020. “Numerals and Neural Reuse.” Synthese 197: 3657–81.
Kant, Immanuel. 1787. Critique of Pure Reason. Cambridge University Press.
Kaufman, Edna L., Miles W. Lord, Thomas W. Reese, and John Volkmann. 1949. “The Discrimination of Visual Number.” The American Journal of Psychology 62: 498–525. https://doi.org/10.2307/1418556.
Kitcher, Philip. 1983. The Nature of Mathematical Knowledge. New York: Oxford University Press.
Knops, Andre. 2020. Numerical Cognition. The Basics. New York: Routledge.
Kripke, Saul A. 1982. Wittgenstein on Rules and Private Language: An Elementary Exposition. Harvard University Press.
Lakoff, George, and Rafael Núñez. 2000. Where Mathematics Comes From. New York: Basic Books.
Lee, Michael D., and Barbara W. Sarnecka. 2010. “A Model of Knower‐level Behavior in Number Concept Development.” Cognitive Science 34 (1): 51–67.
Lee, Michael D., and Barbara W. Sarnecka. 2011. “Number-Knower Levels in Young Children: Insights from Bayesian Modeling.” Cognition 120 (3): 391–402.
Lever, Colin, Tom Wills, Francesca Cacucci, Neil Burgess, and John O’Keefe. 2002. “Long-Term Plasticity in Hippocampal Place-Cell Representation of Environmental Geometry.” Nature 416 (6876): 90.
Lindskog, Marcus, Maria Rogell, Gustaf Gredebäck and Ben Kenward. 2019. “Discrimination of Small Forms in a Deviant-Detection Paradigm by 10-Month-Old Infants.” Frontiers in Psychology 10: 1032.
Linnebo, Øystein. 2017. Philosophy of Mathematics. Princeton: Princeton Unviesrity Press.
Linnebo, Øystein. 2018. Thin Objects. Oxford: Oxford University Press.
Maddy, Penelope. 1990. Realism in Mathematics. Oxford: Oxford University Press.
Maguire, Eleanor A., David G. Gadian, Ingrid S. Johnsrude, Catriona D. Good, John Ashburner, Richard S. J. Frackowiak, and Christopher D. Frith. 2000. “Navigation-Related Structural Change in the Hippocampi of Taxi Drivers.” Proceedings of the National Academy of Sciences 97 (8): 4398–4403. https://doi.org/10.1073/pnas.070039597.
Malafouris, Lambros. 2013. How Things Shape the Mind: A Theory of Material Engagement. Cambridge, MA, USA: MIT Press.
Manders, Kenneth. 2008. “The Euclidean Diagram.” In The Philosophy of Mathematical Practice, edited by P. Mancosu, 80–133. Oxford: Oxford University Press.
Margolis, Eric and Stephen Laurence. 2008. “How to Learn the Natural Numbers: Inductive Inference and the Acquisition of Number Concepts.” Cognition 106: 924–39.
McGurk, Harry. 1972. “Infant Discrimination of Orientation.” Journal of Experimental Child Psychology 14 (1): 151–64. https://doi.org/10.1016/0022-0965(72)90040-9.
Menary, Richard. 2015. Mathematical Cognition: A Case of Enculturation. Frankfurt am Main: Open MIND, MIND Group.
Merzbach, Uta C., and Carl B. Boyer. 2011. A History of Mathematics. 3rd ed. Hoboken, N.J: John Wiley.
Mill, John Stuart. 1843. “A System of Logic.” In Collected Works of John Stuart Mill, edited by J.M. Robson. Vol. vols. 7 & 8. Toronto: University of Toronto Press.
Miller, Kevin F., Catherine M. Smith, Jianjun Zhu, and Houcan Zhang. 1995. “Preschool Origins of Cross-National Differences in Mathematical Competence: The Role of Number-Naming Systems.” Psychological Science 6 (1): 56–60.
Netz, Reviel. 1999. The Shaping of Deduction in Greek Mathematics. Cambridge, UK: Cambridge University Press.
Newcombe, Nora S., and Janellen Huttenlocher. 2000. Making Space: The Development of Spatial Representation and Reasoning. 1st edition. Cambridge, Mass: Bradford Books.
Nieder, A., and S. Dehaene. 2009. “Representation of Number in the Brain.” Annual Review of Neuroscience 32:185–208.
Nieder, Andreas. 2019. A Brain for Numbers: The Biology of the Number Instinct. Illustrated edition. Cambridge, Massachusetts: The MIT Press.
Nissen, Hans J., Peter Damerow, and Robert K. Englund. 1994. Archaic Bookkeeping: Early Writing and Techniques of Economic Administration in the Ancient Near East. Translated by Paul Larsen. 1st edition. Chicago, Ill: University of Chicago Press.
Noël, Marie-Pascale. 2005. “Finger Gnosia: A Predictor of Numerical Abilities in Children?” Child Neuropsychology 11 (5): 413–30. https://doi.org/10.1080/09297040590951550.
Noles, Nicholaus S., Brian J. Scholl, and Stephen R. Mitroff. 2005. “The Persistence of Object File Representations.” Perception & Psychophysics 67 (2): 324–34. https://doi.org/10.3758/BF03206495.
Núñez, Rafael E. 2017. “Is There Really an Evolved Capacity for Number?” Trends in Cognitive Science 21:409–24.
O’Keefe, John, and Neil Burgess. 1996. “Geometric Determinants of the Place Fields of Hippocampal Neurons.” Nature 381 (6581): 425–28. https://doi.org/10.1038/381425a0.
Overmann, Karenleigh A. 2018. “Constructing a Concept of Number.” Journal of Numerical Cognition 4 (2).
Overmann, Karenleigh A. 2023. The Materiality of Numbers: Emergence and Elaboration from Prehistory to Present. Cambridge ; New York, NY: Cambridge University Press.
Pantsar, Markus and Catarina Dutilh Novaes. 2020. “Synthese Special Issue: Mathematical Cognition and Enculturation.” Synthese 197. https://doi.org/10.1007/s11229-019-02478-1.
Pantsar, Markus. 2014. “An Empirically Feasible Approach to the Epistemology of Arithmetic.” Synthese 191 (17): 4201–29. https://doi.org/10.1007/s11229-014-0526-y.
Pantsar, Markus. 2018. “Early Numerical Cognition and Mathematical Processes.” THEORIA. Revista de Teoría, Historia y Fundamentos de La Ciencia 33 (2): 285–304.
Pantsar, Markus. 2019. “The Enculturated Move from Proto-Arithmetic to Arithmetic.” Frontiers in Psychology 10:1454.
Pantsar, Markus. 2020. “Mathematical Cognition and Enculturation: Introduction to the Synthese Special Issue.” Synthese 197 (9): 3647–55. https://doi.org/10.1007/s11229-019-02478-1.
Pantsar, Markus. 2021a. “Bootstrapping of Integer Concepts: The Stronger Deviant-Interpretation Challenge.” Synthese 199 (3–4): 5791–5814. https://doi.org/10.1007/s11229-021-03046-2.
Pantsar, Markus. 2021b. “Objectivity in Mathematics, Without Mathematical Objects†.” Philosophia Mathematica 29 (3): 318–52. https://doi.org/10.1093/philmat/nkab010.
Pantsar, Markus. 2022. “On the Development of Geometric Cognition: Beyond Nature vs. Nurture.” Philosophical Psychology 35 (4): 595–616. https://doi.org/10.1080/09515089.2021.2014441.
Pantsar, Markus. 2024a. Numerical Cognition and the Epistemology of Arithmetic. Cambridge University Press.
Pantsar, Markus. 2024b. “Why Do Numbers Exist? A Psychologist Constructivist Account.” Inquiry 0 (0): 1–33. https://doi.org/10.1080/0020174X.2024.2305386.
Peano, Giuseppe. 1889. “The Principles of Arithmetic, Presented by a New Method.” In Selected Works of Giuseppe Peano, edited by H. Kennedy, 101–34. Toronto; Buffalo: University of Toronto Press.
Pelland, Jean-Charles. 2018. “Which Came First, the Number or the Numeral?” In Naturalizing Logico-Mathematical Knowledge: Approaches from Philosophy, Psychology and Cognitive Science, edited by S. Bangu, 179–94. New York and London: Routledge.
Penner-Wilger, Marcie, and Michael L. Anderson. 2013. “The Relation between Finger Gnosis and Mathematical Ability: Why Redeployment of Neural Circuits Best Explains the Finding.” Frontiers in Psychology 4 (December):877. https://doi.org/10.3389/fpsyg.2013.00877.
Piaget, Jean. 1960. Childs Concept Of Geometry. Basic Books.
Piaget, Jean. 1965. Child’s Conception Of Number. Copyright 1965 edition. Princeton, N.J.: W. W. Norton & Company.
Pica, Pierre, Cathy Lemer, Véronique Izard, and Stanislas Dehaene. 2004. “Exact and Approximate Arithmetic in an Amazonian Indigene Group.” Science 306 (5695): 499–503.
Plato. 1992. The Republic. Translated by G.M.A Grube. Second. Indianapolis: Hackett Publishing Company. Putnam.
Quinon, Paula. 2021. “Cognitive Structuralism: Explaining the Regularity of the Natural Numbers Progression.” Review of Philosophy and Psychology. Springer. https://link.springer.com/article/10.1007/s13164-021-00524-x.
Quirk, Gregory J., Robert U. Muller, and John L. Kubie. 1990. “The Firing of Hippocampal Place Cells in the Dark Depends on the Rat’s Recent Experience.” Journal of Neuroscience 10 (6): 2008–17.
Rips, Lance J., Amber Bloomfield, and Jennifer Asmuth. 2008. “From Numerical Concepts to Concepts of Number.” Behavioral and Brain Sciences 31 (6): 623–42. https://doi.org/10.1017/S0140525X08005566.
Rips, Lance J., Jennifer Asmuth, and Amber Bloomfield. 2006. “Giving the Boot to the Bootstrap: How Not to Learn the Natural Numbers.” Cognition 101 (3): 51–60.
Rothman, Daniel B., and William H. Warren. 2006. “Wormholes in Virtual Reality and the Geometry of Cognitive Maps.” Journal of Vision 6 (6): 143. https://doi.org/10.1167/6.6.143.
Rugani, Rosa, Laura Fontanari, Eleonora Simoni, Lucia Regolin, and Giorgio Vallortigara. 2009. “Arithmetic in Newborn Chicks.” Proceedings of the Royal Society B: Biological Sciences 276 (1666): 2451–60.
dos Santos, César Frederico. 2021. “Enculturation and the Historical Origins of Number Words and Concepts.” Synthese, June. https://doi.org/10.1007/s11229-021-03202-8.
Schlimm, Dirk. 2018. “Numbers Through Numerals: The Constitutive Role of External Representations.” In Naturalizing Logico-Mathematical Knowledge. Routledge.
Schlimm, Dirk 2021. “How Can Numerals Be Iconic? More Varieties of Iconicity.” In Diagrammatic Representation and Inference, edited by Amrita Basu, Gem Stapleton, Sven Linker, Catherine Legg, Emmanuel Manalo, and Petrucio Viana, 520–28. Lecture Notes in Computer Science. Cham: Springer International Publishing. https://doi.org/10.1007/978-3-030-86062-2_53.
Schmandt-Besserat, Denise. 1996. How Writing Came About. University of Texas Press.
Schwartz, Marcelle, R. H. Day, and Leslie B. Cohen. 1979. “Visual Shape Perception in Early Infancy.” Monographs of the Society for Research in Child Development 44 (7): 1–63. https://doi.org/10.2307/1165963.
Shapiro, Stewart. 1997. Philosophy of Mathematics: Structure and Ontology. New York: Oxford University Press.
Spelke, Elizabeth, Sang Ah Lee, and Véronique Izard. 2010. “Beyond Core Knowledge: Natural Geometry.” Cognitive Science 34 (5): 863–84. https://doi.org/10.1111/j.1551-6709.2010.01110.x.
Spelke, Elizabeth S. 2000. “Core Knowledge.” American Psychologist 55 (11): 1233–43. https://doi.org/10.1037/0003-066X.55.11.1233.
Spelke, Elizabeth S., and Sang Ah Lee. 2012. “Core Systems of Geometry in Animal Minds.” Philosophical Transactions of the Royal Society B: Biological Sciences 367 (1603): 2784–93. https://doi.org/10.1098/rstb.2012.0210.
Spelke, Elizabeth S. 2011. “Natural Number and Natural Geometry.” In Space, Time and Number in the Brain, edited by S. Dehaene and E. Brannon, 287–317. London: Academic Press.
Starkey, Prentice, and Robert G. Cooper. 1980. “Perception of Numbers by Human Infants.” Science 210 (4473): 1033–35.
Tolman, Edward C. 1948. “Cognitive Maps in Rats and Men.” Psychological Review 55 (4): 189–208. https://doi.org/10.1037/h0061626.
Tomasello, Michael. 1999. The Cultural Origins of Human Cognition. Cambridge, MA: Harvard University Press.
Verdine, Brian N., Roberta Michnick Golinkoff, Kathryn Hirsh-Pasek, Nora S. Newcombe, Andrew T. Filipowicz, and Alicia Chang. 2014. “Deconstructing Building Blocks: Preschoolers’ Spatial Assembly Performance Relates to Early Mathematics Skills.” Child Development 85 (3): 1062–76. https://doi.org/10.1111/cdev.12165.
Vold, Karina, and Dirk Schlimm. 2020. “Extended Mathematical Cognition: External Representations with Non-Derived Content.” Synthese 197 (9): 3757–77. https://doi.org/10.1007/s11229-019-02097-w.
Walsh, Vincent. 2003. “A Theory of Magnitude: Common Cortical Metrics of Time, Space and Quantity.” Trends in Cognitive Sciences 7 (11): 483–88. https://doi.org/10.1016/j.tics.2003.09.002.
Warren, Jared. 2020. Shadows of Syntax: Revitalizing Logical and Mathematical Conventionalism. New York, NY, United States of America: Oxford University Press.
Wasner, Mirjam, Hans-Christoph Nuerk, Laura Martignon, Stephanie Roesch, and Korbinian Moeller. 2016. “Finger Gnosis Predicts a Unique but Small Part of Variance in Initial Arithmetic Performance.” Journal of Experimental Child Psychology 146:1–16.
Wehner, Rüdiger, and Randolf Menzel. 1990. “Do Insects Have Cognitive Maps?” Annual Review of Neuroscience 13 (1): 403–14. https://doi.org/10.1146/annurev.ne.13.030190.002155.
Wiese, Heike. 2007. “The Co-Evolution of Number Concepts and Counting Words.” Lingua 117: 758–72.
Wittgenstein, Ludwig. 1978. Remarks on the Foundations of Mathematics. Translated by Georg Henrik von Wright. Rev. ed., 4. print. Cambridge, Mass.: MIT Press.
Zahidi, Karim and Erik Myin. 2018. “Making Sense of Numbers without a Number Sense.” In Naturalizing Logico Mathematical Knowledge : Approaches from Philosophy, Psychology and Cognitive Science, edited by S. Bangu, 218–33. London: Routledge.
Stanisław Leśniewski (1886-1939) was a Polish logician and philosopher, co-founder with his colleague Jan Łukasiewicz of one of the most active logic centers of the twentieth century: the Warsaw School of Logic. As an alternative to Whitehead’s and Russell’s Principia Mathematica, he developed his own program for the foundations of mathematics on the basis of three systems. The first, called ‘Protothetic’, is a quantified propositional logic. The second, called ‘Ontology’, is a modernized, higher-order version of term logic. The last and most famous one is a general theory of parts and wholes, called ‘Mereology’. His concern for rigor in analysis and formalization led him to a logical work remarkable in its generality and precision. As a nominalist, he developed one of the major attempts to provide nominalistically acceptable foundations of mathematics. Although his logical systems have not been widely adopted and remain on the margins of standard logic, many of his views and innovations have greatly influenced the progress of logic: his conception of higher-order quantification, his development of a free and plural logic, his outline of natural deduction, his concern for the distinctions between use and mention and between language and meta-language, his canons of good definition, his formalization of the theory of parts and wholes. All this makes him one of the key figures of twentieth-century logic.
Stanisław Leśniewski was born on March 30, 1886, to Polish parents in Serpukhov, a small Russian town near Moscow. His father, a railway engineer, led the family to various construction sites, guiding young Leśniewski to attend the Gymnasium in the Siberian city of Irkutsk. Between 1904 and 1910, he pursued studies in mathematics and philosophy in St. Petersburg, as well as in different German-speaking universities in Leipzig, Heidelberg, Zurich, and Munich. Transitioning in 1910 to the University of Lvov—then a Polish city in Austria-Hungary, later known as Lviv in Ukraine—he obtained his doctorate in two years under the supervision of Kazimierz Twardowski, with a dissertation on the analysis of existential propositions.
Like many Polish philosophers of his time, Leśniewski was deeply influenced by Twardowski. Though he would later diverge from his master’s philosophical views, the rigorous spirit and quest for the greatest linguistic precision instilled by Twardowski, inherited from Brentano, permeated Leśniewski’s entire body of work. A pivotal moment in Leśniewski’s intellectual development occurred in 1911, when he encountered symbolic logic and Russell’s paradox through Jan Łukasiewicz’s book, On the Principle of Contradiction in Aristotle. In the ensuing years, Leśniewski published several papers, mainly devoted to the analysis of existential propositions, to the excluded middle and to the principle of contradiction.
At the outbreak of World War I, Poland found itself in the midst of the conflict, prompting Leśniewski decision to return to Russia. There, he took up teaching positions in Polish schools located in Moscow. It was during this period that he published his initial analysis of Russell’s paradox (1914) and formulated the first version of his Mereology (1916). Leśniewski’s Mereology is a theory of parts and wholes. It introduces the notion of collective class, a concrete notion of class elaborated by Leśniewski directly against Cantor’s sets, Frege’s extensions of concepts and Russell’s and Whitehead’s classes as incomplete symbols. Constituting the initial phase of his work, all the papers from 1911 to 1916 were characterized by an informal style, almost devoid of symbolic notation.
With the advent of the Bolshevik Revolution, Leśniewski departed Russia and permanently settled in Poland. After an initial attempt to obtain his habilitation in Lvov, he eventually attained it in 1918 at the University of Warsaw. During the years 1919-1921, Leśniewski played a role as a code breaker in Poland’s efforts to thwart the Red Army’s advance on the newly independent nation. By the war’s end, Warsaw University had emerged as a significant center for mathematics. In 1919, Leśniewski accepted a chair especially established for him, dedicated to the foundations of mathematics. Together with his colleague Jan Łukasiewicz, he co-founded the Warsaw School of Logic, which was to be the most important center for symbolic logic during the interwar period. Leśniewski and Łukasiewicz attracted exceptionally talented students, including the young Alfred Tarski, who would be Leśniewski’s sole doctoral student throughout his career.
From 1919 until his passing in 1939, Leśniewski consistently taught and refined his principal logical achievements: the three systems known as ‘Protothetic’ (a generalized version of propositional logic, encompassing quantification), ‘Ontology’ (a modern version of term logic), and ‘Mereology’. However, Leśniewski’s perfectionism hindered him from promptly publishing his results, as he insisted on attaining the utmost precision. His decision to employ logical formal tools stemmed from the desire to express his philosophical intuitions with exceptional rigor. Dissatisfied with the prevailing logical works of his time, he found in particular Whitehead’s and Russell’s Principia Mathematica lacking the requisite precision and rigor. Frege’s work was closer to his methodological requirements, although he criticized his Platonist leanings and perceived his logic as overly influenced by mathematical objectives. Leśniewski endeavored to establish an organon wherein principles were not adopted to ensure a consistent account of mathematics, but rather to faithfully express our general logical intuitions. Despite his exacting standards, which often left him dissatisfied with his own output, he resumed publishing from 1927 onward. Notably, he authored a series of eleven papers titled On the Foundations of Mathematics. During this phase of publication, Leśniewski abandoned the informal style of his earlier writings in favor of a formal discussion of his three systems.
Tragically, he died of thyroid cancer on May 13, 1939, aged 53. He left behind a substantial collection of notes and manuscripts entrusted to his pupil Bolesław Sobociński. Regrettably, this material was lost during the Nazi destruction of Warsaw in 1944. Since its publication in 1991, scholars primarily access Leśniewski’s work through the English edition of his Collected Works. Furthermore, there is a volume of lecture notes from Leśniewski’s students compiled and published in English in 1988. Testimonies and reconstructions provided by members of the Warsaw School, notably Bolesław Sobociński, Jerzy Słupecki, Cesław Lejewski, and to a lesser extent, Alfred Tarski, shed light on the lost aspects of his oeuvre. Comprehensive presentations of Leśniewski’s work can be found in works by Luschei (1962), Miéville (1984), Urbaniak (2014), and in a series of six special issues of the Swiss journal Travaux de Logique (2001-2009). Additionally, significant articles on Leśniewski’s systems are featured in collections edited by Srzednicki and Rickey (1984), Miéville and Vernant (1996), and Srzednicki and Stachniak (1998). Rickey’s annotated bibliography, available online, offers a comprehensive reference guide to Leśniewski’s work and related topics.
2. Logical Systems
Becoming skeptical about expressing his research in natural language, as he had done in his early writings, Leśniewski was persuaded in the early 1920s to adopt a symbolic language, despite his reservations toward the symbolic logic works of his time. Consequently, he chose to present his Mereology using a new symbolic language that aligned with his linguistic intuitions. By 1916, Mereology had already been axiomatized, albeit using expressions like ‘A is B’, ‘A is a part of B’ or ‘If an object A is a, then there exists an object B which is a class of the objects a’—expressions which, as Leśniewski recognized, lack precision. He then embarked on constructing a logical calculus capable of incorporating the specific terms of Mereology, such as ‘class’ and ‘part’. This calculus was designed to make explicit his interpretation of the copula ‘is’. Initially, Leśniewski focused on the analysis of singular propositions of the form ‘a is b’, which he symbolized as ‘aεb’. This emphasis on the copula ‘is’ led Leśniewski to name his system ‘Ontology’. He believed that he could express all the intended meanings using only propositions of the form ‘aεb’, along with a general logical framework incorporating propositional logic and quantification theory. Ontology emerged then as a term logic grounded in a more fundamental calculus that Leśniewski called ‘Protothetic’ (literally, the theory of first theses). Protothetic is the most basic system, with Ontology and Mereology being subsequent expansions of it, even though Leśniewski created the three systems in reverse order, starting from the applied theory of Mereology, progressing to the purely logical system of Ontology, and finally arriving at Protothetic. This underscores his use of formalization as a tool for the accurate expression of his intuitions. He did not adhere to a formalist conception of logic. The formalist idea of a pure syntax, subject to various subsequent interpretations is completely foreign to Leśniewski. In his systems, all formulas are intended to be meaningful from the outset. The primitive constants do not get their value from axioms and rules; rather, it is the meaning of the primitive constants that makes the axioms true, and the rules correct.
As a nominalist, he rejected the existence of general entities, a stance that significantly influenced his conception of formal languages and systems. Leśniewski rejected abstract types of expressions and infinite sets of formulas provided from the outset by formal definitions. To him, a theorem within a system is the final entry in a tangible list of explicitly written inscriptions, with axioms preceding it and subsequent entries obtained through the application of explicit rules to previous inscriptions. This perspective views a logical system as a concrete complex of meaningful inscriptions, inherently situated within space and time. Each system is thus composed of a finite list of inscriptions, yet it remains open to the inscription of new theorems. One significant implication of this unusual conception of formal systems is the absence of a predetermined and definitive definition of what constitutes a well-formed formula. Leśniewski had to formulate his rules in a way that ensured both logical validity and grammatical conformity. This nominalist approach to formal syntax empowered Leśniewski to develop a logic where the characterization of the rules reached an extraordinary level of precision. However, it is worth emphasizing that adopting his logic does not require endorsing his nominalist convictions. His systems are equally suitable for reasoning about both concrete and abstract entities.
a. Protothetic (Propositional Logic)
The distinctiveness of Protothetic emerges when contrasted with a more usual deductive system for propositional logic, such as the following:
System L
(1) Formal language of L
Let A={p,q,r,…} ∪ {⊃,~} ∪ {( , )} be the set of symbols.
Let F the set of formulas, defined as the smallest set E such that
(i) {p,q,r,…} ⊂ E
(ii) If α,β ∈ E, then ~α ∈ E and (α ⊃ β) ∈ E
(2) Axioms of L
AxL1: (p ⊃ (q ⊃ p))
AxL2: ((p ⊃ (q ⊃ r)) ⊃ ((p ⊃ q) ⊃ (p ⊃ r)))
AxL3: ((~p ⊃ ~q) ⊃ ((~p ⊃ q) ⊃ p))
(3) Rules of inference of L
Modus ponens
Substitution
Such a system is closed, in the sense that the sets of symbols and formulas are given once and for all, and the set of theorems is fully determined from the outset as the closure of the set of axioms under the rules of inference. It is known to be an adequate axiomatization of the classical bivalent propositional calculus, possessing essential properties such as soundness, consistency, and completeness. Moreover, its concise set of connectives (two symbols interpreted as negation and conditional) is adequate for the expression of all bivalent truth-functions. This last feature allows for the introduction of additional connectives through definition, for example:
Conjunction: α ∧ β ≝ ~(α ⊃ ~β)
Disjunction: α ∨ β ≝ ~α ⊃ β
Biconditional: α ≡ β ≝ (~α ⊃ β) ⊃ ~(α ⊃ ~β)
Although it seems that the sets of formulas and theorems of L can be extended to certain expressions containing the defined connectives, these new expressions are not official formulas of the system L. We can use them in our proofs and deductions, but only as convenient metalinguistic abbreviations of official formulas. For Leśniewski, a system involving only a few primitive constants, and in which defined constants have no status and can only be used in the metalanguage was unacceptable. In his view, a complete system for propositional logic should make it possible to express any truth-functional meaning with an official constant of its own formal language. In the standard perspective, if we want a system with the defined constants as official ones, we must proceed to the construction of a suitable expansion L* of L. For that purpose, the sets of symbols and formulas must be increased, and the set of axioms must also be completed, with suitable additional axioms. This must be done preserving in L* soundness, consistency and completeness. In principle, our former definitions should be used to provide the expected additional axioms, but additional axioms must be formulas of the object language of L*. There are thus two obvious reasons why these definitions cannot be taken as axioms in their current form. Firstly, they are schematic expressions, so the metavariables they contain must be replaced by object language variables. Secondly, they contain the special definition symbol ‘≝ ‘, which is used in order to stipulate a logical equivalence between the definiendum and the definiens. Transforming the definitions into axioms requires the use of object language symbols able to express the same equivalence between the two constituents of the expressions. An obvious solution is the use of biconditional formulas. For instance, the additional axiom devoted to conjunction would then be:
(p ∧ q) ≡ ~(p ⊃ ~q)
However, for the introduction of biconditional itself, this solution would obviously be circular. Another solution for the construction of the expansion L* would be to replace the definition sign by the exclusive use of the primitive constants of the original system L. Instead of adding a single biconditional axiom, Tarski suggested adding a pair of conditional expressions. In the case of conjunction, we would then have to add this pair of axioms:
(p ∧ q) ⊃ ~(p ⊃ ~q)
~(p ⊃ ~q) ⊃ (p ∧ q)
This is perfectly suitable and would also be convenient for the introduction of biconditional. But Leśniewski was more demanding. For him, the most natural solution for the introduction of a new constant was by the way of a single biconditional expression. He turned then to the idea that an initial system for propositional logic must involve biconditional among its primitive constants. Since he attached great importance to the question of parsimony, he sought to elaborate an initial system for his future Protothetic containing biconditional as the only primitive propositional connective.
i. A Quantified Propositional Logic
In the early 1920s, it was already known that in classical logic the Sheffer stroke, like its dual connective, could serve as the unique primitive connective to express all the truth functions. Moreover, Leśniewski also knew another result which was published by Russell in his Principles of Mathematics (1903). Russell indeed showed in this early work that it is possible to conceive a complete system for propositional logic with conditional as the single primitive connective, provided that the propositional variables could be universally quantified. His definition of negation is the following: “not-p is equivalent to the assertion that p implies all propositions” (1903: 18). We can express this definition by the symbolic expression:
~p ≝ p ⊃ (∀r)r
Leśniewski knew that a similar solution holds for the definition of negation in term of biconditional:
~p ≝ p ≡ (∀r)r
However, with the biconditional solution, difficulties remain for the expression of other connectives. For example, conjunction and disjunction are not expressible by a simple combination of biconditional and negation. A brilliant solution to this issue has been discovered by the young Alfred Tarski. In 1923, he established in his PhD thesis, written under the supervision of Leśniewski, that a quantified system of propositional logic with biconditional as its single primitive connective allows the expression of all truth functions. In the introduction of this work, Tarski exposed the issue, and the way it is related to his adviser’s project is clear:
The problem of which I here offer a solution […] seems to me to be interesting for the following reason. We know that it is possible to construct the system of logistic by means of a single primitive term, employing for this purpose either the sign of implication [conditional], if we wish to follow the example of Russell, or by making use of the idea of Sheffer, who adopts as the primitive term the sign of incompatibility, especially introduced for this purpose. Now in order to really attain our goal, it is necessary to guard against the entry of any constant special term into the wording of the definitions involved, if this special term is at the same time distinct from the primitive term adopted, from terms previously defined, and from the term to be defined. The sign of equivalence [biconditional], if we employ it as our primitive term, presents from this standpoint the advantage that it permits to observe the above rule quite strictly and as the same time to give to our definitions a form as natural as it is convenient, that is to say the form of equivalences.
The theorem which is proved in §1 of this article,
constitutes a positive answer to the question raised above. In fact, it can serve as a definition of the symbol of logical product [conjunction] in terms of the equivalence symbol and the universal quantifier; and as soon as we are able to use the symbol of logical product, the definitions of other terms of logistic do not present any difficulty, […] (Tarski, 1923: pp. 1-2).
This result by Tarski was a cornerstone of the future Protothetic. But it was not sufficient to overcome all the obstacles. The complete biconditional fragment of propositional logic was already known in the Warsaw School, but using Tarski’s solution required a version of this fragment allowing quantifiers to bind not only propositional variables, but also variables for propositional connectives. For the axiomatization of this extended biconditional fragment, Leśniewski’s idea was to work first on the basis of the universal closure of two axioms known to form a good basis for the unextended fragment:
AxP1: (∀pqr)(((p ≡ r) ≡ (q ≡ p)) ≡ (r ≡ q))
AxP2: (∀pqr)(((p ≡ q) ≡ r) ≡ (p ≡ (q ≡ r)))
As for the inference rules, they had to include a detachment rule for biconditional expressions, as well as two rules for taking advantage of quantified expressions:
Without going into a detailed and rigorous characterization of the system S1 based on the above described axioms and rules, let us consider as an illustration how to prove a few theorems:
These few examples show how the rules apply in S1.. One sees how Sub and Dis always keep the quantifiers saturated, so that there are never free variables in theorems. It is worth noting that Sub and Dis are formulated to maintain quantifier saturation, ensuring that there are no free variables in the theorems. With a bivalent interpretation of the variables, the standard truth-table for the single connective and the quantifier understood as expressing “whatever the values of …”, it is easy to show that S1 is sound. Moreover, the closure of any biconditional tautology is provable in this system. Nevertheless, S1 is not complete. For instance, the following formulas, which are obviously valid in the intended interpretation, remain unprovable:
(∀p)p ≡ (∀r)r (not provable in S1) (∀p)((∀q)(p ≡ q) ≡ (∀r)(p ≡ r)) (not provable in S1)
Before examining this limitation, let us explore how Leśniewski reinforced the system by introducing a new rule for the introduction of definitions. The resulting system, S2, features a formal language that can be expanded step by step through the official admission of defined constants. Within such a system, the notions of formula and theorem are no longer absolute ones. They become relative to what will be called ‘the developments of the system’. A development is a finite ordered sequence of explicitly written expressions, which are the theorems of that development. The first theorems of a development are necessarily the axioms. Every further theorem must have been explicitly written, applying one of the inference rules on previously written theorems. As a result, each time we write a new theorem, we get a new development. The above written sequence of theorems of S1 is a development in S2 (say the development P6, using the label of its last theorem). We can now get new developments by writing for example the following additional theorems:
P7:
(∀p)(((p ≡ p) ≡ (p ≡ p)) ≡ (p ≡ p))
AxP1, Sub, q⁄p,r⁄p
P8:
(∀p)((p ≡ p) ≡ (p ≡ p)) ≡ (∀p)(p ≡ p)
P7, Dis, p
P9:
(∀p)((p ≡ p) ≡ (p ≡ p))
P6, Sub, q⁄p
P10:
(∀p)(p ≡ p)
P8, P9, Det
P11:
(∀r)r ≡ (∀r)r
P10, Sub, p⁄(∀r)r
At first glance, developments seem to be exactly like proofs, but there are important differences. First, there are only developments that have been explicitly written. Developments are indeed concrete objects. Moreover, every time a definition is stated, the language available in further developments is increased. As a result, it is possible that a theorem in a certain development is not even a well-formed formula in another one.
ii. Definition as a Rule
Now let us consider how a new theorem can be written in S2 by applying the additional rule for stating definitions:
Definition Rule (Def-S2):
In a given development Pn, an expression D can be written as theorem Pn+1 with the rule Def-S2 if and only if D is a closed biconditional formula of the form:
or(when )
where:
1. are different variables belonging to categories already available in the development Pn;
2. the expression Dum (the definiendum) is of the form or (when ), being a new constant symbol (not already present in the development Pn);
3. the expression Diens (the definiens) is a formula well formed in accordance with categories and syntactic contexts available in the development Pn;
4. the expressions Dum and Diens have exactly the same free variables (if any).
This formulation lacks plain rigor, as it refers to categories and syntactic contexts available in a certain development. It serves here as a suggestive presentation, summarizing the meticulous and long explanations Leśniewski provided in order to fully precise the conditions under which an expression can be written as resulting from the application of the definition rule. Without going here into this formal precision, let us instead examine how new developments can be written in S2.
P12: Def-S2
P13:P10, Sub,
P14: P12, P13, Det
Theorem P12 is an example of definition in which the definiendum has no variable. It introduces the first propositional constant (constant of category S). As the definiens can be shown to be a theorem (P13), the new constant can be written as a theorem (P14) and can be understood as the constant true. Now consider how the constant false can also be introduced, using for its definiens an explosive expression (an expression from which every available formula would be derivable by Sub).
P15:Def-S2
P16:P10, Sub,
P17:Def-S2
P18: P17, Sub,
P19: P16, P18, Det
Using the newly defined constant false, P17 introduces classical negation by definition. It is worth noting that this definition also introduces for the first time the category of unary connectives (or the category labeled S/S, that is, the category of functors taking a unique sentence as their argument and resulting in a sentence). P19 expresses that the negation of false is a theorem. Now come three definitions of binary connectives (category S/SS, that is of functorswhich give a sentence from two sentences):
P20:
Def-S2
P21:
Def-S2
P22:
Def-S2
It is worth noting that none of these definitions could have been formulated without the prior definition of negation in P17. This is obvious with P20 and P22 which explicitly include negation in their definiens. In the case of P21 (Tarski’s definition of conjunction), negation is not specifically needed. However, the definiens of P21 involves bound variables for unary connectives. As a principle of constructing developments, the use of variables from a specific category (in this case, S/S) is permissible only if this category either is already included in the axioms or has been introduced through a preceding definition.
P23:
Def-S2
P24:
P23, Sub, f/≡
P25:
P6, P24, Det
Definition P23 still introduces a constant of a new category: S/(S/SS). Theorem P25 expresses that biconditional is a commutative binary connective.
Although the definitional machinery is powerful in S2, it still has a limitation that Leśniewski wanted to overcome. It is indeed impossible to define in S2 operations on connectives, such as “the dual of …” or “the composition of…and…”. All the categories that can be introduced in S2 give a result of category S. In order to reach more complex categories, the definition rule has to be reinforced. Let us call ‘S3‘ the system in which the definition rule is modified as follows:
Definition Rule (Def-Proto):
The rule is like Def-S2, except for condition 2, which is replaced by the following one:
2′. the expression Dum is of the form or (in case ), being a new constant symbol (not already present in the development Pn).
The only difference in this new version is that variables in Dum can be distributed in several successive pairs of brackets. Let us have a look on two examples:
P26:Def-Proto
P27:Def-Proto
Definition P26 introduces the operation which gives the dual of a binary connective. The new constant is of category (S/SS)/(S/SS). P27 introduces the composition or logical product of two binary connectives. The category of the defined constant is then (S/SS)/(S/SS)(S/SS). When the definiens has more than one pair of brackets, the result of the application of the new functor is again a functor. The numerator of its category index is itself a fraction, so that the introduced constant is a functor-forming functor (or a many-link functor), which was not possible to define with Def-S2.
These few examples sufficiently show how powerful the definition machinery can be in S2 and S3. Nevertheless, we must go back here to the limitations of S1. Let us remember that valid formulas as the following one where not provable in this system:
(not provable in S1)
Inevitably, these limitations also affect S2 and S3. Lesniewski understood early that systems like S1–S3 suffered from a too weak characterization of quantification. In the early 1920s, he realized that this weakness could be overcome if the axiomatic basis enforces explicitly propositional bivalency and extensionality.
iii. Bivalency and Extensionality
In a quantified system of propositional logic, propositional bivalency and extensionality can be expressed by the following formulas:
Bivalency for category S:
(Something holds for all propositions iff it holds for true and false)
Extensionality for category S:
(Two propositions are equivalent iff everything that holds for one, holds for the other)
Leśniewski wanted these formulas to be provable in his Protothetic. In 1922, he was able, with Tarski, to establish that in a system with all the usual laws of quantifiers, these two formulas were equivalent. Subsequently, Leśniewski found that in a system like S3, assuming only bivalency was sufficient to reinforce quantification adequately and thereby achieve extensionality for S. However, he could not simply adopt the formula for bivalency as an additional axiom. He had first to eliminate the defined terms in the formula. This could be done by applying the following transformations:
(bivalency, with change of letters)
(commutation)
(elimination of and )
(elimination of )
In order to avoid the introduction of an additional category in the axiomatic basis, he still had to transform the resulting formula, using variables for binary connectives instead of unary ones. He reached then the following third axiom, the addition of which strengthened quantification and allowed to derive both bivalency and extensionality for the category S of sentences:
AxP3: {
}
The system S4, based on the three axioms AxP1-AxP3 and the four rules Det, Sub, Dis, and Def-Proto is strong enough to reach at least a full classical calculus of all possible truth-functional unary and binary connectives. But Leśniewski still did not consider a system like S4 to be satisfactory. He wanted extensionality formulas to be provable not only for sentences, but also for all the categories that could potentially be introduced by definitions. In other words, he wanted his axiomatic basis to enforce extensionality for all the potentially definable functors (not only connectives or functors with propositional arguments, but also functors of which arguments are functors, like for instance those introduced by definitions P26 and P27). This goal could not be achieved by adding once again additional axioms. An infinity of axioms would have been necessary and each of them would have required specific categories for its formulation.
Leśniewski’s solution was to add a fifth rule of inference. In a given development Pn, the Rule of Extensionality (Ext) allows one to write a new theorem expressing extensionality for a category C, provided the development Pn already contains a definition of a constant of category C as well as one constant of category S/C. A general description of this rule would be too long here. It will only be illustrated by a couple of examples.
As a first example, definition P20 introduces a constant of category S/SS and P23 a constant of category S/(S/SS). The rule Ext allows then to write as a new theorem the following formula expressing extensionality for the category S/SS:
P28:
P20, P23, Ext
Definition P26 gives us our second example. It introduces a constant of category (S/SS)/(S/SS). However, in order to apply Ext for this category, we still need to introduce a definition of a constant of category S/((S/SS)/(S/SS)). The following definition would be adequate for that purpose:
P29:
Def-Proto
Now the conditions are satisfied to get by Ext an extensionality theorem for (S/SS)/(S/SS):
P30:
P26, P29, Ext
Lesniewsk’s full Protothetic is the system based on the three axioms AxP1-AxP3 and the five inference rules Det, Sub, Dis, Def-Proto, and Ext. Leśniewski labeled this version of his Protothetic Ϭ5. To get an insight into the expressive power of Ϭ5, consider a few theorems expressing important properties of the category of unary connectives (these theorems are presented here without proof):
P31:
This theorem expresses extensionality for the category S/S.
P32:
This is known as the law of development for category S/S.
P33:
This theorem is known as the law of the number of functions for the category S/S, the four constants occurring in the formula (‘~’, ‘Ass’, ‘Fal’, and ‘Ver’) being the four non-equivalent constants that can be defined for the four unary truth functions.
Leśniewski has shown that for every category to be introduced in the language, it is always possible to construct a development involving theorems analogous to P31-33 and to determine precisely which and how much non-equivalent constants can be defined in this category. The main interest of that result is that it is always possible, for any category, to eliminate from expressions quantifiers binding variables of that category. In the case of S, the theorem for bivalency expresses this fact. In the case of S/S, it is the following theorem, which could be called ‘quadrivalency of S/S’:
P34:
On the basis of P34 and analogous results for other categories (for example, theorems of 16-valency for S/SS, of 216-valency for S/(S/SS), and so on), it is always possible to make explicit the precise meaning of a quantified expression by a finite process. As Luschei wrote, “Protothetic is Leśniewski’s indefinitely extensible logic of propositions, connectors, connector-forming functors, higher-level functor-forming functors—indeed of constants and variables of any semantic category in the unbounded hierarchies constructible on the basis of propositional expressions” (1962: 143).
The question of the completeness of Protothetic has also been discussed by Leśniewski and his followers. Leśniewski considered full Protothetic to be strongly complete (which means that if α is a closed well-formed formula of a given development, then either α or its negation is provable from that development), even though he did not have the time to give a demonstration of that result. Słupecki (1953) gave a partial demonstration of the strong completeness of the large sub-system of Protothetic where only functors of sentence-forming categories are available.
In 1926, Leśniewski discovered that his Protothetic could be based on a single biconditional axiom. Sobociński was able to improve Leśniewski’s result by working out the following single axiom, which is the shortest known one:
ShortAxP:
iv. Semantic Categories and Contextual Syntax
In the language of Protothetic, like in other interpreted systems of logic, symbols and expressions divide in different mutually disjoint types or categories according to their syntactic role and the way they contribute to the meaning of the formulas in which they occur. Developing in the 1920s his notion of semantic category, Leśniewski’s inspiration was in the traditional grammatical theory of parts of speech and in Husserl’s notion of Bedeutungskategorie. In fact, Leśniewski never gave an explicit theory of semantic categories, being content to use the notion in his logical constructions. Later popularized in an explicit theoretical formulation by Ajdukiewicz (1935), the notion of category has also been applied to natural languages, opening the development of categorial grammars (Bar-Hillel, Montague and Lambek being the most representative authors in this field). Ajdukiewicz introduced a convenient notation which permits the indication, through a simple index, of all that is characteristic of a certain category. A single letter is used for the index of a basic category. In Leśniewski’s languages there are only two basic categories: the category of propositions (labeled S) and the category of names (labeled N). In the propositional language of Protothetic, only the former is used. The latter will be added in the language of further theories, namely Ontology and Mereology. Naturally, languages also contain categories for different combining symbols or expressions which are called ‘functors’: connectives, operators, relators, predicates, and so on. All these combining expressions range in derived categories. The category of a functor is determined by three pieces of information: (1) the number of the arguments it needs, (2) the respective categories of these arguments, and (3) the category of the whole generated by the application of the functor to its arguments. Ajdukiewicz’s notation gathers all this information in a single suggestive index. For example, the category of the biconditional connective is labeled S/SS, since it builds a proposition when it applies to two propositional arguments. The (S/SS)/(S/S)(S/SS) index would be that of the category of functors operating on a unary connective (S/S) on the one hand, a binary connective (S/SS) on the other hand, and generating a binary connective (S/SS). In his 1935 paper, Ajdukiewicz developed a procedure for showing grammatical well-formedness of expressions using this categorial notation and a rule for categorial simplification. However, Ajdukiewicz’s procedure requires the category of each of the signs in an expression to be known in advance. This was not possible in Protothetic. Due to its evolutive nature, the system does not fit with a conception of formal language in which all the required categorial information would have been determined from the outset. Obviously, no such language would have been rich enough for all the definitions that can be stated in the successive developments.
Leśniewski developed a new concept of formal syntax, often referred to as ‘inscriptional syntax’, though it is more aptly named ‘contextual syntax’. In Leśniewski’s syntax, the role and the category of a symbol are not indicated by its belonging to a certain previously established list of signs, but by the specific context in which it occurs. For such a syntax, Leśniewski needed a specific notation and an adequate way to warrant his formulas to be well formed. Consider for example the above given first axiom of Protothetic (written in a standard notation):
AxP1: (in standard notation)
The usual analysis of the grammaticality of such a formula goes recursively from simple constituents to more complex expressions: starting from the letters, which are in the list of symbols for propositional variables, we can get the expression by successive applications of the biconditional formation rule and one application of the quantifier formation rule. In a contextual syntax, on the contrary, it is the form of the complete expression that determines the nature and categories of its constituents. In Leśniewski’s notation, AxP1 would have been written as:
AxP1: (in Leśniewski-style notation)
As in Łukasiewicz’s well-known notation, Leśniewski’s notation is a prefixed one: every functor is followed by the list of its arguments, but contrary to Łukasiewicz’s notation, parentheses are not removed. So instead of , we get . As for the quantifier, it is always indicated by the use of specialized lower and upper square brackets: instead of , we get . Like every theorem, the whole formula of AxP1 is an expression of the basic category S. Its general form is that of a quantified expression. This implies that the expression within the upper corner brackets also belongs to category S. This last expression is of the general form . As both positions within the round brackets are again occupied by expressions of the form , this means that is here the symbol of a functor of the category S/SS. By carrying out the analysis down to the last constituents of the formula, it can be determined that the letters p, q, r are here symbols of the category S, for they occur in the positions of arguments of the context . The construction of the system’s developments adheres to this principle: upon the initial introduction of a category, its associated context must remain exclusive to symbols or expressions within that category throughout subsequent developments, ensuring no use of symbols or expressions from any other category. In other words, parentheses of the same shape, delimiting the same number of arguments, must not be associated with functors of different categories. Parentheses are then no longer used to delimit the scope of functors, but to indicate their categories. Let us examine three examples extracted from the previously provided definitions (written here in Leśniewski-compliant versions):
P17:
P23: ⌊f⌋ ⌈≡ (Com⟨f⟩⌊pq⌋⌈≡(f(qp)f(pq))⌉) ⌉
P27:
Definition P17 is the first to introduce a unary functor S/S in the developments. It must then be associated with a new context. The analysis of the definiens shows that the letter p is here of category S, for it occurs in the first place of a context . Therefore, the new constant is of category S/S. The choice of for the new context is suitable and introduces no confusion, for it differs from by the number of argument places.
Definition P23 introduces another unary functor. In the definiens, its argument f is of category S/SS (for f occurs just before a context). The category of the defined functor Com is then S/(S/SS). This category is a new one in the development. So, it must again be associated with a new context. This time round brackets are excluded, for such a choice would introduce ambiguity and would make indistinct the categories S/S and S/(S/SS).
Definition P27 is a more complex example. The defined constant applies to two arguments f and g of category S/SS (as it is clear in the definiens). The result of this application is the expression . But this expression, once again, applies to a pair of arguments, leading to the expression . The use of the context for the second application indicates that the expression is of category S/SS. So applying to two S/SS arguments, the defined constant gives a S/SS expression as a result. The category of the defined constant is thus (S/SS)/(S/SS)(S/SS), and its context is from now on .
In such a syntax, grammaticality does not depend on the choice of letters and symbols for this or that variable. Only the arrangement of the categories indicated by the contexts formed by specific brackets determines whether an expression is well formed or not. On a theoretical point of view, a definition as P27 could perfectly well have been written using one of the following two alternative expressions:
The choice of the letters p and q for the propositional variables, f and g for the connector variables in P27 has no other reason than to avoid offending the reader’s habits. As for brackets and symbols for constants, their choice is free at the time of their first occurrence. However, the choices must respect the differentiation of contexts and be respected throughout the developments.
Finally, it is important to recognize that the differentiation between constants and variables is contextual as well. Since all the axioms and theorems are closed expressions, all the symbols which are bound by a quantifier are variables, whereas the other symbols—apart from brackets—are necessarily constants. This section gives only an outline of the principles of contextual syntax. Leśniewski provides a detailed and scrupulously complete description of them through what he called his ‘terminological explanations’. In this way, he demonstrates how contextuality and the application of the notion of semantic category make it possible to have a rigorous formal language which, like ordinary language and the usual notations of science, remains continuously open to enrichment and novelty. This part of Leśniewski’s work is a masterpiece in the philosophy of notation.
The inherent openness of Leśniewski’s systems requires such a notation that unequivocally and contextually determines the categories of symbols and expressions. This aspect, combined with Leśniewski’s meticulous axiomatic presentations of his systems, makes them challenging for 21st century logicians to apprehend. But this difficulty similarly arises with the original works of Frege, Peirce, or Hilbert. It is largely due to the age of the systems and to the evolution of logicians’ habits. However, it is known that Leśniewski used his contextual syntax only where ambiguities could arise. In his everyday practice, he also formulated his proofs using a form of a natural deduction method. This method was common among the members of the Warsaw School. It was only codified later by Jaśkowski. Surprisingly, this codification was not applied to Leśniewski’s systems. A natural deduction system for Protothetic, close to twenty-first century streamlined methods of logic, is available in Joray (2020).
b. Ontology (Term Logic)
Like standard predicate logic, which is built on the basis of a propositional calculus, Leśniewski’s system called ‘Ontology’ is an expansion of Protothetic. The aim of Ontology is mainly to enlarge propositional deductive logic to the analysis and expression of predication. In spite of these similarities, there are important differences between Ontology and standard predicate logic. Firstly, Ontology is not a theory of quantification. The system indeed inherits quantification from Protothetic. Secondly, the language of Ontology makes no distinction of category between singular and conceptual terms. Instead of having a first category for singular names and another for predicates, Ontology has only a wide category of names. In this respect, Ontology is closer to traditional term logic than it is to predicate logic. Ontology extends then Protothetic by introducing a second basic category, the category of names, labeled N, and a copula as a new primitive constant. It is known as an extensional calculus of names, which constitutes a free and plural logic.
i. Names and Copula
Leśniewski’s notion of name is considerably broader than it is in the Russellian tradition. For him, not only simple singular terms like ‘Socrates’ are names, but also complex referring expressions like ‘Plato’s master’ and terms or expressions that refer to more than one object, like ‘planet’ or ‘author of the Principia Mathematica’. Whether simple or composed, a name may be singular, plural or even empty if there is no object to which it refers, as it is the case with ‘unicorn’ or ‘square circle’. In a sentence like ‘Socrates is Greek’, there are two names according to Leśniewski. ‘Socrates’ is a singular one because it refers to one individual, and ‘Greek’ is a plural one because it refers to many individuals. It should be noticed that there is no way in Leśniewski’s nominalist conception of names to interpret plural names as denoting any single abstract totality (like a set, a class, or a collection) which would have the signified objects as members. A plural name simply refers directly to these objects. Like Protothetic, Ontology is an interpreted system. It is in no way a pure syntax waiting for interpretation. It has then an intuitive semantics from the beginning. All names in Ontology belong to the same category N. In the intended semantics, they can be of three sorts: singular, plural, or empty. In order to represent these three possibilities, Lejewski (1958) proposed to use suggestive diagrams. In the following figure I, the diagrams represent the three possibilities for a name ‘a’: singular (I.1), plural (I.2), or empty (I.3):
Figure II shows 16 diagrams representing the possible situations in which two names ‘a’ and ‘b’ can stand in relation to each other.
In (II.1) both names are singular, and they denote the same object. In (II.2), both are singular, but denote different objects. In (II.3)
‘a’ denotes one object which is among the objects denoted by the plural name ‘b’. From (II.9) to (II.13), both names are plural. They denote, for example, exactly the same objects in (II.9), and the objects denoted by ‘a’ are strictly among those denoted by ‘b’ in (II.10). The main interest of these diagrams is that they make it possible to explain in a rather precise way the meaning with which Leśniewski proposed to use the only primitive term of Ontology, namely the epsilon copula. This copula ‘’ applies to two names (two arguments of category N) and results in a proposition (category S). It is then of category S/NN. Expressions of the form
(often written with the simplified form ‘’)
are called ‘elementary propositions’ of Ontology. Their truth conditions in the intended semantics can be explained in the following way: an elementary proposition ‘’ is true if and only if the two arguments ‘a’ and ‘b’ stand either in situation (II.1) or in situation (II.3) of Lejewski’s Figure II. In ordinary language, the meaning of such an elementary proposition can be approximated by “The object denoted by ‘a’ is the object, or one of the objects, denoted by “b”. For convenience, it is often read “a is among the b’s” or even “a is a b”.
It should be stressed that, unlike syllogistic, where a singular term can never occur in a predicate position, there is no restriction in Ontology as to the sort of names that may serve as arguments in an elementary proposition. Any name, whether singular, plural or empty, may occur either in the first argument position (say as subject) or in the second position (say as predicate) in an elementary proposition. In all the sixteen situations of Lejewski’s Figure II, we would get a well-formed proposition. This proposition, of course, would only be true in cases (II.1) and (II.3). In all other situations, it would be false, but meaningful and perfectly well-formed.
ii. The Axiomatic System
Drawing on his intuitive semantics, Leśniewski formulated in 1920 a single axiom for Ontology. This axiom is presented here both in a Leśniewski-compliant form and in a more usual notation:
AxOnto:
By introducing the elementary proposition ‘‘ in the left-hand side of the biconditional expression, withas a new context, the axiom presents a formal similarity with a definition of Protothetic. However, the new symbol also occurs in the right-hand argument, and the axiom is then like an implicit definition, introducing the new symbol ‘‘ with a formal characterization that fits the intended truth conditions of the elementary proposition:
For all a, b,
a is a b if and only if
1. at least one thing is an a and
2. at most one thing is an a and
3. all that is an a is also a b.
This axiom shows a striking analogy between Leśniewski’s analysis of singular propositions and Russell’s theory of definite descriptions. Despite numerous oppositions between the two logicians, Leśniewski acknowledged that Ontology has certain similarities with Russell’s work, for example its formal proximity with simple type theory.
Concerning the rules of inference, Ontology inherits adapted forms of the rules of Protothetic. In addition, it also has new versions of the rules for definition and for extensionality. With only the protothetical rules, one can already establish significant theorems concerning the constant epsilon, notably the following:
T1:
(epsilon is transitive)
T2:
(one of the most characteristic properties of epsilon)
T3:
These three theorems sufficiently indicate that Leśniewski’s epsilon is formally very different from the epsilon of set theory. Theorem T3 rather shows similarities between Ontology and Aristotle’s syllogistic. In one direction, the biconditional expression is analogous to the Barbara syllogism, while in the other direction it bears formal resemblance to what Aristotle termed ‘ecthesis’. Resting on results due to Tarski and Sobociński, Leśniewski has shown that T3 can be adopted as a shorter single axiom of Ontology.
Using only the protothetical rules, it is also possible to state some interesting definitions, such as those of relators of the same category as epsilon (S/NN):
T4:
Def-Proto (nominal inclusion)
T5:
Def-Proto (nominal co-extensionality)
T6:
Def-Proto (singular identity)
Other definitions are still possible with Def-Proto, for instance those of the three following S/N functors, expressing properties of names, and also a definition of a relation between such properties (category S/(S/N)(S/N)):
T7: Def-Proto
T8: Def-Proto
T9: Def-Proto
T10: Def-Proto (S/N-functors co-extensionality)
From AxOnto, definitions T4 and T8, it is easy to derive the following theorem:
T11:
The introduction of nominal co-extensionality in T5 gives the opportunity to define an interesting many-link functor:
T12: Def-Proto
From (‘a’ and ‘b’ denote the same objects), the definition abstracts the second argument ‘b’, resulting in ‘’, which belongs to the category S/N and expresses a property of names (to denote the same objects as ‘a’). The functor ‘Ext’ is then a many-link functor of category (S/N)/N. It is tempting to interpret ‘’ as denoting the extension of ‘a’. This is however merely a façon de parler, for ‘’ is not the name of an object, but rather the expression of a function. Nevertheless, it is worth noting that from T5, T10 and T12, a formal analogue of Frege’s famous Basic Law V can be derived:
T13:
Contrary to Frege’s law, this theorem is perfectly harmless in Ontology, for ‘‘ is a function, not an object that could be among the denotations of ‘a’.
In addition to defining nominal properties and relations, Boolean operations on names can also be introduced. However, these operations necessitate the application of an Ontology-specific definition rule. Instead of using, like in Protothetic, the following general form (with a definiendum of category S):
Def-Proto
the new rule allows definitions of the following form, with a definiendum of category N:
Def-Onto
The formal conditions for a well-formed definition are pretty much the same in the new version. It is worth noting that including the expression ‘’ on the right-hand side of the biconditional is to ensure that the name ‘a’ is singular. In practice, this addition is unnecessary when the definiens alone already ensures that ‘a’ denotes exactly one object. Here are some examples, including the elementary Boolean operations:
T14:Def-Onto (nominal union)
T15:Def-Onto (nominal intersection)
T16:Def-Onto (nominal difference)
T17:Def-Onto (empty name)
T18:
Def-Onto (universal name)
iii. Higher Orders
Like Protothetic, Ontology is a higher-order logic. Where the Principia Mathematica are based on an unbounded hierarchy of logical types, Ontology is based on a potentially infinite hierarchy of semantic categories. Each time a constant of a new category of this hierarchy is defined in a development, variables of that category and quantifiers binding those variables become available in the language of the development in question. At the root of the hierarchy, the category N includes names, which have the semantic role of designating extra-linguistic objects. Moving up into the hierarchy, one first comes across categories that take names as arguments: N/N, N/NN, and so on (categories which allow to operate on names), and S/N, S/NN, and so on (categories which allow assertions about names). Operators and functors of these categories can in turn serve as arguments for new operators and functors from categories higher up in the hierarchy. This ascent can go on, step by step, without limit of complexity. This increase in complexity must nevertheless be put in perspective. Firstly, a specific extensionality rule enables the derivation of extensionality theorems for each of the new categories that can be reached in Ontology. These theorems are analogous to the following theorem, which expresses extensionality for the basic category N:
T19:
Secondly, there exists a structural analogy within Ontology among the various levels of the category hierarchy. For example, from a semantic perspective, the way a name has meaning (by denoting one object, a plurality of objects or no object at all) parallels the way a S/N functor conveys meaning (by being satisfied by one nominal meaning, multiple nominal meanings or none at all). From a syntactic perspective, it is possible to define in each category an analogue of the primitive epsilon (called a ‘higher epsilon’) and then to derive a structural analogue of the axiom of Ontology. For example, in the category S/N, the following definition of a S/(S/N)(S/N)-epsilon is adequate for that purpose:
T20: (∀αβ)[αεβ ≡.
Def-Proto
Using this definition, Miéville (1984: 334-7) has given a proof of the following S/N-analogue of the axiom of Ontology:
T21:
Def-Proto
Ontology presents a systematic analogy between categories that is similar to the analogy between types in the Principia Mathematica. Using homonyms of different categories, it offers the possibility to speak about functions or incomplete meanings (like for instance properties, relations, extensions, numbers, and so on) as if they were objects, but without any reification. On this point, Ontology attains the same ends as the Principia Mathematica. It constitutes a powerful logic as strong as type theory. But contrary to Whitehead and Russell, Leśniewski attains this result with means which are strictly regulated by his axiomatic system, without resorting either to any external convention of systematic ambiguity or to the non-explicit definitions of logical fictions, like that of classes in the Principia Mathematica.
c. Mereology (Part-Whole Theory)
While Mereology is chronologically the first system elaborated by Leśniewski, it is theoretically the last one, as it is based on both Protothetic and Ontology. Unlike these two deductive systems, Mereology is not a purely logical system. Dealing with the relations between parts and wholes, it contains non-logical proper terms like ‘class’, ‘element of’, or ‘part of’. Although it was considered by Leśniewski as a class theory—a nominalistically acceptable alternative to set theory—it has come to be widely regarded as a powerful formal theory of parts, wholes and related concepts.
At the time he discovered formal logic, Leśniewski planned to provide an alternative way into the issue of the foundations of mathematics. In this matter, the main points of disagreement with his contemporaries were about the ways to analyze and solve the paradoxes which appeared with the pre-theoretical notions of set, class or extension, especially Russell’s paradox. For Leśniewski, the solutions available at that time (in particular those of Frege, Russell and a bit later Zermelo’s one) were only ad hoc means. In his view, their only justification was to avoid contradictions in their systems. He did not find in these solutions a satisfactory analysis of the root causes of the contradictions, which were, according to him, based on a confusion between what he called the distributive and the collective conceptions of classes. From Cantor’s early approach, Leśniewski retained the basic idea that a set is literally made up of its elements. But this basic idea was for him incompatible with the existence of an empty set and with the distinction between an object and the set having this object as its unique element.
The difference between distributive and collective views of sets or classes can be easily grasped by means of a geometrical example such as the following figure:
It is common to conceive such a figure as being a class of certain points. It can also be conceived as a class of certain line segments or even as a class of certain triangles. All these possibilities have different theoretical advantages. But if the notion of class is understood in a distributive sense (for example as a set in a standard sense), the same figure cannot be at the same time a class of points, a class of segments, and a class of triangles. A set or a distributive class of points has only points as elements, whereas a mereological or collective class of points may perfectly have also other objects as elements. A single figure can be at the same time a mereological class of points, a class of segments and a class of triangles. In the collective conception, the three classes are the same object, namely the figure in question.
i. Mereology and Russell’s Paradox
Leśniewski gave a famous analysis of Russell’s paradox. During his lifetime, he did not publish the final state of this analysis, which was reconstructed by his pupil Boleslaw Sobociński in 1949. Taking up Russell’s approach, Leśniewski’s analysis begins with a proof that the principles of comprehension and extensionality of class theory are indeed incompatible. To this end, the following ‘Ontological’ definition of the concept ℜ is introduced (that is, the concept of object which is not a class that falls under a concept of which it is the class):
Defℜ:
The term Cl(-), which occurs in the definiens, must be considered here as a primitive constant satisfying the two principles of comprehension and extensionality (here expressed in the language of Ontology):
Class Comprehension Principle (CCP):
(For all conceptual terms a, there is an object which is a class of the a’s.)
Class Extensionality Principle (CEP):
(For all conceptual terms a and b, if one and the same object is both a class of the a’s and a class of the b’s, then the a’s and the b’s are exactly the same objects.)
Taking these elements as granted, Leśniewski easily shows that we get the following contradiction:
(Nothing is a class of the ℜ’s)
(Something is a class of the ℜ’s)
For Leśniewski, a contradiction acquires the value of an antinomy only if it logically follows from principles in which we undoubtedly believe. In his view, this was not the case with the contradiction encountered in this context. In the intuitive conception of classes or sets, CCP is not free from doubt. Indeed, what this principle expresses is at least doubtful in the case of empty concepts. If a is an empty conceptual term, there is no intuitive reason to decide whether or not there is an object which is a class of a. The reasons for adopting CCP lie in the goals we set for set or class theory, not in our intuitive conceptions of sets or classes. In order to uncover the genuine antinomy behind Russell’s paradox, the analysis must not address the incompatibility between CEP and CCP. It must deal with what would happen to a theory that admits among its principles CEP and the doubtless part of what is expressed by CCP, namely the following weakened principle:
Weak Class Comprehension Principle (WCCP):
(For all non-empty conceptual term a, there is an object which is a class of the a’s.)
With the primitive term Cl(-) satisfying now CEP and the revised principle WCCP, clearly only the first horn of the previous contradiction remains:
(Nothing is a class of the ℜ’s)
But WCCP then imposes that ℜ is an empty term:
This means (by definition of ℜ) that every object is a class that falls under a concept of which it is the class:
Leśniewski shows then that this class is precisely the class of the term in question:
(Every object is a class of itself)
From these first results, he then draws an unexpected consequence:
(There is at most one object)
The demonstration can be informally sketched as follows:
Assume (a and b are individual objects).
Let then be the conceptual term under which exactly the objects a and b fall.
The term is not empty.
By WCCP, it follows that there is an object c which is a class of .
But, like every object, c is also a class of itself.
So we have: .
By CEP, we then get: (exactly the same objects fall under and c).
For c is a singular term, so is also .
The only way for to be a singular term is that .
The joint adoption of CEP and WCCP results then in the existence of at most one object in the universe. Although such a statement could not be refuted on a purely logical basis, Leśniewski considered that no set or class theorist could tolerate it and that they should indisputably believe in its negation. This had for him the value of a genuine antinomy.
Sobociński’s reconstruction of the analysis does not fully clarify why Leśniewski considered that the root causes of this antinomy lay in a confusion between what he called the distributive view and the collective view of classes. It could be argued that, for him, WCCP constituted an unquestionable belief only with a collective view, whereas a doubtless belief in CEP could only result from a distributive view. In any case, his solution to the antinomy was to expurgate from CEP what is only acceptable from a distributive view of classes. Introducing the notion of element with the following definition:
DefEl:
he showed that CEP is logically equivalent to the conjunction of the following two expressions:
CEP1:
(All the objects that fall under a concept are elements of the class of that concept.)
CEP2:
(All the elements of the class of a concept fall under that concept.)
Unlike CEP1, which is undoubtedly true under both intuitive understandings of ‘class’, CEP2 can only appear to be true to one who adopts a distributive view. For Leśniewski, whoever relies on Cantor’s opening idea that a set or a class is a real object that gathers into a whole the elements that literally constitute it faces an antinomy when confusingly admitting the truth of CEP2. His nominalist tendency was to lead him to consider the mereological approach to classes as the only acceptable one, whereas any theoretical approach aiming at saving CEP2 led, according to him, to make classes either fictions (as in Russell’s no-class theory) or disputable abstract entities subject to ad hoc criteria of existence (as in Zermelo’s approach).
ii. The Axiomatic System
Between 1916 and 1921, Leśniewski developed four axiomatizations of his Mereology taking different terms as a unique primitive. The first two were based on the term ‘part’ (Leśniewski’s word for the somewhat more usual ‘proper part’), the third on ‘ingredient’ (a synonym for ‘element’) and the last on ‘exterior of’ (Leśniewski’s term for ‘disjoint’). When taking ‘element’ (or ‘ingredient’) as primitive, Leśniewski gives the following definitions of other important mereological terms:
(a is a part of b when it is a strict element of b, that is, an element of b different from b itself.)
(a is an external of b when it has no element in common with b.)
This last definition introduces the notion of mereological class. In order to be the class of the b’s, a certain a must meet the three conditions stipulated in the definition:
1. a must be an object;
2. all the b’s must be elements of a;
3. every element of a must itself have at least one element in common with one of the b’s.
This is worth clarifying. Let us take as an illustration the mereological class of Swiss people. From condition 2, every Swiss person is an element of this class. But the class is intended to have many other elements. For example, sub-classes of Swiss people are also elements of that class (like the class of Swiss people living in Lugano, that of French-speaking Swiss people, or that of Swiss people who practice yodeling). There are also elements of the class of Swiss people that are neither Swiss persons nor sub-classes of them: for example, an element of a Swiss person (such as the nose of the President of the Confederation) or a class of such elements (such as the class of the feet of Swiss people who have climbed the Matterhorn). Condition 3, however, precisely limits this wealth of elements. It requires that each element itself has at least something (an element) in common with a Swiss person. The Swiss people sub-classes clearly meet this requirement and so do the President’s nose and the Swiss mountaineers’ feet class. In contrast, the class of the noses of all European leaders will not be retained as an element of the Swiss people class because there is at least one element of this class of noses (for example, the nose of the Italian president) which has nothing (no element) in common with a Swiss person.
This illustration highlights that the mereological class corresponds to what is better known as the mereological sum. It also makes it clear that, for the three conditions of Leśniewski’s definition to actually reach what is expected from mereological classes, it is necessary for the term ‘element’ to be characterized in an appropriate way (for example, it is pretty clear that for condition 3 to be relevant, the relation element of must be transitive). This characterization is what Leśniewski does with the following axiomatization of 1920, using ‘element’ (originally ‘ingredient’) as the sole primitive term:
The first two axioms are rather simple. AxM1 is the contraction of the following two formulas:
(only objects have elements)
(element of is an antisymmetric relation)
The second axiom is literally:
(element of is a transitive relation)
Furthermore, Leśniewski shows that AxM1 et AxM2 imply the following formula:
(element of is a reflexive relation on the object domain)
The first two axioms thus make element of a non-strict partial order relation on the object domain.
The last two axioms are more difficult to grasp. However, when examined in light of the definition of class, it becomes apparent that they respectively embody a principle of class uniqueness and a principle of class existence (the latter being nothing but WCCP):
(a class of a certain term is unique)
(There is a class for every nonempty term)
Leśniewski and his followers devoted much effort to finding the shortest and most economical system of axioms for Mereology. But this could only be achieved at a certain expense of intuitive clarity. The axiomatization of 1920 is in an intermediate position in this respect. To one who is concerned neither with having a single primitive, nor with the independence of the axioms, the adoption of the two terms El(-) and Cl(-) as primitives, together with the six axioms (i)-(vi) plus the defining formula for ‘class’ taken as the seventh axiom, would constitute a rather clear axiomatization. The only remaining difficulty lies in the effort to grasp this definition of class (especially with regard to the third condition of the definiens, which gives the mereological class its specificity). This definition of class, elaborated by Leśniewski from the very beginning of his research, can certainly be considered as one of the touchstones of his mereology.
All of Leśniewski’s axiomatizations have been proved to be equivalent. The consistency of his Mereology has also been established. Clay gave in 1968 a model of Mereology in the arithmetic of real numbers, and later a proof of consistency relative to topology. But the most significant proof is that of Lejewski, who showed in 1969 that both Mereology and its underlying logic (namely Ontology) are consistent relative to an elementary sub-system of Protothetic.
3. Foundations of Mathematics
With Leśniewski’s untimely death in 1939, the picture he left us of his program for the foundations of mathematics remains unfortunately unfinished and in some respects ambivalent. Developed in connection with his analysis of Russell’s paradox, his Mereology continues today to be widely studied and developed as a rich applied theory of the part-whole relation. However, Leśniewski intended his Mereology to be a nominalistically acceptable alternative to set theory. One of his purposes was to show that Mereology could be used to provide a foundation for mathematics that would not postulate the existence of questionable abstract entities. On the other hand, he did not deny the relevance of the distributive intuition about classes. He nevertheless considered that an expression like ‘the class of the b’s’ taken in the distributive sense could only be an apparent name that should be eliminated in favor of a language of pure logic. Thus, to assert in the distributive sense a sentence like ‘a is an element of the class of the b’s’ was only for him a façon de parler that amounts to asserting that a is one of the b’s, precisely what is expressed by ‘’ in his Ontology. This eliminativist conception of distributive classes could arguably have led Leśniewski to consider, as several of his followers did, that the core notions of arithmetic (in particular, that of cardinal number) should find their foundations not in mereology, but directly in the purely logical system of Ontology.
a. Mereology and Set Theory
In order to determine to what extent Mereology could be an alternative to set theory, Leśniewski set out to prove numerous theorems that he considered to be analogues of important results of set theory. Unfortunately, this task remained unfinished. Perhaps the most interesting proof he gave was the mereological analogue of Cantor’s theorem on the cardinality of power sets. The problem with mereological classes is that they do not generally carry a specific cardinality. As Frege already remarked about aggregates (his naive notion for mereological classes), firstly there is no aggregate that consists of zero object, and secondly an aggregate can only have a cardinal number when it is conceived through a concept. To overcome this last difficulty, Leśniewski introduces two notions: the notion of mereological collection and the notion of discrete name. These notions can be defined as follows:
For any non-empty term b (like ‘Swiss people’), we have seen that there is exactly one class of the b’s. With D4, collections of b’s are all those classes that are generated either by all the b’s (so the class of Swiss people is a collection of Swiss people), or by only certain b’s (the class of Swiss people living in Lugano is also a collection of Swiss people), or even by only one b (Roger Federer is then also a collection of Swiss People). At first glance, it seems there must be collections of b’s, where n is the number of objects that fall under the name ‘b’. But this is only correct provided the b’s are disconnected objects (objects that are all mereologically external to each other, that is, with no element in common). In other words, using D5, one can say that the result is correct only when ‘b’ is a discrete name. Leśniewski proved then what he considered to be the mereological analogue of Cantor’s theorem:
(If ‘b’ is a plural discrete name, then there are strictly more collections of b’s than there are b’s themselves.)
The analogy is quite clear. With this result, Leśniewski could have tried to establish other important analogous results (for example, the existence of an endless hierarchy of increasingly large infinite cardinals in a Mereology supplemented with an axiom of infinity). Unfortunately, he did not have time to develop his efforts in such a direction, and no subsequent work has been able to show that Mereology is strong enough to achieve such goals while respecting the nominalistic requirements of Leśniewski.
In the Leśniewskian perspective, however, an important difference with the set-theoretic approach must be emphasized. Whereas in Cantor’s theorem, the cardinalities to be compared are those of sets (a set and its power set), in Leśniewski’s analogue, it is not mereological classes that carry a cardinality, but names (a name ‘b’ on the one hand and the name ‘Coll(b)’ on the other hand). What is compared in the mereological analogue is not the number of the elements of different classes, but the number of objects falling under certain names. This observation leads to two remarks. First, even if the Leśniewskian approach does not introduce any ambiguity or confusion, it still mixes both the collective and distributive views on pluralities. Secondly, the notion of cardinality introduced in this context clearly belongs to the underlying logic (Ontology), and not specifically to Mereology. This provides a rather decisive reason to favor a foundational approach to arithmetic from Ontology and not from Mereology.
b. Ontology and Arithmetic
Although he did not explicitly give clear philosophical reasons against the idea of a logicist approach to arithmetic, Leśniewski made no attempt to reduce arithmetic to his Ontology. Instead, he merely developed in the language of Ontology an axiomatization of arithmetic that is more or less a translation of Peano’s second-order arithmetic. In view of the great wealth of his systems, this is still a disappointing result, which should have been just a stage in the mind of the Polish logician. In any case, Canty showed in 1967 in his PhD thesis that the arithmetic of the Principia Mathematica could be fully reconstructed within Ontology. By exploiting several of Canty’s techniques, Gessler, Joray, and Degrange set up in 2005 a logicist program in which they show that second-order Peano’s arithmetic can be reduced to Ontology, using only a single non-logical axiom (an axiom of infinity) that can be stated as follows in the primitive language of Ontology:
(There is an object a for which there is a one-one relation such that every object is in its domain and every object except a is in its codomain.)
With suitable definitions of ‘one-one relation’, ‘domain’ and ‘codomain’, the logicist construction of Peano’s arithmetic can then be obtained resting on the definition of nominal equinumerosity and an associated definition of the many-link functor ‘Card’:
This last definition immediately leads to the perfectly predicative Leśniewskian analogue of Frege’s renowned law, now commonly referred to as Hume’s Principle:
From here on, cardinal numbers are introduced as those S/N-functors which satisfy the following definition:
and natural numbers as those cardinal numbers which are inductive in the sense of Frege:
Completing this in a manner very similar to Frege’s (in particular by making explicit the definitions of ‘zero’, ‘successor’, and of inductivity), one obtains a development of infinite Ontology in which the axioms of Peano’s second-order arithmetic are provable. This construction simplifies and substantially improves that found in the Principia Mathematica. But, as with Whitehead and Russell, it leads to an arithmetic that has to be duplicated in higher orders if its numbers are to be applied not only to the counting of objects, but also to the counting of properties, functions, relations, and so on. This might explain why Leśniewski did not investigate in this direction and why he engaged in a foundational attempt based on his Mereology. Without textual evidence or decisive testimony on that subject, this unfortunately remains a matter of speculation.
4. References and Further Reading
Ajdukiewicz, K. “Die Syntaktische Konnexität.” Studia Philosophica 1 (1935): 1-27. (English translation in S. McCall, ed. Polish Logic 1920-1939. Oxford: Clarendon, 1967: 207-231.)
Canty, J. T. Leśniewski’s Ontology and Gödel’s Incompleteness Theorem. PhD Thesis of the University of Notre Dame, 1967.
Canty, J. T. “The Numerical Epsilon.” Notre Dame Journal of Formal Logic 10 (1969): 47-63.
Clay, R. E. “The Relation of Weakly Discrete to Set and Equinumerosity in Mereology.” Notre Dame Journal of Formal Logic 6 (1965): 325-340.
Clay, R. E. “The Consistency of Leśniewski’s Mereology Relative to the Real Numbers.” Journal of Symbolic Logic 33 (1968): 251-257.
Gessler, N., Joray, P., and Degrange, C. Le logicisme catégoriel. Travaux de Logique (Neuchâtel University) 16 (2005): 1-143.
Joray, P. “Logicism in Leśniewski’s Ontology.” Logica Tranguli 6 (2002): 3-20.
Joray, P. “A New Path to the Logicist Construction of Numbers.” Travaux de Logique (Neuchâtel University) 18 (2007): 147-165.
Joray, P. “Un système de déduction naturelle pour la Protothétique de Leśniewski.” Argumentum 18 (2020): 45-65.
Küng, G. “The Meaning of the Quantifiers in the Logic of Leśniewski.” Studia Logica 26 (1977): 309-322.
Lejewski, C. “Logic and Existence.” British Journal of the Philosophy of Science 5 (1954): 104-119.
Lejewski, C. “On Leśniewski’s Ontology.” Ratio 1 (1958): 150-176.
Lejewski, C. “Consistency of Leśniewski’s Mereology.” Journal of Symbolic Logic 34 (1969): 321-328.
Leśniewski, S. “Introductory Remarks to the Continuation of my Article ‘Grundzüge eines neuen Systems der Grundlagen der Mathematik’.” in S. McCall (ed.) Polish Logic 1920-1939. Oxford: Clarendon, 1967: 116-169.
Leśniewski, S. “On Definitions in the So-called Theory of Dedution.” in S. McCall (ed.) Polish Logic 1920-1939. Oxford: Clarendon, 1967: 170-187.
Leśniewski, S. “On the Foundations of Mathematics.” Topoi 2 (1983): 7-52.
Leśniewski, S. S. Leśniewski’s Lectures Notes in Logic. edited by J. T. J. Srzednicki and Z. Stachniak, Dordrecht: Kluwer, 1988.
Leśniewski, S. Collected Works. edited by S. J. Surma, J. T. J. Srzednicki, J. D. Barnett, and V. F. Rickey, 2 vols, Dordrecht: Kluwer / Warszawa: PWN Polish Scientific Publishers, 1992.
Łukasiewicz, J. The Principle of Contradiction in Aristotle. A Critical Study. (1910), English translation by H. R. Heine, Honolulu: Topos Books, 2021.
Luschei, E. C. The Logical Systems of Leśniewski. Amsterdam: North Holland, 1962.
Miéville, D. Un développement des systèmes logiques de Stanisław Leśniewski. Protothétique-Ontologie-Méréologie. Bern: Peter Lang, 1984.
Miéville, D. and Vernant, D. (eds.) Stanisław Leśniewski Aujourd’hui. Grenoble: Groupe de Recherche sur la Philosophie et le Langage, 1996.
Miéville, D., Gessler, D., and Peeters, M. Introduction à l’oeuvre de S. Leśniewski. Vols I-VI. Series of special issues of Travaux de Logique (Neuchâtel University), 2001-09.
Rickey, V. F. “Interpretations of Leśniewski’s Ontology.” Dialectica 39 (1985): 182-192.
Rickey, V. F. “An Annotated Leśniewski Bibliography.” First version 1972, last version 2019, available at https://lesniewski.info/.
Russell, B. Principles of Mathematics. London: Allen and Unwin, 1903.
Simons, P. “A Semantics for Ontology.” Dialectica 39 (1985): 193-216.
Simons, P. “Stanisław Leśniewski.” Stanford Encyclopedia of Philosophy (2015).
Słupecki, J. “St. Leśniewski’s Protothetics.” Studia Logica 1 (1953): 44-112.
Słupecki, J. “Leśniewski’s Calculus of Names.” Studia Logica 3 (1955): 7-72.
Sobociński, B. “L’analyse de l’antinomie russellienne par Leśniewski.” Methodos 1 (1949): 94-107, 220-228, 308-316 and Methodos 2 (1950): 237-257. (English Translation in Srzednicki, J. T. J. and Rickey, V. F. eds, 1984: 11-44.)
Srzednicki, J. T. J. and Rickey, V. F. (eds.) Leśniewski’s Systems: Ontology and Mereology. The Hague: Nijhoff / Wrocław: Ossolineum, 1984.
Srzednicki, J. T. J. and Stachniak, Z. (eds.) Leśniewski’s Systems: Protothetic. Dordrecht: Kluwer, 1998.
Stachniak, Z. Introduction to Model Theory for Leśniewski’s Ontology. Wrocław: Wydawnictwo Universitetu Wrocławskiego, 1981.
Tarski, A. “On the Primitive Term of Logistic.” (1923), in Logic, Semantics, Metamathematics. Papers from 1923-1938 by Alfred Tarski. Oxford: Clarendon, 1956, 1-23.
Urbaniak, R. Leśniewski’s Systems of Logic and Foundations of Mathematics. Cham: Springer, 2014.
Whitehead, A. N. and Russell, B. Principia Mathematica. 2nd ed. Cambridge University Press, 1927.
Authors of accepted articles should follow these guidelines before submitting their final draft. Here is a short list of the main ones. For more details on other topics and to understand why a copyeditor changed your article on one of these topics, see the longer list of topics that follow the short list.
Do not argue for an original thesis as you would in a journal article. You are providing a balanced presentation of the key positions held by professionals who write on your topic.
Use no footnotes or endnotes.
Before the table of contents, include an opening summary of 200-500 words that tells the reader what is coming. Do not mention this summary in the table of contents.
Quotations are not required to have references. It is OK to say Aristotle said so and so without saying where, but if a referee asks for a detailed reference, then you must supply it.
Avoid using a large number of references as you would if you were writing a definitive book on your topic. Your intended audience is not professional researchers on your topic. It is students majoring in philosophy, intellectually interested general readers, and professional philosophers who are reading outside their area of specialization.
Our policy for the opening summary and table of contents is to avoid placing detailed citation information there and to place it instead within the body of the article. Details in the opening summary are superfluous and distracting. The summary’s purpose is to summarize what follows and to encourage readers to continue reading.
Avoid the use of “I” and other self-referential terms. Revise the “royal we” that is sometimes used by a single author in a professional journal.
If your article is accepted for publication, it cannot be submitted as a pdf file or msg file, but only as a doc file or tex file. You are welcome to include a pdf file in order to show what you intend with the other file. Do not place references in a .bib file.
Do not use the words “recent” or “now” or “currently” because these likely will be inapplicable seventy-five years from now.
In submitting your article to be refereed, remove information that will identify you, but do give this information to your subject area editor, who will add it back into the final version after your article is accepted and ready to be copyedited. Microsoft Word is the preferred program to use in writing your article, but if the article contains a great many technical symbols, submission as a LaTeX document is acceptable. Turn off right justification; right justification will be added during our formatting process. Authors who are non-native English speakers have the responsibility to submit articles in standard English; many of our foreign authors make use of English-speaking proofreaders before submitting their articles.
a. Author Copyright and Licensing Agreement
The author should know that he or she enters into an agreement with the IEP such that:
The author grants an exclusive and perpetual license to the IEP to use and distribute the article through Internet media.
This license covers the present version/edition of the IEP, all future versions/editions of the IEP, and all derivations of the IEP in the present media format as well as other possible formats such as new IEP sites, printed works, and portable media.
The author receives no financial payment for his/her article from the IEP.
Authors retain copyright to their article and the right to publish the article elsewhere in a format that does not compete with the IEP in nature and scope (such as in another encyclopedia or dictionary). IEP articles cannot be posted elsewhere on the internet where there is public access; those websites should link to our articles instead. The IEP has an interest in preventing the dilution of this resource that can occur when IEP articles are copied and posted on external sites. The IEP also has an interest in assuring that only the most recently updated versions of its articles are on the internet, and external duplication compromises this interest. With some types of password-protected websites, reproducing complete articles may be allowable. With printed publications, authors must inform the IEP general editors of any secondary publishing opportunity and also inform the secondary publisher about the author’s prior agreement with the IEP. You, as the author, can use all or any part of your IEP article elsewhere in printed formats. However, you need to be careful that you do not accidentally sign over the copyrights of that material to your publisher whose standard contract may say that none of your material be published elsewhere. If this happens, then the IEP could have to remove your article. The best way to avoid that is by including an acknowledgment somewhere in your new article or book that says that the passages in question are revised versions of articles in the IEP and appear in your book by permission of the IEP.
The IEP reserves the right to grant permission to reprint articles at the request of third parties (such as universities or book publishers), particularly when the third-party publication is in keeping with the nonprofit and educational mission of the IEP. For-profit requests will be referred to the author.
The IEP routinely re-evaluates posted articles for their academic rigor, currentness, readability, length, intended audience, and placement within the overall vision of the IEP. The IEP reserves the right to remove or discontinue any posted article at its sole discretion. To assist in such re-evaluations, the editors may solicit post-publication peer reviews of articles and contact authors for needed revisions. In some cases, if an author is not able to revise an article as requested by the IEP editors, it may be removed and replaced with one on the same topic by a different author. In other cases, an article topic itself may fall outside the evolving vision of the IEP and may be removed without replacement. In either case, the licensing agreement with the author will terminate, and the author will regain full copyrights to the article, including the right to publish it elsewhere on the Internet.
b. Length
Most articles on the IEP are 8,000 – 15,000 words, not counting bibliographical material. Articles on minor topics should be at least 8,000 words in length. Those on the most important topics should be between 10,000 and 15,000 words, although there are no space restrictions, and authors are encouraged to err on the side of being too long rather than too short. Articles about major philosophers and major topics may require a series of articles to adequately cover the scope of their contributions; see for example the set of articles on Aristotle and the set on Hume. You might consider writing a new member of either series.
c. Article’s Purpose
The purpose of your article is to present information accepted by colleagues working in the subject area. An encyclopedia article is not the place to defend an original thesis. Your article should explain the topic in language the reader can understand, and it should present the important distinctions, the main results, and the main points of view on controversial issues. Your article should call attention to related topics and should provide some references in case the reader wishes to read further on your topic.
2. Content
a. Level of Sophistication
The Encyclopedia’s articles should be written with the intention that most of the article can be understood by advanced undergraduates majoring in philosophy and by other scholars who are not working in the field covered by that article. The IEP articles are written by experts but not for experts, in analogy to the way the Scientific American magazine is written by scientific experts but not primarily for scientific experts. For topics that are unavoidably technical, such as an article on Einstein’s General Theory of Relativity or on Ramsey sentences in logic, the goal should be to make the early parts easy-going so that the technically unsophisticated reader will still profit from reading much of the article.
b. Opening Summary
Preceding the table of contents, the beginning of each article should contain a summary of the article in 200 to 500 words. The purpose of the summary is to give readers a quick overview of your topic. See the article on Substance as an excellent example. You have three goals to achieve simultaneously: (1) to convey some understanding of your topic to those readers who will read the summary with no intention of reading the entire article, (2) to say something intriguing that will make readers want to continue reading beyond the summary, and (3) to give readers who do intend to read the full article some idea of the territory ahead. Think of the opening summary as functioning as a brief entry that might appear in a philosophy dictionary. The vast majority of our readers (one million per month) will read only your opening summary. Because of the opening summary’s critical role, the IEP staff may revise these either prior to or after publication for purposes of clarity or uniformity.
Here is what not to do in the opening summary. Do not focus primarily on saying your topic or your philosopher is influential and important. Do not spend time saying how your article is not comprehensive. Do not mention yourself. Instead, add more information about what philosophical contributions are made and how. Put yourself in the shoes of a reader who probably will not be reading your full article but only its opening summary and who wants to learn something about what issues are covered and in what manner. For an article on an individual philosopher, focus on what theses the philosopher defended, and on their style of doing philosophy—for example, that she is a post-Kantian German idealist.
In the opening summary, it would be preferable not to include any quotations nor any citations and to make your points in your own words. The more detailed quotations and citations can be included later in the main body of your article.
c. Structure of Articles on People
The body of the article should begin with a biography of the philosopher. The bulk of the article will consist of a discussion of the philosopher’s main philosophical contributions; in most cases, a topical presentation of this material is preferred over a strictly chronological presentation. The discussion might include influences on the philosopher, traditional criticisms of the philosopher, and the impact of the philosopher on later philosophers. See the article on Francis Bacon as an example. As in the Bacon article, you are encouraged to include a copyright-free picture with your own article.
d. Structure of Articles on Topics
The opening sentence of the summary paragraph should be a general definition of the term being used, such as, “The term ‘category’ means ultimate or fundamental division.” The opening of the article body should say something about the origin of the term, the context in which the term is used, or alternative definitions of the term. The article will typically consist of a chronological survey of philosophical theories relating to the term. Where appropriate, include traditional criticisms of the theories in question. Please see the article on Logical Paradoxes as an example.
3. Style
The IEP prefers American English, but other dialects such as British, South African, Australian, and Indian are acceptable. Just be consistent throughout the article. Regarding citations, the IEP prefers articles written according to the Chicago Manual of Style, but other coherent documentation styles are acceptable.
a. Formal Style
All articles should be written in a formal, yet simple style, such as that used in the Encyclopedia Britannica, the Routledge Encyclopedia of Philosophy, or the Encyclopedia of Philosophy. Avoid slang. Avoid reference to yourself, as in “I find this argument to be unconvincing,” or “As I’ve shown in the previous paragraph….”
b. Simple Style
Avoid technical jargon, especially technical terms that are unexplained. Articles should be written in a straightforward style that is accessible to intelligent but general readers. Although the IEP is regularly visited by professional philosophers, many users of the IEP are philosophy students or philosophically curious web surfers. To best serve the interests of these latter users, authors should minimize unnecessary technical vocabulary.
c. No Footnotes or Endnotes
Although footnotes and endnotes are common in journal articles and books, they are not allowed in our encyclopedia articles.
d. Minimize Citations and Quotations
Minimize the number of citations. You are not writing a treatise for researchers but rather a popularization. An in-line (in a paragraph) citation might include only an author’s name, or it might include the author plus a book or journal title. A fuller citation with page numbers, year, and publisher can be given in the final section called “References and Further Reading” or simply be omitted, so long as you are confident that you could give the citation to the referee if you were asked for it. The expectation is that authors will paraphrase and interpret the published material in their subject area; and for that reason, the number of citations can be minimized.
e. Other Stylistic Conventions
When starting a new paragraph, do not indent the first word; instead, skip to a new line.
Avoid using underlining or boldface; use italics instead. Underlines or boldface look too emphatic.
Minimize the use of in-text references such as “…as shown in section (4) second paragraph below and also in section (6b, 3rd paragraph).”
Avoid the use of most Latin abbreviations. Here are the English replacements:
cf. | compare
et. al. | and others
etc. | and so forth
e.g. | for example
i.e. | that is
NB | note
viz. | namely
It is OK to use “ibid.” and “op. cit.” and to use Latin quotations and names of Latin books.
Refer to articles within the encyclopedia as articles, not entries.
Every article must end with the author information, and not be followed by any other text or diagram.
4. Format
Turn off right justification, and let words wrap naturally to the next line.
Do not number your pages or your items in the section on References and Further Reading
a. Table of Contents and Section Headings
When your article has been accepted by the editor and is being formatted, the sections headings in the body of the article will be generated automatically from your table of contents, not from whatever section headings you have within the body of your article, so be very careful about the accuracy of your table of contents.
Do not include page numbers within the table of contents. When creating your table of contents, do not use the Word macro that automatically produces a table of contents at References | Table of Contents. This creates a table of contents with extraneous page numbers and dots that cannot be edited or accepted by the IEP’s formatting program.
For typical articles, please use between five and twenty section headings. Less than five will make the article look too simple, and readers with specific topics in mind will find it difficult to locate their topic. More than twenty will make the contents list too complex and, in most cases, make some of the sections too short and thus be visually awkward. Do not mention the opening summary in the table of contents. Use capitals and small case, not all capitals. Capitalize verbs, nouns and adjectives, not Use of subheadings is optional. That is, the table of contents can either be flat or hierarchical.
Use the following numbering and lettering conventions:
Title
Opening Summary
Table of Contents
Heading One
Subheading One
Subsubheading One
Subsubsubheading One
Subsubsubheading Two
Subsubheading Two
Subsubsubheading One
Subsubsubheading Two
Subheading Two
Heading Two
Heading Three
References and Further Reading
Although a hierarchical table of contents may have up to four levels, we encourage you to restrict yours to at most three levels for visual elegance. The final heading of any article must be called “References and Further Reading.”
b. References and Bibliographies
Every article’s final section must be called “References and Further Readings.” This is to be followed by the author information, namely the author’s name, email address, university affiliation, and country.
Bibliographies and lists of citations and references should be brief. You do not need a citation for every quotation, as you do in a journal article. Do not exceed 50 entries except in special cases with approval from the general editors. Whenever possible, the IEP prefers bibliographies that are annotated with a short sentence describing the nature of the work cited; see Fallibilism for an example. The rationale for these policies is that long, unannotated lists will confuse casual readers, and will not be needed by professional researchers who already will have more extensive lists available from elsewhere.
We suggest that you follow the Chicago Manual of Style regarding document style, but other formats such as MLA (Modern Language Association) or APA (American Psychological Association) or other coherent and consistent styles are acceptable. The heading for the bibliography must be “References and Further Reading,” but it may contain subheadings. Typical subheadings are “Primary Sources” and “Secondary Sources.” Italicize all names of books and journals; do not use underlining. Do not use dashes in place of author names, although this is commonly done in journals. The IEP does not allow references to forthcoming publications.
c. Hyperlinks
When appropriate, include hyperlinks in your article to other IEP articles. However, do not include any hyperlinks to non-IEP web sites unless these are stable sites that are very unlikely to be changed during the next forty years. Here is what we mean by “stable”: URLs of journals, professional organizations, and encyclopedias are fine, but an individual’s website or blog is not.
The IEP aims to be a self-contained resource, rather than a link list. Also, because external links require continual updating, we hope to avoid this time-consuming task. It is our policy not to include links to translations of your article.
Avoid hyperlinks in the opening summary; use them in the main body of the article. Minimize hyperlinks to other areas within your article such as “as shown below at A” and “as was shown at B.” Readers rarely click on these.
d. Graphics, Photos, and Charts
Authors are encouraged to include graphics in their articles, provided they do not infringe on copyrights. See the article on Francis Bacon as an example. The graphics (including photos, charts and tables) need to be submitted as separate files in either .jpg or .gif or .png format. The IEP staff cannot create graphics, but we can crop a graphic and make simple adjustments to a graphic’s size.
e. Foreign Language Terms
Authors should minimize the use of italics with foreign language terms. Regarding diacritical marks, use those that have html codes. Replace other diacritical marks with ordinary letters. Please see our helpsheets on Chinese and Indian languages.
f. Description of Author
When you send your article to be refereed, do not include information about yourself. However, when you are submitting your final draft after acceptance, then do include your name, email address, your university’s name, and your country. Do not include your title (for example, “associate professor”), your department name (for example “Department of Humanities and Philosophy”), your university’s street address, or your personal webpage address.
g. Acknowledgments
Do not include dedications and acknowledgments to colleagues who provided input on your article, nor to institutions that provided you with funding. Do all this privately.
5. Miscellaneous
a. Authors Who Are Non-Native English Speakers
Authors who do not speak English as their primary language must have their articles revised by a native English speaker, or equivalent, prior to submission.
b. Editing after Acceptance
After your article has been accepted for publication, send the final version to your subject area editor or to the general editors as an e-mail attachment. Use separate attachments for all the graphics, photos, and charts. The editors will then reformat the article to fit the standard IEP design and layout. If you submit your article with a photograph or graphic to be used in your article, indicate who owns the copyright or indicate that it is in the public domain and so is no longer under copyright. More details of the acceptance process for articles is described in the page on the Submissions Page.
After acceptance, the IEP staff copy edits the article before publication. Normally these are minor stylistic changes that are intended not to affect the article’s content. Unlike the case with journal articles, the author is not sent proofs. Instead, the changes are made, and the article is provisionally posted in the Encyclopedia; then in the early days of the publication (before Google and other search engines re-index the IEP site to include our new articles), the author is notified and encouraged to read the article and request changes. If an author wishes to see the final version before publication, it is the author’s responsibility to request this.
The general editors have the right to final acceptance of all articles. In carrying out a unified plan for the entire IEP, the general editors may solicit additional peer reviews of either submitted articles or already posted articles, and request that authors make additional changes including changes to content, style, and length. At any time the general editors may also alter article titles and revise the opening summary of articles.
c. Editing after Posting
After your article is published or tentatively published awaiting your approval, feel free to request further changes, including changes of content. You may make any minor content changes you wish. Major changes in content should be approved by the subject area editor.
Examples of major changes would be the elimination or addition of a section and its subheading in the table of contents, or the addition of a lengthy discussion of a new topic. A minor change is one that improves the presentation of a point.
When telling us what to change, do not send a revision of the Word document that you originally sent before the article was posted because doing so removes all the improvements made by the copy editor and by the formatter. Instead, do one of the following:
(1) If you have only a small number of changes, then send a numbered list of desired changes within an email or an attached Word document. Indicate where in the posted article each change is to be made by giving the section number and the paragraph number such as:
Section 4b, 3rd paragraph. Change its second sentence from “Kant said 1 + 1 = 3” to “Kant said 1 + 1 = 2.”
(2) Capture the content of your online article as a Word document (while (i) retaining the source formatting and (ii) turning on Track Changes). Indicate your desired changes in red font in the Word document, and send that. Feel free to add helpful, explanatory side comments where needed in the Word document.
d. Atypical Review and Publication Delays
Authors should contact their subject area editor if, after submission, there is an unusually long delay at a particular juncture in the process from the article’s initial submission to its final appearance on the IEP website. Each submission typically passes through five hands (subject area editor, reviewer, general editor, copy editor, formatter). While we strive to maintain an efficient production process, occasionally there are unexpected delays as a result of scheduling issues with our all-volunteer staff. In the rare event that this occurs, the author should send a follow-up email if there is a delay of more than one month at any given stage.
Abelard: Logic
This article describes and reconstructs Peter Abelard’s logic of the twelfth century. Much of what he regarded as logic is now classified as ontology or philosophical semantics. The article concentrates on his treatment of the relation of consequence. Abelard’s most important logical innovations consist of two points:
The distinction between two kinds of negation. This is used to extend the traditional Square of Opposition to an octagon.
The introduction of a relevant implication. The aim is to avoid the paradoxes of strict implication and to safeguard the basic principles of connexive logic.
For the latter goal, Abelard rejected the traditional “locus ab oppositis” which says that if one of two opposite concepts is predicated of a certain subject, the other concept has to be denied of the same subject. We now know this approach failed. Alberic of Paris developed an “embarrassing argument” which showed that—in contradiction to Aristotle’s connexive theses—there exist propositions which logically imply their own negation. The conclusiveness of Alberic’s counterexample does not presuppose the validity of the “locus ab oppositis.” Other aspects of Abelard’s philosophy are treated in the main Abelard article.
Abelard’s philosophical works were first edited in 1836 by Victor Cousin. Besides the rather theological essay “Sic et non,” the Ouvrages inédits d’Abélard contain various logical works, namely, commentaries on Aristotle, Porphyry, and Boethius, and a preliminary version of the Dialectica. The next important edition of Abelard’s logical writings was achieved by Bernhard Geyer who, in the period from 1919 to 1933, published Peter Abaelards Philosophische Schriften. This collection contains in particular a Logica ‘Ingredientibus’ and a Logica ‘Nostrorum petitioni.’ The strange titles do not have any specific meaning; Geyer simply chose them according to the words with which the texts begin. In 1954, Mario Dal Pra edited Pietro Abelardo Scritti filosofici, which contain, in particular, Abelard’s so-called “children’s logic” (“logica parvulorum”). A complete version of Abelard’s most important logical work, the Dialectica, based on a manuscript in Paris, was edited in 1959 by Lambert Marie de Rijk. This volume forms the basis for the reception of Abelard’s logic which started in the last third of the 20th century.
In 1964, Maria Teresa Beonio-Brocchieri Fumagalli published the small volume La Logica de Abelardo, which in 1969 appeared in English as The Logic of Abelard. The title, however, is sort of a misnomer because the book does not really deal with Abelard’s logic. The genuine innovations of Abelard’s logical theories were first uncovered by Christopher Martin in the 1980s, especially in his dissertation on Theories of Inference and Entailment in the Middle Ages, and in the papers Martin (1986), (1987), and (2006). Abelard’s theory of the modal operators is extensively discussed in Binini (2022).
King & Arlig (2018) maintain that:
Abelard […] devised a purely truth-functional logic […], and worked out a complete theory of entailment. […] An entailment is complete (perfecta) when it holds in virtue of the logical form (complexio) of the propositions involved. By this […] he means that the entailment holds under any uniform substitution in its terms […]. The traditional moods of the categorical syllogism […] are all instances of complete entailments, or as we should say, valid inference.
Abelard spends a great deal of effort to explore the complexities of the theory of topical inference […]. One of the surprising results of his investigation is that he denies that a correlate of the Deduction Theorem holds, maintaining that a valid argument need not correspond to an acceptable conditional […].
In the end, it seems that Abelard’s principles of topical inferences do not work, a fact that became evident with regard to the topic “from opposites.” Abelard’s principles lead to inconsistent results.
These claims have to be modified, corrected, and supplemented in several respects. First, for Abelard both entailment and disjunction are intensional or modal, but not extensional, that is, merely truth-functional. Second, his theory of entailment distinguishes not only between perfect and imperfect inferences, but also between what is nowadays called strict implication and the even stronger conception of “relevant” implication. Third, in connection with traditional logic it doesn’t make much sense to speak of the Deduction theorem which says that if, in a logic calculus with certain axioms and rules of deduction, one may deduce a conclusion C from a set (or conjunction) of premises P1, … Pn, then the implication (P1 ∧ … ∧ Pn → C) is provable (in that calculus). However, medieval logic has never been developed in the form of an axiomatic calculus! Fourth, as regards Abelard’s principles of “topical inference”, it is not quite correct to maintain that they lead to “inconsistent results.” Rather, Abelard rejected the traditional topic from opposites in order to save “Aristotle’s Theses” from refutation, but his attempt turned out to be unsuccessful since Alberic of Paris presented a genius counter-example to the connexive principles which does not make use of the topic from opposites.
2. Outlines of the Theory of the Syllogism
Abelard was well acquainted with the theory of the syllogism as it had been invented by Aristotle (ca. 384-322 BC) and elaborated by Boethius (ca. 480-525). This theory deals with the categorical forms, in which a subject term S is related to a predicate term P:
Universal affirmative proposition (UA) Every S is P
Universal negative proposition (UN) No S is P
Particular affirmative proposition (PA) Some S is P
Particular negative proposition (PN) Some S isn’t P.
Later medieval logicians referred to these forms by means of the vowels ‘a’, ‘e’, ‘i’, and ‘o.’ Although Abelard did not use such abbreviations, the forms are here symbolized as SaP, SeP, SiP, and SoP, respectively. The traditional doctrine of subalternation saying that the universal propositions entail their particular counterparts is then formalized as follows (where ‘⇒’ symbolizes a logical implication):
Sub 1 SaP ⇒ SiP
Sub 2 SeP ⇒ SoP.
According to the modern analysis of the categorical forms in terms of first order logic, these laws are not unrestrictedly valid but hold only under the assumption that the subject term S is not empty.
The theory of opposition says that the contradictory opposite, or negation, of the UA is the PN, and that the negation of the UN is the PA. If the negation operator is symbolized by ‘¬’, these laws take the form:
Opp 1 ¬SaP ⇔ SoP
Opp 2 ¬SeP ⇔ SiP.
Hence, it is not the case that everyS is P is equivalent to ‘SomeSis notP’; and it is not the case that noS is P means as much as ‘SomeS is P’. From this it follows that there is a contrary opposition between the two universal propositions, which means that SaP and SeP can never be together true while it is possible that none of them is true. Furthermore, the two particular forms are subcontrary, which means that SiP and SoP can never be together false while it is possible that both are true. The laws of subalternation and opposition are often summarized in the well-known “Square of Opposition”:
The traditional theory of conversion says that a PA and a UN may be converted “simpliciter,” that is, one may simply exchange the predicate and the subject:
Conv 1 SiP ⇒ PiS
Conv 2 SeP ⇒ PeS.
Clearly, if some S is P, then conversely some P is S; and if no S is P, then also no P is S. In contrast, the UA can only be converted “per accidens,” that is, the “quantity” of the proposition must be diminished from ‘universal’ to ‘particular’:
Conv 3 SaP ⇒ PiS.
The validity of Conv 3 follows from the law of subalternation, Sub 1, in conjunction with Conv 2: if everyS is P, then a fortiori,someS is P so that, conversely, some P is S. Similarly, one might state another law of conversion according to which the UN can also be converted “accidentally”:
Conv 4 SeP ⇒ PoS.
This follows from Conv 2 by means of Sub 2.
Finally, most medieval logicians accepted the principle of “conversion by contraposition,” according to which the subject and the predicate of a UA may be exchanged when the terms ‘S’ and ‘P’ are replaced by their negations. That is, if every S is P, then every Not-P is not-S. If the negation of a term is symbolized by ‘~’ (thus distinguishing it from the negation operator for propositions, ‘¬’), the law of contraposition takes the form:
Contra SaP ⇒ ~Pa~S.
According to the principle of “obversion”, the negative propositions UN and PN can equivalently be transformed into affirmative propositions (with a negated predicate):
Obv 1 SeP ⇔ Sa~P
Obv 2 SoP ⇔ Si~P.
Hence, that no S is P is equivalent to ‘Every S is not-P’, and ‘Some S isn’t P’ is equivalent to ‘Some S is not-P.’ As a corollary, it follows that conversely the affirmative propositions UA and PA can equivalently be expressed as negative propositions (with a negated predicate):
Obv 3 SaP ⇔ Se~P
Obv 4 SiP ⇔ So~P.
The mutual derivability of these principles presupposes the law of double negation:
Neg 1 ~~T = T.
A proper syllogism is an inference from two premises P1, P2 to a conclusion C where (normally) all these propositions are categorical forms and the premises must have one term in common. The best-known examples are the four “perfect” syllogisms:
Barbara CaD, BaC ⇒ BaD
Celarent CeD, BaC ⇒ BeD
Darii CaD, BiC ⇒ BiD
Ferio CeD, BiC ⇒ BoD.
In Abelard’s logic these inferences are not presented in such an abstract form, however, but mainly by way of examples. For instance, Abelard illustrates the “sillogismi perfecti” as follows:
Every just is good; Every virtuous is just; therefore, Every virtuous is good
No good is bad; Every just is good; therefore, No just is bad
Every good is virtuous; Some just is good; therefore, Some just is virtuous
No good is bad; Some just is good; therefore, Some just is not bad. (Compare Dialectica, p. 236)
In some places, however, Abelard also mentions corresponding generalized rules such as, in the case of Barbara:
If something is predicated universally of something else, and another is subjected universally to the subject, then the same is also subjected in the same way, i.e., universally, to the predicate (Compare Dialectica, p. 237),
or, in the case of Ferio:
If something is removed universally from something else, and another is subjected particularly to the subject, then the first predicate is removed particularly from the second subject (ibid.)
Abelard largely endorsed the traditional theory of the syllogism including the laws of subalternation, opposition, conversion, and obversion. In particular, in Logica ‘Ingredientibus’ he painted a standard square of opposition in which the logical relation between the ‘a’ and the ‘o’ proposition as well as the relation between the ‘e’ and the ‘i’ proposition are characterized as “contradictorie,” while the ‘a’ and the ‘e’ proposition are opposed “contrarie” and the ‘i’ and the ‘o’ proposition “subcontrarie.” Finally, a “subalterne” relation is drawn between the ‘a’ and the ‘i’ and between the ‘e’ and the ‘o’ proposition. The only difference between Abelard’s square (p. 412) and the usual square consists in the fact that his example deals with the special case where the universe of discourse has only two elements. Thus, instead of ‘Every man is white,’ ‘No man is white.’ and ‘Some man is white’ Abelard has ‘Both men are white’ (“Uterque istorum est albus”), ‘Neither is white’ (“Neuter, i. e. nullus ipsorum est albus”) and ‘At least one of them is white’ (“Alter est albus”).
The next section shows, however, that in Dialectica Abelard eventually rejected the traditional laws of opposition. His distinction between so-called “destructive negation” and “separating negation” entails the consideration of each two variants of the categorical forms, and the ensemble of eight propositions, which Abelard arranged into two squares of opposition, can be united into an octagon.
3. Abelard’s Theory of Negation
As was mentioned already in the introduction, one of Abelard’s major logical innovations consists in the introduction of two kinds of negation by means of which the traditional Square of Opposition is extended to an octagon.
a. “Extinctive” vs. “Separating” Negation
In Logica ‘Ingredientibus,’ Abelard explains:
Not only with respect to categorical propositions, but also with respect to hypothetical propositions one has to distinguish between separating negation and extinctive negation. A separating negation [“negatio separativa”] obtains when by the position of the negative particle the terms are separated from each other […]. But an extinctive negation [“negatio exstinctiva” or “negatio destructiva”] obtains when by the position of the negative particle in front of the entire proposition this proposition is destroyed (Geyer, p. 406).
The extinctive negation of a proposition α is just the ordinary negation, ¬α. It can always be formed by putting ‘not’ in front of α. Thus, with respect to the categorical forms, one obtains:
Not every S is P
Not no S is P
Not some S is P
Not some S isn’t P.
With respect to “hypothetical” propositions, one similarly gets:
Not: If α then β ¬(α → β).
The extinctive negation satisfies the law of double negation,
Neg 2 ¬¬α ⇔ α,
but for some rather obscure reason Abelard hesitated to accept this law.
A separating negation obtains whenever the expression ‘not’ is placed somewhere “within” a proposition α so that it separates the predicate of a categorical proposition from its subject:
Every S is not P
No S is not P
Some S is not P
Some S isn’t not P.
With respect to hypothetical propositions, a separating ‘not’ separates the antecedent from the consequent:
If α, then not β.
Propositions with an extinctive negation differ from their separating counterparts in so far as, for example, the extinctively negated UA, ‘Not every S is P’, doesn’t have the same meaning (or the same truth-condition) as the separating negation ‘Every S is not P’. In view of the laws of obversion, the latter proposition rather expresses the same as a UN! Similarly, ‘Not no S is P’ means as much as ‘Some S is P’ while, according to the principle of obversion, ‘No S is not P’ amounts to ‘Every S is P’. Similar remarks apply to the extinctive vs. separating negations of the PA and the PN. Yet this doesn’t mean that there exists a general logical or semantical difference between the two kinds of negation. Let us have a closer look at Abelard’s theory of negation as applied, firstly, to singular propositions and then, secondly to categorical propositions.
b. Negating Singular Propositions
Starting from a singular proposition such as:
S1 Socrates is just (Socrates est iustus),
one can consider besides the extinctive negation:
S2 Not: Socrates is just (Non Socrates est iustus),
two variants of a separating negation:
S3a Socrates is-not just (Socrates non est iustus). S3b Socrates is not-just (Socrates est non iustus).
According to Abelard, the variants S3a and S3b are equivalent. Therefore, in what follows, they shall simply be referred to as ‘S3’. Furthermore, S3 can itself be negated as:
S4 Not: Socrates is not just (Non Socrates est non iustus).
According to Abelard, the separating negation ‘Socrates is not just’, is the contrary opposite of the affirmation ‘Socrates is just,’ because both propositions become false when the subject ‘Socrates’ doesn’t exist! More generally, Abelard accepts the following principle:
Exist For any singular term s and any predicate P: the proposition ‘s is P’ implies (or presupposes) that ‘s is’, that is, that ‘s exists’.
Hence, if s doesn’t exist, both ‘s is P’ and ‘s is not-P’ are false. Therefore, the two (affirmative) propositions ‘s is P’ and ‘s is not-P’ together with their extinctive negations form the following square of opposition:
S1
s is P
contrary
s is ~P
S3
⇓
⇓
S4
¬(s is ~P)
subcontrary
¬(s is P)
S2
c. Negating Quantified Propositions
Starting from a PA such as:
C1 Some man is white (Quidam homo est albus),
one can consider besides the extinctive negation:
C2 Not: Some man is white (Non quidam homo est albus)
two variants of a separative negation:
C3a Some man is-not white (Quidam homo non est albus)
C3b Some man is not-white (Quidam homo est non albus).
While C3a is a particular negative proposition, C3b is usually considered as a particular affirmative proposition with a negative predicate. Since, in accordance with the theory of obversion, Abelard considered these propositions as equivalent, C3a and C3b may simply be referred to as ‘C3’.
Similarly, starting from a UA such as:
C4 Every man is white (Omnis homo est albus)
one can consider besides the extinctive negation:
C5 Not: Every man is white (Non omnis homo est albus)
two variants of a separative negation:
C6a Every man is-not white (Omnis homo non est albus)
C6b Every man is not-white (Omnis homo est non albus).
In view of the flexible grammar of the Latin language, C6a might be understood as synonymous with C5. Abelard, however, apparently understands C6a as equivalent to C6b, and both variants express the same state of affairs as the UN ‘No man is white.’ Therefore, both variants may simply be referred to as ‘C6’.
On p. 407 of Logica ‘Ingredientibus,’ Abelard draws the following diagram:
Omnis homo est albus
[contrary]
Omnis homo non est albus
[⇓]
[⇓]
Quidam homo est albus
[subcontrary]
Quidam homo non est albus
This appears to be a normal Square of Opposition formed by the propositions C4, C6, C1, and C3. Next, Abelard draws another diagram consisting of the “extinctive” negations of the previous propositions:
Non omnis homo est albus
Non omnis homo non est albus
[⇑]
[⇑]
Non quidam homo est albus
Non quidam homo non est albus
This appears to be a mirrored version of a normal Square of Opposition formed by the propositions C5, C2, +:
C7 Not every man isn’t white (Non omnis homo non est albus)
C8 Not some man isn’t white (Non quidam homo non est albus).
A few pages later, Abelard presents variants of these diagrams. The first diagram is entitled ‘exstinctiva’ because all negations are “extinctive”:
Omnis homo est albus
contrarie
Non quidam homo est albus
subalterne
subalterne
Quidam homo est albus
subcontrarie
Non omnis homo est albus
Abelard’s annotations ‘contrarie’, ‘contradictorie’, ‘subcontrarie’ and ‘subalterne’ suggest that the figure represents an ordinary square of opposition. This also appears to hold true for the next diagram which is entitled ‘separativa’ since each proposition is now paraphrased by means of a separating negation.
Omnis homo non est albus
[…]
Non quidam homo [non] est albus
[…]
[…]
Quidam homo non est albus
[…]
Non omnis homo non est albus
Note that here again the structure of the ordinary square is mirrored: The UN stands at the place of the UA and vice versa.
d. Abelard’s Octagon of Opposition
However, both in Logica ‘Ingredientibus’ and in Dialectica Abelard insists that the traditional view of a contradictory opposition between the UA and the PN is mistaken. The contradictory opposite of SaP is ‘Not: Every S is P,’ but this proposition is not equivalent to SoP. Rather, ‘Some S is not P’ is contrary to ‘Every S is P’ because it is possible that both propositions are false. Thus, Abelard explains:
Also with respect to categorical propositions the only correct negation of an affirmation, sharing the truth-values with it, appears to be that proposition which destroys the sense of the sentence by placing the negation in front of it; thus the negation of ‘Every man is a man’ is ‘Not every man is a man’, but not ‘Some man is not a man’; the latter might be false together with the affirmation. For if it were the case that there are no men at all, then neither ‘Every man is a man’ nor ‘Some man is not a man’ would be true. (Compare Dialectica, p. 176)
Hence, according to Abelard, if the subject-term S is “empty,” then both ‘Some S is not S’ and ‘Every S is S’ become false. More generally, if S is “empty”, then, for every P, the UA ‘Every S is P’ is false, that is, this proposition has “existential import”, it entails that ∃xS(x). This consideration leads to the assumption of altogether eight propositions. On the one hand, we have the four “normal” categorical forms which can be formalized by means of the quantifiers ‘∃x’ (‘there exists at least one x’) and ‘∀x’ (‘for every x’) plus the symbols ‘∧’, ‘∨’ and ‘⊃’ for the propositional operators of conjunction, disjunction and (material) implication:
C8
UA
Not some S is not P
¬∃x(S(x) ∧ ¬P(x))
C2
UN
Not some S is P
¬∃x(S(x) ∧ P(x))
C1
PA
Some S is P
∃x(S(x) ∧ P(x))
C3
PN
Some S is not P
∃x(S(x) ∧ ¬P(x))
On the other hand, one obtains two “strong” versions of universal propositions with existential import:
C4
UA+
Every S is P
∃xS(x) ∧ ∀x(S(x) ⊃P(x))
C6
UN+
Every S is not P
∃xS(x) ∧ ∀x(S(x) ⊃ ¬P(x))
Furthermore, the negations of these “strong” universal propositions yield “weak” interpretations of corresponding particular propositions:
C7
PA-
Not every S is not P
¬∃xS(x) ∨ ∃x(S(x) ∧ P(x))
C5
PN-
Not every S is P
¬∃xS(x) ∨ ∃x(S(x) ∧ ¬P(x))
The logical relations between these propositions are displayed in the subsequent “Octagon of Opposition” where horizontal dotted lines indicate contradictory oppositions, or negations, bold arrows stand for (unrestrictedly valid) logical implications, while thin arrows symbolize the traditional inferences of subalternation which hold only for “non-empty” subject terms:
4. Abelard’s Quantification of the Predicate
According to the standard historiography of logic (for example, Kneale 1962), the theory of the “quantification of the predicate” was developed only in the 19th century by William Hamilton and by Augustus de Morgan. However, preliminary versions of such a theory may have already been apparent in the 17th-century work of Leibniz (Compare Lenzen 2010), in the 16th-century work of Caramuel (Compare Lenzen 2017), and in the 14th-century work of Buridan (Compare Read 2012). Interestingly, Abelard had already dealt with this issue both in Logica ‘Ingredientibus’ and in Dialectica. He developed the theory in two steps, first, for propositions with a singular subject, second, for categorical forms with a quantified subject.
a. Singular Propositions with Quantified Predicate
On pp. 189-190 of Dialectica, Abelard considers the following propositions:
SQ1
Socrates est omnis homo
SQ2
Socrates non est aliquis homo
SQ3
Socrates est aliquis homo
SQ4
Socrates non est omnis homo
According to Abelard, SQ1 and SQ2 are contrary to each other. Furthermore, SQ3 follows SQ1 with subalternation, and similarly SQ2 entails SQ4. Hence SQ3 and SQ4 are “opposed” as subcontraries, and one obtains the following Square of Opposition:
SQ1
Socr. is every man
Socr. is no man
SQ2
⇓
⇓
SQ3
Socr. is some man
Socr. is not every man
SQ4
Next, Abelard considers the following propositions:
SQ5
Omnis homo est Socrates
SQ6
Nullus homo est Socrates
SQ7
Aliquis homo est Socrates
SQ8
Non omnis homo est Socrates
According to Abelard, SQ5 is equivalent to SQ1, and SQ6 is equivalent to SQ2. Furthermore, although Abelard himself doesn’t explicitly say this, SQ7 is equivalent to SQ3, and SQ8 is equivalent to SQ4.
Within the framework of first-order logic, SQ5, ‘Every man is Socrates,’ is most naturally interpreted as: ‘For every x: If x is a man, then x is identical with Socrates,’ symbolically ∀x(M(x) ⊃ (x = s)). Similarly, SQ6, ‘No man is Socrates’ can be formalized as ¬∃x(M(x) ∧ (x = s)). By way of subalternation, SQ5 entails ∃x(M(x) ∧ (x = s)), and SQ6, or its equivalent ∀x(M(x) ⊃ (x ≠ s)), similarly entails ∃x(M(x) ∧ (x ≠ s)). All these relations can be represented by another Square of Opposition:
SQ1/SQ5
∀x(M(x) ⊃ (x = s))
∀x(M(x) ⊃ (x ≠ s))
SQ2/SQ6
⇓
⇓
SQ3/SQ7
∃x(M(x) ∧ (x = s))
∃x(M(x) ∧ (x ≠ s))
SQ4/SQ8
This reconstruction largely accords with two squares which Abelard himself presented in Logica Ingredientibus, p. 411 (for more details compare Lenzen (2021), ch. 10).
b. Categorical Forms with Quantified Predicate
In a very condensed passage of Dialectica (p. 190), Abelard sketches how the theory of the quantification of the predicate can be transferred from singular propositions to categorical propositions. He starts with a generalisation of ‘Socrates est omne animal’ and ‘Socrates non est omne animal’:
CQ1 Every man is every animal (Omnis homo est omne animal).
CQ2 No man is every animal (Nullus homo est omne animal).
According to Abelard, these two propositions are doubly contrary (“dupliciter contrarie”) to each other. Next Abelard considers the subcontrary propositions:
CQ3 Some man is some animal (Quidam homo est aliquod animal).
CQ4 Some man is not every animal (Quidam homo non est omne animal).
He maintains that CQ3 follows from CQ1 by subalternation. Similarly, CQ4 follows from CQ2 by subalternation. Furthermore, Abelard maintains that another subalternation exists between:
CQ5 No man is some animal (Nullus homo est aliquod animal).
CQ6 Some man is every animal (Quidam homo est omne animal).
These propositions can be formalized as follows:
CQ1
Every man is every animal
∀x(Mx ⊃ ∀y(Ay ⊃ (x = y)))
CQ2
No man is every animal
¬∃x(Mx ∧ ∀y(Ay ⊃ (x = y)))
CQ3
Some man is some animal
∃x(Mx ∧ ∃y(Ay ∧ (x = y)))
CQ4
Some man is not every animal
∃x(Mx ∧ ¬∀y(Ay ⊃ (x = y)))
CQ5
No man is some animal
¬∃x(Mx ∧ ∃y(Ay ∧ (x = y)))
CQ6
Some man is every animal
∃x(Mx ∧ ∀y(Ay ⊃ (x = y)))
It is easy to see that Abelard’s theses concerning the contradictory opposition between CQ3 and CQ5 and between CQ4 and CQ1 are correct. Also, CQ1 and CQ2 are contrary to each other. Furthermore, as Abelard explains, CQ3 logically follows from CQ1 by way of subalternation. However, he failed to see that CQ3 follows from CQ1 so to speak by a double subalternation: CQ1 first entails:
CQ7
Every man is some animal
∀x(Mx ⊃ ∃y(Ay ∧ (x = y)))
And CQ7 in turn entails CQ3. Altogether the logical relations between the affirmative propositions CQ1, CQ3, CQ6, and CQ7 can be displayed as follows:
The logical relations of this diagram are reversed when one considers the negations of the four propositions. We know already that the negation of CQ1 is CQ4, that of CQ6 is CQ2, and that of CQ3 is CQ5. So, we only have to add the negation of CQ7 (‘Every man is some animal’) which amounts to:
CQ8
Some man is not some animal
∃x(Mx ∧ ¬∃y(Ay ∧ (x = y)))
Hence one obtains the following diagram for negative categorical propositions with quantified predicate:
On p. 411 of Logica Ingredientibus, Abelard presents two squares of opposition, one entitled “exstinctiva”, the other “separativa.” After correcting a minor mistake, these squares accord with our diagrams, and both squares can easily be combined into the following octagon:
Here again dotted lines indicate a contradictory opposition while the arrows symbolize logical implications.
Most likely Abelard understood propositions CQ3 (‘Some S is some P’) and CQ5 (‘No S is some P’) as alternative formulations of the ordinary PA and UN. Similarly, propositions CQ7 (‘Every S is some P’) and CQ8 (‘Some S is not some P’), which were “overlooked” by Abelard, may be interpreted as alternative formulations of the ordinary UA and PN. Therefore, the above octagon contains as a substructure the usual square of opposition:
The thin arrows again signalize that these inferences of subalternation only hold for non-empty terms.
5. Inferences and Implications
Like many other medieval logicians, Abelard fails to make a systematic distinction between inferences and implications. He refers to them equally as “inferentia”, “consequentia”, or “consecutio.” If the inference is a genuine syllogism consisting of two categorical propositions as premisses and another categorical proposition as conclusion, Abelard typically separates them by means of “ergo,”, for instance:
Omnis homo est animal
Omne animal est animatum […]
Ergo omnis homo est animatus. (Dialectica, p. 254)
However, he has no qualms to express this inference equivalently by the conditional:
Si omnis homo est animal et omne animal est animatum, omnis homo est animatus (ibid.).
Also, he has no qualms to refer to the premise(s) of an inference as “antecedens,” to the conclusion as “consequens”, and to the entire inference as “argumentum.”
a. Perfect vs. Imperfect Inferences
Abelard defines an inference as perfect:
[…] when it holds in virtue of the logical form (complexio) of the propositions involved. By this […] he means that the entailment holds under any uniform substitution in its terms […]. The traditional moods of the categorical syllogism […] are all instances of complete entailments, or as we should say, valid inference. (King & Arlig 2018)
Somewhat surprisingly, Abelard was not willing to grant the attribute ‘perfect’ also to a tautology such as ‘si est animatum, est animatum’ (for a closer discussion compare Lenzen (2021), ch. 11).
Typical examples of imperfect inferences are enthymemes, which is to say, inferences which are not (yet) formally valid, but that can be turned into such by the addition of another premise. Thus, Abelard mentions the example ‘si omnis homo est animal, omnis homo est animatus,’ which may be transformed into a perfect syllogism by the addition of ‘omne animal est animatum’. The latter proposition, which Abelard also paraphrases without quantifier as ‘si est animal, est animatum’, is necessarily true because of the “nature of things.” Nowadays we would call such propositions analytically true.
b. Strict vs. Relevant Implication
As was already mentioned above, some modern commentators express their amazement that, allegedly, “Abelard denied the deduction theorem” (Guilfoy (2008)) or that he at least denied “that a correlate of the Deduction Theorem holds” (King & Arlig (2017)). This point had first been raised by Martin who pointed out:
The deduction theorem has often been regarded as central in logic, and it has been felt that one hardly has a logic for entailment if validity of argument and so derivability are not connected in an appropriate way to the truth of a conditional. There is some connection for Abelard since, if a conditional is true, it satisfies condition C, and so the corresponding argument will certainly be valid in the sense of satisfying condition I. In general, however, entailment [as] expressed in true conditionals is not the converse of derivability or logical consequence as expressed in valid arguments. (Martin 1986, p. 569)
In a later essay, he similarly maintained that:
[…] one cannot conditionalize a valid argument to obtain a true conditional and so the Deduction Theorem does not hold for Abelard’s logic, a feature which shocked his student John of Salisbury. (Martin 2006, p. 182)
Actually, John of Salisbury expressed his “shock” as follows:
I am amazed that the Peripatetic of Pallet so narrowly laid down the law for hypotheticals that he judged only those to be accepted the consequent of which is included in the antecedent […] indeed while he freely accepted argumenta, he rejected hypotheticals unless forced by the most manifest necessity. (Translation from Martin 2006, p. 196)
Now, ever since Aristotle, an inference has been regarded as logically valid if and only if it is impossible that the premise(s) are all true and yet the conclusion be false. A large majority of medieval logicians similarly considered a conditionalA → C as true if and only if it can’t be the case that the antecedent A is true while the consequent C is false. This is the common definition of a strict implication (in distinction from a merely material implication) as it was re-invented in the 20th century by C. I. Lewis. Abelard thought it necessary to further distinguish between two kinds of the strictness or necessity of the consequence relation:
There seem to be two necessities of consequences, one in a larger sense, if namely that what is maintained in the antecedent cannot be the case without that what is maintained in the consequent; the other in a narrower sense, if namely not only the antecedent cannot be true without the consequent, but if also the antecedent by itself requires the consequent. (Cf. Dialectica, p. 283-4)
Abelard clearly saw that the former definition gives rise to what are nowadays called the “paradoxes” of strict implication, in particular the principle which later medieval logicians came to call “Ex impossibili quodlibet”:
EIQ If A is impossible, then the inference A ⇒ B (or the implication A → B) is valid (or true) for every proposition B.
Thus, Abelard considered the proposition ‘If Socrates is a stone, he is an ass’ which according to the first, liberal criterion counts as true because “it is impossible that Socrates should be a stone, and so impossible that he should be a stone without being an ass” (Kneale 1962, p. 217). For reasons discussed in section 6, Abelard did not want to accept this inference (or conditional) as sound. Therefore he suggested the stronger condition which Martin (2006: 181) explained as follows:
The antecedent is required to be relevant to the consequent in that its truth is genuinely sufficient for that of the consequent and this is guaranteed by the consequent being in some way contained in the antecedent.
As a standard example for a correct “relevant” implication Abelard mentions:
(i) If he is a man, he is an animal (“Si est homo, est animal”).
Here, the antecedent requires the consequent by itself (“ex se ipso”) since the notion of man contains the notion of animal. In contrast,
(ii) If he is a man, he is not a stone (“Si est homo, non est lapis”)
is not accepted by Abelard as a correct relevant implication, although, of course, it satisfies the weaker criterion of a strict implication. Abelard argues (Dialectica, p. 284) that the truth of (ii) only rests on our experience which shows that the properties ‘man’ and ‘stone’ are disparate, that is to say, they do not simultaneously subsist in one and the same thing. Yet, as the Kneales explained “the sense of the consequent […] is not contained in the sense of the antecedent” (Kneale 1962: p. 218).
In the wake of Abelard, many attempts have been made to elaborate the idea of a “relevant” implication and to develop a full-fledged logic of “containment.” Until today, no real agreement has been reached. Abelard contributed to this enterprise mainly by suggesting that, in generalization of example (i), a “relevant” implication obtains whenever the antecedent refers to a certain species while the consequent refers to the corresponding kind. The correctness of such conditionals does not depend on whether the antecedent is true or false. Even impossible antecedents can support correct conditionals, that is:
If Socrates is a pearl, Socrates is a stone (cf. Logica ‘Ingredientibus’, 329)
If Socrates is an ass, Socrates is an animal (cf. Dialectica, 346).
6. Abelard’s Defence of the Principles of Connexive Logic
In the 1960s, Storrs McCall introduced the idea of a connexive implication which can be characterized by the requirement that the operator ‘→’ satisfies “Aristotle’s Thesis” and “Boethius’ Thesis.” The crucial passage from Prior Analytics 57b3-14 was interpreted by McCall as follows:
What Aristotle is trying to show here is that two implications of the form ‘If p then q’ and ‘If not-p then q’ cannot both be true. The first yields, by contraposition, ‘If not-q then not-p’, and this together with the second gives ‘If not-q then q’ by transitivity. But this, Aristotle says, is impossible: a proposition cannot be implied by its own negation. […] We shall henceforth refer to the principle that no proposition can be implied by its own negation, in symbols ‘~(~p → p)’, as Aristotle’s [first] thesis […] The [other] connexive principle ~[(p → q) & (~p → q)] will be referred to as Aristotle’s second thesis. (McCall 2012, p. 415)
If one replaces McCall’s symbols ‘~’ and ‘&’ for negation and conjunction by our symbols ‘¬’ and ‘∧’, one obtains:
Arist 1 ¬(¬p → p)
Arist 2 ¬((p → q) ∧ (¬p → q)).
The second principle can be paraphrased by saying that no proposition is implied by both of two contradictory propositions. Abelard similarly maintained that “one and the same consequent cannot follow from the affirmation and from the negation of the same proposition” (cf. Dialectica, p. 290). Like Aristotle, Abelard also argued that if Arist 2 would not hold, then Arist 1 wouldn’t hold either, and this would be absurd since “the truth of one of two contradictory propositions not only does not require the truth of the other, but instead it expels and extinguishes it” (ibid.)
Moreover, Abelard pointed out that Aristotle’s Thesis (“regula aristotelica”) not only holds in the version where it is denied “that one and the same follows from the affirmation and from the negation of the same”, but also in the variant that “the affirmation and the negation of the same cannot be implied by one and the same proposition”, that is:
Abel 2 ¬((p → q) ∧ (p → ¬q)).
For example, the propositions ‘If x is a man, x is an animal’ and ‘If x is a man, x is not an animal’ cannot both be true, because otherwise one might derive the “inconveniency” ‘If x is a man, x is not a man,’ The corresponding generalization:
Abel 1 ¬(p → ¬p),
however, in Abelard’s opinion is “impossible.”
Principle Abel 2 is usually referred to as ‘Boethius’ Thesis’. Thus, McCall picked up a passage from De Syllogismo hypothetico where Boethius (ca. 480-524) maintained: “Si est A, cum sit B, est C; […] atqui cum sit B, non est C; non est igitur A’.” McCall then “transliterated” this as the inference:
If p, then if q then r,
If q then not-r Therefore, not-p.
The reasoning that led Boethius to assert the validity of this schema was presumably this. Since the two implications ‘If q then r’ and ‘If q then not-r’ are incompatible, the second premise contradicts the consequent of the first premise. Hence, by modus tollens, we get the negation of the antecedent of the first premise, namely ‘not-p’. […] The corresponding conditional, If q → r then ~(q → ~r) will be denoted Boethius’ thesis, and serves with the thesis ~(p → ~p) as the distinguishing mark of connexive logic (McCall 2012, p. 416).
As was argued in Lenzen (2020), Boethius’ term-logical principle primarily expresses the idea that if a UA of type ‘If x is A, then x is B’ is true, then the UN ‘If x is A, then x is not-B’ can’t be true as well, which is to say, the two universal propositions are contrary to each other. Yet it is probably correct to assume that Boethius would also have endorsed the propositional principle called ‘Boethius’ Thesis’, that is, Abel 2. On the other hand, Boethius nowhere put forward a term-logical counterpart of Abel 1. Therefore, it seems preferable to refer to these principles as Abelard’s Theses.
a. The First “Embarrassing Argument”
Logicians from the 12th-century school of the “Montanae” developed an argument to show that the connexive principles do not hold without restriction. Martin (1986: 569-70) reconstructed their “embarrassing argument” as follows:
1. If Socrates is a man and a stone, Socrates is a man.
2. If Socrates is a man, Socrates is not a stone.
So 3. If Socrates is a man and a stone, Socrates is not a stone.
But 4. If Socrates is not a stone, Socrates is not a man and a stone.
So 5. If Socrates is a man and a stone, Socrates is not a man and a stone.
Conclusion (5) has the logical structure (p ∧ q) → ¬(p ∧ q), hence it contradicts Abel 1. However, Abelard wasn’t too much worried by this counterexample because he considered step (2) as not valid. This step consists of an application of the traditional “Locus ab oppositis.” In the invaluable collection Logica Modernorum edited by Lambert M. De Rijk, the rule is formulated as “si aliquid oppositorum predicatur de aliquo, aliud abnegatur ab illo” (De Rijk 1967, II/2, p. 62). This means:
Opp 3 If one of two opposite predicates is affirmed of a certain subject, then the other predicate must be denied (of the same subject).
The notion of opposite predicates here has to be understood as not only applying to contradictory concepts like ‘man’ and ‘not-man,’ but also to contrary concepts like ‘man’ and ‘horse’. Somewhat more exactly:
Opp 4 Two predicates (or concepts) P1, P2 are opposite to each other if and only if there can’t exist any x such that P1(x) and P2(x).
In particular, ‘man’ and ‘stone’ are opposite concepts; for any x, it is impossible that x is a man, M(x), and x is a stone, S(x)). Hence, according to the definition given above, M(x) strictly implies ¬S(x). Yet, for Abelard, this is not a relevant or “natural” implication because:
Not being a stone does not follow in the appropriate way from being a man, even though it is inseparable from being a man. It does not follow in the appropriate way since it is no part of the nature of a man that he not be a stone. (Martin 1987, p. 392)
The plausibility of this view need not be discussed here because it soon turned out that there are other counter-examples to the connexive principles Abel 1, 2 which do not rely on the “locus ab oppositis.”
b. Alberic’s Argument
As was first reported in Martin (1986), Alberic of Paris put forward the subsequent “embarrassing argument”:
1. If Socrates is a man and is not an animal, Socrates is not an animal.
2. If Socrates is not an animal, Socrates is not a man.
3. If Socrates is not a man it is not the case that Socrates is a man and an animal.
C*. If Socrates is a man and not an animal, it is not the case that Socrates is a man and not an animal. (Martin 1987, pp. 394-5)
Since conclusion C* has the structure (p ∧ ¬q) → ¬(p ∧ ¬q), it constitutes another counterexample to Abel 1. Furthermore, the argument does not depend on the “locus ab oppositis” because Line 2 is obtained by applying the principle of contraposition to the unproblematic conditional ‘If Socrates is a man, Socrates is an animal.’ Since the proof makes use only of logical laws which Abelard regarded as indispensable, “[…] confronted with this argument Master Peter essentially threw up his hands and granted its necessity” (Martin 1987, p. 395).
In the decades after Abelard, logicians from the schools of the Nominales, the Melidunenses and the Porretani tried to cope with the problems created by Alberic’s argument. As was argued in Lenzen (2023), however, their sophisticated arguments turned out to be inconclusive. As several brilliant logicians from the 13th to 15th century recognized, Aristotle’s and Abelard’s connexive theses have to be restricted to self-consistent antecedents and/or to non-necessary consequents. For example, Robert Kilwardby (1222-1277) who in his Notule libri Priorum desperately tried to defend Aristotle’s theses against counter-examples, eventually came to admit: “So it should be granted that from the impossible its opposite follows, and that the necessary follows from its opposite” (Thom & Scott 2015, p. 1145). Furthermore, in On the Purity of the Art of Logic, Walter Burley (ca. 1275-1345) proved that “every conditional is true in which an antecedent that includes opposites implies its contradictory. For example, it follows: ‘You know you are a stone; therefore, you do not know you are a stone’” (Spade 2000, p. 156). Burley concluded that Aristotle’s thesis is only restrictedly valid: “I say that the same consequent does not follow from the same antecedent affirmed and denied, unless the opposite of that consequent includes contradictories. And this is how Aristotle’s statement has to be understood” (Spade 2000, p. 160). In a similar way, John Buridan (ca. 1300-1360) rather incidentally noted that “a “possible [!] proposition never entails its own contradictory” (Hughes 1982, p. 38). The editor of Buridan’s Sophismata remarked: “Note that the principle appealed to is not that no proposition whatsoever can entail its own contradictory, but only that no possible proposition can do so. This is a standard principle of modal logic” (Hughes 1982, p. 86).
In view of these interesting and important discoveries, the history of connexive logic, as it was sketched in McCall (2012), needs to be fundamentally corrected. This has been achieved in Lenzen (2022).
7. References and Further Reading
a. Editions of Abelard’s Logical Works
Victor Cousin (Ed.), Ouvrages inédits d’Abélard, Paris (Imprimerie Royale) 1836.
Mario Dal Pra (Ed.), Pietro Abelardo Scritti Filosofici: Editio super Porphyrium; Glossae in Categorias; Super Aristotelem De Interpretatione; De divisionibus; Super Topica Glossae, Roma (Bocca) 1954.
Lambert Marie De Rijk (Ed.), Petrus Abaelardus Dialectica – First Complete Edition of the Parisian Manuscript, Assen (van Gorcum) 1959. (The manuscript itself can be downloaded from the Bibliothèque Nationale de France under: https://gallica.bnf.fr/ark:/12148/btv1b6000788f?rk=321890;0
Bernhard Geyer (Ed.), Peter Abaelards Philosophische Schriften. In: Beiträge zur Geschichte der Philosophie und Theologie des Mittelalters, vol. 21, issues 1-4, Münster (Aschendorf) 1919-1933.
b. Secondary Literature
Maria Teresa Beonio-Brocchieri Fumagalli (1964): La Logica di Abelardo, Firenze (Pubblicazione della Università degli Studi di Milano); engl. translation: The Logic of Abelard, Dordrecht (Reidel) 1969.
Irene Binini (2022): Possibility and Necessity in the Time of Peter Abelard, Leiden (Brill).
Lambert Marie De Rijk (Ed.) (1967), Logica Modernorum – A Contribution to the History of Early Terminist Logic. Assen (Van Gorcum).
Kevin Guilfoy (2008): “Peter Abelard”, in J. Fieser & B. Dowden (Ed.), Internet Encyclopedia of Philosophy: https://iep.utm.edu/abelard/
George E. Hughes (1982): John Buridan on Self-Reference, Cambridge (Cambridge University Press).
Peter King & Andrew Arlig (2018): “Peter Abelard”, in E. N. Zalta (Ed.), The Stanford Encyclopedia of Philosophy, Fall 2018 edition: https://plato.stanford.edu/entries/abelard/
William and Martha Kneale (1962): The Development of Logic, Oxford (Clarendon).
Wolfgang Lenzen (2010): “The Quantification of the Predicate – Leibniz, Ploucquet and the (Double) Square of Opposition.” In J. A. Nicolás (Ed.), Leibniz und die Entstehung der Modernität, Stuttgart (Steiner), 179-191.
Wolfgang Lenzen (2017): “Caramuel’s Theory of Opposition.” In South American Journal of Logic 3, 361-384.
Wolfgang Lenzen (2020): “A Critical Examination of the Historical Origins of Connexive Logic.” In History and Philosophy of Logic 41, 16-35.
Wolfgang Lenzen (2021): Abaelards Logik, Paderborn (mentis/Brill).
Wolfgang Lenzen (2022): “Rewriting the History of Connexive Logic.” In Journal of Philosophical Logic 51, 523-553.
Wolfgang Lenzen (2023): “Abelard and the Development of Connexive Logic.” In: M. Blicha & I. Sedlár (eds.), Logica 2022 Yearbook, London (College Publications), 55-78.
Christopher Martin (1986): “William’s Machine.” In The Journal of Philosophy 83, 564-572.
Christopher Martin (1987): “Embarrassing Arguments and Surprising Conclusions in the Development of Theories of the Conditional in the Twelfth Century.” In J. Jolivet & A. de Libera (Ed.), Gilbert de Poitiers et ses contemporains: aux origines de laLogica Modernorum, Napoli (Bibliopolis), 377-400.
Christopher Martin (2006): “Logic.” In J. Brower & K. Guilfoy (Eds.), The Cambridge Companion to Abelard, Cambridge (Cambridge University Press), 158-199.
Storrs McCall (2012): “A History of Connexivity.” In D. M. Gabbay, F. J. Pelletier & J. Woods (Ed.), Handbook of the History of Logic, Vol. 11, Logic: A History of its Central Concepts, Elsevier, 415-449.
Jacques Paul Migne (Ed.) (1860): Anicii Manlii Severini Boethii Opera omnia, Paris.
Stephan Read (2012): “John Buridan’s Theory of Consequence and His Octagons of Opposition.” In J.-Y. Béziau & D. Jacquette (Ed.), Around and Beyond the Square of Opposition, Basel (Birkhäuser), 93-110.
Paul V. Spade (Ed.) (2000): Walter Burley – On the Purity of the Art of Logic – The Shorter and the Longer Treatise. New Haven & London (Yale University Press).
Paul Thom & John Scott (Eds.) (2015): Robert Kilwardby, Notule libri Priorum. Oxford (Oxford University Press).
Author Information
Wolfgang Lenzen
Email: lenzen@uos.de
University of Osnabrück
Germany
Knowledge-First Theories of Justification
Knowledge-first theories of justification are theories of justification that give knowledge priority when it comes to explaining when and why someone has justification for an attitude or an action. The emphasis of this article is on knowledge-first theories of justification for belief. As it turns out, there are a number of ways of giving knowledge priority when theorizing about justification, and what follows is a survey of more than a dozen existing options that have emerged since the publication in 2000 of Timothy Williamson’s Knowledge and Its Limits.
The present article first traces several of the general theoretical motivations that have been offered for putting knowledge first in the theory of justification. It then provides an examination of existing knowledge-first theories of justification and their objections. There are doubtless more ways to give knowledge priority in the theory of justified belief than are covered here, but the survey is instructive because it highlights potential shortcomings that would-be knowledge-first theorists may wish to avoid.
The history of the Gettier problem in epistemology is a long history of failed attempts to give a reductive account of knowledge in terms of justification and other conditions. In light of this, many have since judged the project of providing a reductive analysis of knowledge to be a degenerating research program. In putting knowledge first in the theory of justification, epistemologists are exploring whether we can more successfully reverse the order of explanation in epistemology by giving an account of justified belief in terms of knowledge. This article concludes with a reflection about the extent to which the short history of the many controversial attempts to secure an unproblematic knowledge-first account of justified belief has begun to resemble the older Gettier dialectic.
Knowledge-first theories of justified belief give knowledge priority when it comes to explaining when and why someone has a justified belief. As it turns out, there are a number of ways of giving knowledge priority when theorizing about justified belief, and what follows is a survey of several existing options.
Before examining specific knowledge-first theories of justification it is worth considering what might motivate such an approach to begin with. One kind of motivation involves the need for an extensionally adequate theory of justified belief. After all, there is some set of possible cases where agents have justified beliefs, and a knowledge-first theory of justified belief should pick out that set and offer us a knowledge-centric explanation for why that set has exactly the members that it has. Traditional epistemologists should note that progress has been made in this direction, and this provides at least some reason to think that some knowledge-centric account of justification is correct. But there is more to be observed when it comes to motivating knowledge-first accounts of justified belief.
Consider, first, conceptual relations between knowledge and justification. Sutton (2005; 2007) has argued that grasping the concept of epistemic justification depends on our prior understanding of knowledge:
We only understand what it is to be justified in the appropriate sense because we understand what it is to know, and can extend the notion of justification to non-knowledge only because they are would-be knowers. We grasp the circumstances—ordinary rather than extraordinary—in which the justified would know. Justification in the relevant sense is perhaps a disjunctive concept—it is knowledge or would-be knowledge (Sutton 2005: 361).
If our concept of epistemic justification depends on our concept of knowledge, then that surely provides at least some reason to think that knowledge might be more basic a kind than justified belief. At the very least it provides us with reason to explore that possibility.
Second, consider some plausible claims about the normativity of belief. As Williamson (2014: 5) reasons: “If justification is the fundamental epistemic norm of belief, and a belief ought to constitute knowledge, then justification should be understood in terms of knowledge too.” Here Williamson is connecting norms for good instances of a kind and norms for bringing about instances of that kind. So if one is justified in holding a belief only if it is a good belief, and a good belief is one that constitutes knowledge, then it seems to follow that a justified belief has to be understood in terms of knowledge (Kelp, et al. 2016; Simion 2019).
A third reason for putting knowledge first in the theory of justification stems from Williamson’s (2000) defense of the unanalyzability of knowledge together with the E=K thesis, which says that the evidence you possess is just what you know. Assuming we should understand justification in terms of having sufficient evidence, it seems to follow that we should understand justification in terms of knowledge. (For critical discussion of E=K see Silins (2005), Pritchard and Greenough (2009), Neta (2017), and Fratantonio (2019).)
A fourth reason stems from the way in which asymmetries of knowledge can explain certain asymmetries of justification. While much of the knowledge-first literature on lottery beliefs has focused on assertion (see the article Knowledge Norms), the points are easily extended to justified belief. One cannot have justification to believe that (L) one has a losing lottery ticket just on the basis of one’s statistical evidence. But one can have justification to believe (L) on the basis of a newspaper report. What can explain this asymmetry? Knowledge. For one cannot know (L) on the basis of merely statistical evidence, but one can know (L) on the basis a newspaper report. Accordingly, knowledge can play a role in explaining the justificatory asymmetry involving (L) (Hawthorne 2004; Smithies 2012). A similar asymmetry and knowledge-first explanation can be drawn from the literature on pragmatic encroachment (Smithies 2012; De Rose 1996). See also Dutant and Littlejohn (2020) for further justificatory asymmetries that certain knowledge-first approaches to justified belief can explain.
Fifth, putting knowledge in the explanatory forefront can explain (broadly) Moorean absurdities. Consider, for instance, the absurdity involved in believing p while also believing that one does not know p. Some explanation for the irrationality of this combination of beliefs should fall out of a theory of justification that tells us when and why a belief is (or is not) justified. Theories of justification that explain justification in terms of knowledge have an easy time explaining this (Williamson 2000; 2009; 2014).
Lastly, putting knowledge in the explanatory forefront of justification can provide an explanation of the tight connection between justification and knowledge. For it is widely believed that knowing p or being in a position to know p entails that one has justification for believing p. The traditional explanation of this entailment relation involves the idea that knowledge is to be analyzed in terms of, and hence entails, justification. But another way of explaining this entailment is by saying that knowledge or being in a position to know is constitutively required for justification (Sylvan 2018).
2. The Token-Identity Theory
Perhaps the first knowledge-first theory of justified belief is the token-identity theory, according to which token instances of justified belief just are token instances of knowledge, which yield the following biconditional (Williamson 2009, 2014; Sutton 2005, 2007; Littlejohn 2017: 41-42):
(J=K) S’s belief that p is justified iff S knows that p.
The term ‘iff’ abbreviates “if and only if.” This is a theory of a justified state of believing (doxastic justification), not a theory of having justification to believe, whether or not one does in fact believe (propositional justification). But it is not hard to see how a (J=K) theorist might accommodate propositional justification (Silva 2018: 2926):
(PJ=PK) S has justification to believe p iff S is in a position to know p.
What does it take to be in a position to know p? One type of characterization takes being in a position to know as being in a position where all the non-doxastic demands on knowing are met (Smithies 2012; Neta 2017; Rosenkranz 2018; Lord 2018). The doxastic demands involve believing p in the right kind of way, that is, the kind of way required for knowing. The non-doxastic demands involve the truth of p and one’s standing in a suitably non-accidental relation to p such that, typically, were one to believe p in the right kind of way, one would know that p. (For further characterizations of being in a position to know see Williamson 2000: 95; Rosenkranz 2007: 70-71.)
One issue raised by characterizing being in a position to know in counterfactual terms concerns what we might call doxastic masks: features of one’s situation that are triggered by one’s act of coming to believe p at a time t+1 that would preclude one from knowing p despite all the non-doxastic requirements of knowledge being met at an earlier time t. For example, you might have all the evidence it could take for anyone to know p, but suppose Lewis’ (1997) sorcerer does not want you to know p. So, in all or most nearby worlds when the sorcerer sees you beginning to form the belief in p, he dishes out some kind of defeater that prevents you from knowing p. So, on standard possible worlds analyses of counterfactuals, it is false that you have some way of coming to believe p such that were you to use it, you would know p (compare Whitcomb 2014). Alternatively, one might seek to characterize being in a position to know in terms of having the disposition to know which is compatible with the existence of doxastic masks. Another alternative is to give up on the idea that being in a position to know is best understood in terms of worlds and situations nearby or close to one’s actual situation, thereby making the target characterization of being in a position to know a more idealized notion, one that is discussed below (compare Smithies 2012: 268, 2019: sect 10.4; Rosenkrantz 2018; Chalmers 2012).
There are various problems with (J=K) and, by extension, (PJ=PK). First, (J=K) is incompatible with the fallibility of justification, that is, the possibility of having justified false beliefs. So (J=K) cannot permit justified false beliefs. But any theory of justification that rules out such beliefs is widely seen to be implausible (Bird 2007; Comesaña and Kantin 2010; Madison 2010; Whitcomb 2014; Ichikawa 2014).
Second, (J=K) is incompatible with the possibility of having a justified true belief in the absence of knowledge. Gettier cases are typically cases of justified true belief that do not constitute knowledge. But (J=K) implies that there are no such cases because it implies that there can be no cases of justification without knowledge. This bucks against a history of strong intuitions to the contrary (Bird 2007; Comesaña and Kantin 2010; Madison 2010; Whitcomb 2014; Ichikawa 2014).
Third, (J=K) is incompatible with the new evil demon hypothesis. Consider someone who, unwittingly, has had their brain removed, placed in a vat, and is now being stimulated in such a way that the person’s life seems to go on as normal. According to the new evil demon hypothesis: if in normal circumstances S holds a justified belief that p, then S’s recently envatted brain-duplicate also holds a justified belief that p. It is beyond the scope of this article to defend the new evil demon hypothesis. But as Neta and Pritchard (2007) point out, it is a widely shared intuition in 21st century epistemology. This generates problems for (J=K). For since one cannot know that one is looking at a hand (or that a hand is in the room) if one is a recently envatted brain who merely seems to be looking at a hand, then according to (J=K) one cannot be justified in believing it either (Bird 2007; Ichikawa 2014). For further discussion see the article on The New Evil Demon Hypothesis. See also Meylan (2017).
3. Modal Theories
To avoid the problems with (J=K), some have sought to connect justified belief and knowledge in a less direct way, invoking some modal relation or other.
Here is Alexander Bird’s (2007) knowledge-first account of justified judgment, which can be transformed into a theory of justified belief (i.e. arguably the end state of a justified act of judging):
(JuJu) If in world w1S has mental states M and then forms a judgment [or belief], that judgment [or belief] is justified iff there is some world w2 where, with the same mental states M, S forms a corresponding judgment and that judgment [or belief] yields knowledge.
(JuJu) counts as a knowledge-first theory because it explains one’s justified judgment/belief in terms of the knowledge of one’s mental state duplicates. It does a good deal better than (J=K) when it comes to accounting for intuitive characteristics of justified belief: namely, its fallibility, its compatibility with Gettier cases, and its compatibility with the new evil demon hypothesis.
Despite this, various problems have been pointed out concerning (JuJu). First, it seems that we can obtain justified false beliefs from justified false beliefs. For example, suppose S knew that:
(a) Hesperus is Venus.
But, due to some misleading evidence, S had the justified false belief that:
(b) Hesperus is not Phosphorus.
Putting these two together S could infer that:
(c) Phosphorus is not Venus.
As Ichikawa (2014: 191-192) argues, S could justifiably believe (c) on this inferential basis. But, according to (JuJu), S can justifiably believe (c) on the basis of an inference from (a) and (b) only if it is possible for a mental state duplicate of S’s to know (c) on this basis. But content externalism precludes such a possibility. For content externalism implies that any mental state duplicate of S’s who believes (c) on the basis of (a) and (b) is a thinker for whom the terms ‘Phosphorus’ and ‘Venus’ refer to the very same astral body, thus making knowledge of (c) on the basis of (a) and (b) impossible. Because of this, (JuJu) implies that you cannot have justification to believe (c) on this inferential basis, contrary to what seems to be the case. This is not just a problem for (JuJu), but also a problem for (J=K).
Second, (JuJu) fails to survive the Williamsonian counterexamples to internalism. Williamson’s counterexamples, as McGlynn (2014: 44ff) observes, were not intended to undermine (JuJu) but they do so anyway. Here is one example:
Suppose that it looks and sounds to you as though you see and hear a barking dog; you believe that a dog is barking on the basis of the argument ‘That dog is barking; therefore, a dog is barking’. Unfortunately, you are the victim of an illusion, your demonstrative fails to refer, your premise sentence thereby fails to express a proposition, and your lack of a corresponding singular belief is a feature of your mental state, according to the content externalist. If you rationally believe that a dog is barking, then by [JuJu] someone could be in exactly the same mental state as you actually are and know that a dog is barking. But that person, too, would lack a singular belief to serve as the premise of the inference, and would therefore not know that a dog is barking. (Williamson 2000: 57-58).
McGlynn (2014: 44) draws attention to the fact that a “natural verdict is that one’s belief that a dog is barking is rational or justified” despite the fact that one cannot know this while having the same mental states. For any (non-factive) mental state duplicate will be one for whom the sentence ‘That dog is barking’ cannot be true, and hence cannot be known either. So we have another counterexample to (JuJu). Again, this is not just a problem for (JuJu), but also (J=K).
Since (JuJu)’s problems stem from its insistence on sameness of mental states, a natural response is to abandon that emphasis and focus on what a thinker and, say, her duplicate on Twin Earth can have in common. This is just what Ichikawa (2014: 189) attempts to do:
(JPK) S has a justified belief iff S has a possible counterpart, alike to S in all relevant intrinsic respects, whose corresponding belief is knowledge.
The target intrinsic respects are limited to the non-intentional properties that S and her Twin Earth duplicate can share. But they are not intended to include all such properties. Ichikawa wants to maintain that if, say, S unwittingly lost her body in an envattment procedure, she could still have a justified belief that she has a body even though the only counterparts of hers who could know this are ones who have a body. So, the target intrinsic respects are to be further restricted to what S and her envatted counterpart could share. In the end, this seems to amount to sameness of brain states or something close to that. This aspect of (JPK) goes a long way towards making it internalist-friendly and also helps (JPK) avoid the difficulties facing (JuJu) and (J=K). (See Ichikawa (2017) for his most recent work on knowledge-first approaches to justification.)
Nevertheless, (JPK) has problems of its own. Both problems stem from the attempt to reconcile (JPK) with the idea that justified belief is a type of creditable belief. Here is how Ichikawa (2014: 187) describes the first problem: Zagzebski (1996: 300-303) and many others have argued that it is plausible that S’s holding a justified belief entails that S is creditworthy (that is, praiseworthy) for believing as she does. Moreover, S is creditworthy because S holds a justified belief. That is, it is S’s particular act of believing that explains why S deserves credit. But (JPK) seems forced to explain S creditworthiness in terms of facts about S’s counterparts since it is one’s counterparts that explain one’s doxastic justification. But this seems odd: why facts about a merely possible, distinct individual make me creditworthy for believing as I actually do? As others have pointed out, this can seem odd (Silva 2017). But a more promising response involves noting that having a justified belief immediately grounds being creditworthy for believing, just as our intuition has it. And facts about one’s counterparts’ knowledge immediately grounds having a justified belief. But immediate grounding is not transitive, so stuff about knowledge does not immediately ground being creditworthy for believing. So, the odd consequence does not follow. A consequence that does follow is that stuff about knowledge mediately grounds being creditworthy for believing. (Because there is a chain of immediate grounds connecting these.) But here it is open for the knowledge-firster to say that our intuition really concerns only immediate grounding.
Ichikawa is clear that (JPK) is a theory of justified belief (doxastic justification) and that this is the notion of justification that is connected to a belief’s being creditworthy. But doxastic justification has a basing requirement, and this makes doxastic justification partly a historical matter. And epistemic credit and blame also seem to depend on historical factors too (Greco 2014). Thus, Ichikawa’s defense of (JPK) is susceptible to cases like the following:
Bad Past: At t S comes to believe that there is a ceiling overhead. S believes this because she just took a pill which she knew would induce random changes in her intrinsic states. In advance of taking the pill, S knew it would very likely cause her to have many false perceptual beliefs. But as it happens, the pill induced a total re-organization of her intrinsic states such that at t S has a counterpart who knows a ceiling is overhead.
(JPK) implies that S has a justified belief in Bad Past because she happens to have a knowledgeable counterpart. And because she has a justified belief, she is also creditworthy. But this seems wrong. Rather, S seems positively blameworthy for believing as she does. See Silva (2017) for further discussion of (JuJu) and (JPK) and see Greco (2014) for further discussion of historical defeaters for doxastic justification.
An alternative solution to these problems would be to revise (JPK) so that it is only a theory about propositional justification:
(PJPK) S has justification to believe p iff S has a possible counterpart, alike to S in all relevant intrinsic respects, whose corresponding belief is knowledge.
One could then, arguably, concoct a knowledge-first theory of doxastic justification by adding some kind of historical condition that rules out cases like Bad Past.
It should be noted that (PJPK) has a strange result. For if your internal counterpart knows p, then your internal counterpart believes p. But if your internal counterpart believes p, then you also believe p—provided you and your counterpart are not in very different environments (for example, earth vs. twin earth) that shift the content of the belief (compare Whitcomb 2014). So if (PJPK) is true, you only have propositional justification to believe p if you actually believe p. But it is usually assumed that it is possible to have propositional justification to believe p even if you do not believe p. To accommodate this (PJPK) may need revision.
4. Reasons-First, Knowledge-First Theories
Sylvan (2018), and Lord (2018) each take a reasons-first approach to justification, on which justified belief just is belief that is held for sufficient reason:
(J=SR) S’s belief that p is justified iff (i) S possess sufficient reason to believe p, and (ii) S believes that p for the right reasons.
While (J=SR) is not itself a knowledge-first view of justification, it becomes one when combined with a knowledge-first account of condition (i). Lord (2018: ch3) and Sylvan (2018: 212) both do this, taking reasons to be facts and arguing that one possesses a fact just in case one is in a position to know it:
(Pos=PK) S possess the fact that p as a reason to respond in some way w iff S is in a position to know that p.
Others have argued for some kind of knowledge-first restriction on (Pos=PK). For example, Neta (2017) has argued that our evidence is the set of propositions we are in a position to know non-inferentially. Provided one’s evidence just is the set of reasons one has for belief, this view will fall into the reasons-first, knowledge-first camp. For objections to (Pos=PK) see Kiesewetter (2017: 200-201, 208-209) and Silva (2023).
Surprisingly, the category of reasons-first, knowledge-first views cross-cuts some of the other categories. For example, (J=K) theorists have tended to fall into this camp. Williamson (2009) and Littlejohn (2018) take one’s evidence to consist of the propositions that one knows. Again, provided one’s evidence just is the set of reasons one has for belief, this leads to a view on which one possess p iff one knows p. This more restrictive knowledge-first view of possession, but together with (J=SR) and (J=K) it constitutes a kind of reasons-first, knowledge-first theory of justification. Since justified belief that p and knowledge that p never separate on this view, it can seem hardly worth mentioning this view as a reasons-first view. But there is more in need of epistemic justification than belief (though that will not be discussed here). There are other doxastic attitudes (for example, suspension, credence, acceptance, faith) as well as actions and feelings that are in need of epistemic justification. On knowledge-first, reasons-first views these states can only be justified by one’s knowledge.
As mentioned above (J=K) is subject to a range of objections. What follows focuses on Lord and Sylvan’s incarnation of the knowledge-first program that consists of (J=SR) and (Pos=PK). These two principles give us a knowledge-first theory of justification that avoids some of the main problems facing (J=K).
First, (J=SR) and (Pos=PK) are consistent with the existence of justified false beliefs. This is due to the fact that one’s reasons (the facts one is in a position to know) can provide one with sufficient, yet non-conclusive, reason to believe further propositions that may be false. The fact that a drunk has always lied about being sober, can be a sufficient yet non-conclusive inductive reason to believe that he will lie about being sober in the future. Since it is non-conclusive, having justification for this belief is consistent with it turning out to be false. So this view can allow for justified yet false inferential beliefs. The possibility of justified false perceptual beliefs is discussed below in connection with the new evil demon hypothesis.
Second, (i=SR) and (Pos=PK) are consistent with the existence of unknown, justified true beliefs. Because Smith can have justified false beliefs in the way described above, he can have a justified false belief that Jones will get the job based on the fact that the employer said so and the fact that this is a highly reliable indicator of who will get the job. Smith may also know that Jones has ten coins in his pocket based on perception. So, through an appropriate inferential process, Smith can come by a justified true inferential belief that the person who will get the job has ten coins in his pocket. This is a Gettier case, that is, an instance of a justified true belief without knowledge.
There are a few caveats. First, it’s worth noting that the reasons-first, knowledge-first theory of justification only has this implication under the assumption that the justificatory support one derives from facts one is in a position to know is transitive, or can at least sometimes carry over inferences from premises that one is not in a position to know. For, here, Smith’s false belief that Jones will get the job is justified by the reasons Smith is in a position to know, and we are assuming this justified false belief—which Smith is not in a position to know—can nevertheless facilitate Smith’s ability to acquire inferential justification for believing that the person who will get the job has ten coins in his pocket. For worries about the non-transitivity of the justification relation see Silins (2007) and Roche and Shogenji (2014).
Second, it is also worth noting that while Lord and Sylvan’s view is consistent with some intuitions about Gettier cases, it is not consistent with all such intuitions. After all, their view seems to be that we possess different reasons or evidence in the Gettier cases than we do in the good cases. This will seem counterintuitive to those who think that we have the same evidence in both cases.
Third, (J=SR) and (Pos=PK) are consistent with some intuitions about the new evil demon hypothesis. In the standard telling, the recently envatted brain has a non-veridical perceptual experience of p and believes p on the basis of that non-veridical experience. While the non-veridical experience does not give one access to the fact that p (if it is a fact), there is an inferential process that can give the envatted brain a justified belief according to (J=SR) and (Pos=PK). This is because mature thinkers who are recently envatted can know (or be in a position to know) that in the past their visual experiences have been a reliable guide to reality, and can sometimes know that they are now having an experience of p. Together, these are facts that can give one sufficient reason to believe p even if one is an unwittingly recently envatted brain.
Of course, the weakness here is that the envatted brain’s perceptual belief that p is not based on her inferential source of propositional justification to believe p. Rather, the envatted brain holds her belief in response to her perceptual experience. So, she is not doxastically justified, that is, her belief itself fails to be justified. So, there is some bullet to bite unless, perhaps, one can argue that knowledge of the fact that one is having an experience of p can itself be a reason to believe p even when one is an unwittingly envatted brain.
There are further problems that the reasons-first, knowledge-first view faces. They are along the lines of the problems for Bird’s (JuJu). For if reasons are facts, then one cannot obtain justified false beliefs from justified false-premise beliefs unless, as noted above, one’s justified false-premise beliefs are themselves inferentially justified and justificatory support carries over (see the discussion of (JuJu) above). Similarly, it is unclear whether one can gain justified beliefs from contentless beliefs. For contentless “premise” beliefs do not stand in inferential relations to their “conclusions,” and such relations seem essential to the ability of justificatory support to transmit across inferences.
For a further concern about this view, see Littlejohn’s (2019) “Being More Realistic About Reasons,” where he argues that the conjunction of (J=SR) and (Pos=K) generates explanatory lacunas regarding how reasons should constrain our credences.
5. Perspectival Theories
Perspectival knowledge-first theories of justification put “knowledge first” by letting one’s point of view on whether one has knowledge determine whether one has justification. Smithies (2012), for example, argues that:
(PJ=PJK) S has justification to believe that p iff S has justification to believe that she is in a position to know that p.
Smithies (2012: 268) treats being in a position to know as a matter of being in a position where all the non-psychological conditions for knowing are met. Smithies is clear that this is only a theory of propositional justification (having justification to believe), not doxastic justification (having a justified belief). For as a theory of doxastic justification it would be too demanding: it would require an infinite hierarchy of beliefs, and it would require that one have epistemic concepts (e.g. KNOWS, JUSTIFIED, POSITION TO KNOW) if one is to have any justified beliefs at all. This would over-intellectualize justification, excluding agents incapable of epistemic reflection (for example, young children, people with handicaps, smart non-humans). Worse, if knowledge requires justification then this would also rob such beings of knowledge.
It is important to note that (PJ=PJK) is neutral on which side of the biconditional gets explanatory priority. To be a genuinely knowledge-first view it must be the condition on the right-hand side that explains why the condition on the left-hand side obtains. This is something that Smithies himself rejects. And there are good reasons for this, as there are objections to (PJ=PJK) that emerge only if we give the right-hand side explanatory priority. But there is also a general objection to this view that is independent of which side gets priority. This section starts with the general objection and then turns to the others.
A central worry to have about (PJ=PJK), irrespective of which side gets explanatory priority, is the extent to which Smithies’ purely non-psychological conception of propositional justification is a theoretically valuable conception of justification as opposed to a theoretically valuable conception of evidential support. For our evidence can support propositions in virtue of entailment and probabilistic relations, where these propositions can be so complex as to be well beyond our psychological abilities to grasp. For example, even before I had the concept of a Gettier Case, my evidence supported the claim that I exist or I’m in a Gettier case just in virtue of the fact that I exist was already part of my evidence and entailed that disjunction. But since I did not have the concept of GETTIER CASE, I could not have formed that belief.
So one general question concerns whether the motivations appealed to in support of (PJ=PJK) wrongly identify the following two epistemic notions:
Evidential Support: Having evidence, E, such that E entails or probabilistically supports p.
Justification: Having evidence, E, such that E gives one justification to believe p.
Certain evidentialists will like the idea of binding these notions together, thinking that strong evidential support is all there is to epistemic justification (Smithies 2019). Yet many have objected to the kind of evidentialism implicit in making evidential support necessary and sufficient for justification. The necessity direction has been objected to due to lottery problems, pragmatic encroachment, and the existence of justified beliefs not derived from evidence (so called “basic” or “immediate” or “foundational” justified beliefs). The sufficiency direction, while rarely challenged, is also objectionable (Conee 1987, 1994; Silva 2018). For example, some mental states are such that we are not in a position to know that we are in them even upon reflection (Williamson 2000). Suppose you knew that you just took a pill that ensured that you are in a mental state M iff you do not believe (A) that you are in M. A rational response to this knowledge would be to suspend belief in (A) due to your knowledge of this biconditional: for if you believe (A) then it is false, and if you disbelieve (A) then it is true. So suspension seems like the only rational response available to you. In at least some such cases where you consciously suspend belief in (A), you will also know that you have suspended belief (A). This is at least a metaphysical possibility, and certainly a logical possibility. Now, since you know the biconditional and since you know you have suspended belief in (A), your evidence entails that you are in M. But it is logically impossible for you to justifiably believe or know (A) on your evidence—and you can know this a priori. For believing (A) on your evidence entails that (A) is false. So connecting justification to evidential support in this way is inconsistent with the following plausible idea: S has justification to believe P on E only if it is logically possible for S to justifiably believe P on E. For further discussion of these and related reasons to separate justification from evidential support see Silva (2018) and Silva and Tal (2020). For further objections to Smithies see Smith (2012). For further defense of Smithies’ theory see Smithies (2019: sect 9.4).
Further, as Smith (2012) points out, (PJ=PJPK) implies that having justification to believe p requires having justification to believe an infinite hierarchy of meta-justificatory claims:
One thing that we can immediately observe is that [PJ=PJK]… is recursive, in that it can be reapplied to the results of previous applications. If one has justification to believe that p (Jp) then, by [PJ=PJK], one must have justification to believe that one is in a position to know that p (JKp). But if one has justification to believe that one is in a position to know that p (JKp) then, by [PJ=PJK], one must have justification to believe that one is in a position to know that one is in a position to know that p (JKKp) and so on… In general, we have it that Jp ⊃ JKn p for any positive integer n.
If one adds to this the priority claim that having justification to believe that one is in a position to know p is the source of one’s justification to believe p, one must either accept a skeptical result due to grounding worries about the infinite hierarchy of meta-justificatory claims, or accept a knowledge-first form of infinitism. But even overcoming the standard general worries with infinitism, knowledge-first infinitism will be especially difficult to handle due to luminosity failures for KK. For example, in Williamson’s (2000: 229) unmarked clock case, one is argued to know a proposition p, while also knowing that it is very improbable that one knows i. Intuitively, this is a case where one knows p and so justifiably believes p even though they lack justification to believe they know p. (For a discussion of the limits of the unmarked clock case see Horowitz 2014.)
The final issue with (PJ=PJPK) is whether or not having justification to believe that one is in a position to know is the source of one’s propositional justification to believe p (which would make this a knowledge-first view) or whether it is a non-explanatory necessary and sufficient condition on having justification to believe p (Smithies’ view). To illustrate the difference, suppose there is an infallible record of peoples’ heights. It is certainly true that Paul is 5’11’’ at t if and only if the infallible record says that Paul is 5’11’’ at t. But the right-hand-side of that biconditional is plausibly non-explanatory. The fact that there is an infallible record does not make or otherwise explain Paul’s height. Now, if the advocate of (PJ=PJPK) holds that having justification to believe that one is in a position to know is the source of one’s justification, then having a doxastically justified belief will, according to tradition, require one to base their belief that p on that source of justification. But ordinarily we do not base our beliefs on further facts about knowing or being in a position to know. So if we are not to risk an unacceptable skepticism about doxastically justified belief (and hence knowledge), it seems we will either have to give up the tradition or treat the right-hand-side of (PJ=PJPK) as specifying a mere non-explanatory necessary and sufficient condition. However, if that is the case, it can seem puzzling why there should be such a modally robust connection between justification and one’s perspective on whether one knows.
A view much like (PJ=PJPK) that avoids all but this final problem is Dutant and Littlejohn’s (2020) thesis:
(Probable Knowledge) It is rational for S to believe p iff the probability that S is in a position to know p is sufficiently high.
Even after specifying the relevant notion of ‘in a position to know’ and the relevant notion of “probability’ (objective, subjective, epistemic, together with some specification of what counts as an agent’s evidence), provided we can and should distinguish between propositionally and doxastically rational belief, it seems that (Probable Knowledge) is either not going to be a genuinely knowledge-first view or one that does not allow for enough doxastically rational beliefs due to the basing worry described above in connection with Bad Past.
Reynolds (2013) offers a related view of doxastic justification on which justified belief is the appearance of knowledge: “I believe with justification that I am currently working on this paper if and only if there has been an appearance to me of my knowing that I am currently working on this paper.” Generalizing this we get:
(J=AK) S’s belief that p is justified if and only if S is appeared to as though S knows that p.
On his view appearances are not doxastic states nor are they conceptually demanding. As he explains the target notion:
Consider the following example: Walking in a park I notice an unfamiliar bird, and decide I would like to find out what it is. Fortunately, it doesn’t immediately fly away, so I observe it for two or three minutes. A few hours later, having returned home, I look up a web site, find a few photos, follow up by watching a video, and conclude confidently that I saw a Steller’s Jay. I think it is perfectly correct to say that the bird I saw had the appearance of a Steller’s Jay, even though I didn’t know that that’s what it was at the time. If it hadn’t had the appearance of a Steller’s Jay, I wouldn’t have been able to remember that appearance later and match it to the photos and video of Steller’s Jays. I didn’t have the concept of a Steller’s Jay, yet I had an appearance of a Steller’s Jay. (Reynolds 2013: 369)
(J=AK) has advantages with regard to (PJ=PJK). It does not lead to an infinite hierarchy of meta-justificatory claims and it is not hard to see how many of our occurrent beliefs might be based on such appearances, thereby avoiding some of the skeptical challenges that threatened (PJ=PJK). But there are problems.
One concern with (J=AK) is its self-reflective character. To have a justified belief you have to be (or have been) in a state in which it appears to you as though you have knowledge. This requires introspective abilities, which arguably some knowing creatures might lack. As Dretske (2009) put it: a dog can know where its bowl is, and a cat can know where the mouse ran. The correctness of these and other knowledge ascriptions does not seem to turn on whether or not dogs and cats have the capacity to access their own mental lives in such a way that they can appear to themselves to have knowledge.
Moreover, (J=AK) implies that every justified belief is a belief with such an appearance. But many of the justified beliefs we form and much of the knowledge we acquire is merely dispositional, that is, it involves dispositional beliefs that are never or only very briefly made occurrent. Do we, as a matter of psychological fact, also have the appearance of knowledge with regard to all such states? There is non-trivial empirical reason to find this suspicious. In the psychology of memory, it has been observed that our memory systems are not purely preservative, they are also constructive. For example, our sub-personal memory systems often lead us to forget very specific beliefs while forming new beliefs that are more general in character. Sometimes this leads to new knowledge and new justified beliefs (Grundmann and Bernecker 2019). But if the new belief is the product of sub-personal operations and the more general belief is itself unretrieved, then it is unclear how that more general unretrieved justified belief could appear to oneself as a case of knowing.
A final concern with (J=AK) is its ability to handle undercutting defeat and the plausible idea that beliefs can cognitively penetrate appearances (see (cognitive penetration). For suppose you have strong undefeated evidence that you are in fake-barn country, but you brazenly believe without justification that you are looking at the one real barn in all the country. Perhaps this is because you pathologically believe in your own good fortune. But pathology is not necessary to make the point, as it is often assumed that we can have unjustified beliefs that we believe to be justified. If either is your situation, your belief that you are looking at a real barn can appear to you to be knowledge given your normal visual experience and the fact that you (unjustifiably) believe your defeater to have been defeated. According to (J=AK) your belief is then justified. But that is the wrong result. Unjustified beliefs that enable the appearance of knowledge should not have the ability to neutralize defeaters.
Here is a final perspectival, knowledge-first theory of justification. It is mentioned by Smithies (2012) and explored by Rosenkranz (2018):
(J=¬K¬K): S has justification to believe p iff S is not in a position to know that S is not in a position to know that p.
Like Smithies, Rosenkranz relies on a conception of justification and being in a position to know that is psychologically undemanding. But unlike Smithies, Rosenkranz explicitly regards his view as being about justification for idealized agents and leaves open what relevance this notion has for ordinary, non-idealized agents like us.
There are at least two concerns with this view of justification. First, suppose we were to treat (J=¬K¬K) as a theory of justification for ordinary non-ideal agents and imposed (as many wish to) substantive psychological limits on what one has justification to believe. With such limits in place, (J=¬K¬K) would face not an over-intellectualization problem but an under-intellectualization problem. For agents who lack the concept KNOWLEDGE or the complicated concept POSITION TO KNOW could never be in a position to know that they are not in a position to know. So, such agents would be justified in believing anything.
But even once psychological limits are stripped away, and with them the under-intellectualization problem, another problem remains. Smithies (2012: 270) points out that, on this view, to lack justification one must be in a position to know that one is not in a position to know. Since being in a position to know is factive, this limits defeating information to factive defeating information. But it seems like misleading (non-factive) information can also defeat knowledge and justification. For example, suppose you are told that you are in fake-barn country. But in fact you are not, so you are not in a position to know that you are in fake-barn country. Still, the misleading testimony that you are in fake-barn country gives you justification to believe that you are in fake-barn country. Intuitively, this misleading testimony will defeat your justification to believe that there is a barn ahead; the misleading testimony ensures you should not believe that. But you are not in a position to know that you are not in a position to know that there is a barn ahead—recall the testimony you receive is misleading. So (J=¬K¬K) says you have justification when intuitively you do not.
In response, it seems open to advocates of (J=¬K¬K) to argue that while one might not be in a position to know the content of the misleading testimony (because it is false), the misleading testimony itself can defeat. In this case, for example, it is arguable that the misleading testimony that one is in circumstances that make one’s knowing that p improbable itself defeats one’s being in a position to know p, and so prevents one’s good visual contact with an actual nearby barn in normal conditions from putting one in position to know that a barn is nearby. However, recent arguments for the existence of “unreasonable knowledge”—that is, knowledge that p while knowing that it is improbable that one knows p—will challenge the integrity of this response in defense of (J=¬K¬K). For more on unreasonable knowledge see Lasonen-Aarnio (2010, 2014) and Benton and Baker-Hytch (2015).
We are not simply retainers of propositional knowledge. We are also able to acquire it. You are, for example, able to figure out whether your bathroom faucet is currently leaking, you are able to figure out whether your favorite sports team won more games this season than last season, you are able to figure out the sum of 294 and 3342, and so on. In normal circumstances when you exercise this ability you gain propositional knowledge. If you are able to figure out whether the faucet is leaking and you use that ability, the typical result is knowledge that the faucet is leaking (if it is leaking) or knowledge that the faucet is not leaking (if it is not leaking). The core idea behind knowledge-first virtue epistemology (KFVE) is that justified belief is belief that is somehow connected to exercises of an ability to know. Predictably, (KFVE)-theorists have had different things to say about how justified belief is connected to such abilities.
Some have argued that success is a general feature of exercises of abilities (Millar 2016). That is, one exercises an ability only if one does what the ability is an ability to do. It is widely thought that belief formation is a part of exercising an ability to know because knowing is constituted by believing. From which it follows in the special case of exercises of abilities to know that:
(Exercise Infallibilism) S’s belief is the product of an exercise of an ability to know only if S’s belief constitutes knowledge.
For example, Millar (2019) argues for a special instance of this in arguing that we cannot exercise an ability to know by perception without thereby acquiring perceptual knowledge.
If (Exercise Infallibilism) is true, and if justified beliefs just are beliefs that are products of abilities to know, then (J=K) follows. And so we have a virtue theoretic account of justified belief that faces all the same problems we saw above facing (J=K). Of note is the inability of such a view to accommodate the following desiderata:
Desideratum 1. Justification is non-factive, that is, one can have justified false beliefs.
Desideratum 2. One can have justified true beliefs that do not constitute knowledge, as in standard Gettier cases.
Desideratum 3. One can have justified perceptual beliefs even if one is in an environment where perceptual knowledge is impossible due to systematically misleading features of one’s perceptual environment. This can happen on a more global scale (as in the new evil demon case), and it can happen on a more local scale (as in beech-elm cases discussed below).
The central point of departure from Millar’s virtue theory and the remaining virtue theories is that they reject (Exercise Infallibilism). It is this rejection that makes the resulting theories resilient to the objections facing (J=K). On Miracchi’s (2015) preferred instance of (KFVE), exercises of abilities to know explain our justified beliefs but it is not mere abilities to know that have the potential yield justified beliefs. Rather, it is only proficient abilities to know (“competences”) that yield justified beliefs, and all abilities to know are proficient abilities to know. One has a proficient ability to know just in case an exercise of their ability to know ensures a sufficiently high objective probability of knowing. That is, the conditional objective probability that S knows p given that S exercised a relevant ability to know is sufficiently high. This is a kind of in situ reliability demand on justification.
We can summarize her view of justified belief, roughly, as follows:
(KFVE-Proficiency) S has a justified belief iff S’s belief is competent, where S’s belief is competent iff S’s belief is produced by an exercise of a proficient ability to know.
Central to her view is the idea that exercises of proficient abilities are fallible, that is, an agent can exercise an ability to know without succeeding in knowing. So (Exercise Infallibilism) is given up. This enables (KFVE-Proficiency) to accommodate justified false beliefs (that is, Desideratum 1) as well as justified true beliefs that do not constitute knowledge (that is, Desideratum 2). So (KFVE-Proficiency) avoids two of the main challenges to (J=K) and Millar’s (KFVE-Infallibilism).
However, by limiting justified beliefs to beliefs produced by proficient abilities, Miracchi’s view is, like (J=K) and Millar’s infallibilist view, unable to accommodate Desideratum 3, that is, the compatibility of justified beliefs formed in certain deceptive environments. The first case of this is just the familiar new evil demon case. For the recently envatted brain, as Kelp (2016; 2017; 2018) argues, retains the ability to know by perception that, say, they have hands by responding to visual appearances in normal circumstances. But because they are no longer in normal circumstances, they no longer possess a proficient ability to know. In other words, the recently envatted brain’s change of environment robs them of the proficiency needed to form justified beliefs.
Miracchi (2020) rejects, or is at least deeply suspicious of, the metaphysical possibility of the new evil demon hypothesis. But we need not rely on fantastical envatted brain scenarios to make this style of objection to (KFVE-Proficiency). Suppose you grew up in an environment with lots of beech trees and developed the ability to visually identify them and thus the ability to know that a beech tree is nearby by sight. Since exercises of abilities are fallible, you could exercise this beech-identification ability if you were to unwittingly end up in another environment where there are only elms (which, according to Putnam, look indistinguishable from beeches to the untrained). But this is not an environment where your ability to identify beeches amounts to a proficiency: conditional on your exercise of your ability to identify and come to know that beeches are nearby, it is objectively highly likely that you will fail to know. So the intuition that you can have justified perceptual beliefs about beeches being nearby in such a case appears inconsistent with (KFVE-Proficiency). While there may be some doubt about the metaphysical possibility of the new evil demon hypothesis, this is a perfectly possible scenario. See Kelp (2018: 92) for a similar objection for Miracchi.
One last concern with (KFVE-Proficiency) regards its ability to accommodate defeat. This is discussed in the section below.
Kelp (2016; 2017; 2018) and Simion (2019) offer versions of (KFVE) that do not tie justification so closely to in situ reliability and thereby avoid not only the problem of having justified false beliefs and the possibility of Gettier cases, but also problems arising from the new evil demon hypothesis and very local cases of deception (like the beech-elm case above). So Desiderata 1–3 are easily managed. This section first explains their distinctive views and then mentions some concerns they share.
On Kelp’s (2016; 2017; 2019) view, justified belief is competent belief, and competent beliefs are generated by exercises of an agent’s ability to know. Importantly, such exercises do not require proficiency in Miracchi’s sense. Kelp’s view, roughly, amounts to this:
(KFVE-Ability) S has a justified belief iff S’s belief is competent, where S’s belief is competent iff S’s belief is produced by an exercise of an ability to know.
On Simion’s (2019) view, in contrast, justified beliefs are beliefs that are generated by properly functioning cognitive processes that are aimed at yielding knowledge. Presumably, if an agent has properly functioning cognitive processes that are aimed at yielding knowledge, then such an agent has an ability to know as well. So it’s not too much of a taxonomic stretch to place Simion’s theory among the virtue theories. Like the exercise of abilities, cognitive processes can properly function without proficiency:
(KFVE-Functionalism) S’s belief is justified iff S’s belief is produced by a properly functioning cognitive process that has the etiological function of generating knowledge.
These statements of Kelp and Simion’s views are relatively coarse-grained and both Kelp and Simion defend more refined theses.
Kelp and Simion’s views are not unrelated to each other. For the ability to know is an ability one has in virtue of having certain belief-producing cognitive processes, and Kelp’s (2018) preferred account of how the ability to know is acquired is the same general kind of account that Simion (2019) relies on in arguing that the cognitive processes that constitute one’s ability to know are cognitive processes whose function is knowledge production. Nevertheless, the views are distinct in that (KFVE-Ability) grounds justification in agent abilities, while (KFVE-Functionalism) grounds them in cognitive processes. See Kelp (2019) for a discussion of the importance of this difference.
Central to their views is the idea that exercises of abilities to know are fallible, and given the fallibility of exercises of the ability to know, (KFVE-Ability) and (KFVE-Functionalism) allow for justified false beliefs and justified true beliefs that do not constitute knowledge. So, Desiderata 1 and 2 are easily accommodated.
Desiderata 3 is likewise easily accommodated. In Kelp’s (2018) telling, the recently envatted brain retains and exercises an ability to know when believing she has a hand upon having the visual experience as of a hand. According to Simion (2019), just as an envatted heart pumping orange juice counts as a properly functioning heart, a recently envatted brain counts as properly functioning when it comes to believe it has a hand upon having the visual experience as of a hand. And if justified belief can be had in cases of such systematic perceptual deception, then they can also be had in cases of localized perceptual deception as in the beech-elm scenario above.
So (KFVE-Ability) and (KFVE-Functionalism) can accommodate Desiderata 1–3. What about the desiderata that emerged in the objections to (JuJu), (JPK), and reasons-first, knowledge-first views? That is:
Desideratum 4. Justified beliefs can be based on inferences from justified false beliefs.
Desideratum 5. Justified beliefs can be based on inferential acts involving contentless beliefs.
Desideratum 6. Justified belief is a kind of creditable belief.
Desideratum 7. Justified belief has a historical dimension that is incompatible with situations like Bad Past.
If (KFVE-Ability) or (KFVE-Functionalism) imply that a recently envatted brain is able to have justified beliefs from an exercise of an ability to know or as a product of their cognitive competences which aim a knowledge, then it is easy to see how Desiderata 4 and 5 is satisfied by (KFVE-Ability) and (KFVE-Functionalism). For these seem like more local cases of deception. As for 6 and 7, the virtue-theoretic machinery here is key. For both can be explained by the demand that justified beliefs are beliefs that issue from an ability or a properly functioning cognitive process. But that was exactly what was lacking in the cases discussed above that motivated 6 and 7. See Silva (2017) for an extended discussion of how certain versions of (KFVE) can satisfy these desiderata.
There are some worries about these versions of (KFVE). Consider Schroeder’s (2015) discussion about defeater pairing. Any objective condition, d, which defeats knowledge that p is such that: if one justifiedly believes that d obtains then this justified belief will defeat one’s justification to believe p. For example, suppose you formed the belief that a wall is red from an ability to know this by perception and that you are in normal circumstances where the wall is in fact red. You will have a justified belief according to each of the fallibilist versions of (KFVE) above. But suppose you were given misleading yet apparently reliable undercutting information that the wall is illuminated by red lights and so might not actually be red. This is not true, but were it true it would defeat your knowledge; were it true you would be in a Gettier situation. Now the defeater pairing insight says that the fact that you justifiedly believe the wall is illuminated by red lights defeats your justification to believe the wall is red. But according to the fallibilist instances of (KFVE) discussed above, since you arrived at your belief that the wall is red through an exercise of your proficiency or ability or properly functioning cognitive process, you have a justified belief according to (KFVE-Proficiency), (KFVE-Competence), and (KFVE-Functionalism). But that is inconsistent with the intuition that the justification for your belief is defeated.
So this objection gives rise to a further potential demand on an adequate theory of justified belief:
Desideratum 8. Justified belief is susceptible to defeat by justified defeating information.
A possible response to this objection is to maintain that exercises of abilities, or the use of a reliable processes, always depends on the absence of credible defeating information. In which case, the versions of (KFVE) above may be able to accommodate Desideratum 8.
Another response is to resist Desideratum 8 and the supposed phenomenon of defeater pairing. For more on this see discussion of “unreasonable justified beliefs”. See Lasonen-Aarnio (2010, 2014) and Benton and Baker-Hytch (2015). For qualified opposition see Horowitz (2014).
The second concern to have about (KFVE-Ability) and (KFVE-Functionalism) is that there is a question about the extent to which abilities/cognitive processes are “in the head.” For example, consider the amputee gymnast. She lost her leg and so no longer has the ability to do a backflip. So her ability to do backflips is located, in part, in her ability to successfully interact with the physical world in some ways. In this case, it is located in her ability to control her body’s physical movements in certain ways. This does not conflate proficiency with mere ability, for even with both legs the gymnast might not have a proficiency because she’s in an inhospitable environment for performing backflips (high winds, buckling floors, and so forth). We might wonder, then, whether the envatted brain’s ability to know by perception is lost with the loss of her body and the body’s perceptual apparatus just as the gymnast’s ability to do backflips is lost with the loss of her leg. If so, then it is a mistake to think (KFVE-Ability) and (KFVE-Functionalism) are compatible with the new evil demon hypothesis, and hence with Desideratum 3. This threatens to make these views more revisionary than they initially appeared to be.
9. Know-How Theories and the No-Defeat Condition
Silva (2017) argues that justification is grounded in our practical knowledge (knowledge-how) concerning the acquisition of propositional knowledge (knowledge-that). The motivation for this incarnation of (KFVE) starts with the simple observation that we know how to acquire propositional knowledge. You, for example, know how to figure out whether your bathroom faucet is currently leaking, you know how to figure out whether your favorite sports team won more games this season than last season, you know how to figure out the sum of 294 and 3342, and so on. In normal circumstances when you exercise such know-how you typically gain propositional knowledge. If you know how to figure out whether the faucet is leaking and you use that know-how, the typical result is knowledge that the faucet is leaking (if it is leaking) or knowledge that the faucet is not leaking (if it is not leaking). One way of thinking about the grounds of justification is that it is crucially connected to this kind of know-how: justified belief is, roughly, belief produced by one’s knowledge how to acquire propositional knowledge.
Here is a characterization of Silva’s (2017) view:
(KFVE-KnowHow) S has a justified belief iff (i) S’s belief is produced by an exercise of S’s knowledge of how to gain propositional knowledge, and (ii) S is not justified in thinking she is not in a position to acquire propositional knowledge in her current circumstances.
One advantage of (KFVE-KnowHow) is that it is formulated in terms of know-how and so avoids worries about abilities not being “in the head.” For example, while the amputee gymnast discussed above lacks the ability to perform backflips, she still knows how to do them. Similarly, in thinking about the recently envatted brain, she still knows how to acquire propositional knowledge by perception even if she lacks the ability to do so because she has lost the necessary perceptual apparatus. So Desideratum 3 is, arguably, easier to accommodate with (KFVE-KnowHow).
Similarly, since exercises of know-how are fallible in situ (Hawley 2003), (KFVE-KnowHow) has no trouble explaining how exercises of one’s knowledge how to know could lead one to have a false belief (that is, Desideratum 1) or have true beliefs that do not constitute knowledge (that is, Desideratum 2). For similar reasons (KFVE-KnowHow) is able to satisfy Desiderata 4-7. See Silva (2017) for further discussion.
Lastly, condition (ii) is a kind of “no defeater” condition that makes (KFVE-KnowHow) compatible with Schroeder’s defeater-pairing thesis and standard intuitions about undercutting defeat. So it manages to accommodate Desideratum 8. So (KFVE-KnowHow) appears capable of satisfying all the desiderata that emerged above. Accordingly, to the extent that one finds some subset of Desiderata 1-8 objectionable one will have reason to object to (KFVE-KnowHow). For one way of developing this point see the next section.
10. Excused Belief vs. Justified Belief
The objections to knowledge-first views of justification above assumed, among other things, that justification has the following properties:
Desideratum 1. Justification is non-factive, that is, one can have justified false beliefs.
Desideratum 2. One can have justified true beliefs that do not constitute knowledge, as in standard Gettier cases.
Desideratum 3. One can have justified perceptual beliefs even if one is in an environment where perceptual knowledge is impossible due to systematically misleading features of one’s perceptual environment. This can happen on a more global scale (as in the new evil demon case), and it can happen on a more local scale (as in beech-elm cases discussed above).
Desideratum 4. Justified beliefs can be based on inferences from justified false beliefs.
Desideratum 5. Justified beliefs can be based on inferential activities involving contentless beliefs.
Desideratum 6. Justified belief is a kind of creditable belief.
Desideratum 7. Justified belief has a historical dimension that is incompatible with situations like Bad Past.
Desideratum 8. Justified belief is susceptible to defeat by justified defeating information.
Knowledge-first virtue epistemology has the easiest time accommodating these assumed properties of justification, with (KFVE-KnowHow) being able to accommodate all of them.
In defense of alternative knowledge-first views some might argue that Desiderata 1–8 (or some subset thereof) are not genuine properties of justification, but rather properties of a kindred notion, like excuse. Littlejohn (2012: ch. 6; 2020) and Williamson (2014: 5; 2020) have argued that the failure to properly distinguish justification from excuses undermines many of the arguments that object to there being a tight connection between knowledge and justification. An excuse renders you blameless in violating some norm, and it is easy to see how some might argue that 1–8 (or some subset thereof) indicate situations in which an agent is excusable, and so blameless, although her belief is not justified. For the locus classicus on the concept of excuse see Austin’s “A Plea for Excuses.” For critical discussion of the excuse maneuver in defense of knowledge-first theories (of assertion and justification) see Lackey (2007), Gerken (2011), Kvanvig (2011), Schechter (2017), Madison (2018), and Brown (2018).
Arguably, the most accommodating knowledge-first virtue theory, (KFVE-KnowHow), threatens to make the concept of an excuse nearly inapplicable in epistemology. For the situations indicated in 1-8 are so inclusive that it can be hard to see what work is left for excuses. If one thought there should be deep parallels between epistemology and moral theory, which leaves substantive work for excuses, then one might worry that any theory that can accommodate all of Desiderata 1-8 will in some way be guilty of conflating justification with excuse.
11. A Methodological Reflection on Gettier
The history of the Gettier problem is a long history of failed attempts to give a reductive account of knowledge in terms of justification and other conditions. In light of this, many have since judged the project of providing a reductive analysis of knowledge to be a degenerating research program. In putting knowledge first in the theory of justification, epistemologists are exploring whether we can more successfully reverse the order of explanation in epistemology by giving an account of justified belief in terms of knowledge. Attempts to put knowledge first in the theory of justification began during the early twenty-first century, reminiscent of the history of attempts to solve the Gettier problem: knowledge-first theories are proposed, counterexamples are given, new knowledge-first theories (or error theories) are developed, new counterexamples are given, and so on (Whitcomb 2014: sect. 6).
Perhaps this repeat of Gettierology merits a new approach. One such approach, advocated by Gerken (2018) is an ‘equilibristic epistemology’ according to which there is not a single epistemic phenomenon or concept that comes first in the project of the analysis of knowledge or justification. Rather, there are various basic epistemic phenomena that are not reductively analyzable. At most they may be co-elucidated in a non-reductive manner. Alternatively, perhaps we should return to the tradition from which knowledge-first epistemology sprung. That is, perhaps we should return to the prior project of providing a reductive analysis of knowledge in terms of other conditions. A manifestation of a return to the traditional approach involves drawing a distinction between knowledge and awareness, where the diagnosis of the failure of post-Gettier analyses of knowledge is, in part, taken to be a failure to appreciate the differences between knowledge and awareness (Silva 2023: ch.8-9).
12. References and Further Reading
Benton, M. and M. Baker-Hytch. 2015. ‘Defeatism Defeated.’ Philosophical Perspectives 29: 40-66.
Brown, J. 2018. Fallibilism. Oxford: Oxford University Press.
Chalmers, D. 2012. Constructing the World. Oxford: Oxford University Press.
Comesana, J. and Kantin, H. 2010. ‘Is Evidence Knowledge?’ Philosophy and Phenomenological Research, 89: 447-455.
Conee, E. 1987. ‘Evident, but Rationally unacceptable’. Australasian Journal of Philosophy 65: 316-26.
Conee, E. 1994. ‘Against and Epistemic Dilemma’. Australasian Journal of Philosophy 72: 475-81.
Dretske, F. 2009. Perception, Knowledge, Belief. Cambridge: Cambridge University Press.
Dutant, J. and C. Littlejohn. 2020. ‘Defeaters as indicators of ignorance.’ In J. Brown and M. Simion (ed.), Reasons, Justification, and Defeat. Oxford University Press.
Fratantonio, G. 2019. ‘Armchair Access and Imagination.’ Dialectica 72(4): 525-547.
Gerken, M. 2011. ‘Warrant and Action.’ Synthese, 178(3): 529-47.
Gerken, M. 2018. ‘Against Knowledge-First Epistemology.’ In E. And B. A. Gordon and Jarvis Carter (ed.), Knowledge-First Approaches in Epistemology and Mind, Oxford University Press. pp. 46-71.
Greco, J. 2014. ‘Justification is not Internal.’ In M. Steup, J. Turri, and E. Sosa (eds.) Contemporary Debates in Epistemology. Oxford: Wiley Blackwell: 325-336.
Grundmann, T. and S. Bernecker. 2019. ‘Knowledge from Forgetting.’ Philosophy and Phenomenological Research XCVIII: 525-539.
Hawley, K. 2003. ‘Success and Knowledge-How.’ American Philosophical Quarterly, 40: 19-3.
Hawthorne, J. Knowledge and Lotteries. Oxford: Oxford University Press.
Horowitz, S. 2014. ‘Epistemic Akrasia.’ Nous 48/4: 718-744.
Ichikawa, J.J. 2014. ‘Justification is Potential Knowledge.’ Canadian Journal of Philosophy, 44: 184-206.
Ichikawa, J. and Steup, M. 2012. ‘The Analysis of Knowledge.’ Stanford Encyclopedia of Philosophy.
Ichikawa, J. and C.S.I. Jenkins. 2018. In Joseph Adam Carter, Emma C. Gordon & Benjamin Jarvis (eds.), Knowledge First: Approaches in Epistemology and Mind. Oxford University Press.
Kelp, C., M. Simion, H. Ghijsen. 2016. ‘Norms of Belief.’ Philosophical Issues 16: 374-92.
Kelp. C. 2016. ‘‘Justified Belief: Knowledge First-Style.’ Philosophy and Phenomenological Research 93: 79-100.
Kelp, C. 2017. ‘Knowledge First Virtue Epistemology.’ In Carter, A., Gordon, E. and Jarvis, B. (eds.) Knowledge First: Approaches in Epistemology and Mind. Oxford: Oxford University Press.
Kelp, C. 2019b. ‘How to Be a Reliabilist.’ Philosophy and Phenomenological Research 98: 346-74.
Kelp, C. 2018. Good Thinking: A Knowledge-First Virtue Epistemology. New York: Routledge.
Kiesewetter, B. 2017. The Normativity of Rationality. Oxford: Oxford University Press.
Kvanvig, J. L. 2011. ‘Norms of Assertion.’ In Jessica Brown and Herman Cappelen (eds.), Assertion: New Philosophical Essays. Oxford: Oxford University Press.
Lackey, J. 2007. ‘Norms of Assertion.’ Nous 41: 594-626.
Lasonen-Aarnio, M. 2010. ‘Unreasonable knowledge.’ Philosophical Perspectives 24: 1-21.
Lasonen-Aarnio, M. 2014. ‘Higher-order evidence and the limits of defeat.’ Philosophy and Phenomenological Research 88: 314–345.
Lewis, D. 1997. ‘Finkish Dispositions.’ The Philosophical Quarterly 47: 143-58.
Littlejohn, C. 2017. ‘How and Why Knowledge is First.’ In A. Carter, E. Gordon & B. Jarvis (eds.), Knowledge First. Oxford: Oxford University Press.
Littlejohn, C. 2012. Justification and the Truth-Connection. Cambridge: Cambridge University Press.
Littlejohn, C. 2019. ‘Being More Realistic About Reasons: On Rationality and Reasons Perspectivism.’ Philosophy and Phenomenological Research 99/3: 605-627.
Littlejohn, C. 2020. ‘Plea for Epistemic Excuses.’ In F. Dorsch and J. Dutant (eds.), The New Evil Demon Problem. Oxford: Oxford University Press.
Madison, B. 2010. ‘Is Justification Knowledge?’ Journal of Philosophical Research 35:173-191.
Madison, B. 2018. ‘On Justifications and Excuses.’ Synthese 195 (10):4551-4562.
McGlynn, A. 2014. Knowledge First? Palgrave MacMillan.
Meylan, A. 2017. ‘In support of the knowledge-first conception of the normativity of justification.’ In Carter, A., Gordon, E. and Jarvis, B. (eds.) Knowledge First: Approaches in Epistemology and Mind. Oxford: Oxford University Press.
Millar, A. 2016. Forthcoming a. ‘Abilities, Competences, and Fallibility.’ In M. Á. Fernández (ed.), Performance Epistemology. Oxford: Oxford University Press.
Millar, A. 2019. Knowing by Perceiving. Oxford: Oxford University Press.
Miracchi, L. 2015. ‘Competence to Know.’ Philosophical Studies, 172: 29-56.
Miracchi, L. 2020. ‘Competent Perspectives and the New Evil Demon Problem.’ In J. Dutant and F. Dorsch, (eds.), The New Evil Demon. Oxford: Oxford University Press.
Neta, R. and D. Pritchard. 2007. ‘McDowell and the New Evil Genius.’ Philosophy and Phenomenological Research, 74: 381-396.
Neta, R. 2017. ‘Why Must Evidence Be True?’ in The Factive Turn in Epistemology, edited by Velislava Mitova. Cambridge: Cambridge University Press.
Pritchard, D. and Greenough, P. Williamson on Knowledge. Oxford: Oxford University Press.
Reynolds, S. 2013. ‘Justification as the Appearance of Knowledge.’ Philosophical Studies, 163: 367-383.
Rosenkranz, S. 2007. ‘Agnosticism as a Third Stance.” Mind 116: 55-104.
Rosenkranz, S. 2018. ‘The Structure of Justification.’ Mind 127: 309-338.
Roche, W. and T. Shogenji. 2014. ‘Confirmation, transitivity, and Moore: The Screening-off Approach.’ Philosophical Studies 168: 797-817.
Schechter, J. 2017. ‘No Need for Excuses.’ In J. Adam Carter, Emma Gordon & Benjamin Jarvis (eds.), Knowledge-First: Approaches in Epistemology and Mind. Oxford University Press. pp. 132-159.
Silins, N. 2005. Silins, N. (2005). ‘Deception and Evidence.’ Philosophical Perspectives 19: 375-404.
Silins, N. 2007. ‘Basic justification and the Moorean response to the skeptic.’ In T. Gendler & J. Hawthorne (Eds.), Oxford Studies in Epistemology (Vol. 2, pp. 108–140). Oxford: Oxford University Press.
Silva, P. 2017. ‘Knowing How to Put Knowledge First in the Theory of Justification.’ Episteme 14 (4): 393-412.
Silva P. & Tal, E. 2021. ‘Knowledge-First Evidentialism and the Dilemmas of Self-Impact.’ In Kevin McCain, Scott Stapleford & Matthias Steup (eds.), Epistemic Dilemmas. London: Routledge.
Silva, P. 2023. Awareness and the Substructure of Knowledge. Oxford: Oxford University Press.
Simion, M. 2019. ‘Knowledge‐first functionalism.’ Philosophical Issues 29 (1): 254-267.
Smith, M. 2012. ‘Some Thoughts on the JK-Rule.’ Nous 46(4): 791-802.
Smithies, D. 2012. ‘The Normative Role of Knowledge.’ Nous 46(2): 265-288.
Smithies, D. 2019. The Epistemic Role of Consciousness. Oxford: Oxford University Press.
Sutton, J. 2005. ‘Stick to What You Know.’ Nous 39(3): 359-396.
Sutton, J. 2007. Beyond Justification. Cambridge: MIT Press.
Sylvan, K. 2018. ‘Knowledge as a Non-Normative Relation.’ Philosophy and Phenomenological Research 97 (1): 190-222.
Whitcomb, D. 2014. ‘Can there be a knowledge-first ethics of belief.’ In Jonathan Matheson & Rico Vits (eds.), The Ethics of Belief: Individual and Social, Oxford University Press. 2014.
Williamson, T. 2000. Knowledge and its Limits. Oxford: Oxford University Press.
Williamson, T. 2009. ‘Replies to Critics.’ In Duncan Pritchard & Patrick Greenough (eds.), Williamson on Knowledge. Oxford: Oxford University Press. pp. 279-384.
Williamson, T. 2014. ‘Knowledge First.’ In M. Steup, J. Turri, and E. Sosa (eds.), Contemporary Debates in Epistemology (Second Edition). Oxford: Wiley-Blackwell.
Williamson, T. 2020. ‘Justifications, Excuses, and Sceptical Scenarios.’ In J. Dutant and F. Dorsch, (eds.), The New Evil Demon. Oxford: Oxford University Press. Archived in Phil
Zagzebski, L. 1996. Virtues of the Mind: An Inquiry into the Nature of Virtue and the Ethical Foundations of Knowledge. Cambridge: Cambridge University Press.
Philosophy faculty members from any country are welcome to propose writing on these topics or on a topic that is not on any of these lists. Avoid narrow topics that are common in philosophical journals, and avoid promoting one’s own original idea. All expected due dates for completion of an article need to be 12 months or less.
The Encyclopedia also is seeking articles on specialized areas (for example, the epistemology) of all the most influential philosophers of earlier centuries, such as Plato, Aristotle, Aquinas, Descartes, Spinoza, Leibniz, Locke, Berkeley, Hume, Kant, Hegel, Schopenhauer, Kierkegaard, Mill, James, Russell, Wittgenstein, Husserl, Heidegger, and Quine.
Our short-list of desired articles:
Augustine
Determinism
Ethics: Contemporary
Gender, Philosophy of
Hare
Hegel
Hellenistic Philosophy
Identity
Kripke
Mackie
Plato: Forms
Positivism, Logical
Pythagoras
Reid, Thomas
Science, Philosophy of
Stevenson
The above articles require a word count of approximately 12,000-20,000 words. Authors of any article need a Ph.D. in philosophy plus some evidence of expertise in the area such as having published an article in the area in a professional journal or having written a dissertation on the topic.
‘British Empiricism’ is a name traditionally used to pick out a group of eighteenth-century thinkers who prioritised knowledge via the senses over reason or the intellect and who denied the existence of innate ideas. The name includes most notably John Locke, George Berkeley, and David Hume. The counterpart to British Empiricism is traditionally considered to be Continental Rationalism that was advocated by Descartes, Spinoza, and Leibniz, all of whom lived in Continental Europe beyond the British Isles and all embraced innate ideas. This article characterizes empiricists more broadly as those thinkers who accept Locke’s Axiom that there is no idea in the mind that cannot be traced back to some particular experience. It includes British-Irish Philosophy from the seventeenth, eighteenth, and nineteenth century. As well as exploring the traditional connections among empiricism and metaphysics and epistemology, it examines how British empiricists dealt with issues in moral philosophy and the existence and nature of God. The article identifies some challenges to the standard understanding of British Empiricism by including early modern thinkers from typically marginalised groups, especially women. Finally, in showing that there is nothing uniquely British about being an empiricist, it examines a particular case study of the eighteenth-century philosopher Anton Wilhelm Amo, the first African to receive a doctorate in Europe.
This article is called ‘British Empiricism’, but it could just as accurately have been titled ‘British-Irish Philosophy from the seventeenth to the nineteenth century and the Lockean Axiom’. The article focuses on the commitment to the Lockean version of the Peripatetic axiom that is shared by many British and Irish thinkers in the seventeenth, eighteenth, and nineteenth centuries. Following John Locke (1632–1704), virtually all the empiricist thinkers considered in this article accept that “nothing is in the intellect that was not first in the senses” (De veritate q. 2 a. 3 arg. 19), to use Thomas Aquinas’ (1225–1274) phrasing of what is known as the Peripatetic Axiom (see Cranefield 1970 for more on the origin of the phrase).
While the shared acceptance of this axiom is a unifying feature for the thinkers considered in this article, it is worth starting off with some problematization of the term ‘British Empiricism’. The term ‘British’ here is used in a sense common in the early modern period which includes both what in the early twenty-first century was the United Kingdom of Great Britain and North Ireland and the Republic of Ireland—and thus includes thinkers such as the Ardagh-born John Toland (1670–1722) and the Kilkenny-born George Berkeley (1685-1753). The term ‘British’ here also excludes the many British colonies, meaning that this is not a global but is Western European. The term ‘empiricism’ considered here is neither exhaustive nor confined to ‘Britain’. In other words, this article does not discuss all British thinkers who are committed to the Peripatetic axiom. Nor do we claim that such a commitment only exists among British thinkers (see also section 6.1). We further problematize the term by discussing its historiography (section 1.1). This helps to explain why we chose to keep (and use) the term and how the issues and thinkers considered in this article were selected. After all, it is important to be transparent about the fact that an article like this, which focuses on a philosophical tradition, tells a particular story. This will inevitably involve choices by the authors that are shaped by factors like their own introduction to that tradition and which concern the protagonists and the content considered; both of which we outline below.
Section 2 considers the history of the Peripatetic axiom and Locke’s interpretation of it, which here is called the Lockean Axiom.
Lockean Axiom: There is no idea in the mind that cannot be traced back to some particular experience.
Subsequent sections consider how this axiom, accepted in some form by all the thinkers below, was applied to a variety of questions. Section 3 discusses its application to our knowledge of the external world, focusing on George Berkeley (1685–1753), David Hume (1711–1776), and Mary Shepherd (1777–1847). Section 4 focuses on how the axiom influenced moral philosophy in the period, focusing on Hume, Francis Hutcheson (1694–1746), and Susanna Newcome (1685–1763). Section 5 examines the application of the axiom to our knowledge of God, and its focuses on Berkeley, Toland, and Anthony Collins (1676–1729). The final section (section 6) focuses on the limitations of the narrative developed here by considering the case of Anton Wilhelm Amo (c. 1703–1759). Amo is committed to a version of the Lockean Axiom and thus there is a strong reason to consider him within the narrative developed here. However, including Amo comes at the price of challenging the moniker ‘British’ and thus of another feature that determined the selection.
In other words, the purpose of including Amo is twofold. First, it highlights the limits of our narrative. Second, it points to the arbitrary nature of any historical narrative concerning ‘British Empiricism.’ This results from the fact, which we highlight in the next section, that (‘British’) ‘Empiricism’ is an external ascription applied by scholars to certain philosophers – and not a self-expression of a common identity these philosophers took themselves to have shared. In other words, it is an analyst’s category and not an actors’ one. As such, any narrative using this category is always, more or less explicitly, guided by the assumptions, interests, values, and goals of the scholar or analyst employing it. In an attempt to be as transparent as possible about these assumptions, as well as to bolster the case of our arbitrariness-claim, we consider the historiography of the term ‘empiricism’ in the next section. This will also serve to shed further light on the nature and scope of the narrative we develop here, and the ways in which it deviates from the standard narrative.
a. Historiography
A crucial thing to note about both the term ‘British Empiricism’ and what is traditionally thought of as its counterpart ‘Continental Rationalism’ is that they are both anachronisms in the previously introduced sense of being analysts’, and not actors’, categories. To put it differently, none of the thinkers considered in this article, nor thinkers like René Descartes (1596–1650), Baruch Spinoza (1632–1677), or Gottfried Wilhelm Leibniz (1646-1716), who are usually thought of as ‘rationalists,’ used these terms to describe themselves. These thinkers did not think of themselves as working in unified traditions that were opposed to each other. Take the case of Berkeley for instance: while Berkeley critically reacts to Descartes (for example, Letter 44), he is even more critical of Locke. As a case in point, consider his rejection of the existence of abstract ideas in the Introduction to A Treatise Concerning the Principles of Human Knowledge. In fact, we know of no place in Berkeley’s work where he would clearly suggest that he sees himself working in some sort of tandem with Locke, against the likes of Descartes or Leibniz. Leibniz even writes about Berkeley’s Principles that “[t]here is much here that is correct and close to my own view” (AG 307). At the same time Leibniz defends the notion of innate ideas against Locke (see New Essays, G VI), but he also has a critical attitude towards Cartesianism on a variety of issues (see, for example, Anfray 2019 for a concise overview). In summary, the interrelations between these various actors (Berkeley, Locke, Descartes, and Leibniz in this instance) are complex; and it would be a stretch to suggest they saw themselves in two opposing camps.
The fact that it is highly doubtful that ‘empiricists’ (and ‘rationalists’) perceived themselves as such is important. This raises the question of why it is still often taken to be the case that there are two antagonistic philosophical traditions in early modern Europe epitomized by, on the one hand, Descartes, Leibniz, and Spinoza, and Berkeley, Hume, and Locke on the other. What is more, there is evidence that the contrast between these traditions, as we know it today, was invented in the 1850’s by the German historian Kuno Fischer (1824–1907) (see Mercer 2020, 73; for more on the rise of these labels see also Loeb 2010, Norton 1981, Vanzo 2016).
However, despite its complicated history, and further potential challenges which we discuss towards the end of this section, we believe retaining the label ‘British Empiricism’ is fruitful as long as one is fully aware of the fact that it is an analyst’s category. Importantly, there needs to be transparency about the criteria that are used to group certain thinkers together. In our case, the group of thinkers considered here are all, with one exception, British or Irish in the previously outlined sense and share a commitment to the Lockean Axiom, that ‘there is no idea in the mind that cannot be traced back to some particular experience’. This axiom was developed in response to the notion that humans possess innate ideas or innate knowledge (whether that be of mathematical/geometrical truths, or of God), which had previously been endorsed by Plato, was defended by thinkers like Descartes, later Cartesians such as Nicholas Malebranche (1638–1715), and Leibniz, in the seventeenth-century (for Locke’s metaphysics and epistemology, see, for example, Ayers 1991, Bennet 1971, Chappell 1992, Jolley 1999, Mackie 1976, Yolton 1956, Wilson 1999).
Locke, and subsequent thinkers who would go on to be characterised as empiricists, rejected this innatist notion. Indeed, it is standard to view responses to this question, of whether there are innate ideas in the human mind, as a central dividing line between empiricists and rationalists more generally. Thus, in an attempt to bridge the gap between the old standard narrative and new ways of speaking about the history of early modern philosophy, we keep this starting point, yet use it to tell a different story in terms of actors and issues considered. This we deem to be important because of exclusionary tendencies of the traditional early modern canon. By this we mean the fact that the voices of women and other marginalized groups were often systematically excluded when the early modern canon was formed (not to mention that many of the philosophers that became part of the canon have problematic views on issues pertaining sex, gender, class, race, or species) (see for example, O’Neill 1998; Conley 2006, Shapiro 2016; Hutton 2021; Lapointe and Heck 2023). Thus, it is crucial that any new narrative about ‘British Empiricism’ considers non-canonical (that is, traditionally underrepresented) thinkers as well. With that in mind, our decision to focus on the Lockean Axiom is significant because it allows us to integrate non-canonical thinkers such as Collins, Toland, Shepherd, or Newcombe alongside the traditional ‘big three’ of Locke, Berkeley, and Hume. Additionally, focusing on this axiom enables us to consider a larger variety of issues compared to the standard narrative, which focuses primarily on our knowledge of the external world (covered in section 2). For, as will become evident in the subsequent sections, the interests of even Berkeley, Locke, and Hume go well beyond this epistemological issue and encompass, for example, theological and moral questions.
Yet, even if our narrative is more inclusive than the standard story, it is nonetheless important to note its limitations. In closing this section, we illustrate this point with the case of comparatively well-known British women philosophers from the early modern period who do not neatly fall into the category of ‘empiricism’ – either in our use of the term or in its more traditional sense.
It might seem obvious that an article focusing on the Lockean Axiom, as we have called it, does not discuss Margaret Cavendish (1623-1673). After all, Cavendish died over decade before the Essay was published. However, a comprehensive account of philosophy in early modern Britain cannot afford to neglect such a prolific writer. Over her lifetime, Cavendish wrote numerous philosophical treatises, plays, and poems, as well as novels (perhaps most famously The Blazing World in 1668). Yet, Cavendish, perhaps at this stage the most ‘canonical’ woman in early modern philosophy, does not fit neatly into either the ‘empiricist’ or ‘rationalist’ camp. She is critical of Descartes on several issues, including his views on the transfer of motion (which she rejects in favor of an account of self-motion as ubiquitous throughout nature) and his dualism (see her Observations upon Experimental Philosophy and Grounds of Natural Philosophy (both published in 1668); for discussion of Cavendish’s system of nature see Boyle 2017, Lascano 2023, Detlefsen 2006, Cunning 2016). But she is also committed to some (possibly weak) form of ‘innatism’ (discussed in section 2.2), whereby all parts of nature, including humans, have an innate knowledge of God’s existence. Note that (as discussed in section 2.1), there is version of the story of ‘empiricism’ that can be told that brings Thomas Hobbes into the fold. Despite being contemporaneous with Hobbes, Cavendish’s metaphysical and epistemological commitments make it difficult to do the same with her. Thus, by framing the story of early modern British philosophy as one concerned with ‘empiricism’, there is a danger of excluding Cavendish. As recent scholars like Marcy Lascano (2023), have argued, this motivates developing alternative stories – ones that might focus on ‘vitalism’, for instance – alongside more traditional narratives, which feature Cavendish and other women as protagonists.
Another case in point is Mary Astell (1666-1731). One way of telling the story of ‘empiricism’ is as a tradition that formed in opposition to Cartesianism. But if an opposition to Cartesianism is over emphasized, then a thinker like Astell is likely to fall through the cracks. For even though Astell was writing during Locke’s lifetime and critically engages with him when developing her views on education, love, and theology (see for example, Proposal to the Ladies, Parts I and II. Wherein a Method is offer’d for the Improvement of their Minds from 1694 and 1697 or The Christian Religion, As Profess’d by a Daughter Of the Church of England from 1705), she is quite explicitly committed to a form of substance dualism that shares many features in common with that of Descartes (see Atherton 1993 and Broad 2015).
While it may be hard, as we have suggested, to incorporate Cavendish or Astell into a traditional ‘empiricist’ narrative, there are several thinkers that might more easily fit under that label. Take the case of Anne Conway (1631–1679), who is as critical of ‘rationalists’ like Descartes and Spinoza (along with other figures like Hobbes) in her Principles of the Most Ancient and Modern Philosophy (for example, chap. 7) as any of the ‘usual suspects’, such as Berkeley or Locke (for more on Conway’s philosophical system, see Hutton 2004; Thomas 2017; Lascano 2023). But since Conway is not focused on the Peripatetic axiom but wants to offer a philosophical system that can explain how the nature of mind and matter as well as how God and the creation are related, it is hard to place her in the narrative developed in this article. (For a more thorough consideration of Conway’s philosophy, see for instance Hutton 2004; Thomas 2017; Lascano 2023.) This also holds for someone like Damaris Masham (1658-1708) who – despite knowing Locke and corresponding with Leibniz and Astell – is not overly concerned with the Lockean Axiom. Rather, Masham focuses on moral issues as well as love and happiness (see for example, Discourse Concerning the Love of God in 1696 and her Occasional Thoughts in 1705) arguing for a notion of humans as social and rational beings (for more on Masham’s social philosophy, see Broad 2006, and 2019; Frankel 1989; Hutton 2014 and 2018; Myers 2013). Finally, our focus on the Lockean Axiom means that even someone like Mary Wollstonecraft is hard to incorporate into the narrative. While Wollstonecraft is deeply influenced by Locke’s views on education and love, which play an important role in the background of her Vindication of the Rights of Women from 1792, her focus is on women’s rights. There is no obvious sense in which she is an ‘empiricist’ – on either a traditional conception of that term or the way we have conceived it in this article (that is, as committed to the Lockean Axiom) (see Bahar 2002; Bergès 2013; Bergès and Coffee 2016; Falco 1996; Sapiro 1992).
Wollstonecraft’s case is of particular interest because it illustrates that one can even be a Lockean of sorts and still not fit the bill, as it were. In turn, this emphasizes that any narrative that scholars develop will have to make tough choices about who to include, which is why it is so important to be transparent about the reasoning behind these choices. We strongly believe that this must be kept in mind when reading this article and engaging in both teaching and scholarship in the history of philosophy more generally.
In sum, we have strived to present here a narrative that does justice to the existing tradition while correcting some of its main flaws (in particular, its exclusionary tendencies) in terms of issues and thinkers considered. Nonetheless, it is important to be mindful of the fact that this narrative is just one of many stories that could be told about British philosophy from the seventeenth to the nineteenth century. After all, each narrative – no matter its vices and virtues – will have to deal with the fact that it is arbitrary in the sense of being the product of a particular analyst’s choices. It might well be the case that other scholars deem it better to forgo these labels altogether in research and teaching (see, for example, Gordon-Roth and Kendrick 2015).
2. The Origins of Empiricism
a. Precursors to Locke
As noted in the previous section, this article on ‘British Empiricism’ will focus on a particular narrative that takes Locke’s Essay Concerning Human Understanding as a starting point for the ‘British empiricist’ tradition. Inevitably, there is a degree of arbitrariness in this decision – as we suggested in the previous section, such is the case with any historical narrative that chooses some thinkers or ideas and not others. Nonetheless, we think that this particular narrative has the theoretical virtue of allowing us to expand the canon of ‘British empiricism’ and discuss a greater range of topics (covering moral philosophy and theology, for example, as well as epistemology and metaphysics).
Even if ‘empiricism’ is tied to an acceptance of some version of the ‘Peripatetic Axiom’ (as it is in this article), it is important to note that ‘empiricism’ is neither uniquely British nor a uniquely early modern phenomenon, and Locke was not the first early modern thinker to draw heavily from the ‘Peripatetic Axiom’ in his approach to knowledge. In this section, we briefly outline the history of the ‘Peripatetic Axiom’ prior to Locke before introducing Locke’s usage of it as espoused in the Essay. We do so by charting the emergence of this ‘Peripatetic Axiom’ which, in a very general form, is as follows:
Peripatetic Axiom: there is nothing in the intellect not first in the senses.
The name comes from the axiom’s association with Aristotle (see Gasser-Wingate 2021), the ‘Peripatetic’ philosopher; so-called because he liked to philosophise while walking. We will argue that, in the hands of Locke, the Peripatetic Axiom, which has a long history, was turned into the Lockean Axiom: There is no idea in the mind that cannot be traced back to some particular experience (which we discuss in greater detail in section 2.2).
Prior to Locke, the axiom can be found in the writings of medieval Aristotelian writers including Thomas Aquinas (1225–1274) and Roger Bacon, other early modern writers like Thomas Hobbes (1588–1679), and perhaps even in the work of Ancient Greek thinkers like Aristotle (ca. 384-322 BCE) and Heraclitus (ca. 500 BCE). Our contention is that, in Locke’s Essay, the Peripatetic Axiom took on a particular shape that would go on to be hugely influential in seventeenth- and eighteenth-century philosophy, especially in Britain. One reason for this is that Locke’s Essay was extremely widely read in Britain; for example, it was a standard set text for philosophy in British universities.
For the purposes of the discussion in this article, we take empiricists to be those thinkers who are committed, in some form or another, to the view that all knowledge (everything that is ‘in the mind’) can be traced back to some kind of experience. Often, ‘experience’ is construed in terms of sense-perception, although, as we will find, in Locke’s Essay, ‘experience’ covers both outward sense experience and inward, introspective experience of the operations and contents of one’s own mind – what Locke calls ‘reflection’ (Essay 2.1.2). Thus, Locke can be thought of as having expanded the scope of what can be ‘experienced’, compared to many of his early modern, medieval, and ancient predecessors.
There is some evidence of something close to a commitment to ‘empiricism’ – perhaps a kind of ‘proto-empiricism’ – in Pre-Socratic writers such as Heraclitus, Empedocles (ca. 495–435 BCE), or Xenophanes (ca. 570–475 BCE). Although their writings make it hard to determine whether they are committed to a recognisable form of the Peripatetic Axiom or are simply resistant to thinkers like Parmenides (ca. 515–445 BCE), who argued that the senses are unreliable and that a priori reasoning is the only appropriate way to grasp the nature of reality. Similarly, Aristotle rejects his teacher Plato’s (427–347 BCE) account of knowledge as recollection and the theory of innate ideas that follows from it. Plato had argued that our knowledge of, for example, mathematical principles is in fact knowledge of the Forms (Republic 510c1–511b2). The Forms – perfect, idealised, abstract entities which inhabit a ‘Realm of Forms’ distinct from our own world of sense experience—can be accessed, according to Plato, by recollection or intuition. Aristotle rejects this account of knowledge as recollection (for example, APo. 100a)—a move that would later be repeated by Locke in his own discussion of innate ideas in Book I of the Essay. Instead, Aristotle claims that “to gain light on things imperceptible we must use the evidence of perceptible things” (EN 1104a13–14). Similarly, Aristotle rejects the idea, found in thinkers like Parmenides and Plato, that reality can be understood through a priori reasoning, claiming instead that “we should accept what is evident to the senses rather than reasoning” (GA 760b29–33). Like later thinkers who accept the Peripatetic axiom, like Locke and Hume, Aristotle argues that – since inquiry is limited by what we are able to experience – when it comes to certain observable phenomena, we may, at best, be able to arrive at possible causes (Meteor 344a5–7).
In medieval thought, we begin to find explicit formulations of the Peripatetic Axiom. Note that, despite being called ‘Peripatetic’, the axiom is more explicitly articulated by later followers of Aristotle. Perhaps the most famous follower of Aristotle in Western philosophy, Thomas Aquinas, claims that “without sense perception no one can either learn anything new, nor understand matters already learned” (In DA 3.13 [para. 791]). In other words, according to Aquinas, we only learn new things via sense-perception. Clearly, this implies that there is nothing (new) in the mind that is not first in the senses. Similarly, another medieval thinker who pre-empts some of the ideas that would go on to be central to Locke’s view, Roger Bacon (1215–1292), writes that “without experience nothing can be sufficiently known” (OM 6.1). This is not quite the same as the claim that there is no knowledge (at all) without experience, but is still an endorsement of the crucial, necessary role that experience plays in knowledge acquisition that is central to the empiricist tradition.
Perhaps the most significant, imminent pre-cursor to Locke – in the context of the history of the Peripatetic Axiom – is Thomas Hobbes. Hobbes commits himself to the Peripatetic Axiom when he writes, in Leviathan (1651), that “there is no conception in a man’s mind, which hath not at first, totally, or by parts, been begotten upon the organs of Sense” (Leviathan, 1.1). Indeed, arguably one could tell a somewhat different story of early modern (or even ‘British’) ‘empiricism’ that takes Hobbes as its starting point. As Peter Nidditch explains, Hobbes (along with the French philosopher Pierre Gassendi (1592-1655)) “first produced in the modern era, especially in his Leviathan and De Corpore, a philosophy of mind and cognition that built on empiricist principles” (Nidditch 1975, viii). Nidditch goes on to suggest, speculatively, that it is most likely Hobbes’ reputation – as a highly unorthodox thinker, at best, and a secret atheist, at worst – that prevented him, retrospectively, from being seen as the ‘father of empiricism’ in the standard narrative. Whatever the explanation, it is Locke rather than Hobbes who would go on to be widely read and highly influential in Britain, and elsewhere, in the seventeenth- and eighteenth-century. As Nidditch puts it: “The Essay gained for itself a unique standing as the most thorough and plausible formulation of empiricism – a viewpoint that it caused to become an enduring powerful force” (Nidditch 1975, vii). Due to the Essay’s widespread influence, we focus on the role that Locke, rather than Hobbes, played in the development of British thought during these centuries; a role which would go on to be seen as so important that it even becomes possible, in hindsight, to speak of a more or less unified group and label it ‘British empiricism’. As we have suggested, there is a story to be told about Hobbes and empiricism, but it is one that, for the most part, we do not tell here (see section 1).
b. Locke
As was noted in the introduction, the question of whether there are innate ideas in the human mind is often seen as a central dividing line between empiricism and rationalism as they are standardly construed. While we pointed out the various issues of this standard narrative, our narrative also makes use of the issue of innatism. Though, crucially, our focus is less on finding a dividing line and more on finding a common denominator in the views of mainly ‘British’ and ‘Irish’ philosophers (for more on issues concerning the ‘British’ moniker see § 6). With that in mind, let us turn to the issue of innatism and the way Locke deals with it.
Locke characterises his innatist opponents’ position like so: “It is an established Opinion amongst some Men, That there are in the Understanding certain innate Principles; some primary Notions…as it were stamped upon the Mind of Man, which the Soul receives in its very first Being; and brings into the world with it,” (Essay, 1.2.5).
Whether or not this is a fair characterisation of his opponents’ views, as Locke sees it, the term ‘innate’ suggests that, on the innatist account, human beings are quite literally born with some in-built knowledge – some principles or propositions that the mind need not acquire but already possesses. In short, on this view, prior to any experience – that is, at the very first instant of its having come into existence – the human mind knows something. Locke develops two lines of argument against the innatist position, which will be referred to in what follows as (1) the Argument from Superfluousness and (2) the Argument from Universal Assent.
The Argument from Superfluousness proceeds as follows:
It would be sufficient to convince unprejudiced Readers of the falseness of this Supposition, if I should only shew (as I hope I shall in the following Parts of this Discourse) how Men, barely [that is, only] by the Use of their natural Faculties, may attain to all the Knowledge they have, without the help of any innate Impressions. (Essay, 1.2.1)
Locke’s point here is that all it takes to convince an ‘unprejudiced reader’ (that is, one who is willing to be swayed by reasonable argument) of the falseness of innatism is evidence that all knowledge can be traced back to instances in which our human “natural Faculties” – that is, our faculties of sense-perception and reflection – were in use. This argument thus depends upon the plausibility of Locke’s claim that all knowledge can be traced back to some kind of experience. We leave aside the Argument from Superfluousness for the moment since we discuss this claim in greater detail below.
In contrast, the Argument from Universal Assent is a standalone argument that does not depend upon any additional claims about the sources of human knowledge. Locke claims that if the human mind possessed certain principles innately then there would surely have to be certain spoken or written propositions that all human beings would assent to. In other words, if there were an innate principle X such that all human beings, regardless of their lives and experiences, knew X, then when confronted with a written or verbal statement of X (“X”), all human beings would agree that “X” is true. For example, let us assume for the moment that murder is wrong is a principle that is innately known to the human mind. Locke’s point is that, if presented with a written or verbal statement of “murder is wrong”, surely all human beings would assent to it.
And yet, Locke argues, this does not seem to be true of this or any other principle (evidenced, for example, by the fact that people do, in fact, commit murder). He writes: “[this] seems to me a Demonstration that there are none such [innate principles of knowledge]: Because there are none to which all Mankind gives an Universal assent” (Essay, 1.2.4). If by ‘demonstrates’, here, Locke means that it logically follows that, since there are no universally assented-to propositions, there must not be any innately known principles, he is not quite right. For there might be other reasons why certain propositions are not universally assented to—perhaps not everyone understands the statements they are being presented with, or perhaps they are lying (perhaps murderers know murder is wrong, but commit it nonetheless). At best, the Argument from Universal Assent provides a probable case against innatism, or places the burden proof on the innatist to explain why there are no universally assented-to propositions, or else neutralises the converse view (which Locke thinks his opponents subscribe to; see Essay, 1.2.4) that the existence of innate principles can be proven by appealing to the existence of universally assented-to propositions. And, of course, Locke’s reasoning also depends upon the truth of the claim that there are, in fact, no universally assented=to propositions (perhaps people have just not had the chance to assent to them yet, because they have not yet been articulated). Given all these mitigating factors, it seems most charitable to suggest that Locke is simply hoping to point out the implausibility, or even absurdity, of the innatist position – especially given an increasing societal awareness of cultural relativity in different societies and religions outside Europe in the seventeenth century (Essay, 1.4.8), not to mention the fact that neither Plato or Aristotle, or any other pre-Christians, would have assented to propositions like ‘God exists’ or ‘God is to be worshipped’ which, Locke claims, are paradigm cases of so-called ‘innate principles’ (Essay, 1.4.8).
Having, to his own satisfaction at least, provided one argument against the innatist position, Locke develops an account of the sources of human knowledge that supports the Argument from Superfluousness – by showing how all human knowledge can be traced back to some kind of experience. In contrast to innatists, Locke maintains that at birth the human mind is a blank slate or ‘tabula rasa’. If we picture the mind as a “white Paper, void of all characters”, Locke asks, “How comes it to be furnished?” (Essay, 2.1.2). His response is that: “I answer, in one word, From Experience: In that, all our Knowledge is founded; and from that ultimately derives itself” (Essay, 2.1.2).
Locke then divides experience into two subcategories with respective mental faculties: ‘sensation’ and ‘reflection’ (Essay, 2.1.2). Concerning sensation, he writes:
Our Senses, conversant about particular sensible Objects, do convey into the Mind, several distinct Perceptions of things, according to those various ways, wherein those Objects do affect them: And thus we come by those Ideas, we have of Yellow, White, Heat, Cold, Soft, Hard, Bitter, Sweet, and all those which we call sensible qualities. (Essay, 2.1.3)
Our ideas of sensation, Locke explains, are those which pertain to the qualities of things we perceive via the (five) external senses: the objects of vision, touch, smell, hearing, and taste. But of course, this does not exhaust the objects of the mind – we can also have ideas of things that are not perceived by the ‘outward’ senses. As Locke writes:
The other Fountain, from which experience furnisheth the Understanding with Ideas, is the Perception of the Operations of our own Minds within us, as it is employ’d about the Ideas it has got; which Operations, when the Soul comes to reflect on, and consider, do furnish the Understanding with another set of Ideas, which could not be had from the things without: and such are, Perception, Thinking, Doubting, Believing, Reasoning, Knowing, Willing, and all the different actings of our own Mind. (Essay 2.1.4)
In a sense, then, Locke’s point is this: While we standardly talk as though we ‘experience’ only those things that can be perceived by the senses, in actual fact we also experience the operations of our own mind as well as things external to it. We can, that is, observe ourselves thinking, doubting, believing, reasoning, and so on – and we can observe ourselves perceiving, too (this claim is contentious: Do we really observe ourselves perceiving, or are we simply aware of ourselves perceiving?).
Locke’s aim is to establish that no object of knowledge, no ‘idea’ (Essay, 1.1.8), can fail to be traced back to one of these two ‘fountains’ of knowledge. In doing so, Locke thus commits himself to a particular formulation of the ‘Peripatetic Axiom’ (discussed in section 2.1). While the ‘Peripatetic Axiom ‘– found in medieval Aristotelians and in Hobbes – states that ‘there is nothing in the intellect not first in the senses,’ Locke’s claim, which is central to the way ‘empiricism’ is construed in this article, is:
Lockean Axiom: There is no idea in the mind that cannot be traced back to some particular experience.
The Lockean Axiom would go on to very influential in seventeenth- and eighteenth-century thought, especially in Britain.
3. Our Knowledge of the External World and Causation
This section focuses on the application of the Lockean Axiom (there is no idea in the mind that cannot be traced back to some particular experience) to our knowledge of the external world. In doing so it most closely resembles the standard narrative of ‘British empiricism’ because the focus rests on Berkeley’s rejection of materialism and Hume’s denial of necessary connection. However, in contrast to the standard narrative, we close this section by emphasizing how Mary Shepherd, who is said to have read Locke’s Essay when she was eight years old (Jeckyl 1894, 217), rejects both positions. Although, as will become evident, in doing so she does not draw from the Lockean Axiom but from two causal principles.
a. Berkeley on the Nature of the External World
In A Treatise Concerning the Principles of Human Knowledge (1710/34) and Three Dialogues between Hylas and Philonous (1713/34), Berkeley defends the doctrine he is most famous for: Immaterialism. In a nutshell, Berkeley holds that everything that exists is either an immaterial mind or idea (for example, PHK §§ 25–27). Thus, his commitment to the notorious dictum esse est percipi aut percipere (“To be is to be perceived or to perceive”) (compare NB 429, 429a; PHK § 3).
Two key features of his argument for immaterialism are Berkeley’s claims that the “existence of an idea consists in being perceived” (PHK § 3) and that “an idea can be like nothing but an idea” (PHK § 8). Since Berkeley is convinced that sense perception works via resemblance (for example, Works II, 129; TVV § 39) (see Fasko and West 2020; Atherton 1990; West 2021) and because we know that (most) objects of human knowledge are ideas – either “imprinted on the senses” or “formed by help of memory and imagination” (PHK § 1), he argues that we can infer that the objects in the external world also must be ideas or collections of ideas (PHK §§ 1–8). After all, according to Berkeley, when we say something like the table exists, we mean that it can be perceived. And what is perceived is, after all, an idea (PHK § 3) (Daniel 2021, Fields, 2011, Jones 2021, Rickless 2013, Saporiti 2006).
It is important to note that, in developing this argument, Berkeley, implicitly, draws on the Lockean Axiom that there is no idea that cannot be traced back to some particular experience. For Berkeley’s point is that our experience of the external world and its objects clearly suggests that they only exist when they are perceived. That is, when we trace back our ideas of things in the external world to the experiences we have of them, we come to understand that these ‘things’ are also ideas.
Berkeley fortifies his case for immaterialism by rejecting what is, to his mind, the only viable alternative: Materialism. More specifically, Berkeley argues against the existence of a (Lockean) material substance. In doing so, he, again, draws from the Lockean Axiom – and, in that sense, uses Locke’s own claim against him – by raising the question of whether we even have an idea of material substance in the first place. Berkeley then claims that even materialists, like Locke on his reading, must accept that we do not; for, as they themselves admit, there is nothing we can say about it (DHP 261). The reason we do not have an idea of material substance, Berkeley contends, is that there is no such thing in the first place and, thus, no experience of such a thing (and where there is no experience, there can be no idea). In fact, Berkeley believes that the very notion of such a thing would be “repugnant” (DHP 232; PHK § 17). As he puts it:
I have no reason for believing the existence of matter. I have no immediate intuition thereof: neither can I mediately from my sensations, ideas, notions, actions or passions, infer an unthinking, unperceiving, inactive substance, either by probable deduction, or necessary consequence. (DHP 233)
Even worse, assuming the existence of a material substance leads to skepticism concerning the existence of the external world and ultimately also God’s existence (that is, it leads to atheism, compare also PHK § 92) because it leads one to become “ignorant of the true nature of every thing, but you know not whether any thing really exists, or whether there are any true natures at all” (DHP 229). When challenged by his imagined opponent with the argument that we also have no idea of God or other minds (see also section 4.3) – and thus no reason to assume they exist – Berkeley appeals to the (first personal) experience we can have of these entities (DHP 233). This is consistent with the Lockean Axiom which, while it does entail that every idea can be traced back to an experience, does not entail that every experience must lead to an idea.
In sum, in arguing for his immaterialism Berkeley makes implicit use of the Lockean Axiom inasmuch as he draws from it to establish that the external world and its objects must consist of ideas because our experience of the external world and its objects are such that it consists of perceivable things. The Lockean Axiom also plays a role in Berkeley’s argument against the existence of material substance, in that the lack of experience of matter is taken to explain the lack of a corresponding idea – and an analysis of the idea shows its repugnancy.
b. Hume on the Nature of Causation
At least in the context of contemporary Western thought, Hume’s account of causation is perhaps one of the best known and most discussed theories to have come out of the early modern period (see, for example, Garrett 2015; Bell 2008; Beauchamp and Rosenberg 1981). In An Enquiry Concerning Human Understanding (1748), Hume sets out to demonstrate that causal relations – or what he calls ‘necessary connections’ – are not something that we experience in the world around us (see Noxon 1973 or Traiger 2006 for a discussion of the development of Hume’s thought and the relation between the Treatise and the EHU). Rather, Hume claims, we form the idea or concept of causation in our mind as a result of repeated experiences of ‘causes’ preceding ‘effects’, and the ‘sentiment’ that such repeated experiences generate in us (EHU 7). In other words, on Hume’s view, we feel as though certain events or objects (like smoke and fire) are necessarily connected, by a causal relation, because we see them occur in conjunction with one another repeatedly. But, strictly speaking, Hume argues, we do not experience any such causal relations and thus cannot know with certainty that the two things are necessarily connected – at best, we can have probable knowledge. What is important, for the concerns of this article, is that Hume’s reasoning for this view is premised upon a version of the Lockean Axiom: There is no idea in the mind that cannot be traced back to some particular experience. In other words, it is Hume’s ‘empiricism’ (in the sense that we have used the term in this article) that leads him to arrive at his skeptical account of causation. For an ‘empiricist’, knowledge is dependent upon experience – and Hume’s point in the EHU is that we cannot experience causation. We run through Hume’s argument in more detail below.
Hume begins section 2 of the EHU (where his discussion of the origin of ideas takes place) by establishing what has come to be known as ‘the Copy Principle’ (for further discussion, see Coventry and Seppalainen 2012; Landy 2006 and 2012). The Copy Principle concerns the relation between what Hume calls ‘impressions’ and ‘ideas.’ The crucial thing for our purposes is that, for Hume, ‘impression’ refers (amongst other things) to any direct experience or sense-perception we have of an external object. When I look outside my window and see the sun, for instance, I am receiving an ‘impression’ of the sun. That is, the sun is ‘impressing’ itself upon my sense organs, similarly to a stamp that impresses an insignia upon wax. ‘Ideas,’ on the other hand, are what are left behind, in the mind, by such impressions; Hume’s use of the term ‘idea’ is thus slightly different to that of Locke or Berkeley, who both use ‘idea’ in a way that also encompasses Humean impressions. When I remember the sun, as I lie in bed at night, I am having an ‘idea’ of the sun. And, similarly, if I lie in bed and imagine tomorrow’s sun, I am also forming an ‘idea’ of it. In terms of our experiences of them, impressions and ideas are differentiated by their degrees of vividness and strength: my impression of the sun, for instance, will be stronger and more vivid (perhaps brighter) than my idea of the sun. As Hume puts it:
These faculties [of memory and imagination] may mimic or copy the perceptions of the senses; but they never can entirely reach the force and vivacity of the original sentiment. The utmost we say of them, even when they operate with greatest vigour, is, that they represent their object in so lively a manner, that we could almost say we feel or see it: But, except the mind be disordered by disease or madness, they never can arrive at such a pitch of vivacity, as to render these perceptions altogether undistinguishable. (EHU 2.1, 17)
An idea might somewhat resemble the strength or vividness of an impression but, Hume claims, an idea of the sun and the sun itself (unless one’s mind is ‘disordered’) will never be entirely indistinguishable.
The Copy Principle entails that every (simple) idea is a copy of an impression. Hume writes:
It seems a proposition, which will not admit of much dispute, that all our ideas are nothing but copies of our impressions, or, in other words that it is impossible for us to think of anything which we have not antecedently felt, either by our external or internal senses. (EHU 7.1.4, 62)
This principle is strongly empiricist in character and closely related to both the Lockean Axiom and the Peripatetic Axiom, which entails that there is nothing in the mind not first in the senses. Like the Lockean Axiom, the Copy Principle (as articulated in this passage) tells us that if I have an idea of X, then I must previously have had an experience, or ‘impression’, of X.
For Hume, all of this makes the issue of where we get our idea of causation extremely pressing. Hume denies that we do in fact have any impressions of causation or ‘necessary connections’ between things:
When we look about us to external objects…we are never able in a single instance, to discover any power of necessary connexion; any quality which binds the effect to the cause, and renders one an infallible consequence of the other. We only find that one does actually, in fact, follow the other. (EHU 7.1.6, 63)
Consider the case of a white billiard ball rolling along a table and knocking a red ball. Hume asks: can you in fact experience or perceive the ‘necessary connection’ (or causal relation) that makes it the case that when the white ball knocks the red ball the red ball moves away? His answer is no: what you experience, strictly speaking, is a white ball moving and then a red ball moving. But if we do not have an impression of causation, in such instances, why do we have an idea of causation?
Hume concludes that while we do not have an outward impression of causation, because we repeatedly experience uniform instances of for example, smoke following fire, or red balls moving away from white balls, we come to feel a new impression which Hume calls a ‘sentiment’. That is, we feel as though we are experiencing causation – even though, in strict truth, we are not. This new feeling or sentiment is “a customary connexion in the thought or imagination between one object and its usual attendant; and this sentiment is the original of that idea which we seek for” (EHU 7.2.30, 78). In other words, while our idea of causation or necessary connection cannot be traced back to a specific impression, it can nonetheless be traced back to experience more generally. Repeated uniform experience, Hume claims, induces us to generate the idea of causation – and is the foundation of our ‘knowledge’ of cause-and-effect relations in the world around us. In line with the Lockean Axiom, then, Hume’s view is that we would have no idea of causation, were it not for our experience of certain events or objects (‘causes’) regularly preceding others (‘effects’).
c. Shepherd on Berkeley and Hume
The previous subsections have established that Berkeley and Hume both draw on the Lockean Axiom that there is no idea that cannot be traced back to some particular experience in important ways. Both thinkers draw on this principle inasmuch as they take the absence of particular experiences (about the external world or causation) not only to entail that there is no idea but that the things in question (material substance or necessary connections) do not exist. In this section we consider how Mary Shepherd rejects both Berkeley’s immaterialism and Hume’s skeptical account of causation. As will become evident, however, Shepherd does so not by drawing on the Lockean Axiom – which does not play any role in her account of the mind – but by using two causal principles that she introduces in her works. Shepherd is thus an example of the limits of the narrative developed here. For even though she conceives of Locke as her closest ‘philosophical ally’ (LoLordo 2020, 9), Shepherd concludes that one needs, in order to refute Berkeley and Hume, to consider the issue of causation first – and not issues concerning (mental) representation. For Shepherd believes that even (mental) representation and the mental content it allows for ought ultimately to be understood in causal terms.
Shepherd’s first causal principle, the so-called CP, holds that “nothing can ‘begin its own existence’” (for example, ERCE 94). Second, the Causal-Likeness-Principle (CLP) states that “like causes, must generate like Effects” (for example, ERCE 194). It is important to note that the CLP is a biconditional, as Shepherd claims in her second book Essays on the Perception of an External Universe (1827) that “like effects must have like causes” (EPEU 99).
Shepherd defends both principles in her first book, Essay on the Relation of Cause and Effect (1824). The main aim of this work is to refute a Humean account of causation as constant conjunction. In particular, Shepherd wants to establish, against Hume, that causes and effects are necessarily connected (ERCE 10). While the details of Shepherd’s argument can be put aside for now, the crucial thing to note is that she does not draw from the Peripatetic Axiom or the Lockean Axiom. Instead, Shepherd focuses on rejecting Hume’s theory of mental representation and his claim that the possibility of separating cause and effect in thought tells us something about their actual relation (Bolton 2010 & 2019; Landy 2020a & 2020b). Crucially, this rejection of Hume, in turn, fortifies her case for her two causal principles – both of which play a crucial role in arguing against Berkeley.
Meanwhile, in rejecting Berkeley’s version of immaterialism, Shepherd contends that we have sensations of solidity and extension (EPEU 218), and drawing from the CP, we know that these must have a cause. Since we know the mind to be a cause for sensations (for example, EPEU 14–15), there must also be another cause for these sensations. Thus, we can come to know that matter (which she also calls ‘body’) is the “continually exciting cause, for exhibition of the perception of extension and solidity on the mind in particular” (EPEU 155) and matter is “unperceived extended impenetrability” (LMSM 697). In other words, the causal connection between our mental content and the external world allows Shepherd to draw inferences about its objects, which show them not to be ideational, that is, not to merely consist of ideas as Berkeley, for instance, would have it (while Shepherd thus clearly rejects a Berkeleyan brand of immaterialism (see Atherton 1996, Rickless 2018), it is not clear whether she is opposed to all kinds of immaterialism whatsoever; as Bolye (2020, 101) points out ‘(im-)material’ seems to be a “label” for capacities and it is unclear whether more than capacities exist in Shepherd’s metaphysics).
In sum, Shepherd is a fitting end point for this part of the narrative because she not only closely engages with Berkeley and Hume (and their applications of the Lockean Axiom) but also because Locke is such a close philosophical ally for her—although, scholars have noted that Shepherd sometimes advances an idiosyncratic reading of Locke (Boyle 2023; LoLordo 2022). Even more to the point, Shepherd suggests that her theory is a ‘modified Berkeleian theory’ (LMSM 698) and thus aligns herself explicitly with a key figure of the ‘standard’ narrative of British empiricism.
Thus, despite the fact that the Lockean Axiom does not play a role in Shepherd’s argumentation, and in fact it is unclear what she thinks about it, there are good reasons to consider her within this narrative. For Shepherd’s philosophy focuses on key figures within this narrative to the point where she aligns herself implicitly and explicitly with at least two of them.
4. Morality
One of the most interesting upshots of the widespread acceptance of the Lockean Axiom, or what we might call his ‘empiricist’ philosophy, in Britain and Ireland during the eighteenth century is the effect it had on theorising about morality; specifically concerning the question of where we get our moral ideas (like good, bad, right, wrong, virtuous, and vicious) from. The Lockean Axiom dictates that there is no idea that cannot be traced back to some particular experience. While that might fit nicely with how we get our ideas of concepts like colour, sound, or touch (and any other ideas that can be traced to sense perception), ideas like justice/injustice, good/bad, or right/wrong, do not seem to be easily traceable to some particular experience. It does not seem controversial to suggest that ‘redness’ or ‘loudness’ are qualities we can experience in the world around us, but it is much less obvious that we experience qualities such as ‘goodness’, ‘badness’, ‘rightness’, or ‘wrongness’. For a start, while – barring cases of, for example, blindness, deafness, or any other sensory deficiency – there is likely to be agreement about an object’s colour or the volume of a sound. There is, however, generally speaking, considerable disagreement when it comes to the goodness/badness or rightness/wrongness of an action. The same applies in the case of beauty and other aesthetic qualities, and there is a great deal that could be said about ‘empiricist’ approaches to aesthetics (we do not discuss these issues here but for discussion of Hume’s aesthetics see, for example, Costello 2007, Gracyk 1994, Townsend 2001, and for discussion of Hutcheson’s aesthetics, see, for example, Shelley 2013, Michael 1984, Kivy 2003).
This section looks at three thinkers’ views on morality and examine the role that the Lockean Axiom played in their theorising. All three are important figures in the history of (Western) ethics. Francis Hutcheson was one of the first philosophers to apply the Lockean Axiom to questions of morality and, though he was Irish born, would go on to be known as a central figure in the so-called ‘Scottish Enlightenment’ (his parents were Scottish Presbyterian and he would spend most of his career in Scotland). David Hume pre-empts discussions of utility in ethical theorising that would come to the fore in the work of Mill and Bentham and develops the idea of a sense of ‘taste’ which allows us to perceive the moral characteristics of persons and actions. Meanwhile, Susanna Newcome (1685-1763) has recently been identified (Connolly 2021) as one of the earliest thinkers to defend what is recognisably a form of utilitarianism.
a. Hutcheson and the Moral Sense
In An Inquiry into the Original of Our Ideas of Beauty and Virtue (1725), Francis Hutcheson explicitly acknowledges the indebtedness of his discussion of morality (as well as beauty) to Locke (for example, Inquiry, 1.VII). His begins the Inquiry by defining sensations as “[t]hose Ideas which are rais’d in the Mind upon the presence of external Objects, and their acting upon our Bodys” and adds that “We find that the Mind in such Cases is passive, and has not Power directly to prevent the Perception or Idea” (Inquiry, 1.I). A little later, Hutcheson explains that “no Definition can raise any simple Idea which has not been before perceived by the Senses” (Inquiry, 1. IV). In making these claims, Hutcheson is committing himself to a version of the Lockean Axiom, the claim that there is no idea in the mind that cannot be traced to some particular experience – strictly speaking, this should read ‘simple idea’, since Hutcheson’s view is that all simple ideas must be traced back to some experience – compound ideas might be the product of reason.
Hutcheson’s commitment to the Lockean Axiom leads him to conclude that humans have a “Moral Sense” (see Frankena 1955; Harris 2017) as well as external senses of seeing, hearing, touching, tasing, and smelling. In fact, in his Essay on the Nature and Conduct of the Passions and Affections (1742), Hutcheson claims we have a range of ‘internal’ senses including a “Publick Sense”, concerned with the happiness of others, a “Sense of Honour”, and a sense of “decency and dignity” (Essay, 5-30). This is understandable given that, for Hutcheson, a sensation is ‘an idea raised in the mind upon the presence of external objects’ – and it is external objects, or more often external people (and their actions), that raise in us ideas of right, wrong, good, bad, justice, or injustice.
In the Essay, Hutcheson lays out a line of reasoning which justifies this view: “If we may call every Determination of our Minds to receive Ideas Independently on our Will, and to have Perceptions of Pleasure and Pain, A SENSE, we shall find many other Senses besides those commonly explained” (Essay, 5). His point is this: a sense is a ‘determination’ or faculty of the mind by means of which it receives (passively) certain kinds of ideas. Our sense of vision, for instance, is where we get our visual ideas, for example, ideas of colour or brightness/darkness. Our olfactory sense is where we get our ideas of smell such as sourness, putridness, and so on. However, if we can identify ideas that cannot be traced back to one of the five external senses – vision, hearing, taste, touch, smell – Hutcheson argues, then there must be another sense, an internal sense, by means of which the mind has received that idea. Such is the case with our ideas of good, bad, right, wrong, and so on. Since these ideas cannot be traced to any of the five external senses – because we do not literally see, hear, taste, touch, or smell good or bad, or right or wrong – we can infer that there must be a moral sense by which the mind has received them. Hutcheson describes this moral sense as that by which “we perceive Virtue, or Vice in our selves, or others” (Essay, 20). That is, through our naturally built-in moral sense, humans can detect virtue and vice. Note that this view implies that virtue and vice, and relatedly notions like good, bad, right, wrong, justice, and injustice, are qualities out there to be sensed. But what is it exactly that we are perceiving with our moral sense? And how does the human mind perceive virtue and vice in ourselves and other people?
For Hutcheson, the answer is that our ideas of virtue, vice, and other moral concepts are grounded in perceptions of pleasure and pain. Indeed, as the quotation above suggests, for Hutcheson, all perceptions are accompanied by a feeling of pleasure or pain. Some objects excite pleasure or pain in us, Hutcheson explains, even when we cannot see any “Advantage or Detriment the Use of such Objects might tend: Nor would the most accurate Knowledge of these things vary either the Pleasure or Pain of the Perception” (Inquiry, 1.VI). That is, some objects are naturally pleasurable or painful to sense – and such objects, according to Hutcheson, are beautiful or ugly, respectively. Similarly, the actions of some people generate pleasure or pain in us, and this is what determines whether we characterise those people are virtuous or vicious. Hutcheson maintains that it is a moral sense that generates our ideas of virtue or vice (just as it is an aesthetic sense that generates ideas of beauty or ugliness), rather than, say, a judgement or act of reason, because those ideas do “not arise from any Knowledge of Principles, Proportions, Causes, or of the Usefulness of the Object” (Inquiry, 1.XII). Instead, just as we are ‘struck’ with the colour of an object or the pitch of a sound, we are ‘struck’ by the rightness or wrongness, or virtuousness or viciousness, of a person or action.
In short, Hutcheson’s view is that we feel a kind of pleasure or displeasure in response to certain character traits or actions which determines whether we characterise them as virtuous or vicious. For example, one might feel pleasure witnessing an act of charity, or displeasure witnessing an action of cruelty. In the former case, an idea of virtue (or goodness, or rightness) is raised in our minds, while in the latter it is an idea of vice (or badness, or wrongness). In so doing, Hutcheson provides an empiricist account of the origins of ideas concerning moral concepts, that is, one that draws on Lockean Axiom.
b. Hume on Taste and the Moral Sense
Like Hutcheson, Hume is interested in identifying the source of our ideas of moral concepts like virtue, vice, justice, injustice, right, and wrong. And, like Hutcheson, Hume arrives at the view that such ideas are derived from some kind of moral sense, which he calls ‘taste’ (see, for example, T 3.3.6; Shelley 1998). (Another similarity with Hutcheson is that many of Hume’s claims about our sense of morality are paralleled in his discussion of beauty—including the claim that we have a sense of beauty.) In An Enquiry Concerning the Principles of Morals (1751), Hume’s account of moral sense, or taste, is part of a wider discussion of whether it is reason or sentiment, that is, feeling, that gives us our principles of morality. Hume lays out the debate like so:
There has been a controversy started of late…concerning the general foundation of MORALS; whether they can be derived from REASON, or from SENTIMENT; whether we attain the knowledge of them by a chain of argument and induction, or by an immediate feeling and finer internal sense; whether, like all sound judgment of truth and falsehood, they should be the same to every rational intelligent being; or whether, like the perception of beauty and deformity, they be founded entirely on the particular fabric and constitution of the human species. (EPM 1.3, 170)
In other words, the question is: do we reach conclusions about what is right or wrong in the same way we reach the conclusion of a mathematical formula, or do we reach such conclusions in the way we arrive at judgements about what counts as beautiful? The question is significant because, Hume claims, if our moral principles are more like judgements of beauty, then they might not, strictly speaking, be objective. They might instead be grounded in specific human values, concerns, desires, and judgements. Whereas if they are more like conclusions arrived at using reasoning, such as mathematical conclusions, Hume claims, then they can be more appropriately described as objective.
Hume opts for a decidedly ‘empiricist’ approach, that is, he draws from the Lockean Axiom, in answering this question, which ultimately leads him to reject the claim that moral principles are the product of reason. He explains that in the sciences, or ‘natural philosophy’, thinkers “will hearken to no arguments but those which are derived from experience” (EPM 1.10, 172). The same ought to be true, he claims, in ethics. In line with the Lockean Axiom, Hume then suggests that we ought to “reject every system of ethics, however subtile [that is, subtle] or ingenious, which is not found in fact and observation” (ibid.)—that is, the previously mentioned experience which underlies the arguments must be tied to the world we can perceive by our senses. Thus, like the natural philosophers of the Royal Society in London (the natural scientists he is referring to here), who rejected armchair theorising about nature in favour of going out and making observations, Hume’s aim is to arrive at an account of the origin of our moral principles that is based on observations of which traits or actions people do, in practice, deem virtuous or vicious—and why they do so.
What Hume claims to find is that traits like benevolence, humanity, friendship, gratitude, and public spirit—in short, all those which “proceed from a tender sympathy with others, and a generous concern for our kind and species” (EPM 2.1.5, 175)—receive the greatest approbation, that is, approval or praise. What all these character traits, which Hume calls the ‘social virtues’, have in common is their utility (see also Galvagni 2022 for more on Hume’s notion of virtue). This is no coincidence, Hume argues, for “the UTILITY, resulting from the social virtues, forms, at least, a part of their merit, and is one source of that approbation, and regard so universally paid to them” (EPM 2.2.3, 176). This leads Hume to develop the following line of reasoning: There are a set of traits, or ‘social virtues’, that are deemed the most praiseworthy by society. People who exhibit these traits are characterised as ‘virtuous.’ What these virtuous traits have in common is that they promote the interests of—that is, are useful to—society at large. Thus, Hume concludes, utility is at the heart of morality. This conclusion would go on to influence later thinkers like Jeremy Bentham and John Stuart Mill and is central to the normative ethical theory utilitarianism—which we also discuss in section 4.3 in relation to Susanna Newcome.
What role does the moral sense, or taste, play in Hume’s account of the origins of morality? His answer is that taste serves to motivate us to action, based on the pleasure that comes with approbation or the displeasure that comes with condemnation. He writes: “The hypothesis which we embrace is plain. It maintains that morality is determined by sentiment. It defines virtue to be whatever mental action or quality gives to a spectator the pleasing sentiment of approbation; and vice the contrary,” (EPM, Appendix 1, I, 289). In other words, Hume’s point is that we enjoy responding to a person or action with approval and do not enjoy, and may even take displeasure from, responding to persons or actions with blame or condemnation. Thus, like Hutcheson, Hume thinks that ideas we receive via our moral sense are accompanied by feelings of pleasure or pain. This is a claim about human psychology and, again, an idea that would go on to play an important role in the utilitarian ethics of Bentham and Mill—especially the idea, known as ‘psychological hedonism’, that humans are driven by a desire for pleasure and to avoid pain. Hume himself seems to endorse a kind of psychological hedonism when he claims that if you ask someone “why he hates pain, it is impossible he can ever give any [answer]” (EPM, Appendix 1, V, 293).
In line with the Lockean Axiom, Hume concludes that it cannot be reason alone that is the source of our moral principles. Again, like Hutcheson, Hume thinks that we sense—immediately, prior to any rational judgement—rightness or wrongness, or virtue or vice, in certain persons or actions. What is more, he argues, reason alone is “cool and disengaged” and is thus “no motive to action”, whereas taste, or moral sense, is a motive for action since it is involves a feeling of pleasure or pain (EPM, Appendix 1, V, 294). For that reason, Hume concludes that, in morality, taste “is the first spring or impulse to desire or volition” (ibid.).
c. Newcome on Pain, Pleasure, and Morality
There is no explicit commitment to the Lockean Axiom in the writings of Susanna Newcome. However, what we do find in Newcome is a development of the idea, also found in Hutcheson and Hume, that moral theorising is rooted in experiences of pleasure and pain—ideas which, as we found in sections 4.1 and 4.2, are themselves premised upon an acceptance of the Lockean Axiom. Thus, in Newcome (as in Shepherd), we find a thinker deeply influenced by the work of others who did adhere to the Lockean Axiom. In a sense, then, Newcome’s work is indirectly influenced by that Axiom. What we also find in Newcome is a bridge between the ‘empiricism’ of Locke, and thinkers like Hutcheson and Hume, who accept the Lockean Axiom, and the later utilitarianism of Jeremy Bentham and John Start Mill. For these reasons, Newcome’s ethical thinking merits inclusion in this article on ‘empiricism’ and the story of the development of the Lockean Axiom we have chosen to tell (see section 1).
In An Enquiry into the Evidence of the Christian Religion (1728/1732), Newcome provides the basis for a normative ethical theory that looks strikingly similar to the utilitarianism later, and more famously, defended by Jeremy Bentham and John Stuart Mill. For this reason, Connolly (2021) argues that Newcome—whose work pre-dates that of both Bentham and Mill—could plausibly be identified as the first utilitarian. Newcome bases her claims about ethics on claims about our experiences of pleasure and pain. What is also interesting, for our present purposes, is that Newcome identifies ‘rationality’ with acting in a way that maximises pleasure or happiness. Consequently, on Newcome’s view, we can work out what actions are rational by paying attention to which actions lead to experiences of pleasure or happiness—and the same applies to irrational actions, which lead to experiences of pain or unhappiness. In the remainder of this section, we outline Newcome’s views on pleasure and pain, happiness and unhappiness, and rational and irrational action.
Newcome begins her discussion of ethics by claiming that pleasure and pain cannot be defined (Enquiry, II.I). She explains that happiness and misery are not the same as pleasure and pain. Rather, she claims, happiness is “collected Pleasure, or a Sum Total of Pleasure” while misery is “collected Pain, or a Sum Total of Pain” (Enquiry, II.II–III). In other words, happiness is made up of feelings of pleasure, and misery is made up of feelings of pain. One is in a state of happiness when one is experiencing pleasure, and one is in a state of misery when one is experiencing pain. Newcome then goes on to commit herself to what has come to be known as ‘psychological hedonism’ (see section 4.2): the view that humans are naturally driven to pursue pleasure and avoid pain. As she puts it, “To all sensible Beings Pleasure is preferable to Pain” (Enquiry, III.I) and “If to all sensible Beings Pleasure is preferable to Pain, then all such Beings must will and desire Pleasure, and will an Avoidance of Pain” (Enquiry, III.II). Newcome then moves from these claims about what humans naturally pursue or avoid to a claim about what is most ‘fit’ for us. Like later utilitarians such as Bentham and Mill, Newcome thus bases her normative ethical theory—that is, her account of how we ought to act—on psychological hedonism, an account of how we naturally tend to act. She writes: “What sensible Beings must always prefer, will, and desire, is most fit for them” (Enquiry, III.III) and “What sensible Beings must always will contrary to, shun and avoid, is most unfit for them” (Enquiry, III.IV). She concludes: “Happiness is then in its own Nature most fit for sensible Beings” and “Misery is in its own Nature most unfit for them” (Enquiry, III.V–VI).
As we noted at the beginning of this section, Newcome does not explicitly commit herself to the Lockean Axiom that there is no idea in the mind that cannot be traced back to some particular experience. Nonetheless, it is true to say that Newcome arrives at her conception of how humans ought to act on the basis of claims about experience. As we saw, Newcome’s view is that pleasure and pain cannot be defined. Her view seems to be that we all just know what it is to feel pleasure and experience pain, through experience. In much the same way that one could not convey an accurate notion of light or darkness to someone blind from birth or loud and quiet to someone deaf from birth, Newcome’s view seems to be that the only way to know what pleasure and pain are is to have pleasurable and painful experiences. And it is on the basis of such experiences that Newcome, in turn, arrives at her conception of happiness, misery, and ‘fit’ or ‘unfit’ actions—that is, the kinds of actions that it are ‘right’ or ‘wrong’, respectively, for us to perform. As we suggested above, Newcome’s moral philosophy is also noteworthy in that she identifies rational actions with those which are conducive to pleasure. She explains: “As Reason is that Power of the Mind by which it finds Truth, and the Fitness and Unfitness of Things, it follows, that whatever is True or Fit, is also Rational, Reasonable, or according to Reason,” (Enquiry, IV).
And she adds that “all those Actions of Beings which are Means to their Happiness, are rational” (Enquiry, IV.V). In Newcome, then, we find not only a normative ethical theory but also an account of rational action that is grounded, ultimately, in our experience of things. Rational action is action conducive to happiness, and happiness is the accumulation of pleasure. We work out what actions are rational or irrational, then, by appealing to our experience of pleasure or pain.
5. God and Free-Thinking
This section focuses on the application of the Lockean Axiom to questions concerning the existence of God and the divine attributes of wisdom, goodness, and power—a crucial issue for philosophers during the early modern period, when theological issues were seen as just as important as other philosophical or scientific issues.
If you accept the Lockean Axiom, this seems to pose a problem for talk of God and his attributes (although it is worth noting that Locke does seem not see it that way; rather, he thinks the idea of God is on the same epistemic footing as our idea of other minds (Essay 2.23.33–35)). As the ‘free-thinking’ philosopher Anthony Collins (1676–1729) argues, if all the ideas in our minds can be traced back to some particular experience, and if we cannot experience God directly (as orthodox Christian teachings, particularly in the Anglican tradition, would have it), then it seems impossible that we could have an idea of God. But if we cannot have an idea of the deity, one might worry, how we can know or learn anything about God? And what does that mean for the Bible, which is supposed to help us do just that? Thus, thinkers like Collins would argue, while you can have faith in God’s existence, whether this is a reasonable or justified belief is an entirely different question.
A potential rebuttal to Collins’ way of arguing, however, is to point to divine revelations in the form of miracles or other Christian mysteries. Perhaps miracles do constitute instances in which those present can, or could, experience God, or divine actions. This kind of response is attacked by another free-thinker, John Toland (1670–1722), who argues that religious mysteries cannot even be an object of believing because they are inconceivable. For example, the idea that God is father, son, and holy spirit all at once is something that seems both inconceivable and contrary to reason. Against these lines of reasoning, more orthodox thinkers like George Berkeley, who, crucially, accepts the Lockean Axiom as well (see section 3.1), argues that even though we cannot have an idea of God, we can nonetheless experience the deity through our experience of the divine creation, nature. We outline Collins, Toland, and Berkeley’s views on God, and their relation to ‘empiricism’, in the subsections below.
a. Anthony Collins
Anthony Collins had a close friendship with Locke, but he adopted the Lockean Axiom to advance his free-thinking agenda. Like Toland (see 5.2), Collins is concerned with defending the right to make “use of the understanding, in endeavouring to find out the meaning of any proposition whatsoever, in considering the nature of evidence for or against it, and in judging of it according to the seeming force or weakness of the evidence” (Discourse, 3).
Crucially, this process ought not to be interfered with by authority figures, particularly non-religious ones. Rather, everyone needs to be able to judge the evidence freely on their own (Discourse 3–21).
When it comes to applying the Lockean Axiom to questions concerning God’s existence and the divine attributes, Collins takes a concession by an orthodox cleric, archbishop William King (1650–1729), as his starting point. King writes that “it is in effect agreed on all hands, that the Nature of God, as it is in it self, is incomprehensible by human Understanding; and not only his Nature, but likewise his Powers and Faculties” (Sermon § 3). While experience is not explicitly mentioned here, the underlying thought is that God is mysterious because we cannot experience God himself or the divine attributes. In other words, we do not have an idea of God because we cannot experience God. Thus, Collins argues that the word ‘God’ is empty (that is, does not denote anything in the world) and that when we say something like ‘God is wise,’ this is basically meaningless (Vindication, 12–13). In particular, Collins emphasizes that because of this lack of experience and the subsequent emptiness of the term, it becomes impossible to prove the existence of God against atheists. For the term cannot refer to more than a “general cause or effect” (Vindication, 13)—something that, he thinks, even atheists agree exists (Vindication, 14). They would only deny that this cause is wise, or would refuse the notion that this cause is immaterial, equating it instead with the “Material Universe” (Vindication, 14). To put it differently, Collins comes close to using the Lockean Axiom to advance atheism. At the very least, he makes it evident that accepting this axiom undermines fundamental theological commitments, because God and the divine attributes are generally held to be beyond the realm of creaturely experience and thus whatever idea we have of God must be empty (for discussion of Collins’ philosophy and the question of whether he is an atheist see O’Higgins 1970, Taranto 2000, and Agnesina 2018.) As we discuss in the next subsection, a similar way of arguing can also be found in John Toland’s Christianity not Mysterious (1696), which might have been an influence on Collins’ thinking. In contrast to Collins, though, Toland puts more emphasis on the connection between the Lockean Axiom and language, something that he also adopts from Locke.
b. John Toland
John Toland was an Irish-born deist who was raised as a Roman Catholic but converted to Anglicanism (the predominant denomination in Britain at the time) in his twenties. Throughout his writing career, Toland challenged figures in positions of authority. In Christianity not Mysterious, Toland takes aim at the Anglican clergy; this, ultimately, led to a public burning of several copies of the book by a hangman and Toland fleeing Dublin.
As mentioned in the previous section, Toland has a similar way of arguing compared to Collins in Christianity not Mysterious. This is no surprise if we consider that Toland highly esteemed Locke and accepts the Lockean Axiom. In fact, and, again, similarly to Collins, he implicitly draws from the axiom (or rather its contraposition) to argue against the religious mysteries of Christianity, such as the virgin birth of Jesus Christ or the latter’s resurrection from the dead. These events are mysterious in the sense that they cannot be explained without invoking a supernatural power because they conflict with the way things ‘naturally’ are. In line with such an understanding, Toland defines mysteries as “a thing of its own Nature inconceivable, and not to be judg’d by our ordinary Faculties and Ideas” (CNM, 93). The underlying idea is that mysteries are beyond the realm of our experience and that we cannot have an idea of any mystery because we cannot experience them—and so Toland says that “a Mystery expresses Nothing by Words that have no Ideas at all” (CNM, 84). In saying this, Toland is intending to follow Locke in holding that every meaningful word must stand for an idea and as such can be traced to some experience. As Locke says: “He that hath Names without Ideas, wants meaning in his Words, and speaks only empty Sounds’ (Essay 3.10.31). On this basis Toland argues that terms referring to mysteries are empty or meaningless because there can be no experiences of them. For instance, Toland criticises the doctrine of the Holy Trinity on this ground as well as arguing that it is neither supported by the bible nor any other form of divine revelation (CNM, § 3)—the existence of which is not rejected outright (compare CNM, 12).
In keeping with his critical attitude towards (religious) authorities, Toland claims that the Holy Trinity and other mysteries are an invention of “priest-craft” (CNM, 100) and nothing but a tool for submission. This point ties into his overall emancipatory aim of arguing for the right of everyone to use their reason in order to interpret the bible on their own, without interference by religious authorities (CNM, 5–14). For Toland believes that every reasonable person is capable of understanding the bible because reason is God-given. As Toland puts this point when addressing a potential clerical reader: “The uncorrupted Doctrines of Christianity are not above their [that is, the lay people’s] Reach or Comprehension, but the Gibberish of your Divinity Schools they understand not” (CNM, 87).
In short, Toland makes use of Lockean insights to tackle what were difficult and important theological questions of the day. By implicitly drawing on the Lockean Axiom and a broadly Lockean understanding of meaning, he argues against an overreach of clerical authority and against the existence of religious mysteries. For Toland, it holds that if something is really part of Christianity, it must also be accessible by our God-given reason (see also Daniel 1984 and the essays in Toland 1997 for more on Toland’s position).
c. George Berkeley
Throughout his life, Berkeley was very concerned with battling atheism or ideas which he thought undermined Christian teachings. His Principlesof Human Knowledge was dedicated to identifying and rejecting the “grounds” for “Atheism and Irreligion” (Works II, 20). He also defends the idea that vision (NTV1709 § 147) or nature (NTV1732 § 147) is a divine language in his New Theory of Vision. Yet his most elaborate defense of the idea that we can experience God through nature is found in the fourth dialogue of Alciphron; or the Minute Philosopher (1732/52), which is a set of philosophical dialogues. In a nutshell, Berkeley argues that we have no direct access or experience of any other mind, including our fellow human beings or rational agents. Nonetheless most of us believe that other rational agents exist. The reason for this, Berkeley contends, is that these agents exhibit “signs” of their rationality which we can experience. Most notably, they communicate with us using language (AMP 4.5–7). Berkeley then argues (AMP 4.8–16) that nature—that is, everything we see, hear, smell, taste, and touch—literally forms a divine language (there are competing interpretations of how to best interpret this divine language; see, for example Fasko 2021 and Pearce 2017). This language not only shows that God (as a rational agent) exists, but also displays the divine goodness by providing us with “a sort of foresight which enables us to regulate our actions for the benefit of life. And without this we should be eternally at a loss” (PHK § 31). For example, God ensures that where there is fire there is smoke and, in the way, ‘tells’ us there is fire nearby, when we see smoke. In this way, Berkeley objects to the line of reasoning introduced at the beginning of this section—that we cannot have an idea of God because we cannot experience the deity—by showing that there is a sense in which we experience God, via the divine language that constitutes nature. Thus, Berkeley not only accepts the Lockean Axiom, but also accepts Collins’s point that we immediately experience God. What he rejects is the notion that there are no mediate signs for God’s existence because nature, as a divine language, is abundant with them.
While Alciphron provides evidence of God’s existence, Berkeley’s account of how we know (something) about God’s nature can be found in the Three Dialogues. There, he explains:
[T]aking the word ‘idea’ in a large sense, my soul may be said to furnish me with an idea, that is, an image or likeness of God, though indeed extremely inadequate. For all the notion I have of God is obtained by reflecting on my own soul, heightening its powers, and removing its imperfections. (DHP 231–32)
In other words, by reflecting on my own mind, but endeavoring to remove my imperfections, I can get a sense of what God’s mind must be like. Combined with the claims in Alciphron, Berkeley thus offers an account of knowledge of God’s existence and nature.
In the seventh dialogue of Alciphron, Berkeley tackles the challenge issued by Toland. Berkeley argues that it is not problematic that some words do not signify ideas and thus their meaning cannot be traced back to some experience. In fact, Berkeley argues, our everyday language is full of such words. These words still have a meaning because they serve a purpose:
[T]here may be another use of words besides that of marking and suggesting distinct ideas, to wit, the influencing our conduct and actions, which may be done either by forming rules for us to act by, or by raising certain passions, dispositions, or emotions in our minds (AMP 7.5).
Berkeley thus deems it irrelevant for the meaningfulness of a term whether it refers to ideas that are ultimately grounded in experience. Rather, its meaning needs to be judged by the function it serves. When it comes to the mysteries that Toland attacked, Berkeley argues it is irrelevant that we cannot experience them, as long as talking about them serves the right function, that is, it is still meaningful (AMP 7.14–31) (see Jakapi 2002; West 2018).
6. Anton Wilhelm Amo: A Case Study in the Limits of British Empiricism
We argued in the first section of this article that considering the Peripatetic axiom, or more precisely the Lockean Axiom, allows for a more inclusive and diverse alternative story than the standard narrative of ‘British Empiricism’ which solely focuses on Locke, Berkeley, and Hume. This was our motivation for moving away from the standard narrative and focusing on the Lockean Axiom. The advantages of the narrative presented here are that it can incorporate a wider variety of issues and thinkers. However, we also pointed out that the narrative told here is neither exclusive nor exhaustive. Rather than this being a fault specific to our chosen narrative, we think this is an inevitable consequence of developing narratives that include some figures or ideas and exclude others.
This final section’s aim is to further put this narrative into perspective—not least, to make it abundantly clear that we do not intend to replace the standard narrative with the ‘correct’ story of ‘British Empiricism’. Rather, our aim is to illustrate that we are forced to tell stories that involve difficult choices, which ought to be, nonetheless, deliberate (and transparent) and to show what kind of stories can be told and what the limitations of narratives, such as the one developed here, are. In the following, we therefore first introduce a fringe case—that is, a thinker who could, on certain readings, be read as an ‘empiricist’—in the form of Anton Wilhelm Amo (1703–1756), the first African to receive a doctorate in Europe and a figure who is increasingly of interest to Early Modern scholars (for example, Wiredu 2004, Emma-Adamah 2015, Meyns 2019, Menn and Smith 2020, Smith 2015, Walsh 2019, West 2022).
The aim in doing so is to demonstrate that the Peripatetic Axiom transcended the boundaries of early modern Britain and that it was quite possible for thinkers on the continent to have just as much (if not more) in common with, for example, Locke than Descartes (in turn, this indicates that the traditional story of ‘empiricism versus rationalism’ cannot simply be replaced with ‘Lockeanism versus Cartesianism’). The case of Amo also puts pressure on the cohesiveness of the concept of ‘British Empiricism’—in short, there is nothing uniquely British about being an ‘empiricist’ (that is, accepting the Peripatetic Axiom or the Lockean Axiom). We begin with a very brief overview of Amo’s philosophy before drawing out the tension between, on the one hand, Amo’s commitment to the Peripatetic Axiom and, on the other, the difficulty that arises if we try to place him in the ‘empiricist’ tradition. The case of Amo, we think, shows that there simply is not—in any realist sense—any fact of the matter about whether this or that philosopher is or is not an ‘empiricist.’
Anton Wilhelm Amo wrote four texts during his life time: the Inaugural Dissertation on the Impassivity of the Human Mind, the Philosophical Disputation Containing a Distinct Idea of those Things that Pertain either to the Mind or to Our Living and Organic Body (both written in 1734), a Treatise on the Art of Philosophising Soberly and Accurately, and On the Rights of Moors in Europe (his first text, published in 1729, which, sadly, is now lost). The three surviving texts outline Amo’s account of the mind-body relation, which is substance dualist, and his theory of knowledge. Specifically, the Inaugural Dissertation and the Philosophical Disputation both defend a roughly Cartesian account of the mind-body relation and mind-body interaction. Amo is critical of certain elements of Descartes’ view—in particular, the idea that the mind can ‘suffer’ with (that is, passively experience sensations) the body (ID, 179–81). Yet, while he is critical, Amo’s aim is not to dismiss but to fix these kinds of issues with Descartes’ dualism (Nwala 1978, 163; Smith 2015, 219). While it is not clear cut, there is therefore a case to be made for thinking of Amo as a ‘Cartesian’—if, by ‘Cartesian’, we mean something like a thinker who sets out to augment Descartes’ worldview in order to defend or support it. He is certainly not an outright critic. At the very least, it would be difficult to place Amo in the ‘empiricist’ tradition—at least as it is typically construed—given the underlying Cartesian flavour of his philosophical system.
What makes Amo an interesting ‘fringe’ case for ‘empiricism’—and, indeed, Cartesianism too—is his explicit commitment to the Peripatetic Axiom (see, for example, Treatise, 139, 141, 146). Like Hobbes and Locke—as well as Aristotelian scholastics like Aquinas before them (see section 2.1)—Amo maintains that there is nothing in the intellect not first in the senses. Other Cartesians, like Antoine Arnauld for example, explicitly rejected the Peripatetic Axiom. As Arnauld puts it, “It is false…that all of our ideas come through our senses” (Arnauld 1970, 7). Now, it is worth noting that Amo is not a lone outlier. Other Cartesians, like Robert Desgabets or Pierre-Sylvain Régis, also accepted the Peripatetic Axiom—thus, there are further fringe cases. Nonetheless, Amo’s body of work is greatly suited to illustrate the limitations of our narrative and, in fact, any narrative that makes use of ‘empiricism’, or related notions like ‘Cartesianism’, as labels. For, on the one hand, Amo has in common with traditional ‘empiricists’, like Locke, a commitment to the Peripatetic Axiom. But on the other, he wants to defend and improve the philosophical system of someone (that is, Descartes) who has come to epitomize like no other what ‘rationalism’ is about.
One might demand to know: ‘Well, is Amo an empiricist or not?’ But what this discussion shows, we contend, is that when it comes to Amo, or others like Desgabet or Régis, there is no simple (or ‘right’) answer. The answer depends on what is meant by ‘empiricist’—and this, in turn, might depend upon the context in which that concept is being employed or the use to which it is being put.
In that sense, Amo’s body of work is illustrative of the very fundamental problem or ‘danger’ that the attribution of any “-ism”—that is, analyst’s rather than actor’s categories—runs, particularly if these positions are taken to be dichotomous to others: such attributions risk obfuscating important similarities or differences between thinkers’ ideas or simply omitting interesting thinkers or ideas just because they do not fit the story—and, crucially, not because they are underserving of attention.
There are other reasons to think of someone like Amo as a particularly significant figure when it comes to examining, and revising, the historical canon—and categories like ‘empiricism’ in particular. In light of growing interest in and demand for non-Western figures, and thinkers from typically marginalised backgrounds, both in teaching and scholarship, Amo—the first African to receive a doctorate in Europe—has picked up considerable attention. But in what context should we teach or write about Amo? Continuing to think in terms of the standard narrative of ‘British Empiricism’ versus ‘Continental Rationalism’ will, as the above discussion showed, not make it easy to incorporate Amo’s work into syllabi or research—precisely because there is no objective fact of the matter about whether he is one or the other. And, as we have already suggested, Amo is not alone; this is true of many figures who, not coincidentally, have never quite found a place in the standard early modern canon. We think there are ways to incorporate figures like Amo into our familiar narratives—for instance, construing ‘empiricism’ in terms of an adherence to the Peripatetic Axiom does, in that sense, make Amo an ‘empiricist’—but such cases also provide reasons to think that we ought to take a serious look at what purpose those narratives serve and whether we, as scholars and educators, want them to continue to do so. New narratives are available and might better serve our aims, and correspond with our values, in teaching and scholarship going forward.
7. References and Further Reading
When citing primary sources, we have always aimed to use the canonical or most established forms. In the cases where there are no such forms, we used abbreviations that seemed sensible to us. Also, if the text is not originally written in English, we have utilized standardly used translations. Finally, we want to note that on each of these figures and issues there is way more high-quality scholarship than we were able to point towards in this article. The references we provide are merely intended to be a starting point for anyone who wants to explore these figures and issues in more detail.
a. Primary Sources
Amo, Anton Wilhelm.Anton Wilhelm Amo’s Philosophical Dissertations on Mind and Body. Edited by Smith, Justin EH, and Stephen Menn. Oxford: Oxford University Press, 2020.
The first critical translation of Amo’s work espousing his philosophy of mind.
Amo, Anton Wilhelm. Treatise on the art of philosophising soberly and accurately (with commentaries). In T. U. Nwala (Ed.), William Amo Centre for African Philosophy. University of Nigeria, 1990.
Amo’s most systematic text in which he offers a guide to logic and fleshes out his account of the mind-body relation and philosophy of mind.
Aristotle. [APo.], Posterior Analytics, trans. Hugh Tredennick, in Aristotle: Posterior Analytics, Topica, Loeb Classical Library, Cambridge, MA; London: William Heinemann, 1964, pp. 2–261.
One of the most prominent English translations of Aristotle’s famous work on science.
Aristotle. [EN], The Nicomachean Ethics, trans. H. Rackham, Loeb Classical Library, London: William Heinemann, Cambridge, MA: Harvard University Press, 1947.
One of the most prominent English translations of Aristotle’s famous work on ethics.
Aristotle. [GA], De la génération des animaux, ed. Pierre Louis, Collection des Universités de France, Paris: Les Belles Lettres, 1961; trans. A. L. Peck, in Aristotle, Generation of Animals, Loeb Classical Library, London: William Heinemann, Cambridge, MA: Harvard University Press, 1953.
One of the most prominent English translations of Aristotle’s famous work on biology.
Aristotle. [Meteor.], Meteorologica, trans. H. D. P. Lee, Loeb Classical Library, London: William Heinemann, Cambridge, MA: Harvard University Press, 1962.
One of the most prominent English translations of Aristotle’s famous work on the elements.
Aristotle. The Complete Works of Aristotle. Edited by Jonathan Barnes. Princeton: Princeton University Press, 1984.
Standard English translation used by scholars of Aristotle’s complete works.
Arnauld, Antoine. La Logique, ou L’Art de penser. Flammarion, 1970.
Edition of Arnauld and Nicole’s logic textbook.
Arnauld, Antoine and Nicole, Pierre. Logic, or, The art of thinking in which, besides the common, are contain’d many excellent new rules, very profitable for directing of reason and acquiring of judgment in things as well relating to the instruction of for the excellency of the matter printed many times in French and Latin, and now for publick good translated into English by several hands. London: Printed by T.B. for H. Sawbridge, 1685.
Early English translation of this important text for the so-called Port-Royal Logic; an influential logic textbook.
Aquinas, Thomas. [DA] A Commentary on Aristotle’s De anima. Edited by Robert Pasnau. New Haven, CN: Yale University Press, 1999.
English translation of Aquinas’ commentary on Aristotle’s famous text on the soul.
Aquinas, Thomas. Truth.Translated by Mulligan, Robert W., James V. McGlynn, and Robert W. Schmidt. 3 volumes. Indianapolis: Hackett, 1994.
English translation of Aquinas commentary on Aristotle’s famous text on the soul.
Astell, Mary. A Serious Proposal to the Ladies. Parts I and II. Edited by P. Springborg. Ontario: Broadview Literary Texts, 2002.
Argues for women’s education and offers a way for women to improve their critical thinking skills.
Astell, Mary. The Christian Religion, As Profess’d by a Daughter of the Church of England. In a Letter to the Right Honourable, T.L. C.I., London: R. Wilkin, 1705.
Introduces Astell’s religious and philosophical views and continues her feminist project.
Bacon, Francis. The Works. Edited by J. Spedding, R. L. Ellis, and D. D. Heath. 15 volumes. London: Houghton Mifflin, 1857–1900.
First edition of Bacon’s works, still in use by scholars.
Bacon, Roger. [OM] The ‘Opus Maius’ of Roger Bacon. Edited Robert Belle Burke. 2 volumes. New York: Russell & Russell, 1928.
One if not the most important works of Bacon, attempting to cover all aspects of natural science .
Berkeley, George. The Correspondence of George Berkeley. Edited by Marc A. Hight. Cambridge: Cambridge University Press, 2013.
Most comprehensive edition of Berkeley’s correspondence with friends, family, and contemporaries thinkers.
Berkeley, George. The Works of George Berkeley, Bishop of Cloyne. Edited by A. A. Luce and T. E. Jessop. 9 volumes. London: Thomas Nelson and Sons, 1948-1957
Currently the standard scholarly edition of Berkeley’s writings.
Cavendish, Margaret. Observations upon Experimental Philosophy, Edited by Eileen O’Neill. Cambridge: Cambridge University Press, 2001.
Cavendish’s critique of the experimental philosophy of the Royal Society in London, and a defence of her own philosophical system.
Cavendish, Margaret. Grounds of Natural Philosophy. Edited by Anne M. Thell. Peterborough, Canada: Broadview Press, 2020.
The most detailed articulation of Cavendish’s ‘vitalist’ philosophical system of nature.
Cavendish, Margaret. The Blazing World and Other Writings. London: Penguin Classics, 1994.
Cavendish’s fantasy novel, with critiques the Royal Society and was published alongside her Observations
Collins, Anthony. A Discourse of Free-thinking: Occasion’d by the Rise and Growth of a Sect Call’d Free-thinkers. London,1713.
A defence of the right to think for oneself on any question.
Collins, Anthony. A vindication of the divine attributes In some remarks on his grace the Archbishop of Dublin’s sermon, intituled, Divine predestination and foreknowledg consistent with the freedom of man’s will. H. Hills, and sold by the booksellers of London and Westminster, 1710.
A critique of Archbishop King’s sermon, arguing that King’s position is effectively no different from atheism.
Conway, Anne. The Principles of the Most Ancient and Modern Philosophy. Translated by J. C[lark]. London, 1692.
First English translation of Conway’s only known book introducing her metaphysics and system of nature.
Hobbes, Thomas. Leviathan, with selected variants from the Latin edition of 1668. Edited by Edwin Curley. Indianapolis: Hackett, 1994
Hobbes’ influential political treatise, in which he also defends materialism and an ‘empiricist’ theory of knowledge.
Hume, David. Enquiries concerning Human Understanding and concerning the Principles of Morals, edited by L. A. Selby-Bigge, 3rd ed. revised by P. H. Nidditch, Oxford: Clarendon Press, 1975.
Standard scholarly edition of Hume’s famous work in which he lays out his moral and political philosophy.
Hume, David. A Treatise of Human Nature. Edited by L. A. Selby-Bigge, 2nd edition revised by P. H. Nidditch. Oxford: Clarendon Press, 1975.
Standard scholarly edition of Hume’s famous work in which he lays out his account of human nature and begins to develop an account of the human mind.
Hutcheson, Francis. An Inquiry into the Original of Our Ideas of Beauty and Virtue. Edited by Wolfgang Leidhold. Indianapolis: Liberty Fun, 2004.
Hutcheson’s influential texts on ethics and aesthetics, in which he argues that we have both a moral sense and a sense of beauty.
Hutcheson, Francis. An Essay on the Nature and Conduct of the Passions, with Illustrations on the Moral Sense. Edited by Aaron Garret. Indianapolis: Liberty Fund, 2002.
A text outlining Hutcheson’s moral philosophy.
King, William. Archbishop King’s Sermon on Predestination. Edited by David Berman and Andrew Carpenter. Cadenus Press: Dublin, 1976.
A sermon on predestination that revolving around the issue of divine attributes and the way we can meaningfully talk about these attributes and God’s nature.
Leibniz, Gottfried Wilhelm. Die philosophischen Schriften. Edited by Carl Immanuel Gerhardt. 7 volumes. Weidmann: Berlin, 1875–90.
Standard scholarly edition of all of Leibniz’s works.
Locke, John. An Essay concerning Human Understanding. Edited by Peter H. Nidditch. Oxford: Clarendon Press, 1975.
Standard scholarly edition of Locke’s most famous work, providing his description of the human mind.
Masham, Damaris. Occasional Thoughts in Reference to a Vertuous or Christian Life, London: A. and J. Churchil, 1705.
Masham’s second book develops the views of the Discourse in relation to practical morality.
Masham, Damaris. A Discourse Concerning the Love of God, London: Awsnsham and John Churchill, 1696.
Argues that humans are social and rational as well as motivated by love of happiness.
Newcome, Susanna. An Enquiry into the Evidence of the Christian Religion. By a Lady [ie S. Newcome]. The second edition, with additions. London: William Innys, 1732.
Newcome’s book espousing her views on morality and a defence of the Christian religion.
Plato. Plato: Complete Works. Edited by John M. Cooper. Indianapolis: Hackett, 1997.
A standard English edition of Plato’s complete works.
Shepherd, Mary. Essays on the Perception of an External Universe, and Other Subjects connected with the Doctrine of Causation. London: John Hatchard and Son, 1827.
Shepherd’s second book introducing her metaphysics by establishing that there is an independently and continuously existing external world.
Shepherd, Mary. An Essay upon the Relation of Cause and Effect, controverting the Doctrine of Mr. Hume, concerning the Nature of the Relation; with Observations upon the Opinions of Dr. Brown and Mr. Lawrence, Connected with the Same Subject. London: printed for T. Hookham, Old Bond Street, 1824.
Shepherd’s first book introducing her notion of causation by way of rejecting a Humean notion of causation.
Toland, John. John Toland’s Christianity Not Mysterious: Text, Associated Works, and Critical Essays. Edited by Philip McGuinness, Alan Harrison, and Richard Kearney. Dublin: Liliput Press, 1997.
Critical edition of one of Toland’s most famous works which argues that nothing that is above or beyond reason that is part of Christianity.
Wollstonecraft, Mary. A Vindication of the Rights of Woman with Strictures on Political and Moral Subjects. Edited by Sylvana Tomaselli, in A Vindication of the Rights of Men with A Vindication of the Rights of Woman and Hints, Cambridge: Cambridge University Press, 1995.
Critical edition of Wollstonecraft’s groundbreaking work arguing for women’s rights.
b. Secondary Sources
Agnesina, Jacopo. The philosophy of Anthony Collins: free-thought and atheism. Paris: Honoré Champion, 2018.
Consideration of Collins’ philosophy with a focus on the question whether he is an atheist.
Anfray, Jean-Pascal. “Leibniz and Descartes.”In The Oxford Handbook of Descartes and Cartesianism edited by Steven Nadler, Tad M. Schmaltz, and Delphine Antoine-Mahut, 721–37, Oxford: Oxford University Press, 2019.
Essay considering the complicated relationship between two rationalists.
Atherton, Margaret. “Lady Mary Shepherd’s Case Against George Berkeley. ” British Journal for the History of Philosophy 4 (1996): 347–66. Doi: 10.1080/09608789608570945
First article to discuss and evaluate Shepherd’s criticism of Berkeley.
Atherton, Margaret, ed. Women philosophers of the early modern period. Indianapolis/Cambridge: Hackett, 1994.
Groundbreaking volume that contains various women philosophers and present excerpts of their works, intended for their inclusion in the classroom.
Atherton, Margaret. “Cartesian reason and gendered reason.” In A mind of one’s own edited by Louise Antony and Charlotte Witt, 21-37, Boulder, CO: Westview Press, 1993.
Argues against first generation feminist critiques for the emancipatory potential that Cartesianism held for some female thinkers.
Atherton, Margaret. Berkeley’s Revolution in Vision. Ithaca: Cornell University Press, 1990.
The most comprehensive study of Berkeley’s theory of vision/ philosophy of perception.
Ayers, Michael. Locke: Epistemology and Ontology. London: Routledge, 1991.
An in-depth discussion of Locke’s theory of knowledge and metaphysics.
Bahar, Saba. Mary Wollstonecraft’s Social and Aesthetic Philosophy: An Eve to Please Me. New York: Palgrave, 2002.
Sustained discussion of the way that aesthetic considerations (pertaining to the presentation of women) play a crucial role for Wollstonecraft’s feminist project.
Beauchamp, T.L. and A. Rosenberg. Hume and the Problem of Causation. Oxford: Oxford University Press, 1981.
Classical study of the Humean notion of causation and its problems.
Bell, Martin. “Hume on Causation.” In The Cambridge Companion to Hume’s Treatise, edited by Donald C. Ainslie, and Annemarie Butler, 147–76, Cambridge: Cambridge University Press, 2015.
Consideration of Hume’s view of causation, highlighting the centrality of this issue for understanding his philosophical system.
Bennett, Jonathan. Locke, Berkeley, Hume: Central Themes. Oxford: Oxford University Press, 1971.
Classical story of the three so-called empiricist which highlights issues discussed by all of these thinkers.
Bergès, Sandrine, and Coffee, Alan. The Social and Political Philosophy of Mary Wollstonecraft. Oxford: Oxford University Press, 2016.
Essays that distinctively consider Wollstonecraft as a philosopher and relate her to her intellectual context as well as contemporary debates.
Bergès, Sandrine. The Routledge guidebook to Wollstonecraft’s A Vindication of the Rights of Woman. London: Routledge, 2013.
Contributions introducing readers to Wollstonecraft’s famous work of women’s rights and hence also to the origins of feminist thought.
Bolton, Martha Brandt. “Lady Mary Shepherd and David Hume on Cause and Effect.” Feminist History of Philosophy: The Recovery and Evaluation of Women’s Philosophical Thought edited by Eileen O’Neill & Marcy P. Lascano, 129–52, Cham: Springer, 2019.
Sustained discussion of the different understanding of causation by Hume and Shepherd.
Bolton, Martha. “Causality and Causal Induction: The Necessitarian Theory of Lady Mary Shepherd. ” In Causation and Modern Philosophy, edited by Keith Allen and Tom Stoneham, 242–61. New York: Routledge, 2010.
Classical article on Shepherd’s idiosyncratic notion of causation and the way she departs from Hume.
Boyle Deborah. Mary Shepherd: A Guide. Oxford: Oxford University Press, 2023.
First book length treatment of Shepherd’s metaphysics, discussing her core commitments and pointing to helpful secondary literature.
Boyle, Deborah. “Mary Shepherd on Mind, Soul, and Self.” Journal of the History of Philosophy 58, no. 1 (2020): 93–112. Doi: 10.1353/hph.2020.0005
First sustained discussion of Shepherd’s philosophy of mind.
Boyle, Deborah A. The well-ordered universe: The philosophy of Margaret Cavendish. Oxford: Oxford University Press, 2018.
In-depth discussion of Cavendish’s metaphysics.
Broad, Jacqueline. “Damaris Masham on Women and Liberty of Conscience.” Feminist History of Philosophy: The Recovery and Evaluation of Women’s Philosophical Thought edited by Eileen O’Neill & Marcy P. Lascano, 319–36, Cham: Springer, 2019.
One of the first considerations of the role and of the ethics of toleration.
Broad, Jacqueline. The philosophy of Mary Astell: An early modern theory of virtue. Oxford: Oxford University Press, 2015.
Argues that Astell’s ethical goals are at the center of her philosophical project and help to unite some of her seemingly diverging commitments.
Broad, Jacqueline. “A woman’s influence? John Locke and Damaris Masham on moral accountability.” Journal of the History of Ideas 67, no. 3 (2006): 489–510. Doi: https://www.jstor.org/stable/30141038
Considers the influence Masham had on Locke’s notion of moral accountability.
Chappell, Vere Ed. Essays on Early Modern Philosophy, John Locke—Theory of Knowledge. London: Garland Publishing, 1992.
Contributions on a broad variety of issues that pertain to Locke theories of knowledge ranging from triangles to memory.
Conley, John J. “Suppressing Women Philosophers: The Case of the Early Modern Canon.” Early Modern Women: An Interdisciplinary Journal 1, no. 1 (2006): 99-114. Doi: 10.1086/EMW23541458
Consideration of the exclusion of women from the history of philosophy with a focus on the challenges of their reintegration.
Connolly, Patrick J. “Susanna Newcome and the Origins of Utilitarianism.” Utilitas 33, no. 4 (2021): 384–98. Doi: 10.1017/S0953820821000108
One of the few scholarly works on Newcome arguing that she occupies a noteworthy position at the dawn of utilitarianism.
Costelloe, Timothy M. Aesthetics and morals in the philosophy of David Hume. London: Routledge, 2013.
A broad discussion of Hume’s ethics and aesthetics.
Cranefield, Paul F. “On the Origin of the Phrase NIHIL EST IN INTELLECTU QUOD NON PRIUS PUERIT IN SENSU.” Journal of the history of medicine and allied sciences 25, no. 1 (1970): 77–80. Doi: 10.1093/jhmas/XXV.1.77
Early article looking into the origin of the Peripatetic axiom as found in Locke.
Cruz, Maité. “Shepherd’s Case for the Demonstrability of Causal Principles.” Ergo: An Open Access Journal of Philosophy (forthcoming).
Argues that Shepherd endorses a broadly Lockean or Aristotelian substance metaphysics.
An introduction to Cavendish’s life and philosophical contributions.
Daniel, Stephen H. George Berkeley and Early Modern Philosophy. Oxford: Oxford University Press, 2021.
Book length treatments of Berkeley, relating his views to many other Early Modern figures and Ramism.
Daniel, Stephen Hartley. John Toland: His methods, manners, and mind. Kingston/Montreal: McGill-Queen’s Press-MQUP, 1984.
Only one of few book length studies of Toland and his philosophy.
Detlefsen, Karen. “Atomism, Monism, and Causation in the Natural Philosophy of Margaret Cavendish. ” Oxford Studies in Early Modern Philosophy 3 (2006): 199–240. Doi: 10.1093/oso/9780199203949.003.0007
A paper covering Cavendish’s rejection of atomism and commitment to monism, and her theory of causation.
Emma-Adamah, Victor U. “Anton Wilhelm Amo (1703-1756) the African‐German philosopher of mind: an eighteen-century intellectual history.” PhD diss., University of the Free State, 2015.
A doctoral dissertation on Amo’s account of the mind-body relation.
Falco, Maria J., ed. Feminist Interpretations of Mary Wollstonecraft. University Park PA: Penn State Press, 2010.
Includes contributions on the political and social impact of Wollstonecraft’s views.
Fasko, Manuel. Die Sprache Gottes: George Berkeleys Auffassung des Naturgeschehens. Basel/Berlin: Schwabe Verlag, 2021.
Detailed discussion of Berkeley’s divine language hypothesis arguing, contra Pearce, that only vision is the language of God.
Fasko, Manuel, and Peter West. “The Irish Context of Berkeley’s ‘Resemblance Thesis.’ ” Royal Institute of Philosophy Supplements 88 (2020): 7–31. Doi 10.1017/S1358246120000089
Arguing for the importance of the notion that representation requires resemblance in Berkeley’s intellectual context.
Discussion of Berkeley’s immaterialism in context of Descartes’ notion of objective presence that requires causal explanations of the content of ideas .
Frankel, Lois. “Damaris Cudworth Masham: A seventeenth century feminist philosopher.” Hypatia 4, no. 1 (1989): 80–90. Doi: 10.1111/j.1527-2001.1989.tb00868.x
Early art icle showing that Masham is a philosopher in her own right by espousing her feminist views.
Classic article on Hutcheson’s notion that we have a moral sense (much like a sense for seeing).
Galvagni, Enrico. “Secret Sentiments: Hume on Pride, Decency, and Virtue.” Hume Studies 47, no. 1 (2022): 131–55. Doi: 10.1353/hms.2022.0007
Discusses Hume’s account of decency and argues that it challenges standard virtue ethical interpretations of Hume.
Garrett, Don. “Hume’s Theory of Causation.” In The Cambridge Companion to Hume’s Treatise, edited by Donald C. Ainslie, and Annemarie Butler, 69–100, Cambridge: Cambridge University Press, 2015.
An introductory overview of Hume’s controversial theory of causation.
Gasser-Wingate, Marc. Aristotle’s Empiricism. Oxford: Oxford University Press, 2021.
An in-depth discussion of Aristotle’s view that all knowledge comes from perception.
Gordon‐Roth, Jessica, and Nancy Kendrick. “Including Early Modern Women Writers in Survey Courses: A Call to Action.” Metaphilosophy 46, no. 3 (2015): 364–79. Doi: 10.1111/meta.12137
Arguing for the importance of including women philosopher’s not in the least because of the current underrepresentation of women in the discipline.
Gracyk, Theodore A. “Rethinking Hume’s standard of taste.” The Journal of Aesthetics and Art Criticism 52, no. 2 (1994): 169–82.
A novel reading of Hume’s account of our knowledge of beauty.
Harris, James A. “Shaftesbury, Hutcheson and the Moral Sense. ” In The Cambridge History of Moral Philosophy, edited by Sacha Golob and Jens Timmermann, 325–37. Cambridge: Cambridge University Press, 2017. Doi: 10.1017/9781139519267.026
An introductory overview of Hutcheson’s account of the moral sense.
Hutton, Sarah. “Women, philosophy and the history of philosophy.” In Women Philosophers from the Renaissance to the Enlightenment, edited by Ruth Hagengruber and Sarah Hutton 12–29. New York: Routledge, 2021.
A discussion of why and how women are omitted from many histories of philosophy.
Hutton, Sarah. “Liberty of Mind: Women Philosophers and the Freedom to Philosophize.” In Women and liberty, 1600-1800: philosophical essays edited by Jacqueline Broad, and Karen Detlefsen,123–37. Oxford: Oxford University Press, 2017.
A paper arguing that women in early modern philosophy construed liberty as ‘freedom of the mind.’
Hutton, Sarah. “Religion and sociability in the correspondence of Damaris Masham (1658–1708).” In Religion and Women in Britain, c. 1660-1760, edited by Sarah Apetrei and Hannah Smith, 117–30. London: Routledge, 2016.
A discussion of Masham’s religious and social views, as espoused in her correspondences.
Hutton, Sarah. Anne Conway: A woman philosopher. Cambridge: Cambridge University Press, 2004.
Detailed discussion of Conway’s philosophy and her intellectual context
Jakapi, Roomet. “Emotive meaning and Christian mysteries in Berkeley’s Alciphron.” British journal for the history of philosophy 10, no. 3 (2002): 401–11. Doi: https://doi.org/10.1080/09608780210143218
Discusses the notion that Berkeley has an emotive theory of meaning.
Jolley, Nicholas. Locke, His Philosophical Thought. Oxford: Oxford University Press, 1999.
A broad discussion of Locke’s philosophical project.
Jones, Tom. George Berkeley: A Philosophical Life. Princeton: Princeton University Press, 2021.
The most comprehensive study of Berkeley’s life and intellectual context.
Kivy, Peter. The Seventh Sense: Francis Hutchenson and Eighteenth-Century British Aesthetics. Oxford: Clarendon Press, 2003.
An in-depth discussion of Hutcheson’s account of the sense of beauty.
Landy, David. “Shepherd on Hume’s Argument for the Possibility of Uncaused Existence.” Journal of Modern Philosophy 2 no. 1: 2020a. Doi: 10.32881/jomp.128
Discusses Shepherd’s criticism of Hume’s argument.
Landy, David. “A Defense of Shepherd’s Account of Cause and Effect as Synchronous.” Journal of Modern Philosophy 2, no. 1 (2020). Doi: 10.32881/jomp.46
Important discussion of Shepherd’s. account of synchronicity, defending this account against Humean worries.
Landy, David. “Hume’s theory of mental representation.” Hume Studies 38, no. 1 (2012): 23–54. Doi: 10.1353/hms.2012.0001
A novel interpretation of Hume’s account of how the mind represents external objects.
A discussion of Hume’s account of the relation between impressions and ideas.
Lascano, Marcy P. The Metaphysics of Margaret Cavendish and Anne Conway: Monism, Vitalism, and Self-Motion. Oxford: Oxford University Press, 2023.
Comprehensive discussion and comparison of Cavendish and Conway on three major themes in their philosophy.
Loeb, Louis E. Reflection and the stability of belief: essays on Descartes, Hume, and Reid. Oxford: Oxford University Press, 2010.
A discussion of the connections between Descartes, Hume, and Reid’s philosophies.
LoLordo, Antonia. Mary Shepherd. Cambridge: Cambridge University Press, 2022.
A broad overview of Shepherd’s philosophy, suitable for beginners.
LoLordo, Antonia, ed. Mary Sheperd’s Essays on the Perception of an External Universe. Oxford: Oxford Univeristy Press, 2020.
First critical edition of Shepherd’s 1827 book and 1832 paper.
Mackie, J. L. Problems from Locke, Oxford: Clarendon Press, 1971.
A discussion of the philosophical problems, relevant even today, that arise in Locke’s writing.
Mercer, Christia. “Empowering Philosophy.” In Proceedings and Addresses of the APA, vol. 94 (2020): 68–96.
An attempt to use philosophy’s past to empower it’s present and to promote a public-facing attitude to philosophy.
Meyns, Chris. “Anton Wilhelm Amo’s philosophy of mind.” Philosophy Compass 14, no. 3 (2019): e12571. Doi: 10.1111/phc3.12571
The first paper to provide a reconstruction of Amo’s philosophy of mind, suitable for beginners.
Michael, Emily. “Francis Hutcheson on aesthetic perception and aesthetic pleasure.” The British Journal of Aesthetics 24, no. 3 (1984): 241–55. Doi: 10.1093/bjaesthetics/24.3.241
A discussion of the sense of beauty and the feeling of pleasure in Hutcheson.
Myers, Joanne E. “Enthusiastic Improvement: Mary Astell and Damaris Masham on Sociability.” Hypatia 28, no. 3 (2013): 533–50. Doi: 10.1111/j.1527-2001.2012.01294.x
A discussion of the social philosophy of two early modern women.
Nwala, T. Uzodinma. “Anthony William Amo of Ghana on The Mind-Body Problem.” Présence Africaine 4 (1978): 158–65. Doi: 10.3917/presa.108.0158
An early attempt to reconstruct Amo’s response to the mind-body problem.
Influential paper since it is one of the first to discuss the problems and limits of the standard narrative that contrasts empiricism and rationalism.
Noxon, J. Hume’s Philosophical Development. Oxford: Oxford University Press, 1973.
A discussion of the development and changes in Hume’s philosophy over his lifetime.
O’Higgins, James. Anthony Collins the Man and His Works. The Hague : Martinus Nijhoff, 1970.
Still one of the most detailed discussions of Collins philosophy and intellectual context in English.
O’Neill, Eileen. “HISTORY OF PHILOSOPHY: Disappearing Ink: Early Modern Women Philosophers and Their Fate in History.” Philosophy in a Feminist Voice: Critiques and Reconstructions, edited by Janet A. Kourany, 17-62. Princeton: Princeton University Press, 1998.
Groundbreaking paper demonstrating how women thinkers have eradicated from the history of philosophy.
Pearce, Kenneth L. Language and the Structure of Berkeley’s World. Oxford: Oxford University Press, 2017.
Detailed consideration of Berkeley’s divine language hypothesis (that is, the notion that nature is the language of God).
Rickless, Samuel C. “Is Shepherd’s pen mightier than Berkeley’s word?.” British Journal for the History of Philosophy26, no. 2 (2018): 317–30. Doi: 10.1080/09608788.2017.1381584
Discussion of Shepherd’s criticism of Berkeley.
Rickless, Samuel C. Berkeley’s argument for idealism. Oxford: Oxford University Press, 2013.
Critically discusses Berkeley’s arguments for idealism.
Sapiro, Virginia. A vindication of political virtue: The political theory of Mary Wollstonecraft. Chicago: University of Chicago Press, 1992.
One of the first detailed discussions of Wollstonecraft’s’ political thought.
Saporiti, Katia. Die Wirklichkeit der Dinge. Frankfurt a. M.; Klostermann, 2006.
Critical examination of Berkeley’s metaphysics.
Seppalainen, Tom, and Angela Coventry. “Hume’s Empiricist Inner Epistemology: A Reassessment of the Copy Principle.” In The Continuum Companion to Hume, edited by Alan Bailey, Daniel Jayes O’Brie 38–56, London: Continuum, 2012.
Looks at exactly how Hume’s ‘copy principle’ (the claim that all ideas are copies of impressions) works.
Shapiro, Lisa. “Revisiting the early modern philosophical canon.” Journal of the American Philosophical Association 2, no. 3 (2016): 365–83. Doi: 10.1017/apa.2016.27
Critical consideration of the standard narrative arguing for a more inclusive story in terms of figures and issues considered.
Shelley, James. “Empiricism: Hutcheson and Hume.” In The Routledge companion to aesthetics, edited by Berys Gaut and Dominic Lopes, 55–68. London: Routledge, 2005.
An overview of Hutcheson and Hume’s ‘empiricist’ approach to beauty and aesthetics.
Shelley, James R. “Hume and the Nature of Taste.” The Journal of Aesthetics and Art Criticism 56, no. 1 (1998): 29–38. Doi: 10.2307/431945
Focuses on the ‘normative force’ in Hume’s conception of taste.
Smith, Justin EH. Nature, human nature, and human difference: Race in early modern philosophy. Princeton: Princeton University Press, 2015.
Investigates the rise of the category of race in the Early Modern period.
Taranto, Pascal. Du déisme à l’athéisme: la libre-pensée d’Anthony Collins. Paris: Honoré Champion, 2000.
Discusses Collins’ writings and the question whether he is a (covert) atheist.
Thomas, Emily. “Time, Space, and Process in Anne Conway.” British Journal for the History of Philosophy 25, no 5 (2017): 990–1010. Doi: 10.1080/09608788.2017.1302408
Discussion of Conway’s views in relation to Leibniz, arguing that Conway is ultimately closer to Henry More.
A discussion of a problem facing Amo’s philosophy, about how the mind and body can be in unison if they are heterogeneous entities.
West, Peter. “Why Can An Idea Be Like Nothing But Another Idea? A Conceptual Interpretation of Berkeley’s Likeness Principle” Journal of the American Philosophical Association 7, no. 4 (2021): 530-548. Doi: doi:10.1017/apa.2020.34
An account of why Berkeley thinks an idea can be like nothing but another idea.
West, Peter. ‘‘Mind-Body Commerce: Occasional Causation and Mental Representation in Anton Wilhelm Amo” Philosophy Compass 17, no. 9 (2022). Doi: https://doi.org/10.1111/phc3.12872
An overview of secondary literature on Amo’s philosophy of mind so far, and a new reading of how his theory of mental representation works.
Photo by Monozigote, CC BY-SA 4.0, via Wikimedia Commons
Susanne Langer was an American philosopher working across the analytic and continental divide in the fields of logic, aesthetics, and theory of mind. Her work connects in various ways to her central concerns of feeling and meaning.
Feeling, in Langer’s philosophy, encompasses the qualitative, sensory, and emotional aspects of human experience. It is not limited to mere emotional states but includes the entire range of sensory and emotional qualities that humans perceive and experience. Langer argues that feeling is not separate from rationality but, rather, an integral part of human intelligence and creativity.
In contrast to the logical positivists with whom she is sometimes associated, Langer argues for an expanded field of meaning. In contrast to the early Wittgenstein, who argues for a very limited field of meaning bounded by strict usage of language, Langer argues that symbolisms other than language are capable of expressing thoughts that language cannot.
Langer’s theory of feeling is closely tied to her theory of art, where she argues that artworks express forms of feeling. Artists use various elements, such as colours, shapes, sounds, and rhythms, to formulate feeling in their work, with each artwork being an art symbol. According to Langer, the artist’s task is to formulate the quality or gestalt of a particular feeling in their chosen medium.
In her broader philosophy of mind, Langer suggests that feeling is a fundamental aspect of human consciousness. She contends that feeling is not limited to individual emotions but is the basis for all forms of human thought, perception, and expression. In this sense, feeling serves as the foundation for higher-level cognitive processes, including symbolic thought and language.
Langer’s legacy includes her influential books on logic, philosophy of art, and theory of mind. Her position, whilst subject to minor terminological changes during her career, remains overwhelmingly consistent over half a century, and the resulting vision is a bold and original contribution to philosophy. Her ideas in the philosophy of art have been engaged with by various philosophers, including Nelson Goodman, Malcolm Budd, Peter Kivy, Brian Massumi, and Jenefer Robinson. In neuroscience and psychology, her notion of feeling, and her conceptual framework of mind, have been made use of by figures including Antonio Damasio and Jaak Panksepp. Overall, Langer’s work has left a lasting impact on philosophy, with her insights into the role of feeling in human life continuing to resonate with contemporary scholars and researchers.
Langer’s inclusiveness and rigor have recommended her thought to the generations since her passing. In the arts and biosciences her ideas are becoming more widely known. Langer’s work is a model of synthetic conceptual thinking which is both broad and coherent.
Susanne K. Langer (née Knauth) grew up in the Upper West Side of Manhattan, New York. The second of five children born to affluent German immigrants, Langer spoke German at home and French at school, and later claimed she only felt fully comfortable with English by the time she reached high school. Formative experiences included family summer vacations to Lake George and family music recitals in which Langer played the cello.
She attended Radcliffe College from 1916 and was awarded her doctorate in 1926 (Langer took the same classes as male students at Harvard during this time, who were taught separately; Harvard would not award men and women degrees on an equal basis until 1975). During Langer’s time at Radcliffe, she notably studied logic under Henry Sheffer, who introduced her to the ideas of Russell and the early Wittgenstein, as well as under Alfred North Whitehead, with Langer attending the lecture series which would become Process and Reality (1929). Whitehead would also supervise Langer’s doctoral thesis and write the introduction to her first book, The Practice of Philosophy (1930). Sheffer published very little, and Langer’s second book, An Introduction to Symbolic Logic (1937), is presented as putting forward Sheffer’s approach to logic, something Sheffer himself never did.
Langer married William Langer in 1921, who would go on to become a scholar of European history, and the two spent much of their first year of marriage in Vienna. Langer lived the rest of her life in America, though she returned to Europe with her family in the summer of 1933 for a European tour and to visit Edmund Husserl in Schluchsee, Germany. The couple had two children but divorced in 1941, with Langer never remarrying.
In addition to the intellectual influences of Whitehead, Sheffer, and Wittgenstein, Langer was strongly taken by the ideas of Ernst Cassirer; they met and corresponded, with Langer going on to translate Cassirer’s Language and Myth (1946) into English.
Langer’s third book, Philosophy in a New Key (1942), sold more than half a million copies. Arguing that there had been a shift in many fields towards recognition of the role of the symbolic in human life, ritual, art and language, the book brought together findings from many areas and offered a conceptual framework within which to understand, in particular, language and music.
After her divorce, Langer moved to New York City and stayed there for a decade as she wrote her theory of art, Feeling and Form (1953). Langer had part-time and temporary positions at various academic departments, including Radcliffe (1926-42) and Columbia (1945-50), but she did not have a full-time academic post until 1954, when she took up the chair of the philosophy department at Connecticut College for Women. From 1962, she was funded by a grant from the Edgar J. Kaufmann Foundation for her major work on theory of mind, at which point she retired to concentrate on her writing. After this, she split her time between Old Lyme, Connecticut and summers in a wood cabin in Ulster County, New York. Due to ill health and, in particular, failing eyesight, she published a curtailed version of her final, third volume of Mind in 1982. She died in 1985.
2. Feeling
Langer’s notion of feeling underpins all her other work. Feeling tells organisms how they are doing in various categories of need, both internal and external. As Langer puts it:
Feeling is the constant, systematic, but private display of what is going on in our own system, the index of much that goes on below the limen of sentience, and ultimately of the whole organic process, or life, that feeds and uses the sensory and cerebral system. (Langer, 1967)
Langer’s basic analytical unit of life is the act, which she considers in terms of phases. Langer repeatedly acknowledges the futility of drawing hard dividing lines in the natural sciences. Her preference instead is to find centres of activity which hold together because they are functional. An act is a functional unit, and can be considered on dramatically different scales, from cell to organ, to organism and ecosystem. Feeling is anything that can be felt, which is to say that it is a felt phase of an act. Feeling is the mark of at least primitive mentality or mentation, though not, at least in known non-human animals, mind. The relationship of feeling to logic in Langer’s work is that she argues for an expanded logical field of meaning which includes feeling, which is not considered as an irrational disturbance in an organism but the origin of logic; Langer writes that only a highly emotional animal could have developed the methods of logic. Lastly, there are unconscious processes, but there is no unconscious feeling: whatever can be felt is felt consciously. That anything that can be felt is a felt phase of an act emphasises this.
Langer describes how a phase is not a thing but a mode of appearance, explaining that when iron, for instance, is heated to become red-hot, redness is a phase, a mode of appearance, rather than being a new entity. When the iron is cooled the redness vanishes; Langer claims that, similarly, feeling is like this redness of the iron, mere appearance that has no independent existence. This is not to deny the importance of these appearances, however, since they are what the organism has to guide its negotiation with both its internal and external environment. Langer considers the notion of feelings to be a reification, that the process of feeling does not result in ontologically distinct products.
To the extent that an organism is able to react to a stimulus, it is able to feel. There are processes that may appear animated, such as leaves blowing along a path, or water bubbling up from a geyser, but in these examples the processes are entirely dictated by the external environment rather than being active agents seeking to maintain certain balances. If the stimuli in these examples cease, the wind for the leaf and the heat for the geyser, the animation would cease too, and immediately.
Animals feel, they feel their internal and external environment, they feel their own responses to the environment, and they feel as the environment responds to their actions. On human feeling, Langer writes:
Pure sensation—now pain, now pleasure—would have no unity, and would change the receptivity of the body for future pains and pleasures only in rudimentary ways. It is sensation remembered and anticipated, feared or sought or even imagined and eschewed that is important in human life. It is perception molded by imagination that gives us the outward world we know. And it is the continuity of thought that systematizes our emotional reactions into attitudes with distinct feeling tones, and sets a certain scope for an individual’s passions. In other words: by virtue of our thought and imagination we have… a life of feeling. (Langer, 1953)
Langer’s ideas are distinguished from those of the Classical Associationists; feeling is far from being a passive or neutral process, as Langer here stresses the feedback loop of imagination and perception in giving us access to the world. In stressing the continuity of the life of feeling, Langer is stressing the continuity of consciousness—not entirely unbroken in human experience, but normatively present. Feeling, for Langer, is the driving force of consciousness, motivating, among other functions, imagining and seeking and remembering.
This view of feeling leads to a particular view of consciousness: not as an emergent property of complex organisms such as humans but as a continuum along which there are simpler and more complex consciousnesses; whatever being is capable of feeling has at least a primitive awareness of, at a minimum, its sensory environment. Langer considers these very simple organisms, therefore, to be feeling, which is to be constantly attaining psychical phases of sensory acts, and that this constitutes mental activity. Langer describes this as mentation until it reaches the high development that it does in humans, which is the point at which this activity passes the threshold to be considered mind.
The clear question to come out of this is to ask what, if not consciousness, accounts for the gulf between animal mentation and the human mind. And, for Langer, the answer to this is symbolic thought.
Many animals are capable of reacting appropriately to signs (in later works Langer calls these signals), but, in known examples, only humans respond symbolically to the environment. A sign or signal, for Langer, is a symptom of an event; this can be natural, as in footprints signifying that a person or animal has walked a certain way, or artificial, as in a school bell signifying that the end of the school day has come. Symbols, by contrast, call attention primarily to concepts rather than objects; Langer writes that if someone says “Napoleon,” the correct response is not to look around for him but to ask “What about Napoleon?” The symbolic therefore allows people to imagine non-actual situations, including other times and places and the speculative.
Langer considers both emotion and logic to be high developments of feeling. Langer writes that logic is a device for leading people between intuitions, these intuitions being meaningful personal understandings (see the next section for a fuller discussion of Langer’s logic). Langer does not have a fully developed theory of emotion, though she refers to emotional situations in individual people and groups not infrequently. Her notion of feeling is certainly compatible with the use that is made of it by scientists such as Jaak Panksepp and Antonio Damasio, though it need not necessitate their ideas of emotion.
Langer’s notion of art concerns feeling as well: she argues that artworks present forms of feeling for contemplation. The purpose of art is to render pre-reflexive experience available to consciousness so that it can be reflected (rather than merely acted) upon. Knowledge of feeling captures what artworks are meant to help us with educationally, socially, and cross-culturally. We have access, in life and in art, to forms only, from which we extrapolate meaning. In life, the forms of feeling are too embedded in practical situations for us to contemplate them. When art is viewed as art, the experience of them is disinterested, the forms are isolated from practical situations.
Despite Langer’s emphasis on embodiment, she also clearly emphasises cognitive evaluations. As in many other areas, Langer’s work can be seen to bridge perspectives that are often considered incompatible: in this case, that emotion is either fundamentally embodied or fundamentally cognitive:
Certainly in our history, presumably for long ages – eons, lasting into present times – the human world has been filled more with creatures of fantasy than of flesh and blood. Every perceived object, scene, and especially every expectation is imbued with fantasy elements, and those phantasms really have a stronger tendency to form systematic patterns, largely of a dramatic character, than factual impressions. The result is that human experience is a constant dialectic of sensory and imaginative activity – a making of scenes, acts, beings, intentions and realizations such as I believe animals do not encounter. (Langer, 1972)
Langer here clearly believes cognitive evaluations matter—beliefs, whether about ghosts and monsters and gods or about why the bus is late and what might be done about it, and especially expectations, which determine to a surprising extent what is perceived. Langer also stresses here the dynamic real-time mixing of sensory and imaginative activity, disposing the holder of these expectations towards certain kinds of experience.
This emphasis on feeling in Langer has clear parallels to her contemporary John Dewey, who focused on experience similarly. These parallels have been drawn out most thoroughly by Robert Innis in his monograph on Langer.
3. Logic
Langer’s most distinctive contribution to the philosophy of logic is her controversial claim of a presentational logic that operates differently from, but is no less reasonable than, traditional logic. This presentational logic functions by association rather than by logical implication (as in traditional logic) or causality; nonetheless, Langer considers it also to be a logic because presentational forms contain relational patterns. Langer first put forward this idea in her doctoral dissertation in 1926, ‘A Logical Analysis of Meaning’, in which Langer investigated the meaning of meaning from the starting point that the dictionary definition of meaning seems to have little to do with meaning in art or religion.
Langer developed this idea further in her first book, The Practice of Philosophy (1930), in which she also situated philosophy in relation to science. Arguing that analysis is an indispensable part of any complex understanding, she distinguished between the empirical sciences which pursue facts and the rational sciences which instead pursue meanings—the latter exemplified by mathematics and logic. These rational sciences, Langer claimed, are the foundation of the ‘higher’ and more concrete subjects of ethics and metaphysics. Langer points out, for instance, that it was in studying numbers that philosophers gained the understanding they needed to approach more accurately the concept of infinity, and that Zeno’s paradox—that matter in its eternal motion is really at rest—is solved by a clear understanding of the continuum.
Aspects of Langer’s views here are heavily influenced by logical positivism, and this impression of these ideas is likely to be strengthened in the reader’s mind by Langer’s positive discussion of Bertrand Russell and of the early Wittgenstein of the Tractatus. One feature that Langer shares with logical positivism, for instance, is her view that philosophy is a critique of language. But even in this first book, published at approximately the peak of logical positivism’s popularity, Langer explicitly distinguishes her views from those of logical positivism. Already at this point, Langer is insisting on the importance of an interpretant in the meaning relation, reinserting the aspect of personal experience which logical positivism had carefully removed.
One of Langer’s contributions to the logic of signs and symbols is the claim that the semantic power of language is predicated on the lack of any rival interest in vocables. Langer uses the example of an actual peach to replace the spoken word ‘plenty’, and she argues that we are too interested in peaches for this to be effective: the peach would be both distracting and wasted. It is the irrelevance of vocables for any other purpose than language that leads to the transparency of spoken language, where meaning appears to flow through the words.
Langer’s textbook, An Introduction to Symbolic Logic (1937), was written expressly to take students to the point where they could tackle Russell and Whitehead’s Principia Mathematica (1910-3). This textbook contains not only instruction on the formal aspects of symbolic logic, Boolean as well as that of Principia Mathematica, but also extensive philosophical discussion on metaphor, exemplification, generalization and abstraction. As well as achieving this task, the book functions as an introduction to how Sheffer practiced logic, since he did not publish such a text.
Sheffer had followed Josiah Royce in considering logic to be a relational structure rather than dealing solely with inference. Langer takes this notion and follows it through its implications, paying special attention to the distinctions between types of logic and meaning.
From one perspective, Langer’s view is very radical, since expanding the notion of meaning to logical resemblance incorporates huge swathes of life which had been dismissed by many of the thinkers she cites most, such as Russell and the early Wittgenstein, as nonsense. However, this emphasis on the structure of relations can also be seen as a form of hylomorphism, connecting Langer’s views to a tradition which stretches back to Aristotle.
4. The ‘New Key’ in Philosophy
Langer’s next book, Philosophy in a New Key (1942), might be thought of as her central work, in that it serves as a summation and development of her previous work in logic and an expanded field of meaning, but also gives early formulation to her ideas in all the fields which would preoccupy her for the rest of her career, including art and theory of mind, but also touching on linguistics, myth, ritual, and ethnography.
In the book Langer claims that figures as diverse as Freud, Cassirer, Whitehead, Russell, and Wittgenstein are all engaged in a shared project to understand the nature of human symbolization. Along the way, Langer touches on a wide variety of subjects of philosophical interest. Her theory of depiction, for instance, is given, along with a speculative account of the early development of language, and the relation of fantasy to rational thought.
Langer justifies the exploration of all these different topics in a single text by relating them all to a single idea: that across a wide range of humanities subjects there had been, in the late 19th and early 20th centuries, a fundamental shift in the intellectual framework within which work was done in these disciplines and that this shift was related in every case to an expanded appreciation of the nature of human symbolization. Langer describes this shift using the musical metaphor of a key change—hence Philosophy in a New Key. In her introduction, Langer offers a brief account of previous shifts in philosophy such as, for instance, the Cartesian notion of looking at reality as a dichotomy of inner experience and outer world.
Langer refers to her theory of the symbolic as a semantic theory, which proved controversial, as her theory includes but is not limited to language. This is the expanded field of meaning that Langer sought to describe and provided conceptual scaffolding for. Where Wittgenstein’s Tractatus Logico-Philosophicus famously ends with the statement that “whereof we cannot speak, we must remain silent” (Wittgenstein, 1922), Langer argues that language is only one of many symbolisms, albeit one of particular importance, and that other symbolisms, including myth, ritual, and art, can form thoughts which language is incapable of. Whether or not Langer is correct depends not only on whether the semantic can be broadened in this way so that the semantic does not need a corresponding syntax, for instance, but also on whether there are thoughts which language is not capable of expressing.
Langer’s distinction between discursive and presentational symbolic forms in Philosophy in a New Key has received extensive discussion. Briefly, discursive forms are to be read and interpreted successively, whereas presentational forms are to be read and interpreted as a whole. Additionally, another important difference is that in discursive symbolisms the individual elements have independent meaning whereas in non-discursive symbolism they do not; words have independent meaning even when isolated from a wider text or utterance, whilst lines, colours and shapes isolated from an artwork do not have independent meaning.
Scientific language and mathematical proofs are straightforwardly discursive, whereas photographs and paintings are straightforwardly presentational. Some less intuitive but still important applications of this distinction exist, however, with novels and poems, for instance, being considered presentational forms by Langer; despite being formed with language, the artwork functions as a whole, and cannot be judged without considering the whole. On the other hand, graphs and charts function discursively, despite being visual.
Langer’s discussion of ritual is related to her careful reading of Ernst Cassirer, whom Langer met and corresponded with, and who considered Philosophy in a New Key to be the book on art which corresponded to his three-volume Philosophy of Symbolic Forms (1923-9). Langer would translate and write the introduction for the English-language edition of Cassirer’s Language and Myth (1946). Considering rain dances, for instance, Langer discusses them neither as a dishonest trick of tribal seniors nor as magic. Instead, the group activity is seen as symbolic:
Rain-making may well have begun in the celebration of an imminent shower after long drought; that the first harbinger clouds would be greeted with entreaty, excitement, and mimetic suggestion is obvious. The ritual evolves while a capricious heaven is making up its mind. Its successive acts mark the stages that bring the storm nearer. (Langer, 1942)
Langer notes, moreover, that participants do not try to make it snow in mid-summer, nor to ripen fruits entirely out of season. Instead, the elements are either present or imminent, and participants encourage them.
Langer’s treatment of music in the book is notable, defending critic Clive Bell’s famous phrase in Art (1917) that called art ‘significant form’. Langer argues that the sense in which this is true is that music is a symbolic form without fixed conventionally assigned meanings—she calls music an “unconsummated symbolism.” (Langer, 1942) Langer dismisses both the hedonic theory of art and the contagion theory, and she argues instead that music expresses the composer’s knowledge of feeling, an idea she attempts to elucidate and clarify but which she attributes to numerous European composers, critics, and philosophers including Wagner, Liszt and Johann Adam Hüller.
Philosophy in a New Key might also be thought to be central because Langer’s later theory of art is explicitly introduced on its cover as being derived from Philosophy in a New Key, and subsequently her Mind trilogy is introduced as having come out of her research on living form that informed her philosophy of art. Langer herself frequently refers back to Philosophy in a New Key in her later works, whereas The Practice of Philosophy never went beyond the first edition, with Langer in her later life turning down requests by the publisher to put it back in print.
The book was unexpectedly popular. Despite its enthusiastic popular reception, the book was largely neglected by the academic community at the time. The book’s success may partly explain Langer’s relative prominence within musical aesthetics compared to her relative neglect in the aesthetics of other artforms since her treatment of music in Philosophy in a New Key is much fuller than her brief and scattered comments on other artforms. Langer was well aware of this, and indeed the subsequent work, Feeling and Form, gives separate and sustained attention to a wide variety of artforms.
5. Theory of Art
After Philosophy in a New Key’s popular success in giving an account of music, Langer generalised its account to a theory of all the arts. Feeling and Form (1953) is split into three major parts: the first deals with introductory matters; part two, by far the largest part of the book, gives separate and sustained attention to each of the artforms dealt with in turn, including painting, sculpture, architecture, music, dance, poetry, myth and legend, prose fiction, comedic drama and tragic drama (there is also a short appendix on film at the end of the book); then, in part three, Langer gives her general account.
Helpfully, in part three she compares her ideas in detail to those of R. G. Collingwood, whose Principles of Art (1938) had appeared just fifteen years before; this very much helps to locate Langer’s position. The final chapter of Feeling and Form considers art from the point of view of its public, considering the educational and social role of art, in a way that both ties Feeling and Form into the sections on ritual and myth in Philosophy in a New Key and anticipates some arguments Langer would make in Volumes 1 and 3 of Mind. The theory of art presented here is based primarily on Feeling and Form, but also includes elements and quotes from the two other later books in which Langer discusses art at length: Problems of Art (1957) and Mind: Volume 1 (1967).
Langer’s theory states that artworks present forms of feeling. This is possible because both feeling and artistic elements are experienced as qualitative gradients; the forms of each are congruent. Feeling may be complex or simple—more or fewer gradients can be experienced simultaneously; artworks, similarly, may present many gradients at once or very few. In either case, there is a unity to the feeling or artwork—an overall quality. It is this quality of feeling that an artist tries to express when creating a work, negotiating the artistic elements.
Artists work by weighing qualities in the forming artwork—a formulation that seems to capture practices as diverse as traditional easel painting or the selection of ready-mades, a composer writing a symphony or a rock band writing a song, or theatre directors giving feedback to actors on blocking or actors improvising a scene of street theatre. “Artistic forms,” Langer writes, “are more complex than any other symbolic forms we know. They are, indeed, not abstractable from the works that exhibit them. We may abstract a shape from an object that has that shape, by disregarding color, weight and texture, even size; but to the total effect that is an artistic form, the color matters, the thickness of lines matters, and the appearance of texture and weight.” (Langer, 1957) The value of art is intrinsic to the work, rather than being a communication medium, and it is the sensuous qualities of the work which give the viewer access to the meaning (literary work being experienced in the sensuous imagination).
Langer holds that artworks are each a symbol expressive of human feeling. By expression—to press out—Langer means projection, she uses the example of horns projected from the head of a reindeer. An art object is therefore a projection of feeling, not spontaneous feeling—but the artist’s knowledge of feeling. Langer’s Expressivism, moreover, does not insist on melodrama and high emotion. Whilst it could be argued that Langer’s concept of expression differs too significantly from others in the Expressivist tradition to be called such, Langer herself writes that she, Croce, and Collingwood are embarked on a shared project, as well as Bell, Fry, and Cassirer. (Langer, 1953) So long as it is remembered that Langer does not claim that artworks express emotion, the grouping seems fair; Langer’s account concerns expressive form articulating knowledge of feeling rather than a contagious and spontaneous outpouring.
Langer writes that artists try to express a unitary gestalt:
What any true artist – painter or poet, it does not matter – tries to “re-create” is not a yellow chair, a hay wain or a morally perplexed prince, as a “symbol of his emotion,” but that quality which he has once known, the emotional “value” that events, situations, sounds or sights in their passing have had for him. He need not represent those same items of his experience, though psychologically it is a natural thing to do if they were outstanding forms; the rhythm they let him see and feel may be projected in other sensible forms, perhaps even more purely. When he finds a theme that excites him it is because he thinks that in his rendering of it he can endow it with some such quality, which is really a way of feeling. (Langer, 1967)
Langer believes that people feel, and artists have developed special sensitivity to feeling, and when working in an artistic mode, they seek to articulate what they have felt, so that the resulting artwork seems to possess the same quality as the feeling the artist has in mind (remembering that consciousness is a fundamentally embodied process for Langer, feeling raised above the “limen of sentience”). Langer stresses that the artist need not have experienced the feeling, but they must be capable of imagining it.
Langer distinguishes between what she calls primary and secondary illusions. A primary illusion is what an artform stably presents—so painting, sculpture, and architecture must present virtual space whilst a piece of music must present virtual time. This is the contextual framework within which artistic elements exist. Further primary illusions include virtual powers (dance) and virtual memory (literature). Primary illusions do not come and go, and are not a presentation of gradients; because of this they are not the site of particular interest in most artworks—Langer, for instance, criticises the work of Russian artist Malevich for generating a sense of space in his “magic squares” but nothing else. Secondary illusions, by contrast, present gradients; artworks can function because of the congruence between gradients of feeling and gradients in artworks. Gradients are projected into artworks, and while there are no set rules for how this is done, it is possible to analyse an artwork to see how a work has been achieved. By stressing the absence of rules of projection, what Langer means is that the results of these analyses cannot be generalised and reapplied—this is one major way in which art images are distinguished from models, which generally do have a single stable rule of projection; the salience of a gradient depends on the artwork. The relationship of secondary illusions to primary illusion is that of feeling to a life of feeling.
Feeling and Form did not find the success of its predecessor yet it has been mentioned or taught in some aesthetics programmes in the UK and US; perhaps surprisingly, it also seems to have been featured in university aesthetics syllabuses in China and India. Feeling and Form has also been made use of by philosophers seeking to put forward accounts of particular artforms. Robert Hopkins, for instance, has offered a limited defence of her ideas of virtual kinetic volume in sculpture as found in Feeling and Form.
Philosopher Paul Guyer has suggested, in his sub-chapter on Langer in his History of Modern Aesthetics, that the reason for the neglect of Feeling and Form may be timing; the publication of Feeling and Form in 1953 coincided with the publication of Wittgenstein’s Philosophical Investigations, the latter text preoccupying philosophy departments for decades. Accounts of art such as Langer’s which offered a single function which all artworks were meant to perform, expression in Langer’s case, were not in keeping with the intellectual fashion for proceduralist theories, such as George Dickie’s institutional theory of art or Arthur Danto’s historical account.
Langer produced two other books in this phase of her career. Problems of Art (1957) is a transcribed and edited collection of Langer’s talks on art to different audiences. Langer addresses different sorts of audiences, including general and non-specialist and technical, and so her position on many points is made clearer because of the different registers in which she addresses her audiences. She had had four years since the publication of Feeling and Form in which to synthesise the formulation of many of her ideas into a clearer form. The book also contains a reprint of her important and otherwise difficult-to-find essay from 1951 in honour of Henry Sheffer, ‘Abstraction in Science and Abstraction in Art’.
Secondly, Langer produced Reflections on Art: A Source Book of Writings by Artists, Critics, and Philosophers (1958). This latter book is a collection of writings on art which Langer considered to be both important and otherwise hard to find. Whilst invaluable in tracing influences on Langer’s ideas, Reflections on Art is not particularly helpful as an introductory text because of its focus on, in particular, metaphor and expression, at the expense of a wider survey of writings on art.
6. Theory of Mind
In the first volume of Mind (1967), Langer sets out the problem as she sees it: the mind-body duality resists efforts to solve it because it is built on faulty premises, that mind is not metaphysically distinct from body, and that behaviorism in psychology has previously led to an avoidance of the most pressing issues of the discipline. To tackle this, Langer puts forward the thesis, which she planned to substantiate over three volumes, that the whole of animal and human life, including law, the arts, and the sciences, is a development of feeling. The result is a biologically grounded theory of mind, a conceptual structure within which work in the life sciences can be integrated.
Furthermore, Langer claims that it is possible to know the character of the mind by studying the history of art, which shows the development and variety of feeling in its objectified forms. Langer proceeds, then, to first take issue with the ‘idols of the laboratory’—jargon, controlled experiment, and objectivity, claiming that each of these has its place but have held back progress in the life sciences. Claiming that each of these weaknesses is philosophical, Langer argues that scientific knowledge ultimately aims to explain phenomena, and that at a pre-scientific level work is motivated and guided by images which present the phenomenal character of a particular dynamic experience. Images are susceptible to analysis in a way that feeling itself is not. Here Langer calls art symbols a “systematic device whereby observations can be made, combined, recorded and judged, elements distinguished and imaginatively permuted, and, most important, data exhibited and shared, impressions corroborated.” (Langer, 1967) This is material art history seen as a data set, a treasure trove for psychological research.
Langer goes on to explore artistic projection, the artistic idea, and abstraction in art and science before considering living form—that functional artworks need a semblance of livingness, something Aristotle already remarked upon as the single most important characteristic of art. The image of mind that art provides can be used by those studying the mind to test the validity of their accounts.
This then sets up Langer’s discussion of acts and the growth and evolution of acts. Langer coins a new term, pressions, to name the class of relations which hold between acts and situations, such as impression, expression and suppression. Langer sees the evolution of life as fundamentally the evolution of acts, and sees the dominance of both mechanical models and imputing an agent such as God or Nature to situations as antithetical to serious understanding of this process.
The second volume of mind deals with a single section of her project, ‘The Great Shift’ from animal mentation to human mind. Starting with plankton, Langer considers progressively more complex examples, considering topics including instinct and the growth of acts. Langer seeks to neither deny animal feeling nor anthropomorphise animal feeling and behaviour. Langer draws on Jakob von Uexküll’s idea of animal ambient—that differing sensory abilities lead to different animals living in different experiential spaces even if they share the same actual space.
Langer discusses the migration of birds and other animals, arguing that animal travels should be seen as round trips, and migration as an elaboration of the same: a round trip with a long stopover. Also discussed are the parent-young relations of dolphins and the alleged use of language by chimpanzees. Langer brings a large amount of empirical material to bear on these issues, before moving on to consider the specialisation of man. She argues that Homo sapiens has been successful because of specialisation, against the argument that the species is a generalist. Langer considers the shape of the human foot, and that there is no evidence in this for humans ever living entirely in trees. The shape of the foot in facilitating bipedality, and an upright posture, and a larger brain, are all discussed, as is consideration of the hand as a sense organ. Langer then stresses a hugely important feature of the human brain, that it is able to finish impulses on a virtual level instead of needing to enact these in the actual world. This liberates the brain for conceptual thought.
Langer discusses dreaming and argues that the evidence suggests the brain requires constant activation, which is what has driven its increase in size and function. She then links the biologically grounded model of mentation she has drawn so far with the components of symbolization, discussing how mental abstraction is affected by memory, the origin of imagination, and the origins of speech in expression rather than communication.
Langer claims then that speech is necessary for social organisation and that all natural languages are socially adequate. Langer discusses the dangers of the imaginative capacity of humanity, and the feeling of reality, before discussing morality—a concern she notes is peculiar to man.
The final volume of Mind is not what Langer had planned, with an epistemological theory and a metaphysics. Due to poor health and failing eyesight, Langer left the final section of the book with only a brief outline.
What the third volume accomplishes, however, is to make connections between the model of man as the symbolic animal, which had been achieved by the end of the second volume, and various anthropological data relating to tribes, city states, and other societies. The focus of the third volume is considerably broadened to accommodate symbolic mind in society, and Langer by necessity only offers glimpses into this; Adrienne Dengerink Chaplin calls it a “holistic, biologically based, philosophical anthropology.” (Dengerink Chaplin, 2020)
Langer also offers a view of philosophy of religion, that “even as the power of symbolic thought creates the danger of letting the mind run wild, it also furnishes the saving counterbalance of cultural restraint, the orientating dictates of religion.” (Langer, 1953) A religious community and religious symbols keep a rein on individuation, strengthening social bonds; the loss of these religious frameworks in the modern world is a large part of the disorientation of modern life.
As the trajectory of her intellectual career intersected with Wittgenstein’s at several important junctures, it is of interest that she gives a brief verdict on his Philosophical Investigations: that it is a despairing resort to behaviourism.
7. Political Philosophy and Contribution to the ‘Modern Man’ Discourse
Langer’s contribution to political philosophy has received little attention, and her interest in it is certainly minor compared to her substantial interests in logic, the arts, and theory of mind. It consists of chapters on the structure of society in Philosophy in a New Key and the third volume of Mind, and, most notably, articles on the political danger of outdated symbolism governing societies in ‘The Lord of Creation’ (1944), and on what might be done to tackle the persistence of international war in ‘World Law and World Reform’ (1951).
‘The Lord of Creation’ essentially presents the arguments of Philosophy in a New Key through the lens of political philosophy and sociology. Symbolisation, Langer argues, is the source of the distinctiveness of human society—whilst animals, intelligent or not, live very realistic lives, humans are characteristically unrealistic: “magic and exorcism and holocausts—rites that have no connection with common-sense methods of self-preservation.” (Langer, 1944) This is because, Langer claims, people live lives in which there is a constant dialectic of sensory and imaginative activity, so that fantastic elements permeate our experience of reality: “The mind that can see past and future, the poles and the antipodes, and guess at obscure mechanisms of nature, is ever in danger of seeing what is not there, imagining false and fantastic causes, and courting death instead of life.” (Langer, 1944) This human condition has become a human crisis, according to Langer, because scientific progress has led to such upheavals in human living, especially in terms of the symbols which previously gave a shared context to human life.
Industrialisation, secularisation and globalisation have within two centuries, and in many places less, led to a poverty in the governing symbols available to humanity, according to Langer. People are now living together without overarching societal ties of religion or ethnicity, and are left with the vague notion of nationality to unite them, a concept Langer has little patience for, considering it to be a degraded tribalism:
At first glance it seems odd that the concept of nationality should reach its highest development just as all actual marks of national origins – language, dress, physiognomy, and religion – are becoming mixed and obliterated by our new mobility and cosmopolitan traffic. But it is just the loss of these things that inspires this hungry seeking for something like the old egocentric pattern in the vast and formless brotherhood of the whole earth. (Langer, 1944)
The problem is not merely industrial warfare, for Langer, but industrial warfare at a time when ‘modern man’ is simultaneously symbolically impoverished.
‘World Law and World Reform’ is a densely argued twelve pages; Langer argues that whilst civil war is a failure of institutions, and as such not irradicable, international war is, by nature, institutional. What she means by this is that the power of nation states is backed up by the threat and use of force, and it is the display and use of this force which enables diplomacy. Langer dismisses the notion of popular demand for war, arguing that it is diplomats—here she lists kings, presidents, premiers, other leading personages and their cabinets—who prepare and make war: “The threat of violence is the accepted means of backing claims in the concert of nations, as suit and judgement are in civil life.” (Langer, 1951)
Langer argues that this situation is the result of an essentially tribal philosophy of government, which did relatively little damage in the past, but which has the potential to end human life on earth since the invention of atomic weapons. Her solution is the creation and empowerment of a world judiciary, which would be invested with the power to adjudicate and enforce its decisions. She acknowledges that the United Nations is the most notable international institution of her era and lists five reforms which would make it suitable to perform the role of this world judiciary: “1) Extend membership to all nations; 2) Make the General Assembly a legislative body with power to adopt a constitution; 3) Give the World Court the power of summons, and make its decisions binding; 4) Set up a high secretariat (or other executive) to administer world interests; 5) Internationalize all armed force, setting up a federal guard (not enlisted by national units) and allowing the several nationals national police guards of their own, for domestic use.” (Langer, 1951)
Langer is not optimistic about these steps happening in short order, but she argues that historical parallels exist, and that the steps need not happen in one go and can be worked towards as a far-sighted goal. Her historical parallels are action to combat the Black Death and, later, to end legal child labour. In both of these situations, a pre-existing social malady became intolerable due to social changes which exacerbated them, and it was this which prompted social reform. Langer argues that, similarly, properly constituted world courts could bring an end to international war.
8. Legacy
Because of Langer’s many temporary academic positions, and her focus on research instead of teaching from the mid-1950s onwards, her legacy is mainly to be found in her publications, especially books, rather than in direct influence on students. Having said this, numerous individuals who would go on to be influential in their fields studied with Langer, including artist Eva Hesse and philosopher Arthur Danto. Danto would write the preface to the abridged version of Mind.
Langer herself is a subject of growing interest, with research being undertaken into her life, career, and index card system. The Susanne K. Langer Circle is an international group of scholars with interest in Langer’s work and life and is affiliated with Utrecht University. It hosted the first international conference on Langer’s work in 2022.
Langer’s textbook Introduction to Symbolic Logic was the first introductory book on the subject and made the methods of symbolic logic much more accessible. Randall Auxier has published an updated version of this with many more exercises and expanded discussion.
In philosophy of art, Langer’s ideas on expression have been engaged with by a range of prominent thinkers in the philosophy of music, including Malcolm Budd, Peter Kivy, and Jenefer Robinson. Nelson Goodman’s positions on many issues, in particular those he discusses in Languages of Art (1968), are influenced by Langer’s ideas, something Goodman half acknowledges in his introduction, though Goodman somewhat disingenuously cites Langer directly only as Cassirer’s translator.
Philosopher Brian Masumi has engaged with Langerean thought, particularly her work in Feeling and Form, discussing her ideas on motif and, especially, semblance, writing “Langer has probably gone further than any other aesthetic philosopher toward analyzing art-forms not as “media” but according to the type of experiential event they effect.” (Massumi, 2011) Langer in Massumi has been a very influential reference for younger philosophers engaging with her thought.
Jarold Lanier, the ‘father of virtual reality’, attributes the term ‘virtual world’ to Langer—computing and virtual reality pioneer Ivan Sutherland had Feeling and Form era Langer in mind. Here is the first reference to a virtual world in Langer—she is discussing architecture, in particular how one nomadic camp may be set up in the same geographical area where one from another culture used to be, but the sense is extremely evocative when considered in light of virtual reality:
A place, in this non-geographical sense, is a created thing, an ethnic domain made visible, tangible, sensible. As such it is, of course, an illusion. Like any other plastic symbol, it is primarily an illusion of self-contained, self-sufficient, perceptual space. But the principle of organization is its own: for it is organized as a functional realm made visible —the center of a virtual world, the “ethnic domain,” and itself a geographical semblance. (Langer, 1953)
Lanier made the change from virtual world to virtual reality, but the fundamental notion is Langerean. Pioneering media theorist Marshall McLuhan similarly seems to have had Langer in mind, occasionally citing her, when considering how media reshapes and reconstitutes (again, the above quote is suggestive here when considering McLuhan’s famous dictum “the medium is the message.”)
In neuroscience, several notable figures have referred in print approvingly to Langer’s ideas on feeling, including Jaak Panksepp, Gerald Edelman, and Antonio Damasio. The latter refers to his notion of background feeling as being what Langer describes, though he arrived at it independently. In psychology, Fred Levin writes that Langer anticipated by decades the notion of feeling that the biosciences would adopt.
9. References and Further Reading
a. Primary Sources
A Logical Analysis of Meaning, doctoral thesis, Radcliffe College, 1926.
Unpublished thesis making the case for an expanded understanding of meaning which includes religion and the arts, argues that philosophy is the clarification of concepts.
The Practice of Philosophy. New York: Henry Holt, 1930.
Explains Langer’s perspective on what it is to do philosophy, and its distinction from and relation to other fields, including science, mathematics and logic, and art.
An Introduction to Symbolic Logic. New York: Allen and Unwin, 1937. Second revised edition, New York, Dover, 1953.
Textbook aiming to take beginners to the point of being able to tackle Russell and Whitehead’s Principia Mathematica.
Philosophy in a New Key: A Study in the Symbolism of Reason, Rite and Art. Cambridge, MA: Harvard University Press, 1942.
Langer’s most influential book—drawing together researches in psychology, art, ritual, language and logic to claim that there had been a recent philosophical shift predicated on an expanded awareness of the symbolic.
‘The Lord of Creation’. In Fortune Magazine (1944).
Popular treatment discussing how the power of symbolisation is both a strength and source of the precariousness of human society.
‘Abstraction in Science and Abstraction in Art’. In Structure, Method and Meaning: Essays in Honor of Henry M. Sheffer, edited by Paul Henle, Horace M. Kallen and Susanne K. Langer, 171–82. New York: Liberal Arts Press, 1951.
Defends the thesis that scientific abstraction concerns generalisation whilst artistic abstraction specifies unique objects which are forms of feeling.
‘World Law and World Reform’. The Antioch Review, Vol. 11, No. 4 (Winter, 1951).
Langer’s most sustained political philosophy defending the implementation of empowered world courts.
Feeling and Form: A Theory of Art Developed from Philosophy in a New Key. New York: Charles Scribner’s, 1953.
An Expressivist theory of art which discusses numerous artforms in detail before generalising the conclusions.
Problems of Art: Ten Philosophical Lectures. New York: Charles Scribner’s, 1957.
Accessible collection of lectures to different audiences on art topics.
Reflections on Art: A Source Book of Writings by Artists, Critics, and Philosophers. Editor. Baltimore, MD: Johns Hopkins University Press, 1958.
Langer’s choice of aesthetics readings with introduction.
b. Secondary Sources
Auxier, Randall E. ‘Susanne Langer on Symbols and Analogy: A Case of Misplaced Concreteness?’ Process Studies 26 (1998): 86–106.
Suggests a modification to Langer’s account of symbols and considers this part of her account in relation to that of both Whitehead and Cassirer.
Auxier, Randall E. Logic: From Images to Digits. 2021. Ronkonkoma: Linus Learning.
An accessible and updated version of Langer’s symbolic logic, separating it from the implied metaphysics of the original.
Browning, Margaret M. ‘The Import of Feeling in the Organization of Mind’ in Psychoanalytic Psychology, Vol. 33, No. 2 (2016), pp. 284–298.
Pursues and defends a Langerean view of feeling from a neuroscientific perspective.
Browning, Margaret M. ‘Our Symbolic Minds: What Are They Really?’ in The Psychoanalytic Quarterly, Vol. 88, No. 1 (2019), pp. 25–52.
Discusses intersubjectivity from a Langerean perspective.
Budd, M. Music and the Emotions, London: Routledge, 1985.
A serious critique of Langer’s musical aesthetics.
Dengerink Chaplin, Adrienne. The Philosophy of Susanne Langer: Embodied Meaning in Logic, Art, and Feeling, London: Bloomsbury Academic, 2019.
Monograph on Langer with a particular focus on the influence of Wittgenstein, Whitehead, Scheffer and Cassirer on the development of Langer’s thought.
Dryden, Donald. ‘The Philosopher as Prophet and Visionary: Susanne Langer’s Essay on Human Feeling in the Light of Subsequent Developments in the Sciences’. Journal of Speculative Philosophy 21, No. 1 (2007): 27–43.
A brief summary of some of the applications of Langer’s theory of mind with a view to defending the applicability of the Langerean view.
Dryden, Donald. ‘Susanne Langer and William James: Art and the Dynamics of the Stream of Consciousness’. Journal of Speculative Philosophy, New Series, 15, No. 4 (1 January 2001): 272–85.
Traces commonalities and distinctions in the ideas on thinking and feeling of James and Langer.
Dryden, Donald. ‘Whitehead’s Influence on Susanne Langer’s Conception of Living Form’, Process Studies 26, No. 1–2 (1997): 62–85.
A clear account of what Langer does and does not take from Whitehead particularly concerning act form.
Gaikis, L. (ed.) The Bloomsbury Handbook of Susanne K. Langer. London: Bloomsbury Academic, 2024.
Featuring an extensive collection of major scholars on Langer, this book elucidates her transdisciplinary connections and insights across philosophy, psychology, aesthetics, history, and the arts.
Ghosh, Ranjan K. Aesthetic Theory and Art: A Study in Susanne K. Langer. Delhi: Ajanta Books International, 1979.
Doctoral dissertation on Langer which takes the unusual step, in an appendix, of applying her theories to specific artworks.
Criticizes and offers a limited defence of Langer’s notion of virtual kinetic volume in sculpture.
Innis, Robert E. Susanne Langer in Focus: The Symbolic Mind. Bloomington: Indiana University Press, 2009.
The first English-language monograph on Langer; particularly helpful in locating Langer in relation to the pragmatist tradition.
Lachmann, Rolf. Susanne K. Langer: die lebendige Form menschlichen Fühlens und Verstehens. Munich: W. Fink, 2000.
The first monograph on Langer (German language).
Lachmann, Rolf. “From Metaphysics to Art and Back: The Relevance of Susanne K. Langer’s Philosophy for Process Metaphysics.” Process Studies, Vol. 26, No. 1–2, Spring-Summer 1997, 107–25.
English-language summary by Lachman of his above book.
Massumi, B. Semblance and Event: Activist Philosophy and the Occurrent Arts. Cambridge, MA: The MIT Press, 2011.
An aesthetics of interactive art, ephemeral art, performance art, and art intervention. The titular semblance is Langerean and the early part of the book features an extended discussion on and from ideas taken from Feeling and Form.
Nelson, Beatrice K. ‘Susanne K. Langer’s Conception of “Symbol” – Making Connections through Ambiguity’. Journal of Speculative Philosophy, New Series 8, No. 4 (1 January 1994): 277–96.
Considers what is involved and at stake in Langer’s synthetic project.
Reichling, Mary. ‘Susanne Langer’s Concept of Secondary Illusion in Music and Art’. Journal of Aesthetic Education 29, No. 4 (1 December 1995): 39–51.
Opening up of the philosophical discussion on secondary illusions with reference to specific works and art criticism.
Sargeant, Winthrop. ‘Philosopher in a New Key’. New Yorker, 3 December 1960.
New Yorker profile on Langer.
Saxena, Sushil. Hindustani Sangeet and a Philosopher of Art: Music, Rhythm, and Kathak Dance Vis-À-Vis Aesthetics of Susanne K. Langer. New Delhi: D. K. Printworld, 2001.
Applies Langerean aesthetics to a type of music Langer did not discuss.
Schultz, William. Cassirer and Langer on Myth: An Introduction. London: Routledge, 2000.
Discussion of literary myths in Cassirer and Langer, both commonalities and distinctions in their positions.
van der Tuin, Iris. ‘Bergson before Bergsonism: Traversing ‘Bergson’s Failing’ in Susanne K. Langer’s Philosophy of Art’. Journal of French and Francophone Philosophy 24, No. 2 (1 December 2016): 176–202.
Considers Feeling and Form in relation to the philosophy and reception of Henri Bergson.
For Aristotle, human life is marked by special varieties of knowledge and understanding. Where other animals can only know that things are so, humans are able to understand why they are so. Furthermore, humans are the only animals capable of deliberating in a way that is guided by a conception of a flourishing life. The highest types of human knowledge also differ in having an exceptional degree of reliability and stability over time. These special types of knowledge constitute excellences of the soul, and they allow us to engage in characteristic activities that are integral to a good human life, including the study of scientific theories and the construction of political communities.
Aristotle’s central interest in epistemology lies in these higher types of knowledge. Among them, Aristotle draws a sharp division between knowledge that aims at action and knowledge that aims at contemplation, valuing both immensely. He gives a theory of the former, that is, of practically oriented epistemic virtues, in the context of ethics (primarily in the sixth book of the Nicomachean Ethics [Nic. Eth.], which is shared with the Eudemian Ethics [Eud. Eth.]), and he gives a theory of the latter both there and in the Posterior Analytics [Post. An.], where the topic of epistemology is not sharply distinguished from the philosophy of science. Lower types of knowledge and other epistemically valuable states are treated piecemeal, as topics like perception, memory and experience arise in these texts as well as in psychological, biological, and other contexts.
Although Aristotle is interested in various forms of error and epistemic mistakes, his theory of knowledge is not primarily a response to the possibility that we are grossly deceived, or that the nature of reality is radically different from the way we apprehend it in our practical dealings and scientific theories. Instead, Aristotle takes it for granted that we, like other animals, enjoy various forms of knowledge, and sets out to enumerate their diverse standards, objects, purposes and relative value. He emphasizes the differences among mundane forms of knowledge such as perception and higher forms such as scientific theorizing, but he also presents an account on which the latter grows organically out of the former. His pluralism about knowledge and his sensitivity to the different roles various forms of knowledge play in our lives give his theory enduring relevance and interest.
Knowledge in a broad sense (gnōsis, from whose root the word “knowledge” derives; sometimes also eidenai) is enjoyed by all animals from an early stage in their individual development (Generation of Animals [Gen. An.] I 23, 731a30–4). In Aristotle’s usage, it includes everything from a worm’s capacity to discriminate hot and cold to the human ability to explain a lunar eclipse or contemplate the divine (for representative usages, see Post. An. I 1, 71a1–2; II 8, 93a22; II 19, 99b38–9). However, Aristotle shows comparatively little interest in knowledge in this broad sense. The Aristotelian corpus has no surviving treatise devoted to knowledge in all generality, and there is no evidence that Aristotle ever authored such a text. His main interest is in more specific kinds of knowledge. Nevertheless, a few features of Aristotle’s view regarding knowledge in general deserve comment.
First, it is relatively clear that he takes gnōsis to be at least factive (although this is disputed by Gail Fine). That is, if someone (or some animal) has gnōsis that something is the case, then that thing is true. Plausibly, Aristotle takes gnōsis to be not only true cognition, but cognition that is produced by a faculty like perception which reliably yields truths. This makes it tempting to compare Aristotle’s general view of knowledge with contemporary forms of reliabilism such as Ernest Sosa’s or John Greco’s, though the reliability of gnōsis is not a point Aristotle stresses.
Second, Aristotle also treats most kinds of knowledge as relatives (Physics [Phys.] VII 3, 247b1–3). A relative, in Aristotle’s metaphysical scheme, is an entity which is essentially of something else (Cat. 7, 6a37). One example is that of a double, since a double is essentially the double of something else (Cat. 7, 6a39–b1). Likewise, knowledge is essentially knowledge of something-or-other (Cat. 7, 6b5), be it an external particular (De Anima [De An.] II 5, 417b25–7), a universal within the soul (De An. II 5, 417b22–3; compare Phys. VII 3, 247b4–5, 17–18), or the human good (Nic. Eth. VI 5, 1140a25–8). It is fundamental to Aristotle’s way of thinking about knowledge that it is in this way object directed, where the notion of object is a broad one that includes facts, particulars, theories and ethical norms. Aristotle frequently characterizes different types of knowledge by the types of objects they are directed at.
Third, for Aristotle, knowledge generally builds upon itself. In many cases, learning amounts to reconceiving the knowledge we already have, or coming to understand it in a new way (Post. An. I 1, 71b5–8). Further, Aristotle notes that the knowledge we gain when we learn something is often closely connected to the knowledge that we need to already have in order to learn this. For instance, in order to gain a proper geometrical understanding of why a given triangle has internal angles that sum to 180 degrees, we must already know that triangles in general have this angle sum and know that this particular figure is a triangle, whereupon it may be asked: what is this if not already to know that the particular triangle has this angle sum (Post. An. I 1, 71a19–27; cf. Pr. An. II 21, 67a12–22)? Likewise, in order to arrive at knowledge of what something is, that is, of its definition, we must perform an inquiry that involves identifying and scrutinizing things of the relevant kind. That requires knowing that the relevant things exist; however, how can we identify these things if we do not know what defines instances of that kind (Post. An. II 7, 92b4–11; II 8, 93a19–22)?
Aristotle identifies such questions with a famous puzzle raised in Plato’s Meno: how can we search for anything that we do not already know (Post. An. I 1, 71a29, compare Pr. An. II 21, 67a21–2)? Either we already know it, in which case we do not need to look for it, or we do not know it, in which case we do not know what we are seeking to learn and we will therefore not recognize it when we have found it (Meno 80e).
As David Bronstein and Gail Fine have shown, much of Aristotle’s epistemology is structured around this challenge. Aristotle is confident that we can distinguish the prior knowledge required for various types of learning from what we seek to learn; hence, for Aristotle, the puzzle in the Meno amounts to a challenge to articulate what prior knowledge various kinds of learning depend upon. The picture of learning and inquiry we get from Aristotle is, consequently, a thoroughly cumulative one. Typically, we learn by building on and combining what we already know rather than going from a state of complete ignorance to a state of knowledge. Aristotle is concerned to detail the various gradations in intellectual achievement that exist between mundane knowledge and full scientific or practical expertise.
This approach, however, raises a different worry. If we can only gain knowledge by building on knowledge we already have, then the question arises: where does our learning begin? Plato’s answer, at least as Aristotle understands it, is that we have innate latent beliefs in our souls which we can recollect and hence come to know (Post. An. II 19, 99b25–6). Aristotle rejects this view, taking it to require, implausibly, that we have more precise cognitive states in us than we are aware of (Post. An. II 19, 99b26–7). Instead, he adverts to perception as the type of knowledge from which higher cognitive states originate (Post. An. II 19, 99b34–5; cf. Met. I 1, 980a26–7). At least the most rudimentary types of perception allow us to gain knowledge without drawing on any prior knowledge. Thus, for Aristotle, everything learned (both for us and for other animals) starts with perception, such that any lack in perception must necessarily result in a corresponding lack in knowledge (Post. An. I 18, 81a38–9). Depending on the intellectual capabilities of a given animal, perception may be the highest type of knowledge available, or the animal may naturally learn from it, ascending to higher types of knowledge from which the animal can learn in turn (Post. An. II 19, 99a34–100a3; Met. I 1, 980a27–981b6).
2. Perception
For Aristotle, perception is a capacity to discriminate that is possessed by all human and non-human animals (Post. An. II 19, 99b36–7; De An. II 2, 413b2; Gen. An. I 23, 731a30–4), including insects and grubs (Met. I 1, 980a27–b24; De An. II 2, 413b19–22). Every animal possesses at least the sense of touch, even though some may lack other sensory modalities (De An. II 2, 413b8–10, 414a2–3). Each sense has a proper object which only that perceptual modality can detect as such (De An. II 6, 418a9–12): color for sight, sound for hearing, flavor for taste, odor for smell and various unspecified objects for touch (De An. II 6 418a12–14). For Aristotle, perception is not, however, limited to the proper objects of the sensory faculties. He allows that we and other animals also perceive a range of other things: various common properties which can be registered by multiple senses, such as shape, size, motion and amount (De An. II 6, 418a17–18), incidental objects such as a pale thing or even the fact that the pale thing is the son of Diares (De An. II 6, 418a21), and possibly even general facts such as that fire is hot (Met. I 1, 981b13; but see below).
Aristotle holds that we are never, or at least most infrequently, in error about the proper objects of perception, like color, sound, flavor, and so on (De An. II 6, 418a12; De An. III 3, 428b18–19). We are, however, regularly mistaken about other types of perceptual objects (De An. III 3, 428b19–25). While I can be mistaken, for instance, about the identity of the red thing I am perceiving (Is it an ember? Is it a glowing insect? Is it just artifact of the lighting?), I usually am not mistaken that I am seeing red. In Aristotle’s language, this is to say that I am more often in error regarding the incidental objects of perception (De An. III 3, 428b19–22). The common objects of perception are, in his view, even more prone to error (De An. III 3, 428b22–5); for example, I can easily misperceive the size of the red thing or the number of red things there are.
Aristotle gives an account of the way perception works which spans physiology, epistemology and philosophy of mind. In order for perception to occur, there must be an external object with some quality to be perceived and a perceptual organ capable of being affected in an appropriate way (De An. II 5, 417b20–1, 418a3–5). Aristotle posits that each sense organ is specialized and can only be affected in specific ways without being harmed. This explains both why different sensory modalities have different proper objects and why overwhelming stimuli can disable or damage these senses (De An. II 12, 424a28–34; III 2, 426a30–b3; III 13, 435b4–19). Perception takes place when the sensory organ is altered within its natural bounds, in such a way as to take on the sensible quality of the object perceived. In this way, the perceptual organ takes on the sensible form of the object without its matter (De An. III 12, 424a17–19). Much debate has revolved around whether Aristotle means that the organ literally takes on the sensible property (whether, for instance, the eye literally becomes red upon seeing red), or whether Aristotle means that it does so rather in some metaphorical or otherwise attenuated sense.
Some animals, Aristotle holds, have no other form of knowledge except perception. Such animals, in his view, only have knowledge when they are actually perceiving (Post. An. II 19, 99b38–9); they know only what is present to them when their perceptual capacities are in play. The same holds for human perceptual knowledge. If we can be said to have knowledge on account of our merely perceiving something, then this is knowledge we have only at the time when this perception is occurring (Pr. An. II 21, 67a39–67b1; compare Met. Ζ 15, 1039b27–30). A person has, for instance, perceptual knowledge that Socrates is sitting only when actually perceiving Socrates in a seated position. It follows that we cease to have this knowledge as soon as we cease to perceive the thing that we know by perception (Nic. Eth. VI 3, 1139b21–22; Topics [Top.] V 3, 131b21–22).
For Aristotle, this represents a shortcoming of perceptual knowledge. Perceptual knowledge is transitory or unstable in a way that knowledge ideally is not, since knowledge is supposed to be a cognitive state which we can rely upon (Categories [Cat.] 8, 8b27–30; Posterior Analytics [Post. An.] I 33, 89a5–10). Perception is also lacking as a form of knowledge in other ways. Higher types of knowledge confer a grasp of the reasons why something is so, but perception at best allows us to know that something is so (Metaphysics [Met.] I 1, 981b12–13; Post. An. I 31, 88a1–2). The content of perception is also tied to a particular location and time: what I perceive is that this thing here has this property now (Post. An. I 31, 87b28–30). Even if the content of my perception is a fact like the fact that fire is hot (rather than that this fire is hot), a perceptual experience cannot, according to Aristotle, tell me that fire is in general hot, since that would require me to understand why fire is hot (Met. I 1, 981b13).
Hence, while knowledge begins with perception, the types of knowledge which are most distinctively human are the exercise of cognitive abilities that far surpass perception (Gen. An. I 23, 731a34–731b5). In creatures like us, perception ignites a curiosity that prompts repeated observation of connected phenomena and leads us through a series of more demanding cognitive states that ideally culminate in scientific knowledge or craft (Met. I 1, 980a21–27; Post. An. I 31, 88a2–5; Post. An. II 19, 100a3–b5). The two most important of these intermediate states are memory and experience. Let us turn to these, before considering the types of knowledge that Aristotle considers to be virtues of the soul.
3. Memory
For Aristotle, perception provides the prior knowledge needed to form memories. The capacity to form memories allows us to continue to be aware of what we perceived in the past once the perceived object is no longer present, and thus to enjoy forms of knowledge that do not depend on the continued presence of their objects. Learning from our perceptions in order to form memories thus constitutes an important step in the ascent from perception to higher types of knowledge. With the formation of a memory, we gain epistemic access to the contents of our perceptions that transcends the present moment and place.
Aristotle distinguishes memory from recollection. Whereas recollection denotes an active, typically conscious “search” (On Memory and Recollection [De Mem.] 453a12), memory is a cognitive state that results passively from perception (De Mem. 453a15). In order to form a memory, the perceived object must leave an impression in the soul, like a stamp on a tablet (De Mem. 450a31–2). This requires the soul to be in an appropriate receptive condition, a condition which Aristotle holds to be absent or impaired in both the elderly and the very young (De Mem. 450a32–b7). Aristotle however denies that a memory is formed simultaneously with the impression of the perceived object, since we do not remember what we are currently perceiving; we have memories only of things in the past (De Mem. 449b24–26). He infers that there must be a lapse of time between the perceptual impression and the formation of a memory (De Mem. 449b28, 451a24–5, 29–30).
The fact that memory requires an impression raises a puzzle, as Aristotle notices: if perception is necessarily of a present object, but only an impression left by the object is present in our memory, do we really remember the same things that we perceive (De Mem. 450a25–37, 450b11–13)? His solution is to introduce a representational model of memory. The impression formed in us by a sensory object is a type of picture (De Mem. 450a29–30). Like any picture, it can be considered either as a present artifact or as a representation of something else (De Mem. 450b20–5). When we remember something, we access the representational content of this impression-picture. Memory thus requires a sensory impression, but it is not of the sensory impression; it is of the object this impression depicts (De Mem. 450b27–451a8).
While the capacity to form memories represents a cognitive advance over perception thanks to its cross-temporal character, memory is still a rudimentary form of knowledge, which Aristotle takes not to belong to the intellect strictly speaking (De An. I 4, 408b25–9; De Mem. 450a13–14). Memories need not possess any generality (although Aristotle does not seem to rule out the possibility of remembering generalizations), nor does memory as such tell us the reasons why things are so. A more venerable cognitive achievement than memory which, however, still falls short of full scientific knowledge or craft, is what Aristotle calls “experience” (empeiria).
4. Experience
Memories constitute the prior knowledge required to gain experience, which we gain by means of consciously or unconsciously grouping memories of the same thing (Post. An. II 19, 100a4–6; Met. I 1, 980b28–981a1). The type of knowledge we gain in experience confers practical success; in some cases, the practical efficacy (which is not to say the overall value) of this type of knowledge surpasses that of scientific knowledge (Met. I 1, 981a12–15). Aristotle emphasizes the pivotal role of experience in the acquisition of knowledge of scientific principles (Pr. An. I 30, 46a17–20; II 19, 100a6), but he considers the proper grasp of scientific principles to be a strictly different (and more valuable) kind of knowledge.
Experience thus sits mid-way between the awareness of the past we enjoy by way of memory and the explanatory capacity we have in scientific knowledge. His characterization of the content of the knowledge we have in experience has given rise to divergent interpretations. He contrasts experience with the “art” that a scientifically informed doctor has as follows:
[T]o have a judgment that when Callias was ill of this disease this did him good, and similarly in the case of Socrates and in many individual cases, is a matter of experience; but to judge that it has done good to all persons of a certain constitution, marked off in one class, when they were ill of this disease, e.g. to phlegmatic or bilious people when burning with fever–this is a matter of art. (Met. I 1, 981a7–12, trans. Ross)
On one traditional reading, the contrast Aristotle wishes to draw here concerns the generality of what one knows in experience and in scientific knowledge respectively. A person with scientific knowledge knows a universal generalization (for example, “all phlegmatic people are helped by such-and-such a drug when burning with a fever”), whereas a person with experience knows only a string of particular cases which fall under this generalization (“Socrates was helped by such-and-such a drug when burning with a fever”, “Callias was helped by such-and-such a drug when burning with a fever”, and so on).
What distinguishes the experienced person from someone who merely remembers these things, however, is that the memories of the experienced person are grouped or connected (Post. An. II 19, 100a4–6; Met. I 1, 980b28–981a1). Precisely what this grouping or connection comes to is not made clear by the text, but one point suggested by the passage above is that it allows one to competently treat new cases by comparison with old ones. An experienced person would thus, in this example, be able to prescribe the correct drug if, for instance, Polus should arrive with a fever and be of the relevant constitution to benefit from it. The experienced person will do this, however, by comparing Polus with Socrates and Callias, not by means of an explicit grasp of the universal generalization that all phlegmatic people benefit from this drug when suffering from a fever (or even that most of them do). The person with experience thus has a capacity to generalize, but not yet any explicit grasp of the underlying generalization.
One problem for this reading is that outside of this passage Aristotle describes generalizations, even scientifically accurate ones, as things known by experience. In particular, Aristotle describes scientific explananda as things known by experience, where these are taken to be general facts like the fact that round wounds heal more slowly (Post. An. I 13, 79a14–16 with Met. I 1, 981a28–30; Historia Animalium [Hist. An.] VIII 24, 604b25–7; Pr. An. I 30, 46a17–27; and, possibly, Post. An. II 19, 100a3–8). According to Pieter Sjoerd Hasper and Joel Yurdin, the content of experience does not differ from that of scientific knowledge in being any less general or less accurate than scientific knowledge. Instead, what one has experience of is fully precise scientific facts, but what one lacks is a grasp of their causes. On this view, Aristotle’s point in the passage quoted is that someone with experience knows that a certain treatment is effective for all feverish patients who are phlegmatic, but the person does not know why. Experience thus gives one knowledge of scientific explananda; further inquiry or reflection is however needed to have properly scientific knowledge, which requires a grasp of the causes of what one knows by experience.
On either of these interpretations, experience can be seen to contribute a further dimension to the temporal reach of our knowledge. Where memory allows us to retain perceptual knowledge, and thus extends our knowledge into the past, experience extends our knowledge into the future. A person with experience has not only a retrospective grasp of what has cured certain patients; this person has learned from this knowledge what will cure (or is likely to cure) the next patient with the relevant malady, either by direct comparison with previous cases or by grasping the relevant generalization. Since experience presupposes memory, an experienced person has knowledge whose reach extends both backward and forward in time.
5. Knowledge as an Intellectual Virtue
A virtue, for Aristotle, is a particular respect in which a thing is excellent at being what it is or doing what it is meant to do. If, with Aristotle, we suppose that not only our characters but also our intellects can be in better or worse conditions, it makes sense to talk about virtues of intellect as well as virtues of character.
Unlike contemporary virtue epistemologists, who tend to identify knowledge as a type of success issuing from intellectual virtues, Aristotle directly identifies the most desirable types of knowledge with certain intellectual virtues. This has an important effect on his epistemology. A virtue is a kind of stable condition, something a person is qualified with over a period of time rather than (primarily) a thing a person may be said to have or lack on a given occasion. The identification of the highest types of knowledge with virtues thus leads Aristotle to think of these kinds of knowledge as abilities. The relevant abilities include not just practical ones (like building a house) but also purely intellectual abilities, most importantly the ability to contemplate.
Since intellectual virtues must be stable states of the intellect, the best types of knowledge are also those that are difficult to acquire and, conversely, cannot be easily lost or forgotten (Cat. 8, 8b26–9a10). This does not hold of memories (which are easily formed and routinely forgotten) and even less so of perceptual knowledge (which, as we have seen, is for Aristotle a type of knowledge we have just when we are actually perceiving). Only the type of knowledge that is the outcome of protracted instruction or research counts as knowledge in the sense of a virtue (Nic. Eth. VII 3, 1141a18–22). Further, Aristotle thinks we only have this type of knowledge of necessary generalizations which belong to an axiomatizable theory, on the one hand, and of practically pertinent generalizations together with particular facts about their implementation, on the other. His reasons for this view are connected with his division of the human soul.
a. The Division of the Soul
Aristotle takes the human soul to have distinct parts corresponding to our various capacities. He divides the soul first into a rational and a non-rational part. The non-rational part of the soul accounts for the capacities we share with other animals. This part of the soul is divided into a vegetative part, which represents capacities for growth and nutrition, and a part representing the capacities we share with other animals but not with plants.
The rational part of our soul accounts for those capacities by which we seek to grasp truth, capacities which Aristotle takes to be limited to humans and the divine. By “truth”, Aristotle means both the theoretical truth of things that hold independently of us and the practical “truth” of an action or intention that accords with our rational desires (Nic. Eth. VI 2, 1139a26–31). Accordingly, Aristotle divides the rational soul into a calculative and a scientific part corresponding to the different types of truth we seek to grasp (Nic. Eth. VI 1, 1139a6–15; compare. Pol. VII 14, 1333a24–5). The calculative part of the rational soul is responsible for the cognitive component of our practical deliberation, while the scientific part of the soul is responsible for our grasp of what we seek to know for its own sake.
Each part of the soul can, for Aristotle, be in a better or a worse condition. Aristotle notes that this also holds of the nutritive part and the capacities for perception, but he shows little interest in the perfection of these capacities in normative contexts, since they are not distinctively human (Nic. Eth. I 7, 1097b33–5; I 13, 1102a32–b3). Perfecting the non-rational part of the soul is, for humans, to acquire virtues of character, such as courage, temperance and magnanimity. These are acquired, if at all, through a process of habituation beginning in childhood (Nic. Eth. II 1, 1103a25–6, b23–5). To perfect the rational part of the soul, on the other hand, is to acquire what Aristotle calls the “intellectual virtues” (Nic. Eth. I 13, 1103a3–5; VI 1, 1138b35–39a1). Such virtues are also acquired only gradually and over a long period of time, but “mostly as a result of instruction” (didaskalia, Nic. Eth. II 1, 1103a15) rather than habituation.
In addition to taking the calculative and the scientific parts of the soul to be concerned with practical and theoretical truth respectively, Aristotle also distinguishes them according to the modal statuses of the truths that they grasp. The virtues of the calculative part of the soul are excellences for grasping truth concerning what is contingent or can be otherwise (Nic. Eth. VI 1, 1139a8), whereas the virtues of the scientific part of the soul concern “things whose principles cannot be otherwise” (Nic. Eth. VI 1, 1139a7–8). This careful formulation leaves open the possibility that the scientific soul may grasp contingencies so long as the things about which it grasps these contingencies have principles which are necessary. There are, for instance, necessary principles which govern the eclipse of the moon, so that one can have scientific knowledge of the eclipse of the moon even though the moon is not always or necessarily eclipsed (Post. An. I 8, 75b33–6; compare I 31, 87b39–88a5; II 8, 93a35–93b3). Aristotle, however, tends to treat such cases as secondary, taking the primary objects of the scientific part of the soul to be strict and exceptionless necessities.
Aristotle takes there to be different intellectual capacities devoted to the grasp of truths of differing modal statuses for a variety of reasons. On the one hand, he thinks of action as the manipulation of truth. If I fashion some planks of wood into a table, I am making it true that these planks are a table (which, before I begin, is false). It follows that intellectual capacities that are directed towards action must have contingent truths as their objects, since if something cannot be otherwise, then a fortiori it cannot come to be otherwise by someone’s agency (Nic. Eth. VI 2, 1139a36–b11).
Conversely, Aristotle takes only necessary truths to be appropriate objects for the form of knowledge that pertains to the scientific part of the soul. The best condition for this part of the soul is one that allows someone to contemplate the truth freely and at will (De An. II 5, 417b23–5). This means that it ought not to need to monitor, intervene in or otherwise “check on” how things stand in the world with respect to what we know. Aristotle thinks that if one could have this sort of knowledge of a contingent state of affairs, then this state of affairs might change without our awareness, pulling the rug, as it were, out from under our knowledge (Nic. Eth. VI 3, 1139b21–2). For instance, if I could have scientific knowledge that Socrates is sitting, and Socrates gets up without me noticing, then I would suddenly no longer know that Socrates is sitting (since it would no longer be true that Socrates is sitting). Hence, if my knowledge is guaranteed to remain knowledge just by me having learned it in the appropriate way, then what I know must be a state of affairs that does not change, and this will be so if scientific knowledge is of necessities.
b. Scientific Virtues
i. Theoretical Wisdom
Wisdom (sophia) is Aristotle’s name for the best condition of the scientific part of the soul (Nic. Eth. VI 1, 1139a16; compare Met. I 2, 983a9–10) and the “most precise of the kinds of knowledge” (Nic. Eth. VI 7, 1141a17–8). This is the state that we are in when our soul grasps the best objects in the universe in the most intellectually admirable way, enabling us to contemplate these objects with total comprehension (Nic. Eth. VI 7, 1141a20–1, 1141b2–8; X 7, 1177a32–b24; Met. I 1, 981b25–982a3). In the best objects, Aristotle surely intends to include God (Met. 983a4–5) and possibly also the celestial bodies or other things studied in the books of the Metaphysics. He makes clear that humans and their polities are not among these most venerable things: we are in his view plainly not “the best thing there is in the universe” (Nic. Eth VI 7, 1141a21). Humans and their goals may be the most fitting objects of practical knowledge (Nic. Eth. VI 7, 1141b4–15), but there are better things to contemplate.
To this extent, theoretical wisdom is a distinctively disinterested type of knowledge. It is the limiting case of the type of knowledge we seek when we want to understand something for its own sake rather than because it benefits us or has practical utility, and Aristotle associates it strongly with leisure (scholē) (Met. I 1, 982a14–16; Nic. Eth. VI 12; Politics [Pol.] VII.14, 1333a16–b5). This does not mean, however, that it is neutral with respect to its ethical value. On the contrary, Aristotle takes the person with superlative wisdom to be “superlatively happy” (Nic. Eth. X 8, 1179a31), and the pursuit of theoretical wisdom is undoubtedly a central component of the good life in his view.
Aristotle also holds that wisdom can be practically advantageous in more mundane ways. He recounts a story about Thales putting his philosophical knowledge to work so as to predict an excellent olive crop and amassing a fortune by buying up all of the oil presses and then loaning them out at a profit (Pol. I 11, 1259a5–23). Yet he stresses that sophia is neither for the sake of such practical advantages (Nic. Eth. VI 7, 1141b2–8) nor does it require its possessor to be practically wise (Nic. Eth. VI 7, 1141b20–1). He depicts Thales as amassing this wealth to show that “philosophers could easily become wealthy if they wished, but this is not their concern” (Pol. I 11, 1259a15–18).
The best kind of theoretical knowledge has, for Aristotle, the structure of an axiomatic science. One has the best theoretical orientation towards the world when one grasps how each fact of a science concerning the highest things follows from the principles of the highest things (Met. I 2, 982a14–16). Wisdom thus divides into two components, scientific knowledge of certain principles (nous) and the type of scientific knowledge that consists in grasping a scientific proof or “demonstration” issuing from these principles (Nic. Eth. VI 7, 1141a17–20). Someone with the virtue of wisdom understands why the basic principles of theology (or whatever science deals with the best things) are the basic principles of that science, and is also able to prove, in axiomatic fashion, every other theorem in that science on the basis of these principles.
While wisdom is for Aristotle the best kind of theoretical knowledge, he does not hold that this sort of knowledge ought to form a foundation for all other kinds of knowledge or even all other scientific knowledge. This is because he holds that each kind of thing is only properly understood when we understand it according to its own, specific principles (Post. An. I 2, 71b23–25, 72a6; I 6, 74b24–26; I 7; Met. I 3, 983a23–25; Phys. I 1, 184a1–15). Knowledge of the first principles of the highest science might give someone a general understanding of a range of other things (Met. I 2, 982a7–10, 23–24; Nic. Eth. VI 7, 1141a12–15)—it might explain, for instance, why animals move at all by saying that they move in imitation of divine motion—but this sort of general understanding is, for Aristotle, no substitute for the specific kind of understanding we have when we grasp, for example, the mechanics of a particular animal’s motion or the function of this motion in its peculiar form of life.
For this reason, Aristotle takes each scientifically explicable domain to be associated with its own dual virtues of demonstrative and non-demonstrative scientific knowledge. The virtues of demonstrative and non-demonstrative knowledge are, therefore, not characteristics which a person can be said to simply have or to lack in general. Instead, someone might possess the virtues of scientific knowledge with respect to, say, geometry and lack them with respect to, say, human physiology. While one type of scientific knowledge might assist in the acquisition of another, and perhaps even provide some of its principles (Post. An. I 7, 75b14–17; compare I 9, 76a16–25), Aristotle insists that there is a different virtuous state associated with each distinct scientific domain (Nic. Eth. VI 10, 1143a3–4). He does, however, take all such virtues to share a common axiomatic structure, which he lays out in the Posterior Analytics in the course of giving a theory of demonstration.
ii. Demonstrative Knowledge
A demonstration (apodeixis), for Aristotle, is a deductive argument whose grasp imparts scientific knowledge of its conclusion (Post. An. I 2, 71b18–19). Aristotle takes it for granted that we possess a distinctive kind of knowledge by way of deductive reasoning and asks what conditions a deductive argument must satisfy in order to confer scientific knowledge. His primary model for this type of knowledge is mathematics, which, alongside geometrical construction, included the practice of providing a deductive argument from basic principles to prove that the construction satisfies the stated problem. Aristotle however seeks to generalize and extend this model to broadly “mathematical” sciences like astronomy and optics, and, with some qualifications, to non-mathematical sciences like botany and meteorology. His theory of knowledge in the Posterior Analytics (especially the first book) investigates this ideal knowledge state by asking what conditions an argument must satisfy in order to be a demonstration.
Aristotle observes, to begin, that not all deductive arguments are demonstrations (Post. An. I 2, 71b24–26). In particular, an argument from false premises does not confer knowledge of its conclusion (Post. An. I 2, 71b26–27). The notion of demonstration is not, however, simply the notion of a sound deductive argument, since even sound arguments do not provide knowledge of their conclusions unless the premises are already known. Moreover, even sound arguments from known premises may not provide the best kind of knowledge of the conclusion. Aristotle holds that in order to impart the best kind of knowledge of a necessary truth, an argument must establish this truth on the basis of principles that properly pertain to the type of thing the demonstration concerns. A demonstration of some astronomical fact must, for instance, proceed from properly astronomical principles (Pr. An. I 30, 46a19–20). What this rules out is, on the one hand, arguments from accidental and “chance” features of an object (Post. An. I 6, especially 74b5–12, 75a28–37; I 30), and arguments from the principles of a different science, on the other (Post. An. I 7, 75b37–40).
Two requirements for demonstration, then, are that the premises be true and that they be non-accidental facts belonging to the relevant science. Assuming that all principles of a science are true and non-accidental, this reduces to the condition that a demonstration be from principles belonging to the relevant science. This, however, is still not a sufficient condition for an argument to be a demonstration. Aristotle famously contrasts the following two arguments (Post. An. I 13, 78a30–7):
Argument One
Things that do not twinkle are near;
the planets do not twinkle;
therefore, the planets are near.
Argument Two
What is near does not twinkle;
the planets are near;
therefore, the planets do not twinkle.
Here by “twinkle” we should understand the specific astronomical phenomenon whereby a celestial body’s visual intensity modulates in the way that a distant star’s does on a clear night, and we should understand both of these arguments to quantify over visible celestial bodies. If we do, then both of these arguments are sound. The planets are near the earth (relative to most astronomical bodies), and they do not display the astronomical property of twinkling. It is also true that bodies which are relatively close to us, as compared to the stars, fail to display this effect, so the first premise of Argument Two is true. Further, only visible celestial bodies which are near to us fail to twinkle, so the first premise of Argument One is also true. All of the premises in these two arguments are also, in Aristotle’s view, properly astronomical facts. To this extent they both establish that their respective conclusions hold as a matter of astronomical science.
The latter argument is in Aristotle’s view superior, however, in that it establishes not only that the conclusion holds but also why it does. In a completed theory of astronomy, the non-twinkling of the planets might be explained by recourse to their nearness to us, for example by adding that other celestial bodies obstruct the light issuing from more distant ones. Argument Two conveys this explanation by presenting the immediate cause of the conclusion, nearness, as a middle term shared between the two premises. The two premises in the argument thus not only prove the conclusion; they jointly explain what makes the conclusion true.
On the other hand, while the facts that non-twinkling celestial bodies are near and that all planets are non-twinkling celestial bodie do provide perfectly legitimate grounds to infer that the planets are near, an argument from these premises provides little insight into why the planets are near. The fact that the planets are near might take significant work to establish (it might even be established using a chain of reasoning such as that in Argument One), but it would be a confusion, in Aristotle’s view, to think that the soundness of Argument One and the scientific character of its premises shows that the non-twinkling of the planets explains their nearness. The order of explanation runs rather in the opposite direction: they do not twinkle because they are near. In a completed science of astronomy as Aristotle conceives it, where it is assumed that the gross distance of all celestial bodies from the earth is eternally fixed, the nearness of the planets would presumably be treated as a fundamental given from which other things may be explained, not as a fact requiring explanation.
Someone is in a better cognitive condition with respect to a given fact, Aristotle evidently holds, if that person not only knows that it is true but also grasps why it is true. Aristotle does not argue for this position, but it is not difficult to imagine what reasons he might give. We naturally desire not just to know but to understand; curiosity is sated by explanation rather than sheer fact. Further, understanding confers stability on what we know, and Aristotle takes stability to be a desirable quality of knowledge (Cat. 8, 8b28–30; Post. An. I 33, 89a5–10). If I understand why the planets must not twinkle (rather than knowing that this is so but having no idea why), then I will be less likely to give up this belief in light of an apparent observation to the contrary, since to do so would require me to also revise my beliefs about what I take to be the explanation. This is especially so if I understand how this fact is grounded, as Aristotle requires of demonstrative knowledge, in the first principles of a science, since renouncing that piece of knowledge would then require me to renounce the very principles of my scientific theory.
Hence, the type of deduction which places one in the best cognitive condition with respect to an object must be explanatory of its conclusion in addition to being a sound argument with premises drawn from the correct science. The notion of explanation Aristotle works with in laying down this condition is a resolutely objective one. Scientific explanations are not just arguments that someone, or some select class of people, find illuminating. They are the best or most appropriate kinds of explanations available for the fact stated in the conclusion because they argue from what is prior to the conclusion in the order of nature (Phys. I.1, 184a10–23). Further, the fact that a given set of premises explains their conclusion need not be obvious or immediately clear. Aristotle leaves open the possibility that it might be a significant cognitive achievement to see that the premises of a given demonstration explain its conclusion (Post. An. I 7, 76a25–30).
When someone does grasp demonstrative premises as explanatory of a given demonstrative conclusion, the argument is edifying because it tracks some objective fact about how things stand with the relevant kind (celestial bodies, triangles, and so on). Aristotle describes the way that scientific knowledge correctly tracks the order of things in terms of “priority” (Post. An. I 2, 71b35–72a6). Borrowing the terminology of contemporary metaphysics, we might gloss this by saying that demonstrations reveal facts about grounding in a way that not all deductive arguments do. The second syllogism is better than the first one because the fact that the planets are near together with relevant universal generalization about the optical behavior of nearness ground the fact that the planets do not twinkle. They are in an objective sense responsible for the fact that the planets do not twinkle. Given the assumption that grounding is antisymmetric, the premises of the first syllogism cannot also ground its conclusion.
Aristotle’s account of the specific content and logical form of scientific principles is notoriously obscure. Key texts, which do not obviously stand in agreement, are Post. An. I 2, I 4, I 10, II 19 and Pr. An. I 30. Nevertheless, there are a few key ideas which Aristotle remains consistently committed to. First, a particularly important type of principle is one that states what something is, or its definition (Post. An. I 2, 72a22). The centrality accorded to this type of principle suggests a project of grounding all truths of a given science in facts about the essences of the kind or kinds that this science concerns. Aristotle however seems aware of problems with such a rigid view, and admits non-specific or “common” axioms into demonstrative sciences (Post. An. I 2, 72a17–19; I 10, 76b10–12). These include axioms like the principle of non-contradiction, which are in some sense assumed in every science (Post. An. I 2, 72a15–17; I 11, 77a30), as well as those like the axiom that equals taken from equals leave equals, which can be given both an arithmetical and a geometrical interpretation (Post. An. I 10, 76a41—b1; I 11, 77a30).
Aristotle also briefly discusses the profile of conviction (pistis) that someone with scientific knowledge ought to display. At least some of the principles of a demonstration, Aristotle holds, should be “better known” to the expert scientist than their conclusions, and the scientist should be “more convinced” of them (Post. An. I 2, 72a25–32). This is motivated in part by the idea that the principles of demonstrations are supposed to be the source or grounds for our knowledge of whatever we demonstrate in science. Aristotle also connects it with the requirement that someone who grasps a demonstration should be “incapable of being persuaded otherwise” (Post. An. I 2, 72b3). Someone with demonstrative knowledge in the fullest sense will never renounce their beliefs under dialectical pressure, and Aristotle thinks this requires someone to be supremely confident in the principles that found her demonstrations.
iii. Non-Demonstrative Scientific Knowledge
The claim that the principles of demonstrations must be better known than their conclusions generates a problem. If the best way to know something theoretically is by demonstration, and the premises of demonstrations must be at least as well known as their conclusions, then the premises of demonstrations will themselves need to be demonstrated. However, these demonstrations in turn will also have premises requiring demonstration, and so on. A regress looms.
Aristotle canvases three possible responses to this problem. First, demonstrations might extend back infinitely: there might be an infinite number of intermediate demonstrations between a conclusion and its first principles (Post. An. I 3, 72b8–10). Aristotle dismisses this solution on the grounds that we can indeed have demonstrative knowledge (Post. An. I 1, 71a34–72b1), and that we could not have it if having it required surveying an infinite series of arguments (Post. An. I 3, 72b10–11).
Two other views, which Aristotle attributes to two unnamed groups of philosophers, are treated more seriously. Both assume that chains of demonstrations terminate. One group says that they terminate in principles which cannot be demonstrated and consequently cannot be known (or at least, not in the demanding way required by scientific knowledge (Post. An. I 3, 72b11–13)). The other holds that demonstrations “proceed in a circle or reciprocally” (Post. An. I 3, 72b17–18): the principles are demonstrated, but from premises which are in turn demonstrated (directly or indirectly) from them.
Aristotle rejects both of these views, since both possibilities run afoul of the requirement that the principles of demonstrations be better known than their conclusion. The first alternative maintains that the principles are not known, or at least not in any scientifically demanding way, while the second requires that the principles in turn be demonstrated from (and hence not better known than) other demonstrable facts.
Aristotle’s solution is to embrace the claim that the principles are indemonstrable but to deny that this implies they are not known or understood in a rigorous and demanding way. There is no good reason, Aristotle maintains, to hold that demonstrative knowledge is the only or even the best type of scientific knowledge (Post. An. I 3, 72b23–5); it is only the best kind of scientific knowledge regarding what can be demonstrated. There is a different kind of knowledge regarding scientific principles. Aristotle sometimes refers to this as “non-demonstrative scientific knowledge” (Post. An. I 3, 72b20; I 33, 88b36) and identifies or associates it strongly with nous (Post. An. I 33, 88b35; II 19, 100b12), which is translated variously as comprehension, intellection and insight.
Aristotle is therefore a foundationalist insofar as he takes all demonstrative knowledge to depend on a special sort of knowledge of indemonstrable truths. It should be stressed, however, that Aristotle’s form of foundationalism differs from the types that are now more common in epistemology.
First, Aristotle professes foundationalism only regarding demonstrative knowledge in particular: he does not make any similar claim about perceptual knowledge as having a foundation in, for instance, the perception of sense data, nor for practical knowledge nor for our knowledge of scientific principles.
Second, as we have seen, Aristotle’s view is that scientific knowledge is domain-specific. Expert knowledge in one science does not automatically confer expert knowledge in any other, and Aristotle explicitly rejects the idea of a ”super-science” containing the principles for all other sciences (Post. An. I 32). Hence, Aristotle defends what we might term a “local” foundationalism about scientific knowledge. Our knowledge of geometry will have one set of foundations, our knowledge of physics another, and so on. (compare Post. An. I 7; I 9, 75b37–40; 76a13–16; Nic. Eth. VI 10, 1143a3–4).
Third, the faculty which provides our knowledge of the ultimate principles of demonstrative knowledge is, as we have seen, itself a rational faculty, albeit one which does not owe its knowledge to demonstration. Hence, Aristotle does not take the foundation of our demonstrative knowledge to be “brute” or “given”; his claim, more modestly, is that our knowledge of scientific principles must be of a different kind than demonstrative knowledge.
Finally, Aristotle’s foundationalism should not be taken to imply that we need to have knowledge of principles prior to discovering any other scientific facts. In at least some cases, Aristotle takes knowledge of scientific explananda to be acquired first (Post. An. II 2, 90a8–9), by perception or induction (Post. An. I 13, 78a34–5). Only later do we discover the principles which allow us to demonstrate them, and thus enjoy scientific knowledge of them.
Aristotle does not say much about the specific character of our knowledge of principles, and its nature has been the subject of much debate. As we have seen, Aristotle requires the principles to be “better known” (at least in part) than their demonstrative consequences, and he also refers to this type of knowledge as “more precise” (Post. An. II 19, 99b27) than demonstrative knowledge. Some scholars take Aristotle’s view to be that the principles are self-explanatory, while others take the principles to be inexplicable.
His views about the way we acquire knowledge of first principles have also been subject to varying interpretations. Traditionally, Aristotle’s view was taken to be that we learn the first principles by means of exercising nous, understood as a capacity for abstracting intelligible forms from the impressions left by perception. Subsequent scholars have pointed to the dearth of textual evidence for ascribing such a view to Aristotle in the Posterior Analytics, however. Aristotle calls the state we are in when we know first principles nous (Post. An. II 19, 100b12), but he does not claim that we learn first principles by means of exercising a capacity called nous.
A second possibility is that Aristotle thinks we obtain knowledge of scientific principles through some form of dialectic—a competitive argumentative practice outlined in the Topics that operates with different standards and procedures than scientific demonstration. Another view, defended by Marc Gasser-Wingate, is that our knowledge of the first principles is both justified and acquired by what Aristotle calls “induction” (epagōgē)—a non-deductive form of scientific argument in which we generalize from a string of observed cases or instances.
Some scholars also divide the question about how we first come to know the first principles from questions about what justifies this knowledge in the context of a science. One suggestion is that Aristotle takes the justification for nous to consist in a recognition of the principles as the best explanations for other scientific truths. On one version of this view, forcefully defended by David Charles, knowledge of first principles is not acquired prior to our knowledge of demonstrable truths; rather, we gain the two in lockstep as we engage in the process of scientific explanation. On other versions of this view, we come to know the first principles in some less demanding way before we come to appreciate their explanatory significance and thus have proper scientific knowledge of them. David Bronstein, who defends a version of the latter view, argues that Aristotle recommends a range of special methods for determining first principles, including, importantly, a rehabilitation of Plato’s method of division.
c. Practical Knowledge and the Calculative Virtues
Wisdom and scientific knowledge (demonstrative and non-demonstrative) are the excellences of the scientific part of our soul, that part of us devoted to the contemplation of unchanging realities. Aristotle takes the type of knowledge that we employ in our dealings with other people and our manipulation of our environment to be different in kind from these types of knowledge, and gives a separate account of their respective justification, acquisition, and purpose.
The goal of practical knowledge is to enable us to bring about changes in the world. Where attaining theoretical knowledge is a matter of bringing one’s intellect into conformity with unchanging structures in reality, practical knowledge involves a bidirectional relationship between one’s intellect and desires, on the one hand, and the world, on the other. As practical knowers we seek not only to conform our desires and intellects to facts about what is effective, ethical and practically pertinent; we also seek to conform these situations to what we judge to be such. Hence, practical knowledge can have as its objects neither necessities (since no one can coherently decide to, for example, change the angle sum of a triangle (Nic. Eth. III 2, 1112a21–31)) nor what is in the past (someone might make a decision to sack Troy, but “no one decides to have sacked Troy” (Nic. Eth. VI 2, 1139b7)). Only present and future contingencies are, in Aristotle’s view, possible objects of practical knowledge.
Aristotle distinguishes two activities that are enabled by practical thinking: action (praxis) and production (poiēsis) (Nic. Eth. VI 2, 1139a31–b5). Production refers to those doings whose end lies outside of the action itself (Nic. Eth. VI 5, 1140b6–7), the paradigm of which is the fashioning of a craft object like a shoe or a house. Aristotle recognizes that not all of our doings fit this mold, however: making a friend, doing a courageous act, or other activities laden with ethical significance cannot be thought of only with strain on the model of manufacturing a product. Such activities do aim to bring about changes in reality, but their end is not separate from the action itself. In performing a courageous act, say, I am, in Aristotle’s view, simply aiming to exercise the virtue of courage appropriately. Praxis is Aristotle’s name for doings such as these. It refers to a distinctively human kind of action, one not shared by other animals (Nic. Eth. VI 2, 1139a20; III 3, 1112b32; III 5, 1113b18), involving deliberation and judgment that an action is the best way of fulfilling one’s goals (Nic. Eth. VI 2, 1139a31).
Both action and production require more than just knowledge in order to be performed well. In particular, the best kind of action also requires the doer to be virtuous, and this, for Aristotle, has a desiderative as well as an epistemic component. Someone is only virtuous if that person desires the right things, in the right way, to the right extent (Nic. Eth. II 3, 1104b3–13; II 9 1109b1–5; III 4 1113a31–3; IV 1 1120a26–7; X I, 1172a20–3). Further, Aristotle does not take the starting points of our practical knowledge to be themselves objects of practical knowledge. It is not part of someone’s technical expertise to know that, for example, a sword is to be made or a patient to be healed; rather, a blacksmith or a doctor in her capacity as such takes for granted that these ends are to be pursued, and it is the job of her practical knowledge to determine actions which bring them about (Nic. Eth. III 3, 1112b11–16). The proper ends of actions, meanwhile, are given by virtue (Nic. Eth. VI 12, 1144a8, 20), and the virtues are habituated into us from childhood (Nic. Eth. II 2, 1103a17–18, 23–26, 1103b23–25). Nevertheless, Aristotle takes certain types of knowledge to be indispensable for engaging in action and production in the best possible ways. He identifies these types of knowledge with the intellectual virtues of practical wisdom (phronēsis) and craft (technē).
i. Craft
Craft (technē) is Aristotle’s name for the type of knowledge that perfects production (Nic. Eth. IV 4, 1140a1–10; Met. IX 2, 1046a36–b4). Aristotle mentions a treatise on craft which seems to have given a treatment of it roughly parallel to the treatment of scientific knowledge he gives in the Posterior Analytics (Nic. Eth. VI 4, 1140a2–3; compare VI 3, 1139b32), but this treatise is lost. Aristotle’s views on technē must be pieced together from scattered remarks and an outline of this treatise’s contents in the Nicomachean Ethics (VI 4).
As with scientific knowledge, Aristotle does not take craft to be a monolithic body of knowledge. He holds, sensibly enough, that a different type of technical knowledge is required for bringing into being a different kind of object. Aristotle’s stock example of a technē is the construction of houses (Nic. Eth. VI 4, 1140a4). In constructing a house, a craftsperson begins with the form of the house in mind together with a desire to bring one about, and practical knowledge is what enables this to lead to the actual presence of a house (Met. VII 7, 1032a32–1032b23; Phys. II 2, 194a24–27; II 3, 195b16–25).
In order for this to occur, a person with craft must know the “true prescription” pertaining to that practice (Nic. Eth. VI 4, 1140a21), that is, the general truths concerning how the relevant product is to be brought about. In the case of housebuilding, these might include the order in which various housebuilding activities need to be carried out, the right materials to use for various parts, and the correct methods for joining different types of materials. While a merely “experienced” housebuilder might manage to bring about a house without such prescriptions, they would not, in Aristotle’s view, bring about a house in the best or most felicitous way, and hence could not be said to operate according to the craft of housebuilding.
Aristotle indicates that these prescriptions fit together in a causal or explanatory way (Met. A 1, 981a1–3, 28–981b6; Post. An. II 19, 100a9). This view is plausible. Someone with the best kind of knowledge about how to bring about some product will presumably not only know what should be done but also understand why that is the correct thing to do. Such understanding, after all, has not only theoretical interest but also practical benefit. Suppose, for instance, that the craft of housebuilding prescribes that one should bind bricks using straw. Someone who understands why this is prescribed will be in a better position to know what else can be substituted should straw be unavailable, or even when it may be permissible to omit the binding agent. None of this is to say that a practitioner of a craft requires the same depth of understanding as someone with scientific knowledge, however. A technician of housebuilding does not need to know, for example, the chemical or physical principles which explain why and how binding agents work at a microscopic scale.
Given that it involves a kind of understanding, knowing the craft’s correct prescriptions in the way required by craft is a significant intellectual accomplishment. Nevertheless, this is not sufficient for having craft knowledge, according to Aristotle. Someone with craft knowledge must also have a “productive disposition” (Nic. Eth. VI 3, 1140a20–1), that is, a tendency to actually produce the goods according to these prescriptions when they have the desire to do so. Aristotle makes this disposition a part of craft knowledge itself, and not merely an extra condition required for practicing the craft, for at least three reasons.
First, someone does not count as having craft knowledge if that person has only a theoretical grasp of how houses are to be made, for example. Having craft knowledge requires knowledge of how to build houses, and Aristotle thinks that this sort of knowledge is only available to someone with a disposition to actually build them. Second, unlike mathematical generalizations, the prescriptions grasped in technē are not exceptionless necessities (Nic. Eth. VI 4, 1140a1–2; compare Nic. Eth. VI 2, 1139a6–8). Hence, simply knowing these prescriptions (even if one has every intention of fulfilling them) is not in itself sufficient for an ability to actually bring about the relevant product. One must have an ability to recognize when a rule of thumb about, say, the correct materials to use in building a house fails to apply. Aristotle thinks of this type of knowledge as existing in a disposition to apply the prescriptions correctly rather than as an auxiliary theoretical prescription.
A third, related reason is that the process of production requires one to make particular decisions that go beyond what is specified in the prescriptions given by that craft. Thus, even where the craft prescription instructs the builder to, for instance, separate cooking and sleeping quarters or to have a separate top floor for this kind of house, it may not specify the specific arrangement of these quarters or the precise elevation of the second floor. The ability to make such decisions in the context of practicing a craft is, for Aristotle, conferred by the productive disposition involved in craft knowledge rather than by the grasp of additional prescriptions.
ii. Practical Wisdom
Practical wisdom is the central virtue of the calculative part of the soul. This type of knowledge makes one excellent at deliberation (Nic. Eth. VI 9, 1142b31–3; VI 5, 1140b25). Since deliberation is Aristotle’s general term for reasoning well in practical circumstances, practical wisdom is also the type of knowledge that perfects action (praxis). More generally, practical wisdom is the intellectual virtue “concerned with things just and fine and good for a human being” (Nic. Eth. VI 12, 1143b21–22). It includes, or is closely allied with, a number of related types of practical knowledge that inform ethical behavior: good judgment (gnōmē), which Aristotle characterizes as a sensitivity to what is reasonable in a given situation (, Nic. Eth. VI 12, 1143a19–24); comprehension (sunesis), an ability to discern whether a given action or statement accords with practical wisdom (Nic. Eth. VI 10, 1143a9–10); and practical intelligence (nous, related to, but distinct from the theoretical virtue discussed above), which allows one to spot or recognize practically pertinent particulars (Nic. Eth. VI 11, 1143b4–5).
Practical wisdom thus serves to render action rather than production excellent. One important difference between practical wisdom and craft immediately follows. Whereas in craft someone performs an action for the purpose of creating something “other” than the production itself, the end of practical wisdom is the perfection of the action itself (Nic. Eth. VI 5, 1140b6–7). Nevertheless, in many respects, Aristotle’s view of practical wisdom is modeled on his view of craft knowledge. Like craft knowledge, the goal of practical wisdom is to effect some good change rather than simply to register the facts as they stand. In addition, like craft, this type of knowledge involves both a grasp of general prescriptions governing the relevant domain and an ability to translate these generalities into concrete actions. In the case of practical wisdom, the domain is the good human life generally (Nic. Eth. VI 8, 1141b15; Nic. Eth. VI 12, 1143b21–2), and the actions which it enables are ethically good actions. Hence, the general prescriptions associated with practical wisdom concern the living of a flourishing human life, rather than any more particular sphere of action. Practical wisdom also, like craft, involves an ability to grasp the connections between facts, but in a way that is specifically oriented towards action (Nic. Eth. VI 2 1139a33–1139b5; Nic. Eth. VI 7, 1141b16–20).
Some of the complications involved in moving from general ethical prescriptions to concrete actions also mirror those regarding the movement from a craft prescription to the production of a craft object. For one, Aristotle holds that many or all general truths in ethics likewise hold only for the most part (Nic. Eth. I 3, 1094b12–27; 1098a25-34). The ethical prescription, for instance, to reciprocate generosity is a true ethical generalization, even if not an exceptionless one (Nic. Eth. IX 2, 1164b31). If ethical norms permit of exceptions, then knowing these norms will not always be sufficient for working out the ethical thing to do. A further epistemic capacity will be required in order to judge whether general ethical prescriptions apply in the concrete case at hand, and this is plausibly one function of phronēsis.
Aristotle also describes phronēsis as a capacity to work out what furthers good ends (Nic. Eth. VI 5, 1140a25–9; VI 9, 1142b31–3). He distinguishes it from the trait of cleverness, a form of means-end reasoning that is indifferent to the ethical quality of the ends in question (Nic. Eth. VI 12, 1144a23–5). Phronēsis is an ability to further good ends in particular, and to further them in the most appropriate ways. It has also been argued that phronēsis has the function of recognizing, not only the means to one’s virtuous ends, but also what would constitute the realization of those ends in the first place. For instance, I might have the intention to be generous, but it is another thing to work out what it means to be generous to this friend at this time under these circumstances. This is parallel to the way that one needs, in, say,. seeking to construct a house to decide which particular type of house to construct given the constraints of location and resources.
One crucial difference between craft knowledge and practical wisdom is, however, the following. Whereas it suffices for craft knowledge to find a means to an end which is in accord with the goals of that craft, a practically wise person must find a way of realizing an ethical prescription which is in accord with all of the ethical virtues (Nic. Eth. VI 12, 1144a29–1144b1). This is a considerable practical ability in its own right, especially when the demands of different virtues come into conflict, as they might, for instance, when the just thing to do is not (or not obviously) the same as the kind or the generous thing to do. Practical wisdom thus requires, first, that one has all of the virtues so as to be sensitive to their various demands (Nic. Eth. VI 13, 1145a1–2). Over and above the possession of the virtues, practical wisdom calls for an ability to navigate their various requirements and arbitrate between them in concrete cases. In this way, it constitutes a far higher achievement than craft knowledge, since a person with practical wisdom grasps and succeeds in coordinating all of the goods constitutive of a human life rather than merely those directed towards the production of some particular kind of thing or the attainment of some specific goal.
6. References and Further Reading
Two good overviews of Aristotle’s views about knowledge, with complementary points of emphasis, are Taylor (1990) and Hetherington (2012). Bolton (2012) emphasizes Aristotle’s debt to Plato in epistemology. Fine (2021) is one of the few to treat Aristotle’s theory of knowledge in all generality at significant length, but readers should be aware that some of her central theses are not widely supported by other scholars. More advanced but nevertheless accessible pieces on Aristotle’s epistemology and philosophy of science may be found in Smith (2019), Anagnostopoulos (2009) and Barnes (1995).
The most in-depth study of Aristotle’s theory of scientific knowledge in the Posterior Analytics is Bronstein (2016), which focuses on the prior knowledge requirement and reads Aristotle’s views as a response to Meno’s paradox. See also Angioni (2016) on Aristotle’s definition of scientific knowledge in the Posterior Analytics. McKirahan (1992) and Barnes (1993) both provide useful commentary on the Posterior Analytics. See Barnes (1969), Burnyeat (1981), Lesher (2001) and Pasnau (2013) for views concerning whether Aristotle’s theory in the Posterior Analytics is best viewed as an epistemology, a philosophy of science, or something else. Sorabji (1980) also contains penetrating discussions of many specific issues in Aristotle’s epistemology and philosophy of science. For scholarly issues, Berti (1981) is still an excellent resource.
On Aristotle’s scientific method more generally, see Lennox (2021), Bolton (1987) and Charles (2002). For how we acquire knowledge of first principles, important contributions include Kahn (1981) and Bayer (1997) (who defend a view close to the traditional one), Irwin (1988) (who argues for the importance of a form of dialectic in coming to know first principles), and Gasser-Wingate (2016) (who argues for the role of induction and perception). Morison (2019) as well as Bronstein (2016) discuss at length the nature of knowledge of first principles and its relationship to nous in Aristotle.
Shields (2016) provides an excellent translation and up-to-date commentary on the De Anima. Kelsey (2022) gives a novel reading of De Anima as a response to Protagorean relativism. For Aristotle’s views on perception, see Modrak (1987) and Marmodoro (2014). Gasser-Wingate (2021) argues for an empiricist reading of Aristotle, against the rationalist reading of Frede (1996). On the more specific issue of whether Aristotle takes perception to involve a literal change in the sense organ, one can start with Caston (2004), Sorabji (1992) and Burnyeat (2002).
For the Nicomachean Ethics, Broadie and Rowe (2002) provide useful, if partisan, philosophical introduction and commentary, while Reeve (2014) provides extensive cross-references to other texts. For Aristotle’s views about practical wisdom, Russell (2014) and Reeve (2013) are useful starting points. Walker (2018) gives a prolonged treatment of Aristotle’s views about contemplation and its alleged “uselessness”, and Ward (2022) provides interesting background on the religious context of Aristotle’s views.
a. Bibliography
Anagnostopoulos, Georgios (ed.). 2009. A Companion to Aristotle. Sussex: Wiley-Blackwell.
Angioni, Lucas. 2016. Aristotle’s Definition of Scientific Knowledge. Logical Analysis and History of Philosophy 19: 140–66.
Barnes, Jonathan. 1969. Aristotle’s Theory of Demonstration. Phronesis 14: 123–52.
Barnes, Jonathan (ed.). 1995. The Cambridge Companion to Aristotle. Cambridge: Cambridge University Press.
Bayer, Greg. 1997. Coming to Know Principles in Posterior Analytics II.19. Apeiron 30: 109–42.
Berti, Enrico (ed.). 1981. Aristotle on Science: The Posterior Analytics. Proceedings of the Eighth Symposium in Aristotelicum Held in Padua from September 7 to 15, 1978. Padua: Editrice Antenore.
Bolton, Robert. 1987. Definition and Scientific Method in Aristotle’s Posterior Analytics and Generation of Animals. In Philosophical Issues in Aristotle’s Biology. Cambridge: Cambridge University Press.
Bolton, Robert. 1997. Aristotle on Essence and Necessity. Proceedings of the Boston Area Colloquium in Ancient Philosophy (edited by John J. Cleary) 13:113–38. Leiden: Brill.
Bolton, Robert. 2012. Science and Scientific Inquiry in Aristotle: A Platonic Provenance. In The Oxford Handbook of Aristotle (edited by Christopher Shields), 46–59. Oxford: Oxford University Press.
Bolton, Robert. 2014. Intuition in Aristotle. In Rational Intuition: Philosophical Roots, Scientific Investigations, 39–54. Cambridge: Cambridge University Press.
Bolton, Robert. 2018. The Search for Principles in Aristotle. In Aristotle’s Generation of Animals: A Critical Guide (edited by Andrea Falcon and David Lefebvre), 227–48. Cambridge: Cambridge University Press.
Broadie, Sarah, and Christopher Rowe. 2002. Nicomachean Ethics. Philosophical Introduction and Commentary by Sarah Broadie (translated by Christopher Rowe). New York: Oxford University Press.
Bronstein, David. 2010. Meno’s Paradox in Posterior Analytics 1.1. Oxford Studies in Ancient Philosophy 38: 115–41.
Bronstein, David. 2012. The Origin and Aim of Posterior Analytics II.19. Phronesis 57(1): 29–62.
Bronstein, David. 2016. Aristotle on Knowledge and Learning: The Posterior Analytics. Oxford: Oxford University Press.
Bronstein, David. 2020. Aristotle’s Virtue Epistemology. In What the Ancients Offer to Contemporary Epistemology (edited by Stephen Hetherington and Nicholas Smith), 157–77. New York: Routledge.
Burnyeat, Myles. 1981. Aristotle on Understanding Knowledge. In Aristotle on Science: The Posterior Analytics (edited by Enrico Berti). Padua: Editrice Antenore.
Burnyeat, Myles. 2011. Episteme. In Episteme, Etc. Essays in Honour of Jonathan Barnes (edited by Benjamin Morison and Katerina Ierodiakonou), 3–29. Oxford: Oxford University Press.
Burnyeat, Myles. 2002. De Anima II 5. Phronesis 47(1): 28–90.
Byrne, Patrick. 1997. Analysis and Science in Aristotle. Albany: State University of New York Press.
Caston, Victor. 2004. The Spirit and the Letter: Aristotle on Perception. In Metaphysics, Soul and Ethics: Themes from the Work of Richard Sorabji (edited by Ricardo Salles), 245–320. Oxford University Press.
Charles, David. 2002. Aristotle on Meaning and Essence. Oxford: Oxford University Press.
Fine, Gail. 2021. Aristotle on Knowledge. In Essays in Ancient Epistemology, 221–32. Oxford University Press.
Frede, Michael. 1996. Aristotle’s Rationalism. In Rationality in Greek Thought (edited by Michael Frede and Gisela Striker), 157–73. Oxford University Press.
Gasser-Wingate, Marc. 2016. Aristotle on Induction and First Principles. Philosopher’s Imprint 16(4): 1–20.
Gasser-Wingate, Marc. 2019. Aristotle on the Perception of Universals. British Journal for the History of Philosophy 27(3): 446–67.
Gasser-Wingate, Marc. 2021. Aristotle’s Empiricism. New York: Oxford University Press.
Goldin, Owen. 1996. Explaining an Eclipse: Aristotle’s Posterior Analytics 2.1–10. Ann Arbor: The University of Michigan Press.
Greco, John. 2010. Achieving Knowledge. Cambridge: Cambridge University Press.
Hasper, Pieter Sjoerd, and Joel Yurdin. 2014. Between Perception and Scientific Knowledge: Aristotle’s Account of Experience. In Oxford Studies in Ancient Philosophy (edited by Brad Inwood), 47:119–50.
Hetherington, Stephen (ed.). 2012. Aristotle on Knowledge. In Epistemology: The Key Thinkers, 50–71. London: Continuum.
Hintikka, Jaakko. 1967. Time, Truth and Knowledge in Ancient Greek Philosophy. American Philosophical Quarterly 4(1): 1–14.
Irwin, Terence. 1988. Aristotle’s First Principles. Oxford: Clarendon Press.
Kahn, Charles. 1981. The Role of Nous in the Cognition of First Principles in Posterior Analytics II 19. In Aristotle on Science: The Posterior Analytics. Proceedings of the Eighth Symposium in Aristotelicum Held in Padua from September 7 to 15, 1978 (edited by Enrico Berti). Padua: Editrice Antenore.
Kelsey, Sean. 2022. Mind and World in Aristotle’s de Anima. Cambridge, UK: Cambridge University Press.
Kiefer, Thomas. 2007. Aristotle’s Theory of Knowledge. London: Continuum.
Kosman, Aryeh. 2013. Understanding, Explanation, and Insight in Aristotle’s Posterior Analytics. In Virtues of Thought, 7–26. Cambridge: Harvard University Press.
Lennox, James G. 2021. Aristotle on Inquiry. Cambridge: Cambridge University Press.
Lesher, James H. 2001. On Aristotelian Ἐπιστήμη as ‘Understanding’. Ancient Philosophy 21(1): 45–55.
Lorenz, Hendrik. 2014. Understanding, Knowledge and Inquiry in Aristotle. In The Routledge Companion to Ancient Philosophy, 290–303. New York: Routledge.
Malink, Marko. 2013. Aristotle on Circular Proof. Phronesis 58(3): 215–48.
Marmodoro, Anna. 2014. Aristotle on Perceiving Objects. New York: Oxford University Press.
McKirahan, Richard. 1992. Principles and Proofs. Princeton: Princeton University Press.
Modrak, Deborah K. W. 1987. Aristotle: The Power of Perception. Chicago: University of Chicago Press.
Morison, Benjamim. 2019. Theoretical Nous in the Posterior Analytics. Manuscrito 42(4): 1–43.
Morison, Benjamin. 2012. Colloquium 2: An Aristotelian Distinction Between Two Types of Knowledge. In Proceedings of the Boston Area Colloquium of Ancient Philosophy (edited by Gary Gurtler and William Wians), 27:29–63.
Pasnau, Robert. 2013. Epistemology Idealized. Mind 122: 987–1021.
Reeve, C. D. C. 2013. Aristotle on Practical Wisdom: Nicomachean Ethics VI. Cambridge: Harvard University Press.
Reeve, C. D. C. 2014. Aristotle: Nicomachean Ethics. Indianapolis: Hackett.
Russell, Daniel C. 2014. Phronesis and the Virtues (NE Vi 12-13). In The Cambridge Companion to Aristotle’s Nicomachean Ethics (edited by Ronald Polansky), 203–20. New York: Cambridge University Press.
Shields, Christopher. 2016. Aristotle. De Anima. Oxford: Clarendon Press.
Smith, Nicholas D. (ed.). 2019. The Philosophy of Knowledge: A History (Vol. I: Knowledge in Ancient Philosophy). London: Bloomsbury Academic.
Sorabji, Richard. 1980. Necessity, Cause and Blame. Perspectives on Aristotle’s Theory. Ithaca: Cornell University Press.
Sorabji, Richard. 1992. Intentionality and Physiological Processes: Aristotle’s Theory of Sense-Perception. In Essays on Aristotle’s De Anima (edited by Martha C. Nussbaum and Amelie Oksenberg Rorty), 195–225. Clarendon Press.
Sosa, Ernst. 2010. Knowing Full Well. Cambridge: Princeton University Press.
Taylor, C. C. W. 1990. Aristotle’s Epistemology. In Epistemology (edited by Stephen Everson), 116–42. Cambridge: Cambridge University Press.
Walker, Matthew D. 2018. Aristotle on the Uses of Contemplation. Cambridge: Cambridge University Press.
Ward, Julie K. 2022. Searching for the Divine in Plato and Aristotle: Philosophical Theoria and Traditional Practice. Cambridge: Cambridge University Press.
Author Information
Joshua Mendelsohn
Email: jmendelsohn@luc.edu
Loyola University Chicago
U. S. A.
History of Utilitarianism
The term “utilitarianism” is most-commonly used to refer to an ethical theory or a family of related ethical theories. It is taken to be a form of consequentialism, which is the view that the moral status of an action depends on the kinds of consequences the action produces. Stated this way, consequentialism is not committed to any view of what makes certain outcomes desirable. A consequentialist could claim (rather absurdly) that individuals have a moral obligation to cause as much suffering as possible. Similarly, a consequentialist could adopt an ethical egoist position, that individuals are morally required to promote their own interests. Utilitarians have their own position on these matters. They claim it is utility (such as happiness, or well-being), which makes an outcome desirable, they claim that an outcome with greater utility is morally preferable to one with less. Contrary to the ethical egoist, the utilitarian is committed to everyone’s interests being regarded as equally morally important.
These features are fairly uncontroversial among utilitarians, but other features are the subject of considerable dispute. How “utility” should be understood is contested. The favoured ways of understanding utilitarianism have varied significantly since Jeremy Bentham—seen as the “father of utilitarianism”—produced the first systematic treatise of the view. There have also been proponents of views that resemble utilitarianism throughout history, dating back to the ancient world.
This article begins by examining some of the ancient forerunners to utilitarianism, identifying relevant similarities to the position that eventually became known as utilitarianism. It then explores the development what has been called “classical utilitarianism”. Despite the name, “classical utilitarianism” emerged in the 18th and 19th centuries, and it is associated with Jeremy Bentham and John Stuart Mill. Once the main features of the view are explained, some common historical objections and responses are considered. Utilitarianism as the social movement particularly influential in the 19th century is then discussed, followed by a review of some of the modifications of utilitarianism in the 20th century. The article ends with a reflection on the influence of utilitarianism since then.
1. Precursors to Utilitarianism in the Ancient World
While utilitarianism became a refined philosophical theory (and the term “utilitarianism” was first used) in the 18th century, positions which bear strong similarities to utilitarianism have been deployed throughout history. For example, similarities to utilitarianism are sometimes drawn to the teachings of Aristotle, the Buddha and Jesus Christ. In this section, two views from the ancient world are considered. The first is of Mozi, who is sometimes described as the first utilitarian (though this is disputed). The second is Epicurus, whose hedonism was influential on the development of utilitarianism.
a. Mozi
Mozi (c.400s-300s B.C.E)—also known as Mo-Tzu, Mo Di and Mo Ti—led the Mohist school in Chinese philosophy, which, alongside the Confucian school, was one of the two major schools of thought during the Warring States period (403-221 B.C.E.). In this article, some salient similarities between his ethical outlook and utilitarianism will be observed. For a more detailed discussion of Mozi’s philosophy, including how appropriate it is to view him as a utilitarian, see the article devoted to his writings.
Utilitarians are explicit in the importance of impartiality, namely that the well-being of any one individual is no more important than the well-being of anyone else. This is also found in Mozi’s writings. The term jian’ai is often translated as “universal love”, but it is better understood as impartial care or concern. This notion is regarded as the cornerstone of Mohism. The Mohists saw excessive partiality as the central obstacle to good behaviour. The thief steals because they do not sufficiently care for the person they steal from, and rulers instigate wars because they care more for their own good than the people whose countries they invade. Thus, Mozi implored his followers to “replace partiality with impartiality”.
His emphasis on the importance of impartiality bears striking similarities to arguments later made by Bentham and Sidgwick. Mozi’s impartiality is like the utilitarian’s in that it implies inclusivity and equality. Every person’s interests are morally important, and they are equally important.
A second clear similarity between Mohists and utilitarians is the focus on consequences when considering the justifications for actions or practices. Unlike the Confucians, who saw rituals and custom as having moral significance, Mozi would reject this unless they could satisfy some useful purpose. If a custom serves no useful purpose, it should be disposed of. For example, it was customary at the time to spend large quantities of resources on funeral rites, but Mozi criticised this due to these conferring no practical benefit. This scrutiny of the status quo, and willingness to reform practices deemed unbeneficial is something found repeatedly in utilitarians in the 18th century and beyond (see section 4).
A particularly interesting suggestion made by Mozi is that the belief in ghosts and spirts should be encouraged. He claimed that historically, a belief in ghosts who would punish dishonesty or corrupt behaviour had motivated people to act well. Upon seeing scepticism about ghosts in his time, Mozi thought this meant people felt free to act poorly without punishment: “If the ability of ghosts and spirits to reward the worthy and punish the wicked could be firmly established as fact, it would surely bring order to the state and great benefit to the people” (The Mozi, chapter 31).
Mozi approves of the belief in the existence of ghosts, whether or not they actually exist, because of the useful consequences of this belief. This suggestion that utility may count in favour of believing falsehoods is reminiscent of a claim by Henry Sidgwick (1838-1900). Sidgwick was a utilitarian, but he acknowledged that the general public may be happier if they did not believe utilitarianism was true. If that was the case, Sidgwick suggests that the truth of utilitarianism should be kept secret, and some other moral system that makes people happier be taught to society generally. This controversial implication——that it might be morally appropriate to mislead the general public when it is useful——is radical, but it is a reasonable inference from this type of moral view, which Mozi embraced.
A significant difference between Mozi and the utilitarians of the 18th century is the theory of the good he endorsed. Mozi sought to promote a range of goods, specifically order, wealth and a large population. Classical utilitarians, however, regarded happiness or pleasure as the only good. This view was presented shortly after Mozi, in Ancient Greece.
b. Epicureanism
The Epicureans, led by Epicurus (341-271 B.C.E.), were (alongside the Stoics and the Skeptics) one of the three major Hellenistic schools of philosophy. The Epicureans were hedonistic, which means that they saw pleasure as the only thing that was valuable in itself, and pain (or suffering) as the only ultimately bad thing.
This commitment is shared by later utilitarians, and it can be seen in slogans like “the greatest happiness of the greatest number”, which was later used by Frances Hutcheson and popularised by Bentham (though he later disliked it as too imprecise).
Though the Epicureans saw pleasure as the only good, the way they understood pleasure was somewhat different to the way one might imagine pleasure today. They realised that the most intense pleasures, perhaps through eating large amounts of tasty food or having sex, are short-lived. Eating too much will lead to pain further down the line, and appetites for sex dwindle. Even if appetites do not fade, becoming accustomed to intense pleasures may lead to sadness (a mental pain) further down the line if one’s desires cannot be satisfied. Thus, Epicurus endorsed finding pleasure in simple activities that could be reliably maintained for long periods of time. Rather than elaborate feasts and orgies, Epicurus recommended seeking joy in discussion with friends, developing tastes that could easily be satisfied and becoming self-sufficient.
A particular difference between the Epicurean view of pleasure and the view of later hedonists is that Epicurus regards a state of painlessness—being without any physical pains or mental disturbances—as one of pleasure. In particular, Epicurus thought we should aim towards a state of ataraxia, a state of tranquillity or serenity. For this reason, the Epicurean view is similar to a version of utilitarianism sometimes known as negative utilitarianism, which claims that morality requires agents to minimise suffering, as opposed to the emphasis typical utilitarians play on promoting happiness.
Epicurus also differed from utilitarians in terms of the scope of his teachings. His guidance was fairly insular, amounting to something like egoistic hedonism—one that encouraged everyone to promote their own personal pleasure. Epicurus encouraged his followers to find comfort with friends, and make their families and communities happy. This is a stark difference from the attitude of radical reform exhibited by Jeremy Bentham and his followers, who intended to increase the levels of happiness all over the world, rather than merely in the secluded garden that they happened to inhabit.
Epicurean teaching continued long after Epicurus’ death, with Epicurean communities flourishing throughout Greece. However, with the rise of Christianity, the influence of Epicureanism waned. There are several reasons that may explain this. The metaphysical picture of the world painted by Epicureans was one lacking in divine providence, which was seen as impious. Furthermore, the Epicurean attitude towards pleasure was often distorted, and portrayed as degrading and animalistic. This criticism, albeit unfair, would go on to be a typical criticism of utilitarianism (see 3.d.ii). Due to these perceptions, Epicureanism was neglected in the Middle Ages.
By the 15th century, this trend had begun to reverse. The Italian Renaissance philosopher Lorenzo Valla (1407-1457) was influenced by Epicurus and the ancient Epicurean Lucretius (99-55 B.C.E.). Valla defended Epicurean ideas, particularly in his work, On Pleasure, and attempted to reconcile them with Christianity. Thomas More (1478-1535) continued the rehabilitation of hedonism. In Utopia (1516), More describes an idyllic society, where individuals are guided by the quest for pleasure. The Utopian citizens prioritised spiritual pleasures over animalistic ones, which may have made this view more amenable to More’s contemporaries. Later still, the French philosopher Pierre Gassendi (1592-1695) embraced significant portions of Epicurean thinking, including the commitment to ataraxia (tranquillity) as the highest pleasure. The Renaissance revival of Epicureanism paved the way for the development of utilitarianism.
2. The Development of Classical Utilitarianism
In the 17th and early 18th century, philosophical positions that are recognisably utilitarian gained prominence. None of the following labelled themselves as “utilitarians” (the word had not yet been introduced) and whether some should properly be described in this way is a matter of some dispute, but each contain significant utilitarian features and have an important place in the intellectual history.
a. Hutcheson
Francis Hutcheson (1694-1795) was a Scots-Irish philosopher sometimes seen as the first true utilitarian. Geoffrey Scarre (1996) suggests that Hutcheson deserves the title of “father of British utilitarianism” (though Bentham is more typically described in this kind of way). As with many attributions of this sort, this is heavily contested. Colin Heydt, for instance, suggests Hutcheson should not be classified as a utilitarian. Regardless, his contribution to the development of utilitarian thought is undisputed.
Hutcheson was a moral sense theorist. This means he thought that human beings have a special faculty for detecting the moral features of the world. The moral sense gives a person a feeling of pleasure when they observe pleasure in others. Further, the sense approves of actions which are benevolent. Benevolent actions are those that aim towards the general good.
One particular passage that had significant influence on utilitarians can be found in Hutcheson’s Inquiry Concerning the Original of Our Ideas of Virtue or Moral Good (1725):
In the same manner, the moral evil, or vice, is as the degree of misery, and number of sufferers; so that, that action is best, which procures the greatest happiness for the greatest numbers; and that, worst, which, in like manner, occasions, misery.
The phrase, “greatest happiness for the greatest number(s)” became one of the major slogans of utilitarianism. This seems to be the first appearance of the phrase in English (though it was used decades previously by Leibniz). Because of this position, it is easy to see how Hutcheson can be interpreted as a utilitarian.
One important distinction between Hutcheson and utilitarians, however, is that he views the motives of individuals as what is valuable, rather than the state of affairs the action brings about. Whereas utilitarians view happiness itself as good, Hutcheson thinks it is the motives identified by our moral sense (which aim at happiness), which are good.
Hutcheson anticipates something similar to Mill’s higher/lower pleasures distinction (see 3.d.ii). In his posthumously published A System of Moral Philosophy, he says there are “a great variety of pleasures of different and sometimes inconsistent kinds, some of them also higher and more durable than others” (1755). Hutcheson associates dignity and virtuous action with the higher pleasures, and claims that “the exercise of virtue, for some short period, provided it is not succeeded by something vicious, is of incomparably greater value than the most lasting sensual pleasures”. These “higher” pleasures include social and intellectual activities, and are held to trump “lower” pleasures, like food and sex. Hutcheson is aware, however, that pleasures are “generally blended”. Lower pleasures may be accompanied by socialising, moral qualities, or friendship.
This appreciation for the variety and combinations of pleasure adds a rich texture to Hutcheson’s account. However, these intricacies may indicate a further difference between his view and utilitarianism. For the utilitarian, for a certain type of activity to be more valuable than another, this must be explained in terms of pleasure. Hutcheson, however, seems to determine which pleasures are higher and lower based on prior views he harbours about which are noble. He supposes that people who possess “diviner faculties and fuller knowledge” will be able to judge which pleasures are better, and thus which it is better to engage in and promote in others.
Hutcheson is further distinct from utilitarians in that it is unclear whether he is actually trying to provide a theory of right action. He notes that our moral sense can discern which actions are best and worst, but he does not explicitly link this to an account of what it is our duty to do, or what it would be wrong for us not to do. This could be viewed simply as something Hutcheson omitted, but alternatively could be interpreted as a version of scalar utilitarianism (see section 5.d).
b. Christian Utilitarianism
Utilitarianism today is usually seen as a secular doctrine. From Bentham onwards, utilitarians typically attempted to describe their worldview without referring to any theistic commitments. In the 18th century, however, there was a distinct branch of early utilitarians who gave theistic justifications for their position. Participants in this strand are sometime referred to as “Anglican utilitarians”. Richard Cumberland (1631-1718) was an early example of this, and was later followed by John Gay (1699-1745), Soame Jenyns (1704-1787), Joseph Priestley (1733-1804), and William Paley (1743-1805). Paley’s Principles of Moral and Political Philosophy (1785) was the first to bring utilitarianism to a wider audience, and it remained the most discussed example of utilitarianism well into the 19th century.
Cumberland was a natural law theorist, which is to say that moral truths are determined by or can be derived from features of the world, including the nature of human beings. In Cumberland’s view, because human beings find pleasure good and pain bad, they can discern that God wills that they promote pleasure and diminish pain. In A Treatise of the Laws of Nature (1672), he writes: Having duly pondered on these matters to the best of our ability, our minds will be able to bring forth certain general precepts for deciding what sort of human actions may best promote the common good of all beings, and especially of rational beings, in which the proper happiness of each is contained. In such precepts, provided they be true and necessary, is the law of nature contained.
So, armed only with empirical facts about the world, like experiences of pleasure and pain, and our possessing the faculty of reason, Cumberland claimed that it was possible to ascertain that human beings have a God-given duty to promote the general happiness.
While secular versions of utilitarianism came to dominate the tradition, this type of argument for utilitarianism actually has some distinct advantages. Notably, this can provide simple answers to the question “Why be moral?”. Everyone may value their own happiness, so this provides everyone with a reason to act in ways that increase their own happiness. However, there are instances where promoting one’s own personal happiness seem to conflict with the common good. John Gay issued a challenge for secular versions of utilitarianism to explain why an agent in such a position has reason to sacrifice their own happiness to help others: “But how can the Good of Mankind be any Obligation to me, when perhaps in particular Cases, such as laying down my Life, or the like, it is contrary to my Happiness?” (Concerning the Fundamental Principle of Virtue or Morality, 1731).
For the Anglican utilitarian, this question is resolved easily. While it might appear that an individual’s happiness is best promoted by a selfish act contrary to the public good, this is only because rewards of the afterlife have not been taken into account. When someone recognises the infinite rewards for complying with God’s will (or infinite punishments for defying it), they will realise that acting in the interests of the common good (promoting the general happiness) is actually in their best interests. This kind of solution to the problem of moral motivation is not available for secular utilitarians.
Although theistically grounded versions of utilitarianism may stand on firmer ground when it comes to the problem of moral motivation, there are costs too. There are challenges to the existence of an all-powerful creator (see arguments for atheism). Even if those are avoided, the natural law reasoning championed by the Anglican utilitarians might not be persuasive. The inference from what kinds of things people enjoy to a specific divine purpose of human beings (for example, Priestley claims that we can discover that God “made us to be happy”) is one that might be scrutinised. Furthermore, the theistic utilitarian faces a version of the Euthyphro problem: is happiness good because God desires it, or does God desire happiness because it is good?
The Anglican utilitarians foresaw some of the problems that would become serious areas of discussion for later utilitarians. In Priestley, for instance, one can find a discussion of what would later be known as the “demandingness objection” (discussed in section 3.d.iii).
William Paley’s utilitarianism is of historical interest because he discussed several features of the view that have concerned utilitarians and their critics since. For example, he raised the question of whether certain types of action usually deemed to be evil, such as bribery or deceit, might be regarded as morally good if they lead to good consequences:
It may be useful to get possession of a place…or of a seat in parliament, by bribery or false swearing: as by means of them we may serve the public more effectually than in our private station. What then shall we say? Must we admit these actions to be right, which would be to justify assassination, plunder and perjury; or must we give up our principle, that the criterion of right is utility? (The Principles of Moral and Political Philosophy, 1785: 854).
In his answer to this question, Paley suggests a form of what would later be known as rule-utilitarianism (discussed further in section 5.c). He suggests that two types of consequences of an action can be distinguished—the general consequences and the particular consequences. The particular consequence is what follows from a specific action, that is, bribing someone on a given occasion. The general consequence is what follows from acting on that rule, and it is the general consequence Paley views as more important. Paley suggests that, in considering whether bribery to gain a political position is right, one should think about the consequences if everyone accepted a rule where bribery was allowed. Once this is taken into account, Paley argues, it will become apparent that bribery is not useful.
Like Epicurus, Paley is somewhat dismissive of animalistic pleasures, but his explanation for this differs. He makes a distinction between pleasures, which are fleeting, and happiness, which he seems to regard as possessed over longer periods of time:
Happiness does not consist in the pleasures of sense, in whatever profusion or variety they be enjoyed. By the pleasures of sense, I mean, as well the animal gratifications of eating, drinking, and that by which the species is continued, as the more refined pleasures of music, painting, architecture, gardening, splendid shows, theatric exhibitions; and the pleasures, lastly, of active sports, as of hunting, shooting, fishing, etc. (Principles of Moral and Political Philosophy, 35)
He claims these bodily pleasures do not contribute to happiness because they are too fleeting and “by repetition, lose their relish”. Rather, Paley sees happiness as consisting in social activities, the exercise of our faculties, and good health. Paley might then be seen as suggesting that happiness is something one does, rather than something one experiences. He also emphasises the importance of “prudent constitution of the habits” (which bears similarities to Aristotelian ethics). This distinguishes Paley somewhat from the classical utilitarians, who regarded pleasure as a mental state, and happiness consisting in pleasure as well as an absence of pain.
William Paley is also somewhat distinctive due to his conservative values. Unlike Bentham and his followers, who were radical reformers, Paley found the status quo satisfactory. This difference arises for a few different reasons. One explanation for this is that he thought that happiness was relatively evenly distributed around society. He did not think, for instance, that the wealthy were significantly happier than the poor. He argued that this was the case because of his view of happiness—he thought the wealthy and the poor had fairly equal access to social activities, utilising their faculties, and good health.
In his discussions of what acts should be regarded as criminal and what the punishments should be, he does appeal to utility, but also regularly to scripture. As a consequence, Paley’s position on many social issues is one that would now be considered extremely regressive. For example, he favoured financial penalties for women guilty of adultery (but did not suggest the same for men) and argued that we should not pursue leisure activities (like playing cards or frequenting taverns) on the Sabbath. Like many of the later utilitarians, Paley did argue that slavery should be abolished, criticising it as an “odious institution”, but he was in favour of a “gradual” emancipation.
The Anglican utilitarians were extremely influential. Bentham was familiar with their work, citing Joseph Priestley in particular as a major inspiration. Many of the discussions that later became strongly associated with utilitarianism originated here (or were at least brought to a wider audience). An obvious difference between many of the Anglican utilitarians and the later (Benthamite) utilitarians is the conservativism of the former. (One notable exception is perhaps found in Priestley, who celebrated the French Revolution. This reaction was met with such animosity—his chapel was destroyed in a riot—that he emigrated to America.) The Anglican utilitarians were committed to the traditional role of the church and did not endorse anything like the kind of radical reform championed by Bentham and his followers.
c. French Utilitarianism
The development of utilitarianism is strongly associated with Britain. John Plamenatz described the doctrine as “essentially English”. However, a distinctly utilitarian movement also took place in 18th-century France. Of the French utilitarians, Claude Helvétius (1715-1751) and François-Jean de Chastellux (1734-1788) are of particular interest.
While the dominant form utilitarianism in Britain in the 18th century was the Anglican utilitarianism of John Gay (see 2.b), the French utilitarians argued from no divine commitments. Helvétius’ De L’Espirit (1758) was ordered to be burned due to its apparently sacrilegious content. That the French utilitarians were secular has some implications that make it historically noteworthy. As mentioned above (section 2.b), one advantage of the theistically-grounded utilitarianism is that it solves the problem of moral motivation—one should promote the well-being of others because God desires it, and, even if one is fundamentally self-interested, it is in one’s interests to please God (because one’s happiness in the afterlife depends on God’s will). Without the appeal to God, giving an account of why anyone should promote the general happiness, rather than their own, becomes a serious challenge.
Helvétius poses an answer to this challenge. He accepts that the general good is what we should promote, but also, influenced by the Hobbesian or Mandevillian view of human nature, holds that people are generally self-interested. So, people should promote the general good, but human nature will mean that they will promote their individual goods. Helvétius takes this to show that we need to design our laws and policies so that private interest aligns with the general good. If everyone’s actions will be directed towards their own good, as a matter of human nature, “it is only by incorporating personal and general interest, that they can be rendered virtuous.” For this reason, he claims that morality is a frivolous science, “unless blended with policy and legislation”. Colin Heydt identifies this as the key insight that Bentham takes from Helvétius.
Taking this commitment seriously, Helvétius considered what it took to make a human life happy, and what circumstances would be most likely to bring this about. He approached this with a scientific attitude, suggesting “that ethics ought to be treated like all the other sciences. Its methods are those of experimental physics”. But this raises the question of how policy and legislation be designed to make people happy.
Helvétius thought that to be happy, people needed to have their fundamental needs met. In addition to this, they needed to be occupied. Wealthy people may often find themselves bored, but the “man who is occupied is the happy man”. So, the legislator should seek to ensure that citizens’ fundamental needs are met, but also that they are not idle, because he viewed labour as an important component in the happy life. Helvétius treats the suggestion that labour is a negative feature of life with scorn, claiming:
“To regard the necessity of labour as the consequence of an original sin, and a punishment from God, is an absurdity. This necessity is, on the contrary, a favour from heaven” (A Treatise on Man: His Intellectual Faculties and Education, volume 2).
Furthermore, certain desires and dispositions are amenable to an individual’s happiness, so the legislator should encourage citizens to psychologically develop a certain way. For instance, people should be persuaded that they do not need excessive wealth to be happy, and that in fact, luxury does not enhance the happiness of the rich. Because of this, he proposed institutional restrictions on what powers, privileges, and property people could legally acquire. In addition, Helvétius suggested that education should serve to restrict citizens’ beliefs about what they should even want to require, that is, people could be taught (or indoctrinated?) not to want anything that would not be conducive the public good.
As poverty does negatively affect the happiness of the poor, Helvétius defended limited redistribution of wealth. Specifically, one suggestion he offered was to force families that have shrunk in size to relinquish some of their land to families which have grown. Exactly what is the best way to move from a state of misery (which he thought most people were in) to a state of happiness would vary from society to society. So specific suggestions may have limited application. Helvétius urged that this transformation should take place and might involve changing how people think.
In Chastellux’s work, the view that governments should act primarily to promote public happiness is explicit. In his De la Félicité publique (1774), he says: It is an indisputable point, (or at least, there is room to think it, in this philosophical age, an acknowledged truth) that the first object of all governments, should be to render the people happy.
Accepting this, Chastellux asked how this should be done. What is most noteworthy in Chastellux is that he pursued a historical methodology, examining what methods of governments had been most successful in creating a happy populace, so that the more successful efforts might be emulated and developed. From his observations, Chastellux claimed that no society so far had discovered the best way to ensure happiness of its citizens, but he does not find this disheartening. He notes that even if all governments had aimed at the happiness of their citizens, it would “be no matter of astonishment” that they had so far failed, because human civilisation is still in its infancy. He harbours optimism that the technological developments of the future could help improve the quality of life of the poorest in society.
While the historical methodology found in Chastellux may be questionable (Geoffrey Scarre describes it as “fanciful and impressionistic”), it showed a willingness to utilise empirical measures in determining what is most likely to promote the general happiness.
Of the French utilitarians, Helvétius had the greatest influence on later developments in Britain; he was regularly acknowledged by Jeremy Bentham, William Godwin, and John Stuart Mill. The conviction to create good legislation and policies forms the crucial desire of utilitarians in the political realm. In Helvétius, we can also see the optimism of the radical reformer utilitarians, holding to his hope that “wise laws would be able without doubt to bring about the miracle of a universal happiness”.
3. Classical Utilitarianism
While many thinkers were promoting recognisably utilitarian ideas long before him, it is Jeremy Bentham who is credited with providing the first systematic account of utilitarianism in his Introduction to the Principles of Morals and Legislation (1789).
a. Origin of the Term
The word “utilitarianism” is not used in Jeremy Bentham’s Introduction to the Principles of Morals and Legislation (IPML). There he introduces the ‘principle of utility’, that “principle which approves or disapproves of every action whatsoever, according to the tendency it appears to have to augment or diminish the happiness of the party whose interest is in question; or, what is the same thing in other words to promote or to oppose that happiness”. Bentham borrows the term “utility” from David Hume’s Treatise of Human Nature (1739-1740). There, Hume argues that for any character traits viewed as virtues, this can be explained by the propensity of those traits to cause happiness (‘utility’). Bentham later reported that upon reading this, he “felt as if scales had fallen from my eyes”.
The first recorded use of the word “utilitarianism” comes in a letter Bentham wrote in 1781. The term did not catch on immediately. In 1802, in another letter, Bentham was still resisting the label “Benthamite” and encouraging the use of “utilitarian” instead. While Bentham seems to have originated the term, this does not seem to have been common knowledge. John Stuart Mill, in Utilitarianism (1861) notes that he found the term in an 1821 John Galt novel. He was using it as early as 1822, when he formed a society called the ‘Utilitarian Society’, which was a group of young men, who met every two weeks for three and half years. After this, the term entered common parlance.
b. Bentham
As well as providing what became the common name of the view, Jeremy Bentham (1748-1832) is credited with making utilitarianism a systematic ethical view. His utilitarian inclinations were sparked when he read Joseph Priestley’s Essay on Government (1768), and he claims that the “greatest happiness of the greatest number” is the measure of right and wrong in his Fragment on Government (1776). It is in IMPL, however, where the ideas are presented most clearly and explicitly.
In IPML, Bentham defines utility as “that property in any object, whereby it tends to produce benefit, advantage, pleasure, good, or happiness”. In the opening of IPML, Bentham makes clear his view that utility (pleasure and pain) determines the rightness or wrongness of an action. He states:
Nature has placed mankind under the governance of two sovereign masters, pain and pleasure. It is for them alone to point out what we ought to do, as well as determine what we shall do. On the one hand the standard of right and wrong, on the other the chain of causes and effects, are fastened to their throne. They govern us in all we do, in all we say, in all we think: every effort we can make to throw off our subjection, will serve but to demonstrate and confirm it.
As well as emphasising hedonism as the standard of rightness (normative hedonism), Bentham seems here committed to a certain view about our motivation. He not only claims that the rightness or wrongness of an action is determined by pain/pleasure, but also that these notions determine what we will do. Specifically, following Hobbes, Bentham thought that everyone is, as a matter of fact, always motivated by their own happiness, a form of psychological egoism. If we accept the ought-implies-can principle, the idea that we can only be required to act in ways that it is actually possible for us to act, this is a difficult position to reconcile with the claim that we ought to promote the general happiness. If human beings are necessarily always motivated by their own self-interest, imploring them to promote the interests of others seems futile.
Bentham was aware of this sort of objection. One type of response he gives is to claim that we should ensure, where possible, that society is structured so that when individuals act in their own interests, this is conducive to the general happiness. This answer is reminiscent of the strategy deployed by Helvétius (section 2.c). When the incentive and punitive structures in society are structured in this way, self-interested actions benefit the wider community. Second, he suggests that individuals do benefit from living in a community where the general good is promoted. This amounts to a denial that any self-interested actions actually does clash with the general good. This strikes many as implausible, as any actions that would be good for the general good but bad for the individual acting, would disprove it. This move is rendered unnecessary if psychological egoism is abandoned, and given some of the arguments against the view, Bentham’s utilitarianism may be better off without that psychological claim.
One of the ideas Bentham is known for is the “hedonic calculus” or “felicific calculus” (though Bentham never himself used either of these terms). The crux of this is the thought that to determine the value of an action, one can use a kind of moral ledger. On one side of the ledger, the expected good effects of the action and how good they are can be added up. On the other side, the bad effects of the action can be added. The total value of the negative effects can then be subtracted from the value of the positive effects, giving the total value of the action (or policy). This idea was first introduced by Pierre Bayle (1647-1706), though Bentham adds considerable depth to the idea.
In considering how to value a quantity of pleasure (or pain), Bentham observed that we can evaluate it with regards to seven dimensions or elements. These are the pleasure’s:
(1) intensity
(2) duration (how long the pleasure lasts)
(3) certainty/uncertainty (the probability it will occur)
(4) propinquity or remoteness (how soon the pleasure will occur)
(5) fecundity (how likely it is to be followed by further pleasures)
(6) purity (how likely it is to be followed or accompanied by pains)
(7) extent (the number of persons it extends to)
Bentham included a poem in the second edition of IPML, so that people could remember these dimensions:
Intense, long, certain, speedy, fruitful, pure –
Such marks in pleasures and in pains endure.
Such pleasures seek if private be thy end:
If it be public, wide let them extend
Such pains avoid, whichever be thy view:
If pains must come, let them extend to few.
On Bentham’s view, these are all the features we must know of a certain pleasure. Importantly, even a frivolous game, if it turns out to have the same intensity, duration, and so forth, is just as good as intellectual pursuits. He says this explicitly about the game push-pin (a children’s game where players try to hit each other’s pins on a table): “Prejudice apart, the game of push-pin is of equal value with the arts and sciences of music and poetry”. Notably, this view set him apart from those who claimed a difference in kind between types of pleasures, like John Stuart Mill (see section 3.d.ii).
While Bentham does suggest that this kind of happiness arithmetic would be successful in determining what actions are best, he does not suggest that we consider every factor of every possible action in advance of every given action. This would obviously be excessively time consuming, and could result in a failure to act, which would often be bad in terms of utility. Rather, we should use our experience as a guide to what will likely promote utility best.
Though the term “greatest happiness for the greatest number” has become strongly associated with utilitarianism and is used by Bentham in earlier works, he later distanced himself from it, because in it “lurks a source of misconception”. One interpretation of the expression suggests we should ascertain the largest number of people benefited by an action (the greatest number), and benefit those as much as possible, no matter what the effects are on the other remainder. For instance, we could imagine a policy that enslaved 1% of the population for the benefit of the 99%, greatly benefiting that majority, but making the enslaved miserable. A policy like this, which ignores entirely the well-being of some, is certainly not what Bentham intended. He later speaks simply of the “greatest happiness principle”, the requirement to promote the greatest happiness across the whole community.
Bentham was an active reformer. He argued for radical political changes, including arguing for the right to vote for women, significant prison reforms, the abolition of slavery, the elimination of capital punishment, and in favour of sexual freedom. Each of these was argued for on grounds of utility. Bentham gained a number of intellectual followers. One of the most notorious of these was James Mill (1783-1836), who was one of the major figures in 19th century philosophy and economics. Mill’s reputation was international, attracting attention from Karl Marx (1818-1883), and is still seen as one of the most important figures in utilitarianism, but today he is overshadowed by his son, John Stuart. John Stuart Mill (1806-1873) met Bentham when he was two years old, and, under the influences of Bentham and his father, became one of utilitarianism’s fiercest champions. John Stuart Mill’s defence of utilitarianism is still the most widely read today (discussed in more depth in 3.d).
c. Features of Classical Utilitarianism
It is a matter of some dispute what features make a moral theory appropriate for the name utilitarianism. The core features mentioned here are those commonly associated with classical utilitarianism. It is not clear how many of those associated with utilitarianism, even in 19th century Britain, actually accepted classical utilitarianism, that is, who thought the correct moral theory possessed these six features. For instance, though John Stuart Mill is regarded as the man who did most to popularise the view, he rejected elements of this picture, as he explicitly rejected the requirement to maximise utility (see Jacobson 2008 for a discussion of how Mill deviates from this orthodox picture). Regardless of how many actually held it, the view consisting of these claims has become the archetype of utilitarianism. The more a moral view departs from these, the less likely it is to be deemed a version of utilitarianism.
i. Consequentialism
Views are classed as consequentialist if they place particular emphasis on the role of the outcome of actions, rather than features intrinsic to the actions (for example, whether it involves killing, deception, kindness, or sympathy) as forms of deontology do, or what the actions might reveal about the character of the agent performing them (as does virtue ethics).
Classical utilitarianism is uncontroversially consequentialist. Later variations, such as rule-utilitarianism (see section 5c), which regard consequences as having an important role, are less easily categorised. Versions of utilitarianism that do not assess actions solely in terms of the utility they produce are sometimes referred to as indirect forms of utilitarianism.
ii. Hedonism
Following the Epicureans, classical utilitarianism regards pleasure as the only thing that is valuable in itself. Pleasure is the “utility” in classical utilitarianism. On this view, actions are morally better if they result in more pleasure, and worse if they result in less.
Hedonists differ on how they understand pleasure. The Epicureans, for instance, regarded a state of tranquility (ataraxia) as a form of pleasure, and one that should be pursued because it is sustainable. Classical utilitarians typically regard pleasure as a mental state which the individual experiences as positive. Bentham evaluated pleasures across his seven elements, but importantly thought no pleasure was superior in kind to any other. For example, the pleasure from eating fast food is no less valuable than the pleasure one may attain from reading a great novel, though they may differ in terms of sustainability (one might become ill fairly quickly from eating fast food) or propinquity (pleasure from fast food may be quick, whereas it may take some time to come to appreciate a complex prose). This parity of pleasures was something John Stuart Mill disagreed with, leading to a notable difference in their views (see 3.d.ii).
Many contemporary utilitarians, recognising issues with hedonism, have instead adopted welfarism, the weaker claim that the only thing that is intrinsically valuable is well-being, that is, whatever it is that makes a life go well. Well-being could be given a hedonistic analysis, as in classical utilitarianism, but alternatively a preference-satisfaction view (which states that one’s well-being consists in having one’s preferences satisfied) or an objective-list view (which states that lives go well or badly depending on how well they satisfy a set list of criteria) could be adopted.
iii. Aggregation
The utilitarian thinks that everyone’s individual pleasure is good, but they also think it makes sense to evaluate how good an outcome is by adding together all the respective quantities of pleasure (and pain) of the individuals affected. Imagine that we can assign a numerical value to how happy every person is (say 10 is as happy as you could be, zero is neither happy or unhappy, and -10 is as unhappy as you could be). The aggregative claim holds that we can simply add the quantities together for an action to see which is the best.
One of the criticisms sometimes made of utilitarianism is that ignores the separateness of persons. When we decide actions based on aggregated sums of happiness, we no longer think about individuals as individuals. Instead, they are treated more like happiness containers. A related complaint is that determining the best outcome by adding together the happiness scores of every individual can obscure extremes that might be morally relevant. This has implications that many find counterintuitive, such as that this method may judge an outcome where one person undergoes horrific torture to be a good outcome, so long as enough other people are happy.
iv. Optimific (‘Maximising’)
Hedonists believe pleasure is the only good. Aggregation commits utilitarians to the idea that the pleasures and pains of different people can be added to compare the value of outcomes. One could accept these claims without thinking that a moral agent must always do the best. Classical utilitarianism does hold that one is required to perform the best action. In other words, classical utilitarianism is a maximising doctrine (“maximising” is another word introduced into English by Jeremy Bentham).
Maximising views are controversial. One reason for this is that they eliminate the possibility of supererogatory actions, that is, actions that are beyond the call of duty. For example, we might think donating most of your income to charity would be a wonderful and admirable thing to do, but not something that is usually required. The maximiser claims that you must do the best action, and this is the case even if doing so is really difficult, or really costly, for the person acting.
Some of the most persistent criticisms of utilitarianism concern how much it demands. In response, some of the 20th-century revisions of the view sought to abandon this element, for example, satisficing versions and scalar views (5.d).
v. Impartiality
Utilitarians embrace a form of egalitarianism. No individual’s well-being is more important than any other’s. Because of this, utilitarians believe that it is just as important to help distant strangers as it is to help people nearby, including one’s friends or family. As Mill puts it, utilitarianism requires an agent “to be as strictly impartial as a disinterested and benevolent spectator”.
In fact, sometimes impartiality may require a person to help a stranger instead of a loved one. William Godwin (1756-1836) highlighted this in a famous example. He described a scenario where a fire broke out, and a bystander was able to save either Archbishop Fénelon (a famous thinker and author of the time) or a chambermaid. Godwin argued that because of Fénelon’s contributions to humanity, a bystander would be morally required to save him. Moreover, Godwin claimed, one would be required to save Fénelon even if the chambermaid was one’s mother.
This requirement for strict impartiality strikes many as uncomfortable, or even alienating. When challenged, Godwin defended his position, but insisted that scenarios where this kind of sacrifice is required would be rare. In most instances, he thought, people do happen to be more able to bring happiness to themselves or their loved ones, because of greater knowledge or increased proximity. In this way, some partial treatment, like paying more attention to one’s friends or family, can be defended impartially.
vi. Inclusivity
The classical utilitarian accepts the hedonist commitment that happiness is what is valuable. It is a separate question whose happiness should count. Utilitarians answer this with the most inclusive answer possible—everyone’s. Any subject that is capable of pleasure or pain should be taken into consideration.
This has some radical implications. As well as human beings, many animals can also experience pleasure or pain. On this topic, one passage from Bentham is regularly deployed by defenders of animal rights:
It may come one day to be recognized, that the number of legs, the villosity of the skin, or the termination of the os sacrum, are reasons equally insufficient for abandoning a sensitive being to the same fate. What else is it that should trace the insuperable line? Is it the faculty of reason, or perhaps, the faculty for discourse? …the question is not, Can they reason? nor, Can they talk? but, Can they suffer? (IPML, chapter XVII)
Reasoning of this sort extends the domain of morally relevant beings further than many were comfortable with. Bentham was not alone among utilitarians in suggesting that non-human life should be taken into moral consideration. In his Utilitarianism, Mill noted that lives full of happiness and free from pain should be “secured to all mankind; and not to them only, but, so far as the nature of things admits, to the whole sentient creation.” This emphasis on the importance of the well-being of animal life, as well as human life, has persisted into contemporary utilitarian thought.
d. Early Objections and Mill’s Utilitarianism
In the 19th century, knowledge of utilitarianism spread throughout society. This resulted in many criticisms of the view. Some of these were legitimate challenges to the view, which persist in some form today. Others, however, were based upon mistaken impressions.
In 1861, frustrated by what he saw as misunderstandings of the view, John Stuart Mill published a series of articles in Fraser’s Magazine, introducing the theory and addressing some common misconceptions. This was later published as a book, Utilitarianism (1863). Mill was somewhat dismissive of the importance of this work. In letters, he described it as a “little treatise”, and barely mentioned it in his Autobiography (unlike all his other major works). Despite this, it is the most widely consulted defence of utilitarianism.
Here are some of the early criticisms of utilitarianism, and Mill’s responses.
i. Dickens’ Gradgrindian Criticism
In the 19th century, utilitarianism was perceived by some of its detractors as cold, calculating, and unfeeling. In his 1854 novel, Hard Times, Charles Dickens portrays a caricature of a utilitarian in the character of Thomas Gradgrind. Gradgrind, who is described explicitly as a utilitarian, is originally described as follows:
Thomas Gradgrind, sir. A man of realities. A man of facts and calculations. A man who proceeds upon the principle that two and two are four, and nothing over, and who is not to be talked into allowing for anything over. Thomas Gradgrind, sir—peremptorily Thomas—Thomas Gradgrind. With a rule and a pair of scales, and the multiplication table always in his pocket, sir, ready to weigh and measure any parcel of human nature, and tell you exactly what it comes to. It is a mere question of figures, a case of simple arithmetic. You might hope to get some other nonsensical belief into the head of George Gradgrind, or Augustus Gradgrind, or John Gradgrind, or Joseph Gradgrind (all supposititious, non-existent persons), but into the head of Thomas Gradgrind—no, sir!
The reputation of utilitarians for being joyless and overly fixated on precision was so established that John Stuart Mill addressed this misconception in Utilitarianism (1861). Mill complains that the opponents of utilitarianism have been mistaken that the view opposes pleasure, which he describes as an “ignorant blunder”. This view of the position may come, in part, from its name, and the focus on utility, or what is useful or functional—terms seldom associated with happiness.
Despite Mill’s frustrations with this criticism, the colloquial use of the word “utilitarian” continued to have similar connotation long after his death. In an episode of the sitcom Seinfeld, for example, Elaine notes that while the female body is aesthetically appealing, the “The male body is utilitarian — it’s for getting around. It’s like a Jeep” (1997). The implication is that utilitarian objects being functional rather than fun. This association may be unfortunate and unfair, as Mill argues, but it has been a persistent one.
This particular criticism may be unfortunate, but aspects of it—such as the focus on measurement and arithmetic—foreshadow some of the utilitarianism’s later criticisms, like John Rawls’ (1921-2002) suggestion that it cannot appreciate the separateness of persons, or Bernard Williams’ (1923-2003) complaint that the view insists that people regard themselves as merely nodes in a utility calculus.
ii. The ‘Swine’ Objection and ‘Higher Pleasures’
Another criticism that was regularly levelled against utilitarianism was that it is unfit for humans, because the focus on pleasure would not allow for the pursuits of uniquely human goods. This was a criticism also made (unfairly) of the Epicureans. It suggested that the hedonist would endorse a life consisting entirely in eating, sleeping, and having sex, which were devoid of more sophisticated activities like listening to music, playing card games, or enjoying poetry. The allegation suggests that the utilitarian proffers an ethics for swine, which is undignified for human beings. Consequently, the opponent suggests, the view must be rejected.
There are several ways a utilitarian could respond to this. They could make use of the Epicurean strategy, which is to suggest that the animalistic pleasures are just as good, but they are not sustainable. If you try to spend all your time eating delicious food, your appetite will run out, and you may make yourself sick. Pleasures of the mind, however, might be pursued for a longer time. If someone is able to take pleasure in listening to poetry or music, this might also be more readily satisfied. Indulging in pleasures of these sorts does not require scarce resources, and so could be less vulnerable to contingent environmental factors. A bad harvest may ruin one’s ability to enjoy a certain food, but it would not tarnish one’s ability to enjoy a piece of music or think about philosophy. This is the type of response that would satisfy Bentham. He thought that no type of pleasure was intrinsically better than another (that push-pin “is of equal value with the arts and sciences of music and poetry”).
Mill disagreed with Bentham on this matter, claiming instead that “some kinds of pleasure are more desirable and more valuable than others”. On his view, the pleasure gained from appreciating a sophisticated poem or an opera could be better than the pleasure from push-pin, even if both instances had the same duration, were equally intense, and had no additional relevant consequences.
This was a controversial aspect of Mill’s utilitarianism, and many found his justification for this unconvincing. He suggested that someone who had experienced two different kinds of pleasures would be able to discern which was the higher quality. Some people may not be able to appreciate some forms of pleasure, because of ignorance or a lack of intelligence, just as animals are not capable of enjoying a great novel. But, according to Mill, it is generally better to be the intelligent person than the fool, and better to be a human than a pig, even a happy one: “It is better to be a human being dissatisfied than a pig satisfied; better to be Socrates dissatisfied than a fool satisfied. And if the fool, or the pig, is of a different opinion, it is only because they only know their own side of the question” (Mill, Utilitarianism, chapter 2).
Mill’s suggestion, however, invites scrutiny. Many people do opt for “lower” pleasures, rather than “higher” ones, even when capable of enjoying both. One might also wonder whether some mixture of different kinds of pleasures might be preferable to restricting oneself to pleasures more closely associated with the intellect and reasoning (which Mill regards as superior), yet Mill does not consider this, or that different people may simply have different preferences regarding some of these kinds of pleasure, without that indicating any superiority or inferiority. Mill’s proposal raises many questions, so a utilitarian may find that the simpler, Benthamite ‘quantitative hedonism’ is preferable to Mill’s ‘qualitative hedonism’ (see here for further discussion of this distinction).
While this aspect of Mill’s utilitarianism is contentious, a similar type of argument is still utilised to justify the claim that animals have a different moral status (see also the discussion of animals and ethics).
iii. Demandingness
Because of the classical utilitarian commitment to maximisation, utilitarianism is sometimes accused of being excessively demanding. Everyone is required, according to the classical utilitarian, to bring about the most happiness. If an individual can best serve the general utility by living an austere, self-sacrificial life, this is what the utilitarian calculus demands. However, this strikes many as counterintuitive. According to common-sense moral thinking, people can use their time in myriad ways without having morally failed, but the maximiser states that one must always do the very best. Morality then threatens to encroach on every decision.
Mill was aware of this criticism. He identified two particular ways this might be a concern.
First, utilitarianism may be seen to require that moral agents are always thinking about duty, that this must be the motive in every action a person performs. Thinking about morality must be central in all a person’s decisions. This, he claims, is a mistake. Mill argues that the business of ethics is people’s conduct, not whether they act because of a conscious desire to bring about the greatest utility. He provides an example to illustrate this. If a bystander notices someone drowning, what matters is that they save them, whatever their reasons might be:
He who saves a fellow creature from drowning does what is morally right, whether his motive be duty, or the hope of being paid for his trouble: he who betrays the friend that trusts him, is guilty of a crime, even if his object be to serve another friend to whom he is under greater obligations. (Utilitarianism, chapter 2)
Here, Mill makes a distinction between the moral worth of the action and the moral worth of an agent. As far as the action is concerned, the drowning person being rescued is what matters. Whether the person doing the saving is an admirable person might depend on whether they did it for noble reasons (like preventing suffering) or selfish reasons (like the hope of some reward), but utilitarianism is primarily concerned with what actions one should do. In other places, Mill does talk extensively about what makes a virtuous person, and this is strongly connected to his utilitarian commitments.
Second, Mill was aware of the worry that utilitarianism might dominate one’s life. If every action one performs must maximise utility, will this not condemn one to be constantly acting for the sake of others, to the neglect of the things that make one’s own life meaningful? Mill was dismissive of this worry, claiming that “the occasions on which any person (except one in a thousand) has it in his power to do this on an extended scale, in other words, to be a public benefactor, are but exceptional”. Sometimes, one might find oneself in a situation where one could save a drowning stranger, but such scenarios are rare. Most of the time, Mill thought, one individual does not have the ability to affect the happiness of others to any great degree, so they can focus on improving their own situation, or the situations of their friends or families.
In the 19th century, this response may have been more satisfactory, but today it seems wildly implausible. Due to the existence of effective charities, and the ability to send resources around the world instantly, an affluent person can make enormous differences to the lives of people halfway around the world. This could be in terms of providing food to countries experiencing famine, inoculations against debilitating illnesses or simply money to alleviate extreme poverty. In his time, perhaps Mill could not have been confident that small sums of money could prevent considerable suffering, but today’s middle classes have no such excuse.
Because of technological developments, for many people in affluent countries, living maximising happiness may require living a very austere life, while giving most of their resources to the world’s poorest people. This appears implausible to many people, and this intuition forms the basis of one of the major objections to utilitarianism today. Some have responded to this by moving to rule, satisficing, or scalar forms of utilitarianism (see section 5).
iv. Decision Procedure
The utilitarian claims that the right action is that which maximises utility. When an agent acts, they should act in a way that maximises expected utility. But how do they determine this? One way is to consider every possible action one might do, and for each one, think about all the consequences one might expect (with appropriate weightings for how likely each consequence would be), come up with an expected happiness value for each action, and then pick the one with the highest score. However, this sounds like a very time-consuming process. This will often be impossible, as time is limited. Is this a problem for utilitarians? Does it make the view impractical?
Mill was aware of this concern, that “there is not time, previous to action, for calculating and weighing the effects of any line of conduct on the general happiness.” However, Mill thinks this objection obscures relevant information gained throughout human history. As people have acted in all sorts of ways, with varying results, any person today can draw upon humanity’s wealth of knowledge of causes and effects, as well as from their own experiences. This background knowledge provides reasons to think that some actions are likely to be more conducive to happiness than others. Often, Mill thinks, an agent will not need to perform any calculations of utility to determine which actions best promote happiness; it will just be obvious.
Mill ridicules the suggestion that individuals would be completely ignorant of what actions they must do if they were to adopt utilitarianism. There would, of course, be no need to contemplate on each occasion whether theft or murder promote utility—and even if there were, he suggests that this would still not be particularly puzzling. Acknowledging this criticism with some derision, Mill notes that “there is no difficulty in proving any ethical standard whatever to work ill, if we suppose universal idiocy to be conjoined with it”.
However, this kind of objection relates to an interesting question. Should a utilitarian endorse reasoning like a utilitarian? Mill suggests that it is preferable in many occasions to make use of rules that have been previously accepted. But how does one determine whether to use a rule and when to perform a utility calculation? Some of Mill’s remarks about how to use rules have prompted commentators to regard him as a rule-utilitarian (see section 5.c). Utilitarianism also seems to allow for the possibility that no one should believe that utilitarianism is true. If, for instance, it turns out that the world would be a happier place if everyone accepted a Kantian ethical theory, the utilitarian should, by their own lights, favour a world where everyone believes Kant. Henry Sidgwick (1838-1900) took this seriously, and he defended the idea that perhaps only an “enlightened few” should know the truth about morality, and keep it hidden from the masses.
Utilitarians can say that the truth of their view does not depend on what the correct decision procedure is. Whether performing a utility calculus or simply acting on common-sense morality leads to most happiness, they can still say that the right actions are those that lead to happiness being maximised, that is, that utilitarianism is the correct theory. However, given that utilitarians do tend to care about how people should act, and want to change behaviours, the question of how one should decide what to do is pertinent. Exactly what the relationship between utilitarianism and practical reasoning is, or should be, according to utilitarians, is a persisting question.
4. The Utilitarian Movement
Today, utilitarianism is regarded primarily as a moral theory which can be used to determine the obligations of an individual in a situation. This focus on individual morality gives an inaccurate impression of the Utilitarian movement (‘Utilitarianism’ with a capital ‘U’ will be used to indicate the movement, as distinct from the moral theory) in the 18th and 19th century. The Utilitarians were keenly focused on social change. This took the form of revising social policy with the aim of improving the general happiness. Bentham is explicit on the first page of Introduction to the Principles of Morals and Legislation that the principle of utility applies not only to actions of private individuals, but also to “every measure of government”. Helvétius was similarly minded, emphasising the importance of laws that could make people happy, as well as ways to change people, so that they could be made happy more easily.
The Utilitarian project was an ambitious one. Every policy, every law, every custom was open to scrutiny. If it was deemed not conducive to general happiness, the Utilitarians suggested it should be disregarded or replaced. Because they were so willing to disregard customs—even those the general community placed high values on—the Utilitarians were a radical group. This section discusses some of the policies supported by Utilitarians.
A common plea from Utilitarians, deemed radical at the time, was for women’s suffrage. A notable example of this comes from Harriet Taylor (1807-1858). Taylor befriended and later married John Stuart Mill, and she is regarded as a prominent Utilitarian in her own right. She had a significant influence on Mill’s writing (exactly how much influence she had is a matter of dispute, though Mill said in his introduction to On Liberty, “Like all that” he had “written for many years, it belongs as much to her as to” him). In Taylor’s Enfranchisement of Women (1851), she argues that women should have equal political rights to men, including the right to vote and to serve in juries. In fact, Taylor’s arguments call for the equal access to all spheres of public life. In particular, she claimed women should be able to enter all professions, including running for political office.
In the same essay, Taylor condemned slavery. This was another point Utilitarians were largely united on. Bentham also criticised slavery on the grounds that it had negative effects on the general happiness, and when abolition was discussed in parliament, he actively opposed compensating slave-traders for their losses. John Stuart Mill was also vocal on the topic of slavery and the just treatment of former slaves. As a Member of Parliament, Mill chaired the Jamaica Committee, which aimed to prosecute Governor Eyre of Jamaica, who used excessive and deadly force in suppressing an uprising at Morant Bay in 1865. This pitted Mill against many prominent intellectuals, including his contemporary (and sometimes friend) Thomas Carlyle (1795-1881). Mill received assassination threats for his position, which was seen by many as overly sympathetic towards the Black Jamaicans.
Like his wife, John Stuart Mill also campaigned for the rights of women. He thought not only that society would benefit considerably from the liberation of women, but also that there would be an “unspeakable gain in private happiness to the liberated half of the species; the difference to them between a life of subjection to the will of others, and a life of rational freedom”. As well as making the case in his book The Subjection of Women (which drew heavily upon material from his wife’s previous work), Mill spoke passionately in favour of expanding suffrage in Parliament. This cause clearly moved Mill, who was reportedly arrested as a teenager for distributing information about contraception. Henry Sidgwick was also an active campaigner, particularly regarding education reform. He became one of the leading voices advocating for access to higher education for women and was one of the organisers of “Lectures for Ladies” at Cambridge, which, in 1871, led to the formation of Newnham College, an all-women’s college (at the time, women were not allowed to attend the university).
Jeremy Bentham, in the early 1800s, wrote essays defending sexual freedom. He was motivated by the harsh way that society treated homosexuals and thought there could be no utilitarian justification for this. While many members of the public may have been offended by these behaviours, they were not harmful, but the restrictions and punishments faced by the marginalised groups were.
Utilitarians were also vocal in defense of animal welfare. Bentham argued that the feature relevant for whether an entity has moral status is “is not, Can they reason? nor, Can they talk? but, Can they suffer?”. Mill, despite famously arguing that humans can appreciate “higher pleasures” than animals, is insistent that animal welfare is relevant. He thought it obvious that, for a utilitarian, any practice that led to more animal suffering than human pleasure was immoral, thus it seems likely he would have opposed factory farming practices.
Not all of the proposals endorsed by Utilitarians are looked on quite so favourably with a modern eye. While John Stuart Mill argued, from utilitarian principles, for a liberal democratic state, he suggested that those arguments did not apply to “barbarians” who were “unfit for representative government”. Infamously, Mill considered India unsuitable for democracy, and is seen by some as an apologist for the British Empire for defending this kind of view.
Another infamous proposal from the Utilitarians comes from Bentham in the domain of prison reform. Bentham suggested an innovative prison design known as the “panopticon” (1787). This was designed to be humane and efficient. A panopticon prison is circular with cells around the edges, and an inspector’s lodge in the middle, situated so that the guard can view each cell. From the inspection lodge each cell would be visible, but blinds to the inspector’s lodge would prevent the prisoners from seeing whether they were being watched, or even whether a guard was present, at any given time. The mere possibility that they were being watched at any time, Bentham thought, would suffice to ensure good behaviour. He also thought that this would prevent guards from mistreating prisoners, as that too would be widely visible. The panopticon was later popularised and criticised by Michel Foucault in Discipline and Punish. The panopticon is notorious for imposing psychological punishment on inmates. Never knowing whether one is being watched can be psychologically stressful. For better or worse, the panopticon anticipated many developments in surveillance present in early 21st-century society.
In each of these proposals, the Utilitarians insisted that policies, laws, or customs must be justified by their effects. If the effects were positive, they were good and could be maintained. If the effects were negative, they should be dispelled with. This attitude, and the radical political ambition, characterised Utilitarianism as a movement.
5. Utilitarianism in the Twentieth 20th Century
Despite its many detractors, utilitarianism in one form or another continued to hold sway as one of the major moral approaches throughout the 20th century. Philippa Foot (1920-2010) claimed in 1985 that it “tends to haunt” even those who reject the view. That being said, during the 20th century, new criticisms of the view emerged, and previous objections were explored in considerably more depth. This resulted in additional complications to the view, novel defences, and variations on the classical view.
In this section, some of the major 20th-century developments for utilitarianism are discussed. Some advances that may have been described under the heading of “utilitarianism” previously have been omitted, because they veer too far from the core view. For example, G. E. Moore’s “ideal utilitarianism”, despite the name, departs significantly from the central utilitarian commitments, so is not included here (in the early 21st century, this was typically regarded as a non-utilitarian form of consequentialism—see this discussion for further details).
a. Hedonism and Welfarism
The hedonism embraced by classical utilitarianism is controversial. Some of the reasons for this have already been discussed, such as the suggestion that pleasure is all that matters is crude or a doctrine “worthy of swine”. An additional complaint that this offers an impoverished theory of the good suggests that it ignores the values of achievement or authenticity. One example that exemplifies this is the thought experiment of the “experience machine” given by Robert Nozick (1938-2002):
Suppose there were an experience machine that would give you any experience you desired. Superduper neuropsychologists could stimulate your brain so that you would think and feel you were writing a great novel, or making a friend, or reading an interesting book. All the time you would be floating in a tank, with electrodes attached to your brain. Should you plug into this machine for life, pre-programming your life’s experiences? (Nozick, Anarchy, State & Utopia, 1974)
Nozick supposes that many people would be reluctant to plug into the machine. Given that the machine could guarantee more pleasurable experiences than life outside it could, this suggests that people value something other than simply the pleasurable sensations. If some of the things that one would miss out on inside the machine (like forming relationships or changing the world in various ways) are valuable, this suggests that hedonism—the claim that only pleasure matters—is false.
In the 20th century, as a result of rejecting the hedonistic component, several utilitarians modified their view, such that utility could be understood differently. One way to change this is to suggest that the classical view is right that it is important that a person’s life goes well (their well-being), and also that this is the only thing that matters morally, but that it gets something wrong about what makes a person’s life go well. Rather than just a matter of how much pleasure a life contains, we might think well-being is best understood in another way. If a view holds that the well-being of individuals—however this is best understood—is the only moral value, it is welfarist.
One account of well-being regards preferences as especially important, such that a person’s life is made better by their preferences being satisfied. This view, which when joined to utilitarianism is known as preference utilitarianism, is able to evade the problems caused by the experience machine, because some of our preferences are not just to experience certain sensations, but to do things and to have relationships. These preferences would remain unsatisfied in an artificial reality, so the preference utilitarian could regard a person’s life as going less well as a result (even if they do not know it).
However, preference utilitarianism has problems of its own. For instance, some preferences simply do not seem that important. John Rawls (1921-2002) imagines a case of an intellectually gifted person, whose only desire is to count blades of grass. According to preference-satisfaction theories of well-being, if such a person is able to spend all their time grass-counting, their life is as good as it can be. Yet many have the intuition that this life is lacking some important features, like participating in social relationships or enjoying cultural pursuits. If there is some value lacking in the life of the grass-counter, this implies something wrong with the preference-satisfaction account of well-being.
Another objection against preference utilitarianism concerns preferences a person no longer has. If someone has a preference for something to happen, then forgets about it, never to find out whether it occurs, does this actually make their life go better? To take this to an extreme, does a person’s life improve if one of their preferences is satisfied after they die? Utilitarians who are more hedonistically inclined find this implausible. Peter Singer, one of utilitarianism’s most famous defenders, previously endorsed preference utilitarianism, but has since abandoned this in favour of hedonistic utilitarianism.
b. Anscombe and ‘Consequentialism’
G.E.M. Anscombe (1919-2001) was an influential figure in 20th century philosophy. She was not a utilitarian but was responsible for significant changes in how utilitarianism was discussed. In ‘Modern Moral Philosophy’ (1958), Anscombe expressed extremely critical views about the state of moral philosophy. She thought the notion of morality as laws or rules that one must follow made little sense in a secular world; that without a divine law-maker (God), injunctions to or prohibitions against acting some way lacked authority. She was similarly critical of Kant, claiming that the idea that one could legislate for oneself was “absurd”. Among other things, her paper—and Anscombe’s general rejection of the major ethical theories of her day—sparked renewed interest in Aristotelian ethical thinking and the development of virtue ethics.
Anscombe also criticised utilitarianism as a “shallow philosophy” because it suggested that it was always able to give clear-cut answers. She claimed that in ethics borderline cases are ubiquitous. In these cases, there is not an obvious answer, and even if there is a correct answer, it might be something one should be conflicted about.
Anscombe’s criticisms of utilitarians since Sidgwick were particularly scathing. She claimed that they held a view of intention that meant everything that was foreseen was intended—a view she thought was “obviously incorrect”. Anscombe invented the term “consequentialism” as a name for the view she was critical of, distinguishing this from “old-fashioned Utilitarianism”. After Anscombe, “consequentialism” became a broader label than utilitarianism. As well as the classical view outlined above, “consequentialism” allowed for different conceptions of the good. For example, a view that thought that only consequences matter, but held that—as well as happiness or well-being—beauty is intrinsically valuable would be consequentialist, but not utilitarian (this is why G.E. Moore’s “ideal utilitarianism” has not been discussed in this article, as he makes claims of this sort). Today, the term “consequentialism” is used more often by philosophers than “utilitarianism”, though many of those identifying as consequentialists either embrace or sympathise with utilitarianism.
c. Act versus Rule
In the 20th century, a distinction that had been noted previously was scrutinised and given a name. This is the act/rule distinction. Versions of rule-utilitarianism had been given before the 20th century. The rule utilitarian claims that, rather than examining the consequences of any particular action to determine the ethical status of an action, one should consider whether it is compatible with a set of rules that would have good consequences if (roughly) most people accepted them.
The term “rule-utilitarian” was not in popular use until the second half of the 20th century, but the central claim—that the rules one is acting in accordance with determine the moral status of one’s actions—was much older. George Berkeley (1685-1753) is sometimes suggested to have offered the first formulation of rule-utilitarianism. He suggested that we should design rules that aim towards the well-being of humanity, that “The Rule is framed with respect to the Good of Mankind, but our Practice must be always shaped immediately by the Rule”.
Later in the 18th century, William Paley (1743-1804) also suggested something like rule-utilitarianism in response to the problem that his view would seemingly condone horrible behaviours, like lying one’s way to a powerful position, or murder, if the consequences were only good enough. Paley rejected this by claiming that the consequences of the rule should be considered. If one was willing to lie or cheat or steal in order to promote the good, Paley suggested this would licence others to lie, cheat, or steal in other situations. If others did, from this precedent, decide that lying, cheating, and stealing were permissible, this would have bad consequences, particularly when people did these actions for nefarious reasons. Thus, Paley reasoned, these behaviours should be prohibited. Later still, in his Utilitarianism, John Stuart Mill proposed what some have interpreted as a form of rule-utilitarianism, though this is controversial (a discussion on this dispute can be found here).
While principles that can properly be regarded as rule-utilitarian were proposed before, it was in the 20th century that these views received the name “rule-utilitarianism” and were given extensive scrutiny.
Before considering some of the serious objections to rule-utilitarianism, it is worth noting that the view has some apparent advantages over classical act-utilitarianism. Act-utilitarians have a difficulty in making sense of prohibitions resulting from rights. Jeremy Bentham famously described the idea that there might exist moral rights as “nonsense on stilts”, but this is a controversial position. It is often argued that we do have rights, and that these are unconditional and inalienable, such as the right to bodily autonomy. If one person has a right to bodily autonomy, this is understood as requiring that others do not use their body in certain ways, regardless of the consequences. However, basic act-utilitarianism cannot make sense of this. In a famous example, Judith Jarvis Thomson (1929-2020) imagines a surgeon who realises they could save the life of five patients by killing a healthy person who happens to be the right blood type. Assuming they could avoid special negative consequences from the surgeon killing an innocent healthy person (perhaps they can perform the killing so that it looks like an accident to prevent the public panicking about murderous surgeons), an act-utilitarian seems committed to the view that the surgeon should kill the one in order to save the five. The rule-utilitarian, however, has a neat response. They can suggest that a set of rules that gives people rights over their own bodies—rights that preclude surgeons killing them even if they have useful organs—leads to more happiness overall, perhaps because of the feeling of safety or self-respect that this might result in. So the rule-utilitarian can say such a killing was wrong, even if on this particular occasion it would have resulted in the best consequences.
Another potential advantage for rule-utilitarians is that they may have an easier time avoiding giving extremely demanding moral verdicts. For the act-utilitarian, one must always perform the action which has the best consequences, regardless of how burdensome this might be. Given the state of the world today, and how much people in affluent countries could improve the lives of those living in extreme poverty with small sums of money, act-utilitarianism seems to imply that affluent people in developed nations must donate the vast majority of their disposable income to those in extreme poverty. If buying a cup of coffee does not have expected consequences as good as donating the money to the Against Malaria Foundation to spend on mosquito nets, the act-utilitarian claims that buying the cup of coffee is morally wrong (because of the commitment to maximising). Rule-utilitarians can give a different answer. They consider what moral rule would be best for society. One of the reasons act-utilitarianism is so burdensome for a given individual is that the vast majority of people give nothing or very little. However, if every middle-class person in developed nations donated 10% of their income, this might be sufficient to eliminate extreme poverty. So perhaps that would be the rule a rule-utilitarian would endorse.
Despite some advantages, rule-utilitarianism does have many problems of its own. One issue pertains to the strength of the rules. Consider a rule prohibiting lying. This might seem like a good rule for a moral code. However, applying this rule in a case where a would-be murderer asks for the location of a would-be victim would seemingly have disastrous consequences (Kant is often ridiculed for his absolutist stance in this case). One response here would be to suggest that the rules could be more specific. Maybe “do not lie” is too broad, and instead the rule “do not lie, unless it saves a life” is better? But if all rules should be made more and more complicated when this leads to rules with better consequences, this defeats the purpose of the rules. As J. J. C. Smart (1920-2012) pointed out, the view then seems to collapse into a version of act-utilitarianism. In Smart’s words:
I conclude that in every case if there is a rule R the keeping of which is in general optimific, but such that in a special sort of circumstances the optimific behaviour is to break R, then in these circumstances we should break R…. But if we do come to the conclusion that we should break the rule…what reason remains for keeping the rule? (Smart, ‘Extreme and Restricted Utilitarianism’, 1956)
On the other hand, one might suggest that the rules stand, and that lying is wrong in this instance. However, this looks like an absurd position for a utilitarian to take, as they claim that what matters is promoting good consequences, yet they will be forced to endorse an action with disastrous consequences. If they suggest rule-following even when the consequences are terrible, this is difficult to reconcile with core consequentialist commitments, and looks like—in Smart’s terms—“superstitious rule worship”. Is it not incoherent to suggest that only the consequences matter, but also that sometimes one should not try to bring about the best consequences? The rule-utilitarian thus seems to face a dilemma. Of the two obvious responses available, one leads to a collapse into act-utilitarianism and the other leads to incoherence.
Richard Brandt (1910-1997) was the first to offer a rigorous defence of rule-utilitarianism. He offers one way of responding to the above criticism. He suggests that the rules should be of a fairly simple sort, like “do not lie”, “do not steal” and so on, but in extreme scenarios, these rules will be suspended. When a murderer arrives at the door asking for the location of one’s friends, this is an extreme example, so ordinary rules can be suspended so that disaster can be averted. A version of this strategy, where the correct set of rules includes an “avoid disaster” rule, is defended by contemporary rule-consequentialist Brad Hooker (Hooker’s own view is not strictly rule-utilitarian because his code includes prioritarian caveat—he thinks there is some moral importance to prioritising the worst-off in society, over and above their benefits to well-being).
A second problem for rule-utilitarians concerns issues relating to partial compliance. If everyone always acted morally decently and followed the rules, this would mean that certain rules would not be required. For instance, there would be no rules needed for dealing with rule-breakers. But it is not realistic to think that everyone will always follow the rules. So, what degree of compliance should a rule-utilitarian cater for when devising their rules? Whatever answer is given to this is likely to look arbitrary. Some rule-utilitarians devise the rules not in terms of compliance, but acceptance or internalisation. Someone may have accepted the rules but, because of weakness of will or a misunderstanding, still break the rules. Formulating the view this way means that the resulting code will incorporate rules for rule-breakers.
A further dispute concerns whether rule-utilitarianism should really be classified as a form of utilitarianism at all. Because the rightness of an action is only connected to consequences indirectly (via whether or not the action accords to a rule and whether the rule relates to the consequences in the right way), it is sometimes argued that this should not count as a version of utilitarianism (or consequentialism) at all.
d. Satisficing and Scalar Views
A common objection to act-utilitarianism is that, by always requiring the best action, it demands too much. In ordinary life, people do not view each other as failing whenever they do something that does not maximise utility. One response to this is to reconstrue utilitarianism without the claim that an agent must always do the best. Two attempts at such a move will be considered here. One replaces the requirement to do the best with a requirement to do at least good enough. This is known as satisficing utilitarianism. A second adjustment removes obligation entirely. This is known as scalar utilitarianism.
Discussions of satisficing were introduced into moral philosophy by Michael Slote, who found maximising versions of utilitarianism unsatisfactory. Satisficing versions of utilitarianism hope to provide more intuitive verdicts. When someone does not give most of their money to an effective charity, which may be the best thing they could do, they might still do something good enough by giving some donation or helping the needy in other ways. According to the satisficing utilitarian, there is a standard which actions can be measured against. A big problem for satisficing views arises when they are challenged to say how this standard is arrived at—how do they figure out what makes an action good enough? Simple answers to the question have major issues. If, for instance, they suggest that everyone should bring about consequences at least 90% as good as they possibly can, this suggests someone can always permissibly do only 90% of the best. But in some cases, doing what brings about 90% of the best outcome looks really bad. For example, if 10 people are drowning, and an observer can decide how many to save without any cost to themselves, picking 9—and allowing one to die needlessly—would be a monstrous decision. Many sophisticated versions of satisficing utilitarianism have been proposed, but none so far has escaped some counterintuitive implications.
The problem of where to set the bar is not one faced by the scalar utilitarians, as they deny that there is a bar. The scalar utilitarian acknowledges that what makes actions better or worse is their effects on peoples’ well-being but shuns the application of “rightness” and “wrongness”. This approach avoids problems of being overly or insufficiently demanding, because it makes no demands. The scalar view avoids deontic categories, like permissible, impermissible, required, and forbidden. Why might such a view seem appealing? For one thing, the categories of right and wrong are typically seen as binary—the act-utilitarian says actions are either right or wrong, a black-and-white matter. If the moral quality of actions is extremely richly textured, this might look unsatisfactory. Furthermore, using the blunt categories of “right” and “wrong”, someone confident that they have acted rightly may become morally complacent. Unless you are doing the very best, there is room for improvement, scope for doing better, which can be obfuscated by viewing acts as merely permissible or impermissible. While some utilitarians have found this model attractive, abandoning “right” and “wrong” is a radical move, and perhaps unhelpful. It might seem very useful, for instance, for some actions to be regarded as forbidden. Similarly, an account of morality which sets the boundaries of permissible action may be much more useful for regulating behaviour than viewing it merely as matters of degrees.
6. Utilitarianism in the Early 21st Century
In moral theory, discussions of utilitarianism have been partly subsumed under discussions of consequentialism. As typically classified, utilitarianism is simply a form of consequentialism, so any problems that a theory faces in virtue of being consequentialist are also faced by utilitarian views. Some consequentialists will also explicitly reject the label of “utilitarianism” because of its commitment to a hedonistic or welfarist account of the good. Brad Hooker, for example, endorses a rule-consequentialism where not only the total quantity of happiness matters (as the utilitarian would suggest), but where the distribution of happiness is also non-instrumentally important. This allows him to claim that a world with slightly less overall happiness, but where the poorest are happier, is all-things-considered better than a world with more total happiness, but where the worst-off are miserable.
While many of the discussions concern consequentialism more broadly, many of the arguments involved in these discussions still resemble those from the 19th century. The major objections levelled against consequentialism in the early 21st century—for example, whether it demands too much, whether it can account for rights or justice, or whether it allows partial treatment in a satisfactory way—target its utilitarian aspects.
The influence of utilitarian thinking and the Utilitarian movement is still observable. One place where Utilitarian thinking is particularly conspicuous is in the Effective Altruism movement. Like the 19th century Utilitarians, Effective Altruists ask what interventions in the world will actually make a difference and promote the behaviours that are the best. Groups such as Giving What We Can urge individuals to pledge a portion of their income to effective charities. What makes a charity effective is determined by rigorous scientific research to ascertain which interventions have the best prospects for improving peoples’ lives. Like the classical utilitarians and their predecessors, they answer the question of “what is good?” by asking “what is useful?”. In this respect, the spirit of utilitarianism lives on.
7. References and Further Reading
Ahern, Dennis M. (1976): ‘Is Mo Tzu a Utilitarian?’, Journal of Chinese Philosophy 3 (1976): 185-193.
A discussion about whether the utilitarian label is appropriate for Mozi.
Anscombe, G. E. M. (1958): ‘Modern Moral Philosophy’, Philosophy, 33(124), 1-19.
Influential paper where Anscombe criticises various forms of utilitarianism popular at the time she was writing, and also introduces the word “consequentialism”.
Bentham, Jeremy (1776): A Fragment on Government, F. C. Montague (ed.) Oxford: Clarendon Press (1891).
One of the first places utilitarian thinking can be seen in Bentham’s writings.
Bentham, Jeremy (1787): ‘Panopticon or The Inspection House’, in The Panopticon Writings. Ed. Miran Bozovic (London: Verso, 1995). p. 29-95
This is where Bentham proposes his innovative prison model, the “panopticon”. It also includes lengthy discussions of how prisoners should be treated, as well as proposals for hospitals, “mad-houses” and schools.
Bentham, Jeremy (1789): An Introduction to the Principles of Morals and Legislation., Oxford: Clarendon Press, 1907.
Seen as the first rigorous account of utilitarianism. It begins by describing the principle of utility, and it continues by considering applications of the principle in morality and legal policy.
Brandt, R. B. (1959): Ethical Theory, Englewood-Cliffs, NJ: Prentice Hall.
This book offers a clear formulation of rule-utilitarianism, and it is one of the earliest resources that refers to the view explicitly as “rule-utilitarianism”.
Chastellux, François-Jean de (1774): De la Félicité publique, (“Essay on Public Happiness”), London: Cadell; facsimile reprint New York: Augustus Kelley, 1969.
This book is where Chastellux investigates the history of human societies in terms of their successes (and failures) in securing happiness for their citizens.
Cumberland, Richard (1672): A Treatise of the Laws of Nature (De Legibus Naturae), selection printed in British Moralists 1650-1800 (1991), D.D. Raphael (ed.), Hackett.
Here Cumberland discusses the nature of things, and introduces his natural law view, which leads to some utilitarian-like conclusion.
Article about a recent book discussing Bentham’s position on sexual ethics.
De Lazari-Radek, Karazyna and Singer, Peter (2014): The Point of View of the Universe, Oxford University Press.
An exposition of Henry Sidgwick’s utilitarianism, considering his view in light of contemporary ethical discussions.
Dickens, Charles (1854): Hard Times, Bradbury & Evans.
Novel featuring Thomas Gradgrind—a caricature of a utilitarianist.
Foot, Philippa (1985): ‘Utilitarianism and the Virtues’, Mind, 94(374), 196-209.
Foot—an opponent of utilitarianism—notes how utilitarianism has been extremely persistent. She suggests that one reason for this is that utilitarianism’s opponents have been willing to grant that it makes sense to think of objectively better and worse “states of affairs”, and she scrutinises this assumption.
Gay, John (1731): Concerning the Fundamental Principle of Virtue or Morality, selection printed in British Moralists 1650-1800 (1991), D.D. Raphael (ed.), Hackett.
This includes Gay’s challenge to secular versions of utilitarianism, to explain moral motivation.
Helvétius, Claude (1777): A Treatise on Man, His Intellectual Faculties, and His Education, 2 vols., London: B. Law and G. Robinson.
Published after Helvétius’ death, this work includes lengthy discussions of how society may be altered to better promote happiness.
Heydt, Colin (2014): ‘Utilitarianism before Bentham’, in The Cambridge Companion to Utilitarianism, pp. 16-37). Cambridge: Cambridge University Press. doi:10.1017/CCO9781139096737.002
This paper describes the intellectual development of utilitarianism, drawing attention to the non-utilitarian origins, as well as the distinct religious and secular variations of utilitarianism in Britain, and the French utilitarians.
Hooker, Brad (2000): Ideal Code, Real World: A Rule-consequentialist Theory of Morality. Oxford University Press.
This book offers a rigorous defence of rule-consequentialism. Hooker’s account is not rule-utilitarian (because he claims that some priority should be given to the worst-off in society), but he offers defences against all the major objections to rule-utilitarianism.
Hruschka, Joachim, 1991. “The Greatest Happiness Principle and Other Early German Anticipations of Utilitarian Theory,” Utilitas, 3: 165–77.
Hruschka dispels some myths about the origins of the term “greatest happiness for the greatest number”, and he explores the history of the idea in Germany prior to the development of utilitarianism in Britain.
Hutcheson, Francis (1725): Inquiry Concerning the Original of Our Ideas of Virtue or Moral Good, treatise II of An Inquiry into the Original of our Ideas of Beauty and Virtue, selection printed in British Moralists 1650-1800 (1991), D.D. Raphael (ed.), Hackett.
This work provides a detailed account of Hutcheson’s moral and aesthetic theory.
Hutcheson, Francis (1755): A System of Moral Philosophy, three volumes, London.
Published after Hutcheson’s death, this book was written specifically for students. It further develops Hutcheson’s moral thinking, and it includes a discussion of different kinds of pleasures.
Jacobson, Daniel (2008): ‘Utilitarianism without Consequentialism: The Case of John Stuart Mill’, Philosophical Review, 117(2), 159-191.
This article makes a case for distinguishing the view of John Stuart Mill and his contemporaries from consequentialism, as the view is discussed today. This locates “Utilitarianism” within a certain socio-historical context and identifies ways in which it differs in its commitments than the “consequentialism”.
MacAskill, William (2015): Doing Good Better: Effective Altruism and How You Can Make a Difference, Random House.
An introduction to the Effective Altruism movement, which can be seen as an intellectual descendent of the Utilitarians.
This is an attempt from John Stuart Mill to demonstrate that utilitarianism is much more appealing than critics at the time implied. This is often seen today as the foundational text for utilitarianism, though Mill did not seem to regard it as highly as some of his other works, like On Liberty and Considerations on Representative Government.
A lengthy speech given by Mill as an MP arguing for suffrage for women.
Mozi (2010): The Mozi: A Complete Translation, Ian Johnston (trans.), The Chinese University Press.
A translated version of Mozi’s work, accompanied by commentary.
Nozick, Robert (1974): Anarchy, State & Utopia, New York: Basic Books.
In this book, as well as his general account of the requirements of justice, Nozick introduces the example of the “experience machine”, which is often thought to demonstrate a problem for hedonism.
O’Keefe, Tim (2009): Epicureanism, Acumen Publishing.
O’Keefe discusses the teachings of Epicurus. As well as Epicurean ethics, this includes large discussions of Epicurean thoughts on metaphysics and epistemology.
Paley, William (1785): Principles of Moral and Political Philosophy, Boston: Richardson and Lord (1821).
Paley’s Principles of Moral and Political Philosophy was the most influential work of utilitarianism for much of the 19th It also includes an early defence of what would be later termed rule-utilitarianism.
Priestley, Joseph (1768): Essay on the First Principles of Government, London.
In this work, Priestley claims that the greatest happiness for the greatest number is the measure of right and wrong. Bentham says this influenced him significantly.
Railton, Peter (1984): ‘Alienation, Consequentialism and the Demands of Morality’, Philosophy & Public Affairs, 13(2), 134-171.
Elaborates a complaint relating to utilitarian decision procedure, and how this may lead to alienation. Railton offers a distinction between “objective” and “subjective” versions of consequentialism, endorsing the former.
Rawls, John (1971): A Theory of Justice, Cambridge, MA: Harvard University Press.
When developing his influential theory of justice, Rawls criticises the inability of classical utilitarianism to properly appreciate the individual nature of persons.
Rosen, Frederick (2003): Classical Utilitarianism from Hume to Mill, London: Routledge.
This book traces the influence of the idea that utility is the basis of morality and justice, starting from Hume. It includes many of the figures discussed in this article in significantly more depth. It also devotes two chapters to considering the notion of utility as found in the works of Adam Smith.
This book provides a wonderful discussion of utilitarianism. The first few chapters of the book were extremely useful in the creation of this article.
Schultz, Bart and Varouxakis, Georgios (2005): Utilitarianism and Empire, Oxford: Lexington.
This book is a collection of essays that consider the relationship between Utilitarianism—particularly as a social movement—and the British Empire. It explores the criticisms that early Utilitarians, like Jeremy Bentham and John Stuart Mill, were racist, insufficiently critical of slavery, and served as apologists for the British Empire.
Slote, Michael (1984): ‘Satisficing Consequentialism’, Proceedings of the Aristotelian Society, 58, 139-163.
This article marks the introduction of satisficing views, which remove the feature of maximising from utilitarianism, instead claiming that it is (at least) sometimes permissible to perform actions which do not have the best consequences, but which are good enough.
Smart, J. J. C and Williams, Bernard (1973): Utilitarianism: For & Against, Cambridge University Press.
A pair of essays for and against utilitarianism. Williams’ part includes his objection that utilitarianism undermines the integrity of moral agents, which has been very influential.
Harriet Taylor’s essay arguing for the legal equality of women.
Thomson, Judith Jarvis (1976): ‘Killing, Letting Die and The Trolley Problem’, The Monist, 59(2), 204-217.
This paper uses the case of a surgeon who must decide whether to kill one healthy person to save five, which has been used since to show problems utilitarianism has with making sense of rights. It also introduces the term “trolley problem” for a type of case that has become commonplace in moral philosophy.
It is a familiar thought that many of our beliefs are directly justified epistemically by perception. For example, she sees what looks to her to be a cat on the mat, and from this she is justified in saying “There is a cat on the mat.” This article explores the idea that our moral beliefs can be justified empirically in a similar manner. More precisely, it focuses on canonical moral perception (CMP), which restricts perceptual experiences to sensory perceptual experiences, such as vision, touch, taste, smell, and sound. For ease of exposition, this article uses visual perceptual experiences as the sensory modality of choice.
We should be interested in the viability of such a thesis for several reasons. First, if CMP is a plausible epistemology of justification of moral beliefs, then it is uniform with a broader perceptual epistemology and therefore comes with ready-made responses to skeptical challenges to morality. Second, CMP avoids over-intellectualising moral epistemology, and it explains how it is that lay people have justified moral beliefs. Third, CMP, if true, has interesting implications for our methodology of investigating morality. In effect, CMP states that experience comes first, contrary to how some (but not all) rival views characterize moral epistemology as starting from the armchair.
First, the thesis of CMP in presented in detail. The following section considers prima facie arguments in favor of CMP, which are the considerations of epistemic uniformity and the role of experience in moral inquiry. Next, the article discusses prima facie arguments against CMP, which are the problems of counterfactual knowledge, the causal objection, and the ‘looks’ objection. Finally, the article presents arguments for CMP that draw from the philosophy of perception and the philosophy of mind, and it concludes that much of the debate surrounding CMP is continuous with debates in the general philosophy of perception and the philosophy of mind.
Suppose upon returning home one evening, someone encounters a stranger harming a senior citizen for entertainment. As they witness this act, they form the belief that what they are witnessing is morally wrong. Assuming that the belief is epistemically justified, it remains a question what the source of justification for this particular moral belief is. One answer is that perceptual states (such as sight and hearing) provide the justification. This thesis is called canonical moral perception:
CMP: Some moral beliefs are non-inferentially justified by sensory perceptual experiences.
To be clear, CMP claims that some moral beliefs are non-inferentially justified by sensory perceptual experiences. This leaves open the possibility of multiple sources for the justification of moral beliefs while showing that there is an interesting debate here regarding the possibility of CMP, since rivals of the view will deny that any moral beliefs are justified in such a way. For purposes of exposition, this article uses vision as the perceptual state of choice, but it should be kept in mind that this is not to convey that vision is the only source of perceptual justification for moral beliefs. Despite the fact that emotions are sometimes spoken of as if they are a kind of perception, this article does not consider emotional perception in any detail. Someone who endorses CMP may be called a ‘perceptualist.’
Fundamentally, the epistemic contribution of perception is to justify belief and play the role of a justificatory regress stopper. Given that justification for some beliefs bottoms out in perceptual experience, and that some moral beliefs are justified but not on the basis of other beliefs, CMP extends perceptual justification to the moral domain. CMP is a foundationalist theory of the justification of moral beliefs and this article treats it as such. Other foundationalist views, such as intuitionists and emotional perceptualists, will have their own ways of handling the regress problem that differs from Canonical Moral Perception. In particular, the perceptualist (at least) holds that what is essential to perception is its representational nature, the phenomenological character of perceptual experience, and its role as a non-inferential source of justification, and will offer a stopper to the regress problems based on those characteristics. Intuitionists and emotional perceptualists may agree that some of those characteristics are essential to their justificatory source as well, but the story for how their regress stoppers work will differ based on how emotions and intuitions differ from perception. For example, emotional perceptualists may say that what is special about emotional perceptual states is that they are valenced, and that this plays a special role in their justificatory story.
Furthermore, this paper assumes on behalf of the perceptualist a phenomenal dogmatist account of foundationalism of the kind espoused by Jim Pryor, where someone is immediately, but defeasibly, justified by their perceptual experience (Pryor 2000). Phenomenal dogmatism is not a very strong foundationalism in that it does not require an infallibly known belief to ground all the remaining knowledge one may possess. Rather, what phenomenal dogmatism grants us is the claim that representational seeming states justify beliefs based on those seeming states in virtue of having those seeming states. Insofar as one may be concerned about challenges to a general foundationalist picture, the perceptualist will follow Pryor in responding to those objections.
Some of the philosophers mentioned in this article will talk about theories of perceptual moral knowledge, and most of what this article says will be compatible with those theories. A perceptually justified moral belief in the absence of defeaters is perceptual moral knowledge, after all.
2. The Prima Facie Case for Moral Perception
a. Moral Perception and Epistemic Uniformity
Considerations of uniformity and economy within epistemology might push one towards adopting CMP over its more traditional rivals, such as intuitionism. CMP entails that the methodology of obtaining justified moral beliefs does not differ in any significant or substantial way from other kinds of justification gained by perceptual experiences. That is, just as one forms the justified belief that there is a cat in the room by seeing that there is in fact a cat in the room, one forms the justified belief that some act is wrong by perceiving the wrongness of the act. This leads us to the considerations of uniformity. If there is no special methodology that differentiates justified moral beliefs from other justified beliefs in different domains, then the need for positing a special source of justification, such as the intellectual seemings of the intuitionist is moot. Another advantage of CMP is that it gives us a foundationalist epistemology, thereby avoiding regress and circularity worries regarding justification. To be clear, the advantages mentioned are shared with some rival accounts of moral epistemology, so these are not unique advantages but rather considerations that keep it a live theory.
b. The Role of Experience in Moral Inquiry
CMP captures the role that experience seems to play in moral inquiry. If we consider how non-philosophers form most of their moral beliefs, it is unlikely that the sole basic source is a priori reasoning. Most people do not sit in an armchair and contemplate runaway trolleys, yet it seems that most individuals have justified basic moral beliefs. When an individual is asked to explain how they know that an action is wrong, a common answer among lay people is that they saw the wrongness of that action. CMP takes this statement at face value, and considering that moral philosophers are not different in kind from the typical human being, we might think that when engaging in a moral thought experiment the philosopher is making use of past moral observations.
If we are persuaded that experience plays a role in answering moral questions, then a natural thought is that particular moral beliefs are among the most epistemically basic; particular moral beliefs form part of our evidential bedrock. They are basic in the sense that, from justified particular moral beliefs we can infer additional justified moral beliefs, but we cannot make an inference in the opposite direction. For example, one basic justified particular moral belief for the perceptualist may be a very specific claim such as, ‘The instance of a father hugging his child I witnessed yesterday is morally good.’ From this particular experience of goodness, once we return to the armchair and ponder if fathers hugging their children is good, we might inductively infer a more general statement such as ‘It is usually good for fathers to hug their children.’ In short, we draw from experience to reach conclusions about more abstract moral questions. Sarah McGrath, motivates CMP with these considerations in mind (2018, 2019). As McGrath explains:
[A] significant part of our most fundamental evidence for [moral] theorizing consists in singular moral judgments that we know to be true. But I also think that there is a fairly widespread tendency to neglect this fact, and to think that our evidence, or what we ultimately have to go on in our ethical theorizing, consists exclusively of judgments with more general content (2018).
To expand on this: it is a common self-conception of moral philosophers that the methodology of moral inquiry they perform is to consider cases or action types, form judgments about those cases and reach general moral principles (such as ‘It is usually good for fathers to hug their children’, or ‘All things being equal, it is wrong to intentionally cause harm’) that are broadly applicable. That is, judgments about very specific cases will be formed by way of considering the more general principles. As McGrath points out, when considering the morality of an action type, we often draw upon our past experiences of tokens of an action to make moral judgments. To illustrate this, we can imagine an agent who yesterday saw the goodness of a father hugging a child, and then the next day is presented with a thought experiment that asks the agent to consider a near identical scenario. Presumably, this agent will judge the hugging once again to be good, and this judgment will be based on the past observations they made the day before. Thus, CMP denies that intuitions about general moral beliefs reached in the armchair are always methodologically prior to experience in moral theorizing.
If intuitions about general moral principles are epistemically basic, then making use of particular moral judgements is epistemically mistaken. However, drawing on past observations to reach judgements on thought experiments about fathers hugging their children, or even the trolley problem, is not obviously an epistemic misstep. In fact, we often draw on past observations and experiences to give advice on problems that our friends and family experience. Rather than draw on general principles to advise a friend to end her relationship, we usually appeal to previous relationships we have been through to make such a judgment. These are the common and legitimate ways we form moral beliefs, and CMP is the most natural epistemic explanation of our practice of moral inquiry as we find it.
That said, we may worry about cases where we have background knowledge informing our experience of a situation; it may seem strange that we can have the kind of experientially justified moral beliefs CMP promises while at the same time recognizing that background knowledge changes what we may justifiably believe about what our perceptual experiences. For example, we can imagine the father hugging child, but now know have the background information that the father has a criminal record of incest. There are two ways for the perceptualist to handle cases where there is background knowledge informing the observation in such cases. The first is to stick with the kind of Pryor style phenomenal dogmatism, in which the perceptual seeming of goodness delivers prima facie justification for believing the hugging is morally good, but this is defeated by the additional knowledge of the father’s criminal record. The second option is to lean into the phenomenon of cognitive penetration, and answer that the background knowledge does change the perceptual experience of the father hugging the child from one of goodness to one of badness, since our propositional attitude would contour our perceptual experience on this option. In sum, there are two possible ways for the perceptualist to answer this kind of concern, but adjudicating between the two options canvassed here is beyond the scope of this article.
3. The Prima Facie Case Against Moral Perception
a. Justification of Counterfactual Moral Beliefs
Although CMP provides a theory of justification in actual situations, situations in which you see a morally valenced act, we might wonder what the theory says about justification of moral beliefs gained via thought experiments or reading fiction. Call the kind of justification gained in these instances counterfactual justification. Both Hutton and Wodak challenge CMP to provide an account of how one can have counterfactual moral justification (Hutton 2021; Wodak 2019). The challenge takes the following general form: By hypothesis, CMP explains moral justification in localized, everyday cases. However, we do not receive justification for moral beliefs solely through sensory perception, since we can have counterfactual moral justification. So, CMP is an incomplete explanation of the sources of moral justification. Because CMP cannot capture cases where we receive justification through literature or thought experiments, an epistemological theory that can provide a unified explanation of both counterfactual justification and justification gained in everyday cases is preferable on the grounds of parsimony. The following two paragraphs present particular versions of this challenge.
Hutton asks us to consider a case of someone reading a book depicting the brutalities of slavery, stipulating that they have an emotional response to the scenarios depicted in the book. Here, no perception is present (other than of words on a page), but there is a strong emotional response and plausibly, Hutton claims, the individual reading the book forms the justified moral belief that slavery is wrong. The upshot of Hutton’s argument is that CMP cannot explain what the source of justification in the case of literature is, while emotion is able to both explain the source of justification in moral beliefs formed from reading literature and everyday cases.
Like Hutton, Wodak notes that much of our moral inquiry is a priori, and intuitionism is far better suited to capture instances where our justified moral beliefs come from imagining scenarios. When sitting in the armchair imagining a trolley scenario, when we form the justified moral belief that pulling the lever is the right action, we can ask what justifies the belief, and Wodak states “The intuitionist can explain this very easily: our intuitions can concern actual and hypothetical cases” (Wodak 2019). That is, the intuitionist’s story for justification stays the same between imagined cases and cases we encounter on the street. CMP cannot appeal to perceptual justification because in thought experiments there is no perception of the scenario. Because CMP lacks resources to explain the source of the justification, and intuitionism can explain the source of justification in both thought-experiments and everyday cases, Wodak concludes that intuitionism should be preferred on the grounds of parsimony.
While it is true that CMP by itself is unable to capture counterfactual justification and gives some prima facie considerations against the view, this should not be cause for alarm on the part of the advocate of CMP. Recall that CMP states that some of our moral beliefs are perceptually justified, not that all moral beliefs are justified in such a way. The advocate of CMP has the option to make a disjunctive response to challenges from counterfactual justification such as those made by Wodak and Hutton. This response needs to be done with care; the advocate of CMP should avoid introducing an account of counterfactual justification that suffices to explain actual justification as well. Even though the challenge for a story for counterfactual justification has yet to be fully answered, there are other considerations for adhering to CMP.
b. The Causal Objection
The causal objection argues that we cannot perceive moral properties because we cannot be put in a causal relation with them. That is, one might think that moral properties are causally inert, and for this reason we cannot perceive them. Put in the form of an argument, the causal objection appears as:
To perceive some property, one must be placed in the appropriate causal relation with that property.
One can never be put in the proper causal relation with moral properties.
One cannot perceive moral properties.
McBrayer responds to the causal objection by pointing out that on three of the most popular realist accounts moral properties premise two comes out false (McBrayer 2010). These three proposals are (i) treating moral properties as secondary properties, (ii) treating moral properties as natural properties, and (iii) treating moral properties as non-natural properties.
When moral properties are held to be secondary properties, where secondary properties are properties that under appropriate viewing conditions are perceived as such, premise two fails as demonstrated by an analogy between colors and moral properties. We can imagine looking at a chair under midday light and perceiving it to be brown. What causally contributes to our perceptual experience is not the brownness of the chair (due to the nature of secondary properties), but the other properties of the chair. Nonetheless, perceiving the chair results in knowledge of the chair’s color, so we are still put in an appropriate causal relation with the property of brownness. In the case of moral properties, stipulated to be secondary properties, we will be placed in the same causal relation with them as we are with colors. Under ideal viewing circumstances, we will be placed in a causal relation with the base properties (such as a father hugging a child) and perceive the goodness of that action. In short, if we take moral properties to be secondary properties, the response to the causal objection is a common cause style of explanation.
If one takes a reductionist naturalist account of the moral properties, matters are even simpler. Because moral properties are identical to natural properties, the explanation as to how we are able to be in the proper causal relation with them is the same explanation as to how we are able to be in the proper causal relationship with chairs, cars, and human actions.
Finally, according to McBrayer, non-naturalism about moral properties avoids the causal objection as well. What the proponent of the causal objection wants is a non-accidental connection between our perceptual beliefs and the moral facts, and an account that delivers a non-accidental connection between our perceptual beliefs and the moral facts suffices to defuse the causal objection. This is so even if the connection is not causal, strictly speaking. To see this, first note that we are stipulating the supervenience principle, the moral facts necessarily supervene on the natural facts such that there is no change in the moral without a change in the natural. Assuming that we can see supervening properties, the accidentality is eliminated because whenever we see a supervening property we see its natural property that serves as its base, and the natural property serves as the proper causal relationship that satisfies the causal constraint.
The causal objection is an instance of a general challenge to the perception of high-level properties. In this case, the causal objection is an instance of explanatory superfluity. This challenge is as follows: One might think that we cannot be put in a causal relation with high-level properties, and so we do not perceive them. There is no need to claim that we are in a causal relation with trees when being in a causal relation with the lower-level properties of trees is sufficient for justified tree belief; further causal contact would be an instance of overdetermination. To put the objection in a slightly different way, if our perceptual states are in a causal relation with the property of being a pine tree, then the content of our perceptual experience of a pine tree would be causally overdetermined. There is no reason to think that our perceptual experiences are overdetermined, so our perceptual states are not in a causal relation with the property of being a pine tree. It is not clear how worried the defender of CMP should worry about this objection. Because the causal objection shares strong features with the causal exclusion problem of mind-body interaction which provides a framework for addressing these issues, the objection may not carry much weight (Kim 1993, Yablo 2003).
c. The ‘Looks’ Objection
If perception justifies some moral beliefs, then this is presumably because there is a phenomenological character, a what-it-is-likeness, when perceiving moral properties. The ‘looks objection’ claims that this is not the case: we do not have perceptual justification of moral beliefs because there is no phenomenological character for moral properties (Huemer 2005, Reiland 2021). The argument is commonly structured this way:
A moral belief is perceptually justified if there is some way that a moral property looks.
Moral properties have no look.
No moral beliefs are perceptually justified.
We can deny the ‘looks’ objection by rejecting premises one or two, or arguing that the conclusion does not follow. Because ‘looks’ is ambiguous in the argument, one strategy for denying the objection is to interpret ‘looks’ in various ways and see if the argument remains valid. McBrayer (2010a) tackles the ‘looks’ objection by considering several possible readings of “looks” other than the phenomenal ‘looks’ mentioned above. The upshot of McBrayer’s strategy is that on all interpretations of “look” he considers, the objection is invalid. McBrayer settles on a possible reading of ‘looks’ which is supposed to provide the strongest version of the objection. This is the ‘normally looks’, which is understood as the way something resembles something else. If we substitute ‘normally look’ in premise two, we get:
2′. Moral properties do not normally look like anything.
Even with ‘normally looks’, the objection comes out invalid. This is for the following reasons. When ‘normally looks’ is read as normally looking a way to multiple people, the argument fails as many non-moral properties, assuming they have a normal look, do not appear the same way to multiple people. For example, imagine a group of individuals looking at a car from different viewpoints; there is no single way the car appears to all of them. Yet, if a car has a normal look but can appear different ways to different individuals, then there is no principled reason to think that rightness cannot appear different ways yet have a normal look as well. Understood in this cross-person sense, 2′ comes out false. Similarly, when 2′ is read as the way a thing normally looks to an individual, the objection is still invalid. Even if 2′ is true, it is only true of low-level properties such as colors, since no matter what angle you view red from, it would always look the same. Many high-level properties, such as danger, do not have a way of normally looking to an individual. But, assuming we are perceptually justified in judgments of danger despite its disparate looks, such as a rattlesnake looking dangerous and a loaded gun looking dangerous, premise 1 does not hold. We may still be perceptually justified in a belief about a property even if there is no particular look for that property. Finally, if an opponent argues that there is a complex and ineffable way that high-level properties normally look, then this strategy is open to the defender of moral perception as well, so 2′ again comes out false. On all readings McBrayer considers, the ‘looks’ objection is unsound.
Proponents of the ‘looks objection’ may be unsatisfied with McBrayer’s response, however. The kind of ‘looks’ that is likely intended by opponents of CMP is ‘phenomenal looks’. That is, the what-it-is-likeness of perceiving something, such as what it is like to perceive a car or a cat, is the intended meaning of “looks” in the argument. “Looks” in fact was characterized as the phenomenal kind in the opening paragraph of this section. However, McBrayer omits this reading of ‘looks’, and misses the most plausible and strongest reading of the objection. It remains up to contemporary defenders of CMP to provide an account of what the phenomenological ‘looks’ of moral properties are like. Until an account is provided, the looks objection remains a live challenge.
Whatever this account may be, it will also provide a general strategy for answering a general looks objection in the philosophy of perception. This objection is the same as the looks objection listed above, but with instances of ‘moral’ replaced with ‘high-level property’, and concludes that our high-level property beliefs are not perceptually justified (McGrath 2017). If an account is successful at articulating what the phenomenal looks of a higher-order property is, or motivating the belief that high-level properties have one, then this provides a framework for CMP to use in answering the moral looks objection.
4. Arguments from Philosophy of Perception
While the prima facie arguments provide initial motivation for CMP, the thesis is ultimately about the epistemic deliverances of our sensory faculty. Accordingly, much of the debate about the viability of CMP parallels debates in the general philosophy of perception. In this section, we will see the arguments for and against moral perception drawing from empirical perceptual psychology and general philosophy of perception.
a. High-Level Contents in Perception
A natural move for the moral perceptualist in defense of the claim that we are non-inferentially justified by perception is to argue that we see moral properties. The perceptualist here means to be taken literally, similar to how we see the yellow of a lemon or the shape of a motorcycle. If we do perceive moral properties, then a very straightforward epistemic story can be told. This story the perceptualist aims to explain how perceptual moral justification is the same for perceptual justification of ordinary objects. For example, the explanation for how someone knows there is a car before them is that they see a car and form the corresponding belief that there is a car. The story for justification of moral beliefs here will be that someone sees the wrongness of some action and forms the corresponding belief that the action is wrong (absent defeaters). The perceptualist will typically flesh out this move by assuming an additional epistemic requirement, called the Matching Content Constraint (MCC):
MCC: If your visual experience E gives you immediate justification to believe some external world proposition that P, then it’s a phenomenal content of E that P (Silins 2011).
The MCC states that one is non-inferentially justified only if there is a match in contents between a perceiver’s perceptual state and doxastic state (their belief). The reason perceptual contents matter to CMP is that if perceptual contents include moral properties, then one has a perceptual experience of those moral properties, and if one has an experience of those moral properties then a story for a non-inferential perceptual justification of moral beliefs is in hand, which is no different from our perceptual justification of other objects. On the other hand, if there is a mismatch between our perceptual contents and our moral beliefs, then we may find a non-inferentialist perceptual epistemology such as CMP to be implausible.
Given the MCC, the perceptualist needs it to be the case that perceptual experience includes high-level contents, such as being a car, being a pine tree, or being a cause of some effect. If perceptual experiences do contain high-level contents, then the inclusion of moral contents in perceptual experiences is a natural theoretical next-step, barring a principled reason for exclusion. After all, if we commit to arguing that we perceive causation and carhood, extending the contents of perception to rightness (and wrongness) does not appear to require too large a stretch of the imagination. The extension of perceptual experiences to include moral contents meets the matching content constraint, and it clears the way for arguing for CMP. However, if the contents of our perceptual experiences are restricted to low-level contents, which are colors, shapes, depth, and motion (although what counts a low-level content may vary between theorists), the defense of CMP becomes much trickier.
Holding onto CMP because one accepts a high-level theory of content comes with its own risk. If a thin view of contents turns out to be the correct account of perceptual content, such that what makes up the content of our perceptual states are color arrays, shapes, depth and motion, then CMP appears to lose much of its motivation. It would be theoretically awkward to insist that moral contents show up if content about cars, pine trees, and causation are incapable of doing so. And if moral properties do not appear in the contents of perceptual experience, then a simple story as to how we can have perceptual justification for moral beliefs is lost.
Even if perception does not have high-level contents, or nor moral contents, this does not mean that CMP is a failed theory of moral epistemology. Sarah McGrath , provides a story as to how we can have perceptual beliefs in the absence of high-level contents in perception (2018, 2019). This story is an externalist one; the source of the justification comes from a Bayesian account of the adjustment of priors (the probability that a belief is true) given non-moral observations, rather than any experiential contents of morality itself. Through perceptual training and experience our perceptual system is trained to detect morally relevant stimuli, such as detecting the whimper of pain a dog may voice when kicked. On McGrath’s view, then, one is perceptually justified in a moral belief when the perceptual system reliably tracks the moral facts. The upshot for the defender of CMP is that there is much theorizing to be done about the compatibility between CMP and the thin-content view, and McGrath’s view shows one way to reconcile the two.
b. Phenomenal Contrast Arguments
An argument for thinking that we do perceive moral properties, as well as other high-level properties, is the argument from phenomenal contrast. Susanna Siegel develops a kind of phenomenal contrast argument as a general strategy for arguing that the contents of our perception are far richer than a thin view of contents would allow (2006, 2011, 2017). How a phenomenal contrast argument works is as follows. We are asked to imagine two scenarios, one in which a property is present, and a contrast scenario where the same property is absent. If the intuition about these cases is that the perceptual phenomenology is different for a perceiver in these scenarios, then one can argue that what explains the difference in experience in these cases is the absence of the property, which makes a difference to what is perceptually experienced. The reason an advocate of CMP would want to use this strategy is that if there is a phenomenal contrast between two cases, then there is an explanatory gap that CMP fills; if there is a moral experience in one case but not in a different similar case, CMP can explain the difference by saying that a moral property is perceived one case but not in the other, thus explaining the phenomenal difference.
To better illustrate phenomenal contrast, here is a concrete example from Siegel arguing that causation appears in the contents of perception (2011). Imagine two cases both in which we are placed behind a shade and see two silhouettes of objects. In the control case, we see one silhouette of an object bump into another object, and the second object begins to roll. In the contrast case, whenever one silhouette begins to move towards the other silhouette, the other silhouette begins to move as well, keeping a steady distance from the first silhouette. If we have the intuition that these two cases are phenomenally different for a perceiver, then Siegel argues that the best explanation for this difference is that causation is perceptually represented in the former case and not the latter, whereas competitors deny that causation appears in the content would have to find some alternative, and more complicated, explanation for the contrast.
The phenomenal contrast argument has been wielded to argue for moral contents specifically by Preston Werner (2014). Werner asks us to imagine two different individuals, a neurotypical individual and an emotionally empathetic dysfunctional individual (EEDI), coming across the same morally-valenced scenario. Let this scenario be a father hugging a child. When the neurotypical individual comes upon the scene of the father hugging his child, this individual is likely to be moved and have a variety of physiological and psychological responses (such as feeling the “warm fuzzies”). When the EEDI comes upon the scene of the father hugging his child, however, they will be left completely cold, lacking the physiological and psychological responses the neurotypical individual underwent. This version of the phenomenal contrast argument purports to show that what best accounts for the experiential difference between these two individuals is that the neurotypical individual is able to perceptually represent the moral goodness of the father hugging the child, thus explaining the emotional reaction, whereas the EEDI was left cold because of their inability to perceptually represent moral goodness. If this argument is successful, then we have reason to think that moral properties appear in the contents of experience.
One might object here that Werner is not following the general methodology that Siegel sets out for phenomenal contrast. Werner defends his case as a phenomenal contrast by arguing that making use of two different scenarios would be too controversial to be fruitful because of the difference between learning to recognise morally valenced situations and having the recognitional disposition to recognise pine trees, and that the two individuals in the scenario are sufficiently similar in that they both have generally functional psychologies, but interestingly different in that the EEDI lacks the ability to properly emotionally respond to situations. Similarly, we might wonder about the use of an EEDI in this phenomenal contrast case. Although the EEDI possesses much of the same cognitive architecture as the neurotypical individual, the EEDI is also different in significant aspects. First, an immediate explanation of the difference might appeal to emotions, rather than perceptual experiences; the EEDI lacks the relevant emotions requisite for moral experiences. Second, the EEDI makes for a poor contrast if they lack the moral concepts needed to recognise moral properties in the first place. Similarly, the use of an EEDI as a contrast may prove problematic due to the exact nature of an EEDI being unclear; claiming that the best explanation between the two individuals’ experiences is due to a representational difference may be premature in the face of numerous and conflicting theories about the pathology of an EEDI. That is, because an EEDI’s perceptual system is identical to that of the neurotypical individual, the EEDI may still perceptually represent moral properties but fail to respond or recognise them for some other reason. If this hypothesis is correct, then the use of an EEDI is illegitimate because it does not capture the purported experiential difference.
c. Phenomenal Contrast and Parity Considerations
Even if CMP gets the right result, this does not rule out that other views can explain the phenomenology as well. For example, Pekka Väyrynen claims that inferentialism provides a better explanation of moral experiences, particularly regarding explanations of different experiences in phenomenal contrast scenarios (2018). To show this, Väyrynen first provides a rival hypothesis to a perceptualist account, which is as follows. When we see a father hugging his child, our experience of moral goodness is a representation that “results from an implicit habitual inference or some other type of transition in thought which can be reliably prompted by the non-moral perceptual inputs jointly with the relevant background moral beliefs” (Väyrynen 2018). This rival hypothesis aims to explain the phenomenological experiences targeted by phenomenal contrast arguments by stating that rather than moral properties appearing in our perceptual contents, what happens when we have a moral experience is that past moral learning, in conjunction with the non-moral perceptual inputs, forms a moral belief downstream from perception.
To see how this might work in a non-moral case, we can consider the following vignette, Fine Wine (Väyrynen 2018):
Greg, an experienced wine maker, reports that when he samples wine he perceives it as having various non-evaluative qualities which form his basis for classifying it as fine or not. Michael, a wine connoisseur, says that he can taste also fineness in wine.
Väyrynen asks if Michael has a perceptual experience of a property, in this case, fineness, that Greg cannot pick up on, and argues that there is no difference in perceptual experience. Granting that Greg and Michael’s experiences of the wine can differ, we need not appeal to Michael being able to perceive the property of fineness in order to explain this difference. What explains the difference in phenomenology, according to Väyrynen, is that Michael’s representations of fineness are “plausibly an upshot of inferences or some other reliable transitions in thought…” (Väyrynen 2018). Väyrynen’s hypothesis aims to reveal the phenomenal contrast argument as lacking the virtue of parsimony. That is, the perceptualist is using more theoretical machinery than needed to explain the difference in phenomenal experiences. The phenomenal contrast argument explains the difference in phenomenology between two individuals by claiming that moral properties appear in the contents of perception. Väyrynen’s rival hypothesis is supposed to be a simpler and more plausible alternative that explains why we may think high-level contents are in perception. First, it explains what appears to be a difference in perceptual experience as a difference in doxastic experience (a difference in beliefs). Second, because the difference is in doxastic experience, Väyrynen’s hypothesis does not commit to high-level contents in perception. Everyone who is party to this debate agrees on the existence of low-level perceptual contents and doxastic experiences, so to endorse high-level contents is to take on board an extra commitment. All things being equal, it is better to explain a phenomenon with fewer theoretical posits. In other words, Väyrynen’s hypothesis gets better explanatory mileage than the perceptualist’s phenomenal contrast argument.
Consider a moral counterpart of fine wine, where Greg and Michael witness a father hugging his child. Greg rarely engages in moral theorizing, but he classifies the action as morally good based on some of the non-moral features he perceives. Michael, on the other hand, is a world class moral philosopher who claims he can see the goodness or badness of actions. The perceptualist will say that the latter individual perceives goodness, but the former individual is perceptually lacking such that they cannot pick up on moral properties. The perceptualist who makes use of phenomenal contrast arguments is committed to saying here that Michael’s perceptual system has been trained to detect moral properties and has moral contents in perceptual experience, whereas Greg has to do extra cognitive work to make a moral judgment. Väyrynen’s rival hypothesis, on the other hand, need not claim we perceptually represent moral properties, but rather can explain the difference in phenomenology by appealing to the implicit inferences one may make in response to non-moral properties to which one has a trained sensitivity. According to Väyrynen’s hypothesis, Michael’s cognitive system is trained to make implicit inferences in response to certain non-moral properties, whereas Greg needs to do a bit more explicit cognitive work to make a moral judgment. What seems like a difference in perceptual experience is explained away as a difference in post-perceptual experience.
Väyrynen’s hypothesis also challenges Werner’s phenomenal contrast argument above, as it has an explanation for the EEDI’s different phenomenological experience. The neurotypical has the moral experience because of implicit inferences being made, but the EEDI fails to have the same experience because the EEDI lacks a sensitivity to the moral properties altogether, failing to draw the inferences the neurotypical is trained to make. In short, the difference in one’s phenomenological experience is explained by this rival hypothesis by differences in belief, rather than in perception.
d. Cognitive Penetration
It is already clear how low-level contents make it into perception, as perceptual scientists are already familiar with the rod and cone cells that make up the retina and process incoming light, as well as how that information is used by the early visual system. Is less clear how high-level contents make their way into perceptual experience. If perception does contain high-level contents, then a mechanism is required to explain how such contents make it into perceptual experience. The mechanism of choice for philosophers of perception and cognitive scientists is cognitive penetration. Cognitive penetration is a psychological hypothesis claiming that at least some of an individual’s perceptual states are shaped by that individual’s propositional attitudes, such as beliefs, desires, and fears. Put another way, cognitive penetration is the claim that perceptual experience is theory-laden.
To understand how cognitive penetration is supposed to work, we should consider another phenomenal contrast case. Imagine that you are working at a nature conservation center, and are unfamiliar with the plant known as Queen Anne’s lace. While working at the conservation center, you are told by your supervisor that plants that look a certain way are Queen Anne’s Lace. After repeated exposure to the plant, that a plant is Queen Anne’s lace becomes visually salient to you. In other words, your perceptual experience of Queen Anne’s lace prior to learning to recognize it is different from the perceptual experience you have of the plant after you have learned to recognize it. Cognitive penetration explains this shift in perceptual experience as your Queen Anne’s lace beliefs shape your perceptual experience, such that the property of ‘being Queen Anne’s lace’ makes it into the content of your perception. In other words, the difference in perceptual experiences is explained by the difference in perceptual contents, which in turn is explained by perceptual experiences being mediated by propositional attitudes. We should take care to separate this from a similar thesis which claims that there is no change in perceptual experience after learning to recognize Queen Anne’s lace, but that the shift in the phenomenology (the what-it-is-likeness) of looking at Queen Anne’s lace is explained by changes in post-perceptual experience, such as having new beliefs about the plant. Cognitive penetration claims that the phenomenological difference is between perceptual experiences, and it is the beliefs about Queen Anne’s lace that changes the perceptual experience.
Cognitive penetration is an attractive option for the perceptualist because it provides a mechanism to explain how moral properties make their way into the contents of perception. Consequently, the perceptualist’s theory for how we see the rightness or wrongness of actions will be identical to the story about Queen Anne’s lace above: an individual learns about morality from their community and forms moral beliefs, which in turn prime the perceptual system to perceive moral properties. One the perceptualist has cognitive penetration, they then have a story for moral properties in the contents of perception, and then the perceptualist delivers an elegant epistemology of moral justification. This epistemology respects the matching content constraint, which states that in order for a belief to be justified by perception, the contents of a belief must match the contents of perception. The perceptualist may then say that we have foundational perceptual justification for our moral beliefs, in the same way that we have foundational perceptual justification for tree beliefs. Just as we see that there is a tree before us, we see that an action is wrong.
e. The Mediation Challenge
The perceptualist’s use of cognitive penetration has led to challenges to the view on the grounds that cognitive penetration, the thesis that propositional attitudes influence perceptual experiences, lead to counterintuitive consequences. One of the most prominent challenges to the possibility of moral perception comes from Faraci , who argues that if cognitive penetration is true, then CMP must be false (Faraci 2015). To motivate the argument that no moral justification is grounded in perception, Faraci defends a principle he calls mediation:
If perceptions of X are grounded in experiences as of Y, then perceptions of X produce perceptual justification only if they are mediated by background knowledge of some relation between X and Y. (Faraci 2015)
What mediation states is that one can only have perceptual justification of some high-level property if the experience of that higher level property is in some way informed by knowledge of its relation to the lower-level properties it is grounded in. To motivate the plausibility of mediation, Faraci appeals to the non-moral example of seeing someone angry. If Norm sees Vera angry, presumably he knows that she is angry because he sees her furrowed brow and scowl, and he knows that a furrowed brow and scowl is the kind of behavior that indicates that someone is angry. In an analogous moral case, someone witnessing a father hugging his child knows that they are seeing a morally good action only if they have the relevant background knowledge, the relevant moral beliefs, connecting parental affection with goodness. If the witness did not possess the moral bridge principle that parental affection was good, then the witness would not know that they had seen a morally good action. The thrust of the argument is that if mediation is plausible in the non-moral case, then it is plausible in the moral case as well. If mediation is plausible in the moral case, then CMP is an implausible account of moral epistemology because it will need to appeal to background moral knowledge not gained in perceptual experience to explain how we see that the father hugging the child is a morally good action.
Faraci considers three possible ways of avoiding appeal to background knowledge for the defender of moral perception. The first option is to claim that the moral bridge principles are themselves known through perceptual experiences. In the case of the child hugging the father, then, we antecedently had a perceptual experience that justified the belief that parental affection is good. The problem with this response is that it leads to a regress, since we would have to have further background knowledge connecting parental affection and goodness (such as parental affection causes pleasure, and pleasure is good), and experientially gained knowledge of each further bridge principle.
The second option is that one could already know some basic moral principles a priori. The problem with this response should be apparent, since if we know some background principles a priori, then this is to concede the argument to Faraci that none of our most basic moral knowledge is known through experience.
Finally, someone could try to argue that one comes to know a moral fact by witnessing an action multiple times and its correlation with its perceived goodness, but the problem with this is that if each individually viewed action is perceived as being good, then we already have background knowledge informing us of the goodness of that act. If so, then we have not properly answered the mediation challenge and shown that CMP is a plausible epistemology of morality.
One way to defend CMP in response to Faraci’s challenge is to follow Preston Werner’s claim that mediation is too strong and offer a reliabilist account of justification that is compatible with a weaker reading of mediation. Werner considers a weak reading and a strong reading of Faraci’s mediation condition (Werner 2018). Werner rejects the strong reading of mediation on the grounds that while it may make Faraci’s argument against the plausibility of moral perception work, it overgeneralises to cases of perception of ordinary objects. Werner points out that the strong reading of mediation requires that we be able to make explicit the background knowledge of perceptual judgements that we make; if we perceive a chair, the strong reading requires that we be able to articulate the ‘chair theory’ that is informing our perceptual experience, otherwise our perceptual judgment that there is a chair is unjustified. Because the vast majority of non-mereologists are not able to make explicit a ‘chair-theory’ informing their perception, the strong reading yields the verdict that the vast majority of us are unjustified in our perceptual judgment of there being a chair.
The weak reading that Werner offers is what he calls “thin-background knowledge”, which he characterizes as “subdoxastic information that can ground reliable transitions from perceptual information about some property Y to perceptual information as of some other property X” (Werner 2018). The upshot is that a pure perceptualist epistemology of morality is compatible with the thin-background knowledge reading of mediation: We do not need to have access to the subdoxastic states that ground our perceptual judgments in order for us to know that our perceptual judgments are justified. Werner’s response to Faraci, in summary, is that a pure perceptualist epistemology is plausible because thin-background knowledge gives us an explanation as to how our perceptual moral judgments are in good epistemic standing.
f. Moral Perception and Wider Debates in The Philosophy of Perception
A broader lesson from the mediation challenge, however, is that many of the issues facing CMP are the same arguments that appear in general debates regarding the epistemology and metaphysics of perception. In the case of Faraci , the argument is a particular instance of a wider concern about cognitive penetration (2015).
What the mediation challenge reflects is a general concern about epistemic dependence and epistemic downgrade in relation to cognitive penetration. In particular, the mediation principle is an instance of the general challenge of epistemic dependence:
A state or process, e, epistemically depends upon another state, d, with respect to content c if state or process e is justified or justification-conferring with respect to c only if (and partly because) d is justified or justification-conferring with respect to c. (Cowan 2014, 674)
The reason one might worry about epistemic dependence in connection with cognitive penetration is that the justification conferring state in instances of cognitive penetration might be the belief states shaping the perceptual experiences, rather than the perceptual experiences doing the justificatory work. If this is true, then a perceptual epistemology of all high-level contents is doubtful, since what does the justificatory work in identifying pine trees will be either training or reflecting on patterns shared between trees, neither of which lend them to a perceptual story.
There is another general worry for cognitive penetration: epistemic downgrade. Assuming cognitive penetration is true, even if one were able to explain away epistemic dependence one might still think that our perceptual justification is held hostage by the beliefs that shape our experiences. For illustration, let us say we have the belief that anyone with a hoodie is carrying a knife. If we see someone wearing a hoodie and they pull a cellphone out, our belief may shape our perceptual state such that our perceptual experience is that of the person in the hoodie pulling a knife. I then believe that the person is pulling a knife. Another example of epistemic downgrade is that of anger:
Before seeing Jack, Jill fears that Jack is angry at her. When she sees him, her fear causes her to have a visual experience in which he looks angry at her. She goes on to believe that he is angry (Siegel 2019, 67).
In both cases there appears to be an epistemic defect: a belief is shaping a perceptual experience, which in turn provides support to the very same belief that shaped that experience. It is an epistemically vicious feedback loop. The worry about epistemic downgrade and high-order contents should be clear. In the case of morality, our background beliefs may be false, which will in turn shape our moral perceptual experiences to be misrepresentative. This appears to provide a defeater for perceptual justification of morality, and it forces the perceptualist to engage in defense of the moral background beliefs which may turn out to be an a priori exercise, defeating the a posteriori character of justified moral beliefs the perceptualist wanted.
One way to avoid these epistemic worries is for the moral perceptualist to endorse some form of epistemic dogmatism, which is to claim that seemings (perceptual or doxastic) provide immediate prima facie, defeasible, justification for belief. The perceptualist who adopts this strategy can argue that the worry of epistemic dependence is misplaced because although the presence of high-level content is causally dependent on the influence of background beliefs, given their dogmatist theory justification for a belief epistemically depends only on the perceptual experience itself. To see this, consider the following analogy: If one is wearing sunglasses, the perceptual experience one has will depend on those sunglasses they are wearing, but one’s perceptual beliefs are not justified by the sunglasses, but rather by the perceptual experience itself (Pryor 2000). For concerns about epistemic downgrade, the perceptualist may give a similar response, which is to state that one is defeasibly justified in a perceptual belief until one is made aware of a defeater, which in this case is the vicious feedback loop. To be clear, no moral perceptualist has made use of this response in print, as most opt for a kind of externalist account of perceptual justification. We should keep in mind that the dogmatist response is made in debates in general perceptual epistemology, and because debates about the epistemic effects of cognitive penetration in moral perception are instances of the general debate, the dogmatist strategy is available should the moral perceptualist wish to use it.
Apart from the epistemic difficulties cognitive penetration incurs, because cognitive penetration is a thesis about the structure of human cognitive architecture it must withstand scrutiny from cognitive science and empirical psychology. A central assumption of the cognitive science and psychology of perception is that the perceptual system is modular, or informationally encapsulated. However, cognitive penetration assumes the opposite because it claims that beliefs influence perceptual experience. Because cognitive penetration holds that the perceptual system is non-modular and receives input from the cognitive system, it falls upon advocates of the hypothesis to show that there is empirical support for the thesis. The problem is that most empirical tests purporting to demonstrate effects of cognitive penetration are questionable. The results have been debunked as either being explainable by other psychological effects such as attention effects, or they have been dismissed on the grounds of poor methodology and difficult to replicate (Firestone and Scholl 2016). Furthermore, in the case of perceptual learning, cognitive penetration predicts changes in the neurophysiology of the cognitive system rather than in the perceptual system, as it would be new beliefs that explain learning to recognize an object. Research in perceptual neurophysiology shows the opposite: perceptual learning is accompanied by changes in the neurophysiology of the perceptual system (Connolly 2019). The viability of CMP, insofar as it depends on cognitive penetration for high-level contents, is subject not only to epistemic pressures, but also to empirical fortune.
5. Summary: Looking Forward
For moral epistemologists, a foundationalist epistemology that provides responses to skeptical challenges is highly desirable. While a variety of theories of moral epistemology do provide foundations, CMP provides an epistemology that grounds our justification in our perceptual faculty that we are all familiar with and provides a unified story for all perceptual justification.
The overall takeaway is that the arguments that are made by both defenders and challengers to CMP are instances of general issues in the philosophy of perception. The lesson to be drawn here for CMP is that the way forward is to pay close attention to the general philosophy of perception literature. Because the literature of CMP itself remains in very early development, paying attention to the general issues will prevent the advocate of CMP from falling into mistakes made in the general literature, as well as open potential pathways for developing CMP in interesting and novel ways.
6. References and Further Reading
Audi, Robert. 2013. Moral Perception. Princeton University Press.
A book length defense of CMP. A good example of the kind of epistemic ecumenicism a perceptualist may adopt.
Bergqvist, Anna, and Robert Cowan (eds.). 2018. Evaluative Perception. Oxford: Oxford University Press.
Collection of essays on the plausibility of CMP and emotional perception.
Church, Jennifer. 2013. “Moral Perception.” Possibilities of Perception (pp. 187-224). Oxford: Oxford University Press.
Presents a Kantian take on moral perception.
Crow, Daniel. 2016. “The Mystery of Moral Perception.” Journal Of Moral Philosophy 13, 187-210.
Challenges moral perception with a reliability challenge.
Connolly, Kevin. 2019. Perceptual Learning: The Flexibility of the Senses. Oxford: Oxford University Press.
Discusses the findings of the neuroscience and psychology of perception in relation to theses in the philosophy of mind. Chapter 2 argues against cognitive penetration.
Cowan, Robert. 2014. “Cognitive Penetrability and Ethical Perception.” Review of Philosophy and Psychology 6, 665-682.
Discusses the epistemic challenges posed to moral perception by cognitive penetration. Focuses on epistemic dependence.
Cowan, Robert. 2015. “Perceptual Intuitionism.” Philosophy and Phenomenological Research 90, 164-193.
Defends the emotional perception of morality.
Cowan, Robert. 2016. “Epistemic perceptualism and neo-sentimentalist objections.” Canadian Journal of Philosophy 46, 59-81.
Defends the emotional perception of morality.
Faraci, David. 2015. “A hard look at moral perception.” Philosophical Studies 172, 2055-2072.
Faraci, David. 2019. “Moral Perception and the Reliability Challenge.” Journal of Moral Philosophy 16, 63-73.
Responds to Werner 2018. Argues that moral perception has a reliability challenge.
Firestone, Chaz, and Brian J. Scholl. 2016a. “Cognition Does Not Affect Perception: Evaluating the Evidence for ‘Top-down’ Effects.” Behavioral and Brain Sciences 39.
Challenges studies that purport to demonstrate the effects of cognitive penetration.
Firestone, Chaz, and Brian J. Scholl. 2016b. “‘Moral Perception’ Reflects Neither Morality Nor Perception.” Trends in Cognitive Sciences 20, 75-76.
Response to Gantman and Van Bavel 2015.
Fodor, Jerry. 1983. The Modularity of Mind. Cambridge, Massachusetts: MIT Press.
Argues for the informational encapsulation of the perceptual system.
Gantman, Ana P. and Jay J.Van Bavel. 2014. “The moral pop-out effect: Enhanced perceptual awareness of morally relevant stimuli.” Cognition, 132, 22-29.
Argues that findings in perceptual psychology support moral perception.
Gantman, Ana P. and Jay J. Van Bavel. 2015. “Moral Perception.” Trends in Cognitive Sciences 19, 631-633.
An early formulation of CMP, discusses the epistemic motivations for the view.
McGrath, Sarah. 2018. “Moral Perception and its Rivals.” In Anna Bergqvist and Robert Cowan (eds.), Evaluative Perception (pp. 161-182). Oxford: Oxford University Press.
McGrath, Sarah. 2019. Moral Knowledge. Oxford: Oxford University Press.
Chapter 4 is a presentation of CMP that does not require high-level contents. Chapter 1 is a criticism of some views on the methodology of moral inquiry.
Pylyshyn, Zenon. 1999. “Is Vision Continuous with Cognition? The Case for Cognitive Impenetrability of Visual Perception.” Behavioral and Brain Sciences 22, 341-365.
Pryor, James. 2000. “The Skeptic and the Dogmatist.” Noûs 34, 517-549.
Early presentation of phenomenal dogmatism. Responds to epistemic concerns about the theory-ladenness of perception.
Reiland, Indrek. 2021. “On experiencing moral properties.” Synthese 198, 315-325.
Presents a version of the ‘looks’ objection.
Siegel, Susanna. 2006. “Which properties are represented in perception.” In Gendler, Tamar S. & John Hawthorne (eds.), Perceptual Experience (pp. 481-503). Oxford: Oxford University Press.
Argues that perceptual experience includes high-level contents.
Siegel, Susanna. 2011. The Contents of Visual Experience. Oxford: Oxford University Press.
Book length defense of high-level contents in perceptual experience.
Siegel, Susanna. 2012. “Cognitive Penetrability and Perceptual Justification.” Noûs 46.
Discusses the issue of epistemic downgrade.
Siegel, Susanna. 2019. The Rationality Of Perception. Oxford: Oxford University Press.
Chapter 4 is a discussion of epistemic downgrade and responds to criticisms of the problem.
Siegel, Susanna & Byrne, Alex. 2016. “Rich or thin?” In Bence Nanay (ed.), Current Controversies in Philosophy of Perception (pp. 59-80). New York: Routledge-Taylor & Francis.
Byrne and Siegel debate whether or not there are high-level perceptual contents.
Väyrynen, Pekka. 2018. “Doubts about Moral Perception.” In Anna Bergqvist and Robert Cowan (eds.), Evaluative Perception (pp. 109-128). Oxford: Oxford University Press.
Werner, Preston J. 2016. “Moral Perception and the Contents of Experience.” Journal of Moral Philosophy 13, 294-317.
Werner, Preston J. 2017. “A Posteriori Ethical Intuitionism and the Problem of Cognitive Penetrability.” European Journal of Philosophy 25, 1791-1809.
Argues that synchronic cognitive penetration is a problem for CMP, but diachronic cognitive penetration is epistemically harmless.
Werner, Preston J. 2018. “Moral Perception without (Prior) Moral Knowledge.” Journal of Moral Philosophy 15, 164-181.
Response to Faraci 2015.
Werner, Preston J. 2018. “An epistemic argument for liberalism about perceptual content.” Philosophical Psychology 32, 143-159.
Defends the claim that there are high-level contents in perception by arguing that it best explains some findings in perceptual psychology, such as facial recognition.
Werner, Preston J. 2020. “Which Moral Properties Are Eligible for Perceptual Awareness?” Journal of Moral Philosophy 17, 290-319.
Discusses which moral properties we can perceive, concludes that we perceive at least pro-tanto evaluative properties.
Yablo, Stephen. 2005. “Wide Causation” In Stephen Yablo (ed.), Thoughts: Papers on Mind, Meaning, and Modality. Oxford: Oxford University Press.
Presents a solution to the causal exclusion problem. Argues that mental states are causally efficacious in a ‘wide’ sense in that they would still be explanatorily valuable even if the ‘thin’ causes, the physical states, were different.
Following the publication of An Enquiry Concerning Political Justice in 1793 and his most successful novel, Caleb Williams, in 1794, William Godwin was briefly celebrated as the most influential English thinker of the age. At the time of his marriage to the writer Mary Wollstonecraft in 1797, the achievements and influence of both writers, as well as their personal happiness together, seemed likely to extend into the new century. It was not to be. The war with revolutionary France and the rise of a new spirit of patriotic fervour turned opinion against reformers, and it targeted Godwin. Following her death in September 1797, a few days after the birth of a daughter, Mary, Godwin published a candid memoir of Wollstonecraft that ignited a propaganda campaign against them both and which became increasingly strident. He published a third edition of Political Justice and a second major novel, St. Leon, but the tide was clearly turning. And while he continued writing into old age, he never again achieved the success, nor the financial security, he had enjoyed in the 1790s. Today he is most often referenced as the husband of Mary Wollstonecraft, as the father of Mary Wollstonecraft Shelley (the author of Frankenstein and The Last Man), and as the founding father of philosophical anarchism. He also deserves to be remembered as a significant philosopher of education.
In An Enquiry Concerning Political Justice, Godwin argues that individuals have the power to free themselves from the intellectual and social restrictions imposed by government and state institutions. The argument starts with the very demanding requirement that we assess options impartially and rationally. We should act only according to a conviction that arises from a conscientious assessment of what would contribute most to the general good. Incorporated in the argument are principles of impartiality, utility, duty, benevolence, perfectionism, and, crucially, independent private judgment.
Godwin insists that we are not free, morally or rationally, to make whatever choices we like. He subscribes to a form of necessitarianism, but he also believes that choices are constrained by duty and that one’s duty is always to put the general good first. Duties precede rights; rights are simply claims we make on people who have duties towards us. Ultimately, it is the priority of the principle of independent private judgment that produces Godwin’s approach to education, to law and punishment, to government, and to property. Independent private judgment generates truth, and therefore virtue, benevolence, justice, and happiness. Anything that inhibits it, such as political institutions or modes of government, must be replaced by progressively improved social practices.
When Godwin first started An Enquiry Concerning Political Justice, he intended it to explore how government can best benefit humanity. He and the publisher George Robinson wanted to catch the wave of interest created by the French Revolution itself and by Edmund Burke’s Reflections on the Revolution in France, which so provoked British supporters of the revolution. Robinson agreed to support Godwin financially while he worked, with the understanding that he would send sections of the work as he completed them. This meant that the first chapters were printed before he had fully realised the implications of his arguments. The inconsistencies that resulted were addressed in subsequent editions. His philosophical ideas were further revised and developed in The Enquirer (1797), Thoughts Occasioned by a Perusal of Dr. Parr’s Spital Sermon (1801), Of Population (1820), and Thoughts on Man (1831), and in his novels. He also wrote several works of history and biography, and wrote or edited several texts for children, which were published by the Juvenile Library that he started with his second wife, Mary Jane Clairmont.
William Godwin was born in 1756 in Wisbech in Cambridgeshire, England, the seventh of thirteen children. His father was a Dissenting minister; his mother was the daughter of a successful shipowner. Godwin was fond of his lively mother, less so of his strictly Calvinist father. He was a pious and academically precocious boy, readily acquiring a close knowledge of the Old and New Testaments. After three years at a local school, where he read widely, learned some Latin and developed a passion for the classics, he moved at the age of 11 to Norwich to become the only pupil of the Reverend Samuel Newton. Newton was an adherent of Sandemanianism, a particularly strict form of Calvinism. Godwin found him pedantic and unjustly critical. The Calvinist doctrines of original sin and predestination weighed heavily. Calvinism left emotional scars, but it influenced his thinking. This was evidenced, Godwin later stated, in the errors of the first edition of Political Justice: its tendency to stoicism regarding pleasure and pain, and the inattention to feeling and private affections.
After a period as an assistant teacher of writing and arithmetic, Godwin began to develop his own ideas about education and to take an interest in contemporary politics. When William’s father died in 1772, his mother paid for her clever son to attend the New College, a Dissenting Academy in Hoxton, north of the City of London. By then Godwin had become, somewhat awkwardly, a Tory, a supporter of the aristocratic ruling class. Dissenters generally supported the Whigs, not least because they opposed the Test Acts, which prohibited anyone who was not an Anglican communicant from holding a public office. At Hoxton Godwin received a more comprehensive higher education than he would have received at Oxford or Cambridge universities (from which Dissenters were effectively barred). The pedagogy was liberal, based on free enquiry, and the curriculum was wide-ranging, covering psychology, ethics, politics, theology, philosophy, science, and mathematics. Hoxton introduced Godwin to the rational dissenting creeds, Socinianism and Unitarianism, to which philosophers and political reformers such as Joseph Priestley and Richard Price subscribed.
Godwin seems to have graduated from Hoxton with both his Sandemanianism and Toryism in place. But the speeches of Edmund Burke and Charles James Fox, the leading liberal Whigs, impressed him and his political opinions began to change. After several attempts to become a Dissenting minister, he accepted that congregations simply did not take to him; and his religious views began a journey through deism to atheism. He was influenced by his reading of the French philosophes. He settled in London, aiming to make a living from writing, and had some early encouragement. Having already completed a biography of William Pitt, Earl of Chatham, he now contributed reviews to the English Review and published a collection of sermons. By 1784 he had published three minor novels, all quite favourably reviewed, and a satirical pamphlet entitled The Herald of Literature, a collection of spoof ‘extracts’ from works purporting to be by contemporary writers. He also contemplated a career in education, for in July 1783 he published a prospectus for a small school that he planned to open in Epsom, Surrey.
For the next several years Godwin was able to earn a modest living as a writer, thanks in part to his former teacher at Hoxton, Andrew Kippis, who commissioned him to write on British and Foreign History for the New Annual Register. The work built him a reputation as a competent political commentator and introduced him to a circle of liberal Whig politicians, publishers, actors, artists, and authors. Then, in 1789, events in France raised hopes for radical reform in Great Britain. On November 4 Godwin was present at a sermon delivered by Richard Price which, while primarily celebrating the Glorious Revolution of 1688, anticipated many of the themes of Political Justice: universal justice and benevolence; rationalism; and a war on ignorance, intolerance, persecution, and slavery. The special significance of the sermon is that it roused Edmund Burke to write Reflections on the Revolution in France, which was published in November 1790. Godwin had admired Burke, and he was disappointed by this furious attack on the Revolution and by its support for custom, tradition, and aristocracy.
He was not alone in his disappointment. Thomas Paine’s Rights of Man, and Mary Wollstonecraft’s A Vindication of the Rights of Men were early responses to Burke. Godwin proposed to his publisher, George Robinson, a treatise on political principles, and Robinson agreed to sponsor him while he wrote it. Godwin’s ideas veered over the sixteen months of writing towards the philosophical anarchism for which the work is best known.
Political Justice, as Godwin declared in the preface, was the child of the French Revolution. As he finished writing it in January 1793, the French Republic declared war on the Kingdom of Great Britain. It was not the safest time for an anti-monarchist, anti-aristocracy, anti-government treatise to appear. Prime Minister William Pitt thought the two volumes too expensive to attract a mass readership; otherwise, the Government might have prosecuted Godwin and Robinson for sedition. In fact, the book sold well and immediately boosted Godwin’s fame and reputation. It was enthusiastically reviewed in much of the press and keenly welcomed by radicals and Dissenters. Among his many new admirers were young writers with whom Godwin soon became acquainted: William Wordsworth, Robert Southey, Samuel Taylor Coleridge, and a very youthful William Hazlitt.
In 1794 Godwin wrote two works that were impressive and successful in different ways. The novel Things as They Are: or The Adventures of Caleb Williams stands out as an original exploration of human psychology and the wrongs of society. Cursory Strictures on the Charge delivered by Lord Chief Justice Eyre to the Grand Jury first appeared in the Morning Chronicle newspaper. Pitt’s administration had become increasingly repressive, charging supporters of British reform societies with sedition. On May 12, 1794, Thomas Hardy, the chair of the London Corresponding Society (LCS), was arrested and committed with six others to the Tower of London; then John Thelwall, a radical lecturer, and John Horne Tooke, a leading light in the Society for Constitutional Information (SCI), were arrested. The charge was High Treason, and the potential penalty was death. Habeas Corpus had been suspended, and the trials did not begin until October. Godwin had attended reform meetings and knew these men. He was especially close to Thomas Holcroft, the novelist and playwright. Godwin argued in Cursory Strictures that there was no evidence that the LCS and SCI were involved in any seditious plots, and he accused Lord Chief Justice Eyre of expanding the definition of treason to include mere criticism of the government. ‘This is the most important crisis in the history of English liberty,’ he concluded. Hardy was called to trial on October 25, and, after twelve days, the jury returned a verdict of not guilty. Subsequently, Horne Tooke and Thelwall were tried and acquitted, and others were dismissed. Godwin’s article was considered decisive in undermining the charge of sedition. In Hazlitt’s view, Godwin had saved the lives of twelve innocent men (Hazlitt, 2000: 290). The collapse of the Treason Trials caused a surge of hope for reform, but a division between middle-class intellectuals and the leaders of labouring class agitation hastened the decline of British Jacobinism. This did not, however, end the anti-Jacobin propaganda campaign, nor the satirical attacks on Godwin himself.
A series of essays, published as The Enquirer: Reflections on Education, Manners and Literature (1797), developed a position on education equally opposed to Jean-Jacques Rousseau’s progressivism (in Emile) and to traditional education. Other essays modified or developed ideas from Political Justice. One essay, ‘Of English Style’, describes clarity and propriety of style as the ‘transparent envelope’ of thoughts. Another essay, ‘Of Avarice and Profusion’, prompted the Rev. Thomas Malthus to respond with his An Essay on the Principle of Population (1798).
At the lodgings of a mutual friend, the writer Mary Hays, Godwin became reacquainted with a woman he had first met in 1791 at one of the publisher Joseph Johnson’s regular dinners, when he had wanted to converse with Thomas Paine rather than with her. Since then, Mary Wollstonecraft had spent time in revolutionary Paris, fallen in love with an American businessman, Gilbert Imlay, and given birth to a daughter, Fanny. Imlay first left her then sent her on a business mission to Scandinavia. This led to the publication of Letters Written During a Short Residence in Sweden, Norway and Denmark (1796). She had completed A Vindication of the Rights of Woman in 1792, a more substantial work than her earlier A Vindication of the Rights of Men. She had also recently survived a second attempt at suicide. Having previously published Mary: A Fiction in 1788, she was working on a second novel, The Wrongs of Woman: or, Maria. A friendship soon became a courtship. When Mary became pregnant, they chose to get married and to brave the inevitable ridicule, both previously having condemned the institution of marriage (in Godwin’s view it was ‘the worst of monopolies’). They were married on March 29, 1797. They worked apart during daytime, Godwin in a rented room near their apartment in Somers Town, St. Pancras, north of central London, and came together in the evening.
Godwin enjoyed the dramatic change in his life: the unfamiliar affections and the semi-independent domesticity. Their daughter was born on August 30. The birth itself went well but the placenta had broken apart in the womb; a doctor was called to remove it, and an infection took hold. Mary died on September 10. At the end she said of Godwin that he was ‘the kindest, best man in the world’. Heartbroken, he wrote that he could see no prospect of future happiness: ‘I firmly believe that there does not exist her equal in the world. I know from experience we were formed to make each other happy’. He could not bring himself to attend the funeral in the churchyard of St. Pancras Church, where just a few months earlier they had married.
Godwin quickly threw himself into writing a memoir of Wollstonecraft’s life. Within a few weeks he had completed a work for which he was ridiculed at the time, and for which he has been criticised by historians who feel that it delayed the progress of women’s rights. The Memoirs of the Author of a Vindication of the Rights of Woman (1798) is a tender tribute, and a frank attempt to explore his own feelings, but Godwin’s commitment to complete candour meant that he underestimated, or was insensitive to, the likely consequence of revealing ‘disreputable’ details of Mary’s past, not least that Fanny had been born out of wedlock. It was a gift to moralists, humourists, and government propagandists.
Godwin was now a widower with a baby, Mary, and a toddler, Fanny, to care for. With help from a nursemaid and, subsequently, a housekeeper, he settled into the role of affectionate father and patient home educator. However, he retained a daily routine of writing, reading, and conversation. A new novel was to prove almost as successful as Caleb Williams. This was St. Leon: A Tale of the Sixteenth Century. It is the story of an ambitious nobleman disgraced by greed and an addiction to gambling, then alienated from society by the character-corrupting acquisition of alchemical secrets. It is also the story of the tragic loss of an exceptional wife and of domestic happiness: it has been seen as a tribute to Wollstonecraft and as a correction to the neglect of the affections in Political Justice.
The reaction against Godwin continued into the new century, with satirical attacks coming from all sides. It was not until he read a serious attack by his friend Dr. Samuel Parr that he was stung into a whole-hearted defence, engaging also with criticisms by James Mackintosh and Thomas Malthus. Thoughts Occasioned by the Perusal of Dr. Parr’s Spital Sermon was published in 1801. His replies to Mackintosh and Malthus were measured, but his response to Parr was more problematic, making concessions that could be seen as undermining the close connection between truth and justice that is crucial to the argument of Political Justice.
Since Mary Wollstonecraft’s death, Godwin had acquired several new friends, including Charles and Mary Lamb, but he clearly missed the domesticity he had enjoyed so briefly; and he needed a mother for the girls. The story goes that Godwin first encountered his second wife in May 1801, shortly before he started work on the reply to Dr. Parr. He was sitting reading on his balcony when he was hailed from next door: ‘Is it possible that I behold the immortal Godwin?’ Mary Jane Clairmont had two children, Charles and Jane, who were similar in age to Fanny and Mary. Godwin’s friends largely disapproved – they found Mary Jane bad-tempered and artificial – but Godwin married her, and their partnership endured until his death.
Godwin had a moderate success with a Life of Chaucer, failed badly as a dramatist, and completed another novel, Fleetwood, or the New Man of Feeling (1805), but he was not earning enough to provide for his family by his pen alone. He and Mary Jane conceived the idea of starting a children’s bookshop and publishing business. For several years the Juvenile Library supplied stationery and books of all sorts for children and schools, including history books and story collections written or edited by ‘Edward Baldwin’, Godwin’s own name being considered too notorious. Despite some publishing successes, such as Charles and Mary Lamb’s Tales from Shakespeare, the bookshop never really prospered. As he slipped into serious debt, Godwin felt he was slipping also into obscurity. In 1809 he wrote an Essay on Sepulchres: A Proposal for Erecting some Memorial of the Illustrious Dead in All Ages on the Spot where their Remains have been Interred. The Essay was generally well-received, but the proposal was ignored. With the Juvenile Library on the point of collapse, the family needed a benefactor who could bring them financial security.
Percy Bysshe Shelley was just twenty, recently expelled from Oxford University for atheism, and newly married and disinherited, when in January 1812 he wrote a fan letter to a philosopher he had not been sure was still living. His reading of Political Justice at school had ‘opened to my mind fresh & more extensive views’, he wrote. Shelley went off to Ireland to agitate for independence and distribute his pamphlet An Address to the Irish People. Godwin disapproved of the inflammatory tone, but invited Shelley and his wife, Harriet, to London. They eventually arrived in October and Shelley and Godwin thereafter maintained a friendly correspondence. Shelley’s first major poem, Queen Mab, with its Godwinian themes and references, was published at this time. During 1813, as he and Shelley continued to meet, Godwin saw a good deal of a new friend and admirer, Robert Owen, the reforming entrepreneur and philanthropist. Hazlitt commented that Owen’s ideas of Universal Benevolence, the Omnipotence of Truth and the Perfectibility of Human Nature were exactly those of Political Justice. Others thought Owen’s ‘socialism’ was Godwinianism by another name. As Godwin pleaded with friends and admirers for loans and deferrals to help keep the business afloat, the prospect of a major loan from Shelley was thwarted by Sir Timothy Shelley withholding his son’s inheritance when he turned twenty-one.
Godwin’s troubles took a different turn when Mary Godwin, aged sixteen, returned from a stay with friends in Scotland looking healthy and pretty. Harriet Shelley was in Bath with a baby. Shelley dined frequently with the Godwins and took walks with Mary and Jane. Soon he was dedicating an ode to ‘Mary Wollstonecraft Godwin’. On June 26th Mary declared her love as they lay together in St. Pancras Churchyard, beside her mother’s grave, Jane lingering nearby. By July Shelley had informed Harriet that he had only ever loved her as a brother. Godwin was appalled and remonstrated angrily, but early on the morning of July 28 he found a letter on his dressing table: Mary had eloped with Shelley, and they had taken Jane with them.
Godwin’s life over the next eight years, until Shelley’s tragic death in 1822, was far less dramatic or romantic than those of Mary and Shelley, or of Claire (as Jane now called herself). Their travels in Europe, the births and deaths of several children, including Claire’s daughter by Lord Byron, the precocious literary achievements (Shelley’s poems and Mary’s novel Frankenstein) are well known. Meanwhile, in London, Mary Wollstonecraft’s daughter, Fanny, was left unhappily behind. The atmosphere at home was tense and gloomy. Godwin refused to meet Mary and her lover until they were married, although the estrangement did not stop him accepting money from Shelley. A protracted struggle ensued, with neither party appearing to live up to Godwinian standards of candour and disinterestedness. Then, in October 1816, Fanny left the family home, ostensibly to travel to Ireland to visit her aunts (Wollstonecraft’s sisters). In Swansea, she killed herself by taking an overdose of laudanum. She was buried in an unnamed pauper’s grave, Godwin being fearful of further scandal connected with himself and Wollstonecraft. Shortly after this, Harriet Shelley’s body was pulled from the Serpentine in London. Shelley and Mary could now marry, and before long they escaped to Italy, with Claire (Jane) still in tow.
Despite these troubles and the precarious position of the Juvenile Library, Godwin managed to complete another novel, Mandeville, A Tale of the Seventeenth Century in England (1817). He took pride in his daughter’s novel and in his son-in-law’s use of Godwinian ideas in his poems. At the end of 1817, Godwin began his fullest response to Malthus. It took him three years of difficult research to complete Of Population. Meanwhile, his financial difficulties had reached a crisis point. He besieged Shelley in Italy with desperate requests to fulfil his promised commitments, but Shelley had lost patience and refused. The money he had already given, he complained, ‘might as well have been thrown into the sea’. A brief reprieve allowed the Godwins to move, with the Juvenile Library, to better premises. Then came the tragedy of July 8th, 1822. Shelley drowned in rough seas in the Gulf of La Spezia. Mary Shelley returned to England in 1823 to live by her pen. In 1826 she published The Last Man, a work, set in the twenty-first century, in which an English monarch becomes a popular republican leader only to survive a world-wide pandemic as the last man left alive. Godwin’s influence is seen in the ambition and originality of her speculative fiction.
Godwin himself worked for the next five years on a four-volume History of the Commonwealth—the period between the execution in 1649 of Charles I and the restoration in 1660 of Charles II. He describes the liberty that Cromwell and the Parliamentarians represented as a means, not an end in itself; the end is the interests and happiness of the whole: ‘But, unfortunately, men in all ages are the creatures of passions, perpetually prompting them to defy the rein, and break loose from the dictates of sobriety and speculation.’
In 1825, Godwin was finally declared bankrupt, and he and Mary Jane were relieved of the burden of the Juvenile Library. They moved to cheaper accommodation. Godwin had the comfort of good relations with his daughter and grandson. He hoped for an academic position with University College, which Jeremy Bentham had recently helped to establish, but was disappointed. He worked on two further novels, Cloudesley and Deloraine. In 1831 came Thoughts on Man, a collection of essays in which he revisited familiar philosophical topics. In 1834, the last work to appear in his lifetime was published. Lives of the Necromancers is a history of superstition, magic, and credulity, in which Godwin laments that we make ourselves ‘passive and terrified slaves of the creatures of our imagination’. A collection of essays on religion, published posthumously, made similar points but commended a religious sense of awe and wonder in the presence of nature.
The 1832 Reform Bill’s extension of the male franchise pleased Godwin. In 1833, the Whig government awarded him a pension of £200 a year and a residence in New Palace Yard, within the Palace of Westminster parliamentary estate—an odd residence for an anarchist. When the Palace of Westminster was largely destroyed by fire, in October 1834, the new Tory Government renewed his pension, even though he had been responsible for fire safety at Westminster and the upkeep of the fire engine. He spent the last years of his life in relative security with Mary Jane, mourning the deaths of old friends and meeting a new generation of writers. He died at the age of eighty on April 7, 1836. He was buried in St. Pancras Churchyard, in the same grave as Mary Wollstonecraft. When Mary Shelley died in 1851, her son and his wife had Godwin’s and Wollstonecraft’s remains reburied with her in the graveyard of St. Peter’s Church in Bournemouth, on the south coast.
2. Godwin’s Philosophy: An Enquiry Concerning Political Justice
Note: references to An Enquiry ConcerningPolitical Justice (PJ) give the volume number and page number of the two volume 1798 third edition, which is the same as the 1946 University of Toronto Press, ed. F. E. L. Priestley, facsimile edition. This is followed by the book and chapter number of the first edition (for example, PJ II: 497; Bk VIII, vi). Page numbers of other works are those of the first edition.
a. Summary of Principles
The first edition of An Enquiry ConcerningPolitical Justice was published in 1793. A second edition was published in 1796 and a third in 1798. Despite the modifications in the later editions, Godwin considered ‘the spirit and the great outlines of the work remain untouched’ (PJ I, xv; Preface to second edition). Arguably, he was underplaying the significance of the changes. They make clear that pleasure and pain are the only bases on which morality can rest, that feeling, rather than reason or judgment, is what motivates action, and that private affections have a legitimate place in our rational deliberations.
The modifications are incorporated in the ‘Summary of Principles’ (SP) that he added to the start of the third edition (PJ I, xxiii–xxvii). The eight principles are:
(1) ‘The true object of moral and political disquisition, is pleasure or happiness.’ Godwin divides pleasures between those of the senses and those that are ‘probably more exquisite’, such as the pleasures of intellectual feeling, sympathy, and self-approbation. The most desirable and civilized state is that in which we have access to all these diverse sources of pleasure and possess a happiness ‘the most varied and uninterrupted’.
(2) ‘The most desirable condition of the human species, is a state of society.’ Although government was intended to secure us from injustice and violence, in practice it embodies and perpetuates them, inciting passions and producing oppression, despotism, war, and conquest.
(3) ‘The immediate object of government is security.’ But, in practice, the means adopted by government restrict individual independence, limiting self-approbation and our ability to be wise, useful, or happy. Therefore, the best kind of society is one in which there is as little as possible encroachment by government upon individual independence.
(4) ‘The true standard of the conduct of one man to another is justice.’ Justice is universal, it requires us to aim to produce the greatest possible sum of pleasure and happiness and to be impartial.
(5) ‘Duty is the mode of proceeding, which constitutes the best application of the capacity of the individual, to the general advantage.’ Rights are claims which derive from duties; they include claims on the forbearance of others.
(6) ‘The voluntary actions of men are under the direction of their feelings.’ Reason is a controlling and balancing faculty; it does not cause actions but regulates ‘according to the comparative worth it ascribes to different excitements’—therefore, it is the improvement of reason that will produce social improvements.
(7) ‘Reason depends for its clearness and strength upon the cultivation of knowledge.’ As improvement in knowledge is limitless, ‘human inventions, and modes of social existence, are susceptible of perpetual improvement’. Any institution that perpetuates particular modes of thinking or conditions of existence is pernicious.
(8) ‘The pleasures of intellectual feeling, and the pleasures of self-approbation, together with the right cultivation of all our pleasures, are connected with the soundness of understanding.’ Prejudices and falsehoods are incompatible with soundness of understanding, which is connected, rather, with free enquiry and free speech (subject only to the requirements of public security). It is also connected with simplicity of manners and leisure for intellectual self-improvement: consequently, an unequal distribution of property is not compatible with a just society.
b. From Private judgment to Political Justice
Godwin claims there is a reciprocal relationship between the political character of a nation and its people’s experience. He rejects Montesquieu’s suggestion that political character is caused by external contingencies such as the country’s climate. Initially, Godwin seems prepared to argue that good government produces virtuous people. He wants to establish that the political and moral character of a nation is not static; rather, it is capable of progressive change. Subsequently, he makes clear that a society of progressively virtuous people requires progressively less governmental interference. He is contesting Burke’s arguments for tradition and stability, but readers who hoped that Godwin would go on to argue for a rapid, or violent, revolution were to be disappointed. There is even a Burkean strain in his view that sudden change can risk undoing political and social progress by breaking the interdependency between people’s intellectual and emotional worlds and the social and political worlds they inhabit. He wants a gradual march of opinions and ideas. The restlessness he argues for is intellectual, and it is encouraged in individuals by education.
Unlike Thomas Paine and Mary Wollstonecraft in their responses to Burke, Godwin rejects the language of rights. Obligations precede rights and our fundamental obligation is to do what we can to benefit society as a whole. If we do that, we act justly; if we act with a view to benefit only ourselves or those closest to us, we act unjustly. A close family relationship is not a sufficient reason for a moral preference, nor is social rank. Individuals have moral value according to their potential utility. In a fire your duty would be to rescue someone like Archbishop Fénelon, a benefactor to humankind, rather than, say, a member of your own family. (Fénelon’s 1699 didactic novel The Adventures of Telemachus, Son of Ulysses criticised European monarchies and advocated universal brotherhood and human rights; it influenced Rousseau’s philosophy of education.) It seems, then, that it is the consequences of one’s actions that make them right or wrong, that Godwin’s moral philosophy is a form of utilitarianism. However, Mark Philp (1986) argues that Godwin’s position is more accurately characterised as a form of perfectionism: one’s intentions matter and these, crucially, are improvable.
What makes our intentions improvable is our capacity for private judgment. As Godwin has often been unfairly described, both in his own day and more recently, as a cold-hearted rationalist, it is important to clarify what he means by ‘judgment’. It involves a scrupulous process of weighing relevant considerations (beliefs, feelings, pleasures, alternative opinions, potential consequences) in order to reach a reasonable conclusion. In the third edition (SP 6–8), he implies that motivating force is not restricted to feelings (passions, desires), but includes preferences of all kinds. The reason/passion binary is resisted. An existing opinion or intellectual commitment might be described as a feeling, as something which pleases us and earns a place in the deliberative process. In his Reply to Parr, Godwin mentions that the choice of saving Fénelon could be viewed as motivated by the love of the man’s excellence or by an eagerness ‘to achieve and secure the welfare and improvement of millions’ (1801: 41). Furthermore, any kind of feeling that comes to mind thereby becomes ratiocinative or cognitive; the mind could not otherwise include it in the comparing and balancing process. Godwin rejects the reason/passion binary most explicitly in Book VIII of Political Justice, ‘On Property’. The word ‘passion’, he tells us, is mischievous, perpetually shifting its meaning. Intellectual processes that compare and balance preferences and other considerations are perfectible (improvable); the idea that passions cannot be corrected is absurd, he insists. The only alternative position would be that the deliberative process is epiphenomenal, something Godwin could not accept. (For the shifting meaning of ‘passion’ in this period, and its political significance, see Hewitt, 2017.)
Judgments are unavoidably individual in the sense that the combination of relevant considerations in a particular case is bound to be unique, and also in the sense that personal integrity and autonomy are built into the concept of judgment. If we have conscientiously weighed all the relevant considerations, we cannot be blamed for trusting our own judgment over that of others or the dictates of authority. Nothing—no person or institution, certainly not the government—can provide more perfect judgments. Only autonomous acts, Godwin insists, are moral acts, regardless of actual benefit. Individual judgments are fallible, but our capacity for good judgment is perfectible (SP 6). Although autonomous and impartial judgments might not produce an immediate consensus, conversations and a conscientious consideration of different points of view help us to refine our judgment and to converge on moral truths.
In the first edition of Political Justice, it is the mind’s predisposition for truth that motivates our judgments and actions; in later editions, when it is said to be feelings that motivate, justice still requires an exercise of impartiality, a divestment of our own predilections (SP 4). Any judgment that fails the impartiality test would not be virtuous because it would not be conducive to truth. Godwin is not distinguishing knowledge from mere belief by specifying truth and justified belief conditions; rather, he is specifying the conditions of virtuous judgments: they intentionally or consciously aim at truth and impartiality. A preference for the general good is the dominant motivating passion when judgments are good and actions virtuous. The inclusion in the deliberation process of all relevant feelings and preferences arises from the complexity involved in identifying the general good in particular circumstances. Impartiality demands that we consider different options conscientiously; it does not preclude sometimes judging it best to benefit our friends or family.
Is the development of human intellect a means to an end or an end in itself? Is it intrinsically good? Is it the means to achieving the good of humankind or is the good of humankind the development of intellect? If the means and the end are one and the same, then, as Mark Philp (1986) argues, Godwin cannot be counted, straightforwardly at least, a utilitarian, even though the principle of utility plays a major role in delineating moral actions. If actions and practices with the greatest possible utility are those which promote the development of human intellect, universal benevolence and happiness must consist in providing the conditions for intellectual enhancement and the widest possible diffusion of knowledge. The happiest and most just society would be the one that achieved this for all.
When the capacity for private judgment has been enhanced, and improvements in knowledge and understanding have been achieved, individuals will no longer require the various forms of coercion and constraint that government and law impose on them, and which currently inhibit intellectual autonomy (SP 3). In time, Godwin speculates, mind could be so enhanced in its capacities, that it will conquer physical processes such as sleep, even death. At the time he was mocked for such speculations, but their boldness is impressive, and science and medicine have greatly prolonged the average lifespan, farm equipment (as he foretold) really can plough fields without human control, and research continues into the feasibility (and desirability) of immortality.
Anticipating the arguments of John Stuart Mill, Godwin argues that truth is generated by intellectual liberty and the duty to speak candidly and sincerely in robust dialogue with others whose judgments differs from one’s own. Ultimately, a process of mutual individual and societal improvement would evolve, including changes in opinion. Godwin’s anarchistic vision of future society anticipates the removal of the barriers to intellectual equality and justice and the widest possible access to education and to knowledge.
3. Educational Implications of Godwin’s philosophy
a. Progressive Education
Godwin’s interest in progressive education was revealed as early as July 1783 when the Morning Herald published An Account of the Seminary. This was the prospectus for a school—‘For the Instruction of 12 Pupils in the Greek, Latin, French and English Languages’—that he planned to open in Epsom, Surrey. It is unusually philosophical for a school prospectus. It asserts, for example, that when children are born their minds are tabula rasa, blank sheets susceptible to impressions; that by nature we are equal; that freedom can be achieved by changing our modes of thinking; that moral dispositions and character derive from education and from ignorance. The school’s curriculum would focus on languages and history, but the ‘book of nature’ would be preferred to human compositions. The prospectus criticizes Rousseau’s system for its inflexibility and existing schools for failing to accommodate children’s pursuits to their capacities. Small group tuition would be preferred to Rousseauian solitary tutoring. Teachers would not be fearsome: ‘There is not in the world,’ Godwin writes, ‘a truer object of pity than a child terrified at every glance, and watching with anxious uncertainty the caprices of a pedagogue’. Although nothing transpired because too few pupils were recruited, the episode reveals how central education was becoming to Godwin’s political and social thinking. In the Index to the third edition of Political Justice, there are references to topics such as education’s effects on the human mind, arguments for and against a national education system, the danger of education being a producer of fixed opinions and a tool of national government. Discussions of epistemological, psychological, and political questions with implications for education are frequent. What follows aims to synthesize Godwin’s ideas about education and to draw out some implications.
Many of Godwin’s ideas about education are undoubtedly radical, but they are not easily assimilated into the child-centred progressivism that traces its origin back to Rousseau. Godwin, like Wollstonecraft, admired Rousseau’s work, but they both took issue with aspects of the model of education described in Emile, or On Education (1762). Rousseau believed a child’s capacity for rationality should be allowed to grow stage by stage, not be forced. Godwin sees the child as a rational soul from birth. The ability to make and to grasp inferences is essential to children’s nature, and social communication is essential to their flourishing. Children need to develop, and to refine, the communication and reasoning skills that will allow them to participate in conversations, to learn, and to start contributing to society’s progressive improvement. A collision of opinions in discussions refines judgment. This rules out a solitary education of the kind Emile experiences. Whatever intellectual advancement is achieved, diversity of opinion will always be a condition of social progress, and discussion, debate, disagreement (‘conversation’) will remain necessary in education.
Unlike Rousseau, Godwin does not appear to be especially concerned with stages of development, with limits to learning or reading at particular ages. He is not as concerned as Rousseau is about the danger of children being corrupted by what they encounter. We know that his own children read widely and were encouraged to write, to think critically, to be imaginative. They listened and learned from articulate visitors such as Coleridge. Godwin’s interest in children’s reading encouraged him to start the Juvenile Library. One publication was an English dictionary, to which Godwin prefixed A New Guide to the English Tongue. He hoped to inspire children with the inclination to ‘dissect’ their words, to be clear about the primary and secondary ideas they represent. The implication is that the development of linguistic judgment is closely connected with the development of epistemic judgment, with the capacity for conveying truths accurately and persuasively. The kind of interactive dialogue that he believes to be truth-conducive would require mutual trust and respect. There would be little point in discussion, in a collision of ideas, if one could not trust the other participants to exercise the same linguistic and epistemic virtues as oneself. Judgment might be private but education for Godwin is interpersonal.
A point on which Godwin and Rousseau agree is that children are not born in sin, nor do they have a propensity to evil. Godwin is explicit in connecting their development with the intellectual ethos of their early environment, the opinions that have had an impact on them when they were young. Some of these opinions are inevitably false and harmful, especially in societies in which a powerful hierarchy intends children to grow up taking inequalities for granted. As their opinions and thinking develop through early childhood to adulthood, it is important that individuals learn to think independently and critically in order to protect themselves from false and corrupt opinions.
Godwin does not advocate the kind of manipulative tutoring to which Rousseau’s Emile is subjected; nor does he distinguish between the capacities or needs of boys and girls in the way that Rousseau does in his discussion of the education appropriate to Emile’s future wife, Sophie. According to Rousseau, a woman is formed to please a man, to be subjected to him, and therefore requires an education appropriate to that role. Mary Wollstonecraft, in Chapter 3 of A Vindication of the Rights of Woman, had similarly rejected Rousseau’s differentiation. Another difference is that, whereas Rousseau intends education to produce citizens who will contribute to an improved system of government, Godwin intends education to produce individuals with the independence of mind to contribute to a society that requires only minimal governmental or institutional superintendence.
b. Education, Epistemology, and Language.
Underlying Godwin’s educational thinking are important epistemological principles. In acquiring skills of communication, understanding, reasoning, discussion, and judgment, children acquire the virtue of complete sincerity or truthfulness. Learning is understanding, not memorisation. Understanding is the percipience of truth and requires sincere conviction. One cannot be said to have learned or to know or to have understood something, and one’s conduct cannot properly be guided by it, unless one has a sincere conviction of its truth. The connection between reason and conduct is crucial. Correct conduct is accessible to reason, to conscientious judgment. When they are given reasons for acting one way rather than another, children must be open to being convinced. This suggests that pedagogy should emphasise explanation and persuasion rather than monological direct instruction. Moral education is important in regard to conduct, but, as all education prepares individuals to contribute to the general good, all education is moral education.
Godwin gives an interesting analysis of the concept of truth, especially in the second and third editions of Political Justice. Children will need to learn that private judgment cannot guarantee truth. Not only are judgments clearly fallible, but—at least by the third edition—‘truth’ for Godwin does not indicate a transcendental idea, with an existence independent of human minds or propositions. ‘True’ propositions are always tentative, correctable on the basis of further evidence. The probability of a proposition being true can only be assessed by an active process of monitoring available evidence. Although Godwin frequently refers to truth, misleadingly perhaps, as ‘omnipotent’, he can only mean that the concept provides a standard, a degree of probability that precludes reasonable doubt. This suggests that ‘conviction’ is an epistemic judgment that there is sufficient probability to warrant avowal.
The reason why Godwin tends to emphasize truth rather than knowledge may be that we cannot transmit knowledge because we cannot transmit the rational conviction that would turn a reception of a truth into the epistemic achievement of knowing. Each recipient of truths must supply their own conviction via their own private judgment. Godwin insists that we should take no opinions on trust without independent thought and conviction. Judgments need to be refreshed to ensure that what was in the general interest previously still is. When we bind ourselves to the wisdom of our ancestors, to articles of faith or outdated teachings, we are inhibiting individual improvement and the general progress of knowledge. Conviction comes with a duty to bear witness, to pass on the truth clearly and candidly in ‘conversations’. The term ‘conversation’ implies a two-way, open-ended exchange, with at least the possibility of challenge. Integrity would not permit a proposition with an insufficient degree of probability to be conveyed without some indication of its lesser epistemic status, as with conjectures or hearsay. In modern terms, appreciating the difference in the epistemic commitments implicated by different speech acts, such as assertions, confessions, and speculations, would be important to the child’s acquisition of linguistic and epistemic skills or virtues.
c. Education, Volition, and Necessitarianism
Another aspect of Godwin’s philosophy that makes children’s education in reasoning and discussion important is his account of volition and voluntary choice. If a judgment produced no volition, it could be overruled by hidden or unconscious feelings or desires, and there would be no prospect of developing self-control. Disinterested deliberation would be a delusion and moral education would be powerless. Although Godwin made concessions concerning reason’s role in the motivation of judgments and actions, and in time developed doubts about the potential for improving the human capacity for impartiality, he did not alter the central point that it is thoughts that are present to mind, cognitive states with content, that play a role in motivation. Not all thoughts are inferences. By the time passions or desires, or any kind of preference, become objects of awareness, they are ratiocinative; the intellect is necessarily involved in emotion and desire. This ensures there is a point in developing critical thinking skills, in learning to compare and balance conscientiously whatever preferences and considerations are present to mind.
Godwin admits that some people are more able than others to conquer their appetites and desires; nevertheless, he thinks all humans share a common nature and can, potentially, achieve the same level of self-control, allowing judgment to dominate. This suggests that learning self-control should be an educational priority. Young people are capable of being improved, not by any form of manipulative training, coercion, or indoctrination, but by an education that promotes independence of mind through reflective reading and discussion. He is confident that a society freed from governmental institutions and power interests would strengthen individuals’ resistance to self-love and allow them to identify their own interests with the good of all. It would be through education that they would learn what constitutes the general good and, therefore, what their duties are. Although actions are virtuous that are motivated by a passion for the general good, they still require a foundation in knowledge and understanding.
The accusation that Godwin had too optimistic a view of the human capacity for disinterested rationality and self-control was one made by contemporaries, including Thomas Malthus. In later editions of Political Justice, reason is represented as a capacity for deliberative prudence, a capacity that can be developed and refined even to the extent of exercising control over sexual desire. Malthus doubted that most people would ever be capable of the kind of prudence and self-control that Godwin anticipated. Malthus’s arguments pointed towards a refusal to extend benevolence to the poor and oppressed, Godwin’s pointed towards generosity and equity.
The influence on Godwin’s perfectionism of the rational Dissenters, especially Richard Price and Joseph Priestley, is most apparent in the first edition of Political Justice. He took from them, and also from David Hartley and Jonathan Edwards, the doctrine of philosophical necessity, according to which a person’s life is part of a chain of causes extending through eternity ‘and through the whole period of his existence, in consequence of which it is impossible for him to act in any instance otherwise than he has acted’ (PJ I: 385; Bk IV, vi). Thoughts, and therefore judgments, are not exceptions: they succeed each other according to necessary laws. What stops us from being mere automatons is the fact that experience creates habits of mind which compose our moral and epistemic character, the degree of perfection in our weighing of preferences in pursuit of truth. The more rational, or perfect, our wills have become, the more they subordinate other considerations to truth. But the course of our lives, including our mental deliberations, is influenced by our desires and passions and by external intrusions, including by government, so to become autonomous we need to resist distortions and diversions. Experience and active participation in candid discussion help to develop our judgment and cognitive capacities, and as this process of improvement spreads through society, the need for government intervention and coercion reduces.
In revising this account of perfectionism and necessitarianism for the second and third editions of Political Justice, Godwin attempts to keep it compatible with the more positive role he then allows desire and passion. The language shifts towards a more Humean account of causation, whereby regularity and observed concurrences are all we are entitled to use in explanations and predictions, and patterns of feeling are more completely absorbed into our intellectual character. Godwin’s shift towards empiricism and scepticism is apparent, too, in the way truth loses much of its immutability and teleological attraction. This can be viewed as a reformulation rather than a diminution of reason, at least in so far as the changes do not diminish the importance of rational autonomy. We think and act autonomously, Godwin might say, when our judgments are in accordance with our character—that is, with our individual combination of moral and epistemic virtues and vices, which we maintain or improve by conscientiously monitoring and recalibrating our opinions and preferences. Autonomy requires that we do not escape the trajectory of our character but do try to improve it.
It is important to Godwin that we can make a conceptual distinction between voluntary and involuntary actions. He would not want young people to become fatalistic as a consequence of learning about scientific determinism, and yet he did not believe people should be blamed or made to suffer for their false opinions and bad actions: the complexity in the internal and environmental determinants of character is too great for that. Wordsworth for one accepted the compatibility of these positions. ‘Throw aside your books of chemistry,’ Hazlitt reports him saying to a student, ‘and read Godwin on Necessity’ (Hazlitt, 2000: 280).
d. Government, the State, and Education
For Godwin, progress towards the general good is delineated by progressive improvement in education and the development of private judgment. The general good is sometimes referred to by Godwin in utilitarian terms as ‘happiness’, although he avoids the Benthamite notion of the greatest happiness of the greatest number; and there is no question of pushpin being as good as poetry. A just society is a happy society for all, not just because individual people are contented but because they are contented for a particular reason: they enjoy a society, an egalitarian democracy, that allows them to use their education and intellectual development for the general good, including the good of future generations. A proper appreciation of the aims of education will be sufficient inspiration for children to want to learn; they will not require the extrinsic motivation of rewards and sanctions.
Godwin’s critique of forms of government, in Book V of Political Justice, is linked to their respective merits or demerits in relation to education. The best form of government is the one that ‘least impedes the activity and application of intellectual powers’ (PJ: II: 5; Bk V, i). A monarchy gives power to someone whose judgment and understanding have not been developed by vulnerability to the vicissitudes of fortune. All individuals need an education that provides not only access to books and conversation but also to experience of the diversity of minds and characters. The pampered, protected education of a prince inculcates epistemic vices such as intellectual arrogance and insouciance. He is likely to be misled by flatterers and be saved from rebellion only by the servility, credulity, and ignorance of the populace. No one person, not even an enlightened and virtuous despot, can match a deliberative assembly for breadth of knowledge and experience. A truly virtuous monarch, even an elected one, would immediately abolish the constitution that brought him to power. Any monarch is in the worst possible position to choose the best people for public office or to take responsibility for errors, and yet his subjects are expected to be guided by him rather than by justice and truth.
Similar arguments apply to aristocracies, to presidential systems, to any constitution that invests power in one person or class, that divides rulers from the people, including by a difference in access to education. Heredity cannot confer virtue or wisdom; only education, leisure and prosperity can explain differences of that kind. In a just society no one would be condemned to stupidity and vice. ‘The dissolution of aristocracy is equally in the interest of the oppressor and the oppressed. The one will be delivered from the listlessness of tyranny, and the other from brutalising operation of servitude’ (PJ II: 99; Bk V, xi).
Godwin recognises that democracy, too, has weaknesses, especially representative democracy. Uneducated people are likely to misjudge characters, be deceived by meretricious attractions or dazzled by eloquence. The solution is not epistocracy but an education for all that allows people to trust their own judgment, to find their own voice. Representative assemblies might play a temporary role, but when the people as a whole are more confident and well-informed, a direct democracy would be more ideal. Secret ballots encourage timidity and inconstancy, so decisions and elections should be decided by an open vote.
The close connection between Godwin’s ideas about education and his philosophical anarchism is clear. Had he been less sceptical about government involvement in education, he might have embraced more immediately implementable education policies. His optimism derives from a belief that the less interference there is by political institutions, the more likely people are to be persuaded by arguments and evidence to prefer virtue to vice, impartial justice to self-love. It is not the “whatever is, is right” optimism of Leibniz, Pope, Bolingbroke, Mandeville, and others; clearly, things can and should be better than they are. Complacency about the status quo benefits only the ruling elites. The state restricts reason by imposing false standards and self-interested values that limit the ordinary person’s sense of his or her potential mental capacities and contribution to society. Godwin’s recognition of a systemic denial of a voice to all but an elite suggests that his notion of political and educational injustice compares with what Miranda Fricker (2007) calls epistemic injustice. Social injustice for Godwin just is epistemic injustice in that social evils derive from ignorance, systemic prejudices, and inequalities of power; and epistemic injustice, ultimately, is educational injustice.
A major benefit of the future anarchistic society will be the reduction in drudgery and toil, and the increase in leisure time. Godwin recognises that the labouring classes especially are deprived of time in which to improve their minds. He welcomes technology such as printing, which helps to spread knowledge and literacy, but abhors such features of industrialisation as factories, the division of labour that makes single purpose machines of men, women, and children, and a commercial system that keeps the masses in poverty and makes a few opulently wealthy. Increased leisure and longevity create time for education and help to build the stock of educated and enlightened thinkers. Social and cultural improvement results from this accretion. Freed from governmental interference, education will benefit from a free press and increased exposure to a diversity of opinion. Godwin expresses ‘the belief that once freed from the bonds of outmoded ideas and educational practices, there was no limit to human abilities, to what men could do and achieve’ (Simon, 1960: 50). It is a mistake, Godwin writes towards the end of Political Justice, to assume that inequality in the distribution of what conduces to the well-being of all, education included, is recognised only by the ‘lower orders’. The beneficiaries of educational inequality, once brought to an appreciation of what constitutes justice, will inevitably initiate change. The diffusion of education will be initiated by an educated elite, but local discussion and reading groups will play a role: the educated and the less educated bearing witness to their own knowledge, passing it on and learning from frank conversation.
Unlike Paine and Wollstonecraft, Godwin does not advocate a planned national or state system of mass education. Neither the state nor the church could be trusted to develop curricula and pedagogical styles that educate children in an unbiased way. He is wary of the possibility of a mass education system levelling down, of reducing children to a “naked and savage equality” that suits the interests of the ruling elite. Nor could we trust state-accredited teachers to be unbiased or to model open-mindedness and explorative discussion. He puts his faith, rather, in the practices of a just community, one in which a moral duty to educate all children is enacted without restraint. Presumably, each community would evolve its own practices and make progressive improvements. The education of its children, and of adults, would find a place within the community’s exploration of how to thrive without government regulation and coercion. Paine wanted governmental involvement in a mass literacy movement, and Wollstonecraft wanted a system of coeducational schools for younger children, but Godwin sees a danger in any proposal that systematizes education.
Godwin’s vision of society does not allow him to specify in any detail a particular curriculum. Again, to do so would come too close to institutionalising education, inhibiting local democratic choice and diversity. He does, however, advocate epistemic practices which have pedagogical implications. Children should be taught to venerate truth, to enquire, to present reasons for belief, to reject as prejudice beliefs unsupported by evidence, to examine objections. ‘Refer them to reading, to conversation, to meditation; but teach them neither creeds nor catechisms, neither moral nor political’ (PJ II: 300; Bk VI, viii). In The Enquirer he writes: ‘It is probable that there is no one thing that it is of eminent importance for a child to learn. The true object of juvenile education, is to provide, against the age of five and twenty, a mind well regulated, active, and prepared to learn’ (1797: 77-78).
In the essay ‘Of Public and Private Education’, Godwin considers the advantages and disadvantages of education by private tutor rather than by public schooling. He concludes by wondering whether there might be a middle way: ‘Perhaps an adventurous and undaunted philosophy would lead to the rejecting them altogether, and pursuing the investigation of a mode totally dissimilar’ (1797: 64). His criticisms of both are reinforced in his novel Mandeville, in which the main character is educated privately by an evangelical minister, and then sent, unhappily, to Winchester College; he experiences both modes as an imposition on his liberty and natural dispositions. Certainly, Godwin’s ideas rule out traditional schools, with set timetables and curricula, with authoritarian teachers, ‘the worst of slaves’, whose only mode of teaching is direct instruction, and deferential pupils who ‘learn their lessons after the manner of parrots’ (1797: 81). The first task of a teacher, Godwin suggests in the essay ‘Of the Communication of Knowledge’, is to provide pupils with an intrinsic motive to learn—that is, with ‘a perception of the value of the thing learned’ (1797: 78). This is easiest if the teacher follows the pupil’s interests and facilitates his or her enquiries. The teacher’s task then is to smooth the pupil’s path, to be a consultant and a participant in discussions and debates, modelling the epistemic and linguistic virtues required for learning with and from each other. The pupil and the ‘preceptor’ will be co-learners and the forerunners of individuals who, in successive generations, will develop increasingly wise and comprehensive views.
In Godwin’s view, there will never be a need for a national system of pay or accreditation, but there will be a need, in the short-term, for leadership by a bourgeois educated elite. It is interesting to compare this view with Coleridge’s idea of a ‘clerisy’, a permanent national intellectual elite, most fully developed by Coleridge in On the Constitution of the Church and State (1830). The term ‘clerisy’ refers to a state-sponsored group of intellectual and learned individuals who would diffuse indispensable knowledge to the nation, whose role would be to humanize, cultivate, and unify. Where Godwin anticipates an erosion of differences of rank and an equitable education for all, Coleridge wants education for the labouring classes to be limited, prudentially, to religion and civility, with a more extensive liberal education for the higher classes. The clerisy is a secular clergy, holding the balance between agricultural and landed interests on the one hand, and mercantile and professional interests on the other. Sages and scholars in the frontline of the physical and moral sciences would serve also as the instructors of a larger group whose role would be to disseminate knowledge and culture to every ‘parish’. Coleridge discussed the idea with Godwin, but very little in it could appeal to a philosopher who anticipated a withering away of the national state; nor could Godwin have agreed with the idea of a permanent intellectual class accredited and paid by the state, or with the idea of a society that depended for its unity on a permanently maintained intelligentsia. Coleridge’s idea put limits on the learning of the majority and denied them the freedom, and the capacity, to pursue their own enquiries and opinions—as did the national education system that developed in Britain in the nineteenth and twentieth centuries.
Godwin’s educational ideas have had little direct impact. They were not as well-known as those of Rousseau to later progressivist educational theorists and practitioners. He had, perhaps, an over-intellectualised conception of children’s development, and too utopian a vision of the kind of society in which his educational ideas could flourish. Nevertheless, it is interesting that his emphasis on autonomous thinking and critical discussion, on equality and justice in the distribution of knowledge and understanding, and his awareness of how powerful interests and dominant ideologies are insinuated through education, are among the key themes of modern educational discourse. The way in which his ideas about education are completely integral to his anarchist political philosophy is one reason why he deserves attention from philosophers of education, as well as from political theorists.
4. Godwin’s Philosophical Anarchism
a. Introduction
Godwin was the first to argue for anarchism from first principles. The examination of his ideas about education has introduced important aspects of his anarchism, including the preference for local community-based practices, rather than any national systems or institutions. His anarchism is both individualistic and socially oriented. He believes that the development of private judgment enables an improved access to truth, and truth enables progression towards a just society. Monarchical and aristocratic modes of government, together with any form of authority based on social rank or religion, are inconsistent with the development of private judgment. Godwin’s libertarianism in respect of freedom of thought and expression deserves recognition, but his commitment to sincerity and candour, to speech that presumes to assert as true only what is epistemically sound, means that not all speech is epistemically responsible. Nor is all listening responsible: free speech, like persuasive argument, requires a fair-minded and tolerant reception. To prepare individuals and society for the responsible exercise of freedom of thought and expression is a task for education.
Godwin was a philosophical anarchist. He did not specify ways in which like-minded people should organise or build a mass movement. Even in the 1790s, when the enthusiasm for the French Revolution was at its height, he was cautious about precipitating unrest. With regard to the practical politics of his day, he was a liberal Whig, never a revolutionary. But the final two Books of Political Justice take Godwin’s anarchism forward with arguments concerning crime and punishment (Book VII) and property (Book VIII). It is here that some of his most striking ideas are to be found, and where he engages with practical policy issues as well as with philosophical principles.
b. Punishment
Godwin sees punishment as inhumane and cruel. In keeping with his necessitarianism, he cannot accept that criminals make a genuinely free choice to commit a crime: ‘the assassin cannot help the murder he commits any more than the dagger’ (PJ II: 324; Bk VII, i). Human beings are not born into sin, but neither are they born virtuous. Crime is caused environmentally, by social circumstances, by ignorance, inequality, oppression. When the wealthy acknowledge this, they will recognise that if their circumstances and those of the poor were reversed, so, too, would be their crimes. Therefore, Godwin rejects the notions of desert and retributive justice. Only the future benefit that might result from punishment matters, and he finds no evidence that suffering is ever beneficial. Laws, like all prescriptions and prohibitions, condemn the mind to imbecility, alienating it from truth, inviting insincerity when obedience is coerced. Laws, and all the legal and penal apparatus of states, weaken us morally and intellectually by causing us to defer to authority and to ignore our responsibilities.
Godwin considers various potential justifications of punishment. It cannot be justified by the future deterrent effect on the same offender, for a mere suspicion of criminal conduct would justify it. It cannot be justified by its reformative effect, for patient persuasion would be more genuinely effective. It cannot be justified by its deterrent effect on non-offenders, for then the greatest possible suffering would be justified because that would have the greatest deterrent effect. Any argument for proportionality would be absurd because how can that be determined when there are so many variables of motivation, intention, provocation, harm done? Laws and penal sentences are too inflexible to produce justice. Prisons are seminaries of vice, and hard labour, like slavery of any kind, is evil. Only for the purposes of temporary restraint should people ever be deprived of their liberty. A radical alternative to punishment is required.
The development of individuals’ capacities for reason and judgment will be accompanied by a gradual emancipation from law and punishment. The community will apply its new spirit of independence to advance the general good. Simpler, more humane and just practices will emerge. The development of private judgment will enable finer distinctions, and better understanding, to move society towards genuine justice. When people trust themselves and their communities to shoulder responsibility as individuals, they will learn to be ‘as perspicacious in distinguishing, as they are now indiscriminate in confounding, the merit of actions and characters’ (PJ II: 412; Bk VI, viii).
c. Property
Property, Godwin argues, is responsible for oppression, servility, fraud, malice, revenge, fear, selfishness, and suspicion. The abolition—or, at least, transformation—of property will be a key achievement of a just society. If I have a superfluity of loaves and one loaf would save a starving neighbour’s life, to whom does that loaf justly belong? Equity is determined by benefit or utility: ‘Every man has a right to that, the exclusive possession of which being awarded to him, a greater sum of benefit or pleasure will result, than could have arisen from its being otherwise appropriated’ (PJ II:423; Bk VIII, i).
It is not just a question of subsistence, but of all indispensable means of improvement and happiness. It includes the distribution of education, skills, and knowledge. The poor are kept in ignorance while the rich are honoured and rewarded for being acquisitive, dissipated, and indolent. Leisure would be more evenly distributed if the rich man’s superfluities were removed, and this would allow more time for intellectual improvement. Godwin’s response to the objection that a superfluity of property generates excellence—culture, industry, employment, decoration, arts—is that all these would increase if leisure and intellectual cultivation were evenly distributed. Free from oppression and drudgery, people would discover new pleasures and capacities. They will see the benefit of their own exertions to the general good ‘and all will be animated by the example of all’ (PJ II: 488; Bk VIII, iv).
Godwin addresses another objection to his egalitarianism in relation to property: the impossibility of its being rendered permanent: we might see equality as desirable but lack the capacity to sustain it; human nature will always reassert itself. To this Godwin’s response is that equality can be sustained if the members of the community are sufficiently convinced that it is just and that it generates happiness. Only the current ‘infestation of mind’ could see inequality dissolve, happiness increase, and be willing to sacrifice that. In time people will grow less vulnerable to greed, flattery, fame, power, and more attracted to simplicity, frugality, and truth.
But if we choose to receive no more than our just share, why should we impose this restriction on others, why should we impose on their moral independence? Godwin replies that moral error needs to be censured frankly and contested by argument and persuasion, but we should govern ourselves ‘through no medium but that of inclination and conviction’ (PJ II, 497; Bk VIII, vi). If a conflict between the principle of equality and the principle of independent judgment appears, priority should go with the latter. The proper way to respect other people’s independence of mind is to engage them in discussion and seek to persuade them. Conversation remains, for Godwin, the most fertile source of improvement. If people trust their own opinions and resist all challenges to it, they are serving the community because the worst possible state of affairs would be a clockwork uniformity of opinion. This is why education should not seek to cast the minds of children in a particular mould.
In a society built on anarchist principles, property will no longer provide an excuse for the exploitation of other people’s time and labour; but it will still exist to the extent that each person retains items required for their welfare and day-to-day subsistence. They should not be selfish or jealous of them. If two people dispute an item, Godwin writes, let justice, not law, decide between them. All will serve on temporary juries for resolving disputes or agreeing on practices, and all will have the capacity to do so without fear or favour.
d. Response to Malthus
The final objection to his egalitarian strictures on property in Political Justice is the chapter ‘Of the objection to this system from the principle of population’ (Book VIII: vii). The objection raises the possibility that an egalitarian world might become too populous to sustain human life. Godwin argues that if this were to threaten human existence, people would develop the strength of mind to overcome the urge to propagate. Combined with the banishment of disease and increased longevity—even perhaps the achievement of immortality—the nature of the world’s population would change. Long life, extended education, a progressive improvement in concentration, a reduced need for sleep, and other advances, would result in a rapid increase in wisdom and benevolence. People would find ways to keep the world’s population at a sustainable level.
This chapter, together with the essay ‘Of Avarice and Profusion’ (The Enquirer, 1797), contributed to Thomas Malthus’ decision to write An Essay on the Principle of Population, first published in 1798. He argued that Godwin was too optimistic about social progress. They met and discussed the question amicably, and a response was included in Godwin’s Reply to Dr Parr, but his major response, Of Population, was not published until 1820, by which time Malthus’s Essay was into its fifth, greatly expanded, edition. Godwin argues against Malthus’s geometrical ratio for population increase and his arithmetical ratio for the increase in food production, drawing where possible on international census figures. He looks to mechanisation, to the untapped resources of the sea, to an increase in crop cultivation rather than meat production, and to chemistry’s potential for producing new foodstuffs. With regard to sexual passions, he repeats his opinion from Political Justice that men and women are capable of immense powers of restraint, and with regard to the Poor Laws, which Malthus wished to abolish, he argued that they were better for the poor than no support at all. Where Malthus argued for lower wages for the poor, Godwin argued for higher pay, to redistribute wealth and stimulate the economy.
When Malthus read Of Population, he rather sourly called it ‘the poorest and most old-womanish performance to have fallen from a writer of note’. The work shows that Godwin remained positive about the capacity of humankind to overcome misery and to achieve individual and social improvement. He knew that if Malthus was right, hopes for radical social progress, and even short-term relief for the poor and oppressed, were futile.
5. Godwin’s Fiction
a. Caleb Williams (1794)
Godwin wrote three minor novels before he wrote Political Justice. They had some success, but nothing like that of the two novels he completed in the 1790s. Caleb Williams and St. Leon were not only the most successful and intellectually ambitious of his novels but were also the two that relate most closely to his philosophical work of the 1790s. He wrote two more novels that were well received: Fleetwood. or The New Man of Feeling (1805) and Mandeville, a Tale of the Seventeenth Century in England (1817). His final two novels, Cloudsley (1830) and Deloraine (1831), were more romantic and less successful.
Things As They Are; or The Adventures of Caleb Williams is both a study of individual psychology and a continuation, or popularization, of Godwin’s critical analysis of English society in Political Justice. It explores how aristocracy insinuates authority and deference throughout society. One of the two main characters, Falkland, is a wealthy philanthropist whose tragic flaw is a desire to maintain at any cost his reputation as an honourable and benevolent gentleman. The other, Caleb, is his bright, self-educated servant with insatiable curiosity. Caleb admires Falkland, but he begins to suspect that it was his master who murdered the uncouth and boorish neighbouring squire, Barnabas Tyrrel. When the opportunity arises for him to search the contents of a mysterious chest in Falkland’s library, Caleb cannot resist. He is discovered by Falkland and learns the truth from him. Not only was Falkland the murderer, but he had allowed innocent people to die for the crime. He is driven to protect his reputation and honour at any cost. Caleb is chased across the country, and around Europe, by Falkland’s agents. He is resourceful and courageous in eluding them, but Falkland’s power and resources are able to wear him down and bring him to court, where Falkland and Caleb face each other. They are both emotionally, psychologically, and physically exhausted. In different ways, both have been persecuted and corrupted by the other, and yet theirs is almost a love story. The trial establishes the facts as far as they interest the law, but it is not the whole truth: not, from a moral perspective, in terms of true guilt and innocence, and not from a psychological perspective.
Caleb’s adventures during his pursuit across Britain and Europe allow us to see different aspects of human character and psyche, and of the state of society. Caleb recounts his adventures himself, allowing the reader to infer the degree to which he is reliably understanding and confessing his own moral and psychological decline. He espouses principles of sincerity and candour, but his narrative shows the difficulty of being truly honest with oneself. The emotional and mental effects of his persecution are amplified by his growing paranoia.
The novel was recognised as an attack on values and institutions embedded in English society, such as religion, law, prisons, inequality, social class, the abuse of power, and aristocratic notions of honour. One of the more didactic passages occurs when Caleb is visited in prison by Thomas, a fellow servant. Thomas looks at the conditions in which Caleb is kept—shackled and without even straw for a bed—and exclaims, ‘Zounds, I have been choused. They told me what a fine thing it was to be an Englishman, and about liberty and property, and all that there; and I find it is all flam’ (2009: 195). In another episode, Caleb encounters a group of bandits. Their leader, Raymond, justifies their activities to Caleb: ‘We undertake to counteract the partiality and iniquity of public institutions. We, who are thieves without a licence, are at open war with another set of men, who are thieves according to law… we act, not by choice, but only as our wise governors force us to act’ (2009: 209).
It is also a story of communication failure, of mutual distrust and resentment that could have been resolved by conversation. Caleb’s curiosity made him investigate the chest for himself, rather than openly confront Falkland with his suspicions. Both men have failed to exercise their private judgment independently of the values and expectations of their social situation. By the end of the novel, any hope of resolution has evaporated: a frank and rational discussion at the right time could have achieved it. It was, at least in part, the social environment—social inequality—that created their individual characters and the communication barrier.
As well as themes from Political Justice, there are echoes of the persecution and surveillance of British radicals at the time of writing and of the false values, as Godwin saw them, of Burke’s arguments in favour of tradition and aristocracy, of ‘things as they are’. It is not surprising that the novel was especially praised by readers with radical views. In his character sketch of Godwin (in The Spirt of the Age), Hazlitt wrote that ‘no one ever began Caleb Williams that did not read it through: no one that ever read it could possibly forget it, or speak of it after any length of time but with an impression as if the events and feelings had been personal to himself’ (Hazlitt, 2000: 288).
b. St. Leon: A Tale of the Sixteenth Century (1799)
Despite its historical setting, St. Leon is as concerned as Caleb Williams is with the condition of contemporary society and with themes from Political Justice. Gary Kelly (1976) has coupled St. Leon with Caleb Williams as an English Jacobin novel (together with works by Elizabeth Inchbald, Robert Bage, and Thomas Holcroft), and Pamela Clemit (1993) classes them as Rational or Godwinian novels (together with works by Mary Shelley and the American novelist Charles Brockden Brown). They are certainly philosophical novels. St. Leon is also an historical novel in that its setting in sixteenth century Europe is accurately depicted, and it is a Gothic novel in that it contains mystery, horror, arcane secrets, and dark dungeons. B. J. Tysdahl (1981) refers to its ‘recalcitrant Gothicism’. When Lord Byron asked why he did not write another novel, Godwin replied that it would kill him. ‘And what matter,’ Byron replied, ‘we should have another St. Leon’.
The central character and narrator, St. Leon, is as imbued with the values of his own country, class, and period as Falkland. At the start of the novel, he is a young French nobleman in thrall to chivalric values and anxious to create a great reputation as a knight. A high point of his youth is his attendance at the Field of the Cloth of Gold in 1520, when Francis I of France and Henry VIII of England met in awe-inspiring splendour, as if to mark the end of medievalism. A low point is when the French are defeated at the Battle of Pavia. St. Leon’s education had prepared him for a chivalric way of life; its passing leaves him unprepared for a world with more commercial values. His hopes of aristocratic glory are finally destroyed by an addiction to gambling. He loses his wealth and the respect of his son, Charles, and might have lost everything had he been married to a less extraordinary woman. Marguerite sees their financial ruin as a blessing in disguise, and for a period the family enjoys domestic contentment in a humble setting in Switzerland.
This changes when St. Leon encounters a stranger who has selected him to succeed to the possession of arcane knowledge. The alchemical secrets he is gifted—the philosopher’s stone and the elixir of life—restore his wealth and give him immortality. He seizes the opportunity to make amends to his family and to society by becoming the world’s benefactor. But the gift turns out to be a curse. His wife dies, his philanthropic schemes fail, and he becomes an outcast, mistrusted and alienated forever. Generations pass; St. Leon persists but sees himself as a monster undeserving of life. Only by unburdening himself of the alchemical knowledge, as the stranger had done, could he free himself to die. Otherwise, he must live forever a life of deceit and disguise. As the narrator, he cannot provide clues even to the recipients of his narration, in whatever age we might live. We pity him but we cannot entirely trust him. Even as a narrator he is suspected. As in Caleb Williams, the impossibility of candour and truthfulness is shown to be corrupting, and as in Mary Shelley’s Frankenstein, unique knowledge and a unique form of life are shown to bring desolation in the absence of affection, trust, and communication.
We can interpret St. Leon as a renewal of Godwin’s critique of Burke and of the British mixture of tradition and commercialism. We can see in Marguerite a tribute to Mary Wollstonecraft. Is there also, as Gary Kelly suggests (1976: 210), a parallel between the radical philosophers of the late eighteenth century—polymaths like Joseph Priestley and Richard Price, perhaps, or Godwin and Wollstonecraft themselves—and the alchemical adept whose knowledge and intentions society suspects and is unprepared for? Writing St. Leon so shortly after the death of Wollstonecraft, when he is enduring satirical attacks, Godwin must have felt himself in danger of becoming isolated and insufficiently appreciated. We can see the novel as pessimistic, reflecting Godwin’s doubts about the potential for radical change in his lifetime. But Godwin well knew that alchemy paved the way for chemical science, so perhaps the message is more optimistic: what seems like wishful thinking today will lead us to tomorrow’s accepted wisdom.
6. Conclusion
Godwin died on the cusp of the Victorian age, having played a part in the transition from the Enlightenment to Romanticism. His influence persisted as Political Justice reached a new, working-class readership through quotation in Owenite and Chartist pamphlets and a cheap edition published in 1842, and his ideas were discussed at labour movement meetings. His novels influenced Dickens, Poe, Hawthorne, Balzac, and others. According to Marshall (1984: 392), Marx knew of Godwin through Engels, but disagreed with his individualism and about which social class would be the agent of reform. Of the great anarchist thinkers who came after him, Bakunin does not refer to him, Tolstoy does but may not have read him directly; Kropotkin, however, hailed him as the first to define the principles of anarchism.
Godwin’s political philosophy can appear utopian, and his view of the potential for human improvement naively optimistic, but his ideas still have resonance and relevance. As a moral philosopher, he has not received sufficient credit for his version of utilitarian principles, contemporaneous with Bentham’s, a version that anticipates John Stuart Mill’s. He was both intellectually courageous in sticking to his fundamental principles, and conscientious in admitting to errors. Unlike Malthus, he believed the conditions of the poor and oppressed can and should be improved. He is confident that an egalitarian democracy free of government interference will allow individuals to thrive. One of his most important contributions to social and political theory is his analysis of how educational injustice is a primary source of social injustice. The journey to political justice begins and ends with educational justice.
7. References and Further Reading
a. Works by William Godwin
i. Early Editions of An Enquiry Concerning Political Justice
1793. An Enquiry Concerning Political Justice, and Its Influence on General Virtue and Happiness. First edition. 2 vols. London: G.G and J. Robinson.
1796. An Enquiry Concerning Political Justice, and Its Influence on General Virtue and Happiness. Second edition. 2 vols. London: G.G and J. Robinson.
1798. An Enquiry Concerning Political Justice, and Its Influence on General Virtue and Happiness. Third edition. 2 vols. London: G.G and J. Robinson.
ii. Other Editions of An Enquiry Concerning Political Justice
1946. An Enquiry Concerning Political Justice. F. E. L. Priestley (ed). 3 vols. Toronto: University of Toronto Press.
This is a facsimile of the third edition. Volume 3 contains variants from the first and second editions.
2013. An Enquiry Concerning Political Justice. Mark Philp (ed). Oxford World Classics. Oxford: Oxford University Press.
This is based on the text of 1793 first edition. In addition to an introduction by Mark Philp, it includes a chronology of Godwin’s life and explanatory notes.
2015. Enquiry Concerning Political Justice: And Its Influence On Morals And Happiness. Isaac Kramnick (ed.). London: Penguin.
This is based on the text of the 1798 third edition. It includes the Summary of Principles. Introduction and Editor’s Notes by Isaac Kramnick.
iii. Collected Editions of Godwin’s Works and Correspondence
1992. Collected Novels and Memoirs of William Godwin. 8 vols. Mark Philp (ed.). London: Pickering and Chatto Publishers, Ltd.
A scholarly series that includes Memoirs of the Author of a Vindication of the Rights of Woman as well as the text of all Godwin’s fiction and some unpublished pieces.
1993. Political and Philosophical Writings of William Godwin, 7 Volumes, Mark Philp (ed.). London, Pickering and Chatto Publishers Ltd.
A scholarly edition of Godwin’s principal political and philosophical works, including some previously unpublished pieces. Volume 1 includes a complete bibliography of Godwin’s works and political essays. Volume 2 contains the remaining political essays. Volume 3 contains the text of the first edition of Political Justice; volume 4 contains variants from the second and third editions. Volumes 5 and 6 contain educational and literary works, including The Enquirer essays. Volume 7 includes Godwin’s final (unfinished) work, published posthumously: The Genius of Christianity Unveiled.
2011, 2014. The Letters of William Godwin. Volume 1: 1778–1797, Volume 2: 1798–1805. Pamela Clemit (ed). Oxford: Oxford University Press.
A projected six volume series.
iv. First Editions of Other Works by Godwin
1783. An Account of the Seminary That Will Be Opened on Monday the Fourth Day of August at Epsom in Surrey. London: T. Cadell.
1784. The Herald of Literature, as a Review of the Most Considerable Publications That Will Be Made in the Course of the Ensuing Winter. London: J. Murray.
1794a. Cursory Strictures on the Charge Delivered by Lord Chief Justice Eyre to the Grand Jury London: D. I. Eaton.
1794b. Things As They Are; or The Adventures of Caleb Williams. 3 vols. London: B. Crosby.
1797. The Enquirer: Reflections on Education, Manners and Literature. London: GG and J Robinson.
1798. Memoirs of the Author of a Vindication of the Rights of Woman. London: J. Johnson.
1799. St. Leon, A Tale of the Sixteenth Century. 4 vols. London: G.G. and J. Robinson.
1801 Thoughts Occasioned by the Perusal of Dr. Parr’s Spital Sermon, Preached at Christ Church, April I5, 1800: Being a Reply to the Attacks of Dr. Parr, Mr. Mackintosh, the Author of an Essay on Population, and Others. London: GG and J Robinson.
1805. Fleetwood. or The New Man of Feeling. 3 vols. London: R. Phillips.
1817. Mandeville, a Tale of the Seventeenth Century in England. 3 vols. London: Longman, Hurst, Rees, Orme and Brown.
1820. Of Population. An Enquiry Concerning the Power of Increase in the Numbers of Mankind, Being an Answer to Mr. Malthus’s Essay on That Subject. London: Longman, Hurst, Rees, Orme and Brown.
1824. History of the Commonwealth of England from Its Commencement to Its Restoration. 4 vols. London: H. Colburn
1831. Thoughts on Man, His Nature, Productions, and Discoveries. Interspersed with Some Particulars Respecting the Author. London: Effingham Wilson
v. Online Resources
2010. The Diary of William Godwin. Victoria Myers, David O’Shaughnessy, and Mark Philp (eds.). Oxford: Oxford Digital Library. http://godwindiary.bodleian.ox.ac.uk/index2.html.
Godwin kept a diary from 1788 to 1836. It is held by the Bodleian Library, University of Oxford as part of the Abinger Collection. Godwin recorded meetings, topics of conversation, his reading and writing in succinct notes.
vi. Other Editions of Selected Works by Godwin
1986. Romantic Rationalist: A William Godwin Reader. Peter Marshall (ed.). London: Freedom Press.
Contains selections from Godwin’s works, arranged by theme.
1988. Caleb Williams. Maurice Hindle (ed.). London: Penguin Books.
1994. St. Leon. Pamela Clemit (ed.). Oxford World Classics. Oxford: Oxford University Press.
2005. Godwin on Wollstonecraft: Memoirs of the Author of a Vindication of the Rights of Woman. Richard Holmes (ed). London: Harper Perennial.
2009. Caleb Williams. Pamela Clemit (ed.). Oxford World Classics. Oxford: Oxford University Press
2019. Mandeville: A Tale of the Seventeenth Century in England. Miami, Fl: Hard Press Books.
b. Biographies of Godwin
Brailsford, H N. 1951. Shelley, Godwin and Their Circle. Second edition. Home University Library of Modern Knowledge. Oxford: Oxford University Press.
Brown, Ford K. 1926. The Life of William Godwin. London: J. M. Dent and Sons.
Clemit, Pamela (ed). 1999. Godwin. Lives of the Great Romantics III: Godwin, Wollstonecraft and Mary Shelley by their Contemporaries. Volume 1. London: Pickering and Chatto.
Goulbourne, Russell, Higgins, David (eds.). 2017. Jean-Jacques Rousseau and British Romanticism: Gender and Selfhood, Politics and Nation. London: Bloomsbury.
Hazlitt, William. 2000. ‘William Godwin’ in The Fight and Other Writings. Tom Paulin (ed.). London: Penguin.
Locke, Don. 1980. A Fantasy of Reason: The Life and Thought of William Godwin. London: Routledge and Kegan Paul.
This is described as a ‘philosophical biography’.
Marshall, Peter. 1984. William Godwin. New Haven: Yale University Press.
A new edition is entitled William Godwin: Philosopher, Novelist, Revolutionary. PM Press, 2017. The text appears the same. A standard biography.
Paul, Charles Kegan, 1876, William Godwin: his Friends and Contemporaries, 2 volumes, London: H.S King.
An early and thorough biography, with important manuscript material.
St Clair, William. 1989. The Godwins and the Shelleys: The Biography of a Family. London: Faber and Faber.
Thomas, Richard Gough. 2019. William Godwin: A Political Life. London: Pluto Press.
Woodcock, George. 1946. A Biographical Study. London: Porcupine Press.
c. Social and Historical Background
Butler, Marilyn. 1984. Burke, Paine, Godwin and the Revolution Controversy. Cambridge: Cambridge University Press.
Grayling, A. C. 2007. Towards the Light: The Story of the Struggles for Liberty and Rights. London: Bloomsbury.
Hay, Daisy. 2022. Dinner with Joseph Johnson: Books and Friendship in a Revolutionary Age. London: Chatto and Windus.
A study of the regular dinners held by the radical publisher, whose guests included Godwin, Wollstonecraft, Fuseli, Blake, and many other writers, artists, and radicals.
Hewitt, Rachel. 2017. A Revolution in Feeling: The Decade that Forged the Modern Mind. London: Granta.
Norman Jesse. 2013. Edmund Burke: Philosopher Politician Prophet. London: William Collins.
Philp, Mark. 2020. Radical Conduct: Politics, Sociability and Equality in London 1789–1815. Cambridge UK: Cambridge University Press.
A study of the radical intellectual culture of the period and of Godwin’s position within it.
Simon, Brian. 1960. Studies in the History of Education, 1780 – 1870. London: Lawrence and Wishart.
Tomalin, Claire. 1974. The Life and Death of Mary Wollstonecraft. London: Weidenfeld and Nicolson.
Uglow, Jenny. 2014. In These Times: Living in Britain Through Napoleon’s Wars 1798 – 1815. London: Faber and Faber.
d. Other Secondary Sources in Philosophy, Education, Fiction, and Anarchism
Bottoms, Jane. 2004. ‘“Awakening the Mind”: The Educational Philosophy of William Godwin’. History of Education 33 (3): 267–82.
Claeys, Gregory. 1983. ‘The Concept of “Political Justice” in Godwin’s Political Justice.’ Political Theory 11 (4): 565–84.
Clark, John P. 1977. The Philosophical Anarchism of William Godwin. Princetown: Princetown University Press.
Clemit, Pamela. 1993. The Godwinian Novel. Oxford: Clarendon Press.
Crowder, George. 1991. Classical Anarchism: The Political Thought of Godwin, Proudhon, Bakunin and Kropotkin. Oxford: Oxford University Press.
Eltzbacher, Paul. 1960. Anarchism: Seven Exponents of the Anarchist Philosophy. London: Freedom Press.
Fleisher, David. 1951. William Godwin: A Study of Liberalism. London: Allen and Unwin.
Fricker, Miranda. 2007. Epistemic Injustice: Power and the Ethics of Knowing. Oxford: Oxford University Press.
Kelly, Gary. 1976. The English Jacobin Novel 1780 – 1805. Oxford: Clarendon Press.
Knights, B. 1978. The Idea of the Clerisy in the Nineteenth Century. Cambridge UK: Cambridge University Press.
Lamb, Robert. 2009. ‘Was William Godwin a Utilitarian?’ Journal of the History of Ideas 70 (1): 119–41.
Manniquis, Robert, Myers, Victoria. 2011. Godwinian Moments: From Enlightenment to Romanticism. Toronto: University of Toronto/Clark Library UCLA.
Marshall, Peter. 2010. Demanding the Impossible: A History of Anarchism. Oakland, CA: PM Press.
Mee, Jon. 2011. ‘The Use of Conversation: William Godwin’s Conversable World and Romantic Sociability’. Studies in Romanticism 50 (4): 567–90.
Monro, D.H. 1953. Godwin’s Moral Philosophy. Oxford: Oxford University Press.
O’Brien, Eliza, Stark, Helen, Turner, Beatrice (eds.) 2021. New Approaches to William Godwin: Forms, Fears, Futures. London: Palgrave/MacMillan.
Philp, Mark. 1986. Godwin’s Political Justice. London: Duckworth.
A detailed analysis of Godwin’s major philosophical work.
Pollin, Burton R. 1962. Education and Enlightenment in the Works of William Godwin. New York: Las Americas Publishing Company.
Still the most thorough study of Godwin’s educational thought.
Scrivener, Michael. 1978. ‘Godwin’s Philosophy Re-evaluated’. Journal of the History of Ideas 39: 615–26.
Simon, Brian, (ed). 1972. The Radical Tradition in Education in Great Britain. London: Lawrence and Wishart.
Singer, Peter, Leslie Cannold, Helga Kuhse. 1995, ‘William Godwin and the Defence of Impartialist Ethics’. Utilitas, 7(1): 67–86.
Suissa, Judith. 2010. Anarchism and Education: A Philosophical Perspective. Second. Oakland, CA: PM Press.
Tysdahl, B J. 1981. William Godwin as Novelist. London: Athlone Press.
Weston, Rowland. 2002. ‘Passion and the “Puritan Temper”: Godwin’s Critique of Enlightened Modernity’. Studies in Romanticism. 41 (3): 445-470.
Weston, Rowland. 2013. ‘Radical Enlightenment and Antimodernism: The Apostasy of William Godwin (1756–1836)’. Journal for the Study of Radicalism. 7 (2): 1–30.
Author Information
Graham Nutbrown
Email: gn291@bath.ac.uk
University of Bath
United Kingdom
Noam Chomsky (1928 – )
Noam Chomsky is an American linguist who has had a profound impact on philosophy. Chomsky’s linguistic work has been motivated by the observation that nearly all adult human beings have the ability to effortlessly produce and understand a potentially infinite number of sentences. For instance, it is very likely that before now you have never encountered this very sentence you are reading, yet if you are a native English speaker, you easily understand it. While this ability often goes unnoticed, it is a remarkable fact that every developmentally normal person gains this kind of competence in their first few years, no matter their background or general intellectual ability. Chomsky’s explanation of these facts is that language is an innate and universal human property, a species-wide trait that develops as one matures in much the same manner as the organs of the body. A language is, according to Chomsky, a state obtained by a specific mental computational system that develops naturally and whose exact parameters are set by the linguistic environment that the individual is exposed to as a child. This definition, which is at odds with the common notion of a language as a public system of verbal signals shared by a group of speakers, has important implications for the nature of the mind.
Over decades of active research, Chomsky’s model of the human language faculty—the part of the mind responsible for the acquisition and use of language—has evolved from a complex system of rules for generating sentences to a more computationally elegant system that consists essentially of just constrained recursion (the ability of a function to apply itself repeatedly to its own output). What has remained constant is the view of language as a mental system that is based on a genetic endowment universal to all humans, an outlook that implies that all natural languages, from Latin to Kalaallisut, are variations on a Universal Grammar, differing only in relatively unimportant surface details. Chomsky’s research program has been revolutionary but contentious, and critics include prominent philosophers as well as linguists who argue that Chomsky discounts the diversity displayed by human languages.
Chomsky is also well known as a champion of liberal political causes and as a trenchant critic of United States foreign policy. However, this article focuses on the philosophical implications of his work on language. After a biographical sketch, it discusses Chomsky’s conception of linguistic science, which often departs sharply from other widespread ideas in this field. It then gives a thumbnail summary of the evolution of Chomsky’s research program, especially the points of interest to philosophers. This is followed by a discussion of some of Chomsky’s key ideas on the nature of language, language acquisition, and meaning. Finally, there is a section covering his influence on the philosophy of mind.
Avram Noam Chomsky was born in Philadelphia in 1928 to Jewish parents who had immigrated from Russia and Ukraine. He manifested an early interest in politics and, from his teenage years, frequented anarchist bookstores and political circles in New York City. Chomsky attended the University of Pennsylvania at the age of 16, but he initially found his studies unstimulating. After meeting the mathematical linguist Zellig Harris through political connections, Chomsky developed an interest in language, taking graduate courses with Harris and, on his advice, studying philosophy with Nelson Goodman. Chomsky’s 1951 undergraduate honors thesis, on Modern Hebrew, would form the basis of his MA thesis, also from the University of Pennsylvania. Although Chomsky would later have intellectual fallings out with both Harris and Goodman, they were major influences on him, particularly in their rigorous approach, informed by mathematics and logic, which would become a prominent feature of his own work.
After earning his MA, Chomsky spent the next four years with the Society of Fellows at Harvard, where he had applied largely because of his interest in the work of W.V.O. Quine, a Harvard professor and major figure in analytic philosophy. This would later prove to be somewhat ironic, as Chomsky’s work developed into the antithesis of Quine’s behaviorist approach to language and mind. In 1955, Chomsky was awarded his doctorate and became an assistant professor at the Massachusetts Institute of Technology, where he would continue to work as an emeritus professor even after his retirement in 2002. Throughout this long tenure at MIT, Chomsky produced an enormous volume of work in linguistics, beginning with the 1957 publication of Syntactic Structures. Although his work initially met with indifference or even hostility, including from his former mentors, it gradually altered the very nature of the field, and Chomsky grew to be widely recognized as one of the most important figures in the history of language science. Since 2017, he has been a laureate professor in the linguistics department at the University of Arizona.
Throughout his career, Chomsky has been at least as prolific in social, economic, and political criticism as in linguistics. Chomsky became publicly outspoken about his political views with the escalation of the Vietnam War, which he always referred to as an “invasion”. He was heavily involved in the anti-war movement, sometimes risking both his professional and personal security, and was arrested several times. He remained politically active and, among many other causes, was a vocal critic of US interventions in Latin America during the 1980s, the reaction to the September 2001 attacks, and the invasion of Iraq. Chomsky has opposed, since his early youth, the capitalist economic model and supported the Occupy movement of the early 2010s. He has also been an unwavering advocate of intellectual freedom and freedom of speech, a position that has at times pitted him against other left-leaning intellectuals and caused him to defend the rights of others who have very different views from his own. Despite the speculations of many biographers, Chomsky has always denied any connection between his work in language and politics, sometimes quipping that someone was allowed to have more than one interest.
In 1947, Chomsky married the linguist Carol Doris Chomsky (nee Schatz), a childhood friend from Philadelphia. They had three children and remained married until her death in 2008. Chomsky remarried Valeria Wasserman, a Brazilian professional translator, in 2014.
2. Philosophy of Linguistics
Chomsky’s approach to linguistic science, indeed his entire vision of what the subject matter of the discipline consists of, is a sharp departure from the attitudes prevalent in the mid-20th century. To simplify, prior to Chomsky, language was studied as a type of communicative behavior, an approach that is still widespread among those who do not accept Chomsky’s ideas. In contrast, his focus is on language as a type of (often unconscious) knowledge. The study of language has, as Chomsky states, three aspects: determining what the system of knowledge a language user has consists of, how that knowledge is acquired, and how that knowledge is used. A number of points in Chomsky’s approach are of interest to the philosophy of linguistics and to the philosophy of science more generally, and some of these points are discussed below.
a. Behaviorism and Linguistics
When Chomsky was first entering academics in the 1950s, the mainstream school of linguistics for several decades had been what is known as structuralism. The structuralist approach, endorsed by Chomsky’s mentor Zellig Harris, among others, concentrated on analyzing corpora, or records of the actual use of a language, either spoken or written. The goal of the analysis was to identify patterns in the data that might be studied to yield, among other things, the grammatical rules of the language in question. Reflecting this focus on language as it is used, structuralists viewed language as a social phenomenon, a communicative tool shared by groups of speakers. Structuralist linguistics might well be described as consisting of the study of what happens between a speaker’s mouth and a listener’s ear; as one well -known structuralist put it, “the linguist deals only with the speech signal” (Bloomfield, 1933: 32). This is in marked contrast to Chomsky and his followers, who concentrate on what is going on in the mind of a speaker and who look there to identify grammatical rules.
Structuralist linguistics was itself symptomatic of behaviorism, a paradigm prominently championed in psychology by B.F. Skinner and in philosophy by W.V.O. Quine and which was dominant in the midcentury. Behaviorism held that science should restrict itself to observable phenomena. In psychology, this meant seeking explanations entirely in terms of external behavior without discussing minds, which are, by their very nature, unobservable. Language was to be studied in terms of subjects’ responses to stimuli and their resulting verbal output. Behaviorist theories were often formed on the basis of laboratory experiments in which animals were conditioned by being given food rewards or tortured with electric shock in order to shape their behavior. It was thought that human behavior could be similarly explained in terms of conditioning that shapes reactions to specific stimuli. This approach perhaps reached its zenith with the publication of Skinner’s Verbal Behavior (1957), which sought to reduce human language to conditioned responses. According to Skinner, speakers are conditioned as children, through training by adults, to respond to stimuli with an appropriate verbal response. For example, a child might realize that if they see a piece of candy (the stimulus) and respond by saying “candy”, they might be rewarded by adults with the desired sweet, reinforcing that particular response. For an adult speaker, the pattern of stimuli and response could be very complex, and what specific aspect of a situation is being responded to might be difficult to ascertain, but the underlying principle was held to be the same.
Chomsky’s scathing 1959 review of Verbal Behavior has actually become far better known than the original book. Although Chomsky conceded to Skinner that the only data available for the study of language consisted of what people say, he denied that meaningful explanations were to be found at that level. He argued that in order to explain a complex behavior, such as language use, exhibited by a complex organism such as a human being, it is necessary to inquire into the internal organization of the organism and how it processes information. In other words, it was necessary to make inferences about the language user’s mind. Elsewhere, Chomsky likened the procedure of studying language to what engineers would do if confronted with a hypothetical “black box”, a mysterious machine whose input and output were available for inspection but whose internal functioning was hidden. Merely detecting patterns in the output would not be accepted as real understanding; instead, that would come from inferring what internal processes might be at work.
Chomsky particularly criticized Skinner’s theory that utterances could be classified as responses to subtle properties of an object or event. The observation that human languages seem to exhibit stimulus-freedom goes back at least to Descartes in the 17th century, and about the same time as Chomsky was reviewing Skinner, the linguist Charles Hockett (later one of Chomsky’s most determined critics) suggested that this is one of the features that distinguish human languages from most examples of animal communication. For instance, a vervet monkey will give a distinct alarm call any time she spots an eagle and at no other times. In contrast, a human being might say anything or nothing in response to any given stimulus. Viewing a paining one might say, “Dutch…clashes with the wallpaper…. Tilted, hanging too low, beautiful, hideous, remember our camping trip last summer? or whatever else might come to our minds when looking at a picture.” (Chomsky, 1959:2). What aspect of an object, event, or environment triggers a particular response rather than another can only be explained in mental terms. The most relevant fact is what the speaker is thinking about, so a true explanation must take internal psychology into account.
Chomsky’s observation concerning speech was part of his more general criticism of the behaviorist approach. Chomsky held that attempts to explain behavior in terms of stimuli and responses “will be in general a vain pursuit. In all but the most elementary cases, what a person does depends in large measure on what he knows, believes, and anticipates” (Chomsky, 2006: xv). This was also meant to apply to the behaviorist and empiricist philosophy exemplified by Quine. Although Quine has remained important in other aspects of analytic philosophy, such as logic and ontology, his behaviorism is largely forgotten. Chomsky is widely regarded as having inaugurated the era of cognitive science as it is practiced today, that is, as a study of the mental.
b. The Galilean Method
Chomsky’s fundamental approach to doing science was and remains different from that of many other linguists, not only in his concentration on mentalistic explanation. One approach to studying any phenomenon, including language, is to amass a large amount of data, look for patterns, and then formulate theories to explain those patterns. This method, which might seem like the obvious approach to doing any type of science, was favored by structuralist linguists, who valued the study of extensive catalogs of actual speech in the world’s languages. The goal of the structuralists was to provide descriptions of a language at various levels, starting with the analysis of pronunciation and eventually building up to a grammar for the language that would be an adequate description of the regularities identifiable in the data.
In contrast, Chomsky’s method is to concentrate not on a comprehensive analysis but rather on certain crucial data, or data that is better explained by his theory than by its rivals. This sort of methodology is often called “Galilean”, since it takes as its model the work of Galileo and Newton. These physicists, judiciously, did not attempt to identify the laws of motion by recording and studying the trajectory of as many moving objects as possible. In the normal course of events, the exact paths traced by objects in motion are the results of the complex interactions of numerous phenomena such as air resistance, surface friction, human interference, and so on. As a result, it is difficult to clearly isolate the phenomena of interest. Instead, the early physicists concentrated on certain key cases, such as the behavior of masses in free fall or even idealized fictions such as objects gliding over frictionless planes, in order to identify the principles that, in turn, could explain the wider data. For similar reasons, Chomsky doubts that the study of actual speech—what he calls performance—will yield theoretically important insights. In a widely cited passage (Chomsky, 1962, 531), he noted that:
Actual discourse consists of interrupted fragments, false starts, lapses, slurring, and other phenomena that can only be understood as distortions of an underlying idealized pattern.
Like the ordinary movements of objects observable in nature, which Galileo largely ignored, actual speech performance is likely to be the product of a mass of interacting factors, such as the social conventions governing the speech exchange, the urgency of the message and the time available, the psychological states of the speakers (excited, panicked, drunk), and so on, of which purely linguistic phenomena will form only a small part. It is the idealized patterns concealed by these effects and the mental system that generates those patterns —the underlying competence possessed by language users —that Chomsky regards as the proper subject of linguistic study. (Although the terms competence and performance have been superseded by the I-Language/E-Language distinction, discussed in 4.c. below, these labels are fairly entrenched and still widely used.)
c. The Nature of the Evidence
Early in his career (1965), Chomsky specified three levels of adequacy that a theory of language should satisfy, and this has remained a feature of his work. The first level is observational, to determine what sentences are grammatically acceptable in a language. The second is descriptive, to provide an account of what the speaker of the language knows, and the third is explanatory, to give an explanation of how such knowledge is acquired. Only the observational level can be attained by studying what speakers actually say, which cannot provide much insight into what they know about language, much less how they came to have that knowledge. A source of information about the second and third levels, perhaps surprisingly, is what speakers do not say, and this has been a focus of Chomsky’s program. This negative data is drawn from the judgments of native speakers about what they feel they can’t say in their language. This is not, of course, in the sense of being unable to produce these strings of words or of being unable, with a little effort, to understand the intended message, but simply a gut feeling that “you can’t say that”. Chomsky himself calls these interpretable but unsayable sentences “perfectly fine thoughts”, while the philosopher Georges Rey gave them the pithier name “WhyNots”. Consider the following examples from Rey 2022 (the “*” symbol is used by linguists to mark a string that is ill-formed in that it violates some principle of grammar):
(1) * Who did John and kiss Mary? (Compared to John, and who kissed Mary? and who-initial questions like Who did John kiss?)
(2) * Who did stories about terrify Mary? (Compared to stories about who terrified Mary?)
Or the following question/answer pairs:
(3) Which cheese did you recommend without tasting it? * I recommended the brie without tasting it. (Compared to… without tasting it.)
(4) Have you any wool? * Yes, I have any wool.
An introductory linguistics textbook provides two further examples (O’Grady et al. 2005):
(5) * I went to movie. (Compared to I went to school.)
(6) *May ate a cookie, and then Johnnie ate some cake, too. (Compared to Mary ate a cookie, and then Johnnie ate a cookie too/ate a snack too.)
The vast majority of English speakers would balk at these sentences, although they would generally find it difficult to say precisely what the issue is (the textbook challenges the reader to try to explain). Analogous “whynot” sentences exist in every language yet studied.
What Chomsky holds to be significant about this fact is that almost no one, aside from those who are well read in linguistics or philosophy of language, has ever been exposed to (1) –(6) or any sentences like them. Analysis of corpora shows that sentences constructed along these lines virtually never occur, even in the speech of young children. This makes it very difficult to accept the explanation, favored by behaviorists, that we recognize them to be unacceptable as the result of training and conditioning. Since children do not produce utterances like (1) –(6), parents never have a chance to explain what is wrong, to correct them, and to tell them that such sentences are not part of English. Further, since they are almost never spoken by anyone, it is vanishingly unlikely that a parent and child would overhear them so that the parent could point them out as ill-formed. Neither is this knowledge learned through formal instruction in school. Instruction in not saying sentences like (1)–(6) is not a part of any curriculum, and an English speaker who has never attended a day of school is as capable of recognizing the unacceptability of (1)–(6) as any college graduate.
Examples can be multiplied far beyond (1)–(6); there are indefinite numbers of strings of English words (or words of any language) that are comprehensible but unacceptable. If speakers are not trained to recognize them as ill-formed, how do they acquire this knowledge? Chomsky argues that this demonstrates that human beings possess an underlying competence capable of forming and identifying grammatical structures—words, phrases, clauses, and sentences —in a way that operates almost entirely outside of conscious awareness, computing over structural features of language that are not actually pronounced or written down but which are critical to the production and understanding of sentences. This competence and its acquisition are the proper subject matter for linguistic science, as Chomsky defines the field.
d. Linguistic Structures
An important part of Chomsky’s linguistic theory (although it is an idea that predates him by several decades and is also endorsed by some rival theories) is that it postulates structures that lie below the surface of language. The presence of such structures is supported by, among other evidence, considering cases of non-linear dependency between the words in a sentence, that is, cases where a word modifies another word that is some distance away in the linear order of the sentence as it is pronounced. For instance, in the sentence (from Berwick and Chomsky, 2017: 117):
(7) Instinctively, birds who fly swim.
we know that instinctively applies to swim rather than fly, indicating an unspoken connection that bypasses the three intervening words and which the language faculty of our mind somehow detects when parsing the sentence. Chomsky’s hypothesis of a dedicated language faculty —a part of the mind existing for the sole purpose of forming and interpreting linguistic structures, operating in isolation from other mental systems —is supported by the fact that nonlinguistic knowledge does not seem to be relied on to arrive at the correct interpretation of sentences such as (7). Try replacing swim with play chess. Although you know that birds instinctively fly and do not play chess, your language faculty provides the intended meaning without any difficulty. Chomsky’s theory would suggest that this is because that faculty parses the underlying structure of the sentence rather than relying on your knowledge about birds.
According to Chomsky, the dependence of human languages on these structures can also be observed in the way that certain types of sentences are produced from more basic ones. He frequently discusses the formation of questions from declarative sentences. For instance, any English speaker understands that the question form of (8) is (9), and not (10) (Chomsky, 1986: 45):
(8) The man who is here is tall.
(9) Is the man who is here tall?
(10) * Is the man who here is tall?
What rule does a child learning English have to grasp to know this? To a Martian linguist unfamiliar with the way that human languages work, a reasonable initial guess might be to move the fourth word of the sentence to the front, which is obviously incorrect. To see this, change (8) to:
(11) The man who was here yesterday wastall.
A more sophisticated hypothesis might be to move the second auxiliary verb in the sentence, is in the case of (8), to the front. But this is also not correct, as more complicated cases show:
(12) The woman who is in charge of deciding who is hired is ready to see him now.
(13) * Is the woman who is in charge of deciding who hired is ready to see him now?
In fact, in no human language do transformations from one type of sentence to another require taking the linear order of words into account, although there is no obvious reason why they shouldn’t. A language that works on a principle such as switch the first and second words of a sentence to indicate a question is certainly imaginable and would seem simple to learn, but no language yet cataloged operates in such a way.
The correct rule in the cases of (8) through (13) is that the question is formed by moving the auxiliary verb (is) occurring in the verb phrase of the main clause of the sentence, not the one in the relative clause (a clause modifying a noun, such as who is here). Thus, knowing that (9) is the correct question form of (8) or that (13) is wrong requires sensitivity to the way that the elements of a sentence are grouped together into phrases and clauses. This is something that is not apparent on the surface of either the spoken or written forms of (8) or (12), yet a speaker with no formal instruction grasps it without difficulty. It is the study of these underlying structures and the way that the mind processes them that is the core concern of Chomskyan linguistics, rather than the analysis of the strings of words actually articulated by speakers.
3. The Development of Chomsky’s Linguistic Theory
Chomsky’s research program, which has grown to involve the work of many other linguists, is closely associated with generative linguistics. This name refers to the project of identifying sets of rules—grammars—that will generate all and only the sentences of a language. Although explicit rules eventually drop out of the picture, replaced by more abstract “principles”, the goal remains to identify a system that can produce the potentially infinite number of sentences of a human language using the resources contained in the minds of a speaker, which are necessarily finite.
Chomsky’s work has implications for the study of language as a whole, but his concentration has been on syntax. This branch of linguistic science is concerned with the grammars that govern the production of sentences that are acceptable in a language and divide them from nonacceptable strings of words, as opposed to semantics, the part of linguistics concerned with the meaning of words and sentences, and pragmatics, which studies the use of language in context.
Although the methodological principles have remained constant from the start, Chomsky’s theory has undergone major changes over the years, and various iterations may seem, at least on a first look, to have little obvious common ground. Critics present this as evidence that the program has been stumbling down one dead end after another, while Chomsky asserts in response that rapid evolution is characteristic of new fields of study and that changes in a program’s guiding theory are evidence of healthy intellectual progress. Five major stages of development might be identified, corresponding to the subsections below. Each stage builds on previous ones, it has been alleged; superseded iterations should not be considered to be false but rather replaced by a more complete explanation.
a. Logical Constructivism
Chomsky’s theory of language began to be codified in the 1950s, first set down in a massive manuscript that was later published as Logical Structure of Linguistic Theory (1975) and then partially in the much shorter and more widely read Syntactic Structures (1957). These books differed significantly from later iterations of Chomsky’s work in that they were more of an attempt to show what an adequate theory of natural language would need to look like than to fully work out such a theory. The focus was on demonstrating how a small set of rules could operate over a finite vocabulary to generate an infinite number of sentences, as opposed to identifying a psychologically realistic account of the processes actually occurring in the mind of a speaker.
Even before Chomsky, since at least the 1930s, the structure of a sentence was thought to consist of a series of phrases, such as noun phrases or verb phrases. In Chomsky’s early theory, two sorts of rules governed the generation of such structures. Basic structures were given by rewrite rules, procedures that indicate the more basic constituents of structural components. For example,
S → NP VP
indicates that a noun phrase, NP, followed directly by a verb phrase, VP, constitute a sentence, S. “NP → N” indicates that a noun may constitute a noun phrase. Eventually, the application of these rewrite rules stops when every constituent of a structure has been replaced by a syntactic element, a lexical word such as Albert or meows. Transformation rules alter those basic structures in various ways to produce structures corresponding to complex sentences. Importantly, certain transformation rules allowed recursion. This is a concept central to computer science and mathematical logic, by which a rule could be applied to its own output an unlimited number of times (for instance, in mathematics, one can start with 0 and apply the recursive function add 1 repeatedly to yield the natural numbers 0,1,2,3, and so forth.). The presence of recursive rules allows the embedding of structures within other structures, such as placing Albert meows under Leisa thinks to get Leisa thinks Albert meows. This could then be placed under Casey saysthat to produce Casey says that Leisa thinks Albert meows, and so on. Embedding could be done as many times as desired, so that recursive rules could produce sentences of any length and complexity, an important requirement for a theory of natural language. Recursion has not only remained central to subsequent iterations of Chomsky’s work but, more recently, has come to be seen as the defining characteristic of human languages.
Chomsky’s interest in rules that could be represented as operations over symbols reflected influence from philosophers inclined towards formal methods, such as Goodman and Quine. This is a central feature of Chomsky’s work to the present day, even though subsequent developments have also taken psychological realism into account. Some of Chomsky’s most impactful research from his early career (late 50s and early 60s) was the invention of formal language theory, a branch of mathematics dealing with languages consisting of an alphabet of symbols from which strings could be formed in accordance with a formal grammar, a set of specific rules. The Chomsky Hierarchy provides a method of classifying formal languages according to the complexity of the strings that could be generated by the language’s grammar (Chomsky 1956). Chomsky was able to demonstrate that natural human languages could not be produced by the lowest level of grammar on the hierarchy, contrary to many linguistic theories popular at the time. Formal language theory and the Chomsky Hierarchy have continued to have applications both in linguistics and elsewhere, particularly in computer science.
b. The Standard Model
Chomsky’s 1965 landmark work, Aspects of the Theory of Syntax, which devoted much space to philosophical foundations, introduced what later became known as the “Standard Model”. While the theory itself was in many respects an extension of the ideas contained in Syntactic Structures, there was a shift in explanatory goals as Chomsky addressed what he calls “Plato’s Problem”, the mystery of how children can learn something as complex as the grammar of a natural language from the sparse evidence they are presented with. The sentences of a human language are infinite in number, and no child ever hears more than a tiny subset of them, yet they master the grammar that allows them to produce every sentence in their language. (“Plato’s Problem” is an allusion to Plato’s Meno, a discussion of similar puzzles surrounding geometry. Section 4.b provides a fuller discussion of the issue as well as more recent developments in Chomsky’s model of language acquisition.) This led Chomsky, inspired by early modern rationalist philosophers such as Descartes and Leibniz, to postulate innate mechanisms that would guide a child in this process. Every human child was held to be born with a mental system for language acquisition, operating largely subconsciously, preprogrammed to recognize the underlying structure of incoming linguistic signals, identify possible grammars that could generate those structures, and then to select the simplest such grammar. It was never fully worked out how, on this model, possible grammars were to be compared, and this early picture has subsequently been modified, but the idea of language acquisition as relying on innate knowledge remains at the heart of Chomsky’s work.
An important idea introduced in Aspects was the existence of two levels of linguistic structure: deep structure and surface structure. A deep structure contains structural information necessary for interpreting sentence meaning. Transformations on a deep structure —moving, deleting, and adding elements in accordance with the grammar of a language —yield a surface structure that determines the way that the sentence is pronounced. Chomsky explained (in a 1968 lecture) that,
If this approach is correct in general, then a person who knows a specific language has control of a grammar that generates the infinite set of potential deep structures, maps them onto associated surface structures, and determines the semantic and phonetic interpretations of these abstract objects (Chomsky, 2006: 46).
Note that, for Chomsky, the deep structure was a grammatical object that contains structural information related to meaning. This is very different from conceiving of a deep structure as a meaning itself, although a theory to that effect, generative semantics, was developed by some of Chomsky’s colleagues (initiating a debate acrimonious enough to sometimes be referred to as “the linguistic wars”). The names and exact roles of the two levels would evolve over time, and they were finally dropped altogether in the 1990s (although this is not always noticed, a matter that sometimes confuses the discussion of Chomsky’s theories).
Aspects was also notable for the introduction of the competence/performance distinction, or the distinction between the underlying mental systems that give a speaker mastery of her language (competence) and her actual use of the language (performance), which will seldom fully reflect that mastery. Although these terms have technically been superseded by E-language and I-language (see 4.c), they remain useful concepts in understanding Chomsky’s ideas, and the vocabulary is still frequently used.
c. The Extended Standard Model
Throughout the 1970s, a number of technical changes, aimed at simplification and consolidation, were made to the Standard Model set out in Aspects. These gradually led to what became known as the “Extended Standard Model”. The grammars of the Standard Model contained dozens of highly specific transformation rules that successively rearranged elements of a deep structure to produce a surface structure. Eventually, a much simpler and more empirically adequate theory was arrived at by postulating only a single operation that moved any element of a structure to any place in that structure. This operation, move α, was subject to many “constraints” that limited its applications and therefore restrained what could be generated. For instance, under certain conditions, parts of a structure form “islands” that block movement (as when who is blocked from moving from the conjunction in John and who had lunch? to give *Who did John and have lunch?). Importantly, the constraints seemed to be highly consistent across human languages.
Grammars were also simplified by cutting out information that seemed to be specified in the vocabulary of a language. For example, some verbs must be followed by nouns, while others must not. Compare I like coffee and She slept to * I like and * She slept a book. Knowing which of these strings are correct is part of knowing the words like and slept, and it seems that a speaker’s mind contains a sort of lexicon, or dictionary, that encodes this type of information for each word she knows. There is no need for a rule in the grammar to state that some verbs need an object and others do not, which would just be repeating information already in the lexicon. The properties of the lexical items are therefore said to “project” onto the grammar, constraining and shaping the structures available in a language. Projection remains a key aspect of the theory, so that lexicon and grammar are thought to be tightly integrated.
Chomsky has frequently described a language as a mapping from meaning to sound. Around the time of the Extended Standard Model, he introduced a schema whereby grammar forms a bridge between the Phonetic Form, or PF, the form of a sentence that would actually be pronounced, and the Logical Form, or LF, which contained the structural specification of a sentence necessary to determine meaning. To consider an example beloved by introductory logic teachers, Everyone loves someone might mean that each person loves some person (possibly a different person in each case), or it might mean that there is some one person that everyone loves. Although these two sentences have identical PFs, they have different LFs.
Linking the idea of LF and PF to that of deep structure and surface structure (now called D-structure and S-structure, and with somewhat altered roles) gives the “T-model” of language:
D-structure
|
transformations
|
PF – S-Structure – LF
As the diagram indicates, the grammar generates the D-structure, which contains the basic structural relations of the sentence. The D-structure undergoes transformations to arrive at the S-structure, which differs from the PF in that it still contains unpronounced “traces” in places previously occupied by an element that was then moved elsewhere. The S-structure is then interpreted two ways: phonetically as the PF and semantically as the LF. The PF is passed from the language system to the cognitive system responsible for producing actual speech. The LF, which is not a meaning itself but contains structural information needed for semantic interpretation, is passed to the cognitive system responsible for semantics. This idea of syntactic structures and transformations over those structures as mediating between meaning and physical expression has been further developed and simplified, but the basic concept remains an important part of Chomsky’s theories
d. Principles and Parameters
In the 1980s, the Extended Standard Model would develop into what is perhaps the best known iteration of Chomskyan linguistics, what was first referred to as “Government and Binding”, after Chomsky’s book Lectures onGovernment and Binding (1981). Chomsky developed these ideas further in Barriers (1986), and the theory took on the more intuitive name “Principles and Parameters”. The fundamental idea was quite simple. As with previous versions, human beings have in their minds a computational system that generates the syntactic structures linking meanings to sounds. According to Principles and Parameters Theory, all of these systems share certain fixed settings (principles) for their core components, explaining the deep commonalities that Chomsky and his followers see between human languages. Other elements (parameters) are flexible and have values that are set during the language learning process, reflecting the variations observable across different languages. An analogy can be made with an early computer of the sort that was programmed by setting the position of switches on a control panel: the core, unchanging, circuitry of the computer is analogous to principles, the switches to parameters, and the program created by one of the possible arrangements of the switches to a language such as English, Japanese, or St’at’imcets (although this simple picture captures the essence of early Principles and Parameters, the details are a great deal more complicated, especially considering subsequent developments).
Principles are the core aspects of language, including the dependence on underlying structure and lexical projection, features that the theory predicts will be shared by all natural human languages. Parameters are aspects with binary settings that vary from language to language. Among the most widely discussed parameters, which might serve as convenient illustrations, are the Head and Pro-Drop parameters.
A head is the key element that gives a phrase its name, such as the noun in a noun phrase. The rest of the phrase is the complement. It can be observed that in English, the head comes before the complement, as in the noun phrase medicine for cats, where the noun medicine is before the complement for cats; in the verb phrase passed her the tea, the verb passed is first, and in the prepositional phrase in his pocket, the preposition in is first. But consider the following Japanese sentence (Cook and Newsom, 1996: 14):
(14)
E
wa
kabe
ni
kakatte
imasu
[subject marker]
picture
wall
on
hanging
is
The picture is hanging on the wall
Notice that the head of the verb phrase, the verb kakatte imasu, is after its complement, kabe ni, and ni (on) is a postposition that occurs after its complement, kabe. English and Japanese thus represent different settings of a parameter, the Head, or Head Directionality, Parameter. Although this and other parameters are set during a child’s development by the language they hear around them, it seems that very little exposure is needed to fix the correct value. It is taken as evidence of this that mistakes with head positioning are vanishingly rare; English speaking children almost never make mistakes like * Thepicture the wall on is at any point in their development.
The Pro-Drop Parameter explains the fact that certain languages can leave the pronoun subjects of a sentence implied, or up to context. For instance, in Italian, a pro-drop language, the following sentences are permitted (Cook and Newsom, 1996: 55).
(15)
Sono
il
tricheco
be (1st-person-present)
the
walrus
I am the walrus.
(16)
E’
pericoloso
sporger-
si
be (3rd person present)
dangerous
lean out-
(reflexive)
It is dangerous to lean out. [a warning posted on trains]
On the other hand, the direct English translations * Am the walrus and * Is dangerous to lean out are ungrammatical, reflecting a different parameter setting, “non-prodrop”, which requires an explicit subject for sentences.
A number of other, often complex, differences beyond whether subjects must be included in all sentences were thought to come from the settings of Pro-Drop and the way it interacts with other parameters. For example, it has been observed that many pro-drop languages allow the normal order of subjects and verbs to be inverted; Cade la note is acceptable in Italian, unlike its direct translation in English, * falls the night. However, this feature is not universal among pro-drop languages, and it was theorized that whether it is present or not depends on the settings of other parameters.
Examples such as these reflect the general theme of Principles and Parameters, in which “rules” of the sort that had been postulated in Chomsky’s previous work are no longer needed. Instead of syntactical rules present in a speaker’s mental language faculty, the particular grammar of a language was hypothesized to be the result of the complex interaction of principles, the setting of parameters, and the projection properties of lexical items. As a relatively simple example, there is no need for an English-speaking child to learn a bundle of related rules such as noun first in a noun phrase, verb first in a verb phrase, and so on, or for a Japanese-speaking child to learn the opposite rules for each type of phrase; all of this is covered by the setting of the Head Parameter. As Chomsky (1995: 388) puts it,
A language is not, then, a system of rules but a set of specifications for parameters in an invariant system of principles of Universal Grammar. Languages have no rules at all in anything like the traditional sense.
This outlook represents an important shift in approach, which is often not fully appreciated by philosophers and other non-specialists. Many scholars assume that Chomsky and his followers still regard languages as particular sets of rules internally represented by speakers, as opposed to principles that are realized without being explicitly represented in the brain.
This outlook led many linguists, especially during the last two decades of the 20th century, to hope that the resemblances and differences between individual languages could be neatly explained by parameter settings. Language learning also seemed much less puzzling, since it was now thought to be a matter, not of learning complex sets of rules and constraints, but rather of setting each parameter, of which there were at one time believed to be about twenty, to the correct value for the local language, a process that has been compared to the children’s game of “twenty questions”. It was even speculated that a table could be established where languages could be arranged by their parameter settings, in analogy to the periodic table on which elements could be placed and their chemical properties predicted by their atomic structures.
Unfortunately, as the program developed, things did not prove so simple. Researchers failed to reach a consensus on what parameters there are, what values they can take, and how they interact, and there seemed to be vastly more of them than initially believed. Additionally, parameters often failed to have the explanatory power they were envisioned as having. For example, as discussed above, it was originally claimed that the Pro-Drop parameter explained a large number of differences between languages with opposite settings for that parameter. However, these predictions were made on the basis of an analysis of several related European languages and were not fully borne out when checked against a wider sample. Many linguists now see the parameters themselves as emerging from the interactions of “microparameters” that explain the differences between closely related dialects of the same language and which are often found in the properties of individual words projecting onto the syntax. There is ongoing debate as to the explanatory value of parameters as they were originally conceived.
During the Principles and Parameters era, Chomsky sharpened the notions of competence and performance into the dichotomy of I-languages and E-languages. The former is a state of the language system in the mind of an individual speaker, while the latter, which corresponds to the common notion of a language, is a publicly shared system such as “English”, “French”, or “Swahili”. Chomsky was sharply critical of the study of E-languages, deriding them as poorly defined entities that play no role in the serious study of linguistics —a controversial attitude, as E-languages are what many linguists regard as precisely the subject matter of their discipline. This remains an important point in his work and will be discussed more fully in 4.d. below.
e. The Minimalist Program
From the beginning, critics have argued that the rule systems Chomsky postulated were too complex to be plausibly grasped by a child learning a language, even if important parts of this knowledge were innate. Initially, the replacement of rules by a limited number of parameters in the Principles and Parameters paradigm seemed to offer a solution, as by this theory, instead of an unwieldy set of rules, the child needed only to grasp the setting of some parameters. But, while it was initially hoped that twenty or so parameters might be identified, the number has increased to the point where, although there is no exact consensus, it is too large to offer much hope of providing a simple explanation of language learning, and microparameters further complicate the picture.
The Minimalist Program was initiated in the mid-1990s partially to respond to such criticisms by continuing the trend towards simplicity that had begun with the Extended Standard Theory, with the goal of the greatest possible degree of elegance and parsimony. The minimalist approach is regarded by advocates not as a full theory of syntax but rather as a program of research working towards such a theory, building on the key features of Principles and Parameters.
In the Minimalist Program, syntactic structures corresponding to sentences are constructed using a single operation, Merge, that combines a head with a complement, for example, merging Albert with will meow to give Albert will meow. Importantly, Merge is recursive, so that it can be applied over and over to give sentences of any length. For instance, the sentence just discussed can be merged with thinks to give thinks Albert will meow and then again with Leisa to form the sentence Leisa thinks Albert will meow. Instead of elements within a structure moving from one place to another, a structure merges with an element already inside of it and then deletes redundant elements; a question can be formed from Albert will meow by first producing will Albert will meow, and finally will Albert meow? In order to prevent the production of ungrammatical strings, Merge must be constrained in various ways. The main constraints are postulated to be lexical, coming from the syntactic features of the words in a language. These features control which elements can be merged together, which cannot, and when merging is obligatory, for instance, to provide an object for a transitive verb.
During the Minimalist Program era, Chomsky has worked on a more specific model for the architecture of the language faculty, which he divides into the Faculty of Language, Broad (FLB) and the Faculty of Language, Narrow (FLN). The FLN is the syntactic computational system that had been the subject of Chomsky’s work from the beginning, now envisioned as using a single operation, that of recursive Merge. The FLB is postulated to include the FLN, but additionally the perceptual-articulatory system that handles the reception and production of physical messages (spoken or signed words and sentences) and the conceptual-intentional system that handles interpreting the meaning of those messages. In a schema similar to a flattened version of the T-model, the FLN forms a bridge between the other systems of the FLB. Incoming messages are given a structural form by the FLN that is passed to the conceptual-intentional system to be interpreted, and the reverse process allows thoughts to be articulated as speech. The different structural levels, D-structure and S-structure, of the T-model are eliminated in favor of maximal simplicity (the upside-down T is now just a flat ̶ ). The FLN is held to have a single level on which structures are derived through Merge, and two interfaces connected to the other parts of the FLB.
One important implication of this proposed architecture is the special role of recursion. The perceptual-articulatory system and conceptual-intentional system have clear analogs in other species, many of whom can obviously sense and produce signals and, in at least some cases, seem to be able to link meanings to them. Chomsky argues that, in contrast, recursion is uniquely human and that no system of communication among non-human animals allows users to limitlessly combine elements to produce a potential infinity of messages. In many ways, Chomsky is just restating what had been an important part of his theory from the beginning, which is that human language is unique in being productive or capable of expressing an infinity of different meanings, an insight he credits to Descartes. This makes recursion the characteristic aspect of human language that sets it apart from anything else in the natural world, and a central part of what it is to be human.
The status of recursion in Chomsky’s theory has been challenged in various ways, sometimes with the claim that some human language has been observed to be non-recursive (discussed below, in 4.a). That recursion is a uniquely human ability has also been called into question by experiments in which monkeys and corvids were apparently trained in recursive tasks under laboratory conditions. On the other hand, it has also been suggested that if the recursive FLN really does not have any counterpart among non-human species, it is unclear how such a mechanism might have evolved. This last point is only the latest version of a long-running objection that Chomsky’s ideas are difficult to reconcile with the theory of evolution since he postulates uniquely human traits for which, it is argued by critics, there is no plausible evolutionary history. Chomsky counters that it is not unlikely that the FLN appeared as a single mutation, one that would be selected due to the usefulness of recursion for general thought outside of communication. Providing evolutionary details and exploring the relationship between the language faculty and the physical brain have formed a large part of Chomsky’s most recent work.
The central place of recursion in the Minimalist Program also brought about an interesting change in Chomsky’s thoughts on hypothetical extraterrestrial languages. During the 1980s, he speculated that alien languages would be unlearnable by human beings since they would not share the same principles as human languages. As such, one could be studied as a natural phenomenon in the way that humans study physics or biology, but it would be impossible for researchers to truly learn the language in the way that field linguists master newly encountered human languages. More recently, however, Chomsky hypothesized that since recursion is apparently the core, universal property of human language and any extraterrestrial language will almost certainly be recursive as well, alien languages may not be that different from our own, after all.
4. Language and Languages
As a linguist, Chomsky’s primary concern has always been, of course, language. His study of this phenomenon eventually led him to not only formulate theories that were very much at odds with those held at one time by the majority of linguists and philosophers, but also to have a fundamentally different view about the thing, language, that was being studied and theorized about. Chomsky’s views have been influential, but many of them remain controversial today. This section discusses some of Chomsky’s important ideas that will be of interest to philosophers, especially concerning the nature and acquisition of language, as well as meaning and analyticity, topics that are traditionally the central concerns of philosophy of the language.
a. Universal Grammar
Perhaps the single most salient feature of Chomsky’s theory is the idea of Universal Grammar ( UG). This is the central aspect of language that he argues is shared by all human beings —a part of the organization of the mind. Since it is widely assumed that mental features correspond, at some level, to physical features of the brain, UG is ultimately a biological hypothesis that would be part of the genetic inheritance that all humans are born with.
In terms of Principles and Parameters Theory, UG consists of the principles common to all languages and which will not change as the speaker matures. UG also consists of parameters, but the values of the parameters are not part of UG. Instead, parameters may change from their initial setting as a child grows up, based on the language she hears spoken around her. For instance, an English-speaking child will learn that every sentence must have a subject, setting her Pro-Drop parameter to a certain value, the opposite of the value it would take for a Spanish-speaking child. While the Pro-Drop parameter is a part of UG, this particular setting of the parameter is a part of English and other languages where the subject must be overtly included in the sentence. All of the differences between human languages are then differences in vocabulary and in the settings of parameters, but they are all organized around a common core given by UG.
Chomsky has frequently stated that the important aspects of human languages are set by UG. From a sufficiently detached viewpoint, for instance, that of a hypothetical Martian linguist, there would only be minor regional variations of a single language spoken by all human beings. Further, the variations between languages are predictable from the architecture of UG and can only occur within narrowly constrained limits set by that structure. This was a dramatic departure from the assumption, largely unquestioned until the mid-20th century, that languages can vary virtually without limit and in unpredictable ways. This part of Chomsky’s theory has remained controversial, with some authorities on crosslinguistic work, such as the British psycholinguist Stephen Levinson (2016), arguing that it discounts real and important differences among languages. Other linguists argue the exact contrary: that data from the study of languages worldwide backs Chomsky’s claims. Because the debate ultimately concerns invisible mental features of human beings and how they relate to unpronounced linguistic structures, the interpretation of the evidence is not straightforward, and both sides claim that the available empirical data supports their position.
The theory of UG is an important aspect of Chomsky’s ideas for many reasons, among which is that it clearly sets his theories apart as different from paradigms that had previously been dominant in linguistics. This is because UG is not a theory about behavior or how people use language, but instead about the internal composition of the human mind. Indeed, for Chomsky and others working within the framework of his ideas, language is not something that is spoken, signed, or written but instead exists inside of us. What many people think of as language —externalized acts of communication —are merely products of that internal mental faculty. This in turn has further implications for theories of language acquisition (see 4.b) and how different languages should be identified (4.c).
An important implication of UG is that it makes Chomsky’s theories empirically testable. A common criticism of his work is that because it abstracts away from the study of actual language use to seek underlying idealized patterns, no evidence can ever count against it. Instead, apparent counterexamples can always be dismissed as artifacts of performance rather than the competence that Chomsky was concerned with. If correct, this would be problematic since it is widely agreed that a good scientific theory should be testable in some way. However, this criticism is often based on misunderstandings. A linguist dismissing an apparent failure of the theory as due to performance would need to provide evidence that performance factors really are involved, rather than a problem with the underlying theory of competence. Further, if a language was discovered to be organized around principles that contravened those of UG, then many of the core aspects of Chomsky’s theories would be falsified. Although candidate languages have been proposed, all of them are highly controversial, and none is anything close to universally accepted as falsifying UG.
In order to count as a counterexample to UG, a language must actually breach one of its principles; it is not enough that a principle merely not be displayed. As an example, one of the principles is what is known as structure dependence: when an element of a linguistic structure is moved to derive a different structure, that movement depends on the structure and its organization into phrases. For instance, to arrive at the correct question form of The cat who is on the desk is hungry; it is the is in the main clause, the one before hungry, that is moved to the front of the sentence, not the one in the relative clause (between who and on). However, in some languages, for instance Japanese, elements are not moved to form questions; instead, a question marker (ka) is added at the end of the sentence. This does not make Japanese a counterexample to the UG principle that movement is always structurally dependent. The Japanese simply do not exercise this principle when forming questions, but neither is the principle violated. A counterexample to UG would be a language that moved elements but did so in a way that did not depend on structure, for instance, by always moving the third word to the front or inverting the word order to form a question.
A case that generated a great deal of recent controversy has been the claim that Pirahã, a language with a few hundred speakers in the Brazilian rain forest, lacks recursion (Everett 2005). This has been frequently presented as falsifying UG, since recursion is the most important principle, indeed the identifying feature, of human language, according to the Minimalist Program. This alleged counterexample received widespread and often incautious coverage in the popular press, at times being compared to the discovery of evidence that would disprove the theory of relativity.
This assertion that Pirahã has no recursion has itself been frequently challenged, and the status of this claim is unclear. But there is also a lack of agreement on whether, if true, this claim would invalidate UG or whether it would just be a case similar to the one discussed above, the absence of movement in Japanese when forming questions, where a principle is not being exercised. Proponents of Chomsky’s ideas counter that UG is a theory of mental organization and underlying competence, a competence that may or may not be put fully to use. The fact that the Pirahã are capable of learning Portuguese (the majority language in Brazil) shows that they have the same UG present in their minds as anyone else. Chomsky points out that there are numerous cases of human beings choosing not to exercise some sort of biological capacity that they have. Chomsky’s own example is that although humans are biologically capable of swimming, many would drown if placed in water. It has been suggested by sympathetic scholars that this example is not particularly felicitous, as swimming is not an instinctive behavior for humans, and a better example might be monks who are sworn to celibacy. Debate has continued concerning this case, with some still arguing that if a language without recursion would not be accepted as evidence against UG, it is difficult to imagine what can.
b. Plato’s Problem and Language Acquisition
One of Chomsky’s major goals has always been to explain the way in which human children learn language. Since he sees language as a type of knowledge, it is important to understand how that knowledge is acquired. It seems inexplicable that children acquire something as complex as the grammar and vocabulary of a language, let alone the speed and accuracy with which they do so, at an age when they cannot yet learn how to tie their shoes or do basic arithmetic. The mystery is deepened by the difficulty that adults, who are usually much better learners than small children, have with acquiring a second language.
Chomsky addressed this puzzle in Aspects of the Theory of Syntax (1965), where he called it “Plato’s Problem”. This name is a reference to Plato’s Meno, a dialog in which Socrates guides a young boy, without a formal education, into producing a fairly complex geometric proof, apparently from the child’s own mental resources. Considering the difficult question of where this apparent knowledge of geometry came from, Plato, speaking through Socrates, concludes that it must have been present in the child already, although dormant until the right conditions were presented for it to be awakened. Chomsky would endorse largely the same explanation for language acquisition. He also cites Leibniz and Descartes as holding similar views concerning important areas of knowledge.
Chomsky’s theories regarding language acquisition are largely motivated by what has become known as the “Poverty of the Stimulus Argument,” the observation that the information about their native language that children are exposed to seems inadequate to explain the linguistic knowledge that they arrive at. Children only ever hear a small subset of the sentences that they can produce or understand. Furthermore, the language that they do hear is often “corrupt” in some way, such as the incomplete sentences frequently used in casual exchanges. Yet on this basis, children somehow master the complex grammars of their native languages.
Chomsky pointed out that the Poverty of the Stimulus makes it difficult to maintain that language is learned through the same general-purpose learning mechanisms that allow a human being to learn about other aspects of the world. There are many other factors that he and his followers cite to underline this point. All developmentally normal children worldwide are able to speak their native languages at roughly the same age, despite vast differences in their cultural and material circumstances or the educational levels of their families. Indeed, language learning seems to be independent of the child’s own cognitive abilities, as children with high IQs do not learn the grammar of their language faster, on average, than others. There is a notable lack of explicit instruction; analyses of speech corpora show that adult correction of children’s grammar is rare, and it is usually ineffective when it does occur. Considering these factors together, it seems that the way in which human children acquire language requires an explanation in a way that learning, say, table manners or putting shoes on do not.
The solution to this puzzle is, according to Chomsky, that language is not learned through experience but innate. Children are born with Universal Grammar already in them, so the principles of language are present from birth. What remains is “merely” learning the particularities of the child’s native language. Because language is a part of the human mind, a part that each human being is born with, a child learning her native language is just undergoing the process of shaping that part of her mind into a particular form. In terms of the Principles and Parameters Theory, language learning is setting the value of the parameters. Although subsequent research has shown that things are more complicated than the simple setting of switches, the basic picture remains a part of Chomsky’s theory. The core principles of UG remain unchanged as the child grows, while peripheral elements are more plastic and are shaped by the linguistic environment of the child.
Chomsky has sometimes put the innateness of language in very strong terms and has stated that it is misleading to call language acquisition “language learning”. The language system of the mind is a mental organ, and its development is similar, Chomsky argues, to the growth of bodily organs such as the heart or lungs, an automatic process that is complete at some point in a child’s development. The language system also stabilizes at a certain point, after which changes will be relatively minor, such as the addition of new words to a speaker’s vocabulary. Even many of those who are firm adherents to Chomsky’s theories regard such statements as incautious. It is sometimes pointed out that while the growth of organs does not require having any particular experiences, proper language development requires being exposed to language within a certain critical period in early childhood. This requirement is evidenced by tragic cases of severely neglected children who were denied the needed input and, as a result, never learned to speak with full proficiency.
It has also been pointed out that even the rationalist philosophers whom Chomsky frequently cites did not seem to view innate and learned as mutually exclusive. Leibniz (1704), for instance, stated that arithmetical knowledge is “in us” but still learned, drawn out by demonstration and testing on examples. It has been suggested that some such view is necessary to explain language acquisition. Since humans are not born able to speak in the way that, for example, a horse is able to run within hours of birth, some learning seems to be involved, but those sympathetic to Chomsky regard the Poverty of the Stimulus as ruling out simply acquiring language completely from external sources. According to this view, we are born with language inside of us, but the proper experiences are required to draw that knowledge out and make it available.
The idea of innate language is not universally accepted. The behaviorist theory that language learning is a result of social conditioning, or training, is no longer considered viable. But it is a widely held view that general statistical learning mechanisms, the same mechanisms by which a child learns about other aspects of the world and human society, are responsible for language learning, with only the most general features of language being innate. These sorts of theories tend to have the most traction in schools of linguistic thought that reject the idea of Universal Grammar, maintaining that no deep commonalities hold across human languages. On such a view, there is little about language that can be said to be shared by all humans and therefore innate, so language would have to be acquired by children in the same way as other local customs. Advocates of Chomsky’s views counter that such theories cannot be upheld given the complexity of grammar and the Poverty of the Stimulus, and that the very fact that language acquisition occurs given these considerations is evidence for Universal Grammar. The degree to which language is innate remains a highly contested issue in both philosophy and science.
Although the application of statistical learning mechanisms to machine learning programs, such as OpenAI’s ChatGPT, has proven incredibly successful, Chomsky points out that the architecture of such programs is very different from that of the human mind: “A child’s operating system is completely different from that of a machine learning program” (Chomsky, Roberts, and Watumull, 2023). This difference, Chomskyans maintain, precludes drawing conclusions about the use or acquisition of language by humans on the basis of studying these models.
c. I vs. E languages
Perhaps the way in which Chomsky’s theories differ most sharply from those of other linguists and philosophers is in his understanding of what language is and how a language is to be identified. Almost from the beginning, he has been careful to distinguish speaker performance from underlying linguistic competence, which is the target of his inquiry. During the 1980s, this methodological point would be further developed into the I-language/E-language distinction.
A common concept of what an individual language is, explicitly endorsed by philosophers such as David Lewis (1969), Michael Dummett (1986), and Michael Devitt (2022), is a system of conventions shared between speakers to allow coordination. Therefore, language is a public entity used for communication. It is something like this that most linguists and philosophers of language have in mind when they talk about “English” or “Hindi”. Chomsky calls this concept of language E-language, where the “E” stands for external and extensional. What is meant by “extensional” is somewhat technical and will be discussed later in this subsection. “External” refers to the idea just discussed, where language is a public system that exists externally to any of its speakers. Chomsky points out that such a notion is inherently vague, and it is difficult to point to any criteria of identity that would allow one to draw firm boundaries that could be used to tell one such language apart from another. It has been observed that people living near border areas often cannot be neatly categorized as speaking one language or the other; Germans living near the Dutch border are comprehensible to the nearby Dutch but not to many Germans from the southern part of Germany. Based on the position of the border, we say that they are speaking “German” rather than “Dutch” or some other E-language, but a border is a political entity with negligible linguistic significance. Chomsky (1997: 7) also called attention to what he calls “semi-grammatical sentences,” such as the string of words.
(17) *The child seems sleeping.
Although (17) is clearly ill-formed, most “English” speakers will be able to assign some meaning to it. Given these conflicting facts, there seems to be no answer to whether (17) or similar strings are part of “English”.
Based on considerations like those just mentioned, Chomsky derides E-languages as indistinct entities that are of no interest to linguistic science. The real concept of interest is that of an I-language, where the “I” refers to intensional and internal. “Intensional” is in opposition to “extensional”, and will be discussed in a moment. “Internal” means contained in the mind of some individual human being. Chomsky defines language as a computational system contained in an individual mind, one that produces syntactic structures that are passed to the mental systems responsible for articulation and interpretation. A particular state of such a system, shaped by the linguistic environment it is exposed to, constitutes an I-language. Because all I-languages contain Universal Grammar, they will all resemble each other in their core aspects, and because more peripheral parts of language are set by the input received, the I-language of two members of the same linguistic community will resemble one another more closely. For Chomsky, for whom the study of language is ultimately the study of the mind, it is the I-language that is the proper topic of concern for linguists. When Chomsky speaks of “English” or “Swahili”, this is to be understood as shorthand for a cluster of characteristics that are typically displayed by the I-languages of people in a particular linguistic community.
This rejection of external languages as worthy of study is closely related to another point where Chomsky goes against a widely held belief in the philosophy of language, as he does not accept the common hypothesis that language is primarily a means of communication. The idea of external languages is largely motivated by the widespread theory that language is a means for interpersonal communication, something that evolved so that humans could come together, coordinate to solve problems, and share ideas. Chomsky responds that language serves many uses, including to speak silently to oneself for mental clarity, to aid in memorization, to solve problems, to plan, or to conduct other activities that are entirely internal to the individual, in addition to communication. There is no reason to emphasize one of these purposes over any other. Communication is one purpose of language—an important one, to be sure—but it is not the purpose.
Besides the internal/external dichotomy, there is the intensional/extensional distinction, referring to two different ways that sets might be specified. The extension of a set is what elements are in that set, while the intension is how the set is defined and the members are divided from non-members. For instance, the set {1, 2, 3} has as its extension the numbers 1, 2, and 3. The intension of the same set might be the first three positive integers, or the square roots of 1, 4, and 9, or the first three divisors of 6; indeed, an infinite number of intensions might generate the same set extension.
Applying this concept to languages, a language might be defined extensionally in terms of the sentences of the language or intentionally in terms of the grammar that generates all of those sentences but no others. While Chomsky favors the second approach, he attributes the first to two virtually opposite traditions. Structuralist linguists, who place great value on studying corpora, and other linguists and philosophers who focus on the actual use of language define a language in terms of the sentences attested in corpora and those that fit similar patterns. A very different tradition consists of philosophers of language who are known as “Platonists”, and who are exemplified by Jerrold Katz (1981, 1985) and Scott Soames (1984), former disciples of Chomsky. On this view, every possible language is a mathematical object, a set of possible sentences that really exist in the same abstract sense that sets of numbers do. Some of these sets happen to be the languages that humans speak.
Both of these extensional approaches are rejected by Chomsky, who maintains that language is an aspect of the human mind, so what is of interest is the organization of that part of the mind, the I-language. This is an intensional approach, since a particular I-language will constitute a grammar that will produce a certain set of sentences. Chomsky argues that both extensional approaches, the mathematical and the usage-based, are insufficiently focused on the mental to be of explanatory value. If a language is an abstract mathematical object, a set of sentences, it is unclear how humans are supposed to acquire knowledge of such a thing or to use it. The usage-based approach, as a theory of behavior, is insufficiently explanatory because any real explanation of how language is acquired and used must be in mental terms, which means looking at the organization of the underlying I-language.
While many who study language accept the concept of the I-language and agree with its importance, Chomsky’s complete dismissal of E-languages as worthy of study has not been widely endorsed. E-languages, even if they are ultimately fiction, seem to be a necessary fiction for disciplines such as sociolinguistics or for the historical analysis of how languages have evolved over time. Further, having vague criteria of identity does not automatically disqualify a class of entities from being used in science. For example, the idea of species is open to many of the same criticisms concerning vagueness that Chomsky directs at E-languages, and its status as a real category has been debated, but the concept often plays a useful role in biology.
d. Meaning and Analyticity
It might be said that the main concern of the philosophy of language is the question of meaning. How is it that language corresponds to, and allows us to communicate about, states of affairs in the world or to describe possible states of affairs? A related question is whether there are such things as analytic truths, that is, sentences that are (as they were often traditionally characterized) necessarily true by virtue of meaning alone. It might seem like anyone who understands all the words in:
(18) If Albert is a cat, then Albert is an animal.
knows that it has to be true, just in virtue of knowing what it means. Appeals to such knowledge were frequently the basis for explaining our apparent a priori knowledge of logic and mathematics and for what came to be known as “analytic philosophy” in the 20th century. But the exact nature and scope of this sort of truth and knowledge are surprisingly hard to clarify, and many philosophers, notably Quine (1953) and Fodor (1998), argue that allegedly analytic statements are no different from any other belief that is widely held, such as:
(19) The world is more than a day old.
On this outlook, not only are apparently analytic truths open to revision just like any other belief, but the entire idea of determinate meanings becomes questionable.
As mentioned earlier, Chomsky’s focus has been not on meaning but instead on syntax, the grammatical rules that govern the production of well-formed sentences, considered largely independent of their meanings. Much of the critical data for his program has consisted of unacceptable sentences, the “WhyNots,” such as:
(20) * She’s as likely as he’s to get ill. (Rey 2022)
Sentences like (20), or (1)-(6) in 2.c above, are problematic, not because they have no meaning or have an anomalous meaning in some way, but because of often subtle issues under the surface concerning the syntactic structure of the sentence. Chomsky frequently argued that syntax is independent of meaning, and a theory of language should be able to explain the syntactic data without entering into questions of meaning. This idea, sometimes called “the autonomy of syntax”, is supported by, among other evidence, considering sentences such as:
(21) Colorless green ideas sleep furiously. (Chomsky 1965: 149)
which makes no sense if understood literally but is immediately recognizable as a grammatical sentence in English. Whether syntax is entirely independent of meaning and use has proven somewhat contentious, with some arguing that, on the contrary, questions of grammaticality cannot be separated from pragmatic and semantic issues. However, the distinction fits well with Chomsky’s conception of I-language, an internal computational device that produces syntactic structures that are then passed to other mental systems. These include the conceptual-intentional system responsible for assigning meaning to the structures, a system that interfaces with the language faculty but is not itself part of that faculty, strictly speaking.
Despite his focus on syntax, Chomsky does frequently discuss questions of meaning, at least from 1965 on. Chomsky regards the words (and other lexical items, such as prefixes and suffixes) that a speaker has stored in her lexicon as bundles of semantic, syntactic, and phonetic features, indicating information about meaning, grammatical role, and pronunciation. Some features that Chomsky classifies as syntactic may seem to be more related to meaning, such as being abstract. Folding these features into syntax seemed to be supported by the observation that, for example,
(22) * A very running person passed us.
is anomalous because very requires an abstract complement in such a context (a very interesting person is fine). In Aspects of the Theory of Syntax (1965), he also introduced the notion of “selectional rules” that identify sentences such as:
(23) Golf plays John (1965: 149)
as “deviant”. A particularly interesting example is:
(24) Both of John’s parents are married to aunts of mine. (1965: 77)
In 1965, (24) might have seemed to be analytically false, but in the 21st century, such a sentence may very well be true!
One popular theory of semantics is that the meaning of a sentence consists of its truth conditions, that is, the state of affairs that would make the sentence true. This idea, associated with the philosopher of language Donald Davidson (1967), might be said to be almost an orthodoxy in the study of semantics, and it certainly has an intuitive appeal. To know what The cat is on the mat means is to know that this sentence is true if and only if the cat is indeed on the mat. Starting in the late 1990s, Chomsky would challenge this picture of meaning as an oversimplification of the way that language works.
According to Chomsky’s view, also developed by Paul Pietroski (2005), among others, the sentences of a language do not, themselves, have truth conditions. Instead, sentences are tools that might be used, among other things, to make statements that have truth values relative to their context of use. Support for this position is drawn from the phenomenon of polysemy, where the same word might be used with different truth-conditional roles within a single sentence, such as in:
(25) The bank was destroyed by the fire and so moved across the street. (Chomsky 2000: 180)
where the word bank is used to refer to both a building and a financial institution. There is also open texture, a phenomenon by which the meaning of a word might be extended in multiple ways, many of which might have once been impossible to foresee (Waismann 1945). An oft-cited example is mother: in modern times, unlike in the past, it is possible that two women, the woman who produces the ovum and the woman who carries the fetus, may both be called mothers of the child. One might also consider the way that a computer, at one time a human being engaged in performing computations, was easily extended to cover electronic machines that are sometimes said to think, something that was also at one time reserved for humans.
Considering these phenomena, it seems that the traditional idea of words as having fixed “meanings” might be better replaced by the idea of words as “filters or lenses, providing ways of looking at things and thinking about the products of our minds” (Chomsky 2000, 36), or, as Pietroski (2005) puts it, as pointers in conceptual space. A speaker uses the structures made available by her I-language in order to arrange these “pointers” in such a way as to convey information, producing statements that might be assigned truth values given the context. But a speaker is hardly constrained to her I-language, which might be supplemented by resources such as gestures, common knowledge, shared cultural background, or sensibility to the listener’s psychology and ability to fill in gaps. Consider a speaker nodding towards a picture of the Eiffel Tower and saying “been there”; to the right audience, under the right circumstances, this is a perfectly clear statement with a determinate truth value, even though the I-language, which produces structures corresponding to grammatical sentences, has been overridden in the interests of efficiency.
It has been suggested (Rey 2022) that this outlook on meaning offers a solution to the question of whether there are sentences that are analytically true and that are distinct from merely strongly held beliefs. Sentences such as If Albert is a cat, he is an animal may be analytic in the sense that, in the lexicon accessed by the I-language, [animal] is a feature of cat (as argued by Katz 1990). On the other hand, the I-language might be overruled in the face of future evidence, such as discovering that cats are really robots from another planet (as Putnam 1962 imagined). These two apparently opposing facts can be accommodated by the open texture of the word cat, which might come to be used in cases where it does not, at present, apply.
Chomsky, throughout his long career, seems to have frequently vacillated concerning the existence of analytic truths. Early on, as in Aspects (1965), he endorses analyticity, citing sentence 24 and similar examples. At other times, he seems to echo Quine, at one point (1975), stating that the study of meaning cannot be dissociated from systems of belief. More recently (1997) he explicitly allows for analytic truths, arguing that necessary connections occur between the concepts denoted by the lexicons of human languages. For example, “If John persuaded Bill to go to college, then Bill at some point decided or intended to go to college… this is a truth of meaning” (1997: 30). This is to say that it is an analytic truth based on a relation that obtains between the concepts persuade and intend. Ultimately, though, Chomsky regards analyticity as an empirical issue, not one to be settled by considering philosophical intuitions but rather through careful investigation of language acquisition, crosslinguistic comparison, and the relation of language to other cognitive systems, among other evidence. Currently, he holds that allowing for analytic truths based on relations between concepts seems more promising than alternative proposals, but this is an empirical question to be resolved through science.
Finally, mention should be made of the way that Chomsky connects considerations of meaning with “Plato’s Problem”, the question of how small children manage to do something as difficult as learning language. Chomsky notes that the acquisition of vocabulary poses this problem “in a very sharp form” (1997: 29). During the peak periods of language learning, children learn several words a day, often after hearing them a single time. Chomsky accounts for this rapid acquisition in the same way as the acquisition of grammar: what is being learned must already be in the child. The concepts themselves are innate, and what a child is doing is simply learning what sounds people in the local community use to label concepts she already possesses. Chomsky acknowledges that this idea has been criticized. Hilary Putnam (1988), for example, asks how evolution could have possibly had the foresight to equip humans with a concept such as carburetor. Chomsky’s response is simply that this certainly seems surprising, but that “the empirical facts appear to leave open a few other possibilities” (1997: 26). Conceptual relations, like those mentioned above between persuades and intends, or between chase and follow with the intent of staying on one’s path, are, Chomsky asserts, grasped by children on the basis of virtually no evidence. He concludes that this indicates that children approach language learning with an intuitive understanding of important concepts, such as intending, causing something to happen, having a goal, and so on.
Chomsky suggests a parallel to his theory of lexical acquisition in the Nobel Prize-winning work of the immunologist Niels Jerne. The number of antigens (substances that trigger the production of antibodies) in the world is so enormous, including man-made toxins, that it may seem absurd to propose that immune systems would have evolved to have an innate supply of specific antibodies. However, Jerne’s work upheld the theory that an animal could not be stimulated to make an antibody in response to a specific antigen unless it had already produced such an antibody before encountering the antigen. In fact, Jerne’s (1985) Nobel speech was entitled “The Generative Grammar of the Immune System”.
Chomsky’s theories of innate concepts fit with those of some philosophers, such as Jerry Fodor (1975). On the other hand, this approach has been challenged by other philosophers and by linguists such as Stephen Levinson and Nicholas Evans (2009), who argue that the concepts labeled by words in one language very seldom map neatly onto the vocabulary of another. This is sometimes true even of very basic terms, such as the English preposition “in”, which has no exact counterpart in, for example, Korean or Tzeltal, languages that instead have a range of words that more specifically identify the relation between the contained object and the container. This kind of evidence is understood by some linguists to cast doubt on the idea that childhood language acquisition is a matter of acquiring labels for preexisting universal concepts.
e. Kripkenstein and Rule Following
This subsection introduces the “Wittgenstenian Problem”, one of the most famous philosophical objections to Chomsky’s notion of an underlying linguistic competence. Chomsky himself stated that out of the various criticisms his theory had received over the years, “this seems to me to be the most interesting” (1986: 223). Inspired by Ludwig Wittgenstein’s cryptic statement that “no course of action could be determined by a rule, because every course of action could be made out to accord with the rule” (1953: §201), Saul Kripke (1982) developed a line of argument that entailed a deep skepticism about the nature of rule-following activities, including the use of language. Kripke is frequently regarded as having gone beyond what Wittgenstein might have intended, so his argument is often attributed to a composite figure, “Kripke’s Wittgenstein” or “Kripkenstein”. A full treatment of this fascinating, but lengthy and complex, argument is beyond the scope of this article (the interested reader might consult the article “Kripke’s Wittgenstein.” It can be summarized as asserting that, in a case where a person seems to be following a rule, there are no facts about her that determine which rule she is actually following. To take Kripke’s example, if someone seems to be adding numbers in accordance with the normal rules of addition but then gives a deviant answer, say 68 + 57 = 5, there is no way to establish that she was not actually performing an operation called quaddition instead, which is like addition except that it gives an answer of 5 for any equation involving numbers larger than 57. Kripke claims that any evidence, including her own introspection, that she was performing addition and made a bizarre mistake is equally compatible with the hypothesis that she was performing quaddition. Ultimately, he concludes, there is no way to settle such questions, even in theory; there is simply no fact of the matter about what rule is being followed.
The relevance of Kripke’s argument to Chomsky’s linguistic theory is that this directly confronts his notion of language as an internalized system of rules (or, in later iterations, a system of principles and parameters that gives rise to rules that are not themselves represented). According to Chomsky’s theory, a grammatical error is explained as a performance issue, for example, a mistake brought on by inattention or distraction that causes a deviation from the system of rules in the mind of the speaker. According to Kripke, calling this a deviation from those rules, rather than an indication that different rules (or no rules) are being followed, is like trying to decide the question of addition vs. quaddition. Kripke asserted that there is no fact of the matter in the linguistic case, either, any more than in the example of addition and quaddition. Therefore, “it would seem that the use of the ideas of rules and competence in linguistics needs serious reconsideration” (1982: 31).
An essential part of Chomsky’s response to Kripke’s criticism was that the question of what is going on inside a speaker is no different in principle than any other question investigated by the sciences. Given a language user, say Jones, “We then try… to construct a complete theory, the best one we can, of relevant aspects of how Jones is constructed” (1986: 237). Such a theory would involve specifying that Jones incorporates a particular language, consisting of fixed principles and the setting of parameters, and that he follows the rules that would emerge from the interactions of these factors. Any particular theory like this could be proven wrong —Chomsky notes, “This has frequently been the case” —and, therefore, such a theory is an empirically testable one that can be found to be correct or incorrect. That is, given a theory of the speaker’s underlying linguistic competence, whether she is making a mistake or the theory is wrong is “surely as ascertainable as any other fact about a complex system” (Rey 2020: 125). What would be required is an acceptable explanation of why a mistake was made. The issues here are very similar to those surrounding Chomsky’s adaptation of the “Galilean Method” (see 2.b above) and the testability, or lack thereof, of his theories in general (see 4.a).
5. Cognitive Science and Philosophy of Mind
Because Chomsky regards language as a part of the human mind, his work has inevitably overlapped with both cognitive science and philosophy of mind. Although Chomsky has not ventured far into general questions about mental architecture outside of the areas concerned with language, his impact has been enormous, especially concerning methodology. Prior to Chomsky, the dominant paradigm in both philosophy and cognitive science was behaviorism, the idea that only external behavior could be legitimately studied and that the mind was a scientifically dubious entity. In extreme cases, most notably Quine (1960), the mind was regarded as a fiction best dropped from serious philosophy. Chomsky began receiving widespread notice in the 1950s for challenging this orthodoxy, arguing that it was a totally inadequate framework for the study of language (see 2.a, above), and he is widely held to have dramatically altered the scientific landscape by reintroducing the mind as a legitimate object of study.
Chomsky has remained committed throughout his career to the view that the mind is an important target of inquiry. He cautions against what he calls “methodological dualism” (2000: 135), the view that the study of the human mind must somehow proceed differently than the study of other natural phenomena. Although Chomsky says that few contemporary philosophers or scientists would overtly admit to following such a principle, he suggests that in practice it is widespread.
Chomsky postulates that the human mind contains a language faculty, or module, a biological computer that operates largely independently of other mental systems to produce and parse linguistic structures. This theory is supported by the fact that we, as language users, apparently systematically perform highly complex operations, largely subconsciously, in order to derive appropriate structures that can be used to think and communicate our thoughts and to parse incoming structures underlying messages from other language users. These activities point to the presence of a mental computational device that carries them out. This has been interpreted by some as strong evidence for the computational theory of mind, essentially the idea that the entire mind is a biological computer. Chomsky himself cautions against such a conclusion, stating that the extension from the language module to the whole mind is as of yet unwarranted.
In his work over the last two decades, Chomsky has dealt more with questions of how the language faculty relates to the mind more broadly, as well as the physical brain, questions that he had previously not addressed extensively. Most recently, he proposed a scheme by which the language faculty, narrowly defined, or FLN, consists only of a computational device responsible for constructing syntactic structures. This device provides a bridge between the two other systems that constitute the language faculty more broadly, one of which is responsible for providing conceptual interpretations for the structures of the FLN, the other for physical expression and reception. Thus, while, in this view, the actual language faculty plays a narrow role, it is a critical one that allows the communication of concepts. The FLN itself works with a single operation, merge, which combines two elements. This operation is recursive, allowing elements to be merged repeatedly. He suggests that the FLN, which is the only part of the system unique to humans, evolved due to the usefulness of recursion not only for communication but also for planning, navigation, and other types of complex thought. Because the FLN is thought to have no analog among other species, recursion is theorized to be an important characteristic of human thought, which gives it its unique nature.
While the FLN interfaces with other mental systems, passing syntactic structures between them, the system itself is postulated to carry out its operations in isolation. This follows from Chomsky’s view of syntax as largely autonomous from questions of meaning and also from the way that linguistic knowledge seems to be specialized and independent of our general knowledge about the world. For instance, we can recognize a sentence such as:
(26) On later engines, fully floating gudgeon pins are fitted (Cook and Newsom 1998: 83).
as well-formed, despite the fact that most readers will not know what it means. This concept of a specialized language faculty, which has been a constant in Chomsky’s work almost from the start, represents a substantive commitment to the “modularity of mind”, a thesis that the mind consists, at least in part, of specialized and autonomous systems. There is debate among cognitive scientists and in the philosophy of psychology regarding the degree to which this picture is accurate, as opposed to the idea that mental processes result from the interaction of general faculties, such as memory and perception, which are not domain-specific in the way of the hypothesized language faculty.
It should be emphasized that the language faculty Chomsky hypothesizes is mental, not a specific physical organ in the brain, unlike, for example, the hippocampus. Asking where it is in the brain is something like asking where a certain program is in a computer; both emerge from the functioning of many physical processes that may be scattered in different locations throughout the entire physical device. At the same time, although Chomsky’s theory concerns mental systems and their operations, this is intended as a description, at a high level of abstraction, of computational processes instantiated in the physical brain. Opponents of Chomsky’s ideas frequently point out that there has been little progress in actually mapping these mental systems onto the brain. Chomsky acknowledges that “we do not really know how [language] is actually implemented in neural circuitry” (Berwick and Chomsky 2017: 157). However, he also holds that this is entirely unsurprising, given that neuroscience, like linguistics, is as of yet in its infancy as a serious science. Even in much simpler cases, such as insect navigation, where researchers carry out experiments and genetic manipulations that cannot be performed on humans, “we still do not know in detail how that computation is implemented” (2017: 157).
In his most recent publications, Chomsky has worked towards unifying his theories of language and mind with neuroscience and theories of the physical brain. He has at times expressed pessimism about the possibility of fully unifying these fields, which would require explaining linguistic and psychological phenomena completely in terms of physical events and structures in the brain, While he holds that this may be possible at some point in the distant future, it may require a fundamental conceptual shift in neuroscience. He cautions that it is also possible that such a unification may never be completely possible. Chomsky points to Descartes’ discussion of the “creative” nature of human thought and language, which is the observation that in ordinary circumstances the use of these abilities is “innovative without bounds, appropriate to circumstances but not caused by them” (Chomsky 2014: 1), as well as our apparent possession of free will. Chomsky suggests that it is possible that such phenomena may be beyond our inherent cognitive limitations and impossible for us to ever fully understand.
6. References and Further Reading
a. Primary Sources
Chomsky has been a highly prolific author who has written dozens of books explaining and promoting his theories. Although almost all of them are of great interest to anyone interested in language and mind, including philosophers, they vary greatly in the degree to which they are accessible to non-specialists. The following is a short list of some of the relatively non-technical works of philosophical importance:
Chomsky, N. 1956. “Three Models for the Description of Language”. IRE Transactions on Language Theory. 2(3) pages 113 –124.
The earliest presentation of the Chomsky Hierarchy.
Chomsky, N. 1957. Syntactic Structures. The Hague: Mouton and Company.
Chomsky, N. 1959. “A Review of B.F. Skinner’s Verbal Behavior”. Language 35(1): 2658.
Chomsky, N. 1965. Aspects of the Theory of Syntax. Cambridge, MA: MIT Press.
While many of the exact proposals about syntax are dated, this contains what is still one of the best summaries of Chomsky’s ideas concerning language acquisition and the connections he sees between his program and the work of the early modern rationalist philosophers.
Chomsky, N. 1975. The Logical Structure of Linguistic Theory. Berlin: Springer.
The earliest statement of Chomsky’s theory, now somewhat outdated, originally circulated as a typescript in 1956.
Chomsky, N. 1981. Lectures on Government and Binding. The Hague: Mouton.
Chomsky, N. 1986. Barriers. Boston: The MIT Press.
Chomsky, N. 1986. Knowledge of Language: its Nature, Origin and Use. Westport, CN: Praeger.
Contains Chomsky’s response to “Kripkenstein”, as well as the first discussion of languages.
Chomsky, N. 1988. Language and Problems of Knowledge: The ManaguaLectures. Cambridge, MA: MIT Press.
A series of lectures for a popular audience that introduces Chomsky’s linguistic work.
Chomsky, N. 1995. The Minimalist Program. Boston: MIT Press.
Chomsky, N. 1997. “Language and Problems of Knowledge”. Teorema. (16)2: 5 –33.
This is probably the best short introduction to Chomsky’s ideas on the nature and acquisition of language, especially the E-language/I-language distinction.
Chomsky, N. 2000. New Horizons in the Study of Language and Mind. Cambridge: Cambridge University Press.
It is philosophically interesting in that it contains a significant discussion of Chomsky’s views on contemporary trends in the philosophy of language, particularly his rejection of “externalist” theories of meaning.
Hauser, M.; Chomsky, N.; Fitch, T. 2002. “The Faculty of Language: What Is It, Who Has It, and How Did It Evolve”. Science. 198: 1569 –1579.
A good summary of directions in generative linguistics, including proposals about the structure of the language faculty in terms of FLN/FLB.
Chomsky, N. 2006. Language and Mind. Cambridge: Cambridge University Press.
Also contains valuable historical context.
Chomsky, N. 2014. “Science, Mind and Limits of Understanding”. The Science and Faith Foundation, https://chomsky.info/201401/. The Vatican.
Berwick, R. and Chomsky, N. 2016. Why Only Us:Language and Evolution. Boston: MIT Press.
It is valuable as a non-technical look at the current state of Chomsky’s theories as well as a discussion of the evolutionary development of language.
Chomsky discusses hypothetical extraterrestrial languages and the possibility of communicating with aliens.
Chomsky, N., Roberts, I., and Watumull, J. “Noam Chomsky: The False Promise of ChatGPT”. New York Times. March 8, 2023.
For someone interested in exploring Chomsky’s linguistic theories in depth, the following are a few key works tracing their development (along with Aspects, listed above).
b. Secondary Sources
There is a vast secondary literature surrounding Chomsky that seeks to explain, develop, and often criticize his theories. The following is a small sampling of works interesting to non-specialists. After a list of sources that cover Chomsky’s work in general, sources that are relevant to more specific aspects are listed by the section of this article they were referenced in or apply to.
General:
Cook, V. and Newsom, M. 1996. Chomsky’s Universal Grammar: An Introduction. Malden, MA: Blackwell.
Very clear introduction to Chomsky’s theories and their importance to linguistic science. The first three chapters are especially valuable to non-specialists.
Rey, G. 2020. Representation of Language: Philosophical Issues in a Chomskyan Linguistics. Oxford: Oxford University Press.
A useful and thorough overview of the philosophical implications of Chomsky’s theories, particularly regarding the philosophy of science and the philosophy of mind, as well as a summary of the core linguistic theory.
Scholz, B., Pelletier, F., Pullum, G., and Nedft, R. 2022. “Philosophy of Linguistics”, The Stanford Encyclopedia of Philosophy, Edward N. Zalta (ed.).
This article is an excellent critical comparison of Chomsky’s theories on language and linguistic science with the major rival approaches.
Life:
Rai, M. 1995. Chomsky’s Politics. London: Verso.
Cohen, J., and Rogers, J. 1991. “Knowledge, Morality and Hope: The Social Thought of Noam Chomsky.” New Left Review. I/187: 5–27.
Philosophy of Linguistics:
Bloomfield, L. 1933. Language. New York: Holt, Rinehart, and Winston.
Hockett, C. 1960. “The Origin of Speech”. Scientific American. 203: 88 –111.
Quine, W. 1960. Word and Object. Cambridge, MA: MIT University Press.
Skinner, B. 1957. Verbal Behavior. New York: Appleton-Century-Crofts.
The Development of Chomsky’s Linguistic Theory:
Baker, M. 2001. The Atoms of Language. New York: Basic Books
Easily readable presentation of Principles and Parameters Theory.
Harris, R. 2021. The Linguistics Wars. Oxford: Oxford University Press.
Liao, D., et al. 2022. “Recursive Sequence Generation in Crows”. Science Advances. 8(44).
Summarizes recent challenges to Chomsky’s claim that recursion is uniquely human.
Tomalin, M. 2006. Linguistics and the Formal Sciences: The Origins of Generative Grammar. Cambridge, UK: University of Cambridge Press.
Provides as interesting historical background connecting Chomsky’s early work with contemporary developments in logic and mathematics.
Technical:
Generative Grammar:
Lasnik, H. 1999. Minimalist Analysis. Malden, MA: Blackwell
Lasnik, H. 2000. Syntactic Structures Revisited. Cambridge, MA. MIT University Press
Lasnik, H. and Uriagereka, J. 1988. A Course in GB Syntax. Cambridge, MA. MIT University Press.
Language and Languages:
Criticisms of Universal Grammar:
Evans, N. and Levinson, S. 2009. “The Myth of Language Universals: Language Diversity and its Importance for Cognitive Science”. Behavioral and Brain Sciences 32(5) pages 429 –492.
Levinson, S. 2016. “Language and Mind: Let’s Get the Issues Straight!”. Making Sense of Language (Blum, S., ed.). Oxford: Oxford University Press pages 68 –80.
Relevant to the debate over the I-language/E-language distinction:
Devitt, M. 2022. Overlooking Conventions: The Trouble with Linguistic Pragmatism. Oxford: Oxford University Press.
Dummet, M. 1986. “’A nice derangement of epitaphs: Some comments on Davidson and Hacking”. Truth and Interpretation (Lepore, E. ed.). Oxford: Blackwell.
Katz, J. 1981. Language and Other Abstract Objects. Lanham, MD: Rowman and Littlefield.
Katz, J. 1985. The Philosophy of Linguistics. Oxford: Oxford University Press.
Lewis, D. 1969. Convention: A Philosophical Study. Cambridge, MA: Harvard University Press.
Soames, S. 1984. “Linguistics and Psychology”. Linguistics and Philosophy 7: 155 –179.
Meaning and Analyticity:
Davidson, D. 1967. “Truth and Meaning”. Synthese 17(3): 304 –323.
Fodor, J. 1998. Concepts: Where Cognitive Science Went Wrong. Cambridge, MA: MIT University Press.
Katz, J. 1990. The Metaphysics of Meaning. Oxford: Oxford University Press.
Pietroski, P. 2005. “Meaning Before Truth”. Contextualism in Philosophy: Knowledge, Meaning and Truth. Oxford: Oxford University Press.
Putnam, H. 1962. “It Ain’t Necessarily So.” Journal of Philosophy LIX: 658 –671.
Quine, W. 1953. “Two Dogmas of Empiricism”. From a Logical Point of View. Cambridge, MA: Harvard University Press.
Rey, G. 2022. “The Analytic/Synthetic Distinction. The Stanford Encyclopedia of Philosophy (Spring 2023 Edition), Edward N. Zalta & Uri Nodelman (eds.).
See especially the supplement specifically on Chomsky and analyticity.
Waismann, F. 1945. “Verifiability”. Proceedings of the Aristotelian Society 19.
Language Acquisition and the Theory of Innate Concepts:
Fodor, J. 1975. The Language of Thought. Scranton, PA: Crowell.
Jerne, N. “The Generative Grammar of the Immune System”. Science. 229 pages 1057 –1059.
Putnam, H. 1988. Representation and Reality. Cambridge, MA: MIT University Press.
“Kripkenstein” and Rule-Following:
Kripke, S. 1982. Wittgenstein on Rules and Private Language. Cambridge, MA: Harvard University Press.
Wittgenstein, L. 1953. Philosophical Investigations (Anscombe, G. translator). Oxford: Blackwell.
On Pirahã:
Everett, D. 2005. “Cultural Constraints on Grammar and Cognition in Pirahã”. Current Anthropology 46(4): 621–646.
The original claim that a language without recursion had been identified, allegedly showing Universal Grammar to be false.
Hornstein, N. and Robinson, J. 2016. “100 Ways to Misrepresent Noam Chomsky”. Current Affairs.
Representative responses to Everett from those in Chomsky’s camp assert that even if his claims are correct, they would not represent a counterexample to Universal Grammar.
McWhorter, J. 2016. “The bonfire of Noam Chomsky: journalist Tom Wolfe targets the acclaimed linguist”. Vox.
Linguist John McWhorter provides a very understandable summary of the issues and assesses the often incautious way that the case has been handled in the popular press.
Nevins, A., Pesetsky, D., Rodrigues, C. 2009. “Pirahã Exceptionality: A Reassessment”. Language 85(2): 355 –404.
Technical article criticizing Everett’s assessment of Pirahã syntax.
Other:
Lakoff, G. 1971. “On Generative Semantics”. Semantics (Steinberg, G. and Jacobovits, I. ed.). Cambridge, UK: Cambridge University Press.
An important work critical of Chomsky’s “autonomy of syntax”.
Cognitive Science and Philosophy of Mind.
Rey, G. 1997. Contemporary Philosophy of Mind. Hoboken: Wiley-Blackwell.
Covers Chomsky’s contributions in this area, particularly regarding the downfall of behaviorism and the development of the computational theory of mind.
Arguments from evil are arguments against Theism, which is broadly construed as the view that there is a supremely powerful, knowledgeable, and good creator of the universe. Arguments from evil attempt to show that there is a problem with Theism. Some arguments depend on it being known that Theism is false, but some arguments from evil also try to show that Theism is known to be probably false, or unreasonable, or that there is strong evidence against it. Arguments from evil are part of the project of criticizing religions, and because religions offer comprehensive worldviews, arguments from evil are also part of the project of evaluating which comprehensive worldviews are true or false.
Humean arguments from evil take their argumentative strategy from Philo’s argument from evil in part XI of Hume’s Dialogues Concerning Natural Religion. Philo’s argumentative strategy is distinctive in that it is fundamentally explanatory in nature. Philo takes as his data for explanation the good and evil we know about. He asks which hypothesis about a creator best explains that data. He argues that the good and evil we know about is best explained not by Theism but by some rival hypothesis to Theism. In this way, the good and evil we know about provides a reason for rejecting Theism.
This article surveys Humean arguments from evil. It begins by explaining Philo’s original argument from evil as well as some potential drawbacks of that argument. Then it turns to more fully explaining the distinctive features of Humean arguments from evil in comparison to other arguments from evil. It highlights three features in particular: they appeal to facts about good and evil, they are comparative, and they are abductive. The remainder of the article articulates a modern, prototypical Humean argument inspired by the work of Paul Draper. It explains the idea that the good and evil we know about is better explained by a rival to Theism called the “Hypothesis of Indifference,” roughly, the hypothesis that there is no creator who cares about the world one way or the other. It then shows how to strengthen Humean arguments from evil by providing additional support for the rival hypothesis to Theism. Finally, it examines four prominent objections to Humean arguments.
This article focuses on Humean arguments that try to show that Theism is known to be false, or probably false, or unreasonable to believe. These kinds of Humean arguments are ambitious, as they try to draw an overall conclusion about Theism itself. But there can also be more modest Humean arguments that try to show that some evidence favors a rival to Theism without necessarily drawing any overall conclusions about Theism itself. This article focuses on ambitious Humean arguments rather than these modest Humean arguments mostly because ambitious Humean arguments are the ones contemporary philosophers have focused on. But it is important to keep in mind that Humean arguments from evil—like arguments from evil more generally—come in different shapes and sizes and may have different strengths and weaknesses.
Natural theology is the attempt to provide arguments for the existence of God by only appealing to natural facts—that is, facts that are not (purportedly) revealed or otherwise supernatural. Three of the traditional arguments for the existence of God—the ontological argument, the cosmological argument, and the teleological argument—belong to the project of natural theology. Conversely, natural atheology is the attempt to provide arguments against the existence of God by appealing to natural (non-supernatural, non-revealed) facts as well.
Hume’s Dialogues Concerning Natural Religion is a classic work of natural atheology. In the dialogue, the interlocutors assume that there is a creator (or creators) of the world; they advance arguments about the nature or character of this creator. Most of the dialogue—parts II-VIII—discusses design arguments for the existence of God whereas later parts—parts X-XI—discuss arguments from evil. In the dialogue, Philo offers a variety of critiques of prototypical theistic ideas. (Because it is controversial whether Philo speaks for Hume—and if so, where—this article attributes the reasoning to Philo.)
In section X, the interlocutors discuss what is called a “logical” or “incompatibility” argument from evil. They begin by describing various facts about good and evil they have observed. For instance, many people experience pleasure in life; but oftentimes they also experience great pain; the strong prey upon the weak; people use their imaginations not just for relaxation but to create new fears and anxieties; and so forth. They consider whether those facts are logically inconsistent with the existence of a creator with infinite power, wisdom, and goodness. Ultimately Philo does not think it would be reasonable to infer the existence of such a creator from those facts; but he does concede that they are logically consistent with the existence of such a creator (X:35; XI.4, 12). But Philo’s concession is not his last word on the subject.
In section XI, Philo constructs a different argument from evil. Philo begins by articulating additional claims about good and evil he takes himself to know. Most of these additional claims consist in causes of suffering that seem to be unnecessary—for example, variations in weather cause suffering, yet seem to serve no purpose; pain teaches animals and people how to act, but it seems that pleasure would be just as effective at motivating people to act; and so forth. Given these claims, Philo considers what we can reasonably infer about the creator (or creators) of the universe. He considers four potential hypotheses:
The creator(s) of the universe are supremely good.
The creator(s) of the universe are supremely malicious.
The creator(s) of the universe have some mixture of both goodness and malice.
The creator(s) of the universe have neither goodness nor malice.
In evaluating these hypotheses, Philo uses a Humean principle of reasoning that “like effects have like causes.” In other words, the only kinds of features it is reasonable to infer from an effect to its cause(s) are features that would be similar between the two. (He uses this principle throughout the Dialogue; see also II.7, II.8, II.14, II.17, V.1, VI.1.) Using this principle, he argues that of these hypotheses the fourth is “by far the most probable” (XI.15). He rejects the first and the second because the causes of the universe would be too dissimilar to the universe itself. The world is mixed, containing both good and evil. Thus, one cannot infer that the cause of the world contains no evil—as the first hypothesis suggests—or contains no good—as the second hypothesis suggests. Those causes are too dissimilar. He also rejects the third hypothesis. He assumes that if the universe had some mixture of goodness and malice this would be because some of the creators of the universe would be good and some of the creators of the universe would be malicious. And, he assumes, the universe would then be like a battlefield between them. But the regularity of the world suggests the universe is not a battlefield between dueling creators. Having ruled out the first three hypotheses, the most probable hypothesis must be the fourth. As Philo himself says of this hypothesis, using language that is graphic both now and then (XI.13):
The whole [of the universe] presents nothing but the idea of a blind nature, impregnated by a great vivifying principle, and pouring forth from her lap, without discernment or parental care, her maimed and abortive children.
Philo’s conclusion has both a weak and a strong interpretation. In the strong interpretation, Philo is concluding that we can reasonably believe something about the nature of the creator(s), namely, that they are indifferent. In a weak interpretation, Philo is concluding that of these four hypotheses, the fourth is the most probable—but it may not be sufficiently probable to reasonably believe. Either way, the most reasonable hypothesis is that the creator has neither goodness nor malice.
At first blush, it might not be obvious how Philo’s conclusion provides a reason for rejecting Theism. In fact, it might look like Philo is just concerned to undermine an argument from our knowledge of good and evil to Theism. And, one might point out, undermining an argument for a conclusion is not the same thing as providing a reason for rejecting that conclusion. To see how Philo’s conclusion provides a reason for rejecting Theism, notice two things. First, Philo is not merely claiming something purely negative, like that some argument for Theism fails. Rather, he is also claiming something positive, namely, that the fourth hypothesis—where the creator has neither goodness nor malice—is the most reasonable of the four considered, given our knowledge of good and evil. Second, that hypothesis is inconsistent with Theism, which maintains (at the very least) that God is supremely good. Since the most reasonable thing to believe, given that data, is inconsistent with Theism, then that data provides a reason for rejecting Theism. In this way, Philo is not simply undermining an argument for Theism; he is also providing a reason for rejecting Theism.
Philo’s specific argument has received a mixed reaction both historically and in the early 21st century. From a contemporary perspective, there are at least two drawbacks to Philo’s specific argument. First, Philo and his interlocutors assume that there is a creator (or creators) of the universe. Thus, they only consider hypotheses that imply that there is a creator (or creators) of the universe. But many contemporary naturalists and atheists do not assume that there is any creator at all. From a contemporary perspective, it would be better to consider a wider range of hypotheses, including some that do not imply that there is a creator. Second, when evaluating hypotheses, Philo uses Hume’s principles of reasoning that “like causes have like effects.” But many contemporary philosophers reject such principles. Insofar as Philo’s reasoning assumes Hume’s own principles of reasoning, it will exhibit the various problems philosophers have identified for Hume’s principles of reasoning.
But even if Philo’s specific argument suffers from drawbacks, his argumentative strategy is both distinctive and significant. Thus, one might mount an argument that shares several of the distinctive features of his argumentative strategy without committing oneself to the specific details of Philo’s own argument. Toward the end of the 20th and beginning of the 21st century, Paul Draper did exactly that, constructing arguments against Theism that utilize Philo’s argumentative strategy while relying on a more modern epistemology. It is natural to call these arguments Humean arguments since their strategy originates in a dialogue written by Hume—even if modern defenses of them vary from Hume’s original epistemology. The next section describes in more detail several of the distinctive features of Philo’s argumentative strategy.
2. Distinctive Features of Humean Arguments
First, many arguments from evil focus exclusively on facts about evil. Some arguments focus on our inability to see reasons that would justify God’s permission of those evils (Martin (1978), Rowe (1979)). Other arguments focus on the horrific nature of such evils (Adams (1999)). By contrast, Humean arguments from evil focus on facts about both good and evil. The focus on both good and evil is appropriate and significant.
The focus on good and evil is appropriate because, if God exists, God cares about preventing evil but also bringing about what is good. The focus on good and evil is significant because it provides a richer set of data with which to reason about the existence of God. For it is conceivable that facts about evil provide some evidence against the existence of God, but facts about good provide even stronger evidence for the existence of God, thereby offsetting that evidence. Or, alternatively, it is conceivable that facts about evil provide little to no evidence against the existence of God, but facts about good and evil together provide strong evidence against the existence of God. By focusing on both good and evil, Humean arguments provide a richer set of data to reason about the moral character of a purported creator.
Second, Humean arguments compare Theism to some rival hypothesis that is inconsistent with Theism. Normally, the rival hypothesis is more specific than the denial of Theism. For instance, Philo’s argument considered rival hypotheses to Theism that are fairly specific. And we can distinguish between different Humean arguments on the basis of the different rival hypotheses they use.
There is an important advantage of using a specific rival hypothesis to Theism. The simplest rival to Theism is the denial of Theism. But consider all of the views that are inconsistent with Theism. That set includes various forms of naturalism, but also pantheism, panentheism, non-theistic idealisms, various forms of pagan religions, and perhaps others yet. So, the denial of Theism is logically equivalent to the disjunction of these various theories. But it is not at all obvious what a disjunction of these various theories will predict. By contrast, it is normally more obvious what a more specific, rival hypothesis to Theism predicts. Thus, by focusing on a more specific rival hypothesis to Theism, it is easier to compare Theism to that rival.
Third, Humean arguments are best understood abductively. They compare to what degree a specific rival to Theism better explains, or otherwise predicts, some data. Even Philo’s own argument could be understood abductively: the hypothesis that there is a supremely good creator does not explain the good and evil Philo observes because the creator proposed by that hypothesis is not similar to the good and evil he observes. To be clear, Humean arguments need not claim that the rival actually provides the best explanation of those facts. Rather, their claim is more modest, but with real bite: a rival to theism does a better job of explaining some facts about good and evil.
Some Humean arguments may stop here with a comparison between Theism and a specific rival hypothesis. But many Humean arguments are more ambitious than that: they try to provide a reason for rejecting Theism. This feature of such Humean arguments deserves further clarification. Sometimes abductive reasoning is characterized as “inference to the best explanation.” In a specific inference to the best explanation, one infers that some hypothesis is true because it is part of the best explanation of some data. Such Humean arguments need not be understood as inference to the best explanation in this sense. Though it is not as catchy, some Humean arguments could be understood as “inference away from a worse explanation.” Some body of data gives us reason to reject Theism because some hypothesis other than Theism does a better job of explaining that data and that hypothesis is inconsistent with Theism. Notice that a specific rival to Theism can do a better job of explaining that data even if some other hypothesis does an even better job yet.
Lastly, Humean arguments are evidential arguments from evil, not logical arguments from evil. More specifically, Humean arguments do not claim that some known facts are logically inconsistent with Theism. Rather, they claim that some known facts are strong evidence against Theism. Logical arguments from evil have an important methodological feature. If some known fact is logically inconsistent with Theism, then it does not matter what evidence people muster for Theism—we already know that Theism is false. By contrast, evidential arguments may need to be evidentially shored up. Even if the arguments are successful in providing strong evidence against Theism, it may be that there is also strong evidence in favor of Theism as well. This difference between evidential arguments and logical arguments is relevant in section 4 which indicates how to strengthen Humean arguments.
3. Modern Humean Arguments
This section explains a modern, prototypical Humean argument. The author who has done the most to develop Humean arguments is Paul Draper. The argument in this section is inspired by Paul Draper’s work without being an interpretation of any specific argument Draper has given. Humean arguments compare Theism to some specific rival to Theism; and different Humean arguments may use different specific rivals to compare to Theism. Consequently, it is important to begin by clarifying what specific rival is used to generate Humean arguments.
This article uses the term Hypothesis of Indifference. The Hypothesis of Indifference is the claim that it is not the case that the nature or condition of life on earth is the result of a creator (or creators) who cares positively or negatively about that life. The Hypothesis of Indifference is a natural hypothesis to focus on for several reasons. First, it is inconsistent with Theism, but is more specific than just the denial of Theism. Second, it does not imply that there is a creator. Third, it is consistent with metaphysical naturalism, the view that there are no supernatural facts. These last two reasons are important to a modern audience—many people believe that there is no creator of the universe, and many philosophers accept metaphysical naturalism.
The central claim of this Humean argument is this: the Hypothesis of Indifference does a much better job predicting the good and evil we know about than Theism does. This article refers to this claim as Central Claim. Central Claim does not claim that the Hypothesis of Indifference perfectly predicts the good and evil we know about. It does not even claim that the Hypothesis of Indifference is the best explanation of the good and evil we know about. Rather, it claims that in comparison to Theism, the Hypothesis of Indifference does a much better job of predicting the good and evil we know about.
The comparison in Central Claim is an antecedent comparison. That is, it compares what the Hypothesis of Indifference and Theism predict about good and evil antecedent of our actual knowledge of that good and evil. We are to set aside, or bracket, our actual knowledge of good and evil and ask to what degree each hypothesis—the Hypothesis of Indifference, Theism—predicts what we know.
This procedure of antecedent comparison is not unique to Humean arguments. It is frequently used in the sciences. A classic example of the same procedure is the retrograde movement of Mars. Using the naked eye, Mars seems to move “backwards” through the sky. Some astronomers argued that the retrograde motion of Mars was better explained by heliocentrism than geocentrism. But in making their arguments, they first set aside what they already knew about the retrograde motion of Mars. Rather, they asked to what degree each hypothesis would predict the retrograde motion of Mars before considering whether Mars exhibits retrograde motion.
There are different strategies one might use to defend Central Claim. One strategy appeals to what is normally called our background knowledge. This is knowledge we already have “in the background.” Such knowledge is frequently relied upon when we are evaluating claims about evidence, prediction, explanation, and the like. For instance, suppose I hear a loud repeating shrieking noise from my kitchen. I will immediately take that as evidence that there is smoke in my kitchen and go to investigate. However, when I take that noise as evidence of smoke in my kitchen, I rely upon a huge range of knowledge that is in the background, such as: loud repeating shrieking noises do not happen at random; that noise is not caused by a person or pet; there is a smoke detector in my kitchen; smoke detectors are designed to emit loud noises in the presence of smoke; and so on. I rely on this background knowledge—implicitly or explicitly—when I take that noise as evidence of smoke in my kitchen. For instance, if I lacked all of that background knowledge, it is very unlikely I would immediately take that noise as evidence of smoke in my kitchen.
One strategy for defending Central Claim relies upon our background knowledge. The basic strategy has four parts. First, one argues that our background knowledge supports certain kinds of predictions about good and evil. Second, one argues that those predictions are, to a certain degree, accurate. Third, one argues that the Hypothesis of Indifference does not interfere with or undermine those predictions. Finally, one argues that Theism interferes with or undermines those predictions, producing more inaccurate predictions. The end result, then, is that the combination of the Hypothesis of Indifference with our background knowledge does a better job of predicting the data of good and evil than the combination of Theism with our background knowledge.
This strategy can be implemented in various ways. One way of implementing it appeals to our background knowledge of the biological role or function of pleasure and pain (Draper (1989)). Specifically, our background knowledge predicts that pleasure and pain will play certain adaptive roles or functions for organisms. And when we consider the pleasure and pain we know about, we find that it frequently plays those kinds of roles. For instance, warm sunlight on the skin is pleasant, but also releases an important vitamin (vitamin D); rotten food normally produces an unpleasant odor; extreme temperatures that are bad for the body are also painful to experience for extended durations; and so forth. So, our background knowledge makes certain predictions about the biological role or function of pleasure and pain, and those predictions are fairly accurate.
The Hypothesis of Indifference does not interfere with, or undermine, those predictions as it does not imply the existence of a creator who has moral reasons for deviating from the biological role of pleasure and pain. By contrast, Theism does interfere with, and undermine, those predictions. For pleasure is a good and pain a bad. Thus, given Theism, one might expect pleasure and pain to play moral or religious roles or functions. The exact nature of those moral or religious roles might be open to debate. But they might include things like the righteous receiving happiness or perhaps good people getting the pleasure they deserve. Similarly, given Theism, one might expect pain to not play certain biological roles if it does not simultaneously play moral or religious roles. For instance, given Theism, one might not expect organisms that are not moral agents to undergo intense physical pain (regardless of whether that pain serves a biological role). In this way, Theism may interfere with the fairly accurate predictions from our background information. Thus, the combination of the Hypothesis of Indifference and our background knowledge does a better job of predicting some of our knowledge of good and evil—namely, the distribution of pleasure and pain—than the combination of Theism and our background knowledge.
A second strategy for defending Central Claim utilizes a thought experiment (compare Hume Dialogue, XI.4, Dougherty and Draper (2013), Morriston (2014)). Imagine two alien creatures who are of roughly human intelligence and skill. One of them accepts Theism, and the other accepts the Hypothesis of Indifference. But neither of them knows anything about the condition of life on earth. They first make predictions about the nature and quality of life on earth, then they learn about the accuracy of their predictions. One might argue that the alien who accepts the Hypothesis of Indifference will do a much better job predicting the good and evil on earth than the alien who accepts Theism. But as it goes for the aliens so it goes for us: the Hypothesis of Indifference does a much better job of predicting the good and evil we know about than Theism does
The alien who accepts Theism might be surprised as it learns about the actual good and evil of life on earth. For the alien’s acceptance of Theism gives it reason to expect a better overall balance of good and evil than we know about. By contrast, the alien who accepts the Hypothesis of Indifference might not be surprised by the good and evil that we know about because the Hypothesis of Indifference does not imply the existence of a creator with a moral reason for influencing the good and evil the earth has. So the alien’s acceptance of the Hypothesis of Indifference does not give it a reason for anticipating any particular distribution of good and evil. Thus, the alien accepting the Hypothesis of Indifference might not be surprised to discover the specific good and evil it does in fact know about.
Recall that Central Claim involves an antecedent comparison—it compares to what degree two hypotheses predict some data antecedent of our actual knowledge of that data. This thought experiment models the idea of an antecedent comparison by having the aliens not actually know the relevant data of good and evil. Their ignorance of the good and evil models our “bracketing” of our own knowledge.
Having considered some defenses of Central Claim, we can now formulate some Humean arguments that use Central Claim as a premise. One Humean argument goes like this:
Central Claim: the Hypothesis of Indifference does a much better job predicting the good and evil we know about than Theism does.
Therefore, the good and evil we know about is evidence favoring the Hypothesis of Indifference over Theism.
This argument is valid. But the inference of this argument is modest on two fronts. First, evidence comes in degrees, from weak evidence to overwhelming evidence. The conclusion of this argument merely states that the good and evil we know about is evidence favoring one hypothesis over another without specifying the strength of that evidence. Second, this conclusion is consistent with a wide range of views about what is reasonable for us to believe. The conclusion is consistent with views like: it is reasonable to believe Theism; it is reasonable to believe the Hypothesis of Indifference; it is not reasonable to believe or disbelieve either. To be sure, this argument still asserts Central Claim; and as we see in section V, a number of authors have objected to Central Claim and arguments for it. But the conclusion drawn from Central Claim is quite modest. Perhaps for these reasons, defenders of Humean arguments from Philo to the present have tended to defend Humean arguments with more ambitious conclusions.
Consider the following simple Humean argument against Theism:
Central Claim: the Hypothesis of Indifference does a much better job predicting the good and evil we know about than Theism does.
Therefore, Theism is probably false.
This argument does not draw a conclusion comparing Theism to some rival. Rather, it draws a conclusion about Theism itself. In this way it is more ambitious than the argument just considered. What makes this Humean argument a simple Humean argument is that it only has one premise—Central Claim. However, this argument is not valid, and there are several reasons for thinking it is not very strong. The next section explains what those reasons are and how to strengthen Humean arguments by adding additional premises to produce a better (and arguably valid) argument.
4. Strengthening Humean Arguments
Suppose that Central Claim is true. Then a rival hypothesis (Hypothesis of Indifference) to a hypothesis (Theism) does a much better job predicting some data (what we know about good and evil). However, that fact on its own might not make it reasonable to believe the rival hypothesis (Hypothesis of Indifference) or disbelieve the relevant hypothesis (Theism). For the rival hypothesis might have other problems such as being ad hoc or not predicting other data (compare Plantinga (1996)).
An analogy will be useful in explaining these points. Suppose I come home to find that one of the glass windows on the back door of my home has been broken. These facts are “data” that I want to explain. One hypothesis is that the kids next door were playing and accidentally broke the glass with a ball (Accident Hypothesis). A rival hypothesis is that a burglar broke into my home by breaking the glass (Burglar Hypothesis). Now the Burglar Hypothesis better predicts the data. If the burglar is going to break into my home, an effective way to do that is to break the glass on the door to thereby unlock the door. By contrast, the Accident Hypothesis does a worse job predicting the data. Even if the kids were playing, the ball might not hit my door. And even if the ball did hit the door, it might not hit the glass with enough force to break it. So, in this case, the rival hypothesis (Burglar Hypothesis) to a hypothesis (Accident Hypothesis) does a much better job predicting some data (the broken glass on my back door). Does it thereby follow that it is reasonable for me to believe the rival hypothesis (Burglar Hypothesis) or it is unreasonable for me to believe the hypothesis (Accident Hypothesis)?
No, or at least, not yet. First, the Burglar Hypothesis is much less simple than the Accident Hypothesis. I already know that there are kids next door who like to play outside. I do not already know that there is a burglar who wants to break into my home. So the Burglar Hypothesis is committed to the existence of more things than I already know about. That makes the Burglar Hypothesis less ontologically simple. Second, the Burglar Hypothesis might not predict as well other data that I know. Suppose, for instance, there is a baseball rolling around inside my home, and nothing has been stolen. The Accident Hypothesis does a much better job predicting this data than the Burglar Hypothesis. So even if the Burglar Hypothesis better predicts some data, on its own, that would not make it reasonable for me to believe The Burglar Hypothesis or make it reasonable to disbelieve the Accident Hypothesis.
Returning to Humean arguments, suppose Central Claim is true so that a rival to Theism, specifically the Hypothesis of Indifference, better predicts the good and evil we know about. It may not yet follow that it is reasonable to believe the Hypothesis of Indifference or disbelieve Theism. For it may be that the rival is much less simple than Theism. Or it may be that the rival to Theism does a much worse job predicting other data that we know about.
To strengthen Humean arguments, additional premises can be added (compare Dougherty and Draper (2013), Perrine and Wykstra (2014), Morriston (2014)). For instance, an additional premise might be Simplicity Claim: the Hypothesis of Indifference is just as simple, if not more so, than Theism. Another premise might be Not-Counterbalanced Claim: there is no body of data we know about that Theism does a much better job predicting than the Hypothesis of Indifference. The strengthened argument looks like this:
Central Claim: the Hypothesis of Indifference does a much better job predicting the good and evil we know about than Theism does.
Simplicity Claim: the Hypothesis of Indifference is just as simple, if not more so, than Theism.
Not-Counterbalanced Claim: there is no body of data we know about that Theism does a much better job predicting than the Hypothesis of Indifference.
Therefore, Theism is false.
This argument is a stronger argument than the simple one-premise argument from the previous section. Arguably, it is valid. (Whether it is valid depends partly on the relationship between issues like simplicity and probability; but see Dougherty and Draper (2013: 69) for an argument that it is valid.)
Premises like Simplicity Claim and Not-Counterbalanced Claim are not always defended in discussion of arguments from evil. But they can be defended by pressing into service other work in the philosophy of religion. For instance, natural theologians try to provide evidence for the existence of God by appealing to facts we know about. Critics argue that such evidence does not support Theism or, perhaps, supports Theism only to a limited degree. These exchanges are relevant to evaluating Not-Counterbalanced Claim. To be sure, Humean arguments compare Theism to some rival. So other work in philosophy of religion might not straightforwardly apply if it does not consider a rival to Theism or considers a different rival than the one used in the relevant Humean argument.
These additional premises strengthened Humean arguments because Humean arguments are not logical or incompatibility arguments. That is, they do not claim that the good and evil we know about is logically inconsistent with Theism. Rather, they are abductive arguments. They claim that what we know about good and evil is evidence against Theism because some rival to Theism better predicts or explains it. But in evaluating how well a hypothesis explains some data, it is oftentimes important to also consider further facts about the hypothesis—such as how simple it is or if it is also known to be false or otherwise problematic.
Lastly, some might think that the relation between simple and strengthened Humean arguments is just a matter of whether we have considered some evidence against Theism or all relevant evidence for or against Theism. But considering some evidence versus all the evidence are just two different tasks, and the first task can be done without consideration of the second. However, the relation between simple and strengthened Humean arguments is a little more complex than that for certain methodological reasons.
Each of the premises of a strengthened Humean argument involves a comparison of Theism with a specific rival to Theism. But the specific choice of the rival might make it easier to defend some of the comparisons while simultaneously making it harder to defend other comparisons. For instance, the Hypothesis of Indifference does not posit any entity that has the ability or desire to influence life on earth. Some defenders of Central Claim might use that feature to argue that the Hypothesis of Indifference has better predictive fit than Theism with regard to the good and evil we know about. But exactly because the Hypothesis of Indifference does not posit any entity that has the ability or desire to influence life on earth, it may have worse predictive fit when it comes to the fine-tuning of the universe, the existence of life at all, the existence of conscious organisms, the existence of moral agents, and other potential evidence. So picking the Hypothesis of Indifference might make it easier to defend some premises of a strengthened Humean argument (perhaps Central Claim) while also making it harder to defend other premises of a strengthened Humean argument (perhaps Not-Counterbalanced Claim).
As such, the relationship between a simple and strengthened Humean argument is more complex. It is not simply a matter of considering one potential pool of evidence and then considering a larger pool of evidence. Rather, the choice of a specific rival to Theism is relevant to an evaluation of both simple and strengthened Humean arguments. Some specific rivals might make it easier to defend a simple Humean argument while also making it harder to defend a strengthened Humean argument (or vice versa). Defenders of Humean arguments have to carefully choose a specific rival that balances simplicity and predictive strength to challenge Theism.
5. Criticisms of Humean Arguments
Like all philosophical arguments, Humean arguments have received their fair share of criticisms. This section describes a handful of criticisms and potential responses to those criticisms. These criticisms are all criticisms of Central Claim (or premises like it). Consequently, these objections could be lodged against simple Humean arguments and strengthened Humen arguments—as well as the “modest” Humean argument mentioned at the end of section III. (For a discussion of historical responses to Hume’s writing on religion, see Pyle (2006: chapter 5).)
a. Objection 1: Limited Biological Roles
Some authors object to the biological role argument for Central Claim (Plantinga (1996), Dougherty and Draper (2013)). Consider the wide range of pleasure and pain we know about. For instance, I get pleasure out of reading a gripping novel, listening to a well-crafted musical album, or tasting the subtle flavors of a well-balanced curry. Likewise, consider the pain of self-sacrifice, the displeasure of a hard workout, or the frustration of seeing a coworker still fail to fill in standardized forms correctly. The objection goes that these pleasures and pains do not seem to serve any biological roles.
Defenders of Humean arguments might respond in two ways. First, they might distinguish between the pleasure and pain of humans and of non-human animals. It might be that the pleasure and pain in non-human animals is much more likely to play a biological role than the pleasure and pain in humans. Thus, overall, pleasure and pain are more likely to play a biological role. Second, they might point out that Central Claim does not imply that the Hypothesis of Indifference does a good job explaining pleasure and pain. Rather, it implies that the Hypothesis of Indifference does a much better job than Theism. Thus, from the mere fact that some pleasures and pains do not seem to serve any biological roles it would not follow that Theism does a better job of predicting pleasure and pain than the Hypothesis of Indifference.
b. Objection 2: Naturalism and Normativity
Humean arguments maintain that what we know about good and evil is better predicted or explained by some rival to Theism than by Theism itself. In a simple understanding, what we know about good and evil includes claims like: it is bad that stray cats starve in the winter. However, some critics argue that the best explanation of the existence of good and evil is Theism itself. That is, they might argue that a purely naturalistic world, devoid of any supernatural reality, does a much worse job predicting the existence of good and evil than a claim like Theism. The argument here is abductive: there might not be any contradiction in claiming that the world is purely naturalistic and that there is good and evil. Nonetheless, a purely naturalistic hypothesis does a much worse job predicting or explaining good and evil than Theism. Thus, these critics argue, premises like Central Claim are false, since Theism does a much better job of explaining the existence of good and evil than naturalistic alternatives to Theism (see Lauinger (2014) for an example of this criticism).
Note that this objection only applies to certain kinds of Humean arguments. Specifically, it only applies to Humean arguments that implicitly or explicitly assume a rival to Theism that is a purely naturalistic hypothesis. However, not all rivals to Theism need be a purely naturalistic hypothesis. For instance, some of the rivals that Philo considered are not purely naturalistic. Nonetheless, many contemporary authors do accept a purely naturalistic worldview and would compare that worldview with a Theistic one.
In response, defenders of Humean arguments might defend metaethical naturalism. According to metaethical naturalism, normative facts, including facts about good and evil, are natural facts. Defenders of Humean arguments might argue that given metaethical naturalism, a purely naturalistic worldview does predict, to a high degree, normative facts. Determining whether this response succeeds, though, would require a foray into complex issues in metaethics.
c. Objection 3: God’s Obligations
Many philosophers and ordinary people assume that if Theism is true, then God has certain obligations to us. For instance, God is obligated to not bring about evil for us for absolutely no reason at all. These obligations might be based in God’s nature or some independent order. Either way, God is required to treat us in certain ways. The idea that if Theism is true, then God has certain obligations to us is a key idea in defending arguments from evil, including Humean arguments from evil. For instance, one of the defenses of Central Claim from above said that Theists might be surprised at the distribution of good and evil we know about. They might be surprised because they expect God to prevent that evil, since God has an obligation to prevent it, and that being all-powerful, God could prevent it. In this way, defenses of Central Claim (and premises like it) may implicitly assume that if Theism is true, then God has certain obligations to us.
However, some philosophers reject the claim that God has certain obligations to us (Adams (2013), Murphy (2017)). In these views, God might have a justifying reason to prevent evils and harms to us; but God does not have requiring reasons of the sort generated by obligations. There are different arguments for these views, and they are normally quite complex. But the arguments normally articulate a conception of God in which God is not a moral agent in the same way an average human person is a moral agent. But if God is not required to prevent evils and harms for us, God is closer to Hume’s “indifferent creator.” Just as an indifferent creator may, if they so desire, improve the lives of humans and animals, so too God may, if God so desires, improve the lives of humans and animals. But neither God nor the indifferent creator must do so.
Defenders of Humean arguments may respond to these arguments by simply criticizing these conceptions of God. Defenders of Humean arguments might argue that those conceptions are false or subtly incoherent. Alternatively, they might argue that those conceptions of God make it more difficult to challenge premises like Not-Counterbalanced Claim. For if God only has justifying reasons for treating us in certain ways, there might be a wide range of potential ways God would allow the world to be. But if there is a wide range of potential ways God would allow the world to be, then Theism does not make very specific predictions about how the world is. In this way, critics of Humean arguments may make it easier to challenge a premise like Central Claim but at the cost of making it harder to challenge a premise like Not-Counterbalanced Claim.
d. Objection 4: Skeptical Theism
Perhaps some of the most persistent critics of Humean arguments are skeptical theists (van Inwagen (1991), Bergmann (2009), Perrine and Wykstra (2014), Perrine (2019)). While there are many forms of skeptical theism, a unifying idea is that even if God were to exist, we should be skeptical of our ability to predict what the universe is like—including what the universe is like regarding good and evil. Skeptical theists develop and apply these ideas to a wide range of arguments against Theism, including Humean arguments.
Skeptical theistic critiques of Humean arguments can be quite complex. Here the critiques are simplified into two parts that form a simple modus tollens structure. The first part is to argue that there are certain claims that we cannot reasonably disbelieve or otherwise reasonably rule out. (In other words, we should be skeptical of their truth.) The second part is to argue that if we are reasonable in believing Central Claim (or something like it), then it is reasonable for us to disbelieve those claims. Since it is not reasonable for us to believe those claims, it follows that we are not reasonable in believing Central Claim (or something like it).
For the first part, consider a claim like this:
Limitation. God is unable to create a world with a better balance of good and evil without sacrificing other morally significant goods.
Skeptical theists argue that it is not reasonable for us to believe that Limitation is false; rather, we should be skeptical of its truth or falsity. One might argue that it is reasonable for us to believe that Limitation is false because it is hard for us to identify the relevant morally significant goods. But skeptical theists argue that this is a poor reason for disbelieving Limitation since God is likely to have created the world with many morally significant goods that are obscure to us. One might argue that it is reasonable for us to believe that Limitation is false because it is easy for us to imagine or conceive of a world in which it is false. But skeptical theists argue that this is a poor reason for disbelieving Limitation because conceivability is an unreliable guide to possibility when it comes to such complex claims like Limitation. In general, skeptical theists argue that our grasp of the goods and evils there are, as well as how they are connected, is too poor for us to reasonably disbelieve something like Limitation. In this way, they are skeptical of our access to all of the reasons God might have that are relevant to the permission of evil.
The second part of the skeptical theist’s critique is that if it is not reasonable for us to believe Limitation is false, then it is not reasonable for us to believe Central Claim is true. This part of the skeptical theist’s critique may seem surprising. Central Claim is a comparison between two hypotheses. Limitation is not comparative. Nonetheless, skeptical theists think they are importantly related. To see how they might relate, an analogy might be useful.
Suppose Keith is a caring doctor. How likely is it that Keith will cut a patient with a scalpel? At first blush, it might seem that it is extremely unlikely. Caring doctors do not cut people with scalpels! But on second thought, it is natural to think that whether Keith will cut a patient with a scalpel depends upon the kinds of reasons Keith has. If Keith has no compelling medical reason to do so, then given that Keith is a caring doctor, it is extremely unlikely Keith will cut a patient with a scalpel. But if Keith does have a compelling reason—he is performing surgery or a biopsy, for instance—then even if Keith is a caring doctor, it is extremely likely he will cut a patient with a scalpel. Now suppose someone claims that Keith will not cut a patient with a scalpel. That person is committed to a further claim: that Keith lacks a compelling medical reason to cut the patient with a scalpel. After all, even a caring doctor will cut a patient with a scalpel if there is a compelling medical reason to do so.
So, reconsider:
Central Claim: the Hypothesis of Indifference does a much better job predicting the good and evil we know about than Theism does.
There are several arguments one can give for Central Claim. But most of them utilize a simple idea: if Theism is true, there is a God who has reason for preventing the suffering and evil we know about, but if the Hypothesis of Indifference is true, there is no creator with such reasons. But, skeptical theists claim, God might have reasons for permitting suffering and evil if by doing so God can achieve other morally significant goods. Thus, to claim that God would prevent the suffering and evil we know about assumes that God could create a world with a better balance of good and evil without sacrificing other morally significant goods. (Compare: to claim that Keith, the kindly doctor, would not cut a patient with a scalpel assumes that Keith lacks a compelling medical reason to cut the patient with a scalpel.) Thus, if it is reasonable for us to believe Central Claim, it must also be reasonable for us to disbelieve:
Limitation: God is unable to a create a world with a better balance of good and evil without sacrificing other morally significant goods.
After all, God might create a world with this balance of good and evil if it were necessary for other morally significant goods. But at this point, the first part of the skeptical theistic critique is relevant. For the skeptical theist claims that it is not reasonable for us to disbelieve Limitation. To do that, we would have to have a better understanding of the relationship between goods and evils than we do. Since it is not reasonable for us to reject Limitation, it is not reasonable for us to accept Central Claim.
As indicated earlier, the skeptical theist’s critique is quite complex. Nonetheless, some defenders of Humean arguments think that the criticism fails because the reasons skeptical theists give for doubting Central Claim can be offset or cancelled out. The defenders of Humean arguments reason by parity here. Suppose that the skeptical theist is right and that, for all we know, God could not have created a better balance of good and evil without sacrificing other morally significant goods. And suppose that the skeptical theist is right that this gives us a reason for doubting Central Claim. Well, that skepticism cuts both ways. For all we know, God could have created a better balance of good and evil without sacrificing other morally significant goods. By parity, that gives us a reason for accepting Central Claim. Thus, the skepticism of skeptical theism gives us both a reason to doubt Central Claim and a reason for accepting Central Claim. These reasons offset or cancel each other out. But once we set aside these offsetting reasons, we are still left with strong reasons for accepting Central Claim—namely, the reasons given by the arguments of section II. So, the skeptical theist’s critique does not ultimately succeed.
6. References and Further Reading
Adams, Marilyn McCord. (1999). Horrendous Evils and the Goodness of God. Cornell University Press.
Develops and responds to an argument from evil based on horrendous evils.
Adams, Marilyn McCord. (2013). “Ignorance, Instrumentality, Compensation, and the Problem of Evil.” Sophia. 52: 7-26.
Argues that God does not have obligations to us to prevent evil.
Bergmann, Michael. (2009). “Skeptical Theism and the Problem of Evil.” In Thomas Flint and Michael Rea, eds., The Oxford Handbook of Philosophical Theology. Oxford University Press.
A general introduction to skeptical theism that also briefly criticizes Humean arguments.
David Hume, Dialogues Concerning Natural Religion, part XI.
The original presentation of a Humean argument.
Dougherty, Trent and Paul Draper. (2013). “Explanation and the Problem of Evil.” In Justin McBrayer and Daniel Howard-Snyder, eds., The Blackwell Companion to the Problem of Evil. Blackwell Publishing.
A debate on Humean arguments.
Draper, Paul. (1989). “Pain and Pleasure: An Evidential Problem for Theists.” Nous. 23: 331-350
A classic modern presentation of a Humean argument.
Draper, Paul. (2013). “The Limitation of Pure Skeptical Theism.” Res Philosophica. 90.1: 97-111.
A defense of Humean arguments from skeptical theistic critiques.
Draper, Paul. (2017). “Evil and the God of Abraham, Anselm, and Murphy.” Religious Studies. 53: 564-72.
A defense of Humean arguments from the criticism that God lacks obligations to us.
Lauinger, William. (2014). “The Neutralization of Draper-Style Evidential Arguments from Evil.” Faith and Philosophy. 31.3: 303-324.
A critique of Humean arguments that good and evil better fit with Theism than naturalism.
Martin, Michael. (1978). “Is Evil Evidence Against the Existence of God?” Mind. 87.347: 429-432.
A brief argument that our inability to see God’s reasons for permitting suffering is evidence against Theism.
Morriston, Wes. (2014). “Skeptical Demonism: A Failed Response to a Humean Challenge.” In Trent Dougherty and Justin McBrayer, eds., Skeptical Theism. Oxford University Press.
A defense of a Humean argument from Skeptical Theism.
Murphy, Mark. (2017). God’s Own Ethics. Oxford: Oxford University Press.
A criticism of Humean arguments from the claim that God lacks obligations to us.
O’Connor, David. (2001). Hume on Religion. Routledge Press, chapter 9.
A modern discussion of Philo’s argument from evil that discusses the weak and strong interpretations.
Perrine, Timothy and Stephen Wykstra. (2014). “Skeptical Theism, Abductive Atheology, and Theory Versioning.” In Trent Dougherty and Justin McBrayer, eds., Skeptical Theism. Oxford University Press.
A skeptical theistic critique of Humean arguments, focusing on the methodology of the arguments.
Perrine, Timothy. (2019). “Skeptical Theism and Morriston’s Humean Argument from Evil.” Sophia. 58: 115-135.
A skeptical theistic critique of Humean arguments that defends them from the offsetting objection.
Pitson, Tony. (2008). “The Miseries of Life: Hume and the Problem of Evil.” Hume Studies. 34.1: 89-114.
A historical discussion of Hume’s views on the relation between the problem of evil and natural theology and atheology.
Plantinga, Alvin. (1996). “On Being Evidentially Challenged.” In Daniel Howard-Snyder, ed., The Evidential Argument From Evil. Bloomington, IN: Indiana University Press.
An argument that Humean arguments need to be strengthened to be cogent.
A modern commentary on Hume’s Dialogue that provides a discussion of its historical place and reception.
Van Inwagen, Peter. (1991 [1996]). “The Problem of Evil, the Problem of Air, and the Problem of Silence.” Reprinted in Daniel Howard-Snyder, ed., The Evidential Argument From Evil. Bloomington, IN: Indiana University Press.
An earlier skeptical theistic critique of Humean arguments.
This article is about nothing. It is not the case that there is no thing that the article is about; nevertheless, the article does indeed explore the absence of referents as well as referring to absence. Nothing is said to have many extraordinary properties, but in predicating anything of nothingness we risk contradicting ourselves. In trying to avoid such misleading descriptions, nothingness could be theorised as ineffable, though that theorisation itself is an attempt to disparage it. Maybe nothingness is dialetheic, or maybe there are no things that are dialetheic, since contradictions are infamous for leading to absurdity. Contradictions and nothingness can explode very quickly into infinity, giving us everything out of nothing. So, perhaps nothing is something after all.
This article considers different metaphysical and logical understandings of nothingness via an analysis of the presence/absence distinction, by considering nothing first as the presence of absence, second as the absence of presence, third as both a presence and an absence, and fourth as neither a presence nor an absence. In short, it analyses nothingness as a noun, a quantifier, a verb, and a place, and it postulates nothingness as a presence, an absence, both, and neither.
(i) This article is about no-thing (in that there is no thing that this article is about).
(ii) This article is about Nothing (in that there is something that this article is about).
The first reading (i) is a quantificational reading about the (lack of) quantity of things that this article is about. ‘Quantificational’ comes from ‘quantifier’, where a quantifier is a quantity term that ranges over entities of a certain kind. In (i), the quantity is none, and the entities that there are none of are things. This reading is referred to throughout the article as ‘no-thing’ (hyphenated, rather than the ambiguous ‘nothing’) to highlight this absence of things. The second reading (ii) is a noun phrase about the identity of the thing that this article is about. This reading is referred to throughout the article as ‘Nothing’ (capitalised, again avoiding the ambiguous ‘nothing’) to highlight the presence of a thing. In going from (i) to (ii), we have made a noun out of a quantity (a process we can call ‘nounification’). We have given a name to the absence, Nothing, giving it a presence. Sometimes this presence is referred to as ‘nothingness’, but that locution is avoided here since usually the ‘-ness’ suffix in other contexts indicates a quality or way of being, rather than a being itself (compare the redness of a thing to red as a thing, for example), and as such ‘nothingness’ is reserved for describing the nothing-y state of the presence Nothing and the absence no-thing.
It is important not to conflate these readings, and they cannot be reduced to one or the other. To demonstrate their distinctness, consider that (i) and (ii) have different truth values, as (ii) is true whilst (i) is false: it is not the case that this article is not about anything (that is, that for any x whatsoever there is no x that this article is about). As such, the article would be very short indeed (or even empty), bereft of a topic and perhaps bereft of meaning. I intend to do better than that. My intentional states are directed towards Nothing, hence the truth of (ii): there is indeed a topic of this article, and that topic—the subject, or even object of it—is Nothing.
There has been much debate over whether it is legitimate to nounify the quantificational reading of no-thing. Those who are sceptical would say that the ambiguous ‘nothing’ is really not ambiguous at all and should only be understood as a (lack of) quantity, rather than a thing itself. They might further argue that it is just a slip of language that confuses us into taking Nothing to be a thing, and that some of the so-called paradoxes of nothingness arise from illegitimate nounification that otherwise dissolve into mere linguistic confusions. The dialogues between characters in Lewis Carroll’sAlice in Wonderland and Through the Looking Glass are often cited as exemplars of such slippage and confusions. For instance [with my own commentary in square brackets]:
“‘I see nobody [that is, no-body as a quantifier] on the road’, said Alice.
‘I only wish I had such eyes’, the King remarked in a fretful tone.
‘To be able to see Nobody! [that is, Nobody as a noun] And at that distance too! Why, it’s as much as I can do to see real people [that is, somebodyness, rather than nobodyness, as states], by this light!’” (1871 p234)
Here, the term under consideration is ‘nobody’, and the same treatment applies to this as ‘nothing’ (in that we can disambiguate ‘nobody’ into the quantificational no-body and nounified Nobody). Alice intended to convey that there were no-bodies (an absence of presence) in quantitative terms. But the King then nounifies the quantifier, moving to a presence of absence, and applauds Alice on her apparent capacity to see Nobody.
Making this shift from things to bodies is helpful because bodies are less abstract than things (presumably you are reading this article using your body, your family members have bodies, animals have bodies, and so you have an intuitive understanding of what a body is). Once we have determined what is going on with no-body and every-body, we can apply it to no-thing and every-thing. So, consider now ‘everybody’. When understood as a quantifier, every-body is taken to mean all the bodies in the relevant domain of quantification (where a domain of quantification can be understood as the selection of entities that our quantifier terms range over). Do all those bodies, together, create the referent of Everybody as a noun? In other words, does Everybody as a noun refer to all the bodies within the quantitative every-body? One of the mistakes made by the likes of the King is to treat the referent of the noun as itself an instance of the type of entity the quantifier term is quantifying over. This is clear with respect to bodies, as Everybody is not the right sort of entity to be a body itself. All those bodies, together, is not itself a body (unless your understanding of what a body is can accommodate for such a conglomerate monster). Likewise, Nobody, when understood alongside its quantifier reading of no-body as a lack of bodies, is not itself a body (as, by definition, it has no bodies). So, the King, who is able to see only ‘real people’, makes a category mistake in taking Nobody to be, presumably, ‘unreal people’. Nobody, like Everybody, are quite simply not the right category of entity to instantiate or exemplify people-hood, bodyness, or be a body themselves.
The lesson we have learnt from considering ‘nobody’ is that nounifying the quantifier (no-body) does not create an entity (Nobody) of the kind that is being quantified over (bodies). So, returning to the more general terms ‘nothing’ and ‘everything’, are they the right kind of entities to be things themselves? Do Nothing and Everything, as nouns, refer to things, the same category of thing that their quantifier readings of no-thing and every-thing quantify over? The level of generality we are working with when talking of things makes it more difficult to diagnose what is going on in these cases (by comparison with Nobody and Everybody, for example).
To help, we can apply the lessons learnt from Alfred Tarski (1944) in so far as when talking of these entities as things we are doing so within a higher order or level of language—a metalanguage—in order to avoid paradox. We can see how this works with the Liar Paradox. Consider the following sentence, call it ‘S’: ‘This sentence is false’. Now consider that S is true and name the following sentence ‘S*’: ‘S is true’. If S (and thereby also S*) is true, then S says of itself that it is false (given that S literally states ‘This sentence is false’, which if true, would say it is false). On the other hand, if S (and thereby also S*) is false, then S turns out to be true (again, given that S literally states ‘This sentence is false’, which if it is false, would be saying something true). Tarski’s trick is to say that S and S* are in different levels of language. By distinguishing the level of language that S is talking in when it says it ‘… is false’, from the level of language that S* is talking in when it says that S ‘is true’, we end up avoiding the contradiction of having S be true and false at the same time within the same level. S is in the first level or order of language—the object language—and when we talk about S we ascend to a higher level or order of language—the metalanguage. As such, the truth and falsity appealed to in S are of the object language, and the truth and falsity appealed to in S* are of the metalanguage.
Applying Tarski’s trick to Nothing, perhaps Nothing cannot be considered a thing at the same level as the things it is not, just as Everything cannot be considered a thing at the same level as all the things it encapsulates. As quantifier terms, no-thing and every-thing quantify over things in the first level or order of the object language. As nouns, Nothing and Everything can only be considered things themselves in the higher level or order of the metalanguage, which speaks about the object language. The ‘things’ (or lack of) quantified over by every-thing and no-thing are of the object language, whereas the type of ‘thing’ that Everything and Nothing are are of the metalanguage. This avoids Nothing being a thing of the same type that there are no-things of.
Finally, then, with such terminology and distinctions in hand, we are now in a position to understand the difference between the presence of an absence (Nothing, noun), and the absence of a presence (no-thing, quantifier). Lumped into these two theoretical categories are the related positions of referring to a non-existing thing and the failure to refer to any thing at all (which whilst there are important variations, there are illuminating similarities that justify their shared treatment). Each of these approaches in turn are explored before describing other ways in which one can derive (and attempt to avoid deriving) the existence of some-thing from no-thing.
2. Nothing as Presence of Absence
When we sing that childhood song, ‘There’s a hole in my bucket, dear Liza’, the lyrics can be interpreted as straightforwardly meaning that there really is, there really exists, a hole in the bucket, and it is to that hole that the lyrics refer. Extrapolating existence in this sort of way from our language is a Quinean (inspired by the work of W. V. O. Quine) criterion for deriving ontological commitments, and specifically Quine argued that we should take to exist what our best scientific theories refer to. Much of our language is about things, and according to the principle of intentionality, so are our thoughts, in that they are directed towards or refer to things. (Of course, not all language and thought point to things: for example, in the lyrics above, the words ‘a’ and ‘in’ do not pick out entities in the way that ‘bucket’ and ‘Liza’ do. The question is whether ‘hole’ and ‘nothing’ function more like nonreferential ‘a’ and ‘in’ or referential ‘bucket’ and ‘Liza’.)
In our perceptual experiences and in our languages and theories we can find many examples of seeming references to nothingness, including to holes, gaps, lacks, losses, absences, silences, voids, vacancies, emptiness, and space. If we take such experiences, thoughts, and language at face value, then nothingness, in its various forms, is a genuine feature of reality. Jean-Paul Sartre is in this camp, and, in Being and Nothingness, he argues that absences can be the objects of judgements. Famously, Sartre described the situation in which he arrived late for his appointment with Pierre at a café, and ‘sees’ the absence of Pierre (because Pierre is who he is expecting to see, and the absence of Pierre frustrates that expectation and creates a presence of that absence—Sartre does not also ‘see’ the absence of me, because he was not expecting to see me). Relatedly, and perhaps more infamously, Alexius Meinong takes non-existent things to have some form of Being, such that they are to be included in our ontology, though Meinongians—those inspired by Meinong—disagree on what things specifically should be taken as non-existent.
So, what things should we take to exist? Consider the Eleatic principle which states that only causes are real. Using this principle, Leucippus noted that voids have causal power, and generalises that nonbeings are causally efficacious, such that they are as equally real as atoms and beings in general. When we sing, on the part of Henry, his complaints to dear Liza that the water is leaking from his bucket, then, the hole is blamed as being the cause of this leakage, and from this we might deduce the hole’s existence (the presence of an absence with causal powers). Similarly, we might interpret Taoists as believing that a wide variety of absences can be causes (for example, by doing no-thing—or as little as possible to minimise disruption to the natural way of the Tao—which is considered the best course of ‘(in)action’), and as such are part of our reality. As James Legge has translated from the Tao Te Ching: “Vacancy, stillness, placidity, tastelessness, quietude, silence, and non-action, this is the level of heaven and earth, and the perfection of the Tao and its characteristics” (1891 p13).
Roy Sorensen (2022) has gone to great lengths to describe the ontological status of various nothings, and his book on ‘Nothing’ (aptly named Nothing) opens with the following interesting case about when the Mona Lisa was stolen from the Louvre in Paris. Apparently, at the time, more Parisians visited the Louvre to ‘see’ the absence than they did the presence of the Mona Lisa, and the ‘wall of shame’ where the Mona Lisa once hung was kept vacant for weeks to accommodate demand. The Parisians regarded this presence of the absence of the Mona Lisa as something that could be photographed, and they aimed to get a good view of this presence of absence for such a photo, otherwise complaining that they could not ‘see’ if their view was obstructed. Applying the Eleatic principle, the principle of intentionality, a criterion for ontological commitment, or other such metaphysical tests to this scenario (as with Sartre’s scenario) may provide a theoretical basis for interpreting the ‘object’ of the Parisians’ hype (and the missing Pierre) as a presence of absence (of presence)—a thing, specifically, a Nothing.
Interpreting Nothing as a presence of absence requires us to understand Nothing as a noun that picks out such a presence of absence. If there is no such presence of this nothingness, and instead such a state is simply describing where something is not, then it is to be understood as an absence of presence via a quantificational reading of there being no-thing that there is. It can be argued that the burden of proof is on the latter position, which denies Nothing as a noun, to argue that there is only absence of a presence rather than a presence of absence. Therefore, in what follows, we pay close attention to this sceptical view to determine whether we can get away with nothingness as an absence, where there is no-thing, rather than there being a presence of Nothing as our language and experience seem to suggest.
3. No-thing as Absence of Presence
Returning to Liza and that leaking bucket, instead of there being a hole in the bucket, we could reinterpret the situation as the bucket having a certain perforated shape. Rather than there being a presence of a hole (where the hole is an absence), we could say that there is an absence of bucket (where the bucket is a presence) at the site of the leaking water. Such a strategy can be used not only to avoid the existence of holes as things themselves, but also to reinterpret other negative states in positive ways. For example, Aristotle, like Leucippus, argues from the Eleatic principle in saying that omissions can be causes, but to avoid the existence of omissions themselves this seeming causation-by-absence must be redescribed within the framework of Being. As such, negative nothings are just placeholders for positive somethings.
We can see a parallel move happen with Augustine who treats Nothing as a linguistic confusion—where others took there to be negative things (presences of an absence), Augustine redescribed those negative things as mere lacks of positive things (absences of a presence). For example, Mani thought ‘evil’ names a substance, but Augustine says ‘evil’ names an absence of goodness just as ‘cold’ names the absence of heat. Saying that evil exists is as misleading as saying cold exists, as absences are mere privations, and privations of presences specifically. Adeodatus and his father argue similarly, where Adeodatus says ‘nihil’ refers to what is not, and in response his father says that to refer to what is not is to simply fail to refer (see Sorensen 2022 p175). This interpretation of language is speculated to have been imported from Arab grammarians and been influenced by Indian languages where negative statements such as ‘Ostriches do not fly’ are understood as metacognitive remarks that warn us not to believe in ostrich flight rather than a description of the non-flight of ostriches (again see Sorensen 2022 p176 and p181).
Bertrand Russell attempted to generalise this interpretation of negative statements by reducing all negative truths to positive truths (1985). For example, he tried to paraphrase ‘the cat is not on the mat’ as ‘there is a state of affairs incompatible with the cat being on the mat’. But of course, this paraphrase still makes use of negation with respect to ‘incompatible’ which simply means ‘not compatible’, and even when he tried to model ‘not p’ as an expression of ‘disbelief that p’, this too requires negation in the form of believing that something is not the case (or not believing that something is the case). This ineliminatibility of the negation and the negative facts we find it in meant that Russell eventually abandoned this project and (in a famous lecture at Harvard) conceded that irreducibly negative facts exist. Dorothy Wrinch (1918) jests at the self-refuting nature of such positions that try to eliminate the negative, by saying that it is “a little unwise to base a theory on such a disputable point as the non-existence of negative facts”. So can we eliminate Nothing in favour of no-thing? Can we try, like Russell’s attempt, to avoid the presence of negative absences like Nothing, and instead only appeal to the absence of positive presences like no-thing? Can we escape commitment to the new thing created by nounifying no-thing into Nothing, can no-thing do all the work that Nothing does? Consider various strategies.
a. Eliminating Negation
Despite Russell’s attempt, it seems we cannot eliminate negative facts from our natural language. But from the point of view of formal languages, like that of logic, negation is in fact dispensable. Take, for example, the pioneering work of Christine Ladd-Franklin. In 1883, her dissertation put forward an entire logical system based on exclusion, where she coined the NAND operator which reads ‘not … and not …’, or ‘neither … nor …’. This closely resembles the work of Henry Sheffer, who later, in 1913, demonstrated that all of the logical connectives can be defined in terms of the dual of disjunction, which he named NOR (short for NOT OR, ‘neither … nor …’), or the dual of conjunction, which was (confusingly) named NAND (short for NOT AND, ‘either not … or not …’) and has come to be known as the Sheffer stroke. This Sheffer stroke, as well as the earlier Ladd-Franklin’s NAND operator, do away with the need for a symbolic representation of negation. Another example of such a method is in Alonzo Church’s formal language whereby the propositional constant f was stipulated to always be false (1956, §10), and f can then be used to define negation in terms of it as such: ~ A =df A → f. If we can do away with formal negation, then perhaps this mirrors the possibility of doing away with informal negation, including Nothing.
An issue with using this general method of escaping negative reality regards what is known as ‘true negative existentials’ (for example, ‘Pegasus does not exist’). Using Sheffer’s NAND, this is ‘Pegasus exists NAND Pegasus exists’ which is read ‘either it is not the case that Pegasus exists or it is not the case that Pegasus exists’, which we would want to be true. But since Pegasus does not exist, the NAND sentence will not be true, as each side of the NAND (that is, ‘Pegasus exists’) is false. As we shall see, this is a persistent problem which has motivated many alternatives to the classical logic setup.
Another issue concerns whether the concept of negation has really been translated away in these cases, or whether negation has just become embedded within the formal language elsewhere under the guise of some sort of falsehood, ever present in the interpretation. This questioning of the priority of the concept of negation was put forward by Martin Heidegger, when he asks: “Is there Nothing only because there is ‘not’, i.e. negation? Or is it the other way round? Is there negation and ‘not’ only because there is Nothing?” (1929 p12) Heidegger’s answer is that “‘Nothing’ is prior to ‘not’ and negation” (ibid.), and so whilst ‘not’ and negation may be conceptually eliminable because they are not primitive, ‘Nothing’ cannot be so. Try as we might to rid ourselves of Nothing, we will fail, even if we succeed in ridding our formal language of ‘not’ and negation. We shall now turn to more of these eliminative methods.
b. Eliminating True Negative Existentials
The riddle, or paradox, of non-being describes the problem of true negative existentials, where propositions like ‘Pegasus does not exist’ are true but seem to bring with them some commitment to an entity ‘Pegasus’. As we learn from Plato’sParmenides, “Non-being is… being something that is not, – if it’s going not to be” (1996 p81). It is thus self-defeating to say that something, like Pegasus, does not exist, and so it is impossible to speak of what there is not (but even this very argument negates itself). What do we do in such a predicament?
In the seminal paper ‘On What There Is’ (1948), Quine described this riddle of non-being as ‘Plato’s Beard’—overgrown, full of non-entities beyond necessity, to be shaved off with Ockham’s Razor. The problem arises because we bring a thing into existence in order to deny its existence. It is as if we are pointing towards something, and accusing what we are pointing at of not being there to be pointed at. This is reflected in the classical logic that Quine endorsed, where both ‘there is’ and ‘there exists’ are expressed by means of the ‘existential quantifier’ (∃), which is, consequently, interpreted as having ontological import. As a result, such formal systems render the statement ‘There is something that does not exist’ false, nonsensical, inexpressible, or contradictory. How can we get around this issue, in order to rescue the truth of negative existentials like ‘Pegasus does not exist’ without formalising it as ‘Pegasus—an existent thing—does not exist’?
This issue closely resembles the paradox of understanding Nothing—in referring to nothingness as if it were something. As Thales argues, thinking about nothing makes it something, so there can only truly be nothing if there is no one to contemplate it (see Frank Close 2009 p5). The very act of contemplation, or the very act of referring, brings something into existence, and turns no-thing into some-thing, which is self-defeating for the purposes of acknowledging an absence or denying existence. In his entry on ‘Nothingness’ in The Oxford Companion to the Mind, Oliver Sacks summarises the difficulty in the following way: “How can one describe nothingness, not-being, nonentity, when there is, literally, nothing to describe?” (1987 p564)
c. Eliminating Referring Terms
Bertrand Russell (1905) provides a way to ‘describe nothingness’ by removing the referent from definite descriptions. Russell analyses true negative existentials such as ‘The present King of France does not exist’ as ‘It is not the case that there is exactly one present King of France and all present Kings of France exist’. By transforming definite descriptions into quantitative terms, we do not end up referring to an entity in order to deny its existence—rather, the lack of an entity that meets the description ensures the truth of the negative existential. Quine (1948) takes this method a step further by rendering all names as disguised descriptions, and thereby analyses ‘Pegasus does not exist’ as more accurately reading ‘The thing that pegasizes does not exist’. Such paraphrasing away of referring devices removes the problem of pointing to an entity when asserting its nonexistence, thereby eliminating the problem of true negative existentials.
However, such methods are not without criticism, with some claiming their resolutions are worse than the problems they were initially trying to resolve. As Karel Lambert argues, they come with their own problems and place “undue weight both on Russell’s controversial theory of descriptions as the correct analysis of definite descriptions and on the validity of Quine’s elimination of grammatically proper names” (1967 p137). Lambert proposes, instead of ridding language of singular terms via these questionable means, one could rid singular terms of their ontological import. She creates a system of ‘free logic’ whereby singular terms like names need not refer in order to be meaningful, and propositions containing such empty terms can indeed be true. Therefore, ‘Pegasus does not exist’ may be meaningful and true even whilst ‘Pegasus’ does not refer, without contradiction or fancy footwork via paraphrasing into definite descriptions and quantificational statements.
Lambert (1963) also insists that such a move to free logic is required in order to prevent getting something from nothing in classical logic, when we derive an existential claim from a corresponding universal claim where the predicate in use is not true of anything in the domain. This happens when we infer according to the rule of ‘Universal Instantiation’ whereby what is true of all things is true of some (or particular) things, for example:
∀x(Fx → Gx)
∃x(Fx & Gx)
If no thing in the domain is F, then theoretically hypothesizing that all Fs are Gs leads to inferring that some Fs are Gs, thereby deriving an x that is F and G from the domain where there was no thing in the domain that was F to start with. Rather than the ad hoc limitation of the validity of such inferences to domains that include (at least) things that are F (or are more generally simply not empty), Lambert instead proposes her system of free logic where there need not be a thing in the domain for statements to be true.
But what about Nothing? Is ‘Nothing’ a referring term? For Rudolf Carnap, asking such a question is “based on the mistake of employing the word ‘nothing’ as a noun, because in ordinary language it is customary to use it in this form in order to construct negative existential statements… [E]ven if it were admissible to use ‘nothing’ as a name or description of an entity, still the existence of this entity would be denied by its very definition” (1959 p70). Many have argued against the first part of Carnap’s argument, to show that there are occurrences of ‘Nothing’ as a noun which cannot be understood in quantificational terms or as the null object without at least some loss of meaning (see, for example, Casati and Fujikawa 2019). Nevertheless, many have agreed with the second part of Carnap’s argument that even as a noun ‘Nothing’ would fail to refer to an existent thing (see, for example, Oliver and Smiley 2013). But if Nothing does not refer to an existent thing, what then is this encyclopaedia article about?
As Maria Reicher (2022) states, “One of the difficulties of this solution, however, is to give an account of what makes such sentences true, i.e., of what their truthmakers are (given the principle that, for every true sentence, there is something in the world that makes it true, i.e., something that is the sentence’s truthmaker).” The truthmaker of my opening sentence ‘This article is about nothing’ might then be that Nothing is what this article is about, even when Nothing is the name for the nounified no-thing. The problematic situation we seem to find ourselves in is this: Without an entity that the statement is about, the statement lacks a truthmaker; but with an entity that the statement is about, the statement becomes self-refuting in denying that very entity’s existence. But there is another option. ‘Nothing’ may not refer to an existent thing, yet this need not entail the lack of a referent altogether, because instead perhaps ‘Nothing’ refers to a non-existent thing, as we shall now explore.
d. Eliminating Existentially Loaded Quantification
Meinong’s ‘Theory of Objects’ (1904) explains how we can speak meaningfully and truthfully about entities that do not exist. Meinongians believe that we can refer to non-existent things, and talk of them truthfully, due to quantifying over them and having them as members in our domains of quantification. When we speak of non-existent things, then, our talk refers to entities in the domain that are non-existent things. So it is not that our language can be true without referring at all (as in free logic), but rather that our language can be true without referring to an existent thing (where instead what is referred to is a non-existent thing, which acts as a truthmaker). This approach grants that flying horses do not exist, but this does not imply that there are no flying horses. According to the Meinongian, there are flying horses, and they (presumably) belong to the class of non-existent things, where Pegasus is one of them. This class of non-existent things might also include the present King of France, Santa Claus, the largest prime number, the square circle, and every/any-thing you could possibly imagine if taken to not exist—maybe even Nothing.
So, for the Meinongian, naïvely put, there are existents and non-existents. Both are types of ‘thing’, and the over-arching name for these things are that they have ‘being’. All existent things have being, but not all being things have existence. And perhaps in such an account, Nothing could have ‘being’ regardless of its non/existence. Since Meinongians quantify over both existent and non-existent things, their quantification over domains containing both such things must be ontologically neutral (namely, by not having existential import), and they can differentiate between the two types of things by employing a predicate for existence which existent things instantiate and non-existent things do not. The neutral universal and particular quantifiers (Λ and Σ) can then be defined using the classical universal and existential quantifiers (∀ and ∃) with the existence predicate (E!) as such:
∀x =df Λx(E!x)
∃x =df Σx(E!x)
‘All existent things are F’ can be written as such:
∀x(Fx) =df Λx(E!x → Fx)
And ‘Some existent things are F’ can be written as such:
∃x(Fx) =df Σx(E!x & Fx)
Using these neutral quantifiers, we can then say, without contradiction, that some things do not exist, as such:
Σx(~E!x)
Despite these definitions, it would be erroneous to describe Meinongianism as “the way of the two quantifiers” (Peter van Inwagen 2003 p138). This is because the ontologically loaded quantifier ∃ can be considered as being restricted to existents, and so is different to Σ only by a matter of degree with respect to what is in the domain, that is, its range. Such a restriction of the domain can be understood as part and parcel of restricting what it is to count as a ‘thing’, where, for Quine, every-(and only)-thing(s) exists.
One need not be a Meinongian to treat the quantifiers as ontologically neutral, however. For example, Czeslaw Lejewski argues that the existentially non-committal ‘particular quantifier’ is “a nearer approximation to ordinary usage” and claims to “not see a contradiction in saying that something does not exist” (1954 p114). Another way to free the quantifiers of their ontological import is to demarcate ontological commitment from quantificational commitment, as in the work of Jody Azzouni (2004). Even the very basic idea of quantificational commitment leading to a commitment to an object in the domain of quantification can be challenged, by taking the quantifiers to be substitutional rather than objectual. In a substitutional interpretation, a quantificational claim is true not because there is an object in the domain that it is true of, but because there is a term in the language that it is true of (for an early pioneer of substitutional quantification, see Ruth Barcan-Marcus 1962).
In contrast to these alternative systems, for Quine (1948), “to be is to be the value of a bound variable”, which simply means to be quantified over by a quantifier, which further simplified means to be in the domain of quantification. An ontology, then, can be read straight from the domain, which contains (only) the existent things, which happens to be all the ‘things’ that there are. As we have seen, this is problematic with respect to understanding nonexistence. But that is not all. Ladd-Franklin (1912 p653), for example, argues that domains are just ‘fields of thought’, and thus the domain of discourse may vary, and it cannot simply be assumed to contain all of (and only) the things that exist in our reality. Even when the field of thought is physics, or whatever our best science may be, the domain of quantification still leaves us none the wiser with respect to what there is in reality. As Mary Hesse argues, “it is precisely what this domain of values is that is often a matter of dispute within physics” (1962 p243). Indeed, she continues, the very act of axiomatizing a theory in order to answer the question ‘what are the values of its variables?’ implies the adoption of a certain interpretation, which in turn is equivalent to the decisions involved in answering the question ‘what are entities?’ Therefore, one cannot informatively answer ‘what is there?’ with ‘the values of the bound variables’. Extrapolating from the domain is thus no guide to reality: it can give us some-thing from no-thing, regardless of whether every-thing includes more than every (existent) thing. And we cannot infer the existence of Nothing from ‘Nothing’.
4. Beyond the Binary—Both Presence and Absence
As we shall now see, the supposed choice between the binary options of understanding ‘nothing’ as Nothing (a noun, presence of absence) or no-thing (a quantifier, absence of presence) can itself be challenged. To get to that point, firstly, we introduce the dialectical process of Becoming which Nothing participates in, and then we introduce dialetheic understandings of the contradictory nature of Nothing.
a. Dialectical Becoming
In G. W. F. Hegel’s dialectics, a particular pattern is followed when it comes to conceptual analysis. To start, a positive concept is introduced as the ‘thesis’. Then, that positive concept is negated to create the ‘antithesis’ which opposes the thesis. The magic happens when the positive concept and the negative concept are unified to create a third concept, the ‘synthesis’ of the thesis and antithesis. When Hegel applied this dialectic of thesis-antithesis-synthesis to the topic we are considering in this article, the resulting pattern is Being-Nothing-Becoming. To start, he took Being as the positive thesis, which he stated is ‘meant’ to be the concept of presence. Negating this thesis of Being, we get what he stated is ‘meant’ to be the concept of absence, namely, Nothing, as the antithesis.
It is important to note that for Hegel the difference between Being and Nothing is only “something merely meant” (1991 remark to §87) in that we do mean to be highlighting different things when we use the term ‘Nothing’ rather than ‘Being’ or vice versa, but in content they are actually the same. What is the content of Being and Nothing, then, that would equate them in this extensional manner? Well, as purely abstract concepts, Being and Nothing are said to have no further determination, in that Being asserts bare presence, and Nothing asserts bare absence. Given that both are bare, and thus undetermined, they have the same (lack of) properties or content. (Compare the situation with the morning star and evening star—these terms were employed to mean different things, but actually they both refer to Venus.)
There is a presence to Nothing in its asserting absence, and there is an absence to Being in its empty presence. As Julie Maybee (2020) has described, “Being’s lack of determination thus leads it to sublate itself and pass into the concept of Nothing”, and this movement goes both ways. In speculating the bidirectional relationship between Being and Nothing, we enter the dialectic moment of synthesis that unifies and combines them into a state of Becoming. To Become is to go from Being to Nothing or from Nothing to Being, as we do when we consider their equally undefined content. But despite their extensional similarity (in what content they pick out), intensionally (their intended definitional meaning) Being and Nothing are different. Any contradiction that may arise from their synthesis can thus be avoided by reference to this difference. But what if such contradictions provide a more accurate understanding of nothingness, to better reflect its paradoxical nature? This is the idea we will now take up.
b. Dialetheic Nothing
Heidegger pointed out that in speaking of Nothing we make it into something and thereby contradict ourselves. Much like in that dialectical moment of synthesis, we posit Nothing as a being—as a thing—even though by our quantificational understanding that is precisely what it is not (see Krell 1977 p98f). Where can we go from here? Does this mean it is impossible to speak of Nothing without instantaneous self-defeat, by turning Nothing into not-no-thing, namely, some-thing? To this, Graham Priest adds, “One cannot, therefore, say anything of nothing. To say anything, whether that it is something or other, or just that it is, or even to refer to it at all, is to treat it as an object, which it is not” (2002 p241, emphasis in original).
Of course, Priest did say something about Nothing, as did Heidegger, and as does this article. It therefore is not impossible to talk of it. Perhaps the lesson to learn is that any talk of it will be false because the very act of doing so turns it into what it is not. This would be a kind of error-theory of Nothing, that whatever theorising is done will be in error, by virtue of postulating an object to be theorised where there is no object. But this will not do once we consider statements that motivate such a theory, like ‘Nothing is not an object’, which the error-theorist would want to be true in order for all (other) statements about Nothing to be false. Can we not even say that we cannot say anything about Nothing, then? Nor say that?
These problems reflect issues of ineffability. To be ineffable is to not be able to be effed, where to be effed is to be described in some way. Start with the idea that Nothing is ineffable, because in trying to describe it (a no-thing) we end up turning it into some-thing (a thing) that it is not. But, to say that Nothing is ineffable is a self-refuting statement, since ‘Nothing is ineffable’ is to say something about Nothing, namely, that it is ineffable. Furthermore, if it is true that Nothing is ineffable, then it is not true that no-thing is ineffable, because Nothing is. So, to repeat, can the (in)effability of nothingness be effed? And what about effing that?
Ludwig Wittgenstein’sTractatus is also an example of trying to eff the ineffable, via a self-conscious process of ‘showing’ rather than ‘saying’ what cannot be said, or else rendering it all meaningless. Wittgenstein’s work explores (among other things) the limits of our language in relation to the limits of our world, and the messy paths that philosophical reflection on our language can take us down. Applying this to Nothing, it might be that the contradictions that arise from attempts to express nothingness reflect contradictions in its very nature. And maybe when we get caught up in linguistic knots trying to understand Nothing it is because Nothing is knotty (which pleasingly rhymes with not-y). Perhaps then we need not try to find a way out of contradictions that stem from analysing nothingness if those contradictions are true. So, is it true that Nothing is both an object and not an object? Is it true that Nothing is both a thing and no-thing? Whilst this would not be Wittgenstein’s remedy, according to Priest, ‘yes’, we ought to bite this bullet and accept the paradoxical nature of Nothing at face value. To treat such a contradiction as true, one must endorse a dialetheic metaphysics, with a paraconsistent logic to match, where Nothing is a dialetheia.
5. Beyond the Binary—Neither Presence nor Absence
a. The Nothing Noths
As we have seen, when contemplating nothingness, we can quickly go from no-thing to Nothing, which is no longer a ‘nothing’ due to being some-thing. When we turn towards nothingness, it turns away from us by turning itself into something else. This makes nothingness rather active, or rather re-active, in a self-destructive sort of way. As Heidegger put it, “the nothing itself noths or nihilates” (1929 p90).
Carnap was vehemently against such metaphysical musings, claiming that they were meaningless (1959 p65-67). Indeed, Heidegger and the Vienna Circle (of which Carnap was a leading and central figure) were in opposition in many ways, not least with respect to Heidegger’s antisemitism and affiliation with the Nazis in contrast with the Vienna Circle’s large proportion of Jewish and socialist members (see David Edmonds 2020 for the relationship between the political and philosophical disputes).
Somewhat mediating on the logical side of things, Oliver and Smiley (2013) consider ‘the nothing noths’ as “merely a case of verbing a noun” and argue: “If ‘critiques’ is what a critique does, and ‘references’ is what a reference does, ‘nichtet’ is what das Nichts does. The upshot of all this is that ‘das Nichts nichtet’ [‘the nothing noths’] translates as ‘zilch is zilch’ or, in symbols, ‘O=O’. Far from being a metaphysical pseudo-statement, it is a straightforward logical truth” (p611). If verbing a noun is legitimate, what about nouning a quantifier? If ‘Criticisms’ is the name for all criticisms, and ‘References’ is the name for all references, then is not ‘Everything’ the name for every-thing, and likewise ‘Nothing’ the name for no-thing? Such an understanding would make the path to such entities quite trivial, a triviality that ‘straightforward logical truths’ share. But if we have learnt anything about Nothing so far, it is surely that it is a long way (at least 8,000 words away) from being trivial.
Heidegger avoids charges of triviality by clarifying that Nothing is “‘higher’ than or beyond all ‘positivity’ and ‘negativity’” (see Krummel 2017 p256 which cites Beiträge). This resonates with Eastern understandings of true nothingness as irreducible to and outside of binary oppositions, which is prominent in the views of Nishida Kitarō from the Kyoto School. What are they good for? ‘Absolute nothing’ (and more).
b. Absolute Nothing
When Edwin Starr sang that war was good for absolutely nothing (1970), the message being conveyed was that there was no-thing for which war was good. This was emphasised and made salient by the ‘absolutely’. When we are analysing nothingness, we might likewise want to emphasise that what we are analysing is absolutely nothing. But what would that emphasis do? In what way does our conception of nothingness change when we make its absoluteness salient?
For the Kyoto School, this ‘absolute’ means cutting off oppositional understandings, in a bid to go beyond relativity. The way we comprehend reality is very much bound up in such oppositions: life/death, yes/no, true/false, black/white, man/woman, good/bad, acid/alkaline, high/low, left/right, on/off, 0/1, even/odd, this/that, us/them, in/out, hot/cold… and challenging such binaries is an important part of engaging in critical analysis to better grasp the complexities of reality. But these binaries may very well include opposites we have been relying upon in our understanding of nothingness, namely, presence/absence, thing/no-thing, no-thing/Nothing, binary/nonbinary, relative/absolute, and so forth. It seems whatever concept or term or object we hold (like Hegel’s ‘thesis’), we can negate it (like Hegel’s ‘antithesis’), making a set of opposites. What then can be beyond such oppositional dialect? Nothing. (Or is it no-thing?)
Zen Buddhism explains that true nothingness is absolute, not relative—beyond the realm of things. Our earlier attempts at elucidating Nothing and no-thing were very much conceptually related to things, and so to get a truer, more absolute nothingness, we must go beyond no-thing/thing and no-thing/Nothing. Only once detached from all contrasts do we have absolute nothingness.
Nishida says absolute negation (zettai hitei 絶対否定) is beyond the affirmative/negative itself, and so is a rejection of what it colloquially represents: true negation is thereby a negation of negation. This is not the double-negation of classical logic (whereby something being not not true is for that something to be true) and it is not the mealy-mouthed multiple-negation of conversation (whereby not disliking someone does not entail liking them but rather just finding them incredibly annoying, for example). Instead, this negation of negation leaves the realm of relativity behind, it goes beyond (or negates) that which can be negated to enter the absolute realm. No-thing can be absolute without being absolved of any defining opposition that would render it merely relative. And so Nothing can only be absolute when it goes beyond the binaries that attempt to define it in the world of being. This does not place the absolute nothingness in the realm of nonbeing; rather, absolute nothingness transcends the being/nonbeing distinction.
Without anything to define absolute nothingness in relation to, it is quite literally undefined. As such, Nothing cannot be made into a subject or object that could be judged, and so is completely undetermined. It would not make sense, then, to interpret ‘absolute nothing’ as a thing, because that would bring it into the purview of predication. Instead, Nishida (2000 467, 482) speaks of it as a place: “the place of absolute nothing” (zettai mu no basho) or “the place of true nothing” (shin no mu no basho). Within this place is every determination of all beings, and as such is infinitely determined. But this is in contradiction with its status as being completely undetermined, beyond the realm of relative definition. Is absolute nothingness really beyond the realm of relative definition if it is defined in contrast to relativity, namely, as absolute? It seems that we have stumbled upon contradictions and binaries again. (Ask yourself: Can we avoid them? Ought we avoid them?) Like the dialetheic understanding of Nothing, this absolute nothingness is effed as ineffable in terms of what it is and is not. And like the nothing-that-noths, this absolute nothingness is active, but rather than nihilating anything that comes in its path, it creates every-thing.
6. Conclusion
This article has analysed nothingness as a noun, a quantifier, a verb, and a place. It has postulated nothingness as a presence, an absence, both, and neither. Through an exploration of metaphysical and logical theories that crossed the analytic/continental and East/West divides, it started with nothing, got something, and ended up with everything. What other topic could be quite as encompassing? Without further ado, and after much ado about nothing, let us conclude the same way that Priest does in his article ‘Everything and Nothing’ (which hopefully you, the reader, will now be able to disambiguate):
“Everything is interesting; but perhaps nothing is more interesting than nothing” (Gabriel and Priest 2022 p38).
7. References and Further Reading
Jody Azzouni (2004) Deflating Existential Consequence: A Case for Nominalism, Oxford University Press.
Filippo Casati and Naoya Fujikawa (2019) ‘Nothingness, Meinongianism and Inconsistent Mereology’, Synthese, 196.9: 3739–3772.
Rudolf Carnap (1959) ‘The Elimination Of Metaphysics Through Logical Analysis of Language’, A. Pap (trans.) in A. J. Ayer (ed.) Logical Positivism, New York: Free Press, 60–81.
Lewis Carroll (1871) Through the Looking-Glass and What Alice Found There, in M. Gardner (ed.) The Annotated Alice: The Definitive Edition, Harmondsworth: Penguin, 2000.
Alonzo Church (1956) Introduction to Mathematical Logic, Princeton University Press.
Frank Close (2009) Nothing: A very short introduction, Oxford University Press.
David Edmonds (2020) The Murder of Professor Schlick: The Rise and Fall of the Vienna Circle, Princeton University Press.
Suki Finn (2018) ‘The Hole Truth’, Aeon.
Suki Finn (2021) ‘Nothing’, Philosophy Bites. https://podcasts.google.com/feed/aHR0cHM6Ly9waGlsb3NvcGh5Yml0ZXMubGlic3luLmNvbS9yc3M.
Suki Finn (2023) ‘Nothing To Speak Of’, Think, 22.63: 39–45.
Markus Gabriel and Graham Priest (2022) Everything and Nothing, Polity Press.
W. F. Hegel (1991) The Encyclopedia Logic: Part 1 of the Encyclopaedia of Philosophical Sciences, F. Geraets, W. A. Suchting, and H. S. Harris (trans.), Indianapolis: Hackett.
Martin Heidegger (1929) ‘What is Metaphysics?’, in (1949) Existence and Being, Henry Regenry Co.
Mary Hesse (1962) ‘On What There Is in Physics’, British Journal for the Philosophy of Science, 13.51: 234–244.
Peter van Inwagen (2003) ‘Existence, Ontological Commitment, and Fictional Entities’, in Michael
Loux and Dean Zimmerman (eds.) The Oxford Handbook of Metaphysics, Oxford University Press, 131–157.
F. Krell (ed.) (1977) Martin Heidegger: Basic Writings, New York: Harper & Row.
John W. M. Krummel (2017) ‘On (the) nothing: Heidegger and Nishida’, Continental Philosophy Review, 51.2: 239–268.
Christine Ladd-Franklin (1883) ‘The Algebra of Logic’, in Charles S. Pierce (ed.) Studies in Logic, Boston: Little, Brown & Co.
Christine Ladd-Franklin (1912) ‘Implication and Existence in Logic’, The Philosophical Review, 21.6: 641–665.
Karel Lambert (1963) ‘Existential Import Revisited’, Notre Dame Journal of Formal Logic, 4.4: 288–292.
Karel Lambert (1967) ‘Free Logic and the Concept of Existence’, Notre Dame Journal of Formal Logic 8.1-2: 133–144.
James Legge (1891) The Writings of Chuang Tzu, Oxford University Press.
Czeslaw Lejewski (1954) ‘Logic and Existence’, British Journal for the Philosophy of Science, 5: 104–19.
Julie E. Maybee (2020) ‘Hegel’s Dialectics’, The Stanford Encyclopedia of Philosophy, Edward N. Zalta (ed.), <https://plato.stanford.edu/archives/win2020/entries/hegel-dialectics/>.
Alexius Meinong (1904) ‘Über Gegenstandstheorie’, in Alexius Meinong (ed.) Untersuchungen zur Gegenstandstheorie und Psychologie, Leipzig: J. A. Barth.
Kitarō Nishida (2000) Nishida Kitarō zenshū [Collected works of Nishida Kitarō], Tokyo: Iwanami.
Alex Oliver and Timothy Smiley (2013) ‘Zilch’, Analysis, 73.4: 601–613.
Plato (1996) Parmenides, A. K. Whitaker (trans.) Newburyport, MA: Focus Philosophical Library.
Graham Priest (2002) Beyond the Limits of Thought, Oxford University Press.
W.V.O. Quine (1948) ‘On What There Is’, The Review of Metaphysics, 2.5: 21–38.
Maria Reicher (2022) ‘Non-existent Objects’, The Stanford Encyclopedia of Philosophy, Edward N. Zalta and Uri Nodelman (eds.), URL = <https://plato.stanford.edu/archives/win2022/entries/non-existent-objects/>.
Bertrand Russell (1905) ‘On Denoting’, Mind, 14: 479–493.
Bertrand Russell (1985) The Philosophy of Logical Atomism, La Salle, II: Open Court.
Oliver Sacks (1987) ‘Nothingness’, in Richard L. Gregory (ed.) The Oxford Companion to the Mind, Oxford University Press.
Jean-Paul Sartre (1956) Being and Nothingness: An Essay on Phenomenological Ontology, Hazel E. Barnes (trans.), New York: Philosophical Library.
Henry Sheffer (1913) ‘A Set of Five Independent Postulates for Boolean Algebras, with Applications to Logical Constants’, Transactions of the American Mathematical Society, 14: 481–488.
Roy Sorensen (2022) Nothing: A Philosophical History, Oxford: Oxford University Press. Edwin Starr (1970) War, Motown: Gordy Records.
Alfred Tarski (1944) ‘The Semantic Conception of Truth’, Philosophy and Phenomenological Research, 4.3: 341–376.
Ludwig Wittgenstein (1961) Tractatus Logico-Philosophicus, D. F. Pears and B. F. McGuinness (trans.), New York: Humanities Press.
Dorothy Wrinch (1918) ‘Recent Work In Mathematical Logic’, The Monist, 28.4: 620–623.
Author Information
Suki Finn
Email: suki.finn@rhul.ac.uk
Royal Holloway University of London
United Kingdom
Impossible Worlds
Actual facts abound and actual propositions are true because there is a world, the actual world, that the propositions correctly describe. Possibilities abound as well. The actual world reveals what there is, but it is far from clear that it also reveals what there might be. Philosophers have been aware of this limitation and have introduced the notion of a possible world. Finally, impossibilities abound because it turned out that possibilities do not exhaust the modal space as a whole. Beside the actual facts, and facts about the possible, there are facts about what is impossible. In order to explain this, philosophers have introduced the notion of an impossible world.
This article is about impossible worlds. First, there is a presentation of the motivations for postulating impossible worlds as a tool for analysing impossible phenomena. This apparatus seems to deliver great advances in modal logic and semantics, but at the same time it gives rise to metaphysical issues concerning the nature of impossible worlds. Discourse about impossible worlds is explained in Sections 2 and 3. Section 4 provides an overview of the theories in discussion in the academic literature, and Section 5 summarises the drawbacks of those theories. Section 6 takes a closer look at the logical structure of impossible worlds, and Section 7 discusses the connection between impossible worlds and hyperintensionality.
Modal notions are those such as ‘possibility’, ‘necessity’, and ‘impossibility’, whose analysis requires a different account than so-called indicative notions. To compare the two, indicative propositions are about this world, the world that obtains; and all the true indicative propositions describe the world completely. Propositions of the latter kind are about the world as well, although in a different sense. They are about its modal features or, said otherwise, about alternatives to it. Philosophers call them possible worlds.
For a start, it is important to consider the distinction between pre-theoretical and theoretical terms. Pre-theoretical terms are terms we handle before we engage in philosophical theorizing. Theoretical terms, on the other hand, are introduced by philosophers via sets of definitions. Such terms are usually defined via terms that we already understand in advance. The debate about possible worlds can be understood along the similar lines. The word ‘world’ is a theoretical notion that differs from the word as we use it in everyday life. In the latter, the world is everything we live in and interact with. The philosophical ‘world’ represents the world and is one of many such representations. Its uniqueness rests on the correct representation of it. ‘Actual world’, ‘possible world’, as well as ‘impossible world’ are thus theoretical terms.
An example will be helpful here. Consider the following proposition:
(1) Canberra is the capital of Australia.
Given the constitutional order of Australia, (1) is true because Canberra is the capital of Australia. In contrast, the proposition:
(2) Melbourne is the capital of Australia
is false, because it is not the case. So (1) and (2) are factual claims, because they describe the constitutional order in Australia. Consider, however, the following proposition:
(3) Melbourne could be the capital of Australia.
At first sight, (3) also appears to be about our world in some sense, yet it displays structurally different features than (1) and (2). So, why is it so? Some philosophers dismiss this question by rejecting its coherence. Others propose a positive solution by means of other worlds. In the following two sections I provide two arguments for doing so.
2. The First Argument for Impossible Worlds
In his Counterfactuals (1973), David Lewis states the following:
I believe, and so do you, that things could have been different in countless ways. But what does this mean? Ordinary language permits the paraphrase: there are many ways things could have been besides the way they actually are. I believe that things could have been different in countless ways; I believe permissible paraphrases of what I believe; taking the paraphrase at its face value, I therefore believe in the existence of entities that might be called ‘ways things could have been.’ I prefer to call them ‘possible worlds’. (Lewis 1973: 84)
Takashi Yagisawa builds on Lewis’s view as follows:
There are other ways of the world than the way the world actually is. Call them ‘possible worlds.’ That, we recall, was Lewis’ argument. There are other ways of the world than the ways the world could be. Call them ‘impossible worlds’. (Yagisawa 1988: 183)
These two quotes reflect a need for an analysis of modality in terms of worlds. While Lewis postulates possible worlds as the best tool for analysing modal propositions, Yagisawa extends the framework by adding impossible worlds. In other words, while Lewis accepts:
(P) It is possible that P if and only if there is a possible world, w, such that at w, P.
and:
(I) It is impossible that P if and only if there is no possible world, i, such that at i, P.
as definitions of possibility and impossibility.
An alternative analysis of impossibility extends the space of worlds and, in addition to possible worlds, commits to impossible worlds. As a consequence, proponents of impossible worlds formulate a dilemma in the form of modus tollens and modus ponens respectively:
If we endorse arguments for the existence of possible worlds, then, with all needed changes made, we should endorse the same kind of argument for the existence of impossible worlds.
There are arguments that disqualify impossible worlds from being acceptable entities.
Therefore:
There are no possible worlds. (By modus tollens.)
Or:
1*. If we endorse arguments for the existence of possible worlds, then mutatis mutandis, we should endorse the same kind of argument for the existence of impossible worlds.
2*. There are arguments that establish possible worlds as acceptable entities.
Therefore:
There are impossible worlds. (By modus ponens.)
A need for impossible worlds starts from an assumption that if the paraphrase argument justifies belief in worlds as ways things could have been, then the same argument justifies belief in worlds as ways things could not have been. The second reason is the applicability of impossible worlds. I will discuss some applications of impossible worlds in the next section.
3. Impossible Worlds and Their Applications
It is thought of as a platitude that the introduction of theoretical terms ought to be followed by their theoretical utility. Moreover, the usability of theoretical terms should not solve a particular problem only. Instead, their applications should range over various philosophical phenomena and systematically contribute to their explanation.
The theoretical usefulness of possible worlds has been proven in the analysis of de re as well as de dicto modalities (see the article on Frege’s Problem: Referential Opacity, Section 2), as well as in the analysis of counterfactual conditionals, propositional states, intensional entities, or relations between philosophical theories. Given their applicability, possible worlds have turned out to be a useful philosophical approach to longstanding philosophical problems.
To begin with, representing properties and propositions as sets of their instances, possible individuals and possible worlds respectively, offered many advantages in philosophy. In particular, impossible worlds provide a more nuanced explanation of modality in a way that an unadulterated possible world framework does not. Like possible worlds, impossible worlds are ‘localisers’, albeit in the latter case, where impossible things happen. Consider these two statements:
(4) 2 + 2 = 5
and
(5) Melbourne both is and is not in Australia.
(4), according to a possible worlds semantic treatment, does not hold in any possible world, because possible worlds are worlds at which only possible things happen. Also, there is no possible world at which Melbourne both is and is not in Australia. Given these two data, and assuming the widely accepted, although disputable, view of propositions as sets of possible worlds, (4) and (5) are ontologically one and the same proposition. It is the empty set. However, (4) and (5) are about different subject matters, namely arithmetic and geography. In order not to confuse these two (impossible) subjects, one sort of way out is presented by impossible worlds: there is an impossible world at which (4) is true and (5) is false, and vice versa.
The well-known reductio ad absurdum mode of argument is another, although controversial, reason for taking impossible worlds seriously (for a more detailed exposition of this, see the article on Reductio ad Absurdum). The internal structure of such arguments starts with certain assumptions and then, via logically valid steps, leads to a contradiction. The occurrence of such an assumption shows that, although the conclusion is contradictory, the impossible assumption gives rise to a counterfactual string of mutually interconnected and meaningful premises. Some proponents of impossible worlds insist that unless we take such impossible assumptions seriously, reductio ad absurdum arguments would not play such a crucial role in philosophical reasoning. For the opposite view according to which mathematical practice does not depend on using counterfactuals, see Williamson (Williamson 2007, 2017). For a more substantive discussion of the reductio ad absurdum and impossible worlds, see also Berto& Jago (2019, especially Chapter XII).
Whatever the machinery behind the reductio ad absurdum argument is, there is a strong reason to postulate impossible worlds for the analysis of a sort of counterfactual conditionals, nonetheless. According to the most prevalent theory, a counterfactual is true if and only if there is no possible world w more similar to the actual world than some possible world w´ such that (i) the antecedent and the consequent of the conditional are both true in w´, and (ii) the antecedent is true but the consequent is not true in w. Clearly, such an account falls short in analysing counterpossible conditionals unless we either deny their possible worlds interpretation (Fine 2012), admit that they are trivially true (Lewis 1973, Williamson 2007), treat the putative triviality by other means (Vetter 2016) or simply accept impossible worlds. To demonstrate the problem, here is a pair of famous examples, originally from (Nolan 1997):
(6) If Hobbes had (secretly) squared the circle, sick children in the mountains of South America at the time would have cared.
(7) If Hobbes had (secretly) squared the circle, sick children in the mountains of South America at the time would not have cared.
Although intuitions are usually controversial within the philosophical room, there is something intriguing about (7). Namely, although its antecedent is impossible, we seem to take (7) to be true. For, in fact, no sick children would have cared if the antecedent had been true, since this would have made no difference to sick children whatsoever. By the same reasoning, (6) is intuitively false; for again, no sick children would have cared if the antecedent had been true. Consequently, the occurrence of these distinct truth values requires a distinctive analysis and impossible worlds analysis is one candidate.
Disagreements in metaphysical disputes display another feature of impossibility. Metaphysicians argue with each other about lots of issues. For instance, they variously disagree about the nature of properties. Suppose that trope theory is the correct theory of properties and so is necessary true (see the article on Universals). Then this means that both the theory of properties as transcendent universals and the theory of properties as immanent universals are both (a) impossible, and (b) distinct. But they are true in the same possible worlds (that is, none), and to distinguish these two views in terms of where they are true requires impossible worlds. Similarly, proponents of modal realism and modal ersatzism disagree about the nature of possible worlds (see the article on Modal Metaphysics). But they both agree that if either of these theories is true, it is true in all possible worlds; necessarily so. By this reasoning, one’s opponent’s claim is necessarily wrong; she defends an impossible hypothesis. For more details on this (and other issues) see Nolan (1997) and (Miller 2017).
Although theories of fiction abound, its analyses in terms of possible worlds dominate. According to such analyses, what happens in a work of fiction happens at a set of possible worlds, full stop. However, the problem is that fiction fairly often hosts impossible events.
For instance, ‘Sylvan’s Box’ (Priest 1997) is a short story about an object which is inconsistent because it is both empty and non-empty. A usual treatment of such stories uses the terminology of worlds which realise what is stated in the story. However, Priest claims, any interpretation of the story in terms of sub-sets of internally consistent sets of possible worlds (see Lewis 1978) misrepresents the story.
Of course, these applications of impossible worlds are not exhaustive and, as we will see in Section 4, impossible worlds have limitations. Let us, however, suppose that the dilemma is irresistible, and that impossible worlds are, at least to some extent, as applicable as possible worlds are. Given so, one must always consider the cost of such commitment. Since the theoretical application of any entity brings with it an ontological burden, an optimal trade-off between application and ontological commitments must be sought. And impossible worlds are an excellent example of such a trade-off. The next section overviews several metaphysical issues about impossible worlds.
4. The Metaphysics of Impossible Worlds
The introduction of theoretical entities requires a view about their metaphysical nature. The introduction of impossible worlds in not an exception and requires an answer to the question of what impossible worlds are, and, additionally, how impossible worlds differ from possible worlds. We can think of the questions as the identification question and the kind question, respectively.
The identification question concerns the nature of impossible worlds. Like proponents of possible worlds, proponents of impossible worlds disagree about the metaphysical nature and divide into several camps. To start with realism about worlds, these views share a common idea that whatever worlds are, these worlds exist. Probably the most prominent version of modal realism is the genuine modal realism. While modal realism is a thesis according to which possible worlds exist, genuine modal realism claims that possible worlds exist and, moreover, possible worlds exist in the very same way as ‘we and our surroundings’; they are as concrete as we, buildings, animals, and cars are. What is more, every individual exists in one possible world only (for more on transworld identity, see the article on David Lewis). The actual world is a world which has temporal and spatial dimensions and, consequently, every possible world fulfils this requirement. According to modal realism, possible worlds are concrete spatiotemporal entities.
Another version of modal realism with impossible worlds is presented by Kris McDaniel (2004). His strategy is to withdraw Lewis’s commitment to individuals existing in one possible world only. Instead, he allows an individual to exist in many worlds and to thus bear the exists at relation to more than one world. Such so-called modal realism with overlap is genuine realism, because it accepts concrete possible worlds and their inhabitants.
A modified version of modal realism is presented by Yagisawa (2010). Under the name of modal dimensionalism, Yagisawa postulates so-called metaphysical indices. These indices represent the spatial, temporal, and modal dimensions of the world. According to Yagisawa, the world has spatial, temporal, and additionally modal dimensions, in the same way that I have my own spatial, temporal and modal dimensions. Namely, my temporal dimension includes, among other things, me as a child, me nine minutes ago, and me in the future. My spatial dimensions are the space occupied by my hands, head, as well as the rest of my body. My modal dimension includes my possible stages of being a president, a football player and so forth.
A more moderate version of modal realism is modal ersatzism. Like genuine modal realism, modal ersatzism takes possible worlds to be existent entities (see again the article on Modal Metaphysics), yet denies that they have spatiotemporal dimensions. Naturally, such a brand of realism attracts fans of less exotic ontology because possible worlds are considered as already accepted surrogates for otherwise unwelcome philosophical commitments: complete and consistent sets of propositions or sentences, complete and consistent properties, or complete and consistent states of affairs. Usually, these entities are non-concrete in nature and are parts of the actual world (the view is sometimes called actualism). Alternatively, for an excellent overview of various kinds of ersatzism, see (Divers 2002).
Finally, views according to which worlds do not in fact exist, are widespread in literature. Under the name of modal anti-realism, such views reject modal realism for primarily epistemological reasons although neither deny the meaningfulness of modal talk nor the accuracy of its worlds semantics. Although modal anti-realism is not so widespread in the literature, several positive proposals have demonstrated its prospects. For instance, Rosen (1990) proposes a strategy of ‘fictionalising’ the realist’s positions in shape of useful fictions. Although his primary target is genuine modal realism, it is easy to generalise the idea to other versions of modal realism.
The kind question asks whether possible and impossible worlds are of the same metaphysical category or fall under metaphysically distinct categories. The extent to which we identify possible worlds with a certain kind of entity (identification question) and accept impossible worlds for one reason or another, the response to the kind question predetermines our views about the nature of impossible worlds.
A positive response to the kind question is put forward in Priest (1997). As he puts it, anyone who accepts a particular theory of possible worlds, be it concrete entities, abstract entities, or non-existent entities, has no cogent reason to pose an ontological difference between merely possible and impossible worlds (see Priest 1997: 580–581). The idea is expressed by the so-called parity thesis which says that theories of the nature of possible worlds should be applied equally to impossible worlds.
Now, particular versions of modal realism together with the parity thesis lead to specific views of impossible worlds. To begin with genuine modal realism, extended genuine modal realism accepts concrete possible and impossible worlds. These worlds are spatiotemporal entities, and whatever is impossible holds in some concrete impossible world. For the idea of paraphrasing Lewis’s original argument from ways, see Naylor (1986) and Yagisawa (1988).
Modal dimensionalism as well as modal realism with overlap find their impossible alternatives relatively easily. In the former, I simply have impossible stages as well. In the latter, modal realism with overlap allows that an individual can have mutually incompatible properties at two different possible worlds. For example, an individual, a, bears the exists at relation to a world at which a is round, and bears the exists at relation to another world in which a is square, thus representing the situation ‘a is round and square’. Since it is impossible to be both round and square, this is an impossible situation.
A moderate version of modal realism, modal ersatzism combined with parity thesis is, so to speak, in an easier position. Given her metaphysical commitments, be it sets, sentences, propositions, or whatever you have are already assumed to exist, it is only one step further to introduce impossible worlds as their incomplete and inconsistent counterparts without incurring any additional ontological commitments.
Proponents of the negative response to the kind question, on the other hand, deny the parity thesis. Impossible worlds, according to them, are a distinct kind of entity. Interestingly, such a metaphysical stance allows for a ‘recombination’ of philosophically competitive position. For instance, the hybrid genuine modal realism, indicated in Restall (1997), Divers (2002) and further developed in (Berto 2009), posits concrete possible worlds as the best representation of possible phenomena, but abstract impossible worlds as the ‘safest’ representation of impossible phenomena. In other words, what is possible happens in concrete possible worlds as genuine modal realism conceives them, and what is impossible is represented by more moderate ontological commitments. In particular, possible worlds are concrete and impossible worlds are, according to hybrid genuine modal realism, sets of propositions modelled in accordance with genuine modal realism. Notably, hybrid genuine modal realism is one of many options for the opponents of the Parity thesis. As mentioned earlier, the hybrid approach to modality allows us to interpret possibility/impossibility pair in terms of distinct metaphysical categories and, depending on the category choice, explicates the duality via the identification question (possible tropes/inconsistent sets; maximal properties/impossible fictions, or other alternatives). Given that the variety of versions remains an underdeveloped region of modal metaphysics in the early twenty-first century, it is a challenge for the future to fill in the gaps in the literature.
5. Troubles with Impossible Worlds
Undoubtedly, any introduction of suspicious entities into philosophy comes with problems, and impossible worlds are not an exception. Besides incredulous stares toward them, philosophical arguments against impossible worlds abound.
A general argument against impossible worlds points to the analysis of modality. For, as far as the goal is to provide an account of modal concepts in more graspable notions, the introduction of impossible worlds puts the accuracy of the analysis at stake. Recall the initial impossibility schema (I):
(I) It is impossible that P if and only if there is no possible world, i, such that at i,
An impossible worlds reading substitutes the occurrence of ‘no possible world’ with ‘impossible world’ along the lines of (I*):
(I*) It is impossible that P if and only if there is an impossible world, i, such that at i, P.
(I*) mimics the structure of (P) and proponents of impossible worlds are expected to be tempted to it. However, (I*) is ‘superficially tempting’. For, although (P) and (I*) are both biconditionals it is hard to accept the right-to-left direction of (I*). For instance, although it is impossible that A & ~A, the conjuncts themselves may be contingent and, by (P), be true in some possible world. Such disanalogy between (P) and (I*) makes impossible worlds of not much use in the theory of impossibility in the first place.
Other problems concern particular theories of modality. Starting with extended modal realism, Lewis himself did not the feel the need to dedicate much space to its rejection. There are two reasons. The first reason is that to provide an extensional, non-modal analysis of modality and, at the same time, distinguish possible worlds from impossible worlds without making use of modal notions is a viable project. The second reason is that a restricting modifier, like ‘in a world’, works by limiting domains of implicit and explicit quantification to a certain part of all that there is, and therefore has no effect on the truth-functional connectives (Lewis 1986, 7, fn.3).). By this, Lewis means that insofar as you admit an impossible thing in some impossible world, you thereby admit impossibility into reality. Since this is an unacceptable conclusion, Lewis rejects the extended version of his modal realism via a simple argument:
1. There is a concrete impossible world at which (A & ~A)
2. At w (A & ~A) if and only if at w A & ~(at w A)
3. The right-hand side of (2) is literally a true contradiction
4. The Law of Non-Contradiction is an undisputable logical principle.
C. There are no concrete impossible worlds.
For Lewis, restricting modifiers works by limiting domains of implicit and explicit quantification to a certain part of all there is. Therefore, ‘On the mountain both P and Q’ is equivalent to ‘On the mountain P, and on the mountain Q’; likewise, ‘On the mountain not P’ is equivalent to ‘Not: on the mountain P’. As a result, ‘On the mountain both P and not P’ is equivalent to the overt contradiction ‘On the mountain P, and not: on the mountain P’. In other words, there is no difference between a contradiction within the scope of the modifier and a plain contradiction that has the modifier within it. See (Lewis 1986: 7 fn. 3) for a full exposition of this argument.
Modal dimensionalism is not without problems either. Jago (2013) argues that adding an impossible stage of ‘Martin’s being a philosopher and not a philosopher’ to my modal profile generates undesired consequences, for modal stages are subject to existential quantification in the same way that actual stages are. And since both actual and modal stages exist, they instantiate inconsistencies, full stop. In the opposite direction, see Yagisawa’s response (2015), as well as Vacek (2017).
Modal realism with overlap has its problems too. A simple counterexample to it relies on the (usually) indisputable necessity of identity and the view according to which no two objects share the same properties: Leibniz’s law. The argument goes as follows: it is impossible for Richard Routley not to be Richard Sylvan because this is one and the same person (in 1983 Richard Routley adopted the last name “Sylvan”):
It is impossible that ∼ (Routley = Sylvan)
Therefore, there is an impossible world i where ∼ (Routley = Sylvan). Now, take the property ‘being a logician’. It is impossible for Routley but not Sylvan to be a logician which, by modal realism with overlap’s lights, means that Routley, but not Sylvan, bears the being a logician relation to a world i. Generalising the idea,
for some property P, in i Routley has P, but Sylvan does not.
However, by Leibniz’s law, it follows that ∼ (Routley = Sylvan). And that is absurd.
What about modal ersatzism? Recall that this alternative to (extended) modal realism takes possible worlds to be existent entities of a more modest kind. The move from ersatz possible worlds to impossible worlds, together with the parity thesis, leads to the inheritance of the various problems of ersatz theories. One such problem is the failure of the reductive analysis of modality. As Lewis argues, any ersatzist theory must at some point appeal to primitive modality and thus give up the project of analysing modality in non-modal terms. Another problem is that entities like states of affairs, properties and propositions are intensional in nature and thus do not contribute to a fully extensional analysis. For scepticism about intensional entities, see Quine (1956). For more problems with modal ersatzism, see Lewis (1986: ch. 3).
Modal fictionalism can be a way of avoiding the realist’s problems. For, if ‘according to the possible worlds fiction’ explains possibility, then ‘according to the possible and impossible worlds fiction’ offers a finer-grained analysis with no exotic ontological commitments. But again, such a relatively easy move from possibility to impossibility faces the threat of inheriting the problems of modal fictionalism. One such difficulty is that fictionalism is committed to weird abstract objects, to wit, ‘stories’. Another worry about (extended) modal fictionalism is the story operator itself. For, unless the operator is understood as primitive, it should receive an analysis in more basic terms. And the same applies to the ‘according to the possible and impossible worlds fiction’ operator.
Moreover, even if modal fictionalists provide us with an account of their fiction operator, it will probably face the same importation problem that the modal realist does. The argument goes as follows. First, suppose logic is governed by classical logic. Second, if something is true in fiction, so are any of its classical consequences. Third, given the explosion principle (everything follows from a contradiction), an inconsistent fiction implies that every sentence is true in the fiction. Fourth, take an arbitrary sentence and translate it as ‘according to the fiction, A’. Fifth, ‘according to the fiction, A’ is true (because an inconsistent fiction implies that all sentences are true within it). Sixth, given that A is the actual truth, ‘according to the fiction, A’ implies: actually A. But it seems literally false to say that any arbitrary sentence is actually true. For more details, see Jago (2014).
The hybrid view has its limitations too. One limitation is that the view introduces two ontological categories and is, so to speak, ideologically less parsimonious than theories following the parity thesis. Moreover, as Vander Laan (1997, 600) points out, there does not seem to be any ontological principle which would justify two different ontological categories in one modal language, namely language of possibility and impossibility.
Yet, there are at least two responses available for the hybrid view. First, proponents of the hybrid view might simply claim that if the best theory of modality plays out that way, that is, if the theory which best systematises our intuitions about modality approves such a distinction, the objection is illegitimate. Second, even the ersatzer faces the same objection. The actual world has two different interpretations and, consequently, two different ontological categories. The actual world can be understood either as us and all our (concrete) surroundings, or abstract representation of it.
Undoubtedly, there is much more to be said about the metaphysics of impossible worlds. Since they come in various versions, one might worry whether any systematic account of such entities is available. Be that as it may, the story does not end with metaphysics. Besides semantic applications of impossible worlds and their metaphysical interpretation, there are logical criteria which complicate their story even more. The next section therefore discusses the logical boundaries (if any) of impossible worlds.
6. The Logic of Impossible Worlds
One might wonder how far impossibility goes, because, one might think, impossible worlds have no logical borders. Yet, one view to think of impossible worlds is as so-called ‘logic violators’. According to this definition, impossible worlds are worlds where the laws of a logic fail. I use the indefinite article here because it is an open question what the correct logic is. Suppose we grant classical logic its exclusive status among other logics. Then, impossible worlds are worlds where the laws and principles of classical logic cease to hold, and the proper description of logical behaviour of impossible worlds requires different logic.
We might therefore wonder whether there is a logic which impossible worlds are closed under. One such candidate is paraconsistent logic(s). Such logics are not explosive, which means that it is not the case that from contradictory premises anything follows. Formally, paraconsistent logic denies the principle α, ~α |= β, and its proponents argue that, impossibly, there are worlds at which inconsistent events happen. Given their denial of the explosion principle, paraconsistent logics should be the tool for an accurate and appropriate analysis of such phenomena. For an extensive discussion of paraconsistent logics, see Priest, Beall, and Armour-Garb (2004).
However, some examples show that even paraconsistent logics are not sufficient for describing the plenitude of the impossible. For example, paraconsistent logic usually preserves at least some principles of classical logic (see the article on Paraconsistent Logic) and cannot thus treat the impossibilities of their violations. A solution would be to introduce its weaker alternative which would violate those principles. But even this manoeuvre seems not to be enough because, as Nolan (1997) puts it, there is tension between a need of at least some logical principles on one side and the impossibility of their failure on the other. For, ‘if for any cherished logical principle there are logics available where that principle fails… if there is an impossible situation for every way things cannot be, there will be impossible situations where even the principles of (any) subclassical logics fail (Nolan 1997, 547). In other words, if we think of a weaker logic as validating fewer arguments, we easily end up with logical nihilism (Russell 2018). Another option is to admit a plurality of logics (Beall & Restall 2006) or, controversially, accept the explosion principle and fall into trivialism: every proposition follows (Kabay 2008).
7. Impossible Worlds and Hyperintensionality
Let me finish with the question of the place of impossibility in reality. In other words, the question remains whether impossibility is a matter of reality, or a matter of representing it. In other words, are impossible matters representational or non-representational? While the literature about impossible issues is inclined towards the latter option, some authors have located the failure of necessary equivalence, that is, the failure of substituting extensionally as well as intensionally equivalent terms, within the world.
To be more precise, levels of analysis ascend from the extensional, to the intensional, to the hyperintensional level. Nolan (2014) suggests that a position in a sentence is extensional if expressions with the same extension can be substituted into that position without changing the truth-value of the sentence. An intensional position in a sentence is then characterised as non-extensional, such that expressions that are necessarily co-extensional are freely substitutable in that position, while preserving its truth value. Finally, a hyperintensional position in a sentence is neither extensional nor intensional, and one can substitute necessary equivalents while failing to preserve the truth-value of the sentence. Apparently, the introduction of impossible worlds moves philosophical analyses into the hyperintensional level, since even when A and B are necessarily equivalent (be this logical, mathematical, or metaphysical necessity), substituting one of them for the other may result in a difference in truth value. But if that is so, and if some hyperintensional phenomena are non-representational, then impossibility is a very part of reality.
There are several cases which both display worldly features and are hyperintensional. For instance, some counterfactual conditionals with impossible antecedents are non-representational (Nolan 2014). Also, Schaffer (2009) contrasts the supervenience relation to the grounding relation, and concludes that there are substantive grounding questions regarding mathematical entities and relations between them. Yet, given the supervenience relation, such questions turn out to be vacuously true. Explanation as a hyperintensional phenomenon might be understood non-representationally as well. Namely, as an asymmetric relation between the explanans and its necessarily equivalent explanandum. Among other things, some dispositions (Jenkins & Nolan 2012), the notion of intrinsicality (Nolan 2014), the notion of essence (Fine 1994) or omissions (Bernstein 2016) might be understood in the same way. Indeed, all these examples are subject to criticism, but the reader might at least feel some pressure to distinguish between ‘merely’ representational and non-representational hyperintensionality. For more details, see Nolan (2014) and Berto & Jago (2019) and, for an alternative approach to hyperintensionality, Duží, Jespersen, Kosterec, and Vacek (2023).
8. Conclusion
Impossible worlds have been with us, at least implicitly, since the introduction of possible worlds. The reason for this is the equivalence of the phrases ‘it is possible’ and ‘it is not impossible’, or ‘it is impossible’ and ‘it is not possible’. The controversies about impossible worlds can also be understood as a sequel to the controversies about possible worlds. In the beginning, possible worlds were hard to understand, and this produced some difficult philosophical debates. It is therefore no surprise that impossible worlds have come to follow the same philosophical path.
9. References and Further Readings
Beall, J. & Restall, G. (2006). Logical Pluralism, Oxford: Oxford University Press.
A developed account of a position according to which there is more than one (correct) logic.
Bernstein, S. (2016). Omission Impossible, Philosophical Studies, 173, pp. 2575–2589.
A view according to which omissions with impossible outcomes play an explanatory role.
Berto, F. (2008). Modal Meinongianism for Fictional Objects, Metaphysica 9, pp. 205–218.
A combination of Meinongian tradition and impossible worlds.
Berto, F. (2010). Impossible Worlds and Propositions: Against the Parity Thesis, Philosophical Quarterly 60, pp. 471–486.
A version of modal realism which distinguishes distinct impossible propositions, identifies impossible worlds as sets and avoids primitive modality.
Berto, F. & Jago, M. (2019). Impossible Worlds, Oxford: Oxford University Press.
A detailed overview of theories of impossible worlds.
Divers, J. (2002). Possible Worlds, London: Routledge.
Duží, M.; Jespersen, B.; Kosterec, M.; Vacek, D. (eds.). (2023). Transparent Intensional Logic, College Publications.
A detailed survey of the foundations of Transparent Intensional Logic.
Fine, K. (1994). Essence and Modality: The Second Philosophical Perspectives Lecture, Philosophical Perspectives 8, pp. 1–16.
A detailed overview of the possible world ontologies.
Fine, K. (2012). Counterfactuals Without Possible Worlds, Journal of Philosophy 109: 221–246.
The paper argues that counterfactuals raise a serious difficulty for possible worlds semantics.
Jago, M. (2013). Against Yagisawa’s Modal Realism, Analysis 73, pp. 10–17.
This paper attacks modal dimensionalism from both possibility and impossibility angles.
Jago, M. (2014). The Impossible: An Essay on Hyperintensionality, Oxford: Oxford University Press.
A detailed overview of the history, as well as the current state of impossible worlds discourse.
Jenkins, C.S. & Daniel N. (2012). Disposition Impossible, Noûs, 46, pp. 732–753.
An original account of impossible dispositions.
Kabay, P. D. (2008). A Defense of Trivialism, PhD thesis, University of Melbourne.
A defence of trivialism, on the basis that there are good reasons for thinking that trivialism is true.
Kiourti, I. (2010). Real Impossible Worlds: The Bounds of Possibility, Ph.D. thesis, University of St Andrews.
A defence of Lewisian impossible worlds. It provides two alternative extensions of modal realism by adding impossible worlds.
Lewis, D. (1973). Counterfactuals, Cambridge, MA: Harvard University Press.
One of the first explicit articulations of modal realism and its analysis of counterfactual conditionals.
Lewis, D. (1978). Truth in Fiction, American Philosophical Quarterly 15, pp. 37–46.
An approach which aims at dispensing with inconsistent fictions via the method of union or the method of intersection. According to Lewis, we can explain away an inconsistent story via maximally consistent fragments of it.
Lewis, D. (1986). On the Plurality of Worlds, Oxford: Blackwell.
A detailed defence of modal realism, including an overview of arguments against modal ersatzism.
McDaniel, K. (2004). Modal Realism with Overlap, Australasian Journal of Philosophy 82, pp. 137–152.
An approach according to which the worlds of modal realism overlap, resulting in transworld identity.
Miller, K. (2017). A Hyperintensional Account of Metaphysical Equivalence, Philosophical Quarterly 67: 772–793.
This paper presents an account of hyperintensional equivalency in terms of impossible worlds.
Naylor, M. (1986). A Note on David Lewis’ Realism about Possible Worlds, Analysis 46, pp. 28–29.
One of the first modus tollens arguments given in response to modal realism.
Nolan, D. (1997). Impossible Worlds: A Modest Approach, Notre Dame Journal of Formal Logic 38, pp. 535–572.
Besides giving an original account of counterpossible conditionals, this paper introduces the strangeness of impossibilitycondition: any possible world is more similar (nearer) to the actual world than any impossible world.
Nolan, D. (2014). Hyperintensional Metaphysics, Philosophical Studies 171, pp. 149–160.
A defence of modal realism with overlap: the view that objects exist at more than one possible world.
Priest, G. (1997). Sylvan’s Box: A Short Story and Ten Morals, Notre Dame Journal of Formal Logic, 38, 573–582
A short story which is internally inconsistent, yet perfectly intelligible.
Priest, G., Beall, J. C., & Armour-Garb, B. (eds.). (2004), The Law of Non-Contradiction, Oxford: Oxford University Press.
A collection of papers dedicated to the defence as well as the rejection of the law of non-contradiction.
Russell, G. (2018). Logical Nihilism: Could There Be No Logic?, Philosophical Issues 28: 308–324
A proposal according to which there is no logic at all.
Schaffer, J. (2009). On What Grounds What, in D, Chalmers, D. Manley, and R. Wasserman (eds.), Metametaphysics: New Essays on the Foundations of Ontology, Oxford: Oxford University Press, pp. 347–383.
A defence of the grounding relation as providing a philosophical explanation.
Quine, W. V. (1956). Quantifiers and Propositional Attitudes, Journal of Philosophy 53, pp. 177–187.
According to Quine, propositional attitude constructions are ambiguous, yet an intensional analysis of them does not work.
Restall, G. (1997). Ways Things Can’t Be, Notre Dame Journal of Formal Logic 38: 583–96.
In the paper, Restall identifies impossible worlds with sets of possible worlds.
Rosen, G. (1990). Modal Fictionalism, Mind 99, pp. 327–354.
An initial fictionalist account of modality, ‘parasiting’ on the advantages of modal realism, while avoiding its ontological commitments.
Vacek, M. (2017). Extended Modal Dimensionalism, Acta Analytica 32, pp. 13–28.
A defence of modal dimensionalism with impossible worlds.
Vander Laan, D. (1997). The Ontology of Impossible Worlds, Notre Dame Journal of Formal Logic 38, pp. 597–620.
A theory of impossible worlds as maximal inconsistent classes of propositions, as well as a critique of various alternative positions.
Vetter, B. (2016). Counterpossibles (not only) for Dispositionalists, Philosophical Studies 173: 2681–2700
A proposal according to which the non-vacuity of some counterpossibles does not require impossible worlds.
Williamson, T. (2017). Counterpossibles in Semantics and Metaphysics, Argumenta 2: 195–226.
A substantial contribution to the semantics of counterpossible conditionals.
Yagisawa, T. (1988). Beyond Possible Worlds, Philosophical Studies 53, pp. 175–204.
An influential work about the need for impossible worlds, especially with regard to modal realism.
Yagisawa, T. (2010). Worlds and Individuals, Possible and Otherwise, Oxford: Oxford University Press.
A detailed account of modal dimensionalism and its ontological, semantic and epistemological applications.
Yagisawa, T. (2015). Impossibilia and Modally Tensed Predication, Acta Analytica 30, pp. 317–323.
The paper provides responses to several arguments against modal dimensionalism.
Author Information
Martin Vacek
Email: martin.vacek@savba.sk
Institute of Philosophy at the Slovak Academy of Sciences
Slovakia
Boethius (480-524)
Boethius was a prolific Roman scholar of the sixth century AD who played an important role in transmitting Greek science and philosophy to the medieval Latin world. His most influential work is The Consolation of Philosophy. Boethius left a deep mark in Christian theology and provided the basis for the development of mathematics, music, logic, and dialectic in medieval Latin schools. He devoted his life to political affairs as the first minister of the Ostrogothic regime of Theodoric in Italy while looking for Greek wisdom in devout translations, commentaries, and treatises.
During the twenty century, his academic modus operandi and his Christian faith have been a matter of renewed discussion. There are many reasons to believe his academic work was not a servile translation of Greek sources
The Contra Eutychen is the most original work by Boethius. It is original in its speculative solution and its methodology of using hypothetical and categorical logic in its analysis of terms, propositions, and arguments. The Consolation of Philosophy is also original, though many authors restrict it to his methodology and the way to dispose of the elements, but not the content, which would represent the Neoplatonic school of Iamblichus, Syrianus, and Proclus. Boethius was primarily inspired by Plato, Aristotle, and Pythagoras. His scientific, mathematical and logical works are not original, as he recognized.
Anicius Manlius Severinus Boethius (c. 480-524 AD), Boethius, was a prominent member of the gens Anicia, a family with strong presence in the republican and imperial Roman life. From the time of Constantine its members were converted and advocated for the doctrine of the Christian church of Rome. The study of Latin epigraphy (compare Martindale 1980, p. 232) and some biographical details about his childhood delivered by Boethius himself (Consolation of Philosophy ii, 3, 5) allow believing that his father was another Boethius, Narius Manlius Boethius, who was praetorian prefect, then prefect of Italy, and finally consul and patrician in 487 AD (compare Cameron 1981, pp. 181-183). It is not clear if this Boethius is the one who was the prefect of Alexandria in 457 AD, but Courcelle (1970, p. 299, n.1) suggested it so to give more weight to his hypothesis that Boethius could have used his social situation to go to Athens or Alexandria and learn Greek and deepen his study of philosophy and theology. What seems more likely is that Boethius’ grandfather was the same Boethius who was murdered by Valentinian III in 454 AD (compare Martindale 1980, p. 231).
After his father’s death, which occurred when Boethius was a child, he received the protection of Quintus Aurelius Symmachus Memmius, who belonged to a very influential family of the Roman nobility. Later, Boethius married Symmachus’s daughter, Rusticiana, sealing a family alliance that was disturbing to Theodoric, the Ostrogoth king, who was in Italy to impose authority and governance to the collapsed Western Empire by following the request of Flavius Zeno, the Eastern Roman Emperor. The political commitment of Boethius with Rome is attested not only by the public office of magister officiorum, the highest political rank that could be exercised in the reign of Theodoric, but also for the education and cursus honorum of his two sons, Symmachus and Boethius, who became senators (Consolation of Philosophy ii, 3,8; 4,7).
The prestige of Boethius in sixth-century Rome is attested not only by the honors granted him during his youth (some of which were denied to his older fellows, compare Consolation of Philosophy ii, 3), but also by the requests from friends and relatives to write commentaries and treatises to explain some difficult matters. In addition, Cassiodorus (Magnus Aurelius Cassiodorus), well known for founding in 529 AD the monastery of Vivarium, reports a scientific mission entrusted to him by Theodoric in terms of giving a horologium, a clock regulated by a measured flow of water, to the king of the Burgundians, Gundobad (compare. Variae I, 45 and I, 10. Mommsen ed. 1894).
2. Time
Theodoric must have been an ominous character for Romans, perhaps the lesser evil. The difficulties involved in moving from the pure ideal of Rome to Theodoric’s nascent eclectic culture must have been the context in which Boethius lived. By this time the unity of the Western Roman Empire was fragile, and the political power continuously disputed by various Germanic warlords, from Genseric the Vandal king, in 455 AD until Theodoric, the Ostrogoth king, in 526 AD.
It was Theodoric who organized a more stable government and attracted greater political unity among the leaders of the dominant two ethnic groups, the Roman and the Ostrogoth. In 493 Theodoric founded in Ravenna, northern Italy, the political and diplomatic capital of his government after defeating Odoacer there, as planned by the Emperor Flavius Zeno in Constantinople as a punishment for not respecting the authority of the host eastern Empire.
Theodoric brief reign (he died in 526, two years after Boethius) kept the administrative structure of the Roman Empire and sustained a joint government between the two main ethnic and political groups. Theodoric was not an entirely uneducated man (though see Excerpta Valesiana II, 79. Moreau ed. 1968) and would have had familiarity with Greek culture after staying in Constantinople, as hostage at the age of eight; it is known that, whatever his motivation was, he regularly respected the Roman civil institutions (but see the opinion of Anderson 1990, pp. 111-115). Boethius himself gave a panegyric to Theodoric during the ceremony in which Boethius’ two children were elected consuls (Consolation of Philosophy ii, 3, 30).
But the association of the two political powers, the military of Theodoric and the political of Rome, had many reasons to be adverse. By this time, Boethius must have been not only the most influential Roman politician in the Ostrogoth government but also the most distinguished public figure of the Roman class. The personal and political opposition was, after all, deep and irreconcilable. The Arianism of Theodoric and the Catholicism of Boethius clashed in 518, when Justin was appointed Roman emperor of the East. He abolished the Henoticon and embarked on a recovery policy of the Catholic faith of the Council of Chalcedon, and he began a plan to approach to Rome (Matthews 1981, p. 35). The most difficult years came as the elder Theodoric began to be worried about the destiny of his non-Catholic Eastern allies and concerned on his own stability in Italy. Around 524 AD, Boethius was accused of treason by Theodoric himself, without the right to be defended by the Roman Senate, which was also accused of treason (compare also Excerpta Valesiana II, 85-87. Mommsen ed. 1984). He was quickly imprisoned near Pavia, where he remained until being executed.
The detailed circumstances of the accusation have been not entirely clear to posterity, even if Boethius gives a summary of them in his Consolation of Philosophy i, 4. In essence, the charge was treason against the Ostrogoth government by seeking the alliance of Justin in Constantinople. The evidence for this charge includes Boethius’ intention of defending the senate of being involved in protecting the senator Albinus (who was accused of the same charge before), and the exhibition of some letters sent to Justin that contained expressions adverse to Theodoric and his regime. Boethius calls these letters apocryphal (Consolation of Philosophy i, 4). Probably Albinus was not in secret negotiations with the Eastern empire, and Boethius was innocent of wishing to defend the Senate of treason and concealment. However, he was accused and punished for his conspiracy, at the mercy of a violent despotic king who did not establish a proper defense or prove the charge against him. The execution of Boethius came quickly, and the manslaughter of his father-in-law, Symmachus, followed soon as well as the abuse and death of Pope John I. During his imprisonment, Boethius wrote his masterpiece The Consolation of Philosophy, which was not only a work of great influence in the Middle Ages and the Renaissance, but one of the most respected works of human creativity.
3. Writings
Boethius’ writings divide into three kinds: philosophical, theological, and scientific. The scientific writings are divided into mathematical and logical. The relationship between Boethius and his work remains complex. His completed works are traditionally conceived as original. The disorganized and incomplete shape of some of his works, especially his scientific treatises, is explained by his execution and death. However, many twentieth century scholars believe that this classical description only applies to his Theological treatises and partly to the Consolation of Philosophy, for Boethius depends on his sources more than an original work. However, this opinion is somehow a generalization of the situation that surrounded the scientific writings, and the truth is more in the middle.
a. Literary Writings
i. The Consolation of Philosophy
Boethius’ philosophical work is identified with his Consolatio Philosophiae, which combines stylistic refinement through the composition of prose and poetry with philosophical ideas within a conceptual framework based on a Neoplatonic interpretation of Aristotle with some subtle touches of Christian philosophy‑although this has been a matter of discussion. The unexpected visit of Lady Philosophy in his prison allows for a dialogue with a wonderful counterpoint between human opinion and the wisdom of Lady Philosophy, although Boethius says that Lady Philosophy is just the announcer of the light of truth (IV, 1, 5). The themes raised by The Consolation of Philosophy, such as the nature of fortune, human happiness, the existence of God and evil, human freedom and divine providence, became the focus of attention for Christian metaphysics of the Latin Middle Ages.
In Book I, Boethius briefly reviews his political life and the reasons for his accusation and imprisonment, showing that he is fully aware of those who accused him. In Book II, he discusses the nature of fortune and the reasons why no one should trust in it. In Book III he argues‑already in a different sense from what we might expect from the philosophy of Plato and Aristotle‑that true happiness (beatitudo) identifies with divinity itself, whose nature is unique and simple. He identifies the highest good (perfectus bonum) with the father of all things (III, 10, 43), and maintains that it is not possible to possess happiness without first having access to the highest good. The difference between his theory of happiness and that of Aristotle and Plato is that Boethius places God as a sine qua non condition for the possession of happiness, implying that every man must trust in divine wisdom and God’s provident wisdom to be happy. In Book IV, he addresses the issue of the existence of evil in the realm of one who knows and can do everything (IV, 1, 12: in regno scientis omnia potentis omnia). The allusion to the kingdom of God (regnum dei) is highly significant for proving its implicit Christianity mostly because he completes this allusion with the metaphor of the gold and clay vessels that the master of the house disposes since this symbol is found in the Letters of Saint Paul (Timothy 2,20; 2 Corinthians 4,7; and Romans 9, 21), and because it had an enormous Patristic citation. In Book V, Boethius examines one of the most complex problems of post-Aristotelian philosophy: the compatibility of human freedom and divine foreknowledge (divina praescientia). Boethius’ treatment will be of great theoretical value for later philosophy, and the remains of his discussion can be seen in Thomas Aquinas, Valla, and Leibniz (compare Correia (2002a), pp. 175-186).
Neoplatonic influence has been discerned in the Consolation, especially that of Proclus (412-485 AD), and Iamblichus. But this fact is not enough to affirm that Boethius in the Consolation only follows Neoplatonic authors. The issue is whether there is an implicitly Christian philosophy in this work. Furthermore, the absence of the name of Christ and Christian authors has led some scholars to believe that the Consolation is not a work of Christian philosophy, and Boethius’ Christianity was even doubted for this fact (compare Courcelle, 1967, p. 7-8). Added to this is the fact that, if Boethius was a Christian, he would seek consolation in the Christian faith rather than in pagan philosophy. However, it must be considered that the genre of philosophical consolation, in the form of logotherapy, was traditional in Greek philosophy. Crantor of Solos, Epicurus, Cicero, and Seneca had written consolations about the loss of life, exile, and other ills that affect the human spirit. Cicero in his Tusculan Disputations (3.76) even shows that the different philosophical schools were committed to the task of consoling the dejected and recognizes various strategies applied by the different schools as they conceived the place of human beings in the universe. Boethius was surely aware of this tradition (Cicero wrote his consolation for himself) and if we take this assumption for granted, Boethius’ Consolation of Philosophy would fit within the genre of consolation as a universal genre together with the themes of universal human grief (evil, destiny, fortune, unhappiness). At the same time, Boethius would be renovating this literary genre in a Christian literary genre, since Lady Philosophy does not convey Boethius’ spirit towards pagan philosophy in general, but rather to a new philosophy that should be called Christian. We see this not only in the evocations of Saint Paul’s letters and the new theory of happiness but also when, in Book V, Boethius identifies God with the efficient principle (de operante principio) capable of creating from nothing (V, 1, 24-29). Hence, he adapts Aristotle’s definition of chance by incorporating the role of divine providence (providentia) in disposing all things in time and place (locis temporibusque disponit: V, 1, 53-58).
b. Theological Treatises
The Opuscula sacra or Theological treatises are original efforts to resolve some theological controversies of his time, which were absorbed by Christological issues and by the Acacian schism (485-519). Basically, they are anti-heretical writings, where a refined set of Aristotelian concepts and reasoning are put in favor of the Chalcedonian formula on the unity of God and against both Nestorious’ dyophysitism and Eutyches’ monophysitism. Boethius claims to be seeking an explanation on these issues using Aristotle’s logic. This makes him a forerunner of those theologians trusting theological speculations in logic. The following five treatises are now accepted as original: (1) De Trinitate, (2) If the Father and the Son and the Holy Spirit are substantially predicated of divinity, (3) How the substances are good in virtue of their existence without being substantially good, (4) Treatise against Eutyches and Nestorius, and (5) De Fide Catholica. The most original and influential of Boethius’ theological treatises is the Contra Eutychen.
Because of the absence of explicit Christian doctrines in the Consolation of Philosophy, the authenticity of the theological treatises was doubted by some scholars in the early modern era. But Alfred Holder discovered a fragment of Cassiodorus in the manuscript of Reichenau, which later was published by H. Usener (1877), in which the existence of these treatises and their attribution to Boethius is reported. Cassiodorus served as senator with Boethius and succeeded him in the charge of Magister officiorum in Theodoric’s government. Cassiodorus mentions that Boethius wrote “a book on the Trinity, some chapters of dogmatic teaching, and a book against Nestorius” (compare Anecdoton Holderi, p. 4, 12-19. Usener ed. 1877). This discovery confirmed not only the authenticity of Boethius’ theological treatises, but also cleared the doubts over whether Boethius was a Christian or not. The Treatise against Eutyches and Nestorius has been admitted as the most original of Boethius’s theological treatises (Mair, 1981, p. 208). By the year 518 Boethius had translated, commented on, and treated a large part of Aristotle’s Organon (compare De Rijk’s chronology, 1964). Thus, Boethius makes use of Aristotelian logic as an instrument. In Contra Eutychen, he uses all the resources that are relevant to the subject in question: division and definition, hypothetical syllogisms, distinction of ambiguous meanings of terms, detection and resolution of fallacies involved. This is accompanied by the idea that human intelligence can store arguments against or in favor of a certain thesis to possess a copia argumentorum (a copy of arguments; p. 100, 126-130), suggesting that there can be several arguments to demonstrate the same point under discussion, which is a matter reminiscent of Aristotle’s Topics. Thus, Boethius gives a perfect order of exposition, rigorously declared at the beginning of the central discussion of the treatise: 1) Define nature and person, and distinguish these two concepts by means of a specific difference; 2) Know the extreme errors of the positions of Nestorius and Eutyches; 3) Know the middle path of the solution of Catholic faith. Boethius’s solution is the Catholic solution to theological problem of the two natures of Christ. According to his view, Christ is one person and two natures, the divine and the human, which are perfect and united without being confused. He is thus consubstantial with humanity and consubstantial with God.
c. Scientific Treatises
Within the scientific writings, we find mathematical and logical works. Boethius gives us scientific writings on arithmetic, geometry, and music; no work on astronomy has survived, but Cassiodorus (Variae I. 45, 4) attributed to him one on Ptolemy. Similarly, Cassiodorus attributes another on geometry with a translation of Euclid’s Elementa, but what we count as Boethius’ writing on geometry does not correspond to Cassiodorus’ description. His logical works are on demonstrative and inventive logic. A treatise on division (De divisione) has also credited to him (compare Magee 1998). But one on definition (De definitione) has been refuted as original and attributed to Marius Victorinus (Usener, 1877). Boethius devotes to logic three types of writings: translations, commentaries, and treatises.
Boethius uses the term quadrivium (De Institutione Arithmetica I, 1, 28) to refer to arithmetic, geometry, music, and astronomy, which reveals that he might be engaged not only in the development of these sciences, but also in the didactic of them. However, his works on logic do not reveal that this plan might also have covered the other disciplines of the trivium, grammar and rhetoric.
The scientific writings of Boethius occupied an important place in the education of Latin Christendom. The influence that these treatises had in the medieval quadrivium and even into early modern tradition is such that only Newton’s physics, Descartes’ analytical geometry, and Newton’s and Leibniz’s infinitesimal calculus were able to prevail in the Boethian scientific tradition.
It is known that the way Boethius approaches arithmetic and music is speculative and mathematical. Arithmetic is known as the science of numbers and does not necessarily include calculation. And music is a theoretical doctrine of proportion and harmony and has nothing directly to do with making music or musical performance techniques. In De Institutione musica I, 2, 20-23, Boethius makes a distinction of three types of music: cosmic (mundana), human (humana) and instrumental. He distinguishes them according to their universality. The mundana is the most universal, since it corresponds to celestial harmony and the order of stars: some stars rotate lower, others higher, but all form a set with each other. It is followed by human music, which is what we, as humans, experience and reproduce directly within ourselves. It is the song, the melodies that are created by poetry. It is responsible for our own harmony, especially the harmonious conjunction between the sensitive part and the intellectual part of our nature, just as the bass and treble voices articulate in musical consonance. The third is instrumental music, generated by tension of a string or by the breath of air or by percussion.
At the beginning of his De Institutione Musica (I, 10, 3-6), when following Nichomachus of Gerasa, Boethius adopts without criticism not only Pythagoras’ theory of music, but also the supernatural context in which Pythagoras announces the origin of music through a divine revelation given by the symmetric and proportional sounds coming from a blacksmith. The marked tendency of the Pythagorean theory of music impedes Boethius from making a richer report of music by including the more empirical approach by Aristoxenus, who is criticized by Boethius just as the Stoics are in logic.
d. Logical Writings
Boethius has three kinds of works on logic: translations, commentaries, and treatises. Their content revolves mainly around Aristotle’s logical writings: Categories, De Interpretatione, Prior Analytics, Posterior Analytics, Topics and Sophistical Refutations, traditionally called the Organon. But even if Boethius wanted to devote works on each one, he did not complete the task.
i. Translations
As a translator, Boethius has a consummate artistry. His translations are literal and systematic. They do not lack the force of the Greek, and they never spoil the style of Latin. Its literal translation method has been compared to that developed later by William of Moerbeke (who translated some works of Aristotle and other Greek commentators) for their use and study of Thomas Aquinas. Boethius’ translations from Greek are so systematic that scholars often can determine what the Greek term behind the Latin word is. Boethius’ translations are edited in Aristoteles Latinus (1961-1975). Translations on every work by Aristotle’s Organon have been found. In addition to these works, Boethius translated the Isagoge of Porphyry, which is an introduction (Eisagogé is the Greek term for ‘introduction’) to Aristotle’s Categories.
In these translations, Boethius exceeded the art of Marius Victorinus, who had earlier translated into Latin Aristotle’s Categories and De Interpretatione, and Porphyry’s Isagoge. Boethius himself attributed certain errors and confusions in Marius Victorinus and informs us that Vetius Praetextatus’ translation of Aristotle’s Prior Analytics, rather than being a translation of Aristotle’s text, is a paraphrase of the paraphrase made by Themistius on this Aristotelian work (compare Boethius in Int 2, p. 3; Meiser ed. 1877-1880). The translation of Greek works into Latin was common. Apuleius of Madaura, a Latin writer of 2 AD., born and settled in North Africa, had translated the arithmetic of Nicomachus of Gerasa and wrote an abridgement of Aristotelian logic. In general, we can say that Boethius saw very clearly the importance of systematic translations into Latin of Greek philosophy and science as an educational service to the nascent European Latin Christianity.
ii. Commentaries
Even if Boethius planned to comment on the complete Organon, he finished only the following:
On Porphyry’s Isagoge (In Porphyry Isagogen, two editions).
On Aristotle’s Categories (In Aristotelis Categorias, two editions).
On Aristotle’s De Interpretatione (In Aristotelis Peri hermeneias, two editions).
On the Topics of Cicero (In Ciceronis Topica, one edition).
Though no commentary on Posterior Analytics, Topics or Sophistical Refutations exist, this does not suggest that Boethius was unaware of them. In his Introductio ad syllogismos categoricos (p. 48, 2), when Boethius deals with singular propositions, he seems to follow some explanations closely related to a commentary on Sophistical Refutations. Even if his plan of performing a double commentary on every work is not original, he explained this modus operandi. The first edition contains everything which is simple to understand, and the second edition focuses on everything which is more subtle and requires deeper, longer explanation.
The influence of these commentaries on medieval education was enormous, as they contain key concepts that became central to both the logica vetus and medieval philosophy. In fact, his comments on Porphyry’s Isagoge contain the so-called problem of universals (Brandt 1906 ed.p. 24, 159), and his comments on De Intepretatione give the linguistic and semantic basis of the long tradition of logical analysis of medieval thinkers until Peter Abelard. Additionally, his comments on Cicero’s Topics were influential in the history of logic and sciences by dividing logic into the demonstrative and the dialectic branches, underlining the distinction between Aristotle’s Analytics and Topics.
Many times, Boethius’ commentaries are given through long explanations, but they contain valuable information on the history of logic as they build upon many doctrines on earlier commentators of Aristotle. The commentary on Aristotle’s logic had a long Greek tradition, and Boethius knew to select those commentators and doctrines that improve Aristotle’s text. In that tradition, the earlier author played an important role over the latter. However, there is important evidence that Boethius is not following any continuous copy of any of the earlier Greek commentators.
iii. Treatises
Boethius not only translated and commented on the works of Aristotle and Porphyry, but he wrote some monographs or logical treatises that are different from his commentaries, for they are not intended to provide the correct interpretation of Aristotle’s text, but to improve the theory itself. If we leave aside the De definitione, five treatises are recognized:
On Division (De divisione liber)
On Categorical syllogism (De syllogismo categorico)
Introduction to categorical syllogisms (Introductio ad syllogismos categoricos)
On Topical Differences (De Topicis differentiis)
On hypothetical syllogisms (De hypotheticis syllogismis).
1. On the Division
Boethius’ De divisione transmitted the Aristotelian doctrine of division, that is, the doctrine that divides a genus into subordinate species. The aim of division is to define (compare Magee 1998). For example:
In Aristotle’s works there are examples of divisions (for example, Politics 1290b21, De generatione et corruptione 330a1), which proves that Boethius accepted this definition method regardless of the fact that its origin was Platonic. The logical procedure was also appreciated by the first Peripatetics, and the proof is that, as Boethius reports at the beginning of this treatise, Andronicus of Rhodes published a book on the division, because of its considerable interest to Peripatetic philosophy (De divisione 875D; compare also Magee 1998, pp. xxxiv-xliii). Also, the Neoplatonic philosopher Plotinus studied Andronicus’ book and Porphyry adapted its contents for commenting on Plato’s Sophist (De divisione 876D). The species of division that were recounted by Boethius are that any division is either secundum se or secundum accidens. The first has three branches: (i) a genus into species (for example, animal is divided into rational and non-rational); (ii) the whole into its parts (for example, the parts of a house); and (iii) a term into its own meanings (for example, ‘dog’ means quadruped capable of barking, a star in Orion and an aquatic animal). The division secundum accidens is also triple: (i) a subject into its accidents (for example, a man into black, white and an intermediate color); (ii) accidents into a subject (for example, among the things that are seeking, some belong to the soul and some belong to body); finally, (iii) the accidents into accidents (for example, among white things some are liquid some are solid).
It is worth noting that not all the genus-species divisions are dichotomous, as it was with Platonists, because Peripatetic philosophers also accepted that a genus can be divided into three species or more, since the general condition of a division to be correct is that it must never have less than two species and never infinite species (De divisione 877C-D). As it seems, this is one of the differences between Aristotle and the Platonists. In fact, Aristotle criticizes the Platonists’ dependence on dichotomous divisions by arguing that if all divisions were dichotomous, then the number of animal species would be odd or a multiple of two (Aristotle, Parts of Animals I, 3, 643a16-24).
2. On the Topics
Boethius’ idea of logic is complex and in no way reduces only to formal demonstration. When he refers to logic as such (compare In Isagogen 138,4-143,7; De Top Diff 1173C.; and In Ciceronis topica I 2.6-8), he distinguishes between demonstrative and dialectical syllogism and criticizes the Stoics for leaving out the dialectical part of logic and maintaining a narrower idea of it. In fact, Boethius does not reduce logic to demonstration, but he divides logic into two parts: judgement and the discovery of arguments. Since he identifies the former to Analytics and the later to Topics, the division applies to reconcile these two main procedures of logic. Logic is both a demonstration and a justification of reasonable premises, as the syllogism can manage necessary or possible matters.
In Ciceronis Topica Boethius is commenting on Cicero’s Topics. The objective of this work is to adopt Ciceronian forensic cases and explain them within his understanding of Peripatetic tradition of Aristotle’s Topics. Boethius’ notion of topic is based on what seems to be the Theophrastean notion, which is a universal proposition, primitive and indemonstrable, and in and of itself known (Stump, 1988, pp. 210-211). A topic is true if demonstrated through human experience, and its function is to serve as a premise within the argument sought. The topic may be within or outside argumentation. One example in the treatise (1185C) appears to be autobiographic: the question of whether to be ruled by a king is better than by a consul. According to Boethius, one should argue thus: the rule of a king lasts longer than the government maintained by a consul. If we assume that both governments are good, it must be said that a good that lasts longer is better than one that takes less time. Consequently, to be ruled by a king is better than being governed by a consul. This argument clearly shows the topic or precept: goods that last longer are more valuable than those that last a shorter time. Within the argument it works as an indemonstrable proposition. Boethius often calls them a maximal proposition (propositio maxima).
Boethius called dialectic the discipline studying this type of argumentation. The syllogism can be categorical or hypothetical, but it will be dialectic if the matter in its premises is only credible and non-demonstrative. In De Top Diff 1180C, Boethius introduces a general classification of arguments in which demonstrative arguments can be non-evident to human opinion and nevertheless demonstratively true. In fact, our science has innumerable non-evident affirmations that are entirely demonstrable. On the other hand, dialectical arguments are evident to human opinion, but they could lack demonstration.
Boethius devotes the entire Book 5 of this commentary to discussing dialectical hypothetical syllogisms and here, as in his treatise on hypothetical syllogisms, the role of belief (fides) is quite important in defining dialectical arguments in general, as it will be more explained in the following section.
3. On the Hypothetical Syllogisms
The De hypothetico syllogismo (DHS), perhaps originally titled by Boethius De hypotheticis syllogismis, as Brandt (1903, p. 38) has suggested, was published in Venice in 1492 (1st ed.) and 1499 (2nd ed.). This double edition was the basis for the editions of Basel (1546 and 1570) and the subsequent publication of J.P. Migne in Patrologia Latina, vols. 63 and 64 (1st ed., 1847) and (2nd ed. 1860), which appears to be a reprint of the Basel edition. The editions of 1492 and 1499 form the editio princeps, which has been used regularly for the study of this work to the critical revision of the text by Obertello (1969). DHS is the most original and complete treatise of all those written in the antiquity on hypothetical logic that have survived. It was not systematically studied during medieval times, but it had a renaissance in the twentieth century, through the works of Dürr (1951), Maroth (1979), Obertello (1969), and others.
According to the conjecture of Brandt (1903, p. 38), it was written by Boethius between 523 and 510, but De Rijk (1964, p. 159) maintains that it was written between 516 and 522. In DHS Boethius does not follow any Aristotle’s text but rather Peripatetic doctrines. This is because Aristotle wrote nothing about hypothetical syllogisms, although he was aware of the difference between categorical and hypothetical propositions. Thus, De Interpretatione 17a15-16 defines that “A single-statement-making sentence is either one that reveals a single thing or one that is single in virtue of a connective” (Ackrill’s translation, 1963), and later (17a20-22) he adds, “Of these the one is a simple statement, affirming or denying something of something, the other is compounded of simple statements and is a kind of composite sentence” (Ackrill’s translation, 1963). Even if Aristotle promised to explain how categorical and hypothetical syllogisms are related to each other (compare Prior Analytics 45b19-20 and 50a39-b1), he never did.
Aristotle only developed a syllogistic logic with simple or categorical propositions, that is, propositions saying something of something (e.g., “Virtue is good”). The syllogism with conditional premises (for example, “The man is happy, if he is wise”) was covered by the first associates of Aristotle, Theophrastus and Eudemus (DHS I, 1,3). Boethius’ DHS contains the most complete information about this Peripatetic development. The theory is divided into two parts: disjunctive and connective propositions. A conditional connection is like “If P, then Q”, where P and Q are simple propositions. A disjunctive proposition is instanced as “Either P or Q”. Boethius presents two indemonstrable syllogisms to each part. The first disjunctive syllogism: ‘It is P or it is Q. But, it is not P. Therefore, it is Q.’ And the second: ‘It is P or it is Q. But, it is not Q. Therefore, it is P.’ As to connectives, the first syllogism is “If it is P, then it is Q. But it is P. Then, it is Q”. And the second is “If it is P, then it is Q. But it is not Q. Then, it is not P”. Boethius accepts that ‘It is P or it is Q’ is equivalent to ‘If it is not P, then it is Q. Accordingly, Boethius leaves implicit the concordance between hypothetical and disjunctive syllogisms:
First disjunctive syll.
First hypothetical syll.
Second disjunctive syll.
Second hypothetical syll.
It is P or it is Q
It is not P
Therefore, it is Q.
If it is not P, it is Q
It is not P
Therefore, it is Q.
It is P or it is Q
It is not Q
Therefore, it is P.
It is not P, it is Q
It is not Q
Therefore, it is P.
The theory also develops more complex syllogisms and classifies them in modes. For example, DHS II, 11, 7, says correctly that: “The eighth mode is what forms this proposition: “If it is not a, it is not b; and if it is not b, it is not c; but it is c; therefore, it must be a”.
Boethius’ development does not use conjunctions, and this must be an important difference between the Stoic theory and the Peripatetic original development. This fact makes Boethius deny the hypothetical affirmation “If it is P, then it is Q” by attaching the negative particle to the consequent. Thus ‘If it is P, then it is not Q’ (DHS I, 9,7). This is an internal negation, instead of Stoic negation, which is external or propositional, since applies the negative particle to the entire proposition. This explains why he does not consider Stoic axioms based on conjunction in DHS, which he did in his In Ciceronis Topica, V.
The question of whether Boethius is right in believing that the theory comes from Theophrastus and other Peripatetics is still difficult to answer. Speca (2001, p. 71) raises the doubt that we cannot presently be certain of its Peripatetic provenance, because the sources cannot go further back than the end of II century AD, and by then the hypothetical theory was already terminologically conflated with Stoic terminology. He is right, if we look at Boethius’ examples like ‘It is day, then it is light’, and so forth, which are from the Stoic school. On the other hand, Bobzien (2002 and 2002a) has supported the contrary view, and she is inclined to belief in the historical view of Boethius’ account.
The scrupulous view of Speca (2001) is methodologically safe, but it is worth noticing that there are at least three important differences between Boethius’ hypothetical syllogistic logic and Stoic logic. One is negation: Peripatetic hypothetical negation follows the argumentative line of categorical negation; the negative particle must be posed before the most important part of the proposition, and that is the consequent in the case of a conditional proposition. Thus, as said, the negation of “If P, then Q” will be “If P, then not Q”. Stoic negation poses the negative particle before the entire proposition. And thus, the negation will be “It is not the case that if P, then Q”.
The second difference is that Boethius, in his DHS, distinguishes material and formal conclusions just as he does in his treatises on categorical logic (compare DHS I, iv, 1-2; 3; and I, ii, 1-7; II, ii, 7). In a hypothetical syllogism, to affirm the consequent is fallacious, but if the terms mutually exclude (as if they had an impossible matter) and the third hypothetical mood is given (“If it is not P, it is Q”), there will be a syllogism. Boethius gives the example “If it is not day, it is night. It is night. Therefore, it is not day”. But the conclusion does not obtain if ‘white’ and ‘black’ are correspondingly proposed by P and Q. Thus, a syllogism, either categorical or hypothetical, is logically valid if it does not depend on a specific matter of proposition to be conclusive. On the contrary, material syllogisms, either categorical or hypothetical, are valid under certain matters within a certain form, as they are not logical conclusions, for they are not valid universally or in every propositional matter. Accordingly, Boethius (DHS II, iv, 2) distinguishes between the nature of the relation (natura complexionis) and the nature of the terms (natura terminorum).
The third difference lies in the function Boethius puts on fides, belief (DHS I, 2,4; I, 2,5; II, 1,2). The role of fides is the crucial core of Boethius’ DHS. According to him, if someone argues through the first indemonstrable, or by any other hypothetical syllogism, he needs to confirm the minor premise, which is a belief. It is not the syllogism as such which is in doubt, but its conclusion, which is conditioned to the truth of the categorical proposition. Boethius’ reason is the originality and primitiveness of categorical syllogisms. He calls categorical syllogisms ‘simple’ and hypothetical syllogisms ‘non-simple’, because the latter resolves into the former (DHS I, 2,4. Non simplices vero dicuntur quoniam ex simplicibus constant, atque in eosdem ultimos resolvuntur). The role of belief in Boethius’ theory of hypothetical syllogisms is also emphasized in his ICT and, if Stump (1988, pp. 210-1) is right, in recognizing the activity of Theophrastus behind Boethius’ theory of Aristotle’s Topics, then Theophrastus and the activity of the first Peripatetics could be well behind DHS.
iv. Treatises on Categorical Syllogisms
The De syllogismo categorico (DSC) and Introductio ad syllogismos categoricos (ISC) are two treatises on categorical syllogisms composed by the young Boethius. Their contents are similar and almost parallel, which have raised various explanations during the early twenty-first century. They have greatly influenced the teaching of logic in medieval Western thought, especially the former which is the only one that contains syllogistic logic.
1. The De Syllogismo Categorico
DSC was written by Boethius early in his life, perhaps around 505 or 506 AD (for the chronology of Boethius works in logic, compare De Rijk 1964). Despite its importance, it did not received a critical edition until the work by Thörnqvist Thomsen (2008a). In the oldest codices (for example, Orleans 267, p. 57), DSC is entitled “Introductio in syllogi cathegoricos”, but this title changed to De syllogismo categorico after the editions by Martianus Rota (Venice, 1543) and Henrichus Loritus Glareanus (Basel, 1546). The edition of Migne (1891) is based on these two editions of the sixteenth century. During the twentieth century, most scholars have corrected this title to De categoricis syllogismis, after Brandt (1903, p. 238, n. 4), argued for using the plural.
The sources of DSC seem to be a certain introduction to categorical syllogistic logic that Porphyry had written to examine and approve the syllogistic theory of Theophrastus, whose principles are inspired by Aristotle’s Prior Analytics. This seems to be suggested from what Boethius says at the end of this work (p. 101, 6-8): “When composing this on the introduction to the categorical syllogism as fully as the brevity of an introductory work would allow, I have followed Aristotle as my principal source and borrowed from Theophrastus and Porphyry occasionally” (Thomsen Thörnqvist transl.). The existence of a similar work by Theophrastus is confirmed by various ancient references; for example, Boethius attributes to him the work “On the affirmation and negation” (in Int 2, 9, 25; Meiser ed.; also Alexander of Aphrodisias in An Pr 367, 15 and so forth), and Alexander of Aphrodisias cites profusely Theophrastus’ own Prior Analytics (in An Pr 123, 19 and 388, 18; Wallies ed. On the works by Theophrastus, see Bochenski 1947 and Sharples 1992, p. 114-123). Moreover, J. Bidez, in the life and works of Porphyry (compare Bidez 1923, p. 198, and Bidez 1964, p. 66*) confirms the existence of a written work entitled “Introduction to categorical syllogisms” written by Porphyry.
DSC is divided into two books. In the first, Boethius reviews the theory of simple propositions, in a way that recalls his commentaries on Aristotle’s De Interpretatione (ed. Meiser 1877-1880). However, DSC exceeds both the commentaries and what Aristotle teaches in his De Interpretatione. In fact, it includes some extra matters: (i) the law of subalternation when reviewing the logical relationships of the Square of oppositions; (ii) a broader explanation on conversion by containing conversion in contraposition (which Aristotle only developed for universal affirmative propositions); (iii) conversion by accident for universal negative propositions (which Aristotle did not include); and (iv) the division of simple propositions.
The second book is a synopsis of the central part of Aristotle’s theory of syllogism (Prior Analytics I, 2-8) plus Theophrastus’ doctrine of indirect syllogistic moods. Theophrastus added five indirect moods to Aristotle’s four moods. Medieval logicians knew these moods through the technical names: Baralipton, Celantes, Dabitis, Fapesmo, and Frisesomorum. Moreover, the second book of DSC (69, 8-72, 11) contains a complete explanation of the definition of syllogism, which recalls Alexander of Aphrodisias’ teaching in his commentary on Aristotle’s Topics. Again, DSC is more technical and elaborated than Aristotle’s Prior Analytics. In addition, Boethius’ explanation on reducing the imperfect moods of the second and third syllogistic figures to the first four modes of the first figure (Barbara, Celarent, Darii and Ferio) suggests a more systematic way than Aristotle’s own explanations.
A careful reading of the logical contents of DSC also makes clear that Boethius (DSC 17, 10) is following a division of categorical propositions to define the three main logical operations of Aristotelian logic: the opposition of propositions (contradiction, contrariety, and subcontrariety); the conversion of propositions (simple, by accident, and by contraposition); and syllogisms, with its figures, syllogistic moods, and the main extensions of first figure. This division is not Boethius’. Already Alexander of Aphrodisias (In An Pr 45,9) gives a complete use of it. There are remnants in Apuleius (PeriH 7, 9-14, p. 183) and Galen (Inst Log, 6,3), and it reappears in Boethius’ time in Ammonius (In An Pr 35.26) and Philoponus (In An Pr 40.31). It is also present in later authors.
Boethius, after commenting on the definitions of the elements of simple propositions (name, verb, indefinite name and verb, and phrase) takes a pair of propositions and divides them into categorical propositions as follows: a pair of simple propositions can or cannot have terms in common. If they do not have any term in common, then they do not have any logical relation. But if they have some terms in common, there is an alternative: either both terms are in common or some term in common. If both terms are in common, they can or cannot have the same order. When they have same order, the theory of Opposition is stated. If both terms change their order, the theory of Conversion is defined. On the other hand, if the pair has only one term in common, the syllogistic theory will appear.
2. The Introductio ad Syllogismos Categoricos
Boethius is the author of DSC and ISC, two treatises on categorical logic. They have a notorious similarity, and they look parallel to some extent. This opens the question of why Boethius wrote two. The first modern explanation proposed a strong dependence between them. Prantl (1855, I, p. 682, n.80) believed that the first book of DSC was an excerpt of ISC. But the presence of syllogistic logic in the second book of DSC and its total absence in ISC is enough to contradict Prantl’s explanation. Brandt (1903, p. 245) was right in refuting him. However, the reason why the treatises are so alike each other had not been found at all. Murari (1905) and McKinlay (1907) have suggested that the second book of DSC (dedicated to syllogistic logic) was originally the second book of ISC, while the first book of DSC was not by Boethius, but it was attached later to the codices in the middle age. According to McKinlay’s later revision of his hypothesis (1938, p. 218), ISC must be identified to Boethius’s Institutio categorica, thought to be lost, and mentioned by Boethius in his treatise On Hypothetical Syllogism (833B).
McKinlay’s hypothesis has lost support due to later works by De Rijk (1964, p. 39) and Magee (1998, p. xvii-xix). In the early twenty-first century, in her critical edition of both treatises, Christina Thomsen Thörnqvist (2008a and 2008b) has given a new explanation. She thinks (2008a, p. xxxix) that ISC is a review of the first book of DSC and that Boethius was intending to give a review of DSC’s two books, but this original plan was not completed (compare Thomsen Thörnqvist), for while Boethius was writing the first book, he realized that he had gone too far in what was supposed to be nothing more than an introduction to Aristotle’s syllogistic logic. In this conjecture she follows Marenbon (2003, p. 49).
In any case, ISC is different from DSC not only because of its absence of syllogistic logic. ISC (15.2) incorporates the notion of strict and non-strict definitions of the elements of the categorical proposition (name, verb, and so on). It incorporates with high interest proofs based on the matters of proposition (29.18). And it has a high consideration of singular propositions by including material that was not in his commentaries (48.2). Additionally, ISC contains a crucial difference: the logic of indefinite propositions. It states their opposition (51.9), their equivalence (62.9), and it develops with more detail conversion by contraposition (69.1).
The divisions of DSC and ISC
ISC cannot be the breviarium Boethius promised to write in his second commentary on Aristotle’s De Interpretatione (in Int 2, p. 251, 8-9). However, Shiel (1958, p. 238) thinks the contrary. The only reason is that ISC contains more than Boethius’ commentaries on De Interpretatione. The essence of ISC must come from its division.
After developing the linguistic section of Aristotle’s De Interpretatione, both ISC and DSC present their plans through the establishment of a division of a pair of categorical propositions. These divisions contain identical branches, but they also contain important differences. On the one hand, the division of ISC is not as complete as that of DSC, because it does not incorporate the theory of syllogism, but it is more specific than that of DSC by incorporating indefinite terms, on which DSC says nothing. The following description shows how both divisions overlap one another, and what the differences between them are:
On the one hand, if ISC were the first book of DSC, then the indefinite propositions (which only ISC develops) would not take any part of the second book of DSC (which is only on syllogisms). Accordingly, their introduction would be purposeless. On the other hand, if the plan of ISC were a review of DSC’s two books, then Boethius was obliged to develop a theory of syllogisms with indefinite premises, which is unlikely since ISC’s division does not contain syllogistic logic (despite ISC’s being an introduction to syllogistic). But even if one thinks that this could have been so, there are several doubts concerning the logical capacity to do so in Boethius’ sources, even though the issue was not unknown. Boethius indeed recounts (in Int 2, 12-26, p. 316) that Plato and others made conclusive syllogisms with negative premises, which is not allowed by Aristotle in his Prior Analytics (I, 4.41b7-9). According to Boethius, it is possible because Plato in Theaetetus (186e3-4) knew that sometimes a negative categorical proposition can be replaced with the correspondent affirmation with indefinite predicate terms. Boethius (in Int 2, 9, p. 317) cites Alexander of Aphrodisias as one the ancient authors in dealing with syllogisms with indefinite premise, which is certain because Alexander, in his commentary on Aristotle’s Prior Analytics, quotes another syllogism of this sort (in An Pr 397, 5-14). Even Aristotle’s De caelo (269b29-31) has another example. However, this does not seem sufficient to believe that Boethius in his ISC was able to introduce a theory of syllogistic logic with indefinite premises. (To this point, compare I. M. Bochenski (1948), pp. 35-37; and Thomas (1949), pp. 145-160; also, Álvarez & Correia (2012), pp. 297-306. Compare also Correia (2001), pp. 161-174.).
4. Influence of the Treatises
DSC and ISC were taken together and never considered separate. There are no signs that both treatises were studied by the medieval logicians and philosophers before the eleventh century (compare Van de Vyver, 1929, pp. 425-452).
The first text where the influence of their teaching is clear is the twelve century Anonymus Abbreviatio Montana. The other is the Dialectic by Peter Abelard. We know this not only because the name of Boethius is cited as the main source, but also because the division of propositions we have seen above is accepted and maintained by Abelard and the anonymous author of the Abbreviatio.
Later on, the authority of these treatises is more evident. In the fourteenth century, Peter of Spain’s Summulae logicales adopted the indirect moods of the first figure—the doctrine of the matters of proposition (which can be traced in the history of logic as far back as Alexander of Aphrodisias and Apuleius)—and he follows Boethius in the idea that is originally found in Aristotle of reducing the imperfect moods of the second and third syllogistic figures to the first four perfect moods of the first figure.
5. His Sources
The Contra Eutychen is the most original work by Boethius. It is original in its speculative solution and its methodology of using hypothetical and categorical logic in his analysis of terms, propositions, and arguments. The Consolation of Philosophy is also original, though many authors restrict it to his methodology and the way to dispose of the elements, but not the content, which would represent the Neoplatonic school of Iamblichus, Syrianus, and Proclus. As to his inspiring figures, Boethius gives his most respectful words to Plato and Aristotle, but the figure of Pythagoras is also venerated in De Institutione musica (DIM I, 10-20).
As to his scientific writings, his mathematical and logical works are not original, and Boethius recognizes it. When dealing with these scientific matters, Boethius relies on specific Greek sources: in mathematical disciplines, he follows the middle-Platonist Nicomachus of Gerasa (compare Bower, C., 1978, p. 5). However, not everything comes from him (Barbera, A. 1991. pp. 1-3 and 48-49). In his De Institutione musica (IV, 1-2), he follows with some changes (Barbera, ibid., pp. 38-60) to the Sectio Canonis, attributed to Euclides; and, in developing books V, and part of IV, he uses C. Ptolemy’s Harmonicas (compare DIM V, 4, 25; V, 5, 5; V, 8, 13; V, 11, 1; V, 14, 21, V, 18, 24 et al.; also, Redondo Reyes, 2002, p. cxv).
As to Aristotelian logic, he recognizes agreement with the certainty of the Peripatetic doctrines reviewed by the Neoplatonist Porphyry (compare Boethius in Int 2, 24-27, p. 17. Meiser ed., 1877-1880), but it is also true that not everything comes from him, for Boethius also names Syrianus, Proclus’s master.
As to the sources of his logical works, though far from being resolved, there is a basic agreement with refusing the hypothesis proposed by Pierre Courcelle (1969, p. 28) that they are dependent on the work of Ammonius Hermias in Alexandria. This same rebuttal attacks the widespread belief (from Courcelle too) that Boethius studied Greek in Alexandria. Indeed, Courcelle followed Bidez (1923, pp. 189-201), who some years before had shown that Boethius’ logical commentaries (not the treatises) owed almost everything to Porphyry. But Courcelle (1969) made a valuable observation about this: Boethius also refers to Syrianus, the teacher of Proclus, who taught later than Porphyry. Accordingly, Courcelle proposed that the occurrence of post-Porphyrian authors was due to Boethius’ reliance on the school of Ammonius in Alexandria, as Boethius’ logical works were written between 500 and 524, and by this time the school of Athens had fallen into decline after the death of Proclus in 485. On the other hand, Alexandria, where Ammonius taught from this date, had flourished as the center of philological, philosophical, and medical studies. Courcelle showed several parallels in the texts, but these, as he also saw, implied only a common source. However, he proposed that, in a passage of the second commentary on Aristotle’s De Interpretatione (in Int 2, 9, p. 361), the corrupt phrase sicut audivimus docet should be amended as follows: sicut Ammonius docet. Courcelle knew that the absence of the name of Ammonius in Boethius’ writings was the main objection of his hypothesis, but this emendation made it very convincing. He refused, therefore, the emendation that Meiser had done earlier in 1880, in the critical edition of Boethius’s commentaries on De Interpretatione (compare Praefatio, iv). Indeed, before Courcelle, Meiser had proposed emending Eudemus to read: sicut Eudemus docet. Subsequent studies showed that the emendation of Meiser was correct because the doctrine in question was given by Eudemus.
The focus of Courcelle, however, was to place the problem of the sources of Boethius’s logical writings into the correct focus. That is why Shiel (1958, pp. 217-244) offered a new explanation to this status quaestionis: he proposed that Boethius managed all his material, either pre- or post-Porphyrian, from a Greek codex of Aristotle’s Organon, having glosses and marginal notes from which he translated all the comments and explanations. This singular hypothesis has seduced many scholars and has even been generalized as Boethius’ general modus operandi. Shiel’s hypothesis is plausible in some respects when applied to the works on logic, but it seems to have many problems when applied to other kinds of writing. Many scholars have accepted the existence of this manuscript in Boethius’s hands by his verbatim allusions (for example, in Int 2 20-3 p. 250), although not all have accepted Shiel’s conclusions, which remove all originality to Boethius, when presenting him only as a mechanical translator of these Greek glosses. And even though Shiel always referred to Boethius’ works on logic, it is easy to generalize the servile attitude in his scientific material to his other works, but the poems or the philosophical synthesis of the Consolation or the logical analysis of Contra Eutychen have no parallel in earlier sources and are by themselves evidence of a lucid thinker.
According to Shiel (1990), Boethius’s logic comes from a copy of the commentary of Porphyry that was used in the school of Proclus in Athens. This copy was a codex containing Aristotle’s Organon with margins strongly annotated with comments and explanations. Magee has shown the difficulty to accepting the existence of this kind of codex before the ninth century AD (Magee, 1998, Introduction). On the other hand, some scholars find that Shiel’s hypothesis does not accurately apply to all the logical writings of Boethius, as Stump (1974, p. 73-93) has argued in his defense of the comments on the Topics. Moreover, the absence of Proclus’s name in Boethius’ works on logic, even when Proclus made important contribution to logic as in the case of the Canon of Proclus (compare Correia, 2002, pp. 71-84), raises new doubts about the accuracy of the formula given by Shiel.
6. References and Further Reading
Ackrill, J.L., 1963. Aristotle’sCategoriesand De Interpretatione. Translation with notes. Oxford 1963: Oxford University Press.
Alvarez, E. & Correia, M. 2012. “Syllogistic with indefinite terms”. History and Philosophy of Logic, 33, 4, pp. 297-306.
Anderson, P. 1990. Transiciones de la antigüedad al feudalismo, Madrid: Siglo XXI.
Barbera, A. 1991. The Euclidean division of the Canon. Greek and Latin sources. Lincoln: The University of Nebraska Press, pp. 1-3, y 48-49.
Bidez, J. 1923. “Boèce et Porphyre”, en Revue Belge de Philologie et d’Histoire, 2 (1923), pp. 189-201.
Bidez, J. 1964. Vie de Porphyre. Le philosophe néoplatonicien, Hildesheim: G. Olms.
Bidez, J. 1984. Boethius und Porphyrios, en Boethius, M. Fuhrmann und J. Gruber (eds.). Wege der Forschung, Band 483, Darmstadt, pp. 133-145.
Bobzien, S. 2002. “The development of Modus Ponens in Antiquity: from Aristotle to the 2nd century AD. Phronesis vol. 47, 4, pp. 359-394.
Bobzien, S. 2002a. “A Greek Parallel to Boethius’ De Hypotheticis Syllogismis”, Mnemosyne 55 (2002a), pp. 285-300.
Bochenski, I.M. 1947. La logique de Théophraste, 2nd ed., Fribourg: Libraire de L’Université.
Bochenski, I.M. 1948. “On the categorical syllogism”, en Dominican Studies, vol. I, 1, pp. 35-37.
Bower, C., 1978. “Boethius and Nicomachus: An essay concerning the sources of De Institutione Musica”, Vivarium, 6, 1, pp. 1-45-
Brandt, S. 1903. “Entstehungszeit und zeitliche Folge der Werke von Boethius”, en Philologus, 62, pp. 234-275.
Cameron, A. 1981. “Boethius’ father’s name”, en Zeitschrifts für Papyrologie und Epigraph, 44, 1981, pp. 181-183.
Chadwick, H. 1981. “Introduction”, en Gibson (1981), pp. 1-12.
Correia, M. 2002a. “Libertad humana y presciencia divina en Boecio”, Teología y Vida, XLIII (2002), pp. 175-186
Correia, M. 2001. “Boethius on syllogisms with negative premisses”, en Ancient Philosophy 21, pp. 161-174.
Correia, M. 2009. “The syllogistic theory of Boethius”, en Ancient Philosophy 29, pp. 391-405.
Correia, M. 2002. “El Canon de Proclo y la idea de lógica en Aristóteles”. Méthexis 15, pp. 71-84.
Courcelle, P. 1967. La Consolation de Philosophie dans la tradition littéraire. Antécédent et Postérité de Boèce. Etudes Augustiniennes. Paris: Editions du Centre National de la Recherche Scientifique. C.N.R.S.
Courcelle, P. 1969. Late Latin Writers and their Sources, Harvard University Press, Cambridge/Massachusetts (see: Les Lettres Grecques en Occident de Macrobe à Cassiodore, 2nd. Ed., París, 1948).
De Rijk, L. 1964. On the chronology of Boethius’ works on logic (I and II), en Vivarium, vol. 2, parts 1 & 2, pp. 1-49 and 122-162.
Devereux, D. & Pellegrin, P. 1990. Biologie, logique et métaphysique chez Aristote, D. Devereux et P. Pellegrin (eds.). Paris: Editions du Centre National de la Recherche Scientifique. C.N.R.S.
Dürr, K. 1951. The propositional logic of Boethius. Amsterdam: North Holland Publishing. (Reprinted in 1980 by Greenwood Press, USA).
Friedlein, G. 1867. Anicii Manlii Torquati Severini Boetii De Institutione Arithmetica libri duo. De Institutione Musica libri quinque. Accedit Geometria quae fertur Boetii. G. Friedlein (ed.). Leipzig: Teubner.
Fuhrmann, M. & Gruber, J. 1984. Boethius. M. Fuhrmann y J. Gruber (eds.). Wege der Forschung, Band 483, Darmstadt.
Gibson, M. 1981. Boethius, his life, thought and influence. Gibson, M. (ed.). Oxford: Blacwell.
Isaac, I. 1953. Le Peri Hermeneias en Occident de Boèce à Saint Thomas. Histoire littéraire d’un traité d’Aristote, París.
Kaylor, N.H. & Phillips, P.E. 2012. A Companion to Boethius in the Middle Ages. Kaylor, N.H. & Phillips, P.E (eds.). Leiden/Boston : Brill.
Kretzmann, N. 1982. “Syncategoremata, exponibilia, sophismata”.The Cambridge History of Later Medieval Philosophy, pp. 211-214. Cambridge: Cambridge University Press.
Lloyd, G.E.R. 1990. “Aristotle’s Zoology and his Metaphysics: the status quaestionis. A critical review of some recent theories”, en Devereux & Pellegrin (1990), pp. 8-35.
Lukasiewicz, J. 1951. Aristotle’s Syllogistic. Oxford: Oxford University Press.
Magee, J. 1998. Anicii Manlii Severini Boethii De divisione liber. Critical edition, translation, prolegomena and commentary. Leiden/Boston/Koln: Brill.
Mair, J. 1981. “The text of the Opuscula Sacra”, pp. 206-213. In Gibson, M. (1981).
Marenbon, J. 2003. Boethius. Oxford: Oxford University Press.
Marenbon, J. 2009. The Cambridge Companion to Boethius. Cambridge: Cambridge University Press.
Maroth, M. 1979. “Die hypothetischen Syllogismen”, en Acta Antigua 27 (1979), pp. 407-436.
Martindale, J. R. 1980. The prosopography of the Later Roman Empire: A.D. 395-527. Cambridge 1980: Cambridge University Press.
Matthews, J., 1981. Boethius. His life, thought and influence, en M. Gibson (ed.), Oxford.
McKinlay, A. P. 1907. “Stylistic tests and the chronology of the works of Boethius”, en Harvard Studies in Classical Philology, XVIII, pp. 123-156.
McKinlay, A.P. 1938. “The De syllogismis categoricis and Introductio ad syllogismos categoricos of Boethius”, en Classical and Mediaeval Studies in honor of E. K. Rand, pp. 209-219.
Meiser, C. 1877-1880. Anicii Manlii Severini BoetiiCommentarii in Librum Aristotelis PERI ERMHNEIAS. Prima et secunda editio. C. Meiser (ed.), Leipzig.
Migne, J.-P. 1891. De Syllogismo Categorico, en Patrologia Latina, 64, vol. 2, J.-P. Migne (ed.), París.
Migne, J.-P. 1981. Introductio ad Syllogismos Categoricos, en Patrologia Latina, 64, vol. 2, J.-P. Migne (ed.), París.
Minio Paluello, L. 1972. Opuscula. The Latin Aristotle, Amsterdam: A. Hakkert. Eds.
Minio-Paluello, L. 1965. Aristoteles Latinus, II, 1-2, L. Minio-Paluello (ed.), París: Desclée de Brouwer.
Rand, Stewart & Tester. 1990. Boethius,the Theological tractates and The Consolation of philosophy, Translated by H.F. Stewart, E.K. Rand and S.J. Tester. Cambridge Massachusetts/London, England. The Loeb Classical Library: Harvard University Press.
Redondo Reyes, P. 2002. La Harmónica de Claudio Ptolomeo: edición crítica, introducción traducción y comentario. PhD thesis, Murcia, España.
Sharples, R. 1992. Theophrastus of Eresus. Sources for his Life, Writings, Thought and Influence, vols. i-iii, W.W. Fortenbaugh, P.M. Huby, R.W. Sharples, D. Gutas (Eds.), together with A.D. Barker, J.J. Keaney, D.C. Mirhady, D. Sedley and M.G. Sollenberger. Leiden: Brill.
Shiel, J. 1990. “Boethius’ Commentaries on Aristotle”, en Sorabji (1990): pp. 349-372, (también: Medieval and Renaissance Studies 4, 1958, pp. 217-44).
Sorabji, R. 1990. Aristotle Transformed. The Ancient Commentators and their Influence. Sorabji, R. (ed.). London: Duckworth.
Spade, P.V., 1982. The semantics of terms, en The Cambridge History of Later Medieval Philosophy, Cambridge 1982, pp. 190-1: Cambridge University Press.
Speca, A. 2001. Hypothetical syllogistic & Stoic logic. Leiden/Boston/Köln: Brill.
Stump, E., 1974. “Boethius’ Works on Topics”, en Vivarium, 12, 2, pp. 77-93.
Sullivan, M.W., 1967. Apuleian Logic. The Nature, Sources, and Influence of Apuleius’s Peri Hermeneias, en: Studies in Logic and the Foundations of Mathematics, Amsterdam: North-Holland.
Usener, H. 1877. Anecdoton Holderi: ein beitrag zur Geschichte Roms in ostgotischer Zeit. Leipzig: Teubner.
Thomas, P. 1908. Apulei Opera quae Supersunt, vol iii, Apulei Platonici Madaurensis De Philosophia Libri, liberPERI ERMHNEIAS, Thomas P. (ed.), pp. 176-194, Leipzig: Teubner.
Thomas, I., O.P. 1949. “CS(n): An Extension of CS”, in Dominican Studies, pp. 145-160.
Thomsen Thörnqvist, C. 2008a. Anicii Manlii Seuerini Boethii De syllogismo categorico. A critical edition with introduction, translation, notes and indexes. Studia Graeca et Latina Gothoburgensia LXVIII, University of Gothenburg: Acta Universitatis Gothoburgensis.
Thomsen Thörnqvist, C. 2008b. Anicii Manlii Severini Boethii Introductio ad syllogismos categoricos. A critical edition with introduction, commentary and indexes. Studia Graeca et Latina Gothoburgensia LXIX, University of Gothenburg: Acta Universitatis Gothoburgensis.
Van de Vyver, A., 1929. “Les étapes du développement philosophique du aut Moyen-Age”, Revue Belge de Philologie et d’Histoire, viii (1929), pp. 425-452. Brussels: Société pour Le Progrès des Études Philosophiques et Historiques.
Wallies (1883): Alexandri in Aristotelis Analyticorum Priorum Librum I Commentarium, M. Wallies (ed.), in Commentaria in Aristotelem Graeca, vol. 2.1, Berlín: G. Reimerum.
Author Information
Manuel Correia
Email: mcorreia@uc.cl
Pontifical Catholic University of Chile
Chile
Enactivism
The term ‘enaction’ was first introduced in The Embodied Mind, co-authored by Varela, Thompson, and Rosch and published in 1991. That seminal work provides the first original contemporary formulation of enactivism. Its authors define cognition as enaction, which they in turn characterize as the ‘bringing forth’ of domains of significance through organismic activity that has been itself conditioned by a history of interactions between an organism and its environment.
To understand mentality, however complex and sophisticated it may be, it is necessary to appreciate how living beings dynamically interact with their environments. From an enactivist perspective, there is no prospect of understanding minds without reference to such interactions because interactions are taken to lie at the heart of mentality in all of its varied forms.
Since 1991, enactivism has attracted interest and attention from academics and practitioners in many fields, and it is a well-established framework for thinking about and investigating mind and cognition. It has been articulated into several recognizably distinct varieties distinguished by their specific commitments. Some versions of enactivism, such as those put forward by Thompson and Di Paolo and others, focus on expanding and developing the core ideas of the original formulation of enactivism advanced by Varela, Thompson, and Rosch. Other versions of enactivism, such as sensorimotor knowledge enactivism and radical enactivism incorporate other ideas and influences in their articulation of enactivism, sometimes leaving aside and sometimes challenging the core assumptions of the original version of enactivism.
What unifies different articulations of enactivism is that, at their core, they all look to living systems to understand minds, and they conceive of cognition as embodied activity. In enactivist terms, perceiving, imagining, remembering, and even the most abstract forms of thinking are to be understood, first and foremost, as organismic activities that dynamically unfold across time and space.
Enactivists conceive of the embodied cognitive activity that they take to constitute cognition as fundamentally interactive in at least two ways. First, the manner and style of any given bout of cognitive activity are conditioned by the cognizer’s prior history of engagement with environments and the particularities of the current environment with which they are actively engaged. Second, cognizers shape their environments and are, in turn, shaped by them in a variety of ways across multiple timescales.
A cornerstone commitment of enactivism is that minds arise and take shape through the precarious self-creating, self-sustaining, adaptive activities of living creatures as they regulate themselves by interacting with features of its environment. To take a central case, an organism’s characteristic patterns of sensorimotor interaction are deemed to be shaped by its prior history of active engagement with aspects of their environments. Its past engagements reinforce and tend to perpetuate its sensorimotor habits and tendencies. Yet organisms are not wholly creatures of past habits. Living beings always remain flexibly open to adjusting their repertoires and ways of doing things in new and novel ways. Cognition, which takes the form of patterns of open-ended, flexible, extended spatio-temporal activity, is thus deemed ‘autonomous’ in the sense that it unfolds in ways that are viable for sustaining itself and that are not externally regulated or pre-programmed.
Enactivists regard an organism’s environment as a domain of significance populated with items of relevance, not as a neutral setting that can be adequately characterized in, say, purely physical terms. Importantly, in this regard, organisms are said to ‘enact’ or ‘bring forth’ their ‘worlds’. Organisms not only adapt to and are shaped by their environments; they also dynamically fashion, curate, and adapt to them. Through such activity and exchanges, both organisms and their environments are transformed and, in an important sense, brought into being. Enactivists often explicate the unprescribed bi-directional influence of organisms on their environments and vice versa, poetically, using the metaphor of “laying down a path in walking”.
Another signature enactivist idea is that qualitative, phenomenal aspects of lived experience—what it is like to experience something—are an achievement of organismic activity. To take a central case, perceptual experience arises and takes shape through an organism’s active exploration of aspects of its environment. It is through such engaged efforts and the specific ways they are carried out that organisms experience the world in particular ways. Accordingly, organismic activities of certain kinds are required to achieve phenomenal access to aspects of the world or for things to ‘show up’ or “to be present” phenomenally.
Minds, conceived in enactivist terms, operate in ways that are fundamentally unlike those of mechanisms that are driven entirely by externally sourced programs and algorithms. Enactivism thus sees itself as directly opposing the views of cognition that understand it as essentially computational and representational in nature. In its original formulation, enactivism strongly rejects the idea that minds are in the business of collecting, transforming, and representing information sourced from a pre-given world that is assumed to exist independently of and prior to organisms. Strikingly, to conceive of cognition in line with the original version of enactivism entails holding that when organisms actively engage with aspects of their worlds, they always do so in mentality-constituting ways. Yet, enactivists hold that such cognitive activity neither involves constructing representations of those worlds based on retrieved information nor does it depend on any kind of computational processing. So conceived, enactivism rejects the longstanding idea that the core business of cognition is to represent and compute, and, concomitantly, it rejects the familiar explanatory strategies of orthodox cognitive science.
Enactivism is a significant philosophical enterprise because, at least under standard interpretations, it offers a foundational challenge to cognitivist accounts of mind—those that conceive of mentality in representational and computational terms. Enactivists regard such conceptions of mind, which dominate much mainstream analytic philosophy and cognitive science, not only as resting on a mistaken theoretical foundation but as presenting a tempting picture of mentality that, practically, subverts efforts to develop a healthier and more accurate understanding of ourselves and our place in nature.
2. Contemporary Varieties of Enactivism
There are several, and importantly, different versions of enactivism occupying the contemporary philosophical landscape.
a. Original Enactivism
The Embodied Mind by Varela, Thompson, and Rosch, published in 1991, is the locus classicus of enactivism. That landmark work is variously described as initially formulating and advancing the most influential statement of enactivism in recent times. It is credited with being “the first and among the most profound” of the many and various enactivist offerings that have followed in its wake (Kabat-Zinn 2016, p. xiii).
Enactivism, as originally formulated, is not a neatly defined or finished theory. It is variously described in the literature as a broad, emerging ‘perspective’, ‘approach’, ‘paradigm’, or ‘framework’ for understanding mind and cognition (see, for instance, Varela, Thompson and Rosch 1991; Baerveldt and Verheggen 2012; Stewart and others 2010; Gallagher 2017). Enactivism is not a finished product; it continues to evolve as new versions of enactivism emerge which adjust, add to, or reject certain core and peripheral commitments of the original version.
Though the original version of enactivism resists definition in terms of a set of central theses, it does have distinctive features. There are three key and recurring themes emphasized in the original statement of enactivism. The first theme is that understanding organismic biological autonomy is the key to understanding minds. Original enactivism assumes that there is deep continuity between life and mind, such that understanding the biological autonomy of organisms sheds direct light on cognition. The second theme is that minds cannot be understood without coming to terms with subjective, lived experience, and consciousness. The third theme is that non-Western traditions, and in particular, Buddhist philosophy and its practices of meditation and mindfulness, should play a significant role in reforming and rethinking the future sciences of the mind, both theoretically and practically.
The original version of enactivism put forward in The Embodied Mind has been successively developed and expanded upon in works, mainly by Thompson, Di Paolo, and their co-authors (principally Thompson 2007, Froese and Di Paolo 2011, McGann and others 2013, Di Paolo and others 2017, Di Paolo and others 2018, Di Paolo 2018, 2021). Some speak of these works, collectively, as constituting and contributing to a variety of autopoietic enactivism (Hutto and Myin 2013, 2017, Ward and others 2017, Stapleton 2022). The label, which now has some purchase was chosen because the original version of enactivism and those that seek directly to expand on it, are united in looking to biological autonomy to understand the fundamentals of mindedness. Crucially, all enactivists of this stripe embrace the notion of autopoiesis —the self-creating and self-sustaining activity of living systems —as a common theoretical starting point, having been inspired by “the work of the biologists Humberto Maturana and Francisco Varela” (Baerveldt and Verheggen 2012, p. 165; see Maturana and Varela, 1980, 1987). Nevertheless, the label autopoietic enactivism is contested (see, for example, Thompson 2018, Netland 2022). It is thought to be misleading because, although these enactivists build upon the work of Varela and Maturana, they have added significant resources, expanding upon and modifying the initial conception of autopoiesis in their efforts to explicate key aspects of biological autonomy, namely, recognizing its teleological character (see, for instance, Thompson 2007, 127; Di Paolo 2009, p. 12; Di Paolo 2018 and others, p. 37). As such, enactivists working on these topics deem autopoiesis, as originally conceived, to be, at most, necessary but insufficient for important world-involving forms of cognition (see Thompson 2007, p. 149-150; see also p. 127). For these reasons, Barandiaran (2017) recommends the label autonomist enactivism instead. However, given these nuances, it may be safer and more accurate to speak of these positions simply as variants of original enactivism.
The primary aim of the original version of enactivism was to address the problem of understanding how lived experience fits into the world, as described by science, including cognitive science. On the face of it, the two appear unreconcilably different from one another. Thompson (2016) puts the apparent dilemma that motivated the first formulation of enactivism in terms of a hard choice: either “accept what science seems to be telling us and deny our experience… or hold fast to our experience and deny science” (p. xix).
The original version of enactivism was born from the aspiration of finding a way for cognitive science to give appropriate attention to lived experience. One of its key assumptions is that “we cannot begin to address… [the gap between science and experience] without relying on some kind of phenomenology, that is, on some kind of descriptive account of our experience in the everyday world” (Thompson 2016, p. xx).
Enactivism rejects mainstream conceptions of mind that strongly demarcate minds from bodies and environments. It holds that such conceptions are not justified and should be rethought. Enactivism aims to eradicate misleading dualisms that continue to dominate analytic philosophy of mind and much cognitive science. It aims to dissolve the mind-body problem by asking us to abandon our attachment to traditional dichotomies and to come to see that minds are not ultimately separate from bodies, environments, or others.
Original enactivism seeks to put the mind-body problem to rest once and for all. It also rejects the traditional input-output processing model of the mind, a model which pays homage, often explicitly, to the idea that minds are furnished by the senses by accepting that the senses supply minds with information about the external world. Original enactivism rejects this familiar characterization of mental activity, denying that minds ever pick up or process information from the environment. Concomitantly, original enactivism rejects the idea that minds are fundamentally information processing systems that manipulate informational content by categorizing, conceptualizing, and schematizing it by means of representational-computational processes. By also pressing us to radically rethink key notions —self, nature, and science – original enactivism aims to usher in “a new kind of cognitive science” (Rosch 2016, p. xxxv). So conceived, enactivism seeks to revolutionize and massively reform the sciences of the mind.
Embracing original enactivism entails rethinking foundational mainstream theoretical assumptions that are prevalent in much analytic philosophy of mind and cognitive science. Importantly, in this vein, original enactivists advocate not only for changes to our theoretical mindset but also for changes in existing practices and approaches we use in the cognitive sciences and cognate domains that study and engage with minds. Thus, the original version of enactivism envisions that future sciences of the mind will recognize and work with “another mode of knowing not based on an observer and observed” (Rosch 2016, p. x). Original enactivism, thus, issues a normative demand to create a space in which those working to understand and expand our lived experience can speak to and understand empirically focused scientists of the mind. In such a setting, there would be a dynamic and interactive ‘circulation’ and cross-fertilization of theory and practice (Thompson 2016, 2017).
This is the sense in which original enactivism seeks to provide “a framework for a far-reaching renewal of cognitive science as a whole” (Stewart, Gapenne, and Di Paolo 2010, p. viii.).
It is an open question just how much of the ambition of original enactivism has been achieved, but it is undeniable that much has changed in the fields of philosophy and the sciences of the mind since its debut. Thompson (2016) summarizes the current state of the art.
The idea that there is a deep continuity in the principles of self-organization from the simplest living things to more complex cognitive beings — an idea central to Varela’s earlier work with neurobiologist Humberto Maturana — is now a mainstay of theoretical biology. Subjective experience and consciousness, once taboo subjects for cognitive science, are now important research topics, especially in cognitive neuroscience. Phenomenology now plays an active role in the philosophy of mind and experimental cognitive science. Meditation and mindfulness practices are increasingly used in clinical contexts and are a growing subject of investigation in behavioral psychology and cognitive neuroscience. And Buddhist philosophy is increasingly recognized as an important interlocutor in contemporary philosophy (p. xix).
Notably, there have been efforts to transform the way the science of intersubjectivity is itself conducted by getting researchers to participate, at once, both as subjects and objects of research. Details of this method, called PRISMA, are set out in De Jaegher and others (2017). Thompson (2017) praises this work for being “clearly animated by the full meaning of enaction as requiring not just a change in how we think but also in how we experience” (p. 43). For a related discussion of how cognitive science practice might change by giving due attention to dynamically evolving experience, see McGann (2022).
i. Biological Autonomy
All living systems —from simple cells to whole organisms, whether the latter are single-celled bacteria or human beings —actively individuate themselves from other aspects of their environments and maintain themselves by engaging in a constant “dynamical exchange of energy and matter that keeps the inside conditions just right for life to perpetuate itself” (Kabat-Zinn 2016, p. xiv). This is all part of the great game of life: staying far enough away from entropy, aka thermodynamic equilibrium, to survive.
Enactivists emphasize the autopoietic character—the self-creating and self-individuating results—of the activity through which living systems adaptively produce and maintain vital boundaries and relationships between themselves and what lies beyond them (Varela and others, 1991; Thompson, 2007). Accordingly, “organisms actively and continuously produce a distinction between themselves and their environment where none existed before they appeared and where none will remain after they are gone” (Di Paolo and others 2018, p. 23).
What determines the boundaries of a given organism? Where does a given organism end and the environment begin? Enactivists seek to answer such questions by pointing to the fact that living systems are organizationally and operationally closed, which is to say that they are “constituted as a network of interdependent processes, where the behavior of the whole emerges from the interaction dynamics of its component parts” (Barandiaran 2017, p. 411, see also Di Paolo and Thompson 2014, Di Paolo and others 2018; Kirchhoff 2018a).
The basic idea of operational closure is that self-defining autopoietic processes can be picked out by the fact that they exist in mutually enabling networks of circular means-end activities, such that “all of the processes that make up the system are enabled by other processes in the system” (Di Paolo and others 2018, p. 25). Operational closure is evident in the self-sustaining autonomous activity of, for example, metabolic networks in living systems. External influences —such as, say, the effects of sunlight being absorbed by chlorophyll —are any influences that are not mutually enabled or produced by processes within such a closed system.
The exact boundaries of a self-producing, self-individuating living system can be flexible. In this regard, Di Paolo and others (2018) cite the capacity of some insects and spiders to breathe underwater for certain periods of time. They manage to do this by trapping air bubbles in the hair on their abdomens. In such cases, these environmental features become part of the self-individuating enabling conditions of the organism’s operationally closed network: “These bubbles function like external gills as the partial pressure of oxygen within the bubble, diminished by respiration, equilibrates with that of the water as the external oxygen flows in” (Di Paolo and others 2018, p. 28, see also Turner 2000).
When we consider concrete cases, it is evident that autopoietic processes of self-production and self-distinction require living systems to continuously adjust to features of their environment. This involves the “selective opening and selective rejection of material flows—in other words, an adaptive regulation of what goes in and what stays out” (Di Paolo and others 2018, p. 40).
Adaptive regulation requires flexibility. It requires simultaneous adjustments at multiple timescales and various levels, where each adjustment must be responsive to particular speeds and rhythms at the scale required to meet specific thresholds. This is why the business of being and staying alive is necessarily complex, forever unfinished, precarious, and restless (Di Paolo and others, 2017; 2018). Though there is room for error, minimally, organisms that survive and propagate must actively avoid engaging in behaviors that are overly maladaptive.
Enactivists hold that such adaptive activity is autonomous. Living systems establish their own unprincipled norms of operation —norms that arise naturally from the activity of staying alive and far from entropy. It is because organisms generate their own norms through their activities that enactivists speak of them as having an immanent teleology (Thompson 2007, Di Paolo and others 2018).
It is claimed that this notion of autonomy is the very core of enactivism (Thompson 2007, Barandiaran 2017, p. 409; Di Paolo and others, 2018, p. 23). It is regarded as a notion that, strictly speaking, goes “beyond autopoiesis” (Di Paolo and others 2018, p. 25).
Enactivists contend that the fact that living systems are autonomous in the precise sense just defined is what distinguishes them from wholly lifeless, heteronomous machines of the sort that are driven only by external, exogenous instructions. A core idea of enactivism is that “the living body is a self-organizing system. To think of living bodies in this way “contrasts with viewing it as a machine that happens to be made of meat rather than silicon” (Rosch 2016, p. xxviii). In line with this understanding, enactivists hold that organismic processes “operate and self-organize historically rather than function” (Di Paolo and others 2018, p. 20). It is precisely because organisms must always be ready to adjust to new possibilities and circumstances that the self-organizing activity of living systems cannot be governed by instructions in a functionally pre-specified manner (see Barandiaran 2017, p. 411).
Enactivists hold that autonomous norm generation is a feature of all modes and styles of cognitive activity and not just as it concerns basic organismic self-production, self-individuation, and self-maintenance. Di Paolo and others (2018), for example, identify two important dimensions of autonomous self-regulation beyond the basic cycles of regulation that sustain living organisms. These additional dimensions they identify are cycles of sensorimotor interactions involved in action, perception, and emotion and cycles of intersubjectivity involved in social engagements with others (Di Paolo and others 2018, p. 22).
ii. Bringing Forth Domains of Significance
Connected with their understanding of biological autonomy, enactivists reject the idea that organisms simply adapt to features of a pre-existing, neutrally characterized physical world. Instead, they hold that organisms are attuned to features of environments or domains that are significant to them —environments that organisms themselves bring into being. It is on this basis that enactivists “conceive of mental life as the ongoing meaningful engagement between precariously constituted embodied agents and the worlds of significance they bring forth in their self-asserting activity” (Di Paolo and others 2018, p. 20). Hence, a central thesis of enactivism is that “cognition is not the grasping of an independent, outside world by a separate mind or self, but instead the bringing forth or enacting of a dependent world of relevance in and through embodied action” (Thompson 2016, p. xviii).
In this view, organisms and environments dynamically co-emerge. The autonomous adaptative activity of organisms “brings forth, in the same stroke, what counts as other, the organism’s world.” (Thompson 2007, p. 153). The pre-existing world, as characterized by physics and chemistry, is not equivalent to an organism’s environment. The latter, which is effectively captured by von Uexküll’s (1957) notion of an Umwelt, is a sub-set of the physio-chemical world that is relevant to the organism in question. This environmental domain of significance or relevance for organisms, which enactivists hold, is brought into being through the activity of organisms themselves.
For example, sucrose only serves as food for a bacterium because it has certain physical and chemical properties. Yet without organisms that use it as a nutrient, sucrose, understood merely as something that exists as part of the physicochemical world, is not food. Hence, that it is food for bacteria depends not only, or even primarily, on the physiochemical properties of sucrose itself but chiefly on the existence and properties of bacteria —properties connected to their metabolic needs and processes that they brought into being. Although, taking the stance of scientists, we can and do speak of aspects of an organism’s environment using the language of physics and chemistry, describing them in organism-neutral terms, it is only if we recognize the significance that such worldly features have for the organism that we are able to pick the right aspects of the world that are relevant or important to it.
On the face of it, to suggest that organisms ‘bring forth’ or ‘enact’ their own environments may appear to be an extravagant thesis. Yet it finds support in the seminal work of biologists, principally Gould and Lewontin (1979), who question accounts of Darwinian adaptationism in two key respects. They reject construing natural selection as an external evolutionary force that separately targets and optimizes individuated organismic traits. They also reject the idea that natural selection fashions organisms to better compete against one another for the resources of the pre-existing physical world (for further details, see Godfrey-Smith 2001). In the place of strong adaptationism, original enactivists propose to understand evolution in terms of natural drift– seeing it as a holistic, “ongoing process of satisfaction that triggers (but does not specify) change in the form of viable trajectories” (see a full summary in Varela and others 1991, pp. 196-197 and also Maturana and Mpodozis 2000).
A major focus of the critique of adaptationism is the rejection of the idea that a living creature’s environment is an external, “preexistent element of nature formed by autonomous forces, as a kind of theatrical stage on which the organisms play out their lives” (Lewontin and Levins 1997, p. 96, Lewontin 2000).
Putting pressure on the idea that organisms simply adapt to a neutrally characterized external world, Lewontin and Levins (1997) observe that not all worldly forces affect every organism equally. In some cases, some forces greatly affect certain organisms, while the same forces matter to other creatures hardly at all. The all-pervasive force of gravity provides a shining example. All middle-sized plants and animals must contend with it. Not only does gravity affect the musculoskeletal, respiratory, and circulatory systems of such organisms, but also affects their single biological cells. Gravity influences cell size and processes such as mechanotransduction —processes by which cells electrochemically respond, at micro-timescales, to mechanical features and forces in the environment. Hence, even on a microlevel, gravity matters for such cognitively important activities as hearing, proprioception, touch, and balance. Due to their size, other organisms, however, must contend with and are shaped in their activities by other forces. For microorganisms, it is Brownian motion, not gravity, that matters most to their lives. It is reported that some microbes can survive the hypergravity of extraterrestrial, cosmic environments, which exert a gravitational force up to 400,000 times greater than that found on Earth (Deguchi and others 2011). This is one reason why bacteria “are ubiquitous, present in nearly every environment from the abyssal zone to the stratosphere at heights up to 60 km, from arctic ice to boiling volcanoes” (Sharma and Curtis 2022, p. 1).
These reminders support the enactivist claim that the relationship between organism and environment is dialectical —that the one cannot exist without the other. Maintaining that organisms and their environments reciprocally codetermine one another, defenders of this view of biological development hold that:
Environments are as much the product of organisms as organisms are of environments. There is no organism without an environment, but there is no environment without an organism. There is a physical world outside of organisms, and that world undergoes certain transformations that are autonomous. Volcanoes erupt, and the earth precesses on its axis of rotation. But the physical world is not an environment; only the circumstances from which environments can be made are (Lewontin and Levins 1997, p. 96).
Moreover, the relationship between organisms and their environments is not static; it coevolves dynamically over time: “As the species evolves in response to natural selection in its current environment, the world that it constructs around itself is actively changed” (Thompson 2007, p. 150). Lewontin and Levins (1997) provide a range of examples of how organisms relate to and actively construct their environments. These include organisms regulating ambient temperatures through the metabolic production of shells of warm, moist air around themselves and plant roots producing humic acids that alter the physiochemical structure of soil to help them absorb nutrients.
Looking to these foundations, Rolla and Figueiredo (2021) further explicate the evolutionary dynamics by which organisms can be said to, literally, bring forth their worlds. Drawing on the notion of niche construction, theirs is an effort to show that “enactivism is compatible with the idea of an independent reality without committing to the claim that organisms have cognitive access to a world composed of properties specified prior to any cognitive activity”. For more on the notion of niche construction, and why it is thought to be needed, see Laland and others (2014), Laland and others (2016), and Werner (2020).
iii. Phenomenological Connections
In line with its overarching aim, original enactivism aims at giving an account of “situated meaningful action that remains connected both to biology and to the hermeneutic and phenomenological studies of experience” (Baerveldt and Verheggen, 2012, p. 165. See also Stapleton and Froese (2016), Netland (2022)).
See also Stapleton and Froese (2016), Netland (2022). It owes a great deal to the European tradition of phenomenology in that its account of virtual milieus and vital norms is inspired by Merleau-Ponty’s The Structure of Behaviour and, especially, his notion of “the lived body” (Kabat-Zinn 2016, p. xiii). Virtual milieus and their properties are not something found ‘objectively’ in the world; rather, they are enacted or brought forth by organisms. Organisms not only enact their environments —in the sense that sucrose might become food for certain creatures —they also enact their qualitative, felt experiences of the world. In this vein, enactivists advance the view that “our perceived world [the world as perceived]…is constituted through complex and delicate patterns of sensorimotor activity” (Varela and others, 1991, p. 164).
By appealing to arguments from biology, enactivists defend the view that organisms and their environments are bound together in ways that make it impossible to characterize one without reference to the other when it comes to understanding mental life. They apply this same thinking when it comes to thinking about qualitative, phenomenally conscious aspects of mind, holding, for example, that “we will not be able to explain colour if we seek to locate it in a world independent of our perceptual capacities” (Varela and others, 1991, p. 164). This is not meant to be a rejection of mind-independent realism in favor of mind-dependent idealism. Defenders of the original version of enactivism offer this proposal explicitly as providing a ‘middle way’ between these familiar options.By their lights, “colours are not ‘out there’ independent of our perceptual and cognitive capacities…[but equally] colors are not ‘in here’ independent of our surrounding biological and cultural world” (p. Varela and others 1991, p. 172).
For enactivists, colors cannot be understood independently of the very particular ways that experiencing beings respond to specific kinds of worldly offerings. Accordingly, it is not possible to think about the nature of colors qua colors without also referencing those ways of interacting with environmental offerings. This claim rests on two assumptions. First, the way colors appear to organisms —the way they experience them —is essential to understanding the nature of colors as such. Second, such experiential properties are inescapably bound up with organismic ways of responding to aspects of their environments.
Importantly, though enactivists deny that colors are objective properties of the world independent of organisms that perceive them, they neither claim nor imply that colors are wholly mind-dependent properties in the sense associated with classical Berkleyian idealism as it is standardly portrayed.
Furthermore, it is precisely because enactivists hold that an organism’s ways of responding to aspects of its environment are not inherently representational, or representationally mediated that “color provides a paradigm of a cognitive domain that is neither pregiven nor represented but rather experiential and enacted” (Varela and others 1991, p. 171). This conclusion is meant to generalize, applying to all phenomenological structures and aspects of what is brought forth by organisms as domains of significance through their autonomous activity.
In this regard, in its original formulation, enactivism drew on “significant resources in the phenomenological tradition for rethinking the mind” (Gallagher 2017, p. 5). Apart from explicitly borrowing from Merleau-Ponty, Varela and others (1991) also aligned their project with other classic thinkers of the phenomenological tradition, such as Husserl and Sartre, to some extent.
For example, although the enactivists wished to steer clear of what Hubert Dreyfus interpreted as Husserl’s representationalist leanings, they acknowledge the prime importance of his efforts to “develop a specific procedure for examining the structure of intentionality, which [for him] was the structure of experience itself” (Varela and others 1991, p. 16). For this reason, and by contrast, they explicitly oppose and criticize the cognitivist conviction that there is “a fundamental distinction between consciousness and intentionality” (p. 56). By their lights, drawing such a distinction creates a mind-mind problem and disunifies our understanding of the cognizing subject.
Nevertheless, despite borrowing in key respects from the Western phenomenological tradition, when formulating their initial statement of enactivism, Varela and others (1991) also criticized that tradition for, allegedly, being overly theoretical in its preoccupations. According to their assessment at the time, phenomenology “had gotten bogged down in abstract, theoretical reflection and had lost touch with its original inspiration to examine lived experience in a rigorous way” (Thompson 2016, p. xx-xxi). This critical take on phenomenology motivated the original enactivists to “turn to the tradition of Buddhist philosophy and mindfulness-awareness meditation as a more promising phenomenological partner for cognitive science” (Thompson 2007, p. 413).
In time, Thompson and Varela too, in their analysis of the specious present and their work with Natalie Depraz, at least, came to revise original enactivism’s negative verdict concerning phenomenology’s limitations. In his later writings, Thompson admits that the authors of The Embodied Mind, wrongly, gave short shrift to phenomenology. For example, by conceding that they had relied too heavily on second-hand sources and had not given careful attention to the primary texts, Thompson makes clear that the original enactivists came to hold, mistakenly, that Husserl sponsored an unwanted brand of representationalism (see Thompson 2007 appendix A, Thompson 2016).
Many contemporary enactivists, including Thompson, openly draw on and seek to renovate ideas from the phenomenological tradition, connecting them directly with current theorizing in the cognitive sciences (Gallagher 2005, Gallagher and Zahavi 2008/2021, Gallagher 2017). As Gallagher notes, for example, there has been new work in this vein on “Husserl’s concept of the ‘I can’ (the idea that I perceive things in my environment in terms of what I can do with them); Heidegger’s concept of the pragmatic ready-to-hand (Zuhanden) attitude (we experience the world primarily in terms of pre-reflective pragmatic, action-oriented use, rather than in reflective intellectual contemplation or scientific observation); and especially Merleau-Ponty’s focus on embodied practice” (Gallagher 2017, p. 5).
iv. Buddhist Connections
A major source of inspiration for original enactivists comes from Buddhist philosophy and practice. Thompson remarks in an interview that, to his knowledge, The Embodied Mind, “was the first book that related Buddhist philosophy to cognitive science, the scientific study of the mind, and the Western philosophy of mind” (Littlefair 2020).
Speaking on behalf of the authors of The Embodied Mind, Rosch (2016) reports that “We turned to Buddhism because, in our judgment, it provided what both Western psychology and phenomenology lacked, a disciplined and nonmanipulative method of allowing the mind to know itself—a method that we (in retrospect naively) simply called mindfulness” (Rosch 2016, xli). Despite having turned to Buddhist philosophy and psychology due to a mistaken assessment of what Western phenomenology has to offer, original enactivism continues to seek fruitful dialogues between Buddhist and Western traditions of philosophy of mind. Enactivism has helped to promote the recognition that phenomenological investigations need not be limited to work done in the European tradition.
There are potential gains to be had from conducting dialogues across traditions of thought for at least two reasons. Sometimes those working in a different tradition focus on phenomena unnoticed by other traditions. And sometimes those working in a different tradition offer novel observations about phenomena that are already of common interest. Recognizing the potential value of such dialogues, enactivists have a sustained interest in what Asian traditions of thought and practice have to offer when it comes to investigating and describing experience, and “in particular the various Buddhist and Hindu philosophical analyses of the nature of the mind and consciousness, based on contemplative mental training” (Thompson 2007, p. 474).
Inspired by these efforts at cross-fertilization, Varela initially formulated neurophenomenology, which was subsequently taken up by others (Varela 1996, 1999, Thompson 2007). Neurophenomenology was developed as a novel approach to the science of consciousness —one that incorporates empirical studies of mindful, meditative practice with the aim of getting beyond the hard problem of consciousness. Although, as a practical approach to the science of consciousness, neurophenomenology certainly breaks new ground, it has been criticized for failing to adequately address the theoretical roots of the hard problem of consciousness, which are grounded in particular metaphysical commitments (see, for example, Kirchhoff and Hutto 2016 and replies from commentators).
Another enactivist borrowing from Buddhist philosophy, of a more theoretical bent, is the claim that cognition and consciousness are absolutely groundless —that they are ultimately based only on empty co-dependent arising. Thompson (2016) reports that the original working title of The Embodied Mind was Worlds Without Grounds. That initial choice of title, though later changed, shows the centrality of the idea of groundlessness for the book’s authors. As Thompson explains, the notion of groundlessness in Buddhist philosophy is meant to capture the idea “that phenomena lack any inherent and independent being; they are said to be ‘empty’ of ‘own being’” (p. xviii).
The original enactivists saw a connection with the Buddhist notion of groundlessness and their view that cognition only arises through viable organismic activity and histories of interaction that are not predetermined. For them, the idea that cognition is groundless is supported by the conception of evolution as natural drift. Accordingly, they maintain that “our human embodiment and the world that is enacted by our history of coupling reflect only one of many possible evolutionary pathways. We are always constrained by the path we have laid down, but there is no ultimate ground to prescribe the steps that we take” (Varela and others 1991, p. 213). Or, as Thompson (2016) puts it, “Cognition as the enaction of a world means that cognition has no ground or foundation beyond its own history” (p. xviii).
Thompson (2021) has emphasized the apparently far-reaching consequences this view has for mainstream conceptions of science and nature. To take it fully on board is to hold that ultimate reality is ungraspable, that it is beyond conception, or that it is not ‘findable under analysis’. As such, he observes that, on the face of it, the traditional Mahāyāna Buddhist idea of ‘emptiness’ (śūnyatā—the lack of intrinsic reality) appears to be at odds with standard, realist, and objectivist conceptions of scientific naturalism. As such, this raises a deep question of what taking these Buddhist ideas seriously might mean “for scientific thinking and practice” (Thompson 2021, p. 78). Others too have sought to work through the implications of taking enactivist ideas seriously when thinking about an overall philosophy of nature (Hutto and Satne 2015, 2018a, 2018b; Gallagher 2017, 2018b; Meyer and Brancazio 2022). These developments raise the interesting question: To what extent, and at what point, might enactivist revisions to our understanding and practice of science come into direct tension with and begin to undermine attempts to make the notion of autonomous agency credible by “providing a factual, biological justification for it” (Varela 1991 p. 79)?
v. Sense-Making
A foundational, signature idea associated with the original version of enactivism and its direct descendants is that the autonomous agency of living systems and what it entails are a kind of sense-making. The notion of sense–making made its debut in the title of a presentation that Varela delivered in 1981, and the idea’s first published expression arrived with the publication of that presentation, as follows: “Order is order, relative to somebody or some being who takes such a stance towards it. In the world of the living, order is indeed inseparable from the ways in which living systems make sense, so that they can be said to have a world” (Varela 1984, p. 208; see Thompson 2011 for further discussion of the origins of the idea). The idea that living systems are sense-making systems has proved popular with many enactivists, although interestingly, there is no explicit mention of sense-making in The Embodied Mind.
Sense making is variously characterised in the literature. Sometimes it is characterised austerely, serving simply as another name for the autonomous activity of living systems. In other uses, it picks out, more contentiously, what is claimed to be directly entailed by the autonomous activity of living systems. In the latter uses, different authors attribute a variety of diverse properties to sense making activity in their efforts to demonstrate how phenomenological aspects of mind derive directly from, or are otherwise somehow connected with, the autonomous agency of living systems. Making the case for links between life and mind can be seen, broadly, as a continuation of Varela’s project “to establish a direct entailment from autopoiesis to the emergence of a world of significance” (Di Paolo and others 2018, p. 32).
At its simplest, sense-making is used to denote the autonomous agency of living systems. For example, that is how the notion is used in the following passages:
Living is a process of sense-making, of bringing forth significance and value. In this way, the environment becomes a place of valence, of attraction and repulsion, approach or escape (Thompson 2007, p. 158).
Sense-making is the capacity of an autonomous system to adaptively regulate its operation and its relation to the environment depending on the virtual consequences for its own viability as a form of life (Di Paolo and others 2018, p. 33).
Such an identification is at play when it is said that “even the simplest organisms regulate their interactions with the world in such a way that they transform the world into a place of salience, meaning, and value —into an environment (Umwelt) in the proper biological sense of the term. This transformation of the world into an environment happens through the organism’s sense-making activity” (Thompson and Stapleton 2009, p. 25). However, Di Paolo and others (2017) go further, claiming that “it is possible to deduce from processes of precarious, material self-individuation the concept of sense-making” (p. 7).
Enactivists add to this basic explication of sense-making, claiming that the autonomous activity of living systems is equivalent to, invariably gives rise to, entails, or is naturally accompanied by a plethora of additional properties: having a perspective, intentionality, interpretation, making sense of the world, care, concern, affect, values, evaluation, and meaning.
Thompson (2007) explains that the self-individuating and identity-forging activity of living systems “establishes logically and operationally the reference point or perspective for sense-making and a domain of interactions” (p. 148). It is claimed that such autonomous sense-making activity establishes “a perspective from which interactions with the world acquire a normative status” (Di Paolo and others 2018, p. 32). Di Paolo and others (2017) appear to add something more to this explication when they take sense-making to be equivalent to an organism not only having a perspective on things but having “a perspective of meaning on the world invested with interest for the agent itself” (p. 7).
Thompson (2007) tells us that according to Varela, sense-making “is none other than intentionality in its minimal and original biological form” (Thompson 2007, p. 148; see Varela 1997a, Thompson 2004). This fits with the account of intentionality provided in The Embodied Mindccording to which “embodied action is always about or directed toward something that is missing… actions of the system are always directed toward situations that have yet to become actual” (Varela and others 1991, p. 205). In their classic statement of this view, the original enactivists held that intentionality “consists primarily in the directedness of action… to what the system takes its possibilities for action to be and to how the resulting situations fulfill or fail to fulfill these possibilities” (Varela and others 1991, p. 205-206).
Talk of sense-making, despite the minimal operational definition provided above, is sometimes used interchangeably and synonymously with the notion that organisms make sense of their environments. This illocution is at the heart of Varela’s initial presentation of the view in Varela (1984), but others retain the language. Thompson (2007) tells us that “an autopoietic system always has to make sense of the world so as to remain viable” (p. 147-8). He also tells us, “Life is thus a self-affirming process that brings forth or enacts its own identity and makes sense of the world from the perspective of that identity.” (Thompson 2007, p. 153). Rolla and Huffermann (2021) describe enactivists as committed to the claim that “organisms make sense of their environments through autopoiesis and sensorimotor autonomy, thereby establishing meaningful environmental encounters” (p. 345).
Enactivists also regard sense-making as the basis for values and evaluations, as these, they claim, appear even in the simplest and most basic forms of life (see, for example, Rosch 2016). This claim connects with the enactivist assumption that all living things have intrinsic purposiveness and an immanent teleology (Thompson 2007, Di Paolo and others 2018, see also Gambarotto and Mossio 2022).
Certain things are adaptative or maladaptive for organisms, and, as such, through their active sense-making, they tend to be attracted to the former and repulsed by the latter (Thompson 2007, p. 154). Accordingly, it is claimed that organisms must evaluate whatever they encounter. For example, a sense-making system “… ‘evaluates’ the environmental situation as nutrient-rich or nutrient-poor” (Di Paolo and others 2018, p. 32). It is claimed that such evaluation is necessary given that the “organism’s ‘concern’… is to keep on going, to continue living” (Di Paolo and others 2018, p. 33). Moreover, it is held that the autonomous sense-making activity of organisms generates norms that “must somehow be accessible (situations must be accordingly discernible) by the organism itself.” (Di Paolo and others 2018, p. 32). So conceived, we are told that “sense-making… lies at the core of every form of action, perception, emotion, and cognition, since in no instance of these is the basic structure of concern or caring ever absent. This is constitutively what distinguishes mental life from other material and relational processes” (Di Paolo and others 2018, p. 33).
Those who have sought to develop the idea of sense-making also maintain that “cognition is behaviour in relation to meaning… that the system itself enacts or brings forth on the basis of its autonomy” (Thompson 2007, p. 126). In this regard, Cappuccio and Froese (2014) speak of an organism’s “active constitution of a meaningful ‘world-environment’ (Umwelt)” (p. 5).
Importantly, Thompson (2007) emphasizes that sense-making activity not only generates its own meaning but also simultaneously responds to it. He tells us that “meaning is generated within the system for the system itself —that is, it is generated and at the same time consumed by the system” (p. 148). This idea comes to the fore when he explicates his account of emotional responding, telling us that “an endogenously generated response… creates and carries the meaning of the stimulus for the animal. This meaning reflects the individual organism’s history, state of expectancy, and environmental context” (Thompson 2007, p. 368). Similarly, in advancing her own account of enactive emotions, Colombetti (2010) also speaks of organismic “meaning generating” activity and describes the non-neural body as a “vehicle of meaning” (2010, p. 146; p. 147).
Di Paolo and his co-authors defend similar views, holding that “the concept of sense-making describes how living organisms relate to their world in terms of meaning” (Di Paolo and others 2017, p. 7); and that an organism’s engagements with features of the environment “are appreciated as meaningful by the organism” (Di Paolo and others 2018, p. 32).
Enactivists who defend these views about sense-making are keen to note that the kind of ‘meaning’ that they assume is brought forth and consumed by organisms is not to be understood in terms of semantic content, nor does it entail the latter. As such, the kind of meaning that they hold organisms bring forth is not in any way connected to or dependent upon mental representations as standardly understood. We are told “if we wish to continue using the term representation, then we need to be aware of what sense this term can have for the enactive approach… “Autonomous systems do not operate on the basis of internal representations; they enact an environment” (Thompson 2007, p. 58 –59). Indeed, in moving away from cognitivist assumptions, a major ambition of this variety of enactivism is to establish that “behavior…expresses meaning-constitution rather than information processing” (Thompson 2007, p. 71).
In sum, a main aspiration of original enactivism is to bring notions such as sense-making to bear to demonstrate how key observations about biological autonomy can ground phenomenological aspects of mindedness such as “concernful affect, caring attitudes, and meaningful engagements that underscore embodied experience” (Di Paolo and others 2018, p. 42). The sense-making interpretation of biological autonomy is meant to justify attributing basic structures of caring, concern, meaning, sense, and value to living systems quite generally (Di Paolo and others 2018, p. 22). Crucially and pivotally, it is claimed of the original version of enactivism that through its understanding of “precarious autonomy, adaptivity, and sense-making, the core aspect of mind is naturalized” (Di Paolo and others 2018, p. 33).
In pursuing its naturalizing ambition, the original version of enactivism faces a particular challenge. Simply put, the weaker —more austere and deflated —its account of sense-making, the more credible it will be for the purpose of explicating the natural origins of minds, but it will be less capable of accounting for all aspects of mindedness. Contrariwise, the stronger —more fulsome and inflated —its account of sense-making, the more capable it will be of accounting for all aspects of mindedness, but the less credible it will be for the purpose of explicating the natural origins of minds.
For example, in their original statement of enactivism, Varela and others (1991) speak of the most primitive organisms enacting domains of ‘significance’ and ‘relevance’. They add that this implies that ‘some kind of interpretation’ is going on. Yet, they are also careful to emphasize that they use their terms advisedly and are at pains to highlight that “this interpretation is a far cry from the kinds of interpretation that depend on experience” (p. 156). More recently, Stapleton (2022) maintains that:
The autopoietic enactivist is, of course, not committed to viewing the bacterium as experiencing the value that things in its environment have for it. Nor, to viewing the bacterium as purposefully regulating its coupling with the environment, where ‘purposeful’ is understood in the terms we normally use it—as implying some kind of reflection on a goal state and striving to achieve that goal state by behaving in a way in which one could have done otherwise (p. 168).
Even if it is accepted that all cognition lies along a continuum, anyone who acknowledges that there are significantly different varieties of cognition that have additional properties not exhibited by the most basic forms must face up to the ‘scaling up’ challenge. As Froese and Di Paolo (2009) ask, “Is it a question of merely adding more complexity, that is, of just having more of the same kind of organizations and mechanisms? Then why is it seemingly impossible to properly address the hallmarks of human cognition with only these basic biological principles?” (p. 441). In this regard, Froese and Di Paolo (2009) admit that even if the notion of sense-making is thought to be appropriate for characterizing the activity of the simplest living creatures, it still “cries out for further specification that can distinguish between different modes of sense-making” (p. 446).
With the scaling up challenge in sight, several enactivists have been working to explicate how certain, seemingly distinctive high-end human forms of sense-making relate to those of the most basic, primitive forms of life (Froese and Di Paolo 2009; De Jaegher and Froese 2009; Froese, Woodward and Ikegami 2013, Kee 2018). Working in this vein, Cuffari and others (2015) and Di Paolo and others (2018) have broken new ground by providing a sense-making account of human language in their efforts to dissolve the scaling-up problem and demonstrate the full scope and power of key ideas from the original version of enactivism.
b. Sensorimotor Knowledge Enactivism
At a first pass, what is sometimes called simply sensorimotor enactivism holds that perceiving and perceptual experience “isn’t something that happens in us, it is something we do” (Noë 2004, p. 216). Accordingly, perceiving and experiencing are “realized in the active life of the skillful animal” (Noë 2004, p. 227). Its main proponent, Alva Noë (2021), tells us:
The core claim of the enactive approach, as I understand it, and as this was developed in Noë, 2004, and also O’Regan and Noë, 2001… [is that] the presence of the world, in thought and experience, is not something that happens to us but rather something that we achieve or enact (p. 958).
This version of enactivism travels under various names in the literature, including the enactive approach (Noë 2004, 2009, 2012, 2016, 2021); sensorimotor theory (O’Regan and Noë 2001; Myin and O’Regan 2002; Myin and Noë 2005; O’Regan 2011); ‘the dynamic sensorimotor approach’ (Hurley and Noë 2003), which also drew on Hurley (1998); and ‘actionism’ (Noë 2012, 2016). In Noë (2021), the new label ‘sensorimotor knowledge enactivism’ was introduced to underscore the key importance of the idea that perceiving and perceptual experiences are grounded in a special kind of knowledge. Hence, a fuller and more precise explication of the core view of this version of enactivism is that experience of the world comes in the form of an understanding that is achieved through an active exploration of the world, which is mediated by practical knowledge of its relevant sensorimotor contingencies.
The emphasis on sensorimotor understanding and knowledge is what makes this version of enactivism distinctive. Sensorimotor knowledge enactivism holds that in order “to perceive, you must have sensory stimulation that you understand” (Noë 2004, p. 183; see also p. 180, p. 3). In explicating this view, Noë (2012) is thus at pains to highlight “the central role understanding, knowledge, and skill play in opening up the world for experience… the world is blank and flat until we understand it” (Noë 2012, p. 2). Later in the same book, he underscores this crucial point yet again, saying that:
According to the actionist (or enactive) direct realism that I am developing here, there is no perceptual experience of an object that is not dependent on the exercise by the perceiver of a special kind of knowledge. Perceptual awareness of objects, for actionist-direct realism, is an achievement of sensorimotor understanding. (Noë 2012, p. 65).
These claims also echo the original statement of the view, which tells us that “the central idea of our new approach is that vision is a mode of exploration of the world that is mediated by knowledge of what we call sensorimotor contingencies” (O’Regan and Noë 2001, p. 940, see also Noë 2004, p. 228).
Putting this together, Noë (2004) holds that “all perception is intrinsically thoughtful” (2004, p. 3). Accordingly, canonical forms of perceiving and thinking really just lie at different points along the same spectrum: “perception is… a kind of thoughtful exploration of the world, and thought is… a kind of extended perception” (Noë 2012, p. 104 –105). Sensorimotor knowledge enactivism thus asks us to think of the distinction between thought and perception as “a distinction among different styles of access to what there is… thought and experience are different styles of exploring and achieving, or trying to achieve, access to the world” (Noë 2012, p. 104 –105).
The view is motivated by the longstanding observation that we cannot achieve an accurate phenomenology of experience if we only focus on the raw stimulation and perturbation of sensory modalities. A range of considerations support this general position. A proper phenomenology of experience requires an account of what it is to grasp the perceptual presence of objects in the environment. But this cannot be accounted for solely by focusing on raw sensations. The visual experience of, say, seeing a tomato is an experience of a three-dimensional object that takes up space voluminously. This cannot be explained simply by appealing to what is passively ‘given’ to or supplied by the senses. For what is, strictly, provided to the visual system is only, at most, a partial, two-dimensional take of the tomato.
Empirical findings also reveal the need to distinguish between mere sensing and experiencing. It has been shown that it is possible to be sensorially stimulated in normal ways without this resulting in the experience of features or aspects of the surrounding environment in genuinely perceptual ways —in ways that allow subjects to competently engage with worldly offerings or to make genuinely perceptual reports. This is the situation, for example, for those first learning to manipulate sensory substitution devices (O’Regan and Nöe 2001, Nöe 2004, Roberts 2010)
There are longstanding philosophical and empirical reasons for thinking that something must be added to sensory stimulation to a yield full -blown experience of worldly offerings and to enable organisms to engage with them successfully. Something must be added to sensory stimulation to a yield full-blown experience of worldly offerings and enable organisms to engage with them successfully.
A familiar cognitivist answer is that the extra ingredient needed for perceiving comes in the form of inner images or mental representations. Sensorimotor knowledge enactivism rejects these proposals, denying that perceiving depends on mental representations, however rich and detailed. In this regard, sensorimotor knowledge enactivism also sets its face against the core assumption of the popular predictive processing accounts of cognition by holding that
the world does not show up for us “as it does because we project or interpret or confabulate or hypothesize… in something like the way a scientist might posit the existence of an unobserved force” (Noë 2012, p. 5).
Sensorimotor knowledge enactivism, by contrast, holds that perceptual experience proper is grounded in the possession and use of implicit, practical knowledge such that, when such knowledge is deployed properly, it constitutes understanding and allows organisms to make successful contact with the world.
Successfully perceiving the world and enjoying perceptual experiences of it are mediated and made possible by the possession and skillful deployment of a special kind of practical knowledge of sensorimotor contingencies, namely, knowledge of the ways in which stimulation of sense modalities changes, contingent upon aspects of the environment and the organism’s own activities.
Having the sensation of softness consists in being aware that one can exercise certain practical skills with respect to the sponge: one can, for example, press it, and it will yield under the pressure. The experience of the softness of the sponge is characterized by a variety of such possible patterns of interaction with the sponge, and the laws that describe these sensorimotor interactions we call, following MacKay (1962), the laws of sensorimotor contingency (O’Regan and Noë, 2001). (O’Regan and others, 2005, p. 56, emphasis added).
Knowledge of this special sort is meant to account for the expectations that perceivers have concerning how things will appear in the light of possible actions. It amounts to knowing how things will manifest themselves if the environment is perceptually explored in certain ways. At some level, so the theory claims, successful perceivers must have implicit mastery of relevant laws concerning sensorimotor contingencies.
Echoing ideas first set out in the original version of enactivism, sensorimotor knowledge enactivism holds that the phenomenal properties of experience —what-it-is-like properties —are not to be identified with extra ingredients over and above the dynamic, interactive responses of organisms. As such, its advocates hold that “we enact our perceptual experience: we act it out” (Noë 2004, p. 1). In line with the position advanced by other enactivists, Noë (2004) claims that:
Different animals inhabit different perceptual worlds even though they inhabit the same physical world. The sights, sounds, odors, and so on that are available to humans may be unavailable to some creatures, and likewise, there is much we ourselves cannot perceive. We lack the sensorimotor tuning and the understanding to encounter those qualities. The qualities themselves are not subjective in the sense of being sensations. We don’t bring them into existence. But only a very special kind of creature has the biologically capacity, as it were, to enact them (p. 156).
On their face, some of the statements Noë makes about phenomenal properties appear to be of a wholly realist bent. For example, he says, “There is a sense in which we move about in a sea of perspectival properties and are aware of them (usually without thought or notice) whenever we are perceptually conscious. Indeed, to be perceptually conscious is to be aware of them” (Noë 2004, p. 167). Yet, he also appears to endorse a middle way -position that recognizes that the world can be understood as a domain of perceptual activity just as much as it can be understood as a place consisting of or containing the properties and facts that interest us (Noë 2004, p. 167).
It is against that backdrop that Noë holds, “Colours are environmental phenomena, and colour experience depends not only on movement-dependent but also on object-dependent sensorimotor contingencies… colour experience is grounded in the complex tangle of our embodied existence” (Noë 2004, p. 158) In the end, sensorimotor knowledge enactivism offers the following answer to the problem of consciousness: “How the world shows up for us depends not only on our brains and nervous systems but also on our bodies, our skills, our environment, and the way we are placed in and at home in the world” (Noë 2012, pp. 132-3).
Ultimately, “perceptual experience presents the world as being this way or that; to have experience, therefore, one must be able to appreciate how the experience presents things as being” (Noë 2004, p. 180). This is not something that is automatically done for organisms; it is something that they sometimes achieve. Thus, “The world shows up for us thanks to what we can do… We make complicated adjustments to bring the world into focus … We achieve access to the world. We enact it by enabling it to show up for us.… If I don’t have the relevant skills of literacy, for example, the words written on the wall do not show up for me” (Noë 2012, p. 132 –133).
So understood, sensorimotor knowledge enactivism resists standard representational accounts of perception, holding that “perceivings are not about the world; they are episodes of contact with the world” (Noë 2012, p. 64). It sponsors a form of enactive realism according to which the content of perceiving only becomes properly perceptual content that represents how things are when the skillful use of knowledge makes successful contact with the world. There is no guarantee of achieving that outcome. Hence, many attempts at perceiving might be groping, provisional efforts in which we only gain access to how things appear to be and not how they are.
On this view, “perception is an activity of learning about the world by exploring it. In that sense, then, perception is mediated by appearance” (Noë 2004, p. 166). Achieving access to the world via knowledgeable, skillful exploration is to discover the relevant patterns that reveal “how things are from how they appear” (Noë 2012, p. 164). Thus, “hearing, like sight and touch, is a way of learning about the world… Auditory experience, like visual experience, can represent how things are” (Noë 2004, p. 160).
Accordingly, Noë (2004) holds that the perceptual content of experience has a dual character: it presents the world as being a certain way and presents how things are experienced, capturing how things look, or sound, or feel from the vantage point of the perceiver. It is because Noë assumes perceptual content has both of these aspects that he is able to defend the view that perceptual experience is a “way of encountering how things are by making contact with how they appear to be” (Noë 2004, p. 164).
The key equation for how this is possible, according to sensorimotor knowledge enactivism, is as follows: “How [things] (merely) appear to be plus sensorimotor knowledge gives you how things are” (Noë 2004, p. 164). Put otherwise, “for perceptual sensation to constitute experience —that is, for it to have genuine representational content —the perceiver must possess and make use of sensorimotor knowledge” (Noë 2004, p. 17).
Even though knowledge and understanding lie at the heart of sensorimotor knowledge enactivism, Noë (2012) stresses that “your consciousness of… the larger world around you is not an intellectual feat” (Noë 2012, p. 6). He proposes to explain how to square these ideas by offering a putatively de-intellectualized account of knowledge and understanding, advancing a “practical, active, tool-like conception of concepts and the understanding” (Noë 2012, p. 105).
Sensorimotor knowledge enactivism bills itself as rejecting standard representationalism about cognition while also maintaining that perceptual experiences make claims or demands on how things are (Noë 2021). Since, to this extent, sensorimotor knowledge enactivism retains this traditional notion of representational content, at its core, Noë (2021) has come to regard the ‘real task’ for defenders of this view as “to rethink what representation, content, and the other notions are or could be” (p. 961).
It remains to be seen if sensorimotor knowledge enactivism can explicate its peculiar notions of implicit, practical understanding, and representational content in sufficiently novel and deflated ways that can do all the philosophical work asked of them without collapsing into or otherwise relying on standard cognitivist conceptions of such notions. This is the longstanding major challenge faced by this version of enactivism (Block 2005, Hutto 2005).
c. Radical Enactivism
Radical enactivism, also known as radical enactive cognition or REC, saw its debut in Hutto (2005) and was developed and supported in subsequent publications (Menary 2006, Hutto 2008, 2011a, 2011c, 2013a, 2013c, 2017, 2020, Hutto and Myin 2013, 2017, 2018a, 2018b, 2021). It was originally proposed as a critical adjustment to sensorimotor enactivists’ conservative tendencies, as set out in O’Regan and Noë (2001), tendencies which were deemed to be at odds with the professed anti-representationalism of the original version of enactivism. Radical enactivism proposes an account of enactive cognition that rejects characterizing or explaining the most basic forms of cognition in terms of mediating knowledge. This is because radical enactivists deem it unlikely that such notions can be non-vacuously explicated or accounted for naturalistically.
Importantly, radical enactivism never sought to advance a wholly original, new type or brand of enactivism. Instead, its goal was always to identify a minimal core set of tenable yet non-trivial enactivist theses and defend them through analysis and argument.
Much of the work of radical enactivists is subtractive —it adds by cutting away, operating on the assumption that often less is more. The adopted approach is explicated in greater detail in Evolving Enactivism, wherein several non-enactivist proposals about cognition are examined in an effort to assess whether they could be modified and allied with radical enactivism. This process, known as RECtification, is one “through which…. target accounts of cognition are radicalized by analysis and argument, rendering them compatible with a Radically Enactive account of Cognition” (Hutto and Myin 2017, p. xviii).
In advancing this cause, Hutto and Myin (2013) restrict radical enactivism’s ambitions to only promoting strong versions of what they call the Embodiment Thesis and the Developmental-Explanatory Thesis.
The Embodiment Thesis conceives of basic cognition in terms of concrete, spatio-temporally extended patterns of dynamic interaction between organisms and their environments. These interactions are assumed to take the form of individuals engaging with aspects of their environments across time, often in complex ways and on multiple scales. Radical enactivists maintain that these dynamic interactions are loopy, not linear. Such sensitive interactions are assumed, constitutively, to involve aspects of the non-neural bodies and environments of organisms. Hence, they hold that cognitive activity is not restricted to what goes on in the brain. In conceiving of cognition in terms of relevant kinds of world-involving organismic activity, radical enactivists characterize it as essentially extensive, not merely extended, in contrast to what Clark and Chalmers (1998) famously argued (see Hutto and Myin 2013; Hutto, Kirchhoff and Myin 2014).
The Developmental-Explanatory Thesis holds that mentality-constituting interactions are grounded in, shaped by, and explained by nothing more than the history of an organism’s previous interactions and features of its current environment. Sentience and sapience emerge, in the main, through repeated processes of organismic engagement with environmental offerings. An organism’s prolonged history of engaged encounters is the basis of its current embodied tendencies, know-how, and skills.
Radical enactivism differs from other versions of enactivism precisely in rejecting their more extravagant claims. It seeks to get by without the assumption that basic cognition involves mediating knowledge and understanding. Similarly, radical enactivism seeks to get by without assuming that basic cognition involves sense-making. It challenges the grounds for thinking that basic forms of cognition have the full array of psychological and phenomenological attributes associated with sense-making by other enactivists. Radical enactivists, for example, resist the idea that basic cognition involves organisms somehow creating, carrying, and consuming meanings.
Additionally, radical enactivists do not assume that intentionality and phenomenality are constitutively or inseparably linked. Its supporters do not endorse the connection principle according to which intentionality and phenomenal consciousness are taken to be intrinsically related (see Searle 1992, Ch. 7; compare Varela and others, 1991, p. 22). Instead, radical enactivists maintain that there can be instances of world-directed cognition that are lacking in phenomenality, even though, in the most common human cases, acts of world-directed cognition possess a distinctive phenomenal character (Hutto 2000, p. 70).
Most pivotally, radical enactivism thoroughly rejects positing representational contents at the level of basic mentality. One of its most signature claims, and one in which it agrees with original enactivism, is that basic forms of mental activity neither involve nor are best explained by the manipulation of contentful representations. Its special contribution has been to advance novel arguments designed to support the idea that organismic activity, conceived of as engaging with features of their environments in specifiable ways, suffices for the most basic kinds of cognition.
To encourage acceptance of this view, radical enactivists articulated the hard problem of content (Hutto 2013c, Hutto and Myin 2013, Hutto and Myin 2018a, 2018b). This hard problem, posed as a challenge to the whole field, rests on the observation that information understood only in terms of covariance does not constitute any kind of content. Hutto and Myin (2013) erect this observation into a principle and use it to reveal the hard choice dilemma that anyone seeking to give a naturalistic account of basic cognition must face. The first option is to rely only on the notion of information-as-covariance in securing the naturalistic credentials of explanatory resourcesthe cost of not having adequate resources to explain the natural origins of the content that basic forms of cognition are assumed to have. The second option is to presuppose an expanded or inflated notion of information, one which can adequately account for the content of basic forms of cognition, at the cost of having to surrender its naturalistic credentials. Either way, so the analysis goes, it is not possible to give a naturalized account of the content of basic forms of cognition.
Providing a straight solution to the hard problem of content requires “explaining how it is possible to get from non-semantic, non-contentful informational foundations to a theory of content using only the resources of a respectable explanatory naturalism” (Hutto 2018, pp. 24–5).
Hutto and Myin (2013) put existing naturalistic theories of content to the test, assessing their capacity to answer this challenge. As Salis (2022, p.1) describes this work, they offer “an ensemble of reasons” for thinking naturalistic accounts of content will fail.
Radical enactivism wears the moniker ‘radical’ due to its interest in getting to the root of issues concerning cognition and its conviction that not all versions of enactivism have been properly steadfast in their commitment to anti-content, anti-representational views about the character of basic mindedness. For example, when first explicating their conception of the aboutness or intentionality of cognition as embodied action, the original enactivists note that the mainstream assumption is that “in general, intentionality has two sides: first, intentionality includes how the system construes the world to be (specified in terms of the semantic content of intentional states); second, intentionality includes how the world satisfies or fails to satisfy this construal (specified in terms of the conditions of satisfaction of intentional states)” (Varela and others 1991, p. 205). That mainstream notion of intentionality, which is tied to a particular notion of content, is precisely the kind of intentionality that radical enactivism claims does not feature in basic cognition. In providing compelling arguments against the assumption that basic cognition is contentful in that sense, radical enactivism’s primary ambition is to strengthen enactivism by securely radicalizing it.
Several researchers have since argued the hard problem of content has already been solved, or, at least, that it can be answered in principle or otherwise avoided (Miłkowski 2015, Raleigh 2018, Lee 2019, Ramstead and others 2020, Buckner 2021, Piccinini 2022). Yet, see Hutto and Myin (2017, 2018a, 2018b) and Segundo-Ortin and Hutto (2021) for assessments of the potential moves.
On the positive side of the ledger, radical enactivists contend that the kind of mindedness found at the roots of cognition can be fruitfully characterized as a kind of Ur-intentionality. It is a kind of intentionality that lacks the sort of content associated with truth or accuracy conditions (Hutto and Myin 2013, 2017, 2018a, Zahnoun 2020, 2021b, 2021c). Moreover, radical enactivists hold that we can adequately account for Ur-intentionality, naturalistically, using biosemiotics – a modified teleosemantics inspired, in the main, by Millikan (1984) but stripped of its problematic semantic ambitions. This proposed adjustment of Millikan’s theory was originally advanced in Hutto (1999) in the guise of a modest biosemantics that sought to explain forms of intentionality with only nonconceptual content. That version of the position was abandoned and later radicalized to become a content-free biosemiotics (see Hutto 2006, 2008, Ch. 3). The pros and cons of the Ur-intentionality proposal continue to be debated in the literature (Abramova and Villalobos 2015, De Jesus 2016, 2018, Schlicht and Starzak 2019, Legg 2021, Paolucci 2021, Zipoli Caiani 2022, Mann and Pain 2022).
Importantly, radical enactivists only put biosemiotics to the theoretical use of explicating the properties of non-contentful forms of world-involving cognition. Relatedly, they hold that when engaged in acts of basic cognition, organisms are often sensitive to covariant environmental information, even though it is a mere metaphor to say organisms process it. Although organisms are sensitive to relevant indicative, informational relationships, “these relationships were not lying about ready-made to be pressed into service for their purposes” (Hutto 2008, p. 53 –54). When it comes to understanding biological cognition, the existence of the relevant correspondences is not explained by appeals to ahistorical natural laws but by various selectionist forces.
As Thompson (2011b) notes, if radical enactivism’s account of biosemiotics is to find common ground with original enactivism and its direct descendants, it would have to put aside strong adaptationist views of evolution. In fact, although radical enactivism does place great explanatory weight on natural selection, it agrees with original enactivism at least to the extent that it does not hold that biological traits are individually optimized —not selected for —in isolation from one another to make organisms maximally fit to deal with features of a neutral, pre-existing world.
Radical enactivists accept that content-involving cognition exists even though they hold that our basic ways of engaging with the world and others are contentless. In line with this position, they have sought to develop an account of The Natural Origins of Content, a project pursued in several publications by Hutto and Satne (2015, 2017a, 2017b) and Hutto and Myin (2017). In these works, the authors have proposed that capacities for contentful speech and thought emerge with the mastery of distinctive socio-cultural practices —specifically, varieties of discursive practices with their own special norms. These authors also hold that the mastery of such practices introduces kinks into the cognitive mix, such as the capacity for ratio-logical reasoning (see, for example, Rolla 2021). Nevertheless, defenders of radical enactivism maintain that these kinks do not constitute a gap or break in the natural or evolutionary order (see Myin and Van den Herik 2020 for a defense of this position and Moyal-Sharrock 2021b for its critique). Instead, radical enactivists argue that the content-involving practices that enable the development of distinctively kinky cognitive capacities can be best understood as a product of constructed environmental niches (Hutto and Kirchhoff 2015). Rolla and Huffermann (2021) propose that in fleshing out this account, radical enactivism could combine with Di Paolo and others (2018)’s new work on linguistic bodies to understand the cognitive basis of language mastery, characterizing it as a kind of norm-infused and acquired shared know-how.
3. Forerunners
In the opening pages of Sensorimotor Life, its authors describe their contribution to the enactive literature as that of adding a ‘tributary to the flow of ideas’ which found its first expression in Varela, Thompson and Rosch’s The Embodied Mind. Making use of that metaphor, they also astutely note the value of looking “upstream to discover ‘new’ predecessors,” namely precursors to enactivism that can only be identified in retrospect: those which might qualify as “enactivists avant la lettre” (Di Paolo and others 2017, p. 3).
Enactivism clearly has “roots that predate psychology in its modern academic form.”
(Baerveldt and Verheggen 2012, p. 165). For example, in challenging early modern Cartesian conceptions of the mind as a kind of mechanism, it reaches back to a more Aristotelian vision of the mind that emphasizes its biological basis and features shared with all living things. Baerveldt and Verheggen (2012) also see clear links between enactivism and “a particular ‘radical’ tradition in Western Enlightenment thinking that can be traced at least to Spinoza” (p. 165). Gallagher argues that Anaxagoras should be considered the first enactivist based on his claim that human hands are what make humans the most intelligent of animals.
In the domain of biological ecology, there are clear and explicit connections between enactivism and the work of the German biologist Jakob von Uexküll, who introduced the notion of Umwelt, that had great influence in cybernetics and robotics. Resonances with enactivism can also be found in the work of Helmuth Plessner, a German sociologist and philosopher who studied with Husserl and authored Levels of Organic Life and the Human.
Another philosopher, Hans Jonas, who studied with both Heidegger and Husserl, stands out in this regard. As Di Paolo and others (2017) note, “Varela read his work relatively late in his career and was impressed with the resonances with his own thinking” (p. 3). In a collection of his essays, The Phenomenon of Life, very much in the spirit of the original version of enactivism, Jonas defends the view that there exists a deep, existential continuity between life and mind.
Many key enactivist ideas have also been advanced by key figures in the American pragmatist tradition. As Gallagher (2017) observes, many of the ideas of Peirce, Dewey, and Mead can be considered forerunners of enactivism” (p. 5). Gallagher and Lindgren (2015) go a step further, maintaining that the pioneers of enactivism “could have easily drawn on the work of John Dewey and other pragmatists. Indeed, long before Varela and others (1991), Dewey (1896) clearly characterized what has become known as enactivism” (p. 392). See also Gallagher (2014), Gallagher and Miyahara (2012), and Barrett (2019).
In advocating the so-called actional turn, enactivists touch on recurrent themes of central importance in Wittgenstein’s later philosophy, in particular his emphasis on the importance of our animal nature, forms of life, and the fundamental importance of action for understanding mind, knowledge, and language use. Contemporary enactivists characterize the nature of minds and how they fundamentally relate to the world in ways that not only echo but, in many ways, fully concur with the later Wittgenstein’s trademark philosophical remarks on the same topics. Indeed, Moyal-Sharrock (2021a) goes so far as to say that “Wittgenstein is —and should be recognized to be —at the root of the important contemporary philosophical movement called enactivism” (p. 8). The connections between Wittgenstein and enactivism are set out by many other authors (Hutto 2013d, 2015c, Boncompagni 2013, Loughlin 2014, 2021a, 2021b, Heras-Escribano and others 2015. See also Loughlin 2021c, for a discussion of how some of Wittgenstein’s ideas might also challenge enactivist assumptions).
4. Debates
Enactivism bills itself as providing an antidote to accounts of cognition that “take representation as their central notion” (Varela and others 1991, p. 172). Most fundamentally, in proposing that minds, like all living systems, are distinguished from machines by their biological autonomy, it sees itself as opposed to and rejects computational theories and functionalist theories of mind, including extended functionalist theories of mind (Di Paolo and others 2017, Gallagher 2017). Enactivism thus looks to work in robotics in the tradition of Brooks (1991) and dynamical systems theory (Smith and Thelen 1994, Beer 1998, Juarrero 1999) for representation-free and model-free ways of characterising and potentially explaining extensive cognitive activity (Kirchhoff and Meyer 2019, Meyer 2020a, 2020b).
In a series of publications, Villalobos and coauthors offer a sustained critique of enactivism for its commitment to biological autonomy on the grounds that its conception of mind is not sufficiently naturalistic. These critics deem enactivism’s commitment to teleology as the most problematic and seek to develop, in its place, an account of biological cognition built on a more austere interpretation of autopoietic theory (Villalobos 2013, Villalobos and Ward 2015, Abramova and Villalobos 2015, Villalobos and Ward 2016, Villalobos and Silverman 2018, Villalobos 2020, Villalobos and Razeto-Barry 2020, Villalobos and Palacios 2021).
An important topic in this body of work, taken up by Villalobos and Dewhurst (2017), is the proposal that enactivism may be compatible, despite its resistance to the idea, with a computational approach to cognitive mechanisms. This possibility seems plausible to some given the articulation of conceptions of computation that allow for computation without representation (see, for example, Piccinini 2008, 2015, 2020). For a critical response to the suggestion that enactivism is or should want to be compatible with a representation-free computationalism, see Hutto and others (2019) and Hutto and others (2020).
Several authors see great potential in allying enactivism and ecological psychology, a tradition in psychology miniated by James Gibson which places responsiveness to affordances at its center (Gibson 1979). In recent times, this possibility has become more attractive with the articulation of radical embodied cognitive science (Chemero 2009), that seeks to connect Gibsonian ideas with dynamical systems theory, without invoking mental representations.
A joint ecological-enactive approach to cognition has been proposed in the form of the skilled intentionality framework (Rietveld and Kiverstein 2014, Bruineberg and Rietveld 2014, Kiverstein and Rietveld and 2015, 2018, Bruineberg and others 2016, Rietveld, Denys and Van Westen 2018, Bruineberg, Chemero and Rietveld 2019). It seeks to provide an integrated basis for understanding the situated and affective aspects of the embodied mind, emphasizing that organisms must always be sensitive to multiple affordances simultaneously in concrete situations.
The task of ironing out apparent disagreements between enactivsm and ecological psychology to forge a tenable alliance of these two traditions has also been actively pursued by others (see Heras-Escribano 2016, Stapleton 2016, Segundo-Ortin and others 2019, Heras-Escribano 2019, Crippen 2020, Heft 2020, Myin 2020, Ryan and Gallagher 2020, Segundo-Ortin 2020, McGann and others 2020, Heras-Escribano 2021, Jurgens 2021, Rolla and Novaes 2022).
A longstanding sticking point that has impeded a fully-fledged enactivist-ecological psychology alliance is the apparent tension between enactivism’s wholesale rejection of the notion that cognition involves information processing and the tendency of those in the ecological psychology tradition to talk of perception as involving the ‘pickup’ of information ‘about’ environmental affordances (see Varela and others 1991, p. 201-204; Hutto and Myin 2017, p. 86). See also Van Dijk and others (2015). The use of such language can make it appear as if the Gibsonian framework is committed to the view that perceiving is a matter of organisms attuning to the covariant structures of a pre-given world. Notably, Baggs and Chemero (2021) attempt to directly address this obstacle to uniting the two frameworks (see also de Carvalho and Rolla 2020).
There have been attempts to take enactivist ideas seriously by some versions of predictive processing theories of cognition. In several publications, Andy Clark (2013, 2015, 2016) has sought to develop a version of predictive processing accounts of cognition that is informed, to some extent, by the embodied, non-intellectualist, action-orientated vision of mind promoted by enactivists.
Yet most enactivist-friendly advocates of predictive processing accounts of cognition tend to baulk when it comes to giving up the idea that cognition is grounded in models and mental representations. Clark (2015) tells us that he can’t imagine how to get by without such constructs when he rhetorically asks himself, “Why not simply ditch the talk of inner models and internal representations and stay on the true path of enactivist virtue?” (Clark 2015, p. 4; see also Clark 2016, p. 293). Whether a tenable compromise is achievable or whether there is a way around this impasse is a recurring and now prominent theme in the literature on predictive processing (see, for example, Gärtner and Clowes 2017, Constant and others 2021, Venter 2021, Constant and others 2022, Gallagher and others 2022, Gallagher 2022b).
Several philosophers have argued that it is possible to develop entirely non-representationalist predictive processing accounts of cognition that could be fully compatible with enactivism (Bruineberg and Rietveld 2014; Bruineberg, Kiverstein, and Rietveld 2016; Bruineberg and others, 2018; Bruineberg and Rietveld 2019). This promised union comes in the form of what Venter (2021) has called free energy enactivism. The Free Energy Principle articulated by Friston (2010, 2011) maintains that what unites all self-organizing systems (including non-living systems) is that they work to minimize free energy. Many have sought to build similar bridges between enactivism and free energy theorizing (Kirchhoff 2015, Kirchhoff and Froese 2017, Kirchhoff and Robertson 2018, Kirchhoff 2018a, 2018b,
Kirchhoff and others 2018, Robertson and Kirchhoff 2019, Ramstead and others 2020a, Hesp and others 2019). However, Di Paolo, Thompson, and Beer (2022) identify what they take to be fundamental differences between the enactive approach and the free energy framework that appear to make such a union unlikely, if not impossible.
5. Applications and Influence
Enactivism’s novel framework for conceiving of minds and our place in nature has proved fertile and fecund. Enactivism serves as an attractive philosophical platform from which many researchers and practitioners are inspired to launch fresh investigations into a great variety of topics—investigations that have potentially deep and wide-ranging implications for theory and practice.
In the domain of philosophy of psychology, beyond breaking new ground in our thinking about the phenomenality and intentionality of perception and perceptual experience, enactivism has generated many fresh lines of research. Enactivists have contributed to new thinking about: the nature of habits and their intelligence (for example, Di Paolo and others 2017; Ramírez-Vizcaya and Froese 2019; Zarco and Egbert 2019; Hutto and Robertson 2020); emotions and, especially, the distinction in the affective sciences between basic and non-basic emotions ( for example, Colombetti and Thompson 2008; Hutto 2012; Colombetti 2014; Hutto, Robertson, and Kirchhoff 2018); pretense (Rucińska 2016, 2019; Weichold and Rucińska 2021, 2022); imagination (for example, Thompson 2007; Medina 2013; Hutto 2015a; Roelofs 2018; Facchin 2021); memory (for example, Hutto and Peeters 2018; Michaelian and Sant’Anna 2021); mathematical cognition (for example, Zahidi and Myin 2016; Gallagher 2017, 2019; Hutto 2019; Zahidi 2021); social cognition – and, in particular, advanced the proposal that the most basic forms of intersubjectivity take the form of direct, engaged interactions between agents, where this is variously understood in terms of unprincipled embodied engagements scaffolded by narrative practices (Hutto 2006, Gallagher and Hutto 2008 – see also Paolucci 2020, Hutto and Jurgens 2019), interaction theory (Gallagher 2005, 2017, 2020a), and participatory sense-making (De Jaegher and Di Paolo 2007; De Jaegher 2009).
In addition to stimulating new thinking about mind and cognition, enactive ideas have also influenced research on topics in many other domains, including: AI and technological development (Froese and Ziemke 2009; Froese and others 2012; Ihde and Malafouris 2019; Sato and McKinney 2022; Rolla and others 2022); art, music, and aesthetics (Noë 2015; Schiavio and De Jaegher 2017; Fingerhut 2018, Murphy 2019; Gallagher 2021; Høffding and Schiavio 2021); cognitive archaeology (Garofoli 2015, 2018, 2019; Garofoli and Iliopoulos 2018); cross-cultural philosophy (McKinney 2020, Janz 2022, Lai 2022); education and pedagogical design (Hutto and others 2015; Gallagher and Lindgren 2015; Abrahamson and others 2016; Hutto and Abrahamson 2022); epistemology (Vörös 2016; Venturinha 2016; Rolla 2018; De Jaegher 2021; Moyal-Sharrock 2021); ethics and values (Varela 1999a; Colombetti and Torrance 2009; Di Paolo and De Jeagher 2022); expertise and skilled performance (Hutto and Sánchez-García 2015; Miyahara and Segundo-Ortin 2022; Robertson and Hutto 2023); mental health, psychopathology, and psychiatry (Fuchs 2018; de Haan 2020; Jurgens and others 2020; Maiese 2022b, 2022c, 2022d); rationality (Rolla 2021).
6. Conclusion
There can be no doubt that enactivism is making waves in today’s philosophy, cognitive science, and beyond the boundaries of the academy. Although only newly born, enactivism has established itself as a force to be reckoned with in our thinking about mind, cognition, the world around us, and many other related topics. What remains to be seen is whether, and to what extent, different versions of enactivism will continue to develop productively, whether they will unite or diverge, whether they will find new partners, and, most crucially, whether enactivist ideas will continue to be actively taken up and widely influential. For now, this much is certain: The enactivist game is very much afoot.
7. References and Further Reading
Abrahamson, D., Shayan, S., Bakker, A., and Van der Schaaf, M. 2016. Eye-tracking Piaget: Capturing the Emergence of Attentional Anchors in the Coordination of Proportional Motor Action. Human Development, 58(4-5), 218–244.
Abramova, K. and Villalobos, M. 2015. The Apparent Ur-Intentionality of Living Beings and the Game of Content. Philosophia, 43(3), 651-668.
Baerveldt, C. and Verheggen, T. 2012. Enactivism. The Oxford Handbook of Culture and Psychology. Valsiner, J. (ed). Oxford. Oxford University Press. pp. 165–190.
Baggs, E. and Chemero, A. 2021. Radical Embodiment in Two Directions. Synthese, 198:S9, 2175–2190.
Barandiaran, X. E. 2017. Autonomy and Enactivism: Towards a Theory of Sensorimotor Autonomous Agency. Topoi, 36(3), 409–430.
Barandiaran, X. and Di Paolo, E. 2014. A Genealogical Map of the Concept of Habit. Frontiers in Human Neuroscience.
Beer R. 1998. Framing the Debate between Computational and Dynamical Approaches to Cognitive Science. Behavioral and Brain Sciences, 21(5), 630-630.
Boncompagni, A. 2020. Enactivism and Normativity: The case of Aesthetic Gestures. JOLMA – The Journal for the Philosophy of Language, Mind, and the Arts, 2(1):177-194.
Boncompagni, A. 2013. Enactivism and the ‘Explanatory Trap’: A Wittgensteinian Perspective. Methode – Analytic Perspectives, 2, 27-49.
Brooks R. 1991. Intelligence without Representation. Artificial Intelligence. 47: 139-159.
Bruineberg, J., Chemero, A., and Rietveld, E. 2019. General Ecological Information supports Engagement with Affordances for ‘Higher’ Cognition. Synthese, 196(12), 5231–5251.
Bruineberg, J., Kiverstein, J., and Rietveld, E. 2016. The Anticipating Brain is Not a Scientist: The Free-Energy Principle from an Ecological-Enactive Perspective. Synthese, 195(6), 2417-2444.
Bruineberg, J., and Rietveld, E. 2014. Self-organisation, Free Energy Minimization, and Optimal Grip on a Field of Affordances. Frontiers in Human Neuroscience 8(599), 1-14. doi.org/10.3389/fnhum.2014.00599.
Buckner, C. 2021. A Forward-Looking Theory of Content. Ergo. 8:37. 367-401.
Burnett, M. and Gallagher, S. 2020. 4E Cognition and the Spectrum of Aesthetic Experience. JOLMA – The Journal for the Philosophy of Language, Mind, and the Arts. 1: 2. 157–176.
Candiotto, L. 2022. Loving the Earth by Loving a Place: A Situated Approach to the Love of Nature, Constructivist Foundations, 17(3), 179–189.
Cappuccio M. and Froese, T. 2014. Introduction. In Cappuccio, M and Froese, T. (eds.), Enactive Cognition at the Edge of Sense-Making: Making Sense of Nonsense. Basingstoke: Palgrave Macmillan. pp. 1-33
Chemero A. 2009. Radical Embodied Cognitive Science. Cambridge, MA: MIT Press.
Clark, A. 2015. Predicting Peace: The End of the Representation Wars: Reply to Madary. In Open MIND: 7(R), ed. T. Metzinger and J. M. Windt. MIND Group. doi: 10.15502/9783958570979.
Clark, A. 2016. Surfing Uncertainty: Prediction, Action, and the Embodied Mind. New York: Oxford University Press.
Colombetti, G. 2014. The Feeling Body: Affective Science Meets the Enactive Mind. Cambridge, MA, MIT Press.
Colombetti, G. 2010. Enaction, Sense-Making and Emotion. In Stewart J, Gapenne O, and Paolo, E.D. (eds.). Enaction: Toward a New Paradigm for Cognitive Science, Cambridge MA: MIT Press, 145-164.
Colombetti, G. and Torrance, S. 2009. Emotion and Ethics: An Inter-(En)active Approach. Phenomenology and the Cognitive Sciences, 8 (4): 505-526.
Colombetti, G. and Thompson, E. 2008. The Feeling Body: Towards an Enactive Approach to Emotion. In W. F. Overton, U. Müller and J. L. Newman (eds.), Developmental Perspectives on Embodiment and Consciousness. Erlbaum. pp. 45-68.
Constant, A., Clark, A and Friston, K. 2021. Representation Wars: Enacting an Armistice Through Active Inference. Frontiers in Psychology.
Constant, C., Clark, A., Kirchhoff, M, and Friston, K. 2022. Extended Active Inference: Constructing Predictive Cognition Beyond Skulls. Mind and Language, 37(3), 373-394.
Crippen, M. 2020. Enactive Pragmatism and Ecological Psychology. Frontiers in Psychology, 11. 203–204.
Cuffari, E.C., Di Paolo, E.A., and De Jaegher, H. 2015. From Participatory Sense-Making to Language: There and Back Again. Phenomenology and the Cognitive Sciences, 14 (4), 1089-1125.
de Carvalho, E., and Rolla, G. 2020. An Enactive‐Ecological Approach to Information and Uncertainty. Frontiers in Psychology, 11, 1–11.
de Haan, S. 2020. Enactive Psychiatry. Cambridge, UK: Cambridge University Press.
De Jaegher, H. 2021. Loving and Knowing: Reflections for an Engaged Epistemology. Phenomenology and the Cognitive Sciences 20, 847–870.
De Jaegher, H. 2015. How We Affect Each Other: Michel Henry’s’ Pathos-With’ and the Enactive Approach to Intersubjectivity. Journal of Consciousness Studies 22 (1-2), 112-132.
De Jaegher, H. 2013. Embodiment and Sense-Making in Autism. Frontiers in Integrative Neuroscience, 7, 15. doi:10.3389/fnint.2013.00015
De Jeagher, H. 2009. Social Understanding through Direct Perception? Yes, by Interacting. Consciousness and Cognition. 18 (2), 535-542.
De Jaegher, H. and Di Paolo, E.A. 2008. Making Sense in Participation: An Enactive Approach to Social Cognition. In Morganti, F. and others (eds.). Enacting Intersubjectivity. IOS Press.
De Jeagher, H. and Di Paolo, E. 2007. Participatory Sense-Making: An Enactive Approach to Social Cognition. Phenomenology and the Cognitive Sciences 6(4): 485-507
De Jaegher, H, Di Paolo, E.A, and Gallagher, S. 2010. Can Social Interaction Constitute Social Cognition? Trends in Cognitive Sciences, 14 (10), 441-447.
De Jaegher, H., and Froese, T. 2009. On the Role of Social Interaction in Individual Agency. Adaptive Behavior, 17(5), 444‐460.
De Jaegher, H., Pieper, B., Clénin, D., and Fuchs, T. 2017. Grasping Intersubjectivity: An Invitation to Embody Social Interaction Research. Phenomenology and the Cognitive Sciences, 16(3), 491–523.
De Jesus, P. 2018. Thinking through Enactive Agency: Sense‐Making, Bio‐semiosis and the Ontologies of Organismic Worlds. Phenomenology and the Cognitive Sciences, 17(5), 861–887.
De Jesus P. 2016. From Enactive Phenomenology to Biosemiotic Enactivism. Adaptive Behavior. 24(2):130-146.
Degenaar, J., and O’Regan, J. K. 2017. Sensorimotor Theory and Enactivism. Topoi, 36, 393–407.
Deguchi, S., Shimoshige, H., Tsudome, M. Mukai, S., Corkery, R.W., S, and Horikoshi, K. 2011. Microbial growth at hyperaccelerations up to 403,627 × g. PNAS. 108:19. 7997-8002
Dewey, J. 1922. Human Nature and Conduct: An Introduction to Social Psychology, 1st edn. New York: Holt.
Di Paolo, E. A. 2021. Enactive Becoming. Phenomenology and the Cognitive Sciences, 20, 783–809.
Di Paolo, E. A. 2018. The Enactive Conception of Life. In A. Newen, De L. Bruin, and S. Gallagher (eds.). The Oxford Handbook of 4E Cognition (pp. 71–94). Oxford: Oxford University Press.
Di Paolo, E. A. 2009. Extended Life. Topoi, 28(9).
Di Paolo, E. A. 2005. Autopoiesis, Adaptivity, Teleology, Agency. Phenomenology and the Cognitive Sciences, 4, 429–452.
Di Paolo, E., Buhrmann, T., and Barandiaran, X. E. 2017. Sensorimotor Life. Oxford: Oxford University Press.
Di Paolo, E. A., Cuffari, E. C., and De Jaegher, H. 2018. Linguistic Bodies. The Continuity Between Life and Language. Cambridge, MA: MIT Press.
Di Paolo, E.A. and De Jaegher, H. 2022. Enactive Ethics: Difference Becoming Participation. Topoi. 41, 241–256.
Di Paolo, E.A. and De Jaegher, H. 2012. The Interactive Brain Hypothesis. Frontiers in Human Neuroscience.
Di Paolo, E.A., Rohde, M. and De Jaegher, H. 2010. Horizons for the Enactive Mind: Values, Social Interaction, and Play. In Stewart, J., Gapenne, O., and Di Paolo, E.A. (eds). Enaction: Toward a New Paradigm for Cognitive Science. Cambridge, MA: MIT Press.
Di Paolo, E. A. and Thompson, E. 2014. The Enactive Approach. In L. Shapiro (Ed.), The Routledge Handbook of Embodied Cognition (pp. 68–78). London: Routledge.
Di Paolo, E., Thompson, E. and Beer, R. 2022. Laying Down a Forking Path: Tensions between Enaction and the Free Energy Principle. Philosophy and the Mind Sciences. 3.
Facchin, M. 2021. Is Radically Enactive Imagination Really Contentless? Phenomenology and the Cognitive Sciences. 21. 1089–1105.
Fingerhut, J. 2018. Enactive Aesthetics and Neuroaesthetics. Phenomenology and Mind, 14, 80–97.
Froese, T. and Di Paolo, E.A. 2011. The Enactive Approach: Theoretical Sketches from Cell to Society. Pragmatics and Cognition, 19 (1), 1-36.
Froese, T., and Di Paolo, E.A. 2009. Sociality and the Life‐Mind Continuity Thesis. Phenomenology and the Cognitive Sciences, 8(4), 439-463.
Froese, T, McGann, M, Bigge, W, Spiers, A, and Seth, A.K. 2012. The Enactive Torch: A New Tool for the Science of Perception. IEEE Trans Haptics. 5(4):365-75
Froese, T., Woodward, A. and Ikegami, T. 2013. Turing Instabilities in Biology, Culture, and Consciousness? On the Enactive Origins of Symbolic Material Culture. Adaptive Behavior, 21 (3), 199-214.
Froese, T., and Ziemke, T. 2009. Enactive Artificial Intelligence: Investigating the Systemic Organization of Life and Mind. Artificial Intelligence, 173(3–4), 466–500.
Fuchs, T. 2018. Ecology of the Brain: The Phenomenology and Biology of the Embodied Mind. New York: Oxford University Press.
Gallagher, S. 2022b. Surprise! Why Enactivism and Predictive Processing are Parting Ways: The Case of Improvisation. Possibility Studies and Society.
Gallagher, S. 2021. Performance/Art: The Venetian Lectures. Milan: Mimesis International Edizioni.
Gallagher, S. 2020a. Action and Interaction. Oxford: Oxford University Press.
Gallagher, S. 2020b. Enactivism, Causality, and Therapy. Philosophy, Psychiatry, and Psychology, 27 (1), 27-28.
Gallagher, S. 2018a. Educating the Right Stuff: Lessons in Enactivist Learning. Educational Theory. 68 (6): 625-641.
Gallagher, S. 2018b. Rethinking Nature: Phenomenology and a Non-Reductionist Cognitive Science. Australasian Philosophical Review. 2 (2): 125-137
Gallagher, S. 2017. Enactivist Interventions: Rethinking the Mind. Oxford: Oxford University Press.
Gallagher, S. 2014. Pragmatic Interventions into Enactive and Extended Conceptions of Cognition. Philosophical Issues, 24 (1), 110-126.
Gallagher, S. 2005. How the Body Shapes the Mind. New York: Oxford University Press.
Gallagher, S. and Bower, M. 2014. Making Enactivism Even More Embodied. Avant, 5 (2), 232-247.
Gallagher, S. and Hutto, D. 2008. Understanding Others through Primary Interaction and Narrative Practice. In Zlatev, J., Racine, T., Sinha, C. and Itkonen, E. (eds). The Shared Mind: Perspectives on Intersubjectivity. John Benjamins. 17-38.
Gallagher, S., Hutto, D. and Hipólito, I. 2022. Predictive Processing and Some Disillusions about Illusions. Review of Philosophy and Psychology. 13, 999–1017.
Gallagher, S. and Lindgren, R. 2015. Enactive Metaphors: Learning through Full-Body Engagement. Educational Psychological Review. 27: 391–404.
Gallagher, S. and Miyahara, K. 2012. Neo-Pragmatism and Enactive Intentionality. In: Schulkin, J. (eds) Action, Perception and the Brain. New Directions in Philosophy and Cognitive Science. Palgrave Macmillan, London.
Garofoli, D. 2019. Embodied Cognition and the Archaeology of Mind: A Radical Reassessment. In Marie Prentiss, A. M. (ed). Handbook of Evolutionary Research in Archaeology. Springer. 379-405.
Garofoli, D. 2018. RECkoning with Representational Apriorism in Evolutionary Cognitive Archaeology. Phenomenology and the Cognitive Sciences.17, 973–995.
Garofoli, G. 2015. A Radical Embodied Approach to Lower Palaeolithic Spear-making. The Journal of Mind and Behavior. 1-25.
Garofoli, D and Iliopoulos, A. 2018. Replacing Epiphenomenalism: A Pluralistic Enactive Take on the Metaplasticity of Early Body Ornamentation. Philosophy and Technology, 32, 215–242.
Gärtner, K. and Clowes, R. 2017. Enactivism, Radical Enactivism and Predictive Processing: What is Radical in Cognitive Science? Kairos. Journal of Philosophy and Science, 18(1). 54-83.
Gibson, J.J. 1979. The Ecological Approach to Visual Perception. Boston: Houghton Mifflin.
Godfrey‐Smith, P. 2001. Three Kinds of Adaptationism. In Hecht Orzack, S. H. (ed). Adaptationism and Optimality (pp. 335–357). Cambridge: Cambridge University Press.
Gould, S. J., and Lewontin, R.C. 1979. The Spandrels of San Marco and the Panglossian Paradigm: A Critique of the Adaptationist Programme. Proceedings of the Royal Society of London—Biological Sciences, 205(1161), 581–598.
Heft, H. 2020. Ecological Psychology and Enaction Theory: Divergent Groundings. Frontiers in Psychology.
Heras-Escribano, M. 2021. Pragmatism, Enactivism, and Ecological Psychology: Towards a Unified Approach to Post-Cognitivism. Synthese, 198 (1), 337-363.
Heras-Escribano, M. 2019. The Philosophy of Affordances. Basingstoke: Palgrave Macmillan.
Heras-Escribano, M. 2016. Embracing the Environment: Ecological Answers for Enactive Problems. Constructivist Foundations, 11 (2), 309-312.
Heras-Escribano, M, Noble, J., and De Pinedo, M. 2015. Enactivism, Action and Normativity: a Wittgensteinian Analysis. Adaptive Behavior, 23 (1), 20-33.
Hesp, C., Ramstead, M., Constant, A., Badcock, P., Kirchhoff, M., Friston, K. 2019. A Multi-scale View of the Emergent Complexity of Life: A Free-Energy Proposal. In: Georgiev, G., Smart, J., Flores Martinez, C., Price, M. (eds) Evolution, Development and Complexity. Springer Proceedings in Complexity. Springer, Cham.
Hipólito, I, Hutto, D.D., and Chown, N. 2020. Understanding Autistic Individuals: Cognitive Diversity not Theoretical Deficit. In Rosqvist, H., Chown, N., and Stenning, A. (eds). Neurodiversity Studies: A New Critical Paradigm, 193-209.
Høffding, S. and Schiavio. A. 2021. Exploratory Expertise and the Dual Intentionality of Music-Making. Phenomenology and the Cognitive Sciences, 20 (5): 811-829.
Hurley, S. L. 1998. Consciousness in Action. Cambridge, MA: Harvard University Press.
Hurley, S. and Noë, A. 2003. Neural Plasticity and Consciousness. Biology and Philosophy 18, 131–168.
Hutto, D. D. 2020. From Radical Enactivism to Folk Philosophy. The Philosophers’ Magazine. 88. 75-82.
Hutto, D.D. 2019. Re-doing the Math: Making Enactivism Add Up. Philosophical Studies. 176. 827–837.
Hutto, D.D. 2018. Getting into Predictive Processing’s Great Guessing Game: Bootstrap Heaven or Hell? Synthese, 195, 2445–2458.
Hutto, D.D. 2015a. Overly Enactive Imagination? Radically Re-imagining Imagining. The Southern Journal of Philosophy. 53. 68–89.
Hutto, D.D. 2015b. Contentless Perceiving: The Very Idea. In Wittgenstein and Perception. O’Sullivan, M. and Campbell, M. (eds). London: Routledge. 64-84.
Hutto, D.D. 2013a. Radically Enactive Cognition in our Grasp. In The Hand – An Organ of the Mind. Radman, Z. (ed). Cambridge, MA: MIT Press. 227-258.
Hutto, D.D. 2013b. Enactivism from a Wittgensteinian Point of View. American Philosophical Quarterly. 50(3). 281-302.
Hutto, D.D. 2013c. Psychology Unified: From Folk Psychology to Radical Enactivism. Review of General Psychology. 17(2). 174-178.
Hutto, D.D. 2011a. Philosophy of Mind’s New Lease on Life: Autopoietic Enactivism meets Teleosemiotics. Journal of Consciousness Studies. 18:5-6. 44-64.
Hutto, D.D. 2011c. Enactivism: Why be Radical? In Sehen und Handeln. Bredekamp, H. and Krois, J. M. (eds). Berlin: Akademie Verlag. 21-44
Hutto, D.D. 2008. Folk Psychological Narratives: The Socio-Cultural Basis of Understanding Reasons. Cambridge, MA: The MIT Press.
Hutto, D.D. 2006. Unprincipled Engagement: Emotional Experience, Expression and Response. In Menary, R. (ed.), Radical Enactivism: Intentionality, Phenomenology and Narrative: Focus on the Philosophy of Daniel D. Hutto. Amsterdam: Jon Benjamins.
Hutto, D.D. 2005. Knowing What? Radical versus Conservative Enactivism. Phenomenology and the Cognitive Sciences. 4(4). 389-405.
Hutto, D.D. 2000. Beyond Physicalism. Philadelphia/Amsterdam: John Benjamins.
Hutto, D.D. 1999. The Presence of Mind. Philadelphia/Amsterdam: John Benjamins.
Hutto, D.D. and Abrahamson, D. 2022. Embodied, Enactive Education: Conservative versus Radical Approaches. In Movement Matters: How Embodied Cognition Informs Teaching and Learning. Macrine and Fugate (eds). Cambridge, MA: MIT Press.
Hutto, D., Gallagher, S., Ilundáin-Agurruza, J., and Hipólito, I. 2020. Culture in Mind – An Enactivist Account: Not Cognitive Penetration but Cultural Permeation. In Kirmayer, L. J., S. Kitayama, S., Worthman, C.M., Lemelson, R. and Cummings, C.A. (Eds.), Culture, Mind, and Brain: Emerging Concepts, Models, Applications. New York, NY: Cambridge University Press. pp. 163–187.
Hutto, D.D. and Jurgens, A. 2019. Exploring Enactive Empathy: Actively Responding to and Understanding Others. In Matravers, D. and Waldow, A. (eds). Philosophical Perspectives on Empathy: Theoretical Approaches and Emerging Challenges. London: Routledge. pp. 111-128.
Hutto, D.D. and Kirchhoff, M. 2015. Looking Beyond the Brain: Social Neuroscience meets Narrative Practice. Cognitive Systems Research, 34, 5-17.
Hutto, D.D., Kirchhoff, M.D., and Abrahamson, D. 2015. The Enactive Roots of STEM: Rethinking Educational Design in Mathematics. Educational Psychology Review, 27(3), 371-389.
Hutto, D.D., Kirchhoff, M. and Myin, E. 2014. Extensive Enactivism: Why Keep it All In? Frontiers in Human Neuroscience. doi: 10.3389/fnhum.2014.00706.
Hutto, D.D. and Myin, E. 2021. Re-affirming Experience, Presence, and the World: Setting the RECord Straight in Reply to Noë. Phenomenology and the Cognitive Sciences. 20, vol. 5, no. 20, pp. 971-989
Hutto, D.D. and Myin, E. 2018a. Much Ado about Nothing? Why Going Non-semantic is Not Merely Semantics. Philosophical Explorations. 21(2). 187–203.
Hutto, D.D. and Myin, E. 2018b. Going Radical. In The Oxford Handbook of 4E Cognition. Newen, A. Gallagher, S. and de Bruin, L. (eds). Oxford: Oxford University Press. pp. 95-116.
Hutto, D.D. and Myin, E. 2017. Evolving Enactivism: Basic Minds meet Content. Cambridge, MA: The MIT Press.
Hutto, D.D. and Myin, E. 2013. Radicalizing Enactivism: Basic Minds without Content. Cambridge, MA: The MIT Press.
Hutto, D. D., Myin, E., Peeters, A, and Zahnoun, F. 2019. The Cognitive Basis of Computation: Putting Computation In its Place. In Colombo, M. and Sprevak, M. (eds). The Routledge Handbook of The Computational Mind. London: Routledge. 272-282.
Hutto, D.D. and Peeters, A. 2018. The Roots of Remembering. Extensively Enactive RECollection. In New Directions in the Philosophy of Memory. Michaelian, K. Debus, D. Perrin, D. (eds). London: Routledge. pp. 97-118.
Hutto, D.D. and Robertson, I. 2020. Clarifying the Character of Habits: Understanding What and How They Explain. In Habits: Pragmatist Approaches from Cognitive Science, Neuroscience, and Social Theory. Caruana, F. and Testa, I. (eds). Cambridge: Cambridge University Press. pp. 204-222.
Hutto, D.D., Robertson, I and Kirchhoff, M. 2018. A New, Better BET: Rescuing and Revising Basic Emotion Theory. Frontiers in Psychology, 9, 1217.
Hutto, D.D., Röhricht, F., Geuter, U., and S. Gallagher. 2014. Embodied Cognition and Body Psychotherapy: The Construction of New Therapeutic Environments. Sensoria: A Journal of Mind, Brain and Culture. 10(1).
Hutto, D.D. and Sánchez-García, R. 2015. Choking RECtified: Enactive Expertise Beyond Dreyfus. Phenomenology and the Cognitive Sciences. 14:2. 309-331.
Hutto, D.D., and Satne, G. 2018a. Naturalism in the Goldilock’s Zone: Wittgenstein’s Delicate Balancing Act. In Raleigh, T. and Cahill, K. (eds). Wittgenstein and Naturalism. London: Routledge. 56-76.
Hutto, D.D. and Satne, G. 2018b. Wittgenstein’s Inspiring View of Nature: On Connecting Philosophy and Science Aright. Philosophical Investigations. 41:2. 141-160.
Hutto, D.D., and Satne, G. 2017a. Continuity Scepticism in Doubt: A Radically Enactive Take. In Embodiment, Enaction, and Culture. Durt, C, Fuchs, T and Tewes, C (eds). Cambridge, MA. MIT Press. 107-126.
Hutto, D.D. and Satne, G. 2017b. Davidson Demystified: Radical Interpretation meets Radical Enactivism. Argumenta. 3:1. 127-144.
Hutto. D.D. and Satne, G. 2015. The Natural Origins of Content. Philosophia. 43. 521–536.
Ihde, D., and Malafouris, L. 2019. Homo Faber Revisited: Postphenomenology and Material Engagement Theory. Philosophy and Technology, 32(2), 195–214.
Janz, B. 2022. African Philosophy and Enactivist Cognition: The Space of Thought. Imprint Bloomsbury Academic.
Jonas, H. 1966. The Phenomenon of Life. Evanston: Northwestern University Press
Juarrero, A. 1999. Dynamics in Action: Intentional Behavior as a Complex System. Cambridge: The MIT Press.
Jurgens, A., Chown, N, Stenning, A. and Bertilsdotter-Rosqvist, H. 2020. Neurodiversity in a Neurotypical World: An Enactive Framework for Investigating Autism and Social Institutions. In Rosqvist, H., Chown, N., and Stenning, A. (eds). Neurodiversity Studies: A New Critical Paradigm, 73-88.
Jurgens, A. 2021. Re-conceptualizing the Role of Stimuli: An Enactive, Ecological Explanation of Spontaneous-Response Tasks. Phenomenology and the Cognitive Sciences, 20 (5), 915-934.
Kabat-Zinn, J. 2016. Foreword to the Revised Edition. In Varela, F. J., Thompson, E., and Rosch, E. The Embodied Mind: Cognitive Science and Human Experience. Revised Edition (6th ed.). Cambridge, MA: MIT Press.
Kee, H. 2018. Phenomenology and Naturalism in Autopoietic and Radical Enactivism: Exploring Sense-Making and Continuity from the Top Down. Synthese. pp. 2323–2343.
Kirchhoff, M. 2018a. Autopoiesis, Free Energy, and the Life–Mind Continuity Thesis, Synthese, 195 (6), 2519-2540.
Kirchhoff, M. 2018b. The Body in Action: Predictive Processing and the Embodiment Thesis. The Oxford Handbook of 4E Cognition. Oxford: Oxford University Press. pp. 243-260.
Kirchhoff, M. 2015. Species of Realization and the Free Energy Principle. Australasian Journal of Philosophy. 93 (4), 706-723.
Kirchhoff, M and Froese, T. 2017. Where There is Life, There is Mind: In Support of a Strong Life-Mind Continuity Thesis. Entropy, 19 (4), 169.
Kirchhoff, M. and Hutto, D.D. 2016. Never Mind the Gap: Neurophenomenology, Radical Enactivism and the Hard Problem of Consciousness. Constructivist Foundations. 11 (2): 302–30.
Kirchhoff, M and Meyer, R. 2019. Breaking Explanatory Boundaries: Flexible Borders and Plastic Minds. Phenomenology and the Cognitive Sciences, 18 (1), 185-204.
Kirchhoff, M., Parr, T., Palacios, E, Friston, K, and Kiverstein, J. 2018. The Markov Blankets of Life: Autonomy, Active Inference and the Free Energy Principle. Journal of The Royal Society Interface, 15 (138), 2017-0792.
Kirchhoff, M. D., and Robertson, I. 2018. Enactivism and Predictive Processing: A Non‐Representational View. Philosophical Explorations, 21(2), 264–281.
Kiverstein, J. D., and Rietveld, E. 2018. Reconceiving Representation‐Hungry Cognition: An Ecological‐Enactive Proposal. Adaptive Behavior, 26(4), 147–163.
Kiverstein, J. and Rietveld, E. 2015. The Primacy of Skilled Intentionality: On Hutto and Satne’s The Natural Origins of Content. Philosophia, 43 (3). 701–721.
Lai, K.L. 2022. Models of Knowledge in the Zhuangzi: Knowing with Chisels and Sticks. In Lai, K.L. (ed). Knowers and Knowledge in East-West Philosophy: Epistemology Extended. Basingstoke: Palgrave Macmillan 319-344.
Laland, K. N., Matthews, B., and Feldman, M.W. 2016. An Introduction to Niche Construction Theory. Evolutionary Ecology, 30(2), 191–202.
Laland, K., Uller, T., Feldman, M., Sterelny, K., Müller, G., Moczek, A., Jablonka, E., and Odling‐Smee, J. 2014. Does Evolutionary Theory Need a Rethink? Yes, Urgently. Nature, 514, 161–164.
Langland-Hassan, P. 2021. Why Pretense Poses a Problem for 4E Cognition (and how to Move Forward). Phenomenology and the Cognitive Sciences. 21. 1003 – 1021.
Langland-Hassan, P. 2022. Secret Charades: Reply to Hutto. Phenomenology and the Cognitive Sciences. 21. 1183 – 1187.
Lee, J. 2019. Structural Representation and the Two Problems of Content. Mind and Language. 34: 5. 606-626.
Legg, C. 2021. Discursive Habits: A Representationalist Re-Reading of Teleosemiotics. Synthese, 199(5), 14751-14768.
Lewontin, R. 2000. The Triple Helix: Gene. Cambridge, MA: Harvard University Press.
Lewontin, R. and Levins, R. 1997. Organism and Environment. Capitalism Nature Socialism. 8: 2. 95-98.
Loughlin, V. 2014. Radical Enactivism, Wittgenstein and the Cognitive Gap. Adaptive Behavior. 22 (5): 350-359.
Loughlin, V. 2021a. 4E Cognitive Science and Wittgenstein. Basingstoke: Palgrave Macmillan
Loughlin, V. 2021b. Why Enactivists Should Care about Wittgenstein. Philosophia 49(11–12).
Loughlin, V. 2021c. Wittgenstein’s Challenge to Enactivism. Synthese, 198 (Suppl 1), 391–404.
Maiese, M. 2022a. Autonomy, Enactivism, and Mental Disorder: A Philosophical Account. London: Routledge.
Maiese, M. 2022b. White Supremacy as an Affective Milieu. Topoi, 41 (5): 905-915.
Maiese, M. 2022c. Mindshaping, Enactivism, and Ideological Oppression. Topoi, 41 (2): 341-354.
Maiese, M. 2022d. Neoliberalism and Mental Health Education. Journal of Philosophy of Education. 56 (1): 67-77.
Malafouris, L. 2013. How Things Shape the Mind: A Theory of Material Engagement. Cambridge, MA: MIT Press.
Mann, S. and Pain, R. 2022. Teleosemantics and the Hard Problem of Content, Philosophical Psychology, 35:1, 22-46.
Maturana, H. R., and Varela, F. J. 1980. Autopoiesis and Cognition: The Realization of the Living. Boston: D. Reidel.
Maturana, H., and Varela, F. 1987. The Tree of Knowledge: The Biological Roots of Human Understanding. New Science Library/Shambhala Publications.
Maturana, H., and Mpodozis, J. 2000. The Origin of Species by Means of Natural Drift. Revista Chilena De Historia Natural, 73(2), 261–310.
McGann M. 2022. Connecting with the Subject of our Science: Course-of-Experience Research Supports Valid Theory Building in Cognitive Science. Adaptive Behavior. doi:10.1177/10597123221094360).
McGann M. 2021. Enactive and Ecological Dynamics Approaches: Complementarity and Differences for Interventions in Physical Education Lessons. Physical Education and Sport Pedagogy, 27(3):1-14.
McGann, M. 2007. Enactive Theorists Do it on Purpose: Toward An Enactive Account of Goals and Goal-directedness. Phenomenology and the Cognitive Sciences, 6, 463–483.
McGann, M, De Jaegher, H. and Di Paolo, E.A. 2013. Enaction and Psychology. Review of General Psychology, 17 (2), 203-209.
McGann, M. Di Paolo, E.A., Heras-Escribano, M. and Chemero, A. 2020. Enaction and Ecological Psychology: Convergences and Complementarities. Frontiers in Psychology. 11:1982.
McKinney, J. 2020. Ecological-Enactivism Through the Lens of Japanese Philosophy. Frontiers Psychology. 11.
Medina, J. 2013. An Enactivist Approach to the Imagination: Embodied Enactments and ‘Fictional Emotions’. American Philosophical Quarterly 50.3: 317–335.
Merleau-Ponty, M. 1963. The Structure of Behavior. Pittsburgh: Duquesne University Press .
Meyer, R. 2020a. The Nonmechanistic Option: Defending Dynamical Explanations. The British Journal for the Philosophy of Science. 71 (3):959-985
Meyer, R. 2020b. Dynamical Causes. Biology and Philosophy, 35 (5), 1-21.
Meyer, R and Brancazio, N. 2022. Putting Down the Revolt: Enactivism as a Philosophy of Nature.
Michaelian, K. and Sant’Anna, A. 2021. Memory without Content? Radical Enactivism and (Post)causal theories of Memory. Synthese, 198 (Suppl 1), 307–335.
Miłkowski, M. 2015. The Hard Problem of Content: Solved (Long Ago). Studies in Logic, 41(1): 73-88.
Miyahara, K and Segundo-Ortin, M. 2022. Situated Self-Awareness in Expert Performance: A Situated Normativity account of Riken no Ken, Synthese. 200, 192. https://doi.org/10.1007/s11229-022-03688-w
Moyal-Sharrock, 2016. The Animal in Epistemology: Wittgenstein’s Enactivist Solution to the Problem of Regress. International Journal for the Study of Skepticism. 6 (2-3): 97-119
Moyal-Sharrock, 2021a. Certainty In Action: Wittgenstein on Language, Mind and Epistemology. London: Bloomsbury.
Moyal-Sharrock, D. 2021b. From Deed to Word: Gapless and Kink-free Enactivism. Synthese, 198 (Suppl 1), 405–425.
Murphy, M. 2019. Enacting Lecoq: Movement in Theatre, Cognition, and Life. Basingstoke: Palgrave Macmillan.
Myin. E. 2020. On the Importance of Correctly Locating Content: Why and How REC Can Afford Affordance Perception. Synthese, 198 (Suppl 1):25-39.
Myin, E., and O’Regan, K. J. 2002. Perceptual Consciousness, Access to Modality, and Skill Theories: A Way to Naturalize Phenomenology? Journal of Consciousness Studies, 9, 27–46
Myin, E and Van den Herik, J.C. 2020. A Twofold Tale of One Mind: Revisiting REC’s Multi-Storey Story. Synthese,198 (12): 12175-12193.
Netland, T. 2022. The lived, living, and behavioral sense of perception. Phenomenology and the Cognitive Sciences, https://doi.org/10.1007/s11097-022-09858-y
Noë, A. 2021. The Enactive Approach: A Briefer Statement, with Some Remarks on ‘Radical Enactivism’. Phenomenology and the Cognitive Sciences, 20, 957–970
Noë, A. 2015. Strange Tools: Art and Human Nature. New York: Hill and Wang
Noë, A. 2012. Varieties of Presence. Cambridge, MA: Harvard University Press.
Noë, A. 2009. Out of Our Heads: Why You Are not Your Brain and Other Lessons from the Biology of Consciousness. New York: Hill and Wang.
Noë, A. 2004. Action in Perception. Cambridge, MA: MIT Press.
Øberg, G. K., Normann, B. and S Gallagher. 2015. Embodied-Enactive Clinical Reasoning in Physical Therapy. Physiotherapy Theory and Practice, 31 (4), 244-252.
O’Regan, J. K. 2011. Why Red Doesn’t Sound Like a Bell: Understanding the Feel of Consciousness. Oxford: Oxford University Press.
O’Regan, J. K., and Noë, A. 2001. A Sensorimotor Account of Vision and Visual Consciousness. Behavioral and Brain Sciences, 24, 883–917.
O’Regan, J. K., Myin, E., and Noë, A. 2005. Skill, Corporality and Alerting Capacity in an Account of Sensory Consciousness. Progress in Brain Research, 150, 55–68.
Paolucci, C. 2021. Cognitive Semiotics. Integrating Signs, Minds, Meaning and Cognition, Cham Switzerland: Springer.
Paolucci, C. 2020. A Radical Enactivist Approach to Social Cognition. In: Pennisi, A., Falzone, A. (eds). The Extended Theory of Cognitive Creativity. Perspectives in Pragmatics, Philosophy and Psychology. Cham: Springer.
Piccinini, G. 2022. Situated Neural Representations: Solving the Problems of Content. Frontiers in Neurorobotics. 14 April 2022, Volume 16.
Piccinini, G. 2020. Neurocognitive Mechanisms: Explaining Biological Cognition. New York: Oxford University Press.
Piccinini, G. 2015. Physical Computation: A Mechanistic Account. New York: Oxford University Press.
Piccinini. G. 2008. Computation without Representation. Philosophical Studies. 137 (2), 205-241.
Ramírez-Vizcaya, S., and Froese, T. 2019. The Enactive Approach to Habits: New Concepts for the Cognitive Science of Bad Habits and Addiction. Frontiers in psychology, 10, 301.
Ramstead, MJD, Kirchhoff, M, and Friston, K. 2020a. A Tale of Two Densities: Active Inference is Enactive Inference. Adaptive Behavior. 28(4):225-239.
Ramstead, MJD, Friston, K, Hipolito, I. 2020b. Is the Free-Energy Principle a Formal Theory of Semantics? From Variational Density Dynamics to Neural and Phenotypic Representations. Entropy, 22(8), 889.
Reid, D. 2014. The Coherence of Enactivism and Mathematics Education Research: A Case Study. AVANT. 5:2. 137-172.
Rietveld, E., Denys, D. and Van Westen, M. 2018. Ecological-Enactive Cognition as Engaging with a Field of Relevant Affordances: The Skilled Intentionality Framework (SIF). In A. Newen, L. L. de Bruin and S. Gallagher (Eds.). Oxford Handbook of 4E Cognition. Oxford: Oxford University Press, 41-70.
Rietveld, E. and Kiverstein, J. 2014. A Rich Landscape of Affordances. Ecological Psychology, 26(4), 325-352.
Robertson, I. and Hutto, D. D. 2023. Against Intellectualism about Skill. Synthese, 201(4), 143.
Robertson, I. and Kirchhoff, M. D. (2019). Anticipatory Action: Active Inference in Embodied Cognitive Activity. Journal of Consciousness Studies, 27(3-4), 38-68.
Roelofs, L. 2018. Why Imagining Requires Content: A Reply to A Reply to an Objection to Radical Enactive Cognition. Thought: A Journal of Philosophy. 7 (4):246-254.
Rolla, G. 2021. Reconceiving Rationality: Situating Rationality into Radically Enactive Cognition. Synthese, 198(Suppl 1), pp. 571–590.
Rolla, G. 2018. Radical Enactivism and Self-Knowledge. Kriterion,141, pp. 732-743
Rolla, G. and Figueiredo, N. 2021. Bringing Forth a World, Literally. Phenomenology and the Cognitive Sciences.
Rolla, G. and Huffermann, J. 2021. Converging Enactivisms: Radical Enactivism meets Linguistic Bodies. Adaptive Behavior. 30(4). 345-359.
Rolla, G., and Novaes, F. 2022. Ecological-Enactive Scientific Cognition: Modeling and Material Engagement. Phenomenology and the Cognitive Sciences, 21, pp. 625–643.
Rolla, G., Vasconcelos, G., and Figueiredo, N. 2022. Virtual Reality, Embodiment, and Allusion: An Ecological-Enactive Approach. Philosophy and Technology. 35: 95.
Rosch, E. 2016. Introduction to the Revised Edition. In The Embodied Mind: Cognitive Science and Human Experience. Revised Edition (6th ed.). Cambridge, MA: MIT Press.
Rucińska, Z. 2019. Social and Enactive Perspectives on Pretending. Avant. 10:3. 1-27.
Rucińska, Z. 2016. What Guides Pretence? Towards the Interactive and the Narrative Approaches. Phenomenology and the Cognitive Sciences. 15: 117–133.
Ryan Jr, K.J. and S. Gallagher. 2020. Between Ecological Psychology and Enactivism: Is There Resonance? Frontiers in Psychology, 11, 1147.
Salis, P. 2022. The Given and the Hard problem of Content. Phenomenology and the Cognitive Sciences.
Sato, M. and McKinney, J. 2022. The Enactive and Interactive Dimensions of AI: Ingenuity and Imagination Through the Lens of Art and Music. Artificial Life. 28 (3): 310–321.
Schiavio, A. and De Jaegher, H. 2017. Participatory Sense-Making in Joint Musical Practice. Lesaffre, M. Maes, P-J, Marc Leman, M. (eds). The Routledge Companion to Embodied Music Interaction. London: Routledge, 31-39.
Schlicht, T. and Starzak, T. 2019. Prospects of Enactivist Approaches to Intentionality and Cognition. Synthese, 198 (Suppl 1): 89-113.
Searle, J. 1992. The Rediscovery of the Mind. Cambridge: The MIT Press.
Segundo-Ortin, M. 2020. Agency From a Radical Embodied Standpoint: An Ecological-Enactive Proposal. Frontiers in Psychology.
Segundo-Ortin, M, Heras-Escribano, M, and Raja, V. 2019. Ecological Psychology is Radical Enough. A Reply to Radical Enactivists. Philosophical Psychology, 32 (7), 1001-102340.
Segundo-Ortin, M and Hutto, D.D. 2021. Similarity-based Cognition: Radical Enactivism meets Cognitive Neuroscience. Synthese, 198 (1), 198, 5–23.
Seifert, L, Davids, K, Hauw, D and McGann, M. 2020. Editorial: Radical Embodied Cognitive Science of Human Behavior: Skill Acquisition, Expertise and Talent Development. Frontiers in Psychology 11.
Sharma, G. and Curtis, P.D. 2022. The Impacts of Microgravity on Bacterial Metabolism. Life (Basel). 12(6): 774.
Smith, L.B., and, E.Thelen. 1994. A Dynamic Systems Approach to the Development of Cognition and Action. Cambridge, MA: MIT Press.
Stapleton, M. 2022. Enacting Environments: From Umwelts to Institutions. In Lai, K.L. (ed). Knowers and Knowledge in East-West Philosophy: Epistemology Extended. Basingstoke: Palgrave Macmillan. 159-190.
Stapleton, M. and Froese, T. 2016. The Enactive Philosophy of Embodiment: From biological Foundations of Agency to the Phenomenology of Subjectivity. In M. García-Valdecasas, J. I. Murillo, and N. F. Barrett (Eds.), Biology and Subjectivity: Philosophical Contributions to non-Reductive Neuroscience. (pp. 113–129). Cham: Springer.
Stewart, O. Gapenne, and E. A. Di Paolo (Eds.). 2010. Enaction: Toward a New Paradigm for Cognitive Science (pp. 183–218). Cambridge: The MIT Press.
Thompson, E. 2021. Buddhist Philosophy and Scientific Naturalism. Sophia, 1-16.
Thompson, E. 2017. Enaction without Hagiography. Constructivist Foundations, 13(1), 41–44.
Thompson, E. 2016. Introduction to the Revised Edition. In Varela, F. J., Thompson, E., and Rosch, E. The Embodied Mind: Cognitive Science and Human Experience. Revised Edition (6th ed.). Cambridge, MA: MIT Press.
Thompson, E. 2011a. Living Ways of Sense-Making, Philosophy Today: SPEP Supplement. 114-123.
Thompson, E. 2011b. Précis of Mind in Life. Journal of Consciousness Studies.18. 10-22.
Thompson, E. 2011c. Reply to Commentaries. Journal of Consciousness Studies.18. 176-223.
Thompson, E. 2007. Mind in Life: Biology, Phenomenology, and the Sciences of Mind. Cambridge, MA: Harvard University Press.
Thompson, E. 2005. Sensorimotor Subjectivity and the Enactive Approach to Experience. Phenomenology and the Cognitive Sciences, 4: 407-427.
Thompson, E. and Stapleton, M. 2009. Making Sense of Sense-Making: Reflections on Enactive and Extended Mind Theories. Topoi, 28: 23-30.
Turner, J. S. 2000. The Extended Organism: The Physiology of Animal-Built Structures. Cambridge, MA: Harvard University Press.
Van Dijk, L, Withagen, R.G. Bongers, R.M. 2015. Information without Content: A Gibsonian Reply to Enactivists’ Worries. Cognition, 134. pp. 210-214.
Varela, F. J. 1999a. Ethical Know-How: Action, Wisdom, and Cognition. Stanford, CA: Stanford University Press.
Varela, F.J. 1999b. The Specious Present: A Neurophenomenology of Time Consciousness. In J. Petitot, F. J. Varela, and B. R. M. Pachoud (Eds.), Naturalizing Phenomenology (pp. 266–314). Stanford: Stanford University Press
Varela, F. J. 1996. Neurophenomenology: A methodological remedy for the hard problem. Journal of Consciousness Studies, 4, 330–349
Varela F. J. 1991. Organism: A Meshwork of Selfless Selves. In: Tauber A. I. (ed.), Organism and the Origins of Self. Dordrecht: Kluwer, 79–107.
Varela, F.J. 1984. Living Ways of Sense-Making: A Middle Path for Neuroscience. In: P. Livingstone (Ed.), Order and Disorder: Proceedings of the Stanford International Symposium, Anma Libri, Stanford, pp.208-224.
Varela, F. J. 1979. Principles of Biological Autonomy. New York: Elsevier.
Varela, F. J., Thompson, E., and Rosch, E. 1991. The Embodied Mind: Cognitive Science and Human Experience. Cambridge: MIT Press.
Venter, E. 2021. Toward an Embodied, Embedded Predictive Processing Account. Frontiers in Psychology. doi: 10.3389/fpsyg.2021.543076
Venturinha, N. 2016. Moral Epistemology, Interpersonal Indeterminacy and Enactivism. In Gálvez, J.P. (ed). Action, Decision-Making and Forms of Life. Berlin, Boston: De Gruyter, pp. 109-120.
Villalobos, M. 2020. Living Beings as Autopoietic Bodies. Adaptive Behavior, 28 (1), 51-58.
Villalobos, M. 2013. Enactive Cognitive Science: Revisionism or Revolution? Adaptive Behavior, 21 (3), 159-167.
Villalobos, M., and Dewhurst, J. 2017. Why Post‐cognitivism Does Not (Necessarily) Entail Anti‐Computationalism. Adaptive Behavior, 25(3), 117–128.
Villalobos, M. and Palacios, S. 2021. Autopoietic Theory, Enactivism, and Their Incommensurable Marks of the Cognitive. Synthese, 198 (Suppl 1), 71–87.
Villalobos, M. and Razeto-Barry, P. 2020. Are Living Beings Extended Autopoietic Systems? An Embodied Reply. Adaptive Behavior, 28 (1), 3-13.
Villalobos, M. and Silverman, D. 2018. Extended Functionalism, Radical Enactivism, and the Autopoietic Theory of Cognition: Prospects for a Full Revolution in Cognitive Science. Phenomenology and the Cognitive Sciences, 17 (4), 719-739.
Villalobos, M., and Ward, D. 2016. Lived Experience and Cognitive Science: Reappraising Enactivism’s Jonasian Turn. Constructivist Foundations, 11, 204–233
Villalobos, M. and Ward, D. 2015. Living Systems: Autopoiesis, Autonomy and Enaction. Philosophy and Technology, 28 (2), 225-239.
Vörös, S., Froese, T., and Riegler, A. 2016. Epistemological Odyssey: Introduction to Special Issue on the Diversity of Enactivism and Neurophenomenology. Constructivist Foundations, 11(2), 189–203.
Ward, D., Silverman, D., and Villalobos, M. 2017. Introduction: The Varieties of Enactivism. Topoi, 36(3), 365–375.
Weichold, M. and Rucińska, Z. 2021. Pretense as Alternative Sense-making: A Praxeological Enactivist Account. Phenomenology and the Cognitive Sciences. 21. 1131–1156.
Weichold, M. and Rucińska, Z. 2022 Praxeological Enactivism vs. Radical Enactivism: Reply to Hutto. Phenomenology and the Cognitive Sciences. 21. 1177-1182.
Werner, K. 2020. Enactment and Construction of the Cognitive Niche: Toward an Ontology of the Mind‐World Connection. Synthese, 197(3), 1313–1341.
Zahidi, K., and Myin, E. 2016. Radically Enactive Numerical Cognition. In G. Etzelmüller and C. Tewes (Eds.), Embodiment in Evolution and Culture (pp. 57–72). Tübingen, Germany: Mohr Siebeck.
Zahnoun F. 2021a. The Socio-Normative Nature of Representation. Adaptive Behavior. 29(4): 417-429.
Zahnoun, F. 2021b. Some Inaccuracies about Accuracy Conditions. Phenomenology and the Cognitive Sciences.
Zahnoun, F. 2021c. On Representation Hungry Cognition (and Why We Should Stop Feeding It). Synthese, 198 (Suppl 1), 267–284.
Zahnoun, F. 2020. Truth or Accuracy? Theoria 86 (5):643-650.
Zarco M., Egbert M. D. 2019. Different Forms of Random Motor Activity Scaffold the Formation of Different Habits in a Simulated Robot. In Fellermann H., Bacardit J., Goni-Moreno A., Fuchslin M. (Eds.) The 2019 Conference on Artificial Life. No. 31, 582-589
Zipoli Caiani, S. 2022. Intelligence Involves Intensionality: An Explanatory Issue for Radical Enactivism (Again). Synthese. 200: 132.
Author Information
Daniel D. Hutto
Email: ddhutto@uow.edu.au
University of Wollongong
Australia
Hunhu/Ubuntu in the Traditional Thought of Southern Africa
The term Ubuntu/Botho/Hunhu is a Zulu/Xhosa/Ndebele/Sesotho/Shona word referring to the moral attribute of a person, who is known in the Bantu languages as Munhu (Among the Shona of Zimbabwe), Umuntu (Among the Ndebele of Zimbabwe and the Zulu/Xhosa of South Africa) and Muthu (Among the Tswana of Botswana) and Omundu (Among the Herero of Namibia) to name just a few of the Bantu tribal groupings. Though the term has a wider linguistic rendering in almost all the Bantu languages of Southern Africa, it has gained a lot of philosophical attention in Zimbabwe and South Africa, especially in the early twenty-first century for the simple reason that both Zimbabwe and South Africa needed home-grown philosophies to move forward following political disturbances that had been caused by the liberation war and apartheid respectively. Philosophically, the term Ubuntu emphasises the importance of a group or community and it finds its clear expression in the Nguni/Ndebele phrase: umuntu ngumuntu ngabantu which when translated to Shona means munhu munhu muvanhu (a person is a person through other persons). This article critically reflects on hunhu/ubuntu as a traditional and/or indigenous philosophy by focussing particularly on its distinctive features, its components and how it is deployed in the public sphere.
The subject of Hunhu/Ubuntu has generated considerable debate within the public and private intellectual discussions, especially in South Africa and Zimbabwe where the major focus has been on whether or not Hunhu/Ubuntu can compete with other philosophical world views as well as whether or not Hunhu/Ubuntu can solve the socio-political challenges facing the two countries. Hunhu/ubuntu is also a key theme in African philosophy as it places an imperative on the importance of group or communal existence as opposed to the West’s emphasis on individualism and individual human rights. Thus, Hunhu/Ubuntu, as an aspect of African philosophy, prides in the idea that the benefits and burdens of the community must be shared in such a way that no one is prejudiced but that everything is done to put the interests of the community ahead of the interests of the individual. To this end, the traditional philosophical meaning of the term Ubuntu/Botho/Hunhu is sought and its importance in the academy is highlighted and explained. The article also looks at how the concept is deployed in the public sphere. It provides an elaborate analysis of the qualities/features of Hunhu/Ubuntu as exemplified by John S Pobee’s expression Cognatus ergo sum, which means I am related by blood therefore I exist. Finally, the article outlines and thoroughly explains the components cognate to Hunhu/Ubuntu as an aspect of ethno-philosophy, namely: Hunhu/Ubuntu Metaphysics, Hunhu/Ubuntu Ethics and Hunhu/Ubuntu Epistemology.
2. About the Sources
Many scholars have written about Ubuntu and it is only fair to limit our discussion to those scholars who have had an interest in the philosophical meaning of the term in Southern African Thought. In this category, we have first generation scholars of Ubuntu such as Mogobe Bernard Ramose (1999; 2014), who is credited for his definition of Ubuntu as humaneness, Stanlake Samkange and Tommie Marie Samkange (1980) who link Hunhu/Ubuntu with the idea of humanism and Desmond Tutu (1999) who sees Ubuntu as a conflict resolution philosophy. These three are regarded as first-generation scholars of Ubuntu because historically, they are among the first black philosophers hailing from Africa to write about Hunhu/Ubuntu as a philosophy. They also started writing as early as the 1980s and early 1990s and they regarded Ubuntu inspired by the traditional southern African thought as a human quality or as an attribute of the soul.
We also have second generation scholars of Ubuntu such as Michael Onyebuchi Eze (2010), who is credited for his critical historicisation of the term Ubuntu, Michael Battle (2009) who is credited for some deep insights on the linguistic meaning of the term Ubuntu as well as his famous claim that Ubuntu is a gift to the Western world; Fainos Mangena (2012a and 2012b) who is credited for defining Hunhu/Ubuntu and extracting from it the idea of the Common Moral Position (CMP) and Thaddeus Metz (2007) whose search for a basic principle that would define African ethics has attracted a lot of academic attention; Christian BN Gade (2011; 2012 and 2013) who has taken the discourse of Hunhu/Ubuntu to another level by looking at the historical development of discourses on Ubuntu as well as the meaning of Ubuntu among South Africans of African Descent (SAADs). Finally, we have Martin H Prozesky who has outlined some of the distinctive qualities/features of Hunhu/Ubuntu philosophy that are important for this article.
3. Hunhu/Ubuntu and Ethno-Philosophy
In order to define Ubuntu and show its nexus with ethno-philosophy, it is important that we first define ethno-philosophy. To this end, Zeverin Emagalit defines ethno-philosophy as a system of thought that deals with the collective worldviews of diverse African people as a unified form of knowledge based on myths, folk wisdom and the proverbs of the people. From the above definition, we can pick two important points: The first point is ethno-philosophy as a “system of thought” and the second point is “the collective world views of diverse African people” and that they are a unified form of knowledge. This means that the diversity that characterise African people, in terms of geographical location, history and ethnicity, does not take away the fact that Africans have “a unified form of knowledge” that is based on group identity or community. Now, this is what qualifies Ubuntu as an important aspect of ethno-philosophy.
This section defines Ubuntu as well as tracing its historical roots in Southern African cultures. To begin with, the term Ubuntu comes from a group of sub-Saharan languages known as Bantu (Battle, 2009: 2). It is a term used to describe the quality or essence of being a person amongst many sub-Saharan tribes of the Bantu language family (Onyebuchi Eze, 2008: 107). While Battle does not make reference to the Shona equivalence of Ubuntu and recognises the words Ubuntu and Bantu by the common root of –ntu (human); Ramose uses the Zulu/isiNdebele word Ubuntu concurrently with its Shona equivalent – hunhu to denote the idea of existence. For Ramose, Hu– is ontological, while –nhu is epistemological and so is Ubu– and –ntu (Ramose 1999: 50). Having lived in Africa and Zimbabwe, Ramose is able to know with some degree of certainty the ontological and epistemological status of the words hunhu and ubuntu. It sometimes takes an insider to be able to correctly discern the meanings of such words.
Hunhu/ubuntu also says something about the character and conduct of a person (Samkange and Samkange 1980: 38). What this translates to is that hunhu/ubuntu is not only an ontological and epistemological concept; it is also an ethical concept. For Battle, Ubuntu is the interdependence of persons for the exercise, development and fulfilment of their potential to be both individuals and community. Desmond Tutu captures this aptly when he uses the Xhosa proverb, ungamntu ngabanye abantu whose Shona equivalence is munhu unoitwa munhu nevamwe vanhu (a person is made a person by other persons). Generally, this proverb, for Battle, means that each individual’s humanity is ideally expressed in relationship with others. This view was earlier expressed by Onyebuchi Eze (2008: 107) who put it thus:
More critical…is the understanding of a person as located in a community where being a person is to be in a dialogical relationship in this community. A person’s humanity is dependent on the appreciation, preservation and affirmation of other person’s humanity. To be a person is to recognize therefore that my subjectivity is in part constituted by other persons with whom I share the social world.
In regard to the proverbial character of Ubuntu, Ramose also remarks that, “Ubuntu is also consistent with the practices of African peoples as expressed in the proverbs and aphorisms of certain Nguni languages, specifically Zulu and Sotho” (Ramose quoted in van Niekerk 2013).
In his definition of ubuntu, Metz (2007: 323) follows Tutu and Ramose when he equates Ubuntu to the idea of humanness and to the maxim a person is a person through other persons. This maxim, for Metz, “has descriptive senses to the effect that one’s identity as a human being causally and even metaphysically depends on a community.” With this submission, Metz, agrees with Ramose, Samkange and Samkange and Gade that ubuntu is about the group/community more than it is about the self.
It may be important, at this juncture, to briefly consider the historical roots of the term Ubuntu in order to buttress the foregoing. To begin with, in his attempt to trace the history of the idea of Ubuntu, Michael Onyebuchi Eze (2010: 90) remarks thus when it comes to the idea of Ubuntu, “history adopts a new posture…where it is no longer a narrative of the past only but of the moment, the present and the future.” Other than asking a series of questions relating to “history as a narrative of the moment, present and future,” he does not adequately explain why this is so. Instead, he goes further to explain the view of “history as a narrative of the past.” As a narrative of the past, Onyebuchi Eze observes thus:
Ubuntu is projected to us in a rather hegemonic format; by way of an appeal to a unanimous past through which we may begin to understand the socio-cultural imaginary of the “African” people before the violence of colonialism; an imagination that must be rehabilitated in that percussive sense for its actual appeal for the contemporary African society (2010:93).
Onyebuchi Eze seems to be suggesting that there is too much romanticisation of the past when it comes to the conceptualisation and use of the term Ubuntu. He seems to question the idea of the unanimous character of Ubuntu before “the violence of colonialism” reducing this idea to some kind of imagination that should have no place in contemporary African society. We are compelled to agree with him to that extent. Thus, unlike many scholars of Ubuntu who have tended to gloss over the limitations of Ubuntu, Onyebuchi Eze is no doubt looking at the history of this concept with a critical eye. One of the key arguments he presents which is worthy of our attention in this article is that of the status of the individual and that of the community in the definition and conceptualisation of Ubuntu.
While many Ubuntu writers have tended to glorify community over and above the individual, Onyebuchi Eze (2008: 106) is of the view that, “the individual and the community are not radically opposed in the sense of priority but engaged in contemporaneous formation.” Thus, while we agree with Onyebuchi Eze that both the individual and the community put together define Ubuntu, we maintain that their relationship is not that of equals but that the individual is submerged within the community and the interests and aspirations of the community matter more than those of the individual. This, however, should not be interpreted to mean that the individual plays an ancillary role in the definition of Ubuntu. Below, we outline and explain the qualities/features of hunhu/ubuntu as an aspect of ethno-philosophy.
4. The Deployment of Hunhu/Ubuntu in the Public Sphere
Hunhu/Ubuntu has dominated the public discourse especially in Zimbabwe and South Africa where it has been used to deal with both political and social differences. In Zimbabwe, for instance, hunhu/ubuntu has been used to bring together the Zimbabwe African National Union Patriotic Front (ZANU PF) and Patriotic Front Zimbabwe African People’s Union (PF ZAPU) after political tensions that led to the Midlands and Matabeleland disturbances of the early 1980s which saw about 20000 people killed by the North Korea trained Fifth Brigade. The 1987 Unity accord was done in the spirit of Ubuntu where people had to put aside their political differences and advance the cause of the nation.
The Global Political Agreement of 2008 which led to the signing of the Government of National Unity (GNU) also saw hunhu/ubuntu being deployed to deal with the political differences between ZANU PF and the Movement for Democratic Change (MDC) formations as a result of the violent elections of June 2008. This violence had sown the seeds of fear to the generality of the Zimbabwean population and so it took hunhu/ubuntu to remove the fear and demonstrate the spirit of “I am because we are, since we are therefore I am.” The point is that the two political parties needed each other in the interest of the development of the nation of Zimbabwe.
In South Africa, Desmond Tutu, who was the Chairperson of the Truth and Reconciliation Commission (TRC) which was formed to investigate and deal with the apartheid atrocities in the 1990s demonstrated in his final report that it took Ubuntu for people to confess, forgive and forget. In his book: No Future without Forgiveness, published in 1999, Tutu writes, “the single main ingredient that made the achievements of the TRC possible was a uniquely African ingredient – Ubuntu.” Tutu maintains that, what constrained so many to choose to forgive rather than to demand retribution, to be magnanimous and ready to forgive rather than to wreak revenge was Ubuntu (Tutu quoted in Richardson, 2008: 67). As Onyebuchi Eze (2011: 12) would put it, “the TRC used Ubuntu as an ideology to achieve political ends.” As an ideology Ubuntu has been used as a panacea to the socio-political problems affecting the continent of Africa, especially the Southern part of the continent. This means that Ubuntu as a traditional thought has not been restricted to the academy alone but has also found its place in the public sphere where it has been utilised to solve political conflicts and thereby bring about socio-political harmony. To underscore the importance of Ubuntu not only as an intellectual and public good, Gabriel Setiloane (quoted in Vicencio, 2009: 115) remarks thus, “Ubuntu is a piece of home grown African wisdom that the world would do well to make its own.” This suggests the southern African roots of ubuntu as a traditional thought.
5. The Distinctive Qualities/Features of Hunhu/Ubuntu
While Martin H Prozesky (2003: 5-6) has identified the ten qualities that are characteristic of hunhu/ubuntu, it is important to note that, although this article will only utilise Prozesky’s ten qualities, the philosophy of hunhu/ubuntu has more than ten qualities or characteristics. Our justification of using Prozesky’s ten qualities is that they aptly capture the essence of Ubuntu as an aspect of ethno-philosophy. This article begins by outlining Prozesky’s ten qualities before attempting to explain only four of them, namely humaneness, gentleness, hospitality and generosity. Prozesky’s ten qualities are as follows:
Humaneness
Gentleness
Hospitality
Empathy or taking trouble for others
Deep Kindness
Friendliness
Generosity
Vulnerability
Toughness
Compassion
Hunhu/ubuntu as an important aspect of ethno-philosophy is an embodiment of these qualities. While Ramose uses humaneness to define hunhu/ubuntu, Samkange and Samkange use humanism to define and characterise the same. The impression one gets is that the former is similar to the latter. But this is further from the truth. Thus, with regard to the dissimilarity between humaneness and humanism, Gade (2011: 308) observes:
I have located three texts from the 1970s in which Ubuntu is identified as ‘African humanism.’ The texts do not explain what African humanism is, so it is possible that their authors understood African humanism as something different from a human quality.
Granted that this is may be the case, the question then is: What is the difference between humaneness and humanism, and African humaneness and African humanism as aspects of hunhu/ubuntu philosophy? While humaneness may refer to the essence of being human including the character traits that define it (Dolamo, 2013: 2); humanism, on the other hand, is an ideology, an outlook or a thought system in which human interests and needs are given more value than the interests and needs of other beings (cf. Flexner, 1988: 645).Taken together, humaneness and humanism become definitive aspects of hunhu/ubuntu only if the pre-fix ‘African’ is added to them to have African humaneness and African humanism respectively. African humaneness would, then, entail that the qualities of selflessness and commitment to one’s group or community are more important than the selfish celebration of individual achievements and dispositions.
African humanism, on the other hand; would, then, refer to an ideology or outlook or thought system that values peaceful co-existence and the valorisation of community. In other words, it is a philosophy that sees human needs, interests and dignity as of fundamental importance and concern (Gyekye 1997: 158). Gyekye maintains that African humanism “is quite different from the Western classical notion of humanism which places a premium on acquired individual skills and favours a social and political system that encourages individual freedom and civil rights” (1997: 158).
Thus, among the Shona people of Zimbabwe, the expression munhu munhu muvanhu, which in isiNdebele and Zulu language translates to umuntu ngumuntu ngabantu, both of which have the English translation of “a person is a person through other persons,” best explain the idea of African humanism (cf. Mangena 2012a; Mangena 2012b; Shutte 2008; Tutu 1999).
In regard to the definition and characterisation of African humanism, Onyebuchi Eze (2011:12) adds his voice to the definition of African humanism when he observes that:
As a public discourse, Ubuntu/botho has gained recognition as a peculiar form of African humanism, encapsulated in the following Bantu aphorisms, like Motho ke motho ka batho babang; Umuntu ngumuntu ngabantu (a person is a person through other people). In other words, a human being achieves humanity through his or her relations with other human beings.
Whether one prefers humaneness or humanism, the bottom line is that the two are definitive aspects of the philosophy of hunhu/ubuntu which places the communal interests ahead of the individual interests. Of course, this is a position which Onyebuchi Eze would not buy given that in his view, the community cannot be prioritised over the individual as:
The relation with ‘other’ is one of subjective equality, where the mutual recognition of our different but equal humanity opens the door to unconditional tolerance and a deep appreciation of the ‘other’ as an embedded gift that enriches one’s humanity (2011: 12).
Some believe that what distinguishes an African of black extraction from a Westerner is the view that the former is a communal being while the latter prides in the idea of selfhood or individualism. To these people the moment we take the individual and the community as subjective equals [as Onyebuchi Eze does] we end up failing to draw the line between what is African from what is fundamentally Western. Having defined humaneness, this article will now define and characterise the quality of gentleness as understood through hunhu/ubuntu. Gentleness encompasses softness of heart and being able to sacrifice one’s time for others. Thus, being gentle means being tender-hearted and having the ability to spend time attending to other people’s problems. Gentleness is a quality of the tradition thought of hunhu/ubuntu in that it resonates with John S Mbiti’s dictum: “I am because we are, since we are therefore I am” (1969: 215). The point is that with gentleness, one’s humanity is inseparably bound to that of others. Eric K Yamamoto (1997: 52) puts it differently in reference to the altruistic character of Ubuntu philosophy when he remarks thus:
Ubuntu is the idea that no one can be healthy when the community is sick. Ubuntu says I am human only because you are human. If I undermine your humanity, I dehumanise myself.
Both the definition of gentleness provided above and Mbiti’s dictum are equivalent to Yamamoto’s understanding of gentleness in that they both emphasise on otherness rather than the self. The attribute of hospitality also defines hunhu/ubuntu philosophy. Hospitality generally means being able to take care of your visitors in such a way that they feel comfortable to have you as their host and the relationship is not commercial. However, the Western definition of hospitality is such that the host goes out of his or her way to provide for the needs of his guests in return for some payment. This, however, should not be interpreted to mean that the Westerner is not hospitable outside of commerce. No doubt, they can also be hospitable but it is the magnitude of hospitality that differs.
In the case of the Shona/Ndebele communities in Africa where hospitality is given for free as when one provides accommodation and food to a stranger at his or her home, the magnitude is high. Coming to the idea of hospitality in Africa, it is important to note that in traditional Shona/Ndebele society when a person had travelled a long journey looking for some relative, they would get tired before reaching their relative’s home and along the way; it was common for them to be accommodated for a day or two before they get to their relative’s home. During their short stay, they would be provided with food, accommodation and warm clothes (if they happened to travel during the winter season).
Among the Korekore-Nyombwe people of Northern Zimbabwe, strangers would be given water to drink before asking for directions or before they ask for accommodation in transit. The thinking was that the stranger would have travelled a very long distance and was probably tired and thirsty and so there was need to give them water to quench their thirst. Besides, water (in Africa) symbolises life and welfare and so by giving strangers water they were saying that life needed to be sustained and that as Africans, we are “our brothers’ keepers.” Thus, hunhu/ubuntu hospitality derives its impetus from this understanding that the life and welfare of strangers is as important as our own lives and welfare.
Now, this is different from the idea of home and hospitality in Western Cosmopolitans where a home is a place of privacy. Most homes in the West have durawalls or high fences to maximise the privacy of the owner and so a stranger cannot just walk in and be accommodated. This is quite understandable because in Western societies, the individual is conceived of as the centre of human existence and so there is need to respect his or her rights to privacy. In the West, the idea of a stranger walking into a private space is called trespassing and one can be prosecuted for this act. And yet in African traditional thought, in general, and in the Shona/Ndebele society, in particular, the idea of trespassing does not exist in that way.
In fact, in pre-colonial Shona/Ndebele society, however, the community was at the centre of human existence and that is why the pre-colonial Shona/Ndebele people would easily accommodate strangers or visitors without asking many questions. However, due to the colonisation of Africa, some Africans have adopted the Western style of individual privacy, but this is contrary to hunhu/ubuntu hospitality which is still being practiced in most Shona/Ndebele rural communities today. The point is that philosophies of hospitality, identity and belonging are more clearly played out on the home front than in the public sphere.
The last attribute to be discussed in this section, is generosity. Generally, generosity refers to freedom or liberality in giving (Flexner 1988: 550). The attribute of generosity in Southern African thought is best expressed proverbially. In Shona culture, for instance, there are proverbs that explain the generosity of the Shona people or vanhu. Some of these include: Muenzi haapedzi dura (A visitor does not finish food), Chipavhurire uchakodzwa(The one who gives too much will also receive too much), Chawawana idya nehama mutogwa unokangamwa (Share whatever you get with your relatives because strangers are very forgetful) and Ukama igasva hunazadziswa nekudya (Relations cannot be complete without sharing food).
These proverbs not only demonstrate that Bantu people are generous people, but the proverbs also say something about the hunhu/ubuntu strand that runs through the traditional thought of almost all the Bantu cultures of Southern Africa whereby everything is done to promote the interests of the group or community. The proverbs show that the Bantu people are selfless people as summarised by the Nguni proverb which we referred to earlier, which says: Umuntu ngumuntu ngabantu (a person is a person through other persons) or as they put it in Shona: Munhu munhu muvanhu. Without the attribute of generosity, it may be difficult to express one’s selflessness.
6. The Components of Hunhu/Ubuntu
This section outlines the components of hunhu/ubuntu traditional philosophy showing how these are different from the branches of Western philosophy. These components will be outlined as hunhu/ubuntu metaphysics, hunhu/ubuntu ethics as well as hunhu/ubuntu epistemology. The objective is to show that while Western philosophy is persona-centric and is summarised by Descartes’ famous phrase, Cogito ergo sum which when translated to English means “I think therefore I am”; hunhu/ubuntu traditional philosophy, on the other hand, is communo-centric and is summarised by Pobee’s famous dictum, Cognatus ergo sum which when translated to English means, “I am related by blood, therefore, I exist.” In much simpler terms, while Western philosophy emphasises the self and selfhood through the promotion of individual rights and freedoms, hunhu/ubuntu traditional thought emphasises on the importance of the group or community through the promotion of group or communal interests.
a. Hunhu/Ubuntu Metaphysics
Before defining and characterising hunhu/ubuntu metaphysics, it is important to begin by defining the term Metaphysics itself. For lack of a better word in African cultures, the article will define metaphysics from the standpoint of Western philosophy. The article will then show that this definition, though, it will give us a head-start; can only partially be applied to non- Western cultures. To begin with, in the history of Western philosophy, Metaphysics is by far regarded as the most ancient branch of philosophy and it was originally called first philosophy (Steward and Blocker 1987: 95). The term Metaphysics is only but an accident of history as it is thought to have resulted from an editor’s mistake as “he was sorting out Aristotle’s works in order to give them titles, several decades after Aristotle had died. It is thought that the editor came across a batch of Aristotle’s writings that followed The Physics and he called them Metaphysics, meaning After Physics” (1987: 96).
Metaphysics then is a branch of philosophy that deals with the nature of reality. It asks questions such as: What is reality? Is reality material, physical or an idea? As one tries to answer these questions, a world is opened to him or her that enables him or her to identify, name and describe the kinds of beings that exist in the universe. Thus, two words define being, namely: ontology and predication. While pre-Socratics such as Thales, Anaximander, Anaximenes, Heraclitus and Parmenides and others defined being in terms of appearance and reality as well as change and permanence; Classical philosophers such as Socrates/Plato and Aristotle defined change in terms of form and matter.
While form was more real Socrates/Plato and existed in a different realm than that of matter, Aristotle argued that both form and matter together formed substance which was reality. Although the idea of being has always been defined in individual terms in the history of Western philosophy; it was given its definitive character by French Philosopher, Rene Descartes, who defined it in terms of what he called Cogito ergo sum which when translated to English means, “I think therefore I am.” Thus, the individual character of Western philosophy was firmly established with the popularisation of Descartes’ Cogito. A question can be asked: Does this understanding of metaphysics also apply to non-Western cultures? The answer is yes and no. Yes in the sense that in non-Western cultures being is also explained in terms of appearance and reality as well as change and permanence and no in the sense that non-Western philosophies, especially the hunhu/ubuntu traditional philosophy of Southern Africa has a communal character, not an individual character. Having said this, so what is hunhu/ubuntu metaphysics?
Hunhu/ubuntu metaphysics is a component of hunhu/ubuntu traditional philosophy that deals with the nature of being as understood by people from Southern Africa. As we have already intimated, in Southern African traditional thought, being is understood in the communal, physical and spiritual sense. Thus, a human being is always in communion with other human beings as well as with the spiritual world. Sekou Toure (1959) calls this “the communion of persons” whereby being is a function of the “us” or “we” as opposed to the “I” as found in “the autonomy of the individuals” that is celebrated in the West and is especially more revealing in Descartes’ Cogito. Pobee (1979) defines the African being in terms of what he calls Cognatus ergo sum which means “I am related by blood, therefore, I exist.” What this suggests is that in Southern Africa, just like in the rest of Sub-Saharan Africa, the idea of being is relational.
Coming to the communion of human beings with the spiritual world, it is important to remark that the idea of being has its full expression through participation. Just as, Socrates/Plato’s matter partakes in the immutable forms, being in the Shona/Ndebele society depends solely on its relationship with the spiritual world which is populated by ancestral spirits, avenging spirits, alien spirits and the greatest spiritual being called Musikavanhu/Nyadenga/unkulunkulu (The God of Creation). The greatest being works with his lieutenants, ancestors and other spirits to protect the interests of the lesser beings, vanhu/abantu. In return, vanhu/abantu enact rituals of appeasement so that it does not become a one-way kind of interaction. It is, however, important to note that while Socratic/Platonic Metaphysics is dualistic in character; hunhu/ubuntu Metaphysics is onto-triadic or tripartite in character. It involves the Supreme Being (God), other lesser spirits (ancestral/alien and avenging) and human beings.
b. Hunhu/Ubuntu Ethics
Hunhu/ubuntu ethics refer to the idea of hunhu/ubuntu moral terms and phrases such as tsika dzakanaka kana kuti dzakaipa (good or bad behaviour), kuzvibata kana kuti kusazvibata (self-control or reckless behaviour), kukudza vakuru (respecting or disrespecting elders) and kuteerera vabereki (being obedient or disobedient to one’s immediate parents and the other elders of the community) among others. In Shona society they say: Mwana anorerwa nemusha kana kuti nedunhu (It takes a clan, village or community to raise a child). Having defined hunhu/ubuntu ethics, it is important to distinguish them from hunhu/ubuntu morality which relates to the principles or rules that guide and regulate the behaviour of vanhu or abantu (human beings in the Shona/Ndebele sense of the word) within Bantu societies.
What distinguishes hunhu/ubuntu ethics from Western ethics is that the former are both upward-looking/transcendental and lateral, while the latter are only lateral. This section will briefly distinguish between an upward-looking/transcendental kind of hunhu/ubuntu ethic from a lateral kind of hunhu/ubuntu ethic. By upward-looking/transcendental is meant that hunhu/ubuntu ethics are not only confined to the interaction between humans, they also involve spiritual beings such as Mwari/Musikavanhu/Unkulunkulu (Creator God), Midzimu (ancestors) and Mashavi (Alien spirits). Thus, hunhu/ubuntu ethics are spiritual, dialogical and consensual (cf. Nafukho 2006). By dialogical and consensual is meant that the principles that guide and regulate the behaviour of vanhu/abantu are products of the dialogue between spiritual beings and human beings and the consensus that they reach. By lateral is meant that these principles or rules are crafted solely to guide human interactions.
As Mangena (2012: 11) would put it, hunhu/ubuntu ethics proceed through what is called the Common Moral Position (CMP). The CMP is not a position established by one person as is the case with Plato’s justice theory, Aristotle’s eudaimonism, Kant’s deontology or Bentham’s hedonism (2012: 11). With the CMP, the community is the source, author and custodian of moral standards and personhood is defined in terms of conformity to these established moral standards whose objective is to have a person who is communo-centric than one who is individualistic. In Shona/Ndebele society, for instance, respect for elders is one of the ways in which personhood can be expressed with the goal being to uphold communal values. It is within this context that respect for elders is a non-negotiable matter since these are the custodians of these values and fountains of moral wisdom.
Thus, one is born and bred in a society that values respect for the elderly and he or she has to conform. One important point to note is that the process of attaining the CMP is dialogical and spiritual in the sense that elders set moral standards in consultation with the spirit world which, as intimated earlier is made up of Mwari/Musikavanhu/Unkulunkulu (Creator God) and Midzimu (ancestors), and these moral standards are upheld by society (2012: 12). These moral standards, which make up the CMP, are not forced on society as the elders (who represent society), Midzimu (who convey the message to Mwari) and Mwari (who gives a nod of approval) ensure that the standards are there to protect the interest of the community at large.
Communities are allowed to exercise their free will but remain responsible for the choices they make as well as their actions. For instance, if a community chooses to ignore the warnings of the spirit world regarding an impending danger such as a calamity resulting from failure by that community to enact an important ritual that will protect members of that community from say, flooding or famine; then the community will face the consequences.
c. Hunhu/Ubuntu Epistemology
What is epistemology? In the Western sense of the word, epistemology deals with the meaning, source and nature of knowledge. Western philosophers differ when it comes to the sources of knowledge with some arguing that reason is the source of knowledge while others view experience or the use of the senses as the gateway to knowledge. This article will not delve much into these arguments since they have found an audience, instead it focuses on hunhu/ubuntu epistemology. However, one cannot define and characterise hunhu/ubuntu traditional epistemology without first defining and demarcating the province of African epistemology as opposed to Western epistemology.
According to Michael Battle (2009: 135), “African epistemology begins with community and moves to individuality.” Thus, the idea of knowledge in Africa resides in the community and not in the individuals that make up the community. Inherent in the powerful wisdom of Africa is the ontological need of the individual to know self and community (2009: 135) and discourses on hunhu/ubuntu traditional epistemology stems from this wisdom. As Mogobe Ramose (1999) puts it, “the African tree of knowledge stems from ubuntu philosophy. Thus, ubuntu is a well spring that flows within African notions of existence and epistemology in which the two constitute a wholeness and oneness.” Just like, hunhu/ubuntu ontology, hunhu/ubuntu epistemology is experiential.
In Shona society, for instance, the idea of knowledge is expressed through listening to elders telling stories of their experiences as youths and how such experiences can be relevant to the lives of the youths of today. Sometimes, they use proverbs to express their epistemology. The proverb: Rega zvipore akabva mukutsva(Experience is the best teacher) is a case in point. One comes to know that promiscuity is bad when he or she was once involved in it and got a Sexually Transmitted Infection (STI) and other bad consequences. No doubt, this person will be able to tell others that promiscuity is bad because of his or her experiences. The point is that hunhu/ubuntu epistemology is a function of experience. In Shona, they also say: Takabva noko kumhunga hakuna ipwa (We passed through the millet field and we know that there are no sweet reeds there). The point is that one gets to know that there are no sweet reeds in a millet field because he or she passed through the millet field. One has to use the senses to discern knowledge.
7. Conclusion
In this article, the traditional philosophy of hunhu/ubuntu was defined and characterised with a view to show that Africa has a traditional philosophy and ethic which are distinctively communal and spiritual. This philosophy was also discussed with reference to how it has been deployed in the public sphere in both Zimbabwe and South Africa. The key distinctive qualities/features of this traditional philosophy were clearly spelt out as humaneness, gentleness, hospitality and generosity. This philosophy was also discussed within the context of its three main components, namely; hunhu/ubuntu metaphysics, hunhu/ubuntu ethics and hunhu/ubuntu epistemology. In the final analysis, it was explained that hunhu/ubuntu metaphysics, hunhu/ubuntu ethics and hunhu/ubuntu epistemology formed the aspects of what is known today as traditional southern African thought.
8. References and Further Reading
Appiah, K.A. 1992. In My Father’s House: Africa in the Philosophy of Culture. New York: Oxford University Press.
A thorough treatise of the idea of Africa in the philosophy of culture
Battle, M. 2009. Ubuntu: I in You and You in Me. New York: Seasbury Publishing
A discussion of Ubuntu and how this idea has benefitted the Western world.
Dolamo, R. 2013. “Botho/Ubuntu: The Heart of African Ethics.” Scriptura, 112 (1), pp.1-10
A thorough treatise on the notion of Ubuntu and its origin in Africa
Eze, M.O. 2011. “I am Because You Are.” The UNESCO Courier, pp. 10-13
A Philosophical analysis of the idea of ubuntu
Eze, M.O. 2010. Intellectual History in Contemporary South Africa. New York: Palgrave Macmillan
A detailed outline of the definition and characterization of intellectual history in Contemporary Africa
Eze, M.O. 2008. “What is African Communitarianism? Against Consensus as a regulative ideal.” South African Journal of Philosophy. 27 (4), pp. 106-119
A philosophical discussion of the notions of community and individuality in African thought
Flexner, S et al. 1988. The Random House Dictionary. New York: Random House.
One of the best dictionaries used by academics
Gade, C.B.N. 2011. “The Historical Development of the Written Discourses on Ubuntu.” South African Journal of Philosophy, 30(3), pp. 303-330
A philosophical discussion of the historical development of the Ubuntu discourse in Southern Africa
Gade, C.B.N. 2012. “What is Ubuntu? Different Interpretations among South Africans of African Descent.” South African Journal of Philosophy, 31 (3), pp.484-503
A Case-study on how South Africans of African descent interpret ubuntu
Gade, C.B.N. 2013. “Restorative Justice and the South African Truth and Reconciliation Process.” African Journal of Philosophy, 32(1), pp. 10-35
A philosophical discussion of the origins of the idea of Restorative Justice
Gyekye, K. 1997. Tradition and Modernity: Reflections on the African Experience. New York: Oxford University Press
A philosophical rendition of the concepts of tradition and modernity in Africa
Hurka, T. 1993. Perfectionism. New York: Oxford University Press
A discussion on the notion of perfectionism
Makinde, M.A. 1988. African philosophy, Culture and Traditional Medicine. Athens: Africa Series number 53.
A thorough treatise on culture and philosophy in African thought
Mangena, F. 2012a. On Ubuntu and Retributive Justice in Korekore-Nyombwe Culture: Emerging Ethical Perspectives. Harare: Best Practices Books
A philosophical discussion of the place of Ubuntu and culture in the death penalty debate
Mangena, F. 2012b. “Towards a Hunhu/Ubuntu Dialogical Moral Theory.” Phronimon: Journal of the South African Society for Greek Philosophy and the Humanities, 13 (2), pp. 1-17
A philosophical discussion of the problems of applying Western ethical models in non-Western cultures
Mangena, F.2014. “In Defense of Ethno-philosophy: A Brief Response Kanu’s Eclecticism.” Filosofia Theoretica: A Journal of Philosophy, Culture and Religions, 3 (1), pp.96-107
A reflection on the importance of ethno-philosophy in the African philosophy debate
Mangena, F. 2015. “Ethno-philosophy as Rational: A Reply to Two Famous Critics.” Thought and Practice: A Journal of the Philosophical Association of Kenya, 6 (2), pp. 24-38
A reaction to the Universalists regarding the place of ethno-philosophy in African thought
Mbiti, J.S. 1969. African Religions and Philosophy. London: Heinemann
A discussion of community in African philosophy
Metz, T. 2007. “Towards an African Moral Theory.” The Journal of Political Philosophy, 15(3), pp. 321-341
A philosophical outline of what Thaddeus Metz perceive as African ethics
Nafukho. F.M. 2006. “Ubuntu Worldview: A Traditional African View of Adult Learning in the Workplace.” Advances in Developing Human Resources, 8(3), pp.408-415
A thorough treatise on the three pillars of ubuntu
Pobee, J.S. 1979. Towards an African Theology. Nashville: Abingdon Press.
A theological discussion of the notions of community and individuality in African thought
Prozesky, M.H. 2003. Frontiers of Conscience: Exploring Ethics in a New Millennium. Cascades: Equinym Publishing
An outline of Ubuntu’s ten qualities
Ramose, M.B. 1999. African Philosophy through Ubuntu. Harare: Mond Books.
A thorough discussion on the nature and character of ubuntu
Ramose, M.B. 2007. “But Hans Kelsen was not born in Africa: A reply to Metz.” South African Journal of Philosophy, 26(4), pp. 347-355
Ramose’s response to Thaddeus Metz’s claim that African ethics lack a basic norm
Ramose, M.B. 2014b. “Ubuntu: Affirming Right and Seeking Remedies in South Africa.” In: L Praeg and S Magadla (Eds.). Ubuntu: Curating the Archive (pp. 121-1346). Scottsville: University of KwaZulu Natal Press
A discussion of Ubuntu as affirming right and wrong in South Africa
Samkange, S and Samkange, T.M. 1980. Hunhuism or Ubuntuism: A Zimbabwean Indigenous Political Philosophy. Salisbury: Graham Publishing
A philosophical handbook on notions of Hunhu/Ubuntu in Zimbabwe
Steward, D and Blocker H.G. 1987. Fundamentals of Philosophy. New York: Macmillan Publishing Company
A discussion of key topics in Western philosophy
Shutte, A. 2008. “African Ethics in a Globalizing World.” In: R Nicolson (Ed.).Persons in Community: African Ethics in a Global Culture (pp. 15-34). Scottsville: University of KwaZulu Natal Press
A philosophical discussion of African ethics and its place in the globe
Taylor, D.F.P. 2013. “Defining Ubuntu for Business Ethics: A Deontological Approach.” South African Journal of Philosophy, 33(3), pp.331-345
An attempt to apply Ubuntu in the field of Business in Africa
A discussion of the link between leadership, politics and culture in Africa
Tutu, D. 1999. No Future without Forgiveness. New York: Doubleday
A philosophical discussion of the findings of the Truth and Reconciliation Commission in South Africa
van Niekerk, J. (2013). “Ubuntu and Moral Value.” Johannesburg (PhD Dissertation submitted to the Department of Philosophy, University of Witwatersrand)
A philosophical rendition of the discourse of ubuntu and moral value.
Yamamoto, E.K. 1997. “Race Apologies.” Journal of Gender, Race and Justice, Vol. 1, pp. 47-88
A critical reflection on the nexus of Ubuntu, race, gender and justice
Vicencio, C.V. 2009. Walk with Us and Listen: Political Reconciliation in Africa. Cape Town: University of Cape Town Press
A philosophical discussion of political reconciliation in Africa.
Richardson, N. R. 2006. “Reflections on Reconciliation and Ubuntu.” In: R Nicolson (Ed.). Persons in Community: African Ethics in a Global Culture. Scottsville: University of KwaZulu Natal Press
A discussion on reconciliation in light of the Truth and Reconciliation Commission in South Africa.
This article traces the history of systematic African philosophy from the early 1920s to date. In Plato’s Theaetetus, Socrates suggests that philosophy begins with wonder. Aristotle agreed. However, recent research shows that wonder may have different subsets. If that is the case, which specific subset of wonder inspired the beginning of the systematic African philosophy? In the history of Western philosophy, there is the one called thaumazein interpreted as ‘awe’ and the other called miraculum interpreted as ‘curiosity’. History shows that these two subsets manifest in the African place as well, even during the pre-systematic era. However, there is now an idea appearing in recent African philosophy literature called ọnụma interpreted as ‘frustration,’ which is regarded as the subset of wonder that jump-started the systematic African philosophy. In the 1920s, a host of Africans who went to study in the West were just returning. They had experienced terrible racism and discrimination while in the West. They were referred to as descendants of slaves, as people from the slave colony, as sub-humans, and so on. On return to their native lands, they met the same maltreatment by the colonial officials. ‘Frustrated’ by colonialism and racialism as well as the legacies of slavery, they were jolted onto the path of philosophy—African philosophy—by what can be called ọnụma.
These ugly episodes of slavery, colonialism and racialism not only shaped the world’s perception of Africa; they also instigated a form of intellectual revolt from the African intelligentsias. The frustration with the colonial order eventually led to angry questions and reactions out of which African philosophy emerged, first in the form of nationalisms, and then in the form of ideological theorizations. But the frustration was borne out of colonial caricature of Africa as culturally naïve, intellectually docile and rationally inept. This caricature was created by European scholars such as Kant, Hegel and, much later, Levy-Bruhl to name just a few. It was the reaction to this caricature that led some African scholars returning from the West into the type of philosophizing one can describe as systematic beginning with the identity of the African people, their place in history, and their contributions to civilization. To dethrone the colonially-built episteme became a ready attraction for African scholars’ vexed frustrations. Thus began the history of systematic African philosophy with the likes of JB Danquah, Meinrad Hebga, George James, SK. Akesson, Aime Cesaire, Leopold Senghor, Kwame Nkrumah, Julius Nyerere, George James, William Abraham, John Mbiti and others such as Placid Tempels, and Janheinz Jahn to name a few.
African philosophy as a systematic study has a very short history. This history is also a very dense one since actors sought to do in a few decades what would have been better done in many centuries. As a result, they also did in later years what ought to have been done earlier and vice versa, thus making the early and the middle periods overlap considerably. The reason for this overtime endeavor is not far-fetched. Soon after colonialism, actors realized that Africa had been sucked into the global matrix unprepared. During colonial times, the identity of the African was European; his thought system, standard and even his perception of reality were structured by the colonial shadow which stood towering behind him. It was easy for the African to position himself within these Western cultural appurtenances even though they had no real connection with his being.
The vanity of this presupposition and the emptiness of colonial assurances manifested soon after the towering colonial shadow vanished. Now, in the global matrix, it became shameful for the African to continue to identify himself within the European colonialist milieu. For one, he had just rejected colonialism, and for another, the deposed European colonialist made it clear that the identity of the African was no longer covered and insured by the European medium. So, actors realized suddenly that they had been disillusioned and had suffered severe self-deceit under colonial temper. The question which trailed every African was, “Who are you?” Of course, the answers from European perspective were savage, primitive, less than human, etc. It was the urgent, sudden need to contradict these European positions that led some post-colonial Africans in search of African identity. So, to discover or rediscover African identity in order to initiate a non-colonial or original history for Africa in the global matrix and start a course of viable economic, political and social progress that is entirely African became one of the focal points of African philosophy. Here, the likes of Cesaire, Nkrumah and Leon Damas began articulating the negritude movement.
While JB Danquah (1928, 1944) and SK Akesson (1965) rationally investigated topics in African politics, law and metaphysics, George James (1954) reconstructed African philosophical history, Meinrad Hebga (1958) probed topics in African logic. These three represent some of the early African philosophers. Placid Tempels (1959), the European missionary, also elected to help and, in his controversial book, Bantu Philosophy, sought to create Africa’s own philosophy as proof that Africa has its own peculiar identity and thought system. However, it was George James, who attempted a much more ambitious project in his work, Stolen Legacy. In this work, there were strong suggestions not only that Africa had philosophy but that the so-called Western philosophy, the very bastion of European identity, was stolen from Africa. This claim was intended to make the proud European colonialists feel indebted to the humiliated Africans, but it was unsuccessful. That Greek philosophy had roots in Egypt does not imply, as some claim, that Egyptians were high-melanated nor that high-melanated Africans created Egyptian philosophy. The use of the term “Africans” in this work is in keeping with George James’ demarcation that precludes the low-melanated people of North Africa and refers to the high-melanated people of southern Sahara.
Besides the two above, other Africans contributed ideas. Aime Cesaire, John Mbiti, Odera Oruka, Julius Nyerere, Leopold Senghor, Nnamdi Azikiwe, Kwame Nkrumah, Obafemi Awolowo, Alexis Kegame, Uzodinma Nwala, Emmanuel Edeh, Innocent Onyewuenyi, and Henry Olela, to name just a few, opened the doors of ideas. A few of the works produced sought to prove and establish the philosophical basis of African, unique identity in the history of humankind, while others sought to chart a course of Africa’s true identity through unique political and economic ideologies. It can be stated that much of these endeavors fall under the early period.
For its concerns, the middle period of African philosophy is characterized by the Great Debate. Those who seek to clarify and justify the position held in the early period and those who seek to criticize and deny the viability of such a position entangled themselves in a great debate. Some of the actors on this front include, C. S. Momoh, Robin Horton, Henri Maurier, Lacinay Keita, Peter Bodunrin, Kwasi Wiredu, Kwame Gyekye, Richard Wright, Barry Halen, Joseph Omoregbe, C. B. Okolo, Theophilus Okere, Paulin Hountondji, Gordon Hunnings, Odera Oruka and Sophie Oluwole to name a few.
The middle period eventually gave way to the later period, which has as its focus, the construction of an African episteme. Two camps rivaled each other, namely; the Critical Reconstructionists who are the evolved Universalists/Deconstructionists, and the Eclectics who are the evolved Traditionalists/Excavators. The former seek to build an African episteme untainted by ethnophilosophy; whereas, the latter seek to do the same by a delicate fusion of relevant ideals of the two camps. In the end, Critical Reconstructionism ran into a brick wall when it became clear that whatever it produced cannot truly be called African philosophy if it is all Western without African marks. The mere claim that it would be African philosophy simply because it was produced by Africans (Hountondji 1996 and Oruka 1975) would collapse like a house of cards under any argument. For this great failure, the influence of Critical Reconstructionism in the later period was whittled down, and it was later absorbed by its rival—Eclecticism.
The works of the Eclectics heralded the emergence of the New Era in African philosophy. The focus becomes the Conversational philosophizing, in which the production of philosophically rigorous and original African episteme better than what the Eclectics produced occupied the center stage.
Overall, the sum of what historians of African philosophy have done can be presented in the following two broad categorizations to wit; Pre-systematic epoch and the Systematic epoch. The former refers to Africa’s philosophical culture, thoughts of the anonymous African thinkers and may include the problems of Egyptian and Ethiopian legacies. The latter refers to the periods marking the return of Africa’s first eleven, Western-trained philosophers from the 1920’s to date. This latter category could further be delineated into four periods:
Early period 1920s – 1960s
Middle period 1960s – 1980s
Later period 1980s – 1990s
New (Contemporary) period since 1990s
Note, of course, that this does not commit us to saying that, before the early period, people in Africa never philosophized—they did. But one fact that must not be denied is that much of their thoughts were not documented in writing; most of those that may have been documented in writing are either lost or destroyed, and, as such, scholars cannot attest to their systematicity or sources. In other words, what this periodization shows is that African philosophy as a system first began in the late 1920s. There are, of course, documented writings in ancient Egypt, medieval Ethiopia, etc. The historian Cheikh Anta Diop (1974) has gazetted some of the ideas. Some of the popularly cited include St Augustine, born in present-day Algeria, but who being a Catholic Priest of the Roman Church, was trained in western-style philosophy education, and is counted amongst the medieval philosophers. Wilhelm Anton Amo, who was born in Ghana, West Africa, was sold into slavery as a little boy, and later educated in western-style philosophy in Germany where he also practised. Zera Yacob and Walda Heywat, both Ethiopian philosophers with Arabic and European educational influences. The question is, are the ideas produced by these people indubitably worthy of the name ‘African philosophies’? Their authors may be Africans by birth, but how independent are their views from foreign influences? We observe from these questions that the best that can be expected is a heated controversy. It would be uncharitable to say to the European historian of philosophy that St Augustine or Amo was not one of their own. Similarly, it may be uncharitable to say to the African historian that Amo or Yacob was not an African. But, does being an African translate to being an African philosopher? If we set sentiments aside, it would be less difficult to see that all there is in those questions is a controversy. Even if there were any substance beyond controversy, were those isolated and disconnected views (most of which were sociological, religious, ethnological and anthropological) from Egypt, Rome, Germany and Ethiopia adequate to form a coherent corpus of African philosophy? The conversationalists, a contemporary African philosophical movement, have provided us with a via-media out of this controversy. Rather than discard this body of knowledge as non-African philosophies or uncritically accept them as African philosophy as the likes of Obi Ogejiofor and Anke Graness, the conversationalists urge that they be discussed as part of the pre-systematic epoch that also include those Innocent Asouzu (2004) describes as the “Anonymous Traditional African Philosophers”. These are the ancient African philosophers whose names were forgotten through the passage of time, and whose ideas were transmitted through orality.
Because there are credible objections among African philosophers with regards to the inclusion of it in the historical chart of African philosophy, the Egyptian question (the idea that the creators of ancient Egyptian civilization were high-melanated Africans from the south of the Sahara) will be included as part of the controversies in the pre-systematic epoch. The main objection is that even if the philosophers of stolen legacy were able to prove a connection between Greece and Egypt, they could not prove in concrete terms that Egyptians who created the philosophy stolen by the Greeks were high-melanated Africans or that high-melanated Africans were Egyptians. It is understandable the frustration and desperation that motivated such ambitious effort in the ugly colonial era which was captured above, but any reasonable person, judging by the responses of time and events in the last few decades knew it was high time Africans abandoned that unproven legacy and let go of that, now helpless propaganda. If, however, some would want to retain it as part of African philosophy, it would carefully fall within the pre-systematic era.
In this essay, the discussion will focus on the history of systematic African philosophy touching prominently on the criteria, schools, movements and periods in African philosophy. As much as the philosophers of a given era may disagree, they are inevitably united by the problem of their epoch. That is to say, it is orthodoxy that each epoch is defined by a common focus or problem. Therefore, the approach of the study of the history of philosophy can be done either through personality periscope or through the periods, but whichever approach one chooses, he unavoidably runs into the person who had chosen the other. This is a sign of unity of focus. Thus philosophers are those who seek to solve the problem of their time. In this presentation, the study of the history of African philosophy will be approached from the perspectives of criteria, periods, schools, and movements. The personalities will be discussed within these purviews.
2. Criteria of African Philosophy
To start with, more than three decades of debate on the status of philosophy ended with the affirmation that African philosophy exists. But what is it that makes a philosophy African? Answers to this question polarized actors into two main groups, namely the Traditionalists and Universalists. Whereas the Traditionalists aver that the studies of the philosophical elements in world-view of the people constitute African philosophy, the Universalists insist that it has to be a body of analytic and critical reflections of individual African philosophers. Further probing of the issue was done during the debate by the end of which the question of what makes a philosophy “African” produced two contrasting criteria. First, as a racial criterion; a philosophy would be African if it is produced by Africans. This is the view held by people like Paulin Hountondji, Odera Oruka (in part), and early Peter Bodunrin, derived from the two constituting terms—“African” and “philosophy”. African philosophy following this criterion is the philosophy done by Africans. This has been criticized as inadequate, incorrect and exclusivist. Second, as a tradition criterion; a philosophy is “African” if it designates a non-racial-bound philosophy tradition where the predicate “African” is treated as a solidarity term of no racial import and where the approach derives inspiration from African cultural background or system of thought. It does not matter whether the issues addressed are African or that the philosophy is done by an African insofar as it has universal applicability and emerged from the purview of African system of thought. African philosophy would then be that rigorous discourse of African issues or any issues whatsoever from the critical eye of African system of thought. Actors like Odera Oruka (in part), Meinrad Hebga, C. S. Momoh, Udo Etuk, Joseph Omoregbe, the later Peter Bodunrin, Jonathan Chimakonam can be grouped here. This criterion has also been criticized as courting uncritical elements of the past when it makes reference to the controversial idea of African logic tradition. Further discussion on this is well beyond the scope of this essay. What is, however, common in the two criteria is that African philosophy is a critical discourse on issues that may or may not affect Africa by African philosophers—the purview of this discourse remains unsettled. Recently, the issue of language has come to the fore as crucial in the determination of the Africanness of a philosophy. Inspired by the works of Euphrase Kezilahabi (1985), Ngugi wa Thiong’o (1986), AGA Bello (1987), Francis Ogunmodede (1998), to name just a few, the ‘language challenge’ is now taken as an important element in the affirmation of African philosophy. Advocates ask, should authentic African philosophy be done in African languages or in a foreign language with wider reach? Godfrey Tangwa (2017), Chukwueloka Uduagwu (2022) and Enyimba Maduka (2022) are some contemporary Africans who investigate this question. Alena Rettova (2007) represents non-African philosophers who engage the question.
3. Methods of African Philosophy
a. The Communitarian Method
This method speaks to the idea of mutuality, together or harmony, the type found in the classic expression of ubuntu: “a person is a person through other persons” or that, which is credited to John Mbiti, “ I am because we are, since we are, therefore I am”. Those who employ this method wish to demonstrate the idea of mutual interdependence of variables or the relational analysis of variables. You find this most prominent in the works of researchers working in the areas of ubuntu, personhood and communalism. Some of the scholars who employ this method include; Ifeanyi Menkiti, Mogobe Ramose, Kwame Gyekye, Thaddeus Metz, Fainos Mangena, Leonhard Praeg, Bernard Matolino, Michael Eze, Olajumoke Akiode, Rianna Oelofsen, and so forth.
b. The Complementarity Method
This method was propounded by Innocent Asouzu, and it emphasizes the idea of missing link. In it, no variable is useless. The system of reality is like a network in which each variable has an important role to play i.e. it complements and is, in return, complemented because no variable is self-sufficient. Each variable is then seen as a ‘missing link’ of reality to other variables. Here, method is viewed as a disposition or a bridge-building mechanism. As a disposition, it says a lot about the orientation of the philosopher who employs it. The method of complementary reflection seeks to bring together seemingly opposed variables into a functional unity. Other scholars whose works have followed this method include Mesembe Edet, Ada Agada, Jonathan Chimakonam and a host of others.
c. The Conversational Method
This is a formal procedure for assessing the relationships of opposed variables in which thoughts are shuffled through disjunctive and conjunctive modes to constantly recreate fresh thesis and anti-thesis each time at a higher level of discourse without the expectation of the synthesis. The three principal features of this method are relationality, the idea that variables necessarily interrelate; contextuality, the idea that the relationship of variables is informed and shaped by contexts; complementarity, the idea that seemingly opposed variables can complement rather than contradict. It is an encounter between philosophers of rival schools of thought and between different philosophical traditions or cultures in which one party called nwa-nsa (the defender or proponent) holds a position and another party called nwa-nju (the doubter or opponent) doubts or questions the veracity and viability of the position. On the whole, this method points to the idea of relationships among interdependent, interrelated and interconnected realities existing in a network whose peculiar truth conditions can more accurately and broadly be determined within specific contexts. This method was first proposed by Jonathan Chimakonam and endorsed by the Conversational School of Philosophy. Other scholars who now employ this method include, Victor Nweke, Mesembe Edet, Fainos Mangena, Enyimba Maduka, Ada Agada, Pius Mosima, L. Uchenna Ogbonnaya, Aribiah Attoe, Leyla Tavernaro-Haidarian, Amara Chimakonam, Chukwueloka Uduagwu, Patrick Ben, and a host of others.
4. Schools of African Philosophy
a. Ethnophilosophy School
This is the foremost school in systematic African philosophy which equated African philosophy with culture-bound systems of thought. For this, their enterprise was scornfully described as substandard hence the term “ethnophilosophy.” Thoughts of the members of the Excavationism movement like Tempels Placid and Alexis Kagame properly belong here, and their high point was in the early period of African philosophy.
b. Nationalist/Ideological School
The concern of this school was nationalist philosophical jingoism to combat colonialism and to create political philosophy and ideology for Africa from the indigenous traditional system as a project of decolonization. Thoughts of members of the Excavationism movement like Kwame Nkrumah, Leopold Sedar Senghor and Julius Nyerere in the early period can be brought under this school.
c. Philosophic Sagacity
There is also the philosophic sagacity school, whose main focus is to show that standard philosophical discourse existed and still exists in traditional Africa and can only be discovered through sage conversations. The chief proponent of this school was the brilliant Kenyan philosopher Odera Oruka who took time to emphasize that Marcel Gruaile’s similar programme is less sophisticated than his. Other adherents of this school include Gail Presbey, Anke Graness and the Cameroonian philosopher Pius Mosima. But since Oruka’s approach thrives on the method of oral interview of presumed sages whose authenticity can easily be challenged be, what was produced may well distance itself from the sages and becomes the fruits of the interviewing philosopher. So, the sage connection and the tradition became questionable. Their enterprise falls within the movement of Critical Reconstructionism of the later period.
d. Hermeneutical School
Another prominent school is the hermeneutical school. Its focus is that the best approach to studying African philosophy is through interpretations of oral traditions and emerging philosophical texts. Theophilus Okere, Okonda Okolo, Tsenay Serequeberhan and Ademola Fayemi Kazeem are some of the major proponents and members of this school. The confusion, however, is that they reject ethnophilosophy whereas the oral tradition and most of the texts available for interpretation are ethnophilosophical in nature. The works of Okere and Okolo feasted on ethno-philosophy. This school exemplifies the movement called Afro-constructionism of the middle period.
e. Literary School
The literary school’s main concern is to make a philosophical presentation of African cultural values through literary/fictional ways. Proponents like Chinua Achebe, Okot P’Bitek, Ngugi wa Thiong’o, Wole Soyinka to name a few have been outstanding. Yet critics have found it convenient to identify their discourse with ethnophilosophy from literary angle thereby denigrating it as sub-standard. Their enterprise remarks the movement of Afro-constructionism of the middle period.
f. Professional School
Perhaps the most controversial is the one variously described as professional, universalist or modernist school. It contends that all the other schools are engaged in one form of ethnophilosophy or the other, that standard African philosophy is critical, individual discourse and that what qualifies as African philosophy must have universal merit and thrive on the method of critical analysis and individual discursive enterprise. It is not about talking, it is about doing. Some staunch unrepentant members of this school include Kwasi Wiredu, Paulin Hountondji, Peter Bodunrin, Richard Wright, Henri Maurier to name a few. They demolished all that has been built in African philosophy and built nothing as an alternative episteme. This school champions the movement of Afro-deconstructionism and the abortive Critical Reconstructionism of the middle and later periods, respectively.
Perhaps, one of the deeper criticisms that can be leveled against the position of the professional school comes from C. S. Momoh’s scornful description of the school as African logical neo-positivism. They agitate that (1) there is nothing as yet in African traditional philosophy that qualifies as philosophy and (2) that critical analysis should be the focus of African philosophy; so, what then is there to be critically analyzed? Professional school adherents are said to forget in their overt copying of European philosophy that analysis is a recent development in European philosophy which attained maturation in the 19th century after over 2000 years of historical evolution thereby requiring some downsizing. Would they also grant that philosophy in Europe before 19th century was not philosophy? The aim of this essay is not to offer criticisms of the schools but to present historical journey of philosophy in the African tradition. It is in opposition to and the need to fill the lacuna in the enterprise of the professional school that the new school called the conversational school has emerged in African philosophy.
g. Conversational School
This new school thrives on fulfilling the yearning of the professional/modernist school to have a robust individual discourse as well as fulfilling the conviction of the traditionalists that a thorough-going African philosophy has to be erected on the foundation of African thought systems. They make the most of the criterion that presents African philosophy as a critical tradition that prioritizes engagements between philosophers and cultures, and projects individual discourses from the methodological lenses and thought system of Africa that features the principles of relationality, contextuality and complementarity. The school has an ideological structure consisting of four aspects: their working assumption that relationship and context are crucial to understanding reality; their main problem called border lines or the presentation of reality as binary opposites; their challenge, which is to trace the root cause of border lines; and their two main questions, which are: does difference amount to inferiority and are opposites irreconcilable? Those whose writings fit into this school include Pantaleon Iroegbu, Innocent Asouzu, Chris Ijiomah, Godfrey Ozumba, Andrew Uduigwomen, Bruce Janz, Jennifer Vest, Jonathan Chimakonam, Fainos Mangena, Victor Nweke, Paul Dottin, Aribiah Attoe, Leyla Tavernaro-Haidarian, Maduka Enyimba, L. Uchenna Ogbonnaya, Isaiah Negedu, Christiana Idika, Ada Agada, Amara Chimakonam, Patrick Ben, Emmanuel Ofuasia, Umezurike Ezugwu, to name a few. Their projects promote partly the movements of Afro-eclecticism and fully the conversationalism of the later and the new periods, respectively.
5. The Movements in African Philosophy
There are four main movements that can be identified in the history of African philosophy, they include: Excavationism, Afro-constructionism / Afro-deconstructionism, Critical Reconstructionism / Afro-Eclecticism and Conversationalism.
a. Excavationism
The Excavators are all those who sought to erect the edifice of African philosophy by systematizing the African cultural world-views. Some of them aimed at retrieving and reconstructing presumably lost African identity from the raw materials of African culture, while others sought to develop compatible political ideologies for Africa from the native political systems of African peoples. Members of this movement have all been grouped under the schools known as ethnophilosophy and nationalist/ideological schools, and they thrived in the early period of African philosophy. Their concern was to build and demonstrate unique African identity in various forms. A few of them include JB Danquah, SK Akesson, Placid Tempels, Julius Nyerere, John Mbiti, Alexis Kagame, Leopold Senghor, Kwame Nkrumah and Aime Cesaire, and so on.
b. Afro-Constructionism/Afro-Deconstructionism
The Afro-deconstructionists, sometimes called the Modernists or the Universalists are those who sought to demote such edifice erected by the Excavators on the ground that their raw materials are substandard cultural paraphernalia. They are opposed to the idea of unique African identity or culture-bound philosophy and prefer a philosophy that will integrate African identity with the identity of all other races. They never built this philosophy. Some members of this movement include Paulin Hountondji, Kwasi Wiredu, Peter Bodunrin, Macien Towa, Fabien Ebousi Boulaga, Richard Wright and Henri Maurier, and partly Kwame Appiah. Their opponents are the Afro-constructionists, sometimes called the Traditionalists or Particularists who sought to add rigor and promote the works of the Excavators as true African philosophy. Some prominent actors in this movement include Ifeanyi Menkiti, Innocent Onyewuenyi, Henry Olela, Lansana Keita, C. S. Momoh, Joseph Omoregbe, Janheinz Jahn, Sophie Oluwole and, in some ways, Kwame Gyekye. Members of this twin-movement have variously been grouped under ethnophilosophy, philosophic sagacity, professional, hermeneutical and literary schools and they thrived in the middle period of African philosophy. This is also known as the period of the Great Debate.
c. Critical Reconstructionism/Afro-Eclecticism
A few Afro-deconstructionists of the middle period evolved into Critical Reconstructionists hoping to reconstruct from scratch, the edifice of authentic African philosophy that would be critical, individualistic and universal. They hold that the edifice of ethnophilosophy, which they had demolished in the middle period, contained no critical rigor. Some of the members of this movement include, Kwasi Wiredu, Olusegun Oladipo, Kwame Appiah, V. Y. Mudimbe, D. A. Masolo, Odera Oruka and, in some ways, Barry Hallen and J. O. Sodipo. Their opponents are the Afro-Eclectics who evolved from Afro-constructionism of the middle period. Unable to sustain their advocacy and the structure of ethnophilosophy they had constructed, they stepped down a little bit to say, “Maybe we can combine meaningfully, some of the non-conflicting concerns of the Traditionalists and the Modernists.” They say (1) that African traditional philosophy is not rigorous enough as claimed by the Modernists is a fact (2) that the deconstructionist program of the Modernists did not offer and is incapable of offering an alternative episteme is also a fact (3) maybe the rigor of the Modernists can be applied on the usable and relevant elements produced by the Traditionalists to produce the much elusive, authentic African philosophy. African philosophy for this movement therefore becomes a product of synthesis resulting from the application of tools of critical reasoning on the relevant traditions of African life-world. A. F. Uduigwomen, Kwame Gyekye, Ifeanyi Menkiti, Kwame Appiah, Godwin Sogolo and Jay van Hook are some of the members of this movement. This movement played a vital reconciliatory role, the importance of which was not fully realized in African philosophy. Most importantly, they found a way out and laid the foundation for the emergence of Conversationalism. Members of this twin-movement thrived in the later period of African philosophy.
d. Conversationalism
The Conversationalists are those who seek to create an enduring corpus in African philosophy by engaging elements of tradition and individual thinkers in critical conversations. They emphasize originality, creativity, innovation, peer-criticism and cross-pollination of ideas in prescribing and evaluating their ideas. They hold that new episteme in African philosophy can only be created by individual African philosophers who make use of the “usable past” and the depth of individual originality in finding solutions to contemporary demands. They do not lay emphasis on analysis alone but also on critical rigor and what is now called arumaristics—a creative reshuffling of thesis and anti-thesis that spins out new concepts and thoughts. Further, their methodological ambience features principles such as relationality, contextuality and complementarity. Members of this movement thrive in this contemporary period, and their school can be called the conversational school. Some of the philosophers that have demonstrated this trait include Pantaleon Iroegbu, Innocent Asouzu, Chris Ijomah, Godfrey Ozumba, Andrew Uduigwomen, Bruce Janz, Jonathan Chimakonam, Fainos Mangena, Jennifer Lisa Vest, L. Uchenna Ogbonnaya, Maduka Enyimba, Leyla Tervanaro-Haidarian, Aribiah Attoe, and so forth.
6. Epochs in African Philosophy
Various historians of African philosophy have delineated the historiography of African philosophy differently. Most, like Obenga, Abanuka, Okoro, Oguejiofor, Graness, Fayemi, etc., have merely adapted the Western periodization model of ancient, medieval, modern and contemporary. But there is a strong objection to this model. Africa, for example, did not experience the medieval age as Europe did. The intellectual history of the ancient period of Europe shares little in common with ancient Africa. The same goes for the modern period. In other words, the names ancient, medieval and modern refer to actual historical periods in Europe with specific features in their intellectual heritage, which share very little in common with those exact dates in Africa. It, thus, makes no historical, let alone philosophical sense, to adopt such a model for African philosophy. Here, we have a classic case of what Innocent Asouzu calls “copycat philosophy”, which must be rejected. The conversationalists, therefore, propose a different model, one that actually reflects the true state of things. In this model, there are two broad categorizations to wit; Pre-systematic epoch and the Systematic epoch. The latter is further divided into four periods, early, middle, later and the contemporary periods.
a. Pre-systematic Epoch
This refers to the era from the time of the first homo sapiens to the 1900s. African philosophers studied here are those Innocent Asouzu describes as the “Anonymous Traditional African Philosophers”, whose names have been lost in history. They may also include the ancient Egyptians, Ethiopians and Africans who thrived in Europe in that era. The controversies surrounding the nativity of the philosophies of St Augustine, Anton Amo, the Egyptian question, etc., may also be included.
b. Systematic Epoch
This refers to the era from the 1920s to date when systematicity that involves academic training, writing, publishing, engagements, etc., inspired by African conditions and geared towards addressing those conditions, became central to philosophical practice in Africa, South of the Sahara. This latter epoch could further be delineated into four periods: early, middle, later and the contemporary periods.
7. Periods of African Philosophy
a. Early Period
The early period of African philosophy was an era of the movement called cultural/ideological excavation aimed at retrieving and reconstructing African identity. The schools that emerged and thrived in this period were ethnophilosophy and ideological/nationalist schools. Hegel wrote that the Sub-Saharan Africans had no high cultures and made no contributions to world history and “civilization” (1975: 190). Lucien Levy Bruhl also suggested that they were “pre-logical” (1947: 17). The summary of these two positions, which represent the colonial mindset, is that Africans have no dignified identity like their European counterpart. This could be deciphered in the British colonial system that sought to erode the native thought system in the constitution of social systems in their colonies and also in the French policy of assimilation. Assimilation is a concept credited to the French philosopher Chris Talbot (1837), that rests on the idea of expanding French culture to the colonies outside of France in the 19th and 20th centuries. According to Betts (2005: 8), the natives of these colonies were considered French citizens as long as the “French culture” and customs were adopted to replace the indigenous system. The purpose of the theory of assimilation, for Michael Lambert, therefore, was to turn African natives into French men by educating them in the French language and culture (1993: 239-262).
During colonial times, the British, for example, educated their colonies in the British language and culture, strictly undermining the native languages and cultures. The products of this new social system were then given the impression that they were British, though second class, the king was their king, and the empire was also theirs. Suddenly, however, colonialism ended, and they found, to their chagrin, that they were treated as slave countries in the new post-colonial order. Their native identity had been destroyed, and their fake British identity had also been taken from them; what was left was amorphous and corrupt. It was in the heat of this confusion and frustration that the African philosophers sought to retrieve and recreate the original African identity lost in the event of colonization. Ruch and Anyanwu, therefore, ask, “What is this debate about African identity concerned with and what led to it? In other words, why should Africans search for their identity?” Their response to the questions is as follows:
The simple answer to these questions is this: Africans of the first half of this (20th century) century have begun to search for their identity, because they had, rightly or wrongly, the feeling that they had lost it or that they were being deprived of it. The three main factors which led to this feeling were: slavery, colonialism and racialism. (1981: 184-85)
Racialism, as Ruch and Anyanwu believed, may have sparked it off and slavery may have dealt the heaviest blow, but it was colonialism that entrenched it. Ironically, it was the same colonialism at its stylistic conclusion that opened the eyes of the Africans by stirring the hornet’s nest. An African can never be British or French, even with the colonially imposed language and culture. With this shock, the post-colonial African philosophers of the early period set out in search of Africa’s lost identity.
James, in 1954 published his monumental work Stolen Legacy. In it, he attempted to prove that the Egyptians were the true authors of Western philosophy; that Pythagoras, Socrates, Plato and Aristotle plagiarized the Egyptians; that the authorship of the individual doctrines of Greek philosophers is mere speculation perpetuated chiefly by Aristotle and executed by his school; and that the African continent gave the world its civilization, knowledge, arts and sciences, religion and philosophy, a fact that is destined to produce a change in the mentality both of the European and African peoples. In G. M. James’ words:
In this way, the Greeks stole the legacy of the African continent and called it their own. And as has already been pointed out, the result of this dishonesty had been the creation of an enormous world opinion; that the African continent has made no contribution to civilization, because her people are backward and low in intelligence and culture…This erroneous opinion about the Black people has seriously injured them through the centuries up to modern times in which it appears to have reached a climax in the history of human relations. (1954: 54)
These robust intellectual positions supported by evidential and well-thought-out arguments quickly heralded a shift in the intellectual culture of the world. However, there was one problem George James could not fix; he could not prove that the people of North Africa (Egyptians) who were the true authors of ancient art, sciences, religion and philosophy were high-melanated Africans, as can be seen in his hopeful but inconsistent conclusions:
This is going to mean a tremendous change in world opinion, and attitude, for all people and races who accept the new philosophy of African redemption, i.e. the truth that the Greeks were not the authors of Greek philosophy; but the people of North Africa; would change their opinion from one of disrespect to one of respect for the black people throughout the world and treat them accordingly. (1954: 153)
It is inconsistent how the achievements of North Africans (Egyptians) can redeem the black Africans. This is also the problem with Henri Olela’s article “The African Foundations of Greek Philosophy”.
However, in Onyewuenyi’s The African Origin of Greek Philosophy, an ambitious attempt emerges to fill this lacuna in the argument for a new philosophy of African redemption. In the first part of chapter two, he reduced Greek philosophy to Egyptian philosophy, and in the second part, he attempted to further reduce the Egyptians of the time to high-melanated Africans. There are, however, two holes he could not fill. First, Egypt is the world’s oldest standing country which also tells its own story by themselves in different forms. At no point did they or other historians describe them as wholly high-melanated people. Second, if the Egyptians were at a time wholly high-melanated, why are they now wholly low-melanated? For the failure of this group of scholars to prove that high-melanated Africans were the authors of Egyptian philosophy, one must abandon the Egyptian legacy or discuss it as one of the precursor arguments to systematic African philosophy until more evidence emerges.
There are other scholars of the early period who tried more reliable ways to assert African identity by establishing native African philosophical heritage. Some examples include JB Danquah, who produced a text in the Akan Doctrine of God (1944), Meinrad Hebga (1958), who wrote “Logic in Africa”, and SK Akesson, who published “The Akan Concept of Soul” (1965). Another is Tempels, who authored Bantu Philosophy (1959). They all proved that rationality was an important feature of the traditional African culture. By systematizing Bantu philosophical ideas, Tempels confronted the racist orientation of the West, which depicted Africa as a continent of semi-humans. In fact, Tempels showed latent similarities in the spiritual inclinations of the Europeans and their African counterpart. In the opening passage of his work he observed that the European who has taken to atheism quickly returns to a Christian viewpoint when suffering or pain threatens his survival. In much the same way, he says the Christian Bantu returns to the ways of his ancestors when confronted by suffering and death. So, spiritual orientation or thinking is not found only in Africa.
In his attempt to explain the Bantu understanding of being, Tempels admits that this might not be the same as the understanding of the European. Instead, he argues that the Bantu construction is as much rational as that of the European. In his words:
So, the criteriology of the Bantu rests upon external evidence, upon the authority and dominating life force of the ancestors. It rests at the same time upon the internal evidence of experience of nature and of living phenomena, observed from their point of view. No doubt, anyone can show the error of their reasoning; but it must none the less be admitted that their notions are based on reason, that their criteriology and their wisdom belong to rational knowledge. (1959: 51)
Tempels obviously believes that the Bantu, like the rest of the African peoples, possess rationality, which undergird their philosophical enterprise. The error in their reasoning is only obvious in the light of European logic. But Tempels was mistaken in his supposition that the Bantu system is erroneous. The Bantu categories only differ from those of the Europeans in terms of logic, which is why a first-time European on-looker would misinterpret them to be irrational or spiritual. Hebga demonstrates this and suggests the development of African logic. Thus, the racist assumptions that Africans are less intelligent, which Tempels rejected with one hand, was smuggled in with another. For this, and his other errors such as, his depiction of Bantu ontology with vital force, his arrogant claim that the Bantu could not write his philosophy, requiring the intervention of the European, some African philosophers like Paulin Hountondji and Innocent Asouzu to name just a few, criticized Tempels. Asouzu, for one, describes what he calls the “Tempelsian Damage” in African philosophy to refer to the undue and erroneous influence, which the Bantu Philosophy has had on contemporary Africans. For example, Tempels makes a case for Africa’s true identity, which, for him, could be found in African religion within which African philosophy (ontology) is subsumed. In his words, “being is force, force is being”. This went on to influence the next generation of African philosophers like the Rwandise, Alexis Kagame. Kagame’s work The Bantu-Rwandan Philosophy (1956), which offers similar arguments, thus further strengthening the claims made by Tempels, especially from an African’s perspective. The major criticism against their industry remains the association of their thoughts with ethnophilosophy, where ethnophilosophy is a derogatory term. A much more studded criticism is offered recently by Innocent Asouzu in his work Ibuanyidanda: New Complementary Ontology (2007). His criticism was not directed at the validity of the thoughts they expressed or whether Africa could boast of a rational enterprise such as philosophy but at the logical foundation of their thoughts. Asouzu seems to quarrel with Tempels for allowing his native Aristotelian orientation to influence his construction of African philosophy and lambasts Kagame for following suit instead of correcting Tempels’ mistake. The principle of bivalence evident in the Western thought system was at the background of their construction.
Another important philosopher in this period is John Mbiti. His work African Religions and Philosophy (1969) avidly educated those who doubted Africans’ possession of their own identities before the arrival of the European by excavating and demonstrating the rationality in the religious and philosophical enterprises in African cultures. He boldly declared: “We shall use the singular, ‘philosophy’ to refer to the philosophical understanding of African peoples concerning different issues of life” (1969: 2). His presentation of time in African thought shows off the pattern of excavation in his African philosophy. Although his studies focus primarily on the Kikamba and Gikuyu tribes of Africa, he observes that there are similarities in many African cultures just as Tempels did earlier. He subsumes African philosophy in African religion on the assumption that African peoples do not know how to exist without religion. This idea is also shared by William Abraham in his book The Mind of Africa as well as Tempels’ Bantu Philosophy. African philosophy, from Mbiti’s treatment, could be likened to Tempels’ vital force, of which African religion is its outer cloak. The obvious focus of this book is on African views about God, political thought, afterlife, culture or world-view and creation, the philosophical aspects lie within these religious over-coats. Thus, Mbiti establishes that the true, and lost, identity of the African could be found within his religion. Another important observation Mbiti made was that this identity is communal and not individualistic. Hence, he states, “I am because we are and since we are therefore I am” (1969: 108). Therefore, the African has to re-enter his religion to find his philosophy and the community to find his identity. But just like Kagame, Mbiti was unduly and erroneously influenced both by Tempels and the Judeo-Christian religion in accepting the vital force theory and in cloaking the African God with the attributes of the Judeo-Christian God.
This is a view shared by William Abraham. He shares Tempels’ and Mbiti’s views that the high-melanated African peoples have many similarities in their culture, though his studies focus on the culture and political thought of the Akan of present-day Ghana. Another important aspect of Abraham’s work is that he subsumed African philosophical thought in African culture taking, as Barry Hallen described, “an essentialist interpretation of African culture” (2002: 15). Thus for Abraham, like Tempels and Mbiti, the lost African identity could be found in the seabed of African indigenous culture in which religion features prominently.
On the other hand, there were those who sought to retrieve and establish, once again, Africa’s lost identity through economic and political ways. Some names discussed here include Kwame Nkrumah, Leopold Senghor and Julius Nyerere. These actors felt that the African could never be truly decolonized unless he found his own system of living and social organization. One cannot be African living like the European. The question that guided their study, therefore, became, “What system of economic and social engineering will suit us and project our true identity?” Nkrumah advocates African socialism, which, according to Barry Hallen, is an original, social, political and philosophical theory of African origin and orientation. This system is forged from the traditional, communal structure of African society, a view strongly projected by Mbiti. Like Amilcar Cabral, and Julius Nyerere, Nkrumah suggests that a return to African cultural system with its astute moral values, communal ownership of land and a humanitarian social and political engineering holds the key to Africa rediscovering her lost identity. Systematizing this process will yield what he calls the African brand of socialism. In most of his books, he projects the idea that Africa’s lost identity is to be found in African native culture, within which is African philosophical thought and identity shaped by communal orientation. Some of his works include, Neo-colonialism: The Last Stage of Imperialism (1965), I Speak of Freedom: A Statement of African Ideology (1961), Africa Must Unite (1970), and Consciencism (1965).
Leopold Sedar Senghor of Senegal charted a course similar to that of Nkrumah. In his works Negritude et Humanisme (1964) and Negritude and the Germans (1967), Senghor traced Africa’s philosophy of social engineering down to African culture, which he said is communal and laden with brotherly emotion. This is different from the European system, which he says is individualistic, having been marshaled purely by reason. He opposed the French colonial principle of assimilation aimed at turning Africans into Frenchmen by eroding and replacing African culture with French culture. African culture and languages are the bastions of African identity, and it is in this culture that he found the pedestal for constructing a political ideology that would project African identity. Senghor is in agreement with Nkrumah, Mbiti, Abraham and Tempels in many ways, especially with regards to the basis for Africa’s true identity.
Julius Nyerere of Tanzania is another philosopher of note in the early period of African philosophy. In his books Uhuru na Ujamaa: Freedom and Socialism (1964) and Ujamaa: The Basis of African Socialism (1968), he sought to retrieve and establish African true identity through economic and political ways. For him, Africans cannot regain their identity unless they are first free, and freedom (Uhuru) transcends independence. Cultural imperialism has to be overcome. And what is the best way to achieve this if not by developing a socio-political and economic ideology from the petals of African native culture, and traditional values of togetherness and brotherliness? Hence, Nyerere proposes Ujamaa, meaning familyhood—the “being-with” philosophy or the “we” instead of the “I—spirit” (Okoro 2004: 96). In the words of Barry Hallen, “Nyerere argued that there was a form of life and system of values indigenous to the culture of pre-colonial Africa, Tanzania in particular, that was distinctive if not unique and that had survived the onslaughts of colonialism sufficiently intact to be regenerated as the basis for an African polity” (2002: 74). Thus for Nyerere, the basis of African identity is the African culture, which is communal rather than individualistic. Nyerere was in agreement with other actors of this period on the path to full recovery of Africa’s lost identity. Some of the philosophers of this era not treated here include Aime Cesaire, Nnamdi Azikiwe, Obafemi Awolowo, Amilcar Cabral, and the two foreigners, Janheinz Jahn and Marcel Griaule.
b. Middle Period
The middle period of African philosophy was also an era of the twin-movement called Afro-constructionism and afro-deconstructionism, otherwise called the Great Debate, when two rival schools—Traditionalists and Universalists clashed. While the Traditionalists sought to construct an African identity based on excavated African cultural elements, the Universalists sought to demolish such architectonic structure by associating it with ethnophilosophy. The schools that thrived in this era include Philosophic Sagacity, Professional/Modernist/Universalist, hermeneutical and Literary schools.
An important factor of the early period was that the thoughts on Africa’s true identity generated arguments that fostered the emergence of the Middle Period of African philosophy. These arguments result from questions that could be summarized as follows: (1) Is it proper to take for granted the sweeping assertion that all of Africa’s cultures share a few basic elements in common? It was this assumption that had necessitated the favorite phrase in the early period, “African philosophy,” rather than “African philosophies”. (2) Does Africa or African culture contain a philosophy in the strict sense of the term? (3) Can African philosophy emerge from the womb of African religion, world-view and culture? Answers and objections to answers soon took the shape of a debate, characterizing the middle period as the era of the Great Debate in African philosophy.
This debate was between members of Africa’s new crop of intellectual radicals. On the one hand, were the demoters and, on the other were the promoters of African philosophy established by the league of early-period intellectuals. The former sought to criticize this new philosophy of redemption, gave it the derogatory tag “ethnophilosophy” and consequently denigrated the African identity that was founded on it, as savage and primitive identity. At the other end, the promoters sought to clarify and defend this philosophy and justify the African identity that was rooted in it as true and original.
For clarity, the assessment of the debate era will begin from the middle instead of the beginning. In 1978 Odera Oruka a Kenyan philosopher presented a paper at the William Amo Symposium held in Accra, Ghana on the topic “Four Trends in Current African Philosophy” in which he identified or grouped voices on African philosophy into four schools, namely ethnophilosophy, philosophic sagacity, nationalistic-ideological school and professional philosophy. In 1990 he wrote another work, Sage Philosophy: Indigenous Thinkers and the Modern Debate on African Philosophy in which he further added two schools to bring the number to six schools in African philosophy. Those two additions are the hermeneutic and the artistic/literary schools.
Those who uphold philosophy in African culture are the ethnophilosophers, and they include the actors treated as members of the early period of African philosophy and their followers or supporters in the Middle Period. Some would include C. S. Momoh, Joseph Omoregbe, Lansana Keita, Olusegun Oladipo, Gordon Hunnings, Kwame Gyekye, M. A. Makinde, Emmanuel Edeh, Uzodinma Nwala, K. C. Anyanwu and later E. A. Ruch, to name a few. The philosophic sagacity school, to which Oruka belongs, also accommodates C. S. Momoh, C. B. Nze, J. I. Omoregbe, C. B. Okolo and T. F. Mason. The nationalist-ideological school consists of those who sought to develop indigenous socio-political and economic ideologies for Africa. Prominent members include Julius Nyerere, Leopold Senghor, Kwame Nkrumah, Amilcar Cabral, Nnamdi Azikiwe and Obafemi Awolowo. The professional philosophy school insists that African philosophy must be done with professional philosophical methods such as analysis, critical reflection and logical argumentation, as it is in Western philosophy. Members of this school include: Paulin Hountondji, Henri Maurier, Richard Wright, Peter Bodunrin, Kwasi Wiredu, early E. A. Ruch, R. Horton, and later C. B. Okolo. The hermeneutic school recommends interpretation as a method of doing African philosophy. A few of its members include Theophilus Okere, Okonda Okolo, Tsenay Serequeberhan, Godwin Sogolo and partly J. Sodipo and B. Hallen. The Artistic/Literary school philosophically discusses the core of African norms in literary works, and includes Chinua Achebe, Okot P’Bitek, Ngugi wa Thiong’o, Wole Soyinka, Elechi Amadi and F. C. Ogbalu.
Also, in 1989, C. S. Momoh in his The Substance of African Philosophy outlined five schools, namely African logical neo-positivism, the colonial/missionary school of thought, the Egyptological school, the ideological school and the purist school. The article was titled “Nature, Issues and Substance of African Philosophy” and was reproduced in Jim Unah’s Metaphysics, Phenomenology and African Philosophy (1996).
In comparing Momoh’s delineations with Oruka’s, it can be said that the purist school encompasses Oruka’s ethnophilosophy, artistic/literary school and philosophic sagacity; The African logical neo-positivism encompasses professional philosophy and the hermeneutical schools; and the ideological and colonial/missionary schools correspond to Oruka’s nationalistic-ideological school. The Egyptological school, therefore, remains outstanding. Momoh sees it as a school that sees African philosophy as synonymous with Egyptian philosophy or, at least, as originating from it. Also, Egyptian philosophy as a product of African philosophy is expressed in the writings of George James, I. C. Onyewuenyi and Henry Olela.
Welding all these divisions together are the perspectives of Peter Bodunrin and Kwasi Wiredu. In the introduction to his 1985 edited volume Philosophy in Africa: Trends and Perspectives, Bodunrin created two broad schools for all the subdivisions in both Oruka and Momoh, namely the Traditionalist and Modernist schools. While the former includes Africa’s rich culture and past, the latter excludes them from the mainstream of African philosophy. Kwasi Wiredu also made this type of division, specifically Traditional and Modernist, in his paper “On Defining African Philosophy” in C. S. Momoh’s (1989) edited volume. Also, A. F. Uduigwomen created two broad schools, namely the Universalists and the Particularists, in his “Philosophy and the Place of African Philosophy” (1995). These can be equated to Bodunrin’s Modernist and Traditionalist schools, respectively. The significance of his contribution to the Great Debate rests on the new school he evolved from the compromise of the Universalist and the Particularist schools (1995/2009: 2-7). As Uduigwomen defines it, the Eclectic school accommodates discourses pertaining to African experiences, culture and world-view as parts of African philosophy. Those discourses must be critical, argumentative and rational. In other words, the so-called ethnophilosophy can comply with the analytic and argumentative standards that people like Bodunrin, Hountondji, and Wiredu insist upon. Some later African philosophers revived Uduigwomen’s Eclectic school as a much more decisive approach to African philosophy (Kanu 2013: 275-87). It is the era dominated by Eclecticism and meta-philosophy that is tagged the ‘Later period’ in the history of African philosophy. For perspicuity, therefore, the debate from these two broad schools shall be addressed as the perspectives of the Traditionalist or Particularist and the Modernist or Universalist.
The reader must now have understood the perspectives on which the individual philosophers of the middle period debated. Hence, when Richard Wright published his critical essay “Investigating African Philosophy” and Henri Maurier published his “Do we have an African Philosophy?” denying the existence of African philosophy at least, as yet, the reader understands why Lansana Keita’s “The African Philosophical Tradition”, C. S. Momoh’s African Philosophy … does it exist?” or J. I. Omoregbe’s “African Philosophy: Yesterday and Today” are offered as critical responses. When Wright arrived at the conclusion that the problems surrounding the study of African philosophy were so great that others were effectively prevented from any worthwhile work until their resolution, Henri Maurier responded to the question, “Do we have an African Philosophy?” with “No! Not Yet!” (1984: 25). One would understand why Lansana Keita took it up to provide concrete evidence that Africa had and still has a philosophical tradition. In his words:
It is the purpose of this paper to present evidence that a sufficiently firm literate philosophical tradition has existed in Africa since ancient times, and that this tradition is of sufficient intellectual sophistication to warrant serious analysis…it is rather…an attempt to offer a defensible idea of African philosophy. (1984: 58)
Keita went on in that paper to excavate intellectual resources to prove his case, but it was J. I. Omoregbe who tackled the demoters on every front. Of particular interest are his critical commentaries on the position of Kwasi Wiredu and others who share Wiredu’s opinion that what is called African philosophy is not philosophy, but community thought at best. Omoregbe alludes that the logic and method of African philosophy need not be the same as those of Western philosophy, which the demoters cling to. In his words:
It is not necessary to employ Aristotelian or the Russellian logic in this reflective activity before one can be deemed to be philosophizing. It is not necessary to carry out this reflective activity in the same way that the Western thinkers did. Ability to reason logically and coherently is an integral part of man’s rationality. The power of logical thinking is identical with the power of rationality. It is therefore false to say that people cannot think logically or reason coherently unless they employ Aristotle’s or Russell’s form of logic or even the Western-type argumentation. (1998: 4-5)
Omoregbe was addressing the position of most members of the Modernist school who believed that African philosophy must follow the pattern of Western philosophy if it were to exist. As he cautions:
Some people, trained in Western philosophy and its method, assert that there is no philosophy and no philosophizing outside the Western type of philosophy or the Western method of philosophizing (which they call “scientific” or “technical”. (1998: 5)
Philosophers like E. A. Ruch in some of his earlier writings, Peter Bodunrin, C. B. Okolo, and Robin Horton were direct recipients of Omoregbe’s criticism. Robin Horton’s “African Traditional Thought and Western Science” is a two-part essay that sought, in the long run, to expose the rational ineptitude in African thought. On the question of logic in African philosophy, Robin Horton’s “Traditional Thought and the emerging African Philosophy Department: A Comment on the Current Debate” first stirred the hornet’s nest and was ably challenged by Godorn Hunnings’ “Logic, Language and Culture”, as well as by Omoregbe’s “African Philosophy: Yesterday and Today”. Earlier, Meinrad Hebga’s “Logic in Africa” had made insightful ground-clearing on the matter. Recently, C.S. Momoh’s “The Logic Question in African Philosophy” and Udo Etuk’s “The Possibility of an African Logic” as well as Jonathan C. Okeke’s “Why can’t there be an African Logic” made impressions. However, this logic question is gathering new momentum in African philosophical discourse. Recently, Jonathan O Chimakonam (2020), has put together a new edited collection that compiled some of the seminal essays in the logic question debate.
On the philosophical angle, Kwasi Wiredu’s “How not to Compare African Traditional Thought with Western Thought” responded to the lopsided earlier effort of Robin Horton but ended up making its own criticisms of the status of African philosophy, which, for Wiredu, is yet to attain maturation. In his words, “[M]any traditional African institutions and cultural practices, such as the ones just mentioned, are based on superstition. By ‘superstition’ I mean a rationally unsupported belief in entities of any sort (1976: 4-8 and 1995: 194).” In his Philosophy and an African Culture, Wiredu was more pungent. He caricatured much of the discourse on African philosophy as community thought or folk thought unqualified to be called philosophy. For him, there had to be a practised distinction between “African philosophy as folk thought preserved in oral traditions and African philosophy as critical, individual reflection, using modern logical and conceptual techniques” (1980: 14). Olusegun Oladipo supports this in his Philosophy and the African Experience. As he puts it:
But this kind of attitude is mistaken. In Africa, we are engaged in the task of the improvement of “the condition of men”. There can be no successful execution of this task without a reasonable knowledge of, and control over, nature. But essential to the quest for knowledge of, and control over, nature are “logical, mathematical and analytical procedures” which are products of modern intellectual practices. The glorification of the “unanalytical cast of mind” which a conception of African philosophy as African folk thought encourages, would not avail us the opportunity of taking advantage of the theoretical and practical benefits offered by these intellectual procedures. It thus can only succeed in making the task of improving the condition of man in Africa a daunting one. (1996: 15)
Oladipo also shares similar thoughts in his The Idea of African Philosophy. African philosophy, for some of the Modernists, is practised in a debased sense. This position is considered opinionated by the Traditionalists. Later E. A. Ruch and K. C. Anyanwu in their African Philosophy: An Introduction to the Main Philosophical Trends in Contemporary Africa attempt to excavate the philosophical elements in folklore and myth. C. S. Momoh’s “The Mythological Question in African Philosophy” and K. C. Anyanwu’s “Philosophical Significance of Myth and Symbol in Dogon World-View” further reinforced the position of the Traditionalists. (cf. Momoh 1989 and Anyanwu 1989).
However, it took Paulin Hountondji in his African Philosophy: Myth and Reality to drive a long nail in the coffin. African philosophy, for him, must be done in the same frame as Western philosophy, including its principles, methodologies and all. K. C. Anyanwu again admitted that Western philosophy is one of the challenges facing African philosophy but that only calls for systematization of African philosophy not its decimation. He made these arguments in his paper “The Problem of Method in African philosophy”.
Other arguments set Greek standards for authentic African philosophy as can be found in Odera Oruka’s “The Fundamental Principles in the Question of ‘African Philosophy’ (I)” and Hountondji’s “African Wisdom and Modern Philosophy.” They readily met with Lansana Keita’s “African Philosophical Systems: A Rational Reconstruction”, J. Kinyongo’s “Philosophy in Africa: An Existence” and even P. K. Roy’s “African Theory of Knowledge”. For every step the Modernists took, the Traditionalists replied with two, a response that lingered till the early 1990’s when a certain phase of disillusionment began to set in to quell the debate. Actors on both fronts had only then begun to reach a new consciousness, realizing that a new step had to be taken beyond the debate. Even Kwasi Wiredu who had earlier justified the debate by his insistence that “without argument and clarification, there is strictly no philosophy” (1980: 47), had to admit that it was time to do something else. For him, African philosophers had to go beyond talking about African philosophy and get down to actually doing it.
It was with this sort of new orientation, which emerged from the disillusionment of the protracted debate that the later period of African philosophy was born in the 1980’s. As it is said in the Igbo proverb, “The music makers almost unanimously were changing the rhythm and the dancers had to change their dance steps.” One of the high points of the disillusionment was the emergence of the Eclectic school in the next period called ‘the Later Period’ of African philosophy.
c. Later Period
This period of African philosophy heralds the emergence of movements that can be called Critical Reconstructionism and Afro-Eclecticism. For the Deconstructionists of the middle period, the focus shifted from deconstruction to reconstruction of African episteme in a universally integrated way; whereas, for the eclectics, finding a reconcilable middle path between traditional African philosophy and modern African philosophy should be paramount. Thus they advocate a shift from entrenched ethnophilosophy and universal hue to the reconstruction of African episteme if somewhat different from the imposed Westernism and the uncritical ethnophilosophy. So, both the Critical Reconstructionists and the Eclectics advocate one form of reconstruction or the other. The former desire a new episteme untainted by ethnophilosophy, while the latter sue for reconciled central and relevant ideals.
Not knowing how to proceed to this sort of task was a telling problem for all advocates of critical reconstruction in African philosophy, such as V. Y. Mudimbe, Ebousi Boulaga, Olusegun Oladipo, Franz Crahey, Jay van Hook, Godwin Sogolo, and Marcien Towa to name a few. At the dawn of the era, these African legionnaires pointed out, in different terms, that reconstructing African episteme was imperative. But more urgent was the need to first analyse the haggard philosophical structure patched into existence with the cement of perverse dialogues. It appeared inexorable to these scholars and others of the time that none of these could be successful outside the shadow of Westernism. For whatever one writes, if it is effectively free from ethnophilosophy, then it is either contained in Western discourse or, at the very least proceeds from its logic. If it is already contained in Western narrative or proceeds from its logic, what then makes it African? This became something of a dead-end for this illustrious group, which struggled against evolutions in their positions.
Intuitively, almost every analyst knows that discussing what has been discussed in Western philosophy or taking the cue from Western philosophy does not absolutely negate or vitiate what is produced as African philosophy. But how is this to be effectively justified? This appears to be the Achilles heel of the Critical Reconstructionists of the later period in African philosophy. The massive failure of these Critical Reconstructionists to go beyond the lines of recommendation and actually engage in reconstruction delayed their emergence as a school of thought in African philosophy. The diversionary trend which occurred at this point ensured that the later period, which began with the two rival camps of Critical Reconstructionists and Eclectics, ended with only the Eclectics left standing. Thus dying in its embryo, Critical Reconstructionism became absorbed in Eclecticism.
The campaign for Afro-reconstructionism had first emerged in the late 1980s in the writings of Peter Bodunrin, Kwasi Wiredu, V. Y. Mudimbe, Lucius Outlaw, and much later, in Godwin Sogolo, Olusegun Oladipo, and Jay van Hook, even though principals like Marcien Towa and Franz Crahey had hinted at it much earlier. The insights of the latter two never rang bells beyond the ear-shot of identity reconstruction, which was the echo of their time. Wiredu’s cry for conceptual decolonization and Hountondji’s call for the abandonment of the ship of ethnophilosophy were in the spirit of Afro-reconstructionism of the episteme. None of the Afro-reconstructionists except for Wiredu was able to truly chart a course for reconstruction. His was linguistic, even though the significance of his campaign was never truly appreciated. His 1998 work “Toward Decolonizing African Philosophy and Religion,” was a clearer recapitulation of his works of preceding years.
Beyond this modest line, no other reconstructionist crusader of the time actually went beyond deconstruction and problem identification. Almost spontaneously, Afro-reconstructionism evolved into Afro-eclecticism in the early 1990s when the emerging Critical Reconstructionism ran into a brick wall of inactivity. The argument seems to say, ‘If it is not philosophically permissible to employ alternative logic different from the one in the West or methods, perhaps we can make do with the merger of the approaches we have identified in African philosophy following the deconstructions.’ These approaches are the various schools of thought from ethnophilosophy, philosophic sagacity, ideological school, universal, literary to hermeneutic schools, which were deconstructed into two broad approaches, namely: The traditionalist school and the modernist school, also called the particularist and the universalist schools.
Eclectics, therefore, are those who think that the effective integration or complementation of the African native system and the Western system could produce a viable synthesis that is first African and then modern. Andrew Uduigwomen, the Nigerian philosopher, could be regarded as the founder of this school in African philosophy. In his 1995 work “Philosophy and the Place of African Philosophy,” he gave official birth to Afro-eclecticism. Identifying the Traditionalist and Modernist schools as the Particularist and Universalist schools, he created the eclectic school by carefully unifying their goals from the ruins of the deconstructed past.
Uduigwomen states that the eclectic school holds that an intellectual romance between the Universalist conception and the Particularist conception will give rise to an authentic African philosophy. The Universalist approach will provide the necessary analytic and conceptual framework for the Particularist school. Since, according to Uduigwomen, this framework cannot thrive in a vacuum, the Particularist approach will, in turn, supply the raw materials or indigenous data needed by the Universalist approach. From the submission of Uduigwomen above, one easily detects that eclecticism for him entails employing Western methods in analyzing African cultural paraphernalia.
However, Afro-Eclecticism is not without problems. The first problem, though is that he did not supply the yardstick for determining what is to be admitted and what must be left out of the corpus of African tradition. Everything cannot meet the standard of genuine philosophy, nor should the philosophical selection be arbitrary. Hountondji, a chronic critic of traditional efforts, once called Tempels’ Bantu philosophy a sham. For him, it was not African or Bantu philosophy but Tempels’ philosophy with African paraphernalia. This could be extended to the vision of Afro-eclecticism. On the contrary, it could be argued that if Hountondji agrees that the synthesis contains as little as African paraphernalia, then it is something new and, in this respect, can claim the tag of African philosophy. However, it leaves to be proven how philosophical that little African paraphernalia is.
Other notable eclectics include Batholomew Abanuka, Udobata Onunwa, C. C. Ekwealor and much later Chris Ijiomah. Abanuka posits in his 1994 work that a veritable way to do authentic African philosophy would be to recognize the unity of individual things and, by extension, theories in ontology, epistemology or ethics. There is a basic identity among these because they are connected and can be unified. Following C. S. Momoh (1985: 12), Abanuka went on in A History of African Philosophy to argue that synthesis should be the ultimate approach to doing African Philosophy. This position is shared by Onunwa on a micro level. He says that realities in African world-view are inter-connected and inter-dependent (1991: 66-71). Ekwealor and Ijiomah also believe in synthesis, noting that these realities are broadly dualistic, being physical and spiritual (cf. Ekwalor 1990: 30 and Ijiomah 2005: 76 and 84). So, it would be an anomaly to think of African philosophy as chiefly an exercise in analysis rather than synthesis. The ultimate methodological approach to African philosophy, therefore, has to reflect a unity of methods above all else.
Eclecticism survived in the contemporary period of African philosophy in conversational forms. Godfrey Ozumba and Jonathan Chimakonam on Njikoka philosophy, E. G. Ekwuru and later Innocent Egwutuorah on Afrizealotism, and even Innocent Asouzu on Ibuanyidanda ontology are all in a small way, various forms of eclectic thinking. However, these theories are grouped in the New Era specifically for the time of their emergence and the robust conversational structure they have.
The purest development of eclectic thinking in the later period could be found in Pantaleon Iroegbu’s Uwa Ontology. He posits uwa (worlds) as an abstract generic concept with fifteen connotations and six zones. Everything is uwa, in uwa and can be known through uwa. For him, while the fifteen connotations are the different senses and aspects which uwa concept carries in African thought, the six zones are the spatio-temporal locations of the worlds in terms of their inhabitants. He adds that these six zones are dualistic and comprise the earthly and the spiritual. They are also dynamic and mutually related. Thus, Iroegbu suggests that the approach to authentic African philosophy could consist of the conglomeration of uwa. This demonstrates a veritable eclectic method in African philosophy.
One of the major hindrances of eclecticism of the later period is that it leads straight to applied philosophy. Following this approach in this period almost makes it impossible for second readers to do original and abstract philosophizing for its own sake. Eclectic theories and methods confine one to their internal dynamics believing that for a work to be regarded as authentic African philosophy, it must follow the rules of Eclecticism. The wider implication is that while creativity might blossom, innovation and originality are stifled. Because of pertinent problems such as these, further evolutions in African philosophy became inevitable. The Kenyan philosopher Odera Oruka had magnified the thoughts concerning individual rather than group philosophizing, thoughts that had been variously expressed earlier by Peter Bodunrin, Paulin Hountondji and Kwasi Wiredu, who further admonished African philosophers to stop talking and start doing African philosophy. And V. Y. Mudimbe, in his The Invention of Africa…, suggested the development of an African conversational philosophy, and the reinvention of Africa by its philosophers, to undermine the Africa that Europe invented. The content of Lewis Gordon’s essay “African Philosophy’s search for Identity: Existential consideration of a recent effort”, and the works of Outlaw and Sogolo suggest a craving for a new line of development for African philosophy—a new approach which is to be critical, engaging and universal while still being African. This in particular, is the spirit of the conversational thinking, which was beginning to grip African philosophers in late 1990s when Gordon wrote his paper. Influences from these thoughts by the turn of the millennium year crystallized into a new mode of thinking, which then metamorphosed into conversational philosophy. The New Era in African philosophy was thus heralded. The focus of this New Era and the orientation became the conversational philosophy.
d. New Era
This period of African philosophy began in the late 1990s and took shape by the turn of the millennium years. The orientation of this period is conversational philosophy, so, conversationalism is the movement that thrives in this period. The University of Calabar has emerged as the international headquarters of this new movement hosting various workshops, colloquia and conferences in African philosophy under the auspices of a revolutionary forum called The Conversational/Calabar School of Philosophy. This forum can fairly be described as revolutionary for the radical way they turned the fortunes of African philosophy around. When different schools and actors were still groping about, the new school provided a completely new and authentically African approach to doing philosophy. Hinged on the triple principles of relationality (that variables necessarily interrelate), contextuality (that the relationships of variables occur in contexts) and complementarity (that seemingly opposed variables can complement rather than merely contradict), they formulated new methodologies (complementary reflection and conversational method) and developed original systems to inaugurate a new era in the history of African philosophy.
The Calabar School begins its philosophical inquiry with the assumptions that a) relationships are central to understanding the nature of reality, b) each of these relationships must be contextualized and studied as such. They also identify border lines as the main problem of the 21st century. By border lines, they mean the divisive line we draw between realities in order to establish them as binary opposites. These lines lead to all marginal problems such as racism, sexism, classisim, creedoism, etc. To address these problems, they raise two questions: does difference amount to inferiority? And, are opposites irreconcilable? In the Calabar School of Philosophy, some prominent theories have emerged to respond to the border lines problems and the two questions that trail it. Some theoretic contributions of the Calabar School include, uwa ontology (Pantaleon Iroegbu), ibuanyidanda (complementary philosophy) (Innocent Asouzu), harmonious monism (Chris Ijiomah), Njikoka philosophy (Godfrey Ozumba), conceptual mandelanism (Mesembe Edet), and conversational thinking (Jonathan Chimakonam), consolation philosophy (Ada Agada), predeterministic historicity (Aribiah Attoe), personhood-based theory of right action (Amara Chimakonam), etc. All these theories speak to the method of conversational philosophy. Conversational philosophy is defined by the focus on studying relationships existing between variables and active engagement between individual African philosophers in the creation of critical narratives therefrom, through engaging the elements of tradition or straightforwardly by producing new thoughts or by engaging other individual thinkers. It thrives on incessant questioning geared toward the production of new concepts, opening up new vistas and sustaining the conversation.
Some of the African philosophers whose works follow this trajectory ironically have emerged in the Western world, notably in America. The American philosopher Jennifer Lisa Vest is one of them. Another one is Bruce Janz. These two, to name a few, suggest that the highest purification of African philosophy is to be realized in conversational-styled philosophizing. However, it was the Nigerian philosopher Innocent Asouzu who went beyond the earlier botched attempt of Leopold Senghor and transcended the foundations of Pantaleon Iroegbu and CS Momoh to erect a new model of African philosophy that is conversational. The New Era, therefore, is the beginning of conversational philosophy.
Iroegbu in his Metaphysics: The Kpim of Philosophy inaugurated the reconstructive and conversational approach in African philosophy. He studied the relationships between the zones and connotations of uwa. From the preceding, he engaged previous writers in a critical conversation out of which he produced his own thought, (Uwa ontology) bearing the stamp of African tradition and thought systems but remarkably different in approach and method of ethnophilosophy. Franz Fanon has highlighted the importance of sourcing African philosophical paraphernalia from African indigenous culture. This is corroborated in a way by Lucius Outlaw in his African Philosophy: Deconstructive and Reconstructive Challenges. In it, Outlaw advocates the deconstruction of European-invented Africa to be replaced by a reconstruction to be done by conscientious Africans free from the grip of colonial mentality (1996: 11). Whereas the Wiredu’s crusade sought to deconstruct the invented Africa, actors in the New Era of African philosophy seek to reconstruct through conversational approach.
Iroegbu and Momoh inaugurated this drive but it was Asouzu who has made the most of it. His theory of Ibuanyidanda ontology or complementary reflection maintains that “to be” simply means to be in a mutual, complementary relationship (2007: 251-55). Every being, therefore, is a variable with the capacity to join a mutual interaction. In this capacity, every being alone is seen as a missing link and serving a missing link of reality in the network of realities. One immediately suspects the apparent contradiction that might arise from the fusion of two opposed variables when considered logically. But the logic of this theory is not the two-valued classical logic but the three-valued system of logic developed in Africa (cf. Asouzu 2004, 2013; Ijiomah 2006, 2014, 2020; Chimakonam 2012, 2013 and 2014a, 2017, 2018, 2019, 2020). In this, the two standard values are sub-contraries rather than contradictories thereby facilitating effective complementation of variables. The possibility of the two standard values merging to form the third value in the complementary mode is what makes Ezumezu logic, one of the systems developed in the Calabar school, a powerful tool of thought.
A good number of African philosophers are tuning their works into the pattern of conversational style. Elsewhere in Africa, Michael Eze, Fainos Mangena, Bernard Matolino, Motsamai Molefe, Anthony Oyowe, Thaddeus Metz and Leonhard Praeg are doing this when they engage with the idea of ubuntu ethics and ontology, except that they come short of studying relationships. Like all these scholars, the champions of the new conversational orientation are building the new edifice by reconstructing the deconstructed domain of thought in the later period of African philosophy. The central approach is conversation, as a relational methodology. By studying relationships and engaging other African philosophers, entities or traditions in creative struggle, they hope to reconstruct the deconstructed edifice of African philosophy. Hence, the New Era of African philosophy is safe from the retrogressive, perverse dialogues, which characterized the early and middle periods.
Also, with the critical deconstruction that occurred in the latter part of the middle period and the attendant eclecticism that emerged in the later period, the stage was set for the formidable reconstructions and conversational encounters that marked the arrival of the New Era of African philosophy.
8. Conclusion
The development of African philosophy through the periods yields two vital conceptions for African philosophy, namely that African philosophy is a critical engagement of tradition and individual thinkers on the one hand, and on the other hand, it is also a critical construction of futurity. When individual African philosophers engage tradition critically in order to ascertain its logical coherency and universal validity, they are doing African philosophy. And when they employ the tools of African logic in doing this, they are doing African philosophy. On the second conception, when African philosophers study relationships and engage in critical conversations with one another and in the construction of new thoughts in matters that concern Africa but which are nonetheless universal and projected from African native thought systems, they are doing African philosophy. So, the authentic African philosophy is not just a future project; it can also continue from the past.
On the whole, this essay discussed the journey of African philosophy from the beginning and focused on the criteria, schools and movements in African philosophical tradition. The historical account of the periods in African philosophy began with the early period through to the middle, the later and finally, the new period. These periods of African philosophy were covered, taking particular interest in the robust, individual contributions. Some questions still trail the development of African philosophy, many of which include, “Must African philosophy be tailored to the pattern of Western philosophy, even in less definitive issues? If African philosophy is found to be different in approach from Western philosophy, — so what? Are logical issues likely to play any major roles in the structure and future of African philosophy? What is the future direction of African philosophy? Is the problem of the language of African philosophy pregnant? Would conversations in contemporary African philosophy totally eschew perverse dialogue? What shall be the rules of engagement in African philosophy?” These questions are likely to shape the next lines of thought in African philosophy.
9. References and Further Reading
Abanuka, Batholomew. A History of African Philosophy. Enugu: Snaap Press, 2011.
An epochal discussion of African philosophy.
Abraham, William. The Mind of Africa. Chicago: University of Chicago Press, 1962.
A philosophical discussion of culture, African thought and colonial times.
Achebe, Chinua. Morning yet on Creation Day. London: Heinemann, 1975.
A philosophical treatment of African tradition and colonial burden.
Anyanwu, K. C. “Philosophical Significance of Myth and Symbol in Dogon World-view”. C. S. Momoh ed. The Substance of African Philosophy. Auchi: APP Publications, 1989.
A discussion of the philosophical elements in an African culture.
Akesson, Sam. K. “The Akan Concept of Soul”. African Affairs: The Journal of the Royal African Society, 64(257), 280-291.
A discourse on African metaphysics and philosophy of religion.
Akiode, Olajumoke. “African philosophy, its questions, the place and the role of women and its disconnect with its world”. African Philosophy and the Epistemic Marginalization of Women; edited by Jonathan O. Chimakonam and Louise du Toit. Routledge, 2018.
A critical and Afro-feminist discussion of the communalist orientation in African philosophy.
Aristotle. Metaphysica, Translated into English under the editorship of W. D. Ross, M.A., Hon. LL.D (Edin.) Oxford. Vol. VIII, Second Edition, OXFORD at the Clarendon Press 1926. Online Edition. 982b.
A translation of Aristotle’s treatise on metaphysics.
Asouzu Innocent. I. Ibuanyidanda: New Complementary Ontology Beyond World-Immanentism, Ethnocentric Reduction and Impositions. Litverlag, Münster, Zurich, New Brunswick, London, 2007.
An African perspectival treatment of metaphysics or the theory of complementarity of beings.
Asouzu, Innocent. I. The Method and Principles of Complementary, Calabar University Press, 2004.
A formulation of the method and theory of Complementary Reflection.
Asouzu, Innocent. I. 2013. Ibuanyidanda (Complementary Reflection) and Some Basic Philosophical Problems in Africa Today. Sense Experience, “ihe mkpuchi anya” and the Super-maxim. Litverlag, Münster, Zurich, Vienna, 2013.
A further discussion on the theory, method and logic of complementary Reflection.
Attoe, Aribiah David. “Examining the Method and Praxis of Conversationalism,” in Chimakonam Jonathan O., E Etieyibo, and I Odimegwu (eds). Essays on Contemporary Issues in African Philosophy. Cham: Springer, 2022.
An broad examination of the method of conversational thinking.
Babalola, Yai. “Theory and Practice in African Philosophy: The Poverty of Speculative Philosophy. A Review of the Work of P. Hountondji, M. Towa, et al.” Second Order, 2. 2. 1977.
A Critical review of Hountondji and Towa.
Bello, A. G. A. Philosophy and African Language. Quest: Philosophical Discussions. An International African journal of Philosophy, Vol 1, No 1, Pp5-12, 1987.
A critical engagement on the subject of language of philosophy.
Betts, Raymond. Assimilation and Association in French Colonial Territory 1890 to 1915. (First ed. 1961), Reprinted. Nebraska: University of Nebraska Press, 2005
A discourse on French colonial policies.
Bodunrin, Peter. “The Question of African Philosophy”. Richard Wright (ed) African Philosophy: An Introduction 3rd ed. Lanham: UPA, 1984.
A discourse on the nature and universal conception of African philosophy.
Cesaire Aime. Return to My Native Land. London: Penguin Books, 1969.
A presentation of colonial impact on the mind of the colonized.
Chimakonam, Jonathan. O. “On the System of Conversational Thinking: An Overview”, Arụmarụka: Journal of Conversational Thinking, 1(1), 2021, pp1-45.
A detail discussion of the main components of Conversational Thinking.
Chimakonam Jonathan O. Ed. Logic and African Philosophy: Seminal Essays in African Systems of Thought. Delaware: Vernon Press, 2020.
A collection of selected seminal papers on the African logic debate.
Chimakonam, Jonathan O. Ezumezu A System of Logic for African Philosophy and Studies. Cham. Springer Nature, 2019.
A theoretic formulation of the system of Ezumezu logic.
Chimakonam, Jonathan, O. The ‘Demise’ of Philosophical Universalism and the Rise of Conversational Thinking in Contemporary African Philosophy. Method, Substance, and the Future of African Philosophy, ed. Etieyibo Edwin. 135-160. Cham. Springer Nature, 2018.
A critique of philosophical universalism.
Chimakonam Jonathan O. “Conversationalism as an Emerging Method of Thinking in and Beyond African Philosophy,” Acta Academica, 2017a. pp11-33, Vol 2.
A methodological presentation of Conversational thinking.
Chimakonam Jonathan O. “What is Conversational Philosophy? A Prescription of a New Theory and Method of Philosophising in and Beyond African Philosophy,” Phronimon, 2017b. pp115-130, Vol 18.
An intercultural formulation of the Conversational method.
Chimakonam, Jonathan, O. The Criteria Question in African Philosophy: Escape from the Horns of Jingoism and Afrocentrism. Atuolu Omalu: Some Unanswered Questions in Contemporary African Philosophy, ed. Jonathan O. Chimakonam. Pp101-123. University Press of America: Lanham, 2015a.
A discussion of the Criteria of African philosophy.
Chimakonam, Jonathan, O. Addressing Uduma’s Africanness of a Philosophy Question and Shifting the Paradigm from Metaphilosophy to Conversational Philosophy. Filosofia Theoretica: Journal of African Philosophy, Culture and Religions, Vol 4. No 1. 2015b, 33-50.
An engagement with Uduma on his Africanness of philosophy question from a conversational viewpoint.
Chimakonam, Jonathan, O. Conversational Philosophy as a New School of Thought in African Philosophy: A Conversation with Bruce Janz on the Concept of Philosophical Space. Confluence: Online Journal of World Philosophies. 9-40, 2015c.
A rejoinder to Bruce Janz on the concept of philosophical space.
Chimakonam Jonathan O. “Transforming the African philosophical place through conversations: An inquiry into the Global Expansion of Thought (GET)”, in South African Journal of Philosophy, Vol. 34, No. 4. 2015d, 462-479.
A formulation of some basic principles of conversational thinking.
Chimakonam, O. Jonathan. “Ezumezu: A Variant of Three-valued Logic—Insights and Controversies”. Paper presented at the Annual Conference of the Philosophical Society of Southern Africa. Free State University, Bloemfontein, South Africa. Jan. 20-22, 2014.
An articulation of the structure of Ezumezu/African logic tradition.
Chimakonam, O. Jonathan. “Principles of Indigenous African Logic: Toward Africa’s Development and Restoration of African Identity” Paper presented at the 19th Annual Conference of International Society for African Philosophy and Studies [ISAPS], ‘50 Years of OAU/AU: Revisiting the Questions of African Unity, Identity and Development’. Department of Philosophy, Nnamdi Azikiwe University, Awka. 27th – 29th May, 2013.
A presentation of the principles of Ezumezu/African logic tradition.
Chimakonam, O. Jonathan. “Integrative Humanism: Extensions and Clarifications”. Integrative Humanism Journal. 3.1, 2013.
Further discussions on the theory of integrative humanism.
Chimakonam Jonathan O. and Uti Ojah Egbai. “The Value of Conversational Thinking in Building a Decent World: The Perspective of Postcolonial Sub-Saharan Africa”, in Dialogue and Universalism, Vol XXVI No 4. 105-117, 2016.
Danquah, J.B. Gold Coast : Akan laws and customs and the Akim Abuakwa constitution. London: G. Routledge & Sons, 1928.
A discourse on African philosophy of law.
Danquah, J.B. The Akan doctrine of God: a fragment of Gold Coast ethics and religion. London: Cass, 1944.
A discourse on African metaphysics, ethics and philosophy of religion.
Diop, Cheikh Anta. The African Origin of Civilization: Myth or Reality. Mercer Cook Transl. New York: Lawrence Hill & Company, 1974.
Du Bois, W. E. B. The Souls of Black Folk. (1903). New York: Bantam Classic edition, 1989.
A discourse on race and cultural imperialism.
Edeh, Emmanuel. Igbo Metaphysics. Chicago: Loyola University Press, 1985.
An Igbo-African discourse on the nature being.
Egbai, Uti Ojah & Jonathan O. Chimakonam. Why Conversational Thinking Could be an Alternative Method for Intercultural Philosophy, Journal of Intercultural Studies, 40:2, 2019. 172-189.
A discussion of conversational thinking as a method of intercultural philosophy.
Enyimba, Maduka. “On how to do African Philosophy in African Language: Some Objections and Extensions. Philosophy Today, 66. 1, 2022. Pp. 25-37.
A discussion on how to do African philosophy using an African language.
Ekwealor, C. “The Igbo World-View: A General Survey”. The Humanities and All of Us. Emeka Oguegbu (ed) Onitsha: Watchword, 1990.
A philosophical presentation of Igbo life-world.
Etuk, Udo. “The Possibility of African logic”. The Third Way in African Philosophy, Olusegun Oladipo (ed). Ibadan: Hope Publications, 2002.
A discussion of the nature and possibility of African logic.
Fayemi, Ademola K. “African Philosophy in Search of Historiography”. Nokoko: Journal of Institute of African Studies. 6. 2017. 297-316.
A historiographical discussion of African philosophy.
Frantz, Fanon. The Wretched of the Earth. London: The Chaucer Press, 1965.
A critical discourse on race and colonialism.
Graness, Anke. “Writing the History of Philosophy in Africa: Where to Begin?”. Journal of African Cultural Studies. 28. 2. 2015. 132-146.
A Eurocentric historicization of African philosophy.
Graness, A., & Kresse, K. eds., Sagacious Reasoning: H. Odera Oruka in memoriam, Frankfurt: Peter Lang, 1997.
A collection of articles on Oruka’s Sage philosophy.
Graiule, Marcel. Conversations with Ogotemmêli, London: Oxford University Press for the International African Institute, 1965.
An interlocutory presentation of African philosophy.
Gyekye, Kwame. An Essay in African Philosophical Thought: The Akan Conceptual Scheme. Cambridge: Cambridge University Press, 1987.
A discussion of philosophy from an African cultural view point.
Hallen, Barry. A Short History of African Philosophy. Bloomington: Indiana University Press, 2002.
A presentation of the history of African philosophy from thematic and personality perspectives.
Hallen, B. and J. O. Sodipo. Knowledge, Belief and Witchcraft: Analytic Experiments in African Philosophy. Palo Alto, CA: Stanford University Press, 1997.
An analytic discourse of the universal nature of themes and terms in African philosophy.
Hebga, Meinrad. “Logic in Africa”. Philosophy Today, Vol.11 No.4/4 (1958).
A discourse on the structure of African logical tradition.
Hegel, Georg. Lectures on the Philosophy of World History. Cambridge: Cambridge University Press, reprint 1975.
Hegel’s discussion of his philosophy of world history.
Horton, Robin. “African Traditional Religion and Western Science” in Africa 37: 1 and 2, 1967.
A comparison of African and Western thought.
Horton, Robin. “Traditional Thought and the Emerging African Philosophy Department: A Comment on the Current Debate” in Second Order: An African Journal of Philosophy vol. III No. 1, 1977.
A logical critique of the idea of African philosophy.
Hountondji, Paulin. African Philosophy: Myth and Reality. Second Revised ed. Bloomington, Indiana: University Press, 1996.
A critique of ethnophilosophy and an affirmation of African philosophy as a universal discourse.
Hunnings, Gordon. “Logic, Language and Culture”. Second Order: An African Journal of Philosophy, Vol.4, No.1. (1975).
A critique of classical logic and its laws in African thought and a suggestion of African logical tradition.
Ijiomah, Chris. “An Excavation of a Logic in African World-view”. African Journal of Religion, Culture and Society. 1. 1. (August, 2006): pp.29-35.
An extrapolation on a possible African logic tradition.
Iroegbu, Pantaleon. Metaphysics: The Kpim of Philosophy. Owerri: International Universities Press, 1995.
A conversational presentation of theory of being in African philosophy.
Jacques, Tomaz. “Philosophy in Black: African Philosophy as a Negritude”. Discursos Postcoloniales Entorno Africa. CIEA7, No. 17, 7th Congress of African Studies.
A critique of the rigor of African philosophy as a discipline.
James, George. Stolen Legacy: Greek Philosophy is Stolen Egyptian Philosophy. New York: Philosophical Library, 1954.
A philosophical discourse on race, culture, imperialism and colonial deceit.
Jahn, Janheinz. Muntu: An Outline of Neo-African Culture. New York: Grove Press, 1961.
A presentation of a new African culture as a synthesis and as philosophical relevant and rational.
Jewsiewicki, Bogumil. “African Historical Studies: Academic Knowledge as ‘usable past’ and Radical Scholarship”. The African Studies Review. Vol. 32. No. 3, December, 1989.
A discourse on the value of African tradition to modern scholarship.
Kanu, Ikechukwu. ‘Trends in African Philosophy: A Case for Eclectism.’ Filosofia Theoretica: A Journal of African Philosophy, Culture and Religion, 2(1), 2013. pp. 275-287.
A survey of the trends in African philosophy with a focus on Eclectism.
Keita, Lansana. “The African Philosophical Tradition”. Wright, Richard A., ed. African Philosophy: An Introduction. 3rd ed. Lanham, Md.: University Press of America, 1984.
An examination of African philosophical heritage.
Keita, Lansana. “Contemporary African Philosophy: The Search for a Method”. Tsanay Serequeberhan (ed) African Philosophy: The Essential Readings. New York: Paragon House, 1991.
An analysis of methodological issues in and basis of African philosophy.
Kezilahabi, Euphrase. African Philosophy and the Problem of Literary Interpretation. Unpublished Ph.D Dissertation. University of Wisconsin, Madison, 1985.
A doctoral dissertation on the problem of literary interpretation in African philosophy.
Lambert, Michael. “From Citizenship to Négritude: Making a Difference in Elite Ideologies of Colonized Francophone West Africa”. Comparative Studies in Society and History, Vol. 35, No. 2. (Apr., 1993), pp. 239–262.
A discourse on the problems of colonial policies in Francophone West Africa.
Lewis Gordon. “African Philosophy’s Search for Identity: Existential Considerations of a recent Effort”. The CLR James Journal, Winter 1997, pp. 98-117.
A survey of the identity crisis of African philosophical tradition.
Leo Apostel. African Philosophy. Belgium: Scientific Publishers, 1981.
An Afrocentrist presentation of African philosophy.
Levy-Bruhl, Lucien. Primitive Mentality. Paris: University of France Press, 1947.
A Eurocentrist presentation of non-European world.
Makinde, M.A. Philosophy in Africa. The Substance of African philosophy. C.S. Momoh. Ed. Auchi: African Philosophy Projects’ Publications. 2000.
A discourse on the practise and relevance of philosophy in Africa.
Mangena, Fainos. “The Fallacy of Exclusion and the Promise of Conversational Philosophy in Africa”, in Chimakonam Jonathan O., E Etieyibo, and I Odimegwu (eds). Essays on Contemporary Issues in African Philosophy. Cham: Springer, 2022.
A discourse on the significance of conversational thinking.
Masolo, D. A. African Philosophy in Search of Identity. Bloomington: Indiana University Press, 1994.
An individual-based presentation of the history of African philosophy.
Maurier, Henri. “Do We have an African Philosophy?”. Wright, Richard A., ed. 1984. African Philosophy: An Introduction. 3rd ed. Lanham, Md.: University Press of America, 1984.
A critique of Ethnophilosophy as authentic African philosophy.
Mbiti, John. African Religions and Philosophy. London: Heinemann,1969.
A discourse on African philosophical culture.
Momoh, Campbell. “Canons of African Philosophy”. Paper presented at the 6th Congress of the Nigerian Philosophical Association. University of Ife, July 31- August 3, 1985.
A presentation of the major schools of thought in African philosophy.
Momoh, Campbell .ed. The Substance of African Philosophy. Auchi: APP Publications, 1989.
A collection of essays on different issues in African philosophy.
Momoh, Campbell. “The Logic Question in African Philosophy”. C. S. Momoh ed. The Substance of African Philosophy. Auchi: APP Publications, 1989.
A defense of the thesis of a possible African logic tradition.
Mosima, P. M. Philosophic Sagacity and Intercultural Philosophy: Beyond Henry Odera Oruka, Leiden/Tilburg: African Studies Collection 62/Tilburg University, 2016.
Mudimbe, V. Y. The Invention of Africa: Gnosis, Philosophy and the Order of Knowledge (African Systems of Thought). Bloomington: Indiana University Press, 1988.
A discourse on culture, race, Eurocentrism and modern Africa as an invention of Western scholarship.
Nkrumah, Kwame. I Speak of Freedom: A Statement of African Ideology. London: Mercury Books, 1961.
A discourse on political ideology for Africa.
Nkrumah, Kwame. Towards Colonial Freedom. London: Heinemann. (First published in 1945), 1962.
A discussion of colonialism and its negative impact on Africa.
Nwala, Uzodinma. Igbo Philosophy. London: Lantern Books, 1985.
An Afrocentrist presentation of Igbo-African philosophical culture.
Nyerere, Julius. Freedom and Unity. Dares Salaam: Oxford University Press, 1986.
A discussion of a postcolonial Africa that should thrive on freedom and unity.
Nyerere, Julius. Freedom and Socialism. Dares Salaam: Oxford University Press, 1986.
A discourse on the fundamental traits of African socialism.
Nyerere, Julius. Ujamaa—Essays on Socialism. Dar-es-Salaam, Tanzania: Oxford University Press, 1986.
A collection of essays detailing the characteristics of African brand of socialism.
Obenga, Theophile. “Egypt: Ancient History of African Philosophy”. A Companion to African Philosophy. Ed. Kwasi Wiredu. Malden: Blackwell Publishing, 2004.
An Afrocentrist historicization of African philosophy.
Oelofsen, Rianna. “Women and ubuntu: Does ubuntu condone the subordination of women?” African Philosophy and the Epistemic Marginalization of Women; edited by Jonathan O. Chimakonam and Louise du Toit. Routledge, 2018.
A feminist discourse on ubuntu.
Ogbalu, F.C. Ilu Igbo: The Book of Igbo Proverbs. Onitsha: University Publishing Company, 1965.
A philosophical presentation of Igbo-African proverbs.
Ogbonnaya, L. Uchenna. “How Conversational Philosophy Profits from the Particularist and the Universalist Agenda”, in Chimakonam Jonathan O., E Etieyibo, and I Odimegwu (eds). Essays on Contemporary Issues in African Philosophy. Cham: Springer, 2023.
A conversational perspective on particularism and universalism.
Oguejiofor, J. Obi. “African Philosophy: The State of its Historiography”. Diogenes. 59. 3-4, 2014. 139-148.
A Euro-historical adaptation of African philosophy.
Ogunmodede, Francis. 1998. ‘African philosophy in African language.’ West African Journal of Philosophical Studies, Vol 1. Pp3-26.
A discourse on doing African philosophy in African languages.
Okeke, J. Chimakonam. “Why Can’t There be an African logic?”. Journal of Integrative Humanism. 1. 2. (2011). 141-152.
A defense of a possible African logic tradition and a critique of critics.
Okere, Theophilus. “The Relation between Culture and Philosophy,” in Uche 2 1976.
A discourse on the differences and similarities between culture and philosophy.
Okere, Theophilus. African Philosophy: A Historico-Hermeneutical Investigation of the Conditions of Its Possibility. Lanham, Md.: University Press of America, 1983.
A hermeneutical discourse on the basis of African philosophy.
Okolo, Chukwudum B. Problems of African Philosophy. Enugu: Cecta Nigeria Press, 1990.
An x-ray of the major hindrances facing African philosophy as a discipline.
Okoro, C. M. African Philosophy: Question and Debate, A Historical Study. Enugu: Paqon Press, 2004.
A historical presentation of the great debate in African philosophy.
Oladipo, Olusegun. (ed) The Third Way in African Philosophy. Ibadan: Hope, 2002.
A collection of essays on the topical issues in African philosophy of the time.
A discussion of central issues of African philosophy.
Olela, Henry. “The African Foundations of Greek Philosophy”. Wright, Richard A., ed. African Philosophy: An Introduction. 3rd ed. Lanham, Md.: University Press of America, 1984.
An Afrocentrist presentation of African philosophy as the source of Greek philosophy.
Oluwole, Sophie. Philosophy and Oral Tradition. Lagos: Ark Publications, 1999.
A cultural excavationist programme in African philosophy.
Omoregbe, Joseph. “African Philosophy: Yesterday and Today”. African Philosophy: An Anthology. Emmanuel Eze (ed.), Massachusetts: Blackwell, 1998.
A survey of major issues in the debate and a critique of the Universalist school.
Onunwa, Udobata. “Humanism: The Bedrock of African Traditional Religion and Culture”. Religious Humanism. Vol. XXV, No. 2, Spring 1991, Pp 66 – 71.
A presentation of Humanism as the basis for African religion and culture.
Onyewuenyi, Innocent. African Origin of Greek Philosophy: An Exercise in Afrocentrism. Enugu: SNAAP Press, 1993.
An Afrocentrist presentation of philosophy as a child of African thought.
Oruka, H. Odera. “The Fundamental Principles in the Question of ‘African Philosophy,’ I.” Second Order 4, no. 1: 44–55, 1975.
A discussion of the main issues in the debate on African philosophy.
Oruka, H. Odera.“Four Trends in African Philosophy.” In Philosophy in the Present Situation of Africa, edited by Alwin Diemer. Weisbaden, Germany: Franz Steiner Erlagh. (First published in 1978), 1981; Ed.
A breakdown of the major schools of thought in the debate on African philosophy.
Oruka, H. Odera. Sage Philosophy: Indigenous Thinkers and the Modern Debate on African Philosophy. Leiden: E. J. Brill. 1990.
A survey of the journey so far in African philosophy and the identification of two additional schools of thought.
Osuagwu, I. Maduakonam. African Historical Reconsideration: A Methodological Option for African Studies, the North African Case of the Ancient History of Philosophy; Amamihe Lecture 1. Owerri: Amamihe, 1999.
A Euro-historical adaptation of African philosophy.
Outlaw, Lucius. “African ‘Philosophy’? Deconstructive and Reconstructive Challenges.” In his On Race and Philosophy. New York and London: Routledge. 1996.
A presentation of African philosophy as a tool for cultural renaissance.
Plato. Theætetus,155d, p.37.
Contains Plato’s theory of knowledge.
Presbey, G.M. “Who Counts as a Sage?, Problems in the future implementation of Sage Philosophy,” in: Quest- Philosophical Discussions: An International African Journal of Philosophy/ Revue Africaine Internationale de Philosophie, Vol.XI, No.1, 1997. 2:52-66.
A Eurocentric discussion on Afrophone philosophies.
Ruch, E. A. and Anyawnu, K. C. African Philosophy: An Introduction to the Main Philosophical Trends in Contemporary Africa. Rome: Catholic Book Agency, 1981.
A discussion on racialism, slavery, colonialism and their influence on the emergence of African philosophy, in addition to current issues in the discipline.
Sogolo, Godwin. Foundations of African Philosophy. Ibadan: Ibadan University Press, 1993.
A discussion of the logical, epistemological and metaphysical grounds for African philosophy project.
A critical and eclectic proposal in African philosophy.
Tangwa, Godfrey. ‘Revisiting the Language Question in African Philosophy’. The Palgrave Handbook of African Philosophy. Eds. Adesinya Afolayan and Toyin Falola. Pp 129-140. New York: Springer Nature, 2017.
A discourse on the language problem in African philosophy.
Tavernaro-Haidarian, Leyla. “Deliberative Epistemology: Towards an Ubuntu-based Epistemology that Accounts for a Prior Knowledge and Objective Truth,” South African Journal of Philosophy. 37(2), 229-242, 2018.
A conversational perspective on ubuntu-based epistemology.
Towa, Marcien. “Conditions for the Affirmation of a Modern African Philosophical Thought”. Tsanay Serequeberhan (ed) African Philosophy: The Essential Readings. New York: Paragon House, 1991.
A presentation of important factors required for the emergence of African philosophy as a discipline.
Uduagwu, Chukwueloka. “Doing Philosophy in the African Place: A Perspective on the Language Challenge”. Jonathan Chimakonam et al (eds), Essays on Contemporary Issues in African Philosophy. Cham, Springer, 2023.
A discourse on the language problem in African philosophy.
Uduigwomen, F. Andrew. “Philosophy and the Place of African Philosophy”. A. F. Uduigwomen ed. From Footmarks to Landmarks on African Philosophy. 1995, 2nd Ed. 1995/2009. Lagos: O. O. P. 2009.
A collection of essays on different issues in African philosophy.
Uduma Orji. “Can there be an African Logic” in A. F. Uduigwomen(ed.) From Footmarks to Landmarks on African Philosophy Lagos: O. O. P. Ltd, 2009.
A critique of a culture-bound logic in African thought.
Uduma Orji. “Between Universalism and Cultural Identity: Revisiting the Motivation for an African Logic”. A Paper delivered at an International Conference of the Council for Research in Values and Philosophy Washington D.C., USA at University of Cape Coast, Cape Coast Ghana 3–5 February, 2010.
A critique of a culture-bound logic in African thought and a presentation of logic as universal.
Van Hook, Jay M. “African Philosophy and the Universalist Thesis”. Metaphilosophy. 28. 4: 385-396, 1997.
A critique of the universalist thesis in African philosophy.
Van Hook, Jay M. The Universalist Thesis Revisited: What Direction for African Philosophy in the New Millennium? In Thought and Practice in African Philosophy, ed. G. Presbey, D. Smith, P. Abuya and O. Nyarwath, 87-93. Nairobi: Konrad Adenauer Stiftung, 2002.
A further critique of the universalist thesis in African philosophy.
Vest, J. L. 2009. ‘Perverse and necessary dialogues in African philosophy’, in: Thought and practice: a journal of the philosophical association of Kenya. New series, Vol.1 No.2, December, pp. 1-23.
An discussion of the proper direction and focus of African philosophy in the new Age.
Wamba-ia Wamba, E. “Philosophy in Africa: Challenges of the African Philosopher,” in African Philosophy: The Essential Readings. New York: Paragon House, 1991.
A discussions of the technical problems of African philosophy as a discipline.
wa Thiong’o, Ngugi. Decolonizing the Mind: The Politics of Language in African Literature. London: J. Curry and Portsmouth, N. H: Heinemann, 1986.
A discourse on Eurocentrism, Africa’s decolonization and cultural imperialism.
Winch, Peter. “Understanding a Primitive Society”. American Philosophical Quarterly. No. 1, 1964.
A discussion and a defense of the rationality of primitive people.
Wiredu, Kwasi. Philosophy and an African Culture. Cambridge and New York: Cambridge University Press, 1980.
A discussion of the philosophical elements in an African culture and a call for a universalizable episteme for African philosophy.
Wiredu, Kwasi. “How Not to Compare African Thought with Western Thought.”Ch’Indaba no. 2 ( July–December): 1976. 4–8. Reprinted in African Philosophy: An Introduction, edited by R. Wright. Washington, D.C.: University Press of America, 1977; and in African Philosophy: Selected Readings, edited by Albert G. Mosley. Englewood Cliffs, N.J.: Prentice Hall, 1995.
A critique of Robin Horton’s comparison of African and Western thought.
Wiredu, Kwasi.“Our Problem of Knowledge: Brief Reflections on Knowledge and Development in Africa”. African Philosophy as Cultural Inquiry. Ivan Karp and D. A. Masolo (ed). Bloomington, Indiana: Indiana University Press, 2000.
A discussion on the role of knowledge in the development of Africa.
Wiredu, Kwasi. Cultural Universals and Particulars: An African Perspective. Bloomington: Indiana University Press, 1996.
A collection of essays on sundry philosophical issues pertaining to comparative and cross-cultural philosophy.
A discussion of the importance and relevance of the theory of conceptual decolonization in African philosophy.
Wiredu, Kwasi. “On Defining African Philosophy”. C. S. Momoh ed. The Substance of African Philosophy. Auchi: APP Publications, 1989.
A discourse on the parameters of the discipline of African philosophy.
Wright, Richard A., ed. “Investigating African Philosophy”. African Philosophy: An Introduction. 3rd ed. Lanham, Md.: University Press of America, 1984.
A critique of the existence of African philosophy as a discipline.
Baruch Spinoza (1632-1677) was a Dutch Jewish rationalist philosopher who is most famous for his Ethics and Theological-Political Treatise. Although influenced by Stoicism, Maimonides, Machiavelli, Descartes, and Hobbes, among others, he developed distinct and innovative positions on a number of issues in metaphysics, epistemology, ethics, politics, biblical hermeneutics, and theology. He is also known as a pivotal figure in the development of Enlightenment thinking. Some of his most notorious claims and most radical views surround issues concerning determinism and free will. Spinoza was an adamant determinist, and he denied the existence of free will. This led to much controversy concerning his philosophy in subsequent centuries. He was, in fact, one of the first modern philosophers to both defend determinism and deny free will. Nevertheless, his philosophy champions freedom, both ethically and politically. It provides an ethics without free will but one that leads to freedom, virtue, and happiness. Prima facie, such an ethical project might seem paradoxical, but Spinoza distinguished between free will, which is an illusion, and freedom, which can be achieved. A thorough familiarity with Spinoza’s views on determinism, free will, freedom, and moral responsibility resolves this apparent paradox of an ethics without free will.
Contrary to many of his predecessors and contemporaries, Spinoza is an adamant and notorious determinist. For him, nature is thoroughly determined. While there are many different varieties of determinism, Spinoza is committed to causal determinism, or what is sometimes called nomological determinism. Some commentators argue that Spinoza is also a necessitarian or that he holds that the actual world is the only one possible (see IP33); (for an overview, see Garrett 1991). In any case, as a causal determinist, Spinoza certainly argues that events are determined by previous events or causes (which are further determined by previous past events or causes, and so on) following the laws of nature. Spinoza clearly expresses that all events are determined by previous causes:
Every singular thing, or anything which is finite and has a determinate existence, can neither exist nor be determined to produce an effect unless it is determined to exist and produce an effect by another cause, which is also finite and has a determinate existence; and again, this cause can neither exist nor be determined to produce an effect unless it is determined to exist and produce an effect by another, which is also finite and has a determinate existence, and so on, to infinity. (IP28)
Here, Spinoza is arguing for an infinite chain of finite causes for any given effect, or, as he puts it, any singular thing which exists. Spinoza demonstrates the above proposition in his (in)famous geometrical method, which requires starting with definitions and axioms, demonstrating propositions from them, and building upon previous demonstrations. His commitment to causal determinism is already displayed in Axiom 3 of Part I: “From a given determinate cause, the effect follows necessarily; and conversely, if there is no determinate cause, it is impossible for an effect to follow.” Surprisingly, Spinoza uses only this axiom to demonstrate the previous proposition, IP27 “a thing which has been determined by God to produce an effect cannot render itself undetermined.” His demonstrations refer to Axiom 3: “This proposition is evident from A3.” So, it is clear that Spinoza thinks that every effect has a cause, but why he holds this view is not yet clear.
To understand why Spinoza is committed to causal determinism, requires an examination of his larger philosophical commitments. First, Spinoza is a rationalist, and as a rationalist, he holds that everything is, in principle, explainable or intelligible. This is to say that everything that exists and everything that occurs have a reason to be or to happen, and that this reason can be known and understood. This is known as the principle of sufficient reason, after Leibniz’s formulation. Secondly, Spinoza is committed to naturalism, at least a kind of naturalism that argues that there are no explanations or causes outside of nature. This is to say, there are no super-natural causes, and all events can be explained naturally with respect to nature and its laws. Spinoza’s rationalism and naturalism are in evidence when he argues for the necessary existence of the one infinite substance (IP11), God or Nature (Deus sive Natura), which is the immanent (IP18) and efficient cause (IP25) of all things.
The existence of everything cannot be a brute fact for Spinoza, nor does it make sense to him to postpone the reason for existence by referring to a personal God as the creator of all. Rather, he argues that the one substance (“God” or “Nature” in Spinoza’s terminology, but in the following just “God” with the caveat that Nature is implied) is the cause of itself and necessarily exists. “God, or a substance consisting of infinite attributes, each of which expresses eternal and infinite essence, necessarily exists” (IP11). In his alternate demonstration for this proposition, he explicitly uses the principle of sufficient reason: “for each thing there must be assigned a cause, or reason, both for its existence and for its nonexistence” (417). The one substance, or God, is the cause of itself, or, as he defines it “that whose essence involves existence, or that whose nature cannot be conceived except as existing” (ID1).
This necessary existence of God entails the necessity by which every individual thing is determined. This is because Spinoza is committed to substance monism, or the position that there is only one substance. This is markedly different from his rationalist predecessor, Descartes, who, though also arguing that only God is properly speaking an independent substance (Principles I, 51), held that there were indefinitely many substances of two kinds: bodies, or res extensa, and thoughts, or res cogitantes (Principles I, 52). Spinoza, though, defines God as one substance consisting of infinite attributes. An attribute is “what the intellect perceives of a substance as constituting its essence” (ID4). By “infinite” here, Spinoza refers primarily to a totality rather than a numerical infinity, so that the one substance has all possible attributes. Spinoza goes on to indicate that the human intellect knows two attributes, namely extension and thought (IIA5). Besides the one substance and its attributes, Spinoza’s ontology includes what he calls modes. Modes are defined as “affections of a substance or that which is in another thing through which it is also conceived” (ID5). Furthermore, Spinoza distinguishes between infinite modes (IP23) and finite modes, the latter generally taken to be all the singular finite things, such as apples, books, or dogs, as well as ideas of these things, thus also the human body and its mind.
There is much scholarly controversy about the question of how substance, attributes, and infinite and finite modes all relate to each other. Of particular contention is the relation between the finite modes and the one infinite substance. A more traditional interpretation of Spinoza’s substance monism takes finite modes to be parts of God, such that they are properties which inhere in the one substance, with the implication of some variety of pantheism, or the doctrine that everything is God. Edwin Curley, however, influentially argues that finite modes should be taken merely as causally and logically dependent on the one infinite substance, that is, God, which itself is causally independent, following Spinoza’s argument of substance as cause of itself or involving necessary existence (IP1-IP11). According to this interpretation, God is identified with its attributes (extension and thought) as the most general structural features of the universe with infinite modes, following necessarily from the attributes and expressing the necessary general laws of nature (for instance, Spinoza identifies the immediate infinite mode of the attribute of extension with motion and rest in Letter 64, 439). On this causal-nomological interpretation of substance, God is the cause of all things but should only be identified with the most general features of the universe rather than with everything existing, for instance the finite modes (Curley 1969, esp. 44-81).
There is, however, resistance to this causal interpretation of the relation between substance and finite modes (see Bennett 1984, 92-110; 1991; Nadler 2008). Jonathan Bennet argues against Curley’s interpretation—returning to the more traditional relation of modes as properties that inhere in a substance—by taking Spinoza’s proposition IP15 more literally: “Whatever is, is in God, and nothing can be, or be conceived without God.” Bennett identifies the finite modes as ways in which the attributes are expressed adjectively (that is, this region of extension is muddy), keeping closer to Spinoza’s use of “mode” as “affections of God’s attributes… by which God’s attributes are expressed in a specific and determinate way” (IP25C). But as Curley points out, Bennett’s interpretation has some difficulty explaining the precise relation of finite modes to infinite modes and attributes, the latter having an immediate causal relation to God (Curley 1991, 49). Leaving aside the larger interpretive controversies, the issue here is that God and its attributes, being infinite and eternal, cannot be the direct or proximate cause of finite modes, though God is the cause of everything, including finite modes. Spinoza writes “From the necessity of the divine nature there must follow infinitely many things in infinitely many modes (that is, everything that can fall under an infinite intellect)” (IP16). For this reason, Spinoza’s argument for determinism seems to recognize an infinite chain of finite causes and a finite chain of infinite causes. The former has already been referred to when Spinoza argues in IP28 that any particular finite thing is determined to exist or produce an effect by another finite cause “and so on, adinfinitum.” Indeed, in his demonstration, Spinoza states that God, being infinite and eternal, could not be the proximate cause of finite things. Further, in the Scholium to this proposition, Spinoza explains that God is the proximate cause of only those things produced immediately by him, which in turn are infinite and eternal (eternal here indicating necessity as in IP10S, 416). That is, Spinoza does indeed argue that that which follows from the absolute nature of any of God’s attributes must be likewise infinite and eternal in IP21-P23.
Some commentators interpret God as being the proximate cause (through its attributes) of the infinite modes, which are understood as part of the finite chain of infinite causes associated with the most basic laws of nature. While Spinoza does not write directly of the “laws of nature” in this discussion in the Ethics, he does so in the Theological Political Treatise (TTP) in his discussion of miracles. Here Spinoza argues that nothing happens outside of the universal laws of nature, which for him are the same as God’s will and decree. Spinoza writes “But since nothing is necessarily true except by the divine decree alone, it follows quite clearly that the universal laws of nature are nothing but decrees of God, which follow from the necessity and perfection of the divine nature” (TTP VI.9). He goes on to argue that if a miracle were conceived as an occurrence contrary to the universal laws of nature, it would be contradictory in itself and mean that God was acting contrary to his own nature. From this passage, it is clear that Spinoza equates what follows from God’s nature with the universal laws of nature, which are eternal and immutable. For this reason, God’s attributes and the infinite modes are often identified with the most general feature of the universe, expressing the laws of nature.
We tend to use “laws of nature” when referring to physical laws. Spinoza, however, holds that God can be understood under the attribute of extension or the attribute of thought, that is, God is both extended (IIP2) and thinking (IIP1). For this reason, laws of nature exist not only in the attribute of extension but also in that of thought. Bodies and ideas both follow the laws of nature. Bodies are finite modes of extension, while ideas are finite modes of thought. Accordingly, he argues that “the order and connection of ideas are the same as the order and connection of things” (IIP7). This is Spinoza’s famous “parallelism,” though he never uses this term. While there is much controversy concerning how to interpret this identity, Spinoza indicates that the extended thing and the thinking thing are one and the same thing expressed under two different attributes or conceived from two different perspectives (IIP7S). For this reason, a body, or an extended mode, and its correlating idea, or a thinking mode, are one and the same thing conceived from different perspectives, namely through the attributes of extension or thought.
This claim has two significant consequences. First, when Spinoza indicates that each singular finite thing is determined to exist and to produce an effect by another singular finite thing ad infinitum, this applies to ideas as well as bodies. For this reason, just as bodies and their motion or rest are the cause of other bodies and their motion or rest—in accordance with universal laws of nature, namely the laws of physics—ideas are the cause of other ideas (IIP9) in accordance with universal laws of nature, presumably psychological laws. Second, being one and the same thing, bodies and ideas do not interact causally. That is to say, the order and connection of ideas are one and the same as the order and connection of bodies, but ideas cannot bring about the motion or rest of bodies, nor can bodies bring about the thinking of ideas. Spinoza writes “The body cannot determine the mind to thinking, and the mind cannot determine the body to motion, to rest, or to anything else if there is anything else” (IIIP2). It is clear, then, that both bodies and ideas are causally determined within their respective attributes and that there is no interaction between them. This will have a significant consequence for Spinoza’s understandings of free will versus freedom.
Spinoza’s most challenging consequence from these positions is his blunt denial of contingency in IP29, where he states: “In nature there is nothing contingent, but all things have been determined from the necessity of the divine nature to exist and produce an effect in a certain way.” To recall, finite modes of the one infinite substance (in the case of the attributes of extension or thought, bodies and ideas) are determined to exist by a finite cause (that is, another body or idea), which is further determined to exist by another cause, and so on to infinity. Furthermore, though the connection between singular things and God (conceived as the one eternal, infinite substance) is complex, ultimately, God is the cause of everything that exists, and everything is determined according to the universal and necessary laws of nature expressed by the infinite modes and the other fundamental features of the attributes of God, as mentioned above. In other words, for Spinoza, every event is necessitated by previous causes and the laws of nature.
2. Spinoza on Free Will
Because he is a determinist, Spinoza denies the existence of free will. This would make him, in contemporary discussions of free will, an incompatibilist as well as a determinist. In contemporary discussions of free will, the major concern centers mostly on the question of whether free will and thereby moral responsibility are compatible with determinism. There are two dominant solutions to this problem. Incompatibilism claims that free will and/or moral responsibility are incompatible with determinism because the latter prohibits free choice and thus accountability. Some incompatibilists, namely libertarians, even claim that—because human beings do have free will and we hold each other accountable for our actions—the world is not thoroughly determined. Other incompatibilists argue that if the world is determined, then free will is not compatible, but may be agnostic about whether the world is determined. The opposite camp of compatibilism claims that free will and/or moral responsibility are compatible with determinism, though they can also be agnostic about whether the world is determined.
Spinoza’s position cannot easily be sorted into this scheme because he distinguishes between free will (libera voluntas) and freedom (libertas). It is very clear that he denies free will because of his determinism: “In the mind there is no absolute, or free, will, but the mind is determined to will this or that by a cause which is also determined by another, and this again by another, and so to infinity” (IIP48). It is also, however, a consequence of Spinoza’s conception of the will. In the Scholium to IIP48, Spinoza explains that by “will” he means “a faculty of affirming or denying and not desire” (IIP48S, 484). That is to say, Spinoza, here, wants to emphasize will as a cognitive power rather than a conative one. In this respect, he seems to be following Descartes, who also understands the will as a faculty of affirming and denying, which, coupled with the understanding, produces judgements. However, Spinoza quickly qualifies against Descartes that the will is not, in fact, a faculty at all, but a universal notion abstracted from singular volitions: “we have demonstrated that these faculties are universal notions which are distinguished from the singulars from which we form them” (IIP48S, 484). Spinoza is here referring to his earlier explanation in the Ethics of the origin of “those notions called universals, like man, horse, dog, and the like” (IIP40S, 477). For Spinoza, these universal notions are imaginary or fictions that are formed “because so many images are formed at one time in the human body that they surpass the power of imagining.” The resulting universal notion combines what all of the singulars agree on and ignores distinctions.
Spinoza is making two bold and related claims here. First, there is no real faculty of will, that is a faculty of affirming and denying. Rather, the will is a created fiction, a universal that adds to the illusion of free will. Second, the will is simply constituted by the individual volitions—our affirmations and denials—and these volitions are simply the very ideas themselves. For this reason, Spinoza claims that the will is the same as the intellect (or mind) (IIP49C). Therefore, it is not an ability to choose this or that as in the traditional understanding, and certainly not an ability to choose between alternative courses of action arbitrarily. It is not even an ability to affirm or deny, as Descartes claimed. Descartes, in explaining error in judgment, distinguishes the intellect from the will. Thus, with his claim that the will is the same as the intellect, Spinoza is directly criticizing the Cartesian view of free will. We will return to this criticism after examining Spinoza’s view of the human mind.
For Spinoza, the human mind is the idea of an actually existing singular thing (IIP11), namely the body (IIP13). So, for instance, my mind is the idea of my body. As mentioned above, Spinoza holds that the order and connection of ideas are the same as the order and connection of things (IIIP7) insofar as God is understood through both the attribute of extension and the attribute of thought. This entails that for every body, there is an idea that has that body as its object, and this idea is one and the same as that body, although conceived under a different attribute. On the other hand, Spinoza also characterizes the human mind as a part of the infinite intellect of God (IIP11C) understood as the totality of ideas. For this reason, Spinoza explains that when the human mind perceives something, God has this idea “not insofar as he is infinite, but insofar as he is explained through the nature of the human mind, or insofar as he constitutes the essence of the human mind,” that is, as an affection or finite mode of the attribute of thought.
While Spinoza says the mind is the idea of the body, he also recognizes that the human body is considered an individual composed of multiple other bodies that form an individual body by the preservation of the ratio of motion and rest (II Physical Interlude, P1 and L5). Accordingly, every body that composes the individual’s body also has a correlative idea. Therefore, the mind is made up of a multitude of ideas just as the body is made up of a multitude of bodies (IIP15). Furthermore, when the human body interacts with the other bodies external to it, or has what Spinoza calls affections, ideas of these affections (the affections caused by external bodies in the individual human body) become part of the mind and the mind regards the external body as present (IIP16 and IIP17). These ideas of the affections, however, involve both the nature of the human body and that of the external body. Spinoza calls these “affections of the human body whose ideas present external bodies as present to us” images. He continues that “when the mind regards bodies in this way, we shall say that it imagines” (IIP17S, 465). Note here that Spinoza avers that images are the affections of the body caused by other bodies, and although they do not always “reproduce the figures of things”, he calls having the ideas of these affections of the body imagining.
As we can see, for Spinoza, the mind is a composite idea that is composed of ideas of the body and ideas of the body’s affections, which involve both the human body and the external body (and ideas of these ideas as well (IIP20)). Without these ideas of the affections of our body “the human mind does not know the human body, nor does it know that it exists, except through ideas of the affections by which the body is affected” (IIP19). At the same time, Spinoza explains that whenever the human mind perceives something, God has the idea of this thing together with the human mind (IIP11C); but God has the idea which constitutes the human mind only “insofar as he is considered to be affected by the idea of another singular thing” (IIP19D). That is, on the one hand, as explained in IP28, finite singular things come into existence or produce an effect by other finite singular things, on the other hand though, to the extent that all things are modes of the one substance, each effect is at the same time caused by God. Though most of our knowledge of the body and the external world comes from ideas of fections, Spinoza claims that these ideas of the body and its affections are for the most part inadequate, that is, incomplete, partial, or mutilated, and therefore not clear and distinct. Spinoza writes “Insofar as he [God] also has the idea of another thing together with the human mind, we say that the human mind perceives the thing only partially, or inadequately” (IIP11C).
Spinoza argues that for the most part we only have inadequate knowledge (cognitio) of the state of our body, of external bodies that affect our body, and of our own mind (as ideas of ideas of our body) (IIP26C, IIP27, and IIP28). Our knowledge concerning our body and its affections and the external bodies affecting our body and our own mind is, therefore, limited in its distinctness. While it is not always entirely clear what Spinoza means by inadequate knowledge or an inadequate idea, he defines an adequate idea as “an idea which, insofar as it is considered in itself, without relation to an object, has all the denomination of a true idea” (IID4). Avoiding the epistemic problems of a correspondence theory of truth, Spinoza argues we can form adequate ideas insofar as “every idea which in us is absolute, or adequate and perfect, is true” (IIP34). An inadequate idea is an incomplete, partial, or mutilated idea, and Spinoza argues that “falsity consists in the privation of knowledge which inadequate, or mutilate and confused, ideas involve” (IIP35).
Returning to Spinoza’s claim that the will is the same as the intellect, the mind is just constituted by all the individual ideas. To say that the will is the same as the intellect means that, for Spinoza, the will as the sum of individual volitions is just the sum of these individual ideas which compose the mind. What Spinoza has in mind is that our ideas, which constitute our mind, already involve affirmations and negations. There is no special faculty needed. To give a simple example, while sitting in a café, I see my friend walk in, order a coffee, and sit down. Perceiving all this is to say that my mind has ideas of the affections of my body caused by external bodies (which is also to say that there is in God the idea of my mind together with the ideas of other things). All these ideas are inadequate, incomplete, or partial. Because I perceive my friend, the idea of the affection of my body affirms that she is present in the café, drinking coffee, sitting over yonder. I am not choosing to affirm these ideas, according to Spinoza, but the very ideas already involve affirmations. As I am distracted by other concerns, such as reading a book, these ideas continue to involve the affirmation of her being present in the café, regardless of whether that fact is true or not. If I look up and see her again, this new idea reaffirms her presence. But if I look up and she has gone, the new idea negates the previous idea.
Spinoza seems to hold that ideas involve beliefs. This is what Spinoza means when he says that the ideas themselves involve affirmations and negations. Rather than the will choosing to assent or deny things, the will is only the individual volitions that are in fact the individual ideas, which always already involve affirmation and/or negation. To be sure, even knowledge as simple as my friend’s presence will involve a complex of indefinite affirmations and negations, everything from the general laws of nature to mundane facts about daily life. A consequence of ideas as involving affirmation and negation is that error does not result from affirming judgments that are false but rather is a consequence of inadequate knowledge (IIP49SI, 485). Unfortunately, most of our ideas are inadequate. In the above example, it can easily be the case that I continue to have the idea of my friend’s presence when she is no longer in the café, because I will have this idea as long as no other idea negates it (IIP17C).
For Spinoza, therefore, the will is not free and is the same as the intellect. He is aware that this is a strange teaching, explicitly pointing out that most people do not recognize its truth. The reason for this failure to recognize the doctrine that the will is not free can, however, be understood both as an epistemic and a global confusion. Epistemically, most people do not understand that an idea involves an affirmation or negation, but they believe the will is free to affirm or deny ideas. According to Spinoza, “because many people either completely confuse these three – ideas, images, and words – or do not distinguish them accurately, or carefully enough, they have been completely ignorant of this doctrine concerning the will” (IIP49SII, 485-86). First, some people confuse ideas with images “which are formed in us from encounters with bodies.” Images, for Spinoza, are physical and extended, and are, therefore, not ideas. But these people take the ideas to be formed by the direct relation between the mind and body. This has two results: a) ideas of things of which no image can be formed are taken to be “only fictions which we feign from free choice of the will”. In other words, some ideas are not understood as ideas (which involve affirmation and negation) caused by other ideas but as choices of the free will; b) these people “look on ideas, therefore, as mute pictures on a panel,” which do not involve affirmation or negation but are affirmed and denied by the will. Second, some people confuse words with ideas or with the affirmation involved in the ideas. Here they confuse affirmations and negations with willfully affirming or denying in words. Spinoza points out that they cannot affirm or deny something contrary to what the very idea in the mind affirms or negates. They can only affirm or deny in words what is contrary to an idea. In the above example, I can deny in words that my friend is in the café, but these words will not be a negation of the idea which I had while perceiving her as being in the café. For Spinoza, images and words are both extended things and not ideas. This confusion, however, has hindered people from realizing that ideas in themselves already involve affirmations and negations.
Spinoza further explains these confusions and defends his view against possible objections. It is here that Spinoza launches his attack on the Cartesian defense of free will and its involvement in error. Before turning to these possible objections and Spinoza’s replies, a brief overview of Descartes’ view of the will is helpful. In Meditations 4, Descartes explains error through the different scopes of the intellect and the will. The former is limited since we only have limited knowledge, that is, clear and distinct ideas, while our will possibly extends to everything in application, and is thus infinite. Descartes writes, “This is because the will simply consists in our ability to do or not do something (that is, to affirm or deny, to pursue or avoid), or rather, it consists simply in the fact that when the intellect puts something forward for affirmation or denial, for pursuit or avoidance, our inclinations are such that we do not feel we are determined by any external force” (57). Descartes continues, however, that freedom of the will does not consist in indifference. The more the will is inclined toward the truth and goodness of what the intellect presents to it, the freer it is. Descartes’ remedy against error is the suspension of judgment whenever the intellect cannot perceive the truth or goodness clearly and distinctly. Descartes, therefore, understands the will as a faculty of choice, which can affirm or deny freely to make judgments upon ideas presented by the intellect. Though the will is freer when it is based on clear and distinct ideas, it still has an absolute power of free choice in its ability to affirm or deny.
Turning to the possible objections to Spinoza’s view of the will brought up in II49S, the first common objection concerns the alleged different scope of the intellect and the will. Spinoza disagrees that the “faculty of the will” has a greater scope than the “faculty of perception”. Spinoza argues that this only seems to be the case because: 1) if the intellect is taken to only involve clear and distinct ideas, then it will necessarily be more limited; and 2) the “faculty of the will” is itself a universal notion “by which we explain all the singular volitions, that is, it is what is common to them all” (488). Under this view of the will, the power of assenting seems infinite because it employs a universal idea of affirmation that seems applicable to everything. Nevertheless, this view of the will is a fiction. Against the second common objection, that we know from experience that we can suspend judgment, Spinoza denies that we have the power to do so. What actually happens when we seem to hold back our judgment is nothing but an awareness that we lack adequate ideas. Therefore, suspension of judgment is nothing more than perception and not an act of free volition. Spinoza provides examples to illustrate his argument, among them that of a child who imagines a winged horse. The child will not doubt the existence of the winged horse, like an adult who has ideas that exclude the existence of winged horses, until he learns the inadequacy of such an idea. Spinoza is careful to note that perceptions themselves are not deceptive. But they do already involve affirmation independently of their adequacy. For this reason, if nothing negates the affirmation of a perception, the perceiver necessarily affirms the existence of what is perceived.
The third objection is that, since it seems that it is equally possible to affirm something which is true as to affirm something which is false, the affirmation cannot spring from knowledge but from the will. Therefore, the will must be distinct from the intellect. In reply to this, Spinoza reminds us that the will is something universal, which is ascribed to all ideas because all ideas affirm something. As soon as we turn to particular cases, the affirmation involved in the ideas is different. Moreover, Spinoza “denies absolutely” that we need the same power of thinking to affirm something as true which is true as we would need in the case of affirming something as true which is false. An adequate or true idea is perfect and has more reality than an inadequate idea, and therefore the affirmation involved in an adequate idea is different from that of an inadequate idea. Finally, the fourth objection refers to the famous Buridan’s ass, who is caught equidistantly from two piles of feed. A human in such an equilibrium, if it had no free will, would necessarily die. Spinoza, rather humorously, responds, “I say that I grant entirely that a man placed in such an equilibrium (namely, who perceives nothing but thirst and hunger and such food and drink as are equally distant from him) will perish of hunger and thirst. If they ask me whether such a man should be thought an ass rather than a man, I say that I do not know – just as I also do not know how highly we should esteem one who hangs himself, or children, fools, and madmen, and so on” (II49S, 490).
Besides answering the common objections to his identification of the will with the intellect, Spinoza also provides an explanation for the necessary origin of our illusionary belief that the will is free (see Melamed 2017). Spinoza alludes to this illusion a number of times. In the Ethics, it first occurs in the Appendix to Part 1 when he argues against natural teleology. He writes that,
All men are born ignorant of the causes of things, and that they all want to seek their own advantage and are conscious of this appetite. From these it follows, first, that men think themselves free, because they are conscious of their volitions and their appetites, and do not even think in their dreams, of the causes by which they are disposed to wanting and willing because they are ignorant of those causes. (440)
That is, because human beings are 1) ignorant of the causes of their volitions but 2) conscious of their desires, they necessarily believe themselves to be free. Hence, free will is an illusion born of ignorance. In a correspondence with Shuller, Spinoza provides a vivid image of the illusion of free will, writing that a stone, when put into motion, if it could judge, would believe itself free to move, though it is determined by external forces. This is exactly the same for human beings’ belief in free will. Spinoza even writes that “because this prejudice is innate in all men, they are not so easily freed from it” (Letter 58, 428).
Spinoza has another extensive discussion of free will as a result of ignorance in the scholium of IIIP2 in the Ethics. The proposition states “I body cannot determine the mind to thinking, and the mind cannot determine the body to motion, to rest, or anything else (if there is anything else)” (IIIP2). Spinoza’s parallelism holds that the mind and the body are one and the same thing conceived through different attributes, so there is no intra—attribute causality. The order and connection of ideas are the same as the order and connection of bodies, but it is not possible to explain the movement of bodies in terms of the attribute of thought, nor is it possible to explain the thinking of ideas through the attribute of extension. Spinoza is well aware that this will be unacceptable to most people who believe their will is free and that it is the mind which causes the body to move: They are so firmly persuaded that the body now moves, now is at rest, solely from the mind’s command, and that it does a great many things which depend only on the mind’s will and its art of thinking” (IIIP2S, 494-95).
Against this prejudice, Spinoza defends his position by pointing out 1) that human beings are so far quite ignorant of the mechanics of the human body and its workings (for instance, the brain) and 2) that human beings cannot explain how the mind can interact with the body. He further elucidates these points by responding to two objections taken from experience.
But they will say [i] that – whether or not they know by what means the mind moves the body – they still know by experience that unless the human mind were capable of thinking, the body would be inactive. And then [ii], they know by experience, that it is in the mind’s power alone both to speak and to be silent, and to do many other things, which they therefore believe to depend on the mind’s decision. (495)
In response to the first objection, Spinoza argues that while it is true that the body cannot move if the mind is not thinking, the contrary, that the mind cannot think if the body is inactive, is equally true, for they are, after all, one and the same thing conceived through different attributes. Against the great disbelief, though, that “the causes of buildings, of painting, and of things of this kind, which are made only of human skill, should be able to be deduced from the laws of Nature alone, insofar as it is considered corporeal” (496), Spinoza responds by reaffirming that humans are not yet aware of what the human body can do according to its own laws. He gives an interesting example of sleepwalkers doing all kinds of actions, none of which they recall when they are awake.
Concerning the second objection that humans apparently speak (a physical action) from the free power of the mind being an indication that the mind controls the body, Spinoza states that humans have just as much control over their words as over their appetites. He points out that they can hold their tongue only in cases of a weak inclination to speak, just as they can resist indulgence in a weak inclination to certain pleasures. But when it comes to stronger inclinations, humans often suffer from akrasia, or weakness of will. Again, they believe themselves to be free when, in fact, they are driven by causes they do not know. He points to:
[The infant believing] he freely wants the milk; the angry child that he wants vengeance; and the timid, flight. So, the drunk believes it is from a free decision of the mind that he speaks the things he later, when sober, wishes he had not said. So, the madman, the chatterbox, the child, and great many people of this kind believe they speak from a free decision of the mind, when really they cannot contain their impulse to speak. (496)
Here again, Spinoza argues that humans believe themselves free because they are conscious of their own desires but ignorant of the causes of them. Discussing the will with the body, he then states that, as bodies and minds are identical, decisions of the mind are the same as appetites and determinations of the body, understood under different attributes.
Finally, Spinoza points out that humans could not even speak unless they recollected words, though recollecting or forgetting itself is not at will, that is, by the free power of the mind. So it must be that the power of the mind consists only in deciding to speak or not to speak. However, Spinoza counters that often humans dream they are speaking and in their dreams believe that they do this freely, but they are not in fact speaking. In general, when humans are dreaming, they believe they are freely making many decisions, but in fact they are doing nothing. Spinoza asks pointedly:
So, I should very much like to know whether there are in the mind two kinds of decisions – those belonging to our fantasies and those that are free? And if we do not want to go that far in our madness, it must be granted that this decision of the mind, which is believed to be free, is not distinguished from the imagination itself, or the memory, nor is it anything beyond that affirmation which is the idea, insofar as it is an idea, necessarily involves. And so the decisions of the mind arise by the same necessity as the idea of things which actually exist. Those, therefore, who believe that they speak or are silent or do anything from a free decision of the mind, dream with open eyes. (497)
One final point concerning the illusion of free will: Spinoza uses belief in free will as one of his examples of error in IIP35S. IIP35 states that “falsity consists in the privation of knowledge which inadequate, or mutilated and confused ideas, involve.” In the Scholium, he reiterates the now familiar cause of the belief in free will, namely, that humans are conscious of their volitions but ignorant of the causes which determine their volitions. However, Spinoza here is not just claiming that we have an inadequate knowledge of the causes of our volitions leading us to err in thinking the will is free. He makes the stronger claim that because our knowledge of the will is inadequate, we cannot help but imagine that our will is free, that is, we cannot help but experience our will as free in some way, even if we know that it is not.
This can be seen from the second example of error that he uses. When looking at the sun, we imagine that it is rather close. But, Spinoza argues, the problem is not just the error of thinking of a much smaller distance than it is. The problem is that we imagine (that is, we have an idea of the affectation of our body affected by the sun) or experience the sun as being two hundred feet away regardless of whether we adequately know the true distance. Even knowing the sun’s true distance from our body, we will always experience it as being about two hundred feet away. Similarly, even if we adequately understand that our will is not free but that each of our volitions is determined, we will still experience it as free. The reason for this is explained in IIP48S, where Spinoza argues that the will, understood as an absolute faculty, is a “complete fiction or metaphysical being, or universal” which we form, however, necessarily. As mentioned above, universals are formed when—the body overloaded with images through affections—the power of imagining is surpassed, and a notion formed by focusing on similarities and ignoring a great many of the differences between its ideas. Spinoza’s point here in emphasizing the inevitability of error due to the prevalence of imagination and the limited scope of our reason is that humans cannot escape the illusion of free will.
3. Spinoza on Human Freedom
While Spinoza denies that the will is free, he does consider human freedom (libertas humana) as possible. Given the caveat just described, this freedom must be understood as limited. For Spinoza, freedom is the end of human striving. He often equates freedom with virtue, happiness, and blessedness (beatitudo), the more familiar end of human activity (for an overview, see Youpa 2010). Spinoza does not understand freedom as a capacity for choice, that is, as liberum arbitrium (free choice), but rather as consisting in acting as opposed to being acted upon. For Spinoza, freedom is ultimately constituted by activity. In Part I of the Ethics, Spinoza defines, “that thing is called free which exists from the necessity of its nature alone, and is determined to act by itself alone. But a thing is called necessary, or rather compelled, which is determined by another thing to exist and produce an effect in a certain and determinant manner” (ID7). According to this definition, only God, properly speaking, is absolutely free, because only God exists from the necessity of his nature and is determined to act from his nature alone (IP17 and IP17C2). Nevertheless, Spinoza argues that freedom is possible for human beings insofar as they act: “I say we act when something happens, in us or outside of us, of which we are the adequate cause, that is, (by D1), when something in us or outside of us follows from our nature, which can be clearly and distinctly understood through it alone” (IIID2). IIID1 gives the definition of adequate cause: “I call that cause adequate whose effect can be clearly and distinctly perceived through it.” From these definitions, we can see that if human freedom is constituted by activity, then freedom will be constituted by having clear and distinct ideas or adequate knowledge.
Above, it was seen that for Spinoza, will and intellect are one and the same. The will is nothing but singular volitions, which are ideas. These ideas already involve affirmation and negation (commonly ascribed to the faculty of will). In Part II, when arguing against the Cartesian view of the will, Spinoza emphasizes the will as a supposed “faculty of affirming and denying” in order to dispel the universal notion of a free will. In Part III, in his discussion of affects, he provides a fuller description of the will and the affective nature of ideas, providing the tools for his discussion of human freedom. By “affect,” Spinoza understands “the affections of the body by which the body’s power of acting is increased or diminished, aided and restrained, and at the same time, the ideas of these affections” (IIID3). Accordingly, he concludes that “if we can be the adequate cause of any of these affections, I understand by the affect an action; otherwise, a passion.” There is thus a close connection between activity and adequate ideas, as well as between passions and inadequate ideas (IIIP3).
Since most of our knowledge involves ideas of affections of the body, which are inadequate ideas, human beings undergo many things, and the mind suffers many passions until the human body is ultimately destroyed. Nevertheless, Spinoza argues that “each thing, as far as it can by its own power, strives to persevere in its own being” (IIIP6)”. This is Spinoza’s famous conatus principle, by which each individual strives to preserve its being or maintain what might be called homeostasis. In fact, Spinoza argues that the conatus, or striving, is the very essence of each thing (IIIP7). Furthermore, this striving is the primary affect, appetite, or desire. The conatus, or striving, when related solely to the mind, is understood as the will. When the conatus is conceived as related to both mind and body, Spinoza calls it appetite, and when humans are conscious of their appetite, he calls it desire (IIIP9S). Hence, Spinoza defines “desire is man’s very essence, insofar as it is conceived to be determined, from any given affection of it, to do something” (Def. Aff. I).
The conatus is central to Spinoza’s entire moral psychology, from which he derives his theory of affects, his theory of freedom, and his ethical and political theories. In arguing that any human individual is fundamentally striving (conatus) to persevere in being, Spinoza follows Hobbes’ moral psychology. In the Leviathan, Hobbes introduces his concept of conatus in its English version: “the small beginnings of motion within the body of man, before they appear in walking, speaking, and other visible action, are commonly called endeavor [conatus]. This endeavor, when it is toward something which causes it, is called appetite or desire” (Leviathan VI.1-2). Such desire or voluntary motion does not spring from a free will, Hobbes argues, but has its origins from the motion of external bodies imparting their motion to the human body, producing sensation. That is, Hobbes already equates the conatus with the will. Also, Hobbes already derives a taxonomy of passions from the conatus, albeit one that is far less sophisticated and complex than Spinoza’s taxonomy. Furthermore, Hobbes holds that the entire life of human beings consists of an endless desire for power, by which he understands “the present means to attain some future apparent good” (Leviathan X.1). This desire for power ends only with the eventual death of an individual (Leviathan IX.2). For Hobbes, humans are, for the most part, led by their passions, as, for instance, in the construction of a commonwealth from the state of nature, in which they are led by the fear of death and hope for a better life (Leviathan XIII.14). Though, of course, reason provides the means by which the construction of the state is possible. While there are many parallels between Hobbes’ and Spinoza’s psychology, Hobbes understands the conatus entirely as physical, explained by a materialistic mechanical philosophy. In contrast, for Spinoza, the conatus is both physical and psychological, according to his parallelism. Notwithstanding his focus on an ethic, his account of the affects often emphasizes psychological explanations.
From desire, that is, the conscious appetite of striving, Spinoza derives two other primary affects, namely joy and sadness. Spinoza describes joy as the passage or transition of the mind from a lesser to a greater perfection or reality, and sadness as the opposite, the passage of the mind from a greater to a lesser perfection or reality (IIIP11S). The affect of joy as related to both mind and body, he calls “pleasure or cheerfulness,” that of sadness “pain or melancholy.” IIIP3 underlines Spinoza’s parallelism with respect to his theory of affects: “the idea of anything that increases or diminishes, aids or restrains, our body’s power of acting, increases or diminishes our power of thinking” (IIIP11). In these essential basic definitions, Spinoza employs the concept of perfection or reality (equated in IID6). What he means by this can be grasped rather intuitively. The more perfection or reality an individual has, the more power it has to persevere in being, or the more the individual is capable of acting and thinking. When this power increases through a transition to greater perfection, the individual experiences joy. But if it decreases to lesser perfection, it experiences sadness.
Spinoza holds that from these three main affects all others, in principle, can be deduced or explained. However, the variety of affects is dependent not only on the individual but also on all the external circumstances under which they strive. Still, Spinoza provides explanations of the major human affects and their origin from other affects. The first affects he deduces from joy and sadness are love and hate. Whatever an individual imagines increases their power and causes joy, they love; and what decreases their power and causes sadness, they hate: “Love is nothing but joy and the accompanying idea of an external cause, and hate is nothing but sadness with the accompanying idea of an external cause” (IIIP13S). Accordingly, human beings strive to imagine those things (that is, have ideas of the affections of their body caused by those things) that increase their power of acting and thinking (IIIP12), causing joy, while avoiding imagining things that decrease their power of acting and thinking, causing sadness. Like Hobbes, Spinoza holds that human beings strive to increase their power, Spinoza, though, understands this specifically as a power to act and indeed to think.
Furthermore, because “the human body can be affected in many ways in which its power of acting is increased or diminished, and also in others which render its power of acting neither greater nor less” (III Post.I), there are many things which become the accidental cause of joy or sadness. In other words, it can happen that an individual loves or hates something not according to what actually causes joy (or an increase in power) or sadness (or a decrease in power), but rather something that appears to bring joy or sadness. This is possible because human beings are usually affected by two or more things at once, one or more of which may increase or decrease their power or causes joy or sadness, while others have no effect. Moreover, an individual, remembering experiences of joy or sadness accidently related to certain external causes, can come to love and hate many things by association (IIIP14). Indeed, Spinoza holds that there are as many kinds of joy, sadness, and desire as there are objects that can affect us (IIIP56), noting the well—known excessive desires of gluttony, drunkenness, lust, greed, and ambition.
Spinoza ultimately develops a rich taxonomy of passions and their mixtures, including the more common anger, envy, hope, fear, and pride, but also gratitude, benevolence, remorse, and wonder, to name a few. Not only does he define these passions, but he also gives an account of their logic, which is paramount for understanding the origin of these passions, and thereby ultimately overcoming them. True to his promise in the preface to the third part, Spinoza treats the affects “just as if they were a question of lines, planes, and bodies” (492). Initially and broadly, Spinoza discusses those affects that are passions because we experience them when we are acted upon. Human beings are passive in their striving to persevere in their being due to their inadequate ideas about themselves, their needs, as well as external things. Therefore, their striving to imagine what increases their power and avoiding what decreases their power fails, leading to a variety of affects of sadness. In contrast to traditional complaints about the weakness of humans with respect to their affects, however, Spinoza argues that “apart from the joy and desire which are passions, there are other affects of joy and desire which are related to us insofar as we act” (IIIP58) and that all such affects related to humans insofar as they act are ones of joy or desire and not sadness. Of course, this makes sense, as sadness is the transition from greater to lesser perfection and a decrease in the power of acting or thinking.
Spinoza’s theory of affects provides the foundation for his theory of human freedom, because ultimately freedom involves maximizing acting and minimizing being acted upon, that is, having active affects and not suffering passions. Recall that for Spinoza only God is absolutely free, because only God is independent as a self—caused substance and acts according to the necessity of his own nature, and because Spinoza defines a free thing as “existing from the necessity of its nature alone, and is determined to act by itself alone.” Human beings cannot be absolutely free. But insofar as they act, they are the adequate cause of their actions. This is to say that the action “follows from their nature, which can be clearly and distinctly understood through it alone” (IIID2). Therefore, when human beings act, they are free. This is opposed to being acted upon, or having passions, in which humans are only the inadequate or partial cause and are not acting according to their nature alone but are determined by something outside of themselves (see Kisner 2021). Therefore, the more human beings act, the freer they are; the more they suffer from passions, the less they are free.
Thus, Spinoza understands freedom in terms of activity as opposed to passivity, acting as opposed to being acted upon, or being the adequate cause of something as opposed to the inadequate cause of something: “I call that cause adequate whose effect can be clearly and distinctly perceived through it. But I call it partial or inadequate if its effect cannot be understood through it alone” (IIID1). From the perspective of the attribute of thought, being the adequate cause of an action is a function of having adequate ideas or true knowledge. He writes, “Our mind does certain things, [acts] and undergoes other things, namely, insofar as it has adequate ideas, it necessarily does certain things, and insofar as it has inadequate ideas, it necessarily undergoes other things” (IIIP1). Spinoza’s reasoning here is that when the mind has an adequate idea, this idea is adequate in God insofar as God constitutes the mind through the adequate idea. Thus, the mind is the adequate cause of the effect because the effect can be understood through the mind alone (by the adequate idea) and not something outside of the mind. But in the case of inadequate ideas, the mind is not the adequate cause of something, and thus the inadequate idea is, in God, the composite of the idea of the human mind together with the idea of something else. For this reason, the effect cannot be understood as being caused by the mind alone. Thus, it is the inadequate or partial cause. While this is Spinoza’s explanation of how being an adequate cause involves having adequate knowledge, there is some controversy among scholars about the status of humans having adequate ideas and true knowledge.
In Part II of the Ethics in IIP40S2, Spinoza differentiates three kinds of knowledge, which he calls imagination, reason, and intuitive knowledge. The first kind, imagination, mentioned above, has its sources in bodies affecting the human body and the ideas of these affections, or perception and sensation. It also includes associations with these things by signs or language. This kind of knowledge is entirely inadequate or incomplete, and Spinoza often writes that it has “no order for the intellect” or follows from the “common order of Nature,” that is, it is random and based on association. Passions, or passive affects, fall in the realm of imagination because imaginations are quite literally the result of the body being acted upon by other things, or, what is the same, ideas of these affections. The other two kinds of knowledge are adequate. Reason is knowledge that is derived from the knowledge of the common properties of all things, what Spinoza calls “common notions”. His thinking here is that there are certain properties shared by all things and that, being in the part and the whole, these properties can only be conceived adequately (IIP38 and IIP39). The ideas of these common properties cannot be but adequate in God when God is thinking the idea that constitutes the human mind and the idea which constitutes other things together in perception. Also, those ideas that are deduced from adequate ideas are also adequate (P40). The common notions, therefore, are the foundation of reasoning.
Some commentators, however, have pointed out that it seems impossible for humans to have adequate ideas. Michael Della Rocca, for instance, argues that having an adequate idea seems to involve knowledge of the entire causal history of a particular thing, which is not possible (Della Rocca 2001, 183, n. 29). This is because of Spinoza’s axiom that “the knowledge of an effect depends on and involves the knowledge of its cause” (IA4), and, as we have seen, finite singular things are determined to exist and produce an effect by another finite singular thing, and so on ad infinitum. Thus, adequate knowledge of anything would require adequate knowledge of all the finite causes in the infinite series. Eugene Marshall obviates this problem by arguing that it is possible to have adequate knowledge of the infinite modes (Marshall 2011, 31-36), which some commentators take, for Spinoza, to be the concern of the common notions (Curley 1988, 45fn; Bennett 1984, 107). Indeed, Spinoza argues that humans have adequate knowledge of God’s eternal and infinite essence (IIP45-P47), which would include knowledge of the attributes and infinite modes. Intuitive knowledge is also adequate, though it is less clear what specifically it entails. Spinoza defines it as “a kind of knowing [that] proceeds from an adequate idea of the formal essence of certain attributes of God to the adequate knowledge of the formal essence of things” (IIP40S2, 478). Here, Spinoza does indicate knowledge of the essence of singular things returning to the above problem, though Marshall, for instance, points out that Spinoza does not indicate the essence of finite modes existing in duration (existing in time), which would require knowledge of the causal history of a finite mode. Rather, he suggests that Spinoza here speaks of the idea of the essence of things as sub specie aeternitatis, or things considered existing in the eternal attributes of God (Marshall 2011, 41-50). Furthermore, rational knowledge and intuitive knowledge are both related (Spinoza argues that rational knowledge encourages intuitive knowledge) but also distinct (VP28).
Rational knowledge and intuitive knowledge, because they involve adequate ideas, are necessary for human freedom. Again, this is because human freedom is constituted by activity, and humans act when they are the adequate cause of something that follows from their nature (IIID2). Moreover, humans can be the adequate cause, in part, when the mind acts or has adequate ideas (IIIPI). This is how Spinoza explains the possibility of human freedom metaphysically. However, human freedom, which Spinoza equates with virtue and blessedness, is the end of human striving, that is, the ongoing project of human existence. The essence of a human being, the conatus, is the striving to persevere in being and consequently to increase the power of acting and thinking, and this increase brings about the affect of joy. This increase in the power of acting and thinking can occur passively— the passion of joy—when human beings strive from inadequate ideas, or it can occur actively when human beings strive from adequate ideas, or from reason and intuitive knowledge. The more human beings strive for adequate ideas or act rationally in accordance with their own nature, the freer they are and the greater is their power of acting and thinking and the consequent joy. Therefore, reason and intuitive knowledge are paramount for freedom, virtue, and blessedness (VP36S) (see Soyarslan 2021).
For Spinoza, human freedom is very different from free will as ordinarily understood. It is not a faculty or ability apart from the intellect. Rather, it is a striving for a specific way of life defined by activity, reason, and knowledge instead of passivity and ignorance. Determinism is not opposed to this view of freedom, as freedom is understood as acting according to one’s own nature and not being compelled by external forces, especially passions. In this respect, it has many similarities to the view of freedom held by Hobbes and that of the Stoa in different respects. For Hobbes, being a materialist, freedom only applies properly to bodies and concerns the absence of external impediments to the motion of a body. Likewise, calling a human free indicates he is free “in those things which by his own strength and wit he is able to do is not hindered to do what he has a will to” (Leviathan XXI.1-2). However, Spinoza’s view of freedom differs substantially from Hobbes in that he has a more extensive view of what it means to be impeded by external forces, recognizing that the order of ideas and bodies are one and the same. For the Stoa, generally speaking, freedom consists in living a rational life according to nature. If one lives according to nature, which is rational, one can be free despite the fact that nature is determined because one conforms the desires to the order of nature through virtue. A famous illustration of such an understanding of freedom is given by a dog led by a cart. If the dog willingly follows the cart that is pulling it, it acts freely; if it resists the motion of the cart, being pulled along nonetheless, it lacks freedom (Long 1987, 386). For Spinoza, freedom does not conflict with determinism either, as long as human beings are active and not passive. Likewise, the greatest impediment to freedom are the passions, which can so overcome the power of an individual that they are in bondage or a slave. Spinoza famously writes “Man’s lack of power to moderate and restrain the affects I call bondage. For the man who is subject to affects is under the control, not of himself, but of fortune, in whose power he so greatly is that often, though he sees the better for himself, he is still forced to follow the worse” (IV Preface, 543). In these lines, Spinoza presents not only the problem that the passions present to human thriving but also situates this problem within the context of the classic enigma of akrasia, or weakness of will.
In the first 18 propositions of Part IV of the Ethics, entitled “Of Human Bondage, or the Power of the Affects,” Spinoza aims to explain “the causes of man’s lack of power and inconstancy, and why men do not observe the precepts of reason” (IVP18S, 555). First, he sets up the general condition that human beings, being a part of nature, are necessarily acted upon by other things (IVP2). Their power in striving to persevere in being is limited and surpassed by the power of other things in nature (IVP3). Therefore, it is impossible for them to be completely free or act only in accordance with their own nature (IVP4). Accordingly, Spinoza admits, “from this it follows that man is necessarily always subject to passions, that he follows and obeys the common order of Nature, and accommodates himself to it as much as the nature of things requires” (IVP4C). This, of course, is the reason that human freedom is always limited and requires constant striving. Human beings are constantly beset by passions, but what is worse is that the power of a passion is defined by the power of external causes in relation to an individual’s power (IVP5). This is to say, human beings can be overwhelmed by the power of external causes in such a way that “the force of any passion or affect can surpass the other actions, or powers of a man, so that the affect stubbornly clings to the man” (IVP6). This can be easily understood from the universal human experiences of grief and loss, envy and ambition, great love and hatred, as well as from any form of addiction and excessive desire for pleasures. Such passions, and even lesser ones, are hard to regulate and can interrupt our striving for a good life or even completing the simple tasks of daily life.
In IVP7, Spinoza touches on the main issue in akrasia, writing that “an affect cannot be restrained or taken away except by an affect opposite to and stronger than the affect to be restrained”. Here we can see why merely knowing what is good or best does not restrain an affect, and humans often see the better course of action but pursue the worse. The issue here is that Spinoza thinks that a true or adequate idea does not restrain a passion unless it is also an affect that increases the individual’s power of action (IVP 14). Furthermore, an affect’s power is compounded by its temporal and modal relationship to the individual. For instance, temporally, an affect whose cause is imagined to be present is stronger than if it were not (IVP9), if it is imagined to be present imminently rather than far in the future, or if it was present in the recent past rather than in distant memory (IVP10). Likewise, modally, an affect toward something humans view as necessary is more intense than if they view it as possible or contingent (IVP11).
Because the power of affects is temporally and modally affected and because an affect can be restrained by an opposite and more powerful affect, it often is the case that a desire that does come from true knowledge or adequate ideas is still overcome by passions (IVP 15). This can be easily seen in a desire for some future good, which is overcome by the longing for pleasures of the moment (IVP16), as is so often the case. However, “a desire that arises from joy is stronger, all things being equal, than one which arises from sadness” (IVP18). That joy is more powerful than sadness is prima facie a good thing, except that in order to overcome the passions and achieve the good life, true knowledge of good and evil in the affects is necessary. Spinoza’s conception of the good life, or what he calls blessedness, is in essence overcoming this domination of the passions and providing the tools for living a life of the mind, which is the life of freedom (see James 2009). Thus, Spinoza provides guidance for how such a good life can be achieved in Books IV and V of the Ethics, namely in the ideal exemplar of the free man and the so-called remedies of the passion.
4. The Free Man and the Way to Freedom
In the preface to Part IV of the Ethics, Spinoza introduces the idea of the model of human nature, or the “free man”. The free man is understood as an exemplar to which humans can look to decide whether an action is good or evil (there is some controversy over the status of the free man, for instance, see Kisner 2011, 162-78; Nadler 2015; Homan 2015). Spinoza is often interpreted as a moral anti-realist because of some of his claims about moral values. For instance, he writes “We neither strive for, nor will, neither want, nor desire anything because we judge it to be good; on the contrary, we judge it to be good because we strive for it, will it, want it, and desire it” (IIIP9S). And by “good here I understand every kind of joy, and whatever leads to it, and especially what satisfies any kind of longing, whatever that may be. And by evil, every kind of sadness, and especially what frustrates longing” (IIIP39S, 516). However, as anything can be the accidental cause of joy or sadness (IIIP15), it would seem that good and evil, or some goods and evils, are relative to the individual, as is the case for Hobbes. Moreover, Spinoza indicates that in nature there is nothing good or evil in itself. He writes “As far as good and evil are concerned, they also indicate nothing positive in things, considered in themselves, nor are they anything other than modes of thinking or notions we form because we compare things to one another” (IV Preface, 545) (for an overview of Spinoza’s meta-ethics, see Marshall 2017).
Nevertheless, in Part IV of the Ethics, Spinoza redefines good and evil. Good is now understood as what is certainly known to be useful to us, and evil as what is certainly known to prevent the attainment of some good (IVD1 and IVD2). What does Spinoza mean here by “useful”? What is useful to a human individual is what will allow them to persevere in being and increase their power of acting and thinking, especially according to their own nature, or “what will really lead a man to greater perfection” (IVP18S, 555). This new definition of good as what is really useful is distinguished from mere joy or pleasure, which, insofar as it prevents us from attaining some other good, can be an evil. For Spinoza, the most useful thing for humans is virtue (IVP18S), by which they can attain greater perfection, or greater power of acting and thinking. In order to understand what is really useful and good, Spinoza proposed the idea of the free man “as a model of human nature which we may look to”. For this reason, he also defines good relative to this model, writing, “I shall understand by good, what we certainly know is a means by which we may approach nearer and nearer to the model of human nature we set before ourselves” (IV Preface, 545).
With this model of human nature in mind, Spinoza then goes on to give an analysis of what is good and evil in the affects. Generally speaking, all passions that involve sadness, that is, affects that decrease the perfection or reality of an individual and consequently the ability of the mind to think and the body to act are evil (IVP41). For instance, hate towards other humans is never good (IVP45) and all species of such hate such as envy, disdain, and anger, are evil (IVP45C2). Also, any affects that are mixed with sadness, such as pity (IVP50), or are vacillations of the mind, like hope and fear (IVP47), are not good in themselves. In contrast, all affects that are joyful, that is, which increase the reality or perfection of an individual and consequently the ability of the mind to think and the body to act, are directly good. Spinoza qualifies, however, since the net increase and decrease in power of the individual has to be taken as a whole, with its particular conditions, and over time. For instance, the passion of joy and pleasure might be excessive (IVP43) or relate to only one part of an individual (IVP60), and the power of passions, being defined by the power of external causes, can easily overcome our power of acting and thinking as a whole and, thus, lead to greater sadness. Likewise, some sadness and pain might be good to the extent that they prevent a greater sadness or pain by restraining excessive desires (IVP43). It can easily be seen that love, which is a species of joy, if excessive, can be evil. Spinoza writes:
Sickness of the mind and misfortunes take their origin, especially, from too much love towards a thing which is liable to many variations and which we can never fully possess. For no one is disturbed or anxious concerning anything unless he loves it, nor do wrongs, suspicions, and enmities arise except from love for a thing which no one can really fully possess. (VP20S, 606)
Here again, it can be seen that, though joy in itself is directly good, it is often problematic as a passion and sometimes leads to sadness. Nevertheless, there is an interesting asymmetry here. While human beings’ passivity often leads them to the experiences of passions that are a variety of sadness, there are certain passions of joy that can, all things being equal, increase the power of an individual. This asymmetry allows for how human beings can increase their power of thinking and acting before they can act on adequate ideas. Therefore, it is important to note that joyful passions qua passions can be good and increase activity, despite being passions, and insofar as it increases our power of acting, it adds to freedom (see Goldenbaum 2004; Kisner 2011, 168-69). In this respect, the view toward the passions developed by Spinoza, undoubtedly influenced by Stoicism, differs from the general Stoic view. For the Stoa, virtue is living according to reason. The goal of the Stoic sage is to reach ataraxia, a state of mental tranquility, through apatheia, a state in which one is not affected by passions (pathai), which by definition are bad. By contrast, Spinoza explicitly understands passions of joy, all things being equal, as good.
Moreover, Spinoza also emphasizes that there are many things external to the human individual that are useful and therefore good, including all the things that preserve the body (IVP 39) and allow it to optimally interact with the world (IVP 40): “It is the part of a wise man, I say, to refresh and restore himself in moderation with pleasant food and drink, with scents, with the beauty of green plants, with decorations, music, sport, the theater, and other things of this kind, which anyone can use without injury to another” (IVP 45S, 572). Most significant in the category of external goods are other human beings. While other humans can be one of the greatest sources of conflict and turmoil insofar as they are subject to passions (IVP32-34), Spinoza also thinks that “there is no singular thing in Nature which is more useful to man than a man who lives according to the guidance of reason” (IVP35C). For this reason, Spinoza recognizes, similar to Aristotle, that good political organization and friendship are foundational to the good life – freedom, virtue, and blessedness (IVP73, for instance).
Leaving aside the many things in nature that are useful and good for human freedom, despite being external to the individual, what is ultimately constitutive of human freedom is active affects or what is the same, rational activity, that is, striving to persevere in being through the guidance of reason and understanding. Actions are affects which are related to the mind because it understands them, and all such affects are joyful (IIIP59). Nor can desires arising from reason ever be excessive (IVP61). Thus, active joy and desire are always good. Spinoza equates the human striving to persevere in being through the guidance of reason with virtue, which he understands as power, following Machiavelli’s virtu. Albeit for Spinoza, this power is acting from reason and understanding. It can be seen that the conatus is intimately related to virtue, and it is indeed the foundation of virtue. Spinoza writes “The striving to preserve oneself is the first and only foundation of virtue” (IVP22C). When we strive to persevere in being, we seek our own advantage, pursuing what is useful (and therefore good) (IVP19) for increasing our power of acting and thinking. The more we pursue our own true advantage, the more virtue we have (IVP20).
Initially, this apparent egoism may seem like an odd foundation for virtue. However, virtue is the human power to persevere in being, and Spinoza qualifies: “A man cannot be said absolutely to act from virtue insofar as he is determined to do something because he has inadequate ideas, but only insofar as he is determined because he understands” (IVP23). So, virtue, properly speaking, is seeking one’s advantage according to knowledge and striving to persevere in being through the guidance of reason (IVP34). Furthermore, Spinoza argues that what we desire from reason is understanding (IVP26), and the only things that we know to be certainly good or evil are those things which lead us to understanding or prevent it (IVP27). Virtue, therefore, is a rational activity, or active affect, by which we strive to persevere in our being, increasing our power of acting and thinking, through the guidance of reason. Spinoza calls this virtue specifically fortitudo, or “strength of character”. He further divides the strength of character into animositas, or “tenacity” and generositas, or “nobility”. Tenacity is the desire to preserve one’s being through the dictates of reason alone. Nobility, likewise, is the desire to aid others and join them in friendship through the dictates of reason alone (IIIP59S). These two general virtues are both defined as a “desire to strive” to live according to the dictates of reason or to live a rational life of understanding and pursuing what is really to the advantage of the individual.
Though Spinoza does not give a systematic taxonomy of the two sets of virtues, certain specific virtues (and vices) can be found throughout the Ethics (for more, see Kisner 2011, 197-214). Neither does he give an exhaustive list of the “dictates of reason,” though many of these too can be gleaned from the text (see LeBuffe 2010, 177-179). For instance, when he states “He who lives according to the guidance of reason strives, as far as he can, to repay the other’s hate, anger, and disdain towards him with love and nobility” (IVP 46). However, since there is nothing good or evil in nature in itself, the exemplar of the free man is used to consider, in any particular case, what is good and evil from the perspective of the life of freedom and blessedness or happiness. Similar to Aristotle’s phronimos, who is the model of phronesis for discerning virtue in practice, Spinoza’s “free man” can be interpreted as an exemplar to whom an individual can look in order to discern what is truly useful for persevering in being, and what is detrimental to leading a good life defined by rational activity and freedom. In IVP67-IVP73, the so-called “free man propositions”, Spinoza provides an outline of some dictates of reason derived from the exemplar of the free man. Striving to emulate the free man, an individual should not fear death (IVP67), use virtue to avoid danger (IVP68), avoid the favors of the ignorant (IVP70), be grateful (IVP71), always be honest (IVP72), and live a life in community rather than in solitude (IVP73). Ultimately, the exemplar of the free man is meant to provide a model for living a free life, avoiding negative passions by striving to live according to the dictates of reason. However, Spinoza is well aware, as some commentators have pointed out, that the state of the free man, as one who acts entirely from the dictates of reason, may not be entirely attainable for human individuals. In paragraph XXXII of the Appendix to Part IV, he writes “But human power is very limited and infinitely surpassed by the power of external causes. So we do not have the absolute power to adapt things outside us to our use. Nevertheless, we shall bear calmly those things which happen to us contrary to what the principles of our advantage demand, if we are conscious that we have done our duty, that the power we have could not have extended itself to the point where we could have avoided those things, and that we are a part of the whole of nature, whose order we follow.”
In the final part of the Ethics, Spinoza proposes certain remedies to the passions, which he understands as the tools available to reason to overcome them, “the means, or way, leading to freedom.” In general, Spinoza thinks that the more an individual’s mind is made up of adequate ideas, the more active and free the individual is, and the less they will be subject to passions. For this reason, the remedies against the passions focus on activity and understanding. Spinoza outlines five general remedies for the passions:
I. In the knowledge itself of the affects;
II. In the fact that it [the mind] separates the affects from the thought of an external cause, which we imagine confusedly;
III. In the time by which the affection related to things we understand surpasses those related to things we conceive confusedly or in a mutilated way;
IV. In the multiplicity of causes by which affections related to common properties or to God are encouraged;
V. Finally, in the order by which the mind can order its affects and connect them to one another. (VP20S, 605)
The suggested techniques rely on Spinoza’s parallelism, stated in IIP7, that the order of ideas is the same as the order of things. For this reason, Spinoza argues that “in just the same way as thoughts and ideas of things are ordered and connected in the mind, so the affections of the body, images of things are ordered and connected in the body” (IVP1). Therefore, all the techniques suggested by Spinoza involve ordering the ideas according to adequate knowledge, through reason and intuitive knowledge. In this way, the individual becomes more active, and therefore freer, in being a necessary part of nature.
Spinoza’s first and foundational remedy involves an individual fully understanding their affects to obtain self-knowledge. Passive affects, or passions, are, after all, based on inadequate knowledge. Spinoza’s suggestion here is to move from inadequate knowledge to adequate knowledge by attempting to fully understand a passion, that is, to understand its cause. This is possible because, just as the mind is the idea of the body and has ideas of the affections of the body, it can also think ideas of ideas of the mind (IIP20). These ideas are connected to the mind in the same way as the mind is connected to the body (IIP21). Understanding a passion, then, is thinking about the ideas of the ideas of the affections of the body. Attempting to understand a passion has two main effects. First, by the very thinking about their passion, the individual is already more active. Second, by fully understanding their affect, an individual can change it from a passion to an action because “an affect which is a passion ceases to be a passion as soon as we form a clear and distinct idea of it” (VP3).
Spinoza’s argument for the possibility of this relies on the fact that all ideas of the affections of the body can involve some ideas that we can form adequately, that is, there are common properties of all things—the common notions or reason (VP4). So, by understanding affects, thinking ideas of the ideas of the affections of the body, particularly thinking of the causes of the affections of the body, we can form adequate ideas (that follow from our nature) and strive to transform passions into active affects. Spinoza does qualify that we can form some adequate ideas of the affections of the body, underlining that such understanding of passions is limited, but he also writes that “each of us has—in part, at least, if not absolutely—the power to understand himself and his affects, and consequently, the power to bring it about that he is less acted on by them” (VP4S, 598). Since “the appetite by which a man is said to act, and that by which he is said to be acted on are one and the same” (VP4S, 598) anything an individual does from a desire, which is a passion, can also be done from a rational affect.
Interconnected with the first remedy, Spinoza’s second remedy recommends the separation of the affect from the idea of the external cause. VP2 reads “If we separate emotions, or affects, from the thought of an external cause and join them to other thoughts, then the love, or hate, towards the external cause is destroyed, as are the vacillations of the mind arising from these affects.” For Spinoza, love or hate are joy or sadness with an accompanying idea of the external cause. He, here, is indicating that by separating the affect from the thought of an external cause that we understand inadequately, and by understanding the affect as mentioned above by forming some adequate ideas about the affect, we destroy the love and hate of the external cause. As mentioned earlier, anything can be the accidental cause of joy and sorrow (IIIP15), and therefore of love and hate. Furthermore, the strength of an affect is defined by the power of the external cause in relation to our own power (IVP5). Separating the passion from the external cause allows for understanding the affect in relation to the ideas of the mind alone. It might be difficult to grasp what Spinoza means by separating the affect from the external cause in the abstract, but consider the example of the jealous lover. Spinoza defines jealousy as “a vacillation of the mind born of love and hatred together, accompanied by the idea of another who is envied” (IIIP35S). The external causes accompanying the joy and sadness are the beloved and the (imagined) new lover who is envied. By separating the affect from the idea of the external cause, Spinoza is suggesting that a jealous lover could come to terms with the jealousy and form some clear and distinct ideas about it, that is, form some adequate ideas that reduce the power of the passion. Spinoza’s third remedy involves the fact that “affects aroused by reason are, if we take account of time, more powerful than those related to singular things we regard as absent” (VP7). Simply put, “time heals all wounds,” but Spinoza gives an account of why this is. Whereas passions are inadequate ideas that diminish with the absence of the external cause (we have other ideas that exclude the imagining of the external object), an affect related to reason involves the common properties of things “which we always regard as present” (VP7D). Therefore, over time, rational affects are more powerful than passions. This mechanism of this remedy is readily seen in a variety of passions, from heartbreak to addiction.
Spinoza’s fourth and fifth remedies are more concerned with preventing the mind from being adversely affected by passions than with overcoming a specific passion which already exists. The fourth remedy involves relating an affect to a multitude of causes, because “if an affect is related to more and different causes, which the mind considers together with the affect itself, it is less harmful, we are less acted on by it, and we are affected less toward each cause than is the case with another equally great affect, which is related only to one cause or to fewer causes” (VP9). This is the case because, when considering that affect, the mind is engaged in thinking a multitude of different ideas, that is, its power of thinking is increased, and it is more free. Again, this remedy is, in large part, related to the first foundational one. In understanding our affects, we form some adequate ideas and understand the cause of the affect, in part, from these ideas. Insofar as these adequate ideas are common notions concerning the common properties of things, we relate the affects to many things that can engage the mind. Spinoza ultimately claims that “the mind can bring it about that all the body’s affections, or images of things, are related to the idea of God” (VP14), for the mind has an adequate idea of the essence of God (IIP47). Because these affections are related to adequate ideas and follow from our own nature, they are effects of joy accompanied by the idea of God. In other words, all affections of the body can encourage an intellectual love of God. For Spinoza, “he who understands himself and his affects clearly and distinctly loves God, and does so the more, the more he understands himself and his affects” (VP15). This is a large part of how Spinoza conceives of the joyful life of reason and understanding that he calls blessedness.
Finally, the fifth remedy involves the fact that, as Spinoza argues, “so long as we are not torn by affects contrary to our nature, we have the power of ordering and connecting the affection of the body according to the order of the intellect” (VP10). What this amounts to is that the mind will be less affected by negative passions the more adequate ideas it has and will order its ideas according to reason instead of the common order of nature. Spinoza’s suggestion is to “conceive of right principles of living, or sure maxims of life,” which we can constantly look at when confronted by common occurrences and emotional disturbances of life. For instance, Spinoza gives the example of how to avoid being suddenly overwhelmed by hatred by preparing oneself by meditating “frequently on the common wrongs of men, and how they may be warded off best by nobility” (VP10S). This provides the practical mechanism by which we can use the virtues of tenacity and nobility to live a free life (see Steinberg 2014). All the remedies Spinoza mentions allow an individual to be rationally responsive to their environment rather than just being led by their emotions, and insofar as they are led by reason and adequate knowledge, they are free.
5. Spinoza on Moral Responsibility
The discussion about free will and freedom is often concerned with moral responsibility because free will is generally considered a necessary condition for moral responsibility. Moral responsibility is taken to be the condition under which an individual can be praised and blamed, rewarded and punished for their actions. Spinoza’s view on responsibility is complex and little commented upon. And he indeed avers that praise and blame are only a result of the illusion of free will: “Because they think themselves free, those notions have arisen: praise and blame, sin and merit” (I Appendix, 444). Though Spinoza does not speak directly of moral responsibility, he does not completely disavow the idea of responsibility because of his denial of free will. In a series of correspondences with Oldenburg, he makes clear that he does think that individuals are responsible for their actions despite lacking free will, though his sense of responsibility is untraditional. Oldenburg asks Spinoza to explain some passages in the Theological Political Treatise that seem, by equating God with Nature, to imply the elimination of divine providence, free will, and thereby moral responsibility. Spinoza indeed denies the traditional view of divine providence as one of free choice by God. For Spinoza, absolute freedom is acting from the necessity of one’s nature (ID7), and God is free in precisely the fact that everything follows from the necessity of the divine nature. But God does not arbitrarily choose to create the cosmos, as is traditionally argued.
In Letter 74, Oldenburg writes “I shall say what most distresses them. You seem to build on a fatal necessity of all things and actions. But, once that has been asserted and granted, they say the sinews of all laws, of all virtue and religion, are cut, and all rewards and punishments are useless. They think that whatever compels or implies necessity excuses. Therefore, they think no one will be inexcusable in the sight of God” (469). Oldenburg points out the classical argument against determinism, namely that it makes reward and punishment futile and pointless because if human beings have no free will, then they seem to have no control over their lives, and if they have no control over their lives, then there is no justification for punishment or reward. All actions become excusable if they are outside the control of individuals. However, in his response to Oldenburg, Spinoza maintains the significance of reward and punishment even within a deterministic framework. He states,
This inevitable necessity of things does not destroy either divine or human laws. For whether or not the moral teachings themselves receive the form of law or legislation from God himself, they are still divine and salutary. The good which follows from virtue and the love of God will be just as desirable whether we receive it from God as a judge or as something emanating from the necessity of the divine nature. Nor will the bad things which follow from evil actions and affects be any less to be feared because they follow from them necessarily. Finally, whether we do what we do necessarily or contingently, we are still led by hope and fear. (Letter 75, 471)
Spinoza has two points here. The first is that all reward and punishment are natural consequences of actions. Even if everything is determined, actions have good and evil consequences, and these are the natural results of actions. Determinism does not eliminate reward and punishment because there are determined consequences, that are part of the natural order. Traditional views on responsibility are tied to free will, but in this passage, Spinoza is indicating that reward and punishment are justified by the power or right of nature. The second point is that these consequences can regulate human behavior because human beings are led by the hope for some good and the fear of some evil. Determinism does not destroy the law but rather gives it a framework for being effective. Spinoza here seems to be advocating something like a consequentialist theory of responsibility. What matters is that the reward and punishment can act as a deterrent to bad behavior or motivation for desired behavior. Traditional views on responsibility are tied to free will, but in this passage, Spinoza is indicating that reward and punishment are still justified from a social and political standpoint (see Kluz 2015).
To understand Spinoza’s points better, we have to examine his view of law. Spinoza thinks that law is either dependent on natural necessity, that is, laws of nature, or human will. However, because human beings are a part of nature, human law will also be a part of natural law. Moreover, he also thinks that the term “law” is generally more applied to human experience. He writes, “Commonly nothing is understood by law but a command which men can either carry out or neglect—since law confines human power under certain limits, beyond which that power extends, and does not command anything beyond human powers.” For this reason, Spinoza qualifies, “Law seems to need to be defined more particularly: that it is a principle of living man prescribes to himself or to others for some end” (TTP IV.5). Spinoza further divides law into human and divine law. By “human law,” Spinoza specifically means “a principle of living which serves only to protect life and the republic” (TTP IV.9), or what we might call “political” or “civil” law. By “divine law,” he specifically means, that which aims only at the supreme good, that is, the true knowledge and love of God” (TTP IV.9), or what we might call “religious” and “moral” law. The different ends of the law are what distinguish human law from divine law. The first concerns providing security and stability in social life; the second concerns providing happiness and blessedness, which are defined by virtue and freedom. For this reason, “divine law” in Spinoza’s sense concerns what leads to the supreme good for human beings, that is, the rule of conduct that allows humans to achieve freedom, virtue, and happiness. This law Spinoza propounds as moral precepts in the Ethics mentioned above. These laws follow from human nature, that is, they describe what is, in fact, good for human individuals in their striving to persevere in their being, based upon rational knowledge of human beings and nature in general, with the free man as the exemplar toward which they strive.
However, it is not the case that all individuals can access and follow the “divine law” through reason alone, and, therefore, traditionally, divine law took the form of divine commandments ensconced within a system of reward and punishment (while still including, more or less, what Spinoza indicates by ‘divine law”). For Spinoza, what is true in Holy Scripture and “divine law” can also be gained by adequate knowledge because “divine law” is a rule of conduct men lay down for themselves that “aims only at the supreme good, that is, the true knowledge and love of God.” (TTP IV.9). That is to say, “divine law” follows from human nature, which is a part of Nature, but while the free man follows these moral precepts because he rationally knows what is, in fact, advantageous for him, other individuals follow moral precepts because they are led by their passions, namely the hope for some good or the fear of some evil, that is, reward and punishment. Though reward and punishment are, ultimately, the same for the free man and other individuals, the free man is led by reason while other individuals are led by imagination, or inadequate ideas or passions. Likewise, human law, that is, political law, uses a system of reward and punishment to regulate human behavior through hope and fear. Human law provides security and stability for the state in which human individuals co-exist and punishes those who transgress the laws. Moreover, just as in the case of “divine law”, the free man follows human law because he rationally knows his advantage, while other individuals are more led by their passions. Returning to Spinoza’s response, determinism does not do away with law, moral or political, because the utility of the law, that is, the great advantages that following the law provides for the individual and the community and the disadvantages that result from transgressing the law, are retained whether or not human beings have free will. Ultimately, for Spinoza, moral precepts and the law are ensconced in a system of reward and punishment that is necessary for regulating human behavior even without free will.
6. References and Further Reading
All translations are from The Collected Works of Spinoza, Vol. I and II, ed. and trans. Edwin Curley.
a. Primary Sources
Descartes, Rene. The Philosophical Writings of Descartes, Vol. I and II, trans. John Cottingham et al. (Cambridge: Cambridge University Press, 1985).
Hobbes, Thomas. The Leviathan with Selected Variants from the Latin Edition of 1668, ed. Edwin Curley. Indianapolis: Hackett Publishing Company, 1994).
Long, A. A., and D. N. Sedley, trans., The Hellenistic Philosophers, Vol. 1: Translations of the Principal Sources, with Philosophical Commentary. (Cambridge: Cambridge University Press, 1987).
Spinoza, Baruch. The Collected Works of Spinoza, Vol. I and II, ed. and trans. by Edwin Curley. (Princeton University Press, 1985).
b. Secondary Sources
Bennett, Jonathan. A Study of Spinoza’s Ethics. (Indianapolis: Hackett, 1984).
Bennett, Jonathan. “Spinoza’s Monism: A Reply to Curley”, in God and Nature: Spinoza’s Metaphysics, ed. Yirmiyahu Yovel. (Leiden: E.J. Brill, 1991), 53-59.
Curley, Edwin. Spinoza’s Metaphysics: An Essay in Interpretation. (Cambridge: Harvard University Press, 1969).
Curely, Edwin. Behind the Geometrical Method. (Princeton: Princeton University Press, 1985).
Curley, Edwin. “On Bennett’s Interpretation of Spinoza’s Monism”, in God and Nature: Spinoza’s Metaphysics, ed. Yirmiyahu Yovel. (Leiden: E.J. Brill, 1991), 35-52.
De Dijn, Herman. Spinoza: The Way to Wisdom. (West Lafayette, IN: Purdue University Press, 1996).
Della Rocca, Michael. Representation and the Mind-Body Problem in Spinoza. (Oxford: Oxford University Press, 1996).
Gatens, Moira. “Spinoza, Law and Responsibility”, in Spinoza: Critical Assessments of Leading Philosophers Vol.III, ed. by Genevieve Lloyd. (London: Routledge, 2001), 225-242.
Garrett, Don. “Spinoza’s Necessitarianism”, in God and Nature: Spinoza’s Metaphysics, ed. Yirmiyahu Yovel. (Leiden: E.J. Brill, 1991), 197-218.
Goldenbaum, Ursula. “The Affects as a Condition of Human Freedom in Spinoza’s Ethics”, in Spinoza on Reason and the “Free Man”, edited by Yirmiyahu Yovel. (New York: Little Room Press, 2004), 149-65.
Goldenbaum, Ursula, and Christopher Kluz, eds. Doing without Free Will: Spinoza and Contemporary Moral Problems. (New York: Lexington, 2015).
Hübner, Karolina “Spinoza on Being Human and Human Perfection”, in Essays on Spinoza’s Ethical Theory, eds. Mathew Kisner and Andrew Youpa. (Oxford: Oxford University Press, 2014), 124-142.
Homan, Matthew. “Rehumanizing Spinoza’s Free Man”, in Doing without Free Will: Spinoza and Contemporary Moral Problems, eds. Ursula Goldenbaum and Christopher Kluz (New York: Lexington, 2015), 75-96.
James, Susan. “Freedom, Slavery, and the Passions”, in The Cambridge Companion to Spinoza’s Ethics, ed. by Olli Koistinen. (Cambridge: Cambridge University Press, 2009), 223-41.
Kisner, Mathew. Spinoza on Human Freedom: Reason, Autonomy and the Good Life. (Cambridge: Cambridge University Press, 2011).
Kisner, Mathew, and Andrew Youpa eds. Essays on Spinoza’s Ethical Theory. (Oxford: Oxford University Press, 2014).
Kisner, Matthew. “Spinoza’s Activities: Freedom without Independence”, in Freedom, Action, and Motivation in Spinoza’s “Ethics”, ed. Noa Naaman-Zauderer. (London: Routledge, 2021), 37-61.
Kluz, Christopher. “Moral Responsibility without Free Will: Spinoza’s Social Approach”, in Doing without Free Will: Spinoza and Contemporary Moral Problems, eds. Ursula Goldenbaum and Christopher Kluz (New York: Lexington, 2015), 1-26.
LeBuffe, Michael. From Bondage to Freedom: Spinoza on Human Excellence. (Oxford: Oxford University Press, 2010).
Marshal, Colin. “Moral Realism in Spinoza’s Ethics”, in Cambridge Critical Guide to Spinoza’s Ethics, ed. Yitzhak Melamed. (Cambridge: Cambridge University Press, 2017), 248-265,
Marshal, Eugene. The Spiritual Automaton: Spinoza’s Science of the Mind. (Oxford: Oxford University Press, 2014).
Melamed, Yitzhak. “The Causes of our Belief in Free Will: Spinoza on Necessary, “Innate,” yet False Cognition”, in Cambridge Critical Guide to Spinoza’s Ethics, ed. Yitzhak Melamed. (Cambridge: Cambridge University Press, 2017)
Naaman-Zauderer, Nao ed. Freedom, Action, and Motivation in Spinoza’s “Ethics”. (London: Routledge, 2021).
Nadler, Steven. “Whatever is, is in God: substance and things in Spinoza’s metaphysics”, in Interpreting Spinoza: Critical Essays, ed. Charles Huenemann. (Cambridge: Cambridge University Press, 2008), 53-70.
Nadler, Steven. “On Spinoza’s Free Man”, Journal of the American Philosophical Association, Volume 1, Issue 1, Spring 2015, 103-120.
Rutherford, Donald. “Deciding What to Do: The Relation of Affect and Reason in Spinoza’s Ethics”, in Freedom, Action, and Motivation in Spinoza’s “Ethics”, ed. Noa Naaman-Zauderer. (London: Routledge, 2021), 133-151.
Soyarslan, Sanem. “From Ordinary Life to Blessedness: The Power of Intuitive Knowledge in Spinoza’s Ethics”, in Essays on Spinoza’s Ethical Theory eds. Mathew Kisner and Andrew Youpa. (Oxford: Oxford University Press, 2014), 236-257.
Steinberg, Justin. “Following a Recta Ratio Vivendi: The Practical Utility of Spinoza’s Dictates of Reason”, in Essays on Spinoza’s Ethical Theory, eds. Mathew Kisner and Andrew Youpa. (Oxford: Oxford University Press, 2014), 178-196.
Youpa, Andrew. “Spinoza’s Theory of the Good”, in The Cambridge Companion to Spinoza’s Ethics, ed. Olli Koistinen. (Cambridge: Cambridge University Press, 2010), pp. 242 – 257.
Youpa, Andrew. The Ethics of Joy: Spinoza on the Empowered Life. (Oxford: Oxford University Press, 2019).
Yovel, Yirmiyahu, ed. Spinoza on Reason and the “Free Man”. (New York: Little Room Press, 2004).
Gottfried Wilhelm Leibniz (1646-1716) served as the natural end of the rationalist tradition on the European continent, which included Descartes, Spinoza, and Malebranche. His philosophy was one of the major influences on Kant. Although Leibniz had many philosophical and intellectual interests, he was arguably most concerned with reconciling the freedom required for moral responsibility and the determinism that seemed to be entailed by the new sciences being developed at the time. In fact, in several important writings, including the Theodicy, Leibniz refers to “the free and the necessary and their production as it relates to the origin of evil” as one of the “famous labyrinths where our reason very often goes astray.”
To address this labyrinth, Leibniz developed one of the most sophisticated accounts of compatibilism in the early modern period. Compatibilism is the view that freedom and determinism are compatible and not mutually exclusive. Free actions are fully determined, and yet not necessary—they could have been otherwise, were God to have created another possible world instead. According to Leibniz, free actions, whether they be for God or humans, are those that are intelligent, spontaneous, and contingent. He developed a framework of possible worlds that is most helpful in understanding the third and most complex criterion, contingency.
Leibniz’s theory of possible worlds went on to influence some of the standard ways in which modal metaphysics is analyzed in contemporary Anglo-American analytic philosophy. The theory of possible worlds that he developed and utilized in his philosophy was extremely nuanced and had implications for many different areas of his thought, including, but not limited to, his metaphysics, epistemology, jurisprudence, and philosophy of religion. Although Leibniz’s Metaphysics is treated in a separate article, this article is primarily concerned with Leibniz’s modal metaphysics, that is, with his understanding of the modal notions of necessity, contingency, and possibility, and their relation to human and divine freedom. For more specific details on Leibniz’s logic and possible worlds semantics, especially as it relates to the New Essays Concerning Human Understanding and to the Theodicy, please refer to “Leibniz’s Logic.”
Necessitarianism is the view according to which everything that is possible is actual, or, to put this in the language of possible worlds, there is only one possible world and it is the actual world. Not only is everything determined, but it is also metaphysically impossible anything could be otherwise. In the seventeenth century, Baruch Spinoza was the paradigmatic necessitarian. According to Spinoza, insofar as everything follows from the nature of God with conceptual necessity, things could not possibly be other than they are. For Spinoza, necessitarianism had ethical implications—given that it is only possible for the universe to unfold in one way, we ought to learn to accept the way that the world is so that we can live happily. Happiness, Spinoza thought, is partly and importantly understood to be the rational acceptance of the fully determined nature of existence.
Spinoza’s necessitarianism follows directly from his conception of God and his commitment to the principle of sufficient reason, the thesis that there is a cause or reason why everything is the way it is rather than otherwise. In rejecting the anthropomorphic conception of God, he held instead that God is identical with Nature and that all things are, in some sense, in God. While Leibniz rejected the pantheistic/panentheistic understanding of God that Spinoza held, Leibniz’s view of God nevertheless compelled him to necessitarianism, at least in his early years. This article later reconsiders whether Leibniz’s mature views also commit him to necessitarianism. Consider the following letter that he wrote to Magnus Wedderkopf in 1671. Leibniz writes:
Since God is the most perfect mind, however, it is impossible for him not to be affected by the most perfect harmony, and thus to be necessitated to the best by the very ideality of things…Hence it follows that whatever has happened, is happening, or will happen is best and therefore necessary, but…with a necessity that takes nothing away from freedom because it takes nothing away from the will and the use of reason (A. II. I, 117; L 146).
In this early correspondence, Leibniz reasons that since God’s nature is essentially good, he must, by necessity, only do that which is best. It is impossible for God to do less than the best. After his meeting with Spinoza in 1676, Leibniz’s views related to modality began to shift and became much more nuanced. He went on to develop several strategies for addressing contingency to reject this early necessitarian position. In his mature metaphysics, Leibniz maintained that God acts for the best, but rejected that God acts for the best by necessity. How did he attempt to reconcile these positions though?
2. Strategies for Contingency
a. Compossibility
Leibniz’s first and arguably most important strategy for maintaining contingency is to argue that worlds are not possible with respect to God’s will; rather, worlds are intrinsically possible or impossible. If they were possible only with respect to God’s will, the argument from the letter to Wedderkopf would still be applicable—since God is committed to the best by his own essential nature, there is only one possible world, the actual world which is best. Instead, Leibniz maintains that worlds by their very nature are either possible or impossible. He writes in a piece dated from 1680 to1682 called On Freedom and Possibility:
Rather, we must say that God wills the best through his nature. “Therefore,” you will say “he wills by necessity.” I will say, with St. Augustine, that such necessity is blessed. “But surely it follows that from this that things exist by necessity.” How so? Since the nonexistence of what God wills to exist implies a contradiction? I deny that this proposition is absolutely true, for otherwise that which God does not will would not be possible. For things remain possible, even if God does not choose them. Indeed, even if God does not will something to exist, it is possible for it to exist, since, by its nature, it could exist if God were to will it to exist. “But God cannot will it to exist.” I concede this, yet, such a thing remains possible in its nature, even if it is not possible with respect to the divine will, since we have defined as in its nature possible anything that, in itself, implies no contradiction, even though its coexistence with God can in some way be said to imply a contradiction (Grua 289; AG 20-21).
According to Leibniz, worlds are possible just in case they are compossible. Possibility is a property of an object when its properties are logically consistent. For example, winged horses are possible because there is nothing self-contradictory about a horse with wings. But a winged wingless horse would be internally incoherent. By contrast, compossibility is a feature of sets of things, like worlds, rather than individual things. So, when Leibniz insists that worlds are possible by their own nature, he means that the things in that world do not conflict with one another. For example, there is nothing self-contradictory about an unstoppable force or an immovable object. But those objects could not exist in the same world together because their natures would be inconsistent with one another—they rule each other out. So, while there is a possible world with an unstoppable force and a possible world with an immovable object, there is no possible world with both an unstoppable force and an immovable object.
Although Leibniz often analyzes compossibility as a logical relation holding between the created essences of any given world, he sometimes treats it as a relation between the created essences and the laws of nature which God has decreed in each world. He writes in his correspondence to Arnauld:
I think there is an infinity of possible ways in which to create the world, according to the different designs which God could form, and that each possible world depends on certain principal designs or purposes of God which are distinctive of it, that is, certain primary free decrees (conceived sub ratione possibilitatis) or certain laws of the general order of this possible universe with which they are in accord and whose concept they determine, as they do also the concepts of all the individual substances which must enter into this same universe (G. II, 51; L 333).
Passages like this suggest that even logically inconsistent sets of objects like the unstoppable force and the immovable object could exist in a world together, so long as there is one set of laws governing them.
Although there are several different ways to analyze Leibniz’s notion of compossibility, there is good reason to think that he believed that preserving the intrinsic nature of the possibility of worlds was crucial to salvaging contingency. At one point he even suggests that contingency would be destroyed without such an account. He writes to Arnauld:
I agree there is no other reality in pure possibles than the reality they have in the divine understanding…For when speaking of possibilities, I am satisfied that we can form true propositions about them. For example, even if there were no perfect square in the world, we would still see that it does not imply a contradiction. And if we wished absolutely to reject pure possibles, contingency would be destroyed; for, if nothing were possible except what God actually created, then what God created would be necessary, in the case he resolved to create anything (G. II, 45; AG 75).
Importantly, the possibility of worlds is outside the scope of God’s will. God does not determine what is possible, any more than he determines mathematical, logical, or moral truths.
b. Infinite Analysis
Another strategy for understanding necessity and contingency is through Leibniz’s theory of infinite analysis. According to Leibniz, necessity and contingency are not defined in terms of possible worlds in the way that is common in contemporary metaphysics. According to the standard understanding in contemporary metaphysics, a proposition is possible just in case it is true in some possible world, and a proposition is necessary just in case it is true in every possible world. But for Leibniz, a proposition is necessary if and only if it can be reduced to an identity statement in a finite number of steps. Propositions are contingent just in case it would take an infinite number of steps to reduce the statement to an identity statement. He writes in a piece from 1686 called On Contingency:
Necessary truths are those that can be demonstrated through an analysis of terms, so that in the end they become identities, just as in algebra an equation expressing an identity ultimately results from the substitution of values. That is, necessary truths depend upon the principle of contradiction. Contingent truths cannot be reduced to the principle of contradiction; otherwise everything would be necessary and nothing would be possible other than that which actually attains existence (Grua 303; AG 28).
To see how the theory of infinite analysis works, recall that Leibniz holds that every truth is an analytic truth. Every true proposition is one where the concept of the predicate is contained in the concept of the subject. One way that to understand this reduction is to ask, “Why is this proposition true?” Since every truth is an analytic truth, every truth is like, “A bachelor is an unmarried male.” So why is it true that a bachelor is an unmarried male? It is true because it is bound up in the essence of the concept of unmarried male that he is identical with a bachelor. A bachelor just is an unmarried male.
How would the theory of infinite analysis work for explaining contingency though? Consider the following propositions:
1+1=2
Judas is the betrayer of Christ.
The first proposition is a simple mathematical truth that almost everyone in the 17th and 18th centuries would consider to be a necessary truth. For Leibniz, it is a necessary truth because can be reduced to an identity statement in a finite number of steps; that is, we could move from 1+1=2 to 1+1=1+1 in a straightforward manner. We could make a similar move for other mathematical and logical truths that are even more straightforward. The law of identity, that “A is identical to A,” for example, is another example that would take a finite number of steps to reduce to an identity.
The second proposition is an example of a contingent truth because the reduction would take an infinite number of steps to reach an identity statement. To understand how this analysis occurs, consider why it is true that Judas is the betrayer of Christ. This analysis would require reasons for Judas’s nature and his existence. Judas exists because God understood in his infinite wisdom that the best possible world would be one where Judas betrays Christ and Christ suffers. And why is Judas part of the best possible world? The only way to answer that question would be for God to compare the actual world with the infinite plurality of other possible worlds—an analysis that would take an infinite number of steps, even for God. Put simply, the sufficient reason for Judas’s contingent existence is that it is deemed to be best by God.
Importantly, Leibniz holds that not even God could complete the infinite analysis discursively; instead, God completes the analysis intuitively, in one feat of the mind. He writes in On Contingency:
For in necessary propositions, when the analysis is continued indefinitely, it arrives at an equation that is an identity; that is what it is to demonstrate a truth with geometrical rigor. But in contingent propositions one continues the analysis to infinity through reasons for reasons, so that one never has a complete demonstration, though there is always, underneath, a reason for the truth, but the reason is understood completely only by God, who alone traverses the infinite series in one stroke of the mind (Grua 303; AG 28).
c. God’s Choice and Metaphysical and Moral Necessity
Another strategy for salvaging contingency is not at the level of worlds, nor in God’s will, but at the level of God’s wisdom; that is, in the choice to actualize certain substances instead of others. Leibniz holds that we must take the reality of God’s choice seriously. As he writes in the Theodicy, “The nature of things, if taken as without intelligence and without choice, has in it nothing sufficiently determinant” (G. VI, 322; H 350).
Even if the plurality of worlds remain possible in themselves as the first strategy holds, or propositions are contingent because of the infinite analysis theory as the second strategy holds, God’s choice still plays an important role in the causal and explanatory chain of events leading to the actualization of a world. In this way, Leibniz’s modal metaphysics stands again in stark contrast to Spinoza. For Spinoza, the world just is God, and in some sense, all things are in God. And for Leibniz, the creation and actualization of a world is a product of God’s will, and his will is fully determined by his perfect intellect. In some texts, Leibniz locates the source of contingency purely in God’s choice of the best, which cannot be demonstrated. And since the choice of the best cannot be demonstrated, God’s choice is contingent. He writes in On Contingency:
Assuming that the proposition “the proposition that has the greater reason for existing [that is, being true] exists [that is, is true] is necessary, we must see whether it then follows that the proposition that has the greater reason for existing [that is, being true] is necessary. But it is justifiable to deny the consequence. For, if by definition a necessary proposition is one whose truth can be demonstrated with geometrical rigor, then indeed it could be the case that this proposition is demonstrable: “every truth and only a truth has greater reason,” or this: “God always acts with the highest wisdom.” But from this one cannot demonstrate the proposition “contingent proposition A has greater reason [for being true] or “contingent proposition A is in conformity with divine wisdom.” And therefore it does not follow that contingent proposition A is necessary. So, although one can concede that it is necessary for God to choose the best, or that the best is necessary, it does not follow that what is chosen is necessary, since there is no demonstration that it is the best” (Grua 305; AG 30).
Related to God’s choice is the distinction between moral and metaphysical necessity. Moral necessity is used by Leibniz in several different writings, beginning with his earliest jurisprudential writings up to and including his Theodicy. In the 17th century, moral necessity was very often understood in terms of the legal use of “obligation,” a term which Leibniz also applied to God. He writes in the Nova Methodus from 1667:
Morality, that is, the justice or injustice of an act, derives however from the quality of the acting person in relation to the action springing from previous actions, which is described as moral quality. But just as the real quality is twofold in relation to action: the power of acting (potential agendi), and the necessity of acting (necessitas agendi); so also the moral power is called right (jus), the moral necessity is called obligation (obligatio) (A. VI. i. 301).
Leibniz echoes this sentiment into the 1690’s in other jurisprudential writings. In the Codex Juris from 1693, Leibniz insists that “Right is a kind of moral power, and obligation is a moral necessity” (G. III. 386; L 421). In short, Leibniz remarkably held consistent throughout his career that “right” and “obligation” are moral qualities that provide the capacity to do what is just.
Importantly, right and obligation are not just related notions—they have force on each other. As Leibniz writes in the Nova Methodus, “The causes of right in one person are a kind of loss of right in another and it concerns the process of acquiring an obligation. Conversely, the ways of losing an obligation are causes of recovering a right, and can be defined as liberation” (A. VI. vi, 305-306). That a right imposes an obligation cannot be overstated. It is precisely for this reason that we can undergo the theodicean project in the first place. We have proper standing to ask for an explanation for God’s permission of suffering because we have a right to the explanation. And we have a right to the explanation because God is morally necessitated or obligated to create. For a point of comparison, contrast this with God’s response to Job when he demands an explanation for his own suffering. God responds, “Who has a claim against me that I must pay? Everything under heaven belongs to me” (Job 41:11). God does not provide an explanation for Job’s suffering because Job does not have proper standing to request such an explanation.
Leibniz contrasts moral necessity with metaphysical necessity. In the Theodicy, he describes “metaphysical necessity, which leaves no place for any choice, presenting only one possible object, and moral necessity, which obliges the wisest to choose the best” (G. VI, 333; H 367). This distinction becomes important for Leibniz because it allows him to say that God’s choice to create the best of all possible worlds is morally necessary, but not metaphysically necessary. God is morally bound to create the best world due to his divine nature, but since there are other worlds which are possible in themselves, his choice is not metaphysically necessary. Leibniz writes again in the Theodicy, “God chose between different courses all possible: thus, metaphysically speaking, he could have chosen or done what was not the best; but he could not morally speaking have done so” (G. VI, 256; H 271).
Some commentators insist that the dichotomy between metaphysical and moral necessity is illusory. Either it is necessary that God must create the best of all possible worlds, or it is not necessary that God must create the best of all possible worlds. Nevertheless, Leibniz took moral necessity to do both logical and theological work. Only with moral necessity could he preserve both the goodness and wisdom of God. If moral necessity is vacuous, then Leibniz would seem to be committed to necessitarianism.
d. Absolute and Hypothetical Necessity
One final strategy for understanding contingency is to make use of a well-known distinction between absolute and hypothetical necessity. This strategy was most fully utilized in Leibniz’s correspondence with Arnauld in the mid 1680’s. Arnauld was deeply concerned with the implications for freedom because of the theory of complete individual concepts. Since Leibniz held that every individual contains within itself complete truths about the universe, past, present, and future, it seems that there can be no room for freedom. If it is included in Judas’s concept from the moment the universe was created that he would ultimately betray Christ, then it seems as if it was necessary that he do so; Judas could not have done otherwise. Leibniz’s response draws on the distinction between absolute and hypothetical necessity. Consider the following propositions:
Necessarily, Caesar crosses the Rubicon.
Necessarily, if Caesar exists, then he crosses the Rubicon.
Leibniz would deny the first proposition, but readily accept the second proposition. He denies the first because it is not a necessary truth that Caesar crosses the Rubicon. The first proposition is not comparable to other necessary truths like those of mathematics and logic which reduce to identity statements and are not self-contradictory. The second proposition is contingent; although it is bound up in Caesar’s essence that he crosses the Rubicon, it does not follow that he necessarily does so. It is only necessary that Caesar crosses the Rubicon on the hypothesis that Caesar exists. And, of course, Caesar might not have existed at all. God might have actualized a world without Caesar because those worlds are compossible, that is, possible in themselves. This is what Leibniz means when he claims that contingent truths are certain, but not necessary. To use a simple analogy, once God pushes over the first domino, it is certain that the chain of dominoes will fall, but God might have pushed over a completely different set of dominos instead. Once a series is actualized, the laws of the series govern it with certainty. And yet the series is not metaphysically necessary since there are other series that God could have actualized instead were it not for his divine benevolence. Leibnitz writes in the Discourse on Metaphysics from 1686:
And it is true that we are maintaining that everything that must happen to a person is already contained virtually in his nature or notion, just as the properties of a circle are contained in its definition; thus the difficulty still remains. To address it firmly, I assert that connection or following is of two kinds. The one whose contrary implies a contradiction is absolutely necessary; this deduction occurs in eternal truths, for example, the truths of geometry. The other is necessary only ex hypothesi and, so to speak, accidentally, but it is contingent in itself, since its contrary does not imply a contradiction. And this connection is based not purely on ideas of God’s simple understanding, but on his free decrees and on the sequence of the universe (A. VI. iv, 1546-1547; AG 45).
Absolute necessity, then, applies to necessary truths that are outside the scope of God’s free decrees, and hypothetical necessity applies to contingent truths that are within the scope of God’s free decrees.
3. Complete Individual Concepts
According to Leibniz, one of the basic features of a substance is that every substance has a “complete individual concept” (CIC, hereafter). The CIC is an exhaustive account of every single property of each substance. He writes in the Discourse on Metaphysics, “the nature of an individual substance or of a complete being is to have a notion so complete that it is sufficient to contain and to allow us to deduce from it all the predicates of the subject to which this notion is attributed” (A. Vi. iv, 1540; AG 41). From this logical conception of substance, Leibniz argues that properties included in the CIC are those of the past, present, and future. The CIC informs what is sometimes referred to as Leibniz’s doctrine of marks and traces. He illustrates this thesis using the example of Alexander the Great in the Discourse, writing:
Thus, when we consider carefully the connection of things, we can say that from all time in Alexander’s soul there are vestiges of everything that has happened to him and marks of everything that will happen to him and even traces of everything that happens in the universe, even though God alone could recognize them all (A. VI. iv, 1541; AG 41).
According to Leibniz, then, in analyzing any single substance, God would be able to understand every other substance in the universe, since every substance is conceptually connected to every other substance. For example, in analyzing the concept of Jesus, God would also be able to understand the concept of Judas. Because it is part of Jesus’s CIC that he was betrayed by Judas, it is also part of Judas’s CIC that he will betray Jesus. Every truth about the universe could be deduced this way as well. If a pebble were to fall off a cliff on Neptune in the year 2050, that would also be included in Jesus’s CIC too. To use one image of which Leibniz is quite fond, every drop in the ocean is connected to every other drop in the ocean, even though the ripples from those drops could only be understood by God. He writes in the Theodicy:
For it must be known that all things are connected in each one of the possible worlds: the universe, whatever it may be, is all of one piece, like an ocean: the least movement extends its effect there to any distance whatsoever, even though this effect become less perceptible in proportion to the distance. Therein God has ordered all things beforehand once for all, having foreseen prayers, good and bad actions, and all the rest; and each thing as an idea has contributed, before its existence, to the resolution that has been made upon the existence of all things; so that nothing can be changed in the universe (any more than in a number) save its essence or, if you will, save its numerical individuality. Thus, if the smallest evil that comes to pass in the world were missing in it, it would no longer be this world; which nothing omitted and all allowance made, was found the best by the Creator who chose it (G. VI. 107-108; H 128).
In addition to describing substances as possessing a CIC, Leibniz also refers to the essential features of a substance as perception and appetition. These features are explained in more detail in an article on Leibniz’s Philosophy of Mind. In short though, Leibniz held that every single representation of each substance is already contained within itself from the moment it is created, such that the change from one representation to another is brought about by its own conatus. The conatus, or internal striving, is what Leibniz refers to as the appetitions of a substance. Leibniz writes in the late Principles of Nature and Grace:
A monad, in itself, at a moment, can be distinguished from another only by its internal qualities and actions, which can be nothing but its perceptions (that is, the representation of the composite, or what is external, in the simple) and its appetitions (that is, its tendencies to go from one perception to another) which are the principles of change (G. VI. 598; AG 207).
Because every perception of the entire universe is contained within each substance, the entire history of the world is already fully determined. This is the case not just for the actual world after the act of creation, but it is true for every possible world. In fact, the fully determined nature of every possible world is what allows God in his infinite wisdom to actualize the best world. God can assess the value of every world precisely because the entire causal history, past, present, and future is already set.
4. The Containment Theory of Truth and Essentialism
The main article on Leibniz describes his epistemological account in more general terms, but Leibniz’s theory of truth has implications for freedom, so some brief comments bear mentioning. According to Leibniz, propositions are true not if they correspond to the world, but instead based on the relationship between the subject and the predicate. The “predicate in notion principle” (PIN, hereafter), as he describes to Arnauld, is the view according to which “In every true affirmative proposition, whether necessary or contingent, universal or particular, the notion of the predicate is in some way included in that of the subject. Praedicatum inest subjecto; otherwise I do not know what truth is” (G. II, 56; L 337). For example, “Judas is the betrayer of Christ” is true not because there is a Judas who betrays Christ in the actual world, but because the predicate “betrayer of Christ” is contained in the concept of the subject, Judas. Judas’s essence, his thisness, or haecceity, to use the medieval terminology, is partly defined by his betrayal of Christ.
The PIN theory of truth poses significant problems for freedom though. After all, if it is part of Judas’s essence that he is the betrayer of Christ, then it seems that Judas must betray Christ. And if Judas must betray Christ, then it seems that he cannot do otherwise. And if he cannot do otherwise, then Judas cannot be morally responsible for his actions. Judas cannot be blameworthy for the betrayal of Christ for doing something that was part of his very essence. And yet, despite this difficulty, Leibniz maintained a compatibilist theory of freedom, where Judas’s actions were certain, but not necessary.
Since Leibniz holds that every essence can be represented by God as having a complete concept and that every proposition is analytically true, he maintains that every property is essential to a substance’s being. Leibniz, therefore, straightforwardly adopts an essentialist position. Essentialism is the metaphysical view according to which some properties of a thing are essential to it, such that if it were to lose that property, the thing would cease to exist. Leibniz’s essentialism has been a contested issue in the secondary literature during the first few decades of the twenty-first century. The next section of this article highlights three of the more dominant and interesting interpretations of Leibniz’s essentialism in his mature philosophy: superessentialism, moderate essentialism, and superintrinsicalness.
a. Superessentialism
The most straightforward way of interpreting Leibniz’s mature ontology is that he agrees with the thesis of superessentialism. According to superessentialism, every property is essential to an individual substance’s CIC such that if the substance were to lack any property at all, then the substance would not exist. Leibniz often explains his superessentialist position in the context of explaining God’s actions. For example, in one passage he writes, “You will object that it is possible for you to ask why God did not give you more strength than he has. I answer: if he had done that, you would not exist, for he would have produced not you but another creature” (Grua 327).
In his correspondence with Arnauld, Leibniz makes use of the notion of “possible Adams” to explain what looks very much like superessentialism. In describing another possible Adam, Leibniz stresses to Arnauld the importance of taking every property to be part of a substance, or else we would only have an indeterminate notion, not a complete and perfect representation of him. This fully determinate notion is the way in which God conceives of Adam when evaluating which set of individuals to create when a world is actualized. Leibniz describes this perfect representation to Arnauld, “For by the individual concept of Adam I mean, to be sure, a perfect representation of a particular Adam who has particular individual conditions and who is thereby distinguished from an infinite number of other possible persons who are very similar but yet different from him…” (G. II, 20; LA. 15). The most natural way to interpret this passage is along the superessentialist reading such that if there were a property that were not essential to Adam, then we would have a “vague Adam.” Leibniz even says as much to Arnauld. He writes:
We must not conceive of a vague Adam, that is, a person to whom certain attributes of Adam belong, when we are concerned with determining whether all human events follow from positing his existence; rather we must attribute to him a notion so complete that everything that can be attributed to him can be deduced from it (G. II, 42; ag 73.).
The notion of “vague Adams” is further described in a famous passage from the Theodicy. Leibniz describes the existence of other counterparts of Sextus in other possible worlds, that, though complete concepts in their own way, are nevertheless different from the CIC of Sextus in the actual world. Leibniz writes:
I will show you some, wherein shall be found, not absolutely the same Sextus as you have seen (that is not possible, he carries with him always that which he shall be) but several Sextuses resembling him, possessing all that you know imperceptibly, nor in consequence all that shall yet happen to him. You will find in one world a very happy and noble Sextus, in another a Sextus content with a mediocre state, a Sextus, indeed, of every kind and endless diversity of forms (G. VI, 363; H 371).
These passages describing other possible Adams and other possible Sextuses suggest that Leibniz was committed to the very strong thesis of superessentialism. Because every property is essential to an individual’s being, every substance is world-bound; that is, each substance only exists in its own world. If any property of an individual were different, then the individual would cease to exist, but there are also an infinite number of other individuals that vary in different degrees, which occupy different worlds. For example, a Judas who was more loyal and did not ultimately betray Christ would not be the Judas of the actual world. Importantly, one small change would also ripple across and affect every other substance in the universe as well. After all, a loyal Judas who does not betray Christ would also mean that Christ was not betrayed, so it would affect his complete concept and essence as well. Put simply, on the superessentialist interpretation of Leibniz’s metaphysics, due to the complete interconnectedness of all things, if any single property of an individual in the world were different than it is, then every substance in the world would be different as well.
The most important worry that Arnauld had about Leibniz’s philosophy was the way in which essentialism threatens freedom. Arnauld thought that human freedom must entail the ability to do otherwise. In the language of possible worlds, this means that an individual is free if they do otherwise in another possible world. Of course, such a view requires the very same individual to exist in another possible world. According to Arnauld, Judas was free in his betrayal of Christ because there is another possible world where Judas does not betray Christ. Freedom requires the actual ability to do otherwise. But Arnauld worried that according to Leibniz’s superessentialism, since it really was not Judas in another possible world that did not betray Christ but instead a counterpart, an individual very similar in another possible world, then we cannot really say that Judas’s action was truly free. Leibniz anticipates this sort of objection in the Discourse, writing, “But someone will say, why is it that this man will assuredly commit this sin? The reply is easy: otherwise it would not be this man” (A. VI. iv, 1576; AG 61). Leibniz, like most classical compatibilists, argues that the actual ability to do otherwise is not a necessary condition for freedom. All that is required is the hypothetical ability to do otherwise. A compatibilist like Leibniz would insist that Judas’s action is nevertheless free even though he cannot do otherwise. If Judas’s past or the laws of nature were different, then he might not betray Christ. Framing freedom in these hypothetical terms is what allows Leibniz to say that the world is certain, but not necessary.
Leibniz’s motivation for superessentialism is driven partly by theodicean concerns. The basic issue in the classical problem of evil is the apparent incompatibility between a perfectly loving, powerful, and wise God on the one hand with cases of suffering on the other. Why would God permit Jesus to suffer? Leibniz’s answer here as it relates to superessentialism is twofold. First, while Jesus’s suffering is indeed tragic, Leibniz contends that it is better for Jesus to exist and suffer than not to exist at all. Second, because of the complete interconnectedness of all things, without Jesus’s suffering, the entire history of the world would be different. Jesus’s suffering is very much part of the calculus when God is discerning which world is the best. And importantly, God is not choosing that Jesus suffers, but only chose a world in which Jesus suffers. He writes in the Primary Truths from 1689:
Properly speaking, he did not decide that Peter sin or that Judas be damned, but only that Peter who would sin with certainty, though not with necessity, but freely, and Judas who would suffer damnation would attain existence rather than other possible things; that is, he decreed that the possible notion become actual (A. VI. iv, 1646; AG 32).
b. Moderate Essentialism
Despite the evidence to interpret Leibniz as a superessentialist, there is also textual support that superessentialism is simply too strong of a thesis. One reason to adopt a weaker version of essentialism is to be logically consistent with transworld identity, the thesis that individuals can exist across possible worlds. Some commentators like Cover and O’Leary-Hawthorne argue for the weaker essentialist position on the grounds that superessentialism cannot utilize the scholastic difference between essential and accidental properties of which Leibniz sometimes makes use. According to moderate essentialism, Leibniz holds that properties that can be attributed to the species are essential in one way and principles attributed to individuals are essential in a different way.
The weaker thesis of moderate essentialism is the view that only monadic properties are essential to an individual substance, and relational or extrinsic properties should be reducible to monadic properties. The result of this view is that an individual is not “world-bound”; that is, a counterpart of that individual might exist in another possible world, and the essential properties of that individual are what designate it across possible worlds. What follows then is that Jesus, for example, could be said to be free for giving himself up in the Garden of Gethsemane because in another possible world, a counterpart of Jesus did not give himself up. Problematically though, Leibniz explicitly mentions in one of the letters to Arnauld that the laws of nature are indeed a part of an individual’s CIC. Leibniz writes to Arnauld, “As there exist an infinite number of possible worlds, there exists also an infinite number of laws, some peculiar to one world, some to another, and each possible individual contains in the concept of him the laws of his world” (G. II, 40; LA 43).
To reconcile the passages where Leibniz suggests that individuals are world-bound, some commentators argue that it is logically consistent to hold that only the perception or expression of the other substance must exist, but not the substance itself. And since monads are “windowless,” that is, causally isolated, the other substance need not exist at all. In his late correspondence with Des Bosses, Leibniz suggests this very thing, namely, that God could create one monad without the rest of the monads in that world. Leibniz writes:
My reply is easy and has already been given. He can do it absolutely; he cannot do it hypothetically, because he has decreed that all things should function most wisely and harmoniously. There would be no deception of rational creatures, however, even if everything outside of them did not correspond exactly to their experiences, or indeed if nothing did, just as if there were only one mind… (G. II, 496; L 611).
The letter to Des Bosses is compelling for moderate essentialism, but it does not entail it. In fact, conceiving of God’s ability to create only one monad in the actual world with only the expressions of every other substance is perfectly consistent with the superessentialist interpretation. The substances need not actually exist in order to support the claim that every property of a CIC is necessary for that substance. Put differently, if it were part of Peter’s CIC that he denied Christ three times, it need not follow that Christ actually existed for this property to hold, so long as the perceptions of Christ follow from the stores of Peter’s substance.
c. Superintrinsicalness
One final variation of essentialism which we might attribute to Leibniz is called superintrinsicalness. This thesis, defended primarily by Sleigh, states that every individual substance has all its properties intrinsically. This view is distinct from moderate essentialism in a very important way. According to superintrinsicalness, both monadic and extrinsic properties are essential to an individual’s CIC. But, contrary to the superessentialist thesis, the properties that compose an individual’s CIC could be different; that is, some components of a substance’s CIC are necessary, and some are contingent. Leibniz writes in the Discourse:
For it will be found that the demonstration of this predicate of Caesar is not as absolute as those of numbers or of geometry, but that it supposes the sequence of things that God has freely chosen, a sequence based on God’s first free decree always to do what is most perfect and on God’s decree with respect to human nature, following out of the first decree, that man will always do (although freely) that which appears to be best. But every truth based on these kinds of decrees is contingent, even though it is certain; for these decrees do not change the possibility of things, and, as I have already said, even though it is certain that God always chooses the best, this does not prevent something less perfect from being and remaining possible in itself, even though it will not happen, since it is not its impossibility but its imperfection which causes it to be rejected. And nothing is necessary whose contrary is possible (A. VI. iv, 1548; AG 46).
One of the consequences of this view is that a substance’s CIC is contingent on the will of God. For example, on this view, it is a logical possibility that Adam could have had a completely different set of properties altogether. And since a substance could have a completely different CIC and relational properties are part of that CIC, then superintrinsicalness would deny that substances are world-bound. Since Leibniz denies world-bound individuals on this interpretation, he would not need any sort of counterpart theory that comes along with the superessentialist reading. After all, Leibniz’s depiction of counterparts states that there are individuals in other possible worlds that, though they are very similar, are numerically distinct from each other. But on the superintrinsicalness thesis, it may be the case that an individual in another possible world is identical to an individual in the actual world.
There is some textual evidence supporting superintrinsicalness as well. Leibniz writes to Arnauld, “Thus, all human events could not fail to occur as in fact they did occur, once the choice of Adam is assumed; but not so much because of the individual concept of Adam, although this concept includes them, but because of God’s plans, which also enter into the individual concept of Adam” (G. II, 51; LA 57). And yet, if a substance could have had a different CIC, then the notion of a haecceity becomes meaningless. The haecceity serves to individuate substances across possible worlds. If the haecceity could be different than it is, then the concept loses its purpose. We could not pick out the Caesar of this world and another possible world, if the thing that makes Caesar can change.
==And yet, if Leibniz accepted superintrinsicalness, then he would have had an easy response to Arnauld’s worry that the complete concept doctrine diminishes the possibility of freedom. Leibniz could have just responded to Arnauld that Judas freely betrayed Christ because, in another possible world, he did not betray Christ; although his haecceity in the actual world determined that he would betray Christ, the haecceity in another possible world may be different such that he did not betray Christ. But this is not the response that Leibniz gives. Instead, he draws on some of the strategies for contingency in defending a compatibilist view of freedom that were discussed earlier.
5. Leibnizian Optimism and the “Best” Possible World
To paraphrase Ivan in The Brothers Karamazov, “The crust of the earth is soaked by the tears of the suffering.” Events like the Thirty Years War deeply affected Leibniz. His theodicean project was an attempt at an explanation and justification for God’s permission of such suffering. Why would a perfectly wise, powerful, and good God permit suffering? And even if we were to grant that God must permit suffering to allow for greater goods such as compassion and empathy, why must there be so much of it? Would the world not have been better with less suffering? The crux of Leibniz’s philosophical optimism was that creating this world was the best that God could do—it was metaphysically impossible for the world to be better than it is. And so, God is absolved of responsibility for not creating something better. But how could Leibniz maintain a position in such contrast to our intuitions that the world could be better with less suffering?
Arguably the most famous part of Leibniz’s philosophy is his solution to the problem of evil. The problem of evil is the most significant objection to classical theism, and it is one that Leibniz developed an entire system of possible worlds to address. He argues that God freely created the best of all possible worlds from amongst an infinite plurality of alternatives. Voltaire mocked such optimism in his Candide, suggesting in a best-case scenario that, if this is really the best world that God could create, then God certainly is not worth much reverence and in a worst-case scenario, it implies that God does not exist at all. But what exactly did Leibniz mean by the “best” possible world? And was Voltaire’s criticism warranted? Leibniz has several responses to the problem of evil which draw on his complex theory of possible worlds.
First, the basis for Voltaire’s misinterpretation is grounded upon the false assumption that the actual world is morally best. Instead, Leibniz contends that the world is metaphysically best. But how are these “moral” and “metaphysical” qualifications related to one another? After all, Leibniz sometimes remarks like he does in the Discourse that “God is the monarch of the most perfect republic, composed of all minds, and the happiness of this city of God is his principal purpose” (A. VI. iv, 1586; AG 67). And yet at other times, like in the Theodicy, he contends that “The happiness of rational creatures is one of the aims God has in view; but it is not his whole aim, nor even his ultimate aim” (G. VI, 169-170; H 189). It seems then that Leibniz is, at least on the face of it, unsure how much God is concerned with the happiness of creation. Happiness is a “principal” purpose of God, and yet not an “ultimate aim.”
One way to reconcile these apparently disparate positions is to be clearer about what Leibniz means by happiness. Leibniz often reminds the reader that the actual world is not the best because it guarantees every substance has the most pleasurable existence. Rather, he holds, like he does in the Confessio, that “Happiness is the state of mind most agreeable to it, and nothing is agreeable to a mind outside of harmony” (A. VI. iii, 116; CP 29). Put differently, the best of all possible worlds is metaphysically best because it is the world where rational minds can contemplate the harmonious nature of creation. Leibniz goes into more detail in The Principles of Nature and Grace, writing:
It follows from the supreme perfection of God that in producing the universe he chose the best possible plan, containing the greatest variety together with the greatest order; the best arranged situation, place and time; the greatest effect produced by the simplest means; the most power, the most knowledge, the most happiness and goodness in created things of which the universe admitted (G. VI, 603).
In short, Leibniz holds that while there is concern with the happiness of minds during the act of creation, the kind of happiness that God wishes to guarantee is not physical pleasure or the absence of physical pain, but instead the rational recognition that the actual world is the most harmonious.
Second, Leibniz contends that “best” does not mean “perfect” or even “very good.” While it is true that we oftentimes have no idea why bad things sometimes happen to good people and why good things sometimes happen to bad people, what we can be sure of is that God, as an ens perfectissimum, a most perfect being, chose this world because it was the best. And it is the best because it contains the most variety and plurality of substances governed by the fewest laws of nature. He writes in the Discourse:
One can say, in whatever manner God might have created the world, it would always have been regular and in accordance with a certain general order. But God has chosen the most perfect world, that is, the one which is at the same time the simplest in hypotheses and richest in phenomena (A. VI. Iv, 1538; AG 39).
Even if we were to grant that Leibniz means something particular by “best,” how should we understand the criteria that the “best” world is the one that is richest in phenomena and governed by the simplest laws?
It is critical that Leibniz has more than one criterion for the best possible world. If there were only one criterion, like the concern for the happiness of creatures, for example, then there is a problem of maximization. For whatever world God created, he could have created another world with more happiness. And since God could always create a better world, then he could never act for the best, for there is no best. But since there is a world, either this is not the best of all possible worlds, or there is no maximally perfect being. Malebranche (and Aquinas) held that there was no best world, and Leibniz wished to distance himself from their views. He writes in the Discourse, “They [the moderns like Malebranche] imagine that nothing is so perfect that there is not something more perfect—this is an error” (A. VI. iv, 1534; AG 37).
Rather than maximizing one feature of a world, which would be impossible, Leibniz reasons that God must optimize the competing criteria of richness of phenomena, simplicity of laws, and abundance of creatures. He writes in the Discourse:
As for the simplicity of the ways of God, this holds properly with respect to his means, as opposed to the variety, richness, and abundance, which holds with respect to his ends or effects. And the one must be in balance with the other, as are the costs of a building and the size and beauty one demands of it (A. VI. iv, 1537; AG 39).
God, like an architect with unlimited resources, must nevertheless weigh competing variables to optimize the best creation.
Even if we grant the claim that there God considers competing variables in creating the best world, we might still wonder why those variables are those of concern. Although it is unclear why Leibniz chose variety, richness, and abundance as the criteria, he points to simplicity as a possible overarching principle. Unfortunately, simplicity alone will not do, for it would be simpler to have only one substance rather than an abundance of substances. It seems then that simplicity in conjunction with a world that is worthy of the majesty of God are the underlying criteria for the best of all possible worlds.
The notion of simplicity is critical for Leibniz’s theodicean account. In fact, simplicity is the key concept that sets Leibniz’s account of God’s justice directly in line with his contemporary, Nicolas Malebranche. Leibniz remarks at one point that Malebranche’s theodicean account reduces in most substantial ways to his own. He writes in the Theodicy, “One may, indeed, reduce these two conditions, simplicity and productivity, to a single advantage, which is to produce as much perfection as is possible: thus Father Malebranche’s system in this point amounts to the same as mine” (G. VI, 241; H 257). The similarities of their accounts are readily apparent. Consider Malebranche’s remark that “God, discovering in the infinite treasures of his wisdom an infinity of possible worlds…, determines himself to create that world…that ought to be the most perfect, with respect to the simplicity of the ways necessary to its production or to its conservation” (OCM. V, 28).
Third, Leibniz appeals to intellectual humility and insists that our intuition that this is not the best possible world is simply mistaken. If we had God’s wisdom, then we would understand that this is the best possible world. Part of the appeal to intellectual humility is also the recognition that God evaluates the value of each world in its totality. In just the same way that it would be unfair to judge the quality of a film by looking at a single frame of the reel, Leibniz reasons that it is also unfair to judge the quality of the world by any singular instance of suffering. And given our relatively small existence in the enormous history of the universe, even long periods of suffering should be judged with proper context. World wars, global pandemics, natural disasters, famine, genocide, slavery, and total climate catastrophe are immense tragedies to be sure, but they mean relatively little in the context of the history of the universe.
The recognition that these cases of suffering mean little should not be interpreted to imply that they mean nothing. A perfectly benevolent God cares about the suffering of every part of creation, and yet, God must also weigh that suffering against the happiness and flourishing of the entirety of the universe, past, present, and future. And moreover, Leibniz reasons that every bit of suffering will ultimately lead to a greater good that redeems or justifies the suffering. To use the language in the contemporary literature in philosophy of religion, there is no “gratuitous evil.” Every case of evil ultimately helps improve the value of the entire universe. In a mature piece called the Dialogue on Human Freedom and the Origin of Evil, Leibniz writes:
I believe that God did create things in ultimate perfection, though it does not seem so to us considering the parts of the universe. It’s a bit like what happens in music and painting, for shadows and dissonances truly enhance the other parts, and the wise author of such works derives such a great benefit for the total perfection of the work from these particular imperfections that it is much better to make a place for them than to attempt to do without them. Thus, we must believe that God would not have allowed sin nor would he have created things he knows will sin, if he could derive from them a good incomparably greater than the resulting evil (Grua 365-366; AG 115).
6. Compatibilist Freedom
a. Human Freedom
Leibniz was deeply concerned with the way in which to properly understand freedom. In one sense, though, his hands were tied; given his fundamental commitment to the principle of sufficient reason as one of the “great principles of human reason” (G. VI, 602), Leibniz was straightforwardly compelled to determinism. Since the principle of sufficient reason rules out causes which are isolated from the causal series, one of the paradigmatic signs of thoroughgoing Libertarian accounts of free will, the most that Leibniz could hope for was a kind of compatibilist account of freedom. And indeed, Leibniz, like most of his other contemporaries, openly embraced the view that freedom and determinism were compatible.
According to the account of freedom developed in his Theodicy, free actions are those that satisfy three individually necessary and jointly sufficient conditions—they must be intelligent, spontaneous, and contingent. He writes in the Theodicy:
I have shown that freedom according to the definition required in the schools of theology, consists in intelligence, which involves a clear knowledge of the object of deliberation, in spontaneity, whereby we determine, and in contingency, that is, in the exclusion of logical or metaphysical necessity (G. VI, 288; H 288).
Leibniz derives the intelligence and spontaneity conditions from Aristotle, but adds contingency as a separate requirement. For an action to be free, Leibniz contends that the agent must have “distinct knowledge of the object of deliberation” (G. VI, 288; H 288), meaning that the agent must have knowledge of their action and also of alternative courses of action. For an action to be spontaneous, the agent’s actions must derive from an internal source and not be externally caused. There is a sense in which every action is spontaneous in that each substance is causally isolated and windowless from every other substance. And finally, actions must be contingent; that is, they must exclude logical or metaphysical necessity.
b. Divine Freedom
It was not just human freedom, though, that Leibniz treated as intelligent, spontaneous, and contingent. In fact, one of the most remarkably consistent parts of Leibniz’s thought, going back to his jurisprudential writings in the 1660’s all the way through to his mature views on metaphysics and philosophical theology, is that the gap between humans and God is a difference of degree and not type. There is nothing substantively different between humans and God. It is for precisely this reason that he insists in his natural law theory that we can discern the nature of justice and try to implement it in worldly affairs. Justice for humans ought to mirror the justice of God.
The implication for this theological view is that God is free in the same way that humans are free; God is perfectly free because his actions are also intelligent, spontaneous, and contingent. Since God is omniscient, he has perfect perceptions of the entire universe, past, present, and future. Since God determines his own actions without any external coercion, he is perfectly spontaneous. And since there is an infinite plurality of worlds, possible in themselves, which God could choose, his actions are contingent. Leibniz reasons that since God meets each of these conditions in the highest sense, God is perfectly free. And even though God is invariably led toward the Good, this in no way is an infringement on his freedom. He writes in the Theodicy:
…It is true freedom, and the most perfect, to be able to make the best use of one’s free will, and always to exercise this power, without being turned aside either by outward force or by inward passions, whereof the one enslaves our bodies and the other our souls. There is nothing less servile and more befitting the highest degree of freedom than to be always led towards the good, and always by one’s own inclination, without any constraint and without any displeasure. And to object that God therefore had need of external things is only a sophism (G. VI. 385; H 386).
Even with this mature account of freedom in place, Leibniz may still have the very same problem that he was concerned about prior to his meeting with Spinoza in 1676. If God’s nature requires him to do only the best, and assuming that there is only one uniquely best world, then it follows that the only possible world is the actual world. God’s essential nature and the fact of a uniquely best world entails that God must create the best. And so, we may end up back in the necessitarian position after all, albeit in a somewhat different way than Spinoza. Although Leibniz endorses the anthropomorphic conception of God that Spinoza denies, both philosophers hold that God’s nature necessitates, in some way, that there is only one possible world, the actual world. Ultimately, it is up to us to decide whether the strategies for contingency and the account of human and divine freedom that Leibniz develops over the course of his long and illustrious career are successful enough to avoid the necessitarian threat of which he was so concerned.
7. References and Further Reading
a. Primary Sources
[A] Sämtliche Schriften und Briefe. Ed. Deutsche Akademie der Wissenschaften. Darmstadt, Leipzig, Berlin: Akademie Verlag, 1923. Cited by series, volume, page.
[AG] Philosophical Essays. Translated and edited by Roger Ariew and Dan Garber. Indianapolis: Hackett, 1989.
[CP] Confessio Philosophi: Papers Concerning the Problem of Evil, 1671–1678. Translated and edited by Robert C. Sleigh, Jr. New Haven, CT: Yale University Press, 2005.
[G] Die Philosophischen Schriften von Gottfried Wilhelm Leibniz. Edited by C.I. Gerhardt. Berlin: Weidmann, 1875-1890. Reprint, Hildescheim: Georg Olms, 1978. Cited by volume, page.
[Grua] Textes inédits d’après de la bibliothèque provincial de Hanovre. Edited by Gaston Grua. Paris: Presses Universitaires, 1948. Reprint, New York and London: Garland Publishing, 1985.
[H] Theodicy: Essays on the Goodness of God, the Freedom on Man and the Origin of Evil. Translated by E.M. Huggard. La Salle, Il: Open Court, 1985.
[L] Philosophical Papers and Letters. Edited and translated by Leroy E. Loemker.
2nd Edition. Dordrect: D. Reidel, 1969.
[LA] The Leibniz-Arnauld Correspondence. Edited by H.T. Mason. Manchester: Manchester University Press, 1967.
[OCM] Œuvres complètes de Malebranche (20 volumes). Edited by A. Robinet. Paris: J. Vrin, 1958–84.
b. Secondary Sources
Adams, Robert Merrihew. Leibniz: Determinist, Theist, Idealist. New York: Oxford University Press, 1994.
Bennett, Jonathan. Learning from Six Philosophers Vol. 1. New York: Oxford University Press, 2001.
Blumenfeld, David. “Is the Best Possible World Possible?” Philosophical Review 84, No. 2, April 1975.
Blumenfeld, David. “Perfection and Happiness in the Best Possible World.” In Cambridge Companion to Leibniz. Edited by Nicholas Jolley. Cambridge: Cambridge University Press, 1994.
Broad, C.D. Leibniz: An Introduction. Cambridge: Cambridge University Press, 1975.
Brown, Gregory and Yual Chiek. Leibniz on Compossibility and Possible Worlds. Cham, Switzerland: Springer, 2016.
Brown, Gregory. “Compossibility, Harmony, and Perfection in Leibniz.” The Philosophical Review 96, No. 2, April 1987.
Cover, J.A. and John O’Leary-Hawthorne. Substance and Individuation in Leibniz. Cambridge:
Cambridge University Press, 1999.
Curley, Edwin. “Root of Contingency.” In Leibniz: A Collection of Critical Essays. Edited by Harry Frankfurt. New York: Doubleday, 1974.
D’Agostino, Fred. “Leibniz on Compossibility and Relational Predicates.” The Philosophical Quarterly 26, No. 103, April 1976.
Hacking, Ian. “A Leibnizian Theory of Truth.” In Leibniz: Critical and Interpretative Essays, edited by
Michael Hooker. Minneapolis: University of Minnesota Press, 1982.
Horn, Charles Joshua. “Leibniz and Impossible Ideas in the Divine Intellect” In Internationaler Leibniz-Kongress X Vorträge IV, Edited by Wenchao Li. Hannover: Olms, 2016.
Horn, Charles Joshua. “Leibniz and the Labyrinth of Divine Freedom.” In The Labyrinths of Leibniz’s Philosophy. Edited by Aleksandra Horowska. Peter Lang Verlag, 2022.
Koistinen, Olli, and Arto Repo. “Compossibility and Being in the Same World in Leibniz’s Metaphysic.” Studia Leibnitiana 31, 2021.
Look, Brandon. “Leibniz and the Shelf of Essence.” The Leibniz Review 15, 2005.
Maher, Patrick. “Leibniz on Contingency.” Studia Leibnitiana 12, 1980.
Mates, Benson. “Individuals and Modality in the Philosophy of Leibniz.” Studia Leibnitiana 4, 1972.
Mates, Benson. “Leibniz on Possible Worlds.” Leibniz: A Collection of Critical Essays, edited by Harry Frankfurt, 335-365. Notre Dame: University of Notre Dame Press, 1976.
Mates, Benson. The Philosophy of Leibniz: Metaphysics and Language. New York: Oxford University Press, 1986.
McDonough, Jeffrey. “Freedom and Contingency.” The Oxford Handbook of Leibniz. New York: Oxford University Press, 2018.
McDonough, Jeffrey. “The Puzzle of Compossibility: The Packing Strategy.” Philosophical Review.119, No. 2, 2010.
Messina, James and Donald Rutherford. “Leibniz on Compossibility.” Philosophy Compass 4, No. 6,
2009.
Mondadori, Fabrizio. “Leibniz and the Doctrine of Inter-World Identity.” Studia Leibnitiana 7, 1975.
Mondadori, Fabrizio. “Reference, Essentialism, and Modality in Leibniz’s Metaphysics.” Studia Leibnitiana 5, 1973.
Rescher, Nicholas. Leibniz: An Introduction to His Philosophy. Totowa, New Jersey: Rowman and
Littlefield, 1979.
Rescher, Nicholas. Leibniz’s Metaphysics of Nature. Dordrecht, 1981.
Rescher, Nicholas. The Philosophy of Leibniz. Englewood Cliffs, NJ: Prentice Hall, 1967.
Rowe, William. Can God Be Free? New York: Oxford University Press, 2006.
Russell, Bertrand. A Critical Exposition on the Philosophy of Leibniz, 2nd ed. London: George Allen and Unwin, 1937. Reprint London: Routledge, 1997.
Rutherford, Donald. Leibniz and the Rational Order of Nature. Cambridge: Cambridge University Press, 1995.
Rutherford, Donald. “The Actual World.” The Oxford Handbook of Leibniz. New York: Oxford University Press, 2018.
Sleigh, Robert C., Jr. Leibniz and Arnauld: A Commentary on Their Correspondence. New Haven: Yale University Press, 1990.
Wilson, Margaret D. “Compossibility and Law.” In Causation in Early Modern Philosophy: Cartesianism, Occasionalism, and Pre-Established Harmony. Edited by Steven Nadler. University Park, Pennsylvania: Pennsylvania State University Press, 1993.
Wilson, Margaret D. “Possible Gods.” Review of Metaphysics 32, 1978/79.
Author Information
Charles Joshua Horn
Email: jhorn@uwsp.edu
University of Wisconsin Stevens Point
U. S. A.
Faith: Contemporary Perspectives
Faith is a trusting commitment to someone or something. Faith helps us meet our goals, keeps our relationships secure, and enables us to retain our commitments over time. Faith is thus a central part of a flourishing life.
This article is about the philosophy of faith. There are many philosophical questions about faith, such as: What is faith? What are its main components or features? What are the different kinds of faith? What is the relationship between faith and other similar states, such as belief, trust, knowledge, desire, doubt, and hope? Can faith be epistemically rational? Practically rational? Morally permissible?
This article addresses these questions. It is divided into three main parts. The first is about the nature of faith. This includes different kinds of faith and various features of faith. The second discusses the way that faith relates to other states. For example, what is the difference between faith and hope? Can someone have faith that something is true even if they do not believe it is true? The third discusses three ways we might evaluate faith: epistemically, practically, and morally. While faith is not always rational or permissible, this section covers when and how it can be. The idea of faith as a virtue is also discussed.
This article focuses on contemporary work on faith, largely since the twentieth century. Historical accounts of faith are also significant and influential; for an overview of those, see the article “Faith: Historical Perspectives.”
As we saw above, faith is a trusting commitment to someone or something. While this definition is a good start, it leaves many questions unanswered. This section is on the nature of faith and is divided into two subsections. The first covers distinctions among different kinds of faith and the second explores features of faith.
a. Types of Faith
This subsection outlines distinctions among different kinds of faith. It focuses on four distinctions: attitude-focused faith vs. act-focused faith, faith-that vs. faith-in, religious vs. non-religious faith, and important vs. mundane faith.
i. Attitude-Focused vs. Act-Focused
One of the most important distinctions is faith as an attitude compared to faith as an action. Faith, understood as an attitude, is similar to attitudes like beliefs or desires. In the same way that you might believe that God exists, you might have faith that God exists. Both are attitudes (things in your head), rather than actions (things you do). Call this attitude-focused faith.
Attitude-focused faith is thought to involve at least two components (Audi 2011: 79). The first is a belief-like, or cognitive, component. This could simply be a belief. While some contend that faith always involves belief, others argue that faith can involve something weaker, but still belief-like: some confidence that the object of faith is true, thinking it is likely to be true, supported by the evidence, or the most likely of the options under consideration. Either way, attitude-focused faith involves something belief-like. For example, if you have faith that your friend will win their upcoming basketball game, you will think there is at least a decent chance they win. It does not make sense to have faith that your friend’s team will win if you are convinced that they are going to get crushed. Later, this article returns to questions about the exact connection between faith and belief, but it is relatively uncontroversial that attitude-focused faith involves a belief-like component.
The second component of attitude-focused faith is a desire-like, or conative, component. Attitude-focused faith involves a desire for, or a positive evaluation of, its object. Returning to our example, if you have faith that your friend will win their upcoming game, then you want them to win the game. You do not have faith that they will win if you are cheering for the other team or if you want them to lose. This example illustrates why plausibly, attitude-focused faith involves desire; this article returns to this later as well.
A second kind of faith is not in your head, but an action. This kind of faith is similar to taking a “leap of faith”—an act of trust in someone or something. For example, if your friend promises to pick you up at the airport, waiting for them rather than calling a taxi demonstrates faith that they will pick you up. Walking across a rickety bridge demonstrates faith that the bridge will hold you. Doing a trust fall demonstrates faith that someone will catch you. Call this type of faith an act of faith, or action-focused faith.
On some views, such as Kvanvig’s (2013), faith is a disposition. In the same way that glass is disposed to shatter (even if it never actually shatters), on dispositional views of faith, having faith is a matter of being disposed to do certain things (even if the faithful never actually do them). The view that faith is a disposition could be either attitude-focused or action-focused. Faith might be a disposition to act in certain ways, maybe ways that demonstrate trust or involve risk. This type of faith would be action-focused (see Kvanvig 2013). Faith might instead be a disposition to have certain attitudes: like to believe, be confident in, and/or desire certain propositions to be true. This type of faith would be attitude-focused (see Byerly 2012).
What is the relationship between attitude-focused faith and action-focused faith? They are distinct states, but does one always lead to the other? One might think that, in the same way that beliefs and desires cause actions (for example, your belief that there is food in the fridge and your desire for food leads you to open the fridge), attitude-focused faith will cause (or dispose you toward) action-focused faith, as attitude-focused faith is made up of belief- and desire-like states (see Jackson 2021). On the other hand, we may not always act on our beliefs and our desires. So one question is: could you have attitude-focused faith without action-focused faith?
A related question is whether you could have action-focused faith without attitude-focused faith. Could you take a leap of faith without having the belief- and desire-like components of attitude-focused faith? Speak (2007: 232) provides an example that suggests that you could take a leap of faith without a corresponding belief. Suppose Thomas was raised in circumstances that instilled a deep distrust of the police. Thomas finds himself in an unsafe situation and a police officer is attempting to save him; Thomas needs to jump from a dangerous spot so the officer can catch him. While the officer has provided Thomas with evidence that he is reliable, Thomas cannot shake the belief instilled from his upbringing that the police are not trustworthy. Nonetheless, Thomas jumps. Intuitively, Thomas put his faith in the officer, even without believing that the officer is trustworthy.
Generally, you can act on something, even rationally, if you have a lot to gain if it is true, even if you do not believe that it is true. Whether this counts as action-focused faith without attitude-focused faith, however, will depend on the relationship between faith and belief, a question addressed in a later section.
ii. Faith-That vs. Faith-In
A second distinction is between faith-that and faith-in. Faith-that is faith that a certain proposition is true. Propositions are true or false statements, expressed by declarative sentences. So 1+1=2, all apples are red, and God exists are all propositions. In the case of faith, you might have faith that a bridge will hold you, faith that your friend will pick you up from the airport, or faith that God exists. Faith-that is similar to other propositional attitudes, like belief and knowledge. This suggests that attitude-focused faith is a species of faith-that, since the attitudes closely associated with faith, like belief and hope, are propositional attitudes.
There’s also faith-in. Faith-in is not faith toward propositions, but faith toward persons or ideals. For example, you might have faith in yourself, faith in democracy, faith in your spouse, faith in a political party, or faith in recycling.
Some instances of faith can be expressed as both faith-that and faith-in. For example, theistic faith might be described as faith-that God exists or faith-in God. You might also have faith-that your spouse is a good person or faith-in your spouse. There are questions about the relationship between faith-that and faith-in. For example, is one more fundamental? Do all instances of faith-that reduce to faith-in, or vice versa? Or are they somewhat independent? Is there a significant difference between faith-in X, and faith-that a proposition about X is true?
iii. Religious vs. Non-Religious
A third distinction is between religious faith and secular faith. The paradigm example of religious faith is faith in God or gods, but religious faith can also include: faith that certain religious doctrines are true, faith in the testimony of a religious leader, faith in a Scripture or holy book, or faith in the church or in a religious group. In fact, according to one view that may be popular in certain religious circles, “faith” is simply belief in religious propositions (see Swindal 2021).
However, faith is not merely religious—there are ample examples of non-religious faith. This includes the faith that humans have in each other, faith in secular goals or ideals, and faith in ourselves. It is a mistake to think that faith is entirely a religious thing or reserved only for the religious. Faith is a trusting commitment—and this can involve many kinds of commitments. This includes religious commitment, but also includes interpersonal commitments like friendship or marriage, intrapersonal commitments we have to ourselves or our goals, and non-personal commitments we have to ideals or values.
One reason this distinction is important is that some projects have good reason to focus on one or the other. For example, on some religious traditions, like the Christian tradition, faith is a condition for salvation. But presumably, not any kind of faith will do—religious faith is required. One project in Christian philosophical theology provides an analysis of the religious faith that is closely connected to salvation (see Bates 2017). Projects like these have good reason to set secular faith aside. Others may have a special interest in secular faith and thus set religious faith aside.
This article considers both religious and non-religious faith. While they are different in key ways, they both involve trusting commitments, and many contemporary accounts of faith apply to both.
iv. Important vs. Mundane
A final distinction is between important faith and mundane faith. Important faith involves people, ideals, or values that are central to your life goals, projects, and commitments. Examples of important faith include religious faith, faith in your spouse, or faith in your political or ethical values. In most cases, important faith is essential to your life commitments and often marks values or people that you build your life around.
But not all faith is so important. You might have faith that your office chair will hold you, faith that your picnic will not be rained out, or faith that your spouse’s favorite football team will win their game this weekend. These are examples of mundane faith. While mundane faith still plausibly involves some kind of trusting commitment, this commitment is less important and more easily given up. You may have a weak commitment to your office chair. But—given it is not a family heirloom—if the chair started falling apart, you would quickly get rid of it and buy a new one. So important faith is associated with your central, life-shaping commitments, and mundane faith is associated with casual commitments that are more easily given up.
One might distinguish between objectively important faith—faith held to objectively valuable objects—and subjectively important faith—faith held to objects that are important to a particular individual but may or may not be objectively valuable. For example, some critics of religion might argue that while religious faith might be subjectively important to some, it is nonetheless not objectively important.
While this article focuses mostly on important faith, some of what is discussed also applies to mundane faith, but it may apply to a lesser degree. For example, if faith involves a desire, then the desires associated with mundane faith may be weaker. Now, consider features of faith.
b. Features of Faith
This subsection discusses four key features of faith: trust, risk, resilience, and going beyond the evidence. These four features are often associated with faith. They are not necessarily synonymous with faith, and not all accounts of faith give all four a starring role. Nonetheless, they play a role in understanding faith and its effects. Along the way, this article considers specific accounts that closely associate faith with each feature.
i. Trust
The first feature of faith is trust. As we have noted, faith is a trusting commitment. Trust involves reliance on another person. This can include, for example, believing what they say, depending on them, or being willing to take risks that hinge on them coming through for you. Faith and trust are closely connected, and some even use faith and trust as synonymous (Bishop 2016).
The close association with faith and trust lends itself nicely to a certain view of faith: faith is believing another’s testimony. Testimony is another’s reporting that something is true. Accounts that connect faith and testimony are historically significant, tracing back to Augustine, Locke, and Aquinas. Recent accounts of faith as believing another’s testimony include Anscombe (2008) and Zagzebski (2012). Anscombe, for example, says to have faith that p is to believe someone that p. Religious faith might be believing God’s testimony or the testimony of religious leaders. Interpersonal faith might be believing the testimony of your friends or family.
Plausibly, trust is a key feature—likely the key feature—of interpersonal faith. Faith in others involves trusting another person: this includes faith in God or gods, but also faith in other people and faith in ourselves. It is plausible that even propositional faith can be understood in terms of trust. For example, propositional faith that your friend will pick you up from the airport involves trusting your friend. Even in mundane cases propositional faith could be understood as trust: if you have faith it will be sunny tomorrow, you trust it will be sunny tomorrow.
ii. Risk
Faith is also closely related to risk. William James (1896/2011) discusses a hiker who gets lost. She finally finds her way back to civilization, but as she is walking, she encounters a deep and wide crevice on the only path home. Suppose that, to survive, she must jump this crevice, and it is not obvious that she can make the jump. She estimates that she has about a 50/50 chance. She has two choices: she can give up and likely die in the wilderness. Or she can take a (literal) leap of faith and do her best to make it across the crevice. This decision to jump involves a risk: she might fail to make it to the other side and fall to her death.
Risk involves making a decision in a situation where some bad outcome is possible but uncertain. Jumping a wide crevice involves the possible bad outcome of falling in. Gambling involves the possible bad outcome of losing money. Buying a stock involves the bad outcome of its value tanking.
If faith is connected to risk, this suggests two things about faith. First, faith is associated with a degree of uncertainty. For example, if one has faith that something is true, then one is uncertain regarding its truth or falsity. Second, faith is exercised in cases where there is a potentially bad outcome. The outcome might involve the object of faith’s being false, unreliable, or negative in some other way. For example, if you have faith that someone will pick you up at the airport, there is the possibility that they do not show up. If you have faith in a potential business partner, there is the possibility that they end up being dishonest or difficult to work with.
These examples illustrate the connection between risk and action-focused faith. When we act in faith, there is usually some degree of uncertainty involved and a potentially bad outcome. If you have action-focused faith your spouse will pick you up you wait for them and do not call a taxi, you risk waiting at the airport for a long time and maybe even missing an important appointment if your spouse does not show. If you have action-focused faith someone is a good business partner you dedicate time, money, and energy into your shared business, and you risk wasting all those resources if they are dishonest or impossible to work with. Or you might have action-focused faith that God exists and dedicate your life to God, which risks wasting your life if God does not exist.
Attitude-focused faith may also involve risk: some kind of mental risk. William James (1896/2011) discusses our two epistemic goals: believe truth and avoid error. We want to have true beliefs, but if that is all we cared about, we should believe everything. We want to avoid false beliefs, but if that is all we cared about, we should believe nothing. Much of the ethics of belief is about balancing these two goals, and this balance can involve a degree of mental risk. For example, suppose you have some evidence that God exists, but your evidence is not decisive, and you also recognize that there are some good arguments that God does not exist. While it is safer to withhold judgment on whether God exists, you also could miss out on a true belief. Instead, you might take a “mental” risk, and go ahead and believe that God exists. While you are not certain that God exists, and believing risks getting it wrong, you also face a bad outcome if you withhold judgment: missing out on a true belief. By believing that God exists in the face of indecisive evidence, you take a “mental” or “attitude” risk. James argues that this kind of mental risk can be rational (“lawful”) when “reason does not decide”—our evidence does not make it obvious that the statement believed is true or false—and we face a “forced choice”—we have to commit either way.
The view that faith involves an attitude-risk closely resembles John Bishop’s account of faith, which is inspired by insights from James. Bishop (2007) argues that faith is a “doxastic venture” (doxastic meaning belief-like). Bishop’s view is that faith involves believing beyond the evidence. Bishop argues that certain propositions (including what he calls “framework principles”) are evidentially undecidable, meaning our evidence cannot determine whether the claim is true or false. In these cases, you can form beliefs for non-evidential reasons—for example, beliefs can be caused by desires, emotions, affections, and so forth. This non-evidential believing enables you to believe beyond the evidence (see also Ali 2013).
iii. Resilience
A third feature of faith is resilience. Faith’s resilience stems from the connection between faith and commitment. Consider some examples. If you have faith that my team will win their upcoming game, you have some kind of commitment to my team. If you have faith that God exists, this involves a religious commitment. You might commit to finishing a degree, picking up a new instrument, a marriage, or a religion. These commitments can be difficult to keep—you get discouraged, doubt yourself or others, your desires and passions fade, and/or you get counterevidence that makes you wonder if you should have committed in the first place. Faith’s resilience helps you overcome these obstacles and keep your commitments.
Lara Buchak’s (2012) risky commitment view of faith brings risk and commitment together. On Buchak’s view, faith involves stopping one’s search for evidence and making a commitment. Once this commitment is made, you will maintain that commitment, even in the face of new counterevidence. For example, suppose you are considering making a religious commitment. For Buchak, religious faith involves stopping your search for evidence regarding whether God exists and taking action: making the commitment. Of course, this does not mean that you can no longer consider the evidence or have to stop reading philosophy of religion, but you are not looking for new evidence todecide whether to make (or keep) the commitment. Once you’ve made this religious commitment, you will continue in that commitment even if you receive evidence against the existence of God—at least, to a degree.
The literature on grit is also relevant to faith’s resilience. Grit, a phenomenon discussed by both philosophers and psychologists, is the ability to persevere to achieve long-term, difficult goals (Morton and Paul 2019). It takes grit to train for a marathon, survive a serious illness, or remain married for decades. Matheson (2018) argues that faith is gritty, and this helps explain how faith can be both rational and voluntary. Malcolm and Scott (2021) argue that faith’s grit helps the faithful be resilient to a variety of challenges. Along similar lines, Jackson (2021) argues that the belief- and desire-like components of faith explain how faith can help us keep our long-term commitments, in light of both epistemic and affective obstacles.
iv. Going Beyond the Evidence
A final feature of faith is that it goes beyond the evidence. This component is related to faith’s resilience. Faith helps you maintain your commitments because it goes beyond the evidence. You might receive counterevidence that makes you question whether you should have committed in the first place. For example, you might commit to a certain major, but a few months in, realize the required classes are quite difficult and demanding. You might wonder whether you are cut out for that field of study. Or you might have a religious commitment, but then encounter evidence that an all-good, all-loving God does not exist—such as the world’s serious and terrible evils. In either case, faith helps you continue in your commitment in light of this counterevidence. And if the evidence is misleading—so you are cut out for the major, or God does exist—then this is a very good thing.
The idea that faith goes beyond the evidence raises questions about rationality. How can faith go beyond the evidence but still be rational? Is it not irrational to disrespect or ignore evidence? This article returns to this question later, but for now, note that there is a difference between going beyond the evidence and going against the evidence. Going beyond the evidence might look like believing or acting when the evidence is decent but imperfect. Bishop’s account, for example, is a way that faith might “venture” beyond the evidence (2007). However, this does not mean faith goes against the evidence, requiring you to believe something that you have overwhelming evidence is false.
Some do argue that faith goes against the evidence. They fall into two main camps. The first camp thinks that faith goes against the evidence, and this is a bad thing; faith is harmful, and we should avoid having faith at all costs. The New Atheists, such as Richard Dawkins and Sam Harris, have a view like this (but see Jackson 2020). The second camp thinks that faith goes against the evidence but that is actually a good thing. This view is known as fideism. Kierkegaard argued for fideism, and he thought that faith is valuable because it is absurd: “The Absurd, or to act by virtue of the absurd, is to act upon faith” (Journals, 1849). Nonetheless, Kierkegaard thought having faith is one of the highest ideals to which one can aspire. This article returns to the idea that faith “goes beyond the evidence” in Section 3.
2. Faith and Other States
This section is about the relationship between faith and related attitudes, states, or actions: belief, doubt, desire, hope, and acceptance. Unlike the features just discussed, these states are normally not part of the definition or essence of faith. Nonetheless, these states are closely associated with faith. Appreciating the ways that faith is similar to, but also different than, these states provides a deeper understanding of the nature of faith.
a. Faith and Belief
When it comes to attitudes associated with faith, many first think of belief. Believing something is taking it to be the case or regarding it as true. Beliefs are a propositional attitude: an attitude taken toward a statement that is either true or false.
What is the relationship between faith and belief? Since belief is propositional, it is also natural to focus on propositional faith; so what is the relationship between belief that p and faith that p? More specifically: does belief that p entail faith that p? And: does faith that p entail belief that p? The answer to the first question is no; belief does not entail propositional faith. This is because propositional faith involves a desire-like or affective component; belief does not. You might believe that there is a global pandemic or believe that your picnic was rained out. However, you do not have faith that those things are true, because you do not desire them to be true.
The second question—whether propositional faith entails belief—is significantly more controversial. Does faith that p entail belief that p? Answers to this question divide into three main views. Those who say yes normally argue that faith is a kind of belief. The no camp divides into two groups. The first group argues that faith does not have to involve belief, but it involves something belief-like. A final group argues that faith is something totally different from belief. This article considers each view in turn. (See Buchak 2017 for a very helpful, more detailed taxonomy of various views of faith and belief.)
i. Faith as a Belief
On some views, faith is a belief. Call these “doxastic” views of faith. We have discussed two doxastic views already. The first is the view that faith is simply belief in a religious proposition; it was noted that, if intended as a general theory of faith, this seems narrow, as one can have non-religious faith. (But it may be more promising as an account of religious faith.) The second view is Anscombe’s (2008) and Zagzebski’s (2012) view that faith is a belief based on testimony, discussed in the previous section on trust. A third view traces back to Augustine and Calvin, and is more recently defended by Plantinga (2000). On this view, faith is a belief that is formed through a special mental faculty known as the sensus divintatus, or the “sense of the divine.” For example, you might watch a beautiful sunset and form the belief that there is a Creator; you might be in danger and instinctively cry out to God for help. (Although Plantinga also is sympathetic to views that connect faith and testimony; see Plantinga 2000: ch. 9.)
Note two things about doxastic views. First, most doxastic views add other conditions in addition to belief. For instance, as we have discussed, it is widely accepted that faith has an affective, desire-like component. So on one doxastic view, faith involves a belief that p and a desire for p. You could also add other conditions: for example, faith is associated with dispositions to act in certain ways, take certain risks, or trust certain people. What unites doxastic views is that faith is a kind of belief; faith is belief-plus.
Second, the view that faith entails belief does not require you to accept that faith is a belief. You could have a view on which faith is not a belief, but every time you have faith that a statement is true, you also believe it—faith and belief “march in step” (analogy: just because every animal with a heart also has a kidney does not mean hearts are kidneys). So, another view in the family of doxastic views, is that faith is not a belief, but always goes along with belief.
ii. Faith as Belief-like
Some resist the idea that faith entails belief. Daniel Howard-Snyder (2013) provides several arguments against doxastic views of faith. Howard-Snyder argues that if one can have faith without belief, this makes sense of the idea that faith is compatible with doubt. Doubting might cause you to give up a belief, but Howard-Snyder argues that you can maintain your faith even in the face of serious doubts. Second, other belief-like attitudes can play belief’s role: for example, you could think p is likely, be confident in p, think p is more likely than not, and so forth. If you do not flat-out believe that God exists, but are confident enough that God exists, Howard-Snyder argues that you can still have faith that God exists. A final argument that you can have faith without belief involves real-life examples of faith without belief. Consider the case of Mother Theresa. Mother Theresa went through a “dark night of the soul” in her later life. During this dark time, in her journals, she confessed that her doubts were so serious that at times, she did not believe that God existed. Nonetheless, she maintained her commitment and dedication to God. Many would not merely say she had faith; Mother Theresa was a paradigm example of a person of faith. This again supports the idea that you can have faith without belief. In general, proponents of non-doxastic views do not want to exclude those who experience severe, belief-prohibiting doubts from having religious faith. In fact, one of the functions of faith is to help you keep your commitments in the face of such doubts.
Howard-Snyder’s positive view is that faith is “weakly doxastic.” Faith does not require belief but requires a belief-like attitude, such as confidence, thinking likely, and so forth. He adds other conditions as well; in addition to a belief-like attitude, he thinks that faith that p requires a positive view of p, a positive desire-like attitude toward p, and resilience to new counterevidence against the truth of p.
In response to Howard-Snyder, Malcolm and Scott (2017) defend that faith entails belief. While they agree with Howard-Snyder that faith is compatible with doubt, they point out that belief is also compatible with doubt. It is not uncommon or odd to say things like “I believe my meeting is at 3 pm, but I’m not sure,” or “I believe that God exists, but I have some doubts about it.” Malcolm and Scott go on to argue that faith without belief, especially religious faith without belief, is a form of religious fictionalism. Fictionalists speak about and act on something for pragmatic reasons, but they do not believe the claims that they are acting on and speaking about. For example, you might go to church, pray, or recite a creed, but you do not believe that God exists or what the creed says—you merely do those things for practical reasons. Malcolm and Scott argue that there is something suspicious about this, and there is reason to think that fictionalists do not have genuine faith. They conclude that faith entails belief, and more specifically, religious faith requires the belief that God exists.
This debate is not be settled here, but note that there are various responses that the defender of the weakly-doxastic view of faith could provide. Concerning the point about doubt, a proponent of weak doxasticism might argue that faith is compatible with more doubt than belief. Even if belief is compatible with some doubt—as it seems fine to say, “I believe p but there’s a chance I’m wrong”—it seems like faith is compatible with even more doubt—more counterevidence or lower probabilities. On fictionalism, Howard-Snyder (2018) responds that religious fictionalism is a problem only if the fictionalist actively believes that the claims they are acting on are false. However, if they are in doubt but moderately confident, or think the claims are likely, even if they do not believe the claims, it is more plausible that fictionalists can have faith. You might also respond by appealing to some of the distinctions discussed above: for example, perhaps religious faith entails belief, but non-religious faith does not.
iii. Faith as Totally Different from Belief
A third view pulls faith even further away from belief. On this view, faith does not entail belief, nor does faith entail something belief-like, but instead, faith is totally different from belief. This view is often known as the pragmatist view of faith.
This article returns to these views later, but here is a summary. Some authors argue that faith only involves accepting, or acting as if, something is true (Swinburne 1981; Alston 1996). Others argue that faith is a disposition to act in service of an ideal (Dewey 1934; Kvanvig 2013), or that faith involves pursuing a relationship with God (Buckareff 2005). Some even argue that faith is incompatible with belief; for example, Pojman (1986) argues that faith is profound hope, and Schellenberg (2005) argues that faith is imaginative assent. Both argue that one cannot have faith that p if they believe that p.
Pragmatist views depart drastically from both doxastic and weakly doxastic accounts of faith. Faith does not even resemble belief, but is something totally unlike belief, and more closely related to action, commitment, or a disposition to act.
There are two ways to view the debate between doxastic, weakly doxastic, and pragmatic views of faith. One possibility is that there is a single thing, “faith,” and there are various views about what exactly faith amounts to: is faith a belief, similar to a belief, or not at all like belief? Another possibility, however, is that there are actually different kinds of faith. Plausibly, both doxastic and weakly doxastic views are describing attitude-focused faith, and pragmatic views of faith are describing action-focused faith. This second possibility does not mean there are not any interesting debates regarding faith. It still leaves open whether attitude-focused faith requires belief, or merely something belief-like, and if the latter, what those belief-like attitudes can be, and how weak they can be. It also leaves open which view of action-focused faith is correct. However, you may not have to choose between pragmatist views on the one hand, and doxastic or weakly doxastic views on the other; each view may simply be describing a different strand of faith.
b. Faith and Doubt
One might initially think that faith and doubt are opposed to each other. That is, those with faith will never doubt, or if they do doubt, their faith is weak. However, if you agree with the points made in the previous section—Howard-Snyder’s argument that faith is compatible with doubt; and Malcolm and Scott’s point that belief is also compatible with doubt—there is reason to reject the view that faith and doubt are completely opposed to each other.
Howard-Snyder (2013: 359) distinguishes between two ways of doubting. First, you might simply doubt p. Howard-Snyder says that this involves an inclination to disbelieve p. If you doubt that it will rain tomorrow, you will tend to disbelieve that it will rain tomorrow. This type of doubt—doubting p—might be in tension with, or even inconsistent with faith. Even those who deny that faith entails belief nonetheless think that faith is not consistent with disbelief; you cannot have faith that p if you think p is false (but see Whitaker 2019 and Lebens 2021).
However, not all doubt is closely associated with disbelief. You might instead be in doubt about p, or have some doubts about p. Moon (2018) argues that this type of doubt involves (roughly) thinking you might be wrong. In these cases, you are pulled in two directions—maybe you believe something, but then receive some counterevidence. Moon argues that this second kind of doubt is compatible with belief (2018: 1831), and Howard-Snyder argues that it is compatible with faith. Howard-Snyder says, “Being in doubt is no impediment to faith. Doubt is not faith’s enemy; rather, the enemies of faith are misevaluation, indifference or hostility, and faintheartedness” (2013: 370).
Thus, there is good reason to think that having doubts is consistent with faith. Those that deny that faith entails belief might argue that faith is compatible with more doubts than belief. What is more, faith may be a tool that helps us maintain our commitments in light of doubts. For example, Jackson (2019) argues that evidence can move our confidence levels around, but it does not always change our beliefs. For example, suppose John is happily engaged and will be married soon, and based on the sincerity and commitment of him and his spouse, he has faith that they will not get divorced. Then, he learns that half of all marriages end in divorce. Learning this should lower his confidence that they will remain committed, causing him to have doubts that his marriage will last. However, this counterevidence does not mean he should give up his faith or the commitment. His faith in himself and his spouse can help him maintain the commitment, even in light of the counterevidence and resulting doubts.
c. Faith and Desire
Recall that attitude-focused faith involves a desire for, or a positive evaluation of, the object of faith. If you have faith that your friend will win her upcoming race, then you want her to win; it does not make sense to claim to have faith she will win if you hope she will lose. Similarly, you would not have faith that your best friend has cancer, or that your father will continue smoking. A large majority of the authors writing on the philosophy of faith maintain that faith involves a positive evaluation of its object (Audi 2011: 67; Howard-Snyder 2013: 362–3). Even action-focused faith may involve desire. While it is more closely identified with actions, rather than attitudes, it could still involve or be associated with desires or pro-attitudes.
Malcolm and Scott (2021) challenge the orthodox view that faith entails desire or positivity. They argue that, while faith might often involve desire, the connection is not seamless. For example, you might have faith that the devil exists or faith that hell is populated—not because you want these to be true, but because these doctrines are a part of your religious commitment. You might find these doctrines confusing and difficult to swallow, and even hope that they are false, but you trust that God has a plan or reason to allow these to be true. Malcolm and Scott argue that faith in such cases does not involve positivity toward its object—and in fact, it may involve negativity.
Furthermore, crises of faith can involve the loss of desire for the object of faith. There has been much talk about how faith that p can be resilient in light of counterevidence: evidence that p is false. But what about evidence that p would be a bad thing? One might question their religious commitment, say, not because they doubt God’s existence, but because they doubt that God’s existence would be a good thing, or that God is worth committing to (see Jackson 2021). Malcolm and Scott argue that if one can maintain faith through a crisis of faith, this provides another reason to think that faith may not always involve positivity.
Note that more attention has been paid to the specifics of faith’s belief-like component than faith’s desire-like component. Many authors mention the positivity of faith, motivate it with a few examples, and then move on to other topics. But many similar questions that arise regarding faith and belief could also be raised regarding faith and desire. For example: does faith that p entail a desire for p? What if someone has something weaker than a desire, such as a second-order desire (a desire to desire p)? Or some desire for p, but also some desire for not-p? Could these people have faith? Can other attitudes play the role of desire in faith, such as a belief that p is good?
If you are willing to weaken the relationship between faith and desire, you could agree with Malcolm and Scott that the idea that faith entails desire is too strong, but nonetheless accept that a version of the positivity view is correct. Similar to a weakly doxastic account of faith, you could have a weakly positive account of faith and desire: faith’s desire-like condition could include things like second-order desires, conflicting desires, pro-attitudes, or beliefs about the good. In a crisis of faith, the faithful may have second-order desires or some weaker desire-like attitude. The prospect of weakly positive accounts of faith should be further explored. And in general, more attention should be paid to the relationship between faith and desire. In the religious case, this connection is related to the axiology of theism, the question of whether we should want God to exist (see The Axiology of Theism).
d. Faith and Hope
Faith and hope are often considered alongside each other, and for good reason. Like faith, hope also has a desire-like component and a belief-like component. The desire-like component in both attitudes is similar—whether you have faith that your friend will win their game or hope that they will win their game, you want them to win the game.
However, hope’s belief-like component is arguably weaker than faith’s. Hope that a statement is true merely requires thinking that statement is possibly true; it can be extremely unlikely. Even if there is a 95% chance of rain tomorrow, you can still hope your picnic will not be rained out. Hope’s belief-like component could be one of two things: a belief that p is possible, or a non-zero credence in p. (Credence is a measure of subjective probability—the confidence you have in the truth of some proposition. Credences are measured on a scale from 0 to 1, where 0 represents certainty that a proposition is false, and 1 represents certainty that it is true.) So if you hope that p, you cannot believe p is impossible or have a credence of 0 in p (certainty that p is false). At the same time, it seems odd to hope for things in which you are certain. You do not hope that 1+1=2 or hope that you exist, even if you desire those to be true. Then, as Martin (2013: 69) notes, hope that p may be consistent with any credence in p between, but excluding, 1 and 0.
Thus, on the standard view of hope, hope consists of two things: a desire for p to be true and a belief that p is possible (or non-zero credence). (See Milona 2019 for a recent defense of the standard view. Some argue that hope has additional components; for details of recent accounts of hope, see Rioux 2021.) Contrast this with faith. Unlike hope, faith that a statement is true is not compatible with thinking the statement is extremely unlikely or almost definitely false. If there is a 95% chance of rain tomorrow, you should not—and most would not—have faith that it will be sunny tomorrow. The chance of rain is just too high. But this does not preclude hoping that it will be sunny. Thus, you can hope that something is true when it is so unlikely that you cannot have faith.
This carves out a unique role for hope. Sometimes, after you make a commitment, you get lots of counterevidence challenging your basis for that commitment—counterevidence so strong that you must give up your faith. However, simply because you have to give up your faith does not mean you have to give up hope. You might hope your missing sibling is alive, even in light of evidence that they are dead, or hope that you will survive a concentration camp, or hope that you can endure a risky treatment for a serious illness. And resorting to hope does not always mean you should give up your commitment. Hope can, in general, underlie our commitments when we do not have enough evidence to have faith (see Jackson 2021).
While faith and hope are distinct in certain ways, Pojman (1986) argues that faith is a certain type of hope: profound hope. Pojman is not interested in casual hope—like hope your distant cousin will get the job he applied for—but is focused on the hope that is deep and central to our life projects. In addition to the two components of hope discussed above, profound hope also involves a disposition to act on p, an especially strong desire for p to be true, and a willingness to take great risks to bring p about. Pojman’s view draws on a connection between attitude-focused faith and action-focused faith, as Pojman’s account gives a central role to risky action. Those convinced by the idea that faith requires a bit more evidence than hope may also want to add a condition to Pojman’s view: the belief-like component of faith-as-hope must be sufficiently strong, as faith might require more than merely taking something to be possible.
e. Faith and Acceptance
Accepting that p is acting as if p. When you accept a proposition, you treat it as true in your practical reasoning, and when you make decisions, act as if p were true. According to Jonathan Cohen (1992: 4), when one accepts a proposition, one “includes that proposition… among one’s premises for deciding what to do or think in a particular context.” Often, we accept what we believe and believe what we accept. You believe coffee will wake you up, so you drink it when you are tired in the morning. You believe your car is parked north of campus, so you walk that way when you leave the office.
Sometimes, however, you act as if something is true even though you do not believe it. Say you are a judge in a court case, and the evidence is enough to legally establish that a particular suspect did it “beyond a reasonable doubt.” Suppose, though, you have other evidence that they are innocent, but it is personal, such that it cannot legally be used in a court of law. You may not be justified in believing they are guilty, but for legal reasons, you must accept that they are guilty and issue the “guilty” verdict. In other cases, you believe something, but do not act as it if is true. Suppose you are visiting a frozen lake with your young children, and they want to go play on the ice. You may rationally believe the ice is thick and safe, but refuse to let your children play, accepting that the ice will break, because of how bad it would be if they fell in.
Several authors have argued that faith and acceptance are closely connected. Alston (1996) argues that acceptance, rather than belief, is one of the primary components of faith. That is, those with faith may or may not believe the propositions of faith, but they act as if they are true. A similar view is Swinburne’s pragmatist faith. On Swinburne’s (1981) view, faith is acting on the assumption that p. Like Alston, Swinburne also maintains that faith does not require belief. Schellenberg’s (2005) view also gives acceptance a prominent place in faith. On Schellenberg’s view, faith is imaginative assent. If you have faith that p, you deliberately imagine p to be true, and, guided by this imaginative picture, you act on the truth of p. So Schellenberg’s picture of faith is imaginative assent plus acceptance. While these authors argue that acceptance is necessary for faith, most do not think it is sufficient; the faithful fulfill other conditions, including a pro-attitude towards the object of faith.
A final view is that faith involves a kind of allegiance. Allegiance is an action-oriented submission to a person or ideal. Dewey (1934) and Kvanvig (2013) defend the allegiance view of faith, on which the faithful are more characterized by their actions than their attitudes. The faithful are marked by their loyalty and committed action to the object of faith; in many cases, this could look like accepting certain propositions of faith, even if one does not believe them. Bates (2017) also proposes a model of Christian faith as allegiance, but for Bates, faith requires both a kind of intellectual assent (something belief-like) and allegiance, or enacted loyalty and obedience to God.
Whether these views that give acceptance or action a central role in faith are weakly doxastic or pragmatic depends on one’s view of acceptance: is acceptance a belief-like state or an action-like state? Since acceptance is acting as if something is true, and you can accept a proposition even if you think it is quite unlikely, in my opinion, these views are better characterized as pragmatic. However, some acceptance views, like Bates’, that involve both acceptance and something belief-like, may be doxastic or weakly doxastic.
3. Evaluating Faith
Thus far, this article has focused on the nature of faith. Section 1 covered types of faith and features of faith. Section 2 covered the way faith compares and contrasts with other related attitudes and actions. This final section is about evaluating faith. This section discusses three modes of evaluation: epistemic, practical, and moral.
Note that, like other attitudes and actions, faith is sometimes rational and sometimes irrational, sometimes permissible and sometimes impermissible. In the same way that beliefs can be rational or irrational, faith can be rational or irrational. Not all faith should be evaluated in the same way. The rationality of faith depends on several factors, including the nature of faith and the object of faith. Drawing on some of the above accounts of the nature of faith, this article discusses various answers to the question of why and when faith could be rational, and why and when faith could be irrational.
a. Faith’s Epistemic Rationality
Our first question is whether faith can be epistemically rational, and if so, when and how. Epistemic rationality is rationality that is aimed at getting at the truth and avoiding error, and it is associated with justified belief and knowledge. An epistemically rational belief has characteristics like being based on evidence, being reliably formed, being a candidate for knowledge, and being the result of a dependable process of inquiry. Paradigm examples of beliefs that are not epistemically rational ones are based on wishful thinking, hasty generalizations, or emotional attachment.
Epistemic rationality is normally applied to attitudes, like beliefs, so faith’s epistemic rationality primarily concerns faith as a mental state. This article also focuses on propositional faith, and it divides the discussion of faith’s epistemic rationality into two parts: evidence and knowledge.
i. Faith and Evidence
Before discussing faith, it might help to discuss the relationship between evidence and epistemic rationality. It is widely thought that epistemically rational people follow the evidence. While the exact relationship between evidence and epistemic rationality is controversial, many endorse what is called evidentialism, the view that you are epistemically rational if and only if you proportion your beliefs to the evidence.
We have seen that faith is resilient: it helps us keep our commitments in the face of counterevidence. Given faith’s resilience, it is natural to think that faith goes beyond the evidence (or involves a disposition to go beyond the evidence). But would not having faith then violate evidentialism? Can faith both be perfectly proportioned to the evidence, but also go beyond the evidence? Answers to these questions fall into three main camps, taking different perspectives on faith, evidence, and evidentialism.
The first camp, mentioned previously, maintains that faith violates evidentialism because it goes beyond the evidence; but evidentialism is a requirement of rationality; thus, faith is irrational. Fideists and the New Atheists may represent such a view. However, you might think that the idea that all faith is always irrational is too strong, and that, instead, faith is more like belief: sometimes rational and sometimes irrational. Those that think faith can be rational fall into two camps.
The first camp holds that rational faith does not violate evidentialism and that there are ways to capture faith’s resilience that respect evidentialism. For example, consider Anscombe’s and Zagzebski’s view that faith is believing another’s testimony. On this view, faith is based on evidence, and rational faith is proportioned to the evidence: testimonial evidence. Of course, this assumes that testimony is evidence, but this is highly plausible: much of our geographical, scientific, and even everyday beliefs are based on testimony. Most of our scientific beliefs are not based on experiments we did ourselves—they are based on results reported by scientists. We trust their testimony. We believe geographical facts about the shape of the globe and things about other countries even though we have never traveled there ourselves—again, based on testimony. We ask people for directions on the street and believe our family and friends when they report things to us. Testimony is an extremely important source of evidence, and without it, we would be in the dark about a lot of things.
In what sense does faith go beyond the evidence, on this view? Well, sometimes, we have only testimony to go on. We may not have the time or ability to verify what someone tells us without outside sources, and we may be torn about whether to trust someone. In choosing to take someone’s word for something, we go beyond the evidence. At the very least, we go beyond certain kinds of evidence, in that we do not require outside verifying evidence. One worry for this view, however, is that faith is straightforwardly based on evidence, and thus it cannot sufficiently explain faith’s resilience, or how faith goes beyond the evidence.
A second view on which rational faith goes beyond the evidence without violating evidentialism draws on a view in epistemology known as epistemic permissivism: the view that sometimes, the evidence allows for multiple different rational attitudes toward a proposition. In permissive cases, where your evidence does not point you one way or another, there is an evidential tie between two attitudes. You can then choose to hold the faithful attitude, consistent with, but not required by, your evidence. This does not violate evidentialism, as the faithful attitude is permitted by, and in that sense fits, your evidence. At the same time, faith goes beyond the evidence in the sense that the faithful attitude is not strictly required by your evidence.
Consider two concrete examples. First, suppose your brother is accused of a serious crime. Suppose that there are several good, competing explanations of what happened. It might be rational for you to withhold belief, or even believe your brother is guilty, but you could instead choose the explanation of the evidence that supports your brother’s innocence. This demonstrates faith that your brother is innocent without violating the evidence, since believing that he is innocent is a rational response to the data.
Or suppose you are trying to decide whether God exists. The evidence for (a)theism is complicated and difficult to assess, and there are good arguments on both sides. Suppose, because the evidence is complicated in this way, you could be rational as a theist (who believes God exists), atheist (who believes God does not exist), or agnostic (who is undecided on whether God exists). Say you go out on a limb and decide to have faith that God exists. You are going beyond the evidence, but you are also not irrational, since your evidence rationally permits you to be a theist. Again, this is a case where rational faith respects evidentialism, but also goes beyond the evidence. (Note that, depending on how evidentialism is defined, this response may better fit under the third view, discussed next. Some strong versions of evidentialism are inconsistent with permissivism, and on some versions of the permissivist theory of faith, non-evidential factors can break evidential ties, so things besides evidence affect rational belief.) Attempts to reconcile faith’s resilience with evidentialism include, for example, Jackson (2019) and Dormandy (2021).
The third and final camp holds the view that faith, in going beyond the evidence, violates evidentialism, but this does not mean that faith is irrational. (James 1896/2011 and Bishop 2007 may well be characterized as proponents of this view, as they explicitly reject Clifford’s evidentialism). For example, you might maintain that evidentialism applies to belief, but not faith. After all, it is natural to think that faith goes beyond the evidence in a way that belief does not. To maintain evidentialism about belief, proponents of this view would need to say that rational faith is inconsistent with belief. Then, faith might be subject to different, non-evidentialist norms, but could still be rational and go beyond the evidence.
A second family of views that rejects evidentialism but maintains faith’s rationality are externalist views. Externalists maintain that epistemic justification depends on factors that are external to the person—for example, your belief that there is a cup on the desk can be rational if it is formed by a reliable perceptual process, whether or not you have evidence that there is a cup. Plantinga in particular is an externalist who thinks epistemic justification (or “warrant”) is a matter of functioning properly. Plantinga (2000) argues that religious beliefs can be properly basic: rational even if not based on an argument. Plantinga’s view involves the sensus divinitatus: a sense of the divine, that, when functioning properly, causes people to form beliefs about God (for example, “There is a Creator”; “God exists”; “God can help me”) especially in particular circumstances (for example, in nature, when in need of help, and so forth). These beliefs can be rational, even if not based on argument, and may be rational without any evidence at all.
That said, the view that religious belief can be properly basic does not, by itself, conflict with evidentialism. If a religious belief is based on experiential evidence, but not arguments, it can still be rational according to an evidentialist. Externalist views that deny evidentialism make a stronger claim: that religious belief can be rational without argument or evidence (see Plantinga 2000: 178).
Externalist views—at least ones that reject evidentialism—may be able to explain how rational faith goes beyond the evidence; evidence is not required for faith (or belief) to be epistemically rational. Even so, most externalist views include a no-defeater condition: if you get evidence that a belief is false (a defeater), that can affect, or even preclude, your epistemic justification. For example, you might form a warranted belief in God based on the sensus divinitatus but then begin to question why a loving, powerful God would allow the world’s seriously and seemingly pointless evils; this counterevidence could remove the warrant for your belief in God. Generally, externalist views may need a story about how faith can be resilient in the face of counterevidence to fully capture the idea that faith goes beyond the evidence.
We have seen three views about the relationship between faith, evidence, and evidentialism. On the first view, evidentialism is true, and faith does not respect evidentialism, so faith is irrational. On the second, evidentialism is true, and rational faith goes beyond the evidence in a way that respects evidentialism. On the final view, evidentialism is false, so faith does not have to be based on evidence; this makes space for rational faith to go beyond the evidence. Now, we turn to a second topic concerning the epistemology of faith: faith and knowledge.
ii. Faith and Knowledge
Epistemology is the study of knowledge. Epistemologists mostly focus on propositional knowledge: knowledge that a proposition is true. For example, you might know that 1+1=2 or that it is cold today. Knowledge involves at least three components: justification, truth, and belief. If you know that it is cold today, you believe that it is cold today, it is indeed cold today, and your belief that it is cold today is epistemically justified. (While these three components are necessary for knowledge, many think they are not sufficient, due to Gettier’s (1963) famous counterexamples to the justified true belief account of knowledge.) Note that knowledge is a high epistemic ideal. When a belief amounts to knowledge, it is not merely justified, but it is also true. Many epistemologists also think that knowledge requires a high degree of justification, for example, quite good evidence.
There are three main views about the relationship between faith and knowledge. The first is that propositional faith is a kind of knowledge. Plantinga’s view lends itself to a view of faith along these lines, as Plantinga’s story about proper function is ultimately an account of knowledge. Plantinga’s view is inspired by Calvin’s, who defines faith as a “firm and certain” knowledge of God (Institutes III, ii, 7:551). If Plantinga is right that (undefeated) theistic beliefs, formed reliably by properly functioning faculties in the right conditions, amount to knowledge, then Plantinga’s view might be rightfully characterized as one on which faith is (closely tied to) knowledge. Relatedly, Aquinas discusses a kind of faith that resembles knowledge, but is ultimately “midway between knowledge and opinion” (Summa Theologica 2a2ae 1:2).
On a second view, propositional faith is not a kind of knowledge, but can amount to knowledge in certain circumstances. For example, one might hold that faith may be consistent with less evidence or justification than is required for knowledge, or that faith does not require belief. Thus, one could have faith that p—even rationally—even if one does not know that p. Keep in mind that knowledge is a high epistemic bar, so meeting this bar for knowledge may not be required for faith to be rational—faith that p might be rational even if, for example, p is false, so p is not known. However, faith that p may amount to knowledge when it meets the conditions for knowledge: p is justifiedly believed, true, and not Gettiered.
On a final view, faith that p is inconsistent with knowing p. For example, Howard-Snyder (2013: 370) suggests that for faith, one’s evidence is often “sub-optimal.” Along similar lines, Alston (1996: 12) notes that “[F]aith-that has at least a strong suggestion of a weak epistemic position vis-a-vis the proposition in question.” Since knowledge sets a high epistemic bar (the proposition in question must enjoy a high degree of justification, be true, and so forth), faith may play a role when your epistemic position is too poor to know. And if you know p, faith that p is not needed. This fits well with Kant’s famous remarks: “I have… found it necessary to deny knowledge, in order to make room for faith” (Preface to the Second Edition of the Critique of Pure Reason, 1787/1933: 29). On this third view, then, if you have faith that p, you do not know p, and if you know p, faith that p is unnecessary.
As noted, many epistemologists focus on knowledge-that: knowing that a proposition is true. However, there are other kinds of knowledge: knowledge-how, or knowing how to perform some action, such as riding a bike, and knowledge-who, or knowing someone personally. There has been some interesting work on non-propositional knowledge and faith: see Sliwa (2018) for knowledge-how, and Benton (2018) for knowledge-who. Note that non-propositional knowledge might better fit with non-propositional faith, such as faith-in. This raises several interesting questions, such as: does faith in God require interpersonal knowledge of God? And how does this relate to the belief that God exists? The relationship between non-propositional knowledge and faith merits further exploration.
b. Faith’s Practical Rationality
A second question is whether faith can be practically rational, and if so, when and how. Practical rationality, unlike epistemic rationality, is associated with what is good for you: what fulfills your desires and leads to your flourishing. Examples of practically rational actions include brushing your teeth, saving for retirement, pursuing your dream job, and other things conducive to meeting your goals and improving your life (although see Ballard 2017 for an argument that faith’s practical and epistemic rationality are importantly connected).
Practical rationality is normally applied to actions. Thus, it makes the most sense to evaluate action-focused faith for practical rationality. In particular, acceptance, or acting as if a proposition is true, is often associated with action-focused faith. Thus, this article focuses on what makes accepting a proposition of faith practically rational, and whether leaps of faith can be practically rational but go beyond the evidence.
Elizabeth Jackson’s (2021) view of faith focuses on how acceptance-based faith can be practically rational in light of counterevidence. Jackson notes that, on two major theories of rational action (the belief-desire view and the decision-theory view), rational action is caused by two things: beliefs and desires. If it is rational for you to go to the fridge, this is because you want food (a desire) and you believe there is food in the fridge (a belief). But you can believe and desire things to a stronger or lesser degree; you might rationally act on something because you have a strong desire for it, even though you consider it unlikely. Suppose your brother goes missing. He has been missing for a long time, and there is a lot of evidence he is dead, but you think there is some chance he might be alive. Because it would be so good if he was alive and you found him, you have action-focused faith that he is alive: you put up missing posters, spend lots of time searching for him, and so forth. The goodness of finding him again makes this rational, despite your counterevidence. Or consider another example: you might rationally accept that God exists, by practicing a religion, participating in prayer and liturgy, and joining a spiritual community, even if you have strong evidence against theism. This is because you have a lot of gain if you accept that God exists and God does exist, and not much to lose if God does not exist.
Arguably, then, it is even easier for practically rational faith to go beyond the evidence than it is for epistemically rational faith. Taking an act of faith might be practically rational even if one has little evidence for the proposition they are accepting. Practically rational action depends on both your evidence and also what is at stake, and it can be rational to act as if something is true even if your evidence points the other way. In this, practically rational faith can be resilient in light of counterevidence: what you lose in evidence can be made up for in desire.
Of course, this does not mean that faith is always practically rational. Both your beliefs/evidence and your desires/what is good for you can render faith practically irrational. For example, if you became certain your brother was dead (perhaps his body was found), then acting as if your brother is still alive would be practically irrational. Similarly, faith could be practically irrational if its object is not good for your flourishing: for example, faith that you will get back together with an abusive partner.
However, since it can be rational to accept that something is true even if you have overwhelming evidence that it is false, practically rational acts of faith go beyond (and even against) the evidence. For other related decision-theoretic accounts of how practically rational faith can go beyond the evidence, see Buchak (2012) and McKaughan (2013).
c. Faith and Morality/Virtue
The third and final way to evaluate faith is from a moral perspective. There is a family of questions regarding the ethics of faith: whether and when is faith morally permissible? Is faith ever morally obligatory? Is it appropriate to regard faith as a virtue? Can faith be immoral?
We normally ask what actions, rather than what mental states, are obligatory/permissible/wrong. While virtues are not themselves actions, they are (or lead to) dispositions to act. In either case, it makes sense to morally evaluate action-focused faith. (Although, some argue for doxastic wronging, that is, beliefs can morally wrong others. If they can, this suggests beliefs—and perhaps other mental states—can be morally evaluated. This may open up space to morally evaluate attitude-focused faith as well.)
As with the epistemic and practical case, it would be wrong to think that all cases of faith fit into one moral category. Faith is not always moral: faith in an evil cause or evil person can be immoral. But faith is not always immoral, and may sometimes be morally good: faith in one’s close friends or family members, or faith in causes like world peace or ending world hunger seem morally permissible, if not even morally obligatory.
One of the most widely discussed topics on the ethics of faith is faith as a virtue (see Aquinas, Summa Theologiae II-II, q. 1-16). Faith is often taken to be both a virtue in general, but also a theological virtue (in the Christian tradition, along with hope and charity). For reasons just discussed, the idea that faith is a virtue by definition seems incorrect. Faith is not always morally good—it is possible to have faith in morally bad people or cases, and to have faith with morally bad effects. (This is why the discussion of faith as a virtue belongs in this section, rather than in previous sections on the nature of faith.)
This raises the question: Can faith satisfy the conditions for virtue? According to Aristotle, a virtue is a positive character trait that is demonstrated consistently, across situations and across time. Virtues are acquired freely and deliberately and bring benefits to both the virtuous person and to their community. For example, if you have the virtue of honesty, you will be honest in various situations and also over time; you will have acquired honesty freely and deliberately (not by accident), and your honesty will bring benefits both to yourself and those in your community. Thus, assuming this orthodox Aristotelian definition of virtue, when faith is a virtue, it is a stable character trait, acquired freely and deliberately, that brings benefits to both the faithful person and their community.
There have been several discussions of the virtue of faith in the literature. Anne Jeffrey (2017-a) argues that there is a tension between common assumptions about faith and Aristotelian virtue ethics. Specifically, some have argued that part of faith’s function depends on a limitation or an imperfection in the faithful person (for example, keeping us steadfast and committed in light of doubts or misguided affections). However, according to the Aristotelian view, virtues are traits held by fully virtuous people who have perfect practical knowledge and always choose the virtuous action. Taken together, these two views create a challenge for the idea that faith is a virtue, as faith seems to require imperfections or limitations incompatible with virtue. While this tension could be resolved by challenging the idea that faith’s role necessarily involves a limitation, Jeffrey instead argues that we should re-conceive Aristotelian virtue ethics and embrace the idea that even people with limitations can possess and exercise virtues. In another paper, Jeffrey (2017-b) argues that we can secure the practical rationality and moral permissibility of religious faith—which seems necessary if faith is a virtue—by appealing to the idea that faith is accompanied by another virtue, hope.
There is a second reason to think that the theological virtues—faith, hope, and charity—may not perfectly fit into the Aristotelian mold. While Aristotelian virtues are freely acquired by habituation, some thinkers suggest that theological virtues are infused immediately by God, rather than acquired over time (Aquinas, Summa Theologiae II-II, q. 6). While some may conclude from this that faith, along with the other theological virtues, are not true virtues, this may further support Jeffrey’s suggestion that Aristotle’s criteria for virtue may need to be altered or reconceived. Or perhaps there are two kinds of virtues: Aristotelian acquired virtues and theological infused virtues, each with their own characteristics.
A final topic that has been explored is the question of how virtuous faith interacts with other virtues. The relationship between faith and humility is widely discussed. Several authors have noted that prima facie, faith seems to be in tension with humility: faith involves taking various risks (both epistemic and action-focused risks), but in some cases, those risks may be a sign of overconfidence, which can be in tension with exhibiting humility (intellectual or otherwise). In response to this, both Kvanvig (2018) and Malcolm (2021) argue that faith and humility are two virtues that balance each other out. Kvanvig argues that humility is a matter of where your attention is directed (say, not at yourself), and this appropriately directed attention can guide faithful action. Malcolm argues that religious faith can be understood as a kind of trust in God—specifically, a reliance on God’s testimony, which, when virtuous, exhibits a kind of intellectual humility.
4. Conclusion
Faith is a trusting commitment to someone or something. There are at least four distinctions among kinds of faith: attitude-focused faith vs. act-focused faith, faith-that vs. faith-in, religious vs. non-religious faith, and important vs. mundane faith (Section 1.a). Trust, risk, resilience, and going beyond the evidence are all closely associated with faith (Section 1.b). Considering faith’s relationship to attitudes, states, or actions—belief, doubt, desire, hope, and acceptance—sheds further light on the nature of faith (Section 2). There are three main ways we might evaluate faith: epistemically, practically, and morally. While faith is not always epistemically rational, practically rational, or morally permissible, we have seen reason to think that faith can be positively evaluated in many cases (Section 3).
5. References and Further Reading
Ali, Zain. (2013). Faith, Philosophy, and the Reflective Muslim. London, UK: Palgrave Macmillan.
Alston, William. (1996). “Belief, Acceptance, and Religious Faith.” In J. Jordan and D. Howard-Snyder (eds.), Faith, Freedom, and Rationality pp. 3–27. Lanham, MD: Rowman and Littlefield.
Anscombe, G. E. M. (2008). “Faith.” In M. Geach and L. Gormally (eds.), Faith in a Hard Ground. Exeter: Imprint Academic, 11–19.
Audi, Robert. (2011). Rationality and Religious Commitment. New York: Oxford University Press.
Ballard, Brian. (2017). “The Rationality of Faith and the Benefits of Religion.” International Journal for the Philosophy of Religion 81: 213–227.
Bates, Matthew. (2017). Salvation by Allegiance Alone. Grand Rapids: Baker Academic.
Benton, Matthew. (2018). “God and Interpersonal Knowledge.” Res Philosophica 95(3): 421–447.
Bishop, John. (2007). Believing by Faith: An Essay in the Epistemology and Ethics of Religious Belief. Oxford: OUP.
Buchak, Lara. (2012). “Can it Be Rational to Have Faith?” In Jake Chandler & Victoria Harrison (eds.), Probability in the Philosophy of Religion, pp. 225–247. Oxford: Oxford University Press.
Buchak, Lara. (2017). “Reason and Faith.” In The Oxford Handbook of the Epistemology of Theology (edited by William J. Abraham and Frederick D. Aquino), pp. 46–63. Oxford: OUP.
Buckareff, Andrei A. (2005). “Can Faith Be a Doxastic Venture?” Religious Studies 41: 435– 45.
Byerly, T. R. (2012). “Faith as an Epistemic Disposition.” European Journal for Philosophy of Religion, 4(1): 109–128.
Cohen, Jonathan. (1992). An Essay on Belief and Acceptance. New York: Clarendon Press.
Dewey, John (1934). A Common Faith. New Haven, CT: Yale University Press.
Dormandy, Katherine. (2021). “True Faith: Against Doxastic Partiality about Faith (in God and Religious Communities) and in Defense of Evidentialism.” Australasian Philosophical Review 5(1): 4–28
Howard-Snyder, Daniel. (2013). “Propositional Faith: What it is and What it is Not.” American Philosophical Quarterly 50(4): 357–372.
Howard-Snyder, Daniel. (2018). “Can Fictionalists Have Faith? It All Depends.” Religious Studies 55: 1–22.
Jackson, Elizabeth. (2019). “Belief, Credence, and Faith.” Religious Studies 55(2): 153–168.
Jackson, Elizabeth. (2020). “The Nature and Rationality of Faith.” A New Theist Response to the New Atheists (Joshua Rasmussen and Kevin Vallier, eds.), pp. 77–92. New York: Routledge.
Jackson, Elizabeth. (2021). “Belief, Faith, and Hope: On the Rationality of Long-Term Commitment.” Mind. 130(517): 35–57.
Jeffrey, Anne. (2017-a). “How Aristotelians Can Make Faith a Virtue.” Ethical Theory and Moral Practice 20(2): 393–409.
Jeffrey, Anne. (2017-b). “Does Hope Morally Vindicate Faith?” International Journal for Philosophy of Religion 81(1-2): 193–211.
James, William. (1896/2011). “The Will to Believe.” In J. Shook (ed.) The Essential William James, pp. 157–178. New York: Prometheus Books.
Kvanvig, Jonathan. (2018). Faith and Humility. Oxford: OUP.
Kvanvig, Jonathan. (2013). “Affective Theism and People of Faith.” Midwest Studies in Philosophy 37: 109–28.
Lebens, S. (2021). “Will I Get a Job? Contextualism, Belief, and Faith.” Synthese 199(3-4): 5769–5790.
Malcolm, Finlay. (2021). “Testimony, Faith, and Humility.” Religious Studies 57(3): 466–483.
Malcolm, Finlay and Michael Scott. (2017). “Faith, Belief, and Fictionalism.” Pacific PhilosophicalQuarterly 98(1): 257–274.
Malcolm, Finlay and Michael Scott. (2021). “True Grit and the Positivity of Faith.” European Journal of Analytic Philosophy 17(1): 5–32.
Martin, Adrienne M. (2013). How We Hope: A Moral Psychology. Princeton: Princeton University Press.
Schellenberg, J.L. (2005). Prolegomena to a Philosophy of Religion. Ithaca: Cornell University Press.
Sliwa, Paulina. (2018). “Know-How and Acts of Faith.” In Matthew A. Benton, John Hawthorne & Dani Rabinowitz (eds.), Knowledge, Belief, and God: New Insights in Religious Epistemology. Oxford: Oxford University Press. pp. 246-263.
Speak, Daniel. (2007). “Salvation Without Belief.” Religious Studies 43(2): 229–236.
Swinburne, Richard. (1981). “The Nature of Faith’. In R. Swinburne, Faith and Reason, pp. 104–24. Oxford: Clarendon Press.
Swindal, James. (2021). “Faith: Historical Perspectives.” Internet Encyclopedia of Philosophy. https://iep.utm.edu/faith-re/
Whitaker, Robert K. (2019). “Faith and Disbelief.” International Journal for Philosophy of Religion 85: 149–172.
Zagzebski, Linda Trinkaus (2012). “Religious Authority.” In L. T. Zagzebski, Epistemic Authority: A Theory of Trust, Authority, and Autonomy in Belief. Oxford: Oxford University Press, 181–203.
Working with the infinite is tricky business. Zeno’s paradoxes first alerted Western philosophers to this in 450 B.C.E. when he argued that a fast runner such as Achilles has an infinite number of places to reach during the pursuit of a slower runner. Since then, there has been a struggle to understand how to use the notion of infinity in a coherent manner. This article concerns the significant and controversial role that the concepts of infinity and the infinite play in the disciplines of philosophy, physical science, and mathematics.
Philosophers want to know whether there is more than one coherent concept of infinity; which entities and properties are infinitely large, infinitely small, infinitely divisible, and infinitely numerous; and what arguments can justify answers one way or the other.
Here are some examples of these four different ways to claim to be infinite. The density of matter at the center of a black hole is infinitely large. An electron is infinitely small. An hour is infinitely divisible. The integers are infinitely numerous. These four claims are ordered from most to least controversial, although all four have been challenged in the philosophical literature.
This article also explores a variety of other questions about the infinite. Is the infinite something indefinite and incomplete, or is it complete and definite? What did Thomas Aquinas mean when he said God is infinitely powerful? Was Gauss, who was one of the greatest mathematicians of all time, correct when he made the controversial remark that scientific theories involve infinities merely as idealizations and merely in order to make for easy applications of those theories, when in fact all physically real entities are finite? How did the invention of set theory change the meaning of the term “infinite”? What did Cantor mean when he said some infinities are smaller than others? Quine said the first three sizes of Cantor’s infinities are the only ones we have reason to believe in. Mathematical Platonists disagree with Quine. Who is correct? We shall see that there are deep connections among all these questions.
The term “the infinite” refers to whatever it is that the word “infinity” correctly applies to. For example, the infinite integers exist just in case there is an infinity of integers. We also speak of infinite quantities, but what does it mean to say a quantity is infinite? In 1851, Bernard Bolzano argued in The Paradoxes of the Infinite that, if a quantity is to be infinite, then the measure of that quantity also must be infinite. Bolzano’s point is that we need a clear concept of infinite number in order to have a clear concept of infinite quantity. This idea of Bolzano’s has led to a new way of speaking about infinity, as we shall see.
The term “infinite” can be used for many purposes. The logician Alfred Tarski used it for dramatic purposes when he spoke about trying to contact his wife in Nazi-occupied Poland in the early 1940s. He complained, “We have been sending each other an infinite number of letters. They all disappear somewhere on the way. As far as I know, my wife has received only one letter” (Feferman 2004, p. 137). Although the meaning of a term is intimately tied to its use, we can tell only a very little about the meaning of the term from Tarski’s use of it to exaggerate for dramatic effect.
Looking back over the last 2,500 years of use of the term “infinite,” three distinct senses stand out: actually infinite, potentially infinite, and transcendentally infinite. These will be discussed in more detail below, but briefly, the concept of potential infinity treats infinity as an unbounded or non-terminating process developing over time. By contrast, the concept of actual infinity treats the infinite as timeless and complete. Transcendental infinity is the least precise of the three concepts and is more commonly used in discussions of metaphysics and theology to suggest transcendence of human understanding or human capability.
To give some examples, the set of integers is actually infinite, and so is the number of locations (points of space) between London and Moscow. The maximum length of grammatical sentences in English is potentially infinite, and so is the total amount of memory in a Turing machine, an ideal computer. An omnipotent being’s power is transcendentally infinite.
For purposes of doing mathematics and science, the actual infinite has turned out to be the most useful of the three concepts. Using the idea proposed by Bolzano that was mentioned above, the concept of the actual infinite was precisely defined in 1888 when Richard Dedekind redefined the term “infinity” for use in set theory and Georg Cantor made the infinite, in the sense of infinite set, an object of mathematical study. Before this turning point, the philosophical community generally believed Aristotle’s concept of potential infinity should be the concept used in mathematics and science.
a. Actual, Potential, and Transcendental Infinity
The Ancient Greeks conceived of the infinite as formless, characterless, indefinite, indeterminate, chaotic, and unintelligible. The term had negative connotations and was especially vague, having no clear criteria for distinguishing the finite from the infinite. In his treatment of Zeno’s paradoxes about infinite divisibility, Aristotle (384-322 B.C.E.) made a positive step toward clarification by distinguishing two different concepts of infinity, potential infinity, and actual infinity. The latter is also called complete infinity and completed infinity. The actual infinite is not a process in time; it is an infinity that exists wholly at one time. By contrast, Aristotle spoke of the potentially infinite as a never-ending process over time, but which is finite at any specific time.
The word “potential” is being used in a technical sense. A potential swimmer can learn to become an actual swimmer, but a potential infinity cannot become an actual infinity. Aristotle argued that all the problems arising from reasoning with infinity are really problems of improperly applying the incoherent concept of actual infinity instead of properly applying the coherent concept of potential infinity. (See Aristotle’s Physics, Book III, for his account of infinity.)
For its day, this was a successful way to treat some of Zeno’s paradoxes because, if Zeno had confined himself to using only potential infinity, he would not have been able to develop his paradoxical argument. Here is why. Zeno said that to go from the start to the finish, the runner must reach the place that is halfway-there, then after arriving at this place he still must reach the place that is half of that remaining distance, and after arriving there he again must reach the new place that is now halfway to the goal, and so on. These are too many places to reach because there is no end to these places since for any one there is another. Zeno made the mistake, according to Aristotle, of supposing that this infinite process needs completing when it really does not; the finitely long path from start to finish exists undivided for the runner, and it is Zeno the mathematician who is demanding the completion of such a process. Without that concept of a completed infinite process there is no paradox.
Although today’s standard treatment of the Achilles paradox disagrees with Aristotle and says Zeno was correct to use the concept of a completed infinity and to imply the runner must go to an actual infinity of places in a finite time, Aristotle had so many other intellectual successes that his ideas about infinity dominated the Western world for the next two thousand years.
Even though Aristotle promoted the belief that “the idea of the actual infinite−of that whose infinitude presents itself all at once−was close to a contradiction in terms…,” (Moore 2001, 40) during those two thousand years, some distinguished persons did not treat it as a contradiction in terms. Archimedes, Duns Scotus, William of Ockham, Gregory of Rimini, and Leibniz made use of it. Archimedes used it, but had doubts about its legitimacy. Leibniz used it but had doubts about whether it was needed.
Here is an example of how Gregory of Rimini argued in the fourteenth century for the coherence of the concept of actual infinity:
If God can endlessly add a cubic foot to a stone—which He can—then He can create an infinitely big stone. For He need only add one cubic foot at some time, another [cubic foot] half an hour later, another a quarter of an hour later than that, and so on ad infinitum. He would then have before Him an infinite stone at the end of the hour. (Moore 2001, 53)
Leibniz envisioned the world as being an actual infinity of mind-like monads, and in (Leibniz 1702) he freely used the concept of being infinitesimally small in his development of the calculus in mathematics.
The term “infinity” that is used in contemporary mathematics and science is based on a technical development of this earlier, informal concept of actual infinity. This technical concept was not created until late in the 19th century.
b. The Rise of the Technical Terms
In the centuries after the decline of ancient Greece, the word “infinite” slowly changed its meaning in Medieval Europe. Theologians promoted the idea that God is infinite because He is limitless, and this at least caused the word “infinity” to lose its negative connotations. Eventually, during the Medieval Period, the word had come to mean endless, unlimited, and immeasurable–but not necessarily chaotic. The question of its intelligibility and conceivability by humans was disputed.
The term “actual infinity” is now very different. There are actual infinities in the technical, post-1880s sense, which are neither endless, unlimited, nor immeasurable. A line segment one meter long is a good example. It is not endless because it is finitely long, and it is not a process because it is timeless. It is not unlimited because it is limited by both zero and one, its bounds. It is not immeasurable because its length measure is one meter. Nevertheless, the one-meter line is infinite in the technical sense because it has an actual infinity of sub-segments, and it has an actual infinity of distinct points. So, there definitely has been a conceptual revolution.
This can be very shocking to those people who are first introduced to the technical term “actual infinity.” It seems not to be the kind of infinity they are thinking about. The crux of the problem is that these people really are using a different concept of infinity. The sense of infinity in ordinary discourse these days is either the Aristotelian one of potential infinity or the medieval one that requires infinity to be endless, immeasurable, and perhaps to have connotations of perfection or inconceivability. This article uses the name transcendental infinity for the medieval concept although there is no generally accepted name for the concept. A transcendental infinity transcends human limits and detailed knowledge; it might be incapable of being described by a precise theory. It might also be a cluster of concepts rather than a single one.
Those people who are surprised when first introduced to the technical term “actual infinity” are probably thinking of either potential infinity or transcendental infinity, and that is why, in any discussion of infinity, some philosophers will say that an appeal to the technical term “actual infinity” is changing the subject. Another reason why there is opposition to actual infinities is that they have so many counter-intuitive properties. For example, consider a continuous line that has an actual infinity of points. A single point on this line has no next point, which is counter-intuitive. Also counter-intuitive is the fact that mathematicians have shown how a one-dimensional continuous curve can fill a two-dimensional area. Equally counterintuitive is the fact that some actually infinite numbers are smaller than other actually infinite numbers. Looked at more optimistically, though, most other philosophers will say the rise of this technical term is yet another example of how the discovery of a new concept has propelled civilization forward.
Resistance to the claim that there are actual infinities has had two other sources beside that of being counter-intuitive. One is the belief that actual infinities cannot be experienced. The other is the belief that use of the concept of actual infinity leads to paradoxes, such as Zeno’s. In order to solve Zeno’s Paradoxes, the standard solution makes use of calculus. The birth of the new technical definition of actual infinity is intimately tied to the development of calculus and thus to properly defining the mathematician’s real line, the linear continuum. The set of real numbers in their standard order was given the name “the continuum” or “the linear continuum” because it was believed that the real numbers fill up the entire number line continuously without leaving gaps. The integers have gaps, and so do the fractions.
Briefly, the argument for actual infinities is that science needs calculus; calculus needs the continuum; the continuum needs a very careful definition; and the best definition requires there to be actual infinities (not merely potential infinities) in the continuum.
Defining the continuum involves defining real numbers because the linear continuum is the intended model of the theory of real numbers just as the plane is the intended model of the theory of ordinary two-dimensional geometry. It was eventually realized by mathematicians that giving a careful definition to the continuum and to real numbers requires formulating their definitions within set theory. As part of that formulation, mathematicians found a good way to define a rational number in the language of set theory; then they defined a real number to be a certain pair of actually infinite sets of rational numbers. The continuum’s eventual definition required it to be an actually infinite collection whose elements are themselves infinite sets. The details are too complex to be presented here, but the curious reader can check any textbook in classical real analysis. The intuitive picture is that any interval or segment of the continuum is a continuum, and any continuum is a very special infinite set of points that are packed so closely together that there are no gaps. A continuum is perfectly smooth. This smoothness is reflected in there being a very great many real numbers between any two real numbers (technically a nondenumerable infinity between them).
Calculus is the area of mathematics that is more applicable to science than any other area. It can be thought of as a technique for treating a continuous change to be an actually infinite number of infinitesimal changes. When calculus is applied to physical properties capable of change such as spatial location, ocean salinity, or an electrical circuit’s voltage, these properties are represented with continuous variables that have real numbers for their values. These values are specific real numbers, not ranges of real numbers and not just rational numbers. Achilles’ location along the path to his goal is such a property.
It took many centuries to rigorously develop the calculus. A very significant step in this direction occurred in 1888 when Richard Dedekind re-defined the term “infinity” and when Georg Cantor used that definition to create the first set theory, a theory that eventually was developed to the point where it could be used for embedding all classical mathematical theories. See the example in the Zeno’s Paradoxes article of how Dedekind used set theory and his new idea of “cuts” to define the real numbers in terms of infinite sets of rational numbers. In this way, additional rigor was given to the concepts of mathematics, and it encouraged more mathematicians to accept the notion of actually infinite sets. What this embedding requires is first defining the terms of any mathematical theory in the language of set theory, then translating the axioms and theorems of the mathematical theory into sentences of set theory, and then showing that these theorems follow logically from the axioms. (The axioms of any theory, such as set theory, are the special sentences of the theory that can always be assumed during the process of deducing the other theorems of the theory.)
The new technical treatment of infinity that originated with Dedekind in 1888 and was adopted by Cantor in his new set theory provided a definition of “infinite set” rather than simply “infinite.” Dedekind says an infinite set is a set that is not finite. The notion of a finite set can be defined in various ways. We might define it numerically as a set having n members, where n is some non-negative integer. Surprisingly, Dedekind found an essentially equivalent definition of finite set (assuming the axiom of choice, which will be discussed later) that does not require mentioning numbers:
A (Dedekind) finite set is a set for which there exists no one-to-one correspondence between it and one of its proper subsets.
By placing the finger-tips of your left hand on the corresponding fingertips of your right hand, you establish a one-to-one correspondence between the set of fingers of each hand; and in that way, you establish that there is the same number of fingers on each of your hands, without your needing to count the fingers. More generally, there is a one-to-one correspondence between two sets when each member of one set can be paired off with a unique member of the other set, so that neither set has an unpaired member.
Here is a one-to-one correspondence between the natural numbers and its proper subset of even numbers, demonstrating that the natural numbers are infinite:
1
2
3
4
…
↕
↕
↕
↕
2
4
6
8
…
Informally expressed, any infinite set can be matched up to a part of itself; so the whole is equivalent to a part. This is a surprising definition because, before this definition was adopted, the idea that actually infinite wholes are equinumerous with some of their parts was taken as clear evidence that the concept of actual infinity is inherently paradoxical. For a systematic presentation of the many alternative ways to successfully define “infinite set” non-numerically, see (Tarski 1924).
Dedekind’s new definition of “infinite” is defining an actually infinite set, not a potentially infinite set because Dedekind appealed to no continuing operation over time. The concept of a potentially infinite set is then given a new technical definition by saying a potentially infinite set is a growing, finite subset of an actually infinite set. Cantor expressed the point this way:
In order for there to be a variable quantity in some mathematical study, the “domain” of its variability must strictly speaking be known beforehand through a definition. However, this domain cannot itself be something variable…. Thus this “domain” is a definite, actually infinite set of values. Thus each potential infinite…presupposes an actual infinite (Cantor 1887).
The new idea is that the potentially infinite set presupposes an actually infinite one. If this is correct, then Aristotle’s two notions of the potential infinite and actual infinite have been redefined and clarified.
Two sets are the same if any member of one is a member of the other, and vice versa. Order of the members is irrelevant to the identity of the set, and to the size of the set. Two sets are the same size if there exists a one-to-one correspondence between them. This definition of same size was recommended by both Cantor and Frege. Cantor defined “finite” by saying a set is finite if there is a one-to-one correspondence with the set {1, 2, 3, …, n} for some positive integer n; and he said a set is infinite if it is not finite.
Cardinal numbers are measures of the sizes of sets. There are many definitions of what a cardinal number is, but what is essential for cardinal numbers is that two sets have the same cardinal just in case there is a one-to-one correspondence between them. Set A has a smaller cardinal number than a set B (and so set A has fewer members than B) provided there is a one-to-one correspondence between A and a subset of B, but B is not the same size as A. In this sense, the set of even integers does not have fewer members than the set of all integers, although intuitively you might think it does.
How big is infinity? This question does not make sense for either potential infinity or transcendental infinity, but it does for actual infinity. Finite cardinal numbers such as 0, 1, 2, and 3 are measures of the sizes of finite sets, and transfinite cardinal numbers are measures of the sizes of actually infinite sets. The transfinite cardinals are aleph-null, aleph-one, aleph-two, and so on; we represent them with the numerals ℵ0, ℵ1, ℵ2, …. The smallest infinite size is ℵ0 which is the size of the set of natural numbers, and it is said to be countably infinite (or denumerably infinite or enumerably infinite). The other alephs are measures of the uncountable infinities. However, calling a set of size ℵ0 “countably infinite” is somewhat misleading since no process of counting is involved. Nobody would have the time to count from 0 to ℵ0.
The set of even integers, the set of natural numbers and the set of rational numbers all can be shown to have the same size, but surprisingly they all are smaller than the set of real numbers. The set of points in the continuum and in any interval of the continuum turns out to be larger than ℵ0, although how much larger is still an open problem, called “the continuum problem.” A popular but controversial suggestion is that a continuum is of size ℵ1.
When creating set theory, mathematicians did not begin with the belief that there would be so many points between any two points in the continuum nor with the belief that for any infinite cardinal there is a larger cardinal. These were surprising consequences discovered by Cantor. To many philosophers, this surprise is evidence that what is going on is not invention but rather is discovery about mind-independent reality.
The intellectual community has always been wary of actually infinite sets. Before the discovery of how to embed calculus within set theory (a process that is also called giving calculus a basis in set theory), it could have been more easily argued that science does not need actual infinities. The burden of proof has now shifted, and the default position is that actual infinities are indispensable in mathematics and science, and anyone who wants to do without them must show that removing them does not do too much damage and has additional benefits. There are no known successful attempts to reconstruct the theories of mathematical physics without basing them on mathematical objects such as numbers and sets, but for one attempt to do so using second-order logic, see (Field 1980).
Here is why some mathematicians believe the set-theoretic basis is so important:
Just as chemistry was unified and simplified when it was realized that every chemical compound is made of atoms, mathematics was dramatically unified when it was realized that every object of mathematics can be taken to be the same kind of thing [namely, a set]. There are now other ways than set theory to unify mathematics, but before set theory there was no such unifying concept. Indeed, in the Renaissance, mathematicians hesitated to add x2 to x3, since the one was an area and the other a volume. Since the advent of set theory, one can correctly say that all mathematicians are exploring the same mental universe. (Rucker 1982, p. 64)
But the significance of this basis can be exaggerated. The existence of the basis does not imply that mathematics is set theory.
Paradoxes soon were revealed within set theory—by Cantor himself and then others—so the quest for a more rigorous definition of the mathematical continuum continued. Cantor’s own paradox surfaced in 1895 when he asked whether the set of all cardinal numbers has a cardinal number. Cantor showed that, if it does, then it doesn’t. Surely the set of all sets would have the greatest cardinal number, but Cantor showed that for any cardinal number there is a greater cardinal number. [For more details about this and the other paradoxes, see (Suppes 1960).] The most famous paradox of set theory is Russell’s Paradox of 1901. He showed that the set of all sets that are not members of themselves is both a member of itself and not a member of itself. Russell wrote that the paradox “put an end to the logical honeymoon that I had been enjoying.”
These and other paradoxes were eventually resolved satisfactorily by finding revised axioms of set theory that permit the existence of enough well-behaved sets so that set theory is not crippled [that is, made incapable of providing a basis for mathematical theories] and yet the axioms do not permit the existence of too many sets, the ill-behaved sets such as Cantor’s set of all cardinals and Russell’s set of all sets that are not members of themselves. Finally, by the mid-20th century, it had become clear that, despite the existence of competing set theories, Zermelo-Fraenkel’s set theory (ZF) was the best way or the least radical way to revise set theory in order to avoid all the known paradoxes and problems while at the same time preserving enough of our intuitive ideas about sets that it deserved to be called a set theory, and at this time most mathematicians would have agreed that the continuum had been given a proper basis in ZF. See (Kleene 1967, pp. 189-191) for comments on this agreement about ZF’s success and for a list of the ZF axioms and for a detailed explanation of why each axiom deserves to be an axiom.
Because of this success, and because it was clear enough that the concept of infinity used in ZF does not lead to contradictions, and because it seemed so evident how to use the concept in other areas of mathematics and science where the term “infinity” was being used, the definition of the concept of “infinite set” within ZF was claimed by many philosophers to be the paradigm example of how to provide a precise and fruitful definition of a philosophically significant concept. Much less attention was then paid to critics who had complained that we can never use the word “infinity” coherently because infinity is ineffable or inherently paradoxical.
Nevertheless, there was, and still is, serious philosophical opposition to actually infinite sets and to ZF’s treatment of the continuum, and this has spawned the programs of constructivism, intuitionism, finitism, and ultrafinitism, all of whose advocates have philosophical objections to actual infinities. Even though there is much to be said in favor of replacing a murky concept with a clearer, technical concept, there is always the worry that the replacement is a change of subject that has not really solved the problems it was designed for. More discussion of the role of infinity in mathematics and science continues in later sections of this article.
2. Infinity and the Mind
Can humans grasp the concept of the infinite? This seems to be a profound question. Ever since Zeno, intellectuals have realized that careless reasoning about infinity can lead to paradox and perhaps “defeat” the human mind.
Some critics of infinity argue not just that paradox can occur but that paradox is essential to, or inherent in, the use of the concept of infinity, so the infinite is beyond the grasp of the human mind. However, this criticism applies more properly to some forms of transcendental infinity rather than to either actual infinity or potential infinity. This is a consequence of the development of set theory as we shall see in a later section.
A second reason to believe humans cannot grasp infinity is that the concept must contain an infinite number of sub-concepts, which is too many for our finite minds. A counter to this reason is to defend the psychological claim that if a person succeeds in thinking about infinity, it does not follow that the person needs to have an actually infinite number of ideas in mind at one time.
A third reason to believe the concept of infinity is beyond human understanding is that to have the concept one must have some accurate mental picture of infinity. Thomas Hobbes, who believed that all thinking is based on imagination, might remark that nobody could picture an infinite number of grains of sand at once. However, most contemporary philosophers of psychology believe mental pictures are not essential to have a concept. Regarding the concept of dog, you might have a picture of a brown dog in your mind, and I might have a picture of a black dog in mine, but I can still understand you perfectly well when you say dogs frequently chase cats.
The main issue here is whether we can coherently think about infinity to the extent of being said to have the concept. Here is a simple argument that we can: If we understand negation and have the concept of finite, then the concept of infinite is merely the concept of not-finite. A second argument says the apparent consistency of set theory indicates that infinity in the technical sense of actual infinity is well within our grasp. And since potential infinity is definable in terms of actual infinity, it, too, is within our grasp.
Assuming that infinity is within our grasp, what is it that we are grasping? Philosophers disagree on the answer. In 1883, the father of set theory, Georg Cantor, created a formal theory of infinite sets as a way of clarifying the infinite. This was a significant advance, but the notion of set can be puzzling. If you understand that a pencil is on my desk, must you implicitly understand that a set containing a pencil is on my desk? Plus a set containing that set? And another set containing the set containing the set with the pencil, and so forth to infinity?
In regard to mentally grasping an infinite set or any other set, Cantor said:
A set is a Many which allows itself to be thought of as a One.
Notice the dependence of a set upon thought. Cantor eventually clarified what he meant and was clear that he did not want set existence to depend on mental capability. What he really believed is that a set is a collection of well-defined and distinct objects that exist independently of being thought of, but that might be thought of by a powerful enough mind.
3. Infinity in Metaphysics
There is a concept which corrupts and upsets all others. I refer not to Evil, whose limited realm is that of ethics; I refer to the infinite. —Jorge Luis Borges.
Shakespeare declared, “The will is infinite.” Is he correct or just exaggerating? Critics of Shakespeare, interpreted literally, might argue that the will is basically a product of different brain states. Because a person’s brain contains approximately 1027 atoms, these have only a finite number of configurations or states, and so, regardless of whether we interpret Shakespeare’s remark as implying that the will is unbounded (is potentially infinite) or the will produces an infinite number of brain states (is actually infinite), the will is not infinite. But perhaps Shakespeare was speaking metaphorically and did not intend to be taken literally, or perhaps he meant to use some version of transcendental infinity that makes infinity be somehow beyond human comprehension.
Contemporary Continental philosophers often speak that way. Emmanuel Levinas says the infinite is another name for the Other, for the existence of other conscious beings besides ourselves whom we are ethically responsible for. We “face the infinite” in the sense of facing a practically incomprehensible and unlimited number of possibilities upon encountering another conscious being. (See Levinas 1961.) If we ask what sense of “infinite” is being used by Levinas, it may be yet another concept of infinity, or it may be some kind of transcendental infinity. Another interpretation is that he is exaggerating about the number of possibilities and should say instead that there are too many possibilities to be faced when we encounter another conscious being and that the possibilities are not readily predictable because other conscious beings make free choices, the causes of which often are not known even to the person making the choice.
Leibniz was one of the few persons in earlier centuries who believed in actually infinite sets, but he did not believe in infinite numbers. Cantor did. Referring to his own discovery of the transfinite cardinals ℵ0, ℵ1, ℵ2, …. and their properties, Cantor claimed his work was revealing God’s existence and that these mathematical objects were in the mind of God. He claimed God gave humans the concept of the infinite so that they could reflect on His perfection. Influential German neo-Thomists such as Constantin Gutberlet agreed with Cantor. Some Jesuit math instructors claim that by taking a calculus course and set theory course and understanding infinity, students are becoming closer to God. Their critics complain that these mystical ideas about infinity and God are too speculative.
When metaphysicians speak of infinity they use all three concepts: potential infinity, actual infinity, and transcendental infinity. But when they speak about God being infinite, they are usually interested in implying that God is beyond human understanding or that there is a lack of a limit on particular properties of God, such as God’s goodness and knowledge and power.
The connection between infinity and God exists in nearly all of the world’s religions. It is prominent in Hindu, Muslim, Jewish, and Christian literature. For example, in chapter 11 of the Bhagavad Gita of Hindu scripture, Krishna says, “O Lord of the universe, I see You everywhere with infinite form….”
Plato did not envision God (the Demi-urge) as infinite because he viewed God as perfect, and he believed anything perfect must be limited and thus not infinite because the infinite was defined as an unlimited, unbounded, indefinite, unintelligible chaos.
But the meaning of the term “infinite” slowly began to change. Over six hundred years later, the Neo-Platonist philosopher Plotinus was one of the first important Greek philosophers to equate God with the infinite−although he did not do so explicitly. He said instead that any idea abstracted from our finite experience is not applicable to God. He probably believed that if God were finite in some aspect, then there could be something beyond God and therefore God wouldn’t be “the One.” Plotinus was influential in helping remove the negative connotations that had accompanied the concept of the infinite. One difficulty here, though, is that it is unclear whether metaphysicians have discovered that God is identical with the transcendentally infinite or whether they are simply defining “God” to be that way. A more severe criticism is that perhaps they are just defining “infinite” (in the transcendental sense) as whatever God is.
Augustine, who merged Platonic philosophy with the Christian religion, spoke of God “whose understanding is infinite” for “what are we mean wretches that dare presume to limit His knowledge?” Augustine wrote that the reason God can understand the infinite is that “…every infinity is, in a way we cannot express, made finite to God….” [City of God, Book XII, ch. 18] This is an interesting perspective. Medieval philosophers debated whether God could understand infinite concepts other than Himself, not because God had limited understanding, but because there was no such thing as infinity anywhere except in God.
The medieval philosopher Thomas Aquinas, too, said God has infinite knowledge. He definitely did not mean potentially infinite knowledge. The technical definition of actual infinity might be useful here. If God is infinitely knowledgeable, this can be understood perhaps as meaning that God knows the truth values of all declarative sentences and that the set of these sentences is actually infinite.
Aquinas argued in his Summa Theologia that, although God created everything, nothing created by God can be actually infinite. His main reason was that anything created can be counted, yet if an infinity were created, then the count would be infinite, but no infinite numbers exist to do the counting (as Aristotle had also said). In his day this was a better argument than today because Cantor created (or discovered) infinite numbers in the late 19th century.
René Descartes believed God was actually infinite, and he remarked that the concept of actual infinity is so awesome that no human could have created it or deduced it from other concepts, so any idea of infinity that humans have must have come from God directly. Thus God exists. Descartes is using the concept of infinity to produce a new ontological argument for God’s existence.
David Hume, and many other philosophers, raised the problem that if God has infinite power then there need not be evil in the world, and if God has infinite goodness, then there should not be any evil in the world. This problem is often referred to as “The Problem of Evil” and has been a long-standing point of contention for theologians.
Spinoza and Hegel envisioned God, or the Absolute, pantheistically. If they are correct, then to call God infinite, is to call the world itself infinite. Hegel denigrated Aristotle’s advocacy of potential infinity and claimed the world is actually infinite. Traditional Christian, Muslim, and Jewish metaphysicians do not accept the pantheistic notion that God is at one with the world. Instead, they say God transcends the world. Since God is outside space and time, the space and time that he created may or may not be infinite, depending on God’s choice, but surely everything else he created is finite, they say.
The multiverse theories of cosmology in the early 21st century allow there to be an uncountable infinity of universes within a background space whose volume is actually infinite. The universe created by our Big Bang is just one of these many universes. Christian theologians often balk at the notion of God choosing to create this multiverse because the theory’s implication that, although there are so many universes radically different from ours, there also are an actually infinite number of ones just like ours. This implies there is an infinite number of indistinguishable copies of Jesus, each of whom has been crucified on the cross. This removal of the uniqueness of Jesus is apparently a removal of his dignity. Augustine had this worry about uniqueness when considering infinite universes, and he responded that “Christ died once for sinners….”
There are many other entities and properties that some metaphysician or other has claimed are infinite: places, possibilities, propositions, properties, particulars, partial orderings, pi’s decimal expansion, predicates, proofs, Plato’s forms, principles, power sets, probabilities, positions, and possible worlds. That is just for the letter p. Some of these are considered to be abstract objects, objects outside of space and time, and others are considered to be concrete objects, objects within, or part of, space and time.
For helpful surveys of the history of infinity in theology and metaphysics, see (Owen 1967) and (Moore 2001).
4. Infinity in Physical Science
From a metaphysical perspective, the theories of mathematical physics seem to be ontologically committed to objects and their properties. If any of those objects or properties are infinite, then physics is committed to there being infinity within the physical world.
Here are four suggested examples where infinity occurs within physical science. (1) Standard cosmology based on Einstein’s general theory of relativity implies the density of the mass at the center of a spherical black hole is infinitely large (even though the black hole’s total mass is finite). (2) The Standard Model of particle physics implies the size of an electron is infinitely small. (3) General relativity implies that every path in space is infinitely divisible. (4) Classical quantum theory implies the values of the kinetic energy of an accelerating, free electron are infinitely numerous. These four kinds of infinities—infinite large, infinitely small, infinitely divisible, and infinitely numerous—are implied by theory and argumentation, and are not something that could be measured directly.
Objecting to taking scientific theories at face value, the 18th-century British empiricists George Berkeley and David Hume denied the physical reality of even potential infinities on the empiricist grounds that such infinities are not detectable by our sense organs. Most philosophers of the 21st century would say that Berkeley’s and Hume’s empirical standards are too rigid because they are based on the mistaken assumption that our knowledge of reality must be a complex built up from simple impressions gained from our sense organs.
But in the spirit of Berkeley’s and Hume’s empiricism, instrumentalists also challenge any claim that science tells us the truth about physical infinities. The instrumentalists say that all theories of science are merely effective “instruments” designed for explanatory and predictive success. A scientific theory’s claims are neither true nor false. By analogy, a shovel is an effective instrument for digging, but a shovel is neither true nor false. The instrumentalist would say our theories of mathematical physics imply only that reality looks “as if” there are physical infinities. Some realists on this issue respond that to declare it to be merely a useful mathematical fiction that there are physical infinities is just as misleading as to say it is mere fiction that moving planets actually have inertia or petunias actually contain electrons. We have no other tool than theory-building for accessing the existing features of reality that are not directly perceptible. If our best theories—those that have been well tested and are empirically successful and make novel predictions—use theoretical terms that refer to infinities, then infinities must be accepted. See (Leplin 2000) for more details about anti-realist arguments, such as those of instrumentalism and constructive empiricism.
Some other philosophers are more agnostic and say we just do not know whether the universe can or does contain physical infinities.
a. Infinitely Small and Infinitely Divisible
Consider the size of electrons and quarks, the two main components of atoms. All scientific experiments so far have been consistent with electrons and quarks having no internal structure (components), as our best scientific theories imply, so the “simple conclusion” is that electrons are infinitely small, or infinitesimal, and zero-dimensional. Is this “simple conclusion” too simple? Some physicists speculate that there are no physical particles this small and that, in each subsequent century, physicists will discover that all the particles of the previous century have a finite size due to some inner structure. However, most physicists withhold judgment on this point about the future of physics.
A second reason to question whether the “simple conclusion” is too simple is that electrons, quarks, and all other elementary particles behave in a quantum mechanical way. They have a wave nature as well as a particle nature, and they have these simultaneously. When probing an electron’s particle nature it is found to have no limit to how small it can be, but when probing the electron’s wave nature, the electron is found to be spread out through all of space, although it is more probably in some places than others. Also, quantum theory is about groups of objects, not a single object. The theory does not imply a definite result for a single observation but only for averages over many observations, so this is why quantum theory introduces inescapable randomness or unpredictability into claims about single objects and single experimental results. The more accurate theory of quantum electrodynamics (QED) that incorporates special relativity and improves on classical quantum theory for the smallest regions, also implies electrons are infinitesimal particles when viewed as particles, while they are wavelike or spread out when viewed as waves. When considering the electron’s particle nature, QED’s prediction of zero volume has been experimentally verified down to the limits of measurement technology. The measurement process is limited by the fact that light or other electromagnetic radiation must be used to locate the electron, and this light cannot be used to determine the position of the electron more accurately than the distance between the wave crests of the light wave used to bombard the electron. So, all this is why the “simple conclusion” mentioned at the beginning of this paragraph may be too simple. For more discussion, see the chapter “The Uncertainty Principle” in (Hawking 2001) or (Greene 1999, pp. 121-2).
If a scientific theory implies space is a continuum, with the structure of a mathematical continuum, then if that theory is taken at face value, space is infinitely divisible and composed of infinitely small entities, the so-called points of space. But should it be taken at face value? The mathematician David Hilbert declared in 1925, “A homogeneous continuum which admits of the sort of divisibility needed to realize the infinitely small is nowhere to be found in reality. The infinite divisibility of a continuum is an operation which exists only in thought.” Hilbert said actual, completed infinities are real in mathematics, but not in physics. Many physicists agree with Hilbert. Many other physicists and philosophers argue that, although Hilbert is correct that ordinary entities such as strawberries and cream are not continuous, he is ultimately incorrect, for the following reasons.
First, the Standard Model of particles and forces is one of the best tested and most successful theories in all the history of physics. So are the theories of relativity and quantum mechanics. All these theories imply or assume that, using Cantor’s technical sense of actual infinity, there are infinitely many infinitesimal instants in any non-zero duration, and there are infinitely many point places along any spatial path. In short, contra Hilbert, time is a continuum, and space is a continuum.
The second challenge to Hilbert’s position is that quantum theory, in agreement with relativity theory, implies that for any possible kinetic energy of a free electron there is half that energy−insofar as an electron can be said to have a value of energy independent of being measured to have it. Although the energy of an electron bound within an atom is quantized, the energy of an unbound or free electron is not. If it accelerates in its reference frame from zero to nearly the speed of light, its energy changes and takes on all intermediate real-numbered values from its rest energy to its total energy. But mass is just a form of energy, as Einstein showed in his famous equation E = mc2, so in this sense mass is a continuum as well as energy.
How about non-classical quantum mechanics, the proposed theories of quantum gravity that are designed to remove the disagreements between quantum mechanics and relativity theory? Do these non-classical theories quantize all these continua we’ve been talking about? If so, Hilbert was correct after all. One such theory, the theory of loop quantum gravity, implies space consists of discrete units called loops. But string theory, which is the more popular of the theories of quantum gravity in the early 21st century, does not imply space is discontinuous. [See (Greene 2004) for more details.] Speaking about this question of continuity, the theoretical physicist Brian Greene says that, although string theory is developed against a background of continuous spacetime, his own insight is that
[T]he increasingly intense quantum jitters that arise on decreasing scales suggest that the notion of being able to divide distances or durations into ever smaller units likely comes to an end at around the Planck length (10-33centimeters) and Planck time (10-43 seconds). …There is something lurking in the microdepths−something that might be called the bare-bones substrate of spacetime−the entity to which the familiar notion of spacetime alludes. We expect that this ur-ingredient, this most elemental spacetime stuff, does not allow dissection into ever smaller pieces because of the violent fluctuations that would ultimately be encountered…. [If] familiar spacetime is but a large-scale manifestation of some more fundamental entity, what is that entity and what are its essential properties? As of today, no one knows. (Greene 2004, pp. 473, 474, 477)
Disagreeing, the theoretical physicist Roger Penrose speaks about both loop quantum gravity and string theory and says:
…in the early days of quantum mechanics, there was a great hope, not realized by future developments, that quantum theory was leading physics to a picture of the world in which there is actually discreteness at the tiniest levels. In the successful theories of our present day, as things have turned out, we take spacetime as a continuum even when quantum concepts are involved, and ideas that involve small-scale spacetime discreteness must be regarded as ‘unconventional.’ The continuum still features in an essential way even in those theories which attempt to apply the ideas of quantum mechanics to the very structure of space and time…. Thus it appears, for the time being at least, that we need to take the use of the infinite seriously, particularly in its role in the mathematical description of the physical continuum. (Penrose 2005, 363)
b. Singularities
There is a good reason why scientists fear the infinite more than mathematicians do. Scientists have to worry that someday we will have a dangerous encounter with a singularity, with something that is, say, infinitely hot or infinitely dense. For example, we might encounter a singularity by being sucked into a black hole. According to Schwarzschild’s solution to the equations of general relativity, a simple, non-rotating black hole is infinitely dense at its center. Nevertheless, these good reasons for singularities are not generally accepted by physicists because singularities are not allowed by quantum mechanics, and physicists trust quantum theory on this topic more than relativity theory.
Some philosophers will ask: Is it not proper to appeal to our best physical theories in order to learn what is physically possible? Usually, but not in this case, say many scientists, including Albert Einstein. He believed that, if a theory implies that some physical properties might have or, worse yet, do have actually infinite values (the so-called singularities), then this is a sure sign of error in the theory. It is an error primarily because the theory will be unable to predict the behavior of the infinite entity, and so the theory will fail. For example, consider a second, popular candidate for a physical singularity, the universe at the start of the Big Bang. If the Big Bang were considered to be an actual singularity, then knowledge of the state of the universe at the Big Bang could not be used to predict events after the Big Bang. This failure to imply the character of later states of the universe is what Einstein’s collaborator Peter Bergmann meant when he said, “A theory that involves singularities…carries within itself the seeds of its own destruction.” The majority of physicists probably would agree with Einstein and Bergmann about this, but the critics of these scientists say their belief that we need to remove singularities everywhere is merely a hope that has been turned into a metaphysical assumption.
Relativity theory allows singularities, but quantum theory does not; it allows only arbitrary large, finite values of properties such as temperature and mass-energy density. So which theory, relativity theory or quantum theory, should we trust to tell us whether the center of a black hole is or isn’t a singularity? The best answer is, “Neither, because we should get our answer from a theory of quantum gravity.” A principal attraction of string theory, a leading proposal for a theory of quantum gravity to replace both relativity theory and quantum theory, is that it eliminates the many singularities that appear in previously accepted physical theories such as relativity theory. In string theory, the electrons and quarks are not point particles but are small, finite loops of fundamental string. That finiteness in the loop is what eliminates the singularities.
Unfortunately, string theory has its own problems with infinity. It implies an infinity of kinds of particles. If a particle is a string, then the energy of the particle should be the energy of its vibrating string. Strings have an infinite number of possible vibrational patterns each corresponding to a particle that should exist if we take the theory literally. One response that string theorists make to this problem about too many particles is that perhaps the infinity of particles did exist at the time of the Big Bang but now they have all disintegrated into a shower of simpler particles and so do not exist today. Another response favored by string theorists is that perhaps there never were an infinity of particles nor a Big Bang singularity in the first place. Instead, the Big Bang was a Big Bounce or quick expansion from a pre-existing, shrinking universe whose size stopped shrinking when it got below the critical Planck length of about 10-35 meters and instead began expanding explosively.
c. Idealization and Approximation
Scientific theories use idealization and approximation; they are “lies that help us to see the truth,” to use a phrase from the painter Pablo Picasso (who was speaking about art, not science). In our scientific theories, there are ideal gases, perfectly elliptical orbits, and economic consumers motivated only by profit. Everybody knows these are not intended to be real objects. Yet, it is clear that idealizations and approximations are actually needed in science in order to promote genuine explanation of many phenomena. We need to reduce the noise of the details in order to see what is important. In short, approximations and idealizations can be explanatory. But what about approximations and idealizations that involve the infinite?
Although the terms “idealization” and “approximation” are often used interchangeably, John Norton (Norton 2012) recommends paying more attention to their difference by saying that, when there is some aspect of the world, some target system, that we are trying to understand scientifically, approximations should be considered to be inexact descriptions of the target system whereas idealizations should be considered to be new systems or parts of new systems that also are approximations to the target system but that contain reference to some novel object or property. For example, elliptical orbits are approximations to actual orbits of planets, but ideal gases are idealizations because they contain novel objects such as point-sized gas particles that are part of a new system that is useful for approximating the target system of actual gases.
Philosophers of science disagree about whether all appeals to infinity can be known a priori to be mere idealizations or approximations. Our theory of the solar system justifies our belief that the Earth is orbited by a moon, not just an approximate moon. The speed of light in a vacuum really is constant, not just approximately constant. Why then should it be assumed, as it often is, that all appeals to infinity in scientific theory are approximations or idealizations? Must the infinity be an artifact of the model rather than a feature of actual physical reality? Philosophers of science disagree on this issue. See (Mundy, 1990, p. 290).
There is an argument for believing some appeals to infinity definitely are neither approximations nor idealizations. The argument presupposes a realist rather than an antirealist understanding of science, and it begins with a description of the opponents’ position. Carl Friedrich Gauss (1777-1855) was one of the greatest mathematicians of all time. He said scientific theories involve infinities merely as approximations or idealizations and merely in order to make for easy applications of those theories, when in fact all real entities are finite. At the time, nearly everyone would have agreed with Gauss. Roger Penrose argues against Gauss’ position:
Nevertheless, as tried and tested physical theory stands today—as it has for the past 24 centuries—real numbers still form a fundamental ingredient of our understanding of the physical world. (Penrose 2005, 62)
Penrose’s point is that this appeal to real numbers implies there are actual infinities. Gauss’s position could be buttressed if there were useful alternatives to our physical theories that do not use infinities. There actually are alternative mathematical theories of analysis that do not use real numbers and do not use infinite sets and do not require the line to be dense. See (Ahmavaara 1965) for an example. The alternative theories of analysis require enormous but finite numbers. Penrose complains, “To my mind, a physical theory which depends fundamentally upon some absurdly enormous…number would be a far more complicated (and improbable) theory than one that is able to depend upon a simple notion of infinity” (Penrose 2005, 359). David Deutsch agrees. But couldn’t there be an alternative version of analysis that uses finite numbers but smaller, ones. To this, Deutsch says, “Versions of number theory that confined themselves to ‘small natural numbers’ would have to be so full of arbitrary qualifiers, workarounds and unanswered questions, that they would be very bad explanations until they were generalized to the case that makes sense without such ad-hoc restrictions: the infinite case.” (Deutsch 2011, pp. 118-9) And surely a successful explanation is the surest route to understanding reality.
In opposition to this position of Penrose and Deutsch, and in support of Gauss’ position, the physicist Erwin Schrödinger remarks, “The idea of a continuous range, so familiar to mathematicians in our days, is something quite exorbitant, an enormous extrapolation of what is accessible to us.” Emphasizing this point about being “accessible to us,” some metaphysicians attack the applicability of the mathematical continuum to physical reality on the grounds that a continuous human perception over time is not mathematically continuous. Wesley Salmon responds to this complaint from Schrödinger:
…The perceptual continuum and perceived becoming [that is, the evidence from our sense organs that the world changes from time to time] exhibit a structure radically different from that of the mathematical continuum. Experience does seem, as James and Whitehead emphasize, to have an atomistic character. If physical change could be understood only in terms of the structure of the perceptual continuum, then the mathematical continuum would be incapable of providing an adequate description of physical processes. In particular, if we set the epistemological requirement that physical continuity must be constructed from physical points which are explicitly definable in terms of observables, then it will be impossible to endow the physical continuum with the properties of the mathematical continuum. In our discussion…, we shall see, however, that no such rigid requirement needs to be imposed. (Salmon 1970, 20)
Salmon continues by making the point that calculus provides better explanations of physical change than explanations which accept the “rigid requirement” of understanding physical change in terms of the structure of the perceptual continuum, so he recommends that we apply Ockham’s Razor, eliminate that rigid requirement. and embrace actual infinities. But the issue is not settled.
d. Infinity in Cosmology
Let’s review some of the history regarding the volume of spacetime. Aristotle said the past is infinite because for any past time we can imagine an earlier time. It is difficult to make sense of his belief since he means the past is potentially infinite. After all, the past has an end, namely the present, so its infinity has been completed and therefore is not a potential infinity. This problem with Aristotle’s reasoning was first raised in the 13th century by Richard Rufus of Cornwall. It was not given the attention it deserved because of the assumption for so many centuries that Aristotle could not have been wrong about time, especially since his position was consistent with Christian, Jewish, and Muslim theology which implies the physical world became coherent or well-formed only a finite time ago (even if past time itself is potentially infinite). However, Aquinas argued against Aristotle’s view that the past is infinite; Aquinas’ grounds were that Holy Scripture implies God created the world (and thus time itself) a finite time ago and that Aristotle was wrong to put so much trust in what we can imagine.
Unlike time, Aristotle claimed space is finite. He said the volume of physical space is finite because it is enclosed within a finite, spherical shell of visible, fixed stars with the Earth at its center. On this topic of space not being infinite, Aristotle’s influence was authoritative to most scholars for the next eighteen hundred years.
The debate about whether the volume of space is infinite was rekindled in Renaissance Europe. The English astronomer and defender of Copernicus, Thomas Digges (1546–1595) was the first scientist to reject the ancient idea of an outer spherical shell and to declare that physical space is actually infinite in volume and filled with stars. The physicist Isaac Newton (1642–1727) at first believed the universe’s material is confined to only a finite region while it is surrounded by infinite empty space, but in 1691 he realized that if there were a finite number of stars in a finite region, then gravity would require all the stars to fall in together at some central point. To avoid this result, he later speculated that the universe contains an infinite number of stars in an infinite volume. We now know that Newton’s speculation about the stability of an infinity of stars in an infinite universe is incorrect. There would still be clumping so long as the universe did not expand. (Hawking 2001, p. 9)
Immanuel Kant (1724–1804) declared that space and time are both potentially infinite in extent because this is imposed by our own minds. Space and time are not features of “things in themselves” but are an aspect of the very form of any possible human experience, he said. We can know a priori even more about space than about time, he believed; and he declared that the geometry of space must be Euclidean. Kant’s approach to space and time as something knowable a priori went out of fashion in the early 20th century. It was undermined in large part by the discovery of non-Euclidean geometries in the 19th century, then by Beltrami’s and Klein’s proofs that these geometries are as logically consistent as Euclidean geometry, and finally by Einstein’s successful application to physical space of non-Euclidean geometry within his general theory of relativity.
Assuming space is all the places that have been created since the Big Bang, then the volume of space is definitely finite at present, though it is huge and growing ever larger over time. Assuming this expansion will never stop, it follows that the volume of spacetime is potentially infinite but not actually infinite. For more discussion of the issue of the volume of spacetime, see (Greene 2011).
Einstein’s theory of relativity implies that all physical objects must travel at less than light speed (in a vacuum). Nevertheless, by exploiting the principle of time dilation and length contraction in his special theory of relativity, the time limits on human exploration of the universe can be removed. Assuming you can travel safely at any high speed under light speed, then as your spaceship approaches light speed, judged by an Earth clock your trip’s distance and travel time become infinitesimally short. In principle, you have time on your own clock to cross the Milky Way galaxy, a trip that takes light itself 100,000 years as measured on an Earth clock.
5. Infinity in Mathematics
The previous sections of this article have introduced the concepts of actual infinity and potential infinity and explored the development of calculus and set theory, but this section probes deeper into the role of infinity in mathematics. Mathematicians always have been aware of the special difficulty in dealing with the concept of infinity in a coherent manner. Intuitively, it seems reasonable that if we have two infinities of things, then we still have an infinity of them. So, we might represent this intuition mathematically by the equation 2 ∞ = 1 ∞. Dividing both sides by ∞ will prove that 2 = 1, which is a good sign we were not using infinity in a coherent manner.
In recommending how to use the concept of infinity coherently, Bertrand Russell said pejoratively:
The whole difficulty of the subject lies in the necessity of thinking in an unfamiliar way, and in realising that many properties which we have thought inherent in number are in fact peculiar to finite numbers. If this is remembered, the positive theory of infinity…will not be found so difficult as it is to those who cling obstinately to the prejudices instilled by the arithmetic which is learnt in childhood (Salmon 1970, 58).
That positive theory of infinity that Russell is talking about is set theory, and the new arithmetic is the result of Cantor’s generalizing the notions of order and of size of sets into the infinite, that is, to the infinite ordinals and infinite cardinals. These numbers are also called transfinite ordinals and transfinite cardinals. The following sections briefly explore set theory and the role of infinity within mathematics. The main idea, though, is that the basic theories of mathematical physics are properly expressed using the differential calculus with real-number variables, and these concepts are well-defined in terms of set theory which, in turn, requires using actual infinities or transfinite infinities of various kinds.
a. Infinite Sums
In the 17th century, when Newton and Leibniz invented calculus, they wondered what the value is of this so-called “infinite sum”:
1/1 + 1/2 + 1/4 + 1/8 + ….
They believed the sum is 2. Knowing about the dangers of talking about infinity, most later mathematicians hoped to find a technique to avoid using the phrase “infinite sum.” Cauchy and Weierstrass eventually provided this technique two centuries later. They removed any mention of “infinite sum” by using the formal idea of a limit. Informally, the Cauchy-Weierstrass idea is that instead of overtly saying the infinite sum x1 + x2 + x3 + … is some number S, as Newton and Leibniz were saying, one should say that the sequence converges to S just in case the numerical difference between S and any partial sum is as small as one desires, provided that the final term of that partial sum occurs sufficiently far out in the sequence of partial sums. More formally it is expressed this way:
If an infinite series of real numbers is x1 + x2 + x3 + …, and if the infinite sequence of its partial sums is s1, s2, s3, …, then the series converges to S if and only if for every positive number ε there exists an integer n such that, for all integers k > n, |sk – S| < ε.
This technique of talking about limits was due to Cauchy in 1821 and Weierstrass in the period from 1850 to 1871. The two drawbacks to this technique are that (1) it is unintuitive and more complicated than Newton and Leibniz’s intuitive approach that did mention infinite sums, and (2) it is not needed because infinite sums were eventually legitimized by being given a set-theoretic foundation.
b. Infinitesimals and Hyperreals
There has been considerable controversy throughout history about how to understand infinitesimal objects and infinitesimal changes in the properties of objects. Intuitively, an infinitesimal object is as small as you please but not quite nothing. Infinitesimal objects and infinitesimal methods were first used by Archimedes in ancient Greece, but he did not mention them in any publication intended for the public because he did not consider his use of them to be rigorous. Infinitesimals became better known when Leibniz used them in his differential and integral calculus. The differential calculus can be considered to be a technique for treating continuous motion as being composed of an infinite number of infinitesimal steps. The calculus’ use of infinitesimals led to the so-called “golden age of nothing” in which infinitesimals were used freely in mathematics and science. During this period, Leibniz, Euler, and the Bernoullis applied the concept. Euler applied it cavalierly (although his intuition was so good that he rarely if ever made mistakes), but Leibniz and the Bernoullis were concerned with the general question of when we could, and when we could not, consider an infinitesimal to be zero. They were aware of apparent problems with these practices in large part because they had been exposed by George Berkeley in 1734.
Berkeley attacked the concept of infinitesimal as being ill-defined and incoherent because there were no definite rules for when the infinitesimal should be and should not be considered to be zero. Berkeley, like Leibniz, was thinking of infinitesimals as objects with a constant value–as genuinely infinitesimally small magnitudes–whereas Newton thought of them as variables that could arbitrarily approach zero. Either way, there were coherence problems. The scientists and results-oriented mathematicians of the golden age of nothing had no good answer to the coherence problem. As standards of rigorous reasoning increased over the centuries, mathematicians became more worried about infinitesimals. They were delighted when Cauchy in 1821 and Weierstrass in the period from 1850 to 1875 developed a way to use calculus without infinitesimals.
Here is how Cauchy and Weierstrass eliminated infinitesimals with their concept of limit. Suppose we have a function f, and we are interested in the Cartesian graph of the curve y = f(x) at some point a along the x-axis. What is the rate of change of f at a? This is the slope of the tangent line at a, and it is called the derivativef’ at a. This derivative was defined by Leibniz to be
where h is an infinitesimal. Because of suspicions about infinitesimals and dividing by zero, Cauchy and Weierstrass suggested replacing Leibniz’s definition of the derivative with
That is, f'(a) is the limit, as x approaches a, of the above ratio. The limit idea was rigorously defined using Cauchy’s well-known epsilon and delta method. Soon after the Cauchy-Weierstrass definition of derivative was formulated, mathematicians stopped using infinitesimals.
The scientists did not follow the lead of the mathematicians. Despite the lack of a coherent theory of infinitesimals, scientists continued to reason with infinitesimals because infinitesimal methods were so much more intuitively appealing than the mathematicians’ epsilon-delta methods. Although students in calculus classes in the early 21st century are still taught the unintuitive epsilon-delta methods, Abraham Robinson (Robinson 1966) created a rigorous alternative to standard Weierstrassian analysis by using the methods of model theory to define infinitesimals.
Here is Robinson’s idea. Think of the rational numbers in their natural order as being gappy with real numbers filling the gaps between them. Then think of the real numbers as being gappy with hyperreals filling the gaps between them. There is a cloud or region of hyperreals surrounding each real number (that is, surrounding each real number described nonstandardly). To develop these ideas more rigorously, Robinson used this simple definition of an infinitesimal:
h is infinitesimal if and only if 0 < |h| < 1/n, for every positive integer n.
|h| is the absolute value of h.
Robinson did not actually define an infinitesimal as a number on the real line. The infinitesimals were defined on a new number line, the hyperreal line, that contains within it the structure of the standard real numbers from classical analysis. In this sense, the hyperreal line is the extension of the reals to the hyperreals. The development of analysis via infinitesimals creates a nonstandard analysis with a hyperreal line and a set of hyperreal numbers that include real numbers. In this nonstandard analysis, 78+2h is a hyperreal that is infinitesimally close to the real number 78. Sums and products of infinitesimals are infinitesimal.
Because of the rigor of the extension, all the arguments for and against Cantor’s infinities apply equally to the infinitesimals. Sentences about the standardly-described reals are true if and only if they are true in this extension to the hyperreals. Nonstandard analysis allows proofs of all the classical theorems of standard analysis, but it very often provides shorter, more direct, and more elegant proofs than those that were originally proved by using standard analysis with epsilons and deltas. Objections by practicing mathematicians to infinitesimals subsided after this was appreciated. With a good definition of “infinitesimal” they could then use it to explain related concepts such as in the sentence, “That curve approaches infinitesimally close to that line.” See (Wolf 2005, chapter 7) for more about infinitesimals and hyperreals.
c. Mathematical Existence
Mathematics is apparently about mathematical objects, so it is apparently about infinitely large objects, infinitely small objects, and infinitely many objects. Mathematicians often remark that there are infinite-dimensional spaces, the continuum, continuous functions, an infinity of functions, and this or that infinite structure. Despite these remarks, philosophers wish to know whether these infinities really exist. The philosophical literature is filled with arguments pro and con and with fine points about senses of existence.
When axiomatizing geometry, Euclid said that between any two points one could choose to construct a line. Opposed to Euclid’s constructivist stance, many modern axiomatizers take a realist philosophical stance by declaring simply that there exists a line between any two points, so the line pre-exists any construction process. In mathematics, the constructivist will recognize the existence of a mathematical object only if there is at present an algorithm (that is, a step by step “mechanical” procedure operating on symbols that is finitely describable, that requires no ingenuity and that uses only finitely many steps) for constructing or finding such an object. Assertions require proofs. The constructivist believes that to justifiably assert the negation of a sentence S is to prove that the assumption of S leads to a contradiction. So, legitimate mathematical objects must be shown to be constructible in principle by some mental activity and cannot be assumed to pre-exist any such construction process nor to exist simply because their non-existence would be contradictory. A constructivist, unlike a realist, is a kind of conceptualist, one who believes that an unknowable mathematical object is impossible. Most constructivists complain that, although potential infinities can be constructed, actual infinities cannot be.
There are many different schools of constructivism. The first systematic one, and perhaps the most well-known version and most radical version, is due to L.E.J. Brouwer. He is not a finitist, but his intuitionist school demands that all legitimate mathematics be constructible from a basis of mental processes that he called “intuitions.” These intuitions might be more accurately called “clear mental procedures.” If there were no minds capable of having these intuitions, then there would be no mathematical objects just as there would be no songs without ideas in the minds of composers. Numbers are human creations according to constructivists. The number pi is intuitionistically legitimate because we have an algorithm for computing all its decimal digits, but the following number g is not legitimate. It is the number whose nth digit is either 0 or 1, and it is 1 if and only if there are n consecutive 7s in the decimal expansion of pi. No person yet knows how to construct the decimal digits of g. Brouwer argued that the actually infinite set of natural numbers cannot be constructed (using intuitions) and so does not exist. The best we can do is to have a rule for adding more members to a set. So, his concept of an acceptable infinity is closer to that of potential infinity than actual infinity. Hermann Weyl emphasizes the merely potential character of these infinities:
Brouwer made it clear, as I think beyond any doubt, that there is no evidence supporting the belief in the existential character of the totality of all natural numbers…. The sequence of numbers which grows beyond any stage already reached by passing to the next number, is a manifold of possibilities open towards infinity; it remains forever in the status of creation, but is not a closed realm of things existing in themselves (Weyl is quoted in (Kleene 1967, p. 195)).
It is not legitimate for platonic realists, said Brouwer, to bring all the sets into existence at once by declaring they are whatever objects satisfy all the axioms of set theory. Brouwer believed realists accept too many sets because they are too willing to accept sets merely by playing coherently with the finite symbols for them when sets instead should be tied to our experience. For Brouwer, this experience is our experience of time. He believed we should arrive at our concept of the infinite by noticing that our experience of a duration can be divided into parts and then these parts can be further divided, and so. This infinity is a potential infinity, not an actual infinity. For the intuitionist, there is no determinate, mind-independent mathematical reality that provides the facts to make mathematical sentences true or false. This metaphysical position is reflected in the principles of logic that are acceptable to an intuitionist. For the intuitionist, the sentence “For all x, x has property F” is true only if we have already proved constructively that each x has property F. And it is false only if we have proved that some x does not have property F. Otherwise, it is neither true nor false. The intuitionist does not accept the principle of excluded middle, namely, for any sentence S, either S or the negation of S. Outraged by this intuitionist position, David Hilbert famously responded by saying, “To take the law of the excluded middle away from the mathematician would be like denying the astronomer the telescope or the boxer the use of his fists.” (Quoted from Kleene 1967, p. 197.) For a presentation of intuitionism with philosophical emphasis, see (Posy 2005) and (Dummett 1977).
Finitists, even those who are not constructivists, also argue that the actually infinite set of natural numbers does not exist. They say there is a finite rule for generating each numeral from the previous one, but the rule does not produce an actual infinity of either numerals or numbers. The ultrafinitist considers the classical finitist to be too liberal because finite numbers such as 2100 and 21000 can never be accessed by a human mind in a reasonable amount of time. Only the numerals or symbols for those smaller numbers can be coherently manipulated. One challenge to ultrafinitists is that they should explain where the cutoff point is between numbers that can be accessed and numbers that cannot be. Ultrafinitsts have risen to this challenge. The mathematician Harvey Friedman says:
I raised just this objection [about a cutoff] with the (extreme) ultrafinitist Yessenin-Volpin during a lecture of his. He asked me to be more specific. I then proceeded to start with 21 and asked him whether this is “real” or something to that effect. He virtually immediately said yes. Then I asked about 22, and he again said yes, but with a perceptible delay. Then 23, and yes, but with more delay. This continued for a couple of more times, till it was obvious how he was handling this objection. Sure, he was prepared to always answer yes, but he was going to take 2100 times as long to answer yes to 2100 than he would to answering 21. There is no way that I could get very far with this (Elwes 2010, 317).
This battle among competing philosophies of mathematics is not explored in-depth in this article, but this section does offer a few more points about mathematical existence.
Hilbert argued that “If the arbitrarily given axioms do not contradict one another, then they are true and the things defined by the axioms exist.” But (Chihara 2008, 141) points out that Hilbert seems to be confusing truth with truth in a model. If a set of axioms is consistent, and so is its corresponding axiomatic theory, then the theory defines a class of models, and each axiom is true in any such model, but it does not follow that the axioms are really true. To give a crude, nonmathematical example, consider this set of two axioms {All horses are blue, all cows are green.}. The formal theory using these axioms is consistent and has a model, but it does not follow that either axiom is really true.
Quine objected to Hilbert’s criterion for existence as being too liberal. Quine’s argument for infinity in mathematics begins by noting that our fundamental scientific theories are our best tools for helping us understand reality and doing ontology. Mathematical theories that imply the existence of some actually infinite sets are indispensable to all these scientific theories, and their referring to these infinities cannot be paraphrased away. All this success is a good reason to believe in some actual infinite sets and to say the sentences of both the mathematical theories and the scientific theories are true or approximately true since their success would otherwise be a miracle. But, he continues, it is no miracle. See (Quine 1960 chapter 7).
Quine believed that infinite sets exist only if they are indispensable in successful applications of mathematics to science; but he believed science so far needs only the first three alephs: ℵ0 for the integers, ℵ1 for the set of point places in space, and ℵ2 for the number of possible lines in space (including lines that are not continuous). The rest of Cantor’s heaven of transfinite numbers is unreal, Quine said, and the mathematics of the extra transfinite numbers is merely “recreational mathematics.” But Quine showed intellectual flexibility by saying that if he were to be convinced more transfinite sets were needed in science, then he would change his mind about which alephs are real.
To summarize Quine’s position, his indispensability argument treats mathematical entities on a par with all other theoretical entities in science and says mathematical statements can be (approximately) true. Quine points out that reference to mathematical entities is vital to science, and there is no way of separating out the evidence for the mathematics from the evidence for the science. This famous indispensability argument has been attacked in many ways. Critics charge, “Quite aside from the intrinsic logical defects of set theory as a deductive theory, this is disturbing because sets are so very different from physical objects as ordinarily conceived, and because the axioms of set theory are so very far removed from any kind of empirical support or empirical testability…. Not even set theory itself can tell us how the existence of a set (e.g. a power set) is empirically manifested” (Mundy 1990, pp. 289-90). See (Parsons 1980) for more details about Quine’s and other philosophers’ arguments about the existence of mathematical objects.
d. Zermelo-Fraenkel Set Theory
Cantor initially thought of a set as being a collection of objects that can be counted, but this notion eventually gave way to a set being a collection that has a clear membership condition. Over several decades, Cantor’s naive set theory evolved into ZF, Zermelo-Fraenkel set theory, and ZF was accepted by most mid-20th century mathematicians as the correct tool to use for deciding which mathematical objects exist. The acceptance was based on three reasons. (1) ZF is precise and rigorous. (2) ZF is useful for defining or representing other mathematical concepts and methods. Mathematics can be modeled in set theory; it can be given a basis in set theory. (3) No inconsistency has been uncovered despite heavy usage.
Notice that one of the three reasons is not that set theory provides a foundation for mathematics in the sense of justifying the doing of mathematics or in the sense of showing its sentences are certain or necessary. Instead, set theory provides a basis for theories only in the sense that it helps to organize them, to reveal their interrelationships, and to provide a means to precisely define their concepts. The first program for providing this basis began in the late 19th century. Peano had given an axiomatization of the natural numbers. It can be expressed in set theory using standard devices for treating natural numbers and relations and functions and so forth as being sets. (For example, zero is the empty set, and a relation is a set of ordered pairs.) Then came the arithmetization of analysis which involved using set theory to construct from the natural numbers all the negative numbers and the fractions and real numbers and complex numbers. Along with this, the principles of these numbers became sentences of set theory. In this way, the assumptions used in informal reasoning in arithmetic are explicitly stated in the formalism, and proofs in informal arithmetic can be rewritten as formal proofs so that no creativity is required for checking the correctness of the proofs. Once a mathematical theory is given a set theoretic basis in this manner, it follows that if we have any philosophical concerns about the higher level mathematical theory, those concerns will also be concerns about the lower level set theory in the basis.
In addition to Dedekind’s definition, there are other acceptable definitions of “infinite set” and “finite set” using set theory. One popular one is to define a finite set as a set onto which a one-to-one function maps the set of all natural numbers that are less than some natural number n. That finite set contains n elements. An infinite set is then defined as one that is not finite. Dedekind, himself, used another definition; he defined an infinite set also as one that is not finite, but defined a finite set as any set in which there exists no one-to-one mapping of the set into a proper subset of itself. The philosopher C. S. Peirce suggested essentially the same approach as Dedekind at approximately the same time, but he received little notice from the professional community. For more discussion of the details, see (Wilder 1965, p. 66f) and (Suppes 1960, p. 99n).
Set theory implies quite a bit about infinity. First, infinity in ZF has some very unsurprising features. If a set A is infinite and is the same size as set B, then B also is infinite. If A is infinite and is a subset of B, then B also is infinite. Using the axiom of choice, it follows that a set is infinite just in case for every natural number n, there is some subset whose size is n.
ZF’s axiom of infinity declares that there is at least one infinite set, a so-called inductive set containing zero and the successor of each of its members—such as {0, 1, 2, 3, …}. The power set axiom (which says every set has a power set, namely a set of all its subsets) then generates many more infinite sets of larger cardinality, a surprising result that Cantor first discovered in 1874.
In ZF, there is no set with maximum cardinality, nor a set of all sets, nor an infinitely descending sequence of sets x0, x1, x2, … in which x1 is in x0, and x2 is in x1, and so forth. There is, however, an infinitely ascending sequence of sets x0, x1, x2, … in which x0 is in x1, and x1 is in x2, and so forth. In ZF, a set exists if it is implied by the axioms; there is no requirement that there be some property P such that the set is the extension of P. That is, there is no requirement that the set be defined as {x | P(x)} for some property P. One especially important feature of ZF is that for any condition or property, there is only one set of objects having that property, but it cannot be assumed that for any property, there is a set of all those objects that have that property. For example, it cannot be assumed that, for the property of being a set, there is a set of all objects having that property.
In ZF, all sets are pure. A set is pure if it is empty or its members are sets, and its members’ members are sets, and so forth. In informal set theory, a set can contain cows and electrons and other non-sets.
In the early years of set theory, the terms “set” and “class” and “collection” were used interchangeably, but in von Neumann–Bernays–Gödel set theory (NBG or VBG) a set is defined to be a class that is an element of some other class. NBG is designed to have proper classes, classes that are not sets, even though they can have members which are sets. The intuitive idea is that a proper class is a collection that is too big to be a set. There can be a proper class of all sets, but neither a set of all sets nor a class of all classes. A nice feature of NBG is that a sentence in the language of ZFC is provable in NBG only if it is provable in ZFC.
Are philosophers justified in saying there is more to know about sets than is contained within ZF set theory? If V is the collection or class of all sets, do mathematicians have any access to V independently of the axioms? This is an open question that arose concerning the axiom of choice and the continuum hypothesis.
e. The Axiom of Choice and the Continuum Hypothesis
Consider whether to believe in the axiom of choice. The axiom of choice is the assertion that, given any collection of non-empty and non-overlapping sets, there exists a ‘choice set’ which is composed of one element chosen from each set in the collection. However, the axiom does not say how to do the choosing. For some sets, there might not be a precise rule of choice. If the collection is infinite and its sets are not well-ordered in any way that has been specified, then there is, in general, no way to define the choice set. The axiom is implicitly used throughout the field of mathematics, and several important theorems cannot be proved without it. Mathematical Platonists tend to like the axiom, but those who want explicit definitions or constructions for sets do not like it. Nor do others who note that mathematics’ most unintuitive theorem, the Banach-Tarski Theorem, requires the axiom of choice. The dispute can get quite intense with advocates of the axiom of choice saying that their opponents are throwing out invaluable mathematics, while these opponents consider themselves to be removing tainted mathematics. See (Wagon 1985) for more on the Banach-Tarski Theorem; see (Wolf 2005, pp. 226-8) for more discussion of which theorems require the axiom.
A set is always smaller than its power set. How much bigger is the power set? Cantor’s controversial continuum hypothesis says that the cardinality of the power set of ℵ0 is ℵ1, the next larger cardinal number, and not some higher cardinal. The generalized continuum hypothesis is more general; it says that, given an infinite set of any cardinality, the cardinality of its power set is the next larger cardinal and not some even higher cardinal. Cantor believed the continuum hypothesis, but he was frustrated that he could not prove it. The philosophical issue is whether we should alter the axioms to enable the hypotheses to be proved.
If ZF is formalized as a first-order theory of deductive logic, then both Cantor’s generalized continuum hypothesis and the axiom of choice are consistent with Zermelo Frankel set theory but cannot be proved or disproved from them, assuming that ZF is not inconsistent. In this sense, both the continuum hypothesis and the axiom of choice are independent of ZF. Gödel in 1940 and Cohen in 1964 contributed to the proof of this independence result.
So, how do we decide whether to believe the axiom of choice and continuum hypothesis, and how do we decide whether to add them to the principles of ZF or any other set theory? Most mathematicians do believe the axiom of choice is true, but there is more uncertainty about the continuum hypothesis. The independence does not rule out our someday finding a convincing argument that the hypothesis is true or a convincing argument that it is false, but the argument will need more premises than just the principles of ZF. At this point, the philosophers of mathematics divide into two camps. The realists, who think there is a unique universe of sets to be discovered, believe that if ZF does not fix the truth values of the continuum hypothesis and the axiom of choice, then this is a defect within ZF and we need to explore our intuitions about infinity in order to uncover a missing axiom or two for ZF that will settle the truth values. These persons prefer to think that there is a single system of mathematics to which set theory is providing a foundation, but they would prefer not simply to add the continuum hypothesis itself as an axiom because the hope is to make the axioms “readily believable,” yet it is not clear enough that the axiom itself is readily believable. The second camp of philosophers of mathematics disagree and say the concept of infinite set is so vague that we simply do not have any intuitions that will or should settle the truth values. According to this second camp, there are set theories with and without axioms that fix the truth values of the axiom of choice and the continuum hypothesis, and set theory should no more be a unique theory of sets than Euclidean geometry should be the unique theory of geometry.
Believing that ZFC’s infinities are merely the above-surface part of the great iceberg of infinite sets, many set theorists are actively exploring new axioms that imply the existence of sets that could not be proved to exist within ZFC. So far there is no agreement among researchers about the acceptability of any of the new axioms. See (Wolf 2005, pp. 226-8) and (Rucker 1982, pp. 252-3) for more discussion of the search for these new axioms.
6. Infinity in Deductive Logic
The infinite appears in many interesting ways in formal deductive logic, and this section presents an introduction to a few of those ways. Among all the various kinds of formal deductive logics, first-order logic (the usual predicate logic) stands out as especially important, in part because of the accuracy and detail with which it can mirror mathematical deductions. First-order logic also stands out because it is the strongest logic that has proofs for every one of its infinitely numerous logically true sentences, and that is compact in the sense that if an infinite set of its sentences is inconsistent, then so is some finite subset. But first-order logic has expressive limitations:
[M]any central concepts—such as finitude, countability, minimal closure, wellfoundedness, and well-order, cannot be captured in a first-order language. The Lowenheim-Skolem theorems entail that no infinite structure can be characterized up to isomorphism in a first-order language. …The aforementioned mathematical notions that lack first-order characterizations all have adequate characterizations in second-order languages.
Stewart Shapiro, Handbook of Philosophical Logic, p. 131.
Let’s be clearer about just what first-order logic is. To answer this and other questions, it is helpful to introduce some technical terminology. Here is a chart of what is ahead:
First-order language
First-order theory
First-order formal system
First-order logic
Definition
Formal language with quantifiers over objects but not over sets of objects.
A set of sentences expressed in a first-order language.
First-order theory plus its method for building proofs.
First-order language with its method for building proofs.
A first-order theory is a set of sentences expressed in a first-order language (which will be defined below). A first-order formal system is a first-order theory plus its deductive structure (method of building proofs). Intuitively and informally, any formal system is a system of symbols that are manipulated by the logician in game-like fashion for the purpose of more deeply understanding the properties of the structure that is represented by the formal system. The symbols denote elements or features of the structure the formal system is being used to represent.
The term “first-order logic” is ambiguous. It can mean a first-order language with its deductive structure, or a first-order language with its semantics, or the academic discipline that studies first-order languages and theories.
Classical first-order logic is classical predicate logic with its core of classical propositional logic. This logic is distinguished by its satisfying certain classically-accepted assumptions: that it has only two truth values (some non-classical logics have an infinite number of truth-values), every sentence (that is, proposition) gets exactly one of the two truth values; no sentence can contain an infinite number of symbols; a valid deduction cannot be made from true sentences to a false one; deductions cannot be infinitely long; the domain of an interpretation cannot be empty but can have any infinite cardinality; an individual constant (name) must name something in the domain; and so forth.
A formal language specifies the language’s vocabulary symbols and its syntax, primarily what counts as being a term or name and what are its well-formed formulas (wffs). A first-order language is a formal language whose symbols are the quantifiers (for example, ∃), connectives (↔), individual constants (a), individual variables (x), predicates or relations (R), and perhaps functions (f) and equality (=). It has a denumerable list of variables. (A set is denumerable or countably infinite if it has size ℵ0.) A first-order language has a countably finite or countably infinite number of predicate symbols and function symbols, but not a zero number of both. First-order languages differ from each other only in their predicate symbols or function symbols or constants symbols or in having or not having the equality symbol. See (Wolf 2005, p. 23) for more details. There are denumerably many terms, formulas, and sentences. Also, because there are uncountably many real numbers, a theory of real numbers in a first-order language does not have enough names for all the real numbers.
To carry out proofs or deductions in a first-order language, the language needs to be given a deductive structure. There are several different ways to do this (via axioms, natural deduction, sequent calculus), but the ways are all independent of which first-order language is being used, and they all require specifying rules such as modus ponens for how to deduce wffs from finitely many previous wffs in the deduction.
To give some semantics or meaning to its symbols, the first-order language needs a definition of valuation and of truth in a valuation and of validity of an argument. In a propositional logic, the valuation assigns to each sentence letter its own single truth value; in predicate logic each term is given its own denotation (its extension), and each predicate is given a set of objects (its extension) in the domain that satisfy the predicate. The valuation rules then determine the truth values of all the wffs. The valuation’s domain is a set containing all the objects that the terms might denote and that the variables range over. The domain may be of any finite or transfinite size, but the variables can range only over objects in this domain, not over sets of those objects.
Tarski, who was influential in giving an appropriate, rigorous definition to first-order language, was always bothered by the tension between his nominalist view of language as the product of human activity, which is finite, and his view that intellectual progress in logic and mathematics requires treating a formal language as having infinite features such as an infinity of sentences. This article does not explore how this tension can be eased, or whether it should be.
Because a first-order language cannot successfully express sentences that generalize over sets (or properties or classes or relations) of the objects in the domain, it cannot, for example, adequately express Leibniz’s Law that, “If objects a and b are identical, then they have the same properties.” A second-order language can do this. A language is second-order if in addition to quantifiers on variables that range over objects in the domain it also has quantifiers (such as the universal quantifier ∀P) on a second kind of variable P that ranges over properties (or classes or relations) of these objects. Here is one way to express Leibniz’s Law in second-order logic:
(a = b) –> ∀P(Pa ↔ Pb)
P is called a predicate variable or property variable. Every valid deduction in first-order logic is also valid in second-order logic. A language is third-order if it has quantifiers on variables that range over properties of properties of objects (or over sets of sets of objects), and so forth. A language is called higher-order if its order is second-order or higher.
The definition of first-order theory given earlier in this section was that it is any set of wffs in a first-order language. A more ordinary definition adds that it is closed under deduction. This additional requirement implies that every deductive consequence of some sentences of the theory also is in the theory. Since the consequences are countably infinite, all ordinary first-order theories are countably infinite.
If the language is not explicitly mentioned for a first-order theory, then it is generally assumed that the language is the smallest first-order language that contains all the sentences of the theory. Valuations of the language in which all the sentences of the theory are true are said to be models of the theory.
If the theory is axiomatized, then in addition to the logical axioms there are proper axioms (also called non-logical axioms); these axioms are specific to the theory (and so usually do not hold in other first-order theories). For example, Peano’s axioms when expressed in a first-order language are proper axioms for the formal theory of arithmetic, but they are not logical axioms or logical truths. See (Wolf, 2005, pp. 32-3) for specific proper axioms of Peano Arithmetic and for proofs of some of its important theorems.
Besides the above problem about Leibniz’s Law, there is a related problem about infinity that occurs when Peano Arithmetic is expressed as a first-order theory. Gödel’s First Incompleteness Theorem proves that there are some bizarre truths which are independent of first-order Peano Arithmetic (PA), and so cannot be deduced within PA. None of these truths so far are known to lie in mainstream mathematics. But they might. And there is another reason to worry about the limitations of PA. Because the set of sentences of PA is only countable, whereas there are uncountably many sets of numbers in informal arithmetic, it might be that PA is inadequate for expressing and proving some important theorems about sets of numbers. See (Wolf 2005, pp. 33-4, 225).
It seems that all the important theorems of arithmetic and the rest of mathematics can be expressed and proved in another first-order theory, Zermelo-Fraenkel set theory with the axiom of choice (ZFC). Unlike first-order Peano Arithmetic, ZFC needs only a very simple first-order language that surprisingly has no undefined predicate symbol, equality symbol, relation symbol, or function symbol, other than a single two-place binary relation symbol intended to represent set membership. The domain is intended to be composed only of sets but since mathematical objects can be defined to be sets, the domain contains these mathematical objects.
a. Finite and Infinite Axiomatizability
In the process of axiomatizing a theory, any sentence of the theory can be called an axiom. When axiomatizing a theory, there is no problem with having an infinite number of axioms so long as the set of axioms is decidable, that is, so long as there is a finitely long computation or mechanical procedure for deciding, for any sentence, whether it is an axiom.
Logicians are curious as to which formal theories can be finitely axiomatized in a given formal system and which can only be infinitely axiomatized. Group theory is finitely axiomatizable in classical first-order logic, but Peano Arithmetic and ZFC are not. Peano Arithmetic is not finitely axiomatizable because it requires an axiom scheme for induction. An axiom scheme is a countably infinite number of axioms of similar form, and an axiom scheme for induction would be an infinite number of axioms of the following form (expressed here informally): “If property P of natural numbers holds for zero, and also holds for n+1 whenever it holds for natural number n, then P holds for all natural numbers.” There needs to be a separate axiom for every property P, but there is a countably infinite number of these properties expressible in a first-order language of elementary arithmetic.
Assuming ZF is consistent, ZFC is not finitely axiomatizable in first-order logic, as Richard Montague discovered. Nevertheless, ZFC is a subset of von Neumann–Bernays–Gödel set theory (NBG), and the latter is finitely axiomatizable, as Paul Bernays discovered. The first-order theory of Euclidean geometry is not finitely axiomatizable, and the second-order logic used in (Field 1980) to reconstruct mathematical physics without quantifying over numbers also is not finitely axiomatizable. See (Mendelson 1997) for more discussion of finite axiomatizability.
b. Infinitely Long Formulas
An infinitary logic is a logic that makes one of classical logic’s necessarily finite features be infinite. In the languages of classical first-order logic, every formula is required to be only finitely long, but an infinitary logic might relax this. The original, intuitive idea behind requiring finitely long sentences in classical logic was that logic should reflect the finitude of the human mind. But with increasing opposition to psychologism in logic, that is, to making logic somehow dependent on human psychology, researchers began to ignore the finitude restrictions. Löwenheim in about 1915 was perhaps the pioneer here. In 1957, Alfred Tarski and Dana Scott explored permitting the operations of conjunction and disjunction to link infinitely many formulas into an infinitely long formula. Tarski also suggested allowing formulas to have a sequence of quantifiers of any transfinite length. William Hanf proved in 1964 that, unlike classical logics, these infinitary logics fail to be compact. See (Barwise 1975) for more discussion of these developments.
c. Infinitely Long Proofs
Classical formal logic requires any proof to contain a finite number of steps. In the mid-20th century with the disappearance of psychologism in logic, researchers began to investigate logics with infinitely long proofs as an aid to simplifying consistency proofs. See (Barwise 1975).
d. Infinitely Many Truth Values
One reason for permitting an infinite number of truth-values is to represent the idea that truth is a matter of degree. The intuitive idea is that, say, depending on the temperature, the truth of “This cup of coffee is warm” might be definitely true, less true, even less true, and so forth.
One of the simplest infinite-valued semantics uses a continuum of truth values. Its valuations assign to each basic sentence (a formal sentence that contains no connectives or quantifiers) a truth value that is a specific number in the closed interval of real numbers from 0 to 1. The truth-value of the vague sentence “This water is warm” is understood to be definitely true if it has the truth value 1 and definitely false if it has the truth value 0. To a sentence P having main connectives, the valuation assigns to the negation ~P the truth value of one minus the truth value assigned to P. It assigns to the conjunction P & Q the minimum of the truth values of P and of Q. It assigns to the disjunction P v Q the maximum of the truth values of P and of Q, and so forth.
One advantage of using an infinite-valued semantics is that by permitting modus ponens to produce a conclusion that is slightly less true than either premise, we can create a solution to the paradox of the heap, the sorites paradox. One disadvantage is that there is no well-motivated choice for the specific real number that is the truth value of a vague statement. What is the truth value appropriate to “This water is warm” when the temperature is 100 degrees Fahrenheit and you are interested in cooking pasta in it? Is the truth value 0.635? This latter problem of assigning truth values to specific sentences without being arbitrary has led to the development of fuzzy logics in place of the simpler infinite-valued semantics we have been considering. Lofti Zadeh suggested that instead of vague sentences having any of a continuum of precise truth values we should make the continuum of truth values themselves imprecise. His suggestion was to assign a sentence a truth value that is a fuzzy set of numerical values, a set for which membership is a matter of degree. For more details, see (Nolt 1997, pp. 420-7).
e. Infinite Models
A countable language is a language with countably many symbols. The Löwenhim Skolem Theorem says:
If a first-order theory in a countable language has an infinite model, then it has a countably infinite model.
This is a surprising result about infinity. Would you want your theory of real numbers to have a countable model? Strictly speaking, it is a puzzle and not a paradox because the property of being countably infinite is a property it has when viewed from outside the object language not within it. The theorem does not imply that first-order theories of real numbers must have no more real numbers than there are natural numbers.
The Löwenhim-Skolem Theorem can be extended to say that if a theory in a countable language has a model of some infinite size, then it also has models of any infinite size. This is a limitation on first-order theories; they do not permit having a categorical theory of an infinite structure. A formal theory is said to be categorical if any two models satisfying the theory are isomorphic. The two models are isomorphic if they have the same structure, and they can’t be isomorphic if they have different sizes. So, if you create a first-order theory intended to describe a single infinite structure of a certain size, the theory will end up having, for any infinite size, a model of that size. This frustrates the hopes of anyone who would like to have a first-order theory of arithmetic that has models only of size ℵ0, and to have a first-order theory of real numbers that has models only of size 2ℵ0. See (Enderton 1972, pp. 142-3) for more discussion of this limitation.
Because of this limitation, many logicians have turned to second-order logics. There are second-order categorical theories for the natural numbers and for the real numbers. Unfortunately, there is no sound and complete deductive structure for any second-order logic having a decidable set of axioms. This is a major negative feature of second-order logics.
To illustrate one more surprise regarding infinity in formal logic, notice that the quantifiers are defined in terms of their domain, the domain of discourse. In a first-order set theory, the expression ∃xPx says there exists some set x in the infinite domain of all the sets such that x has property P. Unfortunately, in ZF there is no set of all sets to serve as this domain. So, it is oddly unclear what the expression ∃xPx means when we intend to use it to speak about sets.
f. Infinity and Truth
According to Alfred Tarski’s Undefinability Theorem, in an arbitrary first-order language, a global truth predicate is not definable. A global truth predicate is a predicate that is satisfied by all and only the names (via, say, Gödel numbering) of all the true sentences of the formal language. According to Tarski, since no single language has a global truth predicate, the best approach to expressing truth formally within the language is to expand the language into an infinite hierarchy of languages, with each higher language (the metalanguage) containing a truth predicate that can apply to all and only the true sentences of languages lower in the hierarchy. This process is iterated into the transfinite to obtain Tarski’s hierarchy of metalanguages. Some philosophers have suggested that this infinite hierarchy is implicit within natural languages such as English, but other philosophers, including Tarski himself, believe an informal language does not contain within it a formal language.
To better handle the concept of truth formally, Saul Kripke rejects the infinite hierarchy of metalanguages in favor of an infinite hierarchy of interpretations (that is, valuations) of a single language, such as a first-order predicate calculus with enough apparatus to discuss its own syntax. This language’s intended truth predicate T is the only basic (atomic) predicate that is ever partially-interpreted at any stage of the hierarchy. At the first step in the hierarchy, all predicates but the single one-place predicate T(x) are interpreted. T(x) is completely uninterpreted at this level. As we go up the hierarchy, the interpretation of the other basic predicates are unchanged, but T is satisfied by the names of sentences that were true at lower levels. For example, at the second level, T is satisfied by the name of the sentence ∀x(Fx v ~Fx). At each step in the hierarchy, more sentences get truth values, but any sentence that has a truth value at one level has that same truth value at all higher levels. T almost becomes a global truth predicate when the inductive interpretation-building reaches the first so-called fixed point level. At this countably infinite level, although T is a truth predicate for all those sentences having one of the two classical truth values, the predicate is not quite satisfied by the names of every true sentence because it is not satisfied by the names of some of the true sentences containing T. At this fixed point level, the Liar sentence (of the Liar Paradox) is still neither true nor false. For this reason, the Liar sentence is said to fall into a “truth gap” in Kripke’s theory of truth. See (Kripke, 1975).
(Yablo 1993) produced a semantic paradox somewhat like the Liar Paradox. It is commonly called the Yablo Paradox. Yablo claimed there is no way to coherently assign a truth value to any of the sentences in the countably infinite sequence of sentences of the form, “None of the subsequent sentences are true.” Ask yourself whether the first sentence in the sequence could be true. Notice that no sentence overtly refers to itself. There is controversy in the literature about whether the paradox actually contains a hidden appeal to self-reference, and there has been some investigation of the parallel paradox in which “true” is replaced by “provable.” See (Beall 2001).
7. Conclusion
There are many aspects of the infinite that this article does not cover. Here are some of them: an upper limit on the amount of information in the universe, renormalization in quantum field theory, supertasks and infinity machines, categorematic and syncategorematic uses of the word “infinity,” mereology, ordinal and cardinal arithmetic in ZF, the various non-ZF set theories, non-standard solutions to Zeno’s Paradoxes, Cantor’s arguments for the Absolute, Kant’s views on the infinite, quantifiers that assert the existence of uncountably many objects, and the detailed arguments for and against constructivism, intuitionism, and finitism. For more discussion of these latter three programs, see (Maddy 1992).
8. References and Further Reading
Ahmavaara, Y. (1965). “The Structure of Space and the Formalism of Relativistic Quantum Theory,” Journal of Mathematical Physics, 6, 87-93.
Uses finite arithmetic in mathematical physics, and argues that this is the correct arithmetic for science.
Barrow, John D. (2005). The Infinite Book: A Short Guide to the Boundless, Timeless and Endless. Pantheon Books, New York.
An informal and easy-to-understand survey of the infinite in philosophy, theology, science, and mathematics. Says which Western philosopher throughout the centuries said what about infinity.
Barwise, Jon. (1975) “Infinitary Logics,” in Modern Logic: A Survey, E. Agazzi (ed.), Reidel, Dordrecht, pp. 93-112.
An introduction to infinitary logics that emphasizes historical development.
Discusses the controversy over whether the Yablo Paradox is or isn’t indirectly circular.
Cantor, Georg. (1887). “Über die verschiedenen Ansichten in Bezug auf die actualunendlichen Zahlen.” Bihang till Kongl. Svenska Vetenskaps-Akademien Handlingar , Bd. 11 (1886-7), article 19. P. A. Norstedt & Sôner: Stockholm.
A very early description of set theory and its relationship to old ideas about infinity.
Chihara, Charles. (1973). Ontology and the Vicious-Circle Principle. Ithaca: Cornell University Press.
Pages 63-65 give Chihara’s reasons for why the Gödel-Cohen independence results are evidence against mathematical Platonism.
Chihara, Charles. (2008). “The Existence of Mathematical Objects,” in Proof & Other Dilemmas: Mathematics and Philosophy, Bonnie Gold & Roger A. Simons, eds., The Mathematical Association of America.
In chapter 7, Chihara provides a fine survey of the ontological issues in mathematics.
Deutsch, David. (2011). The Beginning of Infinity: Explanations that Transform the World. Penguin Books, New York City.
Emphasizes the importance of successful explanation in understanding the world, and provides new ideas on the nature and evolution of our knowledge.
Descartes, René. (1641). Meditations on First Philosophy.
The third meditation says, “But these properties [of God] are so great and excellent, that the more attentively I consider them the less I feel persuaded that the idea I have of them owes its origin to myself alone. And thus it is absolutely necessary to conclude, from all that I have before said, that God exists….”
Dummett, Michael. (1977). Elements of Intuitionism. Oxford University Press, Oxford.
A philosophically rich presentation of intuitionism in logic and mathematics.
Elwes, Richard. (2010). Mathematics 1001: Absolutely Everything That Matters About Mathematics in 1001 Bite-Sized Explanations, Firefly Books, Richmond Hill, Ontario.
Contains the quoted debate between Harvey Friedman and a leading ultrafinitist.
Enderton, Herbert B. (1972). A Mathematical Introduction to Logic. Academic Press: New York.
An introduction to deductive logic that presupposes the mathematical sophistication of an advanced undergraduate mathematics major. The corollary proved on p. 142 says that if a theory in a countable language has a model of some infinite size, then it also has models of any infinite size.
Feferman, Anita Burdman, and Solomon. (2004) Alfred Tarski: Life and Logic, Cambridge University Press, New York.
A biography of Alfred Tarski, the 20th century Polish and American logician.
Field, Hartry. (1980). Science Without Numbers: A Defense of Nominalism. Princeton: Princeton University Press.
Field’s program is to oppose the Quine-Putnam Indispensability argument which apparently implies that mathematical physics requires the existence of mathematical objects such as numbers and sets. Field tries to reformulate scientific theories so, when they are formalized in second-order logic, their quantifiers do not range over abstract mathematical entities. Field’s theory uses quantifiers that range over spacetime points. However, because it uses a second-order logic, the theory is also committed to quantifiers that range over sets of spacetime points, and sets are normally considered to be mathematical objects.
Gödel, Kurt. (1947/1983). “What is Cantor’s Continuum Problem?” American Mathematical Monthly 54, 515-525. Revised and reprinted in Philosophy of Mathematics: Selected Readings, Paul Benacerraf and Hilary Putnam (eds.), Prentice-Hall, Inc. Englewood Cliffs, 1964.
Gödel argues that the failure of ZF to provide a truth value for Cantor’s continuum hypothesis implies a failure of ZF to correctly describe the Platonic world of sets.
Greene, Brian. (2004). The Fabric of Reality. Random House, Inc., New York.
Promotes the virtues of string theory.
Greene, Brian (1999). The Elegant Universe. Vintage Books, New York.
The quantum field theory called quantum electrodynamics (QED) is discussed on pp. 121-2.
Greene, Brian. (2011). The Hidden Reality: Parallel Universes and the Deep Laws of the Cosmos. Vintage Books, New York.
A popular survey of cosmology with an emphasis on string theory.
Hawking, Stephen. (2001). The Illustrated A Brief History of Time: Updated and Expanded Edition. Bantam Dell. New York.
Chapter 4 of Brief History contains an elementary and non-mathematical introduction to quantum mechanics and Heisenberg’s uncertainty principle.
Hilbert, David. (1925). “On the Infinite,” in Philosophy of Mathematics: Selected Readings, Paul Benacerraf and Hilary Putnam (eds.), Prentice-Hall, Inc. Englewood Cliffs, 1964. 134-151.
Hilbert promotes what is now called the Hilbert Program for solving the problem of the infinite by requiring a finite basis for all acceptable assertions about the infinite.
Kleene, (1967). Mathematical Logic. John Wiley & Sons: New York.
An advanced textbook in mathematical logic.
Kripke, Saul. (1975). “Outline of a Theory of Truth,” Journal of Philosophy 72, pp. 690–716.
Describes how to create a truth predicate within a formal language that avoids assigning a truth value to the Liar Sentence.
Leibniz, Gottfried. (1702). “Letter to Varignon, with a note on the ‘Justification of the Infinitesimal Calculus by that of Ordinary Algebra,'” pp. 542-6. In Leibniz Philosophical Papers and Letters. translated by Leroy E. Loemkr (ed.). D. Reidel Publishing Company, Dordrecht, 1969.
Leibniz defends the actual infinite in calculus.
Levinas, Emmanuel. (1961). Totalité et Infini. The Hague: Martinus Nijhoff.
In Totality and Infinity, the Continental philosopher Levinas describes infinity in terms of the possibilities a person confronts upon encountering other conscious beings.
Maddy, Penelope. (1992). Realism in Mathematics. Oxford: Oxford University Press.
A discussion of the varieties of realism in mathematics and the defenses that have been, and could be, offered for them. The book is an extended argument for realism about mathematical objects. She offers a set-theoretic monism in which all physical objects are sets.
Maor, E. (1991). To Infinity and Beyond: A Cultural History of the Infinite. Princeton: Princeton University Press.
A survey of many of the issues discussed in this encyclopedia article.
Mendelson, Elliolt. (1997). An Introduction to Mathematical Logic, 4th ed. London: Chapman & Hall.
Pp. 225–86 discuss NBG set theory.
Mill, John Stuart. (1843). A System of Logic: Ratiocinative and Inductive. Reprinted in J. M. Robson, ed., Collected Works, volumes 7 and 8. Toronto: University of Toronto Press, 1973.
Mill argues for empiricism and against accepting the references of theoretical terms in scientific theories if the terms can be justified only by the explanatory success of those theories.
Moore, A. W. (2001). The Infinite. Second edition, Routledge, New York.
A popular survey of the infinite in metaphysics, mathematics, and science.
Mundy, Brent. (1990). “Mathematical Physics and Elementary Logic,” Proceedings of the Biennial Meeting of the Philosophy of Science Association. Vol. 1990, Volume 1. Contributed Papers (1990), pp. 289-301.
Discusses the relationships among set theory, logic and physics.
An undergraduate logic textbook containing in later chapters a brief introduction to non-standard logics such as those with infinite-valued semantics.
Norton, John. (2012). “Approximation and Idealization: Why the Difference Matters,” Philosophy of Science, 79, pp. 207-232.
Recommends being careful about the distinction between approximation and idealization in science.
Owen, H. P. (1967). “Infinity in Theology and Metaphysics.” In Paul Edwards (Ed.) The Encyclopedia of Philosophy, volume 4, pp. 190-3.
This survey of the topic is still reliable.
Parsons, Charles. (1980). “Quine on the Philosophy of Mathematics.” In L. Hahn and P. Schilpp (Eds.) The Philosophy of W. V. Quine, pp. 396-403. La Salle IL: Open Court.
Argues against Quine’s position that whether a mathematical entity exists depends on the indispensability of the mathematical term denoting that entity in a true scientific theory.
Penrose, Roger. (2005). The Road to Reality: A Complete Guide to the Laws of the Universe. New York: Alfred A. Knopf. Originally published in London, Great Britain in 2004.
A fascinating book about the relationship between mathematics and physics. Many of its chapters assume sophistication in advanced mathematics.
Posy, Carl. (2005). “Intuitionism and Philosophy.” In Stewart Shapiro. Ed. (2005). The Oxford Handbook of Philosophy of Mathematics and Logic. Oxford: Oxford University Press.
The history of the intuitionism of Brouwer, Heyting and Dummett. Pages 330-1 explains how Brouwer uses choice sequences to develop “even the infinity needed to produce a continuum” non-empirically.
Quine, W. V. (1960). Word and Object. Cambridge: MIT Press.
Chapter 7 introduces Quine’s viewpoint that set-theoretic objects exist because they are needed on the basis of our best scientific theories.
Quine, W. V. (1986). The Philosophy of W. V. Quine. Editors: Lewis Edwin Hahn and Paul Arthur Schilpp, Open Court, LaSalle, Illinois.
Contains the quotation saying infinite sets exist only insofar as they are needed for scientific theory.
Robinson’s original theory of the infinitesimal and its use in real analysis to replace the Cauchy-Weierstrass methods that use epsilons and deltas.
Rucker, Rudy. (1982). Infinity and the Mind: The Science and Philosophy of the Infinite. Birkhäuser: Boston.
A survey of set theory with much speculation about its metaphysical implications.
Russell, Bertrand. (1914). Our Knowledge of the External World as a Field for Scientific Method in Philosophy. Open Court Publishing Co.: Chicago.
Russell champions the use of contemporary real analysis and physics in resolving Zeno’s paradoxes. Chapter 6 is “The Problem of Infinity Considered Historically,” and that chapter is reproduced in (Salmon, 1970).
A collection of the important articles on Zeno’s Paradoxes plus a helpful and easy-to-read preface providing an overview of the issues.
Shapiro, Stewart. (2001). “Systems between First-Order and Second-Order Logics,” in D. M. Gabbay and F. Guenthner (eds.), Handbook of Philosophical Logic, 2nd Edition, Volume I,
Kluwer Academic Publishers, pp. 131-187.
Surveys first-order logics, second-order logics, and systems between them.
Smullyan, Raymond. (1967). “Continuum Problem,” in Paul Edwards (ed.), The Encyclopedia of Philosophy, Macmillan Publishing Co. & The Free Press: New York.
Discusses the variety of philosophical reactions to the discovery of the independence of the continuum hypotheses from ZF set theory.
Suppes, Patrick. (1960). Axiomatic Set Theory. D. Van Nostrand Company, Inc.: Princeton.
An undergraduate-level introduction to set theory.
Tarski, Alfred. (1924). “Sur les Ensembles Finis,” Fundamenta Mathematicae, Vol. 6, pp. 45-95.
Surveys and evaluates alternative definitions of finitude and infinitude proposed by Zermelo, Russell, Sierpinski, Kuratowski, Tarski, and others.
Wagon, Stan. (1985). The Banach-Tarski Paradox. Cambridge University Press: Cambridge.
The unintuitive Banach-Tarski Theorem says a solid sphere can be decomposed into a finite number of parts and then reassembled into two solid spheres of the same radius as the original sphere. Unfortunately, you cannot double your sphere of solid gold this way.
Wilder, Raymond L. (1965) Introduction to the Foundations of Mathematics, 2nd ed., John Wiley & Sons, Inc.: New York.
An undergraduate-level introduction to the foundation of mathematics.
Wolf, Robert S. (2005). A Tour through Mathematical Logic. The Mathematical Association of America: Washington, D.C.
Chapters 2 and 6 describe set theory and its historical development. Both the history of the infinitesimal and the development of Robinson’s nonstandard model of analysis are described clearly on pages 280-316.
Yablo, Stephen. (1993). “Paradox without Self-Reference.” Analysis 53: 251-52.
Yablo presents a Liar-like paradox involving an infinite sequence of sentences that, the author claims, is “not in any way circular,” unlike with the traditional Liar Paradox.
Author Information
Bradley Dowden
Email: dowden@csus.edu
California State University Sacramento
U. S. A.
Copyediting Guidelines
It is not the responsibility of a copy editor to turn a poorly written article into a well written article, but only to ensure clarity, revise obvious errors, and enforce the Encyclopedia’s specific style guidelines. Be a minimalist, and do not correct all instances of inelegance.
Here is an overview of the copyediting process from the volunteer’s viewpoint. The General Editor (probably Brad Dowden) will send you an original article that has been approved by the professional referees. Normally this will be a Microsoft Word file. Your goal is to produce a copyedited version with a new filename and send it back in under four weeks. There is normally no need to defend your changes.
When you are ready to begin copyediting, open the copy you have made of your article in Microsoft Word and click on Review | TrackChanges | All Markup. Turn off right justification. Then revise (that is, mark up) the document to improve it. Normally you do not need to defend your changes; but if you want feedback on your change, use the Insert | Comment feature of Word. Save your work frequently.
Do one last review of your changes by choosing Review | Tracking | Final | No Markup (without accepting the changes) so you can see for yourself how the article will look if all your changes were to be accepted. That last step often helps to reveal new imperfections that you will need to fix. Do not accept your own changes else they will be made but the evidence that they were made will be deleted. Now look at the article and see if there are new errors in red that should be made.
After your marked-up article is received by the General Editor, it will then be read and approved, perhaps with a revision or two, and sent on to a formatter to produce a formatted version that is temporarily posted within the Encyclopedia. The General Editor will acknowledge receipt of your work, but if you do not get an acknowledgement within a week, send a query.
You will receive another notification when the article is posted. This notification will normally be a cc of an email sent to the area editor indicating that he or she can pass along the good news to the author that the article is posted.
Your identity is never revealed to the author.
When you are available to copyedit another article, let the General Editor know.
2. Existing Articles
For articles that already have been published, but for some reason were not properly copyedited, the General Editor will advise you on the proper procedure for copyediting. Normally this involves using an HTML editor rather than a word processor.
3. Depth or Extent of Copyediting
Unfortunately, academic referees normally confine their remarks to accuracy of the author’s claims and not to grammar, clarity or elegance, so you are our line of defense on those issues.
When you do notice oibvious grammar errors and semantic errors, go ahead and fix them on your new copy of the article.
Here is one example. The author writes:
Certainty is the key criterion here, as for al-Ghazalī there can be no room for doubt in a genuine item of knowledge (for an in depth analysis of al-Ghazālī’s conception of absolute certainty see Hadisi 2022).
The sentence above has about nine errors. You might rewrite it as:
Certainty is the key criterion here. As for al-Ghazālī, there can be no room for doubt in a genuine item of knowledge. (For an in-depth analysis of al-Ghazālī’s conception of absolute certainty, see (Hadisi 2022).)
Changes should be made on your own new copy of the article. Do not bother to add any side comments describing and defending what you did. If you missed some of them, do no worry because they are very minor errors.
Here is the list of rules that all good copy editors must follow:
1. Be more or less specific.
2. Use not bad grammars.
3. Proofread carefully to see if you any words out.
4. Don’t use no double negatives.
5. Avoid tumbling off the cliff of triteness into the dark abyss of overused metaphors.
6. Take care that your verb and your subject is in agreement.
7. No sentence fragments.
8. Placing a comma between subject and predicate, is not correct.
9. Who needs rhetorical questions?
10. Use the apostrophe in it’s proper place.
11. Avoid colloquial stuff, like totally.
12. Avoid those run-on sentences you know the ones they stop and then start again they must be separated with semicolons.
13. The passive voice should be used infrequently.
14. And don’t start sentences with a conjunction.
15. Excessive use of exclamation points can be disastrous!!!!
16. Exaggeration is a million times worse than understatement.
17. Stamp out and eliminate redundancy because, if you reread your work, you will find on rereading that a great deal of repetition can be avoided by rereading and editing, so reread your work and improve it by editing out the repetition you noticed during the rereading.
18. Tis incumbent upon one to employ the vernacular and eschew archaisms.
19. It’s not O.K. to use ampersands & abbreviations.
20. Parenthetical remarks (however relevant) are usually (but not always) an obstacle for readers (and make it harder on readers even if you’re being careful) who have the task of understanding your work (article, passage, document, and so forth) as they read.
This funny list is not original with the IEP.
4. Avoid Awkward Phrases
Confusing and ambiguous or otherwise awkward phrases should be rewritten for clarity. If you notice that rewriting is needed but are unsure how to rewrite it yourself, then indicate this issue in a side comment in your Word document.
Example:
Noting these after-affects, one has to wonder how human experience can be anything but an ineffectual, spectatorial undergoing.
The last two words are confusing. They can be rewritten this way:
…spectator process.
Maybe you also noticed that the term after-affects is misspelled; it should be after-effects.
5. Style Preferences
a. Author Guidelines
Become familiar with the author guidelines so that you have a good sense of what we expect from our authors. You are the enforcer of those guidelines. The IEP prefers the Chicago Manual of Style for its documentation style, but if your author has already written it in APA (American Philosophical Association) style or MLA (Modern Language Association) style or some other coherent style, then you can use that style. And if the author has written the article in New Zealand English or some other non-American English, then do not convert it into American English.
b. Opening Summary
All articles must begin with a 200 to 500 word summary. If it is absent, or not within this number range, then add a comment. The summary must not be mentioned in the table of contents nor contain its own heading.
It is OK for the summary to be broken into paragraphs.
Quotations used in the summary should not be given detailed citations; just say who said it. In this sense, the IEP is more like other encyclopedias or Scientific American Magazine than like philosophy journal articles. For example, edit as follows:
Some versions of enactivism – such as those put forward by Thompson (2005, 2007, 2011a, 2011b, 2016) and Di Paolo and others (Di Paolo 2005, Di Paolo et al. 2017, 2018) focus on expanding and developing core ideas of original formulation of enactivism advanced by Varela, Thompson and Rosch. Other versions of enactivism, such as sensorimotor knowledge enactivism (O’Regan & Noë, 2001, Myin & O’Regan 2002, Noë 2004, 2009, 2012; Degenaar & O’Regan 2017, Noë 2021) and radical enactivism (Hutto 2005, Menary 2006, Hutto & Myin 2013, 2017, 2021), incorporate other ideas and influences in their articulation of enactivism, sometimes leaving aside and sometimes challenging core assumptions of the original version of enactivism.
The IEP style is to refer to one of its own articles not as an entry but as an article.
If the author uses the future tense to talk about what is coming in later paragraphs, change it to the present tense. For example, the summary should state that section 3 covers Einstein’s position rather than willcover Einstein’s position.
c. Table of Contents
The opening summary must be followed by a table of contents that indicates the section headings and sub-headings of the article. If this is not the case in the article you are copyediting, go ahead and create the table of contents yourself. Follow the author guidelines, and allow no more depth of sub-headings than those shown in the author guidelines.
Make sure the headings in the table of contents match the headings within the article. One of the most common errors made by authors is to begin their article with a table of contents after their opening summary, then to change the heading of some section during composition of the article and then to forget to go back and also revise the table of contents.
The table of contents can either be flat (for example in the IEP article Solipsism and the Problem of Other Minds) or hierarchical (indented), with main sections and then sub-sections within them (for example in Aztec Philosophy). In either case, the table of contents must use the following structure and labeling convention:
1. Heading
a. Subheading
b. Subheading
i. Subsubheading
ii. Subsubheading
iii. Subsubheading
c. Subheading
2. Heading
a. Subheading
b. Subheading
3. References and Further Reading
d. Article Title
Capitalize an IEP article title as if the title were the name of a book. Here is a helpful, free program that capitalizes for you: https://www.prospercircle.org/tools/title-capitalization, although sometimes it does make a mistake. The program capitalizes all the important words (adjectives, adverbs, nouns, verbs) but not prepositions (to, below) and articles (a, the) and conjunctions (or, and). Capitalize the first word even if it is a preposition, and capitalize words that immediately follows a hyphen. Italicize appropriately. Here are three examples:
To Him and Then to Her
Space-Time in Kant’s Prolegomena
Non-Locative Theories of Persistence
For articles on an individual philosopher, the article title must contain the philosopher’s first and last name, birth date, and death date. If the philosopher has died, include the death date. If the philosopher is still alive, leave the death date blank. If both the birth date and the death date are guesses, then use two circas as in: Anaxarchus (c.380–c.320 B.C.E.), or use question marks as in: Anaxarchus (380?–320? B.C.E.). Check that an en dash (–), not a hyphen (-), nor an em dash (—) is used between the two dates. Blank spaces around the en dash should be removed.
In both the article title and the body of the article, check that periods are used in B.C.E. and C.E. without blank spaces. C.E. is only allowed if the year’s number is less than 500. Circa, the Latin word for about, must be abbreviated as c. rather than ca. or CA. There should be no blank space between c. and the date in either a title or in the body of the article. For example, in a title say Anaxagoras (c.500—428 B.C.E.) rather thanAnaxagoras (c. 500—428 B.C.E.).
The word century should be abbreviated as cn. and not c. and be separated from the date number with a blank space, for example Alexander Polyhistor (1st cn. B.C.E.). Outside of article titles, the full word century is fine to use also.
If a philosopher is known under two names, then insert the second spelling in parentheses as follows: Zhuangzi (Chuang-Tzu, 369—298 B.C.E.). Do not allow Zhuangzi (Chuang-Tzu) 369—298 B.C.E.
e. Headings and Sub-Headings
Make sure section headings and sub-headings in the article match those in the table of contents. Use your own judgment to fix a disagreement.
Capitalize headings and sub-headings as if they are titles of books. See Article Title for more on this. All article titles, headings and sub-headings must be in caps and smalls. Many authors will use full caps; this mistake must be fixed by the copy editor. The phrase “non-religious faith” can occur in the body of the article, but when it occurs in a heading, capitalize the “r.”
f. Author Information
Remove all titles from the author’s name such as Dr. or Professor. Delete the department name and the university’s street address and city. Do not abbreviate country names except for U. S. A. Note the blank spaces within the abbreviation.
For example, change:
Sir Michael Dummett, professor
email: dummett@csus.edu
Dept. of Philosophy
College of Arts & Sciences
Université de Genève
2010 La Mer Avenue
Geneva, Switzerland
to:
Michael Dummett
Email: dummett@csus.edu
University of Geneva
Switzerland
Use the English translation of university names.
g. American vs. British English
The IEP prefers American English, but other dialects such as British, South African, Australian, and Indian are acceptable. Just be consistent throughout the article.
British spelling examples: Aristotelean, favourite, defence, sceptic, behaviour, realisation, travelling.
American spelling examples: Aristotelian, favorite, defense, skeptic, behavior, realization, traveling.
h. References
Citations within the body of the article can use either of these two styles, with or without pages numbers, so long as the author is consistent throughout the article:
…as Alston argued (Alston 2009).
…as Alston argued (2009).
Page numbers are optional.
The last main section of every IEP article (and thus, every table of contents) must be titled “References and Further Reading.” Authors often will use the terms Sources,Bibliography,References,Readings, or Notes, but change these to References and Further Reading. In the section on References and Further Reading, alphabetize all the references by author’s last name. Do not number the list.
The IEP recommends that authors use the MLA style of references and in-line citations; however, the IEP does not enforce this recommendation and allows almost any style that is coherent in the References and Further Reading section.
References and Further Reading sections in IEP articles can have sub-headings such as Original Sources, More Advanced Studies, and so forth.
When there are more than two entries for the same author, repeat the author’s name in subsequent entries. The MLA and many authors use a long dash in place of the author’s name when it occurs a second time; replace all these dashes with the author’s name.
If a citation contains an item with a list of multiple authors, change any occurrence of “&” in the list to “and.”
Do not permit authors to cite a forthcoming article; highlight this with a comment in your Word version of the article, and the General Editor will decide what to do about it.
i. Hyperlinks (Links, Web Address, URLs)
Hyperlinks, links, web addresses, and URLs are the same thing. Hyperlinks to other articles within the Encyclopedia are always encouraged. Some authors are overly eager to include hyperlinks and will ask for a hyperlink for every occurrence of the term, but you must check to see that there is normally only one link per term, usually upon the term’s first occurrence in the body of the article. An exception would be when two links go to the same place but for different reasons, as when a link occurs to explain what the technical word field means, and then later there is a link to the same place to tell the reader that this is where to look for a discussion of the controversy about the relationship between spacetime and its fields,
We have a strict policy about linking to articles outside of the IEP. We use hyperlinks only for stable resources. The Stanford Encyclopedia of Philosophy and a professional society’s collection of Wittgenstein’s original correspondence are two examples of sites with stable URLs, but a link to an article on someone’s personal Webpage would not be stable because it would be too likely to become broken over the next fifty years.
All these unstable URLs must be removed. If you are unsure whether the author used an unstable URL, indicate this issue in a comment in your Word document.
For URLs, do not let authors add a date accessed.
j. Footnotes and Endnotes
Articles must not contain footnotes or endnotes. If you notice that an IEP article contains them, and you haven’t been told explicitly to include them, indicate this issue in a comment in your Word document.
k. Acknowledgments and Thank Yous
The copy editor must remove all the acknowledgments and thank yous. If one is found, include a comment in your Word document so the General Editor can notify the author and the area editor that it was removed.
l. Latin Abbreviations
Replace Latin abbreviations. Here are the replacements:
cf. | compare e.g. | for example (or for instance) et. al. | and others etc. | and so forth i.e. | that is infra | see below fl. | flourished NB | note pace | with all due respect to per se | intrinsically (or as such) viz. | namely
Most of these Latin abbreviations were created by the secretary of the Roman statesman Cicero.
Exceptions: It is OK to use a priori, circa, vice versa, vs, ibid. and op. cit.
m. Self Reference
Revise these sorts of unnecessary self-referential terms by using the phrasing that follows in brackets:
Consider my opening sentence [Consider the opening sentence}.
This concludes what I take to be the major influences on Mitchell’s thought. [This concludes the major influences on Mitchell’s thought.]
My book about him says… [Jones (2005) says…]
I believe Passmore’s description is right. [Passmore’s description is generally accepted.]
We will show in the next section… [The next section shows…]
n. Pejorative Terms
Do not permit your authors to use pejorative phrases.
Examples: which certain irrational philosophers still believe. the idealist curse.
Personal attacks and snide remarks must be removed.
o. Italics
Use italics or quotation marks for emphasis, never boldface or underlining.
Use italics to name something, unless using quotation marks would be clearer. Here is an example using italics: Philosophers have named them entities, things, and objects. Here is an example where quotation marks are clearer. Replace:
For any sentence S, if S is true, then S.
with
For any sentence S, if “S” is true, then S.
p. Dates
For a date range, such as 62–113 C.E., use an en dash everywhere (within an article title and within the body of the article and within its table of contents).
Also, the IEP is multicultural, so we do not want to place all events on a Christian timeline that uses AD and BC. Therefore, change B.C. and BC and BCE to B.C.E. (which means Before the Common Era). Do not use blank spaces within B.C.E., but do insert a blank space between the year and the B.C.E. For example, change Pliny (62–113C.E.) to Pliny (62–113 C.E.). Note: the IEP does not normally use C.E. for any date after 500 C.E.
If you encounter c. or circa in birth/death dates, change these to ca. and use no space between it and the number.
Regarding the word century in birth/death dates, authors may use fourth century, fourth cn, 4th century, and 4th cn provided the article’s style is consistent.
Regarding approximate dates, our style is to use either, for example, Ramanuja (ca.1017–ca.1137) or Ramanuja (1017?-1137?). Within article titles the hyphen should be replaced by an em dash.
For disputed dates within a definite range, do not use circa, ca, or question marks, but instead use this style: Ramanuja (1017/21–1137). If only a death date is known, then use Ramanuja (d. 1137) with a blank space after the period.
q. Initials
You can use either J.M.E. McTaggart or J. M. E. McTaggart, but the key is to be consistent throughout the article. If referring to a person by using only their initials, then use no spaces and no periods, as in FDR.
r. Using “Now” and “Recent”
Do not allow these temporal terms: now, currently, at present, recent, recently, presently, a few years ago, now, to date, as of this writing, does not yet exist. For example, if your author writes, “Recently this topic has become attractive to philosophers of biology …,” you might change this to, “In the first two decades of the twenty-first century, this topic became attractive to philosophers of biology.” If your author says, “As of this writing, Venus is believed to be hot” change to “As of 2024, Venus is believed to be hot” because the IEP articles do not have publication dates and we want the claim to remain true when it is read 85 years from now.
Always retain the above temporal terms within direct quotations.
s. Quotations
IEP articles use two types of quotation formatting. For long ones, indent into a block of text and do not surround it with quotation marks. For short ones, use quotation marks within the text.
Long Quotations: Quotations longer than three lines must be introduced with a colon in the previous text, and formatted as a block of text that is indented. Capitalize the first letter of the quotation even if it is not capitalized in the original source. Longer quotations can use multiple paragraphs. Do not enclose the block itself within quotation marks.
Short Quotations: Quotations of three or fewer lines should be inserted inline. In these cases, the quotations should be enclosed within quotation marks. When using a signal phrase to introduce a quotation, do not add a comma at the end of the signal. For example, say:
Andrew Mclaughlin suggests “Go there.”
not
Andrew Mclaughlin suggests, “Go there.”
Unlike in journal articles, full citations for quotations are not required in IEP articles. Only the author’s name is required.
For an inline quotation it is better to cite at the end of the sentence, not the end of the quoted sentence. So, say:
When he said “I know that I have hands, but I might be a handless brain in a vat,” the remark appears to be inconsistent (Smith 2022).
and not
When he said “I might be a handless brain in a vat,” (Smith 2022) the remark appears to be inconsistent.
The indention of non-quotations of any length is fine; it is a tool for emphasizing something.
The ending period or other punctuation mark goes after the citation in indented quotations. So, Number 1 is correct; not 2, below:
1. Popper concludes that, although Marxism had originally been a scientific theory:
It broke the methodological rule that we must accept falsification, and it immunized itself against the most blatant refutations of its predictions. Can it be described as non-science—as a metaphysical dream married to a cruel reality (1974, 985)?
vs.
2. Popper concludes that, although Marxism had originally been a scientific theory:
It broke the methodological rule that we must accept falsification, and it immunized itself against the most blatant refutations of its predictions. Can it be described as non-science—as a metaphysical dream married to a cruel reality? (1974, 985)
t. Punctuation
i. Quotation Marks
The American style uses double quotation marks. The British style uses single quotation marks. The same goes for scare quotes.
ii. Dashes and Hyphens
Replace a double hyphen (- -) and an en dash (–).
iii. Colons and Semicolons
For quotations, place the colon and semicolon outside the quotation:
Correct: Kant wrote, “There is no fact of the matter here”; Kripke disagreed with him.
Incorrect: Kant wrote, “There is no fact of the matter here;” Kripke disagreed with him.
Listed items may be made inline or, instead, indented as new paragraphs. List short items with commas, and list long items with semicolons either inline or indented. Both cases must be introduced with a colon. Here are two examples:
Traditionally the arguments for God’s existence have fallen into several families: arguments, miracles, and prudential justifications.
A person is justified in believing that X does not exist if:
(a) all the available evidence used to support the view that X exists is shown to be inadequate;
(b) X is the sort of entity that, if X exists, then there is a presumption that would be evidence adequate to support the view that X exists;
(c) this presumption has not been defeated although serious efforts have been made to do so.
iv. Contractions
Contractions not within quotations must be unpacked. For example, change don’t to do not.
v. Future Tense
Sentences referring to what is included in the article must use the present tense rather than the future tense.
Correct: Both act and rule utilitarianism are discussed in the next section.
Incorrect: Both act and rule utilitarianism will be discussed in the next section.
vi. Non-English Words
Italicize non-English words.
These movements are designed to channel the flow of qi (energy).
In subsequent uses of the term, drop the associated English meaning that was in parenthesis.
u. Lists
Authors are allowed to choose their own way of producing lists. They might use Roman numerals, numbers, letters, or whatever. However, always add a left parenthesis if they use only a right one. That is, change:
1)
2)
to
(1)
(2)
Indent lists one standard tab.
6. Last Step: Checkoff List
Most (but not all) of the recommendations described above have been converted into the following task list.
Work on a Copy. Keep the original version that was sent to you, and do your work on a copy which you give a slightly different name.
Save Changes. Turn on Review | Track Changes | All Markup. Save your work early and often.
Opening Summary. Check that the article has an opening summary; that its length is in the range of 200-500 words; and that it is not mentioned in the table of contents. Make a Word comment if there is an error here, or fix things yourself.
Table of Contents. Ensure that the text in the table of contents matches the corresponding heading text in the body of the article. If there is a mismatch, and it is not obvious how to fix it, then describe the problem in a comment.
Hyperlinks. Check that all hyperlinks point to the correct location and that the hyperlink is not repeated later in the article.
Contractions. Search for and unpack contractions that are not part of an externally sourced quotation. Rather than searching for every apostrophe, it is usually easier to perform individual searches for: ‘t, ‘d, ’s, ‘re, ‘ve, ‘ll.
Quotation Marks. Ensure that the style of quotation marks is consistent throughout the article.
Emphasis. Use italics for emphasis and not quotation marks or boldface.
Final Heading. Ensure that the article ends with References and Further Reading. It is more important that this section’s style be internally consistent than that it conform to one of the famous styles such as the one in The Chicago Manual of Style.
Periods and Blanks. Check for missing periods and double periods at the ends of sentences. Also search for double blanks.
Dates. With any date range such as “101 B.C.E.-88 B.C.E”, check that an em dash (—) is used in the article’s title and section sub-titles, but a hyphen is used instead within the body of the article.
The demarcation problem in philosophy of science refers to the question of how to meaningfully and reliably separate science from pseudoscience. Both the terms “science” and “pseudoscience” are notoriously difficult to define precisely, except in terms of family resemblance. The demarcation problem has a long history, tracing back at the least to a speech given by Socrates in Plato’s Charmides, as well as to Cicero’s critique of Stoic ideas on divination. Karl Popper was the most influential modern philosopher to write on demarcation, proposing his criterion of falsifiability to sharply distinguish science from pseudoscience. Most contemporary practitioners, however, agree that Popper’s suggestion does not work. In fact, Larry Laudan suggested that the demarcation problem is insoluble and that philosophers would be better off focusing their efforts on something else. This led to a series of responses to Laudan and new proposals on how to move forward, collected in a landmark edited volume on the philosophy of pseudoscience. After the publication of this volume, the field saw a renaissance characterized by a number of innovative approaches. Two such approaches are particularly highlighted in this article: treating pseudoscience and pseudophilosophy as BS, that is, “bullshit” in Harry Frankfurt’s sense of the term, and applying virtue epistemology to the demarcation problem. This article also looks at the grassroots movement often referred to as scientific skepticism and to its philosophical bases.
In the Charmides (West and West translation, 1986), Plato has Socrates tackle what contemporary philosophers of science refer to as the demarcation problem, the separation between science and pseudoscience. In that dialogue, Socrates is referring to a specific but very practical demarcation issue: how to tell the difference between medicine and quackery. Here is the most relevant excerpt:
SOCRATES: Let us consider the matter in this way. If the wise man or any other man wants to distinguish the true physician from the false, how will he proceed? . . . He who would inquire into the nature of medicine must test it in health and disease, which are the sphere of medicine, and not in what is extraneous and is not its sphere?
CRITIAS: True.
SOCRATES: And he who wishes to make a fair test of the physician as a physician will test him in what relates to these?
CRITIAS: He will.
SOCRATES: He will consider whether what he says is true, and whether what he does is right, in relation to health and disease?
CRITIAS: He will.
SOCRATES: But can anyone pursue the inquiry into either, unless he has a knowledge of medicine?
CRITIAS: He cannot.
SOCRATES: No one at all, it would seem, except the physician can have this knowledge—and therefore not the wise man. He would have to be a physician as well as a wise man.
CRITIAS: Very true. (170e-171c)
The conclusion at which Socrates arrives, therefore, is that the wise person would have to develop expertise in medicine, as that is the only way to distinguish an actual doctor from a quack. Setting aside that such a solution is not practical for most people in most settings, the underlying question remains: how do we decide whom to pick as our instructor? What if we mistake a school of quackery for a medical one? Do quacks not also claim to be experts? Is this not a hopelessly circular conundrum?
A few centuries later, the Roman orator, statesman, and philosopher Marcus Tullius Cicero published a comprehensive attack on the notion of divination, essentially treating it as what we would today call a pseudoscience, and anticipating a number of arguments that have been developed by philosophers of science in modern times. As Fernandez-Beanato (2020a) points out, Cicero uses the Latin word “scientia” to refer to a broader set of disciplines than the English “science.” His meaning is closer to the German word “Wissenschaft,” which means that his treatment of demarcation potentially extends to what we would today call the humanities, such as history and philosophy.
Being a member of the New Academy, and therefore a moderate epistemic skeptic, Cicero writes: “As I fear to hastily give my assent to something false or insufficiently substantiated, it seems that I should make a careful comparison of arguments […]. For to hasten to give assent to something erroneous is shameful in all things” (De Divinatione, I.7 / Falconer translation, 2014). He thus frames the debate on unsubstantiated claims, and divination in particular, as a moral one.
Fernandez-Beanato identifies five modern criteria that often come up in discussions of demarcation and that are either explicitly or implicitly advocated by Cicero: internal logical consistency of whatever notion is under scrutiny; degree of empirical confirmation of the predictions made by a given hypothesis; degree of specificity of the proposed mechanisms underlying a certain phenomenon; degree of arbitrariness in the application of an idea; and degree of selectivity of the data presented by the practitioners of a particular approach. Divination fails, according to Cicero, because it is logically inconsistent, it lacks empirical confirmation, its practitioners have not proposed a suitable mechanism, said practitioners apply the notion arbitrarily, and they are highly selective in what they consider to be successes of their practice.
Jumping ahead to more recent times, arguably the first modern instance of a scientific investigation into allegedly pseudoscientific claims is the case of the famous Royal Commissions on Animal Magnetism appointed by King Louis XVI in 1784. One of them, the so-called Society Commission, was composed of five physicians from the Royal Society of Medicine; the other, the so-called Franklin Commission, comprised four physicians from the Paris Faculty of Medicine, as well as Benjamin Franklin. The goal of both commissions was to investigate claims of “mesmerism,” or animal magnetism, being made by Franz Mesmer and some of his students (Salas and Salas 1996; Armando and Belhoste 2018).
Mesmer was a medical doctor who began his career with a questionable study entitled “A Physico-Medical Dissertation on the Influence of the Planets.” Later, he developed a theory according to which all living organisms are permeated by a vital force that can, with particular techniques, be harnessed for therapeutic purposes. While mesmerism became popular and influential for decades between the end of the 18th century and the full span of the 19th century, it is now considered a pseudoscience, in large part because of the failure to empirically replicate its claims and because vitalism in general has been abandoned as a theoretical notion in the biological sciences. Interestingly, though, Mesmer clearly thought he was doing good science within a physicalist paradigm and distanced himself from the more obviously supernatural practices of some of his contemporaries, such as the exorcist Johann Joseph Gassner.
For the purposes of this article, we need to stress the importance of the Franklin Commission in particular, since it represented arguably the first attempt in history to carry out controlled experiments. These were largely designed by Antoine Lavoisier, complete with a double-blind protocol in which both subjects and investigators did not know which treatment they were dealing with at any particular time, the allegedly genuine one or a sham control. As Stephen Jay Gould (1989) put it:
The report of the Royal Commission of 1784 is a masterpiece of the genre, an enduring testimony to the power and beauty of reason. … The Report is a key document in the history of human reason. It should be rescued from its current obscurity, translated into all languages, and reprinted by organizations dedicated to the unmasking of quackery and the defense of rational thought.
Not surprisingly, neither Commission found any evidence supporting Mesmer’s claims. The Franklin report was printed in 20,000 copies and widely circulated in France and abroad, but this did not stop mesmerism from becoming widespread, with hundreds of books published on the subject in the period 1766-1925.
Arriving now to modern times, the philosopher who started the discussion on demarcation is Karl Popper (1959), who thought he had formulated a neat solution: falsifiability (Shea no date). He reckoned that—contra popular understanding—science does not make progress by proving its theories correct, since it is far too easy to selectively accumulate data that are favorable to one’s pre-established views. Rather, for Popper, science progresses by eliminating one bad theory after another, because once a notion has been proven to be false, it will stay that way. He concluded that what distinguishes science from pseudoscience is the (potential) falsifiability of scientific hypotheses, and the inability of pseudoscientific notions to be subjected to the falsifiability test.
For instance, Einstein’s theory of general relativity survived a crucial test in 1919, when one of its most extraordinary predictions—that light is bent by the presence of gravitational masses—was spectacularly confirmed during a total eclipse of the sun (Kennefick 2019). This did not prove that the theory is true, but it showed that it was falsifiable and, therefore, good science. Moreover, Einstein’s prediction was unusual and very specific, and hence very risky for the theory. This, for Popper, is a good feature of a scientific theory, as it is too easy to survive attempts at falsification when predictions based on the theory are mundane or common to multiple theories.
In contrast with the example of the 1919 eclipse, Popper thought that Freudian and Adlerian psychoanalysis, as well as Marxist theories of history, are unfalsifiable in principle; they are so vague that no empirical test could ever show them to be incorrect, if they are incorrect. The point is subtle but crucial. Popper did not argue that those theories are, in fact, wrong, only that one could not possibly know if they were, and they should not, therefore, be classed as good science.
Popper became interested in demarcation because he wanted to free science from a serious issue raised by David Hume (1748), the so-called problem of induction. Scientific reasoning is based on induction, a process by which we generalize from a set of observed events to all observable events. For instance, we “know” that the sun will rise again tomorrow because we have observed the sun rising countless times in the past. More importantly, we attribute causation to phenomena on the basis of inductive reasoning: since event X is always followed by event Y, we infer that X causes Y.
The problem as identified by Hume is twofold. First, unlike deduction (as used in logic and mathematics), induction does not guarantee a given conclusion, it only makes that conclusion probable as a function of the available empirical evidence. Second, there is no way to logically justify the inference of a causal connection. The human mind does so automatically, says Hume, as a leap of imagination.
Popper was not satisfied with the notion that science is, ultimately, based on a logically unsubstantiated step. He reckoned that if we were able to reframe scientific progress in terms of deductive, not inductive logic, Hume’s problem would be circumvented. Hence falsificationism, which is, essentially, an application of modus tollens (Hausman et al. 2021) to scientific hypotheses:
If P, then Q
Not Q
Therefore, not P
For instance, if General Relativity is true then we should observe a certain deviation of light coming from the stars when their rays pass near the sun (during a total eclipse or under similarly favorable circumstances). We do observe the predicted deviation. Therefore, we have (currently) no reason to reject General Relativity. However, had the observations carried out during the 1919 eclipse not aligned with the prediction then there would have been sufficient reason, according to Popper, to reject General Relativity based on the above syllogism.
Science, on this view, does not make progress one induction, or confirmation, after the other, but one discarded theory after the other. And as a bonus, thought Popper, this looks like a neat criterion to demarcate science from pseudoscience.
In fact, it is a bit too neat, unfortunately. Plenty of philosophers after Popper (for example, Laudan 1983) have pointed out that a number of pseudoscientific notions are eminently falsifiable and have been shown to be false—astrology, for instance (Carlson 1985). Conversely, some notions that are even currently considered to be scientific, are also—at least temporarily—unfalsifiable (for example, string theory in physics: Hossenfelder 2018).
A related issue with falsificationism is presented by the so-called Duhem-Quine theses (Curd and Cover 2012), two allied propositions about the nature of knowledge, scientific or otherwise, advanced independently by physicist Pierre Duhem and philosopher Willard Van Orman Quine.
Duhem pointed out that when scientists think they are testing a given hypothesis, as in the case of the 1919 eclipse test of General Relativity, they are, in reality, testing a broad set of propositions constituted by the central hypothesis plus a number of ancillary assumptions. For instance, while the attention of astronomers in 1919 was on Einstein’s theory and its implications for the laws of optics, they also simultaneously “tested” the reliability of their telescopes and camera, among a number of more or less implicit additional hypotheses. Had something gone wrong, their likely first instinct, rightly, would have been to check that their equipment was functioning properly before taking the bold step of declaring General Relativity dead.
Quine, later on, articulated a broader account of human knowledge conceived as a web of beliefs. Part of this account is the notion that scientific theories are always underdetermined by the empirical evidence (Bonk 2008), meaning that different theories will be compatible with the same evidence at any given point in time. Indeed, for Quine it is not just that we test specific theories and their ancillary hypotheses. We literally test the entire web of human understanding. Certainly, if a test does not yield the predicted results we will first look at localized assumptions. But occasionally we may be forced to revise our notions at larger scales, up to and including mathematics and logic themselves.
The history of science does present good examples of how the Duhem-Quine theses undermine falsificationism. The twin tales of the spectacular discovery of a new planet and the equally spectacular failure to discover an additional one during the 19th century are classic examples.
Astronomers had uncovered anomalies in the orbit of Uranus, at that time the outermost known planet in the solar system. These anomalies did not appear, at first, to be explainable by standard Newtonian mechanics, and yet nobody thought even for a moment to reject that theory on the basis of the newly available empirical evidence. Instead, mathematician Urbain Le Verrier postulated that the anomalies were the result of the gravitational interference of an as yet unknown planet, situated outside of Uranus’ orbit. The new planet, Neptune, was in fact discovered on the night of 23-24 September 1846, thanks to the precise calculations of Le Verrier (Grosser 1962).
The situation repeated itself shortly thereafter, this time with anomalies discovered in the orbit of the innermost planet of our system, Mercury. Again, Le Verrier hypothesized the existence of a hitherto undiscovered planet, which he named Vulcan. But Vulcan never materialized. Eventually astronomers really did have to jettison Newtonian mechanics and deploy the more sophisticated tools provided by General Relativity, which accounted for the distortion of Mercury’s orbit in terms of gravitational effects originating with the Sun (Baum and Sheehan 1997).
What prompted astronomers to react so differently to two seemingly identical situations? Popper would have recognized the two similar hypotheses put forth by Le Verrier as being ad hoc and yet somewhat justified given the alternative, the rejection of Newtonian mechanics. But falsificationism has no tools capable of explaining why it is that sometimes ad hoc hypotheses are acceptable and at other times they are not. Nor, therefore, is it in a position to provide us with sure guidance in cases like those faced by Le Verrier and colleagues. This failure, together with wider criticism of Popper’s philosophy of science by the likes of Thomas Kuhn (1962), Imre Lakatos (1978), and Paul Feyerabend (1975) paved the way for a crisis of sorts for the whole project of demarcation in philosophy of science.
2. The Demise of Demarcation: The Laudan Paper
A landmark paper in the philosophy of demarcation was published by Larry Laudan in 1983. Provocatively entitled “The Demise of the Demarcation Problem,” it sought to dispatch the whole field of inquiry in one fell swoop. As the next section shows, the outcome was quite the opposite, as a number of philosophers responded to Laudan and reinvigorated the whole debate on demarcation. Nevertheless, it is instructive to look at Laudan’s paper and to some of his motivations to write it.
Laudan was disturbed by the events that transpired during one of the classic legal cases concerning pseudoscience, specifically the teaching of so-called creation science in American classrooms. The case, McLean v. Arkansas Board of Education, was debated in 1982. Some of the fundamental questions that the presiding judge, William R. Overton, asked expert witnesses to address were whether Darwinian evolution is a science, whether creationism is also a science, and what criteria are typically used by the pertinent epistemic communities (that is, scientists and philosophers) to arrive at such assessments (LaFollette 1983).
One of the key witnesses on the evolution side was philosopher Michael Ruse, who presented Overton with a number of demarcation criteria, one of which was Popper’s falsificationism. According to Ruse’s testimony, creationism is not a science because, among other reasons, its claims cannot be falsified. In a famous and very public exchange with Ruse, Laudan (1988) objected to the use of falsificationism during the trial, on the grounds that Ruse must have known that that particular criterion had by then been rejected, or at least seriously questioned, by the majority of philosophers of science.
It was this episode that prompted Laudan to publish his landmark paper aimed at getting rid of the entire demarcation debate once and for all. One argument advanced by Laudan is that philosophers have been unable to agree on demarcation criteria since Aristotle and that it is therefore time to give up this particular quixotic quest. This is a rather questionable conclusion. Arguably, philosophy does not make progress by resolving debates, but by discovering and exploring alternative positions in the conceptual spaces defined by a particular philosophical question (Pigliucci 2017). Seen this way, falsificationism and modern debates on demarcation are a standard example of progress in philosophy of science, and there is no reason to abandon a fruitful line of inquiry so long as it keeps being fruitful.
Laudan then argues that the advent of fallibilism in epistemology (Feldman 1981) during the nineteenth century spelled the end of the demarcation problem, as epistemologists now recognize no meaningful distinction between opinion and knowledge. Setting aside that the notion of fallibilism far predates the 19th century and goes back at the least to the New Academy of ancient Greece, it may be the case, as Laudan maintains, that many modern epistemologists do not endorse the notion of an absolute and universal truth, but such notion is not needed for any serious project of science-pseudoscience demarcation. All one needs is that some “opinions” are far better established, by way of argument and evidence, than others and that scientific opinions tend to be dramatically better established than pseudoscientific ones.
It is certainly true, as Laudan maintains, that modern philosophers of science see science as a set of methods and procedures, not as a particular body of knowledge. But the two are tightly linked: the process of science yields reliable (if tentative) knowledge of the world. Conversely, the processes of pseudoscience, such as they are, do not yield any knowledge of the world. The distinction between science as a body of knowledge and science as a set of methods and procedures, therefore, does nothing to undermine the need for demarcation.
After a by now de rigueur criticism of the failure of positivism, Laudan attempts to undermine Popper’s falsificationism. But even Laudan himself seems to realize that the limits of falsificationism do not deal a death blow to the notion that there are recognizable sciences and pseudosciences: “One might respond to such criticisms [of falsificationism] by saying that scientific status is a matter of degree rather than kind” (Laudan 1983, 121). Indeed, that seems to be the currently dominant position of philosophers who are active in the area of demarcation.
The rest of Laudan’s critique boils down to the argument that no demarcation criterion proposed so far can provide a set of necessary and sufficient conditions to define an activity as scientific, and that the “epistemic heterogeneity of the activities and beliefs customarily regarded as scientific” (1983, 124) means that demarcation is a futile quest. This article now briefly examines each of these two claims.
Ever since Wittgenstein (1958), philosophers have recognized that any sufficiently complex concept will not likely be definable in terms of a small number of necessary and jointly sufficient conditions. That approach may work in basic math, geometry, and logic (for example, definitions of triangles and other geometric figures), but not for anything as complex as “science” or “pseudoscience.” This implies that single-criterion attempts like Popper’s are indeed to finally be set aside, but it does not imply that multi-criterial or “fuzzy” approaches will not be useful. Again, rather than a failure, this shift should be regarded as evidence of progress in this particular philosophical debate.
Regarding Laudan’s second claim from above, that science is a fundamentally heterogeneous activity, this may or may not be the case, the jury is still very much out. Some philosophers of science have indeed suggested that there is a fundamental disunity to the sciences (Dupré 1993), but this is far from being a consensus position. Even if true, a heterogeneity of “science” does not preclude thinking of the sciences as a family resemblance set, perhaps with distinctly identifiable sub-sets, similar to the Wittgensteinian description of “games” and their subdivision into fuzzy sets including board games, ball games, and so forth. Indeed, some of the authors discussed later in this article have made this very same proposal regarding pseudoscience: there may be no fundamental unity grouping, say, astrology, creationism, and anti-vaccination conspiracy theories, but they nevertheless share enough Wittgensteinian threads to make it useful for us to talk of all three as examples of broadly defined pseudosciences.
3. The Return of Demarcation: The University of Chicago Press Volume
Laudan’s 1983 paper had the desired effect of convincing a number of philosophers of science that it was not worth engaging with demarcation issues. Yet, in the meantime pseudoscience kept being a noticeable social phenomenon, one that was having increasingly pernicious effects, for instance in the case of HIV, vaccine, and climate change denialism (Smith and Novella, 2007; Navin 2013; Brulle 2020). It was probably inevitable, therefore, that philosophers of science who felt that their discipline ought to make positive contributions to society would, sooner or later, go back to the problem of demarcation.
The turning point was an edited volume entitled The Philosophy of Pseudoscience: Reconsidering the Demarcation Problem, published in 2013 by the University of Chicago Press (Pigliucci and Boudry 2013). The editors and contributors consciously and explicitly set out to respond to Laudan and to begin the work necessary to make progress (in something like the sense highlighted above) on the issue.
The first five chapters of The Philosophy of Pseudoscience take the form of various responses to Laudan, several of which hinge on the rejection of the strict requirement for a small set of necessary and jointly sufficient conditions to define science or pseudoscience. Contemporary philosophers of science, it seems, have no trouble with inherently fuzzy concepts. As for Laudan’s contention that the term “pseudoscience” does only negative, potentially inflammatory work, this is true and yet no different from, say, the use of “unethical” in moral philosophy, which few if any have thought of challenging.
The contributors to The Philosophy of Pseudoscience also readily admit that science is best considered as a family of related activities, with no fundamental essence to define it. Indeed, the same goes for pseudoscience as, for instance, vaccine denialism is very different from astrology, and both differ markedly from creationism. Nevertheless, there are common threads in both cases, and the existence of such threads justifies, in part, philosophical interest in demarcation. The same authors argue that we should focus on the borderline cases, precisely because there it is not easy to neatly separate activities into scientific and pseudoscientific. There is no controversy, for instance, in classifying fundamental physics and evolutionary biology as sciences, and there is no serious doubt that astrology and homeopathy are pseudosciences. But what are we to make of some research into the paranormal carried out by academic psychologists (Jeffers 2007)? Or of the epistemically questionable claims often, but not always, made by evolutionary psychologists (Kaplan 2006)?
The 2013 volume sought a consciously multidisciplinary approach to demarcation. Contributors include philosophers of science, but also sociologists, historians, and professional skeptics (meaning people who directly work on the examination of extraordinary claims). The group saw two fundamental reasons to continue scholarship on demarcation. On the one hand, science has acquired a high social status and commands large amounts of resources in modern society. This means that we ought to examine and understand its nature in order to make sound decisions about just how much trust to put into scientific institutions and proceedings, as well as how much money to pump into the social structure that is modern science. On the other hand, as noted above, pseudoscience is not a harmless pastime. It has negative effects on both individuals and societies. This means that an understanding of its nature, and of how it differs from science, has very practical consequences.
The Philosophy of Pseudoscience also tackles issues of history and sociology of the field. It contains a comprehensive history of the demarcation problem followed by a historical analysis of pseudoscience, which tracks down the coinage and currency of the term and explains its shifting meaning in tandem with the emerging historical identity of science. A contribution by a sociologist then provides an analysis of paranormalism as a “deviant discipline” violating the consensus of established science, and one chapter draws attention to the characteristic social organization of pseudosciences as a means of highlighting the corresponding sociological dimension of the scientific endeavor.
The volume explores the borderlands between science and pseudoscience, for instance by deploying the idea of causal asymmetries in evidential reasoning to differentiate between what are sometime referred to as “hard” and “soft” sciences, arguing that misconceptions about this difference explain the higher incidence of pseudoscience and anti-science connected to the non-experimental sciences. One contribution looks at the demographics of pseudoscientific belief and examines how the demarcation problem is treated in legal cases. One chapter recounts the story of how at one time the pre-Darwinian concept of evolution was treated as pseudoscience in the same guise as mesmerism, before eventually becoming the professional science we are familiar with, thus challenging a conception of demarcation in terms of timeless and purely formal principles.
A discussion focusing on science and the supernatural includes the provocative suggestion that, contrary to recent philosophical trends, the appeal to the supernatural should not be ruled out from science on methodological grounds, as it is often done, but rather because the very notion of supernatural intervention suffers from fatal flaws. Meanwhile, David Hume is enlisted to help navigate the treacherous territory between science and religious pseudoscience and to assess the epistemic credentials of supernaturalism.
The Philosophy of Pseudoscience includes an analysis of the tactics deployed by “true believers” in pseudoscience, beginning with a discussion of the ethics of argumentation about pseudoscience, followed by the suggestion that alternative medicine can be evaluated scientifically despite the immunizing strategies deployed by some of its most vocal supporters. One entry summarizes misgivings about Freudian psychoanalysis, arguing that we should move beyond assessments of the testability and other logical properties of a theory, shifting our attention instead to the spurious claims of validation and other recurrent misdemeanors on the part of pseudoscientists. It also includes a description of the different strategies used by climate change “skeptics” and other denialists, outlining the links between new and “traditional” pseudosciences.
The volume includes a section examining the complex cognitive roots of pseudoscience. Some of the contributors ask whether we actually evolved to be irrational, describing a number of heuristics that are rational in domains ecologically relevant to ancient Homo sapiens, but that lead us astray in modern contexts. One of the chapters explores the non-cognitive functions of super-empirical beliefs, analyzing the different attitudes of science and pseudoscience toward intuition. An additional entry distinguishes between two mindsets about science and explores the cognitive styles relating to authority and tradition in both science and pseudoscience. This is followed by an essay proposing that belief in pseudoscience may be partly explained by theories about the ethics of belief. There is also a chapter on pseudo-hermeneutics and the illusion of understanding, drawing inspiration from the cognitive psychology and philosophy of intentional thinking.
A simple search of online databases of philosophical peer reviewed papers clearly shows that the 2013 volume has succeeded in countering Laudan’s 1983 paper, yielding a flourishing of new entries in the demarcation literature in particular, and in the newly established subfield of the philosophy of pseudoscience more generally. This article now turns to a brief survey of some of the prominent themes that have so far characterized this Renaissance of the field of demarcation.
4. The Renaissance of the Demarcation Problem
After the publication of The Philosophy of Pseudoscience collection, an increasing number of papers has been published on the demarcation problem and related issues in philosophy of science and epistemology. It is not possible to discuss all the major contributions in detail, so what follows is intended as a representative set of highlights and a brief guide to the primary literature.
Sven Ove Hansson (2017) proposed that science denialism, often considered a different issue from pseudoscience, is actually one form of the latter, the other form being what he terms pseudotheory promotion. Hansson examines in detail three case studies: relativity theory denialism, evolution denialism, and climate change denialism. The analysis is couched in terms of three criteria for the identification of pseudoscientific statements, previously laid out by Hansson (2013). A statement is pseudoscientific if it satisfies the following:
It pertains to an issue within the domains of science in the broad sense (the criterion of scientific domain).
It suffers from such a severe lack of reliability that it cannot at all be trusted (the criterion of unreliability).
It is part of a doctrine whose major proponents try to create the impression that it represents the most reliable knowledge on its subject matter (the criterion of deviant doctrine).
On these bases, Hansson concludes that, for example, “The misrepresentations of history presented by Holocaust deniers and other pseudo-historians are very similar in nature to the misrepresentations of natural science promoted by creationists and homeopaths” (2017, 40). In general, Hansson proposes that there is a continuum between science denialism at one end (for example, regarding climate change, the holocaust, the general theory of relativity, etc.) and pseudotheory promotion at the other end (for example, astrology, homeopathy, iridology). He identifies four epistemological characteristics that account for the failure of science denialism to provide genuine knowledge:
Cherry picking. One example is Conservapedia’s entry listing alleged counterexamples to the general theory of relativity. Never mind that, of course, an even cursory inspection of such “anomalies” turns up only mistakes or misunderstandings.
Neglect of refuting information. Again concerning general relativity denialism, the proponents of the idea point to a theory advanced by the Swiss physicist Georges-Louis Le Sage that gravitational forces result from pressure exerted on physical bodies by a large number of small invisible particles. That idea might have been reasonably entertained when it was proposed, in the 18th century, but not after the devastating criticism it received in the 19th century—let alone the 21st.
Fabrication of fake controversies. Perhaps the most obvious example here is the “teach both theories” mantra so often repeated by creationists, which was adopted by Ronald Reagan during his 1980 presidential campaign. The fact is, there is no controversy about evolution within the pertinent epistemic community.
Deviant criteria of assent. For instance, in the 1920s and ‘30s, special relativity was accused of not being sufficiently transpicuous, and its opponents went so far as to attempt to create a new “German physics” that would not use difficult mathematics and would, therefore, be accessible by everyone. Both Einstein and Planck ridiculed the whole notion that science ought to be transpicuous in the first place. The point is that part of the denialist’s strategy is to ask for impossible standards in science and then use the fact that such demands are not met (because they cannot be) as “evidence” against a given scientific notion. This is known as the unobtainable perfection fallacy (Gauch, 2012).
Hansson lists ten sociological characteristics of denialism: that the focal theory (say, evolution) threatens the denialist’s worldview (for instance, a fundamentalist understanding of Christianity); complaints that the focal theory is too difficult to understand; a lack of expertise among denialists; a strong predominance of men among the denialists (that is, lack of diversity); an inability to publish in peer-reviewed journals; a tendency to embrace conspiracy theories; appeals directly to the public; the pretense of having support among scientists; a pattern of attacks against legitimate scientists; and strong political overtones.
Dawes (2018) acknowledges, with Laudan (1983), that there is a general consensus that no single criterion (or even small set of necessary and jointly sufficient criteria) is capable of discerning science from pseudoscience. However, he correctly maintains that this does not imply that there is no multifactorial account of demarcation, situating different kinds of science and pseudoscience along a continuum. One such criterion is that science is a social process, which entails that a theory is considered scientific because it is part of a research tradition that is pursued by the scientific community.
Dawes is careful in rejecting the sort of social constructionism endorsed by some sociologists of science (Bloor 1976) on the grounds that the sociological component is just one of the criteria that separate science from pseudoscience. Two additional criteria have been studied by philosophers of science for a long time: the evidential and the structural. The first refers to the connection between a given scientific theory and the empirical evidence that provides epistemic warrant for that theory. The second is concerned with the internal structure and coherence of a scientific theory.
Science, according to Dawes, is a cluster concept grouping a set of related, yet somewhat differentiated, kinds of activities. In this sense, his paper reinforces an increasingly widespread understanding of science in the philosophical community (see also Dupré 1993; Pigliucci 2013). Pseudoscience, then, is also a cluster concept, similarly grouping a number of related, yet varied, activities that attempt to mimic science but do so within the confines of an epistemically inert community.
The question, therefore, becomes, in part, one of distinguishing scientific from pseudoscientific communities, especially when the latter closely mimic the first ones. Take, for instance, homeopathy. While it is clearly a pseudoscience, the relevant community is made of self-professed “experts” who even publish a “peer-reviewed” journal, Homeopathy, put out by a major academic publisher, Elsevier. Here, Dawes builds on an account of scientific communities advanced by Robert Merton (1973). According to Merton, scientific communities are characterized by four norms, all of which are lacking in pseudoscientific communities: universalism, the notion that class, gender, ethnicity, and so forth are (ideally, at least) treated as irrelevant in the context of scientific discussions; communality, in the sense that the results of scientific inquiry belong (again, ideally) to everyone; disinterestedness, not because individual scientists are unbiased, but because community-level mechanisms counter individual biases; and organized skepticism, whereby no idea is exempt from critical scrutiny.
In the end, Dawes’s suggestion is that “We will have a pro tanto reason to regard a theory as pseudoscientific when it has been either refused admission to, or excluded from, a scientific research tradition that addresses the relevant problems” (2018, 293). Crucially, however, what is or is not recognized as a viable research tradition by the scientific community changes over time, so that the demarcation between science and pseudoscience is itself liable to shift as time passes.
One author who departs significantly from what otherwise seems to be an emerging consensus on demarcation is Angelo Fasce (2019). He rejects the notion that there is any meaningful continuum between science and pseudoscience, or that either concept can fruitfully be understood in terms of family resemblance, going so far as accusing some of his colleagues of “still engag[ing] in time-consuming, unproductive discussions on already discarded demarcation criteria, such as falsifiability” (2019, 155).
Fasce’s criticism hinges, in part, on the notion that gradualist criteria may create problems in policy decision making: just how much does one activity have to be close to the pseudoscientific end of the spectrum in order for, say, a granting agency to raise issues? The answer is that there is no sharp demarcation because there cannot be, regardless of how much we would wish otherwise. In many cases, said granting agency should have no trouble classifying good science (for example, fundamental physics or evolutionary biology) as well as obvious pseudoscience (for example, astrology or homeopathy). But there will be some borderline cases (for instance, parapsychology? SETI?) where one will just have to exercise one’s best judgment based on what is known at the moment and deal with the possibility that one might make a mistake.
Fasce also argues that “Contradictory conceptions and decisions can be consistently and justifiably derived from [a given demarcation criterion]—i.e. mutually contradictory propositions could be legitimately derived from the same criterion because that criterion allows, or is based on, ‘subjective’ assessment” (2019, 159). Again, this is probably true, but it is also likely an inevitable feature of the nature of the problem, not a reflection of the failure of philosophers to adequately tackle it.
Fasce (2019, 62) states that there is no historical case of a pseudoscience turning into a legitimate science, which he takes as evidence that there is no meaningful continuum between the two classes of activities. But this does not take into account the case of pre-Darwinian evolutionary theories mentioned earlier, nor the many instances of the reverse transition, in which an activity initially considered scientific has, in fact, gradually turned into a pseudoscience, including alchemy (although its relationship with chemistry is actually historically complicated), astrology, phrenology, and, more recently, cold fusion—with the caveat that whether the latter notion ever reached scientific status is still being debated by historians and philosophers of science. These occurrences would seem to point to the existence of a continuum between the two categories of science and pseudoscience.
One interesting objection raised by Fasce is that philosophers who favor a cluster concept approach do not seem to be bothered by the fact that such a Wittgensteinian take has led some authors, like Richard Rorty, all the way down the path of radical relativism, a position that many philosophers of science reject. Then again, Fasce himself acknowledges that “Perhaps the authors who seek to carry out the demarcation of pseudoscience by means of family resemblance definitions do not follow Wittgenstein in all his philosophical commitments” (2019, 64).
Because of his dissatisfaction with gradualist interpretations of the science-pseudoscience landscape, Fasce (2019, 67) proposes what he calls a “metacriterion” to aid in the demarcation project. This is actually a set of four criteria, two of which he labels “procedural requirements” and two “criterion requirements.” The latter two are mandatory for demarcation, while the first two are not necessary, although they provide conditions of plausibility. The procedural requirements are: (i) that demarcation criteria should entail a minimum number of philosophical commitments; and (ii) that demarcation criteria should explain current consensus about what counts as science or pseudoscience. The criterion requirements are: (iii) that mimicry of science is a necessary condition for something to count as pseudoscience; and (iv) that all items of demarcation criteria be discriminant with respect to science.
Fasce (2018) has used his metacriterion to develop a demarcation criterion according to which pseudoscience: (1) refers to entities and/or processes outside the domain of science; (2) makes use of a deficient methodology; (3) is not supported by evidence; and (4) is presented as scientific knowledge. This turns out to be similar to a previous proposal by Hansson (2009). Fasce and Picó (2019) have also developed a scale of pseudoscientific belief based on the work discussed above.
Another author pushing a multicriterial approach to demarcation is Damian Fernandez‐Beanato (2020b), whom this article already mentioned when discussing Cicero’s early debunking of divination. He provides a useful summary of previous mono-criterial proposals, as well as of two multicriterial ones advanced by Hempel (1951) and Kuhn (1962). The failure of these attempts is what in part led to the above-mentioned rejection of the entire demarcation project by Laudan (1983).
Fernandez‐Beanato suggests improvements on a multicriterial approach originally put forth by Mahner (2007), consisting of a broad list of accepted characteristics or properties of science. The project, however, runs into significant difficulties for a number of reasons. First, like Fasce (2019), Fernandez-Beanato wishes for more precision than is likely possible, in his case aiming at a quantitative “cut value” on a multicriterial scale that would make it possible to distinguish science from non-science or pseudoscience in a way that is compatible with classical logic. It is hard to imagine how such quantitative estimates of “scientificity” may be obtained and operationalized. Second, the approach assumes a unity of science that is at odds with the above-mentioned emerging consensus in philosophy of science that “science” (and, similarly, “pseudoscience”) actually picks a family of related activities, not a single epistemic practice. Third, Fernandez-Beanato rejects Hansson’s (and other authors’) notion that any demarcation criterion is, by necessity, temporally limited because what constitutes science or pseudoscience changes with our understanding of phenomena. But it seems hard to justify Fernandez-Beanato’s assumption that “Science … is currently, in general, mature enough for properties related to method to be included into a general and timeless definition of science” (2019, 384).
Kåre Letrud (2019), like Fasce (2019), seeks to improve on Hansson’s (2009) approach to demarcation, but from a very different perspective. He points out that Hansson’s original answer to the demarcation problem focuses on pseudoscientific statements, not disciplines. The problem with this, according to Letrud, is that Hansson’s approach does not take into sufficient account the sociological aspect of the science-pseudoscience divide. Moreover, following Hansson—again according to Letrud—one would get trapped into a never-ending debunking of individual (as distinct from systemic) pseudoscientific claims. Here Letrud invokes the “Bullshit Asymmetry Principle,” also known as “Brandolini’s Law” (named after the Italian programmer Alberto Brandolini, to which it is attributed): “The amount of energy needed to refute BS is an order of magnitude bigger than to produce it.” Going pseudoscientific statement by pseudoscientific statement, then, is a losing proposition.
Letrud notes that Hansson (2009) adopts a broad definition of “science,” along the lines of the German Wissenschaft, which includes the social sciences and the humanities. While Fasce (2019) thinks this is problematically too broad, Letrud (2019) points out that a broader view of science implies a broader view of pseudoscience, which allows Hansson to include in the latter not just standard examples like astrology and homeopathy, but also Holocaust denialism, Bible “codes,” and so forth.
According to Letrud, however, Hansson’s original proposal does not do a good job differentiating between bad science and pseudoscience, which is important because we do not want to equate the two. Letrud suggests that bad science is characterized by discrete episodes of epistemic failure, which can occur even within established sciences. Pseudoscience, by contrast, features systemic epistemic failure. Bad science can even give rise to what Letrud calls “scientific myth propagation,” as in the case of the long-discredited notion that there are such things as learning styles in pedagogy. It can take time, even decades, to correct examples of bad science, but that does not ipso facto make them instances of pseudoscience.
Letrud applies Lakatos’s (1978) distinction of core vs. auxiliary statements for research programs to core vs. auxiliary statements typical of pseudosciences like astrology or homeopathy, thus bridging the gap between Hansson’s focus on individual statements and Letrud’s preferred focus on disciplines. For instance: “One can be an astrologist while believing that Virgos are loud, outgoing people (apparently, they are not). But one cannot hold that the positions of the stars and the character and behavior of people are unrelated” (Letrud 2019, 8). The first statement is auxiliary, the second, core.
To take homeopathy as an example, a skeptic could decide to spend an inordinate amount of time (according to Brandolini’s Law) debunking individual statements made by homeopaths. Or, more efficiently, the skeptic could target the two core principles of the discipline, namely potentization theory (that is, the notion that more diluted solutions are more effective) and the hypothesis that water holds a “memory” of substances once present in it. Letrud’s approach, then, retains the power of Hansson’s, but zeros in on the more foundational weakness of pseudoscience—its core claims—while at the same time satisfactorily separating pseudoscience from regular bad science. The debate, however, is not over, as more recently Hansson (2020) has replied to Letrud emphasizing that pseudosciences are doctrines, and that the reason they are so pernicious is precisely their doctrinal resistance to correction.
5. Pseudoscience as BS
One of the most intriguing papers on demarcation to appear in the course of what this article calls the Renaissance of scholarship on the issue of pseudoscience is entitled “Bullshit, Pseudoscience and Pseudophilosophy,” authored by Victor Moberger (2020). Moberger has found a neat (and somewhat provocative) way to describe the profound similarity between pseudoscience and pseudophilosophy: in a technical philosophical sense, it is all BS.
Moberger takes his inspiration from the famous essay by Harry Frankfurt (2005), On Bullshit. As Frankfurt puts it: “One of the most salient features of our culture is that there is so much bullshit.” (2005, 1) Crucially, Frankfurt goes on to differentiate the BSer from the liar:
It is impossible for someone to lie unless he thinks he knows the truth. … A person who lies is thereby responding to the truth, and he is to that extent respectful of it. When an honest man speaks, he says only what he believes to be true; and for the liar, it is correspondingly indispensable that he consider his statements to be false. For the bullshitter, however, all these bets are off: he is neither on the side of the true nor on the side of the false. His eye is not on the facts at all, as the eyes of the honest man and of the liar are. … He does not care whether the things he says describe reality correctly. (2005, 55-56)
So, while both the honest person and the liar are concerned with the truth—though in opposite manners—the BSer is defined by his lack of concern for it. This lack of concern is of the culpable variety, so that it can be distinguished from other activities that involve not telling the truth, like acting. This means two important things: (i) BS is a normative concept, meaning that it is about how one ought to behave or not to behave; and (ii) the specific type of culpability that can be attributed to the BSer is epistemic culpability. As Moberger puts it, “the bullshitter is assumed to be capable of responding to reasons and argument, but fails to do so” (2020, 598) because he does not care enough.
Moberger does not make the connection in his paper, but since he focuses on BSing as an activity carried out by particular agents, and not as a body of statements that may be true or false, his treatment falls squarely into the realm of virtue epistemology (see below). We can all arrive at the wrong conclusion on a specific subject matter, or unwittingly defend incorrect notions. And indeed, to some extent we may all, more or less, be culpable of some degree of epistemic misconduct, because few if any people are the epistemological equivalent of sages, ideally virtuous individuals. But the BSer is pathologically epistemically culpable. He incurs epistemic vices and he does not care about it, so long as he gets whatever he wants out of the deal, be that to be “right” in a discussion, or to further his favorite a priori ideological position no matter what.
Accordingly, the charge of BSing—in the technical sense—has to be substantiated by serious philosophical analysis. The term cannot simply be thrown out there as an insult or an easy dismissal. For instance, when Kant famously disagreed with Hume on the role of reason (primary for Kant, subordinate to emotions for Hume) he could not just have labelled Hume’s position as BS and move on, because Hume had articulated cogent arguments in defense of his take on the subject.
On the basis of Frankfurt’s notion of BSing, Moberger carries out a general analysis of pseudoscience and even pseudophilosophy. He uses the term pseudoscience to refer to well-known examples of epistemic malpractice, like astrology, creationism, homeopathy, ufology, and so on. According to Moberger, the term pseudophilosophy, by contrast, picks out two distinct classes of behaviors. The first is what he refers to as “a seemingly profound type of academic discourse that is pursued primarily within the humanities and social sciences” (2020, 600), which he calls obscurantist pseudophilosophy. The second, a “less familiar kind of pseudophilosophy is usually found in popular scientific contexts, where writers, typically with a background in the natural sciences, tend to wander into philosophical territory without realizing it, and again without awareness of relevant distinctions and arguments” (2020, 601). He calls this scientistic (Boudry and Pigliucci 2017) pseudophilosophy.
The bottom line is that pseudoscience is BS with scientific pretensions, while pseudophilosophy is BS with philosophical pretensions. What pseudoscience and pseudophilosophy have in common, then, is BS. While both pseudoscience and pseudophilosophy suffer from a lack of epistemic conscientiousness, this lack manifests itself differently, according to Moberger. In the case of pseudoscience, we tend to see a number of classical logical fallacies and other reasoning errors at play. In the case of pseudophilosophy, instead, we see “equivocation due to conceptual impressionism, whereby plausible but trivial propositions lend apparent credibility to interesting but implausible ones.”
Moberger’s analysis provides a unified explanatory framework for otherwise seemingly disparate phenomena, such as pseudoscience and pseudophilosophy. And it does so in terms of a single, more fundamental, epistemic problem: BSing. He then proceeds by fleshing out the concept—for instance, differentiating pseudoscience from scientific fraud—and by responding to a range of possible objections to his thesis, for example that the demarcation of concepts like pseudoscience, pseudophilosophy, and even BS is vague and imprecise. It is so by nature, Moberger responds, adopting the already encountered Wittgensteinian view that complex concepts are inherently fuzzy.
Importantly, Moberger reiterates a point made by other authors before, and yet very much worth reiterating: any demarcation in terms of content between science and pseudoscience (or philosophy and pseudophilosophy), cannot be timeless. Alchemy was once a science, but it is now a pseudoscience. What is timeless is the activity underlying both pseudoscience and pseudophilosophy: BSing.
There are several consequences of Moberger’s analysis. First, that it is a mistake to focus exclusively, sometimes obsessively, on the specific claims made by proponents of pseudoscience as so many skeptics do. That is because sometimes even pseudoscientific practitioners get things right, and because there simply are too many such claims to be successfully challenged (again, Brandolini’s Law). The focus should instead be on pseudoscientific practitioners’ epistemic malpractice: content vs. activity.
Second, what is bad about pseudoscience and pseudophilosophy is not that they are unscientific, because plenty of human activities are not scientific and yet are not objectionable (literature, for instance). Science is not the ultimate arbiter of what has or does not have value. While this point is hardly controversial, it is worth reiterating, considering that a number of prominent science popularizers have engaged in this mistake.
Third, pseudoscience does not lack empirical content. Astrology, for one, has plenty of it. But that content does not stand up to critical scrutiny. Astrology is a pseudoscience because its practitioners do not seem to be bothered by the fact that their statements about the world do not appear to be true.
One thing that is missing from Moberger’s paper, perhaps, is a warning that even practitioners of legitimate science and philosophy may be guilty of gross epistemic malpractice when they criticize their pseudo counterparts. Too often so-called skeptics reject unusual or unorthodox claims a priori, without critical analysis or investigation, for example in the notorious case of the so-called Campeche UFOs (Pigliucci, 2018, 97-98). From a virtue epistemological perspective, it comes down to the character of the agents. We all need to push ourselves to do the right thing, which includes mounting criticisms of others only when we have done our due diligence to actually understand what is going on. Therefore, a small digression into how virtue epistemology is relevant to the demarcation problem now seems to be in order.
6. Virtue Epistemology and Demarcation
Just like there are different ways to approach virtue ethics (for example, Aristotle, the Stoics), so there are different ways to approach virtue epistemology. What these various approaches have in common is the assumption that epistemology is a normative (that is, not merely descriptive) discipline, and that intellectual agents (and their communities) are the sources of epistemic evaluation.
The assumption of normativity very much sets virtue epistemology as a field at odds with W.V.O. Quine’s famous suggestion that epistemology should become a branch of psychology (see Naturalistic Epistemology): that is, a descriptive, not prescriptive discipline. That said, however, virtue epistemologists are sensitive to input from the empirical sciences, first and foremost psychology, as any sensible philosophical position ought to be.
A virtue epistemological approach—just like its counterpart in ethics—shifts the focus away from a “point of view from nowhere” and onto specific individuals (and their communities), who are treated as epistemic agents. In virtue ethics, the actions of a given agent are explained in terms of the moral virtues (or vices) of that agent, like courage or cowardice. Analogously, in virtue epistemology the judgments of a given agent are explained in terms of the epistemic virtues of that agent, such as conscientiousness, or gullibility.
Just like virtue ethics has its roots in ancient Greece and Rome, so too can virtue epistemologists claim a long philosophical pedigree, including but not limited to Plato, Aristotle, the Stoics, Thomas Aquinas, Descartes, Hume, and Bertrand Russell.
But what exactly is a virtue, in this context? Again, the analogy with ethics is illuminating. In virtue ethics, a virtue is a character trait that makes the agent an excellent, meaning ethical, human being. Similarly, in virtue epistemology a virtue is a character trait that makes the agent an excellent cognizer. Here is a partial list of epistemological virtues and vices to keep handy:
Epistemic virtues
Epistemic vices
Attentiveness
Close-mindedness
Benevolence (that is, principle of charity)
Dishonesty
Conscentiousness
Dogmatism
Creativity
Gullibility
Curiosity
Naïveté
Discernment
Obtuseness
Honesty
Self-deception
Humility
Superficiality
Objectivity
Wishful thinking
Parsimony
Studiousness
Understanding
Warrant
Wisdom
Linda Zagzebski (1996) has proposed a unified account of epistemic and moral virtues that would cast the entire science-pseudoscience debate in more than just epistemic terms. The idea is to explicitly bring to epistemology the same inverse approach that virtue ethics brings to moral philosophy: analyzing right actions (or right beliefs) in terms of virtuous character, instead of the other way around.
For Zagzebski, intellectual virtues are actually to be thought of as a subset of moral virtues, which would make epistemology a branch of ethics. The notion is certainly intriguing: consider a standard moral virtue, like courage. It is typically understood as being rooted in the agent’s motivation to do good despite the risk of personal danger. Analogously, the virtuous epistemic agent is motivated by wanting to acquire knowledge, in pursuit of which goal she cultivates the appropriate virtues, like open-mindedness.
In the real world, sometimes virtues come in conflict with each other, for instance in cases where the intellectually bold course of action is also not the most humble, thus pitting courage and humility against each other. The virtuous moral or epistemic agent navigates a complex moral or epistemic problem by adopting an all-things-considered approach with as much wisdom as she can muster. Knowledge itself is then recast as a state of belief generated by acts of intellectual virtue.
Reconnecting all of this more explicitly with the issue of science-pseudoscience demarcation, it should now be clearer why Moberger’s focus on BS is essentially based on a virtue ethical framework. The BSer is obviously not acting virtuously from an epistemic perspective, and indeed, if Zagzebski is right, also from a moral perspective. This is particularly obvious in the cases of pseudoscientific claims made by, among others, anti-vaxxers and climate change denialists. It is not just the case that these people are not being epistemically conscientious. They are also acting unethically because their ideological stances are likely to hurt others.
A virtue epistemological approach to the demarcation problem is explicitly adopted in a paper by Sindhuja Bhakthavatsalam and Weimin Sun (2021), who both provide a general outline of how virtue epistemology may be helpful concerning science-pseudoscience demarcation. The authors also explore in detail the specific example of the Chinese practice of Feng Shui, a type of pseudoscience employed in some parts of the world to direct architects to build in ways that maximize positive “qi” energy.
Bhakthavatsalam and Sun argue that discussions of demarcation do not aim solely at separating the usually epistemically reliable products of science from the typically epistemically unreliable ones that come out of pseudoscience. What we want is also to teach people, particularly the general public, to improve their epistemic judgments so that they do not fall prey to pseudoscientific claims. That is precisely where virtue epistemology comes in.
Bhakthavatsalam and Sun build on work by Anthony Derksen (1993) who arrived at what he called an epistemic-social-psychological profile of a pseudoscientist, which in turn led him to a list of epistemic “sins” that pseudoscientists regularly engage in: lack of reliable evidence for their claims; arbitrary “immunization” from empirically based criticism (Boudry and Braeckman 2011); assigning outsized significance to coincidences; adopting magical thinking; contending to have special insight into the truth; tendency to produce all-encompassing theories; and uncritical pretension in the claims put forth.
Conversely, one can arrive at a virtue epistemological understanding of science and other truth-conducive epistemic activities. As Bhakthavatsalam and Sun (2021, 6) remind us: “Virtue epistemologists contend that knowledge is non‐accidentally true belief. Specifically, it consists in belief of truth stemming from epistemic virtues rather than by luck. This idea is captured well by Wayne Riggs (2009): knowledge is an ‘achievement for which the knower deserves credit.’”
Bhakthavatsalam and Sun discuss two distinct yet, in their mind, complementary (especially with regard to demarcation) approaches to virtue ethics: virtue reliabilism and virtue responsibilism. Briefly, virtue reliabilism (Sosa 1980, 2011) considers epistemic virtues to be stable behavioral dispositions, or competences, of epistemic agents. In the case of science, for instance, such virtues might include basic logical thinking skills, the ability to properly collect data, the ability to properly analyze data, and even the practical know-how necessary to use laboratory or field equipment. Clearly, these are precisely the sort of competences that are not found among practitioners of pseudoscience. But why not? This is where the other approach to virtue epistemology, virtue responsibilism, comes into play.
Responsibilism is about identifying and practicing epistemic virtues, as well as identifying and staying away from epistemic vices. The virtues and vices in question are along the lines of those listed in the table above. Of course, we all (including scientists and philosophers) engage in occasionally vicious, or simply sloppy, epistemological practices. But what distinguishes pseudoscientists is that they systematically tend toward the vicious end of the epistemic spectrum, while what characterizes the scientific community is a tendency to hone epistemic virtues, both by way of expressly designed training and by peer pressure internal to the community. Part of the advantage of thinking in terms of epistemic vices and virtues is that one then puts the responsibility squarely on the shoulders of the epistemic agent, who becomes praiseworthy or blameworthy, as the case may be.
Moreover, a virtue epistemological approach immediately provides at least a first-level explanation for why the scientific community is conducive to the truth while the pseudoscientific one is not. In the latter case, comments Cassam:
The fact that this is how [the pseudoscientist] goes about his business is a reflection of his intellectual character. He ignores critical evidence because he is grossly negligent, he relies on untrustworthy sources because he is gullible, he jumps to conclusions because he is lazy and careless. He is neither a responsible nor an effective inquirer, and it is the influence of his intellectual character traits which is responsible for this. (2016, 165)
In the end, Bhakthavatsalam and Sun arrive, by way of their virtue epistemological approach, to the same conclusion that we have seen other authors reach: both science and pseudoscience are Wittgensteinian-type cluster concepts. But virtue epistemology provides more than just a different point of view on demarcation. First, it identifies specific behavioral tendencies (virtues and vices) the cultivation (or elimination) of which yield epistemically reliable outcomes. Second, it shifts the responsibility to the agents as well as to the communal practices within which such agents operate. Third, it makes it possible to understand cases of bad science as being the result of scientists who have not sufficiently cultivated or sufficiently regarded their virtues, which in turn explains why we find the occasional legitimate scientist who endorses pseudoscientific notions.
How do we put all this into practice, involving philosophers and scientists in the sort of educational efforts that may help curb the problem of pseudoscience? Bhakthavatsalam and Sun articulate a call for action at both the personal and the systemic levels. At the personal level, we can virtuously engage with both purveyors of pseudoscience and, likely more effectively, with quasi-neutral bystanders who may be attracted to, but have not yet bought into, pseudoscientific notions. At the systemic level, we need to create the sort of educational and social environment that is conducive to the cultivation of epistemic virtues and the eradication of epistemic vices.
Bhakthavatsalam and Sun are aware of the perils of engaging defenders of pseudoscience directly, especially from the point of view of virtue epistemology. It is far too tempting to label them as “vicious,” lacking in critical thinking, gullible, and so forth and be done with it. But basic psychology tells us that this sort of direct character attack is not only unlikely to work, but near guaranteed to backfire. Bhakthavatsalam and Sun claim that we can “charge without blame” since our goal is “amelioration rather than blame” (2021, 15). But it is difficult to imagine how someone could be charged with the epistemic vice of dogmatism and not take that personally.
Far more promising are two different avenues: the systemic one, briefly discussed by Bhakthavatsalam and Sun, and the personal not in the sense of blaming others, but rather in the sense of modeling virtuous behavior ourselves.
In terms of systemic approaches, Bhakthavatsalam and Sun are correct that we need to reform both social and educational structures so that we reduce the chances of generating epistemically vicious agents and maximize the chances of producing epistemically virtuous ones. School reforms certainly come to mind, but also regulation of epistemically toxic environments like social media.
As for modeling good behavior, we can take a hint from the ancient Stoics, who focused not on blaming others, but on ethical self-improvement:
If a man is mistaken, instruct him kindly and show him his error. But if you are not able, blame yourself, or not even yourself. (Marcus Aurelius, Meditations, X.4)
A good starting point may be offered by the following checklist, which—in agreement with the notion that good epistemology begins with ourselves—is aimed at our own potential vices. The next time you engage someone, in person or especially on social media, ask yourself the following questions:
Did I carefully consider the other person’s arguments without dismissing them out of hand?
Did I interpret what they said in a charitable way before mounting a response?
Did I seriously entertain the possibility that I may be wrong? Or am I too blinded by my own preconceptions?
Am I an expert on this matter? If not, did I consult experts, or did I just conjure my own unfounded opinion?
Did I check the reliability of my sources, or just google whatever was convenient to throw at my interlocutor?
After having done my research, do I actually know what I’m talking about, or am I simply repeating someone else’s opinion?
After all, as Aristotle said: “Piety requires us to honor truth above our friends” (Nicomachean Ethics, book I), though some scholars suggested that this was a rather unvirtuous comment aimed at his former mentor, Plato.
7. The Scientific Skepticism Movement
One of the interesting characteristics of the debate about science-pseudoscience demarcation is that it is an obvious example where philosophy of science and epistemology become directly useful in terms of public welfare. This, in other words, is not just an exercise in armchair philosophizing; it has the potential to affect lives and make society better. This is why we need to take a brief look at what is sometimes referred to as the skeptic movement—people and organizations who have devoted time and energy to debunking and fighting pseudoscience. Such efforts could benefit from a more sophisticated philosophical grounding, and in turn philosophers interested in demarcation would find their work to be immediately practically useful if they participated in organized skepticism.
That said, it was in fact a philosopher, Paul Kurtz, who played a major role in the development of the skeptical movement in the United States. Kurtz, together with Marcello Truzzi, founded the Committee for the Scientific Investigation of Claims of the Paranormal (CSICOP), in Amherst, New York in 1976. The organization changed its name to the Committee for Skeptical Inquiry (CSI) in November 2006 and has long been publishing the premier world magazine on scientific skepticism, Skeptical Inquirer. These groups, however, were preceded by a long history of skeptic organizations outside the US. The oldest skeptic organization on record is the Dutch Vereniging tegen de Kwakzalverij (VtdK), established in 1881. This was followed by the Belgian Comité Para in 1949, started in response to a large predatory industry of psychics exploiting the grief of people who had lost relatives during World War II.
In the United States, Michael Shermer, founder and editor of Skeptic Magazine, traced the origin of anti-pseudoscience skepticism to the publication of Martin Gardner’s Fads and Fallacies in the Name of Science in 1952. The French Association for Scientific Information (AFIS) was founded in 1968, and a series of groups got started worldwide between 1980 and 1990, including Australian Skeptics, Stichting Skepsis in the Netherlands, and CICAP in Italy. In 1996, the magician James Randi founded the James Randi Educational Foundation, which established a one-million-dollar prize to be given to anyone who could reproduce a paranormal phenomenon under controlled conditions. The prize was never claimed.
After the fall of the Berlin Wall, a series of groups began operating in Russia and its former satellites in response to yet another wave of pseudoscientific claims. This led to skeptic organizations in the Czech Republic, Hungary, and Poland, among others. The European Skeptic Congress was founded in 1989, and a number of World Skeptic Congresses have been held in the United States, Australia, and Europe.
Kurtz (1992) characterized scientific skepticism in the following manner: “Briefly stated, a skeptic is one who is willing to question any claim to truth, asking for clarity in definition, consistency in logic, and adequacy of evidence.” This differentiates scientific skepticism from ancient Pyrrhonian Skepticism, which famously made no claim to any opinion at all, but it makes it the intellectual descendant of the Skepticism of the New Academy as embodied especially by Carneades and Cicero (Machuca and Reed 2018).
One of the most famous slogans of scientific skepticism “Extraordinary claims require extraordinary evidence” was first introduced by Truzzi. It can easily be seen as a modernized version of David Hume’s (1748, Section X: Of Miracles; Part I. 87.) dictum that a wise person proportions his beliefs to the evidence and has been interpreted as an example of Bayesian thinking (McGrayne 2011).
According to another major, early exponent of scientific skepticism, astronomer Carl Sagan: “The question is not whether we like the conclusion that emerges out of a train of reasoning, but whether the conclusion follows from the premises or starting point and whether that premise is true” (1995).
Modern scientific skeptics take full advantage of the new electronic tools of communication. Two examples in particular are the Skeptics’ Guide to the Universe podcast published by Steve Novella and collaborators, which regularly reaches a large audience and features interviews with scientists, philosophers, and skeptic activists; and the “Guerrilla Skepticism” initiative coordinated by Susan Gerbic, which is devoted to the systematic improvement of skeptic-related content on Wikipedia.
Despite having deep philosophical roots, and despite that some of its major exponents have been philosophers, scientific skepticism has an unfortunate tendency to find itself far more comfortable with science than with philosophy. Indeed, some major skeptics, such as author Sam Harris and scientific popularizers Richard Dawkins and Neil deGrasse Tyson, have been openly contemptuous of philosophy, thus giving the movement a bit of a scientistic bent. This is somewhat balanced by the interest in scientific skepticism of a number of philosophers (for instance, Maarten Boudry, Lee McIntyre) as well as by scientists who recognize the relevance of philosophy (for instance, Carl Sagan, Steve Novella).
Given the intertwining of not just scientific skepticism and philosophy of science, but also of social and natural science, the theoretical and practical study of the science-pseudoscience demarcation problem should be regarded as an extremely fruitful area of interdisciplinary endeavor—an endeavor in which philosophers can make significant contributions that go well beyond relatively narrow academic interests and actually have an impact on people’s quality of life and understanding of the world.
8. References and Further Readings
Armando, D. and Belhoste, B. (2018) Mesmerism Between the End of the Old Regime and the Revolution: Social Dynamics and Political Issues. Annales historiques de la Révolution française 391(1):3-26
Baum, R. and Sheehan, W. (1997) In Search of Planet Vulcan: The Ghost in Newton’s Clockwork Universe. Plenum.
Bhakthavatsalam, S. and Sun, W. (2021) A Virtue Epistemological Approach to the Demarcation Problem: Implications for Teaching About Feng Shui in Science Education. Science & Education 30:1421-1452. https://doi.org/10.1007/s11191-021-00256-5.
Bloor, D. (1976) Knowledge and Social Imagery. Routledge & Kegan Paul.
Bonk, T. (2008) Underdetermination: An Essay on Evidence and the Limits of Natural Knowledge. Springer.
Boudry, M. and Braeckman, J. (2011) Immunizing Strategies and Epistemic Defense Mechanisms. Philosophia 39(1):145-161.
Boudry, M. and Pigliucci, M. (2017) Science Unlimited? The Challenges of Scientism. University of Chicago Press.
Brulle, R.J. (2020) Denialism: Organized Opposition to Climate Change Action in the United States, in: D.M. Konisky (ed.) Handbook of U.S. Environmental Policy, Edward Elgar, chapter 24.
Carlson, S. (1985) A Double-Blind Test of Astrology. Nature 318:419-25.
Cassam, Q. (2016) Vice Epistemology. The Monist 99(2):159-180.
Cicero (2014) On Divination, in: Cicero—Complete Works, translated by W.A. Falconer, Delphi.
Curd, M. and Cover, J.A. (eds.) (2012) The Duhem-Quine Thesis and Underdetermination, in: Philosophy of Science: The Central Issues. Norton, pp. 225-333.
Dawes, G.W. (2018) Identifying Pseudoscience: A Social Process Criterion. Journal of General Philosophy of Science 49:283-298.
Derksen, A.A. (1993) The Seven Sins of Demarcation. Journal for General Philosophy of Science 24:17-42.
Dupré, J. (1993) The Disorder of Things: Metaphysical Foundations of the Disunity of Science. Harvard University Press.
Fasce, A. (2018) What Do We Mean When We Speak of Pseudoscience? The Development of a Demarcation Criterion Based on the Analysis of Twenty-One Previous Attempts. Disputatio 6(7):459-488.
Fasce, A. (2019) Are Pseudosciences Like Seagulls? A Discriminant Metacriterion Facilitates the Solution of the Demarcation Problem. International Studies in the Philosophy of Science 32(3-4):155-175.
Fasce, A. and Picó, A. (2019) Conceptual Foundations and Aalidation of the Pseudoscientific Belief Scale. Applied Cognitive Psychology 33(4):617-628.
Feldman, R. (1981) Fallibilism and Knowing that One Knows, The Philosophical Review 90:266-282.
Fernandez-Beanato, D. (2020a) Cicero’s Demarcation of Science: A Report of Shared Criteria. Studies in History and Philosophy of Science Part A 83:97-102.
Fernandez-Beanato, D. (2020b) The Multicriterial Approach to the Problem of Demarcation. Journal for General Philosophy of Science 51:375-390.
Feyerabend, P. (1975) Against Method: Outline of an Anarchistic Theory of Knowledge. New Left Books.
Frankfurt, H. (2005) On Bullshit. Princeton University Press.
Gardner, M. (1952) Fads and Fallacies in the Name of Science. Dover.
Gauch, H.G. (2012) Scientific Method in Brief. Cambridge University Press.
Gould, S.J. (1989) The Chain of Reason vs. The Chain of Thumbs, Natural History, 89(7):16.
Grosser, M. (1962) The Discovery of Neptune. Harvard University Press.
Hansson, S.O. (2009) Cutting the Gordian Knot of Demarcation. International Studies in the Philosophy of Science 23(3):237-243.
Hansson, S.O. (2013) Defining Pseudoscience—and Science, in: M. Pigliucci and M. Boudry (eds.), The Philosophy of Pseudoscience. University of Chicago University Press, pp. 61-77.
Hansson, S.O. (2017) Science Denial as a Form of Pseudoscience. Studies in History and Philosophy of Science 63:39-47.
Hansson, S.O. (2020) Disciplines, Doctrines, and Deviant Science. International Studies in the Philosophy of Science 33(1):43-52.
Hausman, A., Boardman, F., and Kahane, H. (2021) Logic and Philosophy: A Modern Introduction. Hackett.
Hempel, C.G. (1951) The Concept of Cognitive Significance: A Reconsideration. Proceedings of the American Academy of Arts and Sciences 80:61–77.
Hossenfelder, S. (2018) Lost in Math: How Beauty Leads Physics Astray. Basic Books.
Kaplan, J.M. (2006) More Misuses of Evolutionary Psychology. Metascience 15(1):177-181.
Kennefick, D. (2019) No Shadow of a Doubt: The 1919 Eclipse That Confirmed Einstein’s Theory of Relativity. Princeton University Press.
Kuhn, T. (1962) The Structure of Scientific Revolutions. University of Chicago Press.
Kurtz, P. (1992) The New Skepticism. Prometheus.
LaFollette, M. (1983) Creationism, Science and the Law. MIT Press.
Lakatos, I. (1978) The Methodology of Scientific Research Programmes. Cambridge University Press.
Laudan, L. (1983) The Demise of the Demarcation Problem, in: R.S. Cohen and L. Laudan (eds.), Physics, Philosophy and Psychoanalysis. D. Reidel, pp. 111–127.
Laudan, L. (1988) Science at the Bar—Causes for Concern. In M. Ruse (ed.), But Is It Science? Prometheus.
Letrud, K. (2019) The Gordian Knot of Demarcation: Tying Up Some Loose Ends. International Studies in the Philosophy of Science 32(1):3-11.
Machuca, D.E. and Reed, B. (2018) Skepticism: From Antiquity to the Present. Bloomsbury Academic.
Mahner, M. (2007) Demarcating Science from Non-Science, in: T. Kuipers (ed.), Handbook of the Philosophy of Science: General Philosophy of Science—Focal Issues. Elsevier, pp. 515-575.
McGrayne, S.B. (2011) The Theory That Would Not Die: How Bayes’ Rule Cracked the Enigma Code, Hunted Down Russian Submarines, and Emerged Triumphant from Two Centuries of Controversy. Yale University Press.
Merton, R.K. (1973) The Normative Structure of Science, in: N.W. Storer (ed.), The Sociology of Science: Theoretical and Empirical Investigations. University of Chicago Press, pp. 267-278.
Moberger, V. (2020) Bullshit, Pseudoscience and Pseudophilosophy. Theoria 86(5):595-611.
Navin, M. (2013) Competing Epistemic Spaces. How Social Epistemology Helps Explain and Evaluate Vaccine Denialism. Social Theory and Practice 39(2):241-264.
Pigliucci, M. (2013) The Demarcation Problem: A (Belated) Response to Laudan, in: M. Pigliucci and M. Boudry (eds.), The Philosophy of Pseudoscience. University of Chicago Press, pp. 9-28.
Pigliucci, M. (2017) Philosophy as the Evocation of Conceptual Landscapes, in: R. Blackford and D. Broderick (eds.), Philosophy’s Future: The Problem of Philosophical Progress. John Wiley & Sons, pp. 75-90.
Pigliucci, M. (2018) Nonsense on Stilts, 2nd edition. University of Chicago Press, pp. 97-98.
Pigliucci, M. and Boudry, M. (eds.) (2013) The Philosophy of Pseudoscience: Reconsidering the Demarcation Problem. University of Chicago Press.
Plato (1986) Charmides. Translated by T.G. West and G.S. West, Hackett Classics.
Popper, K. (1959) The Logic of Scientific Discovery. Hutchinson.
Riggs, W. (2009) Two Problems of Easy Credit. Synthese 169(1):201-216.
Sagan, C. (1995) The Demon Haunted World. Ballantine.
Salas D. and Salas, D. (translators) (1996) The First Scientific Investigation of the Paranormal Ever Conducted, Commissioned by King Louis XVI. Designed, conducted, & written by Benjamin Franklin, Antoine Lavoisier, & Others. Skeptic (Fall), pp.68-83.
Shea, B. (no date) Karl Popper: Philosophy of Science. Internet Encyclopedia of Philosophy. https://iep.utm.edu/pop-sci/
The term “substance” has two main uses in philosophy. Both originate in what is arguably the most influential work of philosophy ever written, Aristotle’s Categories. In its first sense, “substance” refers to those things that are object-like, rather that property-like. For example, an elephant is a substance in this sense, whereas the height or colour of the elephant is not. In its second sense, “substance” refers to the fundamental building blocks of reality. An elephant might count as a substance in this sense. However, this depends on whether we accept the kind of metaphysical theory that treats biological organisms as fundamental. Alternatively, we might judge that the properties of the elephant, or the physical particles that compose it, or entities of some other kind better qualify as substances in this second sense. Since the seventeenth century, a third use of “substance” has gained currency. According to this third use, a substance is something that underlies the properties of an ordinary object and that must be combined with these properties for the object to exist. To avoid confusion, philosophers often substitute the word “substratum” for “substance” when it is used in this third sense. The elephant’s substratum is what remains when you set aside its shape, size, colour, and all its other properties. These philosophical uses of “substance” differ from the everyday use of “substance” as a synonym for “stuff” or “material”. This is not a case of philosophers putting an ordinary word to eccentric use. Rather, “substance” entered modern languages as a philosophical term, and it is the everyday use that has drifted from the philosophical uses.
The idea of substance enters philosophy at the start of Aristotle’s collected works, in the Categories 1a. It is further developed by Aristotle in other works, especially the Physics and the Metaphysics. Aristotle’s concept of substance was quickly taken up by other philosophers in the Aristotelian and Platonic schools. By late antiquity, the Categories, along with an introduction by Porphyry, was the first text standardly taught to philosophy students throughout the Roman world, a tradition that persisted in one form or another for more than a thousand years. As a result, Aristotle’s concept of substance can be found in works by philosophers across a tremendous range of times and places. Uptake of Aristotle’s concept of substance in Hellenistic and Roman philosophy was typically uncritical, however, and it is necessary to look to other traditions for influential challenges to and/or revisions of the Aristotelian concept.
a. Substance in Aristotle
The Categories centres on two ways of dividing up the kinds of things that exist (or, on some interpretations, the kinds of words or concepts for things that exist). Aristotle starts with a simple four-fold division. He then introduces a more complicated ten-fold division. Both give pride of place to the category of substances.
Aristotle draws the four-fold division in terms of two relations: that of existing in a subject in the way that the colour grey is in an elephant, and that of being said of a subject in the way that “animal” or “four-footed” is said of an elephant. Commentators often refer to these relations as inherence and predication, respectively.
Some things, Aristotle says, exist in a subject, and some are said of a subject. Some both exist in and are said of a subject. But members of a fourth group, substances, neither exist in nor are said of a subject:
A substance—that which is called a substance most strictly, primarily, and most of all—is that which is neither said of a subject nor in a subject, e.g. the individual man or the individual horse. (Categories, 2a11)
In other words, substances are those things that are neither inherent in, nor predicated of, anything else. A problem for understanding what this means is that Aristotle does not define the said of (predication) and in (inherence) relations. Aristotle (Categories, 2b5–6) does make it clear, however, that whatever is said of or in a subject, in the sense he has in mind, depends for its existence on that subject. The colour grey and the genus animal, for example, can exist only as the colour or genus of some subject—such as an elephant. Substances, according to Aristotle, do not depend on other things for their existence in this way: the elephant need not belong to some further thing in order to exist in the way that the colour grey and the genus animal (arguably) must. In this respect, Aristotle’s distinction between substances and non-substances approximates the everyday distinction between objects and properties.
Scholars tend to agree that Aristotle treats the things that are said of a subject as universals and other things as particulars. If so, Aristotle’s substances are particulars: unlike the genus animal, an individual elephant cannot have multiple instances. Scholars also tend to agree that Aristotle treats the things that exist in a subject as accidental and the other things as non-accidental. If so, substances are non-accidental. However, the term “accidental” usually signifies the relationship between a property and its bearer. For example, the colour grey is an accident of the elephant because it is not part of its essence, whereas the genus animal is not an accident of the elephant but is part of its essence. The claim that an object-like thing, such as a man, a horse, or an elephant, is non-accidental therefore seems trivially true.
Unlike the four-fold division, Aristotle’s ten-fold division does not arise out of the systematic combination of two or more characteristics such as being said of or existing in a subject. It is presented simply as a list consisting of substance, quantity, qualification, relative, where, when, being-in-a-position, having, doing, and being-affected. Scholars have long debated on whether Aristotle had a system for arriving at this list of categories or whether he “merely picked them up as they occurred to him” as Kant suggests (Critique of Pure Reason, Pt.2, Div.1, I.1, §3, 10).
Despite our ignorance about how he arrived at it, Aristotle’s ten-fold division helps clarify his concept of substance by providing a range of contrast cases: substances are not quantities, qualifications, relatives and so on, all of which depend on substances for their existence.
Having introduced the ten-fold division, Aristotle also highlights some characteristics that make substances stand out (Categories, 3b–8b): a substance is individual and numerically one, has no contrary (nothing stands to an elephant as knowledge stands to ignorance or justice to injustice), does not admit of more or less (no substance is more or less a substance than another substance, no elephant is more or less an elephant than another elephant), is not said in relation to anything else (one can know what an elephant is without knowing anything else to which it stands in some relation), and is able to receive contraries (an elephant can be hot at one time, cold at another). Aristotle emphasises that whereas substances share some of these characteristics with some non-substances, the ability to receive contraries while being numerically one is unique to substances (Categories, 4a10–13).
The core idea of a substance in the Categories applies to those object-like particulars that, uniquely, do not depend for their existence on some subject in which they must exist or of which they must be said, and that are capable of receiving contraries when they undergo change. That, at any rate, is how the Categories characterises those things that are “most strictly, primarily, and most of all” called “substances”. One complication must be noted. Aristotle adds that:
The species in which the things primarily called substances are, are called secondary substances, as also are the genera of these species. For example, the individual man belongs in a species, man, and animal is a genus of the species; so these—both man and animal—are called secondary substances. (Categories, 2a13)
Strictly, then, the Categories characterises two kinds of substances: primary substances, which have the characteristics we have looked at, and secondary substances, which are the species and genera to which primary substances belong. However, Aristotle’s decision to call the species and genera to which primary substances belong “secondary substances” is not typically adopted by later thinkers. When people talk about substances in philosophy, they almost always have in mind a sense of the term derived from Aristotle’s discussion of primary substances. Except where otherwise specified, the same is true of this article.
In singling out object-like particulars such as elephants as those things that are “most strictly, primarily and most of all” called “substance”, Aristotle implies that the term “substance” is no mere label, but that it signifies a special status. A clue as to what Aristotle has in mind here can be found in his choice of terminology. The Greek term translated “substance” is ousia, an abstract noun derived from the participle ousa of the Greek verb eimi, meaning—and cognate with—I am. Unlike the English “substance”, ousia carries no connotation of standing under or holding up. Rather, ousia suggests something close to what we mean by the word “being” when we use it as a noun. Presumably, therefore, Aristotle regards substances as those things that are most strictly and primarily counted as beings, as things that exist.
Aristotle sometimes refers to substances as hypokeimena, a term that does carry the connotation of standingunder (or rather, lying under), and that is often translated with the term “subject”. Early translators of Aristotle into Latin frequently used a Latin rendering of hypokeimenon—namely, substantia—to translate both terms. This is how we have ended up with the English term “substance”. It is possible that this has contributed to some of the confusions that have emerged in later discussions, which have placed too much weight on the connotations of the English term (see section 5.c).
Aristotle also discusses the concept of substance in a number of other works. If these have not had the same degree of influence as the Categories, their impact has nonetheless been considerable, especially on scholastic Aristotelianism. Moreover, these works add much to what Aristotle says about substance in the Categories, in some places even seeming to contradict it.
The most important development of Aristotle’s concept of substance outside the Categories is his analysis of material substances into matter (hyle) and form (morphe)—an analysis that has come to be known as hylomorphism (though only since the late nineteenth century). This analysis is developed in the Physics, a text dedicated to things that undergo change, and which, unsurprisingly therefore, also has to do with substances. Given the distinctions drawn in the Categories, one might expect Aristotle’s account of change to simply say that change occurs when a substance gains or loses one of the things that is said of or that exists in it—before its bath, the elephant is hot and grey, but afterwards, it is cool and mud-coloured. However, Aristotle also has the task of accounting for substantial change. That is, the coming to be or ceasing to exist of a substance. An old tradition in Greek philosophy, beginning with Parmenides, suggests that substantial change should be impossible, since it involves something coming from nothing or vanishing into nothing. In the Physics, Aristotle addresses this issue by analysing material substances into the matter they are made of and the form that organises that matter. This allows him to explain substantial change. For example, when a vase comes into existence, the pre-existing clay acquires the form of a vase, and when it is destroyed, the clay loses the form of a vase. Neither process involves something coming from or vanishing into nothing. Likewise, when an elephant comes into existence, pre-existing matter acquires the form of an elephant. When an elephant ceases to exist, the matter loses the form of an elephant, becoming (mere) flesh and bones.
Aristotle returns to the topic of substance at length in the Metaphysics. Here, much to the confusion of readers, Aristotle raises the question of what is most properly called a “substance” afresh and considers three options: the matter of which something is made, the form that organises that matter, or the compound of matter and form. Contrary to what was said in the Categories and the Physics, Aristotle seems to say that the term “substance” applies most properly not to a compound of matter and form such as an elephant or a vase, but to the form that makes that compound the kind of thing it is. (The form that makes a hylomorphic compound the kind of thing it is, such as the form of an elephant or the form of a vase, is referred to as a substantial form, to distinguish it from accidental forms such as size or colour). Scholars do not agree on how to reconcile this position with that of Aristotle’s other works. In any case, it should be noted that it is Aristotle’s identification of substances with object-like particulars such as elephants and vases that has guided most of later discussions of substance.
One explanation for Aristotle’s claim in the Metaphysics that it is the substantial form that most merits the title of “substance” concerns material change. In the Categories, Aristotle emphasises that substances are distinguished by their ability to survive through change. Living things, such as elephants, however, do not just change with respect to accidental forms such as temperature and colour. They also change with respect to the matter they are made of. As a result, it seems that if the elephant remains the same elephant over time, this must be in virtue of its having the same substantial form.
In the Metaphysics, Aristotle rejects the thesis that the term “substance” applies to matter. In discussing this thesis, he anticipates a usage that becomes popular from the seventeenth century onwards. On this usage, “substance” does not refer to object-like particulars such as elephants or vases; rather, it refers to an underlying thing that must be combined with properties to yield an object-like particular. This underlying thing is typically conceived as having no properties in itself, but as standing under or supporting the properties with which it must be combined. The application of the term “substance” to this underlying thing is confusing, and the common practice of favouring the word “substratum” in this context is followed here. The idea of a substratum that must be combined with properties to yield a substance in the ordinary sense is close to Aristotle’s idea of matter that must be combined with form. It is closer still to the concept of prime matter, which is traditionally (albeit controversially) attributed to Aristotle and which, unlike flesh or clay, is conceived as having no properties in its own right, except perhaps spatial extension. Though the concept of a substratum is not same as the concept of substance in its original sense, it also plays an extremely important role in the history of philosophy, and one that has antecedents earlier than Aristotle in the Presocratics and in classical Indian philosophy, a topic discussed in section 2.b.
b. Substance in Hellenistic and Roman Philosophy
As noted in the previous section, in the Categories, Aristotle distinguishes two kinds of non-substance: those that exist in a subject and those that are said of a subject. He goes on to divide these further, into the ten categories from which the work takes its name: quantity, qualification, relative, where, when, being-in-a-position, having, doing, being-affected, and secondary substance (which we can count as non-substances for the reasons explained in section 1.a).
Although an enormous number of subsequent thinkers adopt the basic distinction between substances and non-substances, many omit the distinction between predication and inherence. That is, between non-substances that are said of a subject and non-substances that exist in a subject. Moreover, many compact the list of non-substances. For example, the late Neoplatonist Simplicius (480–560 C.E.) records that the second head of the Academy after Plato, Xenocrates (395/96–313/14 B.C.E.), as well as the eleventh head of the Peripatetic school, Andronicus of Rhodes (ca.60 B.C.E.), reduced Aristotle’s ten categories to two: things that exist in themselves, meaning substances, and things that exist in relation to something else, meaning non-substances.
In adopting the language of things that exist in themselves and those that exist in relation to something else, philosophers such as Xenocrates and Andronicus of Rhodes appear to have been recasting Aristotle’s distinction between substances and non-substances in a terminology that approximates that of Plato’s Sophist (255c). It can therefore be argued that the distinction between substances and non-substances that later thinkers inherit from Aristotle also has a line of descent from Plato, even if Plato devotes much less attention to the distinction.
The definition of substances as things that exist in themselves (kath’ auta or per se) is commonplace in the history of philosophy after Aristotle. The expression is, however, regrettably imprecise, both in the original Greek and in the various translations that have followed. For it is not clear what the preposition “in” is supposed to signify here. Clearly, it does not signify containment, as when water exists in a vase or a brick in a wall. It is plausible that the widespread currency of this vague phrase is responsible for the failure of the most influential philosophers from antiquity onwards to state explicit necessary and sufficient conditions for substancehood.
The simplification of the category of non-substances and the introduction of the Platonic in itself terminology are the main philosophical innovations respecting the concept of substance in Hellenistic and Roman philosophy. The concept would also be given a historic theological application when the Nicene Creed (ca.325 C.E.) defined the Father and Son of the Holy Trinity as consubstantial (homoousion) or of one substance. As a result, the philosophical concept of substance would play a central role in the Arian controversy that shaped early Christian theology.
Although Hellenistic and Roman discussions of substance tend to be uncritical, an exception can be found in the Pyrrhonist tradition. Sextus Empiricus records a Pyrrhonist argument against the distinction between substance and non-substance, which says, in effect, that:
If things that exist in themselves do not differ from things that exist in relation to something else, then they too exist in relation to something else.
If things that exist in themselves do differ from things that exist in relation to something else, then they too exist in relation to something else (for to differ from something is to stand in relation to it).
Therefore, the idea of something that exists in itself is incoherent (see McEvilley 2002, 469).
While arguing against the existence of substances is not a central preoccupation of Pyrrhonist philosophy, it is a central concern of the remarkably similar Buddhist Madhyamaka tradition, and there is a possibility of influence in one direction or the other.
2. Substance in Classical Indian Philosophy
The concept of substance in Western philosophy derives from Aristotle via the ancient and medieval philosophical traditions of Europe, the Middle East and North Africa. Either the same or a similar concept is central to the Indian Vaisheshika and Jain schools, to the Nyaya school with which Vaisheshika merged and, as an object of criticism, to various Buddhist schools. This appears to have been the first time that the concept of substance was subjected to sustained philosophical criticism, anticipating and possibly influencing the well-known criticisms of the idea of substance advanced by early modern Western thinkers.
a. Nyaya-Vaisheshika and Jain Substances
There exist six orthodox schools of Indian philosophy (those that acknowledge the authority of the Vedas—the principal Hindu scriptures) and four major unorthodox schools. The orthodox schools include Vaisheshika and Nyaya which appear to have begun as separate traditions, but which merged some time before the eleventh century. The founding text of the Vaisheshika school, the Vaisheshikasutra is attributed to a philosopher named Kaṇāda and was composed sometime between the fifth and the second century B.C.E. Like Aristotle’s Categories, the focus of the Vaisheshikasutra is on how we should divide up the kinds of things that exist. The Vaisheshikasutra presents a three-fold division into substance (dravya), quality (guna), and motion (karman). The substances are divided, in turn, into nine kinds. These are the five elements—earth, water, fire, air, and aether—with the addition of time, space, soul, and mind.
The early Vaisheshika commentators, Praśastapāda (ca.6th century) and Candrānanda (ca.8th century) expand the Vaisheshikasutra’s three-category division into what has become a canonical list of six categories. The additional categories are universal (samanya), particularity (vishesha), and inherence (samavaya), concepts which are also mentioned in the Vaisheshikasutra, but which are not, in that text, given the same prominence as substance, quality and motion (excepting one passage of a late edition which is of questionable authenticity).
The Sanskrit term translated as “substance”, dravya, comes from drú meaning wood or tree and has therefore a parallel etymology to Aristotle’s term for matter, hyle, which means wood in non-philosophical contexts. Nonetheless, it is widely recognised that the meaning of dravya is close to the meaning of Aristotle’s ousia: like Aristotle’s ousiai, dravyas are contrasted with quality and motion, they are distinguished by their ability to undergo change and by the fact that other things depend on them for their existence. McEvilley (2002, 526–7) lists further parallels.
At the same time, there exist important differences between the Vaisheshika approach to substance and that of Aristotle. One difference concerns the paradigmatic examples. Aristotle’s favourite examples of substances are individual objects, and it is not clear that he would count the five classical elements, soul, or mind, as substances. (Aristotle’s statements on these themes are ambiguous and interpretations differ.) Moreover, Aristotle would not class space or time as substances. This, however, need not be taken to show that the Vaisheshika and Aristotelian concepts of substance are themselves fundamentally different. For philosophers who inherit Aristotle’s concept of substance often disagree with Aristotle about its extension in respects similar to Vaisheshika philosophers.
A second difference between the Vaisheshika approach to substance and Aristotle’s is that according to Vaisheshika philosophers, composite substances (anityadravya, that is noneternal substances), though they genuinely exist, do not persist through change. An individual atom of earth or water exists forever, but as soon as you remove a part of a tree, you have a new tree (Halbfass 1992, 96). A possible explanation for both differences between Vaisheshika and Aristotelian substances is that the former are not understood as compounds of matter and form but play rather a role somewhere between that of Aristotelian substances and Aristotelian matter.
Something closer to Aristotle’s position on this point is found in Jain discussions of substance, which appear to be indebted to the Vaisheshika notion, but which combine it with the idea of a vertical universal (urdhvatasmanya). The vertical universal plays a similar role to Aristotle’s substantial form, in that it accompanies an individual substance through nonessential modifications and can therefore account for its identity through material change.
The earliest parts of the Vaisheshikasutra are believed to have been authored between the fifth and second centuries B.C.E., with most parts being in place by the second century C.E. (Moise and Thite 2022, 46). This interval included a period of intense cultural exchange between Greece and India, beginning in the final quarter of the fourth century B.C.E. In view of the close parallels between the philosophy of Aristotle and that of the proponents of Vaisheshika, and of the interaction between the two cultures going on at this time, Thomas McEvilley (2002, 535) states that “it is possible to imagine stimulus diffusion channels” whereby elements of Vaisheshika’s thought “could reflect Greek, and specifically Peripatetic, influence”, including Aristotelian ideas about substance. However, it is also possible that the Vaisheshika and Aristotelian concepts of substance developed independently, despite their similarity.
b. Upanishadic Substrata
The paradigmatic examples of substances identified by Vaisheshika thinkers, like those identified by Aristotelians, are ordinary propertied things such as earth, water, humans and horses. Section 1.a noted that since the seventeenth century, the term “substance” has acquired another usage, according to which “substance” does not applies to ordinary propertied things, but to a putative underlying entity that is supposed to lack properties in itself but to combine with properties to yield substances of the ordinary sort. The underlying entity is often referred to as a substratum to distinguish it from substances in the traditional sense of the term. Although the application of the term “substance” to substrata only became well-established in the twentieth century, the idea that substances can be analysed into properties and an underlying substratum is very old and merits attention here.
As already mentioned, the idea of a substratum is exemplified by the idea of prime matter traditionally attributed to Aristotle. An earlier precursor of this idea is the Presocratic Anaximander, according to whom the apeiron underlies everything that exists. Apeiron is usually translated “infinite”; however, in this context, a more illuminating (albeit etymologically parallel) translation would be “unlimited” or “indefinite”. Anaximander’s apeiron is a thing conceived of in abstraction from any characteristics that limit or define its nature: it is a propertyless substratum. It is reasonable, moreover, to attribute essentially the same idea to Anaximander’s teacher, Thales. For although Thales identified the thing underlying all reality as water, and not as the apeiron, once it is recognised that “water” here is used as a label for something that need not possess any of the distinctive properties of water, the two ideas turn out to be more or less the same.
Thales was the first of the Presocratics and, therefore, the earliest Western philosopher to whom the idea of a substratum can be attributed. Thomas McEvilley (2002) argues that it is possible to trace the idea of a substratum still further back to the Indian tradition. First, McEvilley proposes that Thales’ claim that everything is water resembles a claim advanced by Sanaktumara in the Chandogya Upanishad (ca.8th–6th century B.C.E.), which may well predate Thales. Moreover, just as we can recognise an approximation of the idea of a propertyless substratum in Thales’ claim, the same goes for Sanaktumara’s. McEvilley adds that even closer parallels can be found between Anaximander’s idea of the apeiron and numerous Upanishadic descriptions of brahman as that which underlies all beings, descriptions which, in this case, certainly appear much earlier.
The idea of substance in the sense of an underlying substratum can, therefore, be traced back as far as the Upanishads, and it is possible that the Upanishads influenced the Presocratic notion and, in turn, Aristotle. For there was significant Greek-Indian interchange in the Presocratic period, mediated by the Persian empire, and there is persuasive evidence that Presocratic thinkers had some knowledge of Upanishadic texts or of some unknown source that influenced both (McEvilley 2002, 28–44).
c. Buddhist Objections to Substance
The earliest sustained critiques of the notion of substance appear in Buddhist philosophy, beginning with objections to the idea of a substantial soul or atman. Early objections to the idea of a substantial soul are extended to substances in general by Nagarjuna, the founder of the Madhyamaka school, in around the second or third century C.E. As a result, discussions about substances would end up being central to the philosophical traditions across Eurasia in the succeeding centuries.
The earliest Buddhist philosophical texts are the discourses attributed to the Buddha himself and to his immediate disciples, collected in the Sutra Piṭaka. These are followed by the more technical and systematic Abhidharma writings collected in the Abhidhamma Piṭaka. The Sutra Piṭaka and the Abhidhamma Piṭaka are two of the three components of the Buddhist canon, the third being the collection of texts about monastic living known as the VinayaPiṭaka. (The precise content of these collections differs in different Buddhist traditions, the Abhidhamma Piṭaka especially.)
The Sutra Piṭaka and the Abhidhamma Piṭaka both contain texts arguing against the idea of a substantial soul. According to the authors of these texts, the term atman is applied by convention to what is in fact a mere collection of mental and physical events. The Samyutta Nikaya, a subdivision of the Sutra Piṭaka, attributes a classic expression of this view to the Buddhist nun, Vaijira. Bhikku Bodhi (2000, 230) translates the relevant passage as follows:
Why now do you assume ‘a being’?
Mara, is that your speculative view?
This is a heap of sheer formations:
Here no being is found.
Just as, with an assemblage of parts,
The word ‘chariot’ is used,
So, when the aggregates exist,
There is the convention ‘a being’.
Although they oppose the idea of a substantial self, the texts collected in the Sutra Piṭaka and the Abhidhamma Piṭaka do not argue against the existence of substances generally. Indeed, Abhidharma philosophers analysed experiential reality into elements referred to as dharmas, which are often described in terms suggesting that they are substances (all the more so in later, noncanonical texts in the Abhidharma tradition).
The Madhyamaka school arose in response to Abhidharma philosophy as well as non-Buddhist schools such as Nyaya-Vaisheshika. In contrast to earlier Buddhist thought, its central preoccupation is the rejection of substances generally.
Madhyamaka means middle way. The school takes this name from its principal doctrine, which aims to establish a middle way between two opposing metaphysical views: realism (broadly the view that some things are ultimately real) and nihilism (the view that ultimately, nothing exists). Nagarjuna expresses the third alternative as the view that everything is characterised by emptiness (sunyata), which he explicates as the absence of svabhava. While svabhava has various interconnected meanings in Nagarjuna’s thought, it is mainly used to express the idea of substance understood as “any object that exists objectively, the existence and qualities of which are independent of other objects, human concepts, or interests” (Westerhoff 2009, 199).
Westerhoff (2009, 200–212) summarises several arguments against substance that can be attributed to Nagarjuna. These include an argument that substances could not stand in causal relations, an argument that substance could not undergo change, and an argument that there exists no satisfactory account of the relation between a substance and its properties. The first two appear to rule out substances only on the assumption that substances, if they exist at all, must stand in causal relations and undergo change, something that most, but not all, proponents of substances would hold. Regarding the self or soul, Nagarjuna joins with other Buddhist schools in arguing that what we habitually think of as a substantial self is in fact a collection of causally interconnected psychological and physical events.
The principal targets of Nagarjuna’s attacks on the concept of substance are Abhidharma and Nyaya-Vaisheshika philosophies. A central preoccupation of the Nyaya school is to respond to Buddhist arguments, including those against substance. It is possible that a secondary target is the concept of substance in Greek philosophy. As noted above, there is some evidence of influence between the Greek and Indian philosophical traditions in one or both directions. Greeks in India took a significant interest in Buddhism, with Greek converts contributing to Buddhist culture. The best known of these, Menander, a second century B.C.E. king of Bactria, is one of the two principal interlocutors in the Milindasutra, a Buddhist philosophical dialogue that includes a famous presentation of Vaijira’s chariot analogy.
There also exist striking parallels between the arguments of the Pyrrhonists, as recorded by Sextus Empiricus in around 200 C.E. and the Madhyamaka school founded by Nagarjuna at about the same time (McEvilley 2002; Neale 2014). Diogenes Laertius records that Pyrrho himself visited India with Alexander the Great’s army, spending time in Taxila, which would become a centre of Buddhist philosophy. Roman historians record flourishing trade between the Roman empire and India. There was, therefore, considerable opportunity for philosophical interchange during the period in question. Nonetheless, arguing against the idea of substance does not seem to have been such a predominant preoccupation for the Pyrrhonists as it was for the Madhyamaka philosophers.
3. Substance in Medieval Arabic and Islamic Philosophy
Late antiquity and the Middle Ages saw a decline in the influence of Greco-Roman culture in and beyond Europe, hastened by the rise of Islam. Nonetheless, the tradition of beginning philosophical education with Aristotle’s logical works, starting with the Categories, retained an enormous influence in Middle Eastern intellectual culture. (Aristotle’s work was read not only in Greek but also in Syriac and Arabic translations from the sixth and ninth centuries respectively). The translation of Greek philosophical works into Arabic was accompanied by a renaissance in Aristotelian philosophy beginning with al-Kindi in the ninth century. Inevitably, this included discussions of the concept of substance, which is present throughout the philosophy of this period. Special attention is due to al-Farabi for an early detailed treatment of the topic and to Avicebron (Solomon ibn Gabirol) for his influential defence of the thesis that all substances must be material. Honourable mention is also due to Avicenna’s (Ibn Sina) floating-man argument, which is widely seen as anticipating Descartes’ (in)famous disembodiment argument for the thesis that the mind is an immaterial substance.
a. Al-Farabi
The resurgence of Aristotelian philosophy in the Arabic and Islamic world is usually traced back to al-Kindi. Al-Kindi’s works on logic (the subject area to which the Categories is traditionally assigned) have however been lost, and with them any treatment of substance they might have contained. Thérèse-Anne Druart (1987) identifies al-Farabi’s discussion of djawhar, in his Book of Letters, as the first serious Arabic study of substance. There, al-Farabi distinguishes between the literal use of djawhar (meaning gem or ore), metaphorical uses to refer to something valuable or to the material of which something is constituted, and three philosophical uses as a term for substance or essence.
The first two philosophical uses of djawhar identified by al-Farabi approximate Aristotle’s primary and secondary substances. That is, in the first philosophical usage, djawhar refers to a particular that is not said of and does not exist in a subject. For example, an elephant. In the second philosophical usage, it refers to the essence of a substance in the first sense. For example, the species elephant. Al-Farabi adds a third use of djawhar, in which it refers to the essence of a non-substance. For example, to colour, the essence of the non-substance grey.
Al-Farabi says that the other categories depend on those of first and second substances and that this makes the categories of first and second substances more perfect than the others. He reviews alternative candidates for the status of djawhar put forward by unnamed philosophers. These include universals, indivisible atoms, spatial dimensions, mathematical points, and matter. The idea appears to be that these could turn out to be superior candidates for substances because they are more perfect. However, with one exception, al-Farabi does not discover anything more perfect than primary and secondary substances.
The exception is as follows. Al-Farabi claims that it can be proved that there exists a being that is neither in nor predicated of a subject and that is not a subject for anything else either. This being, al-Farabi claims, is more worthy of the term djawhar than the object-like primary substances, insofar as it is still more perfect. Although al-Farabi indicates that it would be reasonable to extend the philosophical usage of djawhar in this way, he does not propose to break with the established use in this way. Insofar as “more perfect” means “more fundamental”, we see here the tension mentioned at the beginning of this article between the use of the term “substance” for object-like things and its use for whatever is most fundamental.
b. Avicebron (Solomon ibn Gabirol)
Avicebron was an eleventh century Iberian Jewish Neoplatonist. In addition to a large corpus of poetry, he wrote a philosophical dialogue, known by its Latin name, Fons Vitae (Fountain of Life), which would have a great influence on Christian scholastic philosophy in the twelfth and thirteenth centuries.
Avicebron’s principal contribution to the topic of substance is his presentation of the position known as universal hylomorphism. As explained in section 1, Aristotle defends hylomorphism, the view that material substances are composed of matter (hyle) and form (morphe). However, Aristotle does not extend this claim to all substances. He leaves room for the view that there exist many substances, including human intellects, that are immaterial. By late antiquity, a standard interpretation of Aristotle emerged, according to which such immaterial substances do in fact exist. By contrast, in the Fons Vitae, Avicebron defends the thesis that all substances, with the only exception of God, are composed of matter and form.
There is a sense in which Avicebron’s universal hylomorphism is a kind of materialism: he holds that created reality consists solely of material substances. It is however important not to be misled by this fact. For although they argue that all substances, barring God, are composed of matter and form, Avicebron and other universal hylomorphists draw a distinction between the ordinary matter that composes corporeal substances and the spiritual matter that composes spiritual substances. Spiritual matter plays the same role as ordinary matter in that it combines with a form to yield a substance. However, the resulting substances do not have the characteristics traditionally associated with material entities. They are not visible objects that take up space. Hence, universal hylomorphism would not satisfy traditional materialists such as Epicurus or Hobbes, who defend their position on the basis that everything that exists must take up space.
Scholars do not agree on what the case for universal hylomorphism is supposed to be. Paul Vincent Spade (2008) suggests that it results from two assumptions: that only God is metaphysically simple in all respects, and that anything that is not metaphysically simple in all respects is a composite of matter and form. However, Avicebron does not explicitly defend this argument, and it is not obvious why something could not qualify as non-simple in virtue of being complex in some way other than involving matter and form.
4. Substance in Medieval Scholastic Philosophy
In the early sixth century, Boethius set out to translate the works of Plato and Aristotle into Latin. This project was cut short when he was executed by Theodoric the Great, but Boethius still did manage to translate Aristotle’s Categories and De Interpretatione. A century later, Isadore of Seville summarised Aristotle’s account of substance in the Categories in his Etymologiae, perhaps the most influential book of the Middle Ages, after the Bible. As a result, the concept of substance introduced in Aristotle’s Categories remained familiar to philosophers after the fall of the Western Roman Empire. Nonetheless, prior to the twelfth century, philosophy in the Latin West consisted principally in elaborating on traditional views, inherited from the Church Fathers and other familiar authorities. It is only in the twelfth century that philosophers made novel contributions to the topic of substance, influenced by Arabic-Islamic philosophy and by the recovery of ancient works by Aristotle and others. The most important are those of Thomas Aquinas and John Duns Scotus.
a. Thomas Aquinas
All the leading philosophers of this period adopted a version of Aristotle’s concept of substance. Many, and in particular those in the Franciscan order, such as Bonaventure, followed Avicebron in accepting universal hylomorphism. Aquinas’s main contribution to the topic of substance is his opposition to Avicebron’s position.
Aquinas endorses Aristotle’s definition of a substance as something that neither is said of, nor exists in, a subject, and he follows Aristotle in analysing material substances as composites of matter and form. However, Aquinas recognised a problem about how to square these views with his belief that some substances, including human souls, are immaterial.
Aquinas was committed to the view that, unlike God, created substances are characterised by potentiality. For example, before its bath, the elephant is actually hot but potentially cool. Aquinas takes the view that in material substances, it is matter that contributes potentiality. For matter is capable of receiving different forms. Since immaterial substances lack matter, it seems to follow that they also lack potentiality. Aquinas is happy to accept this conclusion respecting God whom he regards as pure act. He is however not willing to say the same of other immaterial substances, such as angels and human souls, which he takes to be characterised by potentiality no less than material substances.
One solution would be to adopt the universal hylomorphism of Avicebron, but Aquinas rejects this position on the basis that the potentiality of matter, as usually understood, consists ultimately in its ability to move through space. If so, it seems that matter can only belong to spatial, and hence corporeal, beings (Questiones Disputate de Anima, 24.1.49.142–164).
Instead, Aquinas argues that although immaterial substances are not composed of matter and form, they are composed of essence and existence. In immaterial substances, it is their essence that contributes potentiality. This account of immaterial substances presupposes that existence and essence are distinct, an idea that had been anticipated by Avicenna as a corollary of his proof of God’s existence. Aquinas defends the distinction between existence and essence in De Ente et Essentia, though scholars disagree about how exactly the argument should be understood (see Gavin Kerr’s article on Aquinas’s Metaphysics).
Aquinas recognises that one might be inclined to refer to incorporeal potentiality as matter simply on the basis that it takes on, in spiritual substances, the role that matter plays in corporeal substances. However, he takes the view that this use of the term “matter” would be equivocal and potentially misleading.
A related, but more specific, contribution by Aquinas concerns the issue of how a human soul, if it is the form of a hylomorphic compound, can nonetheless be an immaterial substance in its own right, capable of existing without the body after its death. Aquinas compares the propensity of the soul to be embodied to the propensity of lighter objects to rise, observing that in both cases, the propensity can be obstructed while the object remains in existence. For more on this issue, see Christopher Brown’s article on Thomas Aquinas.
b. Duns Scotus
Like Aquinas, Scotus adopts the Categories’ account of substance. In contrast to earlier Franciscans, he agrees with Aquinas’s rejection of universal hylomorphism. Indeed, Scotus goes even further, claiming not only that form can exist without matter, but also that prime matter can exist without form. As a result, Scotus is committed to the view that matter has a kind of formless actuality, something that, in Aquinas’s system, looks like a contradiction.
Although he drops the doctrine of universal hylomorphism, Scotus maintained, against Aquinas, a second thesis concerning substances associated with Franciscan philosophers and often paired with universal hylomorphism: the view that a single substance can have multiple substantial forms (Ordinatio, 4).
According to Aquinas, a substance has only one substantial form. For example, the substantial form of an elephant is the species elephant. The parts of the elephant, such as its organs, do not have their own substantial forms. Because substantial forms are responsible for the identity of substances over time, this view has the counterintuitive consequence that when, for example, an organ transplant takes place, the organ acquired by the recipient is not the one that was possessed by the donor.
According to Scotus, by contrast, one substance can have multiple substantial forms. For example, the parts of the elephant, such as its organs, may each have their own substantial form. This allows followers of Scotus to take the intuitive view that when an organ transplant takes place, the organ acquired by the recipient is one and the same as the organ that the donor possessed, and not a new entity that has come into existence after the donor’s death. (Aristotle seems to endorse the position of Scotus in the Categories, and that of Aquinas in the Metaphysics.)
Scotus is also known for introducing the idea that every substance has a haecceity (thisness), that is, a property that makes it the particular thing that it is. In this, he echoes the earlier Vaisheshika idea of a vishesha (usually translated “particularity”) which plays approximately the same role (Kaipayil 2008, 79).
5. Substance in Early Modern Philosophy
Prior to the early modern period, Western philosophers tend to adopt both Aristotle’s definition of substance in the Categories and his analysis of material substances into matter and form. In the early modern period, this practice begins to change, with many philosophers offering new characterisations of substance, or rejecting the notion of substance entirely. The most influential contribution from this period is Descartes’ independence definition of substance. Although many earlier philosophers have been interpreted as saying that substances are things that have independent existence, Descartes appears to be the first prominent thinker to say this explicitly. Descartes’ influence, respecting this and other topics, was reinforced by Antoine Arnauld and Pierre Nicole’s Port-Royal Logic, which, towards the end of the seventeenth century, took the place of Aristotle’s Categories as the leading introduction to philosophy. Important contributions to the idea of substance in this period are also made by Spinoza, Leibniz, Locke and Hume, all of whom are known for resisting some aspect of Descartes’ account of substance.
a. Descartes
Substance is one of the central concepts of Descartes’ philosophy, and he returns to it on multiple occasions. In the second set of Objections and Replies to the Meditations on First Philosophy, Descartes advances a definition of substance that resembles Aristotle’s definition of substance in the Categories. This is not surprising given that Descartes underwent formal training in Aristotelian philosophy at the Royal College of La Flèche, France. In a number of other locations, however, Descartes offers what has been called the independence definitionof substance. According to the independence definition, a substance is anything that could exist by itself or, equivalently, anything that does not depend on anything else for its existence (Oeuvres, vol. 7, 44, 226; vol. 3, 429; vol. 8a, 24).
Scholars disagree about how exactly we should understand Descartes’ independence definition. Some have argued that Descartes’ view is that substances must be causally independent, in the sense that they do not require anything else to cause them to exist. Another and maybe more popular view is that, for Descartes, substances are modally independent, meaning that the existence of a substance does not necessitate the existence of any other entity. This interpretation itself has several variants (see Weir 2021, 281–7).
In addition to offering a new definition of substance, Descartes draws a distinction between a strict and a more permissive sense of the term. A substance in the strict sense satisfies the independence definition without qualification. Descartes claims that there is only one such substance: God. For everything else depends on God for its existence. Descartes adds, however, that we can count as created substances those things that depend only on God for their existence. Descartes claims that finite minds and bodies qualify as created substances in this sense, whereas their properties (attributes, qualities and modes in his terminology) do not.
It is possible to view Descartes’ independence definition of substance as a disambiguation of Aristotle’s definition of substance in the Categories. Aristotle says that substances do not depend, for their existence, on any other being of which they must be predicated or in which they must inhere. He does not however say explicitly whether substances depend in some other way on other things for their existence. Descartes clarifies that they do not. This is consistent with, and may even be implied by, what Aristotle says in the Categories.
In another respect, Descartes’ understanding of substance departs dramatically from the Aristotelian orthodoxy of his day. For example, while Descartes accepts Aristotle’s claim that in the case of a living human, the soul serves as the form of body, he exhibits little or no sympathy for hylomorphism beyond this. Rather than analysing material substances into matter and form like Aristotle, or substances in general into potency and act like Aquinas, Descartes proposes that every substance has, as its principal attribute, one of two properties—namely, extension or thought—and that all accidental properties of substances are modes of their principal attribute. For example, being elephant-shaped is a mode of extension, and seeing sunlight glimmer on a lake is a mode of thought. In contrast to the scholastic theory of real accidents, Descartes holds that these modes are only conceptually distinct from, and cannot exist without, the substances to which they belong.
One consequence is that Descartes appears to accept what has come to be known as the bundle view of substances: the thesis that, in his words, “the attributes all taken together are the same as the substance” (Conversation with Burman, 7). To put it another way, once we have the principal attribute of the elephant—extension—and all of the accidental attributes, such as its size, shape, texture and so on, we have everything that this substance comprises. These attributes do not need to be combined with a propertyless substratum. (The bundle view, in the relevant sense, contrasts with the substratum view, according to which a substance is composed of properties and a substratum. Sometimes, the term “bundle view” is used in a stronger sense, to imply that the properties that make up a substance could exist separately, but Descartes does not endorse the bundle view in this stronger sense.)
A further consequence is that Descartes could not accept the standard transubstantiation account of the eucharist, which depended on the theory of real accidents, and was obliged to offer a competing account.
In the late seventeenth century, two followers of Descartes, Antoine Arnauld and Pierre Nicole, set out to author a modern introduction to logic that could serve in place of the texts of Aristotle’s Organon, including the Categories. (The word “logic” is used here in a traditional sense that is significantly broader than the sense that philosophers of the beginning of the twenty-first century would attribute to it, including much of what these philosophers would recognize as metaphysics.) The result was La logique ou l’art de penser, better known as the Port-Royal Logic, a work that had an enormous influence on the next two centuries of philosophy. The Port-Royal Logic offers the following definition of substance:
I call whatever is conceived as subsisting by itself and as the subject of everything conceived about it, a thing. It is otherwise called a substance. […] This will be made clearer by some examples. When I think of a body, my idea of it represents a thing or a substance, because I consider it as a thing subsisting by itself and needing no other subject to exist. (30–21)
This definition combines Aristotle’s idea that a substance is the subject of other categories and Descartes claim that a substance does not need other things to exist. It is interesting to note here a shift in focus from what substances are to how they are conceived or considered. This reflects the general shift in focus from metaphysics to epistemology that characterised philosophy after Descartes.
b. Spinoza
Influential philosophers writing after Descartes tend to use Descartes’ views as a starting point, criticising or accepting them as they deem reasonable. Hence, a number of responses to Descartes’ account of substance appear in the early modern period.
In the only book published under his name in his lifetime, the 1663 Principles of Cartesian Philosophy, Spinoza endorses both Descartes’ definition of substance in the Second Replies (which is essentially Aristotle’s definition in the Categories) and the independence definition introduced in the Principles of Philosophy and elsewhere. Spinoza also endorses Descartes’ distinction between created and uncreated substances, his rejection of substantial forms and real accidents, and his division of substances into extended substances and thinking substances.
In the Ethics, published posthumously in 1677, Spinoza develops his own approach to these issues. Spinoza opens the Ethics by stating that “by substance I understand what is in itself and is conceived through itself”. Shortly after this, in the first of his axioms, he adds that “Whatever is, is either in itself or in another”. Spinoza’s contrast between substance, understood as those things that are in themselves, and non-substances, understood as those things that are in another, reflects the distinction introduced by Plato in the Sophist and taken up by countless later thinkers from antiquity onwards. As in the Port-Royal Logic, Spinoza’s initial definition of substance in terms of how it is conceived reflects the preoccupation of early modern philosophy with epistemology.
Spinoza clarifies the claim that a substance is conceived through itself by saying that it means that “the conception of which does not require for its formation the conception of anything else”. This might mean that something is a substance if and only if it is possible to conceive of its existing by itself. If so, then Spinoza’s definition might be interpreted as an epistemological rewriting of Descartes’ independence definition.
Spinoza purports to show, on the basis of various definitions and axioms, that there can only be one substance, and that this substance is to be identified with God. What Descartes calls created substances are really modes of God. This conclusion is sometimes represented as a radical departure from Descartes. This is misleading, however. For Descartes also holds that only God qualifies as a substance in the strict sense of the word “substance”. To this extent, Spinoza is no more monistic than Descartes.
Spinoza’s Ethics does however depart from Descartes in (i) not making use of a category of created substances, and (ii) emphasizing that those things that Descartes would class as created substances are modes of God. Despite this, Spinoza’s theory is not obviously incompatible with the existence of created substances in Descartes’ sense of the term, even if he does not make use of the category himself. It is plausibly a consequence of Descartes’ position that created substances are, strictly speaking, modes of God, even if Descartes does not state this explicitly.
c. Leibniz
In his Critical Thoughts on Descartes’ Principles of Philosophy, Leibniz raises the following objection to Descartes’ definition of created substances as things that depend only on God for their existence:
I do not know whether the definition of substance as that which needs for its existence only the concurrence of God fits any created substance known to us. […] For not only do we need other substances; we need our own accidents even much more. (389)
Leibniz does not explicitly explain here why substances should need other substances, setting aside God, for their existence. Still, his claim that substances need their own accidents is an early example of an objection that has had a significant degree of influence in the literature of the twentieth and twenty-first centuries on substance. According to this objection, nothing could satisfy Descartes’ independence definition of substance because every candidate substance (an elephant or a soul, for example) depends for its existence on its own properties. This objection is further discussed in section 6.
In the Discourse of Metaphysics, Leibniz does provide a reason for thinking that created substances need other substances to exist. There, he begins by accepting something close to Aristotle’s definition of substance in the Categories: a substance is something of which other things are predicated, but which is not itself predicated of anything else. However, Leibniz claims that this characterisation is insufficient, and sets out a novel theory of substance, according to which the haecceity of a substance includes everything true of it (see section 4.b for the notion of haecceity). Accordingly, Leibniz holds that from a perfect grasp of the concept of a particular substance, one could derive all other truths.
It is not obvious how Leibniz arrives at this unusual conception of substance, but it is clear that if the haecceity of one substance includes everything that is true of it, this will include the relationships in which it stands to every other substance. Hence, on Leibniz’s view, every substance turns out to necessitate, and so to depend modally on, every other for its existence, a conclusion that contrasts starkly with Descartes’ position.
Leibniz’s view illustrates the fact that it is possible to accept Aristotle’s definition of substance in the Categories while rejecting Descartes’ independence definition. Leibniz clearly agrees with Aristotle that a substance does not have to be said of or to exist in something in the way that properties do. However, he holds that substances depend for their existence on other things in a way that contradicts Descartes’ independence definition.
Leibniz’s enormous corpus makes a number of other distinctive claims about substances. The most important of these are the characterisation of substances as unities and as things that act, both of which can be found in his New Essays on Human Understanding. These ideas have precursors as far back as Aristotle, but they receive special emphasis in Leibniz’s work.
d. British Empiricism
Section 1 mentions that since the seventeenth century, a new usage of the term “substance” becomes prevalent, on which it does not refer to an object-like thing, such as an elephant, but to an underlying substratum that must be combined with properties to yield an object-like thing. On this usage, an elephant is a combination of properties such as its shape, size and colour, and the underlying substance in which these properties inhere. The substance in this sense is often described as having no properties in itself, and therefore resembles Aristotelian prime matter more than the objects that serve as examples of substances in earlier traditions.
This new usage of “substance” is standardly traced back to Locke’s Essay Concerning Human Understanding, where he states that:
Substance [is] nothing, but the supposed, but unknown support of those qualities, we find existing, which we imagine cannot subsist, sine re substante, without something to support them, we call that support substantia; which, according to the true import of the word, is in plain English, standing under, or upholding. (II.23.2)
This and similar statements in Locke’s Essay initiated a longstanding tradition in which British empiricists, including Berkeley, Hume, and Russell, took for granted that the term “substance” typically refers to a propertyless substratum and criticised the concept on that basis.
Scholars debate on whether Locke actually intended to identify substances with propertyless substrata. There exist two main interpretations. On the traditional interpretation, associated with Leibniz and defended by Jonathan Bennett (1987), Locke uses the word “substance” to refer to a propertyless substratum that we posit to explain what supports the collections of properties that we observe, although Locke is sceptical of the value of this idea, since it stands for something whose nature we are entirely ignorant of. (Those who believe that Locke intended to identify substances with propertyless substrata disagree regarding the further issue of whether Locke reluctantly accepts or ultimately rejects such entities.)
The alternative interpretation, defended by Michael Ayers (1977), agrees that Locke identifies substance with an unknown substratum that underlies the collections of properties we observe. However, on this view, Locke does not regard the substratum as having no properties in itself. Rather, he holds that these properties are unknown to us, belonging as they do to the imperceptible microstructure of their bearer. This microstructure is posited to explain why a given cluster of properties should regularly appear together. On this reading, Locke’s substrata play a similar role to Aristotle’s secondary substances or Jain vertical universals in that they are the essences that explain the perceptible properties of objects. The principal advantage of this interpretation is that it explains how Locke can endorse the idea of a substratum while recognising the (apparent) incoherence of the idea of something having no properties in itself. The principal disadvantages of this interpretation include the meagre textual evidence in its favour and its difficulty accounting for Locke’s disparaging comments about the idea of a substratum.
Forrai (2010) suggests that the two interpretations of Locke’s approach to substances can be reconciled if we suppose that Locke takes our actual idea of substance to be that of a propertyless substratum while holding that we only think of that substratum as propertyless because we are ignorant of its nature, which is in fact that of an invisible microstructure.
In the passages traditionally interpreted as discussing the idea of a propertyless substratum, Locke refers to it as the “idea of substance in general”. In other passages, Locke discusses our ideas of “particular sorts of substances”. Locke’s particular sorts of substances resemble the things referred to as substances in earlier traditions. His examples include humans, horses, gold, and water. These, Locke claims, are in fact just collections of simple ideas that regularly appear together:
We come to have the Ideas of particular sorts of Substances, by collecting such Combinations of simple Ideas, as are by Experience and Observation of Men’s Senses taken notice of to exist together, and are therefore supposed to flow from the particular internal Constitution, or unknown Essence of that Substance. (Essay, II.23.3)
The idea of an elephant, on this view, is really just a collection comprising the ideas of a certain colour, a certain shape, and so on. Locke seems to take the view that the distinctions we draw between different sorts of substances are somewhat arbitrary and conventional. That is, the word “elephant” may not refer to what the philosophers of the twentieth and twenty-first centuries would consider a natural kind. Hence, substances in the traditional sense turn out to be subject-dependent in the sense that the identification of some collection of ideas as a substance is not an objective, mind-independent fact, but depends on arbitrary choices and conventions.
Locke’s comments about substance, particularly those traditionally regarded as identifying substances with propertyless substrata, had a great influence on Berkeley and Hume, both of whom followed Locke in treating substances as substrata and in criticising the notion on this basis, while granting the existence of substances in the deflationary sense of subject-dependent collections of ideas.
Berkeley’s position is distinctive in that he affirms an asymmetry between perceptible substances, such as elephants and vases, and spiritual substances, such as human and divine minds. Berkeley agrees with Locke that our ideas of perceptible substances are really just collections of ideas and that we are tempted to posit a substratum in which these ideas exist. Unlike Locke, Berkeley explicitly says that in the case of perceptible objects at least, we should posit no such thing:
If substance be taken in the vulgar sense for a combination of qualities such as extension, solidity, weight, and the like […] this we cannot be accused of taking away; but if it be taken in the philosophic sense for the support of accidents or quantities without the mind, then I acknowledge that we take it away. (Principles, 1.37)
Berkeley’s rejection of substrata in the case of material objects is not necessarily due to his rejection ofthe idea of substrata in general, however. It may be that Berkeley rejects substrata for material substances only, and does so solely on the basis that, according to his idealist metaphysics, those properties that make up perceptible objects really inhere in the minds of the perceivers.
Whether or not Berkeley thinks that spiritual substances involve propertyless substrata is hard to judge and it is not clear that Berkeley maintains a consistent view on this issue. On the one hand, Berkeley’s published criticisms of the idea of a substratum tend to focus exclusively on material objects, suggesting that he is not opposed to the existence of a substratum in the case of minds. On the other hand, several passages in Berkeley’s notebooks assert that there is nothing more to minds than the perceptions they undergo, suggesting that Berkeley rejects the idea of substrata in the case of minds as well (see in particular his Notebooks, 577 and 580). The task of interpreting Berkeley on this point is complicated by the fact that the relevant passages are marked with a “+”, which some but not all scholars interpret as indicating Berkeley’s dissatisfaction with them.
Hume’s Treatise of Human Nature echoes Locke’s claim that we have no idea of what a substance is and that we have only a confused idea of what a substance does. Although Hume does not explicitly state that these criticisms are intended to apply to the idea of substances as propertyless substrata, commentators tend to agree that this is his intention (see for example Baxter 2015). Hume seems to agree with Locke (as traditionally interpreted) that we introduce the idea of a propertyless substratum in order to make sense of the unity that we habitually attribute to what are in fact mere collections of properties that regularly appear together. Hume holds that we can have no idea of this substratum because any such idea would have to come from some sensory or affective impression while, in fact, ideas derived from sensory and affective impressions are always of accidents—that is, of properties.
Hume grants that we do have a clear idea of substances understood as Descartes defines them, that is, as things that can exist by themselves. However, Hume asserts that this definition applies to anything that we can think of, and hence, that to call something a substance in this sense is not to distinguish it from anything else.
Hume further argues that we can make no sense of the idea of the inherence relation that is supposed to exist between properties and the substances to which they belong. For the inherence relation is taken to be the relation that holds between an accident and something without which it could not exist (as per Aristotle’s description of inherence in the Categories, for example). According to Hume, however, nothing stands in any such relation to anything else. For he makes it an axiom that anything that we can distinguish in thought can exist separately in reality. It follows that not only do we have no idea of a substratum, but no such thing can exist, either in the case of perceptible objects or in the case of minds. For a substratum is supposed to be that in which properties inhere. It is natural to see Hume’s arguments on this topic as the culmination of Locke’s more circumspect criticisms of substrata.
It follows from Hume’s arguments that the entities that earlier philosophers regarded as substances, such as elephants and vases, are in fact just collections of ideas, each member of which could exist by itself. Hume emphasises that, as a consequence, the mind really consists in successive collections of ideas. Hence, Hume adopts a bundle view of the mind and other putative substances not only in the moderate sense that he denies that minds involve a propertyless substratum, but in the extreme sense that he holds that they are really swarms of independent entities.
There exists a close resemblance between Hume’s rejection of the existence of complex substances and his emphasis on the nonexistence of a substantial mind in particular, and the criticisms of substance advanced by Buddhist philosophers and described in section 2. It is possible that Hume was influenced by Buddhist thought on this and other topics during his stay at the Jesuit College of La Flèche, France, in 1735–37, through the Jesuit missionary Charles François Dolu (Gopnik 2009).
Although not himself a British empiricist (though see Stephen Priest’s (2007, 262 fn. 40) protest on this point), Kant developed an approach to substance in the tradition of Locke, Berkeley and Hume, with a characteristically Kantian twist. Kant endorses a traditional account of substance, according to which substances are subjects of predication and are distinguished by their capacity to persist through change. However, Kant adds that the category of substance is something that the understanding imposes upon experience, rather than something derived from our knowledge of things in themselves. For Kant, the category of substance is, therefore, a necessary feature of experience, and to that extent, it has a kind of objectivity. Kant nonetheless agrees with Locke, Berkeley (respecting material substances) and Hume that substances are subject-dependent. (See Messina (2021) for a complication concerning whether we might nonetheless be warranted in applying this category to things in themselves.)
While earlier thinkers beginning with Aristotle asserted that substances can persist through change, Kant goes further, claiming that substances exist permanently and that their doing so is a necessary condition for the unity of time. It seems to follow that for Kant, composites such as elephants or vases cannot be substances, since they come into and go out of existence. Given that Kant also rejects the existence of indivisible atoms in his discussion of the second antinomy, the only remaining candidate for a material substance in Kant appears to be matter taken as a whole. For an influential exposition, see Strawson (1997).
6. Substance in Twentieth-Century and Early-Twenty-First-Century Philosophy
The concept of substance lost its central place in philosophy after the early modern period, partly as a result of the criticisms of the British empiricists. However, philosophers of the twentieth and early twenty-first centuries have shown a revival of interest in the idea, with several philosophers arguing that we need to accept the concept of substance to account for the difference between object-like and property-like things, or to account for which entities are fundamental, or to address a range of neighbouring metaphysical issues. Discussions have centred on two main themes: the criteria for being a substance, and the structure of substances. O’Conaill (2022) provides a detailed overview of both. Moreover, in the late twentieth century, the concept of substance has gained an important role in philosophy of mind, where it has been used to mark the difference between two kinds of mind-body dualism: substance dualism and property dualism.
a. Criteria for Being a Substance
As noted at the beginning of this article, the term “substance” has two main uses in philosophy. Some philosophers use this word to pick out those things that are object-like in contrast to things that are property-like (or, for some philosophers, event-like or stuff-like). Others use it to pick out those things that are fundamental, in contrast to things that are non-fundamental. Both uses derive from Aristotle’s Categories, which posits that the object-like things are the fundamental things. For some thinkers, however, object-like-ness and fundamentality come apart. When philosophers attempt to give precise criteria for being a substance, they tend to have one of two targets in mind. Some have in mind the task of stating what exactly makes something object-like, while others have in mind the task of stating what exactly makes something fundamental. Koslicki (2018, 164–7) describes the two approaches in detail. Naturally, this makes a difference to which criteria for being a substance seem reasonable, and occasionally this has resulted in philosophers talking past one another. Nonetheless, the hypothesis that the object-like things are the fundamental things is either sufficiently attractive, or sufficiently embedded in philosophical discourse, that there exists considerable overlap between the two approaches.
The most prominent criterion for being a substance in the philosophy of the beginning of the twenty-first century is independence. Many philosophers defend, and even more take as a starting point, the idea that what makes something a substance is the fact that it does not depend on other things. Philosophers differ, however, on what kind of independence is relevant here, and some have argued that independence criteria are unsatisfactory and that some other criterion for being a substance is needed.
The most common independence criteria for being a substance characterise substances in terms of modal (or metaphysical) independence. One thing a is modally independent of another thing b if and only if a could exist in the absence of b. The idea that substances are modally independent is attractive for two reasons. First, it seems that properties, such as shape, size or colour, could not exist without something they belong to—something they are the shape, size or colour of. In other words, property-like things seem to be modally dependent entities. By contrast, object-like things, such as elephants or vases, do not seem to depend on other things in this way. An elephant need not be the elephant of some elephant-having being. Therefore, one could argue for the claim that object-like things differ from property-like things by saying that the former are not modally dependent on other entities, while the latter are. Secondly, modally independent entities are arguably more fundamental than modally dependent entities. For example, it is tempting to say that modally independent entities are the basic elements that make up reality, whereas modally dependent entities are derivative aspects or ways of being that are abstracted from the modally independent entities.
Though attractive, the idea that substances are modally independent faces some objections. The most influential objection says that nothing is modally independent because nothing can exist without its own parts and/or properties (see Weir (2021, 287–291) for several examples). For example, an elephant might not have to be the elephant of some further, elephant-having being, but an elephant must have a size and shape, and countless material parts. An elephant cannot exist without a size, a shape and material parts, and so there is a sense in which an elephant is not modally independent of these things.
Several responses have been suggested. First, one might respond by drawing a distinction between different kinds of modal dependence (see, for example, Lowe 1998, 141; Koslicki 2018, 142–44). For instance, we might say that a is rigidly dependent on b if and only if a cannot exist without b, whereas a is generically dependent on entities of kind F if and only if a cannot exist without some entity of kind F. This allows us to distinguish between something that is weakly modally independent, in that there is no entity upon which it is rigidly dependent, and something that is strongly modally independent, in that there is no kind of entity on which it is generically dependent. It might then be argued that substances need only be weakly modally independent. Hence, the fact that an elephant cannot exist without having properties and parts of certain kinds will not disqualify it as a substance, so long as there is no particular, individual part or property that it must have. It is acceptable, for example, that an elephant must have countless carbon atoms as parts, so long as it can do without any given carbon atom (which, presumably, it can).
The problem with this response is that many putative examples of substances seem to have necessary parts or properties upon which they rigidly depend. For example, it is plausible that King Dutagamuna’s renowned elephant, Kandula, could have existed without some of his properties, such that of exhibiting heroism at the siege of Vijitanagara. It is however not plausible that Kandula could have existed without some of its properties, such as that of being the unique member of the singleton set {Kandula}. This, however, does not seem like the kind of fact that should undermine Kandula’s claim to be a substance. Likewise, it is plausible that a given H2O molecule could not exist without the particular hydrogen atom it contains, and yet most philosophers would hesitate to conclude on this basis that an H2O molecule is not a substance.
A second kind of response to the dependence of substances on their properties and parts replaces modal independence with some other variety. One strategy of this kind appeals to the idea of a non-modal essence (see Fine 1994, 1995). Proponents of non-modal essences claim that things have essences that are narrower—that is, include less—than their necessary parts and properties. For example, it can be argued that although Kandula necessarily belongs to the set {Kandula}, this is not part of Kandula’s essence. After all, it is plausible that one could grasp what it is for Kandula to exist without ever thinking about the fact that Kandula belongs to the set {Kandula}. The fact that Kandula belongs to {Kandula} seems more like a side-effect of Kandula’s nature than a part of his nature. If we accept that things have non-modal essences, then it will be possible to propose that something is a substance if and only if it does not essentially depend on other entities—that is, if and only if no other entity is part of its non-modal essence.
The proposal that substances are essentially independent, in the sense specified, promises to get around the concern that Kandula fails to qualify as a substance because Kandula necessarily belongs to the set {Kandula}. However, other problems remain. For it is plausible that some entities of the sort that intuitively count as substances have some particular properties or parts essentially, and not merely necessarily. It is plausible, for example, that the particular hydrogen atom in a given H2O molecule is not only necessary to it but is also a part of its non-modal essence: a part of what it is for this H2O molecule to exist rather than some other H2O molecule is that it should contain this particular hydrogen atom. Yet, it is not obvious that this should disqualify the H2O molecule’s claim to be a substance.
Other responses that replace modal independence with some other variety include E. J. Lowe’s (1998; 2005) identity-independence and Benjamin Schneider’s (2006) conceptual-independence criteria for substance. Like the essential-independence criterion, these get around at least some of the problems facing the simple modal independence criterion.
A more complex strategy is taken up by Joshua Hoffman & Gary Rosenkrantz (1997). Hoffman and Rosenkrantz introduce a hierarchy of categories with entity at level A, abstract and concrete at level B, and so on. After a lengthy discussion, they formulate the following definition:
x is a substance = df. x instantiates a level C category, C1, such that: (i) C1 could have a single instance throughout an interval of time, and (ii) C1’s instantiation does not entail the instantiation of another level C category which could have a single instance throughout an interval of time, and (iii) it is impossible for C1 to have an instance which has as a part an entity which instantiates another level C category, other than Concrete Proper Part, and other than Abstract Proper Part. (65)
For a full understanding of their approach, it is necessary to refer to Hoffman and Rosenkrantz’s text. However, the definition quoted is enough to illustrate how their strategy addresses the dependence of substances on their properties and parts. In short, Hoffman and Rosenkrantz retain a criterion of independence but qualify that criterion in two ways. First, on their definition, it is only necessary that some substances should satisfy the independence criterion. Substances that do not satisfy the criterion count as substances in virtue of being, in some other respect, the same kinds of entities as those that do. Secondly, even those substances that satisfy the independence criterion need only to be able to exist without a carefully specified class of entities, namely those belonging to a “level C category which could have a single instance throughout an interval of time”.
Hoffman and Rosenkrantz’s definition of substance is carefully tailored to avoid the objection that substances do depend on their properties and parts, as well as a number of other objections. A drawback is that they leave it unclear what it is that unifies the category of substances, given that they only require that some substances should satisfy their qualified independence criterion.
Perhaps the simplest response to the dependence of substances on their properties and parts maintains that while a substance must be independent of all other entities, “other entities” should be taken to refer to things that are not included in the substance. This approach is proposed by Michael Gorman (2006, 151) and defended at length by Weir (2021). According to this response, while it is true that an elephant cannot exist without a shape, a size and countless material parts, this does not mean that the elephant cannot exist by itself or without anything else in the sense required for it to be a substance. For the elephant’s shape, size, and material parts are included in it. By contrast, the reason why property-like things, such as the shape of the elephant, do not count as substances is that they are incapable of existing without something that is not included in them. The shape of the elephant, for example, can only exist by being the shape of something that includes more than just the shape. Weir (2021, 296) suggests that the fact that the elephant includes the shape and not vice versa can be seen from the fact that it is possible to start with the whole elephant and subtract elements such as its colour, weight and so on, until one is left with just the shape, whereas it is not possible to start with just the shape and, by subtracting elements, arrive at the whole elephant.
Several other objections to independence criteria deserve mention. First, if there exist necessary beings, such as numbers or God, then trivially, no candidate substance will be able to exist without them. Secondly, if particulars necessarily instantiate abstract universals (if, for example, an elephant necessarily instantiates universals, such as grey, concretum, or animal), then no candidate substance will be able to exist without abstract universals. Thirdly, if space and time are something over and above their occupants (as they are on substantivalist theories of space and time), then no spatial or temporal substance will be able to exist without these. Some of the strategies for dealing with the dependence of substances on their properties and parts can be transferred to these issues. Other strategies have also been proposed. There exists no consensus on whether one or more independence criteria can satisfactorily be defended against such objections.
Those who reject that some independence criterion is necessary for being a substance, or who hold that an independence criterion needs to be supplemented, have proposed alternative criteria. Two popular options have been subjecthood and unity.
In the Categories, Aristotle introduces substances as those things that are subjects of predication and inherence and are neither predicated of nor inherent in anything else. Since he characterises predication and inherence as dependence relations, many readers have inferred that substances are to be distinguished by their independence. However, philosophers who are hesitant about relying on independence criteria often focus on the initial claim that substances are subjects of predication and inherence that are not predicated of, nor inherent in, other things; or, as it is often put, substances are property bearers that are not themselves properties (see, for example, Heil 2012, 12–17).
One difficulty for the subjecthood or property-bearer criterion for being a substance is that it is vulnerable to the objection that the distinctions we draw between properties and subjects of properties are arbitrary. For example, instead of saying that there is an elephant in the room, we might say that the room is elephant-ish. If we do so, it will no longer be true that elephants are subjects of predication that are not predicated of other things. A proponent of the independence criterion is in a position to assert that our ordinary linguistic practices reflect a deeper metaphysical fact: the reason why we do not say that the room is elephant-ish is that the elephant does not depend for its existence on the room in the way that properties depend on their bearers. Those who rely on the subjecthood criterion by itself cannot reply in this way.
Since Leibniz, many philosophers have proposed that substances are distinguished, either partly or solely, by their high degree of unity. In its extreme form, the criterion of unity says that substances must be simples in the sense that they have no detachable parts. Heil (2012, 21) argues that the simplicity criterion follows from the assumption that substances are property-bearers. For according to Heil, no composite can genuinely bear a property. Schaffer (2010) argues for the simplicity of substances on the basis of parsimony. He proposes that duplicating all the simple entities and their relations to one another would be sufficient to duplicate the entire cosmos, and that if this is so, then there is no good reason to posit further entities beyond the simple entities. Schaffer also argues that the fundamental entities that we posit should be “freely recombinable”, in the sense that the intrinsic properties of one such entity do not constrain the intrinsic properties of another, and that this will only be so if the fundamental entities are simples.
It is widely agreed that even if substances need not be simples, they must nonetheless satisfy some criterion of unity that prevents mere groups or aggregates from counting as substances. (Schneider (2006) and Weir (2021) are liberal about counting aggregates as substances, however.) For example, Kathrin Koslicki (2018) defends a neo-Aristotelian view that, rather than employing an independence criterion as many Aristotelians do, accords to hylomorphic compounds the status of being substances on the basis of their exhibiting a special kind of unity. On Koslicki’s account of the relevant kind of unity, a structured whole is unified to the extent that its parts interact in such a way as to allow it to manifest team-work-requiring capacities, such as the way in which the eye interacts with the brain and other parts of an organism gives it a capacity for visual perception.
b. The Structure of Substances
A second theme that has regained prominence in the twentieth century and the first two decades of the twenty-first century concerns the structure of substances. Increasing attention has been given to the question of whether substances should be regarded as comprising two components: properties and a substratum. At the same time, many philosophers have revived elements of Aristotle’s analysis of material substances into form and matter. (Hylomorphism can be thought of as one particularly important version of the analysis into properties and substratum, or as a distinct but somewhat similar position.)
As noted in section 5.d, Locke (perhaps inadvertently) popularised the idea that the word “substance” refers to a propertyless substratum and that we should be sceptical about the coherence or use of the idea of substances so understood. This idea persisted into the twentieth century in the works of thinkers such as Bertrand Russell (1945, 211) and J. L. Mackie (1976, 77) and is partly responsible for a widespread hostility to substances in this period. Justin Broackes (2006) reviews this development and attempts to rescue the traditional idea of substance from its association with a propertyless substratum.
At the same time, a number of thinkers have come to the defence of the idea that substances can be analysed into properties and substratum. As a result, by the dawn of the twenty-first century, it has become commonplace to speak of two main views about the structure of substances: the bundle view and the substratum view. (As explained in section 5, the bundle view here is simply the view that a substance consists of properties with no substratum. It need not entail the more extreme claim that the properties of a substance can exist separately.)
A prominent argument for the substratum view says that something resembling a propertyless substratum is needed to contribute particularity to a substance. On the standard version of this view, universal properties must be instantiated in a bare particular. An early defence of bare particulars is advanced by Gustav Bergman (1947), and the view is then developed in works by, for example, David Armstrong (1978, 1997), Theodore Sider (2006) and Andrew Bailey (2012). In this context, Armstrong draws a contrast between what he terms a thick particular which is “a thing taken along with all its properties” and a thin particular which is “a thing taken in abstraction from all its properties”. These correspond to the traditional idea of a substance and to that of a substratum of the bare-particular variety, respectively.
Although the idea of a bare particular can be seen as a version of Locke’s idea of a propertyless substratum, bare particulars are not typically introduced to play the role that Locke assigns to substrata—that of supporting properties. Rather, for Bergman and others, the principal role of the bare particular is to account for the particularity of a substance whose other components, its properties, are all universals (things that can exist in multiple places at once). In this respect, the bare particular resembles the Vaisheshika vishesha and the Scotist haecceity.
A different line of argument for positing a substratum, advanced by C. B. Martin (1980), says that without a substratum to bind them together, we should expect the properties of an object to be capable of existing separately, like its parts, something that most philosophers believe that properties cannot do. Unlike the emphasis on the role of particularising, this line of argument may have some attraction for those who hold that properties are particulars rather than universals. One objection to Martin’s argument says that the properties in a bundle might depend on one another without depending on some further substratum (Denkel 1992).
While much of the discussion concerning the structure of substances has focused on the choice between the bundle view and the substratum view, some philosophers have also shown a revival of interest in Aristotle’s analysis of material substance into form and matter, including the prominent role he gives to substantial forms in determining the kinds to which substances belong.
The latter idea is given new life in Peter Geach (1962) and David Wiggins’ (2001) defence of the sortal-dependence of identity. A sortal is a term or concept that classifies an entity as belonging to a certain kind and that hence provides, like Aristotle’s substantial forms, an answer to the question “what is x?”. The claim that identity is sortal-dependent amounts to the claim that if some entity a at an earlier time is identical to an entity b at a later time, then there must be some sortal F such that a and b are the same F—the same elephant for example, or the same molecule. As a result, the conditions under which a and b count as identical will depend on what sortal F is: the criteria for being the same elephant have to do with the kind of things elephants are; the criteria for being the same molecule have to do with the kind of things molecules are. Geach goes further than Wiggins in arguing that identity is not just sortal-dependent but also sortal-relative, so that a might be the same F as b but not the same G as b. Wiggins argues that the sortal-relativity of identity must be rejected, given Leibniz’s law of the indiscernibility of identicals.
The claim that identity is sortal-dependent implies that there is a degree of objectivity to the kinds under which we sort entities. It contrasts with the Lockean claims that the kinds that we employ are arbitrary and, as Leszek Kołakowski expresses it, that:
Nothing prevents us from dissecting surrounding material into fragments constructed in a manner completely different from what we are used to. Thus, speaking more simply, we could build a world where there would be no such objects as “horse”, “leaf”, “star”, and others allegedly devised by nature. Instead, there might be, for example, such objects as “half a horse and a piece of river”, “my ear and the moon”, and other similar products of a surrealist imagination. (1968, 47–8)
Insofar as Geach and Wiggins’ sortals play the role of Aristotle’s substantial forms, their claims about sortal-dependence can be seen as reviving elements of Aristotle’s hylomorphism in spirit if not in letter. Numerous works go further, in explicitly defending the analysis of material substances into matter and form. Examples include Johnston (2006), Jaworski (2011, 2012), Rea (2011), Koslicki (2018) and many others.
Early-twenty-first-century hylomorphists vary widely on the nature they attribute to forms, especially with respect to whether forms should be regarded as universals or particulars. Most, however, regard the form as the source of an object’s structure, unity, activity, and the kind to which it belongs. Motivations for reviving hylomorphic structure include its (putative) ability to differentiate between those composites that really exist and those that are mere aggregates, to account for change, and to make sense of the relationship between properties, their bearers, and resemblances between numerically distinct bearers (see, for example, Koslicki 2018, § 1.5). For these hylomorphists, as for Aristotle, the matter that the form organises need not be in itself propertyless, and thus, although hylomorphism can be viewed as one version of the substratum theory of substances, it can avoid the objection that the idea of an entity that is in itself propertyless is incoherent.
Critics of this sort of hylomorphism, such as Howard Robinson (2021), have questioned whether it can do this work while remaining consistent with the thesis that all events can be accounted for by physical forces (that is, the completeness of physics thesis). Robinson argues that if physics is complete, then forms cannot play any explanatory role.
c. Substance and the Mind-Body Problem
Philosophical work on the idea of substance typically arises as part of the project of describing reality in general. Yet, a more specific source of interest in substances has arisen in the context of philosophy of mind, where the distinction between substances and properties is used to distinguish between two kinds of dualism: substance dualism and property dualism.
The terms “substance dualism” and “property dualism” were hardly used before the 1970s (Michel et al. 2011). They appear to have gained prominence as a result of a desire among philosophers arguing for a revival of mind-body dualism to distinguish their position from the traditional forms of dualism endorsed by philosophers such as Plato and Descartes. Traditional dualists affirm that the mind is a nonphysical substance, something object-like that can exist separately from the body. By contrast, many twentieth-century and early-twenty-first-century proponents of dualism, beginning with Frank Jackson (1982), limit themselves to the claim that the mind involves nonphysical properties.
One advantage of positing nonphysical properties only is that this has allowed proponents of property dualism to represent their position as one that departs only slightly from popular physicalist theories and to distance themselves from the unfashionable idea that a person exists or might exist as a disembodied mind. At the same time, however, several philosophers have questioned whether it makes sense to posit nonphysical properties only, without nonphysical substances (for example, Searle 2002; Zimmerman 2010; Schneider 2012, Weir 2023). Several works, such as those collected in Loose et al. (2018), argue that substance dualism may have advantages over property dualism.
These discussions are complicated by the fact that at the beginning of the third decade of the twenty-first century, there still exists no consensus on how to define the notion of substance, and on what the distinction between substances and properties consists in. Hence, it is not always obvious what property-dualists take themselves to reject when they eschew nonphysical substances.
7. References and Further Reading
Aristotle. Categories and De Interpretatione. Edited and translated by J. L. Ackrill (1963). Oxford: Clarendon.
Contains Aristotle’s classic introduction of the concept of substance.
Aristotle. Aristotle’s Metaphysics. Edited and translated by W. D. Ross (1924). Oxford: Oxford University Press.
Develops and revises Aristotle’s account of the nature of substances.
Aristotle. Physics. Edited and translated by C. D. C. Reeve (2018). Indianapolis, IN: Hackett.
Explains change by analysing material substances into matter and form.
Armstrong, David. (1978). Universals and Scientific Realism. Cambridge: Cambridge University Press.
Contains a classic discussion of the bundle theory and the substratum theory.
Armstrong, David. (1997). A World of States of Affairs. Cambridge: Cambridge University Press.
Contains an influential discussion of thin (that is, bare) particulars.
Arnauld, Antoine and Pierre Nicole. (1662). Logic or the Art of Thinking. Edited and translated by J. V. Buroker (1996). Cambridge: Cambridge University Press.
A highly influential Cartesian substitute for Aristotle’s logical works, covering the concept of substance.
Aquinas, Thomas. De Ente et Essentia. Leonine Commission (Ed.), 1976. Rome: Vatican Polyglot Press.
Ayers, M. R. (1977). The Ideas of Power and Substance in Locke’s Philosophy (revised edition of a 1975 paper). In I. C. Tipton (Ed.), Locke on Human Understanding (pp. 77–104). Oxford: Oxford University Press.
Defends an influential interpretation of Locke on substances.
Bailey, A. M. (2012). No Bare Particulars. Philosophical Studies, 158, 31–41.
Rejects bare particulars.
Barney, S. A., W. J. Lewis, J. J. Beach and Oliver Berghoff. (2006). Introduction. In S. A. Barney, W. J. Lewis, J. J. Beach and O. Berghoff (Eds. & Trans). The Etymologies of Isadore of Seville (pp. 1-2). Cambridge: Cambridge University Press.
Introduces Isadore of Seville’s Etymologies.
Baxter, Donald. (2015). Hume on Substance: A Critique of Locke. In P. Lodge & T. Stoneham (Eds.), Locke and Leibniz on Substance (pp. 45–62). New York, NY: Routledge.
An exposition of Hume on substance.
Bennett, Jonathan. (1987). Substratum. History of Philosophy Quarterly, 4(2), 197–215.
Defends the traditional interpretation of Locke on substance.
Bergman, Gustav. (1947). Russell on Particulars. The Philosophical Review, 56(1), 59–72.
Defends bare particulars against Russell.
Berkeley, George. The Works of George Berkeley, Bishop of Cloyne. A. A. Luce and T. E. Jessop (Eds.), 1948–1957. London: Thomas Nelson and Sons.
Bhikku Bodhi (Trans.). (2000). The Connected Discourses of the Buddha: A New Translation of the Samyutta Nikaya. Somerville, MA: Wisdom Publications.
Contains the version of the chariot argument against substance attributed to the ancient Buddhist nun, Vaijira.
Broackes, Justin. (2006). Substance. Proceedings of the Aristotelian Society, 106, 133–68.
Traces the historical confusion between substance and substratum and defends the former concept.
Descartes, René. The Philosophical Writings of Descartes (3 vols.). Edited and translated by J. Cottingham, R. Stoothoff, D. Murdoch, and A. Kenny (1984–1991). Cambridge: Cambridge University Press.
Contains Descartes’ influential claims about substance, including his independence definition.
Descartes, René. Conversation with Burman. Translated by J. Bennett (2017). https://earlymoderntexts.com/assets/pdfs/descartes1648.pdf
Contains Descartes’ identification of the substance with its attributes.
Denkel, Arda. (1992). Substance Without Substratum. Philosophy and Phenomenological Research, 52(3), 705–711.
Argues that we can retain the concept of substance while rejecting that of a substratum.
Druart, Thérèse-Anne. (1987). Substance in Arabic Philosophy: Al-Farabi’s Discussion. Proceedings of the American Catholic Philosophical Association, 61, 88–97.
An exposition of al-Farabi on substance.
Fine, Kit. (1994). Essence and Modality. Philosophical Perspectives, 8, 1–16.
Defends the idea of non-modal essences.
Fine, Kit. (1995). Ontological Dependence. Proceedings of the Aristotelian Society, 95, 269–90.
Defends the idea of essential dependence.
Forrai, Gabor. (2010). Locke on Substance in General. Locke Studies, 10, 27–59.
Attempts to synthesise Bennett’s traditional and Ayers’ novel interpretations of Locke on substance.
Geach, Peter. (1962). Reference and Generality. Ithaca: Cornell University Press.
Defends the sortal-dependence and sortal-relativity of identity.
Gopnik, Alison. (2009). Could David Hume Have Known about Buddhism? Charles François Dolu, the Royal College of La Flèche, and the Global Jesuit Intellectual Network. Hume Studies, 35(1-2), 5–28.
Argues that Hume’s criticism of the idea of a substantial self may have been influenced by Buddhist philosophy.
Gorman, Michael. (2006). Independence and Substance. International Philosophical Quarterly, 46, 147–159.
Defends a definition of substances as things that do not inhere in anything.
Halbfass, Wilhelm. (1992). On Being and What There Is: Classical Vaisesika and the History of Indian Ontology. New York: SUNY Press.
Contains a very useful introduction to the concept of substance in classical Indian philosophy.
Hoffman, Joshua and Gary Rosenkrantz. (1996). Substance: Its Nature and Existence. London: Routledge.
A sustained examination and defence of a novel characterisation of substance.
Hume, David. A Treatise of Human Nature. Edited by D. F. Norton and M. J. Norton (2007). Oxford: Clarendon Press.
Contains Hume’s influential objections to the idea of substance.
Isidore of Seville. Etymologies. Edited and translated by S. A. Barney, W. J. Lewis, J. A. Beach and O. Berghoff (2006). Cambridge: Cambridge University Press.
Played an important role in transmitting Aristotle’s characterisation of substance to medieval philosophers in the Latin West.
Kaipayil, Joseph. (2008). An Essay on Ontology. Kochi: Karunikan.
Contains a discussion of the idea of substance in both Western and Indian philosophy.
Kant, Immanuel. (1787). Critique of Pure Reason. Edited and translated by N. K. Smith (2nd ed., 2007). Basingstoke: Palgrave Macmillan.
Contains Kant’s approach to the idea of substance and his comments on Aristotle’s Categories.
Kołakowski, Leszek. (1968). Towards a Marxist Humanism. New York: Grove Press.
Claims, contra Geach and Wiggins, that the kinds we divide the world into are arbitrary.
Koslicki, Katherine. (2018). Form, Matter and Substance. Oxford: Oxford University Press.
Defends a unity criterion that attributes substancehood to hylomorphic compounds.
Leibniz, G. W. Critical Thoughts on the General Part of the Principles of Descartes. In L. Loemker (Ed. & Trans.), Gottfried Leibniz: Philosophical Papers and Letters (2nd ed., 1989). Alphen aan den Rijn: Kluwer.
Contains a criticism of Descartes’ independence definition of substance.
Leibniz, G. W. Discourse on Metaphysics. Edited and translated by G. Rodriguez-Pereyra (2020). Oxford University Press.
Presents Leibniz’s idiosyncratic conception of substance.
Locke, John. An Essay Concerning Human Understanding. Edited by P. H. Nidditch (1975). Oxford: Oxford University Press.
Contains Locke’s critical discussion of substance and substratum.
Loose, Jonathan, Angus Menuge, and J. P. Moreland (Eds.). (2018). The Blackwell Companion to Substance Dualism. Oxford: Blackwell.
Collects works bearing on substance dualism.
Lowe, E. J. (1998). The Possibility of Metaphysics: Substance, Identity and Time. Oxford: Clarendon Press.
Discusses substance and defends Lowe’s identity-independence criterion.
Lowe, E. J. (2005). The Four-Category Ontology: A Metaphysical Foundation for Natural Science. Oxford: Clarendon Press.
Further develops Lowe’s account of substance.
Martin, C. B. (2006). Substance Substantiated. Australasian Journal of Philosophy, 58(1), 3–10.
Argues that we should posit a substratum to explain why the properties of a substance cannot exist separately.
McEvilley, Thomas. (2002). The Shape of Ancient Thought: Comparative Studies in Greek and Indian Philosophies. London: Simon & Schuster.
Compares ancient Greek and classical Indian philosophy on many issues including the nature of substances.
Messina, James. (2021). The Content of Kant’s Pure Category of Substance and its Use on Phenomena and Noumena. Philosophers’ Imprint, 21(29), 1-22.
An exposition of Kant on substance.
Michel, Jean-Baptiste, et al. (2011). Quantitative Analysis of Culture Using Millions of Digitized Books. Science, 331(6014), 176–182.
Records development of Google’s Ngram which provides data on the appearance of the terms “substance dualism” and “property dualism”.
Moise, Ionut and G. U. Thite. (2022). Vaiśeṣikasūtra: A Translation. London: Routledge.
The founding text of the Vaisheshika school.
Neale, Matthew. (2014). Madhyamaka and Pyrrhonism: Doctrinal, Linguistic and Historical Parallels and Interactions Between Madhyama Buddhism and Hellenic Pyrrhonism. Ph.D. Thesis, University of Oxford.
Discusses the relationship between Madhyamaka and Pyrrhonism.
O’Conaill, Donnchadh. (2022). Substance. Cambridge: Cambridge University Press.
A detailed overview of philosophical work on substance.
Plato. Sophist. Edited and translated by N. White (1993). Indianapolis, IN: Hackett.
Contains Plato’s distinction between things that exist in themselves and those that exist in relation to something else.
Priest, Stephen. (2007). The British Empiricists (2nd ed.). London: Routledge.
An exposition of the ideas of the British Empiricists on topics including that of substance.
Robinson, Howard. (2021). Aristotelian Dualism, Good; Aristotelian Hylomorphism, Bad. In P. Gregoric and J. L. Fink (Eds.), Encounters with Aristotelian Philosophy of Mind (pp. 283-306). London: Routledge.
Criticises hylomorphism.
Russell, Bertrand. (1945). History of Western Philosophy. London: George Allen and Unwin.
Rejects the idea of substances understood as substrata.
Schneider, Benjamin. (2006). A Certain Kind of Trinity: Dependence, Substance, Explanation. Philosophical Studies, 129, 393–419.
Defends a conceptual-independence criterion for substancehood.
Argues that mind-body dualists must be substance dualists.
Scotus, John Duns. Opera Omnia. Edited by C. Balic et al. (1950-2013). Rome: Vatican Polyglot Press.
Contains Scotus’s influential discussions of substance.
Searle, John. (2002). Why I am Not a Property Dualist. Journal of Consciousness Studies, 9(12), 57–64.
Argues that mind-body dualists must be substance dualists.
Sider, Ted. (2006). Bare Particulars. Philosophical Perspectives, 20, 387–97.
Defends substrata understood as bare particulars.
Solomon ibn Gabirol. The Fount of Life (Fons Vitae). Translated by J. Laumakis (2014). Milwaukee, WI: Marquette University Press.
Presents Avicebron’s (Solomon ibn Gabirol’s) universal hylomorphism.
Spade, P. V. (2008). Binarium Famosissimum. In E. N. Zalta (Ed.), The Stanford Encyclopedia of Philosophy (Fall 2008 Edition). <https://plato.stanford.edu/archives/fall2008/entries/binarium/>
Discusses the medieval case for universal hylomorphism.
Spinoza, Baruch. Principles of Cartesian Philosophy. Edited and translated by H. E. Wedeck (2014). New York: Open Road Integrated Media.
Contains Spinoza’s presentation of Descartes’ account of substance.
Spinoza, Baruch. Ethics: Proved in Geometrical Order. Edited by M. J. Kisner and translated by M. Silverthorne and M. J. Kisner (2018). Cambridge: Cambridge University Press.
Contains Spinoza’s account of substance and argument for substance monism.
Strawson, P. F. (1997). Kant on Substance. In P. F. Strawson, Entity and Identity and Other Essays (pp. 268–79). Oxford: Oxford University Press.
An exposition of Kant on substance.
Weir, R. S. (2021). Bring Back Substances!. The Review of Metaphysics, 75(2), 265–308.
Defends the idea of substances as things that can exist by themselves.
Weir, R. S. (2023). The Mind-Body Problem and Metaphysics: An Argument from Consciousness to Mental Substance. London: Routledge.
Argues that those who posit nonphysical properties to solve the mind-body problem must also posit nonphysical substances.
Westerhoff, Jan. (2009). Nagarjuna’s Madhymaka: A Philosophical Introduction. Oxford University Press.
An introduction to Nagarjuna’s philosophy.
Wiggins, David. (2001). Sameness and Substance Renewed. Cambridge: Cambridge University Press.
Defends the sortal-dependence of identity, but rejects the sortal-relativity of identity.
Zimmerman, Dean. (2010). From Property Dualism to Substance Dualism. Aristotelian Society Supplementary Volume, 84(1), 119–150.
Argues that mind-body dualists must be substance dualists.