Metaphysics of Quantum Gravity

The metaphysics of quantum gravity explores metaphysical issues related to research programs in theoretical physics clustered under the term quantum gravity. These research programs aim at the formulation of a theory that reconciles the theory of general relativity with quantum theory. The goal is not necessarily to come up with a unified single theory but, more pragmatically, to describe phenomena with a dual nature, embodying both quantum and relativistic features—such as black holes and the early universe.

Approaches to quantum gravity are not yet fully worked-out theories. Nevertheless, they already provide a certain partial understanding of physical reality in different ways. Remarkably, they do so with a striking similarity: they virtually all deny the existence of some features usually regarded as essential to the existence of spacetime (or space and/or time) such as its four-dimensionality, the existence of distances and durations between events, or even the very partial ordering of events.

This observation is particularly noteworthy, considering the pervasive influence of spatial and temporal organisation on the human mind across various facets of daily life and theoretical thinking, ranging from most ancient religions to contemporary scientific worldviews. The metaphysics of quantum gravity takes as its starting point the puzzling observation that physics could teach us that space and time are not fundamental. It draws on resources from traditional metaphysics to tackle a set of issues related to the possible non-fundamentality of spacetime, and it investigates its potential implications for venerable traditional issues in metaphysics.

The metaphysics of quantum gravity is a relatively small and new research field, and thus as of now, its focus has been on explaining how spacetime could emerge from a more fundamental and non-spatiotemporal ontology. Consequently, this article is equally focused on questions regarding the status of spacetime and the emergence of spacetime. Section 1 situates the field within metaphysics of science more broadly. Sections 2 and 3 investigate, respectively, the status of spacetime in different approaches to quantum gravity and a number of potential issues with its lack of fundamentality. The article then covers the nature of the emerging spatiotemporal ontology (Section 4) and the building relation that relates it to the underlying non-spatiotemporal ontology (Section 5). Section 6 surveys various potential applications of spacetime emergence to a number of debates in metaphysics.

Table of Contents

  1. A New Domain
  2. Quantum Gravity
    1. String Theory
    2. Loop Quantum Gravity
    3. Causal Set Theory
    4. Is Spacetime Non-Fundamental?
  3. Problems with the Non-Fundamentality of Space-time
    1. The Scientific Problem
    2. The Problem of Empirical Coherence
    3. The Ontological Problem
    4. The Conceptual Problem
  4. What is Spacetime?
    1. Theoretical Spacetime
    2. Phenomenal Space and Time
  5. Bridging the Gap
    1. Functional Realisation
    2. Grounding
    3. Mereological Composition
    4. Eliminativism
  6. Implications
    1. Philosophy of Time
    2. Modality, Laws of Nature, Causation
    3. Other Topics
  7. References and Further Reading

1. A New Domain

The metaphysics of quantum gravity is both a part of the more general philosophy of quantum gravity, which encompasses other epistemological and technical issues, and of metaphysics. This section situates the metaphysics of quantum gravity in this more general context.

Metaphysics, as traditionally conceived, aims to ascertain the most abstract structure of reality. Some questions metaphysicians are typically concerned with are: What is time? How do objects relate to the spatial regions they occupy? What kind of relation is the one that relates fundamental to non-fundamental entities? Using results from the development of a number of approaches to quantum gravity, the metaphysics of quantum gravity thus pursues the traditional tasks of metaphysics while shifting its perspective in two ways.

First, quantum gravity raises new metaphysical questions—in particular that of how to categorize the nature of spacetime if it turns out not to be fundamental according to fundamental physics. However, as novel as the problems we are confronted with might be, they might lead us to support some philosophical claims that have already been argued for on completely different grounds. Here is one example of such views: we might be led to conclude that spacetime simply does not exist (see Section 5d). Arguments to the effect that space and/or time are unreal have been put forward independently of considerations about quantum gravity (famously so by McTaggart in 1908). Second, quantum gravity might also suggest novel answers to a number of preexisting metaphysical questions (see Section 6). On the one hand, this concerns further metaphysical questions about the nature of space, time, and spacetime, beyond the question of their relation to fundamental reality. On the other hand, an overwhelming number of metaphysical concepts have received analyses that rely on the existence of space and time, or of spacetime. If the fundamentality of spacetime is challenged by quantum gravity, then these analyses are equally called into question—at least in so far as they are supposed to apply to the quantum gravity level.

By taking the preliminary results of quantum gravity research as a basis for philosophical investigation, the metaphysics of quantum gravity belongs to the metaphysics of science: an approach to metaphysics according to which metaphysical arguments, claims and theories should be informed by our best science. The metaphysics of quantum gravity shares many methodological and conceptual resources with other areas of physics-oriented metaphysics. For example, problems of emergence have also been discussed in the context of non-relativistic quantum mechanics regarding the status of space (but, importantly, not of spacetime): if one accepts configuration space realism (Albert 1996, Ney 2012 and Ney 2021b)—according to which the fundamental physical space is a physical counterpart of the high-dimensional mathematical configuration space in which the wavefunction is defined—then the question arises whether and how the four-dimensional spacetime emerges from this underlying structure. In contrast to other areas of the metaphysics of science, however, metaphysicians of quantum gravity do not reflect on empirically established scientific theories but on approaches to quantum gravity that are currently under construction.

The need for a theory of quantum gravity arises from the fact that general relativity and quantum physics can hardly be both entirely correct. General relativity and quantum field theories are our best theories in their respective domains of description. As such, they provide excellent descriptions of the world. However, their predictive and theoretical capacities are effectively limited to these respective domains. General relativity, on the one hand, only produces good results in situations where we can neglect the quantum behaviour of matter. The Standard Model of particle physics, on the other hand, offers an excellent description of quantum matter to the extent to which gravitational phenomena involving high energy can be neglected. We thus lack a theory to fully describe phenomena with both quantum and relativistic features, such as black holes and the early universe. Overall, theoretical physics presents us with a situation involving two distinct frameworks with different physical ideologies or philosophies (here understood as sets of ideas suggestive of an ontology), and in which we have no satisfactory reason to privilege one or other of the ideologies to guide us towards the ontology of the physical world.

These two theoretical approaches cannot be easily unified beyond their respective domains. The most conservative attempt is called semi-classical gravity. It tries to conserve elements of each framework by combining them without drastic modifications. This approach is a conceptually hybrid creature akin to a computational tool, which appears to wear no clear and complete ontological commitment on its sleeves. So, it is natural to regard semi-classical gravity as a mere step on the path to quantum gravity.

The highly speculative character of quantum gravity research might raise doubts about the feasibility and relevance of pursuing a metaphysics of quantum gravity. What if none of the approaches to quantum gravity on the market turn out to be correct? Even worse, what if the successful theory of quantum gravity is so different from past approaches that it does not share any of the features deemed metaphysically important, and that past approaches display?

To react to this challenge, it is useful to distinguish two different strategies one might pursue in doing metaphysics of quantum gravity: one that deals with abstract issues across the board of different approaches to quantum gravity, and one that focuses on specific approaches to quantum gravity. A predominant view is that the work can be divided in this way for at least two reasons. First, pragmatically, working out general issues with spacetime emergence can be helpful to then solve more specific issues. Second, spacetime emergence may be related to general issues from the metaphysics literature with far-reaching implications. Note this strategy does not necessarily require investigating all approaches to quantum gravity. It can also focus on a limited set of approaches or aim at formulating results based on pre-theoretic constraints sufficiently disconnected from the theory, in the guise of what has been dubbed experimental metaphysics by Abner Shimony (Cohen et al., 1997). A notable exception to the separation of labour into these two equally legitimate strategies is Jaksland and Salimkhani (2023), who argue that the only valid metaphysics of quantum gravity should focus on specific approaches. Instead, this article follows the standard distinction between general and specific issues.

Now, let us return to the challenge formulated above. The first formulation of the challenge, which doubted that any of the existing approaches to quantum gravity will turn out to be correct, need not affect the more general strategy: (some of) the general features investigated on that strategy, and their metaphysical consequences, might survive in the correct theory. The second version of the challenge affects both strategies alike. Indeed, most metaphysicians of quantum gravity will reject the very sceptical attitude expressed in this challenge. However, even if current research in quantum gravity is as fundamentally misguided as the challenge suggests, this need not render research into the metaphysical consequences of existing quantum gravity approaches futile: maybe such metaphysical considerations can help to open up conceptual possibilities needed to develop the unheard-of correct theory of quantum gravity.

2. Quantum Gravity

As of December 2025, there is no consensus on what is the most promising approach to formulating a theory of quantum gravity. The most conservative attempt is called semi-classical gravity. It tries to conserve elements of each framework by combining them without drastic modifications. This approach is a conceptually hybrid creature akin to a computational tool, which appears to wear no clear and complete ontological commitment on its sleeves. So, it is natural to regard semi-classical gravity as a mere step on the path to quantum gravity. Beyond that, different approaches have been advanced and are under constant development. String theory, loop quantum gravity and causal set theory, to name some candidates, are at different stages of elaboration. Some of them, like string theory, are mature research programs involving thousands of researchers. Others like causal set theory are still at an even earlier phase of development and involve only dozens of researchers. No empirical test has been able to give the edge to one of these approaches over the others, and experimental procedures are currently being developed (Huggett, Linnemann, and Schneider, 2023).

Approaches to quantum gravity each come with their specific issues, including problems of spacetime emergence. The present section briefly introduces three approaches to quantum gravity, demonstrates how spacetime could fail to be fundamental in these approaches, and discusses the prospects for spacetime to remain fundamental in quantum gravity. The approaches discussed here are by no means exhaustive, and their selection simply reflects the knowledge of the authors of this article. Other popular approaches that are philosophically fruitful include but are not limited to: canonical quantum gravity, group field theory, shape dynamics, asymptotic safety, Penrose’s gravitationally-induced collapse approach, non-commutative geometry and causal dynamical triangulation.

a. String Theory

String theory is the most popular research programme in quantum gravity. (For easy-going presentations, see Greene 1999; Dawid 2013; Zimmerman Jones and Sfondrini 2022; for textbooks see Zwiebach 2009; Blumenhagen et al. 2013; Tomasiello 2022.) According to a rough understanding of the formalism, reality is constituted by one-dimensional strings, and other higher-dimensional entities called “branes”. Those entities have various properties, such as vibrations, size and topology. A number of states of this underlying ontology correspond to the particles of the Standard Model of particle physics. Some states of closed strings correspond to the graviton, the particle posited to mediate the gravitational interaction. There is not one but five string theories, and they are usually regarded as approximating an even more fundamental theory. (For an introduction to string theory aimed at philosophers, see Le Bihan 2023.)

String theory jeopardises the fundamentality of spacetime in at least three different ways.

First, for reasons of mathematical consistency, the background spacetime has not four but rather ten dimensions: nine spatial dimensions and one temporal dimension. The dimensionality of spacetime thus becomes problematic, and a story about the emergence of the four-dimensional spacetime from a ten-dimensional spacetime is required. To make things even worse, the five ten-dimensional theories are conjectured to approximate a more fundamental, non-approximative, eleven-dimensional theory called M-theory, involving ten spatial dimensions and one temporal dimension, or perhaps a twelve-dimensional, non-approximative, theory named F-theory, postulating ten spatial dimensions and two temporal dimensions.

Second, the five string theories can be described as quantum field theories on two-dimensional worldsheets that one can visualise, at least to some good approximation, as the extension of one-dimensional strings in an external temporal direction, just as we can view the trajectory of a particle in time as a spacetime line, a one-dimensional worldline, in the more familiar relativistic context (Le Bihan, 2020, Section 3). This worldsheet perspective presents us with a picture of quantum fields fluctuating on a two-dimensional manifold, and strings and branes do not exist qua objects. The manifold’s metric is conformally invariant, strongly suggesting that there is no matter of fact about distances and durations between elements of the manifold. If the worldsheet approach has ontological teeth, then we need to understand the emergence of the relativistic four-dimensional spacetime of general relativity from a two-dimensional surface lacking meaningful notions of distance and durations between its elements.

Third, the five string theories have a surprising feature. They have been shown to be empirically equivalent in a remarkable way, casting doubt on the very existence of the spacetime in relation to which they are defined. They are not merely empirically equivalent but also physically equivalent in a stronger sense. The empirical equivalence of two theories can be defined as the existence of a systematic correspondence between the quantities of all possible measurable quantities, such that empirical evidence cannot decide in favour of one of the theories over the other. Physical equivalence is a stricter condition insofar as there is also a systematic correspondence between the unobservable quantities of the two theories, thus generating inter-theoretical “giant symmetries” (De Haro and Butterfield, 2021, p. 2974). Those are called ‘duality relations’, between duality-related models and theories, and duality-related quantities. The philosophy of duality is usually approached in a very mathematical and non-metaphysical way (but see Le Bihan and Read (2018); Le Bihan (2023) for an introduction and discussion of the ontology of duality aimed at philosophers). Duality has been used to argue against the reality of the ten-dimensional spacetime, since duality-related models of the two theories will not share the spacetime metric (T-duality) and sometimes not even the same topology (mirror symmetry) (Huggett 2017; Matsubara and Johansson 2018). There is therefore no general agreement on the exact ontology of string theory, but there are strong reasons to doubt that the structure we refer to as spacetime in relativistic physics remains present at the more fundamental level described by string theory.

Aside from questions over the fundamentality of spacetime, string theory leads to questions regarding the reality, fundamentality and ontological categorisation of strings and branes: are strings genuinely fundamental objects according to the general framework of string theory, or are the branes the only fundamental entities of the approach (Le Bihan, 2023)? That strings should be eliminated from the ontology of string theory in favour of branes can be motivated by the fact that M-theory appears to merely include branes and not one-dimensional strings. The opposite view that strings are more fundamental than branes has also been defended, but in the context of the five string theories (Vistarini, 2019).

b. Loop Quantum Gravity

Unlike string theory that starts with a modification of the Standard Model of particle physics and tries to recover gravity, loop quantum gravity (LQG) is a general-relativity-first approach. Similarly, it is a geometry-first approach in that it focuses on the construction of spacetime and the gravitational field, without taking into account the quantum physics of matter. This section presents an extremely condensed and superficial version of their Chapter 6 focused on the emergence of space and time in LQG.

LQG refers to two distinct approaches: canonical loop quantum gravity and covariant loop quantum gravity (Rovelli, 2004; Rovelli and Vidotto, 2014).

Canonical LQG is built on a Hamiltonian reformulation of general relativity that is easier to quantise than the standard formulation. This formulation of general relativity goes against the original spirit of general relativity by forcing a foliation of the spacetime into an objective ordering of three-dimensional spaces and a universal time, thereby ruling out solutions of the theory that cannot be foliated this way (the non-globally-hyperbolic solutions). Then, to move from classical spacetime to a quantum structure, those classical three-dimensional spaces are transformed into quantum states via a technical procedure called canonical quantisation, and those are supposed to be in states of quantum superpositions, like quantum matter in textbook quantum mechanics. These three-dimensional quantum states are defined over a Hilbert space (the mathematical configuration space that describes the possible states of the system at hand). Those are spin networks states, which can be described by combinatorial graphs of links and nodes, and by numbers associated with both links and nodes.

Prima facie, the naive ontology of LQG appears to be one of a quantum superposition of discrete elements (the links and nodes) and one might be tempted to argue that spacetime just is this quantum structure. However, four reasons at least can be provided for why this structure differs significantly from spacetime.

First, according to a number of interpretations of quantum mechanics, its ontology is metaphysically indeterminate (one popular exception being the many-worlds interpretation, see Glick and Le Bihan 2024). By being quantum, the fundamental LQG structure could thus be metaphysically indeterminate as well. Whether geometry needs to be well-determined to pretend to the status of spacetime remains debated and might relate closely to the question of whether the world in general harbours metaphysical indeterminacy.

Second, the spatial status of the spin networks can be questioned because of disordered locality. Many models of LQG have adjacency relations between their elements that diverge from the adjacency relations existing between the corresponding elements in the general relativity description, taken to approximate the underlying LQG ontology. The well-defined ordering of events around us could thus turn out to be a statistical approximation, such that when zooming in on the deep fabric of spacetime, we would find anomalies such as adjacency relations that correspond to long spacetime intervals.

Third, there is a problem of frozen dynamics called the problem of time, because the time variable appears to be missing from the equations supposed to describe the evolution of spin networks. Thus, both time and change appear to be at best perspectival or relational, describing relations between specific sub-systems in the universe. But there no longer seems to be any strong sense of a physical system evolving with respect to the rest of the universe.

Fourth, not all spin network states of the underlying structure are expected to give rise to an effectively spatiotemporal geometry. Thus, at best, spacetime could be identical to a spacetime state or property of the underlying ontology, but not to the bearer of the state or property itself.

The second version of LQG is covariant LQG. It describes a four-dimensional extension of spin networks called “spinfoams.” Because of technical difficulties with the canonical approach, most efforts in the LQG community focus on developing the spinfoam framework these days. It exploits a path integral approach to dynamical evolution. A path integral formulation computes the evolution of a physical state from an initial state to final state by weighting all the possible paths between the initial state and the final state. Moving to LQG, the paths are identified with spinfoam trajectories, roughly understood as spin network “evolutions.”

The covariant approach performs well for local descriptions of spacetime regions limited to the astrophysical scale. This contrasts with the canonical approach, which can be deployed to produce simplified toy models of the entire cosmos, resulting in the loop quantum cosmology, not unlike the Λ-CDM model, the standard model of cosmology, which exploits highly simplified models of general relativity to deliver cosmological models of the whole universe (see Bojowald 2011 in physics, and Huggett and Wüthrich 2018, Section 3 for a philosophy of physics perspective). On the contrary, covariant LQG requires feeding the equations with a lot of information about the boundary conditions, both at the beginning and at the end, but also on the spatial edges of the spacetime region being described, and thus operates better on the astrophysical than on the cosmological scale.

Among the four reasons to deny the fundamental status of spacetime in canonical LQG, three remain in covariant LQG. What remains is both metaphysical indeterminacy and disordered locality, with the possible exception of the many-world interpretation. However, there does not seem to be a problem of time since the dynamics is understood as the solution of a set of constraints between the initial and the final state (and not as the evolution of a state according to equations that do not feature a time parameter, like in the Hamiltonian formulation of the canonical approach). As for the fourth reason, namely that certain states of the underlying ontology do not embed a spacetime profile, that certainly continues to be the case with the covariant approach: not all states trigger a spacetime geometry, and even those who do are quantum superpositions of more fine-grained states, some of them failing to be spacetime-like.

c. Causal Set Theory

Causal set theory (CST) aims to rebuild the seemingly continuous spacetime world from a discrete structure of elements and partial ordering relations between these elements (Bombelli et al., 1987; Rideout and Sorkin, 1999; Dowker, 2006, 2013; Major et al., 2009; Rideout and Wallden, 2009). Unlike string theory, which seeks to push the Standard Model further, and loop quantum gravity, which strives to generalize general relativity to the quantum realm, CST sets out to reconstruct familiar physics from scratch from a new paradigm. The structures of partially ordered elements, causal sets, are expected to collectively give rise to spacetime and its material content, as described by, at least up to a good approximation, the general theory of relativity. The approach is premised on a theorem from general relativity (Malament, 1977), which states that the metric structure of a spacetime region can be derived from its causal structure, up to a conformal factor. This technical result has been taken to suggest that almost all the structure of a spacetime region could be built from scratch from the causal structure of said region. Causal set theory attempts this construction.

Causal sets are usually described as evolving through the operation of a dynamical law that adds elements one-by-one, connecting them to the pre-existing stage of the causal set (although this description might be misleading as will be explained below). The approach tries to develop a number of different dynamics in order to reproduce, up to some approximation, a class of models of general relativity consistent with the actual world.

CST offers a clear framework for discussing the possible emergence of continuous from discrete structures. Whether causal set theory should be interpreted as a case of spacetime emergence is still debated, as two main philosophical interpretations of the ontology of CST are in competition. According to the growing block approach, CST suggests an interestingly novel and radical form of the growing block theory of time, which is standardly defined as the view that the past and the present exist, but the future does not. The flow of time is then identified with the coming into existence of new slices of existence, at the border of the past-present block, and the future. With CST, there is no longer a sharp differentiation between three regions of reality, the past, the present and the future, as the coming into existence of being takes the form of single elements (single building blocks of a local time) instead of three-dimensional hypersurfaces (global times). The growing block interpretation to CST has been defended by physicists (Dowker, 2006, 2014) and assessed by philosophers (Earman 2008, Wüthrich and Callender 2016).

The growing block approach to CST differs from the traditional growing block theory in a number of ways. First, the growth is only local and there is often no definite matter of fact regarding which element from a pair of elements comes into existence before the other. The growing of causal sets can be thought of as happening in various directions, a visual simplification to emphasise that the ordering between the elements is merely partial. This local growth is better visualised as a growing octopus, or n-pus, than as an expanding block (Le Bihan, 2020). Another implication of the model is a commitment to metaphysical indeterminacy of a new kind (Wüthrich and Callender, 2016). This metaphysical indeterminacy is new by applying not only to the future but also to the past of the growing block structure. Indeed, due to a property of the dynamics called “discrete general covariance,” it is (seemingly metaphysically) indeterminate which past configuration of the set, among a number of distinct possible configurations, led to any particular configuration in the growth of the octopus. This has been argued to lead to contradictions (blinded), motivating an alternative interpretation of the ontology of CST.

According to a natural alternative ontological interpretation, the growth description is merely heuristic, and we should really think of the maximal set produced at the limit of the dynamics (intuitively, when the process of elements coming into existence is infinitely complete) as a perspicuous representation of reality (Huggett, 2014). This approach is thus in the spirit of the blockhead interpretation of general relativity. The view still includes metaphysical indeterminacy but does not appear to generate problematic contradictions.

Other metaphysical issues discussed in relation to CST include: Is CST committed to a form of realism about causation (Wüthrich and Huggett 2020, blinded)? Could a fundamental ontology of causal relations ground or compose a derivative ontology of spacetime relations (Baron and Le Bihan, 2022a, 2024)?

d. Is Spacetime Non-Fundamental?

Is spacetime really non-fundamental according to most approaches to quantum gravity? Negative answers have been articulated or voiced by a number of scholars. They can be categorised in two classes: a priori and empirical.

The first category of objections to non-fundamental spacetime is a priori in that it is not grounded in the analysis of specific approaches to quantum gravity, but in a priori motivations that have nothing to do with the content of theoretical physics. Such objections can be made based on the claim that the non-fundamentality of spacetime stands in the way of our conception of concrete physical entities (Lam and Esfeld, 2013, p. 287), or that the concept of spacetime involves the property of fundamentality (Baker, 2021, Section 6). More generally, there is a long tradition in metaphysics to associate physicality to spatiotemporality, and the notion of a fundamental spacetime plays indeed a central role in many metaphysical views. Whether those views can be amended to account for fundamentally non-spatiotemporal reality constitutes another direction of research in the metaphysics of quantum gravity (which we turn to in Section 6), and one can see the conservative pressure coming from analytic metaphysics to preserve the rock-bottom fundamental status of spacetime.

A second type of reason to doubt that spacetime is not fundamental is empirical (see, for example, Esfeld 2021). Two lines of reasoning in this direction are possible. First, it could be that the correct theory of quantum gravity will be one that does not question the fundamentality of spacetime. This could potentially happen with a number of approaches to quantum gravity, as working out their ontology remains a vast project. For example, although the received view of canonical quantum gravity is that time disappears in a problematic way (Huggett and Thébault, 2023), the claim remains disputed (Chua and Callender, 2021). Turning to another example and as mentioned before, according to a certain version of causal set theory endorsed by Dowker (2014), spacetime fundamentally exists, although in a peculiar way. Finally, another possible view in that direction is Bohmian quantum gravity, which would extend the Bohmian interpretation of non-relativistic quantum mechanics not only to quantum field theories, but also to quantum gravity (Vassallo and Esfeld 2014). Second, it could be that one of the theories of quantum gravity generally considered to deny crucial features of spacetime actually turns out not to deny them and could be reinterpreted differently in the future.

3. Problems with the Non-Fundamentality of Space-time

This section surveys a number of philosophical problems that emerge if spacetime does not exist fundamentally according to quantum gravity. Intuitively, that physical reality could fail to be fundamentally spatiotemporal appears troublesome: it clashes drastically with the way we usually conceive of the world as being fundamentally spatial and temporal, and with the scientific method, which seems to be based on collecting observations localised in space and time. Sections 3a to 3d review a number of problems resulting from different ways of making the nature of the clash precise and discuss what has or needs to be done to address them. The order in which the problems are presented reflects the extent to which their solution can be expected to follow from physics alone.

a. The Scientific Problem

The scientific problem is the problem of providing a theoretical derivation of spacetime physics from a non-spatiotemporal physics, namely a derivation of our best physics from a theory of quantum gravity. This should be done for two frameworks: general relativity and quantum physics. This problem is scientific in that it is an actual problem that quantum gravity physicists are facing. Indeed, absent new and independent empirical evidence, the most reliable guiding principle in the formulation of a theory of quantum gravity is the ability of the latter to derive our currently best, empirically-confirmed theoretical frameworks in physics—namely, general relativity and the Standard Model of particle physics based on a family of quantum field theories. Applied to spacetime, the problem amounts to the possibility of deriving, at least as a mathematical approximation and with bridge principles between the primitive notions of the two theories, the piece of apparatus that plays the spacetime role in general relativity and quantum field theories from a non-spatiotemporal theory of quantum gravity, along a heterogeneous Nagelian reduction (Nagel, 1979).

The problem has a different form depending on whether the focus is on general relativity or quantum physics. General relativity is a very successful theory in the low-energy regime of description that came with its share of conceptual revolutions—for example, with the possibility of intrinsically curved and expanding geometries. If spacetime is not fundamental, then how can this success be accounted for, and how should we rethink the conceptual revolutions mentioned above? A possible answer is that deriving general relativity as an approximation from the theory of quantum gravity would suffice to explain its predictive power—we should then look for conceptual revolutions in the new theory. The lessons learned from general relativity could then perish or survive the move to the new theory. The issue thus becomes the task of building one model (that is, solution) of general relativity consistent with the distribution of matter in the actual world from a solution to a theory of quantum gravity. This has been done to some degree, for example in string theory where the metric field—standardly regarded as representing spacetime in general relativity—can be described as a coherent state of an underlying ontology of strings and branes (Huggett and Vistarini, 2015).

When it comes to quantum physics, quantum gravity physicists do not focus on non-relativistic quantum mechanics, as is often the case in the metaphysics of quantum physics, but instead on quantum field theories, a class of which correspond to the Standard Model of particle physics. The Standard Model describes fundamental particles and fields, such as electrons and quarks, groups them into families, and provides a catalogue of their possible interactions. The Standard Model incorporates the relativistic effects described by special relativity, yet it does not account for the gravitational aspects of general relativity. As of 2025, a comprehensive quantum field theory approach to gravity is missing. A low-energy approach to quantum gravity has been developed, but it cannot be extended to high-energy interactions (Wallace 2022). The Standard Model should be derived via approximations procedures from a more comprehensive theory of quantum gravity.

The scientific problem may take different guises, depending on the approach to quantum gravity under study. For instance, in the context of string theory, a number of quantum field theories have been derived from string theory solutions. However, none of those quantum field theories are the right ones, namely the ones involved in the Standard Model of particle physics. The number of solutions to string theory is incredibly large, making it, apparently, virtually impossible to get our hands on solutions corresponding to the Standard Model. This is the infamous landscape problem (Read and Le Bihan 2021).

Overall, the scientific problem is a problem for scientists, one that should be carefully distinguished from more philosophical issues introduced below.

b. The Problem of Empirical Coherence

The problem of empirical coherence for spacetime emergence arises when considering physical theories positing that spacetime does not exist fundamentally, while being simultaneously established on empirical evidence manifestly localised in space and time. A similar problem was first formulated in the context of quantum mechanics by Barrett (1996), pointing out at the tension between an ontology of so-called local beables and a realism about the wave function. It was then discussed by Healey (2002) for canonical quantum gravity, and systematically studied for a wide range of quantum gravity approaches by Huggett and Wüthrich (2013).

The formulation of the problem of empirical coherence usually employs the now standard concept of local beable introduced by Bell (1987). The beables of a theory are the things the theory postulates as being physically real. They are deemed “be-able” because they manifest as degrees of freedom, that is, determinable properties that can take on various determinate values. Beables are local if they have a location in space and time. Local beables are regarded as crucial for the possibility of observation, and hence for the empirical justification of theories in physics. Thus, the problem goes, a theory that would deny the fundamental existence of spacetime would appear to be empirically incoherent: the truth of the theory would erode the reasons that initially motivated endorsement of the theory.

Local beables already appear to be lacking in wave-function ontologies, prompting many questions on how to interpret the fundamental ontology of non-relativistic quantum mechanics (Albert, 1996; Ney, 2021b). According to this interpretation of non-relativistic quantum mechanics, configuration space realism (also called wave function realism or wave function fundamentalism), the fundamental ontology of the theory is a distribution of quantitative properties, a physical counterpart of the mathematical wave function. Importantly, those properties are not localised within the ordinary three-dimensional space but in a modal space whose regions correspond to possible configurations of physical systems in the three-dimensional space—the so-called configuration space. This configuration space is a mathematical tool designed to facilitate calculation. Its dimensionality corresponds to the number of apparent particles in the physical system being described. Configuration space realism goes beyond regarding the mathematical space as a mere calculation convenience. It states that the mathematical configuration space reflects the existence of an actual, physical configuration space. Consequently, the fundamental arena of reality would be this configuration space, not the three-dimensional space.

An especially difficult issue is then to understand the relation between this fundamental configuration space wherein the wave-function is defined, on the one hand, and the emergent, ordinary three-dimensional space, on the other hand. Configuration space realism faces the charge of being incoherent, since the ordinary space and its local beables are not part of the fundamental ontology of the theory (Maudlin, 2007). However, it has been noticed that local beables are not logically necessary for the possibility of observation (Ney, 2015). Inter-subjective accessibility to evidence localised in configuration space might turn out be just as effective as the more intuitive accessibility to three-dimensional objects localised in space and time.

An important difference between non-relativistic quantum mechanics and quantum gravity is the status of time. In non-relativistic quantum mechanics, time is regarded as a fixed external parameter. What is at stake for configuration space realism, then, is the possible emergence of the ordinary space from the more fundamental configuration space, both located inside a non-relativistic external time. Some proponents of configuration space realism in the context of quantum mechanics take time to be necessary for observation, as temporality is an explicit part in virtually all formal theories of empirical confirmation (Ney, 2015). The problem of empirical coherence thus appears to be much more difficult in the context of quantum gravity, when it is spacetime and not only space that comes under attack.

A somewhat natural thought is that the empirical coherence of a quantum gravity theory can be straightforwardly achieved by asserting that spacetime exists, yet not fundamentally, thus divorcing existence from fundamentality (Huggett and Wüthrich, 2013; Wüthrich, 2017). That is, we can establish the empirical coherence of theories of quantum gravity by formally deriving the general-relativistic spacetime from the more fundamental theory of quantum gravity. But such a formal derivation will not suffice to establish the reality of spacetime. Additionally, the mathematical derivation needs to be “physically salient”: it cannot be a mere mathematical curiosity. That spacetime exists in a non-fundamental way can thus be understood as the claim that it is physically salient, yet not fundamental. Spacetime would exist over and above the fundamental ontology of quantum gravity, for instance as a structure permeating the ontology of quantum gravity. Whether physical salience can be freed from fundamental spatiotemporality remains debated. Huggett and Wüthrich (2013) argue that it does; Maudlin (2007) argues that it does not. The problem of empirical coherence thus intersects with the problem of whether spacetime really exists, and if so how exactly. Positing the existence of a non-fundamental spacetime could be key to solving the problem of empirical coherence. Other solutions might merely require positing the existence of non-local beables that do not require the existence of a non-fundamental spacetime. More work remains to be done to review issues of empirical coherence related to the non-fundamentality of time beyond the non-fundamentality of space.

c. The Ontological Problem

The ontological problem is the related problem of the status of spacetime. Is spacetime real? And if so, what does it mean that it is not fundamental? If not, how can we make sense of the world around us, which definitely seems to be spatial and temporal? Answers to this question belong to one of the three following strands: eliminativism, reductionism and dualism. Eliminativism about spacetime claims that spacetime is not fundamental because it does not exist at all. Spacetime would be a sort of theoretical artefact, and even space and time might turn out to be akin to perceptual illusions. A second option is to maintain the existence of spacetime and to identify it with (parts of) the non-spatiotemporal structure. Finally, according to dualism, spacetime exists and is distinct from the fundamental structure. To spell out such a dualist account, one should specify how the non-spatiotemporal structure relates to the spatiotemporal one. See Section 5 for an overview of different candidate relations.

It is an intricate issue whether a specific solution to the problem of empirical coherence implies a certain answer to the question of the ontological status of spacetime. As shown below, this might be so for some, but not necessarily all proposed solutions to the problem of empirical coherence. Also, someone who denies that there is any problem of empirical coherence can agree that the question of the ontological status of spacetime should receive some answer.

d. The Conceptual Problem

The claim that space and time do not exist fundamentally might at first be met with scepticism. For how could it possibly be the case that the physical world is not spatial and not temporal? The claim raises concerns as it goes against the deep belief in the fundamentality of space and time. Does the non-fundamentality of spacetime present a conceptual problem in light of those beliefs? Unlike the scientific and empirical coherence issues, this conceptual problem calls into question the coherence and metaphysical plausibility of the view that spacetime could fail to be fundamental.

The problem can be elaborated more precisely in the following way. It relies on an experience of discrepancy between non-spatiotemporal and spatiotemporal concepts that cannot be fully addressed by gesturing at a formal reduction of a theory involving the first set of concepts to another theory involving the other concepts. Or, at the very least, more needs to be said on how to relate primitive spatiotemporal concepts to primitive non-spatiotemporal concepts, beyond a simple analysis in terms of bridge principles relating them, as in heterogeneous Nagelian reductions (Nagel, 1979). One wants to know if there is something so specific in spatiotemporal concepts that they could not possibly be explained away in terms of non-spatiotemporal concepts.

Whether the emergence of spacetime really poses a conceptual problem is controversial. It has been disputed to what extent an analogy with the hard conceptual problem of consciousness, which is supposed to illustrate what the conceptual problem of spacetime emergence consists in, can be carried through. The idea behind the analogy is that, just as there might be a hard problem for explaining the relations between physical and mental entities, one could ask whether there is something akin to spacetime qualia or spacetime qualities (those are not supposed to be mental in this context), analogous to qualia in the philosophy of mind (Le Bihan, 2021). Qualia in the philosophy of mind are potential “what is it like to be conscious” properties, especially difficult to reduce to purely physical entities. Likewise, one might wonder if there are “what it is to be spacetime” properties, especially difficult to reduce to purely non-spatiotemporal entities.

The existence of spacetime qualia has been denied by Knox (2014) and Lam and Wüthrich (2018). Le Bihan (2021) argues that the concept at least should be taken seriously, as the existence of a conceptual discrepancy associated to irreducible spacetime qualities might ground the intuition shared by a number of scholars that spacetime cannot possibly fail to be fundamental. The conceptual problem of spacetime would thus be a hard problem of spacetime, similar to the hard problem of consciousness. Overall, realising that there is no hard problem could alleviate the worry that spacetime emergence is logically or physically incoherent, by insisting that resistance to the logical or physical possibility of spacetime emergence originates in deceiving cognitive, pre-theoretical intuitions.

Another formulation of the hard problem can be expressed as the concern that it would be impossible to understand a non-spatiotemporal theory. However, by dissociating understanding from conceivability, it could be granted that we cannot imagine a world without spacetime, and yet still be able to understand it in a more theoretical way. This would require a conception of understanding that does not require visual imagination as a prerequisite for understanding. Rather, for instance, understanding might require the ability to use the theory in certain ways (De Haro and W. de Regt, 2020).

4. What is Spacetime?

Investigating the status of spacetime in quantum gravity requires to agree beforehand about the defining features of spacetime. What is this phenomenon or theoretical entity that is supposed to emerge from the non-spatiotemporal ontology? Spacetime is a generic term that can be associated to a number of more precise concepts. These concepts can be classified in two broad families. First, conceptions of theoretical spacetime are built on notions found in theoretical physics, and especially in special and general relativity as these are our standard theories of spacetime. Second, conceptions of phenomenal spacetime build on the phenomenology of spatial and temporal phenomena, rooted in our perceptual experience of the world. For instance, space, time, motion, repetitions, local beables (localised objects) and, more generally, any notion essentially tied, at least to some degree, to our concepts of space and time, altogether constitute this broad class of spatial and temporal phenomena.

This section surveys various conceptions of theoretical spacetime and phenomenal spacetime, and how they constitute reasonable targets for the recovering of spacetime in the context of quantum gravity.

a. Theoretical Spacetime

The most obvious concept of spacetime to be recovered from a non-spatiotemporal ontology is the one appearing in theoretical physics. However, an immediate challenge for this project is that there might be more than one concept of spacetime in theoretical physics. First, theoretical physics is not a monolithic block. It is made of a number of distinct theoretical frameworks, and spacetime is not conceptualised in the same way in all of these approaches. Second, even in general relativity, arguably our most solid and advanced theory of spacetime, there is no universal consensus on the nature of spacetime. Let us review the two issues in turn.

Spacetime seems to enjoy a special affiliation with special relativity and, by extension, general relativity. The first scientific concept of spacetime was put forward by Hermann Minkowski in 1908, providing a beautiful and compelling formulation of special relativity. Both Minkowski’s flat spacetime and the curved pseudo-Riemannian spacetime of general relativity appear to be prime candidates for spacetime recovery. And as the Minkowskian spacetime of special relativity appears to be a local approximation of the spacetime of general relativity when curvature is negligible, or can be neglected for various purposes, the pseudo-Riemannian concept of general relativity would seem to be the most suitable target for a definition of spacetime.

However, it seems at least logically possible to temper this demand for a special relation with relativistic physics and envision spacetime as a more autonomous notion, which, although born from special relativity, could feature in other, potentially non-relativistic, theories. For consider Newtonian physics. It may be reformulated and generalised using a four-dimensional ideology, resulting in the Newton-Cartan theory (Cartan, 1923). Thus, whether the only viable concept of theoretical spacetime is the one found in general relativity is a legitimate concern (Baron and Le Bihan, 2022c). However, there is no doubt that the concept of spacetime found in general relativity is of paramount importance for analysing the emergence of spacetime. Hence, setting aside other possible targets for the theoretical concept of spacetime, we now focus on the spacetime concepts from special and general relativity.

The geometric approach is the standard interpretation of special and general relativity. In fact, for many it is not even an “interpretation” of the theory; it is an essential feature of the theory itself. Consider first special relativity. Its geometric interpretation states that special-relativistic effects—including time dilation and length contraction—manifest the geometric structure of the four-dimensional Minkowski spacetime. This geometric structure exists in itself and is metaphysically independent of the rest of the world. This interpretation, pioneered by Minkowski, became the standard reading of special relativity, eventually gaining acceptance from Einstein himself despite his initial reservations. This spacetime structure is a four-dimensional manifold equipped of a metric field, describing how things can and cannot move when acted upon by other material systems. Importantly, the structure delineates the respective perimeters of inertial and non-inertial motion. Both non-massive and massive bodies, when not acted upon by other bodies, follow straight lines in space.

However, Einstein was not completely satisfied by the geometric approach to special relativity, pointing out that this spacetime is acting upon matter but cannot be acted upon (Brown and Pooley, 2006). That goes against a deeply-wired principle of action/reaction typical of substances—understand, of real entities. Interestingly, the action/reaction principle comes back with general relativity. The geometric spacetime of general relativity also reacts to the presence of massive bodies. Indeed, massive bodies curve spacetime around them, notably explaining the presence of what we effectively perceive as a force of gravitation pulling things towards massive objects. Einstein’s initial reservations about the geometric approach thereby disappear when factoring in the dynamic backlash of matter on spacetime, in the context of general relativity. The geometrical approach thus remains the standard view; spacetime is a structure existing on its own partially responsible (together with the dynamical laws) for the motion of material systems. Ignoring the vivid debates about the status of the relation between the metric field and the manifolds on the one hand, and between the metric field and matter fields on the other, this metric field constitutes the target of the recovery of spacetime when one subscribes to the geometrical approach.

Consider now the competing dynamical approach, which was championed by Harvey Brown and developed in details by the Oxford philosophy of physics group (Brown 2005, Brown and Pooley 2001, Brown and Pooley 2006, Read et al. 2018). It demotes the Minkowski spacetime from its fundamental status by analysing relativistic effects as properties of the dynamics of material bodies (more precisely of the symmetries of the dynamical laws). The dynamical approach relocates the origin of special-relativistic effects from the ontological category of spacetime to the ontological category of laws. It is thereby better suited to special relativity than to general relativity. Indeed, if relativistic effects are the manifestations of symmetries of the dynamical laws, and not of spacetime, then there is a bit of mystery as to why the symmetries of the metric field coincide with the symmetries of the matter fields. An immediate reply is that one could be realist about the metric field without identifying it to the spacetime geometry. The metric field should thus be rethought not as a representation of an independent spacetime, but rather of another material field. Accordingly, Brown expresses sympathies for Rovelli’s view that the metric field is another material field, the world being composed of fields on top of fields (Brown, 2005, p. 159-160). Overall, the prospects for applying the dynamical approach to general relativity remain highly debated.

The question then arises as to whether the general relativistic concept of spacetime to be derived from the physics of quantum gravity should be that of the relationist in spirit dynamical approach or that of the substantivalist in spirit geometric approach. The dynamical approach (by already unreifying spacetime to a great extent) might be easier to identify with an emerging structure. So, if the dynamical and geometrical approaches turn out to be empirically equivalent, considering general relativity in the dynamical apparatus might be the right kind of re-conceptualisation to narrow the explanatory gap between the general theory of relativity and a non-spatiotemporal theory of quantum gravity. It has been argued, however, that it might be easier to relate a dynamical reading of spacetime to a non-spatiotemporal theory of gravity, since the very existence of spacetime (in technical parlance, the chronogeometricity of the metric field) turns out to be contingent by depending on the actual coupling of the metric field with the matter fields (Le Bihan and Linnemann, 2019).

Since the geometrical and dynamical approaches are regarded as interpretations of the formalism of general relativity, it is reasonable to expect the two approaches to be empirically equivalent. This justifies taking a step back and asking whether spacetime should not rather be understood in a more abstract way, by what it does. What is more, the dynamical approach being more difficult to square with general relativity than with special relativity, it has been argued that the dynamical approach should culminate in a functionalist rewriting or adjustment of Brown’s original project (Knox, 2019).

Spacetime functionalism is a wide range family of views that either attempts to understand the concept of spacetime in functionalist terms in relativistic physics (Knox, 2011, 2014, 2019), or attempts to analyse the relation of spacetime emergence in the context of quantum gravity (Lam and Wüthrich, 2018, 2021; Yates, 2021; Chalmers, 2021). We focus here on the functionalist concept of spacetime; the functionalist approach to the relation of emergence will be discussed in Section 5.1.

According to a broad definition of spacetime functionalism, spacetime is the theoretical concept that appears in general relativity (or possibly as mentioned above, any other relevant spacetime theory in physics). For consider the Ramsey sentence for general relativity. This sentence is a definition of spacetime in relation of all its relevant predicates in the context of general relativity. Spacetime is the entity selected by the variable in the sentence, namely the entity that plays all the spacetime roles described by the Ramsey sentence. Hence the slogan that spacetime is as spacetime does. What this spacetime role or roles are, exactly, remains highly debated. According to a popular account by Knox (2019), spacetime is associated with inertial motion.

Two other views, similar but distinct from spacetime functionalism, can be articulated. One is spacetime operationalism (see, for example, Le Bihan and Linnemann 2019; Menon 2021). It states that spacetime is the entity recorded by rods and clocks, concrete tools used for probing the structure of spacetime. It bears similarity with spacetime functionalism, by sharing the slogan that spacetime is as spacetime does (in this case, what it does on probes made of matter fields, namely the rods and clocks). However, there is an important difference between spacetime functionalism and spacetime operationalism. While the first position identifies spatiotemporal roles within the physical theory, the second associates them with experimental practice, as the structure that explains the nature of the data collected (the movement and direction of the rods, the durations measured by the clocks). In light of all the (too) many conceptions of spacetime, another option is spacetime quietism. This is the view that we cannot agree on what is the right analysis of spacetime (Baron and Le Bihan, 2022c). Because of the plurality of views on the proper theoretical conception of spacetime, future agreement of the different participants in the discussion appears indeed unlikely. Spacetime quietism is the view that it is not necessary to agree on the theoretical nature of spacetime to make progress with the problems of spacetime, motivating a shift towards more phenomenal concepts of space and time. Before moving to the phenomenal notions, it should be noted however that the scientific problem (Section 3a) makes it necessary to attribute a special status to the derivation of general relativity from a theory of quantum gravity (since it is one of the very ingredients in the development of any theory of quantum gravity). It will be thus necessary to derive at least one particular conceptions of theoretical spacetime consistent with general relativity to address the scientific problem. In the next section, we turn to phenomenal conceptions of space and time as an alternative potential target of metaphysical recovery.

b. Phenomenal Space and Time

The shift to the way things appear to us might justify abandoning the notion of theoretical spacetime in favour of the two distinct notions of space and time, or finding a way to combine the two. One way to ascribe an important function to both notions can be found for instance in Chalmers (2018) as he argues that spacetime can be functionally individuated by its role in triggering phenomenal space and time. But one could alternatively insist that only phenomenal space and time exist, unlike the more theoretical notion of spacetime. Indeed, it can be argued that in the way things appear to us, space and time are not primarily intertwined in a spatiotemporal unity. In fact, the notions of phenomenal space and time may themselves prove too coarse. More refined notions associated with phenomenal space could be notions of local and non-local beables, localized observations, spatial localization, etc. Similarly, more refined notions associated with phenomenal time could be notions of local change, series of experiments, repetition, duration, statistical data, etc. The retreat from theoretical spacetime could thus be more or less profound, depending on whether one wishes to preserve monolithic notions of phenomenal space and time beyond the diversity of spatial and temporal features of the manifest world.

This retreat might be more or less appealing depending on one’s allegiance to the primacy of the external world over phenomenological content, or the other way around. This debate revives to some degree the one that once took place in the Vienna Circle between Neurath (1931) on the one hand, and Schlick (1934) on the other.

According to Neurath’s physicalism, observational statements derive their truth from physical states in the world. They are therefore based on the existence of intersubjective invariants that transcend the private sphere of each individual’s experiences. These invariants take the form of objects located in space and time and instantiating properties. Observational statements are therefore fallible, but objective, by positing the existence of a mind-independent grid enabling the coordination of cognitive experiences and guaranteeing the intersubjective validity of observations made by different observers at different locations in spacetime. If this mind-independent spacetime can be characterized by empirical science—as we are entitled to assume, given the immense success of general relativity—then it is none other than the theoretical spacetime discussed in the previous section.

For Schlick’s psychologism, on the contrary, observational statements derive their truth from mental states. They have the form “here, now, this and that”, but these spatial and temporal notions are linked to the way things appear to us (and therefore cannot be questioned), and not to an external objective, mind-independent spatiotemporal arena of reality. It is the private experiences of individuals producing observational statements that provide the infallible, subjective justification for scientific knowledge. Infallible as it may be, this sort of justification at least leaves open the question of whether, beyond the phenomenal notions of space and time found in ordinary life and scientific practice, there exists a spatiotemporal structure.

Schlick was naturally criticised for opening a Pandora’s box, the subjective tenor of his approach seemingly leading to an unpleasant form of solipsism. This difficulty arises just as much in the case of the emergence of spacetime: for, if there is no spacetime but only phenomenal notions of space and time, how can we ever salvage the intersubjective validity of science, the fact that different observers can compare notes taken from different standpoints and collectively assemble an ontology of the world? One promising answer is to recognize the existence of a fundamental ontological grid which, although not spatiotemporal for various reasons yet to be made explicit, nevertheless makes it possible to coordinate the experiences of observers. One such approach is found in Baron and Le Bihan (2024)’s causal theory of spacetime, which considers that spacetime emerges from a causal network more fundamental than spacetime. The fundamental ontology of causal relations, even though not spatiotemporal, could thus act as the coordination grid allowing intersubjective agreement between observers.

We can thus see the, at least partial, resemblance between Neurath and Schlick’s debate on how to best conceive the epistemological foundations of empirical sciences to the recent discussions on whether space and time belong to the external world in the guise of theoretical spacetime, or in the fundamental conceptual categories sentient beings project onto the world to experiment it and interact with it.

The distinction between theoretical spacetime and phenomenal space and time opens up a new line of thought: if a theoretical notion of spacetime may not be found in contemporary physics, then perhaps we should bid adieu to the concept. Perhaps the lesson to be gained from the emergence of spacetime is that spacetime does not exist, and that the only useful concepts to understand the nature of reality are the notions of phenomenal space and time, associated with the way sentient beings experiment the world. This echoes phenomenological approaches in a broad sense that we can trace back, for instance, to Immanuel Kant’s transcendental philosophy that envision space and time, with many other fundamental categories of sentient experience, as a priori categories necessary to shape our sensory experience.

Moving from theoretical spacetime to phenomenal space and time will have a number of implications for the problems of spacetime emergence. For consider first the ontological problem. Space and time phenomenalism appear to lead to spacetime eliminativism, the view that spacetime does not exist (Ismael, 2021; Baron, 2023; Miller, 2024). If there is no theoretical spacetime, and the only spacetime there is simply the conjunction of space and time, then an obvious terminological choice for this approach is that spacetime is not fundamental because it does not exist. What about space and time? The elimination of spacetime that follows from space and time phenomenalism opens up two theoretical options. Either space and time do not exist (space and time eliminativism), or they do exist (space and time realism). This might end up being a purely conventional choice depending on what one takes the defining features of the concepts of space and time to be (Le Bihan, 2015).

5. Bridging the Gap

Section 3 has presented a variety of problems that the gap, or discrepancy, between the fundamental and the spatiotemporal levels engenders. Solutions to these problems attempt to bridge this gap. As such, they take mostly the form of philosophical articulations of the emergence relation that is supposed to connect the non-spatiotemporal ontology to the spatiotemporal ontology. “Emergence” is here intended as an umbrella term, or placeholder, that can be filled in by the relations we consider in more detail: primitive emergence, functional realisation, grounding, and mereological composition. This is not to deny that “emergence” could also denote a specific, primitive relation from the philosophy toolbox, or that analyses of the notion of emergence — such as the distinction between a weak and a strong form of emergence (as in Wilson 2021b) — could prove fruitful in application to the spacetime case. These issues are mostly open for future research.

In the final part of this section, we examine a different way of trying to resolve the problems surrounding the non-fundamentality of spacetime, which consists in denying that there spacetime exists at all. On this conception, there is thus no gap to be bridged.

a. Functional Realisation

We have already encountered functionalism in Section 4, as one option for specifying what needs to be recovered to recover spacetime. But spacetime functionalism can also serve as an analysis of the emergence relation.

Spacetime functionalism in quantum gravity is inspired by functionalist projects from other areas—notably, mental states in the philosophy of mind, and space in non-relativistic quantum mechanics. It differs from these projects in a number of ways.

First, in the philosophy of mind, functional realisation is often understood causally: the functions with which the emergent entity becomes identified are spelled out in terms of this entity’s causal interactions with other things. If spacetime is not fundamental, then the status of causation is equally questionable. In particular, it is questionable whether what is present at the fundamental level could stand in causal relations. It is thus important for spacetime functionalism to be successful that the notion of functional reduction is broad enough to ensure reduction need not be causal.

Another distinguishing feature of spacetime functionalism in quantum gravity concerns the epistemic status of the entities related by functional realisation. Standardly, the realised entities are the ones that are conceptually problematic. This is reversed in the case of spacetime emergence in quantum gravity: a successful functional realisation of spacetime is supposed to help us understand the possible emergence of spacetime from a puzzling non-spatiotemporal ontology (Huggett and Wüthrich, 2020).

Which problems from Section 3, then, does functionalism address? Spacetime functionalism was specifically designed as a solution to the problem of empirical coherence (Huggett and Wüthrich, 2013). However, proponents of a deflationary take on the problem of empirical coherence have denied that the resources of functionalism are needed to address the problem (Linnemann, 2020). As for the ontological problem of spacetime emergence, Lam and Wüthrich maintain that on the one hand, functionalism amounts “to the denial that there is a ‘hard problem’ beyond the easy problem’ of the emergence of spacetime” (Lam and Wüthrich, 2018, p.44), and on the other hand, that functionalism is orthogonal to the ontological question (Lam and Wüthrich, 2018, p. 40). Distinguishing between different sorts of functionalism Le Bihan (2021) argues there is a tension in this pair of claims. Indeed, functionalism comes in a number of versions with different answers to the ontological problem. For instance, if there is no spacetime, then there is no hard problem. This amounts to dissolving the hard problem by endorsing a particular solution to the ontological problem, based on a particular sort of functionalism, namely eliminativist functionalism. Thus, denying that there is a hard problem because there is no ontological problem might rely on a specific approach to functionalism which already presupposes a particular answer to the ontological problem.

In brief, the functionalist machinery might not be that independent from the hard and ontological problems. Introducing terminology familiar from the philosophy of mind, different sorts of functionalism can be distinguished along two parameters: a first parameter distinguishes role from realiser functionalism; another between ontic and linguistic functionalism. The various sorts of ontic functionalism—realiser functionalism, role functionalism, and eliminativist functionalism—more or less implicitly entail an answer to the ontological question. According to realiser functionalism, spacetime is identical to what fulfils the spacetime role on the quantum gravity level. Role functionalism entails a dualist view on which spacetime is derivative. And according to eliminative functionalism, there is no spacetime at all, but only linguistic roles that we wrongly reify beyond the language. Linguistic functionalism, on the other hand, is a thesis only about the meanings of certain concepts—namely, that their meaning and reference should be functionally analysed. It thus remains ontologically neutral.

A functionalist solution to the problem of empirical coherence can thus be orthogonal to, that is independent from, a solution to the ontological problem. Butterfield and Gomes (2020) argue that the right way to understand spacetime functionalism is as a species of reduction (and hence not as neutral with respect to the ontological question). They take this to be the lesson from Lewis (1972), who argues that if one accepts that two entities fulfil the same role, then one is committed, by logic and meaning alone—that is, without needing to posit any additional bridge laws—to their identity. Pace Butterfield and Gomes, Knox and Wallace (2023) present an argument against reductive functionalism in the spacetime context. The argument points out that functional identifications in physics typically rely heavily on approximation procedures. However, the argument merely targets versions of reductive functionalism relying on strict identity. As both Lewis (1972) and Butterfield and Gomes (2020) acknowledge, reductive functionalism needs to, and can, accommodate approximations.

b. Grounding

The notion of grounding was developed to capture metaphysical relations of non-causal dependence possibly involved in non-causal explanations. Typically cited examples of such dependence relations include: the relation between a set and its members; the relation between a conjunction and its conjuncts; or the relation between the fact that a flower is coloured and the fact that the same flower is red. In all these cases, the first relatum can be described as grounded in, and non-causally explained by, the second. Such non-causal explanations appeal to metaphysical principles, such as that colours are determinable which must have determinate instances, or the logical structure of the conjunction.

How promising are grounding-based approaches to spacetime emergence? Wilson (2021a) provides a modal argument against a grounding-based approach to the emergence of spacetime. (The argument is formulated in terms of constitution rather than grounding, which we neglect for the ease of our exposition.) According to him, the modal status that is commonly ascribed to grounding claims (as necessary), and the modal status commonly ascribed to the existence of spacetime (as contingent), are incompatible with a grounding account of spacetime emergence. More precisely, working with the example of loop quantum gravity, the following four claims cannot be true together:

(1) Spacetime is grounded in a superposition of spinfoams.
(2) The grounding of spacetime is metaphysically non-contingent.
(3) Newtonian spacetime is metaphysically possible.
(4) Newtonian spacetime is not grounded in a superposition of spinfoams. (Wilson, 2021a, p. 189; adapted terminology)

Proponents of a grounding approach to spacetime emergence will have to reject (2), (3) or (4), none of which is a palatable option according to Wilson.

Let us now examine some consequences of the non-fundamentality of spacetime for the very understanding of the notion of grounding itself. Grounding is often characterised as being a metaphysical analogue of causation, or more rarely even as a kind of metaphysical causation (Schaffer 2016, Wilson 2018; for a dissenting view, see Bernstein 2016). Typically, a criterion for distinguishing the two notions makes reference to time: causation happens over time, whereas grounding is synchronic (if what stands in the grounding relation is temporal at all). While this simple temporal criterion arguably needs some refinement irrespective of quantum gravity (Baron et al., 2020), no version of the temporal criterion can apply at the quantum gravity level if time is not present there. One can draw one of the three following consequences from this. It could be that: (i) there is no causation or no grounding at the quantum gravity level; or (ii) causation and grounding are indistinguishable at the quantum gravity level; or (iii) a criterion other than temporal distinguishes causation from grounding at the quantum gravity level. Which of these consequences is drawn affects the outlook of a grounding-based account of spacetime emergence. Wilson (2021a) opts for a novel criterion between causation and grounding, namely for distinguishing grounding from causation through the kind of law by which they are governed. On this account, causal relations are those that are governed by laws of nature, and grounding relations are governed by constitutive principles—that is, by principles that tell us what it is to be a certain kind of thing.

c. Mereological Composition

Objects we encounter in daily life, such as chairs or tables, do not figure in theoretical physics. But we have a relatively straightforward explanation for how chairs and tables emerge from the entities posited by theoretical physics: they are mereologically composed from these entities, whatever these turn out to be (e.g., particles or quantum fields). Of course, there is still a bit of mystery, at least according to many, about how properties of a whole can emerge from parts that lack such properties, but those kinds of potential explanatory gaps are ubiquitous. The mereological approach to spacetime emergence suggests using the same compositional approach to explain the emergence of spacetime. Spacetime would emerge from more fundamental ingredients roughly as chairs and tables emerge from more fundamental entities (Le Bihan, 2018a,b). This means that spacetime would be composed of non-spatiotemporal parts. In what follows, we focus on approaches that try to give a mereological account of spacetime regions (rather than, e.g., distance relations).

The comparison between the composition of ordinary objects and the emergence of spacetime faces the following difficulty. Parthood is typically associated with a number of formal properties—for example, it is typically assumed to be a partial order and to obey certain decomposition principles. Although virtually all such properties have been confronted with putative counterexamples, there is a widespread agreement about certain core characteristics of the parthood relation. If the relation at work in the supposedly mereological composition of spacetime departs too much from these characteristics, then it becomes questionable whether this relation is really the same, or at least from the same family, as the one familiar from the composition of chairs and tables.

One such characteristic typically attributed to parthood that might be missing in mereological models of spacetime emergence concerns the linkage between parthood and location (Baron, 2020). Chairs and their parts are located in spacetime, and the relation between them seems to be mirrored by the relation between their respective locations: just as the chair back is part of the chair, the region of spacetime filled by the chair back is a subregion of the region filled by the chair. Such intuitions have been captured more rigorously by a number of so-called harmony principles, one of which is the following:

x is a part of y iff x’s location is a subregion of y’s location. (Saucedo, 2011, p. 227)

Whether principles such as the above can be maintained in mereological approaches to spacetime emergence depends on a number of choices that need to be made in spelling out such a mereological approach and how it interacts with a theory of location. For example, it needs to be specified how subregionhood relates to parthood (a popular option is to define subregionhood as parthood between regions); whether locations are themselves located somewhere (if so, then most plausibly they are located at themselves); and one needs to decide whether entities at the non-spatiotemporal level can still be attributed a location, albeit a non-spatiotemporal one (Le Bihan, 2018a). In the case of causal set theory, for example, one could make sense of non-spatiotemporal location in terms of location within the causal set structure.

To illustrate the point, let us have a look at a simple toy model in which the harmony principle stated above does fail (see Figure 1). In this toy model, there are just two objects (o1 and o2) at the fundamental non-spatiotemporal level, each having a non-spatiotemporal location (l1 and l2, respectively)—so, we assume that there is a meaningful notion of non-spatiotemporal location available. The two non-spatiotemporal objects o1 and o2 compose the only entity existing at the spatiotemporal level, region r. We let parthood be reflexive, so everything is a part of itself. We further stipulate that a location is a subregion of another just in case it is a part of it, and that every (spatiotemporal or non-spatiotemporal) location is located at itself. Then the harmony principle is violated since o1 is part of r, but o1’s location (l1) is not a subregion of r’s location (which is just r itself). That l1 is not a subregion of r is because we did not assume l1 to be a part of r. Indeed, it seems unclear how we could say that non-spatiotemporal locations could be part of spatiotemporal ones, as they are not located in a common spatial framework.

Figure 1: A mereological model of spacetime emergence violating harmony principles. Thick lines represent parthood (going upwards).

Harmony principles could either be used as guiding principles for the development of mereological approaches to spacetime emergence, or one could endorse a mereological approach violating these principles. The latter strategy could be justified by urging that novel insights from scientific enquiry might trump intuitions about harmony principles (Le Bihan 2018a and Baron and Le Bihan 2022a). So, advances in fundamental physics can call into question location principles and principles about mereological composition. Discussions over locality based on common sense intuitions are then dismissed as irrelevant for discussions over the composition or decomposition of spacetime.

A different challenge for a mereological approach to spacetime emergence might lie in the use working physicists actually make of decomposition. Physicists use decomposition techniques in a highly pragmatic way, which arguably are not suitable for disclosing a hierarchical structure of reality (Healey, 2013). For example, how physicists decompose light (into particles, electromagnetic waves, or a quantum mixture of states of electro-magnetic fields) might depend on the intended application of the decomposition—those are not necessarily supposed to reveal the fundamental mereological structure of light. Furthermore, superposition and mixture, composition relations invoked in quantum physics, seem to have other formal properties than parthood. On this view, then, the viability of the mereological approach becomes a question of usefulness; what matters is whether such a decomposition is useful for the working physicists, not whether it solves the philosophical problems from Section 3.

d. Eliminativism

Spacetime eliminativism rejects the assumption that spacetime really emerges from the non-spatiotemporal fundamental structure: spacetime, on this view, simply does not exist (Baron, 2023). The view has been defended by Miller (2024) under the name of spacetime projectivism: spatiotemporal properties would be projected onto a world which lacks such properties. The problem of empirical coherence seems especially thorny for this approach. If there is no spacetime at all, not even derivative, then how are we to make sense of evidence seemingly localised in spacetime?

To solve the problem of empirical coherence, it must be shown how a non-spatiotemporal theory could be observationally justified. A spacetime eliminativist thus needs to dissociate observation from spacetime (Baron and Le Bihan, 2022c). Spacetime eliminativism comes in different versions, depending on the sort of entities that are invoked to replace spacetime in order to account for the problem of empirical coherence. According to a first version, the local beables of experimental physics still exist, but not in a way which also requires spacetime to be real (Baron, 2023). According to the second, more radical version, even local beables turn out not to exist.

The moderate version faces two difficulties. Firstly, it is not immediately clear how to understand the notion of local beables without reference to spacetime, or space and time. Questions that need to be addressed are as follows: in which sense is a local beable local, if it is not in a spacetime sense? How can we run statistical analyses of runs of experiments involving local beables, if there is no time to organise the data?

Secondly, and more importantly, dissociating the notion of local beables from the notion of spacetime might only shift the problem from spacetime emergence to the emergence of local beables, which now has to be accounted for independently. The attractiveness of spacetime eliminativism seems to depend on whether this new problem turns out to be easier to solve, or less salient than the problem we started out with.

The second version of spacetime eliminativism is more radical by even dispensing with local beables. To solve the problem of empirical coherence, a defender of this version of eliminativism can argue that what needs to be recovered, strictly speaking, is not the physical space, but the spatiotemporality of human perception (Ismael, 2021). And it seems at least possible that this phenomenology does not transparently describe the physical world as it is, which might in fact have a non-spatiotemporal physical structure. Moving to such a phenomenal or phenomenological approach, one can thus maintain a form of realism about physics without realism about spacetime (see Section 4b). More work is needed to assess if and how it could be possible to articulate a non-spatiotemporal account of the physical processes governing involved in (apparently spatiotemporal) human perception.

6. Implications

We have mentioned on several occasions that the spatiotemporality of reality plays a crucial role in many philosophical outlooks, and that denying spacetime a fundamental status will thus have important implications for a broad range of philosophical questions. We have already come across some potential candidates; this section presents further such consequences in a bit more detail.

a. Philosophy of Time

What is the fate of classical debates in the philosophy of time in light of quantum gravity? This will of course depend greatly on the approach to quantum gravity investigated (for a survey, see Huggett et al. 2013). Consider for instance the dispute between A- and B-theorists over whether time passes, or the one between presentists (only the present exists), growing-block theorists (only the entities we regard as past and present exist) and eternalists (entities categorised as past, present or future equally exist) over the domain of existence in time, or again the debate between relationalism and substantivalism as to whether spacetime should be conceived as a relational structure between material entities or as a substance with an existence of its own. Arguably, considerations from quantum gravity will have major repercussions on these views.

Let us focus on presentism and the objective foliation it requires and set aside first the possible emergence of spacetime to ask the following question: could we find one unique objective foliation of the spacetime in quantum gravity? The predominant view appears to be negative as quantum gravity should not resuscitate a non-relativistic world by imposing an objective, unique foliation onto the fundamental ontology (Callender 2000, Belot and Earman 2001, p. 241). However, it has been argued that, on the contrary, quantum gravity could provide a hospitable home to such a foliation, and hence presentism (Monton, 2006). Although this is certainly a logical possibility, this fixed foliation quantum gravity encounters a number of issues. Among a number of technical objections, the most devastating one raised by Wüthrich (2010, 2013) is that even if it turned out that there was a genuine single foliation of the fundamental structure, there would be no reason to expect that our presentist intuitions could hook onto it. The situation is very similar to the now-now objection against the growing block theory (Braddon-Mitchell, 2004): if the present is really the edge of the past-present block, how do you know that your present, from your own perspective, corresponds to the objective boundary of being, to the real objective present, and that you are not lost in the past of the block?

Now, if the fundamental structure is genuinely non-spatiotemporal, then the situation appears even grimmer for the presentist (and the growing block theorist). Since they require the existence and fundamentality of time, spacetime emergence supports either standard eternalism or a new form of eternalism, atemporal eternalism that states that all proper parts of the natural world co-exist simpliciter, and this even though the natural world is not temporal (Le Bihan, 2020).

It has also been argued that some cosmological models based on quantum gravity might suggest not that there is no time, but on the contrary that we need two times (Wüthrich, 2022), a claim also found in one particular approach to string theory, namely F-theory (Le Bihan, 2023; Cinti and Sanchioni, 2023). Whether the denial of the uniqueness of time, and thus of the existence of a single fundamental time, is regarded as a genuine expression of the non-fundamentality of time is, of course, a matter of convention. But it could have important repercussions on debates in the metaphysics of time concerning the plausibility of the hypertime hypothesis, especially since the view has been described to be “just insane” (Skow, 2015, p. 47). This is the view that reality could encompass a second-order time allowing for the possibility of variations of the first-order time with respect to a second-order time, and thus of a veritable flow of time, the first-order present “moving” with respect to the second-order time (Smith, 2011). If hypertime were to gain justification from quantum gravity, it might thereby offer a route to a certain class of dynamical A-theories, contrary to what is generally considered to be the lessons of quantum gravity for our understanding of time.

Furthermore, it is interesting to note that a large part of the argument in metaphysics against this hypertime hypothesis builds on the belief that the two times must share a similar structure, an assumption that is questionable at best, as demonstrated by Baron and Lin (2022). Arguably, the approaches from quantum gravity and cosmology underwriting a two-times approach could provide a concrete blueprint for evaluating the discussion in more detail. Virtually all the work remains to be done to connect the philosophy of quantum gravity to the metaphysics of hypertime literature.

Another debate in the philosophy of time concerns the possibility of time travel and closed time-like curves. A certain category of time travel seems to be possible according to general relativity, as it allows for closed time curves, that is, closed spacetime trajectories that would permit a forward time traveller to return to his past (Earman et al., 2009). One might wonder whether this result is expected to carry over to the prospective theory of quantum gravity. At this stage, there is no clear answer to this question, as shown by Wüthrich (2021). But, one can already articulate possibilities and debate on whether closed timelike curves could survive the absence of closed curves in the fundamental ontology. Interestingly, according to a certain metaphysical interpretation of a speculative cosmological model based on quantum gravity ideas and developed by Penrose, the closed time curves might turn out to be the rule, and not the exception within spacetime. His conformal cyclic cosmology could indeed be teaching us that the world is a gigantic cosmic loop, the whole universe being closed on itself in all timelike directions that do not terminate into black holes (Le Bihan, 2024).

b. Modality, Laws of Nature, Causation

Our next stop is modality, laws of nature, and causation. Accounts of these three notions can come in certain package deals, of which David Lewis’s is a particularly influential one (Lewis, 1986). Lewis gives reductive accounts of causation and laws of nature, and crucial to these reductions is his modal realism: the view that all ways the world could be exist concretely as possible worlds. To individuate the possible worlds within modal space, some kind of “world-making relation” is needed, and Lewis identifies spatiotemporal relations as these world-making relations. This will not do if, as quantum gravity suggests, spacetime is not fundamental. As Wüthrich (2019) argues, if a quantum gravity programme such as causal set theory turns out to be true of our world, it will be doubtful whether we can find any relation holding at the fundamental level that can fulfil the role of the world- making relation. Naturally, if it proved impossible to find any other non-spatiotemporal world-making relation, Lewis’s theory of modality and the conceptions of the laws of nature and causality that it underpins would fail in unison. One option could be to use entanglement relations of spacetime relations as building relations (Jaksland, 2021; Ney, 2021a; Cinti et al., 2022; Cinti and Sanchioni, 2021).

However, the problem is by no means unique to Lewis’s account of laws of nature. As Lam and Wüthrich (2023) demonstrate, most of the popular accounts of law have bad prospects of surviving the shift to non-spatiotemporal fundamental physics. The minimal primitivist account developed by Chen and Goldstein (2022) might be an exception, as it aims to give an explanation of how laws govern that does not necessitate a dynamical evolution from earlier states to later states.

In the case of causation, the situation is slightly different: we need not demand of an account of causation that it apply to non-spatiotemporal settings. This is because, contrary to the case of laws of nature which should arguably be present at the quantum gravity level, it seems a viable option that causation emerges together with spacetime, and many take causation not to be a part of physics anyway. Accounts of causation that presuppose spacetime are thus not necessarily ruled out, if spacetime is not fundamental, but will plausibly relegate causation to an equally non-fundamental status.

However, one could ask the further question as to whether causation could be a fundamental feature of reality even if spacetime is not. On the one hand, time seems essential to differentiate causes from effects, since—disregarding the possibility of back- wards causation—causes precede their effects. On the other hand, not everyone agrees that spacetime is essential to, or more fundamental than, causation. First, interventionist accounts of causation seem in principle applicable to non-spatiotemporal settings (Baron et al., 2010; Baron and Miller, 2014). Second, taking causation as more fundamental than time, and reducing spatiotemporal relations to causal relations, has some philosophical precedence that could be brought to bear on quantum gravity settings: it was already defended by Leibniz and Kant, and in the 20th century, causal theories of spacetime have prominently been advocated by Reichenbach (1956), Grünbaum (1973), and van Fraassen (1970) in the context of relativistic physics before being abandoned in the late seventies, and put back on the philosophical agenda recently in the context of quantum gravity (Baron and Le Bihan, 2024).

c. Other Topics

Further metaphysical positions incompatible with the non-fundamentality of spacetime might include Armstrong’s definition of naturalism, at least under a certain interpretation. According to his naturalism, the spacetime world is all that exists (Armstrong, 2004, p. 101). The fact that the spacetime world is all there is could collide with the view that spacetime is not fundamental. Indeed, a plausible position seems to be that there is more in the non-fundamental world than in the emergent world; and in the context of spacetime emergence, this excess of structure in fundamentals nonetheless appears to be physical, and should thus be accommodated by any suitable definition of naturalism.

Another example might be Schaffer’s mix of priority monism and supersubstantivalism (Schaffer, 2009, 2010). Priority monism is the view that the cosmos is more fundamental than any of its proper parts. Supersubstantivalism is the identification of the cosmos to a spacetime structure directly instantiating natural properties, without the mediation of objects. Supersubstantivalism appears to be incompatible with the view that the whole cosmos should be identified to a non-spatiotemporal substance. There is thus a question as to whether priority monism can be developed absent supersubstantivalism in the context of quantum gravity. Such a view has been proposed by Le Bihan (2018b): spacetime entities and spacetime itself are regarded as identical to (non-spatiotemporal) proper parts of the whole cosmos.

Another discussion concerns the existence of extended simples, that is, entities that are both extended and deprived of proper parts. These could be constituent parts of material objects, spacetime or any other relevant aspect of the natural world. It has been argued that string theory and loop quantum gravity lead to a conception of discrete spacetime, and that this fact provides a justification for the existence of extended simples (Rettler, 2018, p. 851). However, there is not much evidence to support the claim that the ontology of string theory is discrete (Baker, 2016). And even in the more suggestive case of loop quantum gravity, such a stance presupposes that the discrete entities associated with the ultimate nature of spacetime are not only simple, but also extended. But if the fundamental structure is not spatiotemporal, a dilemma then arises: either the discrete entities are not extended, or they are not spatiotemporally extended, which then calls for a non-spatiotemporal notion of extension (Baron and Le Bihan, 2022b).

Finally, note that the philosophical consequences of the non-fundamentality of space- time might go beyond what is sometimes perhaps narrowly conceived of as the domain of metaphysics, to cover the philosophy of mind and language. Braddon-Mitchell and Miller (2019), for example, argue that the non-fundamentality of (space)time stands in the way of naturalistic theories of representation. In a nutshell, if representation requires causation, and causation requires time, then timelessness could undermine these naturalistic theories of causation. One can thereby appreciate how considerations originated in the metaphysics of quantum gravity might have far-reaching implications, branching out into a number of distinct philosophical debates.

7. References and Further Reading

  • Albert, D. Z. Elementary quantum mechanics. In J. T. Cushing, A. Fine, and S. Goldstein, editors, Bohmian mechanics and quantum theory: An appraisal. Dordrecht: Kluwer, 1996.
  • Armstrong, David Malet. Truth and Truthmakers. Cambridge University Press, Cambridge, 2004.
  • Baker, David John. Does string theory posit extended simples? Philosopher’s Imprint, 16 (18):1–15, 2016.
  • Baker, David John. Knox’s inertial spacetime functionalism (and a better alternative). Synthese, 199:277–298, 2021.
  • Baron, Sam. The curious case of spacetime emergence. Philosophical Studies, 177:2207– 2226, 2020.
  • Baron, Sam. Eliminating spacetime. Erkenntnis, 88:1289–1308, 2023.
  • Baron, Sam and Baptiste Le Bihan. Composing spacetime. Journal of Philosophy, 119(1): 33–54, 2022a.
  • Baron, Sam and Baptiste Le Bihan. Quantum gravity and mereology: Not so simple. Philosophical Quarterly, 72(1):19–41, 2022b.
  • Baron, Sam and Baptiste Le Bihan. Spacetime quietism in quantum gravity. In Antonio Vassallo, editor, The Foundations of Spacetime Physics: Philosophical Perspectives. Routledge, New York, 2022c.
  • Baron, Sam and Baptiste Le Bihan. Causal theories of spacetime. Noûs, 58(1):202–224, 2024.
  • Baron, Sam and Yi-Cheng Lin. Time, and time again. The Philosophical Quarterly, 72(2): 259–282, 2022.
  • Baron, Sam and Kristie Miller. Causation in a timeless world. Synthese, 191:2867–2886, 2014.
  • Baron, Sam, Peter Evans, and Kristie Miller. From timeless physical theory to timelessness. Humana Mente, 13(35–59):35–59, 2010.
  • Baron, Sam , Kristie Miller, and Jonathan Tallant. Grounding at a distance. Philosophical Studies, 177:3373–3390, 2020.
  • Barrett, Jeffrey A. Empirical adequacy and the availability of reliable records in quantum mechanics. Philosophy of Science, 63(1):49–64, 1996.
  • Bell, John S. Speakable and Unspeakable in Quantum Mechanics. Cambridge University Press, Cambridge, 1987.
  • Belot, Gordon and John Earman. Pre-Socratic quantum gravity. In Craig Callender and Nick Huggett, editors, Physics meets philosophy at the Planck scale, pages 213–255. Cambridge University Press, Cambridge, 2001.
  • Bernstein, Sara. Grounding is not causation. Philosophical Perspectives, 30:21–38, 2016.
  • Blumenhagen, Ralph, Dieter Lüst, and Stefan Theisen. Basic Concepts of String Theory, volume 17. Springer, 2013.
  • Bojowald, Martin. Quantum Cosmology: A Fundamental Description of the Universe, volume 835. Springer Science & Business Media, 2011.
  • Bombelli, Luca, Joohan Lee, David Meyer, and Rafael D Sorkin. Space-time as a causal set. Physical Review Letters, 59(5):521, 1987.
  • Braddon-Mitchell, David. How do we know it is now now? Analysis, 64(3):199–203, 2004.
  • Braddon-Mitchell, David and Kristie Miller. Quantum gravity, timelessness, and the contents of thought. Philosophical Studies, 176(7):1807–1829, 2019.
  • Brown, H. R. and O. Pooley. The origins of the spacetime metric: Bell’s Lorentzian pedagogy and its significance in general relativity. In C. Callender and N. Huggett, editors, Physics Meets Philosophy at the Planck Scale. Cambridge University Press, Cambridge, 2001.
  • Brown, H. R. and O. Pooley. Minkowski space-time: A glorious non-entity. In D. Dieks, editor, The Ontology of Spacetime. Elsevier, Amsterdam, 2006.
  • Brown, Harvey R. Physical Relativity: Space-time Structure from a Dynamical Perspective. Oxford University Press, Oxford, 2005.
  • Butterfield, Jeremy and Henrique Gomes. Functionalism as a species of reduction. arXiv preprint, arXiv:2008.13366., 2020.
  • Callender, Craig. Shedding light on time. Philosophy of Science, 67(S3):S587–S599, 2000.
  • Cartan, Élie. Sur les variétés à connexion affine et la théorie de la relativité généralisée (première partie). In Annales scientifiques de l’École normale supérieure, volume 40, pages 325–412, 1923.
  • Chalmers, David. Finding space in a non-spatial world. In Christian Wüthrich, Baptiste Le Bihan, and Nick Huggett, editors, Philosophy Beyond Spacetime, pages 154–181. Oxford University Press, Oxford, 2021.
  • Chalmers, David J.. The meta-problem of consciousness. Journal of Consciousness Studies, 25(9-10):6–61, 2018.
  • Chen, Eddy Keming and Sheldon Goldstein. Governing without a fundamental direction of time: Minimal primitivism about laws of nature. In Jerusalem Studies in Philosophy and History of Science, pages 21–64. Springer International Publishing, 2022.
  • Chua, Eugene Y. S. and Craig Callender. No time for time from no-time. Philosophy of Science, 88(5):1172–1184, 2021.
  • Cinti, Enrico and Marco Sanchioni. Humeanism in light of quantum gravity. Synthese, 199(3):10839–10863, 2021.
  • Cinti, Enrico and Marco Sanchioni. Time, spacetime, and f-theory. manuscript, 2023.
  • Cinti, Enrico, Alberto Corti, and Marco Sanchioni. On entanglement as a relation. European Journal for Philosophy of Science, 12(1):10, 2022.
  • Cohen, Robert Sonné, Michael Horne, and John J Stachel. Experimental metaphysics: Quantum mechanical studies for Abner Shimony, volume one. Springer, 1997.
  • Dawid, Richard. String Theory and the Scientific Method. Cambridge University Press, Cambridge, 2013.
  • De Haro, Sebastian and Jeremy Butterfield. On symmetry and duality. Synthese, 198(4): 2973–3013, 2021.
  • De Haro, Sebastian and Henk W. de Regt. A precipice below which lies absurdity? Theories without a spacetime and scientific understanding. Synthese, 197:3121–3149, 2020.
  • Dowker, Fay. Causal sets as discrete spacetime. Contemporary Physics, 47(1):1–9, 2006.
  • Dowker, Fay. Introduction to causal sets and their phenomenology. General Relativity and Gravitation, 45:1651–1667, 2013.
  • Dowker, Fay. The birth of spacetime atoms as the passage of time. Annals of the New York Academy of Sciences, 1326(1):18–25, 2014.
  • Earman, John. Reassessing the prospects for a growing block model of the universe. International Studies in the Philosophy of Science, 22(2):135–164, 2008.
  • Earman, John, Christopher Smeenk, and Christian Wüthrich. Do the laws of physics forbid the operation of time machines? Synthese, 169:91–124, 2009.
  • Esfeld, Michael. Against the disappearance of spacetime in quantum gravity. Synthese, 199(Suppl 2):355–369, 2021.
  • Glick, David and Baptiste Le Bihan. Metaphysical indeterminacy in Everettian quantum mechanics. European Journal for Philosophy of Science, 14(1):3, 2024.
  • Greene, Brian. The Elegant Universe. Vintage Books, New York, 1999.
  • Grünbaum, Adolf. The causal theory of time. In Robert S. Cohen and Marx W. Wartofsky, editors, Philosophical Problems of Space and Time, volume 64, pages 179–208. Dordrecht: Springer Netherlands, 1973.
  • Healey, Richard. Can physics coherently deny the reality of time? Royal Institute of Philosophy Supplements, 50:293–316, 2002.
  • Healey, Richard. Physical composition. Studies in History and Philosophy of Modern Physics, 44:148–62, 2013.
  • Huggett, Nick. Skeptical notes on a physics of passage. Annals of the New York Academy of Sciences, 1326:9–17, 2014.
  • Huggett, Nick. Target space ̸= space. Studies in History and Philosophy of Modern Physics, 59:81–88, 2017.
  • Huggett, Nick and Karim PY Thébault. Finding time for Wheeler-DeWitt cosmology. arXiv preprint arXiv:2310.11072, 2023.
  • Huggett, Nick and Tiziana Vistarini. Deriving general relativity from string theory. Philosophy of Science, 82(5):1163–1174, 2015.
  • Huggett, Nick and Christian Wüthrich. Emergent spacetime and empirical (in)coherence. Studies in History and Philosophy of Modern Physics, 44(3):276–285, 2013.
  • Huggett, Nick and Christian Wüthrich. The (a)temporal emergence of spacetime. Philosophy of Science, 85(5):1190–1203, 2018.
  • Huggett, Nick, Tiziana Vistarini, and Christian Wüthrich. Time in quantum gravity. A Companion to the Philosophy of Time, pages 242–261, 2013.
  • Huggett, Nick, Niels Linnemann, and Mike D. Schneider. Quantum Gravity in a Labora- tory? Elements in the Foundations of Contemporary Physics. Cambridge University Press, 2023.
  • Ismael, Jenann. Do you see space? How to recover the visible and tangible reality of space (without space). In Christian Wüthrich, Baptiste Le Bihan, and Nick Huggett, editors, Philosophy Beyond Spacetime. Oxford University Press, Oxford, 2021.
  • Jaksland, Rasmus. Entanglement as the world-making relation: Distance from entanglement. Synthese, 198(10):9661–9693, 2021.
  • Jaksland, Rasmus and Kian Salimkhani. The many problems of spacetime emergence. The British Journal for the Philosophy of Science, 2023. doi: 10.1086/727052.
  • Knox, Eleanor. Newton-Cartan theory and teleparallel gravity: The force of a formulation. Studies in History and Philosophy of Modern Physics, 42(4):264–275, 2011.
  • Knox, Eleanor. Spacetime structuralism or spacetime functionalism, 2014. Manuscript.
  • Knox, Eleanor. Physical relativity from a functionalist perspective. Studies in History and Philosophy of Modern Physics, 67:118–124, 2019.
  • Knox, Eleanor and David Wallace. Functionalism fit for physics. Unpublished manuscript, 2023.
  • Lam, Vincent and Michael Esfeld. A dilemma for the emergence of spacetime in canonical quantum gravity. Studies in History and Philosophy of Science Part B: Studies in History and Philosophy of Modern Physics, 44(3):286–293, 2013.
  • Lam, Vincent and Christian Wüthrich. Spacetime is as spacetime does. Studies in History and Philosophy of Modern Physics, 64:39–51, 2018.
  • Lam, Vincent and Christian Wüthrich. Spacetime functionalism from a realist perspective. Synthese, 199:335–353, 2021.
  • Lam, Vincent and Christian Wüthrich. Laws beyond spacetime. Synthese, 202(71), 2023.
  • Le Bihan, Baptiste. The unrealities of time. Dialogue, 54(1):25–44, 2015.
  • Le Bihan, Baptiste. Space emergence in contemporary physics: Why we do not need fundamentality, layers of reality and emergence. Disputatio, 10(49):71–95, 2018a.
  • Le Bihan, Baptiste. Priority monism beyond spacetime. Metaphysica, 19(1):95–111, 2018b.
  • Le Bihan, Baptiste. String theory, loop quantum gravity and eternalism. European Journal for Philosophy of Science, 10(2):1–22, 2020. doi: 10.1007/s13194-020-0275-3.
  • Le Bihan, Baptiste. Spacetime emergence in quantum gravity: Functionalism and the hard problem. Synthese, 199(2):371–393, 2021.
  • Le Bihan, Baptiste. String theory for metaphysicians. Unpublished manuscript, 2023.
  • Le Bihan, Baptiste. The great loop: From conformal cyclic cosmology to aeon monism. Journal for General Philosophy of Science / Zeitschrift für Allgemeine Wissenschaftstheorie, 2024.
  • Le Bihan, Baptiste and Niels Linnemann. Have we lost spacetime on the way? Narrowing the gap between general relativity and quantum gravity. Studies in History and Philosophy of Modern Physics, 65:112–121, 2019.
  • Le Bihan, Baptiste and James Read. Duality and ontology. Philosophy Compass, 13(12): e12555, 2018.
  • Lewis, David. Psychophysical and theoretical identifications. Australasian Journal of Philosophy, 50(3):249–258, 1972.
  • Lewis, David. On the Plurality of Worlds. Blackwell Publishers, 1986.
  • Linnemann, Niels. On the empirical coherence and the spatiotemporal gap problem in quantum gravity: And why functionalism does not (have to) help. Synthese, 199(2): 395–412, 2020.
  • Major, Seth, David Rideout, and Sumati Surya. Stable homology as an indicator of manifoldlikeness in causal set theory. Classical and Quantum Gravity, 16(17):175008, 2009.
  • Malament, David B. The class of continuous timelike curves determines the topology of spacetime. Journal of Mathematical Physics, 18(7):1399–1404, 1977.
  • Matsubara, Keizo and Lars-Göran Johansson. Spacetime in string theory: A conceptual clarification. Journal for General Philosophy of Science, 49(3):333–353, 2018.
  • Maudlin, Tim. Completeness, supervenience and ontology. Journal of Physics A: Mathematical and Theoretical, 40(12):3151–3171, 2007.
  • McTaggart, John E. M.. The unreality of time. Mind, 17:457–484, 1908.
  • Menon, Tushar. Taking Up Superspace: The Spacetime Setting for Supersymmetric Field Theory. In Christian Wüthrich, Baptiste Le Bihan, and Nick Huggett, editors, Philosophy Beyond Spacetime: Implications from Quantum Gravity. Oxford University Press, Oxford, 2021.
  • Miller, Kristie. Spatiotemporal projectivism. In Stephen Hetherington, editor, Extreme Philosophy, pages 47–61. Routledge, 2024.
  • Monton, Bradley. Presentism and quantum gravity. In Dennis Dieks, editor, The Ontology of Spacetime, volume 1 of Philosophy and Foundations of Physics, pages 263–280. Elsevier, 2006.
  • Nagel, Ernest. The Structure of Science. Routledge, London, 1979.
  • Neurath, Otto. Soziologie im Physikalismus. Erkenntnis, 2:393–431, 1931.
  • Ney, Alyssa. The status of our ordinary three dimensions in a quantum universe. Noûs, 46(3):525–560, 2012.
  • Ney, Alyssa . Fundamental physical ontologies and the constraint of empirical coherence: A defense of wave function realism. Synthese, 192(10):3105–3124, 2015.
  • Ney, Alyssa. From quantum entanglement to spatiotemporal distance. In Christian Wüthrich, Baptiste Le Bihan, and Nick Huggett, editors, Philosophy Beyond Spacetime, pages 78–102. Oxford University Press, Oxford, 2021a.
  • Ney, Alyssa. The World in the Wave Function: A Metaphysics for Quantum Physics. Oxford University Press, New York, 2021b.
  • Read, James and Baptiste Le Bihan. The landscape and the multiverse: What’s the problem? Synthese, 199(3):7749–7771, 2021.
  • Read, James, Harvey R Brown, and Dennis Lehmkuhl. Two miracles of general relativity. Studies in history and philosophy of science Part B: Studies in history and philosophy of modern physics, 64:14–25, 2018.
  • Reichenbach, Hans. The Direction of Time. Dover Publications, 1956.
  • Rettler, Bradley. Mereological nihilism and puzzles about material objects. Pacific Philosophical Quarterly, 99(4):842–868, 2018.
  • Rideout, David and Petros Wallden. Emergence of spatial structure from causal sets. InJournal of Physics: Conference Series, volume 174, page 012017. IOP Publishing, 2009.
  • Rideout, David P and Rafael D Sorkin. Classical sequential growth dynamics for causal sets. Physical Review D, 61(2):024002, 1999.
  • Rovelli, Carlo. Quantum Gravity. Cambridge University Press, Cambridge, 2004.
  • Rovelli, Carlo and Francesca Vidotto. Covariant Loop Quantum Gravity: An Elementary Introduction to Quantum Gravity and Spinfoam Theory. Cambridge University Press, Cambridge, 2014.
  • Saucedo, Raul. Parthood and location. In Dean W. Zimmerman and Karen Bennett, editors, Oxford Studies in Metaphysics Vol. 5, pages 225–286. Oxford University Press, Oxford, 2011.
  • Schaffer, Jonathan. Spacetime the one substance. Philosophical Studies, 145(1):131–148, 2009.
  • Schaffer, Jonathan. Monism: The priority of the whole. Philosophical Review, 119(1): 31–76, 2010.
  • Schaffer, Jonathan. Grounding in the image of causation. Philosophical Studies, 173(1): 49–100, 2016.
  • Schlick, Moritz. Über das Fundament der Erkenntnis. Erkenntnis, 4:79–99, 1934.
  • Skow, Bradford. Objective becoming. Oxford University Press, 2015.
  • Smith, Nicholas JJ. Inconsistency in the A-theory. Philosophical Studies, 156:231–247, 2011.
  • Tomasiello, Alessandro. Geometry of String Theory Compactifications. Cambridge University Press, Cambridge, 2022.
  • van Fraassen, Bas. An Introduction to the Philosophy of Time and Space. Columbia University Press, 1970.
  • Vassallo, Antonio and Michael Esfeld. A proposal for a Bohmian ontology of quantum gravity. Foundations of Physics, 44:1–18, 2014.
  • Vistarini, Tiziana. The Emergence of Spacetime in String Theory. Routledge, New York, 2019.
  • Wallace, David. Quantum gravity at low energies. Studies in History and Philosophy of Science Part A, 94(C):31–46, 2022. doi: 10.1016/j.shpsa.2022.04.003.
  • Wilson, Alastair. Metaphysical causation. Noûs, 52(2):723–751, 2018.
  • Wilson, Alastair. Explanations of and in time. In Christian Wüthrich, Baptiste Le Bihan, and Nick Huggett, editors, Philosophy Beyond Spacetime: Implications from Quantum Gravity, pages 182–198. Oxford University Press, Oxford, 2021a.
  • Wilson, Jessica. Metaphysical Emergence. Oxford University Press, Oxford, 2021b.
  • Wüthrich, Christian. No presentism in quantum gravity. In Vesselin Petkov, editor, Space, Time, and Spacetime, pages 257–278. Springer, 2010.
  • Wüthrich, Christian. The fate of presentism in modern physics. In Robert Ciuni, Kristie Miller, and Giuliano Torrengo, editors, New Papers On The Present–Focus On Presentism, pages 91–131. Philosophia Verlag, Munich, 2013.
  • Wüthrich, Christian. Raiders of the lost spacetime. In D. Lehmkuhl, G. Schiemann, and Scholz, editors, Towards a Theory of Spacetime Theories, pages 297–335. Birkhäuser, Basel, 2017.
  • Wüthrich, Christian. Time travelling in emergent spacetime. In Judit Madarász and Gergely Székely, editors, Hajnal Andréka and István Németi on Unity of Science: From Computing to Relativity Theory Through Algebraic Logic, pages 453–474. Springer, 2021.
  • Wüthrich, Christian and Craig Callender. What becomes of a causal set? British Journal for the Philosophy of Science, 68(3):907–925, 2016.
  • Wüthrich, Christian. When the actual world is not even possible. In George Darby, David Glick, and Anna Marmodoro, editors, The Foundation of Reality: Fundamentality, Space and Time. Oxford University Press, Oxford, 2019.
  • Wüthrich, Christian. One time, two times, or no time? In Alessandra Campo and Simone Gozzano, editors, Einstein vs. Bergson: An Enduring Quarrel on Time, pages 209–230. de Gruyter, Berlin, 2022.
  • Yates, David. Thinking about spacetime. In Christian Wüthrich, Baptiste Le Bihan, and Nick Huggett, editors, Philosophy Beyond Spacetime, pages 129–153. Oxford University Press, 2021.
  • Zimmerman Jones, Andrew and Alessandro Sfondrini. String Theory for Dummies. John Wiley & Sons, 2022. 2dn edition.
  • Zwiebach, Barton. A First Course in String Theory. Cambridge University Press, Cambridge, 2009.

 

Author Information

Baptiste Le Bihan
Email: baptiste.lebihan@unige.ch
University of Geneva
Switzerland

and

Annica Vieser
Email: annica.vieser@unige.ch
University of Geneva
Switzerland

Being

From virtually the beginning of the Western tradition, philosophers have at least sporadically recognized that being is the primordial issue in philosophy. It is such because every theoretical sentence is implicitly or explicitly governed by a theoretical operator including a conjugated form of the verb “to be,” hence, everything we think or talk about is either being itself or an instance or aspect of being. In the language of concepts, the concept being presupposes no other concepts, but is itself presupposed by all other concepts. According to the Eleatic Visitor in Plato’s Sophist, being [ousia] was of sufficient philosophical importance to his prede­cessors to have instigated “something like a battle of gods and giants among them.” Shortly thereafter, being qua being [to on he on] (or, more accurately, the entity as entity, or better yet, the be-er as be-er) is identified in Aristotle’s Metaphysics—among the most influential books in the history of philosophy—as the issue at the heart of first philosophy. Nevertheless, there are central issues concerning being that are not recognized by any Greek philosopher, indeed not identified until the thirteenth century C.E., in some works by Thomas Aquinas. But these issues are not adequately treated by Aquinas, and, after Aquinas, they are recognized only quite rarely.

This article presents these issues about being by relying upon the present-continuous tense of English, as used in the sentence “It’s being” and the sentence operator “It’s being such that.” These formulations make possible the articulation of the primordiality, universality, and uniqueness of being. More specifically, this article presents the issues through the lens of the structural-systematic philosophy (SSP); the issues’ importance is indicated by the explicit inclusion of the word “being” in, say, the titles of the books Structure and Being, Being and God, and Being and Nothing.

Table of Contents

  1. Articulating Being
    1. Refinements of Vocabulary
    2. A Refinement of Semantically Significant Sentence-Structures
  2. Theories of Being and Theories of Be-ers
  3. Central Aspects of the SSP’s Theory of Being: The Primordiality, Ubiquity, Uniqueness, and Universal Intelligibility of Being
  4. Being and Whatness
  5. Neglectfulness of Being
    1. Examples of Neglectfulness of Being
      1. Paired Philosophical Examples
      2. An Additional Philosophical Example
      3. An Example from Physics
  6. Being and Existing
  7. Dimensions of Being
  8. Being and God
    1. The Relation between the Contingent Dimension of Being and the Absolutely Necessary Dimension of Being
    2. God
    3. The Principle of Rank within Being and Evolution
    4. The SSP and Christianity
  9. References and Further Reading

1. Articulating Being

One centrally important reason for lack of sufficient clarity in philo­sophical treatments of being is that at least most of the ordinary languages that philosophers have relied on throughout the history of philosophy, emphatically including English, articulate being in a variety of inadequate ways. That they do so adds needless and often misleading complications to being’s articulation. A first source of complications in ordinary English is the vocabulary available for the articulation of being; a second source is the structures of the sentences it provides for articulating being. Each of the following two subsections first identifies specific problems with ordinary-English ways of articulating being, and then introduces refine­ments to the SSP’s language that enables it to avoid these problems. This article’s version of the SSP differs from Lorenz B. Puntel’s version, particularly in the ways it articulates being and hence in its theory of being.

a. Refinements of Vocabulary

 Three peculiarities of the words used by ordinary English to artic­ulate being are of particular importance as far as philosophical articulations of being are concerned. The first peculiarity is that the word “being” has (most relevantly) the following distinct senses: (1) a nominal sense, in which “being” is roughly synonymous with “entity,” and (2) a verbal sense, in which “being” is roughly synonymous with “existing,” in the sense articulated in the Oxford English Dictionary (henceforth, OED) as “the fact of belonging to the universe of things material or immaterial.” Because the two senses are available, one can say both “To be is to be a being” and “To be is to be being,” or “I am a being” and “I am being.” Philosophical uses of the word “being” that do not clearly distinguish these senses, or that do not make clear, in all relevant cases, which sense is intended, require clarification.

The second important peculiarity of words used in ordinary English to articulate being is that, because the word “is” is so often used as copula (or, on an alternative interpretation, as a component of predicates, as in “is red” or “is human”), such sentences as “Fred is” can appear to be incomplete; hearing the sentence “Fred is,” one might well wonder, “Fred is what?”. Similarly, the question “Does God exist?” is more readily intelligible than is the question, “Is God?”.

Presumably because sentences ending with “is” so easily appear incom­plete, the use of “is” as the final word in sentences situating the referents of their subject terms within being are rarely used. Instead, for this sense, the “is” usually follows the word “there,” in the phrase “there is.” In this phrase (as in “there are”), the “there” does not perform its usual role of indicating a location that is specified later in the sentence (as in, “There is a pizza restaurant on the corner”). It instead signals that the “is” situates the referent of its subject term within being, rather than functioning as copulative or predicative. That English can express this sense of being by “is,” by “exists,” and by “there is,” with the first of these being the most problematic, introduces avoidable confusion.

The third important peculiarity in ordinary-English words used to articulate being is that several conjugated forms of the verb “to be” have roots different both from that of the infinitive and from those of one another; these include, among others, “am, “is,” and “are.” Consequently, although any sentence using any one of these words at least co-artic­ulates being, the words themselves do not make that fully explicit.

One way to improve talk about being, using a slightly modified version of ordinary English, is to introduce a capitalized version of the word (as is done in Being and God and Being and Nothing), and to explicitly link that version to the use of “being” in sentences that situate the referents of their subject terms within being. That remedy, however, avoids only the first of the three peculiarities just identi­fied. Therefore, this article, following White’s Toward a Philosophical Theory of Being (henceforth, TAPTOE) and “Rearticulating Being,” proceeds differently. First, instead of using the word “being” in a nominal sense, in which it would be roughly synonymous with the word “entity,” it introduces for that sense the technical term “be-er,” a word similar to such ordinary-language terms as “runner,” “swimmer,” “writer,” and “philosopher.” Just as running is not a runner and does not run, being is not a be-er and does not be; instead, runners run, and be-ers be. Second, the SSP often uses “be” when ordinary English would require “am,” “is,” or “are,” that is, as the sole form of its verb “to be” in the simple present, and as a component of present-continuous verbs. A native speaker of Jamaican English has confirmed to me that that language indeed uses the sentence “We be jamming,” and that sentence is intelligible to speakers of other versions of English, as are (for example) “I be talking,” “You be reading,” and “We be philosophizing.” In the technical language of the SSP, there be human be-ers; human be-ers be the be-ers that be human. Their unavoidable mode of being be being human.

Grammatically, these variants of parts of the verb “to be” make that verb much more regular than its counterpart in ordinary English. Philosophically, they enable the SSP to directly and explicitly articulate being and its ubiquity. Thereby, it is hoped, this modified, technical English presents a more powerful obstacle than does ordinary English to the tendency that Heidegger calls forgetfulness or oblivion of being. In other words, these changes are meant to make it harder for us to fail to notice the ubiquity of being—to notice that whenever we are speaking or thinking, we are speaking or thinking either (rarely) of being itself, or (usually) of instances or aspects of being.

b. A Refinement of Semantically Significant Sentence-Structures

Drawing on several works by Étienne Gilson, this subsection shows how the vocabulary introduced above can make possible the direct and explicit articulation of being. Such articulation is the strongest obstacle to the oblivion of being (of which more below) only if it is accompanied by a refinement of semantically significant sentence-structures. A first step is taken with clarification of the ubiquity of being as articulable in theories.

Theories are articulated as collections of indicative sentences and, as Gilson 1952 (197) points out (using the term “affirmations” rather “indicative sentences”),

the principal function of the verb is to affirm, and since affir­mation remains the same whatever may happen to be affirmed, a single verb should suffice for all affirmations. In point of fact, there is such a verb, and it is “to be.” If only spoken usage allowed it, we would never use any other one…. Not I live, or I sit, but I am living, I am sitting and likewise in all other cases.

If the “ams” in “am living” and “am sitting” are understood as components of present-continuous verbs rather than as being copulative or predicative uses of “am,” then of course they are not simply forms of the verb “to be,” but even then, the present-continuous verbs that include them co-articulate being. Moreover, additional distinctions are necessary, for example between the likes of “She be running” and “She be a runner” or, more expansively, “She be a be-er who also (more specifically) be running (right now)” and “She be a be-er who also (more specifically) be a runner (even if not running right now).”

A consequence of the possibility of such reformulations is that every theoretical sentence can be made to co-articulate being. Yet, for reasons given in Gilson 1948 (284-5), co-articulation of being has not sufficed, historically, to counter the oblivion of being. One reason for this is that the sentences considered in that text, like most sentences in English and in most and perhaps all other languages that have been used by philoso­phers, include semantically significant grammatical subjects.

According to Gilson, being is most directly articulated in sentences of the form “S [a semantically significant subject] be”; any such sentence articulates “the composition of the subject with its act of being, it unites them in thought as they are already united in reality.” [original text altered slightly to incorporate this article’s language of being]. Yet although any such sentence unites them, the text tells us that the human intellect (284) tends to focus on the subject—the be-er—and thereby to neglect the “act of being.” In any such sentence, being is articulated only “as included in the [be-er].” That it is only so articulated:

is often serious, to the point of sometimes being catastrophic, because, as history has made us see, the spontaneous conceptu­alism of ordinary thought tends constantly to reinforce the essence of the [be-er] to the detriment of its act of [being]. Let us also add that this fact is easily explained because the [be-er] has more than its [being], that is, it has its concept… Gilson 1948 284-5.

Gilson 1948 recognizes, as the only sentence-forms that articulate being, “S is p” and “S is.” The SSP, however, links its semantics and ontology not to sentence-forms including semantically significant subject-terms, but instead to sentences of the form “It’s such-and-suching.” This form makes it easy to explicitly and exclusively articulate being itself, not being as included in any be-er. This is done with the sentences “It’s being” and “It be being”, whose only semantically significant terms are their present-continuous verbs (the “It” in such sentences is considered below). The decisive contributions that these formulations make to the SSP’s theory of being are explained in greater detail in what follows, but one that links to (Gilson 1948) is appropriately included here. According to that text (248), if a sentence of the form “S is” is understood as articulating being, it says “not that the subject is itself, which is always true of everything, but [instead] that it is, which is not true, and moreover not always, except for some.”

As is more fully explained in what follows, the SSP’s sentence “It be being” is always true; this is a first indication of the primordiality of being.

2. Theories of Being and Theories of Be-ers

Throughout the history of philosophy, most theories that theoretical frameworks relying on ordinary English would classify as theories of being are, according to the SSP, instead theories of be-ers. Moreover, in this terminology (and, more broadly, in that of the SSP), theories of be-ers are ontologies. Theories of be-ers are concisely summarized by sentences of the form “To be a be-er is to be x,” with x generally replaced by one or more nouns, with appropriate article(s). Most ontologies, both historically and at present, hold that to be a be-er is to be either a thing (or object or substance), or an attribute of a thing (a property or, in some variants, a relation); according to such ontologies, the apple that is red and is on the table is (or be), as does the redness of that apple, and the table that the apple is on. The SSP’s ontology holds instead, as discussed in various contexts in TAPTOE, that to be a be-er is to be a fact (or, in TAPTOE’s technical term, a facting).

Why is there need for a theory of being? To answer this question, compare running: one could assemble a list of runners, but unless one had a theory of running, one would be unable to explain why the list contained the items that it did. Similarly, the one thing that all be-ers have in common is that they be, so in the absence of a theory of being, one is unable to explain why the list of be-ers contains the items that it does.

In what follows, clarity is served by speaking of frameworks for theories of be-ers that include thing- or substance- or object-ontologies as whatness-frameworks, because at least the most prominent members of that family of frameworks include identifiable versions of the thesis of the primacy of whatness, that is, the thesis that every be-er is, primarily, its whatness. In Aristotelian frameworks, the problem of being is not recognized, so the primacy of whatness is the primacy of substance over at least accidental attributes; according to such frameworks, Alan White is, primarily, either a human be-er (essence) or the specific human be-er that he is (individual), but he must be that whatness in order to be, at some specific time, doing anything else, for example, sitting rather than standing.

Far later than Aristotle, Thomas Aquinas takes an important step in recognizing, at least occasionally, the primacy of being over whatness but, as Gilson 2002 indicates (163), he has important predecessors:

Other philosophers had preceded Thomas along this path, and all of them helped him to follow it through to the end, particularly those among them who clearly raised the problem of [being]. Alfarabi, Algazel, Avicenna among the Arabs, Moses Maimonides among the Jews, had already noted the truly exceptional position that [being] occupies in relation to essence…. What seems to have especially intrigued these philosophers is that, however far you push the analysis of essence, [being] must be added to it in some way from outside, as an extrinsic determination conferring on it the act of [being]…. These philosophers started from essence, and using analysis they sought to discover [being] within it, but they could not find it there. Hence their conclusion: [being] was extra­neous to essence as such…. So Alfarabi concludes: “[Being] is not a constitutive factor; it is only an accessory accident.”

As this passage makes clear, these predecessors move beyond Aristotle in positing the primacy of whatness not only over attributes, but also over being. Yet if what is called being is extraneous to essence, then essence is extraneous to what is called being. If, however, essence is, and is extraneous to what is called being, then what is called being is not being in its ubiquity. Additional steps must be taken.

As indicated above and considered in somewhat more detail below, some of those steps are detectable in various works by Thomas Aquinas although, as shown in Being and God (1.3.2), those works contain no theory of being. Moreover, those works’ articulations of being have not been widely influential. Indeed, as noted in Gilson 1952 (118), “the genuine meaning of the Thomistic notion of being is, around 1729, completely and absolutely forgotten,” thanks chiefly to the dominant influence of Suarez. A revival of so-called existential Thomism develops in the 1930s, and is sufficiently developed by the 1950s that Clarke 1955 includes (61-2) the following announcement:

What is now widely known as the existential interpretation of Thomistic metaphysics has definitely come of age. (By existential I mean that interpretation which sees in the act of [being] the source of all perfection and intelligibility, hence the center of gravity of St. Thomas’[s] whole philosophy)…. As speculation and text work proceed hand in hand, each illumi­nating the other, it is becoming more and more evident that this perspective is by no means some short-lived fad borrowed from the contemporary Existentialist movements and superimposed extrinsically on St. Thomas’[s] own thought, but rather that it is that one luminous center … in the light of which alone the total body of St. Thomas’[s] texts takes on full intelligibility and coherence.

Fad or not, the revival appears to have been relatively short-­lived, and certainly had no influence on the mainstream analytic philosophy that has been dominant since around the time of the revival. Worse yet, as the SSP shows, “the total body of St. Thomas’[s] texts” cannot “[take] on full intelligibility and coherence,” not only because the heart of the substance ontology relied on by those texts is unintelligible (see TAPTOE 2.5), but also—and more importantly, for this article—because although some of those texts recognize the primacy of being, none adequately articulates the ubiquity of being because in them, essence remains somehow distinct from being (see also Being and God 1.3.2.2). The remainder of this section, however, focuses on semantic and ontological problems that would remain even if a variant were developed according to which essence was fully intrinsic to being.

The semantic and ontological problems endemic to Thomistic being-frameworks are linked to the one identified in a passage from Gilson 1948 considered above. They arise from the at least tacit reliance of Thomistic frameworks on compositional semantics—according to which, roughly speaking, the meanings or semantic values of sentences are functions of the meanings or semantic values of their sub-sentential components—and on ontologies strongly linked to semantically significant grammatical subjects and predicates. Within these frameworks, the semantic counterparts to both subjects and predicates are concepts, such that, for example, the sentence “Unicorns are mythological” links the word “unicorns” to the concept unicorn and the word “mythological” to the concept mythological. The semantic status of being, within such frameworks, is problematic because there is no concept that can be linked to the word “is” (or “are”) in any manner comparable to that in which the concept unicorn is linked to the word “unicorn,” and the concept mythological to the word “mythological.” This makes both the semantic and the ontological status of being obscure.

This point is worth emphasizing. Asked to clarify “unicorns,” the utterer of the sentence would easily be able to say that unicorns are animals that are like horses except that they have single sharp horns growing from their foreheads, and asked about “mythological,” the utterer could easily say that “mythological” beings are ones that occur only in stories told by human be-ers but not, unlike for example horses, cats, and dogs, in reality. But asked about the “are,” the utterer would have no comparably available explanation.

Gilson 2002 (172) addresses this problem by noting initially that “Being is the first of all concepts,” because being is co-articulated in every sentence of the form ‘S is’ or ‘S is p’ and, as shown above, any indicative sentence can be rewritten into a sentence with such a form. Being is also, however, “the most universal and abstract [concept], the richest in extension and the poorest in comprehension.” It is “richest in extension” because any be-er can be articulated in a sentence whose verb is “is,” but “poorest in comprehension” because sentences of the form “S is” can appear to say nothing specific about the be-er. Because of this poorness in comprehension, Gilson 2002 voices the suspicion that we would need either “an intuition of [being]” or an “intellectual intuition of being as being” in order to comprehend it, yet such an intuition would be inarticulable and thus would not make being conceivable. “But,” the book continues (174), “reason dislikes what is inconceivable, and because this is true of [being], philosophy does all it can do to avoid it.”

Concerning the articulation of being, then, Thomistic frameworks face first the at least partially avoidable problem that, because their sentences at best co-articulate being, they tend to place more emphasis on be-ers than on being. This problem is avoided by those who emphasize being, but those who do, even if they also were to recognize the ubiquity of being, would face the insuperable obstacles to clarification of being posed by those frameworks’ semantics and ontologies. What is needed, then, if philosophy is to cease to avoid being, is not a new Thomistic theory of being, but instead a different theory of being.

3. Central Aspects of the SSP’s Theory of Being: The Primordiality, Ubiquity, Uniqueness, and Universal Intelligibility of Being

Given that a theory of being (or of anything else) must be a collection of meaningful, or semantically significant, sentences, any adequate theory of being must explain its semantics, that is, how its sentences are meaningful.

As indicated in various contexts in TAPTOE, the SSP rejects both compositional semantics and any ontology or semantics strongly linked to the structural components of subject-predicate sentences. It links its ontology and semantics instead to sentences with the structure “It’s such-and-suching.” Because such sentences—techni­cally, sentencings—have no semantically significant subsentential components, their semantics cannot be compositional; they are instead contextual, that is, ones according to which words have determinate meanings or semantic values only within the contexts of sentences. The semantic contents of sentences are propositions (technically, propositionings), and these semantic contents relate to the SSP’s ontology such that every propositioning is identical to a facting. The absolutely comprehensive facting IT’S BEING (or IT BE BEING) is identical to the true propositioning It’s being (or It be being), which is expressible by the true sentencing “It’s being” (or “It be being”).

Being as articulated in the sentence “It be being” may appear to be, in Gilson 2002’s terms, “universal and abstract,” but it is straightforwardly and transparently concretized by means of that sentence’s expansion into the operator “It be being such that,” which can govern any and every sentencing that expresses a propositioning. An example: “It be being such that It be Alan Whiting such that It be revising It’s TAPTOEing,” a sentencing true at the time of its initial compo­sition. Hence, although being is of problematic intelligibility within the theoretical frameworks considered and relied on in Gilson’s works, because their semantic focus is on concepts and being has no clear conceptual status within them, within the framework of the SSP, being is directly articulated by a true sentencing that expresses a true propositioning identical to an actual facting.

The uniqueness of being mentioned in this subsection’s title is clarified by more detailed comparison of being with running (running being simply one of a vast number of possible comparative items). Most human be-ers are capable of running or (in slightly different terms) have the capacity to run. The human be-er who runs is activating that capacity—that human be-er be at work running, be engaged in running—whereas the human be-er who sits generally retains the capacity to run while not activating it, while not engaged in running. In contrast, every human be-er who actually be cannot avoid engaging in being, cannot avoid being at work being; any human be-er not engaging in being would be a merely possible human be-er, like a future grandchild or perhaps a human be-er who, although having been, be no longer. Thus, whereas running is an ontological capacity because human be-ers (along with be-ers of many other kinds) can but need not be engaged in running, being is not a capacity, because be-ers, of whatsoever sort, have any capacities, activated or not, only if and when they be. This, then, is the most central way that the being of be-ers differs from all of their other engagements (or, more accurately, modes of being): being is the engagement or being-at-­work (or, again, mode of being) that is not an ontological capacity. In this way, it is absolutely unique.

The uniqueness of being among the engagements of be-ers is further illuminated by the phenomenon of cryopreservation. Some organisms—including human embryos and adult members of a few species of vertebrates (chiefly amphibians)—can continue to be—and to be the organisms that they are—while they are frozen. While they are frozen, all of their metabolic processes cease. Hence, when frozen, they do not activate their capacities for aging or even for living, in anything like the usual sense of living, although they are not dead. They are not dead because they retain the capacity to live; that capacity is reactivated when they cease to be frozen. Even as frozen, then, they continue to be, to engage in being.

An additional step leads, in a manner different from those introduced above, from the being of be-ers to being itself. No organism has the capacity to bring itself into being, because before the organism is, the organism has no capacities. And yet, the coming into being of the organism reveals that it was possible that the organism come into being. The coming into being of the organism therefore reveals the capacity of being to be manifest, to manifest itself, as that organism. The birth of the organism is being’s reconfiguration of itself so as to include that organism; it is the emergence of the organism into and hence within being. For this reason, the gestation of the organism that grows into a salamander is also articulable as being engaging in salamandering: It be being such that It be salamandering.

“Salamandering” is of course a peculiar word, but one whose inclusion in this article should not be surprising, given that one way to be, according to the SSP’s ontology, is to be an actual facting identical to a true propositioning expressible by a true sentencing such as, for example, “It’s salamandering.” A fuller consideration of the “It” in this sentencing further clarifies the involvement of being in salaman­dering (on this topic, see also Being and God 3.2.1.1).

According to the OED, the “it” of “It’s raining” (and that of “It’s salamandering”) is “the subject of an imper­sonal verb or impersonal statement, expressing action or a condition of things simply, without reference to any agent.” The OED includes the following examples: “It has fared badly with the soldiers; How is it in the city? It will soon come to a rupture between them; It is all over with poor Jack; It is very pleasant here.” Resituated within the theoretical framework of the SSP, none of these sentences can articulate any condition of things, but each can articulate, intelligibly and coherently, a configuration of being. That each can indeed do so is revealed by the fact that each remains intelligible—although, of course, also becomes peculiar—if its “it” is replaced with “being”: “Being has fared badly with the soldiers; How is being in the city? Being will soon come to a rupture between them; Being is all over with poor Jack; Being is very pleasant here.” Given its rejection of the semantics linked to subject-predicate sentences, the SSP cannot of course identify being, in any of these cases, as the referent of the “it” that the word “being” replaces. Instead, it takes each “it” to indicate being, a configuration of which is articulated by the words that complete the sentence. Hence, an alternative formulation of one of the sentences introduced above is, “It’s being such that It’s faring badly with the soldiers.”

A further step is taken following the introduction of a second instance of the impersonal “it,” one that is of central importance to the SSP. This is the “it” of the theoretical operator, which has the forms “It is (it be) the case that” and “It is (it be) true that.” As explained more fully in TAPTOE (see 3.4-5, 6.3, and 6.3.1.), prefixing the theoretical operator to any indicative (hence, theoretical) sentence can make explicit the semantic status of that sentence. Hence, for example, the semantic status of “It’s raining,” as asserted, is explicitly articulated in the sentence “It is the case that it’s raining.”

In terms of indicative function, the “it” of any sentencing, such as “It’s raining,” and the “it” of the theoretical operator are not simply identical. The “it” of “It’s raining” usually indicates a configuration of being at a specific spatio-temporal location. Because, however, the theoretical operator makes explicit the semantic status of every theoretical sentence, its scope is absolutely unrestricted: it thus indicates being as a whole. The example sentence is thus understood as follows: being as a whole is configured such that being here-and-now is configured such that raining is ongoing. Or: It be being as a whole such that It be being here-and-now such that It be raining. Or: It be being as a whole such that It always be being such that It be 2+2=4ing.

In closing this section, note that a thesis central to the SSP’s theory of being as such is that being is universally intelligible.

4. Being and Whatness

A central advantage of being-frameworks (over essence- or whatness-frameworks) is that they can make the essence or whatness of any be-er intrinsic to the being of that be-er. By contrast, even in whatness-frameworks that recognize being, being is generally taken to be extrinsic to essence or whatness. This section shows that and how, according to the SSP, (1) being is prior to the whatness of any be-er and (2) whatness is intrinsic to the being of any be-er. To this end, this article, following TAPTOE, introduces two examples, one of a biological be-er’s coming- and ceasing-to-be, and one of an artifactual be-er’s coming- and ceasing-to-be.

The biological example is a possible case of in-vitro fertili­zation (in-vitro so as not to add a woman’s womb as a complicating factor). Prior to fertilization, there are, of central importance to this example, two be-ers, thus, two be-ers that be, that be engaged in being: these be a sperm cell and an egg cell. If fertilization occurs, egg and sperm will have ceased to be, and a zygote will have come to be. Much about specifically what that zygote will be, if it comes to be, is presumably determined by the genetic make-ups of the sperm and egg cells. If a zygote comes to be, it may be the only zygote that could have come to be, in this situation, but it will be that zygote only when it comes to be. There be one cell at work being a sperm cell, and one cell at work being an egg cell; if fertilization occurs, it will be because egg and sperm have jointly reconfigured themselves into constituents of a new be-er, the zygote.

Considered somewhat differently: sperm and egg are both specific configurations of being, of being because they be, and specific because each has specific capacities, and lacks capacities had by other kinds of be-ers; each has the capacity to be for some time, to unite with the other, and, in so uniting, to be reconfigured, or to reconfigure itself, such that some of what had been its constituents become constituents of a zygote. They lack the capacity to be jointly reconfigured, or to jointly reconfigure themselves, into anything other than a zygote. If they unite, the zygote will come to be as itself a specific or restricted configuration of being.

Even if the zygote as a new organism comes to be, it will not, of course, continue forever to be. At some point, it will die—possibly quite abruptly. The physical changes in the organism, when it dies, can appear to be relatively slight, but the ontological change could not be more profound. Following death, many organs that had been components of the organism’s body may continue to be, and to be at work continuing to be, for some time, but they will no longer be at work as organs, because there will no longer be an organism.

The artifactual example is the following: contemplating what to cook for breakfast, I may narrow my choices to oatmeal or an omelet. If I choose to make the omelet, I have—in one terminology—determined the essence of the be-er that will come to be if I succeed in making it. I can have before me all of the ingredients I will use—eggs and, say, sausage, onion, cheese, and the butter with which I will coat the frying pan—and if I proceed, what omelet will come to be, if an omelet comes to be, is highly determinate. But there is not yet an omelet, and nothing about the possible omelet’s essence or whatness determines whether or not it will come to be. What have come to be, already, are the ingredients, with their constituents jointly at work enabling them to continue to be, and I myself, the potential chef. If I opt for the omelet rather than for oatmeal, and if I successfully follow the requisite procedures, the ingredients will begin to work together in such a way that, soon thereafter, an omelet will begin to be. But the beginning to be is the beginning of the omelet—there be no omelet until the omelet be; until the omelet be, the ingredients be, but the omelet does not. After I have eaten the omelet, of course, the omelet no longer be, but its constituents continue to be, as they—or at least some of them—are reconfigured, temporarily, into constituents of my body.

5. Neglectfulness of Being

This subsection is metasystematic because it treats texts external to the SSP, but it  is included to further clarify the central importance of being as a topic for systematic philosophy. Its phrase “neglectfulness of being” is a formulation more accurate than the Heideggerian “forgetfulness” or “oblivion” of being, mentioned above. The reason for introducing this phrase is suggested by what is said in subsection 3 about the theoretical operator. Because the theoretical operator indicates and indeed discloses being as a whole, being cannot be wholly absent from any theoretical framework. In many—indeed, presumably, in the overwhelming majority—it is tacitly presupposed, and nowhere denied. This is generally wholly non-problematic, although it would be a fatal flaw in any systematic philosophy, because any systematic philosophy not including a theory of being would be incomplete. Also fatally flawed, however, are theories that appear to deny being, despite (unavoidably) presupposing being. This is clarified by examples.

a. Examples of Neglectfulness of Being

i. Paired Philosophical Examples

van Inwagen 1996 includes (96) the following:

If the notion of an abstract object makes sense at all, it seems evident that if everything were an abstract object, if the only objects were abstract objects, there is an obvious and perfectly good sense in which there would be nothing at all, for there would be no physical things, no stuffs, no events, no space, no time, no Cartesian egos, no God…. When people want to know why there is anything at all, they want to know why that bleak state of affairs does not obtain.

Worth noting in passing is that speaking of states of affairs as “obtaining” or not “obtaining” —rather than as being or not being—is common in analytic philosophy, and is an evasion or neglect of being; what other than being could “obtaining” be? Be that as it may, by “there would be nothing at all,” van Inwagen 1996 explicitly means that there would be no non-abstract objects; there would however be, in his scenario, abstract objects, hence not the utter absence of being, which, as shown below, cannot be.

Lowe 1996, in response to van Inwagen 1996, includes (115) the following:

Suppose we could show that there couldn’t be a world containing only abstract objects, perhaps by arguing that abstract objects necessarily depend for their existence upon concrete objects: what would follow? Clearly, it would follow that van Inwagen’s ‘bleak’ state of affairs couldn’t obtain. And yet, in a perfectly clear sense, this wouldn’t suffice to show that it was necessary for something concrete to exist: for we wouldn’t have foreclosed the possibility that nothing at all—nothing either concrete or abstract—might have existed. To foreclose that possibility, it seems, we would need also to show that at least some objects, abstract or concrete, exist in every possible world.

For Lowe 1996, the possibility that “nothing at all—nothing either concrete or abstract—might have existed” is open if there is a possible world containing no concrete or abstract objects (see 111-12). That possible world would however have to be a possible world that would not only itself be but would be distinct from other possible worlds, including the actual world. It would, that is, be situated within being, and would not be—impossibly—the utter absence of being.

ii. An Additional Philosophical Example

Other works by Peter van Inwagen are among the relatively few by analytic philosophers that recognize that there even might be a significant distinction between what the SSP terms being and be-ers, and hence a need for theories of being. van Inwagen 2008b includes (278) a conver­sation in which a fictional Alice argues that “being is a feature of everything,” asking, “who could deny that everything there is is?”. The conversation leads to “the identification of being with self-identity” (287). The text recognizes as a possible alternative—attributed to Sartre, among unnamed others—that being is “an activity that things engage in, the most general activity that they engage in.” van Inwagen 2009’s treatment of this alternative includes (477) the following (quoted in part in Being and God, 196):

If there is a most general activity that a human be-er (or anything else that engages in activities) engages in—presumably it would be something like ‘living’ or ‘getting older’ [the phenomenon of cryopreservation, introduced above, reveals that this is not the case]—it is simply wrong to call it ‘being’. And it is equally wrong to apply to it any word containing a root related to ‘être’ or ‘esse’ or ‘existere’ or ‘to on’ or ‘einai’ or ‘Sein’ or ‘be’ or ‘am’ or ‘is.’ One cannot, of course, engage in this most general activity (supposing there to be such an activity) unless one is, but this obvious truth is simply a conse­quence of the fact that one can’t engage in any activity unless one is: if an activity is being engaged in, there has to be something to engage in it.

As Being and God notes, this passage fails to clarify or even to recognize being because it makes no attempt to explain the “is” of “unless one is,” or the “be” of “there has to be something.” According to the SSP, if one actually is, then one be being, and actually to be something is to be being something.

Perhaps also worth noting is that van Inwagen 2009 attempts to show that being is somehow superfluous or avoidable by introducing (478) a fictional Martian language with the following characteristics:

There are in Martian no substantives in any way semantically related to ‘être’ or ‘esse’ or ‘existere’ or ‘to on’ or ‘einai’ or ‘Sein’ or ‘be’ or ‘am’ or ‘is.’ (In particular, Martian lacks the nouns ‘being and ‘existence’….) There is, moreover, no such verb in Martian as ‘to exist’ and no adjectives like ‘existent’ or ‘extant’. Finally, the Martians do not even have the phrases ‘there is and ‘there are’.

van Inwagen’s Martian language does, however, include the following sentences (478-79, emphases added):

Everything is not a dragon.

It is not the case that everything is not (a) God.

I think, therefore not everything is not I.

It makes me strangely uneasy to contemplate the fact that it might have been the case that everything was not always I.

It makes me strangely uneasy to contemplate the fact that every­thing is not (identical with) anything.

It is a great mystery why it is not the case that everything is not (identical with) anything.

As the italicizations clearly show, each of these sentences includes a form of the verb “to be.” Being is thus neither superfluous nor avoided in Martian, and it would be open to Martian philosophers to introduce counterparts to “being,” “be-er,” “It be being,” and “It be being such that” into their philosophical languages.

iii. An Example from Physics

A Universe from Nothing (Krauss 2012) presents itself (xiii) as responding to the question “Why is there something rather than nothing?” That it exhibits neglectfulness of being is evident from its assertion (xiv) that “‘nothing’ is every bit as physical as ‘something,’ especially if it is to be defined as the ‘absence of something.’” Any “nothing” that is physical is not, obviously, utter non-being. Nevertheless, additional details are worth noting.

According to Kraus’s 2012 (xvii), “perhaps the most surprising discovery in physics in the past century … has produced remarkable new support for the idea that our universe arose from precisely nothing,” The text later (58) clarifies “precisely nothing” as follows: “By nothing, I do not mean nothing, but rather nothing—in this case, the nothingness of what we normally call empty space.” Yet later (98), this “precisely nothing” is supplemented by several other factors, and becomes “essentially nothing”: “if inflation indeed is responsible for all the small fluctuations in the density of matter and radiation that would later result in the gravitational collapse of matter into galaxies and stars and planets and people, then it can be truly said that we are all here today because of quantum fluctuations in what is essentially nothing.” This passage clearly presupposes that matter, radiation, and quantum fluctuations be. Moreover, what is first described as empty space is later (104) said to be endowed with energy. As so endowed, it is “Nothing,” and (152) it “can effectively create everything we see, along with an unbelievably large and flat universe.” And yet,

it would be disingenuous to suggest that empty space, which drives inflation, is really nothing. In this picture one must assume that space exists and can store energy, and one uses the laws of physics like general relativity to calculate the consequences. So if we stopped here, one might be justified in claiming that modern science is a long way from really addressing how to get something from nothing. This is just the first step, however. As we expand our understanding, we will next see that inflation can represent simply the tip of a cosmic iceberg of nothingness.

So: a universe “created” by empty space endowed with energy is not a universe from nothing, despite earlier contentions to the contrary, and nothingness as a whole is “a cosmic iceberg.”

Krauss 2012’s cosmic-iceberg sense of nothing/nothingness is (170) “the absence of space and time” but the presence of quantum gravity, and although the text asserts at the outset (xiv) that all of its uses of “nothing” will be “scientific,” the following (174) passage indicates that rather than being required by any scientific theory, these uses are what “work” for the text’s author:

When I have thus far described how something almost always can come from “nothing,” I have focused on either the creation of something from preexisting empty space or the creation of empty space from no space at all. Both initial conditions work for me when I think of the “absence of being” and therefore are possible candidates for nothingness.

The continuation of this passage indicates that neither of these candidates adequately explains the universe as originating from nothing: “I have not addressed directly, however, the issues of what might have existed, if anything, before such creation, what laws governed the creation, or, put more generally, I have not discussed what some may view as the question of First Cause.”

Krauss’ suggested answer to this question is the multiverse (175), although Krauss 2012 nowhere asserts that the multiverse is nothing. Instead, it says (177) that “In a multiverse of any of the types that have been discussed, there could be an infinite number of regions, potentially infinitely big or infinitesimally small, in which there is simply ‘nothing,’ and there could be regions where there is ‘something.’” The empty regions would, of course, be regions. Moreover, Krauss includes the just-quoted contention about regions in which there is “simply ‘nothing’” despite having acknowl­edged (176) that “we don’t currently have a fundamental theory that explains the detailed character of the landscape of a multiverse … (… we generally assume that certain properties, like quantum mechanics, permeate all possibilities…).” In order to permeate all possibilities, the “property” quantum mechanics must of course somehow be.

So, Krauss 2012 in fact does not argue that the universe is created from nothing, even if “create” and “nothing” are understood in the idiosyncratic ways in which the book explains them. Each of its senses of “nothing” is an absence of be-ers of some kinds or other; yet each presupposes being.

6. Being and Existing

“Being and “existing,” and “to be” and “to exist,” are synonymous within some philosophical frameworks. As indicated above, in the SSP’s they are not. In the SSP, existence is the mode of being only of factings within the contingently actual dimension of being (this term is fully explained below). Thus: in the SSP’s terminology, merely possible worlds and the entities within them are (or be), but do not exist.

Some works by Peter van Inwagen follow works of Quine in equating being and existing. The difficulties that ensue, particularly in van Inwagen 2008b, are instructive. The following passage (283) provides a fruitful starting point:

if one says of some woman that she doesn’t exist, one has to be wrong. If the woman in question is “there” to have something said about her, then she exists.

What, one might wonder, if the woman is “there” in a work of fiction? Of Sherlock Holmes, the text asserts (295) the following:

There does exist such a fictional character as Sherlock Holmes. He is as much a part of the World as is any of the short stories and novels in which he “occurs.”

This is problematic at best, because whereas one can buy copies of stories and novels wherein Sherlock Holmes is a character, one cannot acquire the services of Sherlock Holmes; this is an enormous ontological difference. Moreover, van Inwagen 2008b also asserts (111) the following:

Words like ‘dragon’ and ‘unicorn’ are not names for kinds of non-existent things. Rather, they are not names for anything of any sort, for there are no dragons for them to name.

This introduces an inconsistency: if Sherlock Holmes is “as much a part of the World as is any of the short stories and novels in which he ‘occurs,’” then the dragon Smaug is as much a part of the World as is J.R.R. Tolkien’s novel The Hobbit. This inconsistency might plausibly result from a failure to adequately revise, given that the passage about Sherlock Holmes appears in the Coda found only in the third edition of van Inwagen’s Metaphysics, whereas the passage about dragons also appears in the earlier editions. But a comparable inconsistency emerges within the Coda itself. That text denies (296) that “the maps that accompany copies of The Lord of the Rings must be maps of something,” but again, if Sherlock Holmes is a part of what van Inwagen 2008b calls the World because he appears in short stories and novels, then Middle Earth is a part of the World because it appears in novels, and Middle Earth is precisely what the maps accompanying copies of The Lord of the Rings are maps of. One might also ask the following: how could the maps in The Lord of the Rings be maps—rather than mere drawings—if they were not maps of anything?

Distinguishing between being and existing facilitates avoidance of problems of the sorts just identified in van Inwagen 2008b. According to the SSP, Sherlock Holmes, Smaug, and Middle Earth do not exist, but each is (or be), within the non-actual world within which it appears in fictional accounts.

7. Dimensions of Being

Everywhere, there be being, because all be-ers be, or engage in being. But qualification is necessary, because only actual be-ers be; possible but non-actual be-ers do not. This point may also be put as follows: every actual be-er is actively being, is engaged in being. In terms closer to Aristotle’s, to be an actual be-er is to be at work being that be-er. For organisms, as indicated above, to die is to cease to be at work being organisms.

Because possible but non-actual be-ers are not at work being themselves, their mode of being is derivative (see Structure and Being 463, 471). There is, then, no be-er at work being Sherlock Holmes. Sherlock Holmes’s being is derivative in the first instance from the being-at-­work of Arthur Conan Doyle, and in additional instances from the being-at-work of those who read or recall Conan Doyle’s novels and stories, and those who present versions of Sherlock Holmes in films, other works of literature, and so forth, and those who assimilate or recall such versions. A volume of Holmes stories on a library shelf is at work preserving those stories, and it retains the capacity to present them; that capacity is activated when anyone reads the stories.

These modalities of being—contingently actual being (for example, of the volume of Holmes stories) and contingently non-actual being (for example, of Holmes) —require both explanation and supplementation. According to the SSP, there are three distinct modalities of being. Most broadly, there is the absolutely necessary dimension of being and the contingent dimension of being, which can also be termed the dimension of contingent be-ers. The contingent dimension of being includes the dimension of contingently actual be-ers and the dimension of contingently non-actual be-ers. In some other philo­sophical frameworks, the SSP’s dimension of contingently actual be-ers is termed the actual world, and its dimension of contingently non-actual be-ers the realm of merely possible worlds; for conven­ience, this article occasionally uses this terminology.

(To clarify: throughout the discussion that follows, the reader may substitute “propositioning” for “proposition.” The former is a technical term in the SSP’s philosophical language, while the latter is taken from ordinary English. The same holds for “sentencing” and “sentence” and “facting” and “fact.”)

According to the SSP, because modalities qualify or determine true propositions expressible by true sentences, and because true propositions expressible by true sentences are identical to actual facts, modalities qualify actual facts. They are, therefore, being’s own modalities. These modalities of being can be made explicit by means of a number of sentence operators, all of which articulate modalities of being. These operators include the following (with examples of arguments included).

    1. It is absolutely necessarily the case that it’s being.

This is considered below.

    1. It is contingently actually the case that there are parents.

There currently are parents, but there is no necessity that there be parents: there were no parents shortly after the Big Bang, and the time may come when there are no longer any parents.

    1. It is conditionally necessarily the case that every parent has at least one child.

Because it is only contingently actually the case that there are parents, it is not absolutely necessarily the case that every parent has at least one child. Because however to be a parent is to have at least one child, the modality of the relationship is nevertheless one of necessity. Differently put: if it is contingently actually the case that there are parents, then it is necessarily the case that every parent has at least one child.

    1. It is contingently non-actually the case that Sherlock Holmes is a detective (or: It is the case in the contingently non-actual worlds presented in various stories, novels, and films that Sherlock Holmes is a detective).
    2. It is necessarily not the case that Fred drew a round square.
    3. It is necessarily not the case that there is nothing (or: that nothing is).

Concerning the dimensions of being, the most important of these sentences are (1) and (6). Both Structure and Being and Being and God include arguments from the truth of versions of (6) to the truth of versions of (1). A variant is the following: By definition, it is possible for any contingent be-er not to be: for each contingent be-er, it is possible that it be, and possible that it not be, so its non-being is possible. Similarly, if contingent being were exhaustive of being—if all being were contingent being—then it would be possible that being not be. But being’s not being would be possible only if it were possible for non-being to be, and that is not possible. Therefore, being is not exhausted by—is not exhaustively—contingent being, and so must include necessary being as well.

Differently put: it would be possible for all being to be contingent being only if either “It be being such that it be absolutely non-being” or “It be being such that it be absolute-nothinging” expressed a proposition, because if either of these sentences expressed a proposition, that proposition would be identical to a fact at least in some possible world, and possibly (at some point) in the actual world. But these sentences, like the sentence “Fred drew a round square,” do not express propositions. According to the SSP, they express pseudo-propositions, and pseudo-propositions are not identical to facts in any world, actual or possible. Such sentences are therefore necessarily false. As indicated in Structure and Being (239n48), the sentence “Fred drew a round square’’ can be analyzed into the sentences “What Fred drew was round” and “What Fred drew was a square.” Each of these sentences expresses a proposition, but the conjunction “What Fred drew was round and was a square,” although grammatically correct, does not.

The sentence “It be being such that it be absolutely non-being” is similar, but somewhat more complicated. Its status is clarified by consideration of the more ordinary-sounding “There is nothing,” understood as expressing the pseudo-proposition There is absolute nothingness or There is absolute nonbeing. What makes these items pseudo-propositions is the fact that sentences of the form “There is such-and-such’’ express propositions only if the such-and-such somehow is. Any such-and-such, however, that in any way is, is not absolute nothingness, not absolute non-being. But if it is not even possibly the case that there be nothing, then it is absolutely necessarily the case that there be being.

Because there are contingent be-ers, and hence a contingent dimension of being or dimension of contingent be-ers, being is two-dimensional, including both the contingent dimension of being and the absolutely necessary dimension of being. And because it be possible that the entire contingent dimension of being not be, the primacy of being is, more specifically, the primacy of the absolutely necessary dimension of being.

The line of thought developed in the preceding paragraph can be put more technically as follows. The theoretical operator formulable as “It be being the case that,” which (as explained above) implicitly or explicitly governs every indicative sentence that expresses a proposition, and each of its modal variants (“It be being absolutely necessarily the case that,” “It be being contingently actually the case that,” and “It be being contingently not actually the case that”) situates its arguments within being. All propositions or propositionings are arguments of such operators, hence so too are the sentences or sentencings expressing them. Pseudo-propositions, however, are not arguments of these operators, but sentences and sentencings can express pseudo-propositions; those that do are necessarily false. The sentencing “It be being such that it be absolute-nothinging” expresses a pseudo-proposition because the “It be being such that” applicable (in one of its forms) to every sentence and sentencing expressing a proposition situates that proposition within being, and absolute-nothinging can in no way be, hence cannot be situated within being.

Perhaps worth noting at least in passing is that the word “nothing” is non-problematically and indeed often helpfully included within everyday theoretical frameworks relying on ordinary English. For example, “Nothing” can answer the question “What are you doing?”, but this would not mean that the respondent was not breathing, metabolizing, holding their body in some position or other, and so forth, but would instead mean that the respondent was not doing anything that would prevent them from doing something with the questioner. As another example, “There’s nothing in the refrigerator” would not mean that the refrigerator contained no shelves, air, and so forth, but instead that the refrigerator contained nothing that the utterer of the sentence wanted to eat or to drink.

A final word on this topic may be in order. Such sentences as “Nothing might exist” and “There might someday be nothing” are, of course, grammatically non-problematic. From that it does not follow that they are semantically non-problematic. Again, the same holds for “Fred drew a round square.”

8. Being and God

This section is far shorter than the book with which it shares its title, so a reasonable beginning for it is an explanation of the major differences between the two accounts. A thesis central to both accounts is put as follows in Being and God (1): “Any conception of ‘God’ that is not situated within an explicitly presented or implicitly presupposed theory of being as such and as a whole—and hence, obviously, any such conception presented in conjunction with the rejection of such theories—can only be a conception of something or other, an X, that putatively does or does not ‘exist’ beyond the world familiar to us and somehow separately from it, but that cannot ultimately be made either intelligible or reasonable.” Chapter 1 of Being and God criticizes as inade­quate various historical and contemporary approaches to the issue of God that are inadequate because they are not situated within theories of being as such and as a whole; this article includes no such critiques. Chapter 2 of Being and God turns to Heidegger, at the heart of whose thought is the question of being, and it argues at length that Heidegger utterly fails to respond to that question in a philosophically defensible manner; this article does not repeat that critique. Being and God’s Chapter 3 develops the SSP’s theory of absolute being to the point at which coherence and intelligibility are increased by the introduction of the term “God”; this article presents a version of this theory (with minor alterations, and in its different terminology). Chapter 4 of Being and God, finally, criticizes Emmanuel Levinas and Jean-Luc Marion, the most important and influential of those thinkers who attempt—in the language of the central thesis introduced above—to produce conceptions of “God” in conjunction with rejections of theories of being. This article does not consider either Levinas or Marion.

As indicated in the preceding paragraph, the most important way the SSP’s treatment of the issue of God diverges from other treatments of that topic is by situating it within a theory of being as such and as a whole. A second divergence is also worth noting at this point. In contemporary philosophy, the issue “God” is generally treated within what is called the philosophy of religion. According to the SSP, this begs various questions and introduces various unnecessary complica­tions. As is clear from (for example) Plato’s Euthyphro and Aristotle’s Metaphysics, the issue of God (or gods) is—no matter what else it may be—one that can be treated purely theoretically. That is how the SSP treats it. Consequently, the question addressed in this section is the following: does the inclusion of a facting appropriately designated as God increase the coherence and intelligibility of the SSP?

a. The Relation between the Contingent Dimension of Being and the Absolutely Necessary Dimension of Being

Given the preceding clarifications of the modalities of being and the status of absolute nothingness, the SSP’s alternative to the famous but—for reasons just given, incoherent—question “Why is there something rather than nothing?” is easily formulated and explained. The SSP’s question is the following: How is the inclusion within being of a contingently actual dimension best explained? There are in principle only three paths for exploration, and two of those paths are merely apparent. The first merely apparent path does not move beyond the contingently actual dimension of being, and thus leads­—if it can even be said to lead—only to such superficial responses as “Well, there just are contingent be-ers.” The “path” that “leads” to such responses is merely apparent because no such response provides an explanation. The second merely apparent path would lead to the contingently non-actual dimension of being. That is indeed a distinct dimension of being, but it is one that, as non-actual, has no resources that could explain the inclusion within being of a contingently actual dimension and that, as derivative, cannot in any way be the source of any dimension from which it derives. The exclusion of these two merely apparent paths leaves open only the path to the absolutely necessary dimension of being. Because this is the only path, the questions to be asked are the following: how is that path followed, and where does it lead?

The first step along this path consists in determining the relation of the contingent dimension of being to the absolutely necessary dimension of being. According to Structure and Being (454-5, 458), Being and God (234-5), and TAPTOE (171), that relation is one of total dependence. Why? First, to say that the contingently actual dimension of being is independent of the absolutely necessary dimension of being would be to take the first of the two merely apparent paths rejected in the preceding paragraph. What then if the contingently actual dimension of being were said to be partially dependent on the absolutely necessary dimension of being? Such partial dependence is perhaps posited by some accounts of a deus absconditus, according to which God—or, one might say, the absolutely necessary dimension of being—brought the contingently actual dimension of being into being, and then severed relations with it. The problem is that no such account could explain the continu­ation in being of the contingently actual dimension of being; none, that is, could explain why that dimension of being does not cease to be. The thesis that the contingently actual dimension of being is totally dependent on the absolutely necessary dimension of being, however, does explain the continuing being of the contingently actual dimension of being: it is sustained in being by the absolutely necessary dimension of being.

The point made in the preceding paragraph can also be put as follows: being veridically manifests itself, according to SSP, such that it includes both an absolutely necessary dimension and a contingently actual dimension, and such that the latter dimension is totally dependent, for its initial and continuing being, on the former dimension. Challenges to these theses could be only of two sorts. First, it could in principle be argued that the SSP’s theoretical framework would be concretized with greater intelligibility and coherence if one or both of these theses were rejected or altered. Arguments given above in this section at least weigh heavily against any such course of argumentation, and perhaps even show that no such course of argumentation could be viable. Second, an alternative theory of being, lacking any version of the theses introduced at the beginning of this paragraph, could develop within an alternative theoretical framework. Were this to happen, that framework could be evaluated at a meta-systematic level of the SSP. In the absence of such an alternative theory, objections to the SSP’s theory along the lines of “Well, even if it’s the best expla­nation you can come up with, it might not be true” are vacuous. The SSP’s explanation is true, within its theoretical framework, and as true, it articulates factings that are constituents of being.

The next question is, does the total dependence of the contingently actual dimension of being on the absolutely necessary dimension of being make possible the further explication of the absolutely necessary dimension of being? Important to addressing this question is noting the inclusion within the contingently actual dimension of being of human be-ers as be-ers who are, both as thinking and as freely willing, intentionally coextensive with being as such and as a whole, and hence with the absolutely necessary dimension of being. The total dependence of such be-ers on the absolutely necessary dimension of being is however intelligible only if the absolutely necessary dimension of being likewise thinks and freely wills and is thereby intentionally coextensive with being as such and as a whole. Otherwise, what is intelligible to human be-ers would not be intel­ligible to the absolutely necessary dimension of being. The total dependence of human be-ers, in their being, cannot be explained as a relation to a dimension that is in no way cognizant of them or is in any way inferior to them.

The previous paragraph argues that a non-minded absolutely necessary dimension of being is not intelligible as that upon which the contingently actual dimension of being is totally dependent. What, then, of a minded absolutely necessary dimension of being? Such a dimension would not only be cognizant of the contingently actual dimension of being, but would also, as freely willing, be intel­ligible as that upon which the contingently actual dimension of being would be fully dependent: that there is within being a contingently actual dimension is explained by the free willing, by the absolutely necessary dimension of being, that it be.

In part because the contingently actual dimension of being includes human be-ers who make free decisions, the total dependence of that dimension on the free willing of the absolutely necessary dimension of being cannot be one of being determined in all respects. Instead, according to the SSP, what is freely willed by the absolutely necessary dimension of being is the being, as a whole, of the contingently actual dimension of being. This explains the inclusion within the dimension of being as a whole of the contingently actual dimension of being. Explanations of specific phenomena within the contin­gently actual dimension of being, on the other hand, are at least in the overwhelming majority of cases explained by other phenomena within that dimension.

At this point, the following question might be raised: even granting that the only way the inclusion within being of a contingently actual dimension can be explained is by its being freely willed by the absolutely necessary dimension of being, might this explanation nonetheless be false? The first thing to be said in response to this possible objection is that within the theoretical framework of the SSP, the explanation emerges as true. Because it does, it is the case that this is one of the ways in which being veridically manifests itself according to the SSP. The thesis that the inclusion within being of a contingently actual dimension is unintelligible and hence inexplicable cannot be situated within the SSP’s theoretical framework given the centrality, to that framework, of the thesis that being is universally intelligible. This of course does not rule out the possibility of theoretical frameworks within which some such thesis could be included, but if some such framework were to be developed and presented, then it could be assessed in comparison with that of the SSP. Only if it proved superior would the SSP give way to it.

b. God

Once the absolutely necessary dimension of being has been deter­mined to have freely willed the being of the contingently actual dimension of being, and it has been determined, as for example in TAPTOE 5.2, that for be-ers within the contingently actual dimension of being it is good to be, it is appropriate to designate the absolutely necessary dimension of being as God.

To further explain this designation of the absolutely necessary dimension of being as God, it is helpful to introduce the principle of rank within being. This principle is the following:

(PRWB) No facting can arise exclusively from or be explained exclusively by any facting of a lower rank within being.

The rank within being of a given facting is determined by the extent of its sphere of influence, the latter understood as including both what the facting can influence, and what can influence the facting. Given this criterion, rocks have a relatively low rank within being, because (for example) they cannot be influenced by threats from animals or from human be-ers. Because of the ways they interact with other animals and with human be-ers, animals have considerably higher ranks within being than do rocks, but because they cannot be influenced by such things as arguments, they rank well below human be-ers. The sphere of influence of human be-ers has no limits, in that—given that human be-ers are intentionally coextensive with being as such and as a whole—humans can in principle be influenced by any constituent of being, precisely by thinking about it.

From the PRWB and the total dependence of the contingently actual dimension of being on the absolutely necessary dimension of being, it follows that the absolutely necessary dimension of being must be intentionally coextensive with being as such and as a whole, and must be free, because if it were not, then it would be of a lower rank within being than the human be-er.

Once the absolutely necessary dimension of being has been deter­mined to be absolutely freely sustaining the being of the contingently actual dimension of being and to be appropriately designated as God, two additional lines of inquiry open. Following the first would involve confronting the many problems that arise following the intro­duction of God into the SSP; prominent among these is the problem of evil.

The second line of inquiry would require the crossing of a methodological watershed. The reason is that additional determi­nation of the absolutely necessary dimension of being, or of God, may become possible through investigation of the contingently actual dimension of being as wholly dependent on the freedom of God. The question is, does the course of history provide evidence of God’s self­-revelation within it such that the interpretive examination of history will make possible further determination of God—possibly as trinity, and possibly as having been incarnate? Both Structure and Being (459-60) and Being and God (3.7.4.1) identify this interpretive examination of history, which could include interpretive examination of such historical texts as the Bible, as a task for the SSP, but neither pursues this task. Nor does TAPTOE, and nor does this article.

c. The Principle of Rank within Being and Evolution

Biology, relying on its specific theoretical framework, treats specific empirical questions with specific concepts, assumptions, procedures, and so forth. Essential is that it establishes that there has been development within the domain of animals and that among the many stages of this devel­opment there are similarities and dissimilarities. From this it concludes that there are specific connections among these stages. Finally, it inter­prets these connections as constituting a history of descent (particularly: human be-ers descended from some ancestor of the currently extant apes). All of this is correct if it is governed by the qualifier “according to the theoretical framework of biology.” What that means is, among other things, the following: within that framework, only certain questions are addressed; other questions have no place therein. Among these is the following: How is it possible that such an ascending development can have taken place? How is this ascending development ultimately to be explained, particularly given that within it there are be-ers with enormously different ranks within being?

The first and most central thesis that emerges in the SSP’s response to these questions is the following: If a development to higher ranks within being has taken place, then it was possible for it to have taken place. How is this possibility to be explained? First, this possi­bility was always a genuine ontological factor included among the be-ers within the contingently actual dimension of being, where evolution occurs. Already in the earliest and lowest (the purely physical) stages of the cosmos, the possibility for developments to all possible forms and stages, including that for the development of ontologically higher forms, is contained as an immanent factor in the be-ers found at those stages. If this were not the case, then it would be a miracle that these more highly-ranked entities developed as they in fact developed. But how is the immanent ontological status of this possibility of development to be clarified?

The SSP clarifies it as follows: First, comparison of any evolu­tionarily pre-human organism, with its sphere of influence, with any normal adult human be-er, with its sphere of influence, indeed reveals that the human’s sphere of influence is greater, and thereby that the human be-er is of a higher ontological rank (that is, of a higher rank among be-ers). But human be-ers, prior to their emergence in the course of evolution, are not simply absent from the contingently actual dimension of being; they are instead ontologically included within this dimension of being as possibilities, in that if and when the requisite complex configuration of non-human factings emerges, that configuration will be a human be-er. The emergence of human be-ers in the course of evolution is thus nothing like a teleportation from the contingently non-actual dimension of being (or from some merely possible world) into the contingently actual dimension of being. Instead, prior to the emergence of human be-ers in the course of evolution, there be non-human be-ers that have the capacity, in conjunction, to recon­figure themselves such that they cease to be when be-ers of higher ontological ranks, and eventually human be-ers, come to be (this is wholly comparable to the reconfiguration of sperm and egg cells considered above). The span of time, whatever its extent, that precedes the emergence of human be-ers within the contingently actual dimension of being is thus a gestation period for human be-ers. The same holds for organisms of all other kinds.

d. The SSP and Christianity

According to Structure and Being (332), “within the philosophical perspective developed here, Christianity is the incomparably superior religion.” The SSP includes this thesis because Christianity satisfies the following explicitly identified criterion (443): “only Christianity has developed a genuine theology: one that satisfies the highest demands and challenges of theoreticity.” The Christian religion thus provides the theoretician working within the framework of the SSP with a potentially valuable starting point in that Christian theology provides the theoretician with data potentially incorporable into the SSP’s theory of God. That no other religion provides such data is an empirical thesis. If it were shown to be false, or if it were to become false in the future­—if a genuinely theoretical theology linked to any other religion were developed, identified, or discovered—then that theology, too, would provide data potentially incorporable into the SSP, and Christianity would, according to the SSP, cease to be the incomparably superior religion.

In part because Christian theology provides data for potential incorporation into the SSP, Being and God envisages, as the first central question to be addressed as the SSP seeks to further develop its theory of God by examining the history of the contingently actual dimension of being, the question of the degree to which God as articulated by that theory can be identified as the adequately articu­lated biblical-Christian God (see 252-3). It also, however, explicitly recognizes (271-2) the possibility that that degree would be insig­nificant. In addition, theoreticians working to further develop the SSP’s theory of God could focus on religions other than Christianity. Whether historical investigation will make possible further determi­nation of God as articulated by the SSP and, if it does, how closely God, as further determined within the SSP, will resemble the God of any religion, are at this point open questions.

A final remark is in order. It concerns the relation between engaging in philosophy and being of religious faith. The philosopher who as a philos­opher engages in theorization about God may or may not also be of religious faith, Christian or otherwise, and the Christian or person of other religious faith may or may not engage in philosophy. The philosopher who is not of religious faith may or may not be led by theoretical engagement with the issue of God to become of religious faith, Christian or otherwise, and the philosopher who is a Christian or of other religious faith may or may not be led by their theoretical engagement to alter or abandon that faith.

 

9. References and Further Reading

  • Gilson, Étienne. (1948) L’Être et L’Essence. Paris : Librairie Philosophique J. Vrin.
  • Gilson, Étienne. (1952) Being and Some Philosophers (2nd edition). Toronto: Pontifical Institute of Mediaeval Studies.
  • Gilson, Étienne. (2002) Thomism. The Philosophy of Thomas Aquinas. A Translation of Le Thomisme (6th and final edition)
  • Kraus, Lawrence. (2012) A Universe from Nothing. Why There Is Something Rather Than Nothing. New York: Free Press.
  • Lowe, E. J. (1996), “Why is there anything at all?” Proceedings of the Aristotelian Society. Supplementary Volumes. Volume 70: 111-29.
  • Puntel, Lorenz B. (2008), Structure and Being. A Theoretical Framework for a Systematic Philosophy. Translated by and in collaboration with Alan White. University Park, PA: Penn State University Press, 2008.
  • Puntel, Lorenz B. (2011), Being and God. A Systematic Approach in Confrontation with Martin Heidegger, Emmanuel Levinas, and Jean-Luc Marioni. Translated by and in collaboration with Alan White. Evanston, IL: Northwestern University Press, 2011.
  • van Inwagen, Peter. (1996), “Why is there anything at all?” Proceedings of the Aristotelian Society. Supplementary Volumes. Volume 70: 95-110.
  • van Inwagen, Peter. (2008b) Metaphysics (3rd edition). Boulder, CO: Westview Press.
  • van Inwagen, Peter. (2009) “Being, Existence, and Ontological Commitment.” In Metametaphysics: New Essays on the Foundations of Ontology, edited by David J. Chalmers, David Manley, and Ryan Wasserman. Oxford: Clarendon, 2009, 472-506.
  • White, Alan. (2014) Toward a Philosophical Theory of Everything. Bloomsbury Press.
  • White, Alan. (2015) “Rearticulating Being.” Revier of Metaphysics. Volume 69 no. 1: 3-24.

 

Author Information

Alan White
Email: awhite@williams.edu
Williams College
U. S. A.

Franz Brentano (1838-1917)

pic of BrentanoFranz Brentano was a major philosopher of the second half of the 19th century who had a strong impact on the development of early phenomenology and analytic philosophy of mind. Brentano’s influence on students such as Karl Stumpf and Edmund Husserl was extensive, but Sigmund Freud was also much inspired by Brentano’s teaching and personality. Along with Bernard Bolzano, Brentano is acknowledged today as the co-founder of the Austrian tradition of philosophy.

Two of his theses have been the focus of important debates in 20th century philosophy: the thesis of the intentional nature of mental phenomena, and the thesis that all mental phenomena have a self-directed structure which makes them objects of inner perception. The first thesis has been taken up by proponents of the representational theory of mind, while the second thesis continues to inspire philosophers who advocate a self-representational theory of consciousness.

Brentano’s interests, however, were not limited to the philosophy of mind. His ambition was greater: to make the study of mental phenomena the basis for renewing philosophy altogether. This renewal would encompass all philosophical disciplines, but especially logic, ethics, and aesthetics. Moreover, Brentano was a committed metaphysician, much in contrast to Kant’s transcendental idealism and its further developments in German philosophy. Brentano advocated a scientific method to rival Kantianism that combined Aristotelian ideas with Cartesian rationalism and English empiricism. He was a firm believer in philosophical progress backed up by a theistic worldview.

Table of Contents

  1. Biography
    1. Life
    2. Works
  2. Philosophy of Mind
    1. Philosophy and Psychology
    2. Inner Perception
    3. Intentionality
    4. Descriptive Psychology
  3. The Triad of Truth, Goodness and Beauty
    1. A Philosophical System
    2. Judgement and Truth
    3. Interest and the Good
    4. Presentation and Beauty
  4. Epistemology and Metaphysics
    1. Kinds of Knowledge
    2. A World of Things
    3. Substance and Accidents
    4. Dualism, Immortality, God
  5. History of Philosophy and Metaphilosophy
    1. How to do History of Philosophy
    2. Aristotle’s Worldview
    3. Positivism and the Renewal of Philosophy
    4. Philosophical Optimism
  6. References and Further Reading
    1. Monographs Published by Brentano
    2. Other Philosophical Works Published by Brentano
    3. Selected Works Published Posthumously from Brentano’s Nachlass
    4. Secondary Sources

1. Biography

Franz Brentano was born into a distinguished German family of Italian descent whose influence on Germany’s cultural and academic life was considerable. His uncle Clemens Brentano (1748-1842) and his aunt Bettina von Arnim (1785-1859) were major figures of German Romanticism, and his brother Lujo Brentano (1844-1931) became an eminent German economist and social reformer. The Brentano brothers considered their family as “zealously Catholic” (Franz Brentano) and “highly conservative” (Lujo Brentano). Brentano’s early association with the Catholic Church complicated his life and affected his academic career in Würzburg and Vienna.

a. Life

Brentano was born on January 16, 1838, in Marienberg on the Rhine. He was one of five children who reached adulthood. He studied philosophy and theology in Munich, Würzburg, Berlin, and Münster. Among his teachers were the philologist Ernst von Lasaulx (1805-1861), the Aristotle scholar Friedrich Trendelenburg (1802-1872), and the Catholic philosopher Franz Jacob Clemens (1815-1862). Under Clemens’s supervision, Brentano first began a dissertation on Francisco Suárez, but then took his doctorate at Tübingen in 1862 with a thesis on the concept of being in Aristotle. He then enrolled in theology and was ordained a priest in 1864. After his habilitation in philosophy in 1866 in Würzburg, Brentano began his teaching career there, first as a Privatdozent, and from 1872 as an Extraordinarius. Among his students in Würzburg were Anton Marty and Carl Stumpf. The fact that he left the University of Würzburg shortly thereafter was due to a falling out with the Catholic Church. In the fight between liberal and conservative groups, which took place both inside and outside the Catholic Church at the time, Brentano wrote in a letter to a Benedictine abbot that he felt himself “caught between a rock and a hard place” (quoted in Binder 2019, p. 79 A key role was played by a document he had written for the bishop of Mainz where he was critical of the dogma of the infallibility of the Pope proclaimed by the first Vatican Council in 1870.

Despite this rift, Brentano was able to continue his academic career. After he convinced the responsible authorities in Vienna that he was neither anti-clerical nor atheistic, he was appointed to a professorship of philosophy at the University of Vienna in 1874, supported among others by Hermann Lotze (1817-1881). As in Würzburg, Brentano quickly found popularity among the students in Vienna. These included Edmund Husserl, Alexius Meinong, Alois Höfler, Kasimir Twardowski, Thomas Masaryk, Christian Ehrenfels and Sigmund Freud. Privately, Brentano found connections to the society of the Viennese bourgeoisie. He met Ida von Lieben, daughter of a wealthy Jewish family, finally left the Catholic Church, and married her in Leipzig in 1880. This union was followed by a protracted legal dispute in which Brentano tried in vain to regain his professorship, which he had to give up as a married former priest in Austria. In 1894, after the unexpected death of his wife Ida, Brentano was left alone with their six-year-old son Johannes (also called “Giovanni” or “Gio”). One year later, he decided to end his teaching career, now as a Privatdozent, and left Austria first for Switzerland, then for Italy.

Although Brentano was offered several professorships in both countries, the 57-year-old decided to live as a private scholar from then on. In 1896, he acquired Italian citizenship and lived in Florence and Palermo. He continued to spend summers in Austria at his vacation home in Schönbühel on the Danube. In 1897 Brentano married his second wife, the Austrian Emilie Rüprecht. She not only took care of the household and his son, but also increasingly supported Brentano in his scientific work, since his vision had started to decrease around 1903. Due to the outbreak of World War I and the entry of Italy into the war, Brentano moved with his family to Zurich in 1915. During his time as a private scholar, he not only kept in touch with a small circle of students who had meanwhile made careers in Germany and Austria, but he was also in lively exchange with philosophers and intellectuals in Europe. Inspired by these contacts, Brentano was highly active intellectually until his death on March 17, 1917.

b. Works

Brentano’s philosophical work consists of his published writings and an extensive Nachlass, which includes a large amount of lecture notes, manuscripts, and dictates from his later years. The bequest also contains a wealth of correspondence that Brentano exchanged with his former students (notably Anton Marty) and students of his students (notably Oskar Kraus). Other correspondents of Brentano were eminent scientists and philosophers such as Ludwig Boltzmann, Gustav Theodor Fechner, Ernst Mach, John Stuart Mill and Herbert Spencer.

As Brentano was a prolific writer, but reluctant to prepare his works for publication, it was mostly left to his students and later generations of editors to prepare publications from his Nachlass manuscripts. In doing so, the editors often took the liberty of adapting and modifying Brentano’s original text, often without marking the changes as such. As a result, only the works published by Brentano himself are a truly reliable source, while the volumes edited after Brentano’s death vary widely in editorial quality (see the warnings in References and Further Readings, section 3).

2. Philosophy of Mind

Brentano believed that psychology should follow the example of physiology and become a full-fledged science, while continuing to play a special role within philosophy. To meet this goal, he conceived of psychology as a standalone discipline with its own subject matter, namely mental phenomena. The basic principles that inform us about these phenomena are emblematic of his conception of the mind and will be discussed in more detail below: his thesis of the intentional nature of mental phenomena; his thesis of inner perception as a secondary consciousness; and his classification of mental acts into presentations, judgements, and phenomena of love and hate.

In later years, Brentano drew an important distinction between descriptive and explanatory (“genetic”) psychology. These sub-disciplines of psychology differ both in their task and the methods they need to accomplish that task. According to Brentano, an analytical method is needed to describe psychological phenomena. We must analyze the experiences, which often only appear indistinct to us in inner perception, in order to precisely determine their characteristics. Genetic psychology, on the other hand, requires methods of explanation; it must be able to explain how the experiences that we perceive come about.

Brentano prepares the ground for that distinction in his seminal work Psychology from an Empirical Standpoint, first published in 1874, and fleshes it out fully in the second edition of the Classification of Mental Phenomena in 1911. The second edition contains several new appendices, but it is still far from completing the book project as Brentano originally conceived it. According to this plan Brentano wanted to add four more books that give a full treatment of the three main classes of mental phenomena (presentations, judgements, acts of the will and emotional phenomena), as well as a treatment of the mind-body problem, which shows that some notion of immortality is compatible with our scientific knowledge of the mind (see section 4d).

While the Psychology from an Empirical Standpoint secured Brentano’s place in the history of philosophy of mind, it is not an isolated piece in Brentano’s oeuvre. It stands somewhat in the middle between his earlier work on The Psychology of Aristotle (1867) and his lectures on Descriptive Psychology (1884-1889). If we add to this Brentano’s lectures on psychology in Würzburg (1871-1873) and the works on sensory psychology in his later years (1892-1907), we see a continuous preoccupation with questions of psychology over more than 40 years.

a. Philosophy and Psychology

Brentano’s interest in the philosophy of mind was driven by the question of how psychology can claim for itself the status of a proper science. Taking his inspiration from Aristotle’s De Anima, Brentano holds that progress in psychology depends on progress in philosophy, but he takes this dependence to go both ways. This means that we can ascribe to Brentano two programmatic ideas:

  • Philosophy helps psychology to clarify its empirical basis as well as to determine its object of research
  • Conversely, psychology contributes in various ways to many areas of philosophy, especially epistemology, logic, ethics, aesthetics, and metaphysics

Implementing the first idea involved Brentano in addressing thorny questions of methodology and classification: What is the proper method for studying mental phenomena? How do mental phenomena differ from non-mental phenomena? Is consciousness a characteristic of the mental? How does consciousness of outer objects differ from what Brentano calls “inner consciousness”? How can we classify mental phenomena? The first two books of the Psychology from an Empirical Standpoint provide ample answers to these questions.

During his years in Vienna (1874-1895), Brentano’s interest shifted more and more towards his second programmatic idea. In this context, Brentano found it necessary to distinguish more sharply between a “descriptive” and a “genetic” psychology. With this distinction in hand, he tried to show what psychology might contribute to various parts of philosophy. While philosophy may be autonomous from genetic psychology, it builds on the resources of descriptive psychology. How this shift towards descriptive psychology gradually took hold of Brentano’s thinking can be seen from the titles of his Vienna lectures: “Psychology” (1874-1880), “Selected questions of psychology and aesthetics” (1883-86),  “Descriptive psychology” and “Psychognosy” (1887-1891).

It was only in later years that Brentano returned to questions of genetic psychology. Texts from the last decade of his life were published posthumously in a volume entitled On Sensory and Noetic Consciousness. Psychology from an Empirical Standpoint III (1928, English translation 1981). The subtitle is misleading because this volume is not a continuation of Brentano’s earlier book. How Brentano planned to continue his Psychology has been succinctly described in Rollinger (2012).

b. Inner Perception

In studying mental phenomena, philosophers often emphasize a crucial distinction between two questions:

    1. How can we obtain knowledge of our own mental acts?
    2. How can we obtain knowledge of mental acts in other subjects?

In dealing with these questions, Brentano rejects the idea that separating the two questions means to postulate an “inner” sense in addition to the outer senses. This old empiricist idea inherited from Locke was still popular among philosophers and psychologists of Brentano’s time. As an alternative to this old view, Brentano suggests that we access our own mental phenomena by simply having them, without the need of any extra activity such as introspection or reflection.

To appreciate the large step that Brentano takes here, one must address several much-contested questions about the nature of our self-knowledge (see Soldati 2017). To begin with, how does Brentano distinguish between “inner perception” and what he calls “inner observation”? One way to do so is to consider the role that attention and memory play in accessing our own mental states. Like John Stuart Mill, Brentano argues that attending to one’s current experiences involves the possibility of changing those very experiences. Brentano therefore suggests that inner perception involves no act of attention at all, while inner observation requires attending to past experiences as we remember them (see Brentano 1973, p. 35).

But Brentano goes further than that. According to him, we must not think of inner perception as a separate mental act that accompanies a current experience. Instead, we should think of an experience as a complex mental phenomenon that includes a self-directed perception as a necessary part. Some scholars have taken this view to imply that inner perception is an infallible source of knowledge for Brentano. But this conclusion needs to be drawn with care. Take, for instance, a sailor who mistakenly thinks he sees a piece of land on the horizon. Inner perception is telling him with self-evidence that he is having a visual experience of a piece of land. And yet he is mistaken about what he sees, if he mistakes a cloud, let’s say, for a piece of land. Still, Brentano would say that it was not inner perception that misled the sailor, but rather his interpretation of the visual content as of a piece of land. Such misinterpretations are errors of judgements, or attentional deficits that happen in observing or attending to the content of our experience. In the end, Brentano seems committed to the view that unlike observation, inner perception has no proper content and therefore has nothing it could be wrong about.

How can Brentano defend this commitment? One way to do so would be to appeal to the authority of Aristotle and his view that we are seeing objects, while at the same time experiencing seeing them. While Brentano is always happy to follow Aristotle, he also bolsters his view with new arguments. In the present case he does so by drawing on the ontology of parts and wholes. A common understanding of parts and wholes has it that a whole is constituted by detachable parts, like a heap of corn is constituted by single corns as detachable parts. But this model does not apply to inner perception, says Brentano. A conscious experience is not constituted like a heap of corn: you can’t detach the person who sees something from the subject who innerly perceives the seeing. This leads Brentano to the key insight that what appears to us in inner perception is nothing other than the entire mental act that presents itself to us.

How then does Brentano explain the immediate knowledge we have of our own mental phenomena? Having removed any appeal to a faculty of inner sense, Brentano ends up with a form of conceptual insight that could be summarized in the following way: Immediate knowledge of our present mental states comes with the insight that we can only conceptually separate what we see, feel, or think from the act of seeing, feeling or thinking. Such immediate knowledge becomes impossible when we consider the life of other people. In this case we are restricted to a form of “indirect knowledge” of their feelings and thoughts by listening to what they say or observing their behavior. This indirectness implies that we may only be certain that the other person feels or thinks something, without knowing what exactly it is that they feel or think. Here we are not just dealing with conceptual differences, but with a real difference between our own experiences and the mental phenomena we discover in the minds of others.

But are we able to “read” the minds of other people? In book II of his Psychology from an Empirical Standpoint, Brentano argues that psychology could not acquire the status of a proper science merely on the basis of the data of inner perception. We need to make sure that our conception of the mind is not biased by our own experiences. The data that help us to guard against such an egocentric bias include (a) biological facts and the behavior of others indicating, for example, that they feel hunger or thirst like we do, (b) the mutual understanding of communicative acts such as gestures or linguistic behavior, as well as (c) the recognition of behavior as voluntary actions performed with certain intentions.

In drawing on these resources, Brentano shows no skepticism towards our social instincts. It is part of our daily practice to infer from the behavior of others whether someone is hungry, whether he is ashamed because he has done something wrong, and so on. For Brentano, there is no reason why a scientific psychology should dispense with such inferences. On the contrary, he acknowledges that a psychology that limits itself exclusively to the knowledge of its own mental phenomena is exposed to the danger of far-reaching self-deceptions. Psychology must face the task of tracking down such deceptions.

Brentano thus provides a thoroughly optimistic picture of the empirical basis of psychology. With inner perception, it can count on the immediate and potentially error-free access that we have to our own experiences, while the fallacies that arise in self-observation can be rectified by relying on a rich repertoire of inferences about other people’s mental states.

c. Intentionality

Psychology is a self-standing science, says Brentano, because it has a specific subject matter, namely mental phenomena. In his Psychology from an Empirical Standpoint (1874), Brentano argues that these phenomena form a single and unified class thanks to a general characteristic that distinguishes them from all other phenomena: their intentional directedness to objects.

This so-called “intentionality thesis” has sparked a wide-ranging debate among Brentano’s pupils. Husserl and Twardowski–just to mention two–came up with different readings of what Brentano describes as the “intentional directedness towards an object.” Their disagreement paved the way for an interpretation (due to Oskar Kraus and later taken up by Roderick Chisholm) that ascribes to Brentano a fundamental change of view about the nature of intentionality. According to this interpretation, we find in the early Brentano a rich ontology of intentional objects that is threatened by inconsistency. Therefore, Brentano later developed a radical critique of non-real objects that forced him to accept a non-relational theory of intentionality. In the meantime, this interpretation has been disputed for various reasons: Some scholars (for example, Barry Smith, Arkadiusz Chrudzimski) tried to resolve the alleged inconsistencies in Brentano’s early ontology, thus removing the need for a radical shift to a non-relational theory of intentionality, while other scholars (e.g. Mauro Antonelli and Werner Sauer) tried to show that Brentano’s later view is not very different from his earlier conception. Still others have discussed whether one finds at the core of Brentano’s theory a relational concept of intentionality that applies to some but not all mental phenomena, thus putting pressure on the thesis that all mental phenomena share the same feature of intentionality (see Brandl, 2023).

These different descriptions of intentionality played an important role in the reception of the concept by Brentano’s students. Following Twardowski, Husserl insisted on the distinction between the content and the object of an intentional act, stressing (against Twardowski) that some acts are intentional and are yet objectless, e.g. my presentation of a golden mountain. Following this line, Husserl developed a non-relational, semantic view of intentionality in his Logical Investigations (1900/1901), according to which intentional acts are acts of “etwas meinen” (meaning something). Going beyond Brentano and Twardowski, Husserl argues that acts of meaning instantiate ideal species that account for the objectivity of meaning. Although Brentano seemed to be aware of the semantic problem of the objectivity of meaning in some of his lectures, this was clearly not a central concern for his understanding of intentionality. Meinong, on the other hand, extends Brentano’s concept of intentionality in the opposite direction to Husserl’s: not only is there no sharp distinction between semantics and ontology in Meinong, but all mental acts, including my presentation of a golden mountain, in his view have an object, which may or may not exist, or may simply belong to the extra-ontological realm (Außersein). In any case, Meinong defends a fully relational view of intentionality.

Given the plurality of developments of the concept within his school and the plurality of descriptions offered by Brentano himself, one lesson to be drawn from this debate is that the scholastic terminology used by Brentano is much less informative than it has been taken to be. Brentano seems to use different terms for object-directedness in the same way that one uses different numerical terms to measure the temperature of a body in Celsius or Fahrenheit. In putting mental phenomena on different scales, metaphorically speaking, he highlights some of their differences and commonalities and describes these as “having the same intentional content” or “being directed at the same object.” Take sense perception, for example. If every mental phenomenon has an object, then both my seeing and my imagining an elephant have an object. While this is true according to one use of the term “object,” it is not true if we want to compare an act of perception with an act of hallucination. Then, the so-called “object” of the sensory experience might be better described as a “content” of experience. It fills the grammatical gap in the expression “I see X,” which the hallucinator might also use.

Another controversy concerns phenomena that do not appear to be intentional at all. For instance, one can simply be in a sad mood without being sad about a particular event. Or one may generally tend to jealousy, without that jealousy being triggered by any particular object. Brentano’s intentionality thesis has been defended on the grounds that all mental states fulfill a representational function. The function of being sad could be, for example, to cast any goal we are striving for in a negative light. This could explain the paralyzing effect of sadness. But there is also the possibility of allowing that some mental phenomena may have objects only in a derived sense: for instance, when a subject perceives herself as being in a certain mood. The mood or character trait in itself might be “undirected,” but it would still be an object of inner consciousness when the subject perceives herself as being sad.

Brentano was not yet concerned with the many further questions raised by a representational theory of the mind. Thus, it is not clear how he would treat activities, processes, and operations that underlie our sensory experiences, feelings or acts of the will. Would they have intentional content for him only if they are part of our consciousness? Or would Brentano regard them as non-intentional “physical” phenomena to be studied by the physiological sciences? Perhaps the best way to relate Brentano to the contemporary debate about such questions is to say that he promotes a kind of qualitative psychology that does not need to invoke unconscious mental processes (see Citlak, 2023).

d. Descriptive Psychology

In the manuscripts starting from 1875, in which Brentano worked on the continuation of his Psychology from an Empirical Standpoint, one finds the first explicit mention of a separation between descriptive and genetic psychology. The basic idea behind this distinction is that a science of psychology cannot get off the ground unless it is able to identify the components of consciousness and their interrelationships. This requires something like a “geography” of mental concepts, which descriptive psychology is supposed to provide.

Although Brentano approaches this issue in the spirit of the empiricism of Locke and Hume, his objectives are somewhat closer to Descartes’, that is, to find the source of conceptual truths that are self-evident. Combining empiricism about concepts with the search for self-evident conceptual truths is the key to understanding the aim of descriptive psychology, which can be summarized as follows: Descriptive psychology aims to determine the elements of consciousness on the basis of inner perception, and thus to show how we arrive at concepts and judgements that we accept as self-evident and absolutely true. How Brentano tries to implement this plan can be seen in some examples that he discusses in his lecture courses. In one example, he asks whether it is possible to make contradictory judgements. Brentano denies that this is possible as soon as we have a clear perception of the fact that the judgements “A exists” and “A does not exist” cannot both be true. In another example, he asks whether one can feel a sensation without attributing content to that experience. Again, Brentano argues that this is impossible, and that we know this once we have a clear grasp of what “feeling a sensation” means. And we get a clear grasp of what “feeling a sensation” means by describing or analyzing the experience of feeling a sensation into its more basic constituents, by providing examples of such cases, by contrasting different cases, and so on.  Doubts about the validity of these results are possible, but they can be treated as temporary. They can only arise while our knowledge of the elements of consciousness and their connections is incomplete.

In summary, we can say that Brentano conceived of descriptive psychology as an epistemological tool. It is aimed at principles which, like the axioms of mathematics, are immediately self-evident or can be traced back–in principle–to immediately self-evident truths. The caveat “in principle” reminds us that we may have good reason to believe that these self-evident truths exist even if we do not yet know what they are.

3. The Triad of Truth, Goodness and Beauty

In the second half of the 19th century the question of the objective character of logic, ethics and aesthetics became a much-contested issue in philosophy. Brentano’s answer to this question is guided by a grand idea: In these disciplines, the theoretical principles of psychology find their application. The domain of application is fixed in each case by one of the three fundamental classes of mental phenomena: logic is the art of making correct judgements, notably judgements inferred from other judgements; ethics deals with attitudes of interest, such as emotional and volitional acts, which direct us to what is good; while aesthetics examines presentations and our pleasure in having them, which makes us acquainted with what is beautiful.

Brentano thus places psychology at the base of what appears to be a philosophical system. The idea of such a system, however, was severely criticized by Husserl, who accused Brentano of psychologism. From the point of view of logic, this accusation weighs heavily since Brentano transforms the laws of logic into laws of correct thinking. But the objection here is a general one. It can also be used to deny that ethics should be based on moral psychology, and that aesthetics should be based on the psychology of imagination. If Brentano has a good answer to Husserl, it must be a general one.

a. A Philosophical System

Brentano was never convinced by Husserl’s claim that phenomenology can only play a foundational role within philosophy if it is freed from its psychological roots. His answer to Husserl’s reproach was a counter-reproach: Whoever speaks of “psychologism” really means subjectivism, which says that knowledge claims may only be valid for a single subject. For instance, an argument may be valid for me, even if others do not share my view. Pointing to his notion of self-evidence, which is not a subjective feeling at all, Brentano sees his position as shielded from this danger. He regards Husserl’s accusation as based on confusion or deliberate misunderstanding.

The jury is still out on this. While most phenomenologists tend to agree with Husserl, others see nothing wrong with the kind of psychologism that Brentano advocates. Some go even further and insist that only by reviving Brentano’s project of grounding mental concepts in experience can we hope to avoid the dead ends in the study of consciousness to which materialism and functionalism lead (see Tim Crane, 2021).

An interpretation of Brentano that also tries to make progress on the issue of psychologism has been proposed by Uriah Kriegel:

In order to understand the true, the good and the beautiful, we must get a clear idea of (i) the distinctive mental acts that aim at them, and (ii) the success of this aim. According to Brentano, the true is that which is right or fitting or appropriate to believe; the good is that which is right/appropriate to love or like or approve of; and the beautiful is that which is right/appropriate to be pleased with (U. Kriegel: The Routledge Handbook of Franz Brentano and the Brentano School. p. 21).

Kriegel’s interpretation is inspired by a tradition, going back to Moses Mendelssohn, of conceiving of truth, goodness, and beauty as closely related concepts. Brentano gives this idea a psychological twist:

It is necessary, then, to interpret this triad of the Beautiful, the True, and the Good, in a somewhat different fashion. In so doing, it will emerge that they are related to three aspects of our mental life: not, however, to knowledge, feeling and will, but to the triad that we have distinguished in the three basic classes of mental phenomena (Brentano 1995, 261).

Brentano scholars must decide how much weight to give to the idea expressed in this quotation. While Kriegel believes that the idea deserves full development, others are more sceptical. They point out that such a system-oriented interpretation goes against the spirit of Brentano’s philosophising. Wolfgang Huemer takes this line when he suggests that “Brentano’s hostility to German system philosophy and his empiricist and positivist approach made him immune to the temptation to construct a system of his own” (Huemer 2021, p.11).

Another question that arises at this point is how to integrate metaphysics into the system proposed by Kriegel. Brentano hints at a possible answer to this question by adding to the triad “the ideal of ideals,” which consists of “the unity of all truth, goodness and beauty” (Brentano 1995, 262). Ideals are achieved by the correct use of some mental faculties. Whether this also holds for the “ideal of ideals” is unclear. Perhaps Brentano is referring here to a form of wisdom that emerges from our ability to perceive, analyse and describe the facts of our mental life (see Susan Gabriel, 2013).

b. Judgement and Truth

Brentano adopts a psychological approach to logic which stands opposed to the anti-psychologistic consensus in modern logic that takes propositions, sentences, or assertions as the primary bearers of truth.  Brentano’s approach starts from the observation that simple judgements can easily be divided into positive and negative ones. For Brentano, a simple judgement is correct if it makes the right choice between two responses. We can either acknowledge a presented object when we judge that it exists, or reject it when we judge that it does not exist. To illustrate this, suppose you have an auditory experience where you are presented suddenly with a gunshot. The sound wakes you up and you have no clue as to whether what you heard was real or just a dream. You are presented with content (the gunshot heard), and now you take a stance on this content, either by accepting it (“yes, it’s a gunshot”) or rejecting it (“no, it’s not a gunshot”).  The stance you take will be right or wrong, and the resulting judgement will be true or false.

Now, there are two different ideas here that need to be carefully distinguished. One is the following concept acquisition principle:

We acquire the concept of truth by abstracting it from experiences of correct judgement.

The other is a definition of truth:

A judgement is true if and only if it is correct to acknowledge its object or if it is correct to reject its object.

The idea of defining truth in this way raises the question of how it relates to the classical correspondence theory of truth. The critical issue here is the concept of “object,” and the ontological commitments connected with this term. The common view is that Brentano rejected the view that facts or states of affairs could be objects standing in a correspondence relation to a judgement. In a lecture given in 1889, entitled “On the Concept of Truth,” Brentano rehearses some of the criticisms that the correspondence theory has received. He draws particular attention to the problem of negative existential judgements such as “there are no dragons.” This judgement seems to be true precisely because nothing in reality corresponds to the term “dragon.” For Brentano, this speaks not only against the introduction of negative facts, such as the fact that there are no dragons, but also against the acceptance in one’s ontology of states of affairs that might obtain if there were dragons, for example  the state of affairs that dragons can sing.

But abandoning the correspondence theory has its price. How can Brentano distinguish between truth on the one hand, and the more demanding notion of correctness on the other? Correct judgements should help us to attain knowledge. If we want to know whether the noise that we heard was a gunshot, we are not engaged in a game of guessing, but rather we are trying to be as reasonable as possible about the probability that it was in fact a real gunshot. What is it that makes our judgement correct, if it is not the correspondence with a real object or fact?

Brentano tries to capture this more demanding notion of correctness with his notion of “self-evidence.” He points out that the judgement of a subject can be true even if it is not self-evident to that subject. This leaves open the possibility that it is evident to another subject whose judgement would then be correct. Suppose your judgement is that John is happy. You may have good reasons for judging so, but the truth of this judgement is not self-evident to you. There is however a person, John, who might be able to judge with self-evidence that he is happy right now.  To say that your judgement is true means that it agrees with the judgement of John when asked whether he is happy or not.

But what about judgements like “Jack has won the lottery”? In this case we can ask the company running the lottery whether this assertion is true, but we will find no one in a position to resolve this question in a self-evident judgement. How then can Brentano’s definition be considered a general definition of truth that applies to all judgements?

It is at this point that the distinction between a definition of truth and a principle of concept acquisition becomes crucial. When we ask how we acquire the concept of truth, the slogan “Correctness First!” tells us that we acquire the concept of truth only later: the first step is to recognize that judging with self-evidence means to judge correctly (see Textor 2019). But we must not conclude from this that the slogan also applies when we define the concept of truth. Towards the end of his lecture “On the Concept of Truth” Brentano notices that one can remove the notion of “correspondence” from the classical Aristotelian definition of truth, without making it incomprehensible or false. “A judgement of the form ‘A exists’ is true if and only if A exists; a judgement of the form ‘A does not exist’ is true if A does not exist.” Was Brentano then a pioneer of a minimalist theory of truth? It has been argued that this is at least an interesting alternative to the epistemological interpretation described above (see Brandl, 2017). It explains why Brentano in his lecture on the concept of truth finds it unproblematic that a definition of truth may seem trivial. In doing so, he allows that a definition need not be informative about how we acquire the concept of truth.

c. Interest and the Good

The distinction between defining a concept and explaining how we acquire that concept also plays a role in Brentano’s meta-ethical theory of correctness. A few months before his lecture on the concept of truth, the Viennese Law Society had asked Brentano to present his views on whether there was such a thing as a natural sense of justice. In his lecture “On the Origin of our Knowledge of Right and Wrong” (1889), Brentano gives a positive answer to the question posed by the Society, but he makes it clear that the term “natural sense” can be understood in different ways. For him, it is not an innate ability to see what is just or unjust. Rather, what Brentano is defending is the idea that there are “rules which can be known to be right and binding, in and for themselves, and by virtue of their own nature” (Brentano, 1889, p.3).

Brentano’s meta-ethical theory of right and wrong can therefore be seen as a close cousin of his theory of truth. It is a highly original theory because it steers a middle course between the empirical sentimentalism of Hume and the a priori rationalism of Kant. Brentano carves out a third option by asking: What are the phenomena of interest (as Brentano calls them) that form the basis of our moral attitudes and decisions? Introducing the notion of “correct love,” he proposes the following principle of concept acquisition: We acquire the concept of the good by abstracting it from instances of correct love. The term “love” here stands for a positive interest, in polar opposition to “hate,” which includes for Brentano any negative dis-interest. For Brentano, these are phenomena in the same category as simple feelings of benevolence, but with a more complex structure that makes them cognitively much more powerful. By introducing these more powerful notions, Brentano hopes to show how one can take a psychological approach in meta-ethics that still “radically and completely [breaks] with ethical subjectivism” (ibid., p. xi, transl. modified).

We have already mentioned that Brentano denies that our moral attitudes and decisions are based on an innate, and in this sense “natural,” instinct. It may be true that we instinctively love children and cute pets, and that these creatures fully deserve our caring response. But such instinctive or habitual responses can also be misleading. We may instinctively or habitually love things that do not deserve our love, for example if we have become addicted to them. A theory that breaks with ethical subjectivism must be able to tell us why our love for children and pets is right and our love for a drug is wrong.

One way to approach this matter is to interpret Brentano as a precursor of a “fitting attitude” theory (see Kriegel 2017, p. 224ff). When we love an object that deserves our love, we may call this a “fitting attitude.” But to know what “fittingness” means, we have to turn to inner perception. Inner perception tells us, for example, when we love the kindness of a person, that this is a correct emotional response. Once we know that a person is kind, we know immediately that her kindness is something to be loved. This is “self-imposing,” as Kriegel says.

The question now is whether inner perception will also provide us with a list of preferences that is beyond doubt. For example, does inner perception tell us that being healthy or happy is better than being sick or sad, or as Brentano would put it: that it is correct to love health and happiness more than sickness or sadness? Such claims are open to counterexamples, or so it would seem. A person might want to get sick, or give in to sadness, or deliberately hurt herself. But there is a response Brentano can make to defend the self-imposing character of a preference-order. He could say that people have such deviant preferences only for the purpose of achieving some further goal. If we ask for good things that can be final goals, i.e. goals that we do not pursue for the sake of some other goal, then no one can reasonably doubt that health, happiness, and knowledge are better than sickness, sadness, and error.

Following this line, these cases could be treated in the same way as true judgements that are not self-evident to the subject making them: Even if it is not self-evident to a subject that health is a good, there may be subjects who take it as an ultimate goal and for whom it is a self-imposing good. Brentano can thus also claim to avoid a dilemma that threatens his meta-ethics: If goodness just depends on our emotional responses, this would make this notion fully subjective. But if it covers only those cases when the correctness of our love is self-evident to us, its domain of application would be very small indeed. Hence the idea that goodness must correlate with the responses of a perfect moral agent.

Brentano’s notion of correct love may explain how we acquire the concept of goodness, but it need not figure as part of a substantive definition of this concept. Cases of moral behavior whose correctness is self-imposing may still be informative, without comparing our moral behavior with an ideal moral agent. These cases suggest a preference order of goodness. For example, cases in which people sacrifice all their possessions to get medical treatment may help us to see why health is such a high good, perhaps even an ultimate good. And cases in which people risk their health for moral reasons may help us to see that there are goods that are ranked even higher than good health. Just as self-evident judgements lead us to the idea that truth is the highest epistemic value, so acts of love, whose correctness is self-imposing, can lead us to the idea of a supreme good.

d. Presentation and Beauty

Brentano concludes his analysis of normative concepts with an analysis of the concept of beauty. Extending his arguments against subjectivism, he attacks the common view that beauty exists only “in the eyes of the beholder.” For Brentano, beauty is a form of goodness that is no less objective than other forms of goodness. What the common view rightly points out is merely a fact about how we acquire the concept of beauty, namely by recourse to experience. Since this also holds for the concepts of truth and moral goodness, as Brentano says, beauty is no exception.

While we do not have a fully worked out version of Brentano’s aesthetic theory, its outlines are clear from lecture notes in his Nachlass (see Brentano: Grundzüge der Ästhetik, 1959). Again, taking a psychological approach, Brentano argues that it is not a simple form of pleasure that makes us see or feel the beauty of an object. Like in the case of moral judgements, it is a more complex mental state with a two-level structure. When we judge something to be beautiful, the first-level acts are acts of presentation: we perceive something or recall an image from memory. To this, we add a second-level phenomenon of interest: either “delight” or “disgust.” This gives us the following principle of concept acquisition: We acquire the concept of beauty by abstracting it from experiences of correct delight in a presentation.

Brentano’s theory nicely incorporates the fact that people differ in their aesthetic feelings. Musicians feel delight when hearing their favorite piece of music, art lovers when looking at a favorite painting, and nature lovers when enjoying a good view of the landscape. What they have in common is a particular kind of experience, which is why they can use the term “beautiful” in the same way. It is the experience of delight that underlies their understanding of beauty. Yet it is not a simple enjoyment like the pleasure we may feel when we indulge in ice cream or when we take a hot bath. Aesthetic delight is a response that requires a more reflective mode. To feel a higher-order pleasure, we must pay attention to the way objects appear to us and contemplate the peculiarities of these representations.

This reconstruction of Brentano’s aesthetic theory suggests that Brentano does not need a substantive definition of beauty. He can get by with general principles that connect the notions of beauty and delight, mimicking those that connect goodness with love, or truth with judgement and existence. As early as 1866, Brentano used this formulation of such a principle in its application to aesthetics: something is beautiful if its representation is the object of a correct emotion (see Brentano, “Habilitation Theses”). While these principles come with a sense of obviousness, interesting consequences follow from Brentano’s explanation of how we acquire our knowledge of them.

First, it follows that just as we can be mistaken in our judgements and emotional responses, so we can fail to appreciate the aesthetic qualities of an object. People may feel pleasure from things that do not warrant such pleasure, or they may fail to respond with pleasure to even the most beautiful things in front of their eyes. A plausible explanation for such cases of aesthetic incompetence is that people can perceive things very differently. They may not hear what the music lover hears, or fail to see what the nature lover sees, and therefore cannot understand why they find these things so pleasurable.

Second, Brentano’s theory implies that no relation of correspondence between mind and reality will explain what justifies aesthetic pleasure. Such justification can only come from inner perception, which provides us with exemplary cases of aesthetic delight. In such cases, like in the case of moral feelings, the correctness of the enjoyment is self-imposing. Such experiences may serve as a yardstick for judging cases whose beauty is less obvious and therefore more controversial.

4. Epistemology and Metaphysics

Brentano’s interest in questions of psychology is matched by an equally deep interest in questions of metaphysics. In both areas Brentano draws inspiration from Aristotle and the Aristotelian tradition. We can see this from his broad notion of metaphysics, encompassing ontological, cosmological, and theological questions. The central ontological question for Aristotle is the question of “being as such”; what does it mean to say that something is, is being, or has being? This is followed by questions concerning the categories of being, e.g.: What are the highest categories into which being can be divided? The second part of metaphysics, cosmology, seeks to establish the first principles of the cosmic order. It addresses questions concerning space, time and causality. Finally, the pinnacle of metaphysics is to be found in natural theology, which asks for the reason of all being: Does the world have a first cause, and does the order of the world suggest a wise and benevolent creator of the world?

For Brentano, as for Kant, these metaphysical questions pose an epistemological challenge first of all because they tend to exceed the limits of human understanding. To defend the possibility of metaphysical knowledge against skepticism, Brentano takes a preliminary step which he calls “Transcendental Philosophy,” but without adopting Kant’s Copernican turn. On the contrary, for Brentano, Kant himself counts as a skeptic because he declares things in themselves to be unknowable. What Kant overlooked, Brentano argues, is the self-evidence with which we make certain judgements. From this experience, he believes we can derive metaphysical principles whose validity is unquestionable.

a. Kinds of Knowledge

Skeptics of metaphysical knowledge often draw a contrast between metaphysical and scientific knowledge. Brentano resisted such an opposition, opting instead for an integration of scientific knowledge into metaphysics. He must therefore face the following conundrum: How can philosophy integrate the results obtained by the special sciences while at the same time making claims that go beyond the scope of any of the individual sciences?

Kant’s attempt to resolve this dilemma is based on the special nature of synthetic judgements a priori. Such judgements have empirical content, Kant holds, but we recognize them as true by the understanding alone, independently of sensory experience. Metaphysical knowledge thus becomes possible, but it is constrained by the scope of the synthetic judgements a priori. Brentano rejects this Kantian solution as inconclusive. Whatever a synthetic judgement a priori may be, it lacks what Brentano calls “self-evidence.” Metaphysics must not be constrained by such “blind prejudices,” Brentano holds.

Rejecting Kant’s view, Brentano proposes his own classification of judgements based on two distinctions: He divides judgements into those that are made with and without self-evidence, and he distinguishes between judgements that are purely assertive in character and apodictic (necessarily true or necessarily false) judgements. Knowledge in the narrow sense can only be found in judgements that are self-evident or that can be deduced from self-evident judgements. This does not mean, however, that non-evident judgements have no epistemic value for Brentano. To see this, we can replace the assertive/apodictic distinction with a distinction between purely empirical and not purely empirical judgements. Crossing this distinction with the evident/not-evident distinction gives us four possible kinds of knowledge:

  • knowledge through self-evident purely empirical judgements
  • knowledge through non-self-evident purely empirical judgements
  • knowledge through self-evident not purely empirical judgements, and
  • knowledge through non-self-evident not purely empirical judgements.

To the first category belong judgements of inner perception in which we acknowledge the existence of a current mental phenomenon. Such judgements are always made with self-evidence, according to Brentano. For instance, you immediately recognize the truth of “I am presently thinking about Socrates” when you do have such thoughts and therefore cannot reasonably doubt their existence.

Category 2 contains the non-self-evident empirical judgements, including all empirical hypotheses. These judgements differ in the degree of their confirmation, which is expressed in probability judgements. Although they lack self-evidence, Brentano allows that they may have what he calls “physical certainty.” Repeated observation may lead us to the certainty, for instance, that the sun will rise tomorrow. It is a judgement with a very high probability.

Category 3 contains apodictic universal judgements which, according to Brentano, are always negative in character. For instance, the true form of the law of excluded middle is not expressed by “for all x, either x is F or x is not-F,” but rather by the existential negative form: “there is no x which is F and not-F.” The truth of such a judgement is self-evident in a negative sense, i.e. it expresses the impossibility of acknowledging the conjunction of a property and its negation.

Finally, what about judgements in category 4? These judgements are similar to those which Kant classified as synthetic a priori. While Brentano denies that mathematical propositions fall into this category, taking them to be analytic and self-evident, he recognises a special status for judgements such as “Something red cannot possibly be blue” —that is, blue and red in the same place (see Brentano: Versuch über die Erkenntnis, p. 47). Or consider the judgement “It is impossible for parallels to cross”. These judgements are based on experiences described in the framework of commonsense psychology or Euclidean geometry. We can imagine alternative frameworks in which these judgements turn out to be false. But for practical reasons we can ignore this possibility and classify them as axioms and hence as knowledge.

Brentano’s official doctrine is that there are no degrees of evidence and that all axioms are self-evident judgements in category 3, despite the fact that many of them have been disputed. This puts a lot of pressure on explaining away these doubts as unjustified. Brentano was confident that this was a promising project and that it was the only way to show how metaphysics could aim at a form of wisdom that goes beyond the individual sciences. A more modest approach would emphasize the fact that axioms are not purely empirical judgements: they are part of conceptual frameworks like commonsense psychology or Euclidean geometry. These frameworks operate with relations (for example, of opposition or correlation) that may seem self-evident, but alternative frameworks in which other relations hold are conceivable. Treating axioms as knowledge in category 4 leads to a more modest epistemology, which might still serve metaphysics in the way Brentano conceived its role.

b. A World of Things

There are many possible answers to the question “What is the world made of?” The first task of ontology is therefore to compare the different possible answers and to provide criteria for choosing one ontology over another. After considering various options, Brentano settled on the view that the world is made of real things and nothing else. To emphasize that Brentano defends a particular version of realism, his view is called “Reism.”

To better understand Brentano’s Reism, it is necessary to follow his psychological approach. Crucial to Brentano’s view is the idea that irrealia cannot be primary objects of presentations. When we affirm something, we may mention all kinds of non-real entities. But the judgements we make must be based exclusively on presentations of real things. There are many terms that obscure this fact. Brentano calls these terms “linguistic fictions,” which give the false impression that our thoughts could also concern non-real entities. Brentano’s list of such misleading expressions is long:

One cannot make the being or nonbeing of a centaur an object [i.e. a primary object] as one can a centaur. […] Neither the present, past, nor future, neither present things, past things, nor future things, nor existence and non-existence, nor necessity, nor non-necessity, neither possibility nor impossibility, nor the necessary nor the non-necessary, neither the possible nor the impossible, neither truth nor falsity, neither the true nor the false, nor good nor bad. Nor can […] such abstractions as redness, shape, human nature, and the like, ever be the [primary] objects of a mental reference (F. Brentano: Psychology from an Empirical Standpoint, 1973, p. 294).

Commentators have spent a great deal of time examining the arguments Brentano uses to support his Reism (see Sauer 2017, p. 139). His method may be illustrated with the following example. Suppose that someone judges correctly:

Rain is likely to come.

Brentano proposes the following analysis of this statement:

Someone correctly acknowledges rain as a likely event.

This analysis shows that it is not the probability of rain that gets acknowledged, which would be something non-real. If the first statement is equivalent with the second, then the existence of a real thing is sufficient for the truth of both statements. This “truth-maker” is revealed in the second statement: It is a thinking thing that recognizes rain as an impending event, or, as Brentano says, a being that makes the judgement “rain exists” in the mode of the future and with some degree of probability.

With examples like this, Brentano inspired many philosophers in the analytic tradition to use linguistic analysis to promote ontological parsimony. Not many, however, would go as far as Brentano when it comes to analysing perceptual judgements. Brentano is committed to the view that in an act of perception there is only one real thing that must exist: the subject who enjoys the sensory experience. To defend this position, he relies on the following dictum: “What is real does not appear to us, and what appears to us is not real.”

Interpreting this dictum is a serious task. Does Brentano mean to say that all secondary qualities, such as colours, sounds and tastes, are not real because they are mere appearances? Or is he saying that secondary qualities have a special form of “phenomenal reality”? There seem to be good reasons counting against both claims. The phenomenal reality of colours, shapes and tastes counts against classifying them as “non-real,” and the deceptive character that appearances can have counts against a notion of “phenomenal reality” that applies to all appearances across the board. It is only by resolving such questions that a proper understanding of Brentano’s Reism can be achieved.

c. Substance and Accidents

Another classic theme in Aristotelian metaphysics is the relationship between substance and accidents, which include both properties as well as relations. Brentano divides substances into material and immaterial ones, and, in line with his reistic position, holds that both substances and accidents are real things in the broad sense of the word. Material substances are real because they exist independently of our sensory and mental activities. Mental substances, on the other hand, are real things because we innerly perceive them with self-evidence.

Both substances and accidents raise the question of how they exist in space and time. Let us start with space: Brentano holds on to the common view that material substances occupy portions of space (space regions), much in line with a container view of space. Extending this geometric view to mental substances, Brentano considers modeling them as points that occupy no location. To make sense of this idea, he introduces the limiting case of a zero-dimensional topology. It is a limiting case because the lack of any dimensions means that the totality of space in this topology is represented as a single point. Since, by definition, the totality of space does not exist in any other space, we get the desired result that a mental substance occupies no location. In a further step, Brentano then compares material substances with one- or multi-dimensional continua and argues that one can represent mental substances as boundaries of such continua without assigning them a location.  This analogy between mental substances and points on the one hand, and material substances and one- or multidimensional continua on the other, forms the basis of Brentano’s version of substance dualism (on dualism, see below).

What about time? As in the case of space, Brentano represents points of time as boundaries of a continuum. With the exception of the now-point, they are fictions cum fundamentum in re, which Brentano sometimes also calls metaphysical parts. In contrast with space however, Brentano holds that time is not a real continuum. More precisely, it is an unfinished (unfertiges) continuum of which only the now-point is real. This makes Brentano a presentist about the reality of time.  It is a view that seems counterintuitive since it implies that a temporal continuum, such as a melody, can only be perceived at the now point because, strictly speaking, your hearing and your awareness of hearing the melody cannot extend beyond the now point. Brentano’s solution to this problem was to argue for persistence in the tonal presentations of the melody through an original (that is, innate) association between these presentations. When you hear the fourth note of a melody, your presentations of the first three notes are retained, which accounts for the impression of time-consciousness (although technically there is no time consciousness). This account was very influential on Husserl (see Fréchette 2017 on Brentano’s conception of time-consciousness).

Now back to substance. We often characterize substantial change as a change of accidents inhering in a substance. For example, a substance may lose weight; a leaf on a tree may change its color from red to green, thereby losing an accident and gaining a new one. This common view is not Brentano’s view. On the contrary, for Brentano, spatial and temporal accidents are absolute accidents. This means that a substance cannot lose or gain any such properties. To make sense of this, Brentano does two things: first, he turns the traditional view of the substance-accident relation upside down. Instead of seeing the substance as a fundamental being and its accidents as inhering in it, Brentano changes the order: for him, accidents are more fundamental than substances. Second, he rejects inherence as a tie between substance and accident, and replaces it with the parthood relation: In his account, accidents are wholes of which a substance is a part. To illustrate this, take a glass of water which contains 200 ml of water at t1. After you have taken a sip from the glass, it then contains 170 ml of water at t2. According to Brentano, what changes are the wholes while the same substance is part of two different wholes at t1 and t2. Similarly, Brentano argues, when I heat the water in my cup, the water not only expands in space, but also takes up more or less “space” on a temporal continuum of thermal states. This temporal continuum of thermal states is also a continuum of wholes containing the same substance. Expansion does not require new properties to be added to a substance. It just requires an increase that is measurable on some continuum.

It did not escape Brentano that an Aristotelian realism about universals poses a serious problem. According to this view, universals exist in re, that is, not independently of the substances in which they occur. It follows, in the case of material substances, that an accident can exist in more than one place at the same time. It may also be that an accident ceases to exist altogether when all its instances disappear, but later begins to exist again as soon as there is a new instance in which it occurs.

At this point, one must consider again the role that Brentano assigns to linguistic analysis. Brentano relies on such an analysis when he interprets simple categorical judgements as existential judgements. This analysis should remove the root from which the problem of multiple existence arises. Consider, for example, the simple categorical judgement:

A. Some men are bald-headed.

Brentano reduces this judgement to the following existential judgement:

B. A bald-headed man exists.

In the categorical judgement A, the term “man” denotes a substance and the term “bald-headed” denotes an accident of that substance if the judgement is true. In the existential judgement B, we have a single complex term, “bald-headed man,” which corresponds to a complex presentation, independently of whether a bald man exists or not. The function of the judgement is to acknowledge the existence of the object presented in this way, without adding any further “tie” between substance and accident.

Brentano’s mereology provides us with an ontology that matches this semantic analysis. The idea is the following: The complex term “bald man” denotes a whole, which has two parts (“bald” and “man”), both of which seem to denote parts of this whole. Now, which is here the substance, and which is the accident? Brentano decides that only the term “man” denotes a substance, and only the complex term “bald-headed man” denotes an accident. This means a substance, say the man Socrates, can be part of a whole (e.g. the bald man Socrates) without there having to be another part which must be added to this substance in order for this whole to exist.

Brentano offers us an account of substance that is peculiar because it goes against the principle of supplementation in extensional mereology. Brentano doesn’t follow that intuition: For him, properties like “bald-headedness” are not further parts that add to the substance to make it a whole. He thereby corrects the Aristotelian form of realism about universals at the very point that raises the problem of multiple existence.

d. Dualism, Immortality, God

One of the oldest problems in metaphysics is the so-called “mind-body problem.” Brentano confronts this problem in a traditional as well as in a modern way. The traditional setting is provided by the Aristotelian doctrine that different forms of life are explained by the different kinds of souls that organisms possess. Plants and animals have embodied souls that form the lower levels of the human soul which also includes a “thinking soul” as a higher part. In this framework, the mind-body problem consists of two questions: (1) How do the lower parts of the human soul fulfil their body-bound function?; and (2) how, if at all, does the thinking soul depend on the body?

The modern setting of the mind-body problem is provided by Descartes’ arguments for the immateriality of the soul. These arguments make use of considerations that suggest that the activity of the mind is thinking, and that thinking requires a substance that thinks, but not a body with sensory organs. Following the Cartesian argument, the mind-body problem becomes primarily a problem of causation: (1) how can sensory processes have a causal effect on the thinking mind; and (2) how can our thoughts have a causal effect on our behavior?

Brentano’s ambitious aim in his Psychology from an Empirical Standpoint was to bridge the gap between these two historical frameworks. If he succeeded, he would be able to use Cartesian ideas to answer questions arising within the Aristotelian framework, and he would be able to use Aristotelian ideas to answer questions about mental causation. But since Brentano left his Psychology unfinished, we do not know for sure how Brentano hoped to resolve the question, “whether it is conceivable that mental life continues after the dissolution of the body” (Brentano 1973, xxvii).

While Brentano did not get to the part of his Psychology that was meant to deal with the immortality question, he prepares this discussion in a chapter on the unity of consciousness. He begins with the following observation:

We are forced to take the multiplicity of the various acts of sensing […] as well as the inner perception which provides us with knowledge of all of them, as parts of one single phenomenon in which they are contained, as one single and unified thing (Brentano 1973, p. 97).

Here Brentano is only talking about the unity of the experience we have when, for example, we simultaneously hear and see a musician playing an instrument. But the same idea can be extended when we think about our own future. Suppose I am looking forward to a vacation trip I have planned. There is a special unity involved in this because I could be planning a trip without looking forward to it, and I could be looking forward to a trip that someone else is planning. In the present case, however, my intention and my joy are linked, and in a double way. Both phenomena relate to the same object: my future self that enjoys the trip. Using examples like this, an argument for the immortality of the soul could be made along the following lines:

  1. The unity I perceive between my intentions and the pleasures I feel now does not rely on any local part of my body.
  2. If it doesn’t rely on any part of my body now, it will not rely on my body in the future.
  3. Therefore, the unity I perceive now may well outlive my body.

While the question of immortality concerns the end of our life, there is also a question about its beginning: How does it come that each human being has an individual soul?  Brentano takes up this question in a lengthy debate with the philologist and philosopher Eduard Zeller. The subject of this debate was whether Aristotle could be credited with the so-called “creationist” view, according to which the existence of each individual soul is due to God’s creation. Brentano affirms such an interpretation, and we may assume that it coincides with his own view of the matter. It is a view that presupposes a fundamental difference between human and non-human creatures, but also allows some continuity in the way souls enter the bodies of living creatures:

Lest this divine intercession should appear too incredible, Aristotle calls attention to the fact that the powers of the lower elements do not suffice for the genesis of any living being whatever. Rather, the forces of the heavenly substances participate in a certain way as a cause, thereby making such beings more godlike. The participation of the deity in the creation of man, therefore, has its analogy in the generation of lower life (Brentano 1978, 111ff.).

What this suggests is that individual souls of all kinds are created in such a way that God’s contribution to the process is recognisable. This is remarkable, because it means that the human soul owes its existence to a process of creation that is in many ways analogous to processes that enable the existence of plants and animals.

Such considerations fit nicely into the Aristotelian framework of the mind-body problem. Some scholars have therefore questioned whether Brentano could also advocate a Cartesian substance dualism. Dieter Münch for instance suggests that there are clear traces of a “monistic tendency” in Brentano (D. Münch, 1995/96, p. 137).

On the other hand, we also find in Brentano considerations that favor the Cartesian framework. For example, when Brentano offers a psychological proof of the existence of God, he follows closely in the footsteps of Descartes (On the Existence of God, sections 435-464). In this proof, Brentano criticizes what he calls an “Aristotelian semi-materialism,” and argues that the unity of consciousness is incompatible with any form of materialism. The tension between these arguments (which may have been modified somewhat by Brentano’s disciples) and the passages quoted above seems difficult to resolve. We may take this as a sign that the gap between the Aristotelian and Cartesian frameworks is too wide to be bridged (see Textor 2017).

5. History of Philosophy and Metaphilosophy

Brentano’s contributions to the history of philosophy are above all an expression of his metaphilosophical optimism. Brentano believed that philosophy develops according to a model that distinguishes phases of progress and phases of decline. Phases of progress are relatively rare and followed by much longer phases of decline. Only a few philosophers, such as Aristotle, Thomas Aquinas, Leibniz and Descartes, fulfill the highest standards which Brentano applies. Nevertheless, he was optimistic that another phase of progress will come, and that it will provide concrete solutions to philosophical problems.

Brentano’s phase model is undoubtedly speculative. It is the result of his reflections on how to approach the history of philosophy. In his view, it makes a big difference whether one studies the history of philosophy as an historian or as a philosopher. Brentano tried to convey to his students the relevance of this distinction, and it is for this purpose that he most often invoked his phase model.

a. How to do History of Philosophy

In a lecture given to the Viennese Philosophical Society in 1888, entitled “On the Method of Historical Research in the Field of Philosophy” (a draft of which has been published in Franz Brentano: Geschichte der Philosophie der Neuzeit (1987, pp. 81-105)), Brentano hands out his recommendations for doing history of philosophy. One should “approach the author’s thought in a philosophical way,” he says. This requires two critical competences: a specific hermeneutic competence, and a broad understanding of the main currents of progress and decline in philosophy.

The hermeneutic competence Brentano requires “consists in allowing oneself to be penetrated, as it were, by the spirit of the philosopher whose teachings one is studying” (ibid., p. 90). In his debate with Zeller, Brentano uses this requirement as an argument against a purely historical interpretation of Aristotle’s texts:

One must try to resemble as closely as possible the spirit whose imperfectly expressed thoughts one wants to understand. In other words, one must prepare the way for understanding by first meeting the philosopher philosophically, before concluding as a historian (Brentano: Aristoteles’ Lehre vom Ursprung des menschlischen Geistes (1911), p. 165).

The second requirement, namely an awareness of the main currents structuring the development of philosophy, brings us back to Brentano’s phase model. Such models were popular among historians at Brentano’s time. One of these historians was Ernst von Lasaulx, whose lectures Brentano attended as a student in Munich (see Schäfer 2020). A few years later, Brentano came across Auguste Comte’s model and put it to critical examination (see Brentano’s lecture “The Four Phases of Philosophy and its Current State” (1895)).

Brentano believes that progress of philosophy results from a combination of metaphysical interest with a strictly scientific attitude. He therefore disagrees with Comte on three points. Firstly, Comte fails to notice the repetitive cycles of progress and decline. Secondly, he does not see the classical period of Greek philosophy as a phase in which philosophers were driven by a purely theoretical interest. And thirdly, Comte mistakenly believes that modern philosophy had to pass through a theological and metaphysical phase before it could enter its scientific phase.

The broad perspective that Brentano takes on the history of philosophy leads him to cast a shadow over all philosophers who belong to a phase of decline. But this is not the only form of criticism to be found in Brentano. There are also independent and highly illuminating discussions of the works of Thomas Reid and Ernst Mach, as well as profound criticisms of Windelband, Sigwart and other logicians of Brentano’s time. From today’s point of view, these are all important contributions to the history of philosophy.

b. Aristotle’s Worldview

For Brentano, the study of Aristotle’s works was a major source of inspiration and went hand in hand with the development of his own ideas. Almost all his works contain commentaries on Aristotle, often in the form of a fictitious dialogue in which Brentano tested the viability of his own ideas and sought support for his own views from a historical authority.

Brentano was concerned with Aristotle’s philosophy throughout his life, from his early writings on ontology (1862) and psychology (1867), through his debate with Eduard Zeller on the origin of the human soul (1883), to his treatise on Aristotle’s worldview a few years before his death. Throughout these studies, we can observe Brentano applying his hermeneutic method to fill in the gaps he finds in Aristotle’s argumentation and to resolve apparent contradictions.

In his final treatment of Aristotle, Brentano focuses on the doctrine of wisdom as the highest form of knowledge. He approaches this problem by not confining himself to what Aristotle says in the Metaphysics, but “by using incidental remarks from various works” (Aristotle and His World View, p. ix), and by including commentaries such as Theophrastus’s Metaphysics. With the help of Theophrastus and other sources, he hopes to resolve what he believes are only apparent contradictions in Aristotle’s writings. Brentano’s ultimate aim in these late texts is to present his own doctrine of wisdom as the highest form of knowledge (see the manuscripts published in the volume Über Aristoteles 1986).

Another thorny issue taken up by Brentano is Aristotle’s analysis of induction. Brentano praises Aristotle for having recognized the importance of induction for empirical knowledge when, for example, he discussed the question of how we can deduce the spherical shape of the moon from observing its phases. It was, however, “left for a much later age to shed full light, by means of the probability calculus, upon the theory of measure of justified confidence in induction and analogy” (Aristotle and his World View, p. 35). Brentano’s own attempt to solve the problem of induction is tentative. Its solution, he says, will depend on the extent to which the future mathematical analysis of the concept of probability coincides with the intuitive judgements of common sense (ibid.).

Brentano follows in Aristotle’s footsteps in treating metaphysics as a discipline that includes not only ontology, but also cosmology and (natural) theology. The pinnacle of metaphysics, for Brentano, would be a proof of the existence of God. Brentano already hints at this idea in his fifth habilitation thesis: “The multiplicity in the world refutes pantheism and the unity in it refutes atheism” (Brentano, 1867).

Brentano has worked extensively on a proof of God’s existence that relies both on a priori and a posteriori forms of reasoning (see Brentano: On the Existence of God. 1987). Here, too, it is not difficult to find the Aristotelian roots of Brentano’s thinking. In a manuscript from May 1901, Brentano writes: “Aristotle called theology the first philosophy because, just as God is the first among all things, the knowledge of God (in the factual, if not in the temporal order) is the first among all knowledge” (Religion and Philosophy, 90).

We can see from these late writings that there were at least two constants in Brentano’s work: one is his engagement with Aristotle, another one is his theism. But these early imprints are not the only ones. There is yet another historical source that became decisive for Brentano early on, namely the contemporary positivism of Mill, Spencer and Comte.

c. Positivism and the Renewal of Philosophy

In 1859, the Franco-Luxemburgish philosopher and sociologist Théophile Funck, who became Brentano’s brother-in-law in 1860, published a book entitled Philosophie et lois de l’histoire, in which he connects the positivist movement with a model of historical development resembling Brentano’s phase model. How much Brentano was impressed by positivism at this point in time can be seen in his review of another book by the same author:

In the most recent epoch there has appeared in the person of Auguste Comte a thinker lacking neither the enthusiastic zeal for the most sublime questions, nor the insight capable of linking ideas, which elevate the truly great philosopher above the mass of lesser minds. Mill doesn’t hesitate to put him on the same level as Descartes and Leibniz, he even calls him superior to them, if not deeper, if only because he was able to bring to bear a similar spiritual force in a more advanced cultural epoch (Brentano 1876, 3. Our transl.).

This enthusiasm for Comte’s philosophy may have changed in light of the anti-metaphysical tendencies in Comte’s thought and his reservations about psychology becoming a proper science. But there are other factors, too, that may have played a role in this context: For example, the discovery of Mill’s monograph on Comte, which Brentano read in French translation in 1868, may have suggested to him a fundamental agreement between British empiricism and French positivism. Also, at this young age, Brentano was preparing for an academic career in philosophy as a Catholic priest. This was a difficult task in Germany in those days. Brentano had to expect fierce opposition to his beliefs, especially in Würzburg. The study of English and French positivism may have seemed to him an appropriate means of countering such resistance.

One way of preparing for this opposition was to ensure that his psychology had a firm empirical basis. To do this, Brentano wanted to show that mental phenomena are subject to distinct laws which are nevertheless similar or analogous to the laws of physics. Here, Brentano could rely on his doctrine that philosophy (including psychology) and the natural sciences share a common method, a doctrine which he defended as one of his theses for the habilitation. But Brentano had yet to substantiate this claim with concrete examples. In this respect, the debate on the Weber-Fechner law provided him with a welcome opportunity.

Brentano states the law as follows: “It has been found that the increase of the physical stimulus which produces a just barely noticeable increase in the strength of the sensations always bears a constant relation to the magnitude of the stimulus to which it is added” (Psychology from an Empirical Standpoint, p. 67). Brentano, then, goes on to correct what he takes to be a common mistake in applying this law:

Since it was assumed to be self-evident that each barely noticeable increase of sensation is to be regarded as equal, the law was formulated that the intensity of sensation increases by equal amounts when the relative increase of the physical stimulus is the same. In reality, it is by no means self-evident that each barely noticeable increase in sensation is equal, but only that it is equally noticeable (ibid.).

This example highlights two key moves in Brentano’s thinking. While he fully acknowledges the significance of the Weber-Fechner law, he points out a mistake in the formulation of the law. It takes the sharp eye of a philosopher, trained in making precise conceptual distinctions, to reveal this mistake. For Brentano, the consequence is that there are two kinds of laws which together explain the experimental findings by Weber and Fechner: a physiological law, as well as a law of descriptive psychology. These laws are perfectly tuned to each other, which shows that they are similar or at least analogous (For details on the Fechner-Brentano debate, see the introduction by Mauro Antonelli to Brentano and Fechner 2015 and Seron 2023).

In conclusion, it must be admitted that Brentano’s attitude to the positivist tradition remains somewhat ambiguous. This can also be seen in his criticism of Ernst Mach, who was appointed professor of philosophy at the University of Vienna in the same year that Brentano left the city. They shared the ambition of renewing philosophy by drawing upon the resources of the natural sciences. However, Brentano could not accept Mach’s theory of sensation, and the monism it implies. He believed that a firm distinction between physical and mental phenomena precluded such monism (see Brentano: Über Ernst Mach’s “Erkenntnis und Irrtum” 1905/1906, first published in 1988).

d. Philosophical Optimism

Brentano was a philosophical optimist through and through, firmly convinced that philosophy not only had a great past, but an even greater future ahead of it. He emphasized the continuity between the endeavors of philosophy and the empirical sciences, and he attacked in his lectures with great eloquence a cultural pessimism that was popular at the time. In addition, he backed up his optimism with his theistic worldview.

In his lecture “On the Reasons for a Loss of Confidence in the Area of Philosophy” (1874), Brentano admits that so far philosophy has not been able to keep pace with the progress of the natural sciences. It lacks continuity, consensus, and practical usefulness. However, Brentano denies that this is a permanent deficiency and counters the widespread view that the questions of philosophy cannot be treated as precisely as scientific questions. He therefore urges philosophers to orient themselves more closely to the natural sciences and, in particular, to take advantage of the new findings of physiology: “Now that even physiology is beginning to thrive more vigorously, we no longer lack for signs pointing to the time for philosophy, too, to awaken to productive life” (Brentano: “On the Reasons for a Loss of Confidence in the Area of Philosophy” 2022, 499).

With such colorful words, Brentano’s aim was to instill in his students a belief in the integrity of philosophy as a rigorous discipline. If this was not good enough to convince his audience, Brentano reminded them of the human capacity to search for ultimate reasons, thereby appealing to a religious impulse in some of his students. One of them was Alfred Kastil, who often changed Brentano’s words as an editor, but claims that the following passage comes from from one of Brentano’s manuscripts:

Man demands by nature not merely a knowledge of what is [Kenntnis des Dass], but also a knowledge of why [Kenntnis des Warum]. For this reason alone, the knowledge of God, as that of the first reason, is a great good, but it is also so insofar as the most joyful conception of the world, the most blissful hopes are attached to it (Brentano: Religion und Philosophie, 253 fn. Our translation).

Yet it would be be rash to reduce Brentano’s philosophical optimism to religious enthusiasm. Looking back on his own life, he speaks of the “duty of the wise man, having reached the age of maturity, to subject his religious convictions to an examination”, and in doing so, with all due respect for popular religion, to retain the freedom to oppose the unauthorized restrictions on research by an “ecclesiastical government” (Ibid., p. 251).

Clearly, Brentano’s conflict with the Catholic Church and the Austrian state left deep marks on his personality. But it was not only religious institutions that Brentano opposed. As a philosopher, he fought all his life against Kantian philosophy and the scepticism and relativism he saw as its corollary. And although he saw much of philosophy as grounded in psychology, he fought against the popular view that such psychologism implied a form of subjectivism.

6. References and Further Reading

a. Monographs published by Brentano

  • On the Several Senses of Being in Aristotle. Transl. R George. Berkeley: University of California Press 1975. [first published in 1862]
    • Brentano’s PhD dissertation. Argues that being in the sense of the categories is the most basic meaning of being according to Aristotle.
  • The Psychology of Aristotle, Especially His Doctrine of the Nous Poietikos. Trans. R. George. Berkeley: University of California Press, 1977. [first published in 1867]
    • Brentano’s habilitation thesis in which he challenges the traditional view of the active intellect in Aristotle.
  • The Origin of Our Knowledge of Right and Wrong. Transl. R. M. Chisholm and E.H. Schneewind. London: Routledge 1960. [first published in 1889]
    • An expanded version of Brentano’s lecture bearing the same title, along with extensive footnotes and additions, which elaborate and defend his conception of the Good in terms of correct love.
  • Psychology from an Empirical Standpoint. Trans. A. C. Rancurello, D. B. Terrell, and L. L. McAlister. London: Routledge 1973. [first published in 1874]
    • Brentano’s most important book, conceived as the first two parts of a six volume treatise on psychology, four of which were never published, in which Brentano develops his conception of psychology as an empirical science based on inner perception.
  • Von der Klassifikation der psychischen Phänomene. Berlin: Duncker & Humblot, 1911. [On the Classification of Mental Phenomena]
    • Reprint of the second book of his Psychology from an Empirical Standpoint along with an important appendix in which Brentano explains some changes regarding his earlier views. Included in the English translation of the Psychology from an Empirical Standpoint.
  • Untersuchungen zur Sinnespsychologie. Berlin: Duncker & Humblot 1907. [Investigations on Sensory Psychology]
    • A collection of papers previously published between 1890 and 1906 on sensory psychology. These papers are largely based on the Vienna lectures on descriptive psychology. Not yet translated into English.
  • Aristotle and his Worldview. Transl. R. George and R.M. Chisholm. Berkeley: University of California Press 1978 [first published in 1911]
    • Originally published as a chapter of an edited book on the great figures of philosophy and expanded into a book. Offers a general presentation of Aristotle’s metaphysics, in many respects influenced by Theophrastus’ reading of Aristotle.
  • Aristoteles Lehre vom Ursprung des menschlichen Geistes. Leipzig: Velt & Comp. 1911. [Aristotle’s Doctrine of the Origins of the Human Mind]
    • Brentano’s last word on the origin of the human soul in Aristotle in his debate with Eduard Zeller. Not yet translated into English.

b. Other Philosophical Works Published by Brentano

  • “Habilitation Theses”. Transl. S. Gabriel. In I. Tănăsescu, et. al. (eds.), Brentano and the Positive Philosophy of Comte and Mill. Berlin: DeGruyter 2022, 433-436. [first published in 1867]
    • A list of 25 theses (originally in Latin) that Brentano defended for his habilitation.
  • “Auguste Comte and Positive Philosophy”. Transl. S. Gabriel. In I. Tănăsescu, et. al. (eds.), Brentano and the Positive Philosophy of Comte and Mill. Berlin: DeGruyter 2022, 437-455. [first published in 1869]
    • The first and only published article out of a series of eight planned articles in which Brentano critically examines Comte’s conception of philosophy, his classification of sciences and his view on the cycles and phases of history of philosophy.
  • “Der Atheismus und die Wissenschaft”. 1873. [“Atheism and Science”] Reprinted in: Franz Brentano: Sämtliche veröffentlichte Schriften. Hgg. Von Th. Binder und A. Chrudzimski. Volume IX: Vermischte Schriften. Berlin. DeGruyter 2019, 37-61.
    • Brentano’s reply to an article in a Viennese newspaper published anonymously a few weeks earlier. Not yet translated into English.
  • “Der neueste philosophische Versuch in Frankreich”. 1876. Reprinted in: Franz Brentano: Sämtliche veröffentlichte Schriften. Hgg. Von Th. Binder und A. Chrudzimski. Volume III: Schriften zur Ethik und Ästhetik. Berlin. DeGruyter 2011, 1-17.
    • Brentano’s anonymous review of a book by T. Funck-Brentano. Not yet translated into English.
  • “Das Genie”. Berlin: Duncker & Humblot. 1892. [On Genious]. Reprinted in: Franz Brentano: Sämtliche veröffentlichte Schriften. Hgg. Von Th. Binder und A. Chrudzimski. Volume III: Schriften zur Ethik und Ästhetik. Berlin. DeGruyter 2019, 99-127.
    • Offers an exposition of his aesthetics according to which beauty is a property of acts of presentation and gives an account of the genetic preconditions of beautiful presentations. Not yet translated into English.
  • “On the Reasons for a Loss of Confidence in the Area of Philosophy”. Transl. S. Gabriel. In I. Tănăsescu, et. al. (eds.), Brentano and the Positive Philosophy of Comte and Mill. Berlin: DeGruyter 2022, 489-500. [first published in 1874]
    • Brentano’s Inaugural Lecture at the University of Vienna, in which he articulates his optimism about the progress in philosophy.
  • “Miklosich über subjektlose Sätze”. 1883. [Miklosich on sentences without subject terms]. Reprinted in: Franz Brentano: Sämtliche veröffentlichte Schriften. Hgg. Von Th. Binder und A. Chrudzimski. Volume IX: Vermischte Schriften. Berlin. DeGruyter 2019, 105-115.
    • An appreciative review of a short treatise by Franz Miklosich, a contemporary linguist, which Brentano included (minus the final paragraph) as an appendix in Our Knowledge of Right and Wrong (1889).
  • “The Four Phases of Philosophy and Its Current State.” Trans. B. M. Mezei and B. Smith. In Balázs M. Mezei and Barry Smith (eds.), The Four Phases of Philosophy. Amsterdam: Rodopi. 1998. [first published in 1895].
    • Brentano’s exposition of his cyclic view of the history of philosophy in phases of ascent and decline.
  • “On the Future of Philosophy” Transl. S. Gabriel. In I. Tănăsescu, et. al. (eds.), Brentano and the Positive Philosophy of Comte and Mill. Berlin: DeGruyter 2022, 523-570 [first published in 1893].
    • Brentano’s very critical reply to a lecture on political education delivered by Adolf Exner on the occasion of Exner’s inauguration as rector of the University of Vienna.

c. Selected Works Published Posthumously from Brentano’s Nachlass

Since Brentano’s death, many editions of his manuscripts and lectures have been produced. Given the editorial policy of his first editors, which consisted in adapting the text to what they took to be Brentano’s considered view on these matters, some of these texts are heavily edited and not always historically reliable.

  • Philosophy of Mind and Psychology
  • Sensory and Noetic Consciousness. Transl. M. Schättle and L.L. McAlister. London: Routledge 1981.
    • A collection of manuscripts on psychology and metaphysics, misleadingly presented as the third book of the Psychology from an Empirical Standpoint.
  • Descriptive Psychology. Trans. B. Müller. London: Routledge 1995.
    • Material from the lecture courses held in Vienna in the late 1880s, which Husserl often refers to.
  • Briefwechsel über Psychophysik 1874-1878. [Correspondence on Psychophysics 1874-1878] Berlin: DeGruyter 2015. With an Introduction by Mauro Antonelli.
    • Brentano’s correspondence with Gustav Fechner.
  • Epistemology and Truth
  • Versuch über die Erkenntnis [Essay on Knowledge]. Leipzig: Felix Meiner 1925.
    • Contains an edition of the treatise Nieder mit den Vorurteilen! [Down with prejudices!], the most explicit attack of Brentano on Kant’s notion of the synthetic apriori.
  • The True and the Evident. Trans. R. M. Chisholm, I. Politzer, and K. R. Fischer. London: Routledge 1966.
    • Manuscripts and lectures from the period between 1889 and 1915 on truth and ontology, including the lecture “On the concept of truth”.
  • Logic, Ethics, and Aesthetics
  • Die Lehre vom richtigen Urteil [The Theory of Correct Judgement]. Bern: Francke 1956.
    • An edition of Brentano’s logic lectures, dealing among other things with the existential reduction of judgements.
  • The Foundation and Construction of Ethics. Trans. E. H. Schneewind. London: Routledge 1973.
    • An edition of various lectures by Brentano on practical philosophy.
  • Grundzüge der Ästhetik [Outlines of Aesthetics]. Bern: Francke 1959.
    • Brentano’s Vienna lectures on psychology and aesthetics, which include an account of the distinction between intuitive and conceptual presentations.
  • Ontology and Cosmology
  • Die Abkehr vom Nichtrealen. [The Turn Away from the Non-Real]. Bern: Francke 1966.
    • A collection of selected letters from Brentano to Marty and his later students on the motives which led Brentano to adopt Reism.
  • The Theory of Categories. Trans. R. M. Chisholm and N. Guterman. The Hague: Martinus Nijhoff 1981.
    • A collection of manuscripts of the late Brentano on metaphysics.
  • Philosophical Investigations on Space, Time, and the Continuum. Trans. B. Smith. London: Routledge and Kegan Paul 1988.
    • A collection of manuscripts of the late Brentano dealing in detail with his mereology and his conception of boundaries.
  • Religion and Christian Faith
  • On the Existence of God. Trans. S. Krantz. The Hague: Martinus Nijhoff 1987.
    • An edition of various lectures on metaphysics (especially cosmology) and on the proofs of the existence of God.
  • Religion und Philosophie [Religion and Philosophy]. Bern: Francke 1954.
    • A collection of numerous essays on cosmology and immortality, as well as on wisdom, chance and theodicy.
  • The Teachings of Jesus and Their Enduring Significance. New York: Springer 2021.
    • Further studies prepared shortly before Brentano’s death offering a final word on his conception of Christian belief.
  • On the History of Philosophy
  • Über Aristoteles [On Aristotle]. Hamburg: Felix Meiner 1986.
    • A collection of manuscripts on Aristotle, along with correspondence on related topics.
  • Geschichte der Philosophie der Neuzeit [History of Modern Philosophy]. Hamburg: Meiner 1987.
    • Material on Brentano’s lecture course on the history of philosophy from Bacon to Schopenhauer, as well as notes for his lecture on the proper method of doing history of philosophy.
  • Über Ernst Machs Erkenntnis und Irrtum [On Ernst Mach’s Knowledge and Error]. Amsterdam: Rodopi 1988.
    • Manuscripts and excerpts of lectures in which Brentano critically examines Mach’s positivism.

d. Secondary Sources

A valuable selection of earlier literature on Brentano up to 2010 can be found in the 4-volume collection Franz Brentano: Critical Assessment, edited by M. Antonelli and F. Boccaccini, Routledge 2019.

Works with an asterisk * are quoted in the text.

  • Antonelli, Mauro, and Thomas Binder (eds.) The Philosophy of Brentano. Studien zur Österreichischen Philosophie Vol. 49: Brill 2021.
  • Antonelli, Mauro, and Federico Boccaccini (eds.) Franz Brentano: Critical Assessment. Routledge Critical Assessment of Leading Philosophers. 4 volumes. Routledge 2019.
  • Antonelli, Mauro, and Federico Boccaccini. Brentano. Mente, Coscienza, Realtà. Carocci, 2021.
  • *Binder, Thomas. Franz Brentano und sein philosophischer Nachlass. DeGruyter 2019.
  • *Brandl, Johannes L. “Was Brentano an Early Deflationist about Truth?”, The Monist 100 (2017), 1-14.
  • *Brandl, Johannes L. (ed.) Brentano on Intentional Inexistence and Intentionality as the Mark of the Mental. Special Issue of Grazer Philosophische Studien. Volume 100 (2023).
  • *Citlak, Amadeusz: “Qualitative Psychology of the Brentano School and its Inspirations”, Theory and Psychology (2023), 1-22.
  • *Crane, Tim. Aspects of Psychologism. Harvard University Press, 2014.
  • Curvello, Flávio Vieira. “Brentano on Scientific Philosophy and Positivism.” Kriterion: Revista de Filosofia 62 (2021): 657-79.
  • Dewalque, Arnauld. “Brentano’s Case for Optimism.” Rivista di filosofia neoscolastica, CXL, 4 (2019): 835-47.
  • Fisette, Denis. La philosophie de Franz Brentano. Vrin 2022.
  • Fisette, Denis, and Guillaume Fréchette (eds.) Themes from Brentano. Vol. 44: Rodopi 2013.
  • Fisette, Denis, Guillaume Fréchette, and Hynek Janoušek (eds.). Franz Brentano’s Philosophy after One Hundred Years: From History of Philosophy to Reism. Springer Nature, 2021.
  • Fisette, Denis, Guillaume Fréchette, and Friedrich Stadler (eds.) Franz Brentano and Austrian Philosophy. Springer, 2020.
  • *Fréchette, Guillaume. “Brentano on Time-Consciousness.” In: U. Kriegel (eds.): The Routledge Handbook of Franz Brentano and The Brentano School. New York 2017, 75-86.
  • Fréchette, Guillaume, and Hamid Taieb (eds.) Descriptive Psychology: Franz Brentano’s Project Today. Special Issue of European Journal of Philosophy Issue 31 (2023).
  • *Gabriel, Susan. “Brentano at the Intersection of Psychology, Ontology, and the Good.” In: D. Fisette and G. Frechette (eds.) Themes from Brentano. Brill, 2013. 247-71.
  • *Huemer, Wolfgang. “Was Brentano a Systematic Philosopher?” In: Antonelli, Mauro, and Thomas Binder. The Philosophy of Brentano. Studien zur Österreichischen Philosophie Vol. 49: Brill 2021, 11-27.
  • Kriegel, Uriah (ed.) The Routledge Handbook of Franz Brentano and the Brentano School. Taylor & Francis, 2017.
  • *Kriegel, Uriah. Brentano’s Philosophical System: Mind, Being, Value. Oxford University Press, 2018.
  • Massin, Olivier and Mulligan Kevin. Décrire: La Psychologie De Franz Brentano. Vrin: 2021).
  • *Münch, Dieter. “Die Einheit Von Geist und Leib: Brentanos Habilitationsschrift über Die Psychologie des Aristoteles als Antwort auf Zeller.” Brentano Studien. Internationales Jahrbuch der Franz Brentano Forschung 6 (1996), 125-144.
  • *Rollinger, Robin D. “Brentano’s Psychology from an Empirical Standpoint: Its Background and Conception.” In: I. Tănăsescu: Franz Brentano’s Metaphysics and Psychology  (2012): 261-309.
  • Rollinger, Robin D. Concept and Judgment in Brentano’s Logic Lectures: Analysis and Materials. Vol. 48: Brill 2020.
  • *Sauer, Werner. “Brentano’s Reism.” The Routledge Handbook of Franz Brentano and the Brentano School. Routledge, 2017. 133-43.
  • Schaefer, Richard. “Learning from Lasaulx: The Origins of Brentano’s Four Phases Theory.” Franz Brentano and Austrian Philosophy. Brill (2020): 181-96.
  • *Seron, Denis. “Psychology First!” The Philosophy of Brentano. Brill, 2021. 141-55.
  • *Seron, Denis. “The Fechner-Brentano Controversy on the Measurement of Sensation.” In: D. Fisette et. al. (eds.): Franz Brentano and Austrian Philosophy. Springer, 2020, 344-67.
  • *Soldati, Gianfranco. “Brentano on Self-Knowledge.” In: U. Kriegel: The Routledge Handbook of Franz Brentano and the Brentano School. Taylor & Francis, 2017, pp. 124-129.
  • Tănăsescu, Ion (ed.) Franz Brentano’s Metaphysics and Psychology. Zeta Books 2012.
  • Tănăsescu, Ion, et al (eds.) Brentano and the Positive Philosophy of Comte and Mill: With Translations of Original Writings on Philosophy as Science by Franz Brentano. De Gruyter 2022.
  • Tassone, Biagio G. From Psychology to Phenomenology: Franz Brentano’s ‘Psychology from an Empirical Standpoint’ and Contemporary Philosophy of Mind. Palgrave-Macmillan 2012.
  • Textor, Mark. Brentano’s Mind. Oxford University Press 2017.
  • Textor, Mark. “Correctness First: Brentano on Judgment and Truth.” The Act and Object of Judgment: Historical and Philosophical Perspectives. Eds. Ball, Brian Andrew and Christoph Schuringa: Routledge, 2019.
  • *Textor, Mark. “From Mental Holism to the Soul and Back.” The Monist 100.1 (2017): 133-54.
  • Textor, Mark. “That’s Correct! Brentano on Intuitive Judgement.” British Journal for the History of Philosophy 31.4 (2022): 805-24.

 

Author Information

Johannes L. Brandl
Email: johannes.brandl@plus.ac.at
University of Salzburg
Austria

and

Guillaume Frechette
Email: guillaume.frechette@unige.ch
University of Geneva
Switzerland

The Definition of Art

The Hold House Port Mear Square Island Port Mear Beach' ?c.1932 by Alfred Wallis 1855-1942
The Hold House Port Mear Square Island

A definition of art attempts to spell out what the word “art” means. In everyday life, we sometimes debate whether something qualifies as art: Can video games be considered artworks? Should my 6-year-old painting belong to the same category as Wallis’ Hold House Port Mear Square Island (see picture)? Is the flamboyant Christmas tree at the mall fundamentally different from a Louvre sculpture? Is a banana taped to a wall really art? Definitions of art in analytic philosophy typically answer these questions by proposing necessary and sufficient conditions for an entity x to fall under the category of art.

Defining art is distinct from the ontological question of what kind of entities artworks are (for example, material objects, mental entities, abstractions, universals…). We do not, for example, need to know whether a novel and a sculpture have a distinct ontological status to decide whether they can be called “artworks.”

Definitions of art can be classified into six families. (1) classical views hold that all artworks share certain characteristics that are recognizable within the works themselves (that is, internal properties), such as imitating nature (mimesis), representing and arousing emotions (expressivism), or having a notable form (formalism). A modified version of this last option is enjoying a revival in 21st century philosophy, where art is said (2) to have been produced with the aim of instantiating aesthetic properties (functionalism). Classical definitions initially met with negative reactions, so much so that in the mid-twentieth century, some philosophers advocated (3) skepticism about the possibility of defining art while others critiqued the bias of the current definitions. Taking up the challenge laid out by theses critics, (4) a fourth family of approaches defines art in terms of the relations that artworks enjoy with certain institutions (institutionalism) or historical practices (historicism). (5) A fifth family of approaches proposes to analyze art by focusing on the specific art forms—music, cinema, painting, and so one—rather than on art in general (determinable-determinate definitions). (6) A last family claims that “art” requires to be defined by a disjunctive list of traits, with a few borrowed from classical and relational approaches (disjunctivism).

Table of Contents

  1. Some Constraints for a Definition of Art
    1. Criteria
      1. What a Definition of Art Should Include
      2. What a Definition of Art Should Exclude
      3. What a Definition of Art Should Account For
  2. Classical Definitions of Art
    1. Mimesis
      1. The Ancients
      2. The Moderns
      3. The Limits of the Mimesis Theory
    2. Expressivism
      1. Tolstoy
      2. Collingwood and Langer
      3. Limits of Expressivism
    3. Formalism
      1. Clive Bell
      2. Limits of Formalism
  3. The Skeptical Reaction
    1. Anti-Essentialist Approaches
    2. The Cluster Approach
    3. Limits of Skepticism
  4. Relational Definitions
    1. Institutionalism
      1. Advantages of Institutionalism
      2. Limits of Institutionalism
    2. Historicism
      1. Advantages of Historicism
      2. Limits of Historicism
  5. Feminist Aesthetics, Black Aesthetics, and Anti-Discriminatory Approaches
    1. The Art as Critique
    2. Philosophical Critiques
  6. Functionalist Definitions
    1. Neo-Formalist Functionalism
      1. Advantages of Neo-Formalist Functionalism
      2. Limits of Neo-Formalist Functionalism
  7. Determinable-Determinate Definitions
    1. Buck-passing
      1. Advantages of Buck-passing
      2. Limits of Buck-passing
  8. Disjunctive Definitions
    1. Symptomatic Disjunctivism
      1. Advantages of Symptomatic Disjunctivism
      2. Limits of Symptomatic Disjunctivism
    2. Synthetic Disjunctivism
      1. Advantages of Synthetic Disjunctivism
      2. Limits of Synthetic Disjunctivism
  9. Conclusion
  10. References and Further Reading
    1. References
    2. Further Reading

1. Some Constraints for a Definition of Art

The concept expressed by the word “art” may have a relatively recent, and geographically specific, origin. According to some, the semantic distinction between “art” and “crafts” emerged in Europe in the 18th century with the notion of “fine arts,” which includes music, sculpture, painting, and poetry (Kivy 1997, Chapter 1). Indeed, terms such as ars in Latin and tekhnê in Ancient Greek bear some relation to today’s concept of art but they also referred to trades or techniques such as carpentry or blacksmithing. In the Middle Ages, “liberal arts” included things such as mathematics and astronomy, not only crafts. This old meaning of “art” survives in expressions such as “the art of…”—for example, the art of opening a bottle of beer. Similar remarks can be made for related non-Western notions, such as the Hindu notion of “kala” (कला), which involves sixty-four practices, not all of which we would call artistic (Ganguly 1962). “Art” nowadays is more likely understood to mean something more restricted than these traditional meanings.

These differences in terminology do not mean that past or non-Western cultures don’t make art. On the contrary, making art is arguably a typical human activity (Dutton 2006). Moreover, the fact that a culture does not have a word co-referent with “art” does not mean that it does not have the concept of art or, at least, a concept that largely overlaps with it—see Porter (2009) against the idea that the concept of art has emerged only with the notion of “fine arts.” The following definitions of art are thus intended to apply to the practices and productions from all cultures, whether or not they possess a specific term for the concept.

Defining art typically relies on conceptual analysis. Philosophers aim to provide criteria that capture what people mean in everyday life when they talk and think about art while, at the same time, avoiding conceptual inconsistencies. This methodology goes hand in hand with a certain number of criteria that any definition of art must respect.

a. Criteria

Despite the immense diversity of definitions of art, philosophers usually agree on a set of minimum criteria that a good definition must meet in order to respect both folk and specialist uses of the term (in art history, art criticism, aesthetics…) while avoiding mere trivialities (Lamarque 2010). Three classes of criteria can be distinguished: those specifying what a definition must include, those that specify what it must exclude, and those the cases that a good definition must take into account.

i. What a Definition of Art Should Include

[i] Art of a mediocre quality.

[ii] Art produced by people who do not possess the concept of art.

[iii] Avant-garde, future, or possible art (for example, extraterrestrial art).

Criterion [i] touches on what might be called the descriptive project of an art definition. As noted by Dickie (1969), we sometimes use the word “art” descriptively—“The archeologists found tools, clothes, and artworks”—and sometimes evaluatively—“Wow, mom, your couscous is a real artwork!” Dickie points out that, in the past, the descriptive and evaluative (or prescriptive) uses of the term have often been confused. Introducing mediocre art as a criterion excludes the prescriptive or evaluative use. One reason for this is the practice of folk and professional criticism: we may talk of “bad plays”, “insipid books”, or “kitsch songs”, without denying that they are artworks. Doing so also avoids confusing personal or cultural preferences with the essence of art.

Criterion [ii] avoids excluding art produced before the 18th century, non-Western art, as well as the art brut (or outsider art) produced by people whose education or cognitive conditions make it unlikely that they are familiar with the concept of art. A good definition, therefore, does not require the artist to possess the contemporary, Western notion of art to produce artworks.

Criterion [iii] implies that a good definition must do more than designate a set of entities; it must be able to play an explanatory role and make fruitful predictions (Gaut 2000; Lopes 2014). Considering the upheaval that traditional art has undergone in the twentieth century with the emergence of cinema, conceptual art, and so on, this criterion takes on particular importance: any definition that fails to be predictive is doomed to soon be outdated.

ii. What a Definition of Art Should Exclude

[iv] Purely natural objects

[v] What purports to be art but has failed completely

[vi] Non-artistic artifacts (including those with an aesthetic function)

One of the few consensuses in aesthetics is that an artwork is an artifact [iv]—an object intentionally created for a certain purpose. Although a tree or the feathers of a peacock possess undeniable aesthetic qualities, they are not called artworks in the descriptive sense. Note that the criterion [iv], as formulated, does not necessarily exclude productions by Artificial Intelligence even if one denies that AI models are genuine creators of artifacts (see Mikalonytė and Kneer 2022 for empirical explorations of who are considered as the creators of AI-generated art).

A “minimal achievement” is also required for an object to be an artwork [v]. Thus, if Sandrine attempts to play a violin sonata without ever having touched an instrument in her life, she risks failing to produce something identifiable as an artwork. It’s not that she will have played a “bad piece,” but rather that she will have failed to produce a piece of music at all.

Finally, [vi] a good definition of art must also exclude certain artifacts—and this, despite their aesthetic qualities. Even if the maker of a shoelace, a nail, or an alarm bell may have intentionally endowed them with aesthetic properties, this does not seem sufficient to qualify them as art. It is possible to create non-artistic aesthetic artifacts. It is also possible to create shoelaces that are artworks. However, this is—and should—not be the case for most of them.

iii. What a Definition of Art Should Account For

[vii] Borderline cases

A good definition must be able to reflect the fact [vii] that there are many borderline cases, cases where it is not clear whether the concept applies—such as children’s drawings, lullabies, paintings produced by animals, Christmas trees, rituals, jokes, YouTube playthroughs, drafts… A good definition might exclude or include these cases or even account for their tricky nature; in any case, it should not remain silent about them. After all, these are the cases that may most often raise the question “What is the meaning of “art?”

Most contemporary definitions respect these criteria or consider the difficulties posed by some of them. This is less the case for definitions dating from before the second half of the twentieth century.

2. Classical Definitions of Art

Although the focus here is on contemporary definitions of art within the analytic philosophy tradition, it is worth doing a quick tour of the definitions that have prevailed in the West before and what problems they face.

a. Mimesis

i. The Ancients

Although we will focus on contemporary definitions of art within the analytic philosophy tradition, it is worth doing a quick tour of the definitions that have prevailed in the West before and what problems they face.

Plato, Aristotle, and their contemporaries grouped most of what is called “art” today under the heading of “mimesis”. In the Republic or the Poetics (for example, 1447a14-15), the focus is on works that would undeniably be considered artworks today: the poems of Homer, the sculptures of Phidias, works of music, architecture, painting, and so on.

Mimesis is an imitation or representation of the natural, in the sense that artists depict, represent, or copy movements, forms, emotions, and concepts found in nature, human beings, and even gods. The aim is not to achieve hyper-realism (à la Duane Hanson) or naturalism (à la Emile Zola), but to represent possible situations or even universals.

Perhaps surprisingly, even purely instrumental music was considered to be imitative. Beyond the representation of birdsong or the human voice, Aristotle among others believed that music could resemble the movements of the soul, and thus imitate the emotions, moods, and even character traits (vices and virtues) of sentient beings (see, for example, Aristotle’s Politics, 1340a). From that point of view, there is no such thing as non-representational art.

ii. The Moderns

The idea of mimetic art was hegemonic for around two millennia. Witness Charles Batteux who, in Les Beaux-arts réduits à un même principe (1746), popularized the concept of “fine arts” (Beaux-arts) and defended that the essence of these is their ability to represent. In particular, he sees them as an assemblage of rules for imitating what is Beautiful in Nature.

It is interesting to note that when Batteux wrote these lines, he was faced with the unprecedented development of music whose aim was not to imitate anything, as the genres of concerto, ricercare, and sonata were emerging. Batteux is opposed to the idea of non-representational music or dance, which, in his view, “goes astray” (“s’égarer”) (Batteux 1746, 1, chap. 2, §§14-15). Rousseau’s remarks (for example, 1753) on music—and, more particularly, on sonatas—also reflect the conception of art as mimetic among 18th-century thinkers.

iii. The Limits of the Mimesis Theory

The first problem with this theory is that imitation (mimesis) is not a necessary condition for the definition of art. Today, it seems clear that abstract art—that is, works that are not representational—is possible. Just think about Yayoi Kusama’s infinity nets. Thus, mimesis theorists seem to be violating the criterion [iii]: a definition of art must be able to include avant-garde, future, or possible art. Besides, Batteux’s assertion that non-representational music is defective seems to be a case where the descriptive and the prescriptive views of art are conflated. In the same vein, the notion of fine arts has also been vigorously criticized by feminist philosophers as it tends to exclude many works from the realm of genuine arts, including art practices traditionally associated with women, such as embroideries or quilts.

Nevertheless, a more charitable reading of the mimesis theory would see its criterion as going beyond obviously figurative representations. For example, Philippe Schlenker (2017) and Roger Scruton (1999) argue that all music, however abstract, represents or is a metaphor for real-world events, such as spatial movements or emotional changes. This idea captures the fact that, when listening to purely instrumental music, it’s easy to imagine a whole host of shapes or situations, sometimes cheerful and lively, sometimes melancholic and dark, that correspond more or less to the music’s rhythms, melodies, and harmony. Animations illustrating instrumental music also come to mind, as in Walt Disney’s Fantasia or Oskar Fischinger’s Optical Poem. In this broad sense, all art, even if it is not intended to be representational, can be seen as mimetic.

This attempt to save the mimesis theory can be criticized—see, for example, Zangwill (2010) on Scruton. However, even if one dismisses these critics, there seems to be a fatal problem: mimesis is in any case not a sufficient condition. For example, it cannot exclude non-artistic artifacts (criterion [vi]), nor can it properly account for borderline cases (criterion [vii]). Indeed, objects such as passports, souvenir key rings, or any sentence endowed with meaning are mimetic without falling into the category of art. This theory is therefore unsatisfactory.

b. Expressivism

Expressivist theories of art have been championed by many Romantic and post-Romantic philosophers, including Leo Tolstoy, Robin Collingwood, Benedetto Croce, John Dewey, and Suzanne Langer. Let us start with the first, often cited as a paradigmatic representative.

i. Tolstoy

The hegemony of the mimesis theory was gradually replaced by expressivist theories during the (late) Romantic period. Tolstoy is a flamboyant example. In the 2023 edition of What is Art (1898), he defends the following thesis:

To evoke in oneself a feeling one has once experienced, and having evoked it in oneself, then, by means of movements, lines, colors, sounds, or forms expressed in words, so to transmit that feeling that others may experience the same feeling—this is the activity of art (Tolstoy 1898 [2023], 52).

Tolstoy’s expressivist view is particularly strong in that it implies, on the one hand, that the artist experiences a feeling and, on the other, that this feeling is evoked in the audience. Other versions of the expressivist thesis require only one or the other of these conditions, but for Tolstoy, until the audience feels what the artist feels, there is no art. Note also that, in an expressivist approach, communicating both positive and negative emotions can lead to a successful work. Thus, Francis Bacon’s tortured works are expressivist masterpieces since they communicate the author’s near-suicidal state (see Freeland, 2002, Chap.6).

At first glance, expressivism is seductive: if we go to the movies, read a novel, or listen to a love song, is it not to undergo certain emotions? A second look, however, reveals an obvious problem: the existence of “cold” art. There seem to exist artworks whose purpose is not to communicate any affect. We can think of modern or contemporary cases, such as Malevich’s White on White or Warhol’s Empire—an 8-hour-long static shot of the Empire State Building. But we can also think of more traditional artworks, such as Albrecht Dürer’s Hare (see picture), a masterpiece of observation from nature that does not clearly meet Tolstoy’s expressivist criteria.

 

ii. Collingwood and Langer

A less demanding expressivist theory is advocated by Collingwood and refined by Langer. Unlike Tolstoy, Collingwood’s theory does not require the audience to feel anything: it’s enough that the artist has felt certain “emotions” and expresses them. Langer, for her part, argues that the artist does not need to feel an emotion, but she should be able to imagine it—and thus to have a certain knowledge of that emotion (see the IEP article on Susanne K. Langer).

According to one charitable interpretation, by “emotions” Collingwood means something similar to Susanne Langer’s notion of feelings:

The word ’feeling’ must be taken here in its broadest sense, meaning everything that can be felt, from physical sensation, pain and comfort, excitement and repose, to the most complex emotions, intellectual tensions, or the steady feeling-tones of a conscious human life (1957, p. 15).

So, a broad category of mental states can be qualified as feelings, including being struck by a contrast of colors or a combination of sounds (Wiltsher 2018).

By “expression”, Collingwood means an exercise in “transmuting” emotion (in the broad sense) into a medium that makes it shareable, which would require an exercise of the imagination (Wiltsher 2018, 771-8). Langer’s theory of art is similar, she emphasizes that expressing feelings requires the use of a symbolic form to transmit what the artist grasps about the value that an event has for them (Langer 1967). In that sense, it’s possible that Dürer could have expressed the elements that struck him in the hare he depicts.

iii. Limits of Expressivism

Although Collingwood and Langer’s theories are rich and sophisticated, they seem to run up against the same kind of objections as Tolstoy’s view, since it requires that the artist has actually experienced what is supposed to be expressed in the work. However, it’s doubtful that every artwork meets this criterion, and it is possible that many do not. An example is provided in Edgar Allan Poe’s The Philosophy of Composition (1846). He explains that he wrote The Raven purely procedurally, “with the precision and rigorous logic of a mathematical problem” (idem, 349), without reference to anything like feelings or emotions. Of course, Poe must have gone through certain mental states and feelings when writing this famous poem, but the point is that it is arguably not necessary that he had a feeling that he then expressed in order to create this artwork.

Even if one remains convinced that Poe and any other artist must in fact express feelings in the artworks they create, there is another problem with an expressivist theory which, as in the mimesis case, seems insurmountable: The expression of feelings is not a sufficient condition for art. Love letters, insults, emoticons, and a host of other human productions have an expressive purpose, sometimes brilliantly achieved, but this does not make them artworks. Expressivists, like mimesis theorists, seem unable to accommodate criteria [vi] and [vii]: excluding non-artistic artifacts and accounting for borderline cases (Lüdeking 1988, Chap.1).

c. Formalism

After the Romantic period and the apex of expressivism, a new, more objectivist trend emerged, closer to the spirit of its contemporary abstract artists. Instead of focusing on the artist’s feelings, formalism attempts to define art by concentrating on formal aesthetic properties—that is, aesthetic properties that are internal to the object and accessible by direct sensation, such as the aesthetic properties of colors, sounds, or movements (see the IEP article on Aesthetic Formalism). This (quasi-)objectivist approach is infused by Kant’s view on beauty (Freeland 2002, 15)—see Immanuel Kant: Aesthetics.

Note that formalism has affinities with aesthetic perceptualism, the view that any aesthetic property is a formal aesthetic property (Shelley 2003). However, moderate formalism does not imply aesthetic perceptualism (Zangwill 2000): A formalist may accept that artworks possess non-formal relevant properties—for example, originality, expressiveness, or aesthetic properties that depend on symbols (for example, in a poem). Nevertheless, for the formalist, these properties do not define art. This is developed in section 6.

i. Clive Bell

One of the leading figures of formalism is Clive Bell (1914)—and, later Harold Osborne (1952). According to formalism, the essence of visual art is to possess a “significant form,” which is a combination of lines and colors that are aesthetically moving. This is how he introduces this idea:

There must be some one quality without which a work of art cannot exist; possessing which, in the least degree, no work is altogether worthless. What is this quality? What quality is shared by all objects that provoke our aesthetic emotions? What quality is common to Sta. Sophia and the windows at Chartres, Mexican sculpture, a Persian bowl, Chinese carpets, Giotto’s frescoes at Padua, and the masterpieces of Poussin, Piero della Francesca, and Cézanne? Only one answer seems possible–significant form (Bell 1914, 22).

A clear advantage of formalism over expressivism is that it allows us to account for what is herein defined as “cold art”. Malevich’s White on White, Warhol’s Empire, and Dürer’s Hare do not necessarily trigger strong emotions, but the arrangement of colors and lines in these works nevertheless possess notable aesthetic properties. Similarly, Edgar Allan Poe’s poem may not express the author’s feelings, but its formal properties are noteworthy. Another notable advantage of formalism—especially over the mimesis theory—is that it readily accounts for non-representational art, contemporary as well as ancient or non-Western. Indeed, Bell was particularly sensitive to the emergence of abstract art among his contemporaries.

ii. Limits of Formalism

The first problem for formalism is that there seem to be artworks that lack formal aesthetic properties—particularly among conceptual art. The prototypical example is Duchamp’s famous Fountain (Fontaine). While some might argue that the urinal used by Duchamp possesses certain formal aesthetic properties—its immaculate whiteness or its generous curves—these are irrelevant to identifying the artwork that is Fountain (Binkley 1977). Duchamp specifically chose an object which, in his opinion, was devoid of aesthetic qualities, and, in general, his ready-made can be composed of any everyday object selected by the author (a shovel, a bottle-holder…). Similarly, a performance like Joseph Beuys’ I Like America and America Likes Me seems devoid of any formal aesthetic properties—Beuys’ performance consisted mainly of being locked in a cage with a coyote for three days. Formalism thus is threatened by criterion [iii]: there are avant-garde or possible art forms that go beyond formalism. It should be noted, however, that Zangwill offers possible answers to these counterexamples, which are discussed in section 6.a..

A second problem for formalism is that the possession of formal aesthetic properties is not sufficient to be art. Again, the problem concerns criteria [vi] and [vii]: excluding non-artistic artifacts and accounting for borderline cases. As noted above (1.c.), there are a whole host of artifacts that are elegant, pretty, catchy, and so one. in virtue of their perceptual properties but that are not artworks. Formalism seems unable to answer this objection (we will see how neo-formalism tries to avoid it in section 6).

3. The Skeptical Reaction

From the 1950s onwards, a general tendency against attempts to define art emerged among analytic philosophers. The main classical theories were being challenged by avant-garde art which constantly pushed back the boundaries of the concept. At the same time, a general suspicion towards definitions employing necessary and sufficient conditions emerged under the influence of Wittgenstein. In response, philosophers such as Margaret MacDonald and Morris Weitz adopted a radical attitude: they argued that art simply cannot be defined.

a. Anti-Essentialist Approaches

Wittgenstein (1953, §§ 66-67) famously endorses a form of skepticism with regard to the definition of games in terms of necessary and sufficient conditions. He postulates that there are only non-systematic features shared only by some sub-categories of the super-category ’games’—for example, one can win at football and chess…—from which a general impression of similarity among all members emerges, which he calls “a family resemblance”.

Margaret MacDonald, a student of Wittgenstein, is historically the first to take up this argument for art (see Whiting 2022). In her pioneer work, she argues that artworks should be compared “to a family having different branches rather than to a class united by common properties which can be expressed in a simple and comprehensive definition.” (MacDonald 1952, 206–207) In other words: art has no essence and cannot be defined with sufficient and necessary conditions.

Several philosophers came to the same conclusion in the years following MacDonald’s publications. Among them, Morris Weitz’s (1956) argument is the most influential. It claims that the sub-categories of artworks (such as novel, sonata, sculpture…) are concepts that can be described as “open” in the sense that one can always modify their intensions, as their extensions grow due to artistic innovations. Since all art sub-categories are open, the general “art” category should be open too. For instance: is Finnegans Wake a novel (or something brand new)? Is Les Années a novel (or a memoir)? To include these works in the “novel” category, the definition of the term needs to be revised. The same applies to all the other sub-categories of artworks: they need to be revised as avant-garde art progresses. So, since the sub-categories of art cannot be closed, we cannot provide a “closed” definition of art with necessary and sufficient conditions. Weitz moreover thinks that art should not be defined since, in his view, this hinders artists’ creativity.

The view of art as family resemblances is neither a definition nor even a characterization of art—it’s essentially a negative theory. Indeed, the family-resemblance approach does not seem very promising as a positive theory of art. For instance, Duchamp’s In Advance of a Broken Arm looks more like a standard shovel than any other artwork. A naive family resemblance approach might lead to the unfortunate conclusion that either this work is not art or that all shovels are (Carroll 1999, 223). This is probably of little importance to a true skeptic who doesn’t think the category of artworks will ever be definitely set. However, more moderate approaches have attempted to bring a stronger epistemic power to the family resemblance approach.

b. The Cluster Approach

Berys Gaut (2000) agrees with MacDonald and Weitz that any attempt to define art in terms of necessary and sufficient conditions is doomed to fail. Nevertheless, he defends an approach that can guide us to minimally understand the term “art,” capture borderline cases (cf. criterion [vii]), and establish fruitful theories with other human domains (cf. criterion [iii]). This is the cluster approach, which takes the form of a disjunction of relevant properties, none of which is necessary, but which are jointly sufficient. The idea is that, for something to qualify as an artwork, it must meet a certain number of these criteria, though none of them need to be met by all artworks. Gaut provides the following list:

(1) Possessing positive aesthetic properties … (2) being expressive of emotion … (3) being intellectually challenging … (4) being formally complex and coherent … (5) having a capacity to convey complex meanings; (6) exhibiting an individual point of view; (7) being an exercise of creative imagination (being original); (8) being an artifact or performance which is the product of a high degree of skill; (9) belonging to an established artistic form (music, painting, film, and so forth); (10) being the product of an intention to make an artwork (2000, 28).

For the most part, these criteria correspond to definitions proposed by other approaches. Roughly speaking, criteria (1) and (4) correspond to formalism, (2) and (6) to expressivism, (9) and (10) correspond to relationalism, and criteria (7) and (8) correspond to Kantian art theory—see Immanuel Kant: Aesthetics.

Gaut points out that the list can incorporate other elements in such a way as to modify the content but not the global form of the account. The approach is therefore good at incorporating new types of art—criterion [iii].

c. Limits of Skepticism

Skeptic approaches, and in particular the cluster approach, have been vigorously criticized. Note, however, that it has also inspired a non-skeptical approach called “disjunctive definitions” that is discussed in section 8.

A first and obvious objection against the cluster approach concerns the question of which criteria can be added to the list. For instance, why is the criterion “costs a lot” not on the list, while statistically, many artworks cost a lot? (Meskin 2007) More profoundly, the cluster approach has no resource to reject any irrelevant criteria—such as “has been made on a Tuesday” which can be true of some artworks. It would be absurd to elongate the disjunction infinitely with such criteria. Hence, without an element to connect the properties on the list, we run the risk of arbitrary clusters such as “flamp,” which stands for “either a flower or a lamp” (Beardsley 2004, 57). Intuitively, the term “art” is not as arbitrary as “flamp.”

Another question one may ask regarding the cluster theory is how many criteria an item must meet to be art. Some philosophical papers possess properties (3-6) and yet are not artworks (Adajian 2003). Presumably, some criteria are weighted more heavily than others, but this again leads to problems of arbitrariness (Fokt 2014).

Regarding skepticism more generally, Adajian (2003) points out that it has little resources for demonstrating that art has no essence. For instance, Dickie (1969) notices a problem with the most influential argument (by Weitz): it does not follow from the fact that all the sub-concepts of a category are open that the super-category is itself open. One can conceive of a closed super-category, such as “insect,” and sub-categories open to new, as of yet unknown, types of individuals—for example, a new species of dragonfly. Conversely, Wittgenstein may be right to argue that the super-category “game” is open, but this does not mean that sub-categories such as “football,” “role-playing game,” or “chess” are open. It seems, then, that there is no necessary symmetrical relation between open or closed sub-categories and open or closed super-categories.

4. Relational Definitions

Another reaction to classical definitions emerged in the second part of the 20th century. Contrary to skepticism, it did not give up on the attempt to spell out the necessary and sufficient conditions of art. Instead, it claimed that these conditions should not be found in the internal properties of artworks, but in the relational properties that hold between artworks and other entities, namely art institutions—for institutionalism—and art history—for historicism.

a. Institutionalism

Arthur Danto, in his seminal paper “The artworld” (1964), describes Andy Warhol’s Brillo Boxes as what inspired his institutionalist definition. These boxes, exhibited in an art gallery, are visually indistinguishable from the Brillo boxes found in supermarkets. But, crucially, the latter are not artworks. Danto concluded that the identity of an artwork depends not only on its internal properties, but also (and crucially) on relational properties such as the context of its creation, its place of exhibition, or the profession of its author. This led him to propose that artworks are constituted by being interpreted (see Irvin 2005 for a discussion). So, Warhol’s Brillo boxes are distinct from commonplace objects because they are interpreted in a specific way. It means that when Warhol made his Brillo boxes, he had an intention to inscribe the work in an artistic tradition or to depart from it, to comment on other artists, to mock them, and so on. Warhol’s interpretation of Brillo boxes is of a specific kind, it is related to the artworld—that is, related to a set of institutions and institutional players made up of art galleries, art critics, museum curators, collection managers, artists, conservatories, art schools, art historians, and so on.

According to Danto, a child dipping a tie in blue paint could not, on his own, make an artwork, even if he intended to make it prettier. Picasso, on the other hand, could dip the same tie in the same pot and turn it into an artwork. He would achieve this by virtue of his knowledge of the artworld, which he intends to summon through such a gesture. As the example of the child and the tie shows, Danto’s position implies that the creator of art possesses explicit knowledge of the artworld. Danto is thus led to deny that Paleolithic paintings, such as those in the Lascaux caves, can be art—a consequence he seems happy to accept (Danto 1964, 195). This result is highly counter-intuitive given that almost everyone intuitively attributes artistic status to these frescoes. What’s more, Danto excludes the vast majority of non-Western art and art brut. These problems are discussed in more detail below (4.a.ii).

George Dickie’s (1969) institutionalist definition aims to avoid Danto’s counterintuitive results. Dickie distinguishes three meanings of the term “art”: a primary (classificatory) meaning, a secondary (derivative) meaning, and an evaluative (prescriptive) meaning. For Dickie, throughout the history of art up to, roughly, Duchamp, the three meanings were intertwined. The primary meaning of art is what Dickie seeks to define, it reflects the sense of the term which would unify all artworks. The secondary meaning refers to objects that resemble paradigmatic work—for example, a seashell may be metaphorically qualified as an “artwork” given its remarkable proportions that are also used in many artworks. Finally, the third meaning corresponds to the evaluative use of the term art, as in “Your cake is a real work of art!”. Duchamp’s great innovation was to have succeeded in separating the primary from the secondary meaning, by creating works that in no way resembled paradigmatic works of the past (secondary meaning), but which nevertheless managed to be classified as artworks (primary meaning).

Indeed, Duchamp’s Bicycle Wheel and Camille Claudel’s La petite châtelaine are not linked by a relationship of perceptual resemblance but are nevertheless linked by the fact that both sculptures have been recognized by the guarantors of the artworld as possessing certain qualities. It is this institutional recognition that would allow us to classify these two very different sculptures as artworks.

Dickie’s original definition of art is the following:

A work of art in the descriptive sense is (1) an artifact (2) upon which [a] some society or some sub-group of a society [b] has conferred [c] the status of candidate for appreciation (Dickie 1969, 254).

Condition (1) is simply the criterion that artworks are produced by people. It is in condition (2) that one finds Dickie’s most significant contribution to the debate. Let’s consider its various features: the society or sub-group to whom [a] refers should be understood as the relevant actors from the artworld (1969, 254). It may be a single person, such as the artist who created the artifact, but more commonly involves various actors such as gallerists, museum curators, critics, other artists, and so on. Condition [b] refers to the symbolic act of conferring a special status. Dickie compares it to the act of declaring someone as a candidate for alderman (idem, 255)—in a way reminiscent of Austin’s (1962) performative speech acts. This comparison illustrates that the symbolic act of conferring cannot be performed by just anyone in any context. They need to have a special role that allows them to act on behalf of the artworld.

To return to the above example from Danto, a child dipping a tie in paint doesn’t have the institutional status necessary to turn the artifact into an artwork. On the other hand, his father, a museum curator, could take hold of the tie and confer on it the status of art brut by exhibiting it. To perform this act successfully, the father must, according to Dickie, confer to it [c] the status of candidate for appreciation. This doesn’t mean that anyone needs to actually appreciate the tie, but only that, thanks to the status the curator has conferred it, people apprehend the tie in ways that they usually apprehend other artworks so as to find the experience worthy or valuable (Dickie 1969, 255). Of course, it seems circular to define an artwork using the notion of the artworld. Dickie readily admits this. Nevertheless, he argues that this circularity is not vicious, as it is sufficiently informative: we can qualify the artworld through a whole host of notions that do not directly summon up the notion of art itself. Dickie describes the artworld through, for instance, historical, organizational, sociological descriptions that make the notion more substantial than by describing it uninformatively in terms of the “people who make art art.”

It should also be noted that Dickie reformulated and then substantially modified his original definition in the face of various criticisms (see the section Further Reading below).

i. Advantages of Institutionalism

A definite advantage of institutionalist definitions is that their extension corresponds much better to the standard use of the term “art” than the classical definitions discussed above. Thus, since Dickie’s improvement over Danto’s initial idea, institutionalism fulfills criteria [i] to [vi], at least partially.

Indeed, it allows for the inclusion of [i] mediocre art, [ii] art produced by people who do not possess the concept “art,” and [iii] and avant-garde, future, or possible art—insofar as the status of artwork is attributed by the institutions of the artworld to these works. We’ll see that this last condition is open to criticism, but in any case, it allows institutionalism to explain without any problem why Duchamps’ readymades are artworks, which is not obvious from classical or skeptic theories. Art brut, cult objects (for example, prehistoric), video games, chimpanzee paintings, music created by AIs, non-existent or extraterrestrial types of objects, but also natural objects (for example, a banana) can become art as soon as these entities are adequately recognized by the artworld. In this way, arguments such as “It’s not art, my 5-year-old could do the same” that one sometimes hears about contemporary art lose all their weight.

Institutionalism also makes it possible to exclude [iv] purely natural objects, [v] what purports to be art but has completely failed, [vi] and non-artistic artifacts, including those with an aesthetic function—as long as the status of artwork has not been conferred on these objects by the relevant institutions. Again, this last condition is debatable, but it does help to explain why Aunt Jeanne’s beautiful Christmas tree or Jeremy’s selfie on the beach are not considered artworks.

ii. Limits of Institutionalism

Levinson (1979) and Monseré (2012) argue that primitive or non-Western arts, as well as art brut, are not made with an intention related to the artworld. But they do not need to wait for pundits—a fortiori from the Western artworld—to be appropriately qualified as art. Think for instance of Aloïse Corbaz’s work before Dubuffet or Breton exhibited her, or Judith Scott’s (see picture) before her recognition in the art brut world. Through their exhibitions of art brut and non-Western art, Dubuffet and Breton did not transform non-artistic objects into artworks, they rather helped to reveal works that were already art. Turning to another example: in the culture of the Pirahãs, a small, isolated group of hunter-gatherers living in the forests of Amazonia, there is no such thing as the artworld (Everett, 2008). Yet this group produces songs whose complexity, aesthetic interest, and expressiveness clearly make them, in the eyes of many, artworks. Similar remarks can be made about prehistoric works, such as those in the Chauvet cave.

What underlies counterexamples such as art brut, non-Western, or prehistoric art is that there seem to be reasons independent of institutionalization itself that justify artworld participants institutionalizing certain artifacts and not others. In line with this idea, Wollheim (1980, 157-166) proposed a major challenge for any institutional theory: either artworld participants have reasons for conferring the status of art or they do not. If they have reasons, then these are sufficient to determine whether an artifact is art. If they do not, then institutionalization is arbitrary, and there’s no reason to take artworld participants seriously. An institutional definition is therefore either redundant or arbitrary and untrustworthy.

To revisit a previous example, the museum curator must somehow convince his peers in the artworld that his son’s tie dipped in paint is a legitimate candidate for appreciation. For this, it seems intuitive that he must be able to invoke aesthetic, expressive, conceptual, or other reasons to justify his desire to exhibit this object. But if these reasons are valid, then the institutional theory is putting the cart before the horse: it’s not because the artifact is institutionalized that it is art, but because there are good reasons to institutionalize it (see also Zangwill’s Appendix on Dickie, 2007). Conversely, if there are no good reasons to institutionalize the artifact, then the father simply had a whim. If institutionalism still allows the father to confer the status of artwork, then institutionalization is arbitrary, and there is an impossibility of error on the part of participants in the artworld.

The objections discussed have been partially answered by more recent versions of institutionalism, see notably Abell (2011) and Fokt (2017).

b. Historicism

Starting in 1979, Jerrold Levinson sought to develop a definition of art that, while inspired by the institutionalist theses of Danto and Dickie, also avoided certain of their undesirable consequences. Levinson’s theory retains the intuition that art must be defined by relational properties of the works. However, instead of basing his definition on the artworld, Levinson emphasizes a work’s relationship to art history. Noël Carroll (1993) is another well-known advocate of historicism while theories put forward by Stecker (1997) and Davies (2015) also contain a historicist element and are discussed in section 8 below. Here, the focus is on Levinson’s account, which is the oldest, most elaborate, and most influential.

Levinson summarizes his position as follows:

[A]n artwork is a thing (item, object, entity) that has been seriously intended for regard-as-a-work-of-art, i.e., regard in any way preexisting artworks are or were correctly regarded (Levinson 1989, 21).

Before delving into the details, let’s consider an example. We have seen that the institutional definitions of Danto or Dickie struggled to account for art produced by someone from a culture that lacked our concept of art (4.a.ii). Levinson (1979, 238) discusses the case of an entirely isolated individual—imagine, for instance, Mowgli from The Jungle Book. Mowgli could create something beautiful, let’s say a colored stone sculpture, with the intention, among other things, of eliciting admiration from Bagheera. Although Mowgli does not possess the concept of art, and although his sculpture has not been instituted by any representative of the artworld, his artifact is related to past works through the type of intention deployed by Mowgli.

Indeed, one can highlight at least three types of resemblance between the intention with which Mowgli created his sculpture and that of past artists (1979, 235). Firstly, Mowgli wanted to endow his sculpture with formal aesthetic properties—symmetry, vibrant colors, skillful assembly… Secondly, he aimed to evoke a particular kind of experience in his spectators—aesthetic pleasure, admiration, interest… Thirdly, he intended his spectators to adopt a specific attitude towards his sculpture—contemplate it, closely examine its skillfully assembled form, recognize the success of the color palette.

To produce art, according to Levinson, Mowgli does not need to have in mind works from the past, but his production must have been created with these types of intention—as long as it is with these types of intention that the art of the past was created.

The resemblance between an artist’s intentions and those of past artists may seem to lead Levinson to a form of circularity, but that is not the case; it leads him instead to a successive referral of past intentions to past intentions that themselves refer to older ones until arriving, at the end of the chain, at the first art productions the world has known. For Levinson, one does not need to know precisely what these first arts are. What matters is that, at some point in the prehistory of art, there are objects that can easily be called art—like the Chauvet caves (see picture) or the Venus of Willendorf.

This way of defining art can be compared to how biologists typically classify the living world. A biological genus is defined through its common origins, even though individuals of different species may have evolved divergently. The first arts are comparable to the common ancestors of species in the same genus. To know what this common ancestor looked like, one must trace the genetic lineages. But there is no need to know exactly what these ancestors looked like to know that the species belong to the same genus.

Notice that Carroll (1993) proposes a historicist definition of art that does not require an intentional connection like Levinson’s. For Carroll, there is no need for artists to have intentions comparable to those of previous artists; instead, their works must allow a connection to the past history that forms a coherent narrative. If an object can be given an intelligible place in the development of existing artistic practices, then it can also be considered art.

i. Advantages of Historicism

The first advantage of historicism over the theories reviewed so far is to explain the diversification of ways of making and appreciating art. From classical painting to horror movies, from ballet to readymades, the property of being considered-as-art gradually complexifies. Thus, the apparent impression of disparity or lack of unity in what is called “art” is attenuated. A whale, a bat, and a horse strike us as being very different, yet their common ancestor can be traced, explaining their common classification as mammals. It would be the same for a video game, a cathedral, and a performance by Joseph Beuys: although quite different, their common origin can be traced, thus explaining what binds them together. And their historical connection would be the only essential element shared by all these artworks.

Historicism also enjoys several advantages over institutionalism while being able to account for cases that institutionalism deals with successfully. The first advantage is that it sidesteps the issue concerning works not recognized by the artworld (criterion [ii]). A second advantage over institutionalism is what one may call “the primacy of artists over the public.” Danto’s or Dickie’s institutionalism implies that someone belonging to the artworld simply cannot be mistaken in their judgment that something is art. This is not the case with Levinson’s theory. Thus, an art historian who comes across an Inuit toy that was not created with artistic intentions and still catalogs it in a book as an “Inuit artwork” would be mistaken about the status of this artifact, even if, from their perspective, there are good reasons to classify it as such: according to Levinson, if this toy was not created with the right intentions, then it cannot become art just because an archaeologist or curator considers it as such (cf. Levinson’s discussion of ritual objects 1979, 237). By the same token, historicism has no problem explaining how an artifact can be an artwork before being recognized by institutions, for example, the early works of Aloïse Corbaz and Judith Scott. Before being institutionalized by art brut curators, these artists nonetheless created artifacts with the intention that they were regarded-as-works-of-art.

For other advantages of historicism, see the references in the section Further Reading below.

ii. Limits of Historicism

The first difficulty of historicism concerns the first arts. It can be introduced through the age-old problem of Euthyphro, in which Socrates shows Euthyphro that he is unable to say whether a person is pious because loved by the gods or loved by the gods because pious. Here, the question would be: Is something considered art because it fits into the history of art, or does something fit into the history of art because it is art? (Davies 1997)

Outside of first arts, Levinson’s answer is clear: something is considered art when it fits into the history of art through the intention-based relation described above. However, this answer is not possible for first arts because, by definition, they have no artistic predecessors. In his initial definition, Levinson simply stipulates that first arts are arts (Levinson 1979, 249). Consequently, he is forced to admit that first arts are art not because they fit into the history of art; hence the problem of Euthyphro.

Gregory Currie (1993) highlights a similar problem through the following thought experiment: imagine that there is a discovery on Mars of a civilization older than any that has ever existed on Earth. Long before the first arts appeared on our planet, Martians were creating objects that we would unquestionably call “art.” The reason for labeling these objects as “art” does not seem tied to the contingent history of human art but rather to ahistorical aspects.

A last problematic case that can be raised in connection with the criterion [vii] concerns the intentions governing activities seemingly more mundane than art. In particular, a five-year-old child drawing a picture to hang on the fridge has a similar intention to that of a painter who wants to present their work—the five-year-old wants to evoke admiration, create an object that is as beautiful as possible, and so on. Examples can be multiplied easily. Think of the careful preparation of a Christmas tree, the elegant furnishing of a living room, a spicy diary, or vacation photos on Instagram (see Carroll 1999, 247). In all these cases, the intentions clearly resemble those of paradigmatic works of the past but we don’t want to label all of them as artworks.

Levinson replied to some of these objections, but this goes beyond the scope of this article (see the section Further Reading below).

5. Feminist Aesthetics, Black Aesthetics, and Anti-Discriminatory Approaches

In addition to the skeptical approaches discussed in section 3, important criticisms against the project of defining art have been raised within feminist aesthetics, Black aesthetics, and anti-discriminatory approaches to art. In contrast with skepticism, these critiques constitute a constructive challenge for a definition of art—and especially for relational views (see section 4)—rather than an anti-essentialist position.

Two kinds of criticisms are distinguished: (a) those made by artists and (b) those made by philosophers and theorists of art. Art practiced by women, Black people, queer artists, or other groups underrepresented in the history of (Western) art can challenge traditional approaches to art notably by highlighting how the works and perspectives from members of these groups have been unfairly marginalized, ignored, discarded, or stripped of credibility and value. In parallel, one can find critiques from philosophers whose approach is guided by their sensitivity to discrimination, which leads them to detect problematic issues in the necessary and sufficient conditions of existing art definitions.

a. The Art as Critique

Let’s start by stating something quite obvious: not all art produced by a person who suffers from discrimination constitutes a critique of this discrimination. Similarly, not all feminist or anti-racist criticisms go in the same direction. For instance, Judy Chicago’s work The Dinner Party—a 1979 installation where plates shaped like vaginas and bearing the names of important women figures are arranged on a triangular table (see picture)—has been praised as a seminal feminist artwork but has also been criticized as naively essentialist (Freeland 2002, Chap. 5). It is thus difficult to make general statements that apply to all the relevant artworks.

That being said, much feminist art, which gained prominence during the 20th century and especially its second half, bears a certain continuity with Dadaism and conceptual art (Korsmeyer and Brand Weiser 2021). Works created by feminist artists have often radically challenged the most traditional conceptions of art (see section 2.a.)—think, for instance, of the protest art of the Guerrilla Girls or Marina Abramović’s performances.

Non-Western art and art made by marginalized communities, such as artworks by women, have also often been excluded or not considered central to the project of defining art. For this reason, some of these works can unsettle and destabilize the traditional Western conceptions of art discussed above. For instance, many cultures in Africa do not make a rigid distinction between the interpretation and creation of art, everyday practices of crafting, and “contemplation” of beauty (see Taylor 2016, 24). As Taylor notes, these diverse practices can serve as points of comparison with classical Western art for those who seek to elaborate a more general theory of art and so to respect the criteria [ii] and [iii].

It is noteworthy that philosophers have not always had this inclusive attitude towards art produced by marginalized artists. Batteaux’s or later Hegel’s notion of “fine art” (see section 2.a.ii) leads to the neglect of any type of art that does not fit into the “noble” categories of painting, sculpture, architecture, music, poetry, dance, or theater. These categories exclude art practices that have been associated with women or non-Western cultures, relegating these practices to a lesser status, such as craft. Think, for instance, about Otomi embroidery—a Mesoamerican art form—or about quilts, which have traditionally been made by women. These textile creations are not among the fine art categories. It was only in the latter part of the 20th century that some quilts started to be recognized as artworks, exhibited in art galleries and museums, thanks notably to the work of feminist artists such as Radka Donnell, who have used quilts as a means to subvert traditional, male-dominated perspectives on art. Donnell even considered quilts as a liberation issue for women (for example, Donnell 1990).

The emergence of expressivism and formalism, which also challenged the fine art categories, has perhaps helped philosophers and art theorists to adopt a more universalist and hence more inclusive approach to the diversity of the arts (Korsmeyer 2004, 111). Unfortunately, it is unclear whether this helped women and discriminated minorities to be recognized as artists since the art forms favored by formalism or expressivism have remained male- and white-dominated (see Freeland 2002, Chap.5 and Korsmeyer and Brand Weiser 2021, Sect. 4 for a discussion).

b. Philosophical Critiques

Korsmeyer and Brand Weisser (2021), two important feminist aestheticians, state that “There is no particular feminist ‘definition’ of art, but there are many uses to which feminists and postfeminists turn their creative efforts.” Similar remarks have been made for Black aesthetics (Taylor 2016, 23). As far as definitions of art are concerned, much effort in these traditions has concentrated on pointing to biases from prevalent proposals. A point of emphasis has been in explaining structures of power reflected in the concept of art, often from a Marxist perspective (Korsmeyer 2004, 109). Some philosophers working on minority viewpoints see typical attempts to provide necessary and sufficient conditions for art with suspicion and as being systematically guilty of biases regarding gender, race, or class (Brand Weiser 2000, 182).

A striking example of such criticisms is the work of Peg Brand Weiser (for example, 2000). In addition to pointing to the excluding nature of the fine art categories that were stressed above (section 5.a.), she highlights important objections to the relational definitions (section 4). Concerning institutionalism (section 4.a.), a way to reconstruct Brand Weiser’s argument is the following. According to this theory, institutional authorities have the power to make art. However, these authorities historically are (and still mostly are) white men who have acquired their institutional power by virtue of their male and white privilege (among other factors) and whose perspective on what should count as art is biased. Thus, the institutional definition is flawed because it inherits the biases of the authorities that decide what is art.

Note that this criticism holds even if institutions change and gradually become less patriarchal. According to institutionalism, indeed, quilts weren’t art until the 1970s—when art authorities started giving institutional accolades to this practice. However, it seems that some quilts always have been art, only of a neglected kind (cf. the counterexample of art brut in section 4.a.ii).

Brand Weiser also criticizes historicism and in particular Carroll’s version (see section 4.b.). What is art, according to this definition, depends on what has been considered to be well-established art practices in the past. However, since these practices have long been exclusive of arts practiced by marginalized and underrepresented groups, this makes the historicist definition decidedly suspect. Like the institutionalism of Danto or Dickie, this definition is flawed because it inherits historically prevalent biases.

In addition to these criticisms, Brand Weiser has also made positive proposals for a non-biased definition. She offers six recommendations that a definition of art should adopt; they can be summarized with these three points: (1) we must recognize that past art and aesthetic theories have been dominated by people with a particular taste and agenda that may suffer racist and sexist biases, (2) the definition of fine arts is flawed, (3) “gender and race are essential components of the context in which an artwork is created and thus cannot be excluded from consideration in procedural […] definitions of “art”” (Brand Weiser 2000, 194).

There are potential limitations to Brand Weiser’s critique. First, concerning institutionalism, while her criticisms apply to Danto and Dickie’s versions, it should be noted that more recent versions may be able to avoid them. For instance, Abel (2011) and Fokt (2017) propose characterizations of art institutions that can exist independently of the Western art world and its white-dominated authorities (see section 4.a.ii).

Concerning historicism, while Brand Weiser’s criticisms may apply to Carroll’s version, it is less evident that it applies to Levinson’s. This is because, according to the latter, new artworks must be connected to artworks from the past through the intentions of artists. Thus, for Levinson, what makes something an artwork today is not the art practices, art genres, art institutions, famous artworks, or even written art histories. These may well be biased while the relevant artistic intentions are not so biased. Think again about our Mowgli example (section 4.b.): his colored rock sculpture is art because his intention in creating it connected to past artistic intentions—let’s say intentions to overtly endow the sculpture with formal aesthetic properties. Arguably, these kinds of intentions need not be polluted with racism or sexism. If they are not, then Levinson’s historicism can avoid Brand Weiser’s criticism.

Regarding Brand Weiser’s positive suggestions, while (1) and (2) are important historical lessons that philosophers such as Dutton (2006), Abel (2011), Davies (2015), Fokt (2017), and Lopes (2018) have taken into account, (3) may seem too strong. Gender, social class, and sexual orientation are important sociological contextual factors, but one should resist importing them into (any) definition of art to avoid an infinite iteration of the import of contextual factors to which one would be led by parity of reasoning. For instance, Michael Baxandall (1985) shows that the intention to create, say, the Eiffel Tower could have emerged only thanks to a dozen contextual conditions such as the frequentations of the artist, views on aesthetics popular at the time, the trend for positivism regarding science, the state of technological advances, and Gustave Eiffel interests for the technic of puddling. These too are essential components of the context in which this artwork has been created (to use Brand Weiser’s formulation), but it doesn’t seem that all such variables, and more, should be included in a definition of art, however important they are to understand the relevant artworks.

It bears repeating that feminist and anti-racist approaches do not have an essential and particular definition of art. Nevertheless, their positive proposals show that such approaches can lead to new definitions of art, definitions that would pay particular attention to avoiding the negative biases and prejudicial attitudes towards minorities and marginalized groups, attitudes that have all-too-often polluted the (Western) history of art.

6. Functionalist Definitions

A lesson that may be drawn from the discussion of relational approaches is that it is difficult to define art without reference to the non-relational or internal properties of artifacts if one wants to avoid arbitrary definitions. The family of functionalist theories resurrects the idea shared by classical definitions that if x is art, it is in virtue of a non-relational property of x. Here, what x must possess is an aesthetic function.

a. Neo-Formalist Functionalism

A significant portion of functionalist approaches resembles and is inspired by classical formalism (see section 2.c.) since its main proponents acknowledge a close connection between the aesthetic function of artworks and formal aesthetic properties directly accessible by the senses. This sub-family, however, distinguishes itself from classical formalism since it is the aesthetic function in an artifact that makes it an artwork rather than the mere presence of aesthetic properties. Let’s call this sub-family neo-formalist functionalism (or just neo-formalism).

Monroe Beardsley (1961) and more recently Nick Zangwill are the major representatives of this line of thought. Zangwill is skeptical of attempts to include the most extreme cases of contemporary art in a definition of art (2007, 33). Instead, he focuses on the pleasure most of us experience in engaging in more traditional artistic activities and on the metaphysics of formal aesthetic properties.

Formal aesthetic properties supervene on properties that can be perceived through our five senses. For instance, the elegance of a statue supervenes in its curves—that is, there can be no difference in the elegance of the statue without a difference in its curves (see Supervenience and Determination). The elegance of a statue is a formal aesthetic property since its curves are non-aesthetic properties we can perceive through our eyes or hands. This metaphysical position leads Zangwill to believe that what fascinates us primarily when we create or contemplate an artwork are formal aesthetic properties: what strikes us when we listen to music is primarily the balance between instruments, and this balance supervenes on the sound properties of the instruments and tempo; what pleases us when we post a photo on Instagram is finding the appropriate filter to make our image enchanting. By choosing a filter, we do nothing more than decide to endow our photo with formal aesthetic properties based on non-aesthetic properties that can be perceived.

Zangwill’s neo-formalist definition reflects this idea: x is an artwork if and only if (a) x is an artifact endowed with aesthetic properties supervening on non-aesthetic properties, (b) the designer of x had the insight to provide x with aesthetic properties via non-aesthetic properties, and (c) this initial intention has been minimally successful (Zangwill 1995, 307).

Criterion (a) reflects Zangwill’s metaphysical position; (b) reflects the functionalist nature of this approach: to have an aesthetic function, something needs to have been conceived to possess aesthetic properties—which excludes natural objects (see criterion [iv]). Of course, the aesthetic function thus endowed is not necessarily the only one that artworks possess; religious artifacts can have both an aesthetic and a sacred function. More importantly, just like Kant, Zangwill argues that art cannot be created simply by applying aesthetic rules; one must have insight, a moment of “aesthetic understanding” (Zangwill 2007, 44), about how aesthetic properties will be instantiated by non-aesthetic properties. Finally, (c) ensures that an artist’s intention must be minimally successful. There can be dysfunctional artworks, but not so dysfunctional that they fail to have any aesthetic properties at all—this respects the criterion [v] to exclude what purports to be art but has failed completely.

As a “moderate” neo-formalist (as he calls himself), Zangwill does not deny that properties broader than formal aesthetic properties can make an artwork overall interesting—this is the case for relational properties such as originality, for example (2000, 490). However, genuine artworks (and, in fact, most artworks) must possess formal aesthetic functions.

This neo-formalist idea, though moderate, excludes conceptual artworks, since they have no formal aesthetic properties. If we consider Duchamp’s Fountain, we see only an ordinary urinal devoid of relevant formal aesthetic properties—in fact, we would not understand Duchamp’s attempt if we interpreted it as a work aiming to display the elegance of a urinal’s curves (Binkley 1977). But remember that this would be the point of neo-formalism because it reacts to institutionalism. To be more specific, Zangwill considers that these works indirectly involve aesthetic functions since they refer to, or are inspired by, traditional works of art, which are, in turn, endowed with aesthetic functions. Provocatively, Zangwill labels these cases as “second-order” and even “parasite” types of artworks (2002, 113). In doing so, Zangwill accounts for the primacy of formal aesthetic properties in defining art, while at the same time accounting for contemporary art.

i. Advantages of Neo-Formalist Functionalism

The main advantage of neo-formalism regards its efficiency in capturing an important aspect of our daily engagement with artworks: they are “evaluative artifacts” (Zangwill 2007, 28) in the sense that, from Minoan frescos to Taeuber-Arp’s non-representational paintings, these artifacts have been positively evaluated based on the formal beauty intended by their creator. Interestingly, as Korsmeyer (2004, 111) points out, formalism (and its heir) paves the way for appreciation of work created by non-Western cultures—just as we can appreciate beautiful creations of the past.

By contrast, art theorists and philosophers still have a hard time explaining to the person on the street why the most radical cases of conceptual works, such as Comedian by Maurizio Cattelan (a banana stuck on a wall by duct tape), should be considered art. It seems that “non-aesthetic” works—works that do not aim at instantiating aesthetic properties, such as hyper-realistic sculptures or conceptual art—are indeed not considered as paradigmatic examples of artworks by laypeople. Zangwill’s ambition, in opposition to institutionalism, consists precisely in focusing on paradigmatic cases (for example, The Fall of Icarus by Bruegel the Older, see picture) and avoiding a definition of art based on exceptions. In the same vein, Roger Pouivet (2007, 29) adds that practicing a “theory for exception” as institutionalism does is even harmful. The risk would be to provide a definition that would be a purely scholastic exercise but would no longer be related to what is the main interest of art.

Another advantage of neo-formalism is that the definition of art clearly aligns with its ontology within this theory. Zangwill’s definition is indeed based on an ontological analysis of artifacts and formal aesthetic properties. By contrast, the ontology underlying institutionalism is much less clear (Irvin and Dodd 2017).

ii. Limits of Neo-Formalist Functionalism

Limitations to the neo-formalist approach are numerous, given that its very ambition is not to capture all so-called “exceptions” of contemporary art. On paper, neo-formalism meets criterion [iii]: a new type of art can emerge as long as the artworks constituting it possess an aesthetic function. However, its rejection of many contemporary artworks is at odds with the ambition to offer a definition of art across the board. In such a case, it is good to apply the principle of reflective equilibrium, which attempts to determine a balance between the coherence of the theory and its ability to capture our intuitions.

A first notable worry concerns whether “non-aesthetic” works are considered as genuine artworks by Zangwill or as artworks “in name only.” According to his own definition, Fountain is not a genuine artwork, but Zangwill wants to account for it anyway. The status of “second-order” or “parasite” art should thus be clarified.

Another worry concerns artworks such as Tracey Emin’s My Bed – an unkempt bed littered with debris—or, to a lesser extent, Francis Bacon’s tortured works. Contrary to counterexamples such as conceptual arts, these cases were realized with an aesthetic insight, the one of providing them with negative formal aesthetic properties. Since neo-formalism focuses on positive aesthetic formal properties, it says almost nothing about these negative cases—see, by contrast, expressivism section 2.b. These cases are nevertheless quite complex since an artwork can possess an overall (or organic) positive value by possessing intended negative proper parts (Ingarden 1963, Osborne 1952). If one wishes to conserve the idea that art involves the insight to endow artifacts with positive aesthetic properties (which leads to positive appreciation), one must significantly complexify the neo-formalist approach. Indeed, the insight should focus on two layers of aesthetic properties–those concerning the aesthetic properties of the proper parts and those concerning the aesthetic property of the whole.

A broader objection challenges Zangwill’s metaphysical claim that formal aesthetic properties supervene on properties perceived by the five senses. Indeed, many internal aesthetic properties of literary works are central to this art form but are not directly accessible through the senses. Think for instance of the dramatic beauty of tragedy, the elegance with which a character’s psychology is depicted, or the exquisite comic of a remark in a dialogue. If this is true, this is a major objection to most metaphysical approaches to formal aesthetic properties via perceptual properties (Binkley 1977; Shelley 2003). Since a novel cannot be conceived with the intention that it possesses formal aesthetic properties supervening on perceptual properties, a novel cannot be art. Zangwill swallows this bitter pill: by extending “art” to objects that do not possess formal properties, we would commit the same mistake as naming “jade” both jadeite and nephrite (Zangwill 2002). Literature would be taken for art due to metaphorical resemblances with formal aesthetic artifacts. It is not a matter of rejecting the value of literature; it is simply denying that it has formal aesthetic properties, therefore being genuine art.

This response is highly problematic. Firstly, while it is true that narrative properties do not supervene on perceptual properties, we nevertheless enjoy them in the same way as we enjoy formal properties, namely through our sensibility (emotions, feelings, impressions…). Moreover, we attribute evaluative properties supervening on the internal properties of literary works with the same predicates used for formal properties: we speak of elegant narration, stunning style, the compelling rhythm of the plot, the sketchy psychology of a character… Why should these narrative properties be only metaphorically aesthetic since they share these two relevant features with formal aesthetic properties (Goffin 2018)? In a nutshell, narrative properties cannot be excluded from the domain of aesthetic properties just because it could consist of an exception in our ontological system—this is the “No true Scotsman” fallacy (but see Zangwill 2002, 116).

Finally, Zangwill’s definition is both too narrow and too broad. It is too narrow since it excludes literature from art although works such as Jane Eyre, The Book of Sand, or the Iliad seem to be paradigmatic examples of artworks. It is too broad since it includes any artifact created for aesthetic purposes from an insight—such as floral arrangements or even (some) cake decorations (Zangwill 1995). Such cases risk fatal consequences if we want to preserve criterion [vi] and exclude non-artistic artifacts (including those with an aesthetic function).

Patrick Grafton-Cardwell (2021) and, to a certain extent, Marcia Eaton (2000) suggest a definition of art able to keep the spirit of functionalism while bypassing the too-broad objection. They argue that the function of artworks consists in aesthetically engaging—that is, to direct our attention to aesthetic properties.

Cake decorations may indeed be designed with the insight to endow them with aesthetic properties, but not with the intention of making us contemplate or reflect on the aesthetic properties of these cakes. One can even argue, with Eaton, that in our culture, cake decorations do not deserve the relevant aesthetic attention—while letting the door open for another culture that would consider cake decoration as deserving it.

While the aesthetic engagement approach sidesteps a major objection of neo-formalist functionalism, it provides no resource (except our intuitions) to distinguish aesthetic engagement from other types of engagement—for example, epistemic engagement toward a philosophical question. It thus seems too uninformative and vague to have sufficient predictive power—cf. criterion [iii]. That being said, Eaton argues that aesthetic engagement concerns attention toward the internal features of an object that can produce delight. This is, however, still too vague (see Fudge 2003).

7. Determinable-Determinate Definitions

Up to this point, this article has given a general definition of art. To then define individual arts (for example, paintings), one would have to add further conditions to the general definition. One may consider this as a genus-species view of art: the super-category has a positive definition, involving necessary and sufficient conditions, that is independent of the sub-categories. In biology, the class of mammals is defined by attributes such as being vertebrate, having warm blood, nursing their young with maternal milk, and so on. The different species belonging to this class (bats, whales, ponies, humans…) share these properties and distinguish themselves by species-specific properties. In aesthetics, art would be a super-category that contains a set of sub-categories such as literature, music, sculpture… This idea is precisely what is attacked by skeptics.

A promising strategy is to rethink the relationship between art and the specific arts through a determinable-determinate approach—where arts (painting, cinema, music…) possess determinate definitions whereas art is understood as being “one of the arts to be determined”. Take the relationship between being colored and the property of having a specific color. It is difficult, if not impossible, to define “being colored” independently of different colors. “Being colored” is defined as the disjunction “being blue” or “being burgundy” or “being saffron yellow,”… Thus, “being colored” is determinable whereas “being burgundy” is determined. What is determined should have a specific, independent definition but what is determinable must be defined by what is determined. If art is like color, then the responsibility for defining “art” must be transferred to specific arts.

a. Buck-passing

Dominic Lopes embraces this idea. Here is his definition:

(R) item x is a work of art if and only if x is a work in activity P and P is one of the arts. (2008, 109)

The philosopher’s task would thus be to define each specific art, and once this task is accomplished, there is no need to add anything to define the category “art.” Just as by knowing that an object is red, you thereby know that it is colored, by knowing that an entity is a musical work, you thereby know it is art, with no further explanation needed. This definition is called “buck-passing” because the definition of art is delegated to the definitions of specific arts.

As Lopes notices, this matches with the idea that the generic concept of art plays a marginal role in our daily interaction with the arts. When seeking advice, do we look for an “expert in art”? That sounds pompous and somehow out of place. We rather seek advice from a photographer, a garden designer, or a book critic (see Lopes 2018, Chap.1). As Kennick provocatively puts it:

[O]nly a man corrupted by aesthetics would think of judging a work of art as a work of art in general, as opposed to this poem, that picture, or this symphony (in Lopes 2008, 121).

This may be true even when it comes to avant-garde works. Expecting to listen to a classical musical work, the audience will certainly be baffled by John Cage’s 4’33. The audience’s bewilderment does not come from a difficulty in considering 4’33 as art; what baffles them is that they do not know what kind of work they are dealing with–is it a piece of music? A theatrical performance? A joke? Thus, when the audience learns that this work should be classified as being halfway between a noise piece and a conceptual artwork, their bafflement should diminish.

The elephant in the room concerning buck-passing is, therefore, the following: Which are the individual arts? How can we determine if a particular artifact belongs to one of the arts or not? How many arts are there? Without good answers to these questions, buck-passing risks to be nothing more than a circular or non-informative theory.

Intuitively, the list of arts contains “fine art”—architecture, sculpture, painting, music, poetry –, dance (the sixth art), cinema (the seventh art), and so forth. Each of these arts belongs to a medium, meaning a set of (material) resources exploited and transformed through a “technique” (Lopes 2014, 139). Typically, sculptures can be made with several materials (bronze, marble, wood…) shaped by specific tools (chisels, burin…).

However, this is not enough. Buck-passing still has to overcome two challenges: (a) disqualify non-artistic artifacts belonging to the same medium as artistic artifacts (see criterion [vi]), and (b) account for cases in which an artifact does not seem to fall under an already-known medium but we want to say it is or could be an artistic medium (see criterion [iii]).

The first challenge (a), named by Lopes the “coffee mug objection,” is arguably the most important one. Although ceramic can be an artistic medium, there are many ceramic objects, such as our everyday coffee mugs, which arguably do not qualify as art. So, a criterion must be found to distinguish ceramic objects that are artworks from those that are not. To solve this challenge, Lopes has a surprisingly simple strategy: “[artworks] are works that exploit a medium in order to realize artistic properties and values.” (Lopes 2014, 144) In other words, what distinguishes a coffee mug from a ceramic sculpture is that the latter is an “appreciative kind” of object (2014, 121): in the latter, the way resources are exploited in a medium aims to realize aesthetic or artistic properties.

Now, regarding (b), let us return to the analogy with colors. Some new media resemble mixtures of colors; they are mixes between existing media—the most well-known example being cinema. This poses no problem for buck-passers as the same reasons that make these pre-existing media arts can make the new medium an art too. However, some artifacts belong to totally innovative media—for example, Robert Barry’s Inert Gas Series, in which the artist simply released inert gasses into the desert. These cases resemble the discovery of a color never seen before. Lopes admits that his approach would struggle to explain why these new media should enter the category of arts. However, he does not consider this as a challenge only addressed to buck-passing but to any account of art.

i. Advantages of Buck-passing

Let’s start by pointing out that since the notion of art does not play a central role in buck-passing, an individual (such as a non-Western individual) could produce artworks without necessarily knowing the Western notion of “art” (Lopes 2014, 78). Buck-passing also aligns with our daily experience of the arts and explains our reactions to borderline cases (such as 4’33).

However, the main advantage of buck-passing can be found elsewhere. Since its definition of art is “decentralized,” there can be a focus on descriptive as well as evaluative specificities of specific arts. In a nutshell, buck-passing encompasses diversity regarding the ontological status and the relevant way we evaluate each art type (Lopes 2008, 125). To see why this might be fruitful, let’s look at the example of video games.

Grant Tavinor defines video games as “artifacts of a digital visual medium” created with the intention of entertaining either through the deployment of rules, “objective gameplay,” or through “interactive fiction” (Tavinor 2009, 32-33). Tavinor’s definition highlights the aesthetic specificity of video games compared to other arts: gameplay and the interactive nature of (possible) narratives. Thus, even if the videogame Journey features landscapes of splendid dunes, as can be found in Lawrence of Arabia, an aesthetic specificity of Journey compared to a film comes from the fact that these dunes can be explored through a set of game rules requiring the player to press keys in a clever order and timing (meaning, the gameplay). The specificity of this art is at the heart of the aesthetic judgments of video game critics. It is not uncommon for a game with undeniable formal aesthetic properties to be criticized by specialists for the mediocrity of its gameplay. Conversely, games without notable formal aesthetic properties have been praised for the high interactivity of their narrative (for example, Undertale) or the depth of their gameplay (for example, Super Meat Boy).

Within genus-species approaches the focus is on the genus—“Are video games art?”—rather than on the specificities of the medium. This leads to a debate on the artistic nature of video games precisely because of the focus on shared criteria across all arts. Some scholars reject video games from the realm of art because of their interactive nature—which would leave no room for an appropriate aesthetic attitude (or contemplation) toward art in general (Rough 2018). Others reject it due to their gameplay—one can win or lose at video games, and since one cannot win or lose by watching a film or a painting, video games cannot be art (Ebert 2010).

Buck-passing scores a point over skeptical, relationalist, or neo-formalist theories by highlighting that each art requires specific approaches. It does not limit its efforts to setting out abstract properties for “art” that would govern all the arts, it directly considers the diversity of the arts, and it is thereby better suited to accounting for their ontological and evaluative specificities (for references on the ontological diversity of the arts, see the section Further Reading below).

ii. Limits of Buck-passing

Although promising in many respects, buck-passing turns out, upon closer inspection, to fall short of providing a satisfying response to the worries raised against it.

Lopes’ response to the coffee mug problem is that arts are “appreciative kinds”, media used to realize artistic properties (Lopes 2014, 144). But this dangerously resembles a neo-formalist approach “with an extra step”: [buck-passing] item x is an artwork if and only if x is a work in activity P and P is one of the arts…[formalism] an activity P is one of the arts if and only if P is a medium intentionally used to provide any x with aesthetic properties. In such a definition, the aesthetic function is distributed over P and x; ultimately, both derive their artistic character from this aesthetic function. We are thus dealing with a buck-stopping theory (Young 2016). An elegant way out would be to argue that not all arts necessarily need to possess an aesthetic function. For instance, Xhignesse (2020) suggests that new artistic media can emerge through convention. The risk, once again, would be to end up with a historicist view “with an extra step”.

Another issue concerns the way one talks about art in the public sphere which challenges the idea that a definition of art independent of the arts is not needed. Annelies Monseré (2016) emphasizes that artists must often justify “making art” to secure funding, be exhibited, and attract the attention of critics. If an artist simply said, “I am going to make a film, therefore I am going to make art,” they would certainly not obtain funds to subsidize their project. When artists justify themselves before committees, they seem to refer to a generic definition of art that is more substantial than that of buck-passing. Monseré’s objection highlights the evaluative nature of the generic term “art”: making and consuming art engages us in a particular way. Saying that something is art simply because it belongs to an artform fails to capture this evaluative character. Lopes is aware of this since he admits that artistic mediums are “appreciative kinds.” But this brings us back to the first objection discussed.

8. Disjunctive Definitions

A last strategy for approaching the definition of art is to partially concede to the skeptics such as MacDonald, Weitz, and Gaut that there are no necessary and sufficient conditions for art. However, rather than concluding that it is not possible to define art, the strategy is to give a definition of art by enumerating while acknowledging that, although none of them is necessary (in contrast to regular definitions), some of them, or certain conjunctions of them, are sufficient. For a formalization and a detailed analysis of the very notion of disjunctive definition, see Longworth and Scarantino (2010).

There are two approaches within this strategy: a symptomatic approach—which lists a large number of “typical” or “symptomatic” disjoints—and a synthetic approach—which combines different definitions of art through disjunction.

a. Symptomatic Disjunctivism

In a pioneering article from 1975, E. J. Bond responded to Weitz’s skeptical theory (1956) by observing that a set of conditions can be sufficient for something to be considered art without any member of that set being necessary. Bond proposed several conditions such that, if all are satisfied by an artifact, that artifact is undoubtedly an artwork (1975, 180). An artifact meeting fewer criteria could still be considered art; however, the fewer criteria fulfilled, the more dubious its artistic status. An analogy can be drawn with the symptoms of a syndrome in psychiatry according to manuals like the DSM: a person is considered to be on the autistic spectrum disorder when they exhibit certain symptoms; however, none of these particular symptoms is necessary for the syndrome to be diagnosed (cf. Dutton 2006, 373).

Note that this approach is disjunctive in the sense that the list of conditions for art is disjunctive, but being art requires fulfilling a conjunction of conditions. This contrasts with the synthetic disjunctive approach presented below (7.b.).

Bond’s paper was not very influential, but a similar approach has gained interest notably through the works of Denis Dutton (especially 2006), who drew inspiration from Bond. Dutton provides a list of disjoints that he calls “criteria for recognition,” some of which are similar to Bond’s and a few resemble Gaut’s criteria (2000). Here it is (we only describe the items absent from Gaut’s list (see section 3.b.)):

(i) Direct pleasure: the artistic object or performance is appreciated as a source of immediately pleasing experience. (ii) Skill or virtuosity. (iii) Style: it is executed in a recognizable style, following rules of form, composition, or expression. (iv) Novelty and creativity. (v) Criticism: there is a practice of criticism, often elaborate, commenting on this type of object or performance. (vi) Representation: the object or performance represents or imitates real or imaginary experiences. (vii) “Special” focus. It tends to be set apart from ordinary life and becomes a distinct and dramatic center of attention. (viii) Expressive individuality. (ix) Emotional saturation. (x) Intellectual challenge. (xi) Art traditions and institutions. (xii) Imaginative experience. The object or performance offers an imaginative experience to producers and the audience (2006, 369-372).

i. Advantages of Symptomatic Disjunctivism

As Dutton himself points out, his disjunctive approach avoids several objections raised against Gaut’s cluster approach (2000). Although both approaches superficially resemble each other, Dutton aims at providing a genuine definition of art.

Firstly, since the symptomatic disjunctive approach does not adopt the idea of family resemblance, we can exclude the embarrassing idea that any superficial resemblance (such as “executed on a Tuesday,” see section 3.c.) could be part of the list. For example, Dutton rejects the contrast of form/substance and eccentricity criteria from his list, as these would apply to too many artifacts (2006, 373).

Furthermore, the symptomatic disjunctive approach can accept new criteria without running the risk of becoming an ad hoc theory. Even if a new criterion (xiii) were accepted, and if an artifact only fulfilled (xiii), we could not consider that artifact an artwork (2006, 375). Given the evolution of art in our societies—which, Dutton believes, forged the i-xii criteria—it is highly improbable that another criterion appears (or already exists) and could be sufficient on its own.

ii. Limits of Symptomatic Disjunctivism

A first objection that arises when reading Dutton’s (or Bond’s) list is that finding what unifies the different disjoints is a challenge—an objection that had already been raised against Gaut (see 3.c. above).
A second objection, which applies to the cluster approach as well (see Fokt, 2014), points out that the weight assigned to certain criteria is arbitrary. For instance, criterion (xii) is central for Dutton: art would be primarily linked to exercises of imagination. But why would this criterion be more central than emotional saturation (ix), as suggested by the expressivists (section 2.b.)?

b. Synthetic Disjunctivism

Whereas symptomatic disjunctivism uses recognition criteria, synthetic disjunctivism involves blending, so to speak, multiple definitions of art (see, for example, Stecker 1997, Davies 2015). Consider, for example, the definition proposed by Davies:

I propose that something is art (a) if it shows excellence of skill and achievement in realizing significant aesthetic goals, and either doing so is its primary […] identifying function, or (b) if it falls under an art genre or art form established and publicly recognized within an art tradition, or (c) if it is intended by its maker/presenter to be art and its maker/presenter does what is necessary and appropriate to realizing that intention. (Davies 2015, 377-8)

Disjoint (a) closely resembles a neo-formalist definition through the notion of aesthetic goals, while disjoint (b) borrows the notion of artistic tradition from historicist definition, and disjoint (c) is inspired by the institutionalist idea that certain individuals can confer the status of art.

Davies’ approach does not intend to be a mere collage of different definitions; he assigns them different roles in the emergence of artistic practices and in the subsequent evolution of the concept of art. According to him, the first arts (in the sense used in section 4.b) are art by virtue of the disjoint (a). This latter “does the important work of getting the art-ball rolling” (Davies 2015, 378). Disjoints (b) and (c) must be elaborated based on the historical development of first arts. In this regard, Davies aligns with the historicist definition while encompassing the first arts under functionalism.

Note that this approach is disjunctive in the sense that the list of conditions for art is disjunctive, and being art requires meeting at least one condition, not a conjunction of symptom criteria, as in the symptomatic disjunctive approach (8.a.).

i. Advantages of Synthetic Disjunctivism

Synthetic disjunctivism has numerous advantages, especially when compared to the formalist, institutionalist, and historicist definitions from which it draws inspiration. For instance, it avoids the Euthyphro problem posed to historicism (see section 4.b.ii.) by clearly assuming that the first arts are art in virtue of their aesthetic functions. In other words, the Euthyphro problem is resolved as follows: something is art (a) because it possesses a formal function that makes it art or (b) because it fits into the history of art. Consequently, it also bypasses Currie (1989)’s objection about Martian art (see section 4.b.ii.). Martian art may not share the same history, but it should have the same “formalist” beginning.

Synthetic disjunctivism also has an advantage over neo-formalism: it can account for the fact that certain works lack formal quality without having to label them as secondary (or parasitic) arts or exclude them from the realm of art (see section 6.b.ii).

Finally, and perhaps most importantly, the synthetic disjunctive approach has strong descriptive powers: thanks to its disjunctive form, it has no problem accounting for neglected art genres and non-Western art traditions that some of the above definitions wrongly excluded (see Sections 4.a.ii and 5 above), such as the sophisticated make-up and masks of Kathakali dancers. In fact, one can hardly find a case of artwork that escapes this definition.

ii. Limits of Synthetic Disjunctivism

The encompassing aspect of synthetic disjunctivism is a double-edged sword. By borrowing characteristics from previous theories, it inherits some of their shortcomings. Davies’s definition thus accumulates difficulties of overgeneralization from functionalism, institutionalism, and historicism: it struggles to exclude certain non-artistic artifacts and activities or to account for borderline cases (criteria [vi] and [vii]). For example, disjoint (c) imports from institutionalism the problem of the zealous curator who decides to bring an object into the artworld with arbitrary authority (see section 4.a.ii). And with disjoint (a), it seems to include everyday objects such as flower arrangements, similarly to neo-formalism (section 6.a.ii).

A second objection argues that the synthetic disjunctive definition is not a theory in itself. Indeed, Davies explains how the notion of art has become more complex, transitioning from a practice that a formalist definition can capture to practices requiring a disjunctive definition. However, formalism and institutionalism are opposing theories with antithetical insights. Bringing together functionalism, institutionalism, and historicism in a single disjunction does capture a maximum number of cases but at the expense of the unity offered by each of the definitions taken separately. Davies could counter that this is not a problem; after all, if the concept of art is rich and complex, it is precisely because it has a rich history that has led to a disjunctive understanding of the concept. It has been seen that this strategy applied to symptomatic disjunctivism (section 8.a.i.).

9. Conclusion

This overview of definitions of art leads us to identify four broad strategies that philosophers have employed: a target on the internal elements of artistic artifacts (classical definitions; functionalism); a focus on the relational elements of artistic artifacts (institutionalism, historicism); an emphasis on the artistic media rather than art in general (determinable-determinate definitions); and the combination of both internal and relational elements (disjunctive definitions). Opposing these four strategies are the skeptics, for whom art is indefinable; at best, one can provide typical characteristics (family resemblance) or an open cluster of properties. Each of these approaches possesses its advantages and drawbacks, reflecting their contributions to the literature as well as their limitations.

The most important advantage of classical and functionalist definitions seems to be their intuitive simplicity. Aesthetics is the science of beauty, and artworks are a subcategory of objects studied by aestheticians for their formal beauty. An important drawback of these definitions is their inadequate extension, which is both too broad and too narrow compared to common conceptions of art. A definition that categorizes all hummed songs while doing the dishes as artworks and simultaneously excludes cold art, abstract art, or literature is problematic.

A major advantage of relationalist definitions seems to be their great adaptability to borderline cases and new types of art. Inevitably, new forms of art will emerge and are emerging; institutionalism has clear criteria for accepting them. A crucial drawback of these definitions is the room left for arbitrariness—concerning either the authorities capable of institutionalizing an artifact or the indefinable nature of firsts arts.

One major advantage of determinable-determined definitions is their ability to capture the aesthetic and ontological specificities of each art. Except for Oscar Wilde, no one can claim to be an expert “in art”; each domain requires a particular approach and expertise. One drawback of these definitions is the difficulty of finding a determinable-determined approach to the arts that is actually not reducible to another definition “with an extra step.”

Disjunctive definitions have the advantage of being inclusive—a quality that helps address the challenges raised by anti-discriminatory approaches to the definition of art. It seems reasonable to think that a good definition of art does not contain a single criterion but a set of internal and relational elements. However, a major drawback of these definitions is the fact that they are over-encompassing since they fail to exclude many kinds of non-artistic artifacts.

Faced with these definitions, skepticism seems to stand out by predicting that a positive definition of art cannot be given. However, skeptical theories have not proven that a good definition is impossible and even less that this project is a failure. Rather, the positive theories, with their commendable though imperfect efforts, taught us many insights by revealing hidden complexities in the concept of art. Contrary to skeptical predictions, it seems that progress has been made in understanding what art is.

10. References and Further Reading

a. References

  • Abell, Catharine. 2011. “Art: What It Is and Why It Matters.” Philosophy and Phenomenological Research 85 (3): 671–91.
  • Adajian, Thomas. 2003. “On the Cluster Account of Art.” The British Journal of Aesthetics 43 (4): 379–85.
  • Austin, John. L. 1962. How to Do Things with Words. Oxford: Clarendon Press.
  • Batteux, Charles. 1746. Les Beaux arts réduits à un même principe. Paris: Chez Durand.
  • Baxandall, Michael 1985.. Patterns of Intention: On the Historical Explanation of Pictures. Yale University Press.
  • Beardsley, Monroe C. 1961. “The Definitions of the Arts.” Journal of Aesthetics and Art Criticism 20 (2): 175–87.
  • Beardsley, Monroe C. 2004. “The Concept of Literature.” In The Philosophy of Literature: Contemporary and Classic Readings—An Anthology, edited by Eileen John and Dominic McIver Lopes, 1st edition, 51–58. Malden, MA: Wiley-Blackwell.
  • Bell, Clive. 1914. Art. London: Chatto and Windus.
  • Binkley, Timothy. 1977. “Piece: Contra Aesthetics.” Journal of Aesthetics and Art Criticism 35 (3): 265–77.
  • Bond, Edward. J. 1975. “The Essential Nature of Art.” American Philosophical Quarterly 12 (2): 177–83.
  • Brand Weiser, Peg Zeglin. 2000. “Glaring Omissions in Traditional Theories of Art,’” in Theories of Art Today, ed. Noël Carroll, University of WisconsinPress: 175–198.
  • Carroll, Noël. 1993. “Historical Narratives and the Philosophy of Art.” Journal of Aesthetics and Art Criticism, 51(3), 313–326.
  • Carroll, Noël. 1999. Philosophy of Art: A Contemporary Introduction. London ; New York: Routledge.
  • Collingwood, Robin G. 1938. The principles of art. Oxford: Oxford University Press.
  • Currie, Gregory. 1993. “Aliens, Too.” Analysis 53 (2): 116–18.
  • Danto, Arthur. 1964. “The Artworld.” The Journal of Philosophy 61 (19): 571–84.
  • Davies, Stephen. 1997. “First Art and Art’s Definition.” Southern Journal of Philosophy 35 (1): 19–34.
  • Davies, Stephen. 2015. “Defining Art and Artworlds.” The Journal of Aesthetics and Art Criticism 73 (4): 375–84.
  • Dickie, George. 1969. “Defining Art.” American Philosophical Quarterly 6 (3): 253–56.
  • Donnell, Radka. 1990. Quilts as women’s art: A quilt poetics. North Vancouver, BC: Gallerie Publications.
  • Dutton, Denis. 2006. “A Naturalist Definition of Art.” The Journal of Aesthetics and Art Criticism 64 (3): 367–77.
  • Eaton, Marcia. 2010. “A Sustainable Definition of ’Art,’” in Theories of Art Today, ed. Noël Carroll, University of WisconsinPress: 141–159.
  • Ebert, Roger. 2010. “Video Games Can Never Be Art.” Rogertebert.Com. 2010. https://www.rogerebert.com/rogers-journal/video-games-can-never-be-art.
  • Everett, Daniel. 2008. Don’t Sleep, There Are Snakes. London: Profile Books.
  • Fokt, Simon. 2014. “The Cluster Account of Art: A Historical Dilemma.” Contemporary Aesthetics 12.
  • Fokt, Simon. 2017. “The Cultural Definition of Art.” Metaphilosophy 48 (4): 404–29.
  • Freeland, Cynthia. 2002. But Is It Art?: An Introduction to Art Theory. Oxford, New York: Oxford University Press.
  • Fudge, Robert. 2003. “Problems with Contextualizing Aesthetic Properties.” Journal of Aesthetics and Art Criticism 61 (1): 67–70.
  • Ganguly, Anil Baran. 1962. Sixty-Four Arts in Ancient India. English Book Store, New Delhi.
  • Gaut, Berys. 2000. “’Art’ as a Cluster Concept.” In Theories of Art Today, edited by Noël Carroll, University of Wisconsin Press, 25–44. London.
  • Goffin, Kris. 2018. “The Affective Experience of Aesthetic Properties.” Pacific Philosophical Quarterly 0 (0): 1–18.
  • Grafton-Cardwell, Patrick. 2021. “The Aesthetic Engagement Theory of Art.” Ergo an Open Access Journal of Philosophy 8 (0).
  • Ingarden, Roman. 1964. “Artistic and aesthetic values.” The British Journal of Aesthetics 4 (3): 198–213.
  • Irvin, Sherri. 2005. “Interpretation et description d’une oeuvre d’art.” Philosophiques 32 (1): 135‑48.
  • Irvin, Sherri, and Julian Dodd. 2017 “In Advance of the Broken Theory: Philosophy and Contemporary Art.” Journal of Aesthetics and Art Criticism 75 (4): 375‑86.
  • Kivy, Peter. 1997. Philosophies of Arts: An Essay in Differences. Cambridge ; New York: Cambridge University Press.
  • Korsmeyer, Carolyn. 2004. Gender and Aesthetics: An Introduction. New York: Routledge.
  • Korsmeyer, Carolyn, and Peg Brand Weiser. 2021. “Feminist Aesthetics.” In The Stanford Encyclopedia of Philosophy, edited by Edward N. Zalta, Winter 2021. Metaphysics Research Lab, Stanford University. https://plato.stanford.edu/archives/win2021/entries/feminism-aesthetics/.
  • Lamarque, Peter. 2010. Work and Object: Explorations in the Metaphysics of Art. Oxford, New York: Oxford University Press.
  • Langer, Susanne. 1942. Philosophy in a New Key: A Study in the Symbolism of Reason, Rite and Art. Cambridge, MA: Harvard University Press.
  • Langer, Susanne. 1957. Problems of Art: Ten Philosophical Lectures. New York: Charles Scribner’s.
  • Langer, Susanne. Mind: An Essay on Human Feeling. Johns Hopkins Press, 1967.
  • Levinson, Jerrold. 1979. “Defining Art Historically.” British Journal of Aesthetics 19 (3): 21–33.
  • Levinson, Jerrold. 1989. “Refining Art Historically.” Journal of Aesthetics and Art Criticism 47 (1): 21–33.
  • Longworth, Francis, and Scarantino, Andrea. 2010. “The Disjunctive Theory of Art: The Cluster Account Reformulated.” The British Journal of Aesthetics 50 (2): 151–67.
  • Lopes, Dominic McIver. 2008. “Nobody Needs a Theory of Art.” The Journal of Philosophy 105 (3): 109–27.
  • Lopes, Dominic McIver. 2014. Beyond Art. Oxford ; New York, NY: OUP Oxford.
  • Lopes, Dominic McIver. 2018. Being for Beauty: Aesthetic Agency and Value. Oxford, New York: Oxford University Press.
  • Lüdeking, Karlheinz. 1988. Analytische Philosophie Der Kunst. Frankfurt am Main: Athenäum.
  • MacDonald, Margaret. 1952. “Art and Imagination.” Proceedings of the Aristotelian Society 53: 205–26.
  • Meskin, Aaron. 2007. “The Cluster Account of Art Reconsidered.” The British Journal of Aesthetics 47 (4): 388–400.
  • Mikalonytė, Elzė Sigutė, and Markus Kneer. 2022. “Can Artificial Intelligence Make Art?: Folk Intuitions as to Whether AI-Driven Robots Can Be Viewed as Artists and Produce Art.” ACM Transactions on Human-Robot Interaction 11 (4): 43:1-43:19.
  • Monseré, Annelies. 2012. “Non-Western Art and the Concept of Art: Can Cluster Theories of Art Account for the Universality of Art?” Estetika 49 (2): 148-165.
  • Monseré, Annelies. 2016. “Why We Need a Theory of Art.” Estetika 53 (2): 165–83.
  • Osborne, Harold. 1952. Theory of Beauty: An Introduction to Aesthetics. Routledge and K. Paul.
  • Poe, Edgar Allan. 1846. “The Philosophy of Composition.” Graham’s Magazine, 28 (4), 163-167.
  • Porter, James I. 2009. “Is Art Modern? Kristeller’s ?Modern System of the Arts? Reconsidered: Articles.” British Journal of Aesthetics 49 (1): 1–24.
  • Pouivet, Roger. 2007. Qu’est-ce qu’une œuvre d’art. Chemins philosophique. Paris: Librairie Philosophique Vrin.
  • Rough, Brock. 2018. “The Incompatibility of Games and Artworks.” Journal of the Philosophy of Games 1 (1)..
  • Rousseau, Jean-Jacques. 1753. Lettre sur la musique françoise. Unindentified editor.
  • Schlenker, Philippe. 2017. “Outline of Music Semantics.” Music Perception: An Interdisciplinary Journal, 35(1), 3‑37.
  • Scruton, Roger. 1999. The Aesthetics of Music. Oxford, New York: Oxford University Press.
  • Shelley, James. 2003. “The Problem of Non-Perceptual Art.” British Journal of Aesthetics 43 (4): 363–78.
  • Stecker, Robert. 1997. “Artworks: Definition, Meaning, Value.” Journal of Aesthetics and Art Criticism 56 (3): 311–13.
  • Tavinor, Grant. 2009. The Art of Videogames. New Directions in Aesthetics 10. Malden, Mass.: Wiley-Blackwell.
  • Taylor, Paul C. 2016. Black Is Beautiful: A Philosophy of Black Aesthetics. 1st edition. Chichester, West Sussex: Wiley-Blackwell.
  • Tolstoy, Leo. 1898 [2023]. What is art?. Germany: Culturae.
  • Weitz, Morris. 1956. “The Role of Theory in Aesthetics.” The Journal of Aesthetics and Art Criticism 15 (1): 27–35.
  • Whiting, Daniel. 2022. “Margaret Macdonald on the Definition of Art.” British Journal for the History of Philosophy 30 (6): 1074–95.
  • Wiltsher, Nick. 2018. “Feeling, emotion and imagination: In defence of Collingwood’s expression theory of art.” British Journal for the History of Philosophy, 26(4), 759‑781.
  • Wittgenstein, Ludwig. 1953. Philosophical Investigations. Translated by G. E. M. Anscombe. 3. ed, 2000. Englewood Cliffs, N.J: Prentice Hall.
  • Wollheim, Richard. 1980. Art and Its Objects: With Six Supplementary Essays. Cambridge University Press.
  • Xhignesse, Michel-Antoine. 2020. “What Makes a Kind an Art-Kind?” British Journal of Aesthetics 60 (4): 471–88.
  • Young, James O. 2016. “The Buck Passing Theory of Art.” Symposion: Theoretical and Applied Inquiries in Philosophy and Social Sciences 3 (4): 421–33.
  • Zangwill, Nick. 1995. “The Creative Theory of Art.” American Philosophical Quarterly 32 (4): 307–23.
  • Zangwill, Nick. 2000. “In Defence of Moderate Aesthetic Formalism.” The Philosophical Quarterly (1950-) 50 (201): 476–93.
  • Zangwill, Nick. 2002. “Are There Counterexamples to Aesthetic Theories of Art?” Journal of Aesthetics and Art Criticism 60 (2): 111–18.
  • Zangwill, Nick. 2007. Aesthetic Creation. Oxford, New York: Oxford University Press.

b. Further Reading

This sub-section provides further references that were not discussed for reasons of space and of accessibility. These references contain elaborations on most of the theories discussed in this article and many also respond to objections raised above.

  • Expressivism:
  • Croce, Benedetto. 1902. Estetica come scienza dell’espressione e linguistica generale, Florence: Sandron.
  • Dewey, John. 1934. Art as Experience. New York: Capricorn Books.
  • Skepticism:
  • Gaut, Berys. 2005. “The Cluster Account of Art Defended.” The British Journal of Aesthetics 45 (3): 273–88.
  • Institutionalism:
  • Danto, Arthur. 1981. The Transfiguration of the Commonplace: A Philosophy of Art. Cambridge: Harvard University Press.
  • Dickie, George. 1974. Art and the Aesthetic: An Institutional Analysis. Vol. 86. Cornell University Press.
  • Dickie, George. 1984. “The New Institutional Theory of Art.” Proceedings of the 8th Wittgenstein Symposium, no. 10: 57–64.
  • Historicism
  • Carney, James D. 1994. “Defining Art Externally.” British Journal of Aesthetics 34 (2): 114–23.
  • Levinson, Jerrold. 1993. “Extending Art Historically.” Journal of Aesthetics and Art Criticism 51 (3): 411–23.
  • Levinson, Jerrold. 2002. “The Irreducible Historicality of the Concept of Art.” British Journal of Aesthetics 42 (4): 367–79.
  • Pignocchi, Alessandro. 2012. “The Intuitive Concept of Art.” Philosophical Psychology 27 (3): 425–44.
  • Functionalism
  • Zangwill, Nick. 2001. The Metaphysics of Beauty. Ithaca, NY: Cornell University Press.
  • Determinable-Determinate Definitions, on the Ontological Diversity of the Arts:
  • Davies, Stephen. 2009. “Ontology of Art.” In The Oxford Handbook of Aesthetics, edited by Jerrold Levinson. Vol. 1. Oxford University Press..
  • Kania, Andrew. 2005. “Pieces of Music: The Ontology of Classical, Rock, and Jazz Music.” University of Maryland. http://drum.lib.umd.edu/handle/1903/2689.
  • Walton, Kendall L. 1970. “Categories of Art.” The Philosophical Review 79 (3): 334–67.
  • Disjunctivism:
  • Davies, Stephen. 2007. Philosophical Perspectives on Art. Oxford: Oxford University Press.
  • Davies, Stephen. 2012. The Artful Species: Aesthetics, Art, and Evolution. Oxford, New York: Oxford University Press.
  • Dutton, Denis. 2000. “But They Don’t Have Our Concept of Art.” In Theories of Art Today, edited by Noël Carroll, 217–40. Wisconsin: University of Wisconsin Press.
  • Dutton, Denis. 2009. The Art Instinct: Beauty, Pleasure, and Human Evolution. Oxford: Oxford University Press.
  • Stecker, Robert. 2000. “Is It Reasonable to Attempt to Define Art?” In Theories of Art Today, edited by Noël Carroll, 45–64. Wisconsin: University of Wisconsin Press.

 

Authors Information

Constant Bonard
Email: constant.bonard@gmail.com
University of Bern
Switzerland

and

Steve Humbert-Droz
Email: steve.humbert.droz@gmail.com
Umeå University
Sweden

 

Al-Ghazālī (c. 1056–1111)

Al-Ghazālī did not regard himself as a philosopher, given that during his period in Islamic intellectual history, philosophy was associated with the Aristotelian tradition promulgated primarily by Avicenna (Ibn Sina), and, for al-Ghazālī, Avicenna was undoubtedly considered to be an unbeliever whose philosophical views (such as his commitment to the eternity of the world) fell outside the scope of orthodox Sunni Islam. There would be a serious stigma attached, from the perspective of Islamic orthodoxy, with al-Ghazālī identifying with the philosophers. Instead, al-Ghazālī regarded himself primarily as a Sufi (mystic), theologian, and jurist.

Yet despite his aversion to particular philosophical theses, it is clear that Al-Ghazālī is not only sympathetic to particular disciplines and methodologies of philosophy (for example, logic and ethics), but produces work that would certainly qualify as philosophical both in his day and ours. Indeed, he contributed immensely to the history of Islamic philosophy and the history of philosophy more generally, and he is considered to be one of the greatest and most influential thinkers in Islamic intellectual history. Al-Ghazālī’s philosophical work spans epistemology, metaphysics, philosophy of mind, natural philosophy, and ethics. His philosophical work had a wide-reaching influence within the Islamic world, and his Incoherence of the Philosophers, in particular, was well-received by other medieval philosophers and the Latin philosophical tradition.

Table of Contents

  1. Life and Works
  2. Skepticism in the Deliverance from Error
    1. Motivations
    2. Sensory Perception Doubt and Dream Doubt
    3. Resolution of Skepticism
    4. Al-Ghazālī and Descartes
  3. Assessment of Philosophy
    1. Materialists, Naturalists, and Theists
    2. The Philosophical Sciences
  4. Incoherence of the Philosophers
    1. The Eternity of the World
    2. God’s Knowledge: Universals vs. Particulars
    3. Bodily Resurrection
    4. Causation and Occasionalism
    5. Averroes’ Response to the Charge of Unbelief
  5. Revival of the Religious Sciences
    1. The Heart
    2. The Intellect
    3. Sufism and Ethics
    4. Theodicy: The Best of All Possible Worlds
  6. Islamic Philosophy after Al-Ghazālī
  7. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Life and Works

Al-Ghazālī, who holds the title of the “Proof of Islam,” was a Persian-Islamic jurist, mystic, theologian, and philosopher, born c.1058 in Tus, Khorasan (a region in the northeast of modern-day Iran, near Mashad). According to tradition, before his father died, he left the young al-Ghazālī and his brother Ahmad to the tutelage of an Islamic teacher and Sufi; he was eventually transferred to a local madrasa (Islamic school) after this teacher died where he continued his religious studies. After this early period of Islamic learning, al-Ghazālī received an advanced education in the Islamic sciences, particularly theology, from Imam al-Juwaynī (the leading Ash‘arite theologian of the day) in Nishapur.

An early experience that shaped the scholarly character of al-Ghazālī was when his caravan was raided during his travels. Al-Ghazālī demanded from the leader of the robbers that his notebooks and texts not be seized from him. When the robber asked al-Ghazālī what these writings were, Al-Ghazālī responded by saying “My writings contain all of the knowledge that I have acquired during my travels.” The leader then asked al-Ghazālī: “If you claim that you possess this knowledge, then how can I take it away from you?” Al-Ghazālī was deeply moved by this question, regarding this utterance of the robber as divine sent, and afterwards resolved to commit everything he studied to memory.

His reputation as a young scholar eventually led him to be appointed as a professor at the Nizamiyya College in Baghdad in 1091 by Niẓām al-Mulk. Although al-Ghazālī was highly successful as a professor, accruing a large following of students (he claims at one point that he had 300 students), during the period of his professorship it seems that he grew skeptical of his intentions with respect to teaching. That is, he questioned whether his professional ambitions were truly for the sake of God or personal aggrandizement, and he ultimately underwent a skeptical and spiritual crisis (this crisis is distinct from the skeptical crisis discussed in section 2). As he describes it, this crisis had psychosomatic effects, preventing him from speaking and thereby teaching his classes. Resolved to purify his intentions and recover an authentic experience of Islam, al-Ghazālī left his post and his family in Baghdad in 1095 to travel for ten years pursuing and cultivating the Sufi or mystical way of life. He visited Damascus, Jerusalem, and Hebron; and made a pilgrimage to Mecca and Medina. During this time, he writes that:

My only occupation was seclusion and solitude and spiritual exercise and combat with a view to devoting myself to the purification of my soul and the cultivation of virtues and cleansing my heart for the remembrance of God Most High, in the way I had learned from the writings of the sufis. (Deliverance: 80)

Al-Ghazālī eventually returned to teaching in Nishapur (Northeast Iran), seemingly motivated to rid the masses of theological, spiritual, and philosophical confusions, and convinced by his colleagues that he could help revive the true theory and practice of Islam. However, he eventually retired to Tus, where he established a Sufi school, and lived a quiet life of scholarly work and meditation. He died in 1111 at the age of 55 in Tus.

Al-Ghazālī is regarded as one of the most important intellectuals in all of Islamic history. He is most well-known for his work Revival of the Religious Sciences, a work divided into forty books which covers creed, theology, mysticism, ethics, and jurisprudence. Al-Ghazālī’s philosophical and theological work emerges in texts such as Deliverance from Error, Incoherence of the Philosophers, and Niche of Lights.

2. Skepticism in the Deliverance from Error

Written in the style of a spiritual and intellectual autobiography (Menn 2003), Deliverance from Error is one of al-Ghazālī’s most well-known texts. The Deliverance from Error covers a number of topics, from the errors of the Islamic philosophers (in particular, al-Farabi and Avicenna), the nature of prophecy, to the truth of Sufism. However, it has attracted the attention of historians of philosophy, especially for its engagement with skepticism in the first part of the text. This section discusses al-Ghazālī’s skeptical arguments, his solution to skepticism, and the popular question of the similarities between al-Ghazālī’s skepticism and that of Descartes’.

a. Motivations

Before introducing his skeptical arguments, al-Ghazālī first provides some motivations for engaging skepticism, then he provides a theory of knowledge that provides the grounds for a strategy to determine whether a subject’s belief in a proposition amounts to knowledge. Al-Ghazālī claims that from a young age, he had a strong desire to seek knowledge: “The thirst for grasping the real meaning of things was indeed my habit and wont from my early years and in the prime of my life” (Deliverance: 54). However, the following observation raised a nagging doubt for him as to whether this desire would lead him to objective knowledge:

The children of Christians always grew up embracing Christianity, and the children of Jews grew up adhering to Judaism, and the children of Muslims always grew up following the religion of Islam. (Ibid. 55)

The worry that al-Ghazālī has is that what we take to be knowledge (for example, a religious claim to truth) might be due more to parental or societal conditioning (blind imitation or taqld), rather than being due to objective epistemic standards. This led al-Ghazālī to inquire into the true nature of knowledge. So, what then are the objective standards for knowledge? al-Ghazālī writes:

Sure and certain knowledge is that in which the thing known is made so manifest that no doubt clings to it, nor is it accompanied by the possibility of error and deception. (Ibid.)

Al-Ghazālī’s account of knowledge here has received sophisticated treatment in the literature (See Albertini 2005; Hadisi 2022). It is also important to note that this is not the only type of knowledge that al-Ghazālī recognizes. For example, later in Deliverance from Error al-Ghazālī will identify other types of knowledge, such as dhawq or fruitful experience (a type of knowledge-by-acquaintance—see Götz 2003).

However, an operative definition of Ghazālīan knowledge, at least as it is used in this part of the Deliverance from Error, is the following: A subject, S, knows that p if and only if: (1) S believes that p, (2) p is true, and (3) p is absolutely certain. Certainty is the key criterion here, as for al-Ghazālī there can be no room for doubt in a genuine item of knowledge. (For an in-depth analysis of al-Ghazālī’s conception of absolute certainty see Hadisi 2022).

With this definition of knowledge in hand, al-Ghazālī now devises a strategy for determining whether his beliefs—particularly sensory beliefs and rational beliefs—amount to knowledge. If any proposition or belief lacks certainty (that is, is dubitable) then it cannot count as knowledge: “Whatever I did not know in this way and was not certain of with this kind of certainty was unreliable and unsure knowledge” (See also Book of Knowledge: 216). Al-Ghazālī then turns to his sensory and rational beliefs to see if they can meet his standard for knowledge.

b. Sensory Perception Doubt and Dream Doubt

With this theory of knowledge and strategy for examining his beliefs in place, al-Ghazālī raises two separate skeptical arguments that generate his skepticism: a sensory perception doubt, which targets the reliability of sense data; and a dream doubt, which targets the reliability of the primary truths of the intellect.

The sensory perception doubt runs as follows:

The strongest of the senses is the sense of sight. Now this looks at a shadow and sees it standing still and motionless and judges that motion must be denied. Then, due to experience and observation an hour later it knows that the shadow is moving, and that did not move in a sudden spurt, but so gradually and imperceptibly that it was never completely at rest. (Deliverance: 56)

Here is a plausible reconstruction of the argument:

    1. If the senses misrepresent the properties of external objects, then sense data cannot be a source of knowledge about the external world.
    2. The senses misrepresent the properties of external objects
    3. Therefore, sense data cannot be a source of knowledge about the external world.

After having engaged the sensory perception doubt, al-Ghazālī admits that his sensory beliefs cannot amount to knowledge. Instead, he proposes that his rational beliefs, particularly his beliefs in primary truths are still secure. For al-Ghazālī, primary truths are necessary truths which are foundational to proofs, such as the law of non-contradiction: “One and the same thing cannot be simultaneously affirmed and denied” (Deliverance: 56). However, against his reliance on primary truths, al-Ghazālī raises the dream doubt, which runs as follows:

Don’t you see that when you are asleep you believe certain things and imagine certain circumstances and believe they are fixed and lasting and entertain no doubts about that being their status? Then you wake up and know that all your imaginings and beliefs were groundless and unsubstantial. So while everything you believe through sensation or intellection in your waking state may be true in relation to that state, what assurance have you that you may not suddenly experience a state which would have the same relation to your waking state as the latter has to your dreaming, and your waking state would be dreaming in relation to that new and further state? If you found yourself in such a state, you would be sure that all your rational beliefs were unsubstantial fancies. (Ibid. 57)

Here is a plausible reconstruction of the argument:

    1. In our current cognitive position, we cannot doubt primary truths.
    2. However, if our current cognitive position may be akin to a dream state where what we perceive is false, then what we take to be primary truths cannot be a source of knowledge.
    3. We cannot rule out the possibility that our current cognitive position is akin to such a dream state.
    4. It is possible that we can wake up from this dream state into a new and higher cognitive position where we realize that the primary truths, we held to be necessarily true are actually false.
    5. Therefore, our belief in primary truths in our current cognitive position cannot amount to knowledge.

After having raised the dream doubt, al-Ghazālī finds himself in a skeptical crisis (It is important to note that these two doubts do not generate a hyperbolic doubt, under which every proposition is considered dubious. This is evidenced by the fact that al-Ghazālī does not rule out the existence of God during his skeptical crisis. He writes that he attempted to refute the dream doubt, however, his:

Effort was unsuccessful, since the objections could be refuted only by proof. But the only way to put together a proof was to combine primary cognitions. So if, in my case, these were inadmissible, it was impossible to construct the proof. (Ibid.)

The skeptical challenge al-Ghazālī has raised is particularly pernicious. In order to, refute either skeptical doubt al-Ghazālī would have to put together proof. However, proofs require primary truths, which are now suspect. As such, al-Ghazālī has his hands tied, as it were, for he cannot construct a proof. Consequently, there is no way for al-Ghazālī himself to defeat skepticism.

c. Resolution of Skepticism

 Al-Ghazālī claims that his skepticism lasted for two months, during which time he was a skeptic. However, he claims that he eventually overcame his skepticism. But he did not defeat skepticism through any proof; rather, it came about through God’s intervention via a divine light:

At length God Most High cured me of that sickness. My soul regained its health and equilibrium and once again I accepted the self-evident data of reason and relied on them with safety and certainty. But that was not achieved by constructing a proof or putting together an argument. On the contrary, it was the effect of a light which God Most High cast into my breast. And that light is the key to most knowledge. (Ibid.)

Commentators generally understand al-Ghazālī as having an experience of dhawq via the divine light, which is supposed to secure the foundations of his knowledge. In the Niche of Lights, al-Ghazālī explains that the function of a light is to reveal the existence of something, and a divine light in particular reveals the true nature of that thing and is the source of certainty (Hesova 2012; Loumia 2020). On Hadisi’s (2022), reading it is through the engagement of Sufi practices that cultivate the imagination that makes this experience of the divine light possible. While there is much scholarly debate about the nature of this divine light and the conditions for its experience, it is argued that (unlike Descartes) al-Ghazālī clearly does not defeat skepticism through any rational efforts of his own. Rather he is rescued from skepticism through a divine intervention.

d. Al-Ghazālī and Descartes

On the face of it, there are a lot of similarities between al-Ghazālī Deliverance from Error and Descartes’ Discourse on Method (1637) and Meditations on First Philosophy (1641), both of which were written 500 years after the Deliverance from Error. As such, many modern commentators have been interested in studying the relationship between Descartes and al-Ghazālī (See Albertini 2005; Azadpur 2003; Götz 2003; Moad 2009). Commentators have debated two issues on this score: what the exact similarities are between al-Ghazālī and Descartes’ engagement with skepticism, and whether there is historical evidence that al-Ghazālī had any real influence on Descartes.

Commentators generally agree that there are similarities between the two philosophers’ search for certainty and, in particular, the skeptical arguments they raise. In 1857, George Henry Lewes wrote that the Deliverance from Error:  “bears so remarkable a resemblance to the Discours sur la Méthode of Descartes, that had any translation of it existed in the days of Descartes, everyone would have cried out against the plagiarism” (1970: 306). For starters, both al-Ghazālī and Descartes place absolute certainty at the center of their epistemology and develop a similar strategy for evaluating whether their beliefs amount to knowledge, that is, a belief that is dubitable in any respect cannot amount to true knowledge. Furthermore, both al-Ghazālī and Descartes employ a sensory perception doubt and a dream doubt. However, while the sensory perception doubt for both philosophers’ functions similarly, the dream doubt does not. Recall, that al-Ghazālī uses the dream doubt to target the reliability of intellectual beliefs or primary truths. However, in the Meditations on First Philosophy Descartes uses the dream doubt to question the existence of the external world. Descartes uses a separate doubt, what is often called the defective nature doubt, to question the reliability of the intellect and intellectual beliefs. Moreover, while al-Ghazālī’s skeptical arguments do not generate hyperbolic doubt (because he never questions the existence of God), Descartes’ skepticism does end up in global skepticism. There also seems to be a difference in how the philosophers defeat skepticism. As it is well-known, Descartes ultimately claims to defeat skepticism through the natural light of reason. However, al-Ghazālī requires supernatural assistance through the divine light and explicitly denies that he can defeat skepticism through the use of reason and rational proofs.

As for whether Descartes had access to al-Ghazālī’s works, there is a theory that there was a translation of the Deliverance from Error in Latin that Descartes had access to; however, this theory is unlikely. Indeed, the Deliverance from Error was not translated until 1842, when it was translated into French (Van Ess 2018). There is also the possibility that the Deliverance from Error was orally translated for Descartes by Golius, an orientalist who had access to the Arabic Deliverance from Error (Götz 2003). However, there is only circumstantial evidence for this claim. In the end, the best evidence for there being an influence seems to be the striking similarities between al-Ghazālī and Descartes’ engagement with skepticism.

3. Assessment of Philosophy

a. Materialists, Naturalists, and Theists

After having defeated skepticism, al-Ghazālī proceeds to examine the “various categories of those seeking the truth,” to determine whether he can gain anything from them in his newfound path to knowledge, and in particular, whether these categories of seekers conform with the truth on religious matters according to orthodox Islam, or in a state of unbelief (kufr). Regarding his study of philosophy, al-Ghazālī claims that he studied and reflected on philosophy independently (without a teacher) for just under three years. He claims: “God Most High gave me an insight into the farthest reaches of the philosophers’ sciences in less than two years,” and he reflected “assiduously on it for nearly a year” until he understood philosophy’s “precision and delusion” (Deliverance: 61).

The philosophers, al-Ghazālī claims, are those “who maintain that they are the men of logic and apodeictic demonstration” (Ibid. 58) and he divides them into three groups: materialists, naturalists, and theists. Regarding the materialists, al-Ghazālī writes that these were ancient philosophers who denied the existence of an omniscient and omnipotent “Creator-Ruler.” Their fundamental claim is that the world is eternal, and it exists without a creator. According to al-Ghazālī, these philosophers are “the godless in the full sense of the term” (Ibid. 61).

Regarding the naturalists, al-Ghazālī writes that these were the ancient philosophers who devoted themselves to the study of “nature and the marvels found in animals and plants” (Ibid. 62). Unlike the materialists, these philosophers found God’s wisdom in the creation of the universe. What was problematic about these philosophers, is that they denied the immortality of the soul, claiming that the soul ceased to exist upon the corruption of its humors, and thus it would be impossible to bring back a non-existent soul. As such these philosophers denied the afterlife, despite believing in God and his attributes. These philosophers were not thoroughly godless in the way that the materialists were, but still held problematic beliefs that are objectionable from the perspective of orthodox Islam. These are unbelievers as well.

Regarding the theists, al-Ghazālī writes that these were the later ancient philosophers, such as Socrates, Plato, and Aristotle who admitted a Creator-Ruler in their philosophical systems.  These philosophers, according to al-Ghazālī, refuted the materialists and naturalists and also refined the philosophical sciences. Nonetheless, all of these philosophers and their transmitters among the Islamic philosophers (that is, al-Farabi and Ibn Sina) must be charged with unbelief due to their still subscribing to theses such as the eternity of the world.

b. The Philosophical Sciences

Al-Ghazālī divides the sciences of philosophy into six fields: mathematics, logic, physical science, metaphysics, political philosophy, and moral philosophy or ethics (In the Book of Knowledge, he divides them into four: mathematics, logic, physical science, and metaphysics). Here, we will discuss his approach to mathematics, logic, physical science, political philosophy, and ethics, reserving his views on metaphysics for section 4 on the “Incoherence of the Philosophers.”

One might think that since al-Ghazālī wrote the Incoherence of the Philosophers, and claims that the ancient philosophers al-Farabi and Avicenna are unbelievers, he is completely antithetical to philosophy. However, it is important to note that the title of his famous text is not the Incoherence of Philosophy. Rather, al-Ghazālī, in general, takes specific issues with certain philosophical theses, rather than with the discipline of philosophy itself. Nonetheless, there are ways in which he thinks that an improper engagement with philosophy can lead to certain evils.

Regarding mathematics, which concerns arithmetic, geometry, and astronomy, he claims that: “nothing in them entails denial or affirmation of religious matters” (Deliverance: 63); rather, they contain facts which cannot be denied. Nonetheless, he identifies two evils that can follow from an improper engagement with mathematics.

The first evil occurs when a person who studies mathematics assumes that since philosophers have precision in their mathematical proofs, such precision in arriving at the truth extends to other areas of philosophy, such as metaphysics. This will make them susceptible to believing false metaphysical positions that are not actually demonstrated according to al-Ghazālī. Furthermore, this makes the person susceptible to unbelief because they will conform to the philosophers’ disdain (in al-Ghazālī’s view) for the law and religion more generally.

The second evil comes from the champion of Islam who believes that everything in philosophy must be rejected. When another person hears of such negative claims about philosophy as a whole, this will lead to them having a low opinion of Islam, for it seems that Islam is incompatible with clear mathematical truths established by demonstration, so “he becomes all the more enamored with philosophy and envenomed against Islam” (Ibid. 64).

Similarly, al-Ghazālī claims that logic does not conflict with religion, for it only concerns proofs, syllogisms, the conditions of demonstration, and the requisites for sound definitions. He admits that the philosophers have a greater refinement in their study and use of logic, which exceeds that of the theologians. Nonetheless, two evils can come from an improper understanding of the philosophers’ use of logic. First, if logic is rejected by the believer, this will lead the philosophers to have a low opinion of this rejector, and their religion more generally. Second, if the student of logic has too high opinion of the philosophers’ use of logic, they may come to believe that their metaphysical theses that amount to unbelief must be actually backed by demonstration, before even coming to examine the metaphysics per se.

The physical sciences or natural philosophy, which concerns the study of the heavens, stars, the sublunar world, and composite bodies, are also not objectionable with respect to religion, except for certain issues that al-Ghazālī covers in the Incoherence of the Philosophers. Al-Ghazālī stresses that one must remember that every aspect of physics is “totally subject to God Most High: it does not act of itself but is used as an instrument by its Creator” (Deliverance: 66) Any physical science whose theory rests on a denial of the dependency of an aspect of the physical world on the divine is objectionable (see section== 4d for al-Ghazālī’s views on causation and occasionalism).

Regarding political philosophy, al-Ghazālī does not regard the philosophers as contributing anything novel, for they “simply took these over from the scriptures revealed to the prophets” (Ibid. 67). As such, it seems that al-Ghazālī does not find issues with the philosophers’ views on political philosophy.

In al-Ghazālī’s view, ethics concerns “the qualities and habits of the soul, and recording their generic and specific kinds, and the way to cultivate the good ones and combat the bad” (Ibid.). According to al-Ghazālī, the philosophers’ ethical views derive from the sayings of the Sufis and prophets, and thus in general their ethical views are not objectionable (Kukkonen 2016; cf. Quasem 1974). Nonetheless, there are two evils that can arise from an improper engagement with philosophical ethics.

First, there is an evil that arises for the person who rejects philosophical ethics because they subscribe to a wholesale rejection of philosophy, due to their knowledge of other errors of the philosophers (for example, in metaphysics). This is problematic because, for example, the utterances of the prophets are true and should not be rejected. The fundamental problem here, according to al-Ghazālī, is that this “dim-witted” person does not know how to evaluate the truth: “The intelligent man, therefore, first knows the truth, then he considers what is actually said by someone. If it is true, he accepts it, whether the speaker be wrong or right in other matters” (Ibid. 68). Second, there is the evil that arises for the person who accepts philosophical ethics wholesale. The evil here is that one will slip into accepting the metaphysical errors of the philosophers that are mixed in with their ethical teachings.

4. Incoherence of the Philosophers

In the Incoherence of the Philosophers, al-Ghazālī famously condemns al-Farabi and Avicenna as unbelievers for their philosophical views. The notion of unbelief has come up thus far in this article, but what does it mean to be an unbeliever from the Islamic perspective? Al-Ghazālī defines unbelief or kufr as follows: “‘unbelief’ is to deem anything the Prophet brought to be a lie” (On the Boundaries of Theological Tolerance: 92). Unbelief is contrasted with its opposite, faith or iman: “‘faith’ is to deem everything he brought to be true” (Ibid.). From this basic definition, it follows:

Every Unbeliever deems one or more of the prophets to be a liar. And everyone who deems one or more of the prophets to be a liar is an Unbeliever. This is the criterion that should be applied evenly across the board.  (Ibid. 93)

According to al-Ghazālī, the so-called “Islamic philosophers” ascribe to philosophical theses that are fundamentally incompatible with the religion of Islam, that is, these theses imply that the Prophet Muhammad, in particular, is a liar, and thus puts them outside the fold of orthodoxy. Out of the twenty theses he identifies, three in particular conflict fundamentally with Islam: (1) belief in the eternity of the world, (2) belief that God can only know universals and not particulars, and (3) denial of the bodily resurrection. The other seventeen are innovations and are thus heterodox positions from the perspective of orthodox Islam. They do not, strictly speaking, constitute unbelief and thus one could still technically remain a Muslim while believing one of these innovations. But according to al-Ghazālī, the other three doctrines:

Do not agree with Islam in any respect. The one who believes them believes that prophets utter falsehoods and that they said whatever they have said by way of [promoting common] utility, to give examples and explanation to the multitudes of created mankind. This is manifest infidelity which none of the Islamic sects have believed. (Incoherence: 226)

In addition to holding heretical beliefs, al-Ghazālī claims that the philosophers are heretics in practice as well, for they:

Have rejected the Islamic duties regarding acts of worship, disdained religious rites pertaining to the offices of prayer and the avoidance of prohibited things, belittled the devotions and ordained prescribed by the divine law, not halting in the face of its prohibitions and restrictions. (Ibid. 1-2).

With respect to their philosophical views, however, al-Ghazālī does not view the Islamic philosophers as original. In his view, al-Farabi and Ibn Sina do not produce anything philosophically novel, rather they merely regurgitate and reproduce ancient Greek philosophical views. As he, writes: “There is no basis for their unbelief other than traditional, conventional imitation” (Ibid. 2; for more on the philosophers’ imitation or taqlid see Griffel 2005). The charge of unbelief or apostasy against al-Farabi and Avicenna has severe implications, in particular, it allows for them to be killed. As Frank Griffel makes explicit, for al-Ghazālī, “Whoever publicly supports or teaches the three named positions indeed deserves to be killed” (2007: 103).

Al-Ghazālī employs a unique strategy in revealing the philosophers’ “incoherence,” which is that he wants to beat the philosophers at their own game. The philosophers claim to establish their theses based on valid demonstrations with true and certain premises. However, al-Ghazālī aims to show that their demonstrations actually do not meet their own standards for truth, certainty, and validity:

There is neither firm foundation nor perfection in the doctrine they hold; that they judge in terms of supposition and surmise, without verification or certainty; that they use the appearance of their mathematical and logical sciences as evidential proof for the truth of their metaphysical sciences, using [this] as a gradual enticement for the weak in mind. (Incoherence: 4)

While al-Ghazālī does use scripture as a motivation for his own views, his objections to the philosophers do not amount to merely citing scripture that conflicts with a philosophical thesis. Rather, he raises philosophical objections against them with the aim of showing that their views imply problematic consequences, which they should concede given their own logical standards and epistemic position:

I do not enter into [argument] objecting to them, except as one who demands and denies, not as one who claims [and] affirms. I will render murky what they believe in [by showing] conclusively that they must hold to various consequences [of their theories]. (Ibid. 7)

It is important to note that for each Discussion in the Incoherence of the Philosophers, al-Ghazālī considers a variety of proofs for the philosophers’ views, and correspondingly raises many objections against them. Here, we will only consider a portion of these arguments.

a. The Eternity of the World

In the First Discussion of the Incoherence of the Philosophers, al-Ghazālī considers four distinct proofs for the eternity of the world. The thesis of the eternity of the world is objectionable within orthodox Islam, according to al-Ghazālī, because scripture is clear that the world was created ex nihilo. For example, there is a verse in the Qur’an that states: “All it takes, when He wills something is simply to say to it: “Be!” And it is!” (36: 82). What this verse essentially implies is that God can create an existent out of nothing. Of course, this serves as a motivation to refute the philosophers’ proofs for the eternity of the world but does not constitute a refutation on its own, as al-Ghazālī wants to show that the philosophers’ own demonstrations fail.

Let us consider the first proof for the eternity of the world, which he claims is the philosophers’ strongest and most imaginative one. The first proof depends on the fundamental concepts of will and causality (Hourani 1958). Every change (whether physical or mental) requires some cause. For example, a billiard ball is moved by a billiard stick, and the perception of a bear can raise the emotion of fear in a subject. Thus, if God wills some new state of affairs, this must occur due to some cause external to God. Let us suppose, then, that the world was created ex nihilo at the beginning of time, and is not eternal. If God created the world in this way, then something must have influenced his constant, eternal, and unmoved nature to influence his will to arrive at this volition. But we are supposing that nothing exists besides God. If so, then there is nothing outside of God’s will to have such an influence on Him, because there are no causes that exist external to God. This would seem to imply that the world could never exist and that God exists alone eternally without anything else existing alongside Him. But we know that the world does exist. The only option that follows is that the world has existed eternally, as an eternal emanation from the divine essence.

Al-Ghazālī responds by arguing that God’s will can, as it were, postdate the existence of the world so that the world comes into existence at a designated point in time. As such, the world would not have to be eternal. The philosophers (on al-Ghazālī’s reconstruction) would answer to this as follows:

It is impossible for that which necessitates [a thing] to exist with all the conditions of its being necessitating, [all the conditions] of its principles and causes fulfilled, such that nothing at all remains awaited, and then for the necessitated [effect] to be delayed. (Incoherence: 15)

In other words, postdating the existence of the world is impossible because if God has the will for the world to exist, then this volition must have existed eternally. Unless there is an obstacle to delay God’s will from coming to fruition, then there should be no delay in the creation of the world since all the conditions for the world to exist have been met. Thus, the world must be eternal.

Al-Ghazālī objects that the philosophers do not truly know that it is impossible for God’s will to postdate the existence of the world:

It is incumbent on you to set up a demonstrative proof according to the condition of logic that would show the impossibility of this. For, in all of what you have stated, there is nothing but [an expression of] unlikelihood and the drawing of an analogy with our resolve and will, this being false, since the eternal will does not resemble temporal [human] intentions. As regards the sheer deeming of something as unlikely, without demonstrative proof, [this] is not sufficient. (Ibid. 17)

The philosophers claim that a finite and temporal world being created by an eternal will is impossible, but they have not shown the contradiction in God’s will postdating the existence of the world, even without there being an obstacle to his will creating the world earlier. All they have done is provide an analogy between human will and divine will, which is not sufficient for proof.

b. God’s Knowledge: Universals vs. Particulars

In the Thirteenth Discussion of the Incoherence of the Philosophers, al-Ghazālī aims to refute the philosophers’ claim that God only knows universals, but not particulars. This is objectionable within Islam because the Qur’an claims that: “Not an atom’s weight is hidden from Him in the heavens or the earth” (34: 3). According to al-Ghazālī, scripture is clear that God’s knowledge is infinite and all-encompassing, extending to everything that exists and everything that is possible (Moderation in Belief: 104). Again, scripture serves as a motivation for refuting the philosophers on this score, but al-Ghazālī will provide independent objections against their demonstrations.

According to Avicenna, God does not have knowledge of particulars per se, rather, He has knowledge of particulars in a universal manner. This is an implication of Avicenna’s conception of God or the necessary being (Belo 2006). According to Avicenna, God is an intellect that consists of pure thought and activity and is a being wholly distinct from matter and extension. Insofar as God’s essence is thought, this implies that God’s essence is identical with knowledge, since any type of thought that does not qualify as knowledge is not fitting to the perfection of God. Furthermore, the perfection of God’s knowledge requires that God himself is the first object of his knowledge, a reflective act which requires an identity between subject and object of knowledge.

Insofar as God is wholly distinct from matter, this implies that God’s knowledge does not depend on matter or sensory perception in any way, as it does for human beings. As such, God cannot have knowledge of particulars (at least in the way that humans do) because he is not subject to the same physical and temporal processes involved in being a human being, and sensory perception more generally. The example discussed here by Avicenna and al-Ghazālī is that of an eclipse (Belo 2006). We can divide the knowledge an astronomer has of an eclipse into three stages. First, before an eclipse occurs, an astronomer will know that an eclipse is not occurring but will be expecting it. Second, when the eclipse occurs, the astronomer will know that it is occurring. Second, when the eclipse ends, the astronomer will know that the eclipse is a past event. With each stage of the eclipse, there is a corresponding change in the knowledge of the astronomer, and thus a change in the astronomer himself. But this is problematic with respect to God’s knowledge of the eclipse. As al-Ghazālī formulates Avicenna’s view, the claim is that:

If the object known changes, knowledge changes; and if knowledge changes, the knower inescapably changes. But change in God is impossible. (Incoherence: 135)

Since knowing particulars per se would make God subject to change, Avicenna instead maintains that God has knowledge of particulars in a universal manner. This holds in three senses (Belo 2006): First, God’s knowledge is universal in the sense that it is intellectual, and not sensory in any way. Second, God’s knowledge is universal in the sense that his knowledge precedes the objects of his knowledge because he is the cause of their existence. Third, God’s knowledge of particulars extends only to their general qualities. Al-Ghazālī finds this view entirely objectionable from the perspective of orthodox Islam:

This is a principle which they believed and through which they uprooted religious laws in their entirety, since it entails that if Zayd, for example, obeys or disobeys God, God would not know what of his states has newly come about, because He does not know Zayd in his particularity. (Ibid. 136)

Al-Ghazālī agrees that God’s essence cannot admit change, but claims that Avicenna’s analysis of the eclipse example rests on confusion. The essence of Avicenna’s argument, in al-Ghazālī’s view, is that God would undergo change if he knew the particulars of an eclipse because he would move from a state of ignorance to a state of knowledge about the eclipse (in the transition from stages one to two), which would constitute a change in God’s essence. Al-Ghazālī objects that Gods’ knowledge is singular throughout the process of the changes in the eclipse. That is, God’s knowledge of the eclipse before it exists when it exists, and after it exists are all the same knowledge. In al-Ghazālī’s opinion, the differences in the states of the eclipse are relational—that is, in the way God relates to the changes in the eclipse—and thus does not require change in the intrinsic knowledge or the essence of the knower (God).

c. Bodily Resurrection

In the Twentieth Discussion of the Incoherence of the Philosophers, al-Ghazālī aims to refute the philosophers’ denial of bodily resurrection. The philosophers maintain that the soul is immortal, but deny a bodily resurrection in the afterlife. For al-Ghazālī, this is in clear contradiction with scripture and religious law as well. Al-Ghazālī reconstructs three different ways one might conceive of bodily resurrection, how the philosophers would respond to each type of resurrection, and he provides his analysis of each type of resurrection.

First, one might view the resurrection as “God’s returning the annihilated body back to existence and the returning of the life which had been annihilated” (Incoherence: 216). In this view, both the body and the soul are annihilated upon death, and newly created upon resurrection. The philosophers argue that this view of resurrection is false because the human being that is resurrected will not be identical to the original human being, but merely similar to the original human being. This is because there is not something that continues to exist between the annihilation of the body and the soul, and the human being’s “resurrection,” for “unless there is something that [continues to] exist, there [also] being numerically two things that are similar [but] separated by time, then [the meaning of] the term ‘return’ is not fulfilled” (Incoherence: 216). Al-Ghazālī responds by conceding with the philosophers that this option does not amount to a resurrection:

For the human is a human not by virtue of his matter and the earth that is in him, since all or most of [his] parts are changed for him through nourishment while he remains that very same individual. For he is what he is by virtue of his spirit or soul. If, then, life or spirit ceases to exist, then the return of what ceases to exist is unintelligible. (Ibid. 216)

Second, one might view the resurrection as occurring when “the soul exists and survives death but that the first body is changed back [into existence] with all its very parts” (Ibid. 215). The philosophers argue that if this option were conceivable, it would constitute a resurrection. However, they argue that this option is impossible. The philosophers raise several objections here. One of them appeals to cannibalism:

If a human eats the flesh of another human, which is customary in some lands and becomes frequent in times of famine, the [bodily] resurrection of both together becomes impossible because one substance was the body of the individual eaten and has become, through eating, [part of] the body of the eater. And it is impossible to return two souls to the same body. (Ibid. 217)

In short, the philosophers claim that it is impossible to resurrect the body with its original matter.

Third, one might view the resurrection as occurring when the soul is returned to a body, regardless of whether that body is constituted by its original matter for: “The [person] resurrected would be that [identical] human inasmuch as the soul would be that [same] soul” (Ibid. 215). The philosophers argue that this option does not work either because the individual resurrected “would not be a human unless the parts of his body divide into flesh, bone, and [the four] humors” (Ibid. 218) In other words, the philosophers claim that the raw materials of the earth, for example, wood and iron, are not sufficient to reconstitute a body: “it is impossible to return the human and his body [to life] from wood or iron” (Ibid.).

Al-Ghazālī’s response to the philosophers’ dismissal of the second and third types of resurrection is to argue that the body does not matter at all in the identity of the individual resurrected. Rather, it is the continuity of the soul that matters:

This is possible by returning [the soul] to the body, whatever body this might be, whether [composed] of the matter of the first body [or from that] of another or from matter whose creation commences anew. For [an individual] is what he is by virtue of his soul, not his body, since the parts of the body change over for him from childhood to old age through being emaciated, becoming fat, and [undergoing] change of nourishment. His temperament changes with [all] this, while yet remaining that very same human. (Ibid. 219)

According to al-Ghazālī, during the life of a particular human being, the body will undergo a variety of changes while the person remains the same, due to the continued existence of their soul. In the same way, it does not matter from which materials the body is reconstituted when the human being is resurrected, for if the same soul exists in the afterlife as it did in the previous life in the new body, then the individual is the same.

d. Causation and Occasionalism

In the Seventeenth Discussion of the Incoherence of the Philosophers, al-Ghazālī aims to refute the philosophers’ claim that there is a necessary condition between cause and effect. This view is problematic with respect to Islam because it conflicts with the existence of miracles, and God’s ability to change the course of nature at will. This is technically one of the innovative theses of the philosophers.

According to Avicenna’s conception of (efficient) causation, there is a necessary connection between cause and effect (the necessitation thesis—see Richardson 2020). Avicenna’s argument for the necessitation thesis is in part based on his modal metaphysics of the necessary and the possible or contingent. Avicenna distinguishes between the necessary and the possible as follows: “That which in itself is a necessary existence has no cause, while that which in itself is a possible existent has a cause” (Metaphysics of the Healing I, 6: 30).According to Avicenna, the only necessary being is God because existence is a part of the essential nature of God. There is nothing external to God, a cause, that brings God into existence. All other beings, however, are merely possible in themselves because existence is not a part of their essence. Because they are possible, something external to them, a cause, is required to bring them into existence, otherwise they would not exist. As such, there must be an external cause that necessitates the existence of possible things. As Avicenna explains it in his discussion of ontological priority in the Metaphysics of the Healing:

The existence of the second is from the first, so that it [derives] from the first the necessary existence which is neither from nor in itself, having in itself only possibility—allowing [that is] that the first is such that, as long s it exists, it follows as a necessary consequence of its existence that it is the cause of the necessary existence of the second—then the first is prior in existence to the second. (Metaphysics of the Healing IV, 1: 126)

Al-Ghazālī rejects the necessitation theses:

The connection between what is habitually believed to be a cause and what is habitually believed to be an effect is not necessary, according to us. But [with] any two things, where “this” is not “that” and “that” is not “this” and where neither the affirmation of the one entails the affirmation of the other nor the negation of the one entails negation of the other, it is not a necessity of the existence of the one that the other should exist, and it is not a necessity of the nonexistence of the one that the other should not exist—for example, the quenching of thirst and drinking, satiety and eating, burning and contact with fire, light and the appearance of the sun, death and decapitation, healing and the drinking of medicine, the purging of the bowels and the using of a purgative, and so on to (include] all [that is] observable among connected things in medicine, astronomy, arts, and crafts. Their connection is due to the prior decree of God, who creates them side by side, not to its being necessary in itself, incapable of separation. On the contrary, it is within [divine] power to create satiety without eating, to create death without decapitation, to continue life after decapitation, and so on to all connected things. The philosophers denied the possibility of [this] and claimed it to be impossible. (Incoherence: 166)

According to al-Ghazālī, there is no necessary connection between cause and effect, because one can affirm the existence of the cause, without having to affirm the existence of the effect, and one can deny the existence of the cause, without having to deny the existence of the effect.

Consider the example of the burning of cotton when it comes in contact with fire. According to the philosophers, there is a necessary connection between the fire and the burning of the cotton. The fire is the cause or agent of change, that necessitates the effect of change of burning. Denying this causal relation would result in a contradiction. According to al-Ghazālī, however, the connection between fire and burning is not one of necessity; rather, God has created these two events concomitantly or side by side. And due to our repeated or habitual perceiving of these events occurring side by side, this makes us believe that there is a genuine causal relation:

But the continuous habit of their occurrence repeatedly, one time after another, fixes unshakably in our minds the belief in their occurrence according to past habit. (Ibid. 170)

Here, al-Ghazālī anticipates Hume’s theory of causation, according to which causation is nothing more than a constant conjunction between two events (On the differences between Hume and al-Ghazālī, see Moad 2008).

On the standard reading, al-Ghazālī ascribes to occasionalism, according to which creatures do not have any causal power; rather, God is the sole source of causal power in the world (Moad 2005). As al-Ghazālī writes:

The one who enacts the burning by creating blackness in the cotton, [causing] separation in its parts, and making it cinder or ashes is God, either through the mediation of His angels or without mediation. As for fire, which is inanimate, it has no action. (Incoherence: 167)

On behalf of the philosophers, al-Ghazālī raise a unique objection against his theory of causality, namely, that it entails a type of radical skepticism (Dutton 2001). If one denies the necessary connection between cause and effect, then one cannot know what events God—through his unrestrained freedom—will create as conjoined side by side:

If someone leaves a book in the house, let him allow as possible its change on his returning home into a beardless slave boy—intelligent, busy with his tasks—or into an animal; or if he leaves a boy in his house, let him allow the possibility of his changing into a dog; or [again] if he leaves ashes, [let him allow] the possibility of its change into musk; and let him allow the possibility of stone changing into gold and gold into stone. (Ibid. 170)

As Blake Dutton formulates it, the radical skepticism of al-Ghazālī’s theory of causality runs as follows:

Given that we must rely on causal inferences for our knowledge of the world and our ability to navigate our way through it, Ghazali’s position entails that we must face the world with no expectations and adopt a position of skeptical uncertainty. (Dutton 2001: 39)

Al-Ghazālī’s response is to argue that although such transformations are possible, God institutes a habitual course of nature: “The idea is that although God is free to bring about whatever he desires in any order at all, the actual sequence of events that he creates in the world is regular” (Ibid.). As such, we can rely on our normal inferences about how events unfold in the world.

e. Averroes’ Response to the Charge of Unbelief

Averroes (Ibn Rushd) provide a systematic response to the Incoherence of the Philosophers in his Incoherence of the Incoherence. It is beyond the scope of this article to discuss Averroes’ specific responses to al-Ghazālī’s criticisms. But in his Decisive Treatise Averroes provides a philosophical and legal defense of the compatibility of philosophy with Islam, and al-Farabi and Avicenna as Islamic philosophers, that is worth briefly delving into.

According to Averroes, philosophy is obligatory according to scripture and law, because there are a variety of verses that call for a rational reflection on the nature of the world, which amounts to philosophical activity. Furthermore, Averroes claims that studying ancient philosophers is obligatory for the qualified Muslim—the one with intellect and moral virtue—because it is necessary to see what philosophical progress has already been made, in order to make further philosophical developments.

However, there seems to be a conflict between philosophical truth and religious truth when one looks at scripture. According to Averroes, however, there are no distinct standards of truth, for truth cannot contradict truth. When it comes to interpreting scripture from a philosophical perspective, then, one must engage in appropriate forms of allegorical interpretation in order for scripture to conform with demonstrated truths. An allegorical interpretation must be offered for metaphysical claims in scripture whose literal meaning does not conform to a demonstrated metaphysical truth. For example, a verse of scripture that implies that God has hands is problematic because God is not corporeal. However, the reference to God’s hands can be allegorically interpreted as an indication of God’s power. Engaging in allegorical interpretation, however, is only appropriate for philosophers who are capable of logical demonstrations, it is not suitable for the masses or even the theologians.

The philosophers who do engage in allegorical interpretation, have some leeway with respect to making errors because there isn’t consensus amongst scholars about metaphysical issues in the Qur’an in particular. Indeed, it seems that there can’t be a consensus—or at least it is difficult to establish consensus—because scholars hold on to the principle that interpretations of esoteric and theoretical matters should not be divulged to others. A philosopher, then, can assent to (what may turn out to be) false allegorical interpretation, and still be a Muslim. Applying this view of unbelief to al-Farabi and Avicenna, Averroes argues that al-Ghazālī’s condemnation of al-Farabi and Avicenna can only be tentative because there is no consensus amongst the scholars about theoretical matters about the eternity of the world, God’s knowledge, and bodily resurrection.

5. Revival of the Religious Sciences

The Revival of the Religious Sciences constitutes al-Ghazālī’s magnum opus. The primary purpose of the Revival of the Religious Sciences is ethical and spiritual in nature, as al-Ghazālī aims to instruct the reader in both theory and practice that will lead to spiritual enlightenment and an experiential knowledge of God. This section will briefly consider al-Ghazālī’s views on the heart, intellect, Sufism, and theodicy that arise across the Revival of the Religious Sciences.

a. The Heart

 The non-physical heart, according to al-Ghazālī, it is a “subtle tenuous substance of an ethereal spiritual sort, which is connected with the physical heart” (Marvels of the Heart: 6). The heart in this sense constitutes “the real essence of man” (Ibid.). Furthermore, it is through the perception of the heart that human beings can understand reality, by reflecting reality in the mirror of the heart:

In its relationship to the real nature of intelligibles, it is like a mirror in its relationship to the forms of changing appearances. For even as that which changes has a form, and the image of that form is reflected in the mirror and represented therein, so also every intelligible has its specific nature, and this specific nature has a form that is reflected and made manifest in the mirror of the heart. Even as the mirror is one thing, the forms of individuals another, and the representation of their image in the mirror another, being thus to see things in all, so here, too, there are three things: the heart, the specific natures of things, and the representation and presence of these in the heart. (Ibid. 35)

Al-Ghazālī identifies five different reasons why a heart will fail to understand reality, that is, perceive the real nature of intelligible (Ibid. 36-38). First, there may be an imperfection in the constitution of the heart’s mirror. Second, the mirror of the heart may be dull due to acts of disobedience, which preclude the “purity and cleanness of heart”. Third, the heart may not be directed to reality due to the person being excessively preoccupied with his appetites and worldly pursuits (such as wealth and livelihood). Fourth, there is a veil that can block even the obedient who has conquered his appetites and devoted himself to understanding reality. This veil constitutes theological and legal beliefs that one has accepted blindly from their youth. These beliefs can “harden the soul,” constituting a veil between the person and their perception of reality. Fifth, in order to obtain knowledge of the unknown, a person must be able to place his prior knowledge in a “process of deduction” to find the “direction of the thing sought”:

For the things that are not instinctive, which one desires to know, cannot be caught save in the net of acquired knowledge; indeed, no items of knowledge is acquired except from two preceding items of knowledge that are related and combined in a special way, and from their combination a third item of knowledge is gained. (Ibid. 38)

In other words, logic and knowledge of syllogisms in particular are necessary to comprehend the true nature of the intelligible.

The primary upshot is that in order for a person to understand reality they must, broadly construed, be in a state of obedience and their heart must be purified. According to al-Ghazālī, “the purpose of improvement is to achieve the illumination of faith in it; I mean the shining of the light of knowledge [of God]” (Ibid. 41). The illumination of faith comes in three degrees, according to al-Ghazālī: faith based on blind imitation (what the masses possess), faith-based on partial logical reasoning (what the theologians possess), and faith based on the light of certainty (what the Sufis possess).

To illustrate the difference between these degrees of faith, al-Ghazālī offers the following example. Suppose that Zayd is in a house. One way of arriving at the belief that Zayd is in the house is to be told that he is in the house by someone trustworthy and truthful. This is blind imitation and the lowest level of faith. The second way of arriving at the belief that Zayd is in the house is by hearing the voice of Zayd from inside the house. This is a stronger form of faith, as it is based on a type of experience of Zayd, and not mere hearsay. But this type of belief requires logical reasoning, as there is an inference made from hearing Zayd’s voice to the conclusion that he is in the house. The third way of arriving at the belief that Zayd is in the house is to go inside the house and see Zayd within it. This is experiential knowledge that possesses the light of certainty. This is the level of belief and faith that should be the goal of the believer (see section c. on Sufism and Ethics, below).

b. The Intellect

According to al-Ghazālī, the seat of knowledge is the spiritual heart. The intellect is a faculty of the heart, which he defines as “an expression for the heart in which there exists the image of the specific nature of things.” (Marvels of the Heart: 35). In the Book of Knowledge, al-Ghazālī distinguishes four different senses of the intellect and divides it into various categories of knowledge. Before explaining these different senses, however, it will be useful to discuss al-Ghazālī’s views on the original condition (fiṭra) of the human being.

In the Deliverance from Error, al-Ghazālī writes that:

Man’s essence, in his original condition, is created in blank simplicity without any information about the “worlds” of God Most High…. Man get his information about the “worlds” by means of perception. Each one of his kind of perception is created in order that man may get to know thereby a “world” of the existents—and by “worlds” we mean the categories of existing things. (Deliverance: 83)

Here, al-Ghazālī seems to espouse the Lockean tabula rasa claim, namely, that the mind is initially devoid of any information, and that all information is ultimately acquired through perception. On this view, then, the intellect would acquire its information through empirical means, that is, through perception. This would make al-Ghazālī close to Avicenna on this matter; although there is some scholarly debate as to whether primary truths or axiomatic knowledge is acquired via perception. Nonetheless, according to Alexander Treiger (2012), al-Ghazālī’s four-part division of the intellect is partly indebted to Avicenna’s conception of the intellect. The first three stages of intellect in particular correspond to Avicenna’s conceptions of the material intellect, the intellect in habitu, and the actual intellect. For Avicenna, the material intellect is devoid of content, consisting in only a disposition to acquire information; the intellect in habitu is the stage of intellection where primary truths are abstracted from sensory experience; and the actual intellect is the stage of intellection where the intellect has access to and understanding of the intelligible.

According to al-Ghazālī, the order of perceptual capacities created in man is touch, sight, hearing, taste, discernment (around the age of seven), and then the intellect which perceives things “not found in the previous stages” (Ibid. 83) On the first meaning of the intellect, the intellect “is the attribute that differentiates human beings from all other animals and affords them the ability to apprehend the speculative sciences and to organize the subtle rational disciplines” (Book of Knowledge: 253). In other words, the intellect is a distinctive quality of human beings to understand theoretical matters, which is not shared by other animals.

On the second meaning of the intellect, the intellect “is the science that comes or types of knowledge that come into being in the disposition of the child” (Ibid. 255). More specifically, these types of knowledge concern possibility, necessity, and impossibility, such as the knowledge that “two is more than one, and that one person cannot be in two places at the same time.” According to al-Ghazālī, this type of axiomatic knowledge does not come about through blind imitation or instruction but is an endowment to the soul. This is different from acquired knowledge, which is acquired through learning and deduction (Marvels of the Heart: 45). Furthermore, axiomatic knowledge enters the disposition of the child after sensory perception has been created in the child.

On the third meaning of the intellect, the intellect “is the sciences derived through the observation of events and circumstances as they arise” (Ibid.). In other words, the intellect at this stage allows for the rational processing of empirical events. On the fourth meaning of the intellect, the intellect is “the capacity to discern the consequences of one’s actions” (Ibid.). More specifically, it includes the capacity to overcome one’s own desires that motivate one towards immediate pleasures. This sense of intellect is also what distinguishes human beings from animals. This fourth sense of the intellect is “the final fruit and its ultimate goal” (Book of Knowledge: 256). (For alternative conceptions of the intellect in al-Ghazālī’s work see Treiger 2012)

c. Sufism and Ethics

According to al-Ghazālī, a central goal of Sufism is not a discursive understanding of God, but an experiential one. The concept al-Ghazālī employs for experiencing the divine is dhawq or fruition experience:

It became clear to me that their [the Sufis] most distinctive characteristic is something that can be attained, not by study, but rather by fruitful experience and the state of ecstasy and “the exchange of qualities.” How great a difference there is between your knowing the definitions and causes and conditions of health and satiety and your being healthy and sated! And how great a difference there is between your knowing the definition of drunkenness…and your actually being drunk! (Deliverance: 78)

Dhawq is a type of knowledge-by-acquaintance, an unmediated experience of the divine (gnosis), as opposed to propositional knowledge of God that would depend on prior beliefs and inferences. Al-Ghazālī claims that this is the highest type of knowledge one can attain, superior even to proofs of God and faith. Ultimately, to verify the Sufi claim that an experiential knowledge of God is possible, one must embark on the Sufi path.

Broadly construed, the means to attaining dhawq is through purifying the heart, that is, ridding it of spiritual diseases and replacing it with virtues. As we have seen, for al-Ghazālī, the spiritual heart is the seat of perception, intelligence, and knowledge, and the primary function of “the heart being the acquisition of wisdom and gnosis, which is the specific property of the human soul which distinguishes man from animals” (Discipling the Soul: 46-47). However, the heart has to be in the right condition to acquire knowledge generally. It must be in a purified-disciplined state to acquire the experiential knowledge of God. To do so, the believer must engage in a variety of spiritual practices.

In Marvels of the Heart, al-Ghazālī explains, in general, the method of the Sufis for purifying the heart. First, the believer must cut off ties with the present world by “taking away concern for family, possessions, children, homeland, knowledge, rule, and rank” (Marvels of the Heart: 54). Then, he must withdraw into a private place of seclusion in order to fulfill the obligatory and supererogatory religious duties: “He must sit with an empty heart” and “strive [such] that nothing save God, the Exalted, shall come into his mind” (Ibid.) Next, in his seclusion, he must repeatedly recite the name of God (Allah) with his heart fixed on God, until “the form and letters of the expression and the very appearance of the word is effaced from the heart and there remains present in it nought save the ideal meaning” (Marvels of the Heart: 54-55). Engaging in these practices seems to be a necessary, but not sufficient condition for an unveiling of the divine realities. According to al-Ghazālī, it is ultimately up to God’s mercy to grant the believer gnosis if, again, he has purified his heart: “By what he has done thus far he has exposed himself to the breezes of God’s mercy, and it only remains for him to wait for such mercy” (Marvels of the Heart: 55).

In Discipling the Soul, al-Ghazālī lays out an ethical program for cultivating good character that is necessary for acquiring gnosis. Here, we interestingly see the influences of Ancient Greek virtue ethics on al-Ghazālī’s ethical thought. According to al-Ghazālī, a character trait “is a firmly established condition of the soul, from which actions proceed easily without any need for thinking or forethought” (Disciplining the Soul: 17). Good character traits lead to beautiful acts, whereas bad character traits lead to ugly acts. Character, however, is not the same as action, rather it is a “term for the condition and inner aspect of the soul.” (Ibid. 18).

According to al-Ghazālī, the fundamental character traits are wisdom, courage, temperance, and justice, from which a variety of other character traits derive (Ibid. 18-19). According to al-Ghazālī, wisdom is the condition of the soul that allows it to distinguish truth from falsity in all acts of volition. Justice is the condition of the soul that controls desire through wisdom. Courage is when the intellect exerts control over anger or the irascible faculty. Temperance is the control of the appetitive faculty through the intellect and religious law. The purpose of cultivating this character is ultimately to cut off the love of the world way to make space for the love of God in the heart (Ibid. 33).

The key to curing the diseases of the heart, according to al-Ghazālī, ultimately comes down to renouncing one’s desires. Al-Ghazālī writes that the essence of self-discipline consists in the soul “not taking pleasure in anything which will not be present in the grave” (Ibid. 60). The believer should restrict the fulfillment of their desires to the absolute necessities of life (for example, food, marriage, and clothing), and occupy their time in devotion to God. The believer who is solely occupied with the remembrance of God in this way is one of the “Truthful Saints” (Ibid.)

d. Theodicy: The Best of All Possible Worlds

In Faith in Divine Unity and Trust in Divine Providence, al-Ghazālī makes a remarkable claim, (anticipating Leibniz’s optimism) about the omnipotence and benevolence of God, namely, that God created the best of all possible worlds: “Nor is anything more fitting, more perfect, and more attractive within the realm of possibility” (Faith in Divine Unity: 45-6) (Similarly, in Principles of the Creed, al-Ghazālī writes that everything “proceeds forth from His justice in the best, most perfect, most complete, and most equitable way” [Principles of the Creed: 14]). This constitutes a theodicy for the existence of evil in the actual world. Here is the immediate context for this claim:

For everything which God Most High distributes among His servants: care and an appointed time, happiness and sadness, weakness and power, faith and unbelief, obedience and apostasy—all of it is unqualifiedly just with no injustice in it, true with no wrong infecting it. Indeed, all this happens according to a necessary and true order, according to what is appropriate as it is appropriate and in the measure that is proper to it; nor is anything more fitting, more perfect, and more attractive within the realm of possibility. For if something were to exist and remind one of the sheer omnipotence [of God] and not of the good things accomplished by His action, that would utterly contradict [God’s] generosity, and be an in justice contrary to the Just One. And if God were not omnipotent, He would be impotent, thereby contradicting the nature of divinity. (Faith in Divine Unity: 45-6)

According to al-Ghazālī, the world and existence more generally reflect God’s complete benevolence and justice. God’s maximal power that is manifested in the creation of the existing world is balanced by his maximal benevolence. As Ormsby puts it in his seminal study of this text, al-Ghazālī’s arguments for having trust in God in part depends on divine providence and the actual world being the best of all possible worlds:

The aspirant to trust in God must therefore learn to see the world as it really is—not as the product of blind chance or of any series of causes and effects, nor as the arena of his own endeavors, but as the direct expression of the divine will and wisdom, down to the least particular. Trust in God presupposes the recognition of the perfect Rightness of the actual. (1984: 43)

Al-Ghazālī, however, received much criticism for his best-of-all-possible world thesis by later theologians. Broadly construed, three primary and interrelated criticisms were raised (Ormsby 1984; cf. Ogden 2016). The first criticism is that, in terms of possibility, the actual world is not the best of all possible worlds as it certainly could be improved upon in terms of reducing suffering and increasing goodness. The second criticism is that al-Ghazālī was following the views of the philosophers in thinking that the world is created by a natural necessity, in that God’s creation necessarily emanates from his essence and relatedly, that since God is perfect, the creation that follows from his essence is necessarily perfect as well. This conflicts with the view that God has freedom and decrees what he wills. The third criticism is that al-Ghazālī’s best of all possible worlds thesis is dangerously close to a Mu’tazilite doctrine according to which God is obligated to do the best for creation. This conflicts with Ash’arite doctrine according to which God is not obligated to create at all, because he is omnipotent and no constraints can be placed on Him.

6. Islamic Philosophy after Al-Ghazālī

Al-Ghazālī, with his writing of the Incoherence of the Philosophers and his condemnation of al-Farabi and Avicenna, is often charged with causing a decline of philosophy and science in the Islamic world. After al-Ghazālī, the thought goes, Islamic intellectuals abandoned philosophical and scientific inquiry in favor of mysticism, theology, and the traditional Islamic sciences. As Steven Weinberg writes:

Alas, Islam turned against science in the twelfth century. The most influential figure was the philosopher Abu Hamid al-Ghazālī, who argued in The Incoherence of the Philosophers against the very idea of laws of nature…after al-Ghazālī, there was no more science worth mentioning in Islamic countries. (2007)

In Montgomery Watt’s view, such a charge against al-Ghazālī is unjustified because Islamic philosophy was arguably already on the decline after the death of Avicenna:

It is tempting to conclude that his attack on the philosophers had been so devastating that philosophy was killed off; but such a conclusion is not justified. It is true that there were not outstanding philosophers in the east after 1100 who stood within the ‘pure’ Aristotelian and Neoplatonic tradition; but it is also true that the last great philosopher there, Avicenna, had died in 1037; twenty years before al-Ghazālī was born; and so the decline of philosophy may have begun long before the Tahfut [Incoherence of the Philosophers] appeared. (1962: 91)

According to other scholars, however, it is simply false that there was a decline in philosophy after the appearance of the Incoherence of the Philosophers. While the Incoherence of the Philosophers may have convinced certain orthodox thinkers to steer clear of philosophy and even encouraged some persecution, the Islamic philosophical tradition lived on. As Frank Griffel argues: “If al-Ghazālī tried to establish thought-police in Islam, he remained unsuccessful. There was simply no Inquisition in Islam” (2007: 158). In addition to Averroes, who wrote the Incoherence of the Incoherence, there is a rich tradition of post-classical Islamic philosophy full of a variety of Islamic philosophers that have been neglected in the story of Islamic philosophy (for example, Suhrawardī and the illuminations school of thought). As such, while al-Ghazālī’s Incoherence of the Philosophers was influential, it arguably did not put an end to philosophy in the Islamic world.

7. References and Further Reading

a. Primary Sources

  • Al-Ghazālī. 1980. Deliverance from Error. Translated by R.J. McCarthy. Louisville: Fons Vitae.
  • Al-Ghazālī. 1998. The Niche of Lights: A Parallel English-Arabic Text. Edited and Translated by D. Buchman. Provo: Brigham Young University Press.
  • Al-Ghazālī. 2000. The Incoherence of the Philosophers: A Parallel English-Arabic Text. Edited and Translated. By M.E. Marmura, 2nd ed. Provo: Brigham Young University Press.
  • Al-Ghazālī. 2001. Kitāb al-Tawḥid wa’l-Tawakkul (Faith in Divine Unity and Trust in Divine Providence) [Book XXXV]. Translated by D.B. Burrell. Louisville: Fons Vitae.
  • Al-Ghazālī. 2002. On the Boundaries of Theological Tolerance in Islam: Abu Hamid al-Ghazālī’s Faysal al-Tafriqa. Trans. by S.A. Jackson. Karachi: Oxford University Press.
  • Al-Ghazālī. 2010. Kitāb sharḥ ‘ajā’ib al-qalb (The Marvels of the Heart) [Book XXI]. Translated by W.J. Skellie. Louisville: Fons Vitae.
  • Al-Ghazālī. 2013. Al-Ghazali’s Moderation in Belief. Translated by A.M. Yaqub. Chicago: University of Chicago Press.
  • Al-Ghazālī. 2015. Kitāb al-‘ilm (The Book of Knowledge) [Book I]. Translated by K. Honerkamp. Louisville: Fons Vitae.
  • Al-Ghazālī. 2016. Kitāb qawā‘id al-‘aqā’id (The Principles of the Creed) [Book II]. Translated by K. Williams. Louisville: Fons Vitae.
  • Al-Ghazālī. 2016. On Disciplining the Soul and On Breaking the Two Desires: Books XXII and XXIII of the Revival of the Religious Sciences. Translated By T.J. Winter. Cambridge: The Islamic Texts Society.
  • Averroes. 1954. AverroesTahafut Al-Tahafut (The Incoherence of the Incoherence). Translated by S. van den Bergh, 2 vols., London: Luzac.
  • Averroes. 2001. The Book of the Decisive Treatise Determining the Connection Between Law and Wisdom. Edited and Translated by C. Butterworth. Provo: Brigham Young University Press.
  • Avicenna. 2005. The Metaphysics of The Healing: A Parallel English-Arabic Text. Edited and Translated by M.E. Marmura. Provo: Brigham Young University Press.

b. Secondary Sources

  • Abrahamov, Binyamin. 1988. “Al-Ghazālī’s Theory of Causality.” Studia Islamica 67: 75-98.
  • Abrahamov, Binyamin. 1993. “Al-Ghazālī’s Supreme Way to Know God.” Studia Islamica 77: 141-168.
  • Ai-Allaf, Mashhad. 2006. “Al-Ghazālī on Logical Necessity, Causality, and Miracles.” Journal of Islamic Philosophy 2 (1): 37-52.
  • Albertini, Tamara. 2005. “Crisis and Certainty of Knowledge in Al-Ghazālī and Descartes.” Philosophy East and West 55: 1-14.
  • Azadpur, Mohammad. 2003. “Unveiling the Hidden: On the Meditations of Descartes & al-Ghazālī.” In The Passions of the Soul: A Dialogue Between Phenomenology and Islamic Philosophy, edited by Anna-Teresa Tymieniecka, 219-240. Kluwer.
  • Belo, C. 2006. “Averroes on God’s Knowledge of Particulars.” Journal of Islamic Studies 17 (2): 177-199.
  • Burrell, David B. 1987. “The Unknowability of God in Al-Ghazali.” Religious Studies 23 (2): 171-182.
  • Campanini, Massimo. 2018. Al-Ghazali and the Divine. New York: Routledge.
  • Dutton, Blake. 2001. “Al-Ghazālī on Possibility and the Critique of Causality.” Medieval Philosophy and Theology 10 (1): 23-46.
  • Ferhat, Loumia. 2020. “Al-Ghazālī’s Heart as a Medium of Light: Illumination and the Soteriological Process.” Journal of Islamic Ethics 4 (1-2): 201-222.
  • Goodman, Lenn Evan. 1978. “Did Al-Ghazālī Deny Causality?” Studia Islamica 47: 83-120.
  • Götz, Ignacio. 2003. “The Quest for Certainty: Al-Ghazālī and Descartes.” Journal of Philosophical Research 28: 1–22.
  • Griffel, Frank. 2001. “Toleration and Exclusion: Al-Shāfiʾī and al-Ghazālī on the Treatment of Apostates.” Bulletin of the School of Oriental and African Studies, University of London 64 (3): 339-354.
  • Griffel, Frank. 2005. “Taqlîd of the Philosophers. Al-Ghazâlî’s Initial Accusation in the Tahâfut.” In Ideas, Images, and Methods of Portrayal. Insights into Arabic Literature and Islam, edited by S. Günther, 253-273. Leiden: Brill.
  • Griffel, Frank. 2009. al-Ghazālī’s Philosophical Theology. New York: Oxford University Press.
  • Griffel, Frank. 2011. “The Western Reception of al-Ghazālī’s Cosmology from the Middle Ages to the 21st Century.” Dîvân: Disiplinlerarası Çalışmalar Dergisi/Dîvân: Journal of Interdisciplinary Studies 16: 33-62.
  • Griffel, Frank. 2012. “Al-Ghazālī’s Use of “Original Human Disposition” (Fiṭra) and Its Background in the Teachings of Al-Fārābī and Avicenna.” The Muslim World 102 (1): 1-32.
  • Griffel, Frank. 2021. The Formation of Post-Classical Philosophy in Islam. New York: Oxford University Press.
  • Hadisi, Reza. 2022. “Ghazālī’s Transformative Answer to Scepticism.” Theoria 88 (1): 109-142.
  • Hasan, Ali. 2013. “Al-Ghazali and Ibn Rushd (Averroes) on Creation and the Divine Attributes,” In Models of God and Alternative Ultimate Realities, edited by Jeanine Diller & Asa Kasher, 141-156. The Netherlands: Springer.
  • Hesova, Zora. 2012. “The Notion of Illumination in the Perspective of Ghazali’s Mishkāt-al-Anwār.” Journal of Islamic Thought and Civilization 2: 65-85.
  • Hourani, George. 1958. “The Dialogue Between Al-Ghazālī and the Philosophers on the Origin of the World.” Muslim World 48: 183-191.
  • Kukkonen, Taneli. 2000. “Possible Worlds in the Tahâfut al-Falâsifa: Al-Ghazālī on Creation and Contingency,” Journal of the History of Philosophy 38: 479-502.
  • Kukkonen, Taneli. 2010. “Al-Ghazālī’s Skepticism Revisited.” In Rethinking the History of Skepticism: The Missing Medieval Background, edited by Henrik Lagerlund, 103-129. Leiden: Brill.
  • Kukkonen, Taneli. 2010. “Al-Ghazālī on the Signification of Names.” Vivarium 48 (1/2): 55-74.
  • Kukkonen, Taneli. 2012. “Receptive to Reality: Al-Ghazālī on the Structure of the Soul.” The Muslim World 102: 541-561.
  • Kukkonen, Taneli. 2016. “Al-Ghazālī on the Origins of Ethics.” Numen 63 (2/3): 271-298.
  • Lewes, George Henry. 1970. The Biographical History of Philosophy [originally published 1857]. London: J.W. Parker & Son.
  • Marmura, Michael E. 1981. “Al-Ghazālī’s Second Causal Theory in the 17th Discussion of His Tahâfut.” In Islamic Philosophy and Mysticism, edited by Parviz Morewedge, 85-112. Delmar: Caravan Books.
  • Marmura, Michael E. 1995, “Ghazālīan Causes and Intermediaries.” Journal of the American Oriental Society 115: 89-100.
  • Marmura, Michael E. 1965. “Ghazālī and Demonstrative Science.” Journal of the History of Philosophy 3: 183-204.
  • Martin, Nicholas. 2017. “Simplicity’s Deficiency: Al-Ghazālī’s Defense of the Divine Attributes and Contemporary Trinitarian Metaphysics.” Topoi 36 (4): 665-673.
  • Menn, Stephen. 2003. “The Discourse on the Method and the Tradition of Intellectual Autobiography.” In Hellenistic and Early Modern Philosophy, edited by Jon Miller and Brad Inwood, 141-191. Cambridge: Cambridge University Press.
  • Moad, Edward Omar. 2005. “Al-Ghazali’s Occasionalism and the Natures of Creatures.” International Journal for Philosophy of Religion 58 (2): 95-101.
  • Moad, Edward Omar. 2007. “Al-Ghazali on Power, Causation, and Acquisition.” Philosophy East and West 57 (1): 1-13.
  • Moad, Edward Omar. 2008. “A Significant Difference Between Al-Ghazālī and Hume on Causation.” Journal of Islamic Philosophy 3: 22-39.
  • Moad, Edward Omar. 2009. “Comparing Phases of Skepticism in al-Ghazālī and Descartes: Some First Meditations on Deliverance from Error.” Philosophy East and West 59 (1): 88-101.
  • Naumkin, V.V. 1987. “Some Problems Related to the Study of Works by al-Ghazālī.” In Ghazālī, la raison et le miracle, 119-124. Paris: UNESCO.
  • Ogden, Stephen R. (2016). “Problems in Al-Ghazālī’s Perfect World.” In Islam and Rationality, edited by F. Griffel. Leiden, 54-89. The Netherlands: Brill.
  • Ormsby, Eric Linn. 1984. Theodicy in Islamic Thought. The Dispute over al-Ghazālī’s ‘Best of All Possible Worlds.’ Princeton: Princeton University Press.
  • Özel, Aytekin. 2008. “Al-Ghazālī’s Method of Doubt and its Epistemological and Logical Criticism.” Journal of Islamic Philosophy 4: 69-76.
  • Quasem, Muhammad Abul. 1974. “Al-Ghazali’s Rejection of Philosophic Ethics.” Islamic Studies 13 (2): 111-127.
  • Richardson, Kara. 2015. “Causation in Arabic and Islamic Thought (The Stanford Encyclopedia of Philosophy).” Edited by Edward N. Zalta. Stanford Encyclopedia of Philosophy. Winter 2020 Edition. https://plato.stanford.edu/archives/win2020/entries/arabic-islamic-causation/.
  • Riker, Stephen. 1996. “Al-Ghazālī on Necessary Causality in ‘The Incoherence of the Philosophers.’” The Monist 79 (3): 315-324.
  • Ruddle-Miyamoto, Akira O. 2017. “Regarding Doubt and Certainty in al-Ghazālī’s Deliverance from Error and Descartes’ Meditations.” Philosophy East and West 67 (1): 160-176.
  • Treiger, Alexander. 2007. “Monism and Monotheism in al-Ghazālī’s Mishkāt al-Anwār.” Journal of Qur’anic Studies 9 (1): 1-27.
  • Treiger, Alexander. 2012. Inspired Knowledge in Islamic Thought. Al-Ghazālī’s Theory of Mystical Cognition and its Avicennian Foundation. London and New York: Routledge.
  • Van Ess, Josef. 2018. “Quelques Remarques Sur le Munqid min al-dalal,” In Kleine Schriften Vol. 3, edited by Hinrich Biesterfeldt, 1526-1537. Leiden: Brill.
  • Watt, W. Montgomery. 1962. Islamic Philosophy and Theology. Edinburgh: Edinburgh University Press.
  • Watt, W. Montgomery. 1963. Muslim Intellectual: A Study of al-Ghazālī. Edinburgh: Edinburgh University Press.
  • Weinberg, Steven. 2007. “A Deadly Certitude.” Times Literary Supplement, January 17.
  • Wilson, Catherine. 1996. “Modern Western Philosophy,” In History of Islamic Philosophy, edited by Seyyed Hossein Nasr and Oliver Leaman. London: Routledge.
  • Zamir, Syed Rizwan. 2010. “Descartes and al-Ghazālī: Doubt, Certitude and Light.” Islamic Studies 49 (2): 219-251.

 

Author Information

Saja Paravizian
Email: sparavizian@outlook.com
University of Illinois, Chicago
U. S. A.

Alan Gewirth (1912-2004)

Gewirth
Gewirth photo, courtesy of University of Chicago News Office

Alan Gewirth was an American philosopher, famous for his argument that universal human rights can be rationally justified as the outcome of claims necessarily made by rational agents. According to this argument, first outlined in Reason and Morality (1978), all agents necessarily want to be successful in their actions, and since freedom and well-being are the generally necessary conditions of successful agency, every agent must claim rights to freedom and well-being. As the justifying reason for the agent’s rights-claim is the very fact that she is an agent with purposes that she wants to realize, she must also accept the universalized claim that all agents have rights to freedom and well-being. Gewirth calls these rights generic as they correspond to features generically necessary to successful agency. Hence, the supreme principle of morality is the Principle of Generic Consistency (PGC), stating that every agent should act in accord with the generic rights of the recipients of her actions as well as of herself. While freedom refers to one’s control of one’s behaviour in accordance with one’s own unforced choice, well-being refers to the general conditions needed for one to be able to act and to maintain and expand one’s capacity for successful agency.

The PGC applies not only to interpersonal actions but also to social and political morality as it justifies rules and institutions needed for the protection of the generic rights to freedom and well-being at the level of political communities. The minimal state, preventing violations of basic rights, as well as the democratic state, upholding the right to freedom at the political level, are justified in this way. In The Community of Rights (1996), Gewirth argues that the PGC also justifies a more supportive state, involving rights to economic democracy and to productive agency as further specifications of the generic rights to freedom and well-being.

In his last published work Self-Fulfillment (1998), Gewirth outlines a normative theory of self-fulfilment based on a distinction between aspiration-fulfilment and capacity-fulfilment. In aspiration-fulfilment, one’s aim is to satisfy one’s deepest desires; in capacity-fulfilment it is to make the best of oneself. While one’s deepest desires might be for goals that are unrealistic or even immoral, trying to make the best of oneself requires that one’s goals and aspirations are consistent with the requirements of reason. Thus, capacity-fulfilment involves making the most effective possible use of one’s freedom and well-being within the limits set by the PGC as the rationally justified supreme principle of morality.

Beginning with an analysis of the normative claims involved in agency, Gewirth manages not only to justify a supreme moral principle, but also to derive implications of that principle for political and personal morality. His work is not only a major contribution to contemporary moral philosophy, but also an impressive example of how philosophy can make sense of our lives as agents who are rationally committed to morality.

Table of Contents

  1. Life
  2. Gewirth in the Context of Twentieth Century Moral Philosophy
  3. Gewirth’s Moral Theory: Agency, Reason, and the Principle of Generic Consistency
    1. The Normative Structure of Agency and the Necessary Goods of Agency
    2. Conflicting Rights
  4. The Community of Rights
  5. The Good Life of Agents
  6. References and Further Reading
    1. Primary Works
      1. Monographs
      2. Articles and Book Chapters
    2. Secondary Works

1. Life

Alan Gewirth was born in Union City, New Jersey, on November 28, 1912 as Isidore Gewirtz. His parents, Hyman Gewirtz and Rose Lees Gewirtz, were immigrants from what was then Tsarist Russia, where the antisemitic pogroms of the early twentieth century forced many people to cross the Atlantic in the hope of a new beginning and a better life for themselves. Gewirth later dedicated his 1982 book Human Rights “To the memory of my Mother and Father and to Aunt Rebecca and Cousin Libby who as young emigrants from Czarist Russia knew the importance of human rights.” At age eleven, after having been teased by playmates on the schoolyard as “Dizzy Izzy,” he announced to his parents that from now on his first name was to be Alan. The source of inspiration here was a character, Alan Breck Stewart, in Robert Louis Stevenson’s historical adventure novel Kidnapped. In the novel, Alan Breck Stewart was an eighteenth-century Scottish Jacobite whom the young boy Isidore Gewirtz admired as a fearless man of the people. Later, in 1942, he changed his last name from Gewirtz to Gewirth. At a time when antisemitism was also rife in the US, many Jewish Americans found it necessary to anglicize their names. In this way Isidore Gewirtz became Alan Gewirth.

His father, who once had entertained a dream of becoming a concert violinist, gave him violin lessons when Alan was just four or five years old, and later had him take professional lessons. At around age eleven or twelve, Alan himself started to give violin lessons to younger children in the family’s apartment. After entering Columbia University in 1930, he joined the Columbia University Orchestra as a violinist, becoming concertmaster in 1934.

At Columbia, Gewirth was encouraged to pursue philosophical studies by his teacher Richard McKeon. In 1937, he became McKeon’s research assistant at the University of Chicago. Gewirth served in the US Army 1942-46, moving up the ranks from private to captain, after which time he spent the 1946-47 academic year at Columbia on the GI Bill, completing his doctorate in philosophy with a dissertation on Marsilius of Padua and medieval political philosophy (published as a book in 1951). From 1947 onwards, he was a regular member of the faculty of the University of Chicago, from 1960 as a full professor of philosophy. Gewirth was elected a Fellow of the American Academy of Arts and Sciences in 1975, and served as President of the American Philosophical Association Western Division (1973-74) as well as President of the American Society for Political and Legal Philosophy (1983-84). He was the recipient of several prizes and awards, including the Gordon J. Laing Prize for Reason and Morality. He was appointed the Edward Carson Waller Distinguished Service Professor of Philosophy at the University of Chicago in 1975; he became an Emeritus Professor in 1992.

Gewirth continued to give lectures well into his eighties, teaching a course on the philosophical foundations of human rights within the newly constituted Human Rights Program at the University of Chicago as late as 1997-2000. His last public talk was given in August 2003 at the XXI World Congress of Philosophy in Istanbul, Turkey. Alan Gewirth died on May 9, 2004. He was married three times, between 1942 and 1954 to Janet Adams (1916-1997), from 1956 until her death to Marcella Tilton (1928-1992), and from 1996 to Jean Laves (1936-2018). In his first marriage, he was the father of James Gewirth and Susan Gewirth Kumar; in his second marriage, he was the father of Andrew Gewirth, Daniel Gewirth, and Letitia Rose Gewirth Naigles. His younger brother was the educational psychologist Nathaniel Gage (1917–2008).

2. Gewirth in the Context of Twentieth Century Moral Philosophy

The twentieth century was not a very hospitable age for philosophers trying to provide an objectivist foundation for moral principles. In the first half of the century the dominant mode of philosophical thinking about ethics was emotivist and non-cognitivist. Emotivism regarded moral statements about what is right and wrong as mere expressions of the speaker’s attitudes and of her desire to make us share these attitudes. Like other emotive statements, they could be neither true nor false, and there could be no way of proving them. Moral pronouncements came to be thought of as similar to claims made in advertising or in various forms of propaganda. Taking its point of departure in the works of philosophers such as A. J. Ayer in the U.K. and Charles Leslie Stevenson in the U.S., emotivism maintained a dominant presence in analytic philosophy throughout the Cold War years.

The later years of the twentieth century witnessed the rise of postmodernism and a return of pre-modernist cultural relativism. This created room for moral values in public debate, but these values were regarded as relative to various traditions or cultures. Once again, belief in objective and universally justified moral values, such as human rights, was rejected as a culturally produced superstition. However, there was also the additional suspicion that any talk of universal moral values was in reality a disguised attempt by the specific culture of Western Enlightenment to impersonate global reason. Hence, the discourse of universal human rights could either be dismissed as a form of cultural imperialism or find itself compared with a belief in witches and unicorns. The latter claim was made by Alasdair MacIntyre, one of the leading proponents of a communitarian form of relativism.

Against this rather hostile background, Alan Gewirth took upon himself to prove that there could indeed be given a rational foundation for normative ethics, a foundation that would be valid for all rational agents regardless of their subjective preferences or cultural context. Over a period of twenty years, Gewirth published four books and more than sixty journal articles, developing and defending his argument that we, as rational agents who want to realize our goals, logically must claim rights to freedom and well-being, since these rights are the necessary conditions of all successful action. His theory has been the topic of one large analytical monograph and three edited anthologies of philosophical comments and criticism (see References and Further Readings for details). It continues to receive the attention of human rights scholars, moral philosophers, sociologists, and political scientists, and is likely to have a lasting influence on normative analysis and debate in general.

3. Gewirth’s Moral Theory: Agency, Reason, and the Principle of Generic Consistency

 In Reason and Morality (1978), Gewirth described his theory as “a modified naturalism” (Gewirth 1978, 363). He wanted to anchor morality in the empirical world of agents and in the canons of deductive and inductive logic, rather than relegating it to intuitions or emotive attitudes that could be arbitrarily accepted or rejected.

Agency is a natural context for a moral theory given that moral prescriptions are normally about what we should or should not do. Here it could be objected that at least one type of moral theory, virtue ethics, is about what kind of dispositions we should have and so focuses on what we should be like rather than on what we should do. However, virtues are at least implicitly related to agency, as the dispositions that they are meant to cultivate are manifested in the ways virtuous persons act. A prudent person is a person who acts prudently, a courageous person is one who acts courageously, a temperate person is one who acts temperately, and so on. Moreover, a person’s inculcating and developing virtues in herself involves agency – as Aristotle pointed out, one becomes prudent, courageous, and temperate by acting prudently, courageously, and temperately; virtues are acquired by practice and habituation.

a. The Normative Structure of Agency and the Necessary Goods of Agency

Any kind of naturalist normative ethics will face the formidable obstacle known as Hume’s Law, saying that we cannot derive an “ought” from an “is.” According to Hume’s Law, descriptive and prescriptive statements inhabit different logical domains. That is, we cannot derive moral conclusions directly from non-moral empirical premises. For instance, from the descriptive observation that, as a matter of fact, most people in a particular society are in favour of a criminalization of blasphemy, we cannot derive the prescriptive conclusion that blasphemy indeed should be criminalized (in this or in any other society).

However, Gewirth argues that agency provides a context in which it is indeed possible to escape the constraints of Hume’s Law. This is so, since agency in itself has a normative structure involving evaluative and normative judgements made by agents about the very conditions of their agency. Gewirth begins with certain evaluative judgements that necessarily must be made by all rational agents. He then moves dialectically from these judgements to a moral rights-claim that likewise must be embraced by all rational agents. This is to derive moral rights in accordance with the Dialectically Necessary Method. The claim that all agents have rights is therefore presented not as a moral conclusion derived from a non-moral empirical premise about what all agents need, but instead as what rationally consistent agents must accept, given their own necessary evaluative judgements about what it means to be an agent.

To begin with, every agent can be assumed to consider the purpose of her action as something good. Gewirth takes this as a conceptual truth of agency. We involve ourselves in agency for the sake of something that we want to achieve by our action, and in this sense our action reveals a positive evaluation of its purpose. However, for an agent to hold her purpose to be good is not necessarily for her to make a moral evaluation. In this context, “good” should be understood simply as “worth achieving,” according to whatever criteria of worth an agent may have. These criteria need not be moral or even prudent ones. For instance, a burglar might have as his purpose to break into people’s homes and steal their possessions, and a person of a hedonist persuasion might have as his purpose to eat and drink as much as he can, although he is aware that this is bad for his long-term health. Still, the burglar and the hedonist hold their respective purposes to be good in the minimal sense that they want to pursue them; this is why they move from non-action to action with the intention of realizing their purposes.

Hence, agency is purposive and for an agent to have a purpose involves that she holds it to be good. However, agency must also be voluntary, in that the agent must be able to control her behaviour by means of her own unforced choice. It must be her choice to act in a certain way; otherwise, the purposes for which she acts would not be her purposes.

Gewirth’s conceptual analysis thus results in an account of action as being controlled by the agent’s unforced and informed choice (voluntariness) and guided by her intention to realize some goal of hers that she judges to be good (purposivenenss). Voluntariness and purposiveness are the generic features of agency, as they necessarily pertain to all actions.

Having argued that agency involves voluntariness and purposiveness, and that purposiveness involves the agent’s positive evaluation of her goals of action, Gewirth goes on to claim that all agents must hold that the capacities and conditions needed for generally successful agency constitute necessary goods, regardless of their many different particular goals of action. Of course, the capacities and conditions needed for climbing mountains are very different from the capacities and conditions needed for writing a doctoral dissertation in philosophy, not to speak of the capacities and conditions needed for being a good chess player or being good at growing tomatoes. However, common to all agents, including the mountain climber, the dissertation writer, the chess student, and the tomato grower, is the need for certain general capacities and conditions without which it is either impossible or at least unlikely that they will be successful in the realization of any of their purposes. As every agent necessarily wants to be successful in her actions – the very point of her agency being to achieve her purposes – every agent must accept that the capacities and conditions generally needed for any pursuit are necessary goods. They are goods, in the sense of being objects of positive value, and they are necessary, in the sense that no agent can do without them.

According to Gewirth, the necessary goods of agency are freedom and well-being. Freedom and well-being can also be conceptualized as the generic goods of agency, as they correspond to the generic features of agency, voluntariness and purposiveness. Freedom, corresponding to voluntariness, hence refers to the agent’s actual ability to control her behaviour in accordance with her informed and unforced choice. It requires that the agent is not subjected to violence, coercion, and deception in a way that negatively affects her capacity to control her agency. Moreover, it also requires that the agent should not suffer from any compulsive obsession that would interfere with her capacity for informed and unforced choice; nor should she be addicted to drugs that would negatively affect her capacity to control her behaviour. Without freedom, a person’s behaviour would not qualify as agency, as it would not reflect her choices and her will; instead, she would resemble a slave, being the tool of other people, or the powerless victim of uncontrollable impulses.

Well-being, corresponding to purposiveness, refers to the agent’s possession of abilities and resources necessary to her successful realization of her purposes, involving basic preconditions of agency as well as the conditions required for maintaining and developing capacities for agency. The well-being relevant to successful agency therefore has three levels. Basic well-being includes life, health, physical integrity, and mental equilibrium, as well as objects necessary to maintain life and health, such as food, shelter, and clothing. Nonsubtractive well-being includes whatever is necessary to an agent’s maintaining an undiminished capacity for agency, such as not being the victim of theft, broken promises, malicious slander, or generally unsafe conditions of life and work. Additive well-being, finally, includes whatever is necessary to expand an agent’s capacity for agency, such as having self-esteem and the virtues of prudence, temperance, and courage, as well as education, income, and wealth.

Since all agents necessarily want to be successful in their actions, and since freedom and well-being are necessary to all successful action, all rational agents must find it unacceptable to be deprived of or prevented from having freedom and well-being. Consequently, “[s]ince the agent regards as necessary goods the freedom and well-being that constitute the generic features of his successful action, he logically must also hold that he has rights to these generic features” (Gewirth 1978, 63). For an agent not to claim such rights would be for her to hold that it is acceptable that she is left without freedom and well-being. But she cannot hold this to be acceptable, since she, simply by being an agent, must view freedom and well-being as indispensable goods. Hence, any agent who were to deny that she has rights to freedom and well-being would thereby also involve herself in a contradiction, since she would both hold and at the same time deny that she must have freedom and well-being.

Now, so far, the agent has not made a moral rights-claim. She has only made a prudential rights-claim, that is, a rights-claim that is intended to protect the agent’s own interest in being a successful and efficient agent. However, since the sufficient ground for her rights-claim is simply the fact that she is an agent with purposes that she wants to fulfil, she must also recognize that the same rights-claim can and must be made by all other agents as well. Thus, every rational agent must accept the normative conclusion “All agents have rights to freedom and well-being.” Now this is a moral rights-claim, since it refers to the important interests not only of the individual agent, but of all agents. More precisely, it refers to all prospective purposive agents, since the claim applies not only to persons who are presently involved in agency, but also to persons who in the future can be expected to engage in agency.

Here it is once again important to note that Gewirth does not derive moral rights from facts about agency. His argument is not of the form “Because A is an agent, A has moral rights to freedom and well-being.” Instead he argues that each and every rational agent, from within her own perspective as such an agent, must claim rights to freedom and well-being. That is, from within her own perspective as an agent, A holds that (1) “My freedom and well-being are necessary goods.” Having accepted (1), A, who wants to achieve her goals of action and who is unable to achieve these goals without freedom and well-being, must, on pain of self-contradiction, embrace the evaluative judgement (2) “I must have freedom and well-being.” The “must” of (2) has implications for how A has to conceive of possible threats to her possession of freedom and well-being emanating from other persons. That is, she is logically compelled to hold that (3) “Other persons should not interfere with my having freedom and well-being.” And since (3) is equivalent to claiming a protected possession of freedom and well-being, it is also equivalent to a rights-claim: (4) “I have rights to freedom and well-being.” And since the sufficient condition of A’s rights-claim is that A is an agent, A must then also accept the generalized claim (5) “All agents have rights to freedom and well-being.” While (4) is a prudential rights-claim, (5) is a moral rights-claim, referring to rights had by all agents and not only by A.

Strictly speaking what Gewirth has proven is not that all agents have rights to freedom and well-being, but that all rational agents must hold that all agents have rights to freedom and well-being. However, this does not in any way diminish the practical relevance of Gewirth’s argument. As he himself observes, “what a rational agent ought to do … is what he is rationally justified in thinking he ought to do. But what he is rationally justified in thinking he ought to do is what he logically must accept that he ought to do” (Gewirth 1978, 153). Hence, if every agent must logically accept that all agents have rights to freedom and well-being, then every agent has as good reason as there could ever be to adhere to a moral principle prescribing universal respect for the equal rights of all agents to freedom and well-being.

Such a moral principle is also the outcome of Gewirth’s argument. This is the Principle of Generic Consistency (PGC): “Act in accord with the generic rights of your recipients as well as of yourself” (Gewirth 1978, 135). The term “recipients” refers to the people who are affected by an agent’s action, while “generic rights” denotes rights to freedom and well-being, the generic and necessary goods of agency. Therefore, without any loss of meaning, the PGC could also be stated as “Act in accord with the rights to freedom and well-being of your recipients as well as of yourself.”

The rights prescribed by the PGC are both negative and positive. Agents are required both not to interfere with their recipients’ possession of freedom and well-being, and to help them have freedom and well-being when they are unable to secure these necessary goods by their own efforts, and when help can be given at no comparable cost to the helping agent. To refuse to give such help would be tantamount to a practical denial of the equality of rights prescribed by the PGC.

b. Conflicting Rights

The rights of different agents and recipients may conflict with each other: an unlimited right to freedom for murderers and robbers would allow them to infringe the rights to well-being for their victims; to uphold the right to basic well-being for poor people might require a welfare state that taxes wealthier people and so interferes with their right to property; the additive right to well-being includes a right to have an income, but to have an income from selling drugs to children conflicts with the children’s rights to freedom and basic well-being, and so on. Now, according to Gewirth, conflicts between rights are to be resolved in accordance with three different criteria.

The first criterion is about Prevention or Removal of Inconsistency, according to which agents violating the generic rights of their recipients – aggressor agents, for short – can have their own generic rights justifiably interfered with. Agents who interfere with their recipients’ rights to freedom and well-being by, for instance, coercing them, threatening them, manipulating them, killing them, terrorizing them, assaulting them, or stealing from them are guilty of an inconsistency in that they deny rights to others that they must claim for themselves. Even if they do not make an explicit claim to be superior to other agents, their actions involve a practical rejection of the PGC.

In order to protect or restore the equality of rights prescribed by the PGC, aggressor agents must be either prevented from violating their recipients’ rights in the first place or, if they have already violated these rights, punished for their transgressions. The force used in preventing aggressor agents from violating their recipients’ rights should not exceed what is necessary to protect these rights. Likewise, the severity of the punishment meted out to aggressor agents should be proportionate to the seriousness of the violation of rights that they have inflicted on their victims. Preventing and punishing violations of rights will necessarily involve an interference with aggressor agents’ rights to freedom and well-being, but as it is needed to uphold the equality of rights prescribed by the PGC, such interference is also morally justified by that principle.

The criterion concerning Prevention or Removal of Inconsistency also points to the need for a legal system that can administrate punishment in a fair manner. Such a legal system involve laws, courts, judges, prosecutors, defence lawyers, and police officers. By extension, the justification of such a legal system implies the justification of states that are capable of implementing and upholding rights-protecting laws within their borders.

The second criterion is about Degrees of Needfulness for Action, giving priority to the right whose object is most needed for successful agency. For instance, the right to life (basic well-being) is more important to successful agency than the right not to have one’s property interfered with (non-subtractive well-being). Hence, if I can save a drowning child only at the cost of ruining my clothes, I have a duty to do so. Likewise, at the political or societal level, taxing wealthy people for the sake of providing poor people with basic healthcare or providing their children with basic education is morally justified. It is not that wealthy people do not have a right to non-interference with their property, but instead that this right is overridden by the rights of poorer people to life, health, and education. This is so, not because the poor usually outnumber the rich, but because the specific rights of the poor that we are considering here are more important from the point of view of successful agency than the right of wealthier people to have their wealth left untouched.

While the criterion concerning Prevention or Removal of Inconsistency mainly deals with negative rights – rights not to have one’s freedom or well-being interfered with – the criterion concerning Degrees of Needfulness for Action has more to do with positive rights – rights to have one’s freedom and well-being effectively upheld and protected by other agents. The criterion of Degrees of Needfulness for Action also has a bearing on the justification of what is commonly known as the welfare state, that is, a state which upholds the basic and additive rights of well-being of all citizens by means of redistributive taxation.

The third criterion is about Institutional Requirements, allowing for interference with people’s freedom and well-being when doing so is necessary to uphold institutions and social rules that are themselves required by the PGC. Thus, when a judge sentences a murderer to lifetime imprisonment, she interferes with the murderer’s right to freedom, but she is not thereby doing anything morally wrong. On the contrary, she is representing a legal system upholding the generic rights to freedom and well-being as prescribed by the PGC. More specifically, she represents an institution – the state with its laws, courts, judges, police officers, and so on – designed to remove inconsistencies as described by the first criterion above. The judge is not acting as a private person, but as a representative of the law; therefore, although the convicted murderer has not interfered with the judge’s freedom, he cannot argue against the judge that she has violated the PGC by denying him his right to freedom. This is not a case of an individual agent acting on an individual recipient, but of a representative of the law upholding a moral (and legal) principle of justice.

Many rights are not absolute, as they can be overridden by other, more important, rights, as outlined by the criterion of Degrees of Needfulness for Action. Gewirth hence defends a consequentialist position, according to which the rightness of an action depends on how it affects the recipient’s rights, and, in cases of conflicts between rights, on how upholding one right affects other, and possibly more important, rights. This is what he calls a deontological consequentialism (Gewirth 1978, 216), focusing on protecting rights rather than on producing good results in general. Thus, it should be distinguished from a utilitarian consequentialism, according to which the right action or rule of action is the one that results in the greatest total quantity of happiness or preference satisfaction, regardless of how benefits and burdens, pleasures and pains are distributed among individuals.

According to Gewirth, there are indeed rights that are absolute, in the sense that they cannot be overridden. As an example, he presents the case of Abrams, a young lawyer and prominent member of the society, who is being blackmailed by a group of terrorists threatening to use nuclear weapons against the city in which Abrams lives, unless Abrams publicly tortures his mother to death. So – should Abrams give in to the terrorists’ threats and torture his mother to death for the sake of saving the lives of thousands of his fellow citizens?

Gewirth says no. He provides an argument based on agent responsibility that is capable of explaining why the right to basic well-being of Abrams’s mother in this case is indeed absolute. Central to his argument is the Principle of the Intervening Action. According to this principle, if there is a causal connection between one person A’s action or inaction X and some harm Z being inflicted on some other person C, then A’s moral responsibility for Z is removed if, between X and Z, there intervenes some person B’s action Y, and Y is what actually brings about Z (Gewirth 1982: 229).

In the case of Abrams and the terrorists, the Principle of the Intervening Action means that although there is a causal connection between Abrams’s refusing to torture his mother to death and the deaths of thousands of innocent people, Abrams is not morally responsible for their deaths. This is so because there is an action that intervenes between his refusal and the death of these people and which actually brings about their deaths, namely, the terrorists’ use of their nuclear weapons. It is not Abrams’s refusal by itself that kills thousands of innocent people, but the terrorists’ use of their weapons. No one is forcing the terrorists to kill anyone, least of all Abrams; they freely choose to detonate their bombs or whatever type of weapon they have at their disposal as a response to Abrams’s refusal to torture his mother to death. This is their decision and their action, no one else’s.

Thus, the terrorists are both causally and morally responsible for the deaths of these thousands of innocent people, not Abrams:

The important point is not that he lets these persons die rather than kills them, or that he does not harm them but only fails to help them, or that he intends their deaths only obliquely but not directly. The point is rather that it is only through the intervening lethal actions of the terrorists that his refusal eventuates in the many deaths. (Gewirth 1982: 230)

The conflict here is not between Abrams’s refusal to torture his mother to death and the survival of thousands of innocent townspeople, but between their survival and the terrorists’ intention to use nuclear weapons against them. Consequently, Abrams’s duty to respect his mother’s right to basic well-being is not affected by the terrorists’ threat. In this sense, Abrams’s mother has an absolute right not to be tortured to death by her son. By implication, all innocent persons (and not only mothers) have an absolute right not to be tortured to death by anyone (and not only by their sons).

4. The Community of Rights

The PGC applies not only to individual agents and their interaction with their recipients, but also to the collective level of political communities, institutions, and states. The minimal state with its laws against criminal transgressions such as murder, rape, robbery, fraud, enslavement, and so on is justified as instrumentally necessary to the protection of negative rights to freedom and well-being, that is, rights that are about not having one’s freedom and well-being interfered with. The right to freedom also justifies the democratic state, that is, a state that functions according to the method of consent, allowing the people to be a community of citizens, deciding about their own collective fate, and not just the subjects of an autocratic ruler. But the PGC also justifies a supportive state – in common parlance known as the welfare state – which is instrumentally necessary to the protection of positive rights to freedom and well-being, that is, rights that are about actually possessing the freedom and well-being needed for successful agency. Such positive rights imply that people who are unable to develop, or are prevented from developing, freedom and well-being for themselves should receive support from the state to overcome these obstacles to their successful agency. The support provided would involve access to education, health care, and employment whereby individuals can secure an income for themselves, but also involve public goods such as clean air and water, safe roads, public libraries, and similar commodities that contribute to everyone’s actual possession of freedom and well-being. A supportive state that responds in this manner to its citizen’s positive rights is thereby also a community of rights. The Community of Rights is also the title of Gewirth’s 1996 sequel to Reason and Morality.

According to Gewirth, making the state a community of rights is not only justified in the sense of being permissible but is indeed necessitated by the PGC, as this principle justifies positive rights to freedom and well-being. To act in accord with one’s recipient’s rights to freedom and well-being is not only about not interfering with these goods, but also, when it is necessary, and when one can do so at no comparable cost to oneself, assisting one’s recipients in actually having freedom and well-being. To refuse to help a person who is unable to maintain her basic well-being when one can do so without jeopardizing any aspect of one’s own basic well-being would imply a practical denial of that person’s generic rights of agency. Now, sharing a political community – a sovereign territorial state governed by its citizens – transforms the relationship between agents and their recipients from one of individual persons directly interacting with each other to one of a collective of persons indirectly interacting with each other, by means of social rules and institutions that they or their elected political representatives have decided about. A community of rights aims to remove structural threats to the equality of rights, focusing on “situations where threats to freedom and well-being arise from social or institutional contexts, such as where economic or political conditions make for unemployment, homelessness, or persecution” (Gewirth 1996, 41).

In a political community characterized by social inequality, some people may enjoy prosperous lives with good salaries, wealth, and property, while vast numbers might be left in unemployment and poverty. These poor people may live in unhealthy homes in crime-ridden neighbourhoods, often lacking sufficient education and suffering from a hopelessness that inclines them to abuse drugs and alcohol and to neglect their responsibilities as spouses and parents. Often such inequality has an institutional dimension in that it is being maintained by laws or the absence of laws concerning social welfare, work hours, taxation, property, unionization, and so on. Therefore, providing impoverished and marginalized groups with effective rights to freedom and well-being involves creating a supportive legal and institutional framework that gives them access to education and employment. These goods allow them to make a living for themselves and develop their capacities for successful agency. This is what the community of rights is about.

In this context, it is the state that is the relevant agent of change, rather than individual charity workers, as it is a matter of changing a social condition affecting the political community at large, and as the required changes involve political decisions about laws and institutions. Moreover, as the state derives its moral justification from being instrumental to the upholding and maintenance of its citizens’ equal rights to freedom and well-being, it cannot without undermining its own moral foundation remain passive when confronted with the unfulfilled rights of vast numbers of its citizens.

Acting on behalf of all its citizens, the state can be seen as establishing a link between wealthy and poor groups of citizens, connecting the rights of the poor to the duties of the wealthy: “In so acting the state carries out the mutuality of human rights. For since each person has rights to freedom and well-being against all other persons who in turn have these rights against him or her, the state, as the persons’ representative, sees to it that these mutual rights are fulfilled. … So the state, in helping unemployed persons to obtain employment, enables its other members to fulfill positive duties that, in principle, are incumbent on all persons who can provide the needed help” (Gewirth 1996, 219).

However, here it is important to note that the poor have a right to be helped only when they are indeed unable to secure their rights to freedom and well-being for themselves. The positive right to help is based on necessity – that is, it applies only to cases in which one cannot have one’s rights to freedom and well-being realized without the help of other people or social institutions. Thus, “the positive rights require personal responsibility on the part of would-be recipients of help: the responsibility to provide for themselves if at all possible, prior to seeking the help of others” (Gewirth 1996, 42). Moreover, where there is a justified right to assistance, it is about securing goods needed for successful agency – food, housing, education, a job with a sufficient income, and so on; it is not about satisfying whatever particular need a person might have in the light of her personal interests and preferences.

Central among the positive rights outlined and discussed in The Community of Rights is the one to productive agency. According to Gewirth, unemployment, the lack of education, the lack of affordable health care, and so on, are morally problematic because they deprive persons of their capacity for successful agency. Hence, according to Gewirth, the way to deal with these societal shortcomings is not just to give poor people money but rather to help them develop for themselves the means of self-support, making them capable of standing on their own two feet rather than being reduced to permanent recipients of welfare cheques or charity. To be such a permanent recipient of welfare cheques would be detrimental both to one’s autonomy (central to freedom) and to one’s self-esteem (central to additive well-being); living on welfare is not the solution to the moral problem of poverty but rather a problem in its own right.

Instead of welfare cheques, the community of rights focuses on two mutually reinforcing strategies. At the level of individuals, it aims to develop their capacity for productive work by means of education. At the societal level, it aims to establish a system of full employment combined with a system of economic democracy to be applied to firms competing on market conditions. Gewirth’s ideas here are quite bold, combining ideas usually associated with socialism with a defence of market principles usually associated with liberalism. This comes out clearly in his discussion of economic democracy “in which products are put out to be sold in competitive markets and the workers themselves control the productive organization and process,” which in turn may involve “aspects of ownership either by the workers themselves or by the state” (Gewirth 1996, 260).

Gewirth’s argument that the state is morally obligated to guarantee employment to all citizens reflects his belief that there is a human right to employment. This specific right is derived from the more general right to well-being, as work in exchange for money is a morally justified way of satisfying both basic and additive well-being, enabling oneself to have not only food on the table and a roof over one’s head, but also to buy various consumer goods whereby one can increase one’s quality of life. (Theft, robbery, selling drugs, blackmail, and fraud would exemplify morally unjustified ways of making a living that negatively interferes with others’ well-being.) By participating in the production of goods and services one recognizes the mutuality of human rights by offering goods and services to consumers that they value as components of their well-being, while they in turn pay for these goods and services, thereby contributing to one’s own well-being. By participating in such a mutual exchange, one can also derive a sense of justified self-respect and pride. One contributes something of value to others and earns a living from that contribution. Hence, one is a productive member of one’s community – a person who adds to its total wealth and to the well-being of its members.

The right to productive agency includes a right to education that prepares the individual for work life but also promotes “cultural, intellectual, aesthetic, and other values that contribute to additive well-being, including an effective sense of personal responsibility” (Gewirth 1996, 149–150). In this way, the individual is helped to avoid welfare dependence while at the same time being made aware of human rights and the cultural values of her own as well as other communities. Once again, it is the state, as the institutional representative of the community of rights, that should see to it that all members of the community receive this kind of education.

The right to productive agency implies a right to employment. It has both a negative version – the right not to be arbitrarily deprived of employment – and a positive version – the right to actually have an employment in the first place. The state, representing a community of rights, “should seek to secure the negative right to employment by enforcing on private corporate employers the duty not to infringe this right through policies that disregard their severe impact on the employment security of workers.” As for the positive right to employment, it is also the state “that has the correlative duty to take the steps required to provide work for unemployed persons who are able and willing to work” (Gewirth 1996, 218–219). This latter duty can be effectuated by offering unemployed workers retraining to equip them for a new job, or by the state directly employing them in the public sector.

In addition to defending state interventionism in the job market, Gewirth offers an even more radical solution to the problem of how to protect workers’ rights, namely, economic democracy. Given both the background of universal rights to freedom and well-being and that workers are exposed to the power of their capitalist employers regarding decisions about wages, job security, and conditions of work, Gewirth argues that workers should own and control the companies and corporations that employ them. Still, the Gewirthian idea of economic democracy does not try to dispose of the market, nor does it try to eliminate capitalists as such. What it does is to separate the role of capitalists from the power of ownership. Firms owned and controlled by workers will still compete with each other on a market that is guided by the demands of consumers. These firms will also need capital for investment and development. Sometimes they might need to borrow money from banks or private investors. However, the banks or investors will not themselves be owners or share-holders of the firms in question: “In capitalist systems, capital hires workers; in economic democracy, workers hire capital” (Gewirth 1996, 261).

Firms owned by their workers and competing on a market may of course fail in this competition. In a capitalist system, this would entail that workers lose their jobs as their firm is trying to cover its losses. In a system of economic democracy, however, there will be a general organization of firms that intervenes to prevent such threats to workers’ well-being: “When firms are threatened with failure and consequent layoffs of workers, they are not simply permitted to fail. Instead, the general organization … helps them either to improve their efficiency or to convert to some other line of production in which they can successfully compete” (Gewirth 1996, 295).

Here it is also important to point out that Gewirthian economic democracy does not imply a justification of state socialism of the kind associated with the former Eastern Bloc. On the contrary, Gewirth holds that the introduction of a system of economic democracy must depend on such a system being freely accepted by the citizens in a political democracy. It “should not be imposed by fiat; it should be the result of a democratic process of discussion, deliberation, and negotiation in which arguments pro and con are carefully considered by the electorate of political democracy” (Gewirth 1996, 263). Although Gewirth favours a system of economic democracy, he is also aware of the complexity of the question. The argument for workers’ ownership of firms involves empirical assumptions about motivation, solidarity, and productivity that are not settled by the rational justification of the rights of all agents to freedom and well-being. While political democracy, based on the human right to freedom, and its method of consent are rationally justified and morally necessary, the question of how to best organize the economic structure of the community of rights admits of more than one answer. As Gewirth himself recognizes, “[w]hile there are rational arguments for the economic rights I have upheld, there may also be rational arguments for different and even opposed arrangements, so that the rights have a certain element of normative contingency, as against the generic rights [to freedom and well-being] themselves” (Gewirth 1996, 323).

5. The Good Life of Agents

In his last published book, Self-Fulfillment (1998), Gewirth demonstrated that the necessary goods of agency are central not only to interpersonal and political morality but also to personal morality and the quest for a fulfilling and meaningful life. Distinguishing between two varieties of self-fulfilment, aspiration-fulfilment and capacity-fulfilment, he outlines a normative theory of the good life that is also consistent with the prescriptions of the PGC. In aspiration-fulfilment, the aim is to satisfy one’s deepest desires; in capacity-fulfilment, it is to make the best of oneself. Although these two aims are not mutually exclusive – our deepest desire might be to make the best of ourselves – they are conceptually distinct. For instance, it might be the case that our deepest desires are for other things than making the best of ourselves.

Together, aspiration-fulfilment and capacity-fulfilment define us as persons and agents. In aspiration-fulfilment, we are guided by our actual deepest desires; in capacity-fulfilment, we are guided by an idea of what is best in us that might well go beyond our actual desires and even conflict with them. Here, we should also observe that many people do not have self-fulfilment (in either variety) as the direct and conscious goal of their actions. Instead, “their self-fulfillment and their awareness of it emerge as ‘by-products’ of their achieving the direct external objects of their aspirations, whether these consist in composing beautiful music or pursuing political objectives or whatever” (Gewirth 1998, 50).

Aspiration-fulfilment makes an important contribution to the good life as it provides the aspiring person with motivation and purposes to guide her; thus, “[t]he person who has aspirations has something to live for that is especially significant for him, something that gives meaning, zest, and focus to his life” (Gewirth 1998, 32). However, aspiration-fulfilment can also be problematic, from a prudential as well as from a moral point of view.

Prudentially speaking, one can be mistaken about the contents of one’s deepest desires. Hence, one might aim for targets that will leave one frustrated, either because one fails to understand what it takes to realize them, or because one has an exaggerated view of the satisfaction one will get from realizing them. For instance, one might aspire to become a famous novelist without realizing the effort required to achieve this goal, or one might have exaggerated expectations about the happiness one will get from actually becoming a famous novelist.

From a moral point of view, aspirations can be immoral, conflicting with the human rights to freedom and well-being prescribed by the PGC. One’s deepest desires might be about dominating other people, thereby violating their right to freedom. Or it might be about creating a racially, religiously, culturally or ideologically “pure” society in which people identified as having the “wrong” ethnicity, faith, sexual preferences, or political beliefs are killed, enslaved, imprisoned, or persecuted, thereby having their most basic rights to freedom and well-being violated.

In order to overcome the potential errors of aspirations and reconcile self-fulfilment with the requirements of morality and the PGC, Gewirth moves from aspiration-fulfilment to capacity-fulfilment. This form of self-fulfilment is, as we have already noted, about making the best of oneself. Making the best of oneself involves a move in the direction of objectivity and rationality and away from the subjectivity and arbitrariness of the desires of aspiration-fulfilment. This does not mean that capacity-fulfilment rejects desires. On the contrary: as in the case of aspiration-fulfilment, capacity-fulfilment involves agency, and as there can be no agency without desires for some outcome, “there is ultimately no disconnection between capacity-fulfillment and aspiration-fulfillment” (Gewirth 1998, 159). However, in capacity-fulfilment one’s desires have passed through a process of critical assessment, guided by the goal of making the best of oneself. We might describe this process as the move from aspiration-fulfilment’s question “What do I want for myself?” to capacity-fulfilment’s question “What should I want for myself?”

Now, according to Gewirth, “making the best of oneself” is about acting in accordance with the best of one’s capacities. Reason, as the capacity for ascertaining and preserving truth, would belong to this category of capacities, as it is needed for all rational deliberation, including deliberation about what to do with one’s life, what one should aspire to, how one might best realize one’s aspirations, how to handle conflicts between one’s own goals, or between one’s own and other people’s goals, and so on. In Gewirth’s terminology, reason should be understood as “the canons of deductive and inductive logic, including in the former the operations of conceptual analysis and in the latter the deliverances of sense perception” (Gewirth 1998, 72). In relation to self-fulfilment, we use reason to collect, analyse, and evaluate facts about ourselves and our capacities, as well as to make relevant inferences from these facts and apply them to our aspirations and goals.

One important aspect of ourselves that is ascertained by reason is that we are agents – indeed, the very idea of self-fulfilment implies agency, that is, that we can and should do something with our lives, whether it is about satisfying our deepest desires or making the best of ourselves. Now, as Gewirth has shown in his earlier work, being agents, we must claim rights to freedom and well-being, and we must recognize the claim of all other agents that they too have rights to freedom and well-being. Hence, as capacity-fulfilment involves the use of reason, and as reason justifies human rights to freedom and well-being as prescribed by the PGC, capacity-fulfilment involves recognizing a universalist morality of human rights. Accordingly, to make the best of oneself involves acting in accord with the rights to freedom and well-being of one’s recipients as well as of oneself.

Here we can see how capacity-fulfilment comes to modify aspiration-fulfilment – aspirations that are inconsistent with the human rights to freedom and well-being will be rejected by a rational agent as unjustified and impermissible. Therefore, “capacity-fulfillment can sit in reasoned judgment over aspiration-fulfillment” (Gewirth 1998, 101). Moreover, as agents guided by reason must conceive of freedom and well-being as necessary goods, these goods should also figure prominently in projects whereby agents try to realize a good life for themselves. According to Gewirth, “to fulfill oneself by way of capacity-fulfillment, one must make the most effective use one can of one’s freedom and well-being, within the limits set by the overriding authority of universalist morality” (Gewirth 1998, 110).

“Most effective” should not be understood in purely quantitative terms as the realization of as many goals of action as possible. Instead it is about using one’s critical and reflective understanding of one’s abilities and preferences to select some major goals, such as pursuing a certain career that one finds meaningful, or supporting a cause that one finds valuable. Different agents will make different choices here, depending on their varying abilities and preferences. Some agents may dedicate themselves to human rights activism or to the creation of art that widens the horizons of its audience. However, capacity-fulfilment can also be achieved within more ordinary types of occupations. Thus, people can make the best of themselves by becoming “a professional athlete or an electrician or an engineer or a philosopher or a journalist, and so forth” (Gewirth 1998, 131). Within these various occupations, professions, and callings there will be standards of excellence, and by trying to achieve in accordance with these standards the agent will be able to make the best of herself and so achieve capacity-fulfilment. (Here we should once again remind ourselves that the occupations and professions themselves must be consistent with the human rights to freedom and well-being; hence, an agent would not be able to achieve capacity-fulfilment by excelling as a SS officer in the service of Adolf Hitler or as a NKVD agent working for Joseph Stalin.)

It is important in this context to note that making the best of oneself might well imply different commitments and projects at different times in one’s life. Choosing a particular career for oneself might be an important aspect of one’s capacity-fulfilment when one is young; as one grows older, and all or most of one’s working life belongs to the past, it might be more relevant to have a plan about how to make the most of one’s retirement, possibly developing new skills in the process. Such a dynamic conception of capacity-fulfilment implies a realistic view of human life, according to which “one must accept oneself for what one is; in this way one can age gracefully, as against a neurotic longing for one’s past youth” (Gewirth 1998, 118).

An agent’s effectiveness involves her acquiring certain virtues that enhance her capacity for successful agency. Important among these virtues is prudence, which is the ability to “ascertain which of one’s possible ends are most worth pursuing in light of one’s overall capacities and aspirations,” including “both self-knowledge and knowledge of one’s natural and social environment, as well as the proximate ability and tendency to bring these to bear on one’s actions and projects” (Gewirth 1998, 126).

The development of knowledge about oneself and one’s environment can be promoted by education (including self-education) and culture in the form of art and literature that enlighten and enlarge the agent’s understanding of herself and of the world. As a result of such a widening of her horizons, the agent might find inspiration and motivation to develop her own skills in a way that works well for her, given her talents and abilities.

As the agent tries to achieve her ends, she might be confronted with obstacles in the form of fears, self-doubt, setbacks, frustrations, temptations, and disruptive urges. Thus, she will also need the virtues of courage and temperance, helping her to persevere in the face of adversity and to overcome unfounded fears as well as to control her appetites and inclinations so that they do not undermine her determination and ability to achieve her ends.

A good life for an agent often includes various social commitments. She might experience fulfilment in love relationships or family life, or by participating in various voluntary associations, or by patriotic dedication to her country and political community. By identifying with smaller or larger groups, she might provide her own life with meaning and significance, making the best of herself by being a loyal and supportive member of one or many of these groups. Moreover, “a sense of belonging, of being part of a larger nurturing whole, is a valuable component of additive well-being and self-fulfillment” (Gewirth 1998, 151) as it provides the individual with the identity and confidence needed to make the best of herself.

Social commitments typically involve preferential treatment of other members of one’s group. Lovers typically care for each other in a way that they do not care for others, parents typically support their own children in a way that go beyond whatever support they offer children in general, citizens are typically willing to make sacrifices for their political community that they would not contemplate in relation to other nations or states, and so on. Such particularist allegiances are consistent with the requirements of reason and the principle of equal human rights justified by these requirements – that is, the PGC – as long as the preferential treatment involved does not result in a violation of innocent persons’ rights to freedom and well-being.

Indeed, certain preferential concerns are justified by the PGC. For instance, the right to freedom involves a more specific right to join others in voluntary associations, such as families. In these associations one typically also acquires special responsibilities for each other’s well-being. Parents, for instance, are morally obligated to care for their children as the very existence of these children depend on the parents’ exercise of their freedom to procreate. Likewise, the right to well-being justifies the existence of states as necessary to the protection of that right and makes support for such rights-protecting states a duty for their respective citizens. Therefore, citizens are not only justified but morally obligated to support their state – provided, that is, that the state in question is indeed protective of their human rights and does not unjustifiably threaten the rights of members of other political communities. Such support is effectuated by the citizens when they pay their taxes to maintain rights-protecting institutions or when they take part in the defence of their political community in a just war of defence.

Preferential treatment conflicts with reason and the principle of equal human rights to freedom and well-being only when it involves violations of these rights. A mother does not offend against the universality of human rights by choosing to prioritize the feeding of her own starving child in a situation in which there are many other starving children around. However, things would be different if she feeds her child with food that she has taken from someone else’s starving child; then she would have violated the right to basic well-being of that child. Likewise, a citizen of a rights-respecting state does not offend against the universality of human rights by taking a particular interest in the flourishing of her own political community and by being willing to make sacrifices for that community that she would not make for any other political community. However, she would be guilty of contributing to violations of human rights if her patriotic loyalty were to extend to a support for her political community even as that community violates human rights, for instance, by perpetrating a genocidal attack on a religious or ethnic minority.

Gewirth’s justification of agency-based rights and of the PGC has received many critical comments from other philosophers. Among other things, it has been argued (in the 1984 anthology Gewirth’s Ethical Rationalism, edited by Edward J. Regis Jr.) that agents are not logically compelled to claim moral rights just because they want to be successful in achieving their own goals of action (R. M. Hare); that Gewirth might not have been successful in bridging the gap between the “is” of human agency and the “ought” of morality (W. D. Hudson); that while Gewirth might be capable of justifying negative rights, his theory is unable to justify positive rights and hence also unable to justify the supportive or welfare state (Jan Narveson). Gewirth has replied to these and many other objections (in the anthology mentioned above, as well as in many separate articles in various philosophy journals).

In addition to Gewirth’s own replies to his critics, his theory has been carefully and thoroughly defended by Deryck Beyleveld, who in his 1991 book The Dialectical Necessity of Morality listed 66 categories of objections to the justification of the PGC (often including more than one critic in each category) and went on to show how each and every objection either had already been convincingly dealt with by Gewirth himself or could be dealt with by means of a rational reconstruction of Gewirth’s argument.

In 1997 a conference dedicated to the exploration of Gewirth’s moral philosophy took place at Marymount University, Arlington, Virginia. The comments, presented by the participants in this conference together with a reply by Gewirth himself, were later published as a book with the title Gewirth, edited by Michael Boylan. Gewirth’s theory has continued to attract attention after his death in 2004, including a 2016 anthology entitled Gewirthian Perspectives on Human Rights (edited by the Swedish philosopher Per Bauhn); likewise, the PGC is frequently referred to in discussions relating to human rights and social justice.

One reason for the continued interest in Gewirth’s theory is, of course, that we live in troubled times. As Gewirth himself once pointed out, “[i]n a century when the evils that man can do to man have reached unparalleled extremes of barbarism and tragedy, the philosophical concern with rational justification in ethics is more than a quest for certainty” (Gewirth 1978, ix). Referring to the twentieth century with its two World Wars and the Holocaust, these words have certainly not lost their relevance in the twenty-first century, when mankind is tormented by fanaticism and terrorism, as well as by widespread global inequalities between men and women, and between those who have and those who have not.

The need for a rationally justified morality is as great as ever before in human history, if not greater, given the facts of globalization. Different cultures and moralities are brought in ever closer contact with each other, thereby creating possibilities for conflict as well as for cooperation, while new technologies enable us to affect the lives of people across the globe. Thus, questions of agency, morality, and rights will be of the utmost importance for our deliberations about how to shape our individual and collective futures. It is in the context of such deliberations that Alan Gewirth’s carefully developed arguments have their place; his contributions to modern moral and political philosophy are of a significant and lasting kind.

6. References and Further Reading

a. Primary Works

i. Monographs

  • Marsilius of Padua and Medieval Political Philosophy (New York: Columbia University Press, 1951). Gewirth’s doctoral dissertation on the 14th century philosopher who challenged the papal authority in political matters and defended the idea of popular sovereignty.
  • Reason and Morality (Chicago: The University of Chicago Press, 1978). Gewirth’s main work in moral philosophy, providing a detailed argument for the Principle of Generic Consistency (PGC) and its derivation from agents’ necessary evaluation of freedom and well-being as the necessary goods of successful agency.
  • Human Rights (Chicago: The University of Chicago Press, 1982). A collection of essays by Gewirth, dealing with the justification and application of the PGC.
  • The Community of Rights (Chicago: The University of Chicago Press, 1996). Gewirth’s main work in social and political philosophy, in which he provides an argument for positive rights to freedom and well-being, including rights to employment and to economic democracy, as well as for a welfare state that is also a community, based on values such as respect and care.
  • Self-Fulfillment (Princeton: Princeton University Press, 1998). In this work, Gewirth sets out to argue that self-fulfilment comes in two forms, as aspiration-fulfilment and as capacity-fulfilment, and that making the best of one’s life must include adherence to universal human rights, as defined by the PGC.

ii. Articles and Book Chapters

  • “Introduction”, in Gewirth, Alan (ed.) Political Philosophy (London: Collier-Macmillan, 1965), pp. 1–30. This introductory chapter provides valuable clues to Gewirth’s later thinking on political rights and social justice, as well as his early ideas on combining natural law with consequentialism.
  • “The Epistemology of Human Rights”, Social Philosophy & Policy 1 (2), 1984, 1–24. In this article Gewirth outlines the conceptual and logical structure of human rights in general and his dialectically necessary justification of the PGC in particular.
  • “Practical Philosophy, Civil Liberties, and Poverty”, The Monist 67 (4), 1984, 549–568. Here Gewirth outlines his ideas about how philosophy can be practical, exemplifying by discussing how the poor can be provided with effective access to the political process.
  • “Private Philanthropy and Positive Rights”, Social Philosophy & Policy 4 (2), 1987, 55–78. In this article, Gewirth argues that while private philanthropy might contribute to important human values, for reasons of justice and fairness, the primary responsibility for upholding citizens’ positive rights to basic well-being should rest with the state.
  • “Ethical Universalism and Particularism”, The Journal of Philosophy 85 (6), 1988, 283–302. Here Gewirth argues that certain particularist commitments, for instance to one’s family and country, are not only consistent with but are indeed justified by universalist morality and its supreme principle, the PGC.
  • “Is Cultural Pluralism Relevant to Moral Knowledge?” Social Philosophy & Policy 11 (1), 1994, 22–43. In this article Gewirth addresses the topic of multiculturalism, arguing that the norms and values of different cultures must themselves be assessed from the perspective of rational moral knowledge as embodied in the PGC.
  • “Duties to Fulfill the Human Rights of the Poor”, in Pogge, Thomas (ed.), Freedom from Poverty as a Human Right (Oxford: Oxford University Press, 2007), pp. 219–236. This book chapter is based on “Justice: Its Conditions and Contents”, Gewirth’s keynote address at the XXI World Congress of Philosophy in Istanbul, Turkey, delivered on August 17, 2003. Here Gewirth outlines a positive duty of wealthier nations to provide poorer nations with agency-empowering assistance. This argument can be seen as containing ideas for a book that Gewirth was working on at the time, entitled Human Rights and Global Justice; the book was left unfinished at the time of his death.

b. Secondary Works

  • Bauhn, Per (ed.). Gewirthian Perspectives on Human Rights (New York: Routledge, 2016). A collection of essays with new interpretations and applications of Gewirth’s theory, with a particular focus on human rights.
  • Beyleveld, Deryck. The Dialectical Necessity of Morality (Chicago: The University of Chicago Press, 1991). An extensive and detailed defence of Gewirth’s argument for the PGC, dealing with sixty-six distinct types of objections made by philosophers; foreword by Gewirth.
  • Boylan, Michael (ed.). Gewirth (Lanham: Rowman & Littlefield, 1999). A collection of essays commenting on Gewirth’s theory, how it relates to Kantianism, rationalism in ethics, altruism, and community; the book also contains Gewirth’s replies to the comments, as well as a chronological list of all his published writings up to 1998.
  • Regis Jr., Edward (ed.). Gewirth’s Ethical Rationalism (Chicago: The University of Chicago Press, 1984). A collection of critical essays dealing with various aspects of Gewirth’s theory, such as the “is–ought” problem, duties relating to positive rights, and marginal agents; Gewirth replies to his critics in the last chapter.

 

Author Information

Per Bauhn
Email: Per.Bauhn@Lnu.se
Linnaeus University
Sweden

William Hazlitt (1778 – 1830)

William Hazlitt is best known as a brilliant essayist and critic. His essays include criticism of art, poetry, fiction, and drama. He wrote social and political commentary, portraits of major writers and political figures of his age, and a biography of his great hero, Napoleon. He had intended to follow his father into the Unitarian ministry but became instead a painter of portraits before settling into a career as a writer. His earliest writing is philosophical, and his key ideas are incorporated into his later work as a critic and conversational essayist.

Hazlitt was acquainted with many of the leading figures of the period, including Wordsworth and Coleridge, Keats and Shelley, the philosopher William Godwin, and the essayists Leigh Hunt and Charles Lamb. Like other political radicals of the time, he was persecuted by the Tory press, being referred to disparagingly by one periodical as belonging, with Keats and Hunt, to the ‘Cockney School’. His most notorious work, Liber Amoris (1823), gave ammunition to his enemies by candidly recounting the story of his infatuation with Sarah Walker, the daughter of his landlady, for whom he divorced his wife only to be rejected. He died in 1830, at the age of 52.

Hazlitt was educated at New College, Hackney, a Dissenting academy, where he acquired a thorough grounding in philosophy and literature. He left prematurely, but not before he had begun developing the ideas that he later described as his ‘metaphysical discovery’ and that formed the core arguments of his first book, An Essay on the Principles of Human Action (1805). In this he argues against psychological egoism, materialism, associationism, and a Lockean account of personal identity. He argues for the formative power of the mind and the natural disinterestedness of human action regarding future benefits for oneself and others.

Table of Contents

  1. Life
  2. Early Philosophical Works
    1. The ‘Metaphysical Discovery’
    2. Hartley and Helvétius
    3. History of Philosophy
    4. Kant and Idealism
  3. Political Thought
    1. Early Political Writing
    2. Virtue and Liberty
    3. The People
    4. The Press and Freedom of Speech
  4. The Essayist as Philosopher
    1. Late Twentieth and Early Twenty-First Century Studies
    2. Abstraction and the Poetic
    3. Power and the Poetic
    4. Conclusion
  5. References and Further Reading

1. Life

Hazlitt was born on April 10, 1778, in Maidstone, in the English county of Kent. His Irish father, also named William Hazlitt, was a Presbyterian minister, an author of theological and philosophical works, and a friend of leading Dissenting thinkers such as Joseph Priestley and Richard Price. His mother, Grace, was from an English Dissenting family. In 1780 the family moved to Bandon, County Cork, Ireland. Running into political difficulties with the local community, the Rev. Hazlitt moved the family again, this time to the United States, where he founded the first Unitarian church in Boston but failed to become established. The family returned to England in 1787, to the village of Wem, near Shrewsbury, Shropshire.

William started his formal education in the small school run by his father. He also had periods of schooling in Liverpool, from where he wrote home, precociously, about the injustice of the slave trade and of the Test and Corporation Acts. William intended to follow his father into the Presbyterian, specifically Unitarian, ministry. Because Unitarianism is a rational and politically liberal Dissenting tradition, the family welcomed the French Revolution in 1789. In July 1791 the Birmingham home, library, and laboratory of Joseph Priestley were destroyed by a mob. Young William penned a passionate letter in defence of Priestley, which was published in the Shrewsbury Chronicle.

In 1793 Hazlitt left Wem and the oversight of his father to begin his formal training for the ministry at New College, Hackney, just north of London. This was a Dissenting Academy for the education of lay and ministerial students. If its building rivalled Oxford and Cambridge colleges for grandeur, the curriculum exceeded them in its breadth and its pedagogy in its promotion of free enquiry and philosophical debate. Robert Southey wrote that students ‘came away believers in blind necessity and gross materialism – and nothing else’. Internal disputes and financial difficulties, as well as its reputation as a hotbed of sedition, were already starting to destabilise the college. Many of the students were restless, radicalised by works such as Godwin’s Enquiry Concerning Political Justice (1793) or distracted by the theatres and other pleasures afforded by its proximity to London. Hazlitt left prematurely, having lost his sense of a vocation, but the college had given him a solid grounding in philosophy and literature; and he may already have made the ‘metaphysical discovery’ that would form the basis of his first book.

Hazlitt chose to follow his brother John into a career as a portrait painter, and to live and train with him in London. He soon met leading radicals and thinkers, including William Godwin and Joseph Fawcett. 1798 was a landmark year, as he recalls in his 1823 essay ‘My First Acquaintance with Poets’ (CW 12). When Samuel Taylor Coleridge came to Shrewsbury, to consider a vacancy as a Unitarian minister, Hazlitt went to hear him preach, and later at dinner in Wem, the poet ‘dilated in a very edifying manner on Mary Wollstonecraft and [James] Mackintosh’. Coleridge stayed the night; in the morning he received a letter from Thomas Wedgewood offering him £150 a year to relinquish his ministerial intentions and devote himself to poetry and philosophy, which he immediately accepted. Any disappointment Hazlitt felt was assuaged by the poet’s invitation to visit him at Nether Stowey in Somerset. Delighted, the nineteen-year-old William accompanied Coleridge back to Shrewsbury. On this walk, Hazlitt attempted, with difficulty, to outline the argument of his ‘metaphysical discovery’.

His stay later that year with Coleridge at Nether Stowey, and with Wordsworth at nearby Alfoxton, was another formative experience. The diarist Henry Crabb Robinson, who first met Hazlitt at this time, describes him as a shy and somewhat tongue-tied young man, but also as the cleverest person he knew. By now it was becoming clear that the tide was turning against the radicals and reformers. At a lecture, ‘On the Law of Nature and Nations’, Hazlitt heard James Mackintosh renounce his support for the Revolution and radicalism. Thereafter Hazlitt had nothing but contempt for apostasy of this kind.

Hazlitt painted portraits in Manchester, Liverpool, and Bury St Edmunds. A portrait of his father was exhibited at the Royal Academy in 1802. On a visit to Paris, he caught a glimpse of his hero, Napoleon, and spent hours copying works by Titian, Raphael, and Poussin in the Louvre. He told an Englishman who praised his work that rapid sketching was his forte and that, after the first hour or two, he generally made his pictures worse. He later wrote about his career as a painter in essays such as ‘On the Pleasure of Painting’ (CW 18).

When he was twenty-five, Hazlitt visited the Lake District in Northern England, where Coleridge and Wordsworth were living. The poets regarded him as a moody and easily enraged young man but possessed of real genius. However, the stay ended badly. In 1815, Wordsworth gave an account of the episode to Crabb Robinson as an explanation for his coolness towards Hazlitt. According to this (not necessarily reliable) account, Wordsworth had rescued Hazlitt from a ducking following Hazlitt’s assault of a local woman.

An important new friendship dates from October 1804. Charles Lamb was an old schoolmate and friend of Coleridge and was already a published poet and journalist. Hazlitt also saw a good deal of William Godwin. He was still attempting to get his ‘metaphysical discovery’ into decent order, and it is likely that Godwin advised him. He certainly assisted practically by recommending the work to the publisher Joseph Johnson. The book was published in July 1805. Johnson clearly did not anticipate a huge demand for a work of metaphysics from an unknown author: the first (and only edition in Hazlitt’s lifetime) consisted of just 250 copies, and yet when he was denigrated by the editor of the Quarterly Review, William Gifford, as a writer of third-rate books, Hazlitt responded: ‘For myself, there is no work of mine which I would rate so high, except one, which I dare say you never heard of – An Essay on the Principles of Human Action’ (CW 9: 51).

Mary Lamb, Charles Lamb’s sister, was attempting to interest Hazlitt in a relationship with her friend Sarah Stoddart, who lived in a cottage in Winterslow in Wiltshire. Hazlitt was often busy, so meetings were few and far between. He was working on a second publication, Free Thoughts on Public Affairs (1806). In 1807 he completed his abridgement of The Light of Nature Pursued by Abraham Tucker, and he worked on a series of letters, published in William Cobbett’s Political Register and subsequently as a book entitled A Reply to the Essay on Population, by the Rev. T. R. Malthus (CW 1). Then came his The Eloquence of the British Senate, an anthology of political speeches. Occupied with researching and writing, and still painting, he somehow found time to correspond with Miss Stoddart. She was, at 33, three years older than him. After their largely epistolary courtship, they married on May 1, 1808. The financial interference of Sarah’s brother, John Stoddart, a lawyer and future editor of The Times, rankled with Hazlitt. There were tensions from the start. Sarah liked tidiness and busyness, and despite his recent flurry of publications, she suspected Hazlitt was an idler. In fact, he was working on a book Godwin had commissioned: A New and Improved Grammar of the English Tongue (CW 2). Hazlitt and Sarah went to live in Winterslow. A child was born but lived only a few months. In September 1811, their only child to survive, another William, was born.

Hazlitt worked on a completion of the Memoirs of the Late Thomas Holcroft, which was not published until 1816, in part because Godwin, who had been a close friend of the playwright and novelist, objected to the way Hazlitt had made use of Holcroft’s diary. Earlier, a plan to write a History of English Philosophy had failed due to insufficient subscribers. The scheme was reinvented as a lecture series, with lectures planned on Hobbes, Locke, Berkeley, self-love and benevolence, Helvétius, Price and Priestley on free will and necessity, John Horne Tooke on language, and natural religion. The lectures were delivered at the Russell Institute in Bloomsbury from January 14, 1812. The first lecture was considered monotonous, but subsequently the delivery improved, and Crabb Robinson reported that the final lecture was ‘very well delivered and full of shrewd observation’.  But another attempt to publish the series as a book again failed to attract sufficient subscriptions. Having rejected the ministry, and with only mediocre success as a portrait painter and as a philosopher, Hazlitt was ready for his true vocation: as a journalist, critic, and essayist.

In October 1812 Hazlitt was engaged as a Parliamentary reporter by James Perry, the proprietor of the Morning Chronicle. The four guineas a week he was paid enabled him to move his lodgings to a house in Westminster, one recently vacated by James Mill and owned by Jeremy Bentham, who was also a neighbour (although they may never have met). Within a few months, Hazlitt progressed to an appointment as drama critic, with the opportunity also to contribute political pieces. However, his public support for the fallen Napoleon caused him difficulty with editors. More sympathetic were Leigh Hunt and his brother John Hunt of the Examiner, who were steady in their commitment to political reform. Writing now on drama, painting, and poetry, Hazlitt contributed also to The Edinburgh Review, the leading liberal periodical. Napoleon’s return from Elba in March 1815 and his subsequent defeat at Waterloo represented ‘the utter extinction of human liberty from the earth’. A period of depression and heavy drinking followed. Godwin was one of the few friends who shared his anguish.

Hazlitt’s family situation was difficult. He and Sarah quarrelled, and his brother was now alcoholic and in decline. Hazlitt worked relentlessly to cover household expenses. In 1816 the Memoirs of the Late Thomas Holcroft (CW 3) was finally published, and in 1817 came his first essay collections: The Round Table and Characters of Shakespeare’s Plays (both CW 4). The Round Table shows the mastery of form that Hazlitt had already achieved as an essayist, with 41 titles, including ‘On the Love of Life’, ‘On Mr Kean’s Iago’, ‘On Hogarth’s Marriage a-la-Mode’, ‘On Milton’s Lycidas’, ‘On the Tendency of Sects’, ‘On Patriotism’, ‘On The Character of Rousseau’, ‘Why the Arts are Not Progressive’ and, perhaps most famously, ‘On Gusto’. (There were additional essays by Leigh Hunt in the original edition.)

Hazlitt met John Keats for the first time in 1817: Keats admired him and regarded him as a philosophical mentor. He had also met Percy Shelley, probably at Godwin’s. His relationship with the older Romantics was not good. When John Murray published Coleridge’s long- uncompleted poems ‘Christabel’ and ‘Kubla Khan’, Hazlitt’s reviews displayed the full extent of his frustration with his early mentor. He subsequently criticized Coleridge’s Biographia Literaria vigorously. These attacks upset their mutual friend Charles Lamb and Coleridge himself. Henceforth, in Duncan Wu’s words, Coleridge and Wordsworth ‘dedicated themselves to the dismantling of Hazlitt’s reputation, by fair means or foul’ (Wu, 2008: 191). Hazlitt’s attacks on the poets continued with a review of Robert Southey’s youthful dramatic poem Wat Tyler, which had recently been published against Southey’s wishes. Hazlitt compared its radical sentiments with the poet’s more recent ultra-Royalist articles in the Quarterly Review. But the renegade poets were not the only people he criticized: liberals and reformers were not sacrosanct.

In June 1817, Hazlitt became the drama critic of The Times, which sold 7000 copies a day. The work was exhilarating and exhausting as the two main theatres, Drury Lane and Covent Garden, changed their bills daily, and Hazlitt would often compose the review in his head as he hurried through the streets to dictate it to the printer. In 1818 his reviews were collected in A View of the English Stage. In the same year Lectures on the English Poets was published, based on a lecture series he had given at the Surrey Institution (both CW 5). His lectures had been applauded and cheered, despite some provocative political allusions. The reviews of the books were good, except, inevitably, those in the Tory periodicals. Hazlitt regarded William Gifford, the editor of the Quarterly Review, as ‘the Government’s Critic, the invisible link that connects literature with the police’.

In the summer of 1818 Hazlitt retired to Winterslow to stay by himself at an inn. He wanted to be close to his beloved son, but he was estranged from Sarah. He worked on a new series of lectures, on English comic writers from Ben Jonson to Henry Fielding and Laurence Sterne. By the end of that summer, however, he was rocked by a vituperative article in Blackwood’s Edinburgh Magazine entitled ‘Hazlitt Cross-Questioned’. The author, J. G. Lockhart, the co-editor, had heard from Wordsworth about Hazlitt’s Lake District episode of 1803. The periodical also attacked Keats and Leigh Hunt, ridiculing the three of them as the ‘Cockney School’. It was a scurrilous political campaign, aimed at harming him professionally—and it partly succeeded, for Taylor and Hessey, who had published previous lecture series, withdrew an offer of £20 for the copyright of the Comic Writers series. Hazlitt took legal action and eventually settled out of court, winning £100 damages, plus costs.

Hazlitt now had yearly earnings of approximately £400, some of which he may have gifted to his elderly parents (now living, with his sister Peggy, in Devon). He was not inclined to save money. He was often in default of his rent, and Bentham eventually evicted him. Lectures on the English Comic Writers (CW 6) was published in 1819, followed in August by his Political Essays (CW 7), which included the major two-part essay ‘What is the People’. As usual, the reviews were partisan. William Gifford, in the Quarterly Review, returned to the attack, and the Anti-Jacobin Review called for Hazlitt’s arrest. The attacks, as A. C. Grayling notes (2000: 248), were purely personal. The Government’s repressive measures had raised the political temperature and public discontent. In August eleven people were killed and 600 injured when dragoons charged demonstrators at St Peter’s Field in Manchester.

In November 1819, Hazlitt began a series of lectures on Elizabethan dramatists other than Shakespeare. Lectures on the Dramatic Literature of the Age of Elizabeth was published in February 1820 (CW 6). He was at the height of his reputation, widely recognised as a great critic and prose stylist. In April 1820 he embarked on a series of essays that would further enhance his reputation. These were published as Table Talk, in two volumes in 1821 and 1822 (CW 8). Less polemical than the Round Table essays, they are longer and more reflective. One of the best known of the thirty-three essays is ‘The Indian Jugglers’; others include ‘On Genius and Common Sense’, ‘On the Ignorance of the Learned’, ‘Why Distant Objects Please’, and ‘On the Knowledge of Character’. News of his father’s death had reached Hazlitt belatedly at Winterslow and his feelings about his father are movingly expressed in the essay ‘On the Pleasure of Painting’. After a visit to his mother and sister, he returned to lodgings in London, once again in Southampton Buildings. It was about to become the scene of the most painful episode of his life.

Sarah Walker, the landlady’s daughter, was nineteen. A flirtatious relationship became, on Hazlitt’s side, a passionate infatuation. But Hazlitt was not the only lodger with whom Sarah flirted. When he overheard Sarah and her mother talking lewdly about other lodgers, he was shocked and frustrated. He decided to push Sarah towards a commitment by freeing himself to marry. By February 1822 he was in Scotland, where divorce was easier to obtain, arranging for his wife to discover him with a prostitute. A period of residence in Scotland was required, and he spent the time writing, also lecturing on Shakespeare. His wife’s attitude was pragmatic, and she took the opportunity to walk independently in the Highlands.

News from London concerning Sarah Walker’s behaviour caused Hazlitt additional agony; he rushed to see her only to be met with a cold reception. After a tormented week, he returned to Edinburgh to complete divorce proceedings, then back in London, newly single, he witnessed Sarah Walker walking with his main rival, John Tomkins, in a way that convinced him they were lovers. By now he regarded her as ‘a regular lodging-house decoy’. His friends witnessed the ‘insanity’ of his conflict of adoration and jealousy. Meeting him in the street, Mary Shelley was shocked by his changed appearance. As late as September 1823, visiting London from Winterslow, he spent hours watching Sarah Walker’s door. Hazlitt chose now to compile from his notes and letters a confessional account of the whole affair. This became Liber Amoris, or The New Pygmalion (CW 9), his most notorious book. It was published anonymously in 1823, but no one doubted its authorship. It caused an uproar and allowed his enemies further to impugn his morality and his judgment.

During his period of infatuation, both Keats and Shelley had died. Before his death, Shelley had collaborated with Leigh Hunt and Lord Byron to launch a new journal, The Liberal. One of Hazlitt’s contributions was ‘My First Acquaintance with Poets’, the essay in which he discusses his early meetings with Coleridge and Wordsworth, and which ends with an affectionate tribute to Lamb. If Hazlitt’s literary power was undiminished by those recent events, his financial situation certainly was, and in February 1823 he was arrested for debt. It was an unpleasant experience, but brief because his friends were able to supply ‘terms of accommodation’. He now resumed his regular contributions to periodicals, and he started work on the character portraits that would be republished in The Spirit of the Age (CW 11) in 1825. It sold well and is considered one of his finest achievements. The men portrayed include Bentham, Godwin, Coleridge, Sir Walter Scott, Lord Byron, Southey, Wordsworth, Mackintosh, Malthus, Cobbett, and Lamb.

A relief to his emotional struggles and financial crisis presented itself, conveniently, in the form of marriage to an independent woman, Isabella Bridgewater. She was intelligent and educated, a widow with £300 a year. Their love (or understanding) developed rapidly, and in early 1824 they were in Scotland (where his divorce was recognised) to get married. They then embarked on a continental tour, during which he contributed travel pieces to the Morning Chronicle, subsequently published (in 1826) as Notes of a Journey Through France and Italy (CW 10). In Paris they visited the Louvre. His hopes and commitments had not changed since his last visit in 1802. In Florence he visited Leigh Hunt and Walter Savage Landor. He liked Venice more than Rome but admired the Sistine Chapel. They returned via the Italian Lakes and Geneva, where he enjoyed scenes associated with Jean-Jacques Rousseau.

In 1826 Hazlitt finished preparing for publication the essays collected in his final major collection, The Plain Speaker (CW 12). These included some of his greatest essays, such as ‘On the Prose Style of Poets’, ‘On the Conversation of Authors’, ‘On Reason and Imagination’, ‘On Londoners and Country People’, ‘On Egotism’, ‘On the Reading of Old Books’, ‘On Personal Character’, and, perhaps best known, ‘On the Pleasure of Hating’. The book was published in May 1826, in the same month as Notes of a Journey. Settled in Down Street, Piccadilly, with Isabella, without financial worries, Hazlitt contemplated his most ambitious work: a biography of Napoleon. Researching this would require a prolonged stay in Paris. This, or perhaps young William’s unfriendly manner towards his step-mother, unsettled Isabella, and the marriage foundered. It was said that she had fallen in love with Hazlitt because of his writings and parted from him because of the boy.

By December 1827 Hazlitt was complaining of ill-health. In February 1828 he wrote his ‘Farewell to Essay Writing’, a powerful justification of his ‘unbendingness’: ‘What I have once made up my mind to, I abide by to the end of the chapter’ (CW17: 319). He returned to Paris to pour everything into the completion of The Life of Napoleon Buonaparte (CW 13 – 15). Then, back in London, he continued a project he had started in autumn 1827: a series of ‘conversations’ with James Northcote (1746 – 1831), the artist and former pupil and biographer of Sir Joshua Reynolds. Crabb Robinson reported that there was ‘more shrewdness and originality’ in Northcote and Hazlitt himself than in Dr Johnson and James Boswell. The essays Hazlitt wrote in the last few months of his life showed no signs of decline. One, ‘The Sick Room’, describes his pleasure in reading when ‘bed-rid’: ‘If the stage shows us the masks of men and the pageant of the world, books let us into their souls and lay open to us the secrets of our own. They are the first and last, the most home-felt, the most heart-felt of our enjoyments’ (CW 17: 375 – 76).

Having been often disappointed by the course of European history, Hazlitt survived long enough to hear that the Bourbon monarchy had been overthrown. He wrote in his final essay, ‘On Personal Politics’, that should the monarchy be restored once more, liberty would live on because the hatred of oppression is ‘the unquenchable flame, the worm that dies not’ (CW19: 334n). Hazlitt died on 18 September 1830, most likely from stomach cancer. He was 52. According to his son, his last words were, ‘Well, I’ve had a happy life’. He was buried in the churchyard of St Anne’s, Soho. The funeral was arranged by Charles Lamb, who had been with him when he died.

2. Early Philosophical Works

a. The ‘Metaphysical Discovery’

At the age of sixteen or seventeen Hazlitt made a ‘metaphysical discovery’. It would be another ten years before he could articulate this insight to his satisfaction and work out its wider implications. When the work containing them, An Essay on the Principles of Human Action (CW 1), was finally published in 1805, it was largely ignored. From the late twentieth century on, it has received more attention than ever before. Hazlitt did not pursue a career as a metaphysician, but the ideas remained central to his thinking: he found ways better suited to his genius to infiltrate them into public consciousness.

The ‘metaphysical discovery’ occurred as Hazlitt was reading Baron d’Holbach’s arguments for self-love. He contemplated the possibility that we have a greater tendency to altruism than Hobbes, and most philosophers since, had allowed. Voluntary action concerns future consequences, so questions about egoism and altruism are ultimately about the individual’s relation to his future self. Psychological egoism suggests that even an apparently benevolent action has an underlying selfish motivation, and Hazlitt does not deny that this can be the case, but he wonders what accounts for it. Is the principle of self-love inherent in human nature, or is there a metaphysical case for questioning this dispiriting conclusion?

Hazlitt argues as follows. If I now regret an earlier generous action and, looking back, hold my past self accountable, I am presuming a continuity of personal identity between past and present—and with good reason, for my past self is causally connected to my present self through memory. There is some kind of ‘mechanical’ or psycho-physiological process that connects my past decisions to my present consciousness. But if I now anticipate a future benefit or injury to myself, resulting from a present decision, there can be no comparable connection because the future has not occurred, it does not exist. Therefore, Hazlitt insists, a principle of self-interest cannot apply to my future self: at least, not one that posits an actual continuity or identity of self through time. But is there not some faculty of mind that connects me to it, and is this not as ‘personal’ to me, as exclusive, as memory and consciousness are? These faculties give me access to my past and present experience of a kind that I cannot have to anyone else’s: is not my anticipation of my future experience a directly parallel case?

Hazlitt argues that it is not. There is, currently, no future self. The faculty of mind that anticipates the future self is imagination and, yes, it allows me to anticipate my future, but only in the same way as it allows me to anticipate your future or anyone else’s. We are ‘thrown forwards’ into our futures but not in the intimate, exclusive way in which we connect through memory with our past or through consciousness with our present. The connection I have with my future self, through imagination, has the same degree of disinterestedness or impersonality as my relationship with another person’s future self. An action that might be described as motivated by self-love could equally be described as motivated by disinterested benevolence, for my future self has the metaphysical status of otherness.

This seems counterintuitive. It is true that I can anticipate another person’s pleasure or pain to some extent, but not with the same force or degree of interest as I do my own. Hazlitt knows that we do, as a matter of fact, have a bias towards our own future interests, and that this provides some sense of continuity. However, his supposition is that this bias is acquired: the selfishness that other philosophers argued was inherent is actually the result of nurture, of socialisation. The point is not that benevolence is inherent, but that humans are ‘naturally disinterested’, and therefore we could be educated to think and act differently.

For some commentators this reorientation of the argument concerning egoism and altruism is the Essay’s main point of interest, while for others it is the argument for the discontinuity of personal identity. John Locke had argued in An Essay Concerning Human Understanding (first published in 1689) that what makes someone the same person today as yesterday, or as in their distant childhood, is memory. His ‘Prince and Cobbler’ example, in which the memories of each switch bodies, was intended to show that psychological rather than physical continuity is what guarantees identity through time (Locke, 1975: 340). This was questioned by Thomas Reid in his Essays on the Intellectual Powers (1785). His  ‘General and Ensign’ example suggested that personal identity could not be reduced to psychological continuity (Reid, 1983: 216 – 218). Renewed interest in the question from the 1960s onwards produced a range of fission-based thought experiments that led Derek Parfit, for example, to conclude that it is not personal identity that matters but some degree of psychological survival (Parfit, 1987: 245 – 302). In some scenarios it might not be me who persists, but someone qualitatively very much like me.

Hazlitt appears to have anticipated this distinction. He employs a multiple-fission example to show that the kind of connection we have with our future self cannot guarantee personal identity. What if the connection between past and present were non-causal: could that produce identity? If Person A’s consciousness were replicated (non-causally) in Person B, would A not feel ‘imposed upon’ by a false claim to identity? Anticipating twentieth-century examples involving multiple replicants, such as Parfit’s ‘Mars Tele-Transporter’ example, Hazlitt asks: if a Deity multiplied my self any number of times, would they all be equally myself? Where would my self-interest lie? He concludes: ‘Here then I saw an end to my speculations about absolute self-interest and personal identity’ (CW 12: 47). Hazlitt’s point is that if the concept of personal identity cannot be carried through with logical consistency to a future self which is one and the same as the present self who acts, neither can the idea of the necessity of self-interested action. In any case, although what I am now depends on what I have been, the chain of communication cannot run backwards from the future to the present. If the Deity multiplied me any number of times in the future, or destroyed me, it could not affect my present self.

My only interest in a future self comes from the psychological bias I have acquired from experience, including my upbringing, and if I have thereby acquired a sense of self and a capacity for sympathy with my future self, I have equally acquired a potential for sympathy (empathy) with others. My future self is in fact one of those others with whom I can empathize. Imagination enables me to project out of myself into the feelings of others. Moreover, ‘I could not love myself, if I were not capable of loving others’ (CW1: 2). If this implies that I could not wish good things for my future self if I did not wish good things for the future selves of others, it prompts the question: what motivates the desire for good things to happen to anyone?

Hazlitt’s position suggests an account of child development that sees children as learning in stages to distinguish between self and others and to identify with their own current and anticipated longer-term interests. He knew that people sometimes fail to acquire a moral sense and that they can be driven by circumstances to evil, yet still, he argues, they must have in choosing between alternative actions some notion of good. So, although he rejects the hypothesis that we are naturally self-interested, he admits that there is something which is inherent, that ‘naturally excites desire or aversion’, and this is ‘the very idea of good or evil’. Regardless of what I think makes a future consequence a good one, ‘it cannot be indifferent to me whether I believe that any being will be made happy or miserable in consequence of my actions, whether this be myself or another’ (CW 1: 11 – 12).

It follows from this that both selfish and altruistic actions are in a sense impersonal, for it is the idea of good that motivates, rather than a rational calculation or allocation of benefits. Hazlitt may have had in mind a role for parenting and education in refining and extending the child’s understanding of good. The sensitivity of the faculty of imagination in differentiating degrees of good and evil is improvable. No doubt, in anticipating a future benefit or pleasure, imagination can stimulate an illusion of continuous identity, and the satisfaction one gains from imagining one’s own future pleasure is especially forceful because I know from experience what my future feelings might be, but this does not make the connection with the future self parallel with the connection with one’s past self: it is still a fiction. Imagination provides the freedom to think in a more expansive way, to project one’s love of good beyond self-interest to others one is close to, and beyond to others unknown. It is the freedom to aspire to universal benevolence.

Hazlitt’s disjunction between the self, as constituted by memory and consciousness, and the putative future self appears to have been an original observation. We have seen how Hazlitt has been said (for example, by Martin and Barresi, 1995) to anticipate Parfit on personal identity. A. C. Grayling (2000: 363 – 4) finds a parallel between Peter Strawson’s argument concerning other minds and Hazlitt’s ‘transcendental’ argument that being capable of having an interest in other people’s future is a condition for being capable of having an interest in one’s own. Just as, according to Strawson, one can ascribe states of consciousness to oneself only if one can ascribe them to others, which suggests that observable, bodily behaviours constitute logically adequate criteria for ascribing states of consciousness to others, so, according to Hazlitt, one’s relation to one’s future self has the same status as one’s relation to another person’s future self and this suggests that it must be a condition of acting benevolently towards oneself (self-interestedly) that one can act benevolently towards others.

b. Hartley and Helvétius

The Essay has a second part, Remarks on the Systems of Hartley and Helvétius. David Hartley (1705 – 1757) had presented a physiological and mechanical account of the impact of sensation on the brain. Ideas become associated through repetition, so that one sensation can cause, involuntarily, multiple ideas. For Hazlitt this form of associationism provides an insufficient account of the mind. A physiological chain of vibrations, or the ‘proximity of different impressions’, can no more produce consciousness than ‘by placing a number of persons together in a line we should produce in them an immediate consciousness and perfect knowledge of what was passing in each other’s mind’. Furthermore, the suggestion that different ideas have a definite location in the brain is simply absurd; nor can associationism account for the mingling of different experiences in one idea, as when one hears with joy the song of a thrush and imagines it coming beyond the hill from some warm wood shelter. Every beginning of a series of association must derive from some act of the mind which does not depend on association. Association, where it does exist, is only a particular and accidental effect of some more general principle. Hartley’s account leaves no room for such voluntary mental activity as comparison of one idea with another, for abstraction, reasoning, imagination, judgment—in short, ‘nothing that is essential or honourable to the human mind would be left to it’.

Helvétius (1715 – 1771), Hazlitt’s other disputant in the second part of the Essay, had argued (in De l’esprit, 1758) for a materialist theory of mind and for self-interest as the sole motive of human action, a reduction of right and wrong to pleasure and pain: benevolent actions are an attempt to remove the uneasiness which pity creates in our own minds. Therefore, any disinterestedness hypothesis must be wrong, because only self-gratification provides the required causal mechanism. Hazlitt responds, firstly, that this is irrelevant to the issue. The relation of voluntary action to the future does not differ according to whether the principle impelling it is directed towards the self or towards others. It is no more mechanical in the former case than in the latter. Secondly, there is no reason to resolve feelings of compassion or benevolence into a principle of mechanical self-love. We are necessarily affected emotionally by our actions and their consequences: it would be ‘palpable nonsense’ to suggest that to feel for others we must in reality feel nothing. If all love were self-love, what would be the meaning of ‘self’? It must either point to a distinction in certain cases or be redundant. There must be clear limits to the meaning of the term ‘self-love’; but, in any case, purely as a matter of fact, Hazlitt thinks, it is incorrect to think that we have a mechanical disposition to seek our own good, or to think that, when we act benevolently, an accompanying pleasure sensation necessarily displaces the painful feeling occasioned by another’s distress.

The relevant distress is the other person’s, not my own; it is the relief of his or her distress that I will. To the argument that my love of others amounts to self-love because ‘the impression exciting my sympathy must exist in my mind and so be part of me’, Hazlitt responds that ‘this is using words without affixing any distinct meaning to them’. After all, any object of thought could be described as a part of ourselves: ‘the whole world is contained within us’. If any thought or feeling about or for another person is directed not to them but to me, then by the same token I might sometimes be said to be filled with self-hate: ‘For what is this man whom I think I see before me but an object existing in my mind, and therefore a part of myself?… If I am always necessarily the object of my own thoughts and actions, I must hate, love, serve, or stab myself as it happens’ (CW 1: 89 – 90).

Hazlitt concludes the Essay by affirming the common-sense view that compassion for another person’s injury is not a selfish feeling. When I am wounded, the pain is the effect of ‘physical sensibility’; when I see another person’s wound, my experience of pain is ‘an affair of imagination’. Benevolence ‘has the same necessary foundation in the human mind as the love of ourselves’ (CW 1: 91).

c. History of Philosophy

Hazlitt’s intention to write a history of English philosophy was first heard of in 1809 when an eight-page pamphlet was published advertising ‘Proposals for Publishing, in One Large Quarto… A History of English Philosophy: containing an Account of the Rise and Progress of modern Metaphysics, an Analysis of the Systems of the most celebrated Writers who have treated on the Subject, and an Examination of the principal Arguments by which they are supported. By the Author of An Essay on the Principles of Human Action, and An Abridgement of the Light of Nature Pursued’ (CW 2: 112). By 1810 Hazlitt had decided to turn the History into a series of essays, and in January 1812 these became lectures. After the lecture series had been successfully completed, the Proposals for a History of English Philosophy was republished, with a list of subscribers, but assured sales were too few to cover production costs and the book was never published. Most of the lectures were eventually published as essays in Literary Remains of the Late William Hazlitt in 1836.

The Proposals (entitled Prospectus in Howe’s Complete Works) outlines the positive claims on which Hazlitt’s critique of English philosophy would be based. These include the following: that the mind is not material; that the intellectual powers of the mind are distinct from sensation; that the power of abstraction is a necessary consequence of the limitation of the comprehending power of the mind; that reason is a source of knowledge distinct from, and above, experience; that the principle of association does not account for all our ideas, feelings and actions; that there is a principle of natural benevolence in the human mind; that the love of pleasure or happiness is not the only principle of action, but that there are others implied in the nature of man as an active and intelligent being; that moral obligation is not the strongest motive which could justify any action whatever; that the mind is not mechanical, but a rational and voluntary agent—it is free in as far as it is not the slave of external impressions, physical impulses, or blind senseless motives; and that the idea of power is inseparable from activity—we get it from the exertion of it in ourselves (CW 2: 116 – 119).

The lectures of 1812 included ‘On the Writings of Hobbes’, ‘On Locke’s Essay’, ‘On Self-Love’, and ‘On Liberty and Necessity’. In the first of these he argues that, contrary to popular opinion, Locke was not the founder of ‘the modern system of philosophy’. He sees Locke as a follower of Hobbes. Hazlitt’s argument in these essays takes forward the aim referred to in his Proposals, to oppose ‘the material, or modern, philosophy, as it has been called’, according to which ‘the mind is nothing, and external impressions everything. All thought is to be resolved into sensation, all morality into the love of pleasure, and all action into mechanical impulse’ (CW 2: 113 – 4). This theory, he writes, derives from a false interpretation of Francis Bacon’s use of the word ‘experience’, according to which the term applies to external things only and not to the mind. To apply the experimental methodology of natural philosophy to the mind is to assume an affinity based on ‘no better foundation than an unmeaning and palpable play of words’ (CW 2: 114).

In ‘On Liberty and Necessity’, Hazlitt largely agrees with Hobbes’s account of necessity as implying no more than a connection between cause and effect. Free will is not unmotivated: the motives which cause free actions originate in the mind. ‘The will may be said to be free when it has the power to obey the dictates of the understanding’ (CW 2: 255). Liberty is not an absence of obstruction or an uncertainty, it is ‘the concurrence of certain powers of an agent in the production of that event’. It is as real a thing ‘as the necessity to which it is thus opposed’ (CW 2: 258 – 9).

In the same year as the Proposals first appeared (1809), Hazlitt published A New and Improved Grammar of the English Tongue (CW 2). Hazlitt claims some originality for his theoretical and logical analysis of language. He rejects the assumption that grammatical distinctions and words of different kinds relate to different sorts of things or ideas rather than to our manner of relating to them. The same word can play many roles: what changes is the way things are reordered in relation to one another in our thoughts and discourse. A substantive, for example, is not the name of a substance or quality subsisting by itself but of something considered as subsisting by itself. It is an abstraction. Grammatical distinctions also mark changes in the orientation of the speaker to the hearer (‘the poisonous plant’ vs. ‘the plant is poisonous’). Verbs, like adjectives, express attributes and direct the hearer either to a familiar connection between things or to a new or unknown one. Verbs are not the only words that express ideas of being, doing, or suffering, but they have a certain eminence in that, without them, we cannot affirm or deny, ask for information or communicate a desire, express or understand an idea. Hazlitt appears to have grasped something of the pragmatics, in addition to the syntactical and semantic features, of language and communication.

Hazlitt acknowledged the importance and influence of John Horne Tooke’s The Diversions of Purley (1786), which had provided a general theory of language and the mind, but he disagreed with Tooke’s ideas concerning abstractions. He returned to this in ‘On Abstract Ideas’ and ‘On Tooke’s Diversions of Purley’, two of the lectures delivered in 1812. Tooke agrees with Hobbes, Hume, Berkeley, and others that there are no abstract or complex ideas. Hazlitt counters that, on the contrary, ‘we have no others’, for if all ideas were simple and individual we could not have an idea even of such things as a chair, a blade of grass, a grain of sand, each of which is a ‘certain configuration’ or assemblage of different things or qualities. Every idea of a simple object is ‘an imperfect and general notion of an aggregate’ (CW 2: 191). ‘Without the cementing power of the mind, all our ideas would be necessarily decomposed… We could indeed never carry on a chain of reasoning on any subject, for the very links of which this chain must consist, would be ground to powder’ (CW 2: 280).

Hazlitt alludes to an idea that was to stay central to his philosophical outlook: ‘The mind alone is formative, to borrow the expression of a celebrated German writer’ (CW 2: 280). How much did Hazlitt know of Immanuel Kant’s philosophy and to what extent was he an idealist? One reason for asking this is that the debate concerning politics and epistemology, brought to the fore by the empiricism of Burke’s Reflections on the Revolution in France (1790), enticed Romantic writers to explore, as Timothy Michael puts it, ‘the idea… that it is through rational activity that things like liberty and justice cease to be merely ideas’ (Michael, 2016: 1). Hazlitt was one writer who went some way towards idealism. It promised a potential alternative to Godwin’s utopianism, to Bentham’s felicific calculus, and to Burke’s arguments from experience and tradition.

d. Kant and Idealism

 Hazlitt’s opinion that particulars are abstract ideas constructed by an abstract entity we call ‘the mind’ suggests that he had absorbed at least some of the ideas of idealism. He is opposed to any materialist epistemology that has no place for the active power of the mind, but he does not agree with Berkeley that there is no mind-independent world. We can, to an extent, experience an external reality but we cannot conceptualize, know, or understand it without the mind’s faculty of abstraction.

‘Abstraction,’ Hazlitt writes, ‘is a trick to supply the defect of comprehension’. This sentence occurs in the Preface to his 1807 abridgement of Abraham Tucker’s The Light of Nature Pursued. He goes on to argue that abstraction is only half of the understanding: common sense is also needed, and he sees Tucker’s ‘sound, practical, comprehensive good sense’ as the great merit of his (too-voluminous) work (CW 1: 125). There are only two sorts of philosophy: one ‘rests chiefly on the general notions and conscious perceptions of mankind, and endeavours to discover what the mind is, by looking into the mind itself; the other denies the existence of everything in the mind, of which it cannot find some rubbishly archetype, and visible image in its crucibles and furnaces, or in the distinct forms of verbal analysis’. The latter can be left to chemists and logicians, the former is ‘the only philosophy fit for men of sense’.

Hazlitt himself connects Tucker’s philosophy with Kant’s. Tucker ‘believed with professor Kant in the unity of consciousness, or “that the mind alone is formative,” that fundamental article of the transcendental creed’. It is not clear when Hazlitt first became acquainted with Kant’s philosophy. Before he had finished preparing the arguments of the Essay for publication, in 1805, he may have encountered Friedrich August Nitsch’s A General and Introductory View of Professor Kant’s Principles concerning Man, the World and the Deity, submitted to the Consideration of the Learned (1796) or John Richardson’s Principles of Critical Philosophy, selected from the works of Emmanuel Kant and expounded by J. S. Beck (1797) or the same writer’s later publications, but it is most likely that he had encountered Anthony Willich’s Elements of the Critical Philosophy (1798). We know that Coleridge possessed a copy of Willich’s Elements, and he may well have discussed Kant’s philosophy directly with Hazlitt, or with a mutual acquaintance such as Godwin or Crabb Robinson. By 1807, possibly by 1805, Hazlitt certainly knew something of Kant and appreciated him as a formidable opponent of ‘the empirical or mechanical philosophy’ and as a proponent of the doctrine of the creative and active power of the mind.

Hazlitt had definitely seen Willich’s translation by 1814. In his review of Madame de Staël’s Account of German Philosophy and Literature, he mentions Willich’s summary of the Critique of Pure Reason as including the proposition that ‘We are in possession of certain notions a priori which are absolutely independent of all experience, although the elements of experience correspond with them, and which are distinguished by necessity and strict universality’ (CW 20: 18). A footnote takes issue with this idea: ‘This, if the translation is correct… is, as it appears to me, the great stumbling block in Kant’s Philosophy. It is quite enough to shew, not that there are certain notions a priori or independent of sensation, but certain faculties independent of the senses or sensible objects, which are the intellect itself, and necessary, after the objects are given, to form ideas of them’. Having rejected Locke’s doctrine of the mind as a blank slate, Hazlitt was not keen to fill it with what he saw as innate ideas. Whether he or Willich is to blame for the misreading, it seems that, even in 1814, Hazlitt’s understanding of Kant’s philosophy was incomplete.

Did he appreciate Kant’s moral philosophy? There is, perhaps, some similarity between the categorical imperative and the role the mind plays in Hazlitt’s account of disinterested voluntary action. Certainly, for Hazlitt moral action is dissociated from a calculation of material advantage. It is not utilitarian or teleological, but neither is it exactly deontological in the sense of being based on universal rules or duties. There are parallels with Kant’s notion of a priori understanding in that moral action is conformity to a moral standard that is not derived from sensory experience, but for Hazlitt sensory experience is not all experience. Moral action is free of self-interest, but it is not free of selfhood, of the passions and habits and dispositions of the individual self. Hazlitt’s faculty of imagination lacks the purity of Kant’s idea of reason. Its recognition of good is based on experience and past preferences, so the active nature of the individual mind does not entail that reasonable choices converge.

3. Political Thought

a. Early Political Writing

One of the intellectual virtues that Hazlitt championed as a critic and essayist was openness, in the sense both of open-mindedness and of candour. This does not mean that he was flexible in his core political commitments. On the contrary, he valued a principled steadiness. True to his Dissenting roots, he was unshakeable in his commitment to civil and religious liberty and in his opposition to Toryism, the war with France, and the restoration of the Bourbon monarchy. Like many of his generation, he admired Godwin’s Enquiry Concerning Political Justice, but not uncritically, and if he espoused the politics of radical reform, he did so with a degree of skepticism concerning the modifiability of human nature. His open-mindedness and candour meant that he was prepared to criticize, and to antagonize, people who shared his commitments, and to praise those with whom he disagreed politically.

After the Essay was published, Hazlitt was quick to apply its fundamental insights to politics. In his 1806 pamphlet Free Thoughts on Public Affairs (CW 1), he denounces the ‘false patriotism’ of Tory policies, seeing it as a cover for militaristic nationalism, imperialism, and the erosion of constitutional rights. He blames the recently deceased William Pitt the Younger for diffusing ‘a spirit of passive obedience and non-resistance’ (CW 1: 112). He does not reject, as Godwin had done, the legitimacy of the state as an institution, but he insists that a radically reformed state and senate should reject selfish attachments in favour of disinterested policies that bring universal benefit. In this early statement of his political commitments, the ‘metaphysical discovery’ underpins his opposition to tyranny, capitalism, and imperialism.

In The Eloquence of the British Senate (CW 1), an anthology of political speeches (with commentary), published in 1807, Hazlitt makes clear his commitment to the British people’s right and constitutional duty ‘to resist the insidious encroachments of monarchical power’. He condemns the political corruption rampant in the parliamentary system and praises (pre-1789) Edmund Burke and Charles James Fox for their disinterested patriotism and benevolence. In the same year he composed five letters addressed to Thomas Malthus, three of which were published in William Cobbett’s Political Register. Malthus had originally written An Essay on the Principle of Population (1798) in response to Nicolas de Condorcet’s and Godwin’s optimism about the consequences of social improvement. Malthus’s argument that population growth would inevitably outstrip the potential for subsistence was proving influential even in Whig circles. Godwin’s major response would not be published until 1820; meanwhile, Hazlitt’s letters were a significant contribution to the defence of social progress and justice. Hazlitt condemns Malthus’s fatalism and advocacy of the principle of self-love. He also opposed Samuel Whitbread’s Malthusian Poor Bill, which proposed a national system of parochial schools, which, in Hazlitt’s view, would indoctrinate and further increase the vassalage of the poor. Like Godwin, he saw any such state-sponsored system as undermining independence of thought and the principle of popular democracy.

b. Virtue and Liberty

As Hazlitt’s career as a journalist, critic, and essayist developed, he focused on the particular and the individual rather than on abstract principles, but his opposition to the unjust exercise of power was clear and consistent. His Preface to Political Essays (1819) emphasizes his commitment to autonomy, to candour, to opposing selfishness and corruption. He writes:

I am no politician, and still less can I be said to be a party-man: but I have a hatred for tyranny, and a contempt for its tools… I have no mind to have my person made a property of, nor my understanding made a dupe of. I deny that liberty and slavery are convertible terms, that right and wrong, truth and falsehood, plenty and famine, the comforts or wretchedness of a people, are matters of perfect indifference. (CW 7: 7)

Openness, integrity, and sincerity are the virtues Hazlitt opposes to the temptations of advancement through corruption or the allure of power. ‘The admiration of power in others,’ he writes, ‘is as common to man as the love of it in himself: the one makes him a tyrant, the other a slave’ (CW 7: 148). The willingness of a people to become the instruments of tyrants and oppressors allows power to claim legitimacy. It does not speak well for human nature if it can be seduced in this way. Once embedded in people’s minds, power is almost irremovable. This hatred of unjust power explains Hazlitt’s opposition to hereditary monarchy and the idea of divine right; and it explains also, and more controversially, his admiration for Napoleon, whom he came to see as the final bastion against the threat to liberty represented by European monarchies.

Though not a ‘party-man’, Hazlitt thinks like a modern politician when he concedes the need to make pragmatic and partisan concessions in the cause of liberty:

If we were engaged in a friendly contest, where integrity and fair dealing were the order of the day, our means might be as unimpeachable as our ends; but in a struggle with the passions, interests, and prejudices of men, right reason, pure intention, are hardly competent to carry through: we want another stimulus. The vices must be opposed to each other sometimes with advantage and propriety. (CW 17: 40)

Integrity sometimes permits one to speak truth to power in language that power understands.

Freedom of will and political freedom are linked in Hazlitt’s conception of the mind’s innate power, subject only to the laws of its own innate constitution, and arbitrary political power which tries to make us passive machines. Both kinds of power may be tyrannical, and we are too inclined to admire political power in others. Uttara Natarajan observes that in his conversational essays we see Hazlitt attempting to translate into practice the ideal of the innate power of the individual resisting arbitrary political power (1998: 116). Potentially the most powerful instrument in the cause of liberty, poetry is neutral, and the power of language can be put to use on either side. In a sense liberty and political power are unevenly matched, for the former is diffused and the latter concentrated, and liberty must contend also with ego, pride, and prejudice. As with the will of individuals, it is not inevitable that the will of the people will be directed to the common good, but at least it has the capacity to be so directed.

That provides at least a degree of hope; yet Hazlitt sometimes comes close to despair about the prospects for genuine change. In one of his many aphorisms, he states:

If reform were to gain the day, reform would become as vulgar as cant of any other kind. We only shew a spirit of independence and resistance to power, as long as power is against us. As soon as the cause of opposition prevails, its essence and character are gone out of it; and the most flagrant radicalism degenerates into the tamest servility. (CW 20: 333)

Nevertheless, it was important to sustain resistance, to exert freedom of the will, in order to retain whatever liberty remained.

c. The People

Like Winston Smith in George Orwell’s Nineteen Eighty-Four, Hazlitt looks to the proletariat. In essays depicting country people and townspeople, he characterizes both with frankness. A Cockney is someone who ‘sees and hears a vast number of things, and knows nothing’ (CW 12:  66). By his lack of servility, ‘Your true Cockney is your only true leveller’ (CW12: 67). Whereas the county dweller is petty and parochial, the urban dweller benefits from his exposure to the mass of people. London is described as ‘a visible body-politic, a type and image of that great Leviathan’. The urban social experience is an emancipation from ‘petty interests and personal dependence’.

Hazlitt recognises that although differences of character, talent, and discrimination mean there is undeniable superiority in particular spheres of life, including art, poetry, and criticism, nevertheless, superiority ‘arises out of the presupposed ground of equality’ (CW 8: 208). The benefit that ordinary people gain from society, from ‘free communication and comparing of ideas’, is denied to people of rank, ‘where all is submission on one side, and condescension on the other’. He is astonished by the airs and graces some people give themselves when there is so ‘little difference… in mankind (either in body or mind)’. Individual achievement is grounded in the essential equality of the people: ‘I am proud up to the point of equality—every thing above or below that appears to me arrant impertinence or abject meanness’ (CW 20: 123). Differences are largely due to disparities of opportunity and esteem.

In two powerful essays entitled ‘What is the People?’ (CW 7), Hazlitt attacks efforts by Southey and others to associate parliamentary reform with insurrection. The Poet Laureate criticizes the maxim vox populi vox Dei—the voice of the people is the voice of God. Hazlitt’s answer to the question What is the people? is:

Millions of men, like you, with hearts beating in their bosoms, with thoughts stirring in their minds, with blood circulating in their veins, with wants and appetites, and passions and anxious cares, and busy purposes and affections for others and a respect for themselves, and a desire for happiness, and a right to freedom, and a will to be free’. (CW 7: 259)

He launches into a ferocious attack on his antagonist, who would lay the mighty heart of the nation ‘bare and bleeding at the foot of despotism’, who would ‘make the throne every thing, and the people nothing’ and be himself a ‘cringing sycophant, a court favourite, a pander to Legitimacy.

This notion, legitimacy, is nothing other than the old doctrine of Divine Right ‘new-vamped’. The purpose of the Government should be to benefit the governed; its interests should not be at variance with those of the people, which are common and equal rights, yet the Government, Hazlitt thinks, sees its interest as preserving its privileges and those of the great and powerful. The dog kennels of the great and powerful are ‘glutted with the food which would maintain the children of the poor’. The people obstruct their absolute power; therefore rulers will always try to root out ‘the germs of every popular right and liberal principle’. How can rulers such as these be expected to have sympathy with those whose loss of liberty is their gain? The wealth of the few is composed of ‘the tears, the sweat, and blood of millions’ (CW 7: 264 – 5).

If a corrupt, self-interested Government cannot be trusted to serve the people’s interest, what can? There is no better solution, Hazlitt insists, than a popular democracy: ‘Vox populi vox Dei is the rule of all good Government: for in that voice, truly collected and freely expressed… we have all the sincerity and all the wisdom of the community’. In fact, the closer we can get to a direct democracy, in which each individual’s consciousness of his or her own needs and desires is registered, the better. In the opposite extreme (hereditary despotism), the people are ‘an inert, torpid mass, without the power, scarcely with the will, to make its wants or wishes known’ (CW 7: 268).

Hazlitt does not appear to endorse Godwin’s anarchistic localism, for he thinks representation and universal suffrage is the closest to direct democracy that can be achieved, but there are Godwinian themes when he addresses, in the second part of the essay, the question ‘Where are we to find the intellect of the people?’. His answer is everywhere. Public opinion incorporates ‘all those minds that have ever devoted themselves to the love of truth and the good of mankind’ (CW 7: 269). Lord Bacon was a great man, but not because he was a lord; Burke received his pension from the King, but not his understanding or his eloquence. What have hereditary monarchs ever done for the people? What wisdom is there in the Established Church, in the slave trade, in error, inhumanity, corruption and intolerance, in Church-and-King mobs but not in petitions for parliamentary reform? According to Hazlitt,

‘Loyalty, patriotism, and religion, are regarded as the natural virtues and plain unerring instincts of the common people: the mixture of ignorance or prejudice is never objected to in these: it is only their love of liberty or hatred of oppression that are discovered… to be the proofs of a base and vulgar disposition. (CW 7: 271)

Vox populi is the voice of God because it is the cry raised against ‘intolerable oppression and the utmost extremity of human suffering’ (CW 7: 278). Freed from attempts to stifle it or give it a false bias, it must lead ‘to the gradual discovery and diffusion of knowledge in this as in all other departments of human knowledge’. Indoctrinated by the Church and State, the people have been denied a proper (non-catechistic) education, and therefore the opportunity to ‘take the management of our own affairs into our own hands, or to seek our fortunes in the world ourselves’. Liberty requires the people to want it more than they want power, and to recognise their oppression. He shares with Godwin the view that individual judgment will improve when people are allowed to exercise it.

To develop his attack on the hereditary principle, Hazlitt invents a fable or thought experiment. He supposes that the actor Edmund Kean takes out letters patent ‘to empower him and his heirs for ever… by the grace of God and the favour of the Prince Regent’  to take the lead in all future stage tragedies, regardless of actual talent, and despite the fact that his son is ‘a little crook-kneed, raven-voiced, disagreeable, mischievous, stupid urchin, with the faults of his father’s acting, and none of his fine qualities’ (CW 7: 274). Unless forced to attend at the point of a bayonet, the public would simply stay away. ‘Surely, if government were a thing requiring the utmost genius, wisdom, virtue, to carry it on, the office of King would never even have been dreamt of as hereditary, any more than that of poet, painter, or philosopher’ (CW 7: 274). Near idiots are supposed capable of ruling while the people are denigrated as ‘a swinish multitude’ and mocked for their lack of refinement and philosophy. When will the ‘long minority’ of the people expire? Despotic rulers, tenacious of power, should indeed fear the people’s fury, even if timely reform might prevent, or delay, a revolution.

‘What is the People?’ is Hazlitt at his most polemical. The pronouns ‘we’ and ‘our’ become prominent as he aligns himself with the people. His tendency to skepticism about enduring progress is suppressed in favour of a defiant tone that conveys the tensions in the period before the Peterloo Massacre of 1819. It is in stark contrast to the aloof voice of An Essay on the Principles of Human Action, though the militant is identical with the metaphysician.

d. The Press and Freedom of Speech

The idea of a disinterested state of mind, first developed in the Essay, grounds Hazlitt’s political thinking and, specifically, his commitment to freedom of speech and the liberty of the press. The faculties of mind, including imagination, are active and receptive, and they develop through exposure to ideas and beliefs that are encountered through conversation and, especially, through reading. We are naturally disposed to sympathize with the feelings of others, but our faculties need cultivation. We need to be challenged and to exercise judgment in the careful consideration of different points of view, and without prioritising our own interests or settled opinions.

The invention of printed books made knowledge more widely available, and the press is, potentially at least, ‘the great organ of intellectual improvement and civilisation’ (CW 13: 34). Hazlitt was, of course, aware that periodicals could equally be organs of Government propaganda. Editors, booksellers, and publishers were prosecuted under the Sedition and Libel acts more often than authors themselves. For example, in 1798 Joseph Johnson, the publisher of the Essay, and of Hazlitt’s father’s sermons, of Priestley, Wollstonecraft, Godwin and others, was tried and imprisoned for sedition. John Hunt, Hazlitt’s friend and the publisher of The Examiner and The Yellow Dwarf, was twice imprisoned. William Cobbett fled to the United States to avoid arrest. Hazlitt is often at his most splenetic (and least disinterested, perhaps) in considering the editors of the Tory press and their turncoat contributors. In the former category is William Gifford, the editor of the Quarterly Review. Hazlitt describes him as being so well qualified for the situation ‘by a happy combination of defects, natural and acquired’ that at his death ‘it will be difficult to provide him a suitable successor’ (CW 11: 114).  Mercilessly denouncing Whigs and Radicals as ‘dunces and miscreants’, Gifford ‘entitles himself to the protection of the Church and State’ (CW 11: 117). People like this ‘poison the sources of public opinion’ (CW 11: 124). A puppet press manipulates public opinion, diverting it from truth, justice, and liberty. Without the opportunity to develop independent thinking, individuals cannot break free from prejudice and received opinion.

The discontinuity of personal identity is alluded to, ironically, in Hazlitt’s response to Southey’s attempt to suppress the unauthorised publication, in 1817, of his youthful dramatic poem Wat Tyler, which the Poet Laureate now regarded as seditious. The former Radical had become a strong opponent of parliamentary reform and of free speech. What could prompt such a turnaround? Hazlitt imputes it ‘rather to a defect of moral principle than to an accession of intellectual strength’ (CW 7: 180). Hazlitt had admired the Lake poets’ earlier work, but he insists on his right to criticize them in print. Integrity requires Hazlitt to speak truth, also, to those allies he feels sometimes undermine the cause: Godwin was too utopian, Percy Shelley too extreme, Robert Owen disingenuous in claiming originality. The underlying idea is still disinterestedness: a critical, candid, disinterested response to the spirit of the age, to the cacophony of its leading voices. What would be the point of independent, disinterested judgment if, from tact or pusillanimity, one preferred self-censorship to candid free speech? Freedom is the right to criticize and disagree.

4. The Essayist as Philosopher

a. Late Twentieth and Early Twenty-First Century Studies

Interest in Hazlitt, and especially in his philosophy, was largely dormant throughout Victorian times and most of the twentieth century. Herschel Baker (1962) and W. P.  Albrecht (1965) both comment on the philosophy, in Baker’s case none too favourably, but it is since Roy Park’s 1971 study that interest has been sustained. Several biographies and critical studies have appeared that have attempted to do justice to Hazlitt the philosopher and political thinker, as well as to Hazlitt the critic and conversational essayist. These include biographies by A. C. Grayling (2000) and Duncan Wu (2008). Stanley Jones (1989) focuses on Hazlitt’s later life. David Bromwich’s intellectual biography The Mind of the Critic (1983) is recognised as a major critical study of Hazlitt as a leading figure of Romanticism. John Kinnaird (1978) traces Hazlitt’s use of the term ‘power’ in both political and creative contexts. Tom Paulin (1998) emphasizes Hazlitt’s genius as a prose stylist and radical thinker. Stephen Burley (2014) places Hazlitt’s life and thought in the context of his Unitarian upbringing and education and focusses on his early philosophical work. Kevin Gilmartin (2015) puts politics at the centre of Hazlitt’s achievement as a critic and essayist. A major collection of essays by several authors, Metaphysical Hazlitt: Bicentenary Essays (2005) marked the bicentenary of Hazlitt’s Essay on the Principles of Human Action and explored its relevance to his other work.

These works testify to the modern interest in Hazlitt’s overall achievement. But it is worth taking a closer look at two distinctive interpretations of the philosophy, focused on different key concepts, in order to relate them to his work as a critic and essayist. Park’s Hazlitt and the Spirit of the Age (1971) focuses on what he sees as Hazlitt’s entirely original theory of abstract ideas; Uttara Natarajan’s Hazlitt and the Reach of Sense: Criticism, Morals, and the Metaphysics of Power (1998) focuses on Hazlitt’s insistence on the formative power of the mind. Both books investigate how Hazlitt’s philosophical commitments were integral to the style and content of the literary essays on which his reputation as a writer rests.

b. Abstraction and the Poetic

Roy Park argues that Hazlitt’s theory of abstraction explains the role that both painting and philosophy played in the formulation of his literary theory. He emphasizes how Hazlitt disguises the philosophy in his essays by focusing the reader’s attention on the concrete and particular. He sees partial parallels with the thought of Coleridge, Thomas Carlyle, and Mathew Arnold in the way Hazlitt represents abstraction as a threat to our experience of the poetic—which is to say, to civilized living. Literature is a response to life at its deepest level, to life experienced imaginatively rather than rationally. Contemporary philosophers, such as Bentham, Condorcet, even Godwin, represent humankind in materialistic terms. ‘Experience’ should not be restricted to material or physiological experience. The moral theories of egotism and utilitarianism, which make pleasure and pain the only criteria of right action, also Malthus’s theory of population, are the outcome of empirical epistemology. Park sees Hazlitt as attempting (like Thomas Reid) to harmonize the ‘material’ with the ‘intellectual’ or ‘imaginative’, and (like Francis Bacon) the abstract with the concrete, the individual with the universal, the scientific with the spiritual (1971: 20 – 21).

Hazlitt was not interested, Park writes, in converting the poetic into something other than itself. To ask What is the poetic? is an improper question, for the essence of poetry and life are lost when we attempt to explain them. It is not so much that thought is mysterious as that mystery is part of the thought. Hazlitt’s experiential response to the poetic and the existential was appreciated by Percy Shelley and John Keats: ‘Hazlitt initiated the response… but it is to Keats that we owe the classic formulation of the experiential standpoint in his characterization of negative capability as a condition in which “man is capable of being in uncertainties”’ (Park, 1971: 32).

Park sees Hazlitt’s objection to abstraction as being an objection ‘to all closed systems of thought in which the whole of human experience was interpreted in the light of the system’s initial premiss, empirical or metaphysical, with scant regard to the individuality, complexity and diversity of “the truth of things”’ (1971: 35). He quotes Hazlitt’s observation in the Lectures on English Philosophy:

‘They [system-makers] have in fact always a purpose… [which] takes away the tremulous sensibility to every slight and wandering impression which is necessary to complete the fine balance of the mind, and enable[s] us to follow all the infinite fluctuations of thought through their nicest distinctions. (CW 2: 269; Park, 1971: 37)

One of Hazlitt’s objections to Wordsworth’s ‘The Excursion’ is that it has ‘palpable designs upon us’.

Hazlitt’s Essay prepared the way for his rejection of abstraction by its rejection of mechanical associationism and of psychological egoism, and by its discovery of natural disinterestedness and the active power of the mind. Park thinks Hazlitt’s analysis of personal identity is not as significant as the argument against psychological egoism or the positioning of the imagination—the faculty of self-transcendence—as the moral faculty. It is the prerequisite for the openness characteristic of the existential stance. Furthermore, Park argues, it parallels Kant’s autonomously legislative will: practical reason and imagination are both essentially experiential.

But imagination’s role is not just as a mode of volitional consciousness: it is also the faculty of the poetic (1971: 49). Hazlitt connects imagination with the vital role played by sentiments in developing our habits and affections. The existential or sentimental relationship with the world around us is what Hazlitt calls ‘the poetry of life’—what makes life worth living. Imagination extends beyond volitional consciousness to include art and life in general. We are poetical animals because we love, hope, fear, hate, despair, wonder, admire. For Hazlitt, the spirit of poetry and the spirit of humanity are the same. The past becomes a fable, investing objects with value; objects become epiphanies. Childhood is important for its ‘symbolic fecundity and its subsequent symbolic richness’ (Park, 1971: 66). Poetry expresses this revelation of the significance of human life, modifying and enlivening the mystery of existence, the real and interior spirit of things. Value is quality, not, as in Bentham’s utilitarianism, quantity. The fine arts and poetry are self-authenticating; their value is never instrumental: ‘they toil not, neither do they spin’ (CW 18: 167). But that is not to say that they have no cultural implications for the individual or for society. Through literature and the arts, we are humanized: they enable us to become aware of our inter-relatedness with the rest of humanity.

Intellectual progress is not towards abstraction, as Locke and later philosophers had thought, but towards individuation. Objects of sense are complex; they are ‘seen’ with our understanding and our hearts: the more we observe, the more we see. Hazlitt, Park observes, had learned this as a painter. Detail is the essence of the poetic. Hazlitt’s critical vocabulary is full of terms derived from painting, terms related to the kind of ‘gusto’ appropriate to that branch of literature, and terms connected with particularity and individuation: ‘detail’, ‘distinction’, ‘tint’, ‘local’, ‘concrete’, ‘subtle’, and the contrasting terms such as ‘abstract’, ‘vague’, ‘universal’, ‘indefinite’, ‘theoretical’. It is out of particularity that the universal emerges. It is the precise and the vague that are in opposition, not the individual and the general. Park observes that Hazlitt’s attitude to abstraction helps us to understand his own view of his work as ‘the thoughts of a metaphysician expressed by a painter’.

Hazlitt’s use of this term ‘gusto’ epitomises his experiential view of the poetic. Derived from art criticism, it refers to the particular character of a work of art or literature: the quality which, as it were, differentiates one grain of sand from another. What the work expresses, in all its complexity, can only be expressed in the work itself. Our job is to submit to the artist’s or poet’s vision. In an essay that itself exemplifies gusto, ‘The Indian Juggler’, Hazlitt refers to the poet’s ability to unravel ‘the mysterious web of thought and feeling’ through ‘the power of that trembling sensitivity which is awake to every change and every modification of its every-varying impressions’ (CW 8: 83).

Philosophers are too connected to their form of abstraction. Park quotes Ludwig Wittgenstein (1958: 18) on the ‘craving for generality’ and the tendency of philosophers to try to answer questions in the way that scientists do (Park, 1971: 210). Feeling is the most important factor in Hazlitt’s distrust of abstraction. Imagination is the power of carrying feeling into other situations, including into other people’s situations. Park uses the term ‘imaginative sincerity of feeling’ to refer to the power of imagination at work in art and in moral action. This and gusto and the distrust of abstraction are all facets, Park insists, of Hazlitt’s experiential view of the poetic: they give a unity to his criticism (1971: 169). The combination and balance between these facets serve to isolate the peculiar, original, and characteristic nature of a work in relation to the artist’s individual genius, and to discriminate kinds and degrees of poetic excellence.

Park argues that for Hazlitt hope lies in the nature of poetry itself and in the spirit of man. His disappointment with his own age is demonstrated in The Spirit of the Age (CW 11), published in 1825, a work Park describes as ‘a masterpiece of indirectness’, an ‘aggregate of well-founded particulars’ (1971: 213 – 214). It is like an historical painting of the age, starting with portraits of Bentham, Godwin, and Coleridge, and proceeding with writers and politicians that exemplify aspects of the times or who, like Coleridge, had capitulated to the spirit of the age. Abstraction is to blame for its political and aesthetic limitations: the principle of utility, for example, is characteristic.

c. Power and the Poetic

Uttara Natarajan thinks the attention given by Park to Hazlitt’s criticism of abstraction is ‘at the expense of the larger theoretical framework of his writing’ (Natarajan,1998: 6). She calls for the recognition of Hazlitt as not only a great critic but also as a profound philosopher. Hazlitt’s criticism of art and literature, and his political and social criticism, is pervaded by the epistemology and metaphysics of the Essay, and also by its moral theory. Her main claim is that all Hazlitt’s subsequent thought follows from what she sees as the central idea of the Essay: the concept of power. Power is the independence of the mind from manipulation by the senses (or, equivalently, by external objects). The concept of power ‘is at the very core of Hazlitt’s celebration of all intellectual activity as the vindication of an innate self-directing principle with which the mind is endowed (Natarajan,1998: 27).

The formative power of the mind is evidenced by the structure of language, and poetic language especially has a reach that extends beyond the mind to objective reality. In Natarajan’s view, therefore, Hazlitt’s linguistic philosophy is more important than has previously been recognised. Language is the means by which we can understand the self. Words affirm, Natarajan writes, ‘the relation between mind and nature: the moral goal, unity’ (1998: 146). Unity is always a function of the self-projecting attribute of the mind and its ability to create relations and perceive wholes. Language is not a limitation but a manifestation of the mind’s power.

Whereas ordinary language does not reveal externally ‘subsisting objects’, poetic or ‘inspired’ language is true to nature when it conveys the impression the object makes on the mind. Its role is to evoke things as they are felt by the speaker, their emotional significance. Hazlitt’s term ‘impression’, Natarajan observes, is imbued with the weight of feeling. The prime mover is not the object itself but the imagination: it is constructive, it assembles the whole. The power of human perception is embodied in the ‘excess of the imagination beyond the actual or ordinary impression of any object or feeling’ (1998: 27). Hazlitt ‘grants to the purely intellectual a degree of actuality equal to, if not greater than, the impressions of the sense’ (1998: 39). It is through the apprehension of imagination that, in Wordsworth’s words, we ‘see into the life of things’.

Natarajan reminds us that Hazlitt had been educated in Unitarianism, in which God is the power that unifies the order of creation. In Hazlitt’s work, the human intellectual faculty replaces the divine. ‘Nature’ and ‘truth’ are imagination embodied in words. Imagination brings about a process of association, whereby it projects itself into the order of nature to produce its own immaterial creation, its own unity. This active form of associationism opposes the mechanical, materialist, passive, and deterministic associationism of Hartley and Priestley. We never see the whole of an object by looking at it: we ‘read’ it with an associative power that allows it endlessly to accrue meaning. The mind is no blank slate; rather, nature is the blank that gains meaning from the mind. It grows with its growth. Looking into nature is looking into oneself, and vice-versa.

Hazlitt adopts a pluralistic view of truth: it ‘is not one but many; and an observation may be true in itself that contradicts another equally true, according to the point of view from which we contemplate the subject’ (CW 9: 228). Truth is an aggregation, a universal composed of particulars, a union of the abstract and the concrete, synthesized by the imagination. A point of view is true when it is authentic and produced, as in poetry and art, from well-founded particulars. Poetic truth is individualistic, an original insight made possible by the poet’s experience and circumstance and influenced by innate biases or predispositions. This suggests a paradox of determinism simultaneous with free volition. The agent’s dispositions influence choices, but the idea of freedom of the will refers to the choice between motives, which is an inner activity, not externally directed. So, although an agent, artist, or poet is inevitably constrained by the constitution of his or her own mind, it is precisely this that creates the individuality to which she or he must be true. Imaginative truth is an exclusive original insight: truth but not the truth. The artistic genius is compelled to communicate this truth to us, almost tyrannically, in a kind of ‘transfusion of mind’, and yet this is also a sublimation of self.

Hazlitt’s idea of the empowered mind, Natrajan observes, is where his linguistics, epistemology, and poetics converge. It provides, too, a model of the self ‘at once the origin and the product of power’ (Natarajan, 1998: 78). Self-love and benevolence are identically manifestations of innate imagination. By refuting the mechanistic and passive notion of the self, Hazlitt is able to emphasize the capacity of the mind to apprehend holistically both self and others. ‘Alterity’, Natarajan writes, ‘validates the moral nature of man… If that which is other to the self can be shown to constitute a real motive to action, then the self owns moral agency (1998: 121). However, there is a love of power in the mind independent of the love of good. Hazlitt emphasizes this especially in The Plain Speaker essays (CW 12), where empowered mind is shown as bigoted and exclusive. To reach a just determination, we must set aside the bias of the will, the mind’s dispositions. This reconciliation of wisdom and power is achievable because the self is the instrument for action, not the motive. Whereas ‘power’ expresses moral capacity, ‘good’ expresses moral purpose. The ‘metaphysical discovery’ shows the mind’s imaginative capacity, not its achievement, but it allows us to adjust our motives to suit this new understanding of the self as naturally disinterested and to expand the circle of our sympathy.

d. Conclusion

Natarajan and Park agree that a full appreciation of Hazlitt as a philosopher requires us to explore the philosophy in his criticism and conversational essays as well as in his earlier, more explicitly philosophical works. His role in the conversational essays is to be both an artist and a moralist, to create and to criticize, to entertain and to enlighten, and sometimes to enrage. As Natarajan points out, he constantly revisits favourite themes, qualifying, refining, contradicting, aggregating, and composing. Like Michel de Montaigne, whom he admired, Hazlitt was not afraid to turn the spotlight on himself and to explore his own contradictions. Despite its huge variety, there is a unity in his work, a continuity of interests, and commitments that reaches right back into young adulthood. His ideas and themes continue to deserve to be revisited. Interpretation and appreciation of his philosophy has flourished since Park’s and Natarajan’s books were published, and interesting approaches continue to be explored.

Charles Lamb’s review of the first volume of Table Talk provides a fine assessment of his friend’s achievement:

To an extraordinary power of original observation he adds an equal power of familiar and striking expression… In fact, he all along acts as his own interpreter, and is continually translating his thoughts out of their original metaphysical obscurity into the language of the senses and of common observation. (Lamb, 1980: 306 – 7)

5. References and Further Reading

Bibliographical note: references to Hazlitt’s works are generally to the 21-volume Complete Works, edited by P. P. Howe, and are indicated by CW + volume number.

  • Hazlitt’s Complete Works:
  • Howe, P. P. (ed.) 1930-34. The Complete Works of William Hazlitt. 21 vols. London: Dent.
    • This is the standard edition of Hazlitt’s writing. It contains texts of the full-length volumes, annotated. Not quite complete. It is available from the Internet Archive.
  • Selections, Letters and Memoirs:
  • Cook, Jon (ed.). 1998. William Hazlitt: Selected Writings. World’s Classics. Oxford: Oxford University Press.
    • A one-volume paperback selection with useful annotations and an introduction. It contains some of Hazlitt’s shorter philosophical essays and some of his aphorisms.
  • Dart, Gregory (ed.). 2005. William Hazlitt: Metropolitan Writings. Manchester: Carcanet Press.
    • A collection of Hazlitt’s major metropolitan essays, with a critical introduction on his attitude to London life.
  • Dart, Gregory (ed.). 2008. Liber Amoris and Related Writings. Manchester: Carcanet Press.
    • Sets Liber Amoris, Hazlitt’s memoir of the main crisis of his life, in the context of his other writings from 1822-23, with notes and a critical introduction by the editor.
  • Hazlitt, William Carew. (ed.) 1867. Memoirs of William Hazlitt. 2 volumes. London: Richard Bentley.
    • Edited by Hazlitt’s grandson. Includes portions of his correspondence. Available from Internet Archive and Forgotten Books.
  • Keynes, Geoffrey (ed.). 1948. Selected Essays of William Hazlitt. London: Nonesuch Press.
  • Mee, Jon and Grande, James (eds.). 2021. The Spirit of Controversy and Other Essays. Oxford: Oxford University Press.
    • A useful, updated World’s Classics edition. The texts are drawn from the original periodical publications, rather than those subsequently prepared for book publication.
  • Paulin, Tom and Chandler, David (eds.). 2000. William Hazlitt: The Fight and Other Writings. London: Penguin
    • Substantial annotated Penguin Classics selection, with an introduction by Tom Paulin.
  • Sikes, Herschel Moreland, Bonner, William Hallam, and Lahey, Gerald (eds.). 1978. The Letters of William Hazlitt. New York: New York University Press.
  • Wu, Duncan (ed.). 1998. The Plain Speaker: The Key Essays. Oxford: Blackwell.
    • Paperback selection of some of Hazlitt’s best essays, plus a newly discovered essay, ‘A Half-Length’. Introduction by Tom Paulin.
  • Wu, Duncan (ed.). 1998. The Selected Writings of William Hazlitt. 9 vols. London: Pickering and Chatto.
    • A major selection: nine-volumes, with updates of Howe’s texts and annotations. Introduction by Tom Paulin. It includes two previously unpublished essays and An Essay on the Principles of Human Action but excludes much of the philosophical writing to be found in CW 20).
  • Wu, Duncan (ed.). 2007. New Writings of William Hazlitt. 2 vols. Oxford: Oxford University.
    • A collection of 205 more recently discovered writings, including major essays on the poetry of Wordsworth and Coleridge and some late philosophical essays not previously recognised as Hazlitt’s.
  • Wu, Duncan. 2014. All That is Worth Remembering: Selected Essays of William Hazlitt. London: Notting Hill Editions.
  • Biographies:
  • Baker, Herschel 1962. William Hazlitt. Cambridge, Mass.: Harvard University Press.
  • Grayling, A. C. 2000. The Quarrel of the Age: The Life and Times of William Hazlitt. London: Weidenfeld and Nicholson.
    • A lively biography which appreciates the man and the philosophy.
  • Howe, P. P. 1947 (new edition). The Life of William Hazlitt. London: Hamish Hamilton.
    • A standard biography, which draws heavily on Crabb Robinson’s diary. Useful but dated.
  • Jones, Stanley. 1989. Hazlitt: A Life. From Winterslow to Frith Street. Oxford: Oxford University Press.
    • An important critical biography. Covers later part of Hazlitt’s life and work.
  • Wu, Duncan. 2008. William Hazlitt: the First Modern Man. Oxford: Oxford University Press.
    • An important and substantial biography by a leading Hazlitt scholar.
  • Critical and Historical Studies:
  • Albrecht, W. P. 1965. Hazlitt and the Creative Imagination. Lawrence, Kan.: University of Kansas Press.
    • A study of Hazlitt’s concept of imagination, his political thought and his literary judgements.
  • Barbalet, Jack. 2009. ‘Disinterestedness and Self-Formation: Principles of Action in William Hazlitt’. European Journal of Social Theory, 12 (2), 195 – 211.
  • Bromwich, David.1983. Hazlitt: The Mind of a Critic. Oxford: Oxford University Press.
    • A major modern study of Hazlitt’s philosophy, politics, criticism, and moral theory. Makes the case for Hazlitt as a major critic.
  • Bullitt, J. M. 1945. ‘Hazlitt and the Romantic Conception of the Imagination’. Philological Quarterly, 24.4, 343-61.
  • Burley, Stephen. 2014. Hazlitt the Dissenter: Religion, Philosophy, and Politics, 1766-1816. London: Palgrave Macmillan.
    • A major study with an emphasis on the Dissenting tradition’s influence on Hazlitt’s early philosophical and political writing.
  • Butler, Marilyn. 1981. Romantics, Rebels and Reactionaries: English Literature and its Background 1760-1830. Oxford: Oxford University Press.
    • A major work on the period, it includes a summary of Hazlitt’s career as a radical and characteristically English thinker. It puts the Romantic movement in its historical setting and emphasizes its contradictions.
  • Cook, Jon. 2023. ‘Hazlitt’s First acquaintance with Poets’. The Hazlitt Review, 16, 33 – 47.
  • Dart, Gregory. 2000. ‘Romantic Cockneyism: Hazlitt and the Periodical Press’. Romanticism 6.2, 143-62.
  • Eagleton, Terry. 1973. ‘William Hazlitt: An Empiricist Radical’. New Blackfriars, 54, 108 – 117.
  • Eagleton, Terry. 2009. ‘The Critic as Partisan: William Hazlitt’s Radical Imagination’. Harper’s Magazine, 318 (1907), 77 – 82.
  • Gilmartin, Kevin. 1996. Print Politics: The Press and Radical Opposition in Early Nineteenth Century England. Cambridge: Cambridge University Press.
  • Gilmartin, Kevin. 2015. William Hazlitt: Political Essayist. Oxford: Oxford University Press.
    • A major study which makes a case for the centrality of Hazlitt’s political thought to his achievement as an essayist.
  • Harling, Philip. 1997. ‘William Hazlitt and Radical Journalism’. Romanticism, 3.1, 53-65.
  • Hunnekuhl, Philipp. 2017. ‘Hazlitt, Crabb Robinson, and Kant: 1806 and Beyond’. The Hazlitt Review, 10, 45 – 62.
  • Johnston, Freya. 2018. ‘Keeping to William Hazlitt’, in Thinking through Style: Non-Fiction Prose of the Long Nineteenth Century. Oxford: Oxford University Press.
  • Kinnaird, John. 1977. ‘Hazlitt, Keats, and the Poetics of Intersubjectivity’. Criticism, 19 (1), 1 – 16.
  • Kinnaird, John. 1978. William Hazlitt: Critic of Power. New York: Columbia University Press.
    • Takes ‘power’ to be the unifying theme of Hazlitt’s works, in both its political sense and in the sense of creative energy.
  • Lockridge, Laurence S. 1989. The Ethics of Romanticism. Cambridge: Cambridge University Press.
    • Chapter 7 is entitled ‘Hazlitt: Common Sense of a Dissenter.’
  • Martin, Raymond and Barresi, John. 1995. ‘Hazlitt on the Future of the Self’, Journal of the History of Ideas, 56 (3), 463 – 81.
    • Makes a strong claim for the originality of Hazlitt’s theory of personal identity.
  • Martin, Raymond and Barresi, John. 2003. ‘Self-Concern from Priestley to Hazlitt’. British Journal for the History of Philosophy,11(3), 499 – 507.
  • McFarland, Thomas. 1987.  Romantic Cruxes: The English Essayists and the Spirit of the Age. Oxford: Clarendon Press.
  • Mee, Jon. 2011. Conversable Worlds: Literature, Contention, and Community 1762 – 1830. Oxford: Oxford University Press.
  • Michael, Timothy. 2024. ‘Hazlitt, Disinterestedness, and the Liberty of the Press’. The Review of English Studies, 75 (318), 57 – 74.
  • Milnes, Tim. 2000. ‘Seeing in the Dark: Hazlitt’s Immanent Idealism’. Studies in Romanticism, 39 (1), 3 – 25.
  • Milnes, Tim. 2003. Knowledge and Indifference in English Romantic Prose. Cambridge: Cambridge University Press.
  • Milnes, Tim. 2017. ‘“This Happy Nonentity”: Hazlitt, Hume, and the Essay’. The Hazlitt Review, 10, 63 – 72.
  • Milnes, Tim. 2019. The Testimony of Sense: Empiricism and the Essay from Hume to Hazlitt. Oxford: Oxford University Press.
  • Mulvihill, James. 1990. ‘Hazlitt and “First Principles”’. Studies in Romanticism, 29 (2), 241 – 255.
  • Natarajan, Uttara. 1996. ‘Abstracting Passion: Hazlitt’s Ideal of Power’. New Blackfriars, 77, 276 – 287.
  • Natarajan, Uttara. 1998. Hazlitt and the Reach of Sense: Criticism, Morals, and the Metaphysics of Power. Oxford: Clarendon Press.
    • A major study, focusing on the innate, independent activity of the mind. It makes a case for the importance of the philosophy for the essays and criticism.
  • Natarajan, Uttara, Paulin, Tom, Wu, Duncan (eds.). 2005. Metaphysical Hazlitt: Bicentenary Essays. London: Routledge.
    • Commemorates the bicentenary of Hazlitt’s Essay on the Principles of Human Action, with essays by important Hazlitt scholars and philosophers.
  • Noxon, James. 1963. ‘Hazlitt as Moral Philosopher’. Ethics, 73 (4), 279 – 283.
  • Park, Roy. 1971. Hazlitt and the Spirit of the Age: Abstraction and Critical Theory. Oxford: Clarendon Press.
    • An important study, relating Hazlitt’s literary works to his painting and to his philosophy, especially his concept of abstraction.
  • Paulin, Tom. 1998. The Day-Star of Liberty: William Hazlitt’s Radical Style. London: Faber.
    • A passionate argument for Hazlitt’s status as a prose artist and political radical.
  • Philp, Mark. 2020. Radical Conduct: Politics, Sociability and Equality in London 1789 – 1815. Cambridge: Cambridge University Press.
  • Postle, Martin. 2015. ‘“Boswell Redivivus”: Northcote, Hazlitt, and the British School’. The Hazlitt Review, 8, 5 – 20.
  • Rée, Jonathan. 2019. Witcraft: The Invention of Philosophy in English. London: Allen Lane.
    • As well as intellectual portraits of celebrated philosophers there is discussion of the philosophical work of literary authors. A section entitled ‘1801: Politics, Paradise and Personal Identity’ provides a lively narrative concerning Hazlitt’s family, influences, relationships and ideas.
  • Schneider, Elisabeth W. 1933. The Aesthetics of William Hazlitt: A Study of the Philosophical Basis of his Criticism. Philadelphia: University of Pennsylvania Press.
  • Tomalin, Marcus. 2012. Romanticism and Linguistic Theory: William Hazlitt, Language and Literature. Cambridge: Cambridge University Press.
  • Wakefield, James. 2021. ‘On Whether William Hazlitt Was A Philosophical Idealist (and Why It Matters’). The Hazlitt Review,14, 5 – 23.
  • Wellek, Rene. 1931. Immanuel Kant in England 1793 – 1838. Princeton, N.J: Princeton University Press.
  • Whale, John. 2000. ‘Hazlitt and the Limits of the Sympathetic Imagination’, in Imagination under Pressure, 1789-1832: Aesthetics, Politics and Utility. Cambridge: Cambridge University Press, 110-39.
  • Wu, Duncan. 2006. ‘Hazlitt’s Unpublished History of English Philosophy: The Larger Context’. The Library: The Transactions of the Bibliographical Society, 7(1), 25 -64.
  • Other Resources:
  • Keynes, Geoffrey. 1931. Bibliography of William Hazlitt. London: Nonesuch Press.
  • Peter Landry’s Hazlitt Page: http://www.blupete.com/Literature/Essays/WorksHaz.htm.
  • Romantic Circles: Home | Romantic Circles (romantic-circles.org).
  • The Hazlitt Review, an annual peer-reviewed journal published by the Hazlitt Society.
  • The Hazlitt Society: www.ucl.ac.uk/hazlitt-society.
  • Other References:
  • Lamb, Charles. 1980. Lamb as Critic. R. Park (ed.). London: Routledge and Kegan Paul.
  • Locke, John. 1975. An Essay Concerning Human Understanding. P. H. Nidditch (ed.). Oxford: Oxford University Press.
  • Martin, Raymond and Barresi, John. 2000. Naturalization of the Soul: Self and Personal Identity in the Eighteenth Century. London and New York: Routledge.
  • Michael, Timothy. 2016. British Romanticism and the Critique of Political Reason. Baltimore, Md: John Hopkins University Press.
  • Parfit, Derek. 1987. Reasons and Persons. Oxford: Oxford University Press.
  • Reid, Thomas.1983. Inquiry and Essays. R. E. Beanblossom and K. Lehrer (eds.). Indianapolis, Ind.: Hackett Publishing Company.
  • Strawson, Peter. 1964. Individuals: An Essay in Descriptive Metaphysics. London: Methuen University Paperback.
  • Wittgenstein, Ludwig. 1958. The Blue and Brown Books. Oxford: Basil Blackwell.

 

Author Information

Graham Nutbrown
Email: gn291@bath.ac.uk
University of Bath
United Kingdom

Eternalism

Eternalism is a metaphysical view regarding the nature of time. It posits the equal existence of all times: the past, the present, and the future. Every event, from the big bang to the heat death of the universe, including our births and deaths, is equally real.

Under standard eternalism, temporal locations are somewhat akin to spatial locations. No place is exclusively real. When someone says that they stand ‘here’, it is clear that the term ‘here’ refers to their position. ‘Back’ and ‘front’ exist as well. Eternalists stress that ‘now’ is indexical in a similar way. It is equally real with ‘past’ and ‘future’. Events are classified as past, present, or future from some perspective.

Eternalism is contrasted with presentism, which maintains that only present things exist and with the growing block view (also known as possibilism or no-futurism), which holds that past and present things exist but not future ones. The moving spotlight view suggests that all times exist but that the present is the only actual time. This view can be termed eternalist, but it preserves a non-perspectival difference between past, present, and future by treating tense as absolute. Additionally, the moving spotlight view retains some characteristics of presentism by maintaining that the ‘now’ is unique and privileged.

Broadly speaking, presentism is a common-sensical view, and so aligns with our manifest image of time. This view is, however, at odds with the scientific image of time. The primary motivation for eternalism arises from orthodox interpretations of the theories of relativity. According to them, simultaneity is relative, not absolute. This implies that there is no universal ‘now’ stretched out across the entire universe. One observer’s present can be another’s past or future. Assuming the universe is four-dimensional spacetime, then all events exist unconditionally.

The classical argument for eternalism was devised in the 1960s by Rietdijk in 1966 and Putnam in 1967, with subsequent follow-ups by Maxwell in 1985, Penrose in 1989, and Peterson and Silberstein in 2010. This argument and its ramifications remain the subject of ongoing debate. They shall be further explored in this article, including their relation to other issues in the metaphysics of time and the philosophy of physics.

Table of Contents

  1. Introduction
    1. Motivation
    2. Central Definitions and Positions
  2. The Classical Argument for Eternalism
    1. History of the Concept
    2. The Argument
  3. Eternalism in Relation to Other Metaphysical Issues
    1. Dynamic A-Theory and Static B-Theory
    2. Passage of Time
      1. The Illusion Argument
      2. Error Theory
      3. Moving Spotlight
      4. Deflationary passage
    3. Persistence and Dimensions of the World
    4. Free Will and Agency
    5. The Possibility of Time Travel
  4. Objections
    1. Conventionality of Simultaneity
    2. Neo-Lorentzianism
    3. General Relativity
    4. Quantum Physics
    5. Triviality
  5. References and Further Reading

1. Introduction

a. Motivation

It is usually thought that presentism and perhaps the growing block view are a better match with our common-sensical idea of time than eternalism. It is obvious that the present is different from the past and the future. As you are presently reading these sentences, your reading is real. At least it is real in comparison to what you did a long time ago or will do a long time from now. There is, however, a problem. If the ‘now’ is exclusively real, that moment is entirely universal. The ‘now’— the moment your eyes skim through these lines— is the same ‘now’ as the ‘now’ in other regions of the universe. Independently of where we are, the ‘now’ is the same. If this is true, simultaneity is absolute. Both presentism and the growing block view assume the absoluteness of simultaneity. There is a knife-edge present moment stretched throughout the entire universe. That universal now is the boundary of all that exists. According to presentism, what was before that moment and what lies ahead that moment does not exist. According to no-futurism, what lies ahead that moment does not exist.

This common-sensical picture is in tension with the special theory of relativity. This theory is included in two central pillars of contemporary physics: the general theory of relativity and the quantum field theory. Whether we are dealing with gravitational effects or high-energy physics, time dilation is prevalent. This result is of central importance to eternalism, as the relativity of simultaneity is included in time dilation. Simultaneity differs across frames of reference. Provided that in some frame of reference the time difference between two events is zero, the events are simultaneous. In another frame, the time-difference between those same events is not zero, so the events are successive. They might be successive in a different order in a third frame.

The classical argument for eternalism hinges on this result. Before delving into this argument more explicitly in Section 2, let’s consider some intuitions that arise from relative simultaneity. Imagine three observers witnessing two events in a room with one door and a window. The first observer stands still and sees the window and the door open simultaneously. In their frame, the two events are simultaneous, that is, happening now. The second observer moves toward the window and away from the door. For them, the window opens now, and the opening of the door is in their future. The third observer moves toward the door and away from the window. For them, the door opens now, and the opening of the window is in their future. Thus, the three observers disagree on what happens now. Someone’s ‘now’ can be someone else’s future, and it can also be that someone’s ‘now’ is someone else’s past.

Another way to motivate eternalism is to imagine, for the sake of the argument, that there is an alien standing still in a faraway galaxy, around ten light-years from us. I am now practically motionless as I am typing these sentences. Let’s imagine we could draw a now-slice that marks the simultaneity of the alien standing and me sitting. Provided we do not move, we share the same moment ‘now’. Then the alien starts to move, not too fast, away from me. The alien’s now-slice is now tilted toward my past. With a great enough distance between us, even what could be considered a relatively slow motion, the alien would carve up spacetime so their now-slice would no longer match with me. It would match with Napoleon’s invasion of the Russian Empire. If the alien turns in the opposite direction, their now-slice would be tilted toward my future. Their ‘now’ would be—well, that we do not know. Perhaps a war with sentient robots on Earth?

According to the general theory of relativity, gravitational time dilation causes time in one location to differ from time in another. Clocks closer to massive sources of gravity tick slower than those farther away. This phenomenon is evident even in simple scenarios, such as two clocks in a room – one high on the wall and the other lower on a table – showing different times. According to presentism, there are no different times. There is one and only one time, the present time. However, the potential number of clocks is indefinite, leading to countless different times. This is clearly in tension with presentism. There are many different times, not just one unique time.

At first sight, eternalism is backed up by empirical science. We are not dealing with a purely a priori argument. Time dilation is apparent in a plethora of experiments and applications that utilize relativity. These include, for example, the functioning of LHC in Cern, the detection of muons at the ground of the earth, and GPS technology. The important point for motivating eternalism is that the empirical evidence for the existence of time dilation in nature is very extensive and well-corroborated. Metaphysicians with a naturalist bent have reasons to take the ramifications of relativity seriously.

b. Central Definitions and Positions

The different metaphysics of time can be clarified by introducing reality values. The assumption is that an event has either a reality value of 1 (to exist) or 0 (to not exist). An event does not hover between existing and non-existing (it is however possible to connect the truth values of future-tensed statements to probabilities as the future might be open and non-existent; more about this in Section 4.d).

Eternalism

Temporal location Past Present Future
Reality value 1 1 1

 

Presentism
Temporal location Past Present Future
Reality value 0 1 0

 

Growing block, possibilism, no-futurism
Temporal location Past Present Future
Reality value 1 1 0

As the moving spotlight view differs in terms of actuality, it is appropriate to add an actuality value on its own row:

Moving spotlight
Temporal location Past Present Future
Reality value 1 1 1
Actuality value 0 1 0

Presentism, growing block view, and moving spotlight view all treat tenses as absolutes, whereas eternalism treats them as indexicals. The former take that the distinction between past, present, and future is absolute, whereas the latter takes that it is perspectival. The former are instances of the A-theory of time, whereas the latter is a B-theory or a C-theory of time, as shall be clarified in Section 3.a. This article focuses on eternalism, so not too much will be said about the other metaphysics of time. To that end, see Section 14.a “Presentism, the Growing-Past, Eternalism, and the Block-Universe” of this Internet Encyclopedia of Philosophy on Time, as well as the section on A-theory and B-theory.

2. The Classical Argument for Eternalism

a. History of the Concept

It is far from clear who was the first one to use the concept of  “eternalism”. It is not even clear where to look for in the first place. Frege’s late 19th-century theory of propositions has been interpreted in eternalist terms, as propositions have an eternal character (O’Sullivan 2023). Hinton published an essay “What is the Fourth Dimension?” in the 1880s. This was perhaps a precursor to eternalism, as the idea of the fourth dimension is co-extensive with the idea of the four-dimensional spacetime. It has been suggested that Spinoza in the 17th century (Waller 2012) and Anselm around 1100 argued for an eternalist ontology (Rogers 2007). Spinoza held that all temporal parts of bodies exist, thus anticipating a perdurantist account of persistence, a view that aligns well with the existence of all times. Anselm thought that God is a timeless, eternal being (a view also held by his predecessors like Augustine and Boethius) and that all times and places are under his immediate perception. Past, present, and future are not absolute distinctions but relative to a temporal perceiver. The history of the philosophy of time certainly stretches farther in time and place. It might very well be that eternalism, or a position close to it, was conceptualized by the ancient philosophers in the West and the East.

Considering the contemporary notion of eternalism and the debates within current metaphysics of time, the special theory of relativity includes the essential ingredients for eternalism. Although the theory has forerunners, it is typically thought to originate in Einstein’s 1905 article “On the Electrodynamics of Moving Bodies” and in Minkowski’s 1908 Cologne lecture “Space and Time.” We can assume that the earliest relevant publications concerning eternalism that draw on relativity came out in the first quarter of the twentieth century.

The classical argument for eternalism was formulated in the 1960s by Rietdijk and Putnam, independently of each other, but neither used the notion of eternalism explicitly. In the 1980s, Maxwell and Penrose argued along the lines of Rietdijk and Putnam without using the notion of eternalism. Rietdijk’s or Putnam’s predecessors like Williams (1951) and Smart (1949) did not invoke eternalism explicitly. Surprisingly, not even Russell, who is known for his tenseless theory of time, mentions eternalism in his 1925 exposition of relativity, The ABC of Relativity. The last chapter of that book is “Philosophical consequences,” in which one would expect something to be said about the ontology of time.

It is also worth mentioning one famous quote of Einstein, drawn from a letter to the widow of his long-time friend Michele Besso. This was written shortly after Besso’s demise (he died on March 15, 1955, and Einstein’s letter is dated on March 21, 1955). The letter reads, “Now he has departed from this strange world a little ahead of me. That signifies nothing. For those of us who believe in physics, the distinction between past, present, and future is only a stubbornly persistent illusion.” (Mar 2017: 469). It is not difficult to find Internet forums in which this part of the letter is characterized as eternalist. To dub this eternalism is, however, to read a personal, poetic letter out of its context. Moreover, the letter is cryptic. It is by no means an explicit endorsement of a philosophical position, as one would expect from a personal, moving letter.

There are quite a few 20th-century philosophers, physicists, and mathematicians, like Cassirer, Eddington, Einstein, Grünbaum, Gödel, Lewis, Minkowski, Quine, Smart, and Weyl, who have endorsed a view akin to eternalism (Thyssen 2023, 3). Yet the central argument and motivation for contemporary eternalism comes from the Rietdijk-Putnam-Maxwell-Penrose argument, which will be denoted the ‘classical argument’ below. The most recent extensive defense of this same idea comes from Peterson and Silberstein (2010).

b. The Argument

The two important notions required for the eternalist argument are reality value and reality relation. Reality values, or better for logic notation, r-values, represent the ontological status of any event. 1 denotes a real event, 0 an unreal event. An ideal spacetime diagram that represents everything from the beginning to the end would record all events with a reality value of 1, but none with a reality value of 0. This starting point omits some higher values like “possibly real” or “potentially real in the future,” which will become relevant in the discussion about the openness of the future in quantum physics (Section 4.d). The uniqueness criterion of reality means that an event has only one reality value. An event having two different reality values, 1 and 0, would be a contradiction. Reality relations, for their part, apply to events that share the same reality value. They can be translated into equally real relations: when two events are equally real, they are in a reality relation with each other (Peterson and Silberstein 2012, 212).

If events A, B, and C are equally real, then ArBrC. The properties of the reality relation are reflexivity, symmetricity, and transitivity. Reflexivity stipulates that ArA, since A has a unique reality value. Symmetricity stipulates that if ArB then BrA, since A and B have the same reality value. Transitivity stipulates that if ArB and if BrC, then ArC, since A, B, and C all have the same reality value. The transitivity condition is the most controversial (see Section 4.a).

As noted already in the introduction, the relativity of simultaneity is of the utmost importance for eternalism. So is the idea of four-dimensional spacetime, and concepts related to spacetime diagrams, something originally introduced by Minkowski in 1908.

Figure 1. Minkowski light cones

Only events lying outside light cones, that is, spacelike separated events that are in each other’s absolute elsewhere, may be simultaneous. These events are not causally related, and no signals may traverse between them. If one can establish that two spatially separated events are connected with a hyperplane of simultaneity, they are simultaneous. Hyperplanes do not apply to lightlike (the edges of the cones) or timelike (the observer’s worldline) separated events. Events in the observer’s past or future light cones cannot be simultaneous. Different observers do not agree on the temporal order of spacelike separated events. Two or more events happen at the same time in different places, according to some observers, but not according to all.

Figure 2. The classical argument illustrated with spacetime diagrams.

In Figure 2, we have three events, A, B, and C. There are two observers, that is, inertial frames of reference. 1 is marked with a blue axis, and 2 is marked with a red axis. A and B are spacelike separated from each other, as are A and C. B and C are timelike separated in relation to each other. Provided that one may establish a hyperplane of simultaneity among spacelike separated events, A is simultaneous with B in observer 1’s frame, and A is simultaneous with C in observer 2’s frame. To use the notion s to denote simultaneity, we may write: AsB for 1, and AsC for 2.

The classical argument assumes that events have unique r-values. Physical events exist independently of observers, they are located somewhere in spacetime. Whether a coordinate system is designated matters not to the existence of the event. Moreover, when dealing with separate, distinct events, these do not affect each other in any way. We may assume that distant events are equally real. If this assumption is correct, then simultaneous events are equally real. AsB should align with ArB, and AsC with ArC.

To spell out the eternalist argument:

Premise 1        AsB

Premise 2        AsC

Premise 3        AsB → ArB

Premise 4        AsC → ArC

Premise 5        ArB ∧ ArC

Conclusion      ArB ∧ ArC → BrC

The presentist or growing blocker cannot accept the above conclusion, only two of the premises:

Premise 1        AsB

Premise 2        AsB → ArB

Premise 3        ¬BsC → ¬BrC

Conclusion      ArB ∧ ¬BrC

The presentist and growing blocker both agree that the present is completely universal. In any place of the universe, what occurs in a moment is the exact same moment as in any other place. Every existing thing is simultaneous with any other existent thing. Everything that exists exists now. The now becomes redundant with such claims, as “happens now” means simply “happens”, as happening outside of the present has no reality value. All events are simultaneous. If B occurs now, C cannot occur, as it does not yet exist. From the eternalist viewpoint, B and C are equally real. If the classical argument for eternalism that draws on the relativity of simultaneity is valid, then presentism and the growing block view are committed to BrC ∧ ¬BrC. That would render these doctrines contradictory and absurd.

Penrose (1989) presents a similar eternalist argument (well-illustrated here). Imagine two observers, Bob and Alice. They pass each other at normal human walking speeds. Alice walks toward the Andromeda galaxy, Bob in the opposite direction. Andromeda is about 2×1019 kilometers from Earth. Stipulating that the Earth and Andromeda are at rest with respect to each other (which they are actually not, as is also the case in the Alien example discussed previously in Section 1.a), Alice’s and Bob’s planes of simultaneity intersect with Andromeda’s worldline for about five days. Imagine then that an Andromedan civilization initiates war against Earth. They decide to attack us in a time that sits between Alice’s and Bob’s planes of simultaneity. This means that the launch is in Alice’s past and in Bob’s future. The space fleet launching the attack is an unavoidable event. An event can be in some observer’s past (Alice), in some observer’s future (Bob), and in some observers’ present (at Andromedan’s home, at the time they take off, this event occurs now for them).

So far, we have focused on the reality values of events. The Minkowski coordinate system enables us to assess tenses: the present in the origin, and the past and the future in the light cones. How one argues for the direction of time is a huge topic of its own; it will not be dealt with here. Here it is assumed that future light cones point toward later times and past light cones toward earlier times. We may stick with the same events, A and C, as in the previous Figure 2. Let’s say there is some observer at event A. For them, event A occurs now, and event C is in their future. Another observer at event C: the event C occurs now for them, while A is in their past.

Figure 3. On the left: Future event C for an observer at A when A occurs now for them. On the right: Past event A for an observer at C when C occurs now for them.

Let’s say there is yet another event that we just previously did not mention: D. This event occurs now for an observer located at D. A is in their past, and C in their future.

Figure 4. For an observer at D, D occurs now,
A is in the past, and C in the future.

This brings us to the semantic argument for eternalism initiated by Putnam. He contrasts his position to Aristotle on future contingents. Putnam sees Aristotle as an indeterminist. Statements about potential future events do not have truth values in the present time. Putnam maintains Aristotle’s theory is obsolete, as it does not fit with relativity. The semantic argument can be clarified with the aid of Figure 4 above. When an observer at A utters a statement, “Event D will occur,” and an observer at C utters a statement, “Event D did occur,” one would expect both statements to have definite truth-values. Claims about future or past events are either true or false. Provided a physical event exists in some spacetime location, it does not matter in which spacetime location the observer who utters the existence claim is located. The occurrence of some physical events is not a subjective matter. From the four-dimensional perspective, D’s occurrence has a definite truth-value grounded in its definite reality value.

Putnam’s argument can be bolstered by truthmaker semantics. For something to be true about the world, there has to be something on the side of the world, perhaps a fact, state of affair, being, or process, that makes the statement, assertion, proposition, or theory about that aspect of the world true. In the case of physical events like D, that event itself would be the truthmaker for an existence claim like “Event D occurs at a given location in spacetime.” The truthmaker does not depend on the contingent spacetime location in which the existence claim is uttered. Even if past, present, and future are frame-relative, the physical event itself is not. Unlike tensed predicates (past, present, future), truthmakers (like an event) are not indexical. Armstrong (2004, chapter 11), for one, supports eternalism, or omnitemporalism, as he calls it, based on a truthmaker theory. Eternalism does not face some of the difficulties that presentism has about truthmaking, including postulating truthmakers in the present, finding them outside of time, or accepting non-existents as truthmakers.

3. Eternalism in Relation to Other Metaphysical Issues

a. Dynamic A-theory and Static B-theory

An exposition of the A-theory and the B-theory is provided in this Internet Encyclopedia of Philosophy article. In short, A-theorists think time is structured into past, present, and future. B-theorists think time is structured according to earlier than, simultaneous with, and later than relations. There is also the C-theory of time, which maintains that time is structured according to temporal in-betweenness relations. The A-theory is typically called dynamic; the B/C-theory is static. Presentists, growing blockers and moving spotlighters are A-theorists. Eternalists are typically, but not always, B-theorists. A-theorists maintain that time passes, while B/C-theorists deny the passage of time.

Mellor (1998) is a B-theorist who denies that events could be absolutely past, present, or future. Under his theory, properties like being past, being present, and being future do not exist. When statements about pastness, presentness, or futureness of events are made, they can be reformulated in a way that uses the resources of the B-theory. An event that happened in the past means that e is earlier than t. An event happening now means that e is located at t. An event that will happen means that e is later than t. Tensed sentences are switched into tenseless sentences.

Eternalism and the B-theory are typically categorized as the static theory of time. In this view, there is no passage of time in the sense that the future approaches, turns into the present, and then drifts off into the past. The four-dimensional world is thought to be changeless. This is based on the following issue (Price 1996, 13). We are misled by imagining the universe as a three-dimensional static spatial block, with time treated as an external dimension. However, in the framework of four-dimensional spacetime, time is not extrinsic but intrinsic. Time is one dimension of spacetime. Each clock measures proper time, the segment of the clock’s own trajectory in spacetime. Along the observer’s timelike worldline, events are organized successively, perhaps according to earlier than and later than relations. Above this temporal order there is no temporal passage.

The proponents of eternalism typically do not admit temporal flux, an objective change in A-properties, to be part of reality. Grünbaum (1973) criticized vehemently the idea of passage of time. In his view, relativity does not permit postulating a transient now. The now denoting present time is an arbitrary zero, the origin of temporal coordinates at the tip of Minkowski light cones. Absolute future and absolute past are events that take place earlier than or later than the arbitrarily chosen origin, the present moment. Relativity allows events to exist and sustain earlier than t or later than t relations, not any kind of objective becoming. In his view, organisms are conscious of some events in spacetime. Organisms receive novel information about events; there is no coming into existence and then fading away.

b. Passage of Time

A great many philosophers (not only philosophers of physics like Grünbaum) are and have been against the idea of passage of time. Traditionally, logical a priori arguments have been laid against passage. These can be found from very early philosophical sources, like Parmenides’ fragments. There is an inconsistency involved in thinking about passage. If the future, which is nothing, becomes the now, which is something, then this existing now becomes the past, which is nothing. How can nothing become something and something become nothing? How can non-existing turn into existing, and existing disappear into non-existing?

At first sight, eternalism is inconsistent with passage. If all times, past, present, and future exist, then the future does not come to us, switch into the ‘now’, and then disappear into the past. No thing comes into existence; no thing comes out of existence. All entities simply be, tenselessly. How does eternalism deal with the passage of time? There are different strategies for answering this question. 1) Passage is an illusion. We might experience a passage, but this experience is mistaken. 2) We believe that we experience passage, but we are mistaken by that belief. 3) Although the orthodox relativistic eternalism points towards B-theory and perspectivity and indexicality of tense, the moving spotlighters disagree. They maintain that the passage is a genuine feature of reality. 4) There is a passage of time, but that passage is something different than change along the past–present–future. Deflationary theories treat passage as a succession of events. The two first options are anti-realist about passage, while the last two are realistic.

i. The Illusion Argument

According to Parmenides, reality lacks time and change in general. Parmenidian monism suggests that the one and only world, our world, is timeless. Our experience of things changing in time is an illusion. All-encompassing antirealism about time is not currently popular. Yet the classic article for contemporary debates, McTaggart’s “The Unreality of Time” from 1908, is antirealist. Somewhat like Parmenides, McTaggart maintained that describing the world with tensed concepts is illogical. Past, present, and future are incompatible notions. An event should be either past, present, or future. It would be contradictory to claim that they share more than one tensed location in time. But that is how it should be if time passes. Perhaps the distinction introduced by the A-series “is simply a constant illusion of our minds,” surmises McTaggart (1908, 458).

In addition to aprioristic reasoning, there are empirical cases to be made for the illusion argument. There are various motion illusions. The color phi phenomenon might be taken to lend support to the argument that passage of time is an illusion. In the case of color phi, we wrongly see a persistent colored blob moving back and forth. It appears to change its color. Our experience is not veridical. There is no one blob changing its color, but two blobs of different colors. There is no reciprocating motion to begin with. We somehow construct the dynamic scenario in our experience. Perhaps we also create the animation of the flow of time from the myriad of sensory inputs. Another motion illusion: Say someone spins rapidly multiple times. After they stop spinning, the environment seems to move around them. It does not; that is the illusion. This is caused by the inner ear’s fluid rotation. It could be that the flux of time is a similar kind of phenomenon.

ii. Error Theory

Temporal error theory is the following claim: Our belief in the experience of time passing from the future to the present and to the past is false. Temporal error theory can be challenged by considering the origin of our temporal beliefs. Where does our belief in the passage of time come from, if not from a genuine experience of passage, of a very real feeling of time passing by?  Torrengo (2017, 175) puts it as follows: “It is part of the common-sense narrative about reality and our experience of it not only that time passes, but that we believe so because we feel that time passes.”

One common metaphor is the flowing river. It is not difficult to find inspirational quotes and bits of prose in which time is compared to the flowing of water (although it is difficult to authenticate such sources!):

“Time flows away like the water in the river.” – Confucius

“Everything flows and nothing abides; everything gives way and nothing stays fixed.” – Heraclitus

“Time is like a river made up of the events which happen, and a violent stream; for as soon as a thing has been seen, it is carried away, and another comes in its place, and this will be carried away too.” – Marcus Aurelius

“River is time in water; as it came, still so it flows, yet never is the same.” – Barten Holyday

“Time is a river without banks.” – Marc Chagall

“I wanted to change my fate, to force it down another road. I wanted to stand in the river of time and make it flow in a different direction, if just for a little while.” – April Genevieve Tucholke

Perhaps we assume metaphors of a flowing river from fiction and from our ways of using language more generally. Miller, Holcombe, and Latham (2020) speculate that all languages are at least to some degree passage-friendly. That is how we come to mis-describe our phenomenology of time. This approach does not imply that we tacitly conceptualize the world as containing passage, and then come to describe our experience as including time’s passing. Instead, ”we only come to tacitly conceptualize the world as containing passage—and hence to believe that it does—once we come to deploy passage-laden language,” Miller, Holcombe, and Latham (2020, 758) write. By this conceptualization, we not only believe time to be passing, but we also describe our temporal phenomenology in terms of time’s passing. From error theoretic point of view, this means that we mis-describe our temporal experience.

Note that the error theory should be separated from the illusionist thesis. In the case of illusions, we humans erroneously observe something to be what it is not. There are, for example, well-known optical illusions. Take the finger sausage illusion. Put two index fingers close to your eyes, and you see an individual “sausage” floating in the middle. There is no sausage floating in the air. The illusion is that you really see the floating sausage. We know the mechanism that is productive of the finger sausage illusion, how the gaze direction of the eyes is merged, and the brain corrects this by suppressing one end of the finger. According to the error theory, passage of time is not an illusion because we do not experience time flowing in the first place. We rather falsely believe and describe our temporal phenomenology by using passage-friendly and passage-laden language.

iii. Moving Spotlight

Broad originally expressed (1923, 59) the idea of a moving spotlight: “We imagine the characteristic of presentness as moving, somewhat like the spot of light from a policeman’s bull’s-eye traversing the fronts of the houses in a street.” The illuminated part is the present moment, what was just illuminated is the past, and what so far has not been illuminated is the future. Broad remained critical of this kind of theory. He thought originally in eternalist terms, but his metaphysics of time changed to resemble the growing block view of temporal existence (Thomas 2019).

The moving spotlight theory is a form of eternalism. The past, the present, and the future all exist, yet there is objective becoming. Only one time is absolutely present. That present “glows with a special metaphysical status” (Skow 2009, 666). Cameron (2015, 2) maintains both privileged present and temporary presentness. The former is a thesis according to which there is a unique and privileged present time. The latter is a thesis according to which this objectively special time changes. In other words, for the moving spotlighter, temporal passage is a fundamental feature of reality. The moving spotlight view therefore connects the A-theory with eternalism.

iv. Deflationary Passage

The deflationary theory agrees with traditional anti-realism about passage. There is no unique global passage and direction of time across the entire universe. There is no A-theoretic, tensed passage. There are however local passages of time along observers’ timelike worldlines. Fazekas argues that special relativity supports the idea of “multiple invariant temporal orderings,” that is, multiple B-series of timelike related events. She calls this “the multiple B-series view.” Timelike related events are the only events that genuinely occur successively. “So,” in the view of Fazekas (2016, 216), “time passes in each of the multiple B-series, but there is no passage of time spanning across all events.”

Slavov (2022) argues that the passage of time is a relational, not substantial, feature of reality (see the debate between substantivalism and relationism). Over and above the succession of events, there is no time that independently flows. This should fit with the four-dimensional block view. It, Slavov (2022, 119) argues,

contains dynamicity. Time path belongs to spacetime. The succession of events along observers’ timelike worldlines is objectively, although not uniquely, ordered. One thing comes after another. The totality of what exists remains the same, but there is change between local parts of spacetime regions between an earlier and a later time.

Passage requires that temporally ordered events exist and that there is change from an earlier time to a later time. This is how Mozersky describes his deflationist view in an interview: “such a minimal account captures everything we need the concept of temporal passage to capture: growth, decay, aging, evolution, motion, and so forth.” Growing, decaying, aging, evolving, and moving are all related to change.

c. Persistence and Dimensions of the World

How do things survive change across time? At later times, an object is, however slightly, different from what it used to be at an earlier time. Yet that object is the same object. How is this possible? There are two major views about persistence: endurantism and perdurantism (for a much more detailed and nuanced analysis of persistence, see this article). The former maintains that objects are wholly present at each time of their existence. The objects have spatial parts, but they are not divisible temporally. The eternalists typically side with the latter. Perdurantism is the view that objects are made of spatial and temporal parts, or more specifically, spatiotemporal parts. Most humans are composed of legs, belly, and head in the same way as most humans are composed of childhood, middle-age, and eld. Ordinary objects are so-called spacetime worms; they stretch out through time like earthworms stretch out through space (Hawley 2020).

Endurantism is a three-dimensional theory. An object endures in three-dimensional space. It just sits there, occupying a place in three-dimensional Cartesian space. Time is completely external and independent of the enduring object. Endurantism assumes that the three spatial dimensions are categorically different from the temporal dimension. Perdurantism, for its part, is a four-dimensional theory of persistence. Space and time cannot be completely separated. Bodies do not endure in time-independent space. Rather, objects are composed of spatiotemporal parts. Earlier and later parts exist, as they are parts of the same object. The perdurantist view aligns with eternalism and relativity (Hales and Johnson 2003; Balashov 2010).

Figure 5. Temporal parts of an object in spacetime.

The perdurantist explains change in terms of the qualitative difference among different parts. Change occurs in case an object has incompatible properties at different times. How are the temporal parts connected? What makes, for instance, a person the same person at different times? Lewis (1976a) mentions mental continuity and connectedness. The mental states of a person should have future successors. There should be a succession of continuing mental states. There is a bond of similarity and causal dependence between earlier and later states of the person.

Perdurantism fits nicely with eternalism. It predicates the existence of all temporal parts and times, and it is consistent with the universe having four dimensions. Considering humans, every event in our lives, from our births to our deaths, is real.

d. Free Will and Agency

In his article “A Rigorous Proof of Determinism Derived from the Special Theory of Relativity,” Rietdijk (1966) argues that special relativity theory indicates fatalism and negates the existence of free will. Consider Figure 2. An observer at event B should be able to influence their future. So, they should be able to influence how C will unfold. However, for another observer at event A, event C is simultaneous with A, suggesting that event C is fixed and unalterable. Yet, C lies in the absolute future of the observer at B. C is predetermined. It is an inevitable event, akin to the Andromedan space fleet in Penrose’s example. This notion poses a threat to at least some conception of free will. If the future must be open and indeterminate for agents to choose between alternative possibilities, a relativistic block universe does not allow for such openness.

There are reasons to think that eternalism does not contradict free will. Let’s assume that the future is fixed. It exists one way only. Statements about future events are true or false. Suppose, Miller explains,

It is true that there will be a war with sentient robots. In a sense, we cannot do anything about that; whatever we in fact do, the war with the robots will come to pass. But that does not mean that what you or I choose to do makes no difference to the way the world turns out or that somehow our choices are constrained in a deleterious manner. It is consistent with it being the case that there will be a war with sentient robots, that the reason there is such a war is because of what you and I do now. Indeed, one would expect that the reason there is such a war is in part because we build such robots. We make certain choices, and these choices have a causal impact on the way the world is. These choices, in effect, bring it about that there is a war with the robots in the future. Moreover, it is consistent with the fact that there will be such a war, that had we all made different choices, there would have been no war, and the facts about the future would have been different. The future would equally have been fixed, but the fixed facts would have been other than they are. From the fact that whatever choices we in fact make, these lead to a war with the robots, it does not follow that had we made different choices, there would nevertheless have been a war with the robots (Miller 2013, 357–8).

The future condition of later local regions of the universe depends on the state of their earlier local regions. We, as human agents, have some degree of influence over how things will unfold. For instance, as I compose this article in the 2020s, our actions regarding climate will partly shape the climate conditions in the 2050s. We are not omnipotent, and our understanding of consequences, especially those far into the future, is somewhat uncertain. However, even if it is a fact that the future exists in only one definite way, this does not inherently exclude free will or the causal relationship between actions and their consequences. Subscribing to eternalism does not resolve the debate over free will; one can be a fatalist or affirm the reality of free will within an eternalist framework.

One weakness to note about Rietdijk and Penrose’s arguments, at least if they are used to deny freedom of the will, is their focus on spacelike separated events. These events exist beyond any conceivable causal influence. It is obvious that if a distant civilization, with which we have no communication or interaction, decides to attack us, we cannot influence that decision. Events occurring in regions of the universe beyond our reach remain indifferent to our capacity to make free choices. What truly pertains to freedom of choice are events lying within the future light cone of the observer, not those outside or within the past light cone. Norton provides an illustrative example with mini spacetimes:

Figure 6. Causal connectibility.
Drawing based on Norton (2015, 211).

An observer at the origin of their path in O may only influence events that will be in the multiplicity of the future light cones, along the line towards event E, as they are timelike separated from them. There is no way to affect anything toward the spatially separated event F. Only future timelike or lightlike separated events can be affected, as in that case the affection stays within the cosmic speed limit, the speed of the electromagnetic spectrum frequency. Action from O to S would require the observer to surpass the maximum speed limit.

An important asymmetry within the eternalist framework is between perception and action. We may never perceive the future or affect the past. When perceiving something, we always perceive the past. An event, distant from the observer, occurs, and then there is a time-lag during which the information gets transmitted to and processed by the observer. The event causing the perception occurs before its perception. Actions, for their part, are always directed toward the future. All of this can be translated into B-language (Mellor 1998, 105). We may affect what happens before a time t but not what happens after a time t. We may not perceive what happens before a time t but what happens after a time t. This characterization might be misleading. To clarify, imagine time t as the time when we have lunch. Our breakfast occurs before t, and our dinner happens after t. At breakfast, we can influence what we are going to have for lunch, but we cannot observe it yet. At dinner, we can observe what we had for lunch, but we can no longer influence it.

Traditionally, philosophies of time akin to eternalism employ an all-knowing being that can see all times. Around a thousand years ago, Anselm argued that God is timeless, and so the entire world is immediately present to Him. This indicates every place and every time is under his immediate perception. All times from the beginning to the end are real. This is a tenseless view of time, which treats past, present, and future as perspectives relative to a temporal perceiver (Rogers 2007, 3). A more modern, science fiction example could be a being who can intuit the four-dimensional spacetime. That kind of being could somehow see the past, the present, and the future equally. The movie Men in Black 3 portrays a being like this, an alien named Griffin. He sees a past baseball game and a future attack of the movie’s villain in the same way as the present.

There is, however, a notable difficulty when it comes to observing the future. Perceiving the future would require turning the temporal asymmetry of causation around. It is hard to understand how causation would function in perception if one could observe events occurring later in time. For example, observing something outdoors requires light originating from the Sun to be reflected towards the observer’s eyes. The photons that strike the retina are eventually transduced into electric charges, which then navigate their way through the brain, resulting in the creation of a visual experience. Reversing the temporal direction in this process would be extremely weird. It would imply that the charges in the brain are transformed into light particles, which are then emitted from the eyes towards the object on Earth, subsequently traveling back towards the Sun and initiating physical processes there.

e. The Possibility of Time Travel

At first sight, presentism cannot accommodate time travel because, according to presentism, there are no various times in which one could travel. There is only the present time, but no other times that we could, even in principle, access. Past objects and future objects do not exist, so we cannot go and meet them, just like we cannot meet fictional beings. Not everyone, however, thinks that presentism could not deal with time travel (see Monton 2003).

For its part, eternalism is, in principle, hospitable to the idea of time travel. If the entire universe exists unconditionally, all spacetime with its varying regions simply be. We could travel to different times because all times exist. Traveling to different spatial locations is made possible by the existence of all spatial locations and the path between them. Four-dimensionally, there is a timelike path between different spacetime locations.

Travel to the future is in some sense a trivial idea: we are going toward later times all the time. As writing this encyclopedia article takes x months of my time, it means I am x months farther from my birth and x months closer to my death from starting to finishing writing. Time dilation is consistent with future time travel in another way. An observer traveling close to the speed of light or situated close to a black hole will age more slowly than an observer staying on Earth. After such a space journey, when they get back home to Earth, the traveler would have traveled into the future.

The question about traveling into the past, like to regions of spacetime that precede our own births, is more controversial. If closed timelike curves are possible, then at least in principle time travel to times earlier than our births is possible. This raises interesting questions about what we could do in our pasts. Can we go on and kill our grandparents? Lewis published a famous article in the 1970s, in which he argued that travel to the past is possible, and there is nothing paradoxical about it.

4. Objections

Eternalism has faced numerous strands of criticism. Typical objections concern eternalism’s putative incapability of dealing with change and free will (Ingram 2024). The issue of changelessness was tackled in Section 3.a, and the issue of free will in Section 3.d. Below, five more objections are presented, including possible answers to them.

a. Conventionality of Simultaneity

The conventionality of simultaneity poses a challenge to the classical argument for eternalism (see Ben-Yami 2006; Dieks 2012; Rovelli 2019). The conventionality of simultaneity was something already noted by Poincaré in the late 1800s and by Einstein in the early 1900s. If we are dealing with two spatially separated places, or spacelike separated events, how can we know that these places share the same time and that these events happen at the same time? How do we know that clocks at different places show the same time? How do we know that the ’now’ here is the same ‘now’ in another place? How do we know that the present extends across space even within a designated inertial frame?

Here we have the problem of synchronization. Say we could construct two ideal clocks with the exact same constitution separated by a distance AB. According to Einstein’s proposal, we may send a ray of light from location A to B, which is then reflected from B to A. The time the signal leaves from A is tA, which is measured by an ideal clock at A. The time it gets to B and bounces back is tB, and that time is measured by an identical clock at B. The time of arrival at A is measured by the clock at A. The two clocks are in synchrony if tB – tA = tA’ – tB.

In his Philosophy of Space and Time (1958), Reichenbach went on to argue that simultaneity of two distant events is indeterminate. He adopted Einstein’s notation but added the synchronization parameter ε. The definition of synchrony becomes tB = tA + ε(tA’ – tA), 0 < ε < 1. If the speed of light is isotropic, that is the same velocity in all directions, ε = 1/2. However, because the constant one-way speed of light is a postulate based on a definition, not on any brute fact about nature, the choice of the synchronization parameter is conventional (Jammer 2006: 176–8). Provided that simultaneity is a matter of convention, different choices of the synchronization parameter ε yield different simultaneity relations. If hyperplanes are arbitrary constructions, arguments relying on ontological simultaneity and co-reality relations become questionable. Conventionality implies that spacelike separated events are not even in a definite temporal order. If this is correct, the classical eternalist argument does not even get off the ground.

The conventionality objection relates to the issue of transitivity. Some relations are clearly transitive. They form chains in which transitivity holds. If A is bigger than B and B bigger than C, then A is bigger than C. Yet if A is B’s friend and B is C’s friend, it does not necessarily follow that A is C’s friend. How about simultaneous events being equally real? Do the premises AsB → ArB and AsC → ArC hold in the first place? They should, if we are to truthfully infer that ArB ∧ ArC → BrC. If distant simultaneity is a matter of convention, there seems to be no room for implying that events happening at the same time share the same reality value. Moreover, per Reichenbach’s causal theory of time, only causally connectable events lying in the observer’s light cones are genuinely temporally related. Outside light cones, we are in the regions of superluminal signals. What lies outside light cones is in principle not temporally related; spacelike separated events are neither simultaneous (occurring now) nor past or future. There is no fact of the matter as to which order non-causally connectable events occur. This applies to a great many events in the universe. As the universe is expanding, there are regions that are not causally connected. The different regions have the Big Bang as their common cause, but they are not otherwise affecting each other. They are therefore not temporally related.

Although conventionalism can be laid against eternalism, it can also be taken to support eternalism. Presentism and the growing block view require that the present moment is universal. There should be a unique, completely universal present hyperplane that connects every physical event in the universe. In that case, it should be true that the present time for the observer on Earth is the same time as in any other part of the universe. This means that it also should be true that a time that is past for an observer on Earth is past for an observer at any other place in the universe. Likewise, a future time for an observer on Earth, which both the presentist and no-futurist think does not yet exist, does not exist for any observer anywhere. Presentism and the growing block view both should accept the following to be true: “The moment a nuclear reaction in Betelgeuse occurs, the year 2000 on Earth has either passed or not.” If we consider the relativity of simultaneity, in some frames the year 2000 has passed at the time the reaction occurs but in some other frames it has not. If we consider the conventionality of simultaneity, the statement in question is not factual to begin with. As for the non-eternalist the statement must be true; for the eternalist, it is false (based on relativity of simultaneity) or it is not truth-app (based on conventionality of simultaneity). In both cases, the statement in question is not true, as the passing of the year 2000 and the reaction occurring have no unique simultaneity relation.

b. Neo-Lorentzianism

Historically, Lorentz provided an ether-based account of special relativity. His theory retains absolute simultaneity and is, in some circumstances, empirically equivalent to Einstein-Minkowski four-dimensional theory. Lorentz’s theory is not part of currently established science. It was abandoned quite a long time ago, as it did not fit with emerging general relativity and quantum physics (Acuña 2014).

There has however been an emerging interest in so-called Neo-Lorentzianism about relativity in the 2000s. Craig (2001) and Zimmerman (2007) have both argued, although not exactly in similar ways, for the existence of absolute simultaneity. This interpretation of special relativity would, against the orthodox interpretation, back up presentism. Craig’s theological presentism leans on the existence of God. “For God in the “now” of metaphysical time,” Craig explains (2001, 173), would “know which events in the universe are now being created by Him and are therefore absolutely simultaneous with each other and with His “now.”” According to this interpretation, there is a privileged frame of reference, the God’s frame of reference. Zimmerman does not explicitly invoke Neo-Lorentzianism. In his view, there is nevertheless a privileged way to carve up spacetime into a present hyperplane:

My commitment to presentism stems from the difficulty I have in believing in the existence of such entities as Bucephalus (Alexander the Great’s horse) and the Peloponnesian War, my first grandchild, and the inauguration of the first female US president. It is past and future objects and events that stick in my craw. The four-dimensional manifold of space-time points, on the other hand, is a theoretical entity posited by a scientific theory; it is something we would not have believed in were it not for its role in this theory, and we should let the theory tell us what it needs to be like. As a presentist, I believe that only one slice of this manifold is filled with events and objects (Zimmerman 2007, 219).

These approaches maintain that the present moment is ontologically privileged, as there is a privileged frame of reference and a privileged foliation of spacetime. These views can be seen as revising science on theological and metaphysical grounds. Some, like Balashov and Jansen (2003), Wüthrich (2010, 2012), and Baron (2018), have criticized these strategies. Here we may also refer to Wilson’s (2020, 17) evidential asymmetry principle. To paint with a broad brush, physical theories are better corroborated than metaphysical theories. Physical theories, like special relativity, are supported by a vast, cross-cultural, and global consensus (not to mention the immense amount of empirical evidence and technology that requires the theory). Making our metaphysics match with science is less controversial than making our science match with intuitively appealing metaphysics. Many intuitions—only the present exists, time passes unidirectionally along past-present-future, a parent cannot be younger than their children—can be challenged based on modern physics.

Alternative interpretations of relativity usually invoke something like the undetermination thesis. There is more than just one theory, or versions of the same theory, that correspond to empirical data. Hence, the empirical data alone does not determine what we should believe in. This motivates the juxtaposition of the Einstein-Minkowski four-dimensional theory and the Lorentz ether theory. Even though there are historical cases of rival theories that at some point in history accounted for the data equally well, this does not mean that the two theories are equal contestants in contemporary science. The two theories might not be empirically equivalent based on current knowledge. Impetus/inertia, phlogiston/oxygen, and Lorentz/Einstein make interesting alternatives from a historical viewpoint. It would be a false balance to portray them as equally valid ways of understanding the natural world. Lorentz’s theory did not fit with the emerging general relativity and quantum field theories. These theories have made a lot of progress throughout the 20th and 21st centuries, and they are parts of established science, unlike the ether theory. Moreover, Einstein’s 1905 theory did not only pave the way for subsequent physics. It also corrected the preceding Maxwellian-Hertzian electrodynamics by showing that electric fields are relative quantities; they do not require an ether in which the energy of the field is contained. Maxwell’s electrodynamics is an important part of classical physics and electric engineering without the assumption of space-permeating ether.

c. General Relativity

Although special relativity, at least its orthodox interpretation, does not lend support to presentism or the growing block view, things might be different in case of general relativity. That theory includes the so-called Friedmann-Lemaître-Robertson-Walker (FLRW) metric. This enables one to argue for cosmic simultaneity, the unique hypersurface of cosmic time. This idea requires a fundamental observer that could be construed as one who is stationary relative to the microwave background. In this sense, presentists or growing blockers may argue that although special relativity is at odds with their accounts of the nature of time, this is not so in the case of more advanced science. Swinburne (2008, 244), for one, claims “that there is absolute simultaneity in our homogeneous and isotropic universe of galaxies receding from each other with a metric described by the” FLRW solution. As pointed out by Callender (2017, 75–6) and Read and Qureshi-Hurst (2021), however, we are not fundamental observers, as we are in various relative states of motion. We move on our planet; our planet rotates around its axis; it orbits the Sun; it moves in relation to our galaxy, which in turn moves in relation to other galaxies (for a more astute description, see Pettini 2018, 2). This indicates that we do not have local access to cosmic time.

Black hole physics is also troublesome for presentism and the growing block view. It retains the frame-relativity of simultaneity, so not all observers agree with what is present (see Romero and Pérez 2014, section “Black holes and the present”). Baron and Le Bihan (2023) consider the idea of surface presentism based on general relativity. According to surface presentism, what exists is limited to a three-dimensional hypersurface. Although surface presentism allows there is no preferred frame of reference in physics, it maintains there is a preferred frame in the metaphysical sense. This anchors the one and only actual present moment. Consider the event horizon. Nothing, not even massless particles like photons, can escape from a black hole. The event horizon is the limit between what goes in the black hole and can never leave and the rest of the universe. The later times of the region, what is beyond the event horizon in the black hole, are ones in which nothing that enters it will never escape. What is relevant for the metaphysics of time is the ontological dependence of earlier and later times in black hole physics.

According to the argument of Baron and Le Bihan, there would be no event horizons if surface presentism were true. Surface presentism, as well as presentism in general and the growing block view, maintains there is no future or times later to the present time. Briefly put, the very existence of black holes as evidenced by general relativity is against presentism. As nothing can escape the interior of a black hole after entering it, there is an ontological reference to a later time. No matter how long it takes, nothing can escape. To paraphrase and interpret Curiel (2019, 29), the location of the event horizon in spacetime requires the entire structure of spacetime, from the beginning to the end (and all the way to infinity). All spacetime exists.

d. Quantum Physics

Whereas relativity is a classical, determinist theory and so well-fit with predicating a fixed future, quantum physics is many times interpreted in probabilistic terms. This may imply that the future is nonexistent and open. This would flat-out contradict eternalism. There are good reasons for this conclusion. Take the famous double-slit experiment.

In this experiment, a gun steadily fires electrons or photons individually towards two slits. Behind the slits, a photographic plate registers the hits. Each particle leaves a mark on the plate, one at a time. When these particles are shot one by one, they land individually on the detector screen. However, when a large number of particles are shot, an interference pattern begins to emerge on the screen, indicating wave-like behavior.

At first glance, this is highly peculiar because particles and waves are fundamentally different: particles are located in specific spatial regions, while waves spread out across space.

What is important for debates concerning the nature of time is the probabilistic character of the experimental outcome. Determinist theories, within the margin of error, enable the experimenter to precisely predict the location of the particle in advance of the experiment.

In the double-slit experiment, one cannot even in principle know, before carrying out the experiment, where the individual particle will eventually hit the screen. To put it a bit more precisely, the quantum particle is associated with a probability density. There is a proportionality that connects the wave and the particle nature of matter and light. The probability density is proportional to the square of the amplitude function of the electromagnetic wave. This means we can assign probabilities for detecting the particle at the screen. It is more likely that it will be observed at one location as opposed to another.

This can be taken to imply that the future location of the particle is a matter of open possibility. Before performing the experiment, it is a random tensed fact where the particle will be. Yet eternalism cannot allow such open possibilities, because it treats any event as tenselessly existing. To give a more commonsensical example, eternalism indicates that the winning lottery numbers of the next week’s lottery exist. We were ignorant of those numbers before the lottery because we were not at the spacetime location in which we could see the numbers. Yet the no-futurist, probabilistic interpretation of the situation suggests that the next week’s lottery numbers do not yet exist. The machine will eventually randomly pick out a bunch of numbers.

Putnam’s classical semantic argument for eternalism assumes that statements concerning events have definite truth values, independently of whether they are in the observer’s past or future. This is consistent with a determinist theory, but quantum theory might require future-tense statements that have probabilistic truth values. Hence future events would have probabilistic reality values. Statements concerning what will occur do not have bivalent truth values, but they instead range between 0 and 1 in the open interval [0, 1]. Sudbery (2017) has developed a logic of the future from a quantum perspective. Sudbery (2017, 4451–4452) argues “that the statements any one of us can make, from his or her own perspective in the universe,” when they concern the future, “are to be identified with probabilities.” This account seems to go well with quantum physics and no-futurist views like presentism or the growing block.

One way for the eternalist to answer this objection is to consider a determinist interpretation of quantum mechanics, like the many worlds interpretation, initiated by Everett in 1957. Greaves and Myrvold (2010, 264) encapsulate the underlying Everettian idea: “All outcomes with non-zero amplitude are actualized on different branches.” All quantum measurements correspond to multiple measurement results that genuinely occur in different branches of reality. Under Everettianism, one could think that there is no ‘now’ that is the same everywhere in all of physical reality, but different worlds/branches have their own times. Different worlds within the Everett multiverse or different branches within the single universe are causally isolated. This is not much different from relativistic spacelike separation: different locations in the universe are in each other’s absolute elsewhere, not connected by any privileged hyperplane of simultaneity. There is no unique present moment that cuts through everything that exists and defines all that exists at that instant.

A potential challenge to classical eternalist arguments that draw on the relativity of simultaneity comes from quantum entanglement. Based on the ideas of Bell, Aspect and his team were able in the 1980s to experimentally corroborate a non-local theory. Two particles, separated by distance, turn out instantly to have correlated properties. This could not be the case with only locally defined physical states of the particles. Maudlin (2011, 2) explains that they:

appear to remain “connected” or “in communication” no matter how distantly separated they may become. The outcome of experiments performed on one member of the pair appears to depend not just on that member’s own intrinsic physical state but also on the result of experiments carried out on its twin.

Non-locality might introduce a privileged frame of reference. (For a thorough discussion on non-locality and relativity, see Maudlin (2011).) If this is so, the classical argument that relies on chains of simultaneity relations and their transitivities would perhaps be challenged. A question still remains about time. Gödel (1949) argued that objective passage requires a completely universal hyperplane, a global ‘now’ that constantly recreates itself. It is not clear whether the instant correlation of distant particles in quantum entanglement introduces this kind of unique spacetime foliation required for exclusive global passage of time.

The research programs on quantum gravity aim to weave together relativistic and quantum physics, considering both gravitational and quantum effects. This could potentially yield the most fundamental physical theory. Some approaches to quantum gravity indicate that spacetime is not fundamental. When one reaches the Planck scale, 1.62 x 10-35 meters and 5.40 x 10-44 seconds (Crowther 2016, 14), there might not be space and time as we know them. At first glance, this might challenge eternalism, as the classical argument for it leans on four-dimensional spacetime. Le Bihan (2020) analyzes string theory and loop quantum gravity, arguing that both align with an eternalist metaphysics of time. If deep-down the world is in some sense timeless, the distinct parts of the natural world still exist unconditionally. This undermines presentism and no-futurism, since they rely on positing a global present by means of a unique, universal hyperplane and an absolute distinction between A-properties. Whether something is past, present, or future is not determined by fundamental physics.

e. Triviality

A case has been made that the presentist/eternalist debate lacks substance. Dorato (2006, 51), for one, claims that the whole issue is ill-founded from an ontological viewpoint. Presentism and eternalism reflect our different practical attitudes toward the past, present, and future. Dorato notes that Putnam’s assertion “any future event X is already real” (Putnam 1967, 243) is problematic. This assertion seems to implicitly assume presentism and eternalism. By saying that “the future is already,” we are saying something like “the future is now.” This is contradictory: the present and the future are different times. They cannot exist at the same time.

To be more precise, Dorato (2006, 65) distinguishes between the tensed and tenseless senses of the verb “exist.” In the tensed sense, an event exists in the sense that the event exists now. In the tenseless sense, an event exists in the sense that it existed, exists now, or will exist. These become trivial definitions: both presentists and eternalists accept them. The whole debate is verbal. Presentism maintains that past or future events do not exist now. What happens is that “presentism becomes a triviality” because “both presentists and eternalists must agree that whatever occurs” in the past or future “does not exist now!”. Instead of being an ontological debate, the presentism/eternalism dialogue is a matter of differing existential attitudes toward the past, present, and future.

One way to answer the worry of triviality—and the charge that the whole debate is merely verbal—is to add that presentism and eternalism disagree on what exists unconditionally (the Latin phrase is simpliciter). Consider the following statement S:

S: “Cleopatra exists unconditionally.”

The presentist thinks S is false. All that exists for the presentist are the presently existing entities. When I am composing or you are reading this article, Cleopatra does not exist anymore. The original claim is false, according to the presentist. The eternalist disagrees: S is true for the eternalist while I’m writing and you are reading. We are not in the same spacetime location as Cleopatra, but nevertheless there is a location that the living Cleopatra occupies (Curtis and Robson 2016, 94).

The sense in which presentism and eternalism ontologically agree/disagree can be clarified by specifying the domain of quantification. This clarification borrows from predicate logic. Presentism and eternalism both agree on restricted quantification. It is true, according to presentism and eternalism, that Cleopatra does not exist anymore while writing or reading this article. The two nevertheless respond differently to unrestricted quantification. When quantifying over the totality of what exists, the presentist maintains that the quantification is over the presently existing entities, while the eternalist maintains that the quantification is over the past, the present, and the future entities.

There are many other reasons to think that presentism and eternalism imply separate temporal ontologies. These aspects have already been discussed in this article: dimensionality of the world, indexicality, and persistence. Presentism maintains the existence of a three-dimensional spatial totality plus the universal present moment that cuts through the whole universe. Eternalism denies there is such a universal present moment and that existence is restricted to the present time. Instead, the entire block universe exists (Wüthrich 2010). Eternalism is very different from presentism because it predicates the existence of all events, irrespective of their contingent spacetime location. Presentists cannot accept the existence of all events from the beginning of the universe to its end but limit existence to the presently existing entities, which are thought to be the only existent entities. Presentists may well treat the spatial location ‘here’ as indexical but hold the present to be absolute. Eternalists (apart from the moving spotlight view) treat both spatial and temporal tensed locations as indexicals. Presentists typically assume a perdurantist account of persistence, the view that objects persist the same across time temporally indivisibly, whereas eternalists subscribe to temporal parts in spacetime.

5. References and Further Reading

  • Acuña, P. (2014) “On the Empirical Equivalence between Special Relativity and Lorentz’s Ether Theory.” Studies in History and Philosophy of Modern Physics 46: 283-302.
  • Armstrong, D. M. (2004) Truth and Truthmakers. Cambridge: Cambridge University Press.
  • Balashov, Y. and M. Janssen (2003) ”Presentism and Relativity.” British Journal for the Philosophy of Science 54: 327-46.
  • Balashov, Y. (2010) Persistence and Spacetime. New York: Oxford University Press.
  • Baron, S. (2018). “Time, Physics, and Philosophy: It’s All Relative.” Philosophy Compass 13: 1-14.
  • Baron, S. and B. Le Bihan (2023) “Composing Spacetime.” The Journal of Philosophy.
  • Ben-Yami, H. (2006) “Causality and Temporal Order in Special Relativity.” The British Journal for the Philosophy of Science 57: 459-79.
  • Broad, C. D. (1923) Scientific Thought. London: Routledge and Kegan Paul.
  • Callender, C. (2017) What Makes Time Special? New York: Oxford University Press.
  • Cameron, R. (2015) The Moving Spotlight. New York: Oxford University Press.
  • Craig, W. L. (2001). Time and the Metaphysics of Relativity. Dordrecht: Springer.
  • Crowther, K. (2016) Effective Spacetime. Understanding Emergence in Effective Field Theory and Quantum Gravity. Cham: Springer.
  • Curiel, E. (2019) “The Many Definitions of a Black Hole.” Nature Astronomy 3: 27–34.
  • Curtis L. and Robson J. (2016) A Critical Introduction to the Metaphysics of Time. London: Bloomsbury Academic.
  • Dieks, D. (2012) “Time, Space, Spacetime.” Metascience 21: 617-9.
  • Dorato, M. (2006) “Putnam on Time and Special Relativity: A Long Journey from Ontology to Ethics”. European Journal of Analytic Philosophy 4: 51-70.
  • Dowden, B. (2024) “Time.” Internet Encyclopedia of Philosophy, https://iep.utm.edu/time/.
  • Einstein, A. (1905/1923) “On the Electrodynamics of Moving Bodies.” In Lorentz et al. (ed.) The Principle of Relativity, 35‒65. Trans. W. Perret and G.B. Jeffery. Dover Publications, Inc.
  • Everett, H. I. (1957) “Relative State Formulation of Quantum Mechanics.” Review of Modern Physics 29: 454-62.
  • Fazekas, K. (2016) “Special Relativity, Multiple B-Series, and the Passage of Time.” American Philosophical Quarterly 53: 215-29.
  • Greaves, H. and W. Myrvold (2010) “Everett and Evidence.” In Saunders, S., Barrett, J., Kent, A. and Wallace, D. (eds.), Many Worlds? Everett, Quantum Theory, & Reality, 264-306. New York: Oxford University Press.
  • Grünbaum, A. (1973) Philosophical Problems of Space and Time. Second, enlarged version. Dordrecht: Reidel.
  • Gödel, K. (1949) “A Remark about the Relationship between Relativity Theory and the Idealistic Philosophy.” In Schilpp, P. A. (ed.), Albert Einstein: Philosopher-Scientist, 555-62. La Salle, Illinois: Open Court.
  • Hales, S. and Johnson, T. (2003) “Endurantism, Perdurantism and Special Relativity.” Philosophical Quarterly 53: 524–539.
  • Hawley, K. (2020) “Temporal Parts.” The Stanford Encyclopedia of Philosophy, https://plato.stanford.edu/entries/temporal-parts/.
  • Hinton, C. (1904) The Fourth Dimension. London: Swan Sonnenschein & Co. Ltd.
  • Ingram, D. (2024) “Presentism and Eternalism.” In N. Emery (ed.), The Routledge Companion to Philosophy of Time. Routledge.
  • Jammer, M. (2006) Concepts of Simultaneity: From Antiquity to Einstein and Beyond. Baltimore: Johns Hopkins University Press.
  • Maxwell, N. (1985) ”Are Probabilism and Special Relativity Incompatible?” Philosophy of Science 52: 23–43.
  • Le Bihan, B. (2020) “String Theory, Loop Quantum Gravity, and Eternalism.” European Journal for Philosophy of Science 10: 1-22.
  • Lewis, D. (1976a) “Survival and Identity.” In A. O. Rorty (ed.), The Identities of Persons. Berkeley: University of California Press, 17-40.
  • Lewis, D. (1976b) “The Paradoxes of Time Travel.” American Philosophical Quarterly 13: 145-52.
  • Mar, G. (2017) “Gödel’s Ontological Dreams.” In Wuppuluri, S. and Ghirardi, G. (eds.), Space, Time, and the Limits of Human Understanding, 461-78. Cham: Springer.
  • Maudlin, T. (2011) Quantum Non-locality and Relativity. Third Edition. Wiley-Blackwell
  • McTaggart, J. M. E. (1908) “The Unreality of Time.” Mind 17: 457-74.
  • Mellor, H. (1998) Real Time II. London and New York: Routledge.
  • Miller, K. (2013). “Presentism, Eternalism, and the Growing Block.” In H. Dyke & A. Bardon, eds., A Companion to the Philosophy of Time. Malden, MA: Wiley-Blackwell, 345-64.
  • Miller, K. et al. (2020) “Temporal Phenomenology: Phenomenological Illusion versus Cognitive Error.” Synthese 197: 751-71.
  • Minkowski, H. (1923) “Space and Time.” In Lorentz et al. (eds.), The Principle of Relativity, 73-91. Translated by W. Perrett and G. B. Jeffery. Dover Publications.
  • Monton, B. (2003) “Presentists Can Believe in Closed Timelike Curves.” Analysis 63: 199-202.
  • Norton, J. (2015) “What Can We Learn about the Ontology of Space and Time from the Theory of Relativity.” In L. Sklar (ed.), Physical Theory: Method and Interpretation, Oxford University Press, 185-228.
  • O’Sullivan, L. (2023) “Frege and the Logic of Historical Propositions.” Journal of the Philosophy of History 18: 1-26.
  • Penrose, R. (1989) The Emperor’s New Mind: Concerning Computers, Minds, and Laws of Physics. New York and Oxford: Oxford University Press.
  • Peterson, D. and M. D. Silberstein (2010) “Relativity of Simultaneity and Eternalism.” In Petkov (ed.), Space, Time, and Spacetime, 209-37. Heidelberg: Springer.
  • Pettini, M. (2018) “Introduction to Cosmology – Lecture 1. Basic concepts.” Online lecture notes: https://people.ast.cam.ac.uk/~pettini/Intro%20Cosmology/Lecture01.pdf
  • Price, H. (2011) “The flow of time.” In C. Callender, ed., The Oxford Handbook of Philosophy of Time. Oxford: Oxford University Press, 276-311.
  • Putnam, H. (1967) “Time and Physical Geometry.” The Journal of Philosophy 64: 240-7.
  • Read, J. and E. Qureshi-Hurst (2021) “Getting Tense about Relativity.” Synthese 198: 8103–8125.
  • Reichenbach, H. (1958) The Philosophy of Space and Time. Translated by Maria Reichenbach and John Freund. New York: Dover Publications, Inc.
  • Rietdijk, C. W. (1966) “A Rigorous Proof of Determinism Derived from the Special Theory of Relativity.” Philosophy of Science 33: 341-4.
  • Rogers, K. A. (2007) “Anselmian Eternalism: The Presence of a Timeless God.” Faith and Philosophy 24: 3-27.
  • Romero, G. E. and D. Pérez (2014) “Presentism Meets Black Holes.” European Journal for Philosophy of Science 4: 293-308.
  • Rovelli, C. (2019) “Neither Presentism nor Eternalism.” Foundations of Physics 49: 1325-35.
  • Russell, B. (1925) The ABC of Relativity. New York and London: Harper and Brothers.
  • Russell, B. (1915) “On the Experience of Time.” The Monist 25: 212-23.
  • Skow, B. (2009) “Relativity and Moving Spotlight.” The Journal of Philosophy 106: 666–78.
  • Slavov, M. (2022) Relational Passage of Time. New York: Routledge.
  • Smart, J. (1949) “The river of time.” Mind 58: 483-94.
  • Sudbery, A. (2017) “The Logic of the Future in Quantum Theory.” Synthese 194: 4429-53.
  • Swinburne, R. (2008) “Cosmic Simultaneity.” In W. L. Craig & Q. Smith (Eds.), Einstein, Relativity and Absolute Simultaneity, 224–261. London: Routledge.
  • Thomas, E. (2019) “The Roots of C. D. Broad’s Growing Block Theory of Time.” Mind 128: 527-49.
  • Thyssen, P. (2023) ”The Rietdijk–Putnam–Maxwell Argument.” https://philarchive.org/rec/THYTRA.
  • Torrengo, G. (2017) “Feeling the Passing of Time.” Journal of Philosophy 114: 165–188.
  • Waller, J. (2012) Persistence through Time in Spinoza. Lexington Books.
  • Williams, D. C. (1951) “The Myth of Passage.” The Journal of Philosophy 48: 457-72.
  • Wilson, A. (2020) The Nature of Contingency: Quantum Physics as Modal Realism. Oxford: Oxford University Press.
  • Wüthrich, C. (2010) “No Presentism in Quantum Gravity.” In V. Petkov (ed.), Space, Time, and Spacetime: Physical and Philosophical Implications of Minkowski’s Unification of Space and Time, 257-78. Heidelberg: Springer.
  • Wüthrich, C. (2012) “The Fate of Presentism in Modern Physics.” In Ciuni, Miller, and Torrengo (eds.), New Papers on the Present—Focus on Presentism, 92-133. München: Philosophia Verlag.
  • Zimmerman, D. (2007) “The Privileged Present: Defending an ‘A-Theory’ of Time.” In Sider, T. et al. (eds.), Contemporary Debates in Metaphysics, 211-25. Blackwell.

 

Author Information

Matias Slavov
Email: matias.slavov@tuni.fi
Tampere University
Finland

The Cognitive Foundations and Epistemology of Arithmetic and Geometry

How is knowledge of arithmetic and geometry developed and acquired? In the tradition established by Plato and often associated with Kant, the epistemology of mathematics has been focused on a priori approaches, which take mathematical knowledge and its study to be essentially independent of sensory experience. Within this tradition, there are two a priori approaches. In the epistemological a priori approach, mathematical knowledge is seen as being a priori in character. In the methodological a priori approach, the study of the nature of mathematical knowledge is seen primarily as an a priori philosophical pursuit. Historically, there have been philosophers, most notably Mill in 1843, who have challenged the epistemological a priori approach. By contrast, until the 21st century, the methodological a priori approach has remained unchallenged by philosophers.

In the first two decades of the 21st century, the methodological a priori approach has received serious challenges concerning both arithmetic and geometry, which are generally considered to be among the most fundamental areas of mathematics. Empirical results have emerged that suggest that human infants and many non-human animals have something akin to arithmetical and geometrical capacities. There has been a great deal of disagreement over the philosophical significance of such results. Some philosophers believe that these results are directly relevant to philosophical questions concerning mathematical knowledge, while others remain sceptical.

This article presents some key empirical findings from the cognitive sciences and how they have been applied to the epistemology of arithmetic and geometry. It is divided into two parts. The first part is focused on arithmetic. Results on early quantitative cognition are reviewed, and important conceptual terminological distinctions are made, after which the importance of these empirical data for the epistemology of arithmetic is discussed. Two separate but connected problems are distinguished: the development of arithmetical knowledge on the level of individual ontogeny, and on the level of phylogeny and cultural history. The role of culture in the development of arithmetical knowledge is discussed, after which general epistemological considerations are provided. While at present the empirical data relevant to the development of arithmetic are stronger and more plentiful, there is also a growing body of data relevant to the development of geometry. In the second part, these data are used to provide geometrical knowledge with a similar treatment to that provided to arithmetical knowledge.

Table of Contents

  1. Arithmetic
    1. The A Posteriori and Mathematics
      1. Empiricism in the Philosophy of Mathematics
    2. Empirical Research and the Philosophy of Mathematics
  2. Numerical and Arithmetical Cognition
    1. Arithmetic and Proto-Arithmetic
    2. Acquisition of Number Concepts and Arithmetical Knowledge
    3. Embodied Mind and Enculturation
    4. From Ontogeny to Phylogeny and Cultural History
    5. Ordinal or Cardinal
    6. Empirically-Informed Epistemology of Arithmetic
  3. Geometry
    1. The Cognitive Foundations of Geometry
      1. Proto-Geometrical Cognition
      2. Shape Recognition
      3. Orientation
    2. The Development of Geometric Cognition
  4. References and Further Reading

1. Arithmetic

a. The A Posteriori and Mathematics

i. Empiricism in the Philosophy of Mathematics

Traditional pre-19th-century Western philosophy of mathematics is often associated with two specific views. The first is Plato’s (The Republic) notion that mathematics concerns mind-independent abstract objects. The second is Immanuel Kant’s (1787) view of mathematical knowledge as synthetic a priori. These views, while not necessarily connected, are compatible. By combining them, readers obtain a standard Platonist view of mathematics: mathematical knowledge is acquired and justified through reason and recollection, and it concerns mind-independent abstract objects.

This standard view can be challenged from different directions. Conventionalists, for example, deny that mathematical knowledge concerns mind-independent objects. Hence, mathematics does not give us genuinely new knowledge about the world (in a broad sense), and should be considered analytically a priori (see, for example, Carnap, 1937). A less popular challenge claims that mathematical knowledge is not essentially independent of sensory experience, but a posteriori in character. The most famous such empiricist view was presented by John Stuart Mill (1843). By the late 20th century, empiricist epistemologies of mathematics had been supported by Philip Kitcher (1983) and Penelope Maddy (1990). Maddy connects empiricism to mathematical realism, while, according to Kitcher, mathematical knowledge concerns generalisations of operations that we undertake in our environment. A similar view is supported by George Lakoff and Rafael Núñez (2000), who focus on the use of conceptual metaphors in cognitive development.

While there are many similarities in the approaches of Kitcher and Lakoff and Núñez, there are also notable differences. Importantly, Lakoff and Núñez make more connections to the empirical literature on the development of mathematical cognition. From the 1990s on, authors began taking this approach even more seriously, making important use of empirical studies on early numerical and geometrical cognition in the epistemology of mathematics. Significantly, such authors are not necessarily empiricists concerning mathematical knowledge. Accordingly, it is important to distinguish between empiricist and empirically-informed epistemological theories of mathematics. While the former are likely to include the latter, the opposite is not necessarily the case. Since the late 20th century, many philosophers have proposed epistemological accounts of arithmetic and geometry that are based in a significant way on empirical research, but which do not support the view that mathematical knowledge is essentially empirical in character.

b. Empirical Research and the Philosophy of Mathematics

An important reason for the emergence of empirically-informed accounts is the extensive development of empirical research on numerical cognition in the 1990s. This empirical work was presented in two popular books: The Number Sense by Stanislas Dehaene (Dehaene 1997) and What Counts: How Every Brain is Hardwired for Math (Butterworth 1999), published in the United Kingdom as The Mathematical Brain. Within this research corpus, one of the most famous items is Karen Wynn’s paper “Addition and subtraction by human infants” (Wynn 1992). In it, Wynn presents her research on five-month-old infants, whom she interprets as possessing genuine number concepts and arithmetical abilities. Wynn’s experiment is widely discussed in the subsequent literature, and since it illuminates the different kinds of interpretations that can be made of empirical research, it is worth presenting in detail.

In the experiment, infants were shown dolls, and their reactions were observed to determine whether they had numerical abilities. In the first variation of the experiment, infants were shown two dolls placed, one by one, behind an opaque screen. In some trials, one of the dolls was removed without the infant seeing its removal, revealing only one doll when the screen was lifted. In others, both dolls were left behind the screen, revealing two dolls. In the second variation of the experiment, the infants were first shown two dolls and, after one was visibly removed, either one or two dolls were revealed.

visual sequence

Wynn’s experiment showed that infants reacted with surprise (measured through longer looking times) to the trials where the revealed quantity of dolls was unnatural (namely, when only one doll was revealed in the first variation, and when two dolls were revealed in the second variation). Wynn argued that this showed that “infants can calculate the results of simple arithmetical operations on small numbers of items. This indicates that infants possess true numerical concepts, and suggests that humans are innately endowed with arithmetical abilities.” (Wynn 1992, 749). Others were equally excited by the results. Dehaene, for example, motivated his book by asking: “How can a five-month-old baby know that 1 plus 1 equals 2?” (Dehaene 2011, xvii). Yet, a lot is assumed in these claims. Do infants really possess true numerical concepts? Are they innately endowed with arithmetical abilities? Their behaviour in the experiment notwithstanding, do they know that one plus one equals two? These questions warrant a detailed analysis before we say more about the arithmetical capacities of infants. But, clearly, empirical research of this type is highly philosophically relevant; after all, if Wynn’s and Dehaene’s interpretations are correct, some mathematical knowledge is already possessed by infants. This would pose a serious challenge to epistemological theories according to which mathematical knowledge is acquired solely through reason and recollection.

2. Numerical and Arithmetical Cognition

a. Arithmetic and Proto-Arithmetic

Neither Wynn nor Dehaene propose that infants possess arithmetical knowledge in the same sense in which arithmetically-educated adults do. Yet, as they interpret the empirical results, adult arithmetical knowledge is a later stage of a developmental trajectory that builds on innate arithmetical abilities. Standardly, two innate abilities are identified (see, for example, Dehaene, 2011; Feigenson et al., 2004; Hyde, 2011; Spelke, 2011). First is the ability to subitize, which was first reported in (Kaufman et al. 1949). Subitizing is the ability to determine the quantity of objects in the field of vision without counting. The subitizing ability allows for the precise determination of quantities but standardly stops being applied for collections larger than four items. In one of the earliest results supporting the existence of an infant ability to discriminate quantities, Starkey and Cooper (1980) reported that 22-week-old infants subitize. Since then, it has been established that many non-human animals have the ability to subitize (see, for example, Dehaene, 2011 for an overview). The second ability is estimating the size of an observed collection of objects. Unlike subitizing, the estimating ability is not limited to small quantities. Yet, it becomes increasingly inaccurate as the estimated collections become larger. Indeed, the accuracy of the estimations decreases in a logarithmic manner, following the so-called Weber-Fechner law (also called Weber’s law in the literature): it is more difficult to distinguish between, say, 17 and 18 objects than it is between 7 and 8 (Dehaene 2011; Fechner 1948). Since the performance signatures of these abilities (for both human children and non-human animals) are different for small and large collections of objects, subitizing (being precise but limited) and estimating (being approximate but essentially unlimited) are standardly thought to be distinct abilities (Feigenson, Dehaene, and Spelke 2004).

In addition to infants, the subitizing and estimating abilities have been detected in many non-human animals. Among them are animals that are generally considered to be intelligent, like primates and birds in the corvid family (for review, see Dehaene 2011; Knops 2020; Nieder 2019). More surprising have been empirical results showing that goldfish (DeLong et al. 2017), newborn chicks (Rugani et al. 2009) and honeybees (Howard et al. 2019) also seem to possess similar quantitative abilities. These data suggest that the abilities have early evolutionary origins, or evolved several times. In either case, their existence has been an important reason to reconsider the origins of numerical abilities.

Given that subitizing and estimating are innate abilities, it is commonplace among empirical researchers to attribute them to so-called core cognitive systems (Carey 2009; Spelke 2000). According to Susan Carey, core cognition refers to how human cognition begins with “highly structured innate mechanisms designed to build representations with specific content” (Carey 2009, 67). The core system responsible for the subitizing ability allows for the tracking of persisting objects in the field of vision, and it is usually called the object tracking system (OTS) (Knops 2020), but sometimes also the parallel individuation system (Carey 2009; Hyde 2011). Unlike the OTS, which has functions besides determining the quantity of objects, the core system responsible for the estimating ability is standardly thought to be quantity specific. It is usually called the approximate number system (ANS) (Spelke 2000), and it is what Dehaene called the number sense (Dehaene 1997), even though some research suggests that instead of being number-specific, the estimation system is common to space, time, and number (Walsh 2003).

As mentioned above, it is commonplace to think of the OTS and the ANS as two distinct core cognitive systems (see, for example, Hyde, 2011; Nieder, 2019). Recently, though, a mathematical model has been proposed according to which, under limited informational capacity, only one innate system is responsible for different performance signatures for small and large collections (Cheyette and Piantadosi 2020). From a philosophical standpoint, more important than the exact division between systems is how the core cognitive systems should be understood in terms of the development of arithmetical cognition. Based on Wynn’s (1992) report of her experiment, it is likely that infants are subitizing when determining the quantity of the dolls. She believes that they also practice arithmetic and possess true numerical concepts. This prompts the question: could the infants’ behaviour be understood in another way, one that does not assign genuine arithmetical ability or numerical concepts to them? Many researchers believe that subitizing, based on the OTS, works by means of the observed objects occupying mental object files (Carey 2009; Noles, Scholl, and Mitroff 2005). When two objects are observed, two object files are occupied. Under this explanation, the infants’ surprise during Wynn’s experiment is explained by their observations not matching the occupied object files. Importantly, the infants are not thought to observe “twoness”, representing the number of dolls in terms of numerical concepts.

Based on such considerations, it is commonplace to distinguish between genuinely arithmetical ability and the innate quantitative abilities of subitizing (that is, quickly recognizing and naming the number in a group without counting) and estimating. For this reason, Markus Pantsar (2014; 2019) has called the latter abilities proto-arithmetical, whereas Núñez (2017) calls them quantical. Arithmetic, under these distinctions, refers exclusively to the culturally developed human system of natural numbers and their operations. Arithmetic does not necessarily mean modern sophisticated formal systems, like the Dedekind-Peano axiomatization (Peano 1889). Instead, under the characterization of Pantsar (2018, 287), arithmetic refers to a “sufficiently rich discrete system of explicit number words or symbols with specified rules of operations.” What counts as “sufficiently rich” is left purposefully undefined; more important is the idea that arithmetic, in contrast to proto-arithmetic, needs to consistently and specifically discriminate between different cardinalities (unlike with the ANS), without there being a pre-set limit in size (unlike with the OTS). Put more precisely in mathematical terms, the system must “sufficiently follow the structure of the omega progression, that is, the standard ordering of the set of natural numbers” (ibid.).

A similar distinction has also been proposed when it comes to the subject matter of proto-arithmetical/quantical abilities. While some argue that they concern numbers (see, for example, Carey, 2009; Clarke & Beck, 2021), others insist that we need to distinguish them from numbers as objects of arithmetic. Hence it has been proposed that instead of numbers, proto-arithmetical abilities should be discussed as detecting numerosities (De Cruz and De Smedt 2010; Pantsar 2014). Under this distinction, numerosities refer to the quantity-specific content that can be processed using proto-arithmetical abilities. In what follows, we employ the double distinction between arithmetic and proto-arithmetic and between numbers and numerosities.

b. Acquisition of Number Concepts and Arithmetical Knowledge

Recall that Wynn interpreted the results of her experiment to imply that “infants possess true numerical concepts” (Wynn 1992, 749). With the distinction between numbers and numerosities in place, this conclusion seems dubious. The infants do appear to be able to process numerosities, but this ability is entirely proto-arithmetical, not arithmetical. Hence there is no reason to believe that number concepts are innate to humans (or non-human animals). This is supported by evidence from anumeric cultures such as the Pirahã and the Munduruku of the Amazon. These cultures have not developed arithmetic and their languages do not have numeral words, with the possible exceptions of words for one and two; their members show no arithmetical abilities (Gordon 2004; Pica et al. 2004). Yet experiments show that members of these cultures do possess proto-arithmetical abilities (Dehaene et al. 2008; Everett and Madora 2012; Frank et al. 2008). Therefore, it is likely that, while proto-arithmetical abilities are innate and universal to neurotypical humans, number concepts develop only in particular cultural contexts, in close connection with the development of numeral words (or symbols).

There are, however, disagreeing voices in the literature. Aside from Wynn, Rochel Gelman and C. Randy Gallistel are proponents of a nativist view according to which number concepts are innate and pre-verbal (Gallistel 2017; Gelman and Gallistel 2004). In addition, Dehaene and Brian Butterworth have both presented influential accounts which could be interpreted as forms of nativism. Butterworth (1999) has argued for the existence of an innate “number module”, while Dehaene (2011) has argued for an innate “mental number line.” Yet, it seems that nativist interpretations of these accounts result from misleading terminology. While both authors support innate numerical capacities, given our terminological distinctions, this amounts to the innateness of proto-arithmetical abilities.

If number concepts are not innate, they must be acquired during individual ontogeny. The first influential account along these lines was presented by Jean Piaget (1965). At the time, it was not known that children possess pre-verbal proto-arithmetical abilities, so Piaget endorsed a view according to which all numerical abilities arise from logical capacities and only typically emerge at around the age of five. Over the years after Piaget, researchers have contended that he was wrong about numerical abilities, but he seems to have been right about number concepts not being innate. Moreover, Piaget got the age of this development wrong, as the first number concepts seem to emerge at the age of two.

Parents of young children may want to object at this point, since children can count already before the age of two. Yet it is important to distinguish between different types of counting. Paul Benacerraf (1965) distinguishes between intransitive and transitive counting. The former consists merely of repeating the numeral word sequence in order, as in the beginning of a game of hide and seek. In the latter, the numeral words are used to enumerate items in a collection, like when counting grapes on a plate. In the empirical literature, these two types of counting are often referred to by different terms. Transitive counting is often called enumeration, and the word “counting” is simply used for intransitive counting (for example, Rips et al., 2008).

When exploring number concept acquisition, it is imperative that we distinguish both types of counting from counting with number concepts, which is a further stage in cognitive development. It is known that intransitive counting precedes transitive counting, but also that transitive counting is not sufficient for possession of number concepts (see, for example, Davidson et al., 2012). The way this is usually established is through the “give-n” test developed by Wynn (1990). In the test, children are presented with a collection of objects and asked to give n of them. If they consistently give n objects, they are thought to possess the number concept of n. Yet, as pointed out by Davidson and colleagues, there is a stage at which children can transitively count objects but do not pass the give-n test.

At about two years of age, children start passing the test for n = 1, at which point they are called one-knowers. After that, they acquire the next three number concepts in ascending order, in stages that typically take 4-5 months (Knops 2020). Subsequently, a qualitative change occurs in children’s cognitive processing. After becoming four-knowers, instead of following this trajectory, in the next stage, children grasp something more general about numbers: in addition to the give-5 test, they start passing the give-n test for six, seven and so on (Lee and Sarnecka 2010). At this point, when they have acquired a general understanding that the last uttered word in the counting list refers to the cardinality of the objects, they are called cardinality-principle-knowers (Lee and Sarnecka 2011).

What happens in children’s cognitive development when they become cardinality-principle-knowers? The theory of number concept acquisition that is currently most thoroughly developed in the literature is called bootstrapping. Originally presented by Susan Carey (2004), this theory has since been further developed and clarified by Carey and others (Beck 2017; Carey 2009; Pantsar 2021a). In a nutshell, the bootstrapping account ascribes a central role to the object tracking system in the process. After acquiring the counting list (that is, being able to intransitively count), children are thought to form mental models of different sizes of collections, based on computational constraints set by the object files of the OTS (Beck 2017, 116). When observing two objects, for example, two mental object files are occupied. Such instances are thought to form a representation of two objects in the long-term memory (Carey 2009, 477). Then, through counting games in which the counting list is repeated while pointing to objects, this representation is connected to a particular number word, like “two” (Beck 2017, 119). This explanation can only hold for up to four, though, which is the limit of the OTS. After this, in the last stage of the bootstrapping process, children are thought to grasp, through inductive and analogous reasoning, that the way in which the first four number concepts are in an ascending, regular, order can be extrapolated to the rest of the counting list (Beck 2017, 119). This is the stage at which children become cardinality-principle-knowers.

The bootstrapping theory has critics. Pantsar (2021a) has asked how children ever grasp that there are numerosities larger than four if the OTS is the sole proto-arithmetical system relevant to the bootstrapping process. Lance Rips and colleagues (2006) ask why children bootstrap a linear number system and not, say, a cyclical one? Why, after acquiring the number concept for twelve, for example, is thirteen the next step, instead of one? Rips and colleagues (2008) argue that there needs to be a mathematical schema concerning numbers already in place to prevent such cyclical systems or other “deviant interpretations”. For them, grasping the concept of natural number requires understanding general truths of the type “for all numbers, a + b = b + a”. This, however, is quite problematic as it implies that grasping natural numbers requires understanding something like the Dedekind-Peano axioms, which standardly only happens quite late in individual ontogeny (if at all). It is problematic to associate grasping number concepts with such sophisticated mathematical understanding given that children already possess considerable arithmetical knowledge and skills much earlier.

The conclusion of Rips and colleagues (2008, p. 640) is that the concept of natural number may be completely independent of proto-arithmetical abilities. While they believe that number concepts are learned in a “top-down” manner by grasping general principles, others have suggested different explanations for why we do not acquire deviant interpretations such as cyclical number systems. Margolis and Laurence (2008) have suggested that the deviant interpretation challenge supports a nativist view about number concepts. Yet others have looked for solutions to the challenge that do not entirely abandon the bootstrapping account. Paula Quinon (2021) has suggested that an innate sense for rhythm can explain the preference for linear, regular number systems. Pantsar (2021a) has argued that the approximate number system can influence the bootstrapping process so as to prevent cyclical number systems. Jacob Beck (2017), by contrast, rejects the overall importance of the problem, pointing out that it is just another instance of the general problem of inductive learning, as formulated by the likes of Kripke (1982) and Goodman (1955).

So far, we have been discussing number concepts, but acquiring them is only the first step in developing arithmetical knowledge and skills. While acquiring number concepts would seem to be a necessary condition for having arithmetical knowledge, possessing number concepts alone does not ensure possession of arithmetical knowledge. In accounts like that of Lakoff & Núñez (2000), detailed schemas of the acquisition of arithmetical knowledge are presented. They see the application of conceptual metaphors as the foundation of mathematical knowledge. Addition, for example, is seen as the metaphorical counterpart of the physical task of putting collections of objects together (Lakoff and Núñez 2000, 55).

In other accounts, addition is understood as a more direct continuation of counting procedures, which are mastered in the process of acquiring number concepts. The psychologists Fuson and Secada (1986), for example, presented an account of grasping the addition operation based on a novel understanding of counting. In counting a + b, children typically first use the “counting-all” strategy. When presented with a objects, the children count to a. When presented with b additional objects and asked to count all (that is, a + b) objects, they re-count to a and then continue the count to a + b. Yet, at some stage, children switch to the “counting-on” strategy, starting the count directly from where they finished with a. That is, instead of counting, say, 5 + 3 by counting 1, 2, 3, 4, 5, 6, 7, 8, when applying the “counting-on” strategy, children simply count 6, 7, 8 in the second stage. This strategy has been shown to be helpful in understanding addition (Fuson and Secada 1986, 130).

This example shows how arithmetical operations can be grasped by applying knowledge that children already possess (that is, about counting). Addition is thus understood as a direct descendant of counting. Similarly, multiplication can be understood as a descendant of addition, and so on. Therefore, acquiring arithmetical knowledge and skills can build directly on number concept acquisition and the knowledge and skills that children learn in that process. One should be careful, though, not to make too many assumptions about ontogenetic trajectory. For empirically-informed approaches, it is important to be faithful to the actual empirical details regarding the development of children during ontogeny. While schemas like that presented by Lakoff and Núñez (2000) can be instructive, we should not confuse them with empirical evidence. Understanding ontogenetic trajectories in learning arithmetic is an important challenge that, like that of number concept acquisition, needs further empirical research.

c. Embodied Mind and Enculturation

One consequence of the traditional a priori emphasis in philosophy of mathematics has been that the embodied aspects of mathematics have often been dismissed as philosophically inconsequential. This was famously the case for Frege (1879), who distinguished between the context of discovery (how truths are learned) and the context of justification (how truths are established). For arithmetic, Frege (1884) deemed the context of discovery philosophically irrelevant, ridiculing Mill’s empiricist account as “pebble arithmetic”. However, as we have seen, many researchers currently see the ontogenetic aspects of arithmetic as important to a philosophical understanding of arithmetical knowledge. So far, we have been focused on the role of the proto-arithmetical abilities of subitizing and estimating, and the cognitive core systems associated with them. But in individual ontogeny, embodied aspects and external resources also play important roles. Therefore, it is important to expand epistemological accounts of arithmetic beyond proto-arithmetical abilities.

This is the case already for number concepts. Recognizing the role of embodied aspects of cognition and external resources within cognitive development does not, by itself, imply a preference for one account of number concepts over another. In Beck’s (2017) formulation of the bootstrapping account, number concepts are acquired with the help of two types of external resources: the counting list and counting games. But it is not the presence of external resources that distinguishes between bootstrapping accounts and nativist accounts. Nativists concerning number concepts also acknowledge the importance of education in grasping number concepts. The difference lies in what kind of influence the external resources have. Within nativist accounts, external resources are seen as facilitators that help innate number concepts to emerge. Within bootstrapping accounts, number concepts are shaped by a combination of innate cognitive capacities and external cultural influences.

While non-nativist accounts acknowledge the importance of external factors for number concept acquisition and the development of arithmetical cognition in general, it is not always clear how such accounts view interactions between learning subjects and their environment. Helen De Cruz (2008) was one of the first to tackle this problem systematically by adopting the extended mind perspective within the philosophy of arithmetical cognition. According to the extended mind thesis, cognitive processes are extended into the environment in the sense that they are constituted by both internal cognitive capacities and external resources (Clark and Chalmers 1998). De Cruz argued that this holds for arithmetical cognition; external media such as numeral words, body parts, and numeral symbols are constitutive of number concepts and arithmetical operations on them.

The embodiment of mind is often associated with the extended mind thesis. Arithmetical cognition seems to provide good examples of how our bodies shape our cognitive processes. The use of fingers, in particular, has been widely established across cultures to be significant in the early grasp of counting processes (Barrocas et al. 2020; Bender and Beller 2012; Fuson 1987). Furthermore, finger gnosis, the ability to differentiate between one’s own fingers without visual feedback, is a predictor of numerical and arithmetical ability levels (Noël 2005; Penner-Wilger and Anderson 2013; Wasner et al. 2016). The use of body parts has also been identified by philosophers as an important factor in developing arithmetical cognition (Fabry 2020). In addition to finger (and other body part) counting, there are many other important embodied processes involved in learning arithmetic. Manipulating physical objects like building blocks has been established as important in early arithmetic education (Verdine et al. 2014). The use of cognitive tools, such as pen and paper or abacus, becomes important in later stages (Fabry and Pantsar 2021). It is important to note, though, that this kind of support for the embodiment of mind does not require subscribing to the stronger extended mind thesis. Under a weaker understanding of the embodiment of mind, embodied processes shape the concepts and cognitive processes involved in arithmetic. According to the extended mind thesis, embodied processes are constitutive of concepts and cognitive processes.

External resources are important for the development of arithmetical cognition in many ways. In addition to body part counting and the use of physical cognitive tools, numeral word and numeral symbol systems also play an important role. The Mandarin numeral word system, for example, has shorter words than the English system and follows the recursive base of ten more closely. This has been suggested as an explanation of why native Mandarin speakers typically learn to count faster than native English speakers (Miller et al. 1995). For numeral symbol systems, similar data is not available, largely due to the widespread use of Indo-Arabic numerals. Yet, philosophers have worked on different numeral symbol systems and the way they can shape arithmetical cognition (see, for example, Schlimm 2021).

The importance of external resources for the development of arithmetical cognition appears to raise an age-old question: what is the role of nature and what is the role of nurture in this process? How much of arithmetical cognition comes from evolutionarily developed and innate proto-arithmetical abilities, and how much comes from culturally developed practices? From the above considerations, it seems clear that both play important roles, yet it is difficult to determine their relative impact. These types of considerations have led some researchers to abandon the crude nature vs. nurture framework. Influentially, Richard Menary (2015) has argued that arithmetical cognition is the result of enculturation. Enculturation refers to the transformative process through which interactions with the surrounding culture determine the way cognitive practices are acquired and developed (Fabry 2018; Jones 2020; Menary 2015; Pantsar 2020; Pantsar and Dutilh Novaes 2020). In the enculturation account, we must consider the development of cognitive abilities through such interactions. Menary focuses mainly on the brain and its capacity to adapt learning in accordance with the surrounding culture. Regina Fabry (2020) has extended this focus to the rest of the body, emphasizing the importance of embodied processes for the development of arithmetical cognition. The enculturation account has also proven to be fruitful for studying the general role of material representations, like symbols and diagrams, in the development of arithmetical (and other mathematical) cognition (Johansen and Misfeldt 2020; Vold and Schlimm 2020).

The enculturation account helps us understand how the brain changes as a result of learning cognitive practices involving cognitive tools, such as numeral symbol systems or tools for writing, like pen and paper. These changes typically involve the same areas of the brain across individuals. Menary (2015) has explained such predictable changes in the brain through the notion of learning driven plasticity, which includes both structural and functional changes. At present, there are two competing accounts for explaining learning driven plasticity. Menary follows Dehaene’s (2009) notion of neuronal recycling, according to which evolutionarily developed neural circuits are recycled for new, culturally specific purposes. In the case of arithmetic, that means that evolutionary developed proto-arithmetical neural circuits are (partly) redeployed for culturally specific, arithmetical, purposes. There is empirical data in support of this, showing, for example, that the prefrontal and posterior parietal lobes, especially the intraparietal sulcus, activate when processing numerosities both symbolically and non-symbolically (Nieder and Dehaene 2009).

Michael Anderson (2010; 2015), by contrast, has argued that neural reuse is the basic organizational principle involved in enculturation. According to the neural reuse principle, neural circuits generally do not perform specific cognitive or behavioural functions. Instead, due to the flexibility of the brain, cognitive and behavioural functions can employ (and re-deploy) resources from many neural circuits across different brain areas. Brain regions do, though, have functional biases, which explain why, during enculturation, the brains of individuals typically go through similar structural and functional changes. Recently, both Fabry (2020) and Max Jones (2020) have argued for neural reuse as a better explanation of enculturation in the specific context of arithmetic (see Pantsar (2024a) for further discussion). In addition, due to the importance of the embodied aspects of arithmetical cognition, Fabry (2020) has emphasized the need for also explaining adaptability outside the brain, introducing the notion of learning driven bodily adaptability.

One further question that has emerged in the literature on the development of arithmetical cognition is that of representations. The standard accounts among empirical researchers imply that proto-arithmetical numerosities are represented in the mind either in object files (OTS) or a mental number line (ANS) (see, for example, Carey, 2009; Dehaene, 2011). However, this goes against radical enactivist views according to which basic, that is, unenculturated, minds do not have content or representations (Hutto and Myin 2013; 2017). Recently, radical enactivist accounts of mathematical cognition have been presented (Hutto 2019; Zahidi 2021; Zahidi and Myin 2018). These accounts try to explain the development of arithmetical cognition in ontogeny without evoking representational models of proto-arithmetical abilities. This topic can be expected to get more attention in the future, as our empirical understanding of proto-arithmetical abilities improves.

d. From Ontogeny to Phylogeny and Cultural History

The surrounding culture and its impact on learning play key roles in the enculturation account of arithmetical knowledge. This prompts the question: how can cultures develop arithmetical knowledge in the first place? The problem is already present at the level of number concepts. In the bootstrapping account, for example, counting lists and counting games are thought to be integral to the acquisition of number concepts. But how can such cultural practices develop if members of a culture do not already possess number concepts? In many cultures, this happened through cultural transmissions from other cultures (Everett 2017). But this cannot be the case for all cultures. Most fundamentally, as Jean-Charles Pelland, among others, observed, the question concerns the origin of numeral words. Is it possible that numeral words – and consequently counting lists – could have developed without there first being number concepts (Pelland 2018)?

In response to this question, the role of external representations of numbers has become an important topic in the literature (see, for example, Schlimm, 2018, 2021). To explain how external representations like numeral symbols and words can influence concept formation, many researchers have turned to the cultural evolution of practices and concepts. Particularly influential in this field is the theory of cumulative cultural evolution (Boyd and Richerson 1985; 2005; Henrich 2015; Heyes 2018; Tomasello 1999). The central idea of cumulative cultural evolution is that cultural developments frequently take place in small (trans-)generational increments. From this background, it has been argued that material engagement with our environment (Malafouris 2013) has been central to the cultural evolution of numeral words, numeral symbols, and number concepts (dos Santos 2021; Overmann 2023; Pantsar 2024a). The emergence of numeral symbols has been traced to around 8,000-4,500 B.C.E. in Elam and Mesopotamia, where clay tokens were used to represent quantities in accounting (Overmann 2018; Schmandt-Besserat 1996). At first, a proper array of different tokens was put in a clay case to represent the quantity, but later the outside of the case was marked with a symbol to signal the contents (Ifrah 1998, xx). In the next phase, people bypassed the case and simply used the symbol (Nissen, Damerow, and Englund 1994).

For numeral words, similar connections to material practices have been identified. Not all languages have a system of numeral words. Hence, we must ask: how can numeral words be introduced into a language? Recently, César dos Santos (2021) has brought philosophical attention to the interesting case of the Hup language spoken by cultures in the Amazonia, which seem to be in the process of developing a numeral word system. In the Hup language, the word for two means “eye quantity” and the word for three comes from a word for a three-chambered rubber plant seed (Epps 2006). Thus, in the Hup language, numeral words appear to be emerging from prototypic representations of quantities of natural phenomena. Lieven Decock (2008) has called such representations “canonical collections”. Canonical collections may also hold the key to understanding how number concepts have evolved. The linguist Heike Wiese (2007) has argued that numeral words and number concepts co-evolved. Within philosophy, this idea has been pursued by dos Santos (2021) and Pantsar (2024b). Canonical collections in natural phenomena, in addition to body parts like fingers, have provided references that proto-arithmetical numerosity representations can latch onto. The word for the rubber plant seed in the Hup language, for example, has gradually changed meaning to also concern a quantity. Such words can be connected to body part counting procedures, which make them part of the counting procedure, and together with other similarly evolved words, part of the counting list. Through this process of co-evolution, the concept of number develops into the kind of exact notion of natural number that we have in arithmetic. Thus, the co-evolution of numeral words and number concepts is a proposed solution to the question presented by Pelland; numeral words can develop without there being prior number concepts because the concepts themselves can emerge as part of the same (or parallel) development.

Unfortunately, the emergence of numeral words and number concepts is, in most cases, impossible to trace. Similarly, the evolution of arithmetical operations is difficult to study due to a dearth of surviving material from the early stages. There are, however, important works that help improve our understanding of the evolution of numbers and arithmetic, including those by the mathematician Georges Ifrah (1998), the cognitive archaeologist Karenleigh Overmann (2023), and the anthropologist Caleb Everett (2017). History of mathematics also provides important insights concerning the development of arithmetic into the modern discipline with which we are familiar (for example, Merzbach and Boyer 2011). This kind of interdisciplinary research is highly philosophically relevant; it provides important material for recent epistemological accounts of arithmetic, like that of Pantsar (2024a). There are good reasons to expect this development to continue in the future, with the cultural development of arithmetic receiving more philosophical attention.

For the epistemology and ontology of arithmetic, both the ontogenetic proto-arithmetical foundations and the cultural development of arithmetical knowledge and skills are highly relevant. According to one view, mathematical objects like numbers are social constructs (Cole 2013; 2015; Feferman 2009). If this is the case, then an important question arises: what kind of social constructs they are? According to conventionalist philosophy of mathematics, which has found popularity in different forms ever since the early 20th century, mathematical truths are merely firmly entrenched conventions (Ayer 1970; Carnap 1937; Warren 2020; Wittgenstein 1978). Thus, one way of interpreting numbers as social constructs would be as parts of such conventions, which ultimately could be arbitrary. However, it has been argued that the proto-arithmetical origins of arithmetic (partly) determine its content, which means that arithmetical truths are (partly) based on evolutionarily developed cognitive architecture and not purely conventional (Pantsar 2021b).

e. Ordinal or Cardinal

Let us next consider whether knowledge of natural numbers is primarily ordinal or cardinal. In many languages, numeral symbols are connected to two numeral words, one cardinal (one, two, three, so on) and the other ordinal (first, second, third, so on). In the philosophy of mathematics, this is an old and much-discussed problem that traces back to Cantor (1883), who defined the cardinal numbers based on the ordinal numbers, which has been interpreted to imply the primariness of ordinals (Hallett 1988). Also, within set theory, ordinals are often seen as more fundamental, since the cardinal number of a set can be defined as the least ordinal number whose members can be put in a one-to-one correspondence with the members of the set (Assadian and Buijsman 2019, 565). Finally, in structuralist philosophy of mathematics, numbers are understood to be fundamentally places in the natural number structure. In Stewart Shapiro’s structuralist account, for example, natural numbers are explicitly defined in terms of their (ordinal) position in the natural number structure (Shapiro 1997, 72).

Are the cognitive foundations of arithmetic relevant to whether our knowledge of natural numbers is primarily ordinal or cardinal? If the OTS and the ANS are indeed integral cognitive systems relevant to acquiring number concepts, this is at least plausible. Both subitizing and estimating detect the cardinality of the observed collection of objects, and there is nothing to suggest that the order of the occupied object files in the OTS, for example, influences numerosity determination. Therefore, our first numerosity representations are likely cardinal in nature. This also receives direct empirical support. Studies show that children who do not possess numeral words cannot make ordinal judgments while they are able to make (some) cardinal judgments (Brannon and Van de Walle 2001). It could be that the ordinal understanding of numerosities is something that only emerges through numeral words and the ordered counting lists they comprise.

Stefan Buijsman (2019) has suggested that the acquisition of the first number concepts depends on understanding claims of the logical form “there exists exactly one F, which requires grasping the singular/plural distinction from syntactic clues. Since that distinction does not concern ordinality, this may likewise suggest that the first number concepts are cardinal in nature. Yet, Buijsman (2021) has also argued that acquisition of larger number concepts requires ordinal understanding, which fits well, for example, with the role of ordered counting lists in the bootstrapping process.

f. Empirically-Informed Epistemology of Arithmetic

To conclude our discussion of arithmetic, we evaluate the philosophical significance of the work presented above. This is important because there is a potential counterargument to it having any philosophical significance. Specifically, one can acknowledge that we conduct meaningful research on numerical cognition, yet insist, as Frege (1884) did, that such considerations are only about the context of discovery and not the philosophically important context of justification. From this perspective, arithmetical knowledge can be completely a priori and fit the rationalist paradigm; the empirical studies reviewed above merely concern developmental trajectories in acquiring the kind of conceptual and reasoning abilities required for arithmetical knowledge.

This potential counterargument should be taken seriously. After all, even within the kind of empirically-informed philosophy of arithmetic that this article presents, there are empirical dimensions that are not considered to be philosophically relevant. The fact that we need to have visual (or tactile) access to number symbols, for example, is generally not considered to make epistemology of arithmetic somehow empirical, even though it clearly connects arithmetical knowledge to sensory experience. The counterargument formulated above is essentially similar: what if most – perhaps even all – of the empirical data on numerical cognition is connected to arithmetical knowledge only within the context of discovery, and not relevant to the context of justification and the philosophically important characteristics of arithmetical knowledge?

This counterargument can be divided into two. First, one may agree that the empirical data on numerical cognition is epistemologically relevant but insist that there can be multiple routes to arithmetical knowledge; while people can acquire arithmetical knowledge based on proto-arithmetical abilities, this is not necessarily the case. According to this counterargument, there is no path-dependency or arithmetical knowledge that necessarily develops through proto-arithmetical abilities. Most advocates of empirically-informed philosophy of arithmetic accept this counterargument. They are not claiming that arithmetical knowledge could not be acquired, at least in principle, through an essentially different ontogenetic path. What they maintain is that arithmetical knowledge standardly develops based on (one or more) proto-arithmetical abilities. Given that proto-arithmetical abilities are universal, this ontogenetic path is available to anyone who has access to the kind of enculturation required to move from proto-arithmetical abilities to proper arithmetic. But it is not a necessary path: we cannot rule out that somebody learns arithmetic entirely based on principles of equinumerosity and logic, as described by Frege (see also (Linnebo 2018)).

The second form of the counterargument is more serious. According to it, the empirical data on numerical cognition is not epistemologically relevant at all. This is similar to the way Frege (1884) dismissed contemporary psychologist theories of arithmetic, though we should resist the temptation to speculate about what he would have thought of the kind of modern empirical research presented in this article. While this counterargument is rarely explicitly stated, it seems to be implicitly accepted by many philosophers of mathematics that the importance of empirical research is at least very limited, as demonstrated by the way the topic is ignored in most modern textbooks and encyclopaedia articles on the philosophy of mathematics (see, for example, Horsten 2023; Linnebo 2017).

While further work is needed to determine the epistemological consequences of empirical research on numerical cognition, some progress has already been made. Pantsar (2024a) has presented an epistemological account within which arithmetical knowledge is characterized as contextually a priori. According to this account, the experience of applying our proto-arithmetical abilities in ontogeny sets the context for developing arithmetical knowledge. While that context is thus constrained by our way of experiencing the world, within that context, arithmetical statements are thought to be knowable a priori. Therefore, arithmetical statements are neither refuted nor corroborated by observations. Distinguishing the account from that of Kitcher (1983), Pantsar does not claim that basic arithmetical truths (concerning finite natural numbers and their operations) are generalizations of operations in our environment. Instead, they are determined by our proto-arithmetical abilities. This also distinguishes the account from conventionalist views of mathematics. In his account, the reason for, say, 2 + 2 = 4 being an arithmetical truth is not that it is a firmly entrenched convention; instead, it is because our evolutionarily developed cognitive architecture—in this case the OTS—(partly) determines the domain of arithmetical truths.

3. Geometry

a. The Cognitive Foundations of Geometry

i. Proto-Geometrical Cognition

Historically, within the philosophy of mathematics, geometry has generally been seen in a similar light to arithmetic. In Ancient Greece, geometry was the paradigmatic field of mathematics and provided key content for Plato’s (The Republic) treatment of mathematics. Indeed, Euclid, who gathered the Ancient knowledge of arithmetic into his famous Elements (Euclid, 1956), treated arithmetic essentially as an extension of geometry, with numbers defined as lengths of line segments. Geometry and arithmetic are likewise treated similarly in the work of Kant (1787), for whom geometry is also a paradigmatic case of synthetic a priori knowledge. These philosophical views suggest the innateness of geometrical abilities. Recently, this suggestion has been supported by the psychologist Gallistel (1990), who takes both arithmetic and (Euclidean) geometry to be the result of innate cognitive mechanisms.

More recently still, though, nativist views concerning geometrical abilities have been contested. There have been philosophical accounts, like that of Ferreirós and García-Pérez (2020), that emphasize the cultural characteristics of Euclidean geometry. Yet philosophers have also pursued accounts, like the arithmetical ones discussed above, according to which geometry is based on proto-mathematical, evolutionarily developed, capacities. While the empirical research on proto-geometrical abilities has not been as extensive as that on proto-arithmetical abilities, there are important results that should be considered by philosophers. Potentially, these may provide at least a partial cognitive foundation for the ontogenetic, and perhaps also the phylogenetic and cultural development, of geometrical knowledge and skills. This kind of philosophical work has been discussed most extensively by Mateusz Hohol in his book Foundations of Geometric Cognition (Hohol 2019), but there are also articles focusing on similar approaches concerning the cognitive foundations of geometry (for example, Hohol & Miłkowski, 2019; Pantsar, 2022).

The state of the art in empirical research on proto-geometrical cognition is fundamentally similar to that on proto-arithmetical cognition: two proto-geometrical abilities have been identified in the literature. The first of these concerns shape recognition, the second concerns orientation. Similarly to the OTS and the ANS, it has been proposed that these two abilities are due to two different core cognitive systems (Spelke 2011). And just like in the case of arithmetic, both have been seen as forming (at least a partial) cognitive foundation for the development of geometry (Hohol 2019).

ii. Shape Recognition

Let us first focus on the ability to recognize geometric shapes, which Hohol (2019) calls object recognition (shape recognition is a more fitting term because it is not clear that all recognized shapes are treated cognitively as objects). For a long time, psychology was dominated by the views of Piaget (1960), according to which children are born with no conception of objects and consequently, no conception of shapes. This view started to be contested in the 1970s, as evidence of neonates recognizing geometrical shapes emerged (Schwartz, Day, and Cohen 1979). Since then, there have been various empirical reports of infants (Bomba and Siqueland 1983; Newcombe and Huttenlocher 2000), non-human animals (Spelke and Lee 2012), and members of isolated cultures (Dehaene et al. 2006) being sensitive to geometric shapes in their observations and behaviour. Importantly, these abilities are almost always reported in terms of Euclidean geometry. Véronique Izard and colleagues, for example, report that Munduruku adults and children show an “intuitive understanding [of] essential properties of Euclidean geometry” (p. 9782). This includes estimating the sum of internal angles to be roughly 180 degrees and there being one parallel line to any given line drawn through a given point. Izard and Elizabeth Spelke (2009) have also described the different developmental stages in children’s learning of shape recognition in Euclidean terms.

Is there an innate ability, or at least a tendency, toward recognising Euclidean geometric shapes? In addition to psychologists like Gallistel, some philosophers have supported strong nativist views of Euclidean representations (see, for example, Hatfield, 2003). However, as described by Izard and Spelke (2009), the shape recognition system does not have enough resources to fully represent Euclidean geometry. While preschool children can detect, for example, curved lines among straight lines, and right angles among different types of angles, there are many notions of Euclidean geometry that they are not sensitive to. These are typically higher-order properties, such as symmetry (Izard and Spelke 2009). Given the limitations of the preschoolers’ abilities, we need to ask whether it makes sense to call them geometrical in the first place. Mirroring the distinction between arithmetic and proto-arithmetic, it seems necessary to distinguish between proto-geometrical and properly geometrical abilities. This distinction takes proto-geometrical abilities to be evolutionarily developed and innate, while geometrical abilities are culturally developed. Unlike proto-geometrical abilities, proper geometrical abilities are not limited to specific characteristics of shapes.

This does not mean that geometrical ability requires knowledge of Euclidean geometry, or other axiomatic systems. Such a definition would classify most people as geometrically ignorant. While it is not possible to precisely define geometrical ability, we can characterize the difference between geometrical and proto-geometrical abilities. In terms of distinguishing between angles, for example, Izard and Spelke (2009) report preschool children as being able to distinguish a different angle regardless of whether it is presented among acute, straight, or obtuse angles. This kind of ability is far from geometrical knowledge that, for example, the angles of a triangle are equal to two right angles. This latter kind of systematic knowledge about shapes and ways to understand them through precise notions, such as the size of an angle, should be considered geometrical. Under this characterization, it does not make sense to talk about proto-geometrical cognition being “Euclidean”. Only properly geometrical abilities can be Euclidean or non-Euclidean, proto-geometrical abilities are too imprecise to be so classified.

The core cognitive system for shape recognition should therefore be understood as proto-geometrical, but what kind of system is it, and what is the evidence for it? One key experiment reported that infants react to changes in angle size rather than to changes in the orientation of an angle (Cohen and Younger 1984). In this experiment, which has since been replicated by others (Lindskog et al. 2019), 6-week-old and 14-week-old infants were habituated to simple two-dimensional forms consisting of two lines that formed an angle. In the test trials, the angle (which was either 45 or 135 degrees) remained the same, but its orientation was changed. The eye movements of the infants showed that six-week-olds dishabituated (indicated by longer looking times) to a change in orientation, suggesting that they were surprised by the changing orientation. However, the fourteen-week-olds dishabituated to the angle size and not the orientation. If the angle stayed the same, they were not surprised by the next form being presented, but as soon as the angle size changed, their looking times became longer.

Cohen and Younger (1984) concluded that there has to be a developmental shift between the ages of six and fourteen weeks during which the infants start to recognize the geometric property of two lines being at a certain angle. Similar tests have been run on older children and it has been established that, starting from at least four years of age, children can consistently pick out a deviant geometric form, such as a different angle, from a collection of forms (Izard and Spelke 2009). Such results strongly imply that that there is a core cognitive ability that enables shape recognition, and which develops with age without explicit understanding of geometry as a mathematical theory. This is also in line with data on members of the Amazonian Munduruku culture. Adults and children of at least four years of age have been reported to make similar shape discriminations as European and North American children (Izard et al. 2011). Hence there are good reasons to understand the core cognitive shape recognition ability as being proto-geometrical.

iii. Orientation

In addition to shape recognition, there are also extensive data on a proto-geometrical ability concerning orientation, which Hohol (2019) calls spatial navigation (orientation is a better term, given that spatial navigation also applies the shape recognition ability). In cognitive and comparative psychology, there is a long-standing idea that navigation in both humans and non-human animals is based on so-called “cognitive maps” (Tolman 1948). Cognitive maps are thought to be mental representations that animals form of new environments. In Tolman’s account, these representations are enduring, Euclidean, and independent of the location of the observer. Thus, the account can be seen as an early hypothesis suggesting a proto-geometrical orientation ability, even though under the present distinction between proto-geometrical and geometrical we should not call the representations Euclidean. Recently, these types of mental representations have been extensively criticized by the so-called radical enactivist philosophers (Hutto and Myin 2013), according to whom all representations are dependent on language. Among psychologists, though, the idea of cognitive maps has endured. Yet, according to the modern understanding, cognitive maps are not necessarily enduring, observer-free, or Euclidean (Spelke and Lee 2012). Instead, they can be momentary and tied to particular viewpoints, and contain “wormholes” that go against Euclidean characteristics (Rothman and Warren 2006; Wehner and Menzel 1990).

While the characteristics, and indeed the very existence, of cognitive maps is a topic of much debate, there is little doubt about the existence of the orientation ability with which they are connected in the literature. This ability is thought to represent distances and directions on large-scale surfaces and spaces. In one key early experiment reported by Ken Cheng (1986), rats were shown the location of buried food in a rectangular space. After being disoriented, they looked for the food almost equally at the right location and at the point rotated 180 degrees from it. This was remarkable, because the rectangular environment had distinct features in the corners, which the rat could have used to represent the space. It was only after training that rats started to use these features for navigation, suggesting that the orientation ability is the primary one used in spatial navigation. Similar behaviour has been reported for young children (Hermer and Spelke 1996; McGurk 1972), as well as  non-human animals like ants (Wystrach and Beugnon 2009).

Moreover, there is evidence that the proto-geometrical ability for orientation is based on abstract representations that animals can use in a multi-modal manner. In one experiment, it was reported that rats navigate according to the shape of a chamber even in the dark (Quirk, Muller, and Kubie 1990). In addition, the orientation ability appears to overrun other changes in the environment. It has been reported that rats’ navigation in a chamber remains unchanged even with radical changes in the texture, material, and colour of the chamber (Lever et al. 2002). In the brain, the orienting ability has been strongly associated with the hippocampus (O’Keefe and Burgess 1996). Interestingly, some studies suggest that the spatial structure is mirrored in the location of neurons firing in the hippocampus. Colin Lever and colleagues, for example, report an experiment in which rats repeatedly exposed to two differently shaped environments develop different hippocampal place-cell representations (Lever et al. 2002). This is explained by there being “grid cells” located in the entorhinal cortex, which is the interface between the hippocampus and the neocortex. The grid cells are thought to be “activated whenever the animal’s position coincides with any vertex of a regular grid of equilateral triangles spanning the surface of the environment” (Hafting et al. 2005, 801). Such studies should not be seen as suggesting that spatial representations are generally expected to be mirrored in neural coding, but they do raise interesting questions about the way proto-geometrical representations are implemented in the brain.  The connection between hippocampus and spatial representations could also explain interesting phenomena in humans, including data showing that London taxi drivers – who presumably need extensive cognitive maps for orientation – have significantly larger posterior hippocampal volume than control subjects (Maguire et al. 2000).

Spatial navigation in small children and non-human animals is not conducted exclusively with the orientation ability, though. In navigation, animals also use cues from so-called “landmark objects,” applying the proto-geometrical ability for shape recognition (Izard et al. 2011). Object representations based on shape recognition are thought to be different in three ways from spatial representations due to the orientation system (Spelke and Lee 2012, 2789). First, the shape representations fail to capture absolute lengths and distances between parts. Second, they are “sense-invariant,” that is, they do not capture the difference between a shape and its mirror image. Third, the shape representations capture relationships between lengths and angles that allow distinguishing between shapes. The cognitive map of grid cells is thought to be anchored also to landmarks, but it remains in their absence. This has been seen as evidence for the orientation ability being a distinct system from the shape recognition system (Hafting et al. 2005). Further support for this comes from there being distinct correlates both in neural (concerning brain areas) and cognitive (concerning learning strategies) terms for navigating based on landmarks (shape recognition) and extended surfaces (orientation), with data showing similar neural activity in rats and humans (Doeller, King, and Burgess 2008; Doeller and Burgess 2008; Spelke and Lee 2012).

b. The Development of Geometric Cognition

As in the case of arithmetic, the development of geometrical cognition needs to be divided into two questions, one concerning its ontogeny and the other its phylogeny and cultural history. Based on the kind of research summarized above, Hohol (2019) has argued that geometrical cognition has developed in a largely similar way to how arithmetical development was described in the enculturation account above. The two proto-geometrical core cognitive systems that we possess already in infancy and share with many non-human animals form a partial foundation of geometry, but the development of geometry would not be possible without human linguistic capacity, the ability to abstract, and the capacity to create and understand diagrams (Hohol 2019). Turning attention to these capacities—evidently exclusive to humans—it is apparent how they have shaped the development of geometry in phylogeny and cultural history, continuing to do so in ontogeny for every new generation.

It should be noted at this point that, just like in the case of arithmetic, the empirical literature on proto-geometrical cognition does not distinguish terminologically between proto-geometrical and geometrical abilities, or between proto-geometrical representations and geometrical abilities. Virtually all articles report the abilities of young children and non-human animals as concerning “geometry.” As in the case of the epistemology of arithmetic, the conflation of two very different types of abilities and representations can be damaging for developing a proper philosophical understanding of the nature of geometrical knowledge.

To understand just how important that difference is, we must consider both the ontogeny and the phylogeny and cultural history. In terms of ontogeny, it is clear that acquiring proper geometrical knowledge requires a considerable amount of education even in order to grasp general geometrical notions like line and angle. Furthermore, many additional years of education must typically be completed before one can reach an understanding of geometry in the formal mathematical sense of a system of axioms and proofs based on them. In terms of phylogeny and cultural history, the matter is likely to be no less complex. If we accept that geometrical knowledge is based (partly) on proto-geometrical abilities, we are faced with the enormous challenge of explaining how these simple abilities for shape recognition and orientation have developed into axiomatic systems of geometry. This has led philosophers to criticize the notion of Spelke and colleagues (2010) that Euclidean geometry is “natural geometry.” Ferreirós and García-Pérez (2020) have argued that the gap between proto-geometrical abilities and Euclidean geometry is so wide that the latter cannot be called “natural” in any relevant sense. Instead, it is the product of a long line of cultural development in which cognitive artifacts (such as the ruler and the compass) and external representations (such as pictures and diagrams) have played crucial roles.

Ferreirós and García-Pérez (2020, 194) formulate a three-level model of the emergence of geometrical knowledge. The first level is called visuo-spatial cognition, which includes what has here been called proto-geometrical cognition. The second level they call “proto-geometry” which, contrary to the use of this terminology in this article to refer to evolutionarily developed abilities, refers in their taxonomy to developing basic concepts like circles and squares. On this level, tools and external representations play a key role. Finally, on the third level we get actual geometry, which requires the systematic and specific development of the second level. While the conceptual distinctions made by Ferreirós and García-Pérez are partly different from the ones introduced in this article, their key content is compatible with the approach here. We should be careful not to confuse lower-level abilities with higher-level concepts. Terminology-wise, the most important consequence is that we cannot call the ability of ants and rats “geometry,” as is often done in the literature (for example, Wystrach and Beugnon 2009). But the philosophically more important point is that geometry is so far away from our evolutionarily developed cognitive abilities that any conception of Euclidean geometry as “natural” geometry is potentially problematic.

This is directly related to one key problem Ferreirós and García-Pérez see with the approach of Spelke and colleagues (2010), namely the way the latter describe Euclidean concepts as “extremely simple,” because “just five postulates, together with some axioms of logic, suffice to specify all the properties of points, lines, and forms” (Spelke, Lee, and Izard 2010, 2785). To challenge this idea, Ferreirós and García-Pérez (2020, 187) point out two ways in which Euclidean concepts are not as simple as they may seem. First, the logical foundation of a system of geometry must be much richer than that of Euclid’s five postulates (Manders 2008). Second, and more importantly for the present purposes, Euclidean geometry cannot be equated with modern formal Hilbert-type (Hilbert 1902) theories of geometry that focus on logical step-by-step proofs. Instead, Euclidean geometry is fundamentally based on proofs that use (lettered) diagrams (Manders 2008; Netz 1999). Hence, Ferreirós (2016) has argued that the Euclidean postulates cannot be considered similar to modern Hilbertian axioms. The upshot of this, Ferreirós and García-Pérez (2020, 188) argue, is that very different types of cognitive abilities are involved in the two ways of practicing geometry. While Hilbertian geometry can be seen as reasoning based on simple postulates, Euclidean geometry is possible only by means of artifacts (ruler and compass) that allow for the construction of lettered diagrams. Therefore, Euclidean concepts may not be so simple after all.

While this history of geometry is important to recognize, it does not imply that we cannot trace the development of Euclidean geometry from proto-geometrical abilities. Such an effort is made by Hohol (2019), who has identified embodiment, abstraction, and cognitive artifacts as key notions in explaining this development. Among others sources, he refers to the work of Lakoff and Núñez (2000) in explaining the process of abstraction. They describe the process of abstraction as creating metaphorical counterparts of embodied processes in our environment. In the case of arithmetic, for example, addition is seen as the metaphorical counterpart of putting collections of physical objects together (p. 55). Geometrical concepts can feasibly be seen as being based on this kind of abstraction: a line in geometry, for example, has no width so it corresponds to no physical object, but it can be seen as an abstract counterpart of physically drawn lines.

In his analysis of cognitive artifacts, Hohol (2019) focuses on diagrams and formulae. As argued by Netz (1999), the introduction of lettered diagrams was central to the development of Greek geometry and its deductive method. While agreeing with this, Hohol and Miłkowski emphasize also the general role of linguistic formulae as cognitive artifacts in the development of geometry (Hohol and Miłkowski 2019). It was only through these cognitive artifacts that ancient practitioners of geometry could communicate with each other and intersubjectively develop their knowledge.

Finally, the matter of non-Euclidean geometries should be discussed. While there are competing axiomatizations of arithmetic, their differences are not as fundamental as the difference between Euclidean and non-Euclidean geometries. In non-Euclidean geometries, the fifth postulate of Euclid (called the parallel postulate) is rejected. According to this postulate, for any line l and point a not on l, there is exactly one line through a that does not intersect l. In hyperbolic geometry, there are infinitely many such non-intersecting lines. In elliptic geometry, all lines through a intersect l.

From the perspective of cognitive foundations, how can we account for non-Euclidean geometries? This problem used to be mainly academic, but gained importance when Einstein used non-Euclidean Riemann geometry in his general theory of relativity. Currently, our best theory of macro-level physics applies a non-Euclidean geometry (even though the effects of general relativity tend to be detectable only in phenomena that are on a much larger scale than our everyday experiences). If, as argued by Spelke and others, our “natural” geometry is Euclidean, how is it possible that this natural geometry is not the geometry of the world, so to speak? Should this be seen as evidence against Euclidean geometry being natural in the sense of it being based on our basic cognitive architecture, that is, our proto-geometrical abilities? While such connections may be tempting to make, it is important to not read too much into proto-geometrical origins. First, we must remember that Euclidean geometry is a distant development from proto-geometrical origins, and, as such, it already contains a lot that is not present in our basic cognitive architecture. Non-Euclidean geometries may simply be a further step in culturally developing geometry. Second, it is possible that our basic proto-geometrical abilities are proto-Euclidean in the specific sense of agreeing with the intuitive content of the parallel axiom. The geometrical structure of the world, on the other hand, may be different from that intuitive content.

As in the case of arithmetic, it is important to note that this type of empirically-informed epistemology emerged over the last decades of the 20th century. So far, quite little has been written explicitly about the connection between the foundations of geometrical cognition and the nature of geometrical knowledge. Indeed, even less has been written about geometrical knowledge than arithmetical knowledge. There are at least two reasons for this. First, the empirical data relevant to arithmetical cognition are currently stronger both in quantity and quality. Second, philosophers have been working for longer on the cognitive foundations of arithmetic. There is, however, no reason to believe that geometrical knowledge will not be given an empirically-informed epistemological treatment similar to that provided for arithmetical knowledge. In fact, many of the considerations relevant to arithmetic seem to be applicable, mutatis mutandis, to geometry. For example, the point made above about the path-dependency of arithmetical knowledge also applies to geometrical knowledge: while there is growing evidence that geometrical knowledge is at least partly based on evolutionarily developed proto-arithmetical abilities, that does not imply that this is the only ontogenetic or cultural historical trajectory leading to geometrical knowledge. It is at least in principle possible that geometrical knowledge can be acquired and developed independently of proto-geometrical abilities. Yet, there are good reasons to think that, standardly, children apply their proto-geometrical abilities in learning geometry. Exactly how this happens is a question that demands a lot more future work, and progress in this work will most likely make it increasingly philosophically relevant. It is to be expected that in coming years both fields will receive increasing attention from philosophers.

4. References and Further Reading

  • Anderson, Michael. 2015. After Phrenology: Neural Reuse and the Interactive Brain. Cambridge, MA: MIT Press.
  • Anderson, Michael. 2010. “Neural reuse: A fundamental organizational principle of the brain.” Behavioral and brain sciences 33 (4): 245–66.
  • Assadian, Bahram, and Stefan Buijsman. 2019. “Are the Natural Numbers Fundamentally Ordinals?” Philosophy and Phenomenological Research 99 (3): 564–80. https://doi.org/10.1111/phpr.12499.
  • Ayer, Alfred Jules. 1970. Language, Truth and Logic. Unabridged and Unaltered republ. of the 2. (1946) ed. New York, NY: Dover Publications.
  • Barrocas, Roberta, Stephanie Roesch, Caterina Gawrilow, and Korbinian Moeller. 2020. “Putting a Finger on Numerical Development – Reviewing the Contributions of Kindergarten Finger Gnosis and Fine Motor Skills to Numerical Abilities.” Frontiers in Psychology 11:1012. https://doi.org/10.3389/fpsyg.2020.01012.
  • Beck, Jacob. 2017. “Can Bootstrapping Explain Concept Learning?” Cognition 158: 110–21.
  • Benacerraf, Paul. 1965. “What Numbers Could Not Be.” The Philosophical Review 74 (1): 47–73. https://doi.org/10.2307/2183530.
  • Bender, Andrea, and Sieghard Beller. 2012. “Nature and Culture of Finger Counting: Diversity and Representational Effects of an Embodied Cognitive Tool.” Cognition 124 (2): 156–82. https://doi.org/10.1016/j.cognition.2012.05.005.
  • Bomba, Paul C., and Einar R. Siqueland. 1983. “The Nature and Structure of Infant Form Categories.” Journal of Experimental Child Psychology 35 (2): 294–328. https://doi.org/10.1016/0022-0965(83)90085-1.
  • Boyd, Robert and Peter J. Richerson. 1985. Culture and the Evolutionary Process. Chicago: University of Chicago Press.
  • Boyd, Robert and Peter J. Richerson 2005. Not by Genes Alone. Chicago: University of Chicago Press.
  • Brannon, Elizabeth M., and Gretchen A. Van de Walle. 2001. “The Development of Ordinal Numerical Competence in Young Children.” Cognitive Psychology 43 (1): 53–81. https://doi.org/10.1006/cogp.2001.0756.
  • Buijsman, Stefan. 2019. “Learning the Natural Numbers as a Child.” Noûs 53 (1): 3–22.
  • Buijsman, Stefan. 2021. “How Do We Semantically Individuate Natural Numbers?†.” Philosophia Mathematica 29 (2): 214–33. https://doi.org/10.1093/philmat/nkab001.
  • Butterworth, Brian. 1999. What Counts: How Every Brain Is Hardwired for Math. New York: The Free Press.
  • Cantor, Goerg. 1883. “Über unendliche, lineare Punktmannigfaltigkeiten, 5.” Mathematische Annalen 21: 545–86.
  • Carey, Susan. 2004. “Bootstrapping & the Origin of Concepts.” Daedalus 133 (1): 59–68.
  • Carey, Susan. 2009. The Origin of Concepts. Oxford: Oxford University Press.
  • Carnap, Rudolf. 1937. The Logical Syntax of Language. Open Court Classics. Chicago, Ill: Open Court.
  • Cheng, Ken. 1986. “A Purely Geometric Module in the Rat’s Spatial Representation.” Cognition 23 (2): 149–78. https://doi.org/10.1016/0010-0277(86)90041-7.
  • Cheyette, Samuel J., and Steven T. Piantadosi. 2020. “A Unified Account of Numerosity Perception.” Nature Human Behaviour 4 (12): 1265–72. https://doi.org/10.1038/s41562-020-00946-0.
  • Clark, Andy, and David Chalmers. 1998. “The Extended Mind.” Analysis 58 (1): 7–19.
  • Clarke, Sam, and Jacob Beck. 2021. “The Number Sense Represents (Rational) Numbers.” Behavioral and Brain Sciences, April, 1–57. https://doi.org/10.1017/S0140525X21000571.
  • Cohen, Leslie B., and Barbara A. Younger. 1984. “Infant Perception of Angular Relations.” Infant Behavior and Development 7: 37–47.
  • Cole, Julian C. 2013. “Towards an Institutional Account of the Objectivity, Necessity, and Atemporality of Mathematics†.” Philosophia Mathematica 21 (1): 9–36. https://doi.org/10.1093/philmat/nks019.
  • Cole, Julian C. 2015. “Social Construction, Mathematics, and the Collective Imposition of Function onto Reality.” Erkenntnis 80 (6): 1101–24. https://doi.org/10.1007/s10670-014-9708-8.
  • Davidson, Kathryn, Kortney Eng, and David Barner. 2012. “Does Learning to Count Involve a Semantic Induction?” Cognition 123: 162–73.
  • De Cruz, Helen and Johan De Smedt. 2010. “The Innateness Hypothesis and Mathematical Concepts.” Topoi 29 (1): 3–13.
  • De Cruz, Helen. 2008. “An Extended Mind Perspective on Natural Number Representation.” Philosophical Psychology 21 (4): 475–90. https://doi.org/10.1080/09515080802285289.
  • Decock, Lieven. 2008. “The Conceptual Basis of Numerical Abilities: One-to-One Correspondence Versus the Successor Relation.” Philosophical Psychology 21 (4): 459–73. https://doi.org/10.1080/09515080802285255.
  • Dehaene, Stanislas. 1997. The Number Sense: How the Mind Creates Mathematics. 2nd ed. New York: Oxford University Press.
  • Dehaene, Stanislas. 2009. Reading in the Brain: The New Science of How We Read. London: Penguin.
  • Dehaene, Stanislas, Véronique Izard, Elizabeth Spelke, and Pierre Pica. 2008. “Log or Linear? Distinct Intuitions of the Number Scale in Western and Amazonian Indigene Cultures.” Science 320: 1217–20.
  • Dehaene, Stanislas. 2011. The Number Sense: How the Mind Creates Mathematics, Revised and Updated Edition. Revised, Updated ed. edition. New York: Oxford University Press.
  • Dehaene, Stanislas, Véronique Izard, Pierre Pica, and Elizabeth Spelke. 2006. “Core Knowledge of Geometry in an Amazonian Indigene Group.” Science 311 (5759): 381–84. https://doi.org/10.1126/science.1121739.
  • DeLong, Caroline M., Stephanie Barbato, Taylor O’Leary, and K. Tyler Wilcox. 2017. “Small and Large Number Discrimination in Goldfish (Carassius Auratus) with Extensive Training.” Behavioural Processes, The Cognition of Fish, 141 (August):172–83. https://doi.org/10.1016/j.beproc.2016.11.011.
  • Doeller, Christian F., and Neil Burgess. 2008. “Distinct Error-Correcting and Incidental Learning of Location Relative to Landmarks and Boundaries.” Proceedings of the National Academy of Sciences 105 (15): 5909–14. https://doi.org/10.1073/pnas.0711433105.
  • Doeller, Christian F., John A. King, and Neil Burgess. 2008. “Parallel Striatal and Hippocampal Systems for Landmarks and Boundaries in Spatial Memory.” Proceedings of the National Academy of Sciences 105 (15): 5915–20. https://doi.org/10.1073/pnas.0801489105.
  • Epps, Patience. 2006. “Growing a Numeral System: The Historical Development of Numerals in an Amazonian Language Family.” Diachronica 23 (2): 259–88. https://doi.org/10.1075/dia.23.2.03epp.
  • Euclid. 1956. The Thirteen Books of Euclid’s Elements. Vol. 1: Introduction and Books I, II. Second edition revised with additions. Vol. 1. New York: Dover Publications.
  • Everett, Caleb. 2017. Numbers and the Making of Us: Counting and the Course of Human Cultures. Harvard University Press.
  • Everett, Caleb, and Keren Madora. 2012. “Quantity Recognition Among Speakers of an Anumeric Language.” Cognitive Science 36 (1): 130–41. https://doi.org/10.1111/j.1551-6709.2011.01209.x.
  • Fabry, Regina E. 2018. “Betwixt and between: The Enculturated Predictive Processing Approach to Cognition.” Synthese 195 (6): 2483–2518.
  • Fabry, Regina E. 2020. “The Cerebral, Extra-Cerebral Bodily, and Socio-Cultural Dimensions of Enculturated Arithmetical Cognition.” Synthese 197:3685–3720.
  • Fabry, Regina E., and Markus Pantsar. 2021. “A Fresh Look at Research Strategies in Computational Cognitive Science: The Case of Enculturated Mathematical Problem Solving.” Synthese 198 (4): 3221–63. https://doi.org/10.1007/s11229-019-02276-9.
  • Fechner, Gustav Theodor. 1948. “Elements of Psychophysics, 1860.” In Readings in the History of Psychology, 206–13. Century Psychology Series. East Norwalk, CT, US: Appleton-Century-Crofts. https://doi.org/10.1037/11304-026.
  • Feferman, Solomon. 2009. “Conceptions of the Continuum.” Intellectica 51 (1): 169–89.
  • Feigenson, Lisa, Stanislas Dehaene, and Elizabeth Spelke. 2004. “Core Systems of Number.” Trends in Cognitive Sciences 8 (7): 307–14.
  • Ferreirós, José. 2016. Mathematical Knowledge and the Interplay of Practices. Princeton: Princeton.
  • Ferreirós, José, and Manuel J. García-Pérez. 2020. “Beyond Natural Geometry: On the Nature of Proto-Geometry.” Philosophical Psychology 33 (2): 181–205.
  • Frank, Michael C., Daniel L. Everett, Evelina Fedorenko, and Edward Gibson. 2008. “Number as a Cognitive Technology: Evidence from Pirahã Language and Cognition.” Cognition 108 (3): 819–24. https://doi.org/10.1016/j.cognition.2008.04.007.
  • Frege, Gottlob. 1879. “Begriffsschift.” In From Frege to Gödel: A source book in mathematical logic, 1879-1931, edited by J. Heijenoort, 1–82. Harvard University Press.
  • Frege, Gottlob. 1884. The Foundations of Arithmetic. Oxford: Basil Blackwell.
  • Fuson, Karen C. 1987. Children’s Counting and Concepts of Number. New York: Springer.
  • Fuson, Karen C., and Walter G. Secada. 1986. “Teaching Children to Add by Counting-On with One-Handed Finger Patterns.” Cognition and Instruction 3 (3): 229–60.
  • Gallistel, Charles R. 1990. The Organization of Learning. Cambridge, Mass: Mit Pr.
  • Gallistel, Charles R. 2017. “Numbers and Brains.” Learning & Behaviour 45 (4): 327–28.
  • Gelman, Rochel and Charles R. Gallistel. 2004. “Language and the Origin of Numerical Concepts.” Science 306: 441–43.
  • Goodman, Nelson. 1955. Fact, Fiction, and Forecast. Second. Harvard University Press.
  • Gordon, Peter. 2004. “Numerical Cognition without Words: Evidence from Amazonia.” Science 306 (5695): 496–99.
  • Hafting, Torkel, Marianne Fyhn, Sturla Molden, May-Britt Moser, and Edvard I. Moser. 2005. “Microstructure of a Spatial Map in the Entorhinal Cortex.” Nature 436 (7052): 801–6. https://doi.org/10.1038/nature03721.
  • Hallett, Michael. 1988. Cantorian Set Theory and Limitation of Size. Oxford Logic Guides 10. Oxford [England] : New York: Clarendon Press ; Oxford University Press.
  • Hatfield, Gary. 2003. The Natural and the Normative: Theories of Spatial Perception from Kant to Helmholtz. The MIT Press. https://doi.org/10.7551/mitpress/4852.001.0001.
  • Henrich, Joseph. 2015. The Secret of Our Success: How Culture Is Driving Human Evolution, Domesticating Our Species, and Making Us Smarter. Princeton University Press.
  • Hermer, Linda, and Elizabeth Spelke. 1996. “Modularity and Development: The Case of Spatial Reorientation.” Cognition 61 (3): 195–232. https://doi.org/10.1016/S0010-0277(96)00714-7.
  • Heyes, Cecilia. 2018. Cognitive Gadgets: The Cultural Evolution of Thinking. Cambridge: Harvard University Press.
  • Hilbert, David. 1902. The Foundations of Geometry. Open court publishing Company.
  • Hohol, Mateusz. 2019. Foundations of Geometric Cognition. New York: Routledge.
  • Hohol, Mateusz, and Marcin Miłkowski. 2019. “Cognitive Artifacts for Geometric Reasoning.” Foundations of Science 24 (4): 657–80. https://doi.org/10.1007/s10699-019-09603-w.
  • Horsten, Leon. 2023. “Philosophy of Mathematics.” In The Stanford Encyclopedia of Philosophy, edited by Edward N. Zalta and Uri Nodelman, Winter 2023. Metaphysics Research Lab, Stanford University. https://plato.stanford.edu/archives/win2023/entries/philosophy-mathematics/.
  • Howard, Scarlett R., Aurore Avarguès-Weber, Jair E. Garcia, Andrew D. Greentree, and Adrian G. Dyer. 2019. “Numerical Cognition in Honeybees Enables Addition and Subtraction.” Science Advances 5 (2): eaav0961. https://doi.org/10.1126/sciadv.aav0961.
  • Hutto, Daniel D. 2019. “Re-Doing the Math: Making Enactivism Add Up.” Philosophical Studies 176: 827–37.
  • Hutto, Daniel D., and Erik Myin. 2013. Radicalizing Enactivism. Basic Minds without Content. Cambridge, MA: MIT Press.
  • Hutto, Daniel D., and Erik Myin. 2017. Evolving enactivism. Basic minds meet content. Cambridge, MA: MIT Press.
  • Hyde, Daniel C. 2011. “Two Systems of Non-Symbolic Numerical Cognition.” Frontiers in Human Neuroscience 5: 150.
  • Ifrah, Georges. 1998. The Universal History of Numbers: From Prehistory to the Invention of the Computer. London: Harville Press.
  • Izard, Véronique, Pierre Pica, Elizabeth S. Spelke, and Stanislas Dehaene. 2011. “Flexible Intuitions of Euclidean Geometry in an Amazonian Indigene Group.” Proceedings of the National Academy of Sciences 108 (24): 9782–87.
  • Izard, Véronique, and Elizabeth S. Spelke. 2009. “Development of Sensitivity to Geometry in Visual Forms.” Human Evolution 23 (3): 213.
  • Izard, Véronique, Pierre Pica, Stanislas Dehaene, Danielle Hinchey, and Elizabeth Spelke. 2011. “Geometry as a Universal Mental Construction.” In Space, Time and Number in the Brain, edited by Stanislas Dehaene and Elizabeth M. Brannon, 319–32. San Diego: Academic Press. https://doi.org/10.1016/B978-0-12-385948-8.00019-0.
  • Johansen, Mikkel W., and Morten Misfeldt. 2020. “Material Representations in Mathematical Research Practice.” Synthese 197 (9): 3721–41. https://doi.org/10.1007/s11229-018-02033-4.
  • Jones, Max. 2020. “Numerals and Neural Reuse.” Synthese 197: 3657–81.
  • Kant, Immanuel. 1787. Critique of Pure Reason. Cambridge University Press.
  • Kaufman, Edna L., Miles W. Lord, Thomas W. Reese, and John Volkmann. 1949. “The Discrimination of Visual Number.” The American Journal of Psychology 62: 498–525. https://doi.org/10.2307/1418556.
  • Kitcher, Philip. 1983. The Nature of Mathematical Knowledge. New York: Oxford University Press.
  • Knops, Andre. 2020. Numerical Cognition. The Basics. New York: Routledge.
  • Kripke, Saul A. 1982. Wittgenstein on Rules and Private Language: An Elementary Exposition. Harvard University Press.
  • Lakoff, George, and Rafael Núñez. 2000. Where Mathematics Comes From. New York: Basic Books.
  • Lee, Michael D., and Barbara W. Sarnecka. 2010. “A Model of Knower‐level Behavior in Number Concept Development.” Cognitive Science 34 (1): 51–67.
  • Lee, Michael D., and Barbara W. Sarnecka. 2011. “Number-Knower Levels in Young Children: Insights from Bayesian Modeling.” Cognition 120 (3): 391–402.
  • Lever, Colin, Tom Wills, Francesca Cacucci, Neil Burgess, and John O’Keefe. 2002. “Long-Term Plasticity in Hippocampal Place-Cell Representation of Environmental Geometry.” Nature 416 (6876): 90.
  • Lindskog, Marcus, Maria Rogell, Gustaf Gredebäck and Ben Kenward. 2019. “Discrimination of Small Forms in a Deviant-Detection Paradigm by 10-Month-Old Infants.” Frontiers in Psychology 10: 1032.
  • Linnebo, Øystein. 2017. Philosophy of Mathematics. Princeton: Princeton Unviesrity Press.
  • Linnebo, Øystein. 2018. Thin Objects. Oxford: Oxford University Press.
  • Maddy, Penelope. 1990. Realism in Mathematics. Oxford: Oxford University Press.
  • Maguire, Eleanor A., David G. Gadian, Ingrid S. Johnsrude, Catriona D. Good, John Ashburner, Richard S. J. Frackowiak, and Christopher D. Frith. 2000. “Navigation-Related Structural Change in the Hippocampi of Taxi Drivers.” Proceedings of the National Academy of Sciences 97 (8): 4398–4403. https://doi.org/10.1073/pnas.070039597.
  • Malafouris, Lambros. 2013. How Things Shape the Mind: A Theory of Material Engagement. Cambridge, MA, USA: MIT Press.
  • Manders, Kenneth. 2008. “The Euclidean Diagram.” In The Philosophy of Mathematical Practice, edited by P. Mancosu, 80–133. Oxford: Oxford University Press.
  • Margolis, Eric and Stephen Laurence. 2008. “How to Learn the Natural Numbers: Inductive Inference and the Acquisition of Number Concepts.” Cognition 106: 924–39.
  • McGurk, Harry. 1972. “Infant Discrimination of Orientation.” Journal of Experimental Child Psychology 14 (1): 151–64. https://doi.org/10.1016/0022-0965(72)90040-9.
  • Menary, Richard. 2015. Mathematical Cognition: A Case of Enculturation. Frankfurt am Main: Open MIND, MIND Group.
  • Merzbach, Uta C., and Carl B. Boyer. 2011. A History of Mathematics. 3rd ed. Hoboken, N.J: John Wiley.
  • Mill, John Stuart. 1843. “A System of Logic.” In Collected Works of John Stuart Mill, edited by J.M. Robson. Vol. vols. 7 & 8. Toronto: University of Toronto Press.
  • Miller, Kevin F., Catherine M. Smith, Jianjun Zhu, and Houcan Zhang. 1995. “Preschool Origins of Cross-National Differences in Mathematical Competence: The Role of Number-Naming Systems.” Psychological Science 6 (1): 56–60.
  • Netz, Reviel. 1999. The Shaping of Deduction in Greek Mathematics. Cambridge, UK: Cambridge University Press.
  • Newcombe, Nora S., and Janellen Huttenlocher. 2000. Making Space: The Development of Spatial Representation and Reasoning. 1st edition. Cambridge, Mass: Bradford Books.
  • Nieder, A., and S. Dehaene. 2009. “Representation of Number in the Brain.” Annual Review of Neuroscience 32:185–208.
  • Nieder, Andreas. 2019. A Brain for Numbers: The Biology of the Number Instinct. Illustrated edition. Cambridge, Massachusetts: The MIT Press.
  • Nissen, Hans J., Peter Damerow, and Robert K. Englund. 1994. Archaic Bookkeeping: Early Writing and Techniques of Economic Administration in the Ancient Near East. Translated by Paul Larsen. 1st edition. Chicago, Ill: University of Chicago Press.
  • Noël, Marie-Pascale. 2005. “Finger Gnosia: A Predictor of Numerical Abilities in Children?” Child Neuropsychology 11 (5): 413–30. https://doi.org/10.1080/09297040590951550.
  • Noles, Nicholaus S., Brian J. Scholl, and Stephen R. Mitroff. 2005. “The Persistence of Object File Representations.” Perception & Psychophysics 67 (2): 324–34. https://doi.org/10.3758/BF03206495.
  • Núñez, Rafael E. 2017. “Is There Really an Evolved Capacity for Number?” Trends in Cognitive Science 21:409–24.
  • O’Keefe, John, and Neil Burgess. 1996. “Geometric Determinants of the Place Fields of Hippocampal Neurons.” Nature 381 (6581): 425–28. https://doi.org/10.1038/381425a0.
  • Overmann, Karenleigh A. 2018. “Constructing a Concept of Number.” Journal of Numerical Cognition 4 (2).
  • Overmann, Karenleigh A. 2023. The Materiality of Numbers: Emergence and Elaboration from Prehistory to Present. Cambridge ; New York, NY: Cambridge University Press.
  • Pantsar, Markus and Catarina Dutilh Novaes. 2020. “Synthese Special Issue: Mathematical Cognition and Enculturation.” Synthese 197. https://doi.org/10.1007/s11229-019-02478-1.
  • Pantsar, Markus. 2014. “An Empirically Feasible Approach to the Epistemology of Arithmetic.” Synthese 191 (17): 4201–29. https://doi.org/10.1007/s11229-014-0526-y.
  • Pantsar, Markus. 2018. “Early Numerical Cognition and Mathematical Processes.” THEORIA. Revista de Teoría, Historia y Fundamentos de La Ciencia 33 (2): 285–304.
  • Pantsar, Markus. 2019. “The Enculturated Move from Proto-Arithmetic to Arithmetic.” Frontiers in Psychology 10:1454.
  • Pantsar, Markus. 2020. “Mathematical Cognition and Enculturation: Introduction to the Synthese Special Issue.” Synthese 197 (9): 3647–55. https://doi.org/10.1007/s11229-019-02478-1.
  • Pantsar, Markus. 2021a. “Bootstrapping of Integer Concepts: The Stronger Deviant-Interpretation Challenge.” Synthese 199 (3–4): 5791–5814. https://doi.org/10.1007/s11229-021-03046-2.
  • Pantsar, Markus. 2021b. “Objectivity in Mathematics, Without Mathematical Objects†.” Philosophia Mathematica 29 (3): 318–52. https://doi.org/10.1093/philmat/nkab010.
  • Pantsar, Markus. 2022. “On the Development of Geometric Cognition: Beyond Nature vs. Nurture.” Philosophical Psychology 35 (4): 595–616. https://doi.org/10.1080/09515089.2021.2014441.
  • Pantsar, Markus. 2024a. Numerical Cognition and the Epistemology of Arithmetic. Cambridge University Press.
  • Pantsar, Markus. 2024b. “Why Do Numbers Exist? A Psychologist Constructivist Account.” Inquiry 0 (0): 1–33. https://doi.org/10.1080/0020174X.2024.2305386.
  • Peano, Giuseppe. 1889. “The Principles of Arithmetic, Presented by a New Method.” In Selected Works of Giuseppe Peano, edited by H. Kennedy, 101–34. Toronto; Buffalo: University of Toronto Press.
  • Pelland, Jean-Charles. 2018. “Which Came First, the Number or the Numeral?” In Naturalizing Logico-Mathematical Knowledge: Approaches from Philosophy, Psychology and Cognitive Science, edited by S. Bangu, 179–94. New York and London: Routledge.
  • Penner-Wilger, Marcie, and Michael L. Anderson. 2013. “The Relation between Finger Gnosis and Mathematical Ability: Why Redeployment of Neural Circuits Best Explains the Finding.” Frontiers in Psychology 4 (December):877. https://doi.org/10.3389/fpsyg.2013.00877.
  • Piaget, Jean. 1960. Childs Concept Of Geometry. Basic Books.
  • Piaget, Jean. 1965. Child’s Conception Of Number. Copyright 1965 edition. Princeton, N.J.: W. W. Norton & Company.
  • Pica, Pierre, Cathy Lemer, Véronique Izard, and Stanislas Dehaene. 2004. “Exact and Approximate Arithmetic in an Amazonian Indigene Group.” Science 306 (5695): 499–503.
  • Plato. 1992. The Republic. Translated by G.M.A Grube. Second. Indianapolis: Hackett Publishing Company. Putnam.
  • Quinon, Paula. 2021. “Cognitive Structuralism: Explaining the Regularity of the Natural Numbers Progression.” Review of Philosophy and Psychology. Springer. https://link.springer.com/article/10.1007/s13164-021-00524-x.
  • Quirk, Gregory J., Robert U. Muller, and John L. Kubie. 1990. “The Firing of Hippocampal Place Cells in the Dark Depends on the Rat’s Recent Experience.” Journal of Neuroscience 10 (6): 2008–17.
  • Rips, Lance J., Amber Bloomfield, and Jennifer Asmuth. 2008. “From Numerical Concepts to Concepts of Number.” Behavioral and Brain Sciences 31 (6): 623–42. https://doi.org/10.1017/S0140525X08005566.
  • Rips, Lance J., Jennifer Asmuth, and Amber Bloomfield. 2006. “Giving the Boot to the Bootstrap: How Not to Learn the Natural Numbers.” Cognition 101 (3): 51–60.
  • Rothman, Daniel B., and William H. Warren. 2006. “Wormholes in Virtual Reality and the Geometry of Cognitive Maps.” Journal of Vision 6 (6): 143. https://doi.org/10.1167/6.6.143.
  • Rugani, Rosa, Laura Fontanari, Eleonora Simoni, Lucia Regolin, and Giorgio Vallortigara. 2009. “Arithmetic in Newborn Chicks.” Proceedings of the Royal Society B: Biological Sciences 276 (1666): 2451–60.
  • dos Santos, César Frederico. 2021. “Enculturation and the Historical Origins of Number Words and Concepts.” Synthese, June. https://doi.org/10.1007/s11229-021-03202-8.
  • Schlimm, Dirk. 2018. “Numbers Through Numerals: The Constitutive Role of External Representations.” In Naturalizing Logico-Mathematical Knowledge. Routledge.
  • Schlimm, Dirk 2021. “How Can Numerals Be Iconic? More Varieties of Iconicity.” In Diagrammatic Representation and Inference, edited by Amrita Basu, Gem Stapleton, Sven Linker, Catherine Legg, Emmanuel Manalo, and Petrucio Viana, 520–28. Lecture Notes in Computer Science. Cham: Springer International Publishing. https://doi.org/10.1007/978-3-030-86062-2_53.
  • Schmandt-Besserat, Denise. 1996. How Writing Came About. University of Texas Press.
  • Schwartz, Marcelle, R. H. Day, and Leslie B. Cohen. 1979. “Visual Shape Perception in Early Infancy.” Monographs of the Society for Research in Child Development 44 (7): 1–63. https://doi.org/10.2307/1165963.
  • Shapiro, Stewart. 1997. Philosophy of Mathematics: Structure and Ontology. New York: Oxford University Press.
  • Spelke, Elizabeth, Sang Ah Lee, and Véronique Izard. 2010. “Beyond Core Knowledge: Natural Geometry.” Cognitive Science 34 (5): 863–84. https://doi.org/10.1111/j.1551-6709.2010.01110.x.
  • Spelke, Elizabeth S. 2000. “Core Knowledge.” American Psychologist 55 (11): 1233–43. https://doi.org/10.1037/0003-066X.55.11.1233.
  • Spelke, Elizabeth S., and Sang Ah Lee. 2012. “Core Systems of Geometry in Animal Minds.” Philosophical Transactions of the Royal Society B: Biological Sciences 367 (1603): 2784–93. https://doi.org/10.1098/rstb.2012.0210.
  • Spelke, Elizabeth S. 2011. “Natural Number and Natural Geometry.” In Space, Time and Number in the Brain, edited by S. Dehaene and E. Brannon, 287–317. London: Academic Press.
  • Starkey, Prentice, and Robert G. Cooper. 1980. “Perception of Numbers by Human Infants.” Science 210 (4473): 1033–35.
  • Tolman, Edward C. 1948. “Cognitive Maps in Rats and Men.” Psychological Review 55 (4): 189–208. https://doi.org/10.1037/h0061626.
  • Tomasello, Michael. 1999. The Cultural Origins of Human Cognition. Cambridge, MA: Harvard University Press.
  • Verdine, Brian N., Roberta Michnick Golinkoff, Kathryn Hirsh-Pasek, Nora S. Newcombe, Andrew T. Filipowicz, and Alicia Chang. 2014. “Deconstructing Building Blocks: Preschoolers’ Spatial Assembly Performance Relates to Early Mathematics Skills.” Child Development 85 (3): 1062–76. https://doi.org/10.1111/cdev.12165.
  • Vold, Karina, and Dirk Schlimm. 2020. “Extended Mathematical Cognition: External Representations with Non-Derived Content.” Synthese 197 (9): 3757–77. https://doi.org/10.1007/s11229-019-02097-w.
  • Walsh, Vincent. 2003. “A Theory of Magnitude: Common Cortical Metrics of Time, Space and Quantity.” Trends in Cognitive Sciences 7 (11): 483–88. https://doi.org/10.1016/j.tics.2003.09.002.
  • Warren, Jared. 2020. Shadows of Syntax: Revitalizing Logical and Mathematical Conventionalism. New York, NY, United States of America: Oxford University Press.
  • Wasner, Mirjam, Hans-Christoph Nuerk, Laura Martignon, Stephanie Roesch, and Korbinian Moeller. 2016. “Finger Gnosis Predicts a Unique but Small Part of Variance in Initial Arithmetic Performance.” Journal of Experimental Child Psychology 146:1–16.
  • Wehner, Rüdiger, and Randolf Menzel. 1990. “Do Insects Have Cognitive Maps?” Annual Review of Neuroscience 13 (1): 403–14. https://doi.org/10.1146/annurev.ne.13.030190.002155.
  • Wiese, Heike. 2007. “The Co-Evolution of Number Concepts and Counting Words.” Lingua 117: 758–72.
  • Wittgenstein, Ludwig. 1978. Remarks on the Foundations of Mathematics. Translated by Georg Henrik von Wright. Rev. ed., 4. print. Cambridge, Mass.: MIT Press.
  • Wynn, Karen. 1990. “Children’s Understanding of Counting.” Cognition 36 (2): 155–93.
  • Wynn, Karen. 1992. “Addition and Subtraction by Human Infants.” Nature 358:749–51.
  • Wystrach, Antoine and Guy Beugnon. 2009. “Ants Learn Geometry and Features.” Current Biology 19 (1): 61–66.
  • Zahidi, Karim. 2021. “Radicalizing numerical cognition.” Synthese 198 (Suppl 1): 529–45.
  • Zahidi, Karim and Erik Myin. 2018. “Making Sense of Numbers without a Number Sense.” In Naturalizing Logico Mathematical Knowledge : Approaches from Philosophy, Psychology and Cognitive Science, edited by S. Bangu, 218–33. London: Routledge.

 

Author Information

Markus Pantsar
Email: markus.pantsar@gmail.com
RWTH Aachen University
Germany

Leśniewski: Logic

LesniewskiStanisław Leśniewski (1886-1939) was a Polish logician and philosopher, co-founder with his colleague Jan Łukasiewicz of one of the most active logic centers of the twentieth century: the Warsaw School of Logic. As an alternative to Whitehead’s and Russell’s Principia Mathematica, he developed his own program for the foundations of mathematics on the basis of three systems. The first, called ‘Protothetic’, is a quantified propositional logic. The second, called ‘Ontology’, is a modernized, higher-order version of term logic. The last and most famous one is a general theory of parts and wholes, called ‘Mereology’. His concern for rigor in analysis and formalization led him to a logical work remarkable in its generality and precision. As a nominalist, he developed one of the major attempts to provide nominalistically acceptable foundations of mathematics. Although his logical systems have not been widely adopted and remain on the margins of standard logic, many of his views and innovations have greatly influenced the progress of logic: his conception of higher-order quantification, his development of a free and plural logic, his outline of natural deduction, his concern for the distinctions between use and mention and between language and meta-language, his canons of good definition, his formalization of the theory of parts and wholes. All this makes him one of the key figures of twentieth-century logic.

Table of Contents

  1. Life and Work
  2. Logical Systems
    1. Protothetic (Propositional Logic)
      1. A Quantified Propositional Logic
      2. Definition as a Rule
      3. Bivalency and Extensionality
      4. Semantic Categories and Contextual Syntax
    2. Ontology (Term Logic)
      1. Names and Copula
      2. The Axiomatic System
      3. Higher Orders
    3. Mereology (Part-Whole Theory)
      1. Mereology and Russell’s Paradox
      2. The Axiomatic System
  3. Foundations of Mathematics
    1. Mereology and Set Theory
    2. Ontology and Arithmetic
  4. References and Further Reading

1. Life and Work

Stanisław Leśniewski was born on March 30, 1886, to Polish parents in Serpukhov, a small Russian town near Moscow. His father, a railway engineer, led the family to various construction sites, guiding young Leśniewski to attend the Gymnasium in the Siberian city of Irkutsk. Between 1904 and 1910, he pursued studies in mathematics and philosophy in St. Petersburg, as well as in different German-speaking universities in Leipzig, Heidelberg, Zurich, and Munich. Transitioning in 1910 to the University of Lvov—then a Polish city in Austria-Hungary, later known as Lviv in Ukraine—he obtained his doctorate in two years under the supervision of Kazimierz Twardowski, with a dissertation on the analysis of existential propositions.

Like many Polish philosophers of his time, Leśniewski was deeply influenced by Twardowski. Though he would later diverge from his master’s philosophical views, the rigorous spirit and quest for the greatest linguistic precision instilled by Twardowski, inherited from Brentano, permeated Leśniewski’s entire body of work. A pivotal moment in Leśniewski’s intellectual development occurred in 1911, when he encountered symbolic logic and Russell’s paradox through Jan Łukasiewicz’s book, On the Principle of Contradiction in Aristotle. In the ensuing years, Leśniewski published several papers, mainly devoted to the analysis of existential propositions, to the excluded middle and to the principle of contradiction.

At the outbreak of World War I, Poland found itself in the midst of the conflict, prompting Leśniewski decision to return to Russia. There, he took up teaching positions in Polish schools located in Moscow. It was during this period that he published his initial analysis of Russell’s paradox (1914) and formulated the first version of his Mereology (1916). Leśniewski’s Mereology is a theory of parts and wholes. It introduces the notion of collective class, a concrete notion of class elaborated by Leśniewski directly against Cantor’s sets, Frege’s extensions of concepts and Russell’s and Whitehead’s classes as incomplete symbols. Constituting the initial phase of his work, all the papers from 1911 to 1916 were characterized by an informal style, almost devoid of symbolic notation.

With the advent of the Bolshevik Revolution, Leśniewski departed Russia and permanently settled in Poland. After an initial attempt to obtain his habilitation in Lvov, he eventually attained it in 1918 at the University of Warsaw. During the years 1919-1921, Leśniewski played a role as a code breaker in Poland’s efforts to thwart the Red Army’s advance on the newly independent nation. By the war’s end, Warsaw University had emerged as a significant center for mathematics. In 1919, Leśniewski accepted a chair especially established for him, dedicated to the foundations of mathematics. Together with his colleague Jan Łukasiewicz, he co-founded the Warsaw School of Logic, which was to be the most important center for symbolic logic during the interwar period. Leśniewski and Łukasiewicz attracted exceptionally talented students, including the young Alfred Tarski, who would be Leśniewski’s sole doctoral student throughout his career.

From 1919 until his passing in 1939, Leśniewski consistently taught and refined his principal logical achievements: the three systems known as ‘Protothetic’ (a generalized version of propositional logic, encompassing quantification), ‘Ontology’ (a modern version of term logic), and ‘Mereology’. However, Leśniewski’s perfectionism hindered him from promptly publishing his results, as he insisted on attaining the utmost precision. His decision to employ logical formal tools stemmed from the desire to express his philosophical intuitions with exceptional rigor. Dissatisfied with the prevailing logical works of his time, he found in particular Whitehead’s and Russell’s Principia Mathematica lacking the requisite precision and rigor. Frege’s work was closer to his methodological requirements, although he criticized his Platonist leanings and perceived his logic as overly influenced by mathematical objectives. Leśniewski endeavored to establish an organon wherein principles were not adopted to ensure a consistent account of mathematics, but rather to faithfully express our general logical intuitions. Despite his exacting standards, which often left him dissatisfied with his own output, he resumed publishing from 1927 onward. Notably, he authored a series of eleven papers titled On the Foundations of Mathematics. During this phase of publication, Leśniewski abandoned the informal style of his earlier writings in favor of a formal discussion of his three systems.

Tragically, he died of thyroid cancer on May 13, 1939, aged 53. He left behind a substantial collection of notes and manuscripts entrusted to his pupil Bolesław Sobociński. Regrettably, this material was lost during the Nazi destruction of Warsaw in 1944. Since its publication in 1991, scholars primarily access Leśniewski’s work through the English edition of his Collected Works. Furthermore, there is a volume of lecture notes from Leśniewski’s students compiled and published in English in 1988. Testimonies and reconstructions provided by members of the Warsaw School, notably Bolesław Sobociński, Jerzy Słupecki, Cesław Lejewski, and to a lesser extent, Alfred Tarski, shed light on the lost aspects of his oeuvre. Comprehensive presentations of Leśniewski’s work can be found in works by Luschei (1962), Miéville (1984), Urbaniak (2014), and in a series of six special issues of the Swiss journal Travaux de Logique (2001-2009). Additionally, significant articles on Leśniewski’s systems are featured in collections edited by Srzednicki and Rickey (1984), Miéville and Vernant (1996), and Srzednicki and Stachniak (1998). Rickey’s annotated bibliography, available online, offers a comprehensive reference guide to Leśniewski’s work and related topics.

2. Logical Systems

Becoming skeptical about expressing his research in natural language, as he had done in his early writings, Leśniewski was persuaded in the early 1920s to adopt a symbolic language, despite his reservations toward the symbolic logic works of his time. Consequently, he chose to present his Mereology using a new symbolic language that aligned with his linguistic intuitions. By 1916, Mereology had already been axiomatized, albeit using expressions like ‘A is B’, ‘A is a part of B’ or ‘If an object A is a, then there exists an object B which is a class of the objects a’—expressions which, as Leśniewski recognized, lack precision. He then embarked on constructing a logical calculus capable of incorporating the specific terms of Mereology, such as ‘class’ and ‘part’. This calculus was designed to make explicit his interpretation of the copula ‘is’. Initially, Leśniewski focused on the analysis of singular propositions of the form ‘a is b’, which he symbolized as ‘a ε b’. This emphasis on the copula ‘is’ led Leśniewski to name his system ‘Ontology’. He believed that he could express all the intended meanings using only propositions of the form ‘a ε b’, along with a general logical framework incorporating propositional logic and quantification theory. Ontology emerged then as a term logic grounded in a more fundamental calculus that Leśniewski called ‘Protothetic’ (literally, the theory of first theses). Protothetic is the most basic system, with Ontology and Mereology being subsequent expansions of it, even though Leśniewski created the three systems in reverse order, starting from the applied theory of Mereology, progressing to the purely logical system of Ontology, and finally arriving at Protothetic. This underscores his use of formalization as a tool for the accurate expression of his intuitions. He did not adhere to a formalist conception of logic. The formalist idea of a pure syntax, subject to various subsequent interpretations is completely foreign to Leśniewski. In his systems, all formulas are intended to be meaningful from the outset. The primitive constants do not get their value from axioms and rules; rather, it is the meaning of the primitive constants that makes the axioms true, and the rules correct.

As a nominalist, he rejected the existence of general entities, a stance that significantly influenced his conception of formal languages and systems. Leśniewski rejected abstract types of expressions and infinite sets of formulas provided from the outset by formal definitions. To him, a theorem within a system is the final entry in a tangible list of explicitly written inscriptions, with axioms preceding it and subsequent entries obtained through the application of explicit rules to previous inscriptions. This perspective views a logical system as a concrete complex of meaningful inscriptions, inherently situated within space and time. Each system is thus composed of a finite list of inscriptions, yet it remains open to the inscription of new theorems. One significant implication of this unusual conception of formal systems is the absence of a predetermined and definitive definition of what constitutes a well-formed formula. Leśniewski had to formulate his rules in a way that ensured both logical validity and grammatical conformity. This nominalist approach to formal syntax empowered Leśniewski to develop a logic where the characterization of the rules reached an extraordinary level of precision. However, it is worth emphasizing that adopting his logic does not require endorsing his nominalist convictions. His systems are equally suitable for reasoning about both concrete and abstract entities.

a. Protothetic (Propositional Logic)

The distinctiveness of Protothetic emerges when contrasted with a more usual deductive system for propositional logic, such as the following:

System L

(1) Formal language of L

Let A={p,q,r,…} ∪ {⊃,~} ∪ {( , )}  be the set of symbols.

Let F the set of formulas, defined as the smallest set E such that

(i) {p,q,r,…} ⊂ E

(ii) If α,β ∈ E, then ~α ∈ E and (α ⊃ β) ∈ E

(2) Axioms of L

AxL1: (p ⊃ (q ⊃ p))

AxL2: ((p ⊃ (q ⊃ r)) ⊃ ((p ⊃ q) ⊃ (p ⊃ r)))

AxL3: ((~p ⊃ ~q) ⊃ ((~p ⊃ q) ⊃ p))

(3) Rules of inference of L

Modus ponens

Substitution

Such a system is closed, in the sense that the sets of symbols and formulas are given once and for all, and the set of theorems is fully determined from the outset as the closure of the set of axioms under the rules of inference. It is known to be an adequate axiomatization of the classical bivalent propositional calculus, possessing essential properties such as soundness, consistency, and completeness. Moreover, its concise set of connectives (two symbols interpreted as negation and conditional) is adequate for the expression of all bivalent truth-functions. This last feature allows for the introduction of additional connectives through definition, for example:

Conjunction: α ∧ β ≝ ~(α ⊃ ~β)

Disjunction: α ∨ β ≝ ~α ⊃ β

Biconditional: α ≡ β ≝ (~α ⊃ β) ⊃ ~(α ⊃ ~β)

Although it seems that the sets of formulas and theorems of L can be extended to certain expressions containing the defined connectives, these new expressions are not official formulas of the system L. We can use them in our proofs and deductions, but only as convenient metalinguistic abbreviations of official formulas. For Leśniewski, a system involving only a few primitive constants, and in which defined constants have no status and can only be used in the metalanguage was unacceptable. In his view, a complete system for propositional logic should make it possible to express any truth-functional meaning with an official constant of its own formal language. In the standard perspective, if we want a system with the defined constants as official ones, we must proceed to the construction of a suitable expansion L* of L. For that purpose, the sets of symbols and formulas must be increased, and the set of axioms must also be completed, with suitable additional axioms. This must be done preserving in L* soundness, consistency and completeness. In principle, our former definitions should be used to provide the expected additional axioms, but additional axioms must be formulas of the object language of L*. There are thus two obvious reasons why these definitions cannot be taken as axioms in their current form. Firstly, they are schematic expressions, so the metavariables they contain must be replaced by object language variables. Secondly, they contain the special definition symbol ‘‘, which is used in order to stipulate a logical equivalence between the definiendum and the definiens. Transforming the definitions into axioms requires the use of object language symbols able to express the same equivalence between the two constituents of the expressions. An obvious solution is the use of biconditional formulas. For instance, the additional axiom devoted to conjunction would then be:

(p ∧ q) ≡ ~(p ⊃ ~q)

However, for the introduction of biconditional itself, this solution would obviously be circular. Another solution for the construction of the expansion L* would be to replace the definition sign by the exclusive use of the primitive constants of the original system L. Instead of adding a single biconditional axiom, Tarski suggested adding a pair of conditional expressions. In the case of conjunction, we would then have to add this pair of axioms:

(p ∧ q) ⊃ ~(p ⊃ ~q)

~(p ⊃ ~q) ⊃ (p ∧ q)

This is perfectly suitable and would also be convenient for the introduction of biconditional. But Leśniewski was more demanding. For him, the most natural solution for the introduction of a new constant was by the way of a single biconditional expression. He turned then to the idea that an initial system for propositional logic must involve biconditional among its primitive constants. Since he attached great importance to the question of parsimony, he sought to elaborate an initial system for his future Protothetic containing biconditional as the only primitive propositional connective.

i. A Quantified Propositional Logic

In the early 1920s, it was already known that in classical logic the Sheffer stroke, like its dual connective, could serve as the unique primitive connective to express all the truth functions. Moreover, Leśniewski also knew another result which was published by Russell in his Principles of Mathematics (1903). Russell indeed showed in this early work that it is possible to conceive a complete system for propositional logic with conditional as the single primitive connective, provided that the propositional variables could be universally quantified. His definition of negation is the following: “not-p is equivalent to the assertion that p implies all propositions” (1903: 18). We can express this definition by the symbolic expression:

~p ≝ p ⊃ (∀r)r

Leśniewski knew that a similar solution holds for the definition of negation in term of biconditional:

~p ≝ p ≡ (∀r)r

However, with the biconditional solution, difficulties remain for the expression of other connectives. For example, conjunction and disjunction are not expressible by a simple combination of biconditional and negation. A brilliant solution to this issue has been discovered by the young Alfred Tarski. In 1923, he established in his PhD thesis, written under the supervision of Leśniewski, that a quantified system of propositional logic with biconditional as its single primitive connective allows the expression of all truth functions. In the introduction of this work, Tarski exposed the issue, and the way it is related to his adviser’s project is clear:

The problem of which I here offer a solution […] seems to me to be interesting for the following reason. We know that it is possible to construct the system of logistic by means of a single primitive term, employing for this purpose either the sign of implication [conditional], if we wish to follow the example of Russell, or by making use of the idea of Sheffer, who adopts as the primitive term the sign of incompatibility, especially introduced for this purpose. Now in order to really attain our goal, it is necessary to guard against the entry of any constant special term into the wording of the definitions involved, if this special term is at the same time distinct from the primitive term adopted, from terms previously defined, and from the term to be defined. The sign of equivalence [biconditional], if we employ it as our primitive term, presents from this standpoint the advantage that it permits to observe the above rule quite strictly and as the same time to give to our definitions a form as natural as it is convenient, that is to say the form of equivalences.

The theorem which is proved in §1 of this article,

(∀pq)((p ∧ q) ≡ (∀f)(p ≡ ((∀r)(p ≡ f(r)) ≡ (∀r)(q ≡ f(r))))) [modified notation]

constitutes a positive answer to the question raised above. In fact, it can serve as a definition of the symbol of logical product [conjunction] in terms of the equivalence symbol and the universal quantifier; and as soon as we are able to use the symbol of logical product, the definitions of other terms of logistic do not present any difficulty, […] (Tarski, 1923: pp. 1-2).

This result by Tarski was a cornerstone of the future Protothetic. But it was not sufficient to overcome all the obstacles. The complete biconditional fragment of propositional logic was already known in the Warsaw School, but using Tarski’s solution required a version of this fragment allowing quantifiers to bind not only propositional variables, but also variables for propositional connectives. For the axiomatization of this extended biconditional fragment, Leśniewski’s idea was to work first on the basis of the universal closure of two axioms known to form a good basis for the unextended fragment:

AxP1: (∀pqr)(((p ≡ r) ≡ (q ≡ p)) ≡ (r ≡ q))

AxP2: (∀pqr)(((p ≡ q) ≡ r) ≡ (p ≡ (q ≡ r)))

As for the inference rules, they had to include a detachment rule for biconditional expressions, as well as two rules for taking advantage of quantified expressions:

Detachment Rule (Det): Φ ≡ Ψ ,Φ ⊢ Ψ

Substitution Rule (Sub): (∀α1 α2⋯αn)Φ ⊢ (∀β1⋯βm α2⋯αn)Φ[α1 / Ψ(β1⋯βm)]

Distribution Rule (Dis): (∀α1 α2⋯αn)(Φ ≡ Ψ) (∀α2⋯αn)((∀α1)Φ ≡ (∀α1)Ψ)

Without going into a detailed and rigorous characterization of the system S1 based on the above described axioms and rules, let us consider as an illustration how to prove a few theorems:

  Theorems Justifications
AxP1: (∀pqr)(((p ≡ r) ≡ (q ≡ p)) ≡ (r ≡ q)) Ax
AxP2: (∀pqr)((p ≡ (q ≡ r)) ≡ ((p ≡ q) ≡ r)) Ax
P1: (∀pqr)(((p ≡ r) ≡ ((p ≡ q) ≡ p)) ≡ (r ≡ (p ≡ q))) AxP1, Sub, q⁄p ≡ q
P2: (∀pq)(((p ≡ (q ≡ p)) ≡ ((p ≡ q) ≡ p)) ≡ ((q ≡ p) ≡ (p ≡ q))) P1, Sub, r⁄q ≡ p
P3: (∀p)((∀q)((p ≡ (q ≡ p)) ≡ ((p ≡ q) ≡ p)) ≡ (∀q)((q ≡ p) ≡ (p ≡ q))) P2, Dis, q
P4: (∀pq)((p ≡ (q ≡ p)) ≡ ((p ≡ q) ≡ p)) ≡ (∀pq)((q ≡ p) ≡ (p ≡ q)) P3, Dis, p
P5: (∀pq)((p ≡ (q ≡ p)) ≡ ((p ≡ q) ≡ p)) AxP2, Sub, r⁄p
P6: (∀pq)((q ≡ p) ≡ (p ≡ q)) P4, P5, Det

These few examples show how the rules apply in S1.. One sees how Sub and Dis always keep the quantifiers saturated, so that there are never free variables in theorems. It is worth noting that Sub and Dis are formulated to maintain quantifier saturation, ensuring that there are no free variables in the theorems. With a bivalent interpretation of the variables, the standard truth-table for the single connective and the quantifier understood as expressing “whatever the values of …”, it is easy to show that S1 is sound. Moreover, the closure of any biconditional tautology is provable in this system. Nevertheless, S1 is not complete. For instance, the following formulas, which are obviously valid in the intended interpretation, remain unprovable:

(∀p)p ≡ (∀r)r        (not provable in S1)
(∀p)((∀q)(p ≡ q) ≡ (∀r)(p ≡ r))        (not provable in S1)

Before examining this limitation, let us explore how Leśniewski reinforced the system by introducing a new rule for the introduction of definitions. The resulting system, S2, features a formal language that can be expanded step by step through the official admission of defined constants. Within such a system, the notions of formula and theorem are no longer absolute ones. They become relative to what will be called ‘the developments of the system’. A development is a finite ordered sequence of explicitly written expressions, which are the theorems of that development. The first theorems of a development are necessarily the axioms. Every further theorem must have been explicitly written, applying one of the inference rules on previously written theorems. As a result, each time we write a new theorem, we get a new development. The above written sequence of theorems of S1 is a development in S2 (say the development P6, using the label of its last theorem). We can now get new developments by writing for example the following additional theorems:

P7: (∀p)(((p ≡ p) ≡ (p ≡ p)) ≡ (p ≡ p)) AxP1, Sub, q⁄p,r⁄p
P8: (∀p)((p ≡ p) ≡ (p ≡ p)) ≡ (∀p)(p ≡ p) P7, Dis, p
P9: (∀p)((p ≡ p) ≡ (p ≡ p)) P6, Sub, q⁄p
P10: (∀p)(p ≡ p) P8, P9, Det
P11: (∀r)r ≡ (∀r)r P10, Sub, p⁄(∀r)r

At first glance, developments seem to be exactly like proofs, but there are important differences. First, there are only developments that have been explicitly written. Developments are indeed concrete objects. Moreover, every time a definition is stated, the language available in further developments is increased. As a result, it is possible that a theorem in a certain development is not even a well-formed formula in another one.

ii. Definition as a Rule

Now let us consider how a new theorem can be written in S2 by applying the additional rule for stating definitions:

Definition Rule (Def-S2):

 

In a given development Pn, an expression D can be written as theorem Pn+1 with the rule Def-S2 if and only if D is a closed biconditional formula of the form:

     or      (when )

where:

1. are different variables belonging to categories already available in the development Pn;

2. the expression Dum (the definiendum) is of the form or (when ), being a new constant symbol (not already present in the development Pn);

3. the expression Diens (the definiens) is a formula well formed in accordance with categories and syntactic contexts available in the development Pn;

4. the expressions Dum and Diens have exactly the same free variables (if any).

 

This formulation lacks plain rigor, as it refers to categories and syntactic contexts available in a certain development. It serves here as a suggestive presentation, summarizing the meticulous and long explanations Leśniewski provided in order to fully precise the conditions under which an expression can be written as resulting from the application of the definition rule. Without going here into this formal precision, let us instead examine how new developments can be written in S2.

 

P12:                                                    Def-S2

P13:  P10, Sub,

P14:                                                                                    P12, P13, Det

Theorem P12 is an example of definition in which the definiendum has no variable. It introduces the first propositional constant (constant of category S). As the definiens can be shown to be a theorem (P13), the new constant can be written as a theorem (P14) and can be understood as the constant true. Now consider how the constant false can also be introduced, using for its definiens an explosive expression (an expression from which every available formula would be derivable by Sub).

P15:                                                      Def-S2

P16:   P10, Sub,

P17:                           Def-S2

P18:                                        P17, Sub,

P19:                                                                P16, P18, Det

 

Using the newly defined constant false, P17 introduces classical negation by definition. It is worth noting that this definition also introduces for the first time the category of unary connectives (or the category labeled S/S, that is, the category of functors taking a unique sentence as their argument and resulting in a sentence). P19 expresses that the negation of false is a theorem. Now come three definitions of binary connectives (category S/SS, that is of functors which give a sentence from two sentences):

 

P20: Def-S2
P21: Def-S2
P22: Def-S2

It is worth noting that none of these definitions could have been formulated without the prior definition of negation in P17. This is obvious with P20 and P22 which explicitly include negation in their definiens. In the case of P21 (Tarski’s definition of conjunction), negation is not specifically needed. However, the definiens of P21 involves bound variables for unary connectives. As a principle of constructing developments, the use of variables from a specific category (in this case, S/S) is permissible only if this category either is already included in the axioms or has been introduced through a preceding definition.

 

P23: Def-S2
P24: P23, Sub, f/≡
P25: P6, P24, Det

Definition P23 still introduces a constant of a new category: S/(S/SS). Theorem P25 expresses that biconditional is a commutative binary connective.

 

Although the definitional machinery is powerful in S2, it still has a limitation that Leśniewski wanted to overcome. It is indeed impossible to define in S2 operations on connectives, such as “the dual of …” or “the composition of…and…”. All the categories that can be introduced in S2 give a result of category S. In order to reach more complex categories, the definition rule has to be reinforced. Let us call ‘S3‘ the system in which the definition rule is modified as follows:

 

Definition Rule (Def-Proto):

 

The rule is like Def-S2, except for condition 2, which is replaced by the following one:

 

2′. the expression Dum is of the form or (in case ), being a new constant symbol (not already present in the development Pn).

 

The only difference in this new version is that variables in Dum can be distributed in several successive pairs of brackets. Let us have a look on two examples:

 

P26:  Def-Proto

P27:  Def-Proto

Definition P26 introduces the operation which gives the dual of a binary connective. The new constant is of category (S/SS)/(S/SS). P27 introduces the composition or logical product of two binary connectives. The category of the defined constant is then (S/SS)/(S/SS)(S/SS). When the definiens has more than one pair of brackets, the result of the application of the new functor is again a functor. The numerator of its category index is itself a fraction, so that the introduced constant is a functor-forming functor (or a many-link functor), which was not possible to define with Def-S2.

 

These few examples sufficiently show how powerful the definition machinery can be in S2 and S3. Nevertheless, we must go back here to the limitations of S1. Let us remember that valid formulas as the following one where not provable in this system:

 

     (not provable in S1) 

Inevitably, these limitations also affect S2 and S3. Lesniewski understood early that systems like S1S3 suffered from a too weak characterization of quantification. In the early 1920s, he realized that this weakness could be overcome if the axiomatic basis enforces explicitly propositional bivalency and extensionality.

iii. Bivalency and Extensionality

In a quantified system of propositional logic, propositional bivalency and extensionality can be expressed by the following formulas:

 

Bivalency for category S:

(Something holds for all propositions iff it holds for true and false)

 

Extensionality for category S:

(Two propositions are equivalent iff everything that holds for one, holds for the other)

 

Leśniewski wanted these formulas to be provable in his Protothetic. In 1922, he was able, with Tarski, to establish that in a system with all the usual laws of quantifiers, these two formulas were equivalent. Subsequently, Leśniewski found that in a system like S3, assuming only bivalency was sufficient to reinforce quantification adequately and thereby achieve extensionality for S. However, he could not simply adopt the formula for bivalency as an additional axiom. He had first to eliminate the defined terms in the formula. This could be done by applying the following transformations:

 

 

(bivalency, with change of letters)

 

(commutation)

 

(elimination of and )

 (elimination of )

 

In order to avoid the introduction of an additional category in the axiomatic basis, he still had to transform the resulting formula, using variables for binary connectives instead of unary ones. He reached then the following third axiom, the addition of which strengthened quantification and allowed to derive both bivalency and extensionality for the category S of sentences:

 

AxP3: {

}

 

The system S4, based on the three axioms AxP1-AxP3 and the four rules Det, Sub, Dis, and Def-Proto is strong enough to reach at least a full classical calculus of all possible truth-functional unary and binary connectives. But Leśniewski still did not consider a system like S4 to be satisfactory. He wanted extensionality formulas to be provable not only for sentences, but also for all the categories that could potentially be introduced by definitions. In other words, he wanted his axiomatic basis to enforce extensionality for all the potentially definable functors (not only connectives or functors with propositional arguments, but also functors of which arguments are functors, like for instance those introduced by definitions P26 and P27). This goal could not be achieved by adding once again additional axioms. An infinity of axioms would have been necessary and each of them would have required specific categories for its formulation.

 

Leśniewski’s solution was to add a fifth rule of inference. In a given development Pn, the Rule of Extensionality (Ext) allows one to write a new theorem expressing extensionality for a category C, provided the development Pn already contains a definition of a constant of category C as well as one constant of category S/C. A general description of this rule would be too long here. It will only be illustrated by a couple of examples.

 

As a first example, definition P20 introduces a constant of category S/SS and P23 a constant of category S/(S/SS). The rule Ext allows then to write as a new theorem the following formula expressing extensionality for the category S/SS:

 

P28: 

P20, P23, Ext

Definition P26 gives us our second example. It introduces a constant of category (S/SS)/(S/SS). However, in order to apply Ext for this category, we still need to introduce a definition of a constant of category S/((S/SS)/(S/SS)). The following definition would be adequate for that purpose:

 

P29: 

Def-Proto

Now the conditions are satisfied to get by Ext an extensionality theorem for (S/SS)/(S/SS):

 

P30: 

P26, P29, Ext

Lesniewsk’s full Protothetic is the system based on the three axioms AxP1-AxP3 and the five inference rules Det, Sub, Dis, Def-Proto, and Ext. Leśniewski labeled this version of his Protothetic Ϭ5. To get an insight into the expressive power of Ϭ5, consider a few theorems expressing important properties of the category of unary connectives (these theorems are presented here without proof):

 

P31: 

 This theorem expresses extensionality for the category S/S.

     P32: 

This is known as the law of development for category S/S.

P33: 

                                 

This theorem is known as the law of the number of functions for the category S/S, the four constants occurring in the formula (‘~’, ‘Ass’, ‘Fal’, and ‘Ver’) being the four non-equivalent constants that can be defined for the four unary truth functions.


Leśniewski has shown that for every category to be introduced in the language, it is always possible to construct a development involving theorems analogous to P31-33 and to determine precisely which and how much non-equivalent constants can be defined in this category. The main interest of that result is that it is always possible, for any category, to eliminate from expressions quantifiers binding variables of that category. In the case of
S, the theorem for bivalency expresses this fact. In the case of S/S, it is the following theorem, which could be called ‘quadrivalency of S/S’:

 

P34: 

On the basis of P34 and analogous results for other categories (for example, theorems of 16-valency for S/SS, of 216-valency for S/(S/SS), and so on), it is always possible to make explicit the precise meaning of a quantified expression by a finite process. As Luschei wrote, “Protothetic is Leśniewski’s indefinitely extensible logic of propositions, connectors, connector-forming functors, higher-level functor-forming functors—indeed of constants and variables of any semantic category in the unbounded hierarchies constructible on the basis of propositional expressions” (1962: 143).

 

The question of the completeness of Protothetic has also been discussed by Leśniewski and his followers. Leśniewski considered full Protothetic to be strongly complete (which means that if α is a closed well-formed formula of a given development, then either α or its negation is provable from that development), even though he did not have the time to give a demonstration of that result. Słupecki (1953) gave a partial demonstration of the strong completeness of the large sub-system of Protothetic where only functors of sentence-forming categories are available.

 

In 1926, Leśniewski discovered that his Protothetic could be based on a single biconditional axiom. Sobociński was able to improve Leśniewski’s result by working out the following single axiom, which is the shortest known one:

 

ShortAxP:  

 

iv. Semantic Categories and Contextual Syntax

In the language of Protothetic, like in other interpreted systems of logic, symbols and expressions divide in different mutually disjoint types or categories according to their syntactic role and the way they contribute to the meaning of the formulas in which they occur. Developing in the 1920s his notion of semantic category, Leśniewski’s inspiration was in the traditional grammatical theory of parts of speech and in Husserl’s notion of Bedeutungskategorie. In fact, Leśniewski never gave an explicit theory of semantic categories, being content to use the notion in his logical constructions. Later popularized in an explicit theoretical formulation by Ajdukiewicz (1935), the notion of category has also been applied to natural languages, opening the development of categorial grammars (Bar-Hillel, Montague and Lambek being the most representative authors in this field). Ajdukiewicz introduced a convenient notation which permits the indication, through a simple index, of all that is characteristic of a certain category. A single letter is used for the index of a basic category. In Leśniewski’s languages there are only two basic categories: the category of propositions (labeled S) and the category of names (labeled N). In the propositional language of Protothetic, only the former is used. The latter will be added in the language of further theories, namely Ontology and Mereology. Naturally, languages also contain categories for different combining symbols or expressions which are called ‘functors’: connectives, operators, relators, predicates, and so on. All these combining expressions range in derived categories. The category of a functor is determined by three pieces of information: (1) the number of the arguments it needs, (2) the respective categories of these arguments, and (3) the category of the whole generated by the application of the functor to its arguments. Ajdukiewicz’s notation gathers all this information in a single suggestive index. For example, the category of the biconditional connective is labeled S/SS, since it builds a proposition when it applies to two propositional arguments. The (S/SS)/(S/S)(S/SS) index would be that of the category of functors operating on a unary connective (S/S) on the one hand, a binary connective (S/SS) on the other hand, and generating a binary connective (S/SS). In his 1935 paper, Ajdukiewicz developed a procedure for showing grammatical well-formedness of expressions using this categorial notation and a rule for categorial simplification. However, Ajdukiewicz’s procedure requires the category of each of the signs in an expression to be known in advance. This was not possible in Protothetic. Due to its evolutive nature, the system does not fit with a conception of formal language in which all the required categorial information would have been determined from the outset. Obviously, no such language would have been rich enough for all the definitions that can be stated in the successive developments.

 

Leśniewski developed a new concept of formal syntax, often referred to as ‘inscriptional syntax’, though it is more aptly named ‘contextual syntax’. In Leśniewski’s syntax, the role and the category of a symbol are not indicated by its belonging to a certain previously established list of signs, but by the specific context in which it occurs. For such a syntax, Leśniewski needed a specific notation and an adequate way to warrant his formulas to be well formed. Consider for example the above given first axiom of Protothetic (written in a standard notation):

 

AxP1:    (in standard notation)

The usual analysis of the grammaticality of such a formula goes recursively from simple constituents to more complex expressions: starting from the letters, which are in the list of symbols for propositional variables, we can get the expression by successive applications of the biconditional formation rule and one application of the quantifier formation rule. In a contextual syntax, on the contrary, it is the form of the complete expression that determines the nature and categories of its constituents. In Leśniewski’s notation, AxP1 would have been written as:

 

AxP1:   (in Leśniewski-style notation)

As in Łukasiewicz’s well-known notation, Leśniewski’s notation is a prefixed one: every functor is followed by the list of its arguments, but contrary to Łukasiewicz’s notation, parentheses are not removed. So instead of , we get . As for the quantifier, it is always indicated by the use of specialized lower and upper square brackets: instead of , we get . Like every theorem, the whole formula of AxP1 is an expression of the basic category S. Its general form is that of a quantified expression. This implies that the expression within the upper corner brackets also belongs to category S. This last expression is of the general form . As both positions within the round brackets are again occupied by expressions of the form , this means that is here the symbol of a functor of the category S/SS. By carrying out the analysis down to the last constituents of the formula, it can be determined that the letters p, q, r are here symbols of the category S, for they occur in the positions of arguments of the context . The construction of the system’s developments adheres to this principle: upon the initial introduction of a category, its associated context must remain exclusive to symbols or expressions within that category throughout subsequent developments, ensuring no use of symbols or expressions from any other category. In other words, parentheses of the same shape, delimiting the same number of arguments, must not be associated with functors of different categories. Parentheses are then no longer used to delimit the scope of functors, but to indicate their categories. Let us examine three examples extracted from the previously provided definitions (written here in Leśniewski-compliant versions):

 

P17: 

P23: ⌊f⌋ ⌈≡ (Comf⟩⌊pq⌋⌈≡(f(qp)f(pq))⌉) ⌉

 

P27: 
 

Definition P17 is the first to introduce a unary functor S/S in the developments. It must then be associated with a new context. The analysis of the definiens shows that the letter p is here of category S, for it occurs in the first place of a context . Therefore, the new constant is of category S/S. The choice of for the new context is suitable and introduces no confusion, for it differs from  by the number of argument places.

Definition P23 introduces another unary functor. In the definiens, its argument f is of category S/SS (for f occurs just before a context). The category of the defined functor Com is then S/(S/SS). This category is a new one in the development. So, it must again be associated with a new context. This time round brackets are excluded, for such a choice would introduce ambiguity and would make indistinct the categories S/S and S/(S/SS).

Definition P27 is a more complex example. The defined constant applies to two arguments f and g of category S/SS (as it is clear in the definiens). The result of this application is the expression . But this expression, once again, applies to a pair of arguments, leading to the expression . The use of the context for the second application indicates that the expression is of category S/SS. So applying to two S/SS arguments, the defined constant gives a S/SS expression as a result. The category of the defined constant is thus (S/SS)/(S/SS)(S/SS), and its context is from now on .

In such a syntax, grammaticality does not depend on the choice of letters and symbols for this or that variable. Only the arrangement of the categories indicated by the contexts formed by specific brackets determines whether an expression is well formed or not. On a theoretical point of view, a definition as P27 could perfectly well have been written using one of the following two alternative expressions:

 

The choice of the letters p and q for the propositional variables, f and g for the connector variables in P27 has no other reason than to avoid offending the reader’s habits. As for brackets and symbols for constants, their choice is free at the time of their first occurrence. However, the choices must respect the differentiation of contexts and be respected throughout the developments.

 

Finally, it is important to recognize that the differentiation between constants and variables is contextual as well. Since all the axioms and theorems are closed expressions, all the symbols which are bound by a quantifier are variables, whereas the other symbols—apart from brackets—are necessarily constants. This section gives only an outline of the principles of contextual syntax. Leśniewski provides a detailed and scrupulously complete description of them through what he called his ‘terminological explanations’. In this way, he demonstrates how contextuality and the application of the notion of semantic category make it possible to have a rigorous formal language which, like ordinary language and the usual notations of science, remains continuously open to enrichment and novelty. This part of Leśniewski’s work is a masterpiece in the philosophy of notation.

 

The inherent openness of Leśniewski’s systems requires such a notation that unequivocally and contextually determines the categories of symbols and expressions. This aspect, combined with Leśniewski’s meticulous axiomatic presentations of his systems, makes them challenging for 21st century logicians to apprehend. But this difficulty similarly arises with the original works of Frege, Peirce, or Hilbert. It is largely due to the age of the systems and to the evolution of logicians’ habits. However, it is known that Leśniewski used his contextual syntax only where ambiguities could arise. In his everyday practice, he also formulated his proofs using a form of a natural deduction method. This method was common among the members of the Warsaw School. It was only codified later by Jaśkowski. Surprisingly, this codification was not applied to Leśniewski’s systems. A natural deduction system for Protothetic, close to twenty-first century streamlined methods of logic, is available in Joray (2020).

 

b. Ontology (Term Logic)

Like standard predicate logic, which is built on the basis of a propositional calculus, Leśniewski’s system called ‘Ontology’ is an expansion of Protothetic. The aim of Ontology is mainly to enlarge propositional deductive logic to the analysis and expression of predication. In spite of these similarities, there are important differences between Ontology and standard predicate logic. Firstly, Ontology is not a theory of quantification. The system indeed inherits quantification from Protothetic. Secondly, the language of Ontology makes no distinction of category between singular and conceptual terms. Instead of having a first category for singular names and another for predicates, Ontology has only a wide category of names. In this respect, Ontology is closer to traditional term logic than it is to predicate logic. Ontology extends then Protothetic by introducing a second basic category, the category of names, labeled N, and a copula as a new primitive constant. It is known as an extensional calculus of names, which constitutes a free and plural logic.

 

i. Names and Copula

Leśniewski’s notion of name is considerably broader than it is in the Russellian tradition. For him, not only simple singular terms like ‘Socrates’ are names, but also complex referring expressions like ‘Plato’s master’ and terms or expressions that refer to more than one object, like ‘planet’ or ‘author of the Principia Mathematica’. Whether simple or composed, a name may be singular, plural or even empty if there is no object to which it refers, as it is the case with ‘unicorn’ or ‘square circle’. In a sentence like ‘Socrates is Greek’, there are two names according to Leśniewski. ‘Socrates’ is a singular one because it refers to one individual, and ‘Greek’ is a plural one because it refers to many individuals. It should be noticed that there is no way in Leśniewski’s nominalist conception of names to interpret plural names as denoting any single abstract totality (like a set, a class, or a collection) which would have the signified objects as members. A plural name simply refers directly to these objects. Like Protothetic, Ontology is an interpreted system. It is in no way a pure syntax waiting for interpretation. It has then an intuitive semantics from the beginning. All names in Ontology belong to the same category N. In the intended semantics, they can be of three sorts: singular, plural, or empty. In order to represent these three possibilities, Lejewski (1958) proposed to use suggestive diagrams. In the following figure I, the diagrams represent the three possibilities for a name ‘a’: singular (I.1), plural (I.2), or empty (I.3):

 


Figure II shows 16 diagrams representing the possible situations in which two names ‘
a’ and ‘b’ can stand in relation to each other.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

In (II.1) both names are singular, and they denote the same object. In (II.2), both are singular, but denote different objects. In (II.3)

‘a’ denotes one object which is among the objects denoted by the plural name ‘b’. From (II.9) to (II.13), both names are plural. They denote, for example, exactly the same objects in (II.9), and the objects denoted by ‘a’ are strictly among those denoted by ‘b’ in (II.10). The main interest of these diagrams is that they make it possible to explain in a rather precise way the meaning with which Leśniewski proposed to use the only primitive term of Ontology, namely the epsilon copula. This copula ‘ applies to two names (two arguments of category N) and results in a proposition (category S). It is then of category S/NN. Expressions of the form

 

           (often written with the simplified form ‘)

 

are called ‘elementary propositions’ of Ontology. Their truth conditions in the intended semantics can be explained in the following way: an elementary proposition ‘ is true if and only if the two arguments ‘a’ and ‘b’ stand either in situation (II.1) or in situation (II.3) of Lejewski’s Figure II. In ordinary language, the meaning of such an elementary proposition can be approximated by “The object denoted by ‘a’ is the object, or one of the objects, denoted by “b”. For convenience, it is often read “a is among the b’s” or even “a is a b”.

 

It should be stressed that, unlike syllogistic, where a singular term can never occur in a predicate position, there is no restriction in Ontology as to the sort of names that may serve as arguments in an elementary proposition. Any name, whether singular, plural or empty, may occur either in the first argument position (say as subject) or in the second position (say as predicate) in an elementary proposition. In all the sixteen situations of Lejewski’s Figure II, we would get a well-formed proposition. This proposition, of course, would only be true in cases (II.1) and (II.3). In all other situations, it would be false, but meaningful and perfectly well-formed.

 

ii. The Axiomatic System

Drawing on his intuitive semantics, Leśniewski formulated in 1920 a single axiom for Ontology. This axiom is presented here both in a Leśniewski-compliant form and in a more usual notation:

 

AxOnto:

 

 

By introducing the elementary proposition ‘ in the left-hand side of the biconditional expression, with  as a new context, the axiom presents a formal similarity with a definition of Protothetic. However, the new symbol also occurs in the right-hand argument, and the axiom is then like an implicit definition, introducing the new symbol ‘ with a formal characterization that fits the intended truth conditions of the elementary proposition:

 

For all a, b,
a is a b           if and only if
1. at least one thing is an a   and
2. at most one thing is an a   and
3. all that is an a is also a b.

This axiom shows a striking analogy between Leśniewski’s analysis of singular propositions and Russell’s theory of definite descriptions. Despite numerous oppositions between the two logicians, Leśniewski acknowledged that Ontology has certain similarities with Russell’s work, for example its formal proximity with simple type theory.

 

Concerning the rules of inference, Ontology inherits adapted forms of the rules of Protothetic. In addition, it also has new versions of the rules for definition and for extensionality. With only the protothetical rules, one can already establish significant theorems concerning the constant epsilon, notably the following:

 

T1: (epsilon is transitive)
T2: (one of the most characteristic properties of epsilon)
T3:

These three theorems sufficiently indicate that Leśniewski’s epsilon is formally very different from the epsilon of set theory. Theorem T3 rather shows similarities between Ontology and Aristotle’s syllogistic. In one direction, the biconditional expression is analogous to the Barbara syllogism, while in the other direction it bears formal resemblance to what Aristotle termed ‘ecthesis’. Resting on results due to Tarski and Sobociński, Leśniewski has shown that T3 can be adopted as a shorter single axiom of Ontology.

 

Using only the protothetical rules, it is also possible to state some interesting definitions, such as those of relators of the same category as epsilon (S/NN):

 

T4: 

Def-Proto (nominal inclusion)

T5: 

Def-Proto (nominal co-extensionality)


T6:
 

Def-Proto (singular identity)

 

Other definitions are still possible with Def-Proto, for instance those of the three following S/N functors, expressing properties of names, and also a definition of a relation between such properties (category S/(S/N)(S/N)):

 

T7: 
Def-Proto

T8: 
Def-Proto

T9: 
Def-Proto

T10: 
Def-Proto (S/N-functors co-extensionality)

 

From AxOnto, definitions T4 and T8, it is easy to derive the following theorem:

 

T11: 

 

The introduction of nominal co-extensionality in T5 gives the opportunity to define an interesting many-link functor:

 

T12: 
Def-Proto

 

From   (‘a’ and ‘b’ denote the same objects), the definition abstracts the second argument ‘b’, resulting in, which belongs to the category S/N and expresses a property of names (to denote the same objects as ‘a’). The functor ‘Ext’ is then a many-link functor of category (S/N)/N. It is tempting to interpret ‘ as denoting the extension ofa’. This is however merely a façon de parler, for ‘ is not the name of an object, but rather the expression of a function. Nevertheless, it is worth noting that from T5, T10 and T12, a formal analogue of Frege’s famous Basic Law V can be derived:

 

T13: 

 

Contrary to Frege’s law, this theorem is perfectly harmless in Ontology, for ‘ is a function, not an object that could be among the denotations of ‘a’.

 

In addition to defining nominal properties and relations, Boolean operations on names can also be introduced. However, these operations necessitate the application of an Ontology-specific definition rule. Instead of using, like in Protothetic, the following general form (with a definiendum of category S):

 

 Def-Proto

 

the new rule allows definitions of the following form, with a definiendum of category N:

 

 Def-Onto

 

The formal conditions for a well-formed definition are pretty much the same in the new version. It is worth noting that including the expression ‘ on the right-hand side of the biconditional is to ensure that the name ‘a’ is singular. In practice, this addition is unnecessary when the definiens alone already ensures that ‘a’ denotes exactly one object. Here are some examples, including the elementary Boolean operations:

 

T14:  Def-Onto (nominal union)

T15:  Def-Onto (nominal intersection)

T16:  Def-Onto (nominal difference)

T17:  Def-Onto (empty name)

T18:  

Def-Onto (universal name)

 

iii. Higher Orders

Like Protothetic, Ontology is a higher-order logic. Where the Principia Mathematica are based on an unbounded hierarchy of logical types, Ontology is based on a potentially infinite hierarchy of semantic categories. Each time a constant of a new category of this hierarchy is defined in a development, variables of that category and quantifiers binding those variables become available in the language of the development in question. At the root of the hierarchy, the category N includes names, which have the semantic role of designating extra-linguistic objects. Moving up into the hierarchy, one first comes across categories that take names as arguments: N/N, N/NN, and so on (categories which allow to operate on names), and S/N, S/NN, and so on (categories which allow assertions about names). Operators and functors of these categories can in turn serve as arguments for new operators and functors from categories higher up in the hierarchy. This ascent can go on, step by step, without limit of complexity. This increase in complexity must nevertheless be put in perspective. Firstly, a specific extensionality rule enables the derivation of extensionality theorems for each of the new categories that can be reached in Ontology. These theorems are analogous to the following theorem, which expresses extensionality for the basic category N:

 

T19: 

 

Secondly, there exists a structural analogy within Ontology among the various levels of the category hierarchy. For example, from a semantic perspective, the way a name has meaning (by denoting one object, a plurality of objects or no object at all) parallels the way a S/N functor conveys meaning (by being satisfied by one nominal meaning, multiple nominal meanings or none at all). From a syntactic perspective, it is possible to define in each category an analogue of the primitive epsilon (called a ‘higher epsilon’) and then to derive a structural analogue of the axiom of Ontology. For example, in the category S/N, the following definition of a S/(S/N)(S/N)-epsilon is adequate for that purpose:

 

T20: (∀αβ)[αεβ ≡.

Def-Proto


Using this definition, Miéville (1984: 334-7) has given a proof of the following
S/N-analogue of the axiom of Ontology:

 

T21: 

 

Def-Proto

 

Ontology presents a systematic analogy between categories that is similar to the analogy between types in the Principia Mathematica. Using homonyms of different categories, it offers the possibility to speak about functions or incomplete meanings (like for instance properties, relations, extensions, numbers, and so on) as if they were objects, but without any reification. On this point, Ontology attains the same ends as the Principia Mathematica. It constitutes a powerful logic as strong as type theory. But contrary to Whitehead and Russell, Leśniewski attains this result with means which are strictly regulated by his axiomatic system, without resorting either to any external convention of systematic ambiguity or to the non-explicit definitions of logical fictions, like that of classes in the Principia Mathematica.

 

c. Mereology (Part-Whole Theory)

While Mereology is chronologically the first system elaborated by Leśniewski, it is theoretically the last one, as it is based on both Protothetic and Ontology. Unlike these two deductive systems, Mereology is not a purely logical system. Dealing with the relations between parts and wholes, it contains non-logical proper terms like ‘class’, ‘element of’, or ‘part of’. Although it was considered by Leśniewski as a class theory—a nominalistically acceptable alternative to set theory—it has come to be widely regarded as a powerful formal theory of parts, wholes and related concepts.

 

At the time he discovered formal logic, Leśniewski planned to provide an alternative way into the issue of the foundations of mathematics. In this matter, the main points of disagreement with his contemporaries were about the ways to analyze and solve the paradoxes which appeared with the pre-theoretical notions of set, class or extension, especially Russell’s paradox. For Leśniewski, the solutions available at that time (in particular those of Frege, Russell and a bit later Zermelo’s one) were only ad hoc means. In his view, their only justification was to avoid contradictions in their systems. He did not find in these solutions a satisfactory analysis of the root causes of the contradictions, which were, according to him, based on a confusion between what he called the distributive and the collective conceptions of classes. From Cantor’s early approach, Leśniewski retained the basic idea that a set is literally made up of its elements. But this basic idea was for him incompatible with the existence of an empty set and with the distinction between an object and the set having this object as its unique element.

 

The difference between distributive and collective views of sets or classes can be easily grasped by means of a geometrical example such as the following figure:

 

 

 

 

 

 

 

It is common to conceive such a figure as being a class of certain points. It can also be conceived as a class of certain line segments or even as a class of certain triangles. All these possibilities have different theoretical advantages. But if the notion of class is understood in a distributive sense (for example as a set in a standard sense), the same figure cannot be at the same time a class of points, a class of segments, and a class of triangles. A set or a distributive class of points has only points as elements, whereas a mereological or collective class of points may perfectly have also other objects as elements. A single figure can be at the same time a mereological class of points, a class of segments and a class of triangles. In the collective conception, the three classes are the same object, namely the figure in question.

 

i. Mereology and Russell’s Paradox

Leśniewski gave a famous analysis of Russell’s paradox. During his lifetime, he did not publish the final state of this analysis, which was reconstructed by his pupil Boleslaw Sobociński in 1949. Taking up Russell’s approach, Leśniewski’s analysis begins with a proof that the principles of comprehension and extensionality of class theory are indeed incompatible. To this end, the following ‘Ontological’ definition of the concept ℜ is introduced (that is, the concept of object which is not a class that falls under a concept of which it is the class):

 

 Def:

 

The term Cl(-), which occurs in the definiens, must be considered here as a primitive constant satisfying the two principles of comprehension and extensionality (here expressed in the language of Ontology):

 

Class Comprehension Principle (CCP): 
 

(For all conceptual terms a, there is an object which is a class of the a’s.)

 

 Class Extensionality Principle (CEP):

 (For all conceptual terms a and b, if one and the same object is both a class of the a’s and a class of the b’s, then the a’s and the b’s are exactly the same objects.)

 

Taking these elements as granted, Leśniewski easily shows that we get the following contradiction:

 

     (Nothing is a class of the ℜ’s)

        (Something is a class of the ℜ’s)

 

For Leśniewski, a contradiction acquires the value of an antinomy only if it logically follows from principles in which we undoubtedly believe. In his view, this was not the case with the contradiction encountered in this context. In the intuitive conception of classes or sets, CCP is not free from doubt. Indeed, what this principle expresses is at least doubtful in the case of empty concepts. If a is an empty conceptual term, there is no intuitive reason to decide whether or not there is an object which is a class of a. The reasons for adopting CCP lie in the goals we set for set or class theory, not in our intuitive conceptions of sets or classes. In order to uncover the genuine antinomy behind Russell’s paradox, the analysis must not address the incompatibility between CEP and CCP. It must deal with what would happen to a theory that admits among its principles CEP and the doubtless part of what is expressed by CCP, namely the following weakened principle:


Weak Class Comprehension Principle (WCCP)
:


(For all non-empty conceptual term a, there is an object which is a class of the a’s.)

 

With the primitive term Cl(-) satisfying now CEP and the revised principle WCCP, clearly only the first horn of the previous contradiction remains:

 

        (Nothing is a class of the ℜ’s)

 

But WCCP then imposes that ℜ is an empty term:

 

 

 

This means (by definition of ℜ) that every object is a class that falls under a concept of which it is the class:

 

 

 

Leśniewski shows then that this class is precisely the class of the term in question:

 

      (Every object is a class of itself)

 

From these first results, he then draws an unexpected consequence:

 

      (There is at most one object)

 

The demonstration can be informally sketched as follows:

 

Assume (a and b are individual objects).

Let then be the conceptual term under which exactly the objects a and b fall.

The term is not empty.

By WCCP, it follows that there is an object c which is a class of .

But, like every object, c is also a class of itself.

So we have: .

By CEP, we then get:   (exactly the same objects fall under and c).

For c is a singular term, so is also .

The only way for to be a singular term is that .

 

The joint adoption of CEP and WCCP results then in the existence of at most one object in the universe. Although such a statement could not be refuted on a purely logical basis, Leśniewski considered that no set or class theorist could tolerate it and that they should indisputably believe in its negation. This had for him the value of a genuine antinomy.

 

Sobociński’s reconstruction of the analysis does not fully clarify why Leśniewski considered that the root causes of this antinomy lay in a confusion between what he called the distributive view and the collective view of classes. It could be argued that, for him, WCCP constituted an unquestionable belief only with a collective view, whereas a doubtless belief in CEP could only result from a distributive view. In any case, his solution to the antinomy was to expurgate from CEP what is only acceptable from a distributive view of classes. Introducing the notion of element with the following definition:

 

 DefEl:

 

he showed that CEP is logically equivalent to the conjunction of the following two expressions:

 

 CEP1:

 

(All the objects that fall under a concept are elements of the class of that concept.)

 

 CEP2:

 (All the elements of the class of a concept fall under that concept.)

 

Unlike CEP1, which is undoubtedly true under both intuitive understandings of ‘class’, CEP2 can only appear to be true to one who adopts a distributive view. For Leśniewski, whoever relies on Cantor’s opening idea that a set or a class is a real object that gathers into a whole the elements that literally constitute it faces an antinomy when confusingly admitting the truth of CEP2. His nominalist tendency was to lead him to consider the mereological approach to classes as the only acceptable one, whereas any theoretical approach aiming at saving CEP2 led, according to him, to make classes either fictions (as in Russell’s no-class theory) or disputable abstract entities subject to ad hoc criteria of existence (as in Zermelo’s approach).

 

ii. The Axiomatic System

Between 1916 and 1921, Leśniewski developed four axiomatizations of his Mereology taking different terms as a unique primitive. The first two were based on the term ‘part’ (Leśniewski’s word for the somewhat more usual ‘proper part’), the third on ‘ingredient’ (a synonym for ‘element’) and the last on ‘exterior of’ (Leśniewski’s term for ‘disjoint’). When taking ‘element’ (or ‘ingredient’) as primitive, Leśniewski gives the following definitions of other important mereological terms:



(a is a part of b when it is a strict element of b, that is, an element of b different from b itself.)

  

 (a is an external of b when it has no element in common with b.)

 

 

                                        

     

 

This last definition introduces the notion of mereological class. In order to be the class of the b’s, a certain a must meet the three conditions stipulated in the definition:

 

1. a must be an object;

2. all the b’s must be elements of a;

3. every element of a must itself have at least one element in common with one of the b’s.

 

This is worth clarifying. Let us take as an illustration the mereological class of Swiss people. From condition 2, every Swiss person is an element of this class. But the class is intended to have many other elements. For example, sub-classes of Swiss people are also elements of that class (like the class of Swiss people living in Lugano, that of French-speaking Swiss people, or that of Swiss people who practice yodeling). There are also elements of the class of Swiss people that are neither Swiss persons nor sub-classes of them: for example, an element of a Swiss person (such as the nose of the President of the Confederation) or a class of such elements (such as the class of the feet of Swiss people who have climbed the Matterhorn). Condition 3, however, precisely limits this wealth of elements. It requires that each element itself has at least something (an element) in common with a Swiss person. The Swiss people sub-classes clearly meet this requirement and so do the President’s nose and the Swiss mountaineers’ feet class. In contrast, the class of the noses of all European leaders will not be retained as an element of the Swiss people class because there is at least one element of this class of noses (for example, the nose of the Italian president) which has nothing (no element) in common with a Swiss person.

 

This illustration highlights that the mereological class corresponds to what is better known as the mereological sum. It also makes it clear that, for the three conditions of Leśniewski’s definition to actually reach what is expected from mereological classes, it is necessary for the term ‘element’ to be characterized in an appropriate way (for example, it is pretty clear that for condition 3 to be relevant, the relation element of must be transitive). This characterization is what Leśniewski does with the following axiomatization of 1920, using ‘element’ (originally ‘ingredient’) as the sole primitive term:

 

 

 

 

 

 

                          

 

 

          

 

The first two axioms are rather simple. AxM1 is the contraction of the following two formulas:

 

  (only objects have elements)

 

 

(element of is an antisymmetric relation)

 

The second axiom is literally:

 

 

(element of is a transitive relation)

 

Furthermore, Leśniewski shows that AxM1 et AxM2 imply the following formula:

 

 (element of is a reflexive relation on the object domain)

 

The first two axioms thus make element of a non-strict partial order relation on the object domain.

 

The last two axioms are more difficult to grasp. However, when examined in light of the definition of class, it becomes apparent that they respectively embody a principle of class uniqueness and a principle of class existence (the latter being nothing but WCCP):

 

 

(a class of a certain term is unique)

 

 

(There is a class for every nonempty term)


Leśniewski and his followers devoted much effort to finding the shortest and most economical system of axioms for Mereology. But this could only be achieved at a certain expense of intuitive clarity. The axiomatization of 1920 is in an intermediate position in this respect. To one who is concerned neither with having a single primitive, nor with the independence of the axioms, the adoption of the two terms
El(-) and Cl(-) as primitives, together with the six axioms (i)-(vi) plus the defining formula for ‘class’ taken as the seventh axiom, would constitute a rather clear axiomatization. The only remaining difficulty lies in the effort to grasp this definition of class (especially with regard to the third condition of the definiens, which gives the mereological class its specificity). This definition of class, elaborated by Leśniewski from the very beginning of his research, can certainly be considered as one of the touchstones of his mereology.

 

All of Leśniewski’s axiomatizations have been proved to be equivalent. The consistency of his Mereology has also been established. Clay gave in 1968 a model of Mereology in the arithmetic of real numbers, and later a proof of consistency relative to topology. But the most significant proof is that of Lejewski, who showed in 1969 that both Mereology and its underlying logic (namely Ontology) are consistent relative to an elementary sub-system of Protothetic.

 

3. Foundations of Mathematics

With Leśniewski’s untimely death in 1939, the picture he left us of his program for the foundations of mathematics remains unfortunately unfinished and in some respects ambivalent. Developed in connection with his analysis of Russell’s paradox, his Mereology continues today to be widely studied and developed as a rich applied theory of the part-whole relation. However, Leśniewski intended his Mereology to be a nominalistically acceptable alternative to set theory. One of his purposes was to show that Mereology could be used to provide a foundation for mathematics that would not postulate the existence of questionable abstract entities. On the other hand, he did not deny the relevance of the distributive intuition about classes. He nevertheless considered that an expression like ‘the class of the b’s’ taken in the distributive sense could only be an apparent name that should be eliminated in favor of a language of pure logic. Thus, to assert in the distributive sense a sentence like ‘a is an element of the class of the b’s’ was only for him a façon de parler that amounts to asserting that a is one of the b’s, precisely what is expressed by ‘ in his Ontology. This eliminativist conception of distributive classes could arguably have led Leśniewski to consider, as several of his followers did, that the core notions of arithmetic (in particular, that of cardinal number) should find their foundations not in mereology, but directly in the purely logical system of Ontology.

a. Mereology and Set Theory

In order to determine to what extent Mereology could be an alternative to set theory, Leśniewski set out to prove numerous theorems that he considered to be analogues of important results of set theory. Unfortunately, this task remained unfinished. Perhaps the most interesting proof he gave was the mereological analogue of Cantor’s theorem on the cardinality of power sets. The problem with mereological classes is that they do not generally carry a specific cardinality. As Frege already remarked about aggregates (his naive notion for mereological classes), firstly there is no aggregate that consists of zero object, and secondly an aggregate can only have a cardinal number when it is conceived through a concept. To overcome this last difficulty, Leśniewski introduces two notions: the notion of mereological collection and the notion of discrete name. These notions can be defined as follows:

 

 

 

For any non-empty term b (like ‘Swiss people’), we have seen that there is exactly one class of the b’s. With D4, collections of b’s are all those classes that are generated either by all the b’s (so the class of Swiss people is a collection of Swiss people), or by only certain b’s (the class of Swiss people living in Lugano is also a collection of Swiss people), or even by only one b (Roger Federer is then also a collection of Swiss People). At first glance, it seems there must be collections of b’s, where n is the number of objects that fall under the name ‘b’. But this is only correct provided the b’s are disconnected objects (objects that are all mereologically external to each other, that is, with no element in common). In other words, using D5, one can say that the result is correct only when ‘b’ is a discrete name. Leśniewski proved then what he considered to be the mereological analogue of Cantor’s theorem:

 

 

(If ‘b’ is a plural discrete name, then there are strictly more collections of b’s than there are b’s themselves.)

 

The analogy is quite clear. With this result, Leśniewski could have tried to establish other important analogous results (for example, the existence of an endless hierarchy of increasingly large infinite cardinals in a Mereology supplemented with an axiom of infinity). Unfortunately, he did not have time to develop his efforts in such a direction, and no subsequent work has been able to show that Mereology is strong enough to achieve such goals while respecting the nominalistic requirements of Leśniewski.

 

In the Leśniewskian perspective, however, an important difference with the set-theoretic approach must be emphasized. Whereas in Cantor’s theorem, the cardinalities to be compared are those of sets (a set and its power set), in Leśniewski’s analogue, it is not mereological classes that carry a cardinality, but names (a name ‘b’ on the one hand and the name ‘Coll(b)’ on the other hand). What is compared in the mereological analogue is not the number of the elements of different classes, but the number of objects falling under certain names. This observation leads to two remarks. First, even if the Leśniewskian approach does not introduce any ambiguity or confusion, it still mixes both the collective and distributive views on pluralities. Secondly, the notion of cardinality introduced in this context clearly belongs to the underlying logic (Ontology), and not specifically to Mereology. This provides a rather decisive reason to favor a foundational approach to arithmetic from Ontology and not from Mereology.

b. Ontology and Arithmetic

Although he did not explicitly give clear philosophical reasons against the idea of a logicist approach to arithmetic, Leśniewski made no attempt to reduce arithmetic to his Ontology. Instead, he merely developed in the language of Ontology an axiomatization of arithmetic that is more or less a translation of Peano’s second-order arithmetic. In view of the great wealth of his systems, this is still a disappointing result, which should have been just a stage in the mind of the Polish logician. In any case, Canty showed in 1967 in his PhD thesis that the arithmetic of the Principia Mathematica could be fully reconstructed within Ontology. By exploiting several of Canty’s techniques, Gessler, Joray, and Degrange set up in 2005 a logicist program in which they show that second-order Peano’s arithmetic can be reduced to Ontology, using only a single non-logical axiom (an axiom of infinity) that can be stated as follows in the primitive language of Ontology:

 

 

(There is an object a for which there is a one-one relation such that every object is in its domain and every object except a is in its codomain.)

 

With suitable definitions of ‘one-one relation’, ‘domain’ and ‘codomain’, the logicist construction of Peano’s arithmetic can then be obtained resting on the definition of nominal equinumerosity and an associated definition of the many-link functor ‘Card’:

 

 This last definition immediately leads to the perfectly predicative Leśniewskian analogue of Frege’s renowned law, now commonly referred to as Hume’s Principle:

 

 

 

From here on, cardinal numbers are introduced as those S/N-functors which satisfy the following definition:

 

 

 

and natural numbers as those cardinal numbers which are inductive in the sense of Frege:

 

 

 

Completing this in a manner very similar to Frege’s (in particular by making explicit the definitions of ‘zero’, ‘successor’, and of inductivity), one obtains a development of infinite Ontology in which the axioms of Peano’s second-order arithmetic are provable. This construction simplifies and substantially improves that found in the Principia Mathematica. But, as with Whitehead and Russell, it leads to an arithmetic that has to be duplicated in higher orders if its numbers are to be applied not only to the counting of objects, but also to the counting of properties, functions, relations, and so on. This might explain why Leśniewski did not investigate in this direction and why he engaged in a foundational attempt based on his Mereology. Without textual evidence or decisive testimony on that subject, this unfortunately remains a matter of speculation.

4. References and Further Reading

  • Ajdukiewicz, K. “Die Syntaktische Konnexität.” Studia Philosophica 1 (1935): 1-27. (English translation in S. McCall, ed. Polish Logic 1920-1939. Oxford: Clarendon, 1967: 207-231.)
  • Canty, J. T. Leśniewski’s Ontology and Gödel’s Incompleteness Theorem. PhD Thesis of the University of Notre Dame, 1967.
  • Canty, J. T. “The Numerical Epsilon.” Notre Dame Journal of Formal Logic 10 (1969): 47-63.
  • Clay, R. E. “The Relation of Weakly Discrete to Set and Equinumerosity in Mereology.” Notre Dame Journal of Formal Logic 6 (1965): 325-340.
  • Clay, R. E. “The Consistency of Leśniewski’s Mereology Relative to the Real Numbers.” Journal of Symbolic Logic 33 (1968): 251-257.
  • Gessler, N., Joray, P., and Degrange, C. Le logicisme catégoriel. Travaux de Logique (Neuchâtel University) 16 (2005): 1-143.
  • Joray, P. “Logicism in Leśniewski’s Ontology.” Logica Tranguli 6 (2002): 3-20.
  • Joray, P. “A New Path to the Logicist Construction of Numbers.” Travaux de Logique (Neuchâtel University) 18 (2007): 147-165.
  • Joray, P. “Un système de déduction naturelle pour la Protothétique de Leśniewski.” Argumentum 18 (2020): 45-65.
  • Küng, G. “The Meaning of the Quantifiers in the Logic of Leśniewski.” Studia Logica 26 (1977): 309-322.
  • Lejewski, C. “Logic and Existence.” British Journal of the Philosophy of Science 5 (1954): 104-119.
  • Lejewski, C. “On Leśniewski’s Ontology.” Ratio 1 (1958): 150-176.
  • Lejewski, C. “Consistency of Leśniewski’s Mereology.” Journal of Symbolic Logic 34 (1969): 321-328.
  • Leśniewski, S. “Introductory Remarks to the Continuation of my Article ‘Grundzüge eines neuen Systems der Grundlagen der Mathematik’.” in S. McCall (ed.) Polish Logic 1920-1939. Oxford: Clarendon, 1967: 116-169.
  • Leśniewski, S. “On Definitions in the So-called Theory of Dedution.” in S. McCall (ed.) Polish Logic 1920-1939. Oxford: Clarendon, 1967: 170-187.
  • Leśniewski, S. “On the Foundations of Mathematics.” Topoi 2 (1983): 7-52.
  • Leśniewski, S. S. Leśniewski’s Lectures Notes in Logic. edited by J. T. J. Srzednicki and Z. Stachniak, Dordrecht: Kluwer, 1988.
  • Leśniewski, S. Collected Works. edited by S. J. Surma, J. T. J. Srzednicki, J. D. Barnett, and V. F. Rickey, 2 vols, Dordrecht: Kluwer / Warszawa: PWN Polish Scientific Publishers, 1992.
  • Łukasiewicz, J. The Principle of Contradiction in Aristotle. A Critical Study. (1910), English translation by H. R. Heine, Honolulu: Topos Books, 2021.
  • Luschei, E. C. The Logical Systems of Leśniewski. Amsterdam: North Holland, 1962.
  • Miéville, D. Un développement des systèmes logiques de Stanisław Leśniewski. Protothétique-Ontologie-Méréologie. Bern: Peter Lang, 1984.
  • Miéville, D. and Vernant, D. (eds.) Stanisław Leśniewski Aujourd’hui. Grenoble: Groupe de Recherche sur la Philosophie et le Langage, 1996.
  • Miéville, D., Gessler, D., and Peeters, M. Introduction à l’oeuvre de S. Leśniewski. Vols I-VI. Series of special issues of Travaux de Logique (Neuchâtel University), 2001-09.
  • Rickey, V. F. “Interpretations of Leśniewski’s Ontology.” Dialectica 39 (1985): 182-192.
  • Rickey, V. F. “An Annotated Leśniewski Bibliography.” First version 1972, last version 2019, available at https://lesniewski.info/.
  • Russell, B. Principles of Mathematics. London: Allen and Unwin, 1903.
  • Simons, P. “A Semantics for Ontology.” Dialectica 39 (1985): 193-216.
  • Simons, P. “Stanisław Leśniewski.” Stanford Encyclopedia of Philosophy (2015).
  • Słupecki, J. “St. Leśniewski’s Protothetics.” Studia Logica 1 (1953): 44-112.
  • Słupecki, J. “Leśniewski’s Calculus of Names.” Studia Logica 3 (1955): 7-72.
  • Sobociński, B. “L’analyse de l’antinomie russellienne par Leśniewski.” Methodos 1 (1949): 94-107, 220-228, 308-316 and Methodos 2 (1950): 237-257. (English Translation in Srzednicki, J. T. J. and Rickey, V. F. eds, 1984: 11-44.)
  • Srzednicki, J. T. J. and Rickey, V. F. (eds.) Leśniewski’s Systems: Ontology and Mereology. The Hague: Nijhoff / Wrocław: Ossolineum, 1984.
  • Srzednicki, J. T. J. and Stachniak, Z. (eds.) Leśniewski’s Systems: Protothetic. Dordrecht: Kluwer, 1998.
  • Stachniak, Z. Introduction to Model Theory for Leśniewski’s Ontology. Wrocław: Wydawnictwo Universitetu Wrocławskiego, 1981.
  • Tarski, A. “On the Primitive Term of Logistic.” (1923), in Logic, Semantics, Metamathematics. Papers from 1923-1938 by Alfred Tarski. Oxford: Clarendon, 1956, 1-23.
  • Urbaniak, R. Leśniewski’s Systems of Logic and Foundations of Mathematics. Cham: Springer, 2014.
  • Whitehead, A. N. and Russell, B. Principia Mathematica. 2nd ed. Cambridge University Press, 1927.

 

Author Information

Pierre Joray
Email: pierre.joray@univ-rennes.fr
University of Rennes
France

This content is password protected. To view it please enter your password below:

Abelard: Logic

picture of AbelardThis article describes and reconstructs Peter Abelard’s logic of the twelfth century. Much of what he regarded as logic is now classified as ontology or philosophical semantics. The article concentrates on his treatment of the relation of consequence. Abelard’s most important logical innovations consist of two points:

The distinction between two kinds of negation. This is used to extend the traditional Square of Opposition to an octagon.

The introduction of a relevant implication. The aim is to avoid the paradoxes of strict implication and to safeguard the basic principles of connexive logic.

For the latter goal, Abelard rejected the traditional “locus ab oppositis” which says that if one of two opposite concepts is predicated of a certain subject, the other concept has to be denied of the same subject. We now know this approach failed. Alberic of Paris developed an “embarrassing argument” which showed that—in contradiction to Aristotle’s connexive theses—there exist propositions which logically imply their own negation. The conclusiveness of Alberic’s counterexample does not presuppose the validity of the “locus ab oppositis.” Other aspects of Abelard’s philosophy are treated in the main Abelard article.

Table of Contents

  1. Abelard’s Logical Works
  2. Outlines of the Theory of the Syllogism
  3. Abelard’s Theory of Negation
    1. “Extinctive” vs. “Separating” Negation
    2. Negating Singular Propositions
    3. Negating Quantified Propositions
    4. Abelard’s Octagon of Opposition
  4. Abelard’s Quantification of the Predicate
    1. Singular Propositions with Quantified Predicate
    2. Categorical Forms with Quantified Predicate
  5. Inferences and Implications
    1. Perfect vs. Imperfect Inferences
    2. Strict vs. Relevant Implication
  6. Abelard’s Defence of the Principles of Connexive Logic
    1. The First “Embarrassing Argument”
    2. Alberic’s Argument
  7. References and Further Reading
    1. Editions of Abelard’s Logical Works
    2. Secondary Literature

1. Abelard’s Logical Works

Abelard’s philosophical works were first edited in 1836 by Victor Cousin. Besides the rather theological essay “Sic et non,” the Ouvrages inédits d’Abélard contain various logical works, namely, commentaries on Aristotle, Porphyry, and Boethius, and a preliminary version of the Dialectica. The next important edition of Abelard’s logical writings was achieved by Bernhard Geyer who, in the period from 1919 to 1933, published Peter Abaelards Philosophische Schriften. This collection contains in particular a Logica ‘Ingredientibus’ and a Logica ‘Nostrorum petitioni.’ The strange titles do not have any specific meaning; Geyer simply chose them according to the words with which the texts begin. In 1954, Mario Dal Pra edited Pietro Abelardo Scritti filosofici, which contain, in particular, Abelard’s so-called “children’s logic” (“logica parvulorum”). A complete version of Abelard’s most important logical work, the Dialectica, based on a manuscript in Paris, was edited in 1959 by Lambert Marie de Rijk. This volume forms the basis for the reception of Abelard’s logic which started in the last third of the 20th century.

In 1964, Maria Teresa Beonio-Brocchieri Fumagalli published the small volume La Logica de Abelardo, which in 1969 appeared in English as The Logic of Abelard. The title, however, is sort of a misnomer because the book does not really deal with Abelard’s logic. The genuine innovations of Abelard’s logical theories were first uncovered by Christopher Martin in the 1980s, especially in his dissertation on Theories of Inference and Entailment in the Middle Ages, and in the papers Martin (1986), (1987), and (2006). Abelard’s theory of the modal operators is extensively discussed in Binini (2022).

King & Arlig (2018) maintain that:

Abelard […] devised a purely truth-functional logic […], and worked out a complete theory of entailment. […] An entailment is complete (perfecta) when it holds in virtue of the logical form (complexio) of the propositions involved. By this […] he means that the entailment holds under any uniform substitution in its terms […]. The traditional moods of the categorical syllogism […] are all instances of complete entailments, or as we should say, valid inference.

Abelard spends a great deal of effort to explore the complexities of the theory of topical inference […]. One of the surprising results of his investigation is that he denies that a correlate of the Deduction Theorem holds, maintaining that a valid argument need not correspond to an acceptable conditional […].

In the end, it seems that Abelard’s principles of topical inferences do not work, a fact that became evident with regard to the topic “from opposites.” Abelard’s principles lead to inconsistent results.

These claims have to be modified, corrected, and supplemented in several respects. First, for Abelard both entailment and disjunction are intensional or modal, but not extensional, that is, merely truth-functional. Second, his theory of entailment distinguishes not only between perfect and imperfect inferences, but also between what is nowadays called strict implication and the even stronger conception of “relevant” implication. Third, in connection with traditional logic it doesn’t make much sense to speak of the Deduction theorem which says that if, in a logic calculus with certain axioms and rules of deduction, one may deduce a conclusion C from a set (or conjunction) of premises P1, … Pn, then the implication (P1 ∧ … ∧ PnC) is provable (in that calculus). However, medieval logic has never been developed in the form of an axiomatic calculus! Fourth, as regards Abelard’s principles of “topical inference”, it is not quite correct to maintain that they lead to “inconsistent results.” Rather, Abelard rejected the traditional topic from opposites in order to save “Aristotle’s Theses” from refutation, but his attempt turned out to be unsuccessful since Alberic of Paris presented a genius counter-example to the connexive principles which does not make use of the topic from opposites.

2. Outlines of the Theory of the Syllogism

Abelard was well acquainted with the theory of the syllogism as it had been invented by Aristotle (ca. 384-322 BC) and elaborated by Boethius (ca. 480-525). This theory deals with the categorical forms, in which a subject term S is related to a predicate term P:

Universal affirmative proposition (UA)                 Every S is P

Universal negative proposition (UN)                      No S is P

Particular affirmative proposition (PA)                 Some S is P

Particular negative proposition (PN)                      Some S isn’t P.

Later medieval logicians referred to these forms by means of the vowels ‘a’, ‘e’, ‘i’, and ‘o.’ Although Abelard did not use such abbreviations, the forms  are here symbolized as SaP, SeP, SiP, and SoP, respectively. The traditional doctrine of subalternation saying that the universal propositions entail their particular counterparts is then formalized as follows (where ‘⇒’ symbolizes a logical implication):

Sub 1                 SaPSiP

Sub 2                 SePSoP.

According to the modern analysis of the categorical forms in terms of first order logic, these laws are not unrestrictedly valid but hold only under the assumption that the subject term S is not empty.

The theory of opposition says that the contradictory opposite, or negation, of the UA is the PN, and that the negation of the UN is the PA. If the negation operator is symbolized by ‘¬’, these laws take the form:

Opp 1                 ¬SaPSoP

Opp 2                 ¬SePSiP.

Hence, it is not the case that every S is P is equivalent to ‘Some S is not P’; and it is not the case that no S is P means as much as ‘Some S is P’. From this it follows that there is a contrary opposition between the two universal propositions, which means that SaP and SeP can never be together true while it is possible that none of them is true. Furthermore, the two particular forms are subcontrary, which means that SiP and SoP can never be together false while it is possible that both are true. The laws of subalternation and opposition are often summarized in the well-known “Square of Opposition”:

square of opposition

The traditional theory of conversion says that a PA and a UN may be converted “simpliciter,” that is, one may simply exchange the predicate and the subject:

Conv 1               SiPPiS

Conv 2              SePPeS.

Clearly, if some S is P, then conversely some P is S; and if no S is P, then also no P is S. In contrast, the UA can only be converted “per accidens,” that is, the “quantity” of the proposition must be diminished from ‘universal’ to ‘particular’:

Conv 3              SaPPiS.

The validity of Conv 3 follows from the law of subalternation, Sub 1, in conjunction with Conv 2: if every S is P, then a fortiori, some S is P so that, conversely, some P is S. Similarly, one might state another law of conversion according to which the UN can also be converted “accidentally”:

Conv 4              SePPoS.

This follows from Conv 2 by means of Sub 2.

Finally, most medieval logicians accepted the principle of “conversion by contraposition,” according to which the subject and the predicate of a UA may be exchanged when the terms ‘S’ and ‘P’ are replaced by their negations. That is, if every S is P, then every Not-P is not-S. If the negation of a term is symbolized by ‘~’ (thus distinguishing it from the negation operator for propositions, ‘¬’), the law of contraposition takes the form:

Contra             SaP ⇒ ~Pa~S.

According to the principle of “obversion”, the negative propositions UN and PN can equivalently be transformed into affirmative propositions (with a negated predicate):

Obv 1                 SePSa~P

Obv 2                SoPSi~P.

Hence, that no S is P is equivalent to ‘Every S is not-P’, and ‘Some S isn’t P’ is equivalent to ‘Some S is not-P.’ As a corollary, it follows that conversely the affirmative propositions UA and PA can equivalently be expressed as negative propositions (with a negated predicate):

Obv 3                SaPSe~P

Obv 4                SiPSo~P.

The mutual derivability of these principles presupposes the law of double negation:

Neg 1                ~~T = T.

A proper syllogism is an inference from two premises P1, P2 to a conclusion C where (normally) all these propositions are categorical forms and the premises must have one term in common. The best-known examples are the four “perfect” syllogisms:

Barbara          CaD, BaCBaD

Celarent         CeD, BaCBeD

Darii                 CaD, BiCBiD

Ferio                CeD, BiCBoD.

In Abelard’s logic these inferences are not presented in such an abstract form, however, but mainly by way of examples. For instance, Abelard illustrates the “sillogismi perfecti” as follows:

Every just is good; Every virtuous is just; therefore, Every virtuous is good

No good is bad; Every just is good; therefore, No just is bad

Every good is virtuous; Some just is good; therefore, Some just is virtuous

No good is bad; Some just is good; therefore, Some just is not bad. (Compare Dialectica, p. 236)

In some places, however, Abelard also mentions corresponding generalized rules such as, in the case of Barbara:

If something is predicated universally of something else, and another is subjected universally to the subject, then the same is also subjected in the same way, i.e., universally, to the predicate (Compare Dialectica, p. 237),

or, in the case of Ferio:

If something is removed universally from something else, and another is subjected particularly to the subject, then the first predicate is removed particularly from the second subject (ibid.)

Abelard largely endorsed the traditional theory of the syllogism including the laws of subalternation, opposition, conversion, and obversion. In particular, in Logica ‘Ingredientibus’ he painted a standard square of opposition in which the logical relation between the ‘a’ and the ‘o’ proposition as well as the relation between the ‘e’ and the ‘i’ proposition are characterized as “contradictorie,” while the ‘a’ and the ‘e’ proposition are opposed “contrarie” and the ‘i’ and the ‘o’ proposition “subcontrarie.” Finally, a “subalterne” relation is drawn between the ‘a’ and the ‘i’ and between the ‘e’ and the ‘o’ proposition. The only difference between Abelard’s square (p. 412) and the usual square consists in the fact that his example deals with the special case where the universe of discourse has only two elements. Thus, instead of ‘Every man is white,’ ‘No man is white.’ and ‘Some man is white’ Abelard has ‘Both men are white’ (“Uterque istorum est albus”), ‘Neither is white’ (“Neuter, i. e. nullus ipsorum est albus”) and ‘At least one of them is white’ (“Alter est albus”).

The next section shows, however, that in Dialectica Abelard eventually rejected the traditional laws of opposition. His distinction between so-called “destructive negation” and “separating negation” entails the consideration of each two variants of the categorical forms, and the ensemble of eight propositions, which Abelard arranged into two squares of opposition, can be united into an octagon.

3. Abelard’s Theory of Negation

As was mentioned already in the introduction, one of Abelard’s major logical innovations consists in the introduction of two kinds of negation by means of which the traditional Square of Opposition is extended to an octagon.

a. “Extinctive” vs. “Separating” Negation

In Logica ‘Ingredientibus,’ Abelard explains:

Not only with respect to categorical propositions, but also with respect to hypothetical propositions one has to distinguish between separating negation and extinctive negation. A separating negation [“negatio separativa”] obtains when by the position of the negative particle the terms are separated from each other […]. But an extinctive negation [“negatio exstinctiva” or “negatio destructiva”] obtains when by the position of the negative particle in front of the entire proposition this proposition is destroyed (Geyer, p. 406).

The extinctive negation of a proposition α is just the ordinary negation, ¬α. It can always be formed by putting ‘not’ in front of α. Thus, with respect to the categorical forms, one obtains:

Not every S is P

Not no S is P

Not some S is P

Not some S isn’t P.

With respect to “hypothetical” propositions, one similarly gets:

Not: If α then β        ¬(α → β).

The extinctive negation satisfies the law of double negation,

Neg 2              ¬¬α ⇔ α,

but for some rather obscure reason Abelard hesitated to accept this law.

A separating negation obtains whenever the expression ‘not’ is placed somewhere “within” a proposition α so that it separates the predicate of a categorical proposition from its subject:

Every S is not P

No S is not P

Some S is not P

Some S isn’t not P.

With respect to hypothetical propositions, a separating ‘not’ separates the antecedent from the consequent:

If α, then not β.

Propositions with an extinctive negation differ from their separating counterparts in so far as, for example, the extinctively negated UA, ‘Not every S is P’, doesn’t have the same meaning (or the same truth-condition) as the separating negation ‘Every S is not P’. In view of the laws of obversion, the latter proposition rather expresses the same as a UN! Similarly, ‘Not no S is P’ means as much as ‘Some S is P’ while, according to the principle of obversion, ‘No S is not P’ amounts to ‘Every S is P’. Similar remarks apply to the extinctive vs. separating negations of the PA and the PN. Yet this doesn’t mean that there exists a general logical or semantical difference between the two kinds of negation. Let us have a closer look at Abelard’s theory of negation as applied, firstly, to singular propositions and then, secondly to categorical propositions.

b. Negating Singular Propositions

Starting from a singular proposition such as:

S1        Socrates is just (Socrates est iustus),

one can consider besides the extinctive negation:

S2        Not: Socrates is just (Non Socrates est iustus),

two variants of a separating negation:

S3a      Socrates is-not just (Socrates non est iustus).
S3b      Socrates is not-just (Socrates est non iustus).

According to Abelard, the variants S3a and S3b are equivalent. Therefore, in what follows, they shall simply be referred to as ‘S3’. Furthermore, S3 can itself be negated as:

S4        Not: Socrates is not just (Non Socrates est non iustus).

According to Abelard, the separating negation ‘Socrates is not just’, is the contrary opposite of the affirmation ‘Socrates is just,’ because both propositions become false when the subject ‘Socrates’ doesn’t exist! More generally, Abelard accepts the following principle:

Exist     For any singular term s and any predicate P: the proposition ‘s is P’ implies (or presupposes) that ‘s is’, that is, that ‘s exists’.

Hence, if s doesn’t exist, both ‘s is P’ and ‘s is not-P’ are false. Therefore, the two (affirmative) propositions ‘s is P’ and ‘s is not-P’ together with their extinctive negations form the following square of opposition:

S1 s is P contrary s is ~P S3
S4 ¬(s is ~P) subcontrary ¬(s is P) S2

c. Negating Quantified Propositions

Starting from a PA such as:

C1   Some man is white (Quidam homo est albus),

one can consider besides the extinctive negation:

C2   Not: Some man is white      (Non quidam homo est albus)

two variants of a separative negation:

C3a   Some man is-not white       (Quidam homo non est albus)

C3b   Some man is not-white       (Quidam homo est non albus).

While C3a is a particular negative proposition, C3b is usually considered as a particular affirmative proposition with a negative predicate. Since, in accordance with the theory of obversion, Abelard considered these propositions as equivalent, C3a and C3b may simply be referred to as ‘C3’.

Similarly, starting from a UA such as:

C4   Every man is white              (Omnis homo est albus)

one can consider besides the extinctive negation:

C5   Not: Every man is white     (Non omnis homo est albus)

two variants of a separative negation:

C6a   Every man is-not white       (Omnis homo non est albus)

C6b   Every man is not-white       (Omnis homo est non albus).

In view of the flexible grammar of the Latin language, C6a might be understood as synonymous with C5. Abelard, however, apparently understands C6a as equivalent to C6b, and both variants express the same state of affairs as the UN ‘No man is white.’ Therefore, both variants may simply be referred to as ‘C6’.

On p. 407 of Logica ‘Ingredientibus,’ Abelard draws the following diagram:

Omnis homo est albus [contrary] Omnis homo non est albus
[⇓] [⇓]
Quidam homo est albus [subcontrary] Quidam homo non est albus

This appears to be a normal Square of Opposition formed by the propositions C4, C6, C1, and C3. Next, Abelard draws another diagram consisting of the “extinctive” negations of the previous propositions:

Non omnis homo est albus Non omnis homo non est albus
[⇑] [⇑]
Non quidam homo est albus Non quidam homo non est albus

This appears to be a mirrored version of a normal Square of Opposition formed by the propositions C5, C2, +:

C7   Not every man isn’t white (Non omnis homo non est albus)

C8   Not some man isn’t white (Non quidam homo non est albus).

A few pages later, Abelard presents variants of these diagrams. The first diagram is entitled ‘exstinctiva’ because all negations are “extinctive”:

Omnis homo est albus contrarie Non quidam homo est albus
subalterne subalterne
Quidam homo est albus subcontrarie Non omnis homo est albus

Abelard’s annotations ‘contrarie’, ‘contradictorie’, ‘subcontrarie’ and ‘subalterne’ suggest that the figure represents an ordinary square of opposition. This also appears to hold true for the next diagram which is entitled ‘separativa’ since each proposition is now paraphrased by means of a separating negation.

Omnis homo non est albus […] Non quidam homo [non] est albus
[…] […]
Quidam homo non est albus […] Non omnis homo non est albus

Note that here again the structure of the ordinary square is mirrored: The UN stands at the place of the UA and vice versa.

d. Abelard’s Octagon of Opposition

However, both in Logica ‘Ingredientibus’ and in Dialectica Abelard insists that the traditional view of a contradictory opposition between the UA and the PN is mistaken. The contradictory opposite of SaP is ‘Not: Every S is P,’ but this proposition is not equivalent to SoP. Rather, ‘Some S is not P’ is contrary to ‘Every S is P’ because it is possible that both propositions are false. Thus, Abelard explains:

Also with respect to categorical propositions the only correct negation of an affirmation, sharing the truth-values with it, appears to be that proposition which destroys the sense of the sentence by placing the negation in front of it; thus the negation of ‘Every man is a man’ is ‘Not every man is a man’, but not ‘Some man is not a man’; the latter might be false together with the affirmation. For if it were the case that there are no men at all, then neither ‘Every man is a man’ nor ‘Some man is not a man’ would be true. (Compare Dialectica, p. 176)

Hence, according to Abelard, if the subject-term S is “empty,” then both ‘Some S is not S’ and ‘Every S is S’ become false. More generally, if S is “empty”, then, for every P, the UA ‘Every S is P’ is false, that is, this proposition has “existential import”, it entails that ∃xS(x). This consideration leads to the assumption of altogether eight propositions. On the one hand, we have the four “normal” categorical forms which can be formalized by means of the quantifiers ‘∃x’ (‘there exists at least one x’) and ‘∀x’ (‘for every x’) plus the symbols ‘∧’, ‘∨’ and ‘⊃’ for the propositional operators of conjunction, disjunction and (material) implication:

C8 UA Not some S is not P ¬∃x(S(x) ∧ ¬P(x))
C2 UN Not some S is P ¬∃x(S(x) ∧ P(x))
C1 PA Some S is P ∃x(S(x) ∧ P(x))
C3 PN Some S is not P ∃x(S(x) ∧ ¬P(x))

On the other hand, one obtains two “strong” versions of universal propositions with existential import:

C4 UA+ Every S is P ∃xS(x) ∧ ∀x(S(x) ⊃P(x))
C6 UN+ Every S is not P ∃xS(x) ∧ ∀x(S(x) ⊃ ¬P(x))

Furthermore, the negations of these “strong” universal propositions yield “weak” interpretations of corresponding particular propositions:

C7 PA- Not every S is not P ¬∃xS(x) ∨ ∃x(S(x) ∧ P(x))
C5 PN- Not every S is P ¬∃xS(x) ∨ ∃x(S(x) ∧ ¬P(x))

The logical relations between these propositions are displayed in the subsequent “Octagon of Opposition” where horizontal dotted lines indicate contradictory oppositions, or negations, bold arrows stand for (unrestrictedly valid) logical implications, while thin arrows symbolize the traditional inferences of subalternation which hold only for “non-empty” subject terms:

4. Abelard’s Quantification of the Predicate

According to the standard historiography of logic (for example, Kneale 1962), the theory of the “quantification of the predicate” was developed only in the 19th century by William Hamilton and by Augustus de Morgan. However, preliminary versions of such a theory may have already been apparent in the 17th-century work of Leibniz (Compare Lenzen 2010), in the 16th-century work of Caramuel (Compare Lenzen 2017), and in the 14th-century work of Buridan (Compare Read 2012). Interestingly, Abelard had already dealt with this issue both in Logica ‘Ingredientibus’ and in Dialectica. He developed the theory in two steps, first, for propositions with a singular subject, second, for categorical forms with a quantified subject.

a. Singular Propositions with Quantified Predicate

On pp. 189-190 of Dialectica, Abelard considers the following propositions:

SQ1 Socrates est omnis homo
SQ2 Socrates non est aliquis homo
SQ3 Socrates est aliquis homo
SQ4 Socrates non est omnis homo

According to Abelard, SQ1 and SQ2 are contrary to each other. Furthermore, SQ3 follows SQ1 with subalternation, and similarly SQ2 entails SQ4. Hence SQ3 and SQ4 are “opposed” as subcontraries, and one obtains the following Square of Opposition:

SQ1 Socr. is every man Socr. is no man SQ2
SQ3 Socr. is some man Socr. is not every man SQ4

Next, Abelard considers the following propositions:

SQ5 Omnis homo est Socrates
SQ6 Nullus homo est Socrates
SQ7 Aliquis homo est Socrates
SQ8 Non omnis homo est Socrates

According to Abelard, SQ5 is equivalent to SQ1, and SQ6 is equivalent to SQ2. Furthermore, although Abelard himself doesn’t explicitly say this, SQ7 is equivalent to SQ3, and SQ8 is equivalent to SQ4.

Within the framework of first-order logic, SQ5, ‘Every man is Socrates,’ is most naturally interpreted as: ‘For every x: If x is a man, then x is identical with Socrates,’ symbolically ∀x(M(x) ⊃ (x = s)). Similarly, SQ6, ‘No man is Socrates’ can be formalized as ¬∃x(M(x) ∧ (x = s)). By way of subalternation, SQ5 entails ∃x(M(x) ∧ (x = s)), and SQ6, or its equivalent ∀x(M(x) ⊃ (xs)), similarly entails ∃x(M(x) ∧ (xs)). All these relations can be represented by another Square of Opposition:

SQ1/SQ5 ∀x(M(x) ⊃ (x = s)) ∀x(M(x) ⊃ (x ≠ s)) SQ2/SQ6
SQ3/SQ7 ∃x(M(x) ∧ (x = s)) ∃x(M(x) ∧ (x ≠ s)) SQ4/SQ8

This reconstruction largely accords with two squares which Abelard himself presented in Logica Ingredientibus, p. 411 (for more details compare Lenzen (2021), ch. 10).

b. Categorical Forms with Quantified Predicate

In a very condensed passage of Dialectica (p. 190), Abelard sketches how the theory of the quantification of the predicate can be transferred from singular propositions to categorical propositions. He starts with a generalisation of ‘Socrates est omne animal’ and ‘Socrates non est omne animal’:

CQ1                 Every man is every animal (Omnis homo est omne animal).

CQ2                 No man is every animal (Nullus homo est omne animal).

According to Abelard, these two propositions are doubly contrary (“dupliciter con­trarie”) to each other. Next Abelard considers the subcontrary propositions:

CQ3                 Some man is some animal (Quidam homo est aliquod animal).

CQ4                 Some man is not every animal (Quidam homo non est omne animal).

He maintains that CQ3 follows from CQ1 by subalternation. Similarly, CQ4 follows from CQ2 by subalternation. Furthermore, Abelard maintains that another subalternation exists between:

CQ5                 No man is some animal (Nullus homo est aliquod animal).

CQ6                 Some man is every animal (Quidam homo est omne animal).

These propositions can be formalized as follows:

CQ1 Every man is every animal ∀x(Mx ⊃ ∀y(Ay ⊃ (x = y)))
CQ2 No man is every animal ¬∃x(Mx ∧ ∀y(Ay ⊃ (x = y)))
CQ3 Some man is some animal ∃x(Mx ∧ ∃y(Ay ∧ (x = y)))
CQ4 Some man is not every animal ∃x(Mx ∧ ¬∀y(Ay ⊃ (x = y)))
CQ5 No man is some animal ¬∃x(Mx ∧ ∃y(Ay ∧ (x = y)))
CQ6 Some man is every animal x(Mx ∧ ∀y(Ay ⊃ (x = y)))

It is easy to see that Abelard’s theses concerning the contradictory opposition between CQ3 and CQ5 and between CQ4 and CQ1 are correct. Also, CQ1 and CQ2 are contrary to each other. Furthermore, as Abelard explains, CQ3 logically follows from CQ1 by way of subalternation. However, he failed to see that CQ3 follows from CQ1 so to speak by a double subalternation: CQ1 first entails:

CQ7 Every man is some animal ∀x(Mx ⊃ ∃y(Ay ∧ (x = y)))

And CQ7 in turn entails CQ3. Altogether the logical relations between the affirmative propositions CQ1, CQ3, CQ6, and CQ7 can be displayed as follows:

The logical relations of this diagram are reversed when one considers the negations of the four propositions. We know already that the negation of CQ1 is CQ4, that of CQ6 is CQ2, and that of CQ3 is CQ5. So, we only have to add the negation of CQ7 (‘Every man is some animal’) which amounts to:

CQ8 Some man is not some animal ∃x(Mx ∧ ¬∃y(Ay ∧ (x = y)))

Hence one obtains the following diagram for negative categorical propositions with quantified predicate:

On p. 411 of Logica Ingredientibus, Abelard presents two squares of opposition, one entitled “exstinctiva”, the other “separativa.” After correcting a minor mistake, these squares accord with our diagrams, and both squares can easily be combined into the following octagon:

Here again dotted lines indicate a contradictory opposition while the arrows symbolize logical implications.

Most likely Abelard understood propositions CQ3 (‘Some S is some P’) and CQ5 (‘No S is some P’) as alternative formulations of the ordinary PA and UN. Similarly, propositions CQ7 (‘Every S is some P’) and CQ8 (‘Some S is not some P’), which were “overlooked” by Abelard, may be interpreted as alternative formulations of the ordinary UA and PN. Therefore, the above octagon contains as a substructure the usual square of opposition:

The thin arrows again signalize that these inferences of subalternation only hold for non-empty terms.

5. Inferences and Implications

Like many other medieval logicians, Abelard fails to make a systematic distinction between inferences and implications. He refers to them equally as “inferentia”, “consequentia”, or “consecutio.” If the inference is a genuine syllogism consisting of two categorical propositions as premisses and another categorical proposition as conclusion, Abelard typically separates them by means of “ergo,”, for instance:

Omnis homo est animal

Omne animal est animatum […]

Ergo omnis homo est animatus. (Dialectica, p. 254)

However, he has no qualms to express this inference equivalently by the conditional:

Si omnis homo est animal et omne animal est animatum, omnis homo est animatus (ibid.).

Also, he has no qualms to refer to the premise(s) of an inference as “antecedens,” to the conclusion as “consequens”, and to the entire inference as “argumentum.”

a. Perfect vs. Imperfect Inferences

 Abelard defines an inference as perfect:

[…] when it holds in virtue of the logical form (complexio) of the propositions involved. By this […] he means that the entailment holds under any uniform substitution in its terms […]. The traditional moods of the categorical syllogism […] are all instances of complete entailments, or as we should say, valid inference. (King & Arlig 2018)

Somewhat surprisingly, Abelard was not willing to grant the attribute ‘perfect’ also to a tautology such as ‘si est animatum, est animatum’ (for a closer discussion compare Lenzen (2021), ch. 11).

Typical examples of imperfect inferences are enthymemes, which is to say, inferences which are not (yet) formally valid, but that can be turned into such by the addition of another premise. Thus, Abelard mentions the example ‘si omnis homo est animal, omnis homo est animatus,’ which may be transformed into a perfect syllogism by the addition of ‘omne animal est animatum’. The latter proposition, which Abelard also paraphrases without quantifier as ‘si est animal, est animatum’, is necessarily true because of the “nature of things.” Nowadays we would call such propositions analytically true.

b. Strict vs. Relevant Implication

As was already mentioned above, some modern commentators express their amazement that, allegedly, “Abelard denied the deduction theorem” (Guilfoy (2008)) or that he at least denied “that a correlate of the Deduction Theorem holds” (King & Arlig (2017)). This point had first been raised by Martin who pointed out:

The deduction theorem has often been regarded as central in logic, and it has been felt that one hardly has a logic for entailment if validity of argument and so derivability are not connected in an appropriate way to the truth of a conditional. There is some connection for Abelard since, if a conditional is true, it satisfies condition C, and so the corresponding argument will certainly be valid in the sense of satisfying condition I. In general, however, entailment [as] expressed in true conditionals is not the converse of derivability or logical consequence as expressed in valid arguments. (Martin 1986, p. 569)

In a later essay, he similarly maintained that:

[…] one cannot conditionalize a valid argument to obtain a true conditional and so the Deduction Theorem does not hold for Abelard’s logic, a feature which shocked his student John of Salisbury. (Martin 2006, p. 182)

Actually, John of Salisbury expressed his “shock” as follows:

I am amazed that the Peripatetic of Pallet so narrowly laid down the law for hypotheticals that he judged only those to be accepted the consequent of which is included in the antecedent […] indeed while he freely accepted argumenta, he rejected hypotheticals unless forced by the most manifest necessity. (Translation from Martin 2006, p. 196)

Now, ever since Aristotle, an inference has been regarded as logically valid if and only if it is impossible that the premise(s) are all true and yet the conclusion be false. A large majority of medieval logicians similarly considered a conditional A → C as true if and only if it can’t be the case that the antecedent A is true while the consequent C is false. This is the common definition of a strict implication (in distinction from a merely material implication) as it was re-invented in the 20th century by C. I. Lewis. Abelard thought it necessary to further distinguish between two kinds of the strictness or necessity of the consequence relation:

There seem to be two necessities of consequences, one in a larger sense, if namely that what is maintained in the antecedent cannot be the case without that what is maintained in the consequent; the other in a narrower sense, if namely not only the antecedent cannot be true without the consequent, but if also the antecedent by itself requires the consequent. (Cf. Dialectica, p. 283-4)

Abelard clearly saw that the former definition gives rise to what are nowadays called the “paradoxes” of strict implication, in particular the principle which later medieval logicians came to call “Ex impossibili quodlibet”:

EIQ   If A is impossible, then the inference AB (or the implication AB) is valid (or true) for every proposition B.

Thus, Abelard considered the proposition ‘If Socrates is a stone, he is an ass’ which according to the first, liberal criterion counts as true because “it is impossible that Socrates should be a stone, and so impossible that he should be a stone without being an ass” (Kneale 1962, p. 217). For reasons discussed in section 6, Abelard did not want to accept this inference (or conditional) as sound. Therefore he suggested the stronger condition which Martin (2006: 181) explained as follows:

The antecedent is required to be relevant to the consequent in that its truth is genuinely sufficient for that of the consequent and this is guaranteed by the consequent being in some way contained in the antecedent.

As a standard example for a correct “relevant” implication Abelard mentions:

(i) If he is a man, he is an animal (“Si est homo, est animal”).

Here, the antecedent requires the consequent by itself (“ex se ipso”) since the notion of man contains the notion of animal. In contrast,

(ii) If he is a man, he is not a stone (“Si est homo, non est lapis”)

is not accepted by Abelard as a correct relevant implication, although, of course, it satisfies the weaker criterion of a strict implication. Abelard argues (Dialectica, p. 284) that the truth of (ii) only rests on our experience which shows that the properties ‘man’ and ‘stone’ are disparate, that is to say, they do not simultaneously subsist in one and the same thing. Yet, as the Kneales explained “the sense of the consequent […] is not contained in the sense of the antecedent” (Kneale 1962: p. 218).

In the wake of Abelard, many attempts have been made to elaborate the idea of a “relevant” implication and to develop a full-fledged logic of “containment.” Until today, no real agreement has been reached. Abelard contributed to this enterprise mainly by suggesting that, in generalization of example (i), a “relevant” implication obtains whenever the antecedent refers to a certain species while the consequent refers to the corresponding kind. The correctness of such conditionals does not depend on whether the antecedent is true or false. Even impossible antecedents can support correct conditionals, that is:

If Socrates is a pearl, Socrates is a stone (cf. Logica ‘Ingredientibus’, 329)

If Socrates is an ass, Socrates is an animal (cf. Dialectica, 346).

6. Abelard’s Defence of the Principles of Connexive Logic

In the 1960s, Storrs McCall introduced the idea of a connexive implication which can be characterized by the requirement that the operator ‘→’ satisfies “Aristotle’s Thesis” and “Boethius’ Thesis.” The crucial passage from Prior Analytics 57b3-14 was interpreted by McCall as follows:

What Aristotle is trying to show here is that two implications of the form ‘If p then q’ and ‘If not-p then q’ cannot both be true. The first yields, by contraposition, ‘If not-q then not-p’, and this together with the second gives ‘If not-q then q’ by transitivity. But this, Aristotle says, is impossible: a proposition cannot be implied by its own negation. […] We shall henceforth refer to the principle that no proposition can be implied by its own negation, in symbols ‘~(~pp)’, as Aristotle’s [first] thesis […] The [other] connexive principle ~[(pq) & (~pq)] will be referred to as Aristotle’s second thesis. (McCall 2012, p. 415)

If one replaces McCall’s symbols ‘~’ and ‘&’ for negation and conjunction by our symbols ‘¬’ and ‘∧’, one obtains:

Arist 1              ¬(¬pp)

Arist 2              ¬((pq) ∧ (¬pq)).

The second principle can be paraphrased by saying that no proposition is implied by both of two contradictory propositions. Abelard similarly maintained that “one and the same consequent cannot follow from the affirmation and from the negation of the same proposition” (cf. Dialectica, p. 290). Like Aristotle, Abelard also argued that if Arist 2 would not hold, then Arist 1 wouldn’t hold either, and this would be absurd since “the truth of one of two contradictory propositions not only does not require the truth of the other, but instead it expels and extinguishes it” (ibid.)

Moreover, Abelard pointed out that Aristotle’s Thesis (“regula aristotelica”) not only holds in the version where it is denied “that one and the same follows from the affirmation and from the negation of the same”, but also in the variant that “the affirmation and the negation of the same cannot be implied by one and the same proposition”, that is:

Abel 2              ¬((pq) ∧ (p → ¬q)).

For example, the propositions ‘If x is a man, x is an animal’ and ‘If x is a man, x is not an animal’ cannot both be true, because otherwise one might derive the “inconveniency” ‘If x is a man, x is not a man,’ The corresponding generalization:

Abel 1              ¬(p → ¬p),

however, in Abelard’s opinion is “impossible.”

Principle Abel 2 is usually referred to as ‘Boethius’ Thesis’. Thus, McCall picked up a passage from De Syllogismo hypothetico where Boethius (ca. 480-524) maintained: “Si est A, cum sit B, est C; […] atqui cum sit B, non est C; non est igitur A’.” McCall then “transliterated” this as the inference:

If p, then if q then r,
If q then not-r
Therefore, not-p.

The reasoning that led Boethius to assert the validity of this schema was presumably this. Since the two implications ‘If q then r’ and ‘If q then not-r’ are incompatible, the second premise contradicts the consequent of the first premise. Hence, by modus tollens, we get the negation of the antecedent of the first premise, namely ‘not-p’. […] The corresponding conditional, If qr then ~(q → ~r) will be denoted Boethius’ thesis, and serves with the thesis ~(p → ~p) as the distinguishing mark of connexive logic (McCall 2012, p. 416).

As was argued in Lenzen (2020), Boethius’ term-logical principle primarily expresses the idea that if a UA of type ‘If x is A, then x is B’ is true, then the UN ‘If x is A, then x is not-B’ can’t be true as well, which is to say, the two universal propositions are contrary to each other. Yet it is probably correct to assume that Boethius would also have endorsed the propositional principle called ‘Boethius’ Thesis’, that is, Abel 2. On the other hand, Boethius nowhere put forward a term-logical counterpart of Abel 1. Therefore, it seems preferable to refer to these principles as Abelard’s Theses.

a. The First “Embarrassing Argument”

Logicians from the 12th-century school of the “Montanae” developed an argument to show that the connexive principles do not hold without restriction. Martin (1986: 569-70) reconstructed their “embarrassing argument” as follows:

1. If Socrates is a man and a stone, Socrates is a man.
2. If Socrates is a man, Socrates is not a stone.
So 3. If Socrates is a man and a stone, Socrates is not a stone.
But 4. If Socrates is not a stone, Socrates is not a man and a stone.
So 5. If Socrates is a man and a stone, Socrates is not a man and a stone.

Conclusion (5) has the logical structure (pq) → ¬(pq), hence it contradicts Abel 1. However, Abelard wasn’t too much worried by this counterexample because he considered step (2) as not valid. This step consists of an application of the traditional “Locus ab oppositis.” In the invaluable collection Logica Modernorum edited by Lambert M. De Rijk, the rule is formulated as “si aliquid oppositorum predicatur de aliquo, aliud abnegatur ab illo” (De Rijk 1967, II/2, p. 62). This means:

Opp 3    If one of two opposite predicates is affirmed of a certain subject, then the other predicate must be denied (of the same subject).

The notion of opposite predicates here has to be understood as not only applying to contradictory concepts like ‘man’ and ‘not-man,’ but also to contrary concepts like ‘man’ and ‘horse’. Somewhat more exactly:

Opp 4    Two predicates (or concepts) P1, P2 are opposite to each other if and only if there can’t exist any x such that P1(x) and P2(x).

In particular, ‘man’ and ‘stone’ are opposite concepts; for any x, it is impossible that x is a man, M(x), and x is a stone, S(x)). Hence, according to the definition given above, M(x) strictly implies ¬S(x). Yet, for Abelard, this is not a relevant or “natural” implication because:

Not being a stone does not follow in the appropriate way from being a man, even though it is inseparable from being a man. It does not follow in the appropriate way since it is no part of the nature of a man that he not be a stone. (Martin 1987, p. 392)

The plausibility of this view need not be discussed here because it soon turned out that there are other counter-examples to the connexive principles Abel 1, 2 which do not rely on the “locus ab oppositis.”

b. Alberic’s Argument

As was first reported in Martin (1986), Alberic of Paris put forward the subsequent “embarrassing argument”:

1. If Socrates is a man and is not an animal, Socrates is not an animal.
2. If Socrates is not an animal, Socrates is not a man.
3. If Socrates is not a man it is not the case that Socrates is a man and an animal.
C*. If Socrates is a man and not an animal, it is not the case that Socrates is a man and not an animal. (Martin 1987, pp. 394-5)

Since conclusion C* has the structure (p ∧ ¬q) → ¬(p ∧ ¬q), it constitutes another counterexample to Abel 1. Furthermore, the argument does not depend on the “locus ab oppositis” because Line 2 is obtained by applying the principle of contraposition to the unproblematic conditional ‘If Socrates is a man, Socrates is an animal.’ Since the proof makes use only of logical laws which Abelard regarded as indispensable, “[…] confronted with this argument Master Peter essentially threw up his hands and granted its necessity” (Martin 1987, p. 395).

In the decades after Abelard, logicians from the schools of the Nominales, the Melidunenses and the Porretani tried to cope with the problems created by Alberic’s argument. As was argued in Lenzen (2023), however, their sophisticated arguments turned out to be inconclusive. As several brilliant logicians from the 13th  to 15th century recognized, Aristotle’s and Abelard’s connexive theses have to be restricted to self-consistent antecedents and/or to non-necessary consequents. For example, Robert Kilwardby (1222-1277) who in his Notule libri Priorum desperately tried to defend Aristotle’s theses against counter-examples, eventually came to admit: “So it should be granted that from the impossible its opposite follows, and that the necessary follows from its opposite” (Thom & Scott 2015, p. 1145). Furthermore, in On the Purity of the Art of Logic, Walter Burley (ca. 1275-1345) proved that “every conditional is true in which an antecedent that includes opposites implies its contradictory. For example, it follows: ‘You know you are a stone; therefore, you do not know you are a stone’” (Spade 2000, p. 156). Burley concluded that Aristotle’s thesis is only restrictedly valid: “I say that the same consequent does not follow from the same antecedent affirmed and denied, unless the opposite of that consequent includes contradictories. And this is how Aristotle’s statement has to be understood” (Spade 2000, p. 160). In a similar way, John Buridan (ca. 1300-1360) rather incidentally noted that “a “possible [!] proposition never entails its own contradictory” (Hughes 1982, p. 38). The editor of Buridan’s Sophismata remarked: “Note that the principle appealed to is not that no proposition whatsoever can entail its own contradictory, but only that no possible proposition can do so. This is a standard principle of modal logic” (Hughes 1982, p. 86).

In view of these interesting and important discoveries, the history of connexive logic, as it was sketched in McCall (2012), needs to be fundamentally corrected. This has been achieved in Lenzen (2022).

7. References and Further Reading

a. Editions of Abelard’s Logical Works

  • Victor Cousin (Ed.), Ouvrages inédits d’Abélard, Paris (Imprimerie Royale) 1836.
  • Mario Dal Pra (Ed.), Pietro Abelardo Scritti Filosofici: Editio super Porphyrium; Glossae in Categorias; Super Aristotelem De Interpretatione; De divisioni­bus; Super Topica Glossae, Roma (Bocca) 1954.
  • Lambert Marie De Rijk (Ed.), Petrus Abaelardus Dialectica – First Complete Edition of the Parisian Manuscript, Assen (van Gorcum) 1959. (The manuscript itself can be downloaded from the Bibliothèque Nationale de France under: https://gallica.bnf.fr/ark:/12148/btv1b6000788f?rk=321890;0
  • Bernhard Geyer (Ed.), Peter Abaelards Philosophische Schriften. In: Beiträge zur Geschichte der Philosophie und Theologie des Mittelalters, vol. 21, issues 1-4, Münster (Aschendorf) 1919-1933.

b. Secondary Literature

  • Maria Teresa Beonio-Brocchieri Fumagalli (1964): La Logica di Abelardo, Firenze (Pubblicazione della Università degli Studi di Milano); engl. translation: The Logic of Abelard, Dordrecht (Reidel) 1969.
  • Irene Binini (2022): Possibility and Necessity in the Time of Peter Abelard, Leiden (Brill).
  • Lambert Marie De Rijk (Ed.) (1967), Logica Modernorum – A Contribution to the History of Early Terminist Logic. Assen (Van Gorcum).
  • Kevin Guilfoy (2008): “Peter Abelard”, in J. Fieser & B. Dowden (Ed.), Internet Encyclopedia of Philosophy: https://iep.utm.edu/abelard/
  • George E. Hughes (1982): John Buridan on Self-Reference, Cambridge (Cambridge University Press).
  • Peter King & Andrew Arlig (2018): “Peter Abelard”, in E. N. Zalta (Ed.), The Stanford Encyclopedia of Philosophy, Fall 2018 edition: https://plato.stanford.edu/entries/abelard/
  • William and Martha Kneale (1962): The Development of Logic, Oxford (Clarendon).
  • Wolfgang Lenzen (2010): “The Quantification of the Predicate – Leibniz, Ploucquet and the (Double) Square of Opposition.” In J. A. Nicolás (Ed.), Leibniz und die Entstehung der Modernität, Stuttgart (Steiner), 179-191.
  • Wolfgang Lenzen (2017): “Caramuel’s Theory of Opposition.” In South American Journal of Logic 3, 361-384.
  • Wolfgang Lenzen (2020): “A Critical Examination of the Historical Origins of Connexive Logic.” In History and Philosophy of Logic 41, 16-35.
  • Wolfgang Lenzen (2021): Abaelards Logik, Paderborn (mentis/Brill).
  • Wolfgang Lenzen (2022): “Rewriting the History of Connexive Logic.” In Journal of Philosophical Logic 51, 523-553.
  • Wolfgang Lenzen (2023): “Abelard and the Development of Connexive Logic.” In: M. Blicha & I. Sedlár (eds.), Logica 2022 Yearbook, London (College Publications), 55-78.
  • Christopher Martin (1986): “William’s Machine.” In The Journal of Philosophy 83, 564-572.
  • Christopher Martin (1987): “Embarrassing Arguments and Surprising Conclusions in the Development of Theories of the Conditional in the Twelfth Century.” In J. Jolivet & A. de Libera (Ed.), Gilbert de Poitiers et ses contemporains: aux origines de la Logica Modernorum, Napoli (Bibliopolis), 377-400.
  • Christopher Martin (2006): “Logic.” In J. Brower & K. Guilfoy (Eds.), The Cambridge Companion to Abelard, Cambridge (Cambridge University Press), 158-199.
  • Storrs McCall (2012): “A History of Connexivity.” In D. M. Gabbay, F. J. Pelletier & J. Woods (Ed.), Handbook of the History of Logic, Vol. 11, Logic: A History of its Central Concepts, Elsevier, 415-449.
  • Jacques Paul Migne (Ed.) (1860): Anicii Manlii Severini Boethii Opera omnia, Paris.
  • Stephan Read (2012): “John Buridan’s Theory of Consequence and His Octagons of Opposition.” In J.-Y. Béziau & D. Jacquette (Ed.), Around and Beyond the Square of Opposition, Basel (Birkhäuser), 93-110.
  • Paul V. Spade (Ed.) (2000): Walter Burley – On the Purity of the Art of Logic – The Shorter and the Longer Treatise. New Haven & London (Yale University Press).
  • Paul Thom & John Scott (Eds.) (2015): Robert Kilwardby, Notule libri Priorum. Oxford (Oxford University Press).

 

Author Information

Wolfgang Lenzen
Email: lenzen@uos.de
University of Osnabrück
Germany

Knowledge-First Theories of Justification

Knowledge-first theories of justification are theories of justification that give knowledge priority when it comes to explaining when and why someone has justification for an attitude or an action. The emphasis of this article is on knowledge-first theories of justification for belief. As it turns out, there are a number of ways of giving knowledge priority when theorizing about justification, and what follows is a survey of more than a dozen existing options that have emerged since the publication in 2000 of Timothy Williamson’s Knowledge and Its Limits.

The present article first traces several of the general theoretical motivations that have been offered for putting knowledge first in the theory of justification. It then provides an examination of existing knowledge-first theories of justification and their objections. There are doubtless more ways to give knowledge priority in the theory of justified belief than are covered here, but the survey is instructive because it highlights potential shortcomings that would-be knowledge-first theorists may wish to avoid.

The history of the Gettier problem in epistemology is a long history of failed attempts to give a reductive account of knowledge in terms of justification and other conditions. In light of this, many have since judged the project of providing a reductive analysis of knowledge to be a degenerating research program. In putting knowledge first in the theory of justification, epistemologists are exploring whether we can more successfully reverse the order of explanation in epistemology by giving an account of justified belief in terms of knowledge. This article concludes with a reflection about the extent to which the short history of the many controversial attempts to secure an unproblematic knowledge-first account of justified belief has begun to resemble the older Gettier dialectic.

Table of Contents

  1. Motivating Knowledge-First Approaches
  2. The Token-Identity Theory
  3. Modal Theories
  4. Reasons-First, Knowledge-First Theories
  5. Perspectival Theories
  6. Infallibilist Knowledge-First Virtue Epistemology
  7. Proficiency-Theoretic Knowledge-First Virtue Epistemology
  8. Functionalist & Ability-Theoretic Knowledge-First Virtue Epistemology
  9. Know-How Theories and the No-Defeat Condition
  10. Excused Belief vs. Justified Belief
  11. A Methodological Reflection on Gettier
  12. References and Further Reading

1. Motivating Knowledge-First Approaches

Knowledge-first theories of justified belief give knowledge priority when it comes to explaining when and why someone has a justified belief. As it turns out, there are a number of ways of giving knowledge priority when theorizing about justified belief, and what follows is a survey of several existing options.

Before examining specific knowledge-first theories of justification it is worth considering what might motivate such an approach to begin with. One kind of motivation involves the need for an extensionally adequate theory of justified belief. After all, there is some set of possible cases where agents have justified beliefs, and a knowledge-first theory of justified belief should pick out that set and offer us a knowledge-centric explanation for why that set has exactly the members that it has. Traditional epistemologists should note that progress has been made in this direction, and this provides at least some reason to think that some knowledge-centric account of justification is correct. But there is more to be observed when it comes to motivating knowledge-first accounts of justified belief.

Consider, first, conceptual relations between knowledge and justification. Sutton (2005; 2007) has argued that grasping the concept of epistemic justification depends on our prior understanding of knowledge:

We only understand what it is to be justified in the appropriate sense because we understand what it is to know, and can extend the notion of justification to non-knowledge only because they are would-be knowers. We grasp the circumstances—ordinary rather than extraordinary—in which the justified would know. Justification in the relevant sense is perhaps a disjunctive concept—it is knowledge or would-be knowledge (Sutton 2005: 361).

If our concept of epistemic justification depends on our concept of knowledge, then that surely provides at least some reason to think that knowledge might be more basic a kind than justified belief. At the very least it provides us with reason to explore that possibility.

Second, consider some plausible claims about the normativity of belief. As Williamson (2014: 5) reasons: “If justification is the fundamental epistemic norm of belief, and a belief ought to constitute knowledge, then justification should be understood in terms of knowledge too.” Here Williamson is connecting norms for good instances of a kind and norms for bringing about instances of that kind. So if one is justified in holding a belief only if it is a good belief, and a good belief is one that constitutes knowledge, then it seems to follow that a justified belief has to be understood in terms of knowledge (Kelp, et al. 2016; Simion 2019).

A third reason for putting knowledge first in the theory of justification stems from Williamson’s (2000) defense of the unanalyzability of knowledge together with the E=K thesis, which says that the evidence you possess is just what you know. Assuming we should understand justification in terms of having sufficient evidence, it seems to follow that we should understand justification in terms of knowledge. (For critical discussion of E=K see Silins (2005), Pritchard and Greenough (2009), Neta (2017), and Fratantonio (2019).)

A fourth reason stems from the way in which asymmetries of knowledge can explain certain asymmetries of justification. While much of the knowledge-first literature on lottery beliefs has focused on assertion (see the article Knowledge Norms), the points are easily extended to justified belief. One cannot have justification to believe that (L) one has a losing lottery ticket just on the basis of one’s statistical evidence. But one can have justification to believe (L) on the basis of a newspaper report. What can explain this asymmetry? Knowledge. For one cannot know (L) on the basis of merely statistical evidence, but one can know (L) on the basis a newspaper report. Accordingly, knowledge can play a role in explaining the justificatory asymmetry involving (L) (Hawthorne 2004; Smithies 2012). A similar asymmetry and knowledge-first explanation can be drawn from the literature on pragmatic encroachment (Smithies 2012; De Rose 1996). See also Dutant and Littlejohn (2020) for further justificatory asymmetries that certain knowledge-first approaches to justified belief can explain.

Fifth, putting knowledge in the explanatory forefront can explain (broadly) Moorean absurdities. Consider, for instance, the absurdity involved in believing p while also believing that one does not know p. Some explanation for the irrationality of this combination of beliefs should fall out of a theory of justification that tells us when and why a belief is (or is not) justified. Theories of justification that explain justification in terms of knowledge have an easy time explaining this (Williamson 2000; 2009; 2014).

Lastly, putting knowledge in the explanatory forefront of justification can provide an explanation of the tight connection between justification and knowledge. For it is widely believed that knowing p or being in a position to know p entails that one has justification for believing p. The traditional explanation of this entailment relation involves the idea that knowledge is to be analyzed in terms of, and hence entails, justification. But another way of explaining this entailment is by saying that knowledge or being in a position to know is constitutively required for justification (Sylvan 2018).

2. The Token-Identity Theory

Perhaps the first knowledge-first theory of justified belief is the token-identity theory, according to which token instances of justified belief just are token instances of knowledge, which yield the following biconditional (Williamson 2009, 2014; Sutton 2005, 2007; Littlejohn 2017: 41-42):

(J=K) S’s belief that p is justified iff S knows that p.

The term ‘iff’ abbreviates “if and only if.” This is a theory of a justified state of believing (doxastic justification), not a theory of having justification to believe, whether or not one does in fact believe (propositional justification). But it is not hard to see how a (J=K) theorist might accommodate propositional justification (Silva 2018: 2926):

(PJ=PK) S has justification to believe p iff S is in a position to know p.

What does it take to be in a position to know p? One type of characterization takes being in a position to know as being in a position where all the non-doxastic demands on knowing are met (Smithies 2012; Neta 2017; Rosenkranz 2018; Lord 2018). The doxastic demands involve believing p in the right kind of way, that is, the kind of way required for knowing. The non-doxastic demands involve the truth of p and one’s standing in a suitably non-accidental relation to p such that, typically, were one to believe p in the right kind of way, one would know that p. (For further characterizations of being in a position to know see Williamson 2000: 95; Rosenkranz 2007: 70-71.)

One issue raised by characterizing being in a position to know in counterfactual terms concerns what we might call doxastic masks: features of one’s situation that are triggered by one’s act of coming to believe p at a time t+1 that would preclude one from knowing p despite all the non-doxastic requirements of knowledge being met at an earlier time t. For example, you might have all the evidence it could take for anyone to know p, but suppose Lewis’ (1997) sorcerer does not want you to know p. So, in all or most nearby worlds when the sorcerer sees you beginning to form the belief in p, he dishes out some kind of defeater that prevents you from knowing p. So, on standard possible worlds analyses of counterfactuals, it is false that you have some way of coming to believe p such that were you to use it, you would know p (compare Whitcomb 2014). Alternatively, one might seek to characterize being in a position to know in terms of having the disposition to know which is compatible with the existence of doxastic masks. Another alternative is to give up on the idea that being in a position to know is best understood in terms of worlds and situations nearby or close to one’s actual situation, thereby making the target characterization of being in a position to know a more idealized notion, one that is discussed below (compare Smithies 2012: 268, 2019: sect 10.4; Rosenkrantz 2018; Chalmers 2012).

There are various problems with (J=K) and, by extension, (PJ=PK).  First, (J=K) is incompatible with the fallibility of justification, that is, the possibility of having justified false beliefs. So (J=K) cannot permit justified false beliefs. But any theory of justification that rules out such beliefs is widely seen to be implausible (Bird 2007; Comesaña and Kantin 2010; Madison 2010; Whitcomb 2014; Ichikawa 2014).

Second, (J=K) is incompatible with the possibility of having a justified true belief in the absence of knowledge. Gettier cases are typically cases of justified true belief that do not constitute knowledge. But (J=K) implies that there are no such cases because it implies that there can be no cases of justification without knowledge. This bucks against a history of strong intuitions to the contrary (Bird 2007; Comesaña and Kantin 2010; Madison 2010; Whitcomb 2014; Ichikawa 2014).

Third, (J=K) is incompatible with the new evil demon hypothesis. Consider someone who, unwittingly, has had their brain removed, placed in a vat, and is now being stimulated in such a way that the person’s life seems to go on as normal. According to the new evil demon hypothesis: if in normal circumstances S holds a justified belief that p, then S’s recently envatted brain-duplicate also holds a justified belief that p. It is beyond the scope of this article to defend the new evil demon hypothesis. But as Neta and Pritchard (2007) point out, it is a widely shared intuition in 21st century epistemology. This generates problems for (J=K). For since one cannot know that one is looking at a hand (or that a hand is in the room) if one is a recently envatted brain who merely seems to be looking at a hand, then according to (J=K) one cannot be justified in believing it either (Bird 2007; Ichikawa 2014). For further discussion see the article on The New Evil Demon Hypothesis. See also Meylan (2017).

3. Modal Theories

To avoid the problems with (J=K), some have sought to connect justified belief and knowledge in a less direct way, invoking some modal relation or other.

Here is Alexander Bird’s (2007) knowledge-first account of justified judgment, which can be transformed into a theory of justified belief (i.e. arguably the end state of a justified act of judging):

(JuJu) If in world w1 S has mental states M and then forms a judgment [or belief], that judgment [or belief] is justified iff there is some world w2 where, with the same mental states M, S forms a corresponding judgment and that judgment [or belief] yields knowledge.

(JuJu) counts as a knowledge-first theory because it explains one’s justified judgment/belief in terms of the knowledge of one’s mental state duplicates. It does a good deal better than (J=K) when it comes to accounting for intuitive characteristics of justified belief: namely, its fallibility, its compatibility with Gettier cases, and its compatibility with the new evil demon hypothesis.

Despite this, various problems have been pointed out concerning (JuJu). First, it seems that we can obtain justified false beliefs from justified false beliefs. For example, suppose S knew that:

(a) Hesperus is Venus.

But, due to some misleading evidence, S had the justified false belief that:

(b) Hesperus is not Phosphorus.

Putting these two together S could infer that:

(c) Phosphorus is not Venus.

As Ichikawa (2014: 191-192) argues, S could justifiably believe (c) on this inferential basis. But, according to (JuJu), S can justifiably believe (c) on the basis of an inference from (a) and (b) only if it is possible for a mental state duplicate of S’s to know (c) on this basis. But content externalism precludes such a possibility. For content externalism implies that any mental state duplicate of S’s who believes (c) on the basis of (a) and (b) is a thinker for whom the terms ‘Phosphorus’ and ‘Venus’ refer to the very same astral body, thus making knowledge of (c) on the basis of (a) and (b) impossible. Because of this, (JuJu) implies that you cannot have justification to believe (c) on this inferential basis, contrary to what seems to be the case. This is not just a problem for (JuJu), but also a problem for (J=K).

Second, (JuJu) fails to survive the Williamsonian counterexamples to internalism. Williamson’s counterexamples, as McGlynn (2014: 44ff) observes, were not intended to undermine (JuJu) but they do so anyway. Here is one example:

Suppose that it looks and sounds to you as though you see and hear a barking dog; you believe that a dog is barking on the basis of the argument ‘That dog is barking; therefore, a dog is barking’. Unfortunately, you are the victim of an illusion, your demonstrative fails to refer, your premise sentence thereby fails to express a proposition, and your lack of a corresponding singular belief is a feature of your mental state, according to the content externalist. If you rationally believe that a dog is barking, then by [JuJu] someone could be in exactly the same mental state as you actually are and know that a dog is barking. But that person, too, would lack a singular belief to serve as the premise of the inference, and would therefore not know that a dog is barking. (Williamson 2000: 57-58).

McGlynn (2014: 44) draws attention to the fact that a “natural verdict is that one’s belief that a dog is barking is rational or justified” despite the fact that one cannot know this while having the same mental states. For any (non-factive) mental state duplicate will be one for whom the sentence ‘That dog is barking’ cannot be true, and hence cannot be known either. So we have another counterexample to (JuJu). Again, this is not just a problem for (JuJu), but also (J=K).

Since (JuJu)’s problems stem from its insistence on sameness of mental states, a natural response is to abandon that emphasis and focus on what a thinker and, say, her duplicate on Twin Earth can have in common. This is just what Ichikawa (2014: 189) attempts to do:

(JPK) S has a justified belief iff S has a possible counterpart, alike to S in all relevant intrinsic respects, whose corresponding belief is knowledge.

The target intrinsic respects are limited to the non-intentional properties that S and her Twin Earth duplicate can share. But they are not intended to include all such properties. Ichikawa wants to maintain that if, say, S unwittingly lost her body in an envattment procedure, she could still have a justified belief that she has a body even though the only counterparts of hers who could know this are ones who have a body. So, the target intrinsic respects are to be further restricted to what S and her envatted counterpart could share. In the end, this seems to amount to sameness of brain states or something close to that. This aspect of (JPK) goes a long way towards making it internalist-friendly and also helps (JPK) avoid the difficulties facing (JuJu) and (J=K). (See Ichikawa (2017) for his most recent work on knowledge-first approaches to justification.)

Nevertheless, (JPK) has problems of its own. Both problems stem from the attempt to reconcile (JPK) with the idea that justified belief is a type of creditable belief. Here is how Ichikawa (2014: 187) describes the first problem: Zagzebski (1996: 300-303) and many others have argued that it is plausible that S’s holding a justified belief entails that S is creditworthy (that is, praiseworthy) for believing as she does. Moreover, S is creditworthy because S holds a justified belief. That is, it is S’s particular act of believing that explains why S deserves credit. But (JPK) seems forced to explain S creditworthiness in terms of facts about  S’s counterparts since it is one’s counterparts that explain one’s doxastic justification. But this seems odd: why facts about a merely possible, distinct individual make me creditworthy for believing as I actually do? As others have pointed out, this can seem odd (Silva 2017). But a more promising response involves noting that having a justified belief immediately grounds being creditworthy for believing, just as our intuition has it. And facts about one’s counterparts’ knowledge immediately grounds having a justified belief. But immediate grounding is not transitive, so stuff about knowledge does not immediately ground being creditworthy for believing. So, the odd consequence does not follow. A consequence that does follow is that stuff about knowledge mediately grounds being creditworthy for believing. (Because there is a chain of immediate grounds connecting these.) But here it is open for the knowledge-firster to say that our intuition really concerns only immediate grounding.

Ichikawa is clear that (JPK) is a theory of justified belief (doxastic justification) and that this is the notion of justification that is connected to a belief’s being creditworthy. But doxastic justification has a basing requirement, and this makes doxastic justification partly a historical matter. And epistemic credit and blame also seem to depend on historical factors too (Greco 2014).  Thus, Ichikawa’s defense of (JPK) is susceptible to cases like the following:

Bad Past: At t S comes to believe that there is a ceiling overhead. S believes this because she just took a pill which she knew would induce random changes in her intrinsic states. In advance of taking the pill, S knew it would very likely cause her to have many false perceptual beliefs. But as it happens, the pill induced a total re-organization of her intrinsic states such that at t S has a counterpart who knows a ceiling is overhead.

(JPK) implies that S has a justified belief in Bad Past because she happens to have a knowledgeable counterpart. And because she has a justified belief, she is also creditworthy. But this seems wrong. Rather, S seems positively blameworthy for believing as she does. See Silva (2017) for further discussion of (JuJu) and (JPK) and see Greco (2014) for further discussion of historical defeaters for doxastic justification.

An alternative solution to these problems would be to revise (JPK) so that it is only a theory about propositional justification:

(PJPK) S has justification to believe p iff S has a possible counterpart, alike to S in all relevant intrinsic respects, whose corresponding belief is knowledge.

One could then, arguably, concoct a knowledge-first theory of doxastic justification by adding some kind of historical condition that rules out cases like Bad Past.

It should be noted that (PJPK) has a strange result. For if your internal counterpart knows p, then your internal counterpart believes p. But if your internal counterpart believes p, then you also believe p—provided you and your counterpart are not in very different environments (for example, earth vs. twin earth) that shift the content of the belief (compare Whitcomb 2014). So if (PJPK) is true, you only have propositional justification to believe p if you actually believe p. But it is usually assumed that it is possible to have propositional justification to believe p even if you do not believe p. To accommodate this (PJPK) may need revision.

4. Reasons-First, Knowledge-First Theories

Sylvan (2018), and Lord (2018) each take a reasons-first approach to justification, on which justified belief just is belief that is held for sufficient reason:

(J=SR) S’s belief that p is justified iff (i) S possess sufficient reason to believe p, and (ii) S believes that p for the right reasons.

While (J=SR) is not itself a knowledge-first view of justification, it becomes one when combined with a knowledge-first account of condition (i). Lord (2018: ch3) and Sylvan (2018: 212) both do this, taking reasons to be facts and arguing that one possesses a fact just in case one is in a position to know it:

(Pos=PK) S possess the fact that p as a reason to respond in some way w iff S is in a position to know that p.

Others have argued for some kind of knowledge-first restriction on (Pos=PK). For example, Neta (2017) has argued that our evidence is the set of propositions we are in a position to know non-inferentially. Provided one’s evidence just is the set of reasons one has for belief, this view will fall into the reasons-first, knowledge-first camp. For objections to (Pos=PK) see Kiesewetter (2017: 200-201, 208-209) and Silva (2023).

Surprisingly, the category of reasons-first, knowledge-first views cross-cuts some of the other categories. For example, (J=K) theorists have tended to fall into this camp. Williamson (2009) and Littlejohn (2018) take one’s evidence to consist of the propositions that one knows. Again, provided one’s evidence just is the set of reasons one has for belief, this leads to a view on which one possess p iff one knows p. This more restrictive knowledge-first view of possession, but together with (J=SR) and (J=K) it constitutes a kind of reasons-first, knowledge-first theory of justification. Since justified belief that p and knowledge that p never separate on this view, it can seem hardly worth mentioning this view as a reasons-first view. But there is more in need of epistemic justification than belief (though that will not be discussed here). There are other doxastic attitudes (for example, suspension, credence, acceptance, faith) as well as actions and feelings that are in need of epistemic justification. On knowledge-first, reasons-first views these states can only be justified by one’s knowledge.

As mentioned above (J=K) is subject to a range of objections. What follows focuses on Lord and Sylvan’s incarnation of the knowledge-first program that consists of (J=SR) and (Pos=PK). These two principles give us a knowledge-first theory of justification that avoids some of the main problems facing (J=K).

First, (J=SR) and (Pos=PK) are consistent with the existence of justified false beliefs. This is due to the fact that one’s reasons (the facts one is in a position to know) can provide one with sufficient, yet non-conclusive, reason to believe further propositions that may be false. The fact that a drunk has always lied about being sober, can be a sufficient yet non-conclusive inductive reason to believe that he will lie about being sober in the future. Since it is non-conclusive, having justification for this belief is consistent with it turning out to be false. So this view can allow for justified yet false inferential beliefs. The possibility of justified false perceptual beliefs is discussed below in connection with the new evil demon hypothesis.

Second, (i=SR) and (Pos=PK) are consistent with the existence of unknown, justified true beliefs. Because Smith can have justified false beliefs in the way described above, he can have a justified false belief that Jones will get the job based on the fact that the employer said so and the fact that this is a highly reliable indicator of who will get the job. Smith may also know that Jones has ten coins in his pocket based on perception. So, through an appropriate inferential process, Smith can come by a justified true inferential belief that the person who will get the job has ten coins in his pocket. This is a Gettier case, that is, an instance of a justified true belief without knowledge.

There are a few caveats. First, it’s worth noting that the reasons-first, knowledge-first theory of justification only has this implication under the assumption that the justificatory support one derives from facts one is in a position to know is transitive, or can at least sometimes carry over inferences from premises that one is not in a position to know. For, here, Smith’s false belief that Jones will get the job is justified by the reasons Smith is in a position to know, and we are assuming this justified false belief—which Smith is not in a position to know—can nevertheless facilitate Smith’s ability to acquire inferential justification for believing that the person who will get the job has ten coins in his pocket. For worries about the non-transitivity of the justification relation see Silins (2007) and Roche and Shogenji (2014).

Second, it is also worth noting that while Lord and Sylvan’s view is consistent with some intuitions about Gettier cases, it is not consistent with all such intuitions. After all, their view seems to be that we possess different reasons or evidence in the Gettier cases than we do in the good cases. This will seem counterintuitive to those who think that we have the same evidence in both cases.

Third, (J=SR) and (Pos=PK) are consistent with some intuitions about the new evil demon hypothesis. In the standard telling, the recently envatted brain has a non-veridical perceptual experience of p and believes p on the basis of that non-veridical experience. While the non-veridical experience does not give one access to the fact that p (if it is a fact), there is an inferential process that can give the envatted brain a justified belief according to (J=SR) and (Pos=PK). This is because mature thinkers who are recently envatted can know (or be in a position to know) that in the past their visual experiences have been a reliable guide to reality, and can sometimes know that they are now having an experience of p. Together, these are facts that can give one sufficient reason to believe p even if one is an unwittingly recently envatted brain.

Of course, the weakness here is that the envatted brain’s perceptual belief that p is not based on her inferential source of propositional justification to believe p. Rather, the envatted brain holds her belief in response to her perceptual experience. So, she is not doxastically justified, that is, her belief itself fails to be justified. So, there is some bullet to bite unless, perhaps, one can argue that knowledge of the fact that one is having an experience of p can itself be a reason to believe p even when one is an unwittingly envatted brain.

There are further problems that the reasons-first, knowledge-first view faces. They are along the lines of the problems for Bird’s (JuJu). For if reasons are facts, then one cannot obtain justified false beliefs from justified false-premise beliefs unless, as noted above, one’s justified false-premise beliefs are themselves inferentially justified and justificatory support carries over (see the discussion of (JuJu) above).  Similarly, it is unclear whether one can gain justified beliefs from contentless beliefs. For contentless “premise” beliefs do not stand in inferential relations to their “conclusions,” and such relations seem essential to the ability of justificatory support to transmit across inferences.

For a further concern about this view, see Littlejohn’s (2019) “Being More Realistic About Reasons,” where he argues that the conjunction of (J=SR) and (Pos=K) generates explanatory lacunas regarding how reasons should constrain our credences.

5. Perspectival Theories

Perspectival knowledge-first theories of justification put “knowledge first” by letting one’s point of view on whether one has knowledge determine whether one has justification. Smithies (2012), for example, argues that:

(PJ=PJK) S has justification to believe that p iff S has justification to believe that she is in a position to know that p.

Smithies (2012: 268) treats being in a position to know as a matter of being in a position where all the non-psychological conditions for knowing are met. Smithies is clear that this is only a theory of propositional justification (having justification to believe), not doxastic justification (having a justified belief). For as a theory of doxastic justification it would be too demanding: it would require an infinite hierarchy of beliefs, and it would require that one have epistemic concepts (e.g. KNOWS, JUSTIFIED, POSITION TO KNOW) if one is to have any justified beliefs at all. This would over-intellectualize justification, excluding agents incapable of epistemic reflection (for example, young children, people with handicaps, smart non-humans). Worse, if knowledge requires justification then this would also rob such beings of knowledge.

It is important to note that (PJ=PJK) is neutral on which side of the biconditional gets explanatory priority. To be a genuinely knowledge-first view it must be the condition on the right-hand side that explains why the condition on the left-hand side obtains. This is something that Smithies himself rejects.  And there are good reasons for this, as there are objections to (PJ=PJK) that emerge only if we give the right-hand side explanatory priority. But there is also a general objection to this view that is independent of which side gets priority. This section starts with the general objection and then turns to the others.

A central worry to have about (PJ=PJK), irrespective of which side gets explanatory priority, is the extent to which Smithies’ purely non-psychological conception of propositional justification is a theoretically valuable conception of justification as opposed to a theoretically valuable conception of evidential support. For our evidence can support propositions in virtue of entailment and probabilistic relations, where these propositions can be so complex as to be well beyond our psychological abilities to grasp. For example, even before I had the concept of a Gettier Case, my evidence supported the claim that I exist or I’m in a Gettier case just in virtue of the fact that I exist was already part of my evidence and entailed that disjunction. But since I did not have the concept of GETTIER CASE, I could not have formed that belief.

So one general question concerns whether the motivations appealed to in support of (PJ=PJK) wrongly identify the following two epistemic notions:

Evidential Support: Having evidence, E, such that E entails or probabilistically supports p.

Justification: Having evidence, E, such that E gives one justification to believe p.

Certain evidentialists will like the idea of binding these notions together, thinking that strong evidential support is all there is to epistemic justification (Smithies 2019). Yet many have objected to the kind of evidentialism implicit in making evidential support necessary and sufficient for justification. The necessity direction has been objected to due to lottery problems, pragmatic encroachment, and the existence of justified beliefs not derived from evidence (so called “basic” or “immediate” or “foundational” justified beliefs). The sufficiency direction, while rarely challenged, is also objectionable (Conee 1987, 1994; Silva 2018). For example, some mental states are such that we are not in a position to know that we are in them even upon reflection (Williamson 2000). Suppose you knew that you just took a pill that ensured that you are in a mental state M iff you do not believe (A) that you are in M. A rational response to this knowledge would be to suspend belief in (A) due to your knowledge of this biconditional: for if you believe (A) then it is false, and if you disbelieve (A) then it is true. So suspension seems like the only rational response available to you. In at least some such cases where you consciously suspend belief in (A), you will also know that you have suspended belief (A). This is at least a metaphysical possibility, and certainly a logical possibility. Now, since you know the biconditional and since you know you have suspended belief in (A), your evidence entails that you are in M. But it is logically impossible for you to justifiably believe or know (A) on your evidence—and you can know this a priori. For believing (A) on your evidence entails that (A) is false. So connecting justification to evidential support in this way is inconsistent with the following plausible idea: S has justification to believe P on E only if it is logically possible for S to justifiably believe P on E. For further discussion of these and related reasons to separate justification from evidential support see Silva (2018) and Silva and Tal (2020). For further objections to Smithies see Smith (2012). For further defense of Smithies’ theory see Smithies (2019: sect 9.4).

Further, as Smith (2012) points out, (PJ=PJPK) implies that having justification to believe p requires having justification to believe an infinite hierarchy of meta-justificatory claims:

One thing that we can immediately observe is that [PJ=PJK]… is recursive, in that it can be reapplied to the results of previous applications. If one has justification to believe that p (Jp) then, by [PJ=PJK], one must have justification to believe that one is in a position to know that p (JKp). But if one has justification to believe that one is in a position to know that p (JKp) then, by [PJ=PJK], one must have justification to believe that one is in a position to know that one is in a position to know that p (JKKp) and so on… In general, we have it that Jp ⊃ JKn p for any positive integer n.

If one adds to this the priority claim that having justification to believe that one is in a position to know p is the source of one’s justification to believe p, one must either accept a skeptical result due to grounding worries about the infinite hierarchy of meta-justificatory claims, or accept a knowledge-first form of infinitism. But even overcoming the standard general worries with infinitism, knowledge-first infinitism will be especially difficult to handle due to luminosity failures for KK. For example, in Williamson’s (2000: 229) unmarked clock case, one is argued to know a proposition p, while also knowing that it is very improbable that one knows i. Intuitively, this is a case where one knows p and so justifiably believes p even though they lack justification to believe they know p. (For a discussion of the limits of the unmarked clock case see Horowitz 2014.)

The final issue with (PJ=PJPK) is whether or not having justification to believe that one is in a position to know is the source of one’s propositional justification to believe p (which would make this a knowledge-first view) or whether it is a non-explanatory necessary and sufficient condition on having justification to believe p (Smithies’ view). To illustrate the difference, suppose there is an infallible record of peoples’ heights. It is certainly true that Paul is 5’11’’ at t if and only if the infallible record says that Paul is 5’11’’ at t. But the right-hand-side of that biconditional is plausibly non-explanatory. The fact that there is an infallible record does not make or otherwise explain Paul’s height. Now, if the advocate of (PJ=PJPK) holds that having justification to believe that one is in a position to know is the source of one’s justification, then having a doxastically justified belief will, according to tradition, require one to base their belief that p on that source of justification. But ordinarily we do not base our beliefs on further facts about knowing or being in a position to know. So if we are not to risk an unacceptable skepticism about doxastically justified belief (and hence knowledge), it seems we will either have to give up the tradition or treat the right-hand-side of (PJ=PJPK) as specifying a mere non-explanatory necessary and sufficient condition. However, if that is the case, it can seem puzzling why there should be such a modally robust connection between justification and one’s perspective on whether one knows.

A view much like (PJ=PJPK) that avoids all but this final problem is Dutant and Littlejohn’s (2020) thesis:

(Probable Knowledge) It is rational for S to believe p iff the probability that S is in a position to know p is sufficiently high.

Even after specifying the relevant notion of ‘in a position to know’ and the relevant notion of “probability’ (objective, subjective, epistemic, together with some specification of what counts as an agent’s evidence), provided we can and should distinguish between propositionally and doxastically rational belief, it seems that (Probable Knowledge) is either not going to be a genuinely knowledge-first view or one that does not allow for enough doxastically rational beliefs due to the basing worry described above in connection with Bad Past.

Reynolds (2013) offers a related view of doxastic justification on which justified belief is the appearance of knowledge: “I believe with justification that I am currently working on this paper if and only if there has been an appearance to me of my knowing that I am currently working on this paper.” Generalizing this we get:

(J=AK) S’s belief that p is justified if and only if S is appeared to as though S knows that p.

On his view appearances are not doxastic states nor are they conceptually demanding. As he  explains the target notion:

Consider the following example: Walking in a park I notice an unfamiliar bird, and decide I would like to find out what it is. Fortunately, it doesn’t immediately fly away, so I observe it for two or three minutes. A few hours later, having returned home, I look up a web site, find a few photos, follow up by watching a video, and conclude confidently that I saw a Steller’s Jay. I think it is perfectly correct to say that the bird I saw had the appearance of a Steller’s Jay, even though I didn’t know that that’s what it was at the time. If it hadn’t had the appearance of a Steller’s Jay, I wouldn’t have been able to remember that appearance later and match it to the photos and video of Steller’s Jays. I didn’t have the concept of a Steller’s Jay, yet I had an appearance of a Steller’s Jay. (Reynolds 2013: 369)

(J=AK) has advantages with regard to (PJ=PJK). It does not lead to an infinite hierarchy of meta-justificatory claims and it is not hard to see how many of our occurrent beliefs might be based on such appearances, thereby avoiding some of the skeptical challenges that threatened (PJ=PJK). But there are problems.

One concern with (J=AK) is its self-reflective character. To have a justified belief you have to be (or have been) in a state in which it appears to you as though you have knowledge. This requires introspective abilities, which arguably some knowing creatures might lack. As Dretske (2009) put it: a dog can know where its bowl is, and a cat can know where the mouse ran. The correctness of these and other knowledge ascriptions does not seem to turn on whether or not dogs and cats have the capacity to access their own mental lives in such a way that they can appear to themselves to have knowledge.

Moreover, (J=AK) implies that every justified belief is a belief with such an appearance. But many of the justified beliefs we form and much of the knowledge we acquire is merely dispositional, that is, it involves dispositional beliefs that are never or only very briefly made occurrent. Do we, as a matter of psychological fact, also have the appearance of knowledge with regard to all such states? There is non-trivial empirical reason to find this suspicious. In the psychology of memory, it has been observed that our memory systems are not purely preservative, they are also constructive. For example, our sub-personal memory systems often lead us to forget very specific beliefs while forming new beliefs that are more general in character. Sometimes this leads to new knowledge and new justified beliefs (Grundmann and Bernecker 2019). But if the new belief is the product of sub-personal operations and the more general belief is itself unretrieved, then it is unclear how that more general unretrieved justified belief could appear to oneself as a case of knowing.

A final concern with (J=AK) is its ability to handle undercutting defeat and the plausible idea that beliefs can cognitively penetrate appearances (see (cognitive penetration). For suppose you have strong undefeated evidence that you are in fake-barn country, but you brazenly believe without justification that you are looking at the one real barn in all the country. Perhaps this is because you pathologically believe in your own good fortune. But pathology is not necessary to make the point, as it is often assumed that we can have unjustified beliefs that we believe to be justified. If either is your situation, your belief that you are looking at a real barn can appear to you to be knowledge given your normal visual experience and the fact that you (unjustifiably) believe your defeater to have been defeated. According to (J=AK) your belief is then justified. But that is the wrong result. Unjustified beliefs that enable the appearance of knowledge should not have the ability to neutralize defeaters.

Here is a final perspectival, knowledge-first theory of justification. It is mentioned by Smithies (2012) and explored by Rosenkranz (2018):

(J=¬K¬K): S has justification to believe p iff S is not in a position to know that S is not in a position to know that p.

Like Smithies, Rosenkranz relies on a conception of justification and being in a position to know that is psychologically undemanding. But unlike Smithies, Rosenkranz explicitly regards his view as being about justification for idealized agents and leaves open what relevance this notion has for ordinary, non-idealized agents like us.

There are at least two concerns with this view of justification. First, suppose we were to treat (J=¬K¬K) as a theory of justification for ordinary non-ideal agents and imposed (as many wish to) substantive psychological limits on what one has justification to believe. With such limits in place, (J=¬K¬K) would face not an over-intellectualization problem but an under-intellectualization problem. For agents who lack the concept KNOWLEDGE or the complicated concept POSITION TO KNOW could never be in a position to know that they are not in a position to know. So, such agents would be justified in believing anything.

But even once psychological limits are stripped away, and with them the under-intellectualization problem, another problem remains. Smithies (2012: 270) points out that, on this view, to lack justification one must be in a position to know that one is not in a position to know. Since being in a position to know is factive, this limits defeating information to factive defeating information. But it seems like misleading (non-factive) information can also defeat knowledge and justification. For example, suppose you are told that you are in fake-barn country. But in fact you are not, so you are not in a position to know that you are in fake-barn country. Still, the misleading testimony that you are in fake-barn country gives you justification to believe that you are in fake-barn country. Intuitively, this misleading testimony will defeat your justification to believe that there is a barn ahead; the misleading testimony ensures you should not believe that. But you are not in a position to know that you are not in a position to know that there is a barn ahead—recall the testimony you receive is misleading. So (J=¬K¬K) says you have justification when intuitively you do not.

In response, it seems open to advocates of (J=¬K¬K) to argue that while one might not be in a position to know the content of the misleading testimony (because it is false), the misleading testimony itself can defeat. In this case, for example, it is arguable that the misleading testimony that one is in circumstances that make one’s knowing that p improbable itself defeats one’s being in a position to know p, and so prevents one’s good visual contact with an actual nearby barn in normal conditions from putting one in position to know that a barn is nearby. However, recent arguments for the existence of “unreasonable knowledge”—that is, knowledge that p while knowing that it is improbable that one knows p—will challenge the integrity of this response in defense of (J=¬K¬K). For more on unreasonable knowledge see Lasonen-Aarnio (2010, 2014) and Benton and Baker-Hytch (2015).

6. Infallibilist Knowledge-First Virtue Epistemology

We are not simply retainers of propositional knowledge. We are also able to acquire it. You are, for example, able to figure out whether your bathroom faucet is currently leaking, you are able to figure out whether your favorite sports team won more games this season than last season, you are able to figure out the sum of 294 and 3342, and so on. In normal circumstances when you exercise this ability you gain propositional knowledge. If you are able to figure out whether the faucet is leaking and you use that ability, the typical result is knowledge that the faucet is leaking (if it is leaking) or knowledge that the faucet is not leaking (if it is not leaking). The core idea behind knowledge-first virtue epistemology (KFVE) is that justified belief is belief that is somehow connected to exercises of an ability to know. Predictably, (KFVE)-theorists have had different things to say about how justified belief is connected to such abilities.

Some have argued that success is a general feature of exercises of abilities (Millar 2016). That is, one exercises an ability only if one does what the ability is an ability to do. It is widely thought that belief formation is a part of exercising an ability to know because knowing is constituted by believing. From which it follows in the special case of exercises of abilities to know that:

(Exercise Infallibilism) S’s belief is the product of an exercise of an ability to know only if S’s belief constitutes knowledge.

For example, Millar (2019) argues for a special instance of this in arguing that we cannot exercise an ability to know by perception without thereby acquiring perceptual knowledge.

If (Exercise Infallibilism) is true, and if justified beliefs just are beliefs that are products of abilities to know, then (J=K) follows. And so we have a virtue theoretic account of justified belief that faces all the same problems we saw above facing (J=K). Of note is the inability of such a view to accommodate the following desiderata:

Desideratum 1. Justification is non-factive, that is, one can have justified false beliefs.

Desideratum 2. One can have justified true beliefs that do not constitute knowledge, as in standard Gettier cases.

Desideratum 3. One can have justified perceptual beliefs even if one is in an environment where perceptual knowledge is impossible due to systematically misleading features of one’s perceptual environment. This can happen on a more global scale (as in the new evil demon case), and it can happen on a more local scale (as in beech-elm cases discussed below).

7. Proficiency-Theoretic Knowledge-First Virtue Epistemology

The central point of departure from Millar’s virtue theory and the remaining virtue theories is that they reject (Exercise Infallibilism). It is this rejection that makes the resulting theories resilient to the objections facing (J=K). On Miracchi’s (2015) preferred instance of (KFVE), exercises of abilities to know explain our justified beliefs but it is not mere abilities to know that have the potential yield justified beliefs. Rather, it is only proficient abilities to know (“competences”) that yield justified beliefs, and all abilities to know are proficient abilities to know. One has a proficient ability to know just in case an exercise of their ability to know ensures a sufficiently high objective probability of knowing. That is, the conditional objective probability that S knows p given that S exercised a relevant ability to know is sufficiently high. This is a kind of in situ reliability demand on justification.

We can summarize her view of justified belief, roughly, as follows:

(KFVE-Proficiency) S has a justified belief iff S’s belief is competent, where S’s belief is competent iff S’s belief is produced by an exercise of a proficient ability to know.

Central to her view is the idea that exercises of proficient abilities are fallible, that is, an agent can exercise an ability to know without succeeding in knowing. So (Exercise Infallibilism) is given up. This enables (KFVE-Proficiency) to accommodate justified false beliefs (that is, Desideratum 1) as well as justified true beliefs that do not constitute knowledge (that is, Desideratum 2). So (KFVE-Proficiency) avoids two of the main challenges to (J=K) and Millar’s (KFVE-Infallibilism).

However, by limiting justified beliefs to beliefs produced by proficient abilities, Miracchi’s view is, like (J=K) and Millar’s infallibilist view, unable to accommodate Desideratum 3, that is, the compatibility of justified beliefs formed in certain deceptive environments. The first case of this is just the familiar new evil demon case. For the recently envatted brain, as Kelp (2016; 2017; 2018) argues, retains the ability to know by perception that, say, they have hands by responding to visual appearances in normal circumstances. But because they are no longer in normal circumstances, they no longer possess a proficient ability to know. In other words, the recently envatted brain’s change of environment robs them of the proficiency needed to form justified beliefs.

Miracchi (2020) rejects, or is at least deeply suspicious of, the metaphysical possibility of the new evil demon hypothesis. But we need not rely on fantastical envatted brain scenarios to make this style of objection to (KFVE-Proficiency). Suppose you grew up in an environment with lots of beech trees and developed the ability to visually identify them and thus the ability to know that a beech tree is nearby by sight. Since exercises of abilities are fallible, you could exercise this beech-identification ability if you were to unwittingly end up in another environment where there are only elms (which, according to Putnam, look indistinguishable from beeches to the untrained). But this is not an environment where your ability to identify beeches amounts to a proficiency: conditional on your exercise of your ability to identify and come to know that beeches are nearby, it is objectively highly likely that you will fail to know. So the intuition that you can have justified perceptual beliefs about beeches being nearby in such a case appears inconsistent with (KFVE-Proficiency). While there may be some doubt about the metaphysical possibility of the new evil demon hypothesis, this is a perfectly possible scenario. See Kelp (2018: 92) for a similar objection for Miracchi.

One last concern with (KFVE-Proficiency) regards its ability to accommodate defeat. This is discussed in the section below.

8. Functionalist & Ability-Theoretic Knowledge-First Virtue Epistemology

Kelp (2016; 2017; 2018) and Simion (2019) offer versions of (KFVE) that do not tie justification so closely to in situ reliability and thereby avoid not only the problem of having justified false beliefs and the possibility of Gettier cases, but also problems arising from the new evil demon hypothesis and very local cases of deception (like the beech-elm case above). So Desiderata 1–3 are easily managed. This section first explains their distinctive views and then mentions some concerns they share.

On Kelp’s (2016; 2017; 2019) view, justified belief is competent belief, and competent beliefs are generated by exercises of an agent’s ability to know. Importantly, such exercises do not require proficiency in Miracchi’s sense. Kelp’s view, roughly, amounts to this:

(KFVE-Ability) S has a justified belief iff S’s belief is competent, where S’s belief is competent iff S’s belief is produced by an exercise of an ability to know.

On Simion’s (2019) view, in contrast, justified beliefs are beliefs that are generated by properly functioning cognitive processes that are aimed at yielding knowledge. Presumably, if an agent has properly functioning cognitive processes that are aimed at yielding knowledge, then such an agent has an ability to know as well. So it’s not too much of a taxonomic stretch to place Simion’s theory among the virtue theories. Like the exercise of abilities, cognitive processes can properly function without proficiency:

(KFVE-Functionalism) S’s belief is justified iff S’s belief is produced by a properly functioning cognitive process that has the etiological function of generating knowledge.

These statements of Kelp and Simion’s views are relatively coarse-grained and both Kelp and Simion defend more refined theses.

Kelp and Simion’s views are not unrelated to each other. For the ability to know is an ability one has in virtue of having certain belief-producing cognitive processes, and Kelp’s (2018) preferred account of how the ability to know is acquired is the same general kind of account that Simion (2019) relies on in arguing that the cognitive processes that constitute one’s ability to know are cognitive processes whose function is knowledge production. Nevertheless, the views are distinct in that (KFVE-Ability) grounds justification in agent abilities, while (KFVE-Functionalism) grounds them in cognitive processes. See Kelp (2019) for a discussion of the importance of this difference.

Central to their views is the idea that exercises of abilities to know are fallible, and given the fallibility of exercises of the ability to know, (KFVE-Ability) and (KFVE-Functionalism) allow for justified false beliefs and justified true beliefs that do not constitute knowledge. So, Desiderata 1 and 2 are easily accommodated.

Desiderata 3 is likewise easily accommodated. In Kelp’s (2018) telling, the recently envatted brain retains and exercises an ability to know when believing she has a hand upon having the visual experience as of a hand. According to Simion (2019), just as an envatted heart pumping orange juice counts as a properly functioning heart, a recently envatted brain counts as properly functioning when it comes to believe it has a hand upon having the visual experience as of a hand. And if justified belief can be had in cases of such systematic perceptual deception, then they can also be had in cases of localized perceptual deception as in the beech-elm scenario above.

So (KFVE-Ability) and (KFVE-Functionalism) can accommodate Desiderata 1–3. What about the desiderata that emerged in the objections to (JuJu), (JPK), and reasons-first, knowledge-first views? That is:

Desideratum 4. Justified beliefs can be based on inferences from justified false beliefs.

Desideratum 5. Justified beliefs can be based on inferential acts involving contentless beliefs.

Desideratum 6. Justified belief is a kind of creditable belief.

Desideratum 7. Justified belief has a historical dimension that is incompatible with situations like Bad Past.

If (KFVE-Ability) or (KFVE-Functionalism) imply that a recently envatted brain is able to have justified beliefs from an exercise of an ability to know or as a product of their cognitive competences which aim a knowledge, then it is easy to see how Desiderata 4 and 5 is satisfied by (KFVE-Ability) and (KFVE-Functionalism). For these seem like more local cases of deception. As for 6 and 7, the virtue-theoretic machinery here is key. For both can be explained by the demand that justified beliefs are beliefs that issue from an ability or a properly functioning cognitive process. But that was exactly what was lacking in the cases discussed above that motivated 6 and 7. See Silva (2017) for an extended discussion of how certain versions of (KFVE) can satisfy these desiderata.

There are some worries about these versions of (KFVE). Consider Schroeder’s (2015) discussion about defeater pairing. Any objective condition, d, which defeats knowledge that p is such that: if one justifiedly believes that d obtains then this justified belief will defeat one’s justification to believe p. For example, suppose you formed the belief that a wall is red from an ability to know this by perception and that you are in normal circumstances where the wall is in fact red. You will have a justified belief according to each of the fallibilist versions of (KFVE) above. But suppose you were given misleading yet apparently reliable undercutting information that the wall is illuminated by red lights and so might not actually be red. This is not true, but were it true it would defeat your knowledge; were it true you would be in a Gettier situation. Now the defeater pairing insight says that the fact that you justifiedly believe the wall is illuminated by red lights defeats your justification to believe the wall is red. But according to the fallibilist instances of (KFVE) discussed above, since you arrived at your belief that the wall is red through an exercise of your proficiency or ability or properly functioning cognitive process, you have a justified belief according to (KFVE-Proficiency), (KFVE-Competence), and (KFVE-Functionalism). But that is inconsistent with the intuition that the justification for your belief is defeated.

So this objection gives rise to a further potential demand on an adequate theory of justified belief:

Desideratum 8. Justified belief is susceptible to defeat by justified defeating information.

A possible response to this objection is to maintain that exercises of abilities, or the use of a reliable processes, always depends on the absence of credible defeating information. In which case, the versions of (KFVE) above may be able to accommodate Desideratum 8.

Another response is to resist Desideratum 8 and the supposed phenomenon of defeater pairing. For more on this see discussion of “unreasonable justified beliefs”. See Lasonen-Aarnio (2010, 2014) and Benton and Baker-Hytch (2015). For qualified opposition see Horowitz (2014).

The second concern to have about (KFVE-Ability) and (KFVE-Functionalism) is that there is a question about the extent to which abilities/cognitive processes are “in the head.” For example, consider the amputee gymnast. She lost her leg and so no longer has the ability to do a backflip. So her ability to do backflips is located, in part, in her ability to successfully interact with the physical world in some ways. In this case, it is located in her ability to control her body’s physical movements in certain ways. This does not conflate proficiency with mere ability, for even with both legs the gymnast might not have a proficiency because she’s in an inhospitable environment for performing backflips (high winds, buckling floors, and so forth). We might wonder, then, whether the envatted brain’s ability to know by perception is lost with the loss of her body and the body’s perceptual apparatus just as the gymnast’s ability to do backflips is lost with the loss of her leg. If so, then it is a mistake to think (KFVE-Ability) and (KFVE-Functionalism) are compatible with the new evil demon hypothesis, and hence with Desideratum 3. This threatens to make these views more revisionary than they initially appeared to be.

9. Know-How Theories and the No-Defeat Condition

Silva (2017) argues that justification is grounded in our practical knowledge (knowledge-how) concerning the acquisition of propositional knowledge (knowledge-that). The motivation for this incarnation of (KFVE) starts with the simple observation that we know how to acquire propositional knowledge. You, for example, know how to figure out whether your bathroom faucet is currently leaking, you know how to figure out whether your favorite sports team won more games this season than last season, you know how to figure out the sum of 294 and 3342, and so on. In normal circumstances when you exercise such know-how you typically gain propositional knowledge. If you know how to figure out whether the faucet is leaking and you use that know-how, the typical result is knowledge that the faucet is leaking (if it is leaking) or knowledge that the faucet is not leaking (if it is not leaking). One way of thinking about the grounds of justification is that it is crucially connected to this kind of know-how: justified belief is, roughly, belief produced by one’s knowledge how to acquire propositional knowledge.

Here is a characterization of Silva’s (2017) view:

(KFVE-KnowHow) S has a justified belief iff (i) S’s belief is produced by an exercise of S’s knowledge of how to gain propositional knowledge, and (ii) S is not justified in thinking she is not in a position to acquire propositional knowledge in her current circumstances.

One advantage of (KFVE-KnowHow) is that it is formulated in terms of know-how and so avoids worries about abilities not being “in the head.” For example, while the amputee gymnast discussed above lacks the ability to perform backflips, she still knows how to do them. Similarly, in thinking about the recently envatted brain, she still knows how to acquire propositional knowledge by perception even if she lacks the ability to do so because she has lost the necessary perceptual apparatus. So Desideratum 3 is, arguably, easier to accommodate with (KFVE-KnowHow).

Similarly, since exercises of know-how are fallible in situ (Hawley 2003), (KFVE-KnowHow) has no trouble explaining how exercises of one’s knowledge how to know could lead one to have a false belief (that is, Desideratum 1) or have true beliefs that do not constitute knowledge (that is, Desideratum 2). For similar reasons (KFVE-KnowHow) is able to satisfy Desiderata 4-7. See Silva (2017) for further discussion.

Lastly, condition (ii) is a kind of “no defeater” condition that makes (KFVE-KnowHow) compatible with Schroeder’s defeater-pairing thesis and standard intuitions about undercutting defeat. So it manages to accommodate Desideratum 8.  So (KFVE-KnowHow) appears capable of satisfying all the desiderata that emerged above. Accordingly, to the extent that one finds some subset of Desiderata 1-8 objectionable one will have reason to object to (KFVE-KnowHow). For one way of developing this point see the next section.

10. Excused Belief vs. Justified Belief

The objections to knowledge-first views of justification above assumed, among other things, that justification has the following properties:

Desideratum 1. Justification is non-factive, that is, one can have justified false beliefs.

Desideratum 2. One can have justified true beliefs that do not constitute knowledge, as in standard Gettier cases.

Desideratum 3. One can have justified perceptual beliefs even if one is in an environment where perceptual knowledge is impossible due to systematically misleading features of one’s perceptual environment. This can happen on a more global scale (as in the new evil demon case), and it can happen on a more local scale (as in beech-elm cases discussed above).

Desideratum 4. Justified beliefs can be based on inferences from justified false beliefs.

Desideratum 5. Justified beliefs can be based on inferential activities involving contentless beliefs.

Desideratum 6. Justified belief is a kind of creditable belief.

Desideratum 7. Justified belief has a historical dimension that is incompatible with situations like Bad Past.

Desideratum 8. Justified belief is susceptible to defeat by justified defeating information.

Knowledge-first virtue epistemology has the easiest time accommodating these assumed properties of justification, with (KFVE-KnowHow) being able to accommodate all of them.

In defense of alternative knowledge-first views some might argue that Desiderata 1–8 (or some subset thereof) are not genuine properties of justification, but rather properties of a kindred notion, like excuse. Littlejohn (2012: ch. 6; 2020) and Williamson (2014: 5; 2020) have argued that the failure to properly distinguish justification from excuses undermines many of the arguments that object to there being a tight connection between knowledge and justification. An excuse renders you blameless in violating some norm, and it is easy to see how some might argue that 1–8 (or some subset thereof) indicate situations in which an agent is excusable, and so blameless, although her belief is not justified. For the locus classicus on the concept of excuse see Austin’s “A Plea for Excuses.” For critical discussion of the excuse maneuver in defense of knowledge-first theories (of assertion and justification) see Lackey (2007), Gerken (2011), Kvanvig (2011), Schechter (2017), Madison (2018), and Brown (2018).

Arguably, the most accommodating knowledge-first virtue theory, (KFVE-KnowHow), threatens to make the concept of an excuse nearly inapplicable in epistemology. For the situations indicated in 1-8 are so inclusive that it can be hard to see what work is left for excuses. If one thought there should be deep parallels between epistemology and moral theory, which leaves substantive work for excuses, then one might worry that any theory that can accommodate all of Desiderata 1-8 will in some way be guilty of conflating justification with excuse.

11. A Methodological Reflection on Gettier

The history of the Gettier problem is a long history of failed attempts to give a reductive account of knowledge in terms of justification and other conditions. In light of this, many have since judged the project of providing a reductive analysis of knowledge to be a degenerating research program. In putting knowledge first in the theory of justification, epistemologists are exploring whether we can more successfully reverse the order of explanation in epistemology by giving an account of justified belief in terms of knowledge. Attempts to put knowledge first in the theory of justification began during the early twenty-first century, reminiscent of the history of attempts to solve the Gettier problem: knowledge-first theories are proposed, counterexamples are given, new knowledge-first theories (or error theories) are developed, new counterexamples are given, and so on (Whitcomb 2014: sect. 6).

Perhaps this repeat of Gettierology merits a new approach. One such approach, advocated by Gerken (2018) is an ‘equilibristic epistemology’ according to which there is not a single epistemic phenomenon or concept that comes first in the project of the analysis of knowledge or justification. Rather, there are various basic epistemic phenomena that are not reductively analyzable. At most they may be co-elucidated in a non-reductive manner. Alternatively, perhaps we should return to the tradition from which knowledge-first epistemology sprung. That is, perhaps we should return to the prior project of providing a reductive analysis of knowledge in terms of other conditions. A manifestation of a return to the traditional approach involves drawing a distinction between knowledge and awareness, where the diagnosis of the failure of post-Gettier analyses of knowledge is, in part, taken to be a failure to appreciate the differences between knowledge and awareness (Silva 2023: ch.8-9).

12. References and Further Reading

  • Benton, M. and M. Baker-Hytch.  2015. ‘Defeatism Defeated.’  Philosophical Perspectives 29: 40-66.
  • Bird, Alexander. 2007. ‘Justified Judging.’ Philosophy and Phenomenological Research, 74: 81-110.
  • Brown, J. 2018. Fallibilism. Oxford: Oxford University Press.
  • Chalmers, D. 2012. Constructing the World. Oxford: Oxford University Press.
  • Comesana, J. and Kantin, H. 2010. ‘Is Evidence Knowledge?’ Philosophy and Phenomenological Research, 89: 447-455.
  • Conee, E. 1987. ‘Evident, but Rationally unacceptable’. Australasian Journal of Philosophy 65: 316-26.
  • Conee, E. 1994. ‘Against and Epistemic Dilemma’. Australasian Journal of Philosophy 72: 475-81.
  • Dretske, F. 2009. Perception, Knowledge, Belief. Cambridge: Cambridge University Press.
  • Dutant, J. and C. Littlejohn. 2020. ‘Defeaters as indicators of ignorance.’ In J. Brown and M. Simion (ed.), Reasons, Justification, and Defeat. Oxford University Press.
  • Fratantonio, G. 2019. ‘Armchair Access and Imagination.’ Dialectica 72(4): 525-547.
  • Gerken, M. 2011. ‘Warrant and Action.’ Synthese, 178(3): 529-47.
  • Gerken, M. 2018. ‘Against Knowledge-First Epistemology.’ In E. And B. A. Gordon and Jarvis Carter (ed.), Knowledge-First Approaches in Epistemology and Mind, Oxford University Press. pp. 46-71.
  • Greco, J. 2014. ‘Justification is not Internal.’ In M. Steup, J. Turri, and E. Sosa (eds.) Contemporary Debates in Epistemology. Oxford: Wiley Blackwell: 325-336.
  • Grundmann, T. and S. Bernecker. 2019. ‘Knowledge from Forgetting.’ Philosophy and Phenomenological Research XCVIII: 525-539.
  • Hawley, K. 2003. ‘Success and Knowledge-How.’ American Philosophical Quarterly, 40: 19-3.
  • Hawthorne, J. Knowledge and Lotteries. Oxford:  Oxford University Press.
  • Horowitz, S. 2014. ‘Epistemic Akrasia.’ Nous 48/4: 718-744.
  • Ichikawa, J.J. 2014. ‘Justification is Potential Knowledge.’ Canadian Journal of Philosophy, 44: 184-206.
  • Ichikawa, J.J. 2017. ‘Basic Knowledge First.’ Episteme 14(3): 343-361.
  • Ichikawa, J. and Steup, M. 2012. ‘The Analysis of Knowledge.’ Stanford Encyclopedia of Philosophy.
  • Ichikawa, J. and C.S.I. Jenkins. 2018. In Joseph Adam Carter, Emma C. Gordon & Benjamin Jarvis (eds.), Knowledge First: Approaches in Epistemology and Mind. Oxford University Press.
  • Kelp, C., M. Simion, H. Ghijsen. 2016. ‘Norms of Belief.’ Philosophical Issues 16: 374-92.
  • Kelp. C. 2016. ‘‘Justified Belief: Knowledge First-Style.’ Philosophy and Phenomenological Research 93: 79-100.
  • Kelp, C. 2017. ‘Knowledge First Virtue Epistemology.’ In Carter, A., Gordon, E. and Jarvis, B. (eds.) Knowledge First: Approaches in Epistemology and Mind. Oxford: Oxford University Press.
  • Kelp, C. 2019b. ‘How to Be a Reliabilist.’ Philosophy and Phenomenological Research 98: 346-74.
  • Kelp, C. 2018. Good Thinking: A Knowledge-First Virtue Epistemology. New York: Routledge.
  • Kiesewetter, B. 2017. The Normativity of Rationality. Oxford: Oxford University Press.
  • Kvanvig, J. L. 2011. ‘Norms of Assertion.’ In Jessica Brown and Herman Cappelen (eds.), Assertion: New Philosophical Essays. Oxford: Oxford University Press.
  • Lackey, J. 2007. ‘Norms of Assertion.’ Nous 41: 594-626.
  • Lasonen-Aarnio, M. 2010. ‘Unreasonable knowledge.’ Philosophical Perspectives 24: 1-21.
  • Lasonen-Aarnio, M. 2014. ‘Higher-order evidence and the limits of defeat.’ Philosophy and Phenomenological Research 88: 314–345.
  • Lewis, D. 1997. ‘Finkish Dispositions.’ The Philosophical Quarterly 47: 143-58.
  • Littlejohn, C. 2017. ‘How and Why Knowledge is First.’ In A. Carter, E. Gordon & B. Jarvis (eds.), Knowledge First. Oxford: Oxford University Press.
  • Littlejohn, C. 2012. Justification and the Truth-Connection. Cambridge: Cambridge University Press.
  • Littlejohn, C. 2019. ‘Being More Realistic About Reasons: On Rationality and Reasons Perspectivism.’ Philosophy and Phenomenological Research 99/3: 605-627.
  • Littlejohn, C. 2020. ‘Plea for Epistemic Excuses.’ In F. Dorsch and J. Dutant (eds.), The New Evil Demon Problem. Oxford: Oxford University Press.
  • Madison, B. 2010. ‘Is Justification Knowledge?’ Journal of Philosophical Research 35:173-191.
  • Madison, B. 2018. ‘On Justifications and Excuses.’ Synthese 195 (10):4551-4562.
  • McGlynn, A. 2014. Knowledge First? Palgrave MacMillan.
  • Meylan, A. 2017. ‘In support of the knowledge-first conception of the normativity of justification.’ In Carter, A., Gordon, E. and Jarvis, B. (eds.) Knowledge First: Approaches in Epistemology and Mind. Oxford: Oxford University Press.
  • Millar, A. 2016. Forthcoming a. ‘Abilities, Competences, and Fallibility.’ In M. Á. Fernández (ed.), Performance Epistemology. Oxford: Oxford University Press.
  • Millar, A. 2019. Knowing by Perceiving. Oxford: Oxford University Press.
  • Miracchi, L. 2015. ‘Competence to Know.’ Philosophical Studies, 172: 29-56.
  • Miracchi, L. 2020. ‘Competent Perspectives and the New Evil Demon Problem.’ In J. Dutant and F. Dorsch, (eds.), The New Evil Demon. Oxford: Oxford University Press.
  • Neta, R. and D. Pritchard. 2007. ‘McDowell and the New Evil Genius.’ Philosophy and Phenomenological Research, 74: 381-396.
  • Neta, R. 2017. ‘Why Must Evidence Be True?’ in The Factive Turn in Epistemology, edited by Velislava Mitova. Cambridge: Cambridge University Press.
  • Pritchard, D. and Greenough, P. Williamson on Knowledge. Oxford: Oxford University Press.
  • Reynolds, S. 2013. ‘Justification as the Appearance of Knowledge.’ Philosophical Studies, 163: 367-383.
  • Rosenkranz, S. 2007. ‘Agnosticism as a Third Stance.” Mind 116: 55-104.
  • Rosenkranz, S. 2018. ‘The Structure of Justification.’ Mind 127: 309-338.
  • Roche, W. and T. Shogenji. 2014. ‘Confirmation, transitivity, and Moore: The Screening-off Approach.’ Philosophical Studies 168: 797-817.
  • Schechter, J. 2017. ‘No Need for Excuses.’ In J. Adam Carter, Emma Gordon & Benjamin Jarvis (eds.), Knowledge-First: Approaches in Epistemology and Mind. Oxford University Press. pp. 132-159.
  • Silins, N. 2005. Silins, N. (2005). ‘Deception and Evidence.’ Philosophical Perspectives 19: 375-404.
  • Silins, N. 2007. ‘Basic justification and the Moorean response to the skeptic.’ In T. Gendler & J. Hawthorne (Eds.), Oxford Studies in Epistemology (Vol. 2, pp. 108–140). Oxford: Oxford University Press.
  • Silva, P. 2017. ‘Knowing How to Put Knowledge First in the Theory of Justification.’ Episteme 14 (4): 393-412.
  • Silva, P. 2018. ‘Explaining Enkratic Asymmetries: Knowledge-First Style.’ Philosophical Studies 175 (11): 2907-2930.
  • Silva P. & Tal, E. 2021. ‘Knowledge-First Evidentialism and the Dilemmas of Self-Impact.’ In Kevin McCain, Scott Stapleford & Matthias Steup (eds.), Epistemic Dilemmas. London: Routledge.
  • Silva, P. 2023. Awareness and the Substructure of Knowledge. Oxford: Oxford University Press.
  • Simion, M. 2019. ‘Knowledge‐first functionalism.’ Philosophical Issues 29 (1): 254-267.
  • Smith, M. 2012. ‘Some Thoughts on the JK-Rule.’  Nous 46(4): 791-802.
  • Smithies, D. 2012. ‘The Normative Role of Knowledge.’ Nous 46(2): 265-288.
  • Smithies, D.  2019. The Epistemic Role of Consciousness. Oxford: Oxford University Press.
  • Sutton, J. 2005. ‘Stick to What You Know.’ Nous 39(3): 359-396.
  • Sutton, J. 2007. Beyond Justification. Cambridge: MIT Press.
  • Sylvan, K. 2018. ‘Knowledge as a Non-Normative Relation.’ Philosophy and Phenomenological Research 97 (1): 190-222.
  • Whitcomb, D. 2014. ‘Can there be a knowledge-first ethics of belief.’ In Jonathan Matheson & Rico Vits (eds.), The Ethics of Belief: Individual and Social, Oxford University Press. 2014.
  • Williamson, T. 2000. Knowledge and its Limits. Oxford: Oxford University Press.
  • Williamson, T. 2009. ‘Replies to Critics.’ In Duncan Pritchard & Patrick Greenough (eds.), Williamson on Knowledge. Oxford: Oxford University Press. pp. 279-384.
  • Williamson, T. 2014. ‘Knowledge First.’ In M. Steup, J. Turri, and E. Sosa (eds.), Contemporary Debates in Epistemology (Second Edition). Oxford: Wiley-Blackwell.
  • Williamson, T. 2020. ‘Justifications, Excuses, and Sceptical Scenarios.’ In J. Dutant and F. Dorsch, (eds.), The New Evil Demon. Oxford: Oxford University Press. Archived in Phil
  • Zagzebski, L. 1996. Virtues of the Mind: An Inquiry into the Nature of Virtue and the Ethical Foundations of Knowledge. Cambridge: Cambridge University Press.

 

Author Information

Paul Silva Jr.
Email: psilvajr@gmail.com
University of Cologne
Germany

British Empiricism

 ‘British Empiricism’ is a name traditionally used to pick out a group of eighteenth-century thinkers who prioritised knowledge via the senses over reason or the intellect and who denied the existence of innate ideas. The name includes most notably John Locke, George Berkeley, and David Hume. The counterpart to British Empiricism is traditionally considered to be Continental Rationalism that was advocated by Descartes, Spinoza, and Leibniz, all of whom lived in Continental Europe beyond the British Isles and all embraced innate ideas. This article characterizes empiricists more broadly as those thinkers who accept Locke’s Axiom that there is no idea in the mind that cannot be traced back to some particular experience. It includes British-Irish Philosophy from the seventeenth, eighteenth, and nineteenth century. As well as exploring the traditional connections among empiricism and metaphysics and epistemology, it examines how British empiricists dealt with issues in moral philosophy and the existence and nature of God. The article identifies some challenges to the standard understanding of British Empiricism by including early modern thinkers from typically marginalised groups, especially women. Finally, in showing that there is nothing uniquely British about being an empiricist, it examines a particular case study of the eighteenth-century philosopher Anton Wilhelm Amo, the first African to receive a doctorate in Europe.

Table of Contents

  1. Introduction
    1. Historiography
  2. The Origins of Empiricism
    1. Precursors to Locke
    2. Locke
  3. Our Knowledge of the External World and Causation
    1. Berkeley on the Nature of the External World
    2. Hume on the Nature of Causation
    3. Shepherd on Berkeley and Hume
  4. Morality
    1. Hutcheson and the Moral Sense
    2. Hume on Taste and the Moral Sense
    3. Newcome on Pain, Pleasure, and Morality
  5. God and Free-Thinking
    1. Anthony Collins
    2. John Toland
    3. George Berkeley
  6. Anton Wilhelm Amo: A Case Study in the Limits of British Empiricism
  7. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Introduction

This article is called ‘British Empiricism’, but it could just as accurately have been titled ‘British-Irish Philosophy from the seventeenth to the nineteenth century and the Lockean Axiom’. The article focuses on the commitment to the Lockean version of the Peripatetic axiom that is shared by many British and Irish thinkers in the seventeenth, eighteenth, and nineteenth centuries. Following John Locke (1632–1704), virtually all the empiricist thinkers considered in this article accept that “nothing is in the intellect that was not first in the senses” (De veritate q. 2 a. 3 arg. 19), to use Thomas Aquinas (1225–1274) phrasing of what is known as the Peripatetic Axiom (see Cranefield 1970 for more on the origin of the phrase).

While the shared acceptance of this axiom is a unifying feature for the thinkers considered in this article, it is worth starting off with some problematization of the term ‘British Empiricism’. The term ‘British’ here is used in a sense common in the early modern period which includes both what in the early twenty-first century was the United Kingdom of Great Britain and North Ireland and the Republic of Ireland—and thus includes thinkers such as the Ardagh-born John Toland (1670–1722) and the Kilkenny-born George Berkeley (1685-1753). The term ‘British’ here also excludes the many British colonies, meaning that this is not a global but is Western European. The term ‘empiricism’ considered here is neither exhaustive nor confined to ‘Britain’. In other words, this article does not discuss all British thinkers who are committed to the Peripatetic axiom. Nor do we claim that such a commitment only exists among British thinkers (see also section 6.1). We further problematize the term by discussing its historiography (section 1.1). This helps to explain why we chose to keep (and use) the term and how the issues and thinkers considered in this article were selected. After all, it is important to be transparent about the fact that an article like this, which focuses on a philosophical tradition, tells a particular story. This will inevitably involve choices by the authors that are shaped by factors like their own introduction to that tradition and which concern the protagonists and the content considered; both of which we outline below.

Section 2 considers the history of the Peripatetic axiom and Locke’s interpretation of it, which here is called the Lockean Axiom.

Lockean Axiom: There is no idea in the mind that cannot be traced back to some particular experience.

Subsequent sections consider how this axiom, accepted in some form by all the thinkers below, was applied to a variety of questions. Section 3 discusses its application to our knowledge of the external world, focusing on George Berkeley (1685–1753), David Hume (1711–1776), and Mary Shepherd (1777–1847). Section 4 focuses on how the axiom influenced moral philosophy in the period, focusing on Hume, Francis Hutcheson (1694–1746), and Susanna Newcome (1685–1763). Section 5 examines the application of the axiom to our knowledge of God, and its focuses on Berkeley, Toland, and Anthony Collins (1676–1729). The final section (section 6) focuses on the limitations of the narrative developed here by considering the case of Anton Wilhelm Amo (c. 1703–1759). Amo is committed to a version of the Lockean Axiom and thus there is a strong reason to consider him within the narrative developed here. However, including Amo comes at the price of challenging the moniker ‘British’ and thus of another feature that determined the selection.

In other words, the purpose of including Amo is twofold. First, it highlights the limits of our narrative. Second, it points to the arbitrary nature of any historical narrative concerning ‘British Empiricism.’ This results from the fact, which we highlight in the next section, that (‘British’) ‘Empiricism’ is an external ascription applied by scholars to certain philosophers – and not a self-expression of a common identity these philosophers took themselves to have shared. In other words, it is an analyst’s category and not an actors’ one. As such, any narrative using this category is always, more or less explicitly, guided by the assumptions, interests, values, and goals of the scholar or analyst employing it. In an attempt to be as transparent as possible about these assumptions, as well as to bolster the case of our arbitrariness-claim, we consider the historiography of the term ‘empiricism’ in the next section. This will also serve to shed further light on the nature and scope of the narrative we develop here, and the ways in which it deviates from the standard narrative.

a. Historiography

A crucial thing to note about both the term ‘British Empiricism’ and what is traditionally thought of as its counterpart ‘Continental Rationalism’ is that they are both anachronisms in the previously introduced sense of being analysts’, and not actors’, categories. To put it differently, none of the thinkers considered in this article, nor thinkers like René Descartes (1596–1650), Baruch Spinoza (1632–1677), or Gottfried Wilhelm Leibniz (1646-1716), who are usually thought of as ‘rationalists,’ used these terms to describe themselves. These thinkers did not think of themselves as working in unified traditions that were opposed to each other. Take the case of Berkeley for instance: while Berkeley critically reacts to Descartes (for example, Letter 44), he is even more critical of Locke. As a case in point, consider his rejection of the existence of abstract ideas in the Introduction to A Treatise Concerning the Principles of Human Knowledge. In fact, we know of no place in Berkeley’s work where he would clearly suggest that he sees himself working in some sort of tandem with Locke, against the likes of Descartes or Leibniz. Leibniz even writes about Berkeley’s Principles that “[t]here is much here that is correct and close to my own view” (AG 307). At the same time Leibniz defends the notion of innate ideas against Locke (see New Essays, G VI), but he also has a critical attitude towards Cartesianism on a variety of issues (see, for example, Anfray 2019 for a concise overview). In summary, the interrelations between these various actors (Berkeley, Locke, Descartes, and Leibniz in this instance) are complex; and it would be a stretch to suggest they saw themselves in two opposing camps.

The fact that it is highly doubtful that ‘empiricists’ (and ‘rationalists’) perceived themselves as such is important. This raises the question of why it is still often taken to be the case that there are two antagonistic philosophical traditions in early modern Europe epitomized by, on the one hand, Descartes, Leibniz, and Spinoza, and Berkeley, Hume, and Locke on the other. What is more, there is evidence that the contrast between these traditions, as we know it today, was invented in the 1850’s by the German historian Kuno Fischer (1824–1907) (see Mercer 2020, 73; for more on the rise of these labels see also Loeb 2010, Norton 1981, Vanzo 2016).

However, despite its complicated history, and further potential challenges which we discuss towards the end of this section, we believe retaining the label ‘British Empiricism’ is fruitful as long as one is fully aware of the fact that it is an analyst’s category. Importantly, there needs to be transparency about the criteria that are used to group certain thinkers together. In our case, the group of thinkers considered here are all, with one exception, British or Irish in the previously outlined sense and share a commitment to the Lockean Axiom, that ‘there is no idea in the mind that cannot be traced back to some particular experience’. This axiom was developed in response to the notion that humans possess innate ideas or innate knowledge (whether that be of mathematical/geometrical truths, or of God), which had previously been endorsed by Plato, was defended by thinkers like Descartes, later Cartesians such as Nicholas Malebranche (1638–1715), and Leibniz, in the seventeenth-century (for Locke’s metaphysics and epistemology, see, for example, Ayers 1991, Bennet 1971, Chappell 1992, Jolley 1999, Mackie 1976, Yolton 1956, Wilson 1999).

Locke, and subsequent thinkers who would go on to be characterised as empiricists, rejected this innatist notion. Indeed, it is standard to view responses to this question, of whether there are innate ideas in the human mind, as a central dividing line between empiricists and rationalists more generally. Thus, in an attempt to bridge the gap between the old standard narrative and new ways of speaking about the history of early modern philosophy, we keep this starting point, yet use it to tell a different story in terms of actors and issues considered. This we deem to be important because of exclusionary tendencies of the traditional early modern canon. By this we mean the fact that the voices of women and other marginalized groups were often systematically excluded when the early modern canon was formed (not to mention that many of the philosophers that became part of the canon have problematic views on issues pertaining sex, gender, class, race, or species) (see for example, O’Neill 1998; Conley 2006, Shapiro 2016; Hutton 2021; Lapointe and Heck 2023). Thus, it is crucial that any new narrative about ‘British Empiricism’ considers non-canonical (that is, traditionally underrepresented) thinkers as well. With that in mind, our decision to focus on the Lockean Axiom is significant because it allows us to integrate non-canonical thinkers such as Collins, Toland, Shepherd, or Newcombe alongside the traditional ‘big three’ of Locke, Berkeley, and Hume. Additionally, focusing on this axiom enables us to consider a larger variety of issues compared to the standard narrative, which focuses primarily on our knowledge of the external world (covered in section 2). For, as will become evident in the subsequent sections, the interests of even Berkeley, Locke, and Hume go well beyond this epistemological issue and encompass, for example, theological and moral questions.

Yet, even if our narrative is more inclusive than the standard story, it is nonetheless important to note its limitations. In closing this section, we illustrate this point with the case of comparatively well-known British women philosophers from the early modern period who do not neatly fall into the category of ‘empiricism’ – either in our use of the term or in its more traditional sense.

It might seem obvious that an article focusing on the Lockean Axiom, as we have called it, does not discuss Margaret Cavendish (1623-1673). After all, Cavendish died over decade before the Essay was published. However, a comprehensive account of philosophy in early modern Britain cannot afford to neglect such a prolific writer. Over her lifetime, Cavendish wrote numerous philosophical treatises, plays, and poems, as well as novels (perhaps most famously The Blazing World in 1668). Yet, Cavendish, perhaps at this stage the most ‘canonical’ woman in early modern philosophy, does not fit neatly into either the ‘empiricist’ or ‘rationalist’ camp. She is critical of Descartes on several issues, including his views on the transfer of motion (which she rejects in favor of an account of self-motion as ubiquitous throughout nature) and his dualism (see her Observations upon Experimental Philosophy and Grounds of Natural Philosophy (both published in 1668); for discussion of Cavendish’s system of nature see Boyle 2017, Lascano 2023, Detlefsen 2006, Cunning 2016). But she is also committed to some (possibly weak) form of ‘innatism’ (discussed in section 2.2), whereby all parts of nature, including humans, have an innate knowledge of God’s existence. Note that (as discussed in section 2.1), there is version of the story of ‘empiricism’ that can be told that brings Thomas Hobbes into the fold. Despite being contemporaneous with Hobbes, Cavendish’s metaphysical and epistemological commitments make it difficult to do the same with her. Thus, by framing the story of early modern British philosophy as one concerned with ‘empiricism’, there is a danger of excluding Cavendish. As recent scholars like Marcy Lascano (2023), have argued, this motivates developing alternative stories – ones that might focus on ‘vitalism’, for instance – alongside more traditional narratives, which feature Cavendish and other women as protagonists.

Another case in point is Mary Astell (1666-1731). One way of telling the story of ‘empiricism’ is as a tradition that formed in opposition to Cartesianism. But if an opposition to Cartesianism is over emphasized, then a thinker like Astell is likely to fall through the cracks. For even though Astell was writing during Locke’s lifetime and critically engages with him when developing her views on education, love, and theology (see for example, Proposal to the Ladies, Parts I and II. Wherein a Method is offer’d for the Improvement of their Minds from 1694 and 1697 or The Christian Religion, As Profess’d by a Daughter Of the Church of England from 1705), she is quite explicitly committed to a form of substance dualism that shares many features in common with that of Descartes (see Atherton 1993 and Broad 2015).

While it may be hard, as we have suggested, to incorporate Cavendish or Astell into a traditional ‘empiricist’ narrative, there are several thinkers that might more easily fit under that label. Take the case of Anne Conway (1631–1679), who is as critical of ‘rationalists’ like Descartes and Spinoza (along with other figures like Hobbes) in her Principles of the Most Ancient and Modern Philosophy (for example, chap. 7) as any of the ‘usual suspects’, such as Berkeley or Locke (for more on Conway’s philosophical system, see Hutton 2004; Thomas 2017; Lascano 2023). But since Conway is not focused on the Peripatetic axiom but wants to offer a philosophical system that can explain how the nature of mind and matter as well as how God and the creation are related, it is hard to place her in the narrative developed in this article. (For a more thorough consideration of Conway’s philosophy, see for instance Hutton 2004; Thomas 2017; Lascano 2023.)  This also holds for someone like Damaris Masham (1658-1708) who – despite knowing Locke and corresponding with Leibniz and Astell – is not overly concerned with the Lockean Axiom. Rather, Masham focuses on moral issues as well as love and happiness (see for example, Discourse Concerning the Love of God in 1696 and her Occasional Thoughts in 1705) arguing for a notion of humans as social and rational beings (for more on Masham’s social philosophy, see Broad 2006, and 2019; Frankel 1989; Hutton 2014 and 2018; Myers 2013). Finally, our focus on the Lockean Axiom means that even someone like Mary Wollstonecraft is hard to incorporate into the narrative. While Wollstonecraft is deeply influenced by Locke’s views on education and love, which play an important role in the background of her Vindication of the Rights of Women from 1792, her focus is on women’s rights. There is no obvious sense in which she is an ‘empiricist’ – on either a traditional conception of that term or the way we have conceived it in this article (that is, as committed to the Lockean Axiom) (see Bahar 2002; Bergès 2013; Bergès and Coffee 2016; Falco 1996; Sapiro 1992).

Wollstonecraft’s case is of particular interest because it illustrates that one can even be a Lockean of sorts and still not fit the bill, as it were. In turn, this emphasizes that any narrative that scholars develop will have to make tough choices about who to include, which is why it is so important to be transparent about the reasoning behind these choices. We strongly believe that this must be kept in mind when reading this article and engaging in both teaching and scholarship in the history of philosophy more generally.

In sum, we have strived to present here a narrative that does justice to the existing tradition while correcting some of its main flaws (in particular, its exclusionary tendencies) in terms of issues and thinkers considered. Nonetheless, it is important to be mindful of the fact that this narrative is just one of many stories that could be told about British philosophy from the seventeenth to the nineteenth century. After all, each narrative – no matter its vices and virtues – will have to deal with the fact that it is arbitrary in the sense of being the product of a particular analyst’s choices. It might well be the case that other scholars deem it better to forgo these labels altogether in research and teaching (see, for example, Gordon-Roth and Kendrick 2015).

2. The Origins of Empiricism

a. Precursors to Locke

As noted in the previous section, this article on ‘British Empiricism’ will focus on a particular narrative that takes Locke’s Essay Concerning Human Understanding as a starting point for the ‘British empiricist’ tradition. Inevitably, there is a degree of arbitrariness in this decision – as we suggested in the previous section, such is the case with any historical narrative that chooses some thinkers or ideas and not others. Nonetheless, we think that this particular narrative has the theoretical virtue of allowing us to expand the canon of ‘British empiricism’ and discuss a greater range of topics (covering moral philosophy and theology, for example, as well as epistemology and metaphysics).

Even if ‘empiricism’ is tied to an acceptance of some version of the ‘Peripatetic Axiom’ (as it is in this article), it is important to note that ‘empiricism’ is neither uniquely British nor a uniquely early modern phenomenon, and Locke was not the first early modern thinker to draw heavily from the ‘Peripatetic Axiom’ in his approach to knowledge. In this section, we briefly outline the history of the ‘Peripatetic Axiom’ prior to Locke before introducing Locke’s usage of it as espoused in the Essay. We do so by charting the emergence of this ‘Peripatetic Axiom’ which, in a very general form, is as follows:

Peripatetic Axiom: there is nothing in the intellect not first in the senses.

The name comes from the axiom’s association with Aristotle (see Gasser-Wingate 2021), the ‘Peripatetic’ philosopher; so-called because he liked to philosophise while walking. We will argue that, in the hands of Locke, the Peripatetic Axiom, which has a long history, was turned into the Lockean Axiom: There is no idea in the mind that cannot be traced back to some particular experience (which we discuss in greater detail in section 2.2).

Prior to Locke, the axiom can be found in the writings of medieval Aristotelian writers including Thomas Aquinas (1225–1274) and Roger Bacon, other early modern writers like Thomas Hobbes (1588–1679), and perhaps even in the work of Ancient Greek thinkers like Aristotle (ca. 384-322 BCE) and Heraclitus (ca. 500 BCE). Our contention is that, in Locke’s Essay, the Peripatetic Axiom took on a particular shape that would go on to be hugely influential in seventeenth- and eighteenth-century philosophy, especially in Britain. One reason for this is that Locke’s Essay was extremely widely read in Britain; for example, it was a standard set text for philosophy in British universities.

For the purposes of the discussion in this article, we take empiricists to be those thinkers who are committed, in some form or another, to the view that all knowledge (everything that is ‘in the mind’) can be traced back to some kind of experience. Often, ‘experience’ is construed in terms of sense-perception, although, as we will find, in Locke’s Essay, ‘experience’ covers both outward sense experience and inward, introspective experience of the operations and contents of one’s own mind – what Locke calls ‘reflection’ (Essay 2.1.2). Thus, Locke can be thought of as having expanded the scope of what can be ‘experienced’, compared to many of his early modern, medieval, and ancient predecessors.

There is some evidence of something close to a commitment to ‘empiricism’ – perhaps a kind of ‘proto-empiricism’ – in Pre-Socratic writers such as Heraclitus, Empedocles (ca. 495–435 BCE), or Xenophanes (ca. 570–475 BCE). Although their writings make it hard to determine whether they are committed to a recognisable form of the Peripatetic Axiom or are simply resistant to thinkers like Parmenides (ca. 515–445 BCE), who argued that the senses are unreliable and that a priori reasoning is the only appropriate way to grasp the nature of reality. Similarly, Aristotle rejects his teacher Platos (427–347 BCE) account of knowledge as recollection and the theory of innate ideas that follows from it. Plato had argued that our knowledge of, for example, mathematical principles is in fact knowledge of the Forms (Republic 510c1–511b2). The Forms – perfect, idealised, abstract entities which inhabit a ‘Realm of Forms’ distinct from our own world of sense experience—can be accessed, according to Plato, by recollection or intuition. Aristotle rejects this account of knowledge as recollection (for example, APo. 100a)—a move that would later be repeated by Locke in his own discussion of innate ideas in Book I of the Essay. Instead, Aristotle claims that “to gain light on things imperceptible we must use the evidence of perceptible things” (EN 1104a13–14). Similarly, Aristotle rejects the idea, found in thinkers like Parmenides and Plato, that reality can be understood through a priori reasoning, claiming instead that “we should accept what is evident to the senses rather than reasoning” (GA 760b29–33). Like later thinkers who accept the Peripatetic axiom, like Locke and Hume, Aristotle argues that – since inquiry is limited by what we are able to experience – when it comes to certain observable phenomena, we may, at best, be able to arrive at possible causes (Meteor 344a5–7).

In medieval thought, we begin to find explicit formulations of the Peripatetic Axiom. Note that, despite being called ‘Peripatetic’, the axiom is more explicitly articulated by later followers of Aristotle. Perhaps the most famous follower of Aristotle in Western philosophy, Thomas Aquinas, claims that “without sense perception no one can either learn anything new, nor understand matters already learned” (In DA 3.13 [para. 791]). In other words, according to Aquinas, we only learn new things via sense-perception. Clearly, this implies that there is nothing (new) in the mind that is not first in the senses. Similarly, another medieval thinker who pre-empts some of the ideas that would go on to be central to Locke’s view, Roger Bacon (1215–1292), writes that “without experience nothing can be sufficiently known” (OM 6.1). This is not quite the same as the claim that there is no knowledge (at all) without experience, but is still an endorsement of the crucial, necessary role that experience plays in knowledge acquisition that is central to the empiricist tradition.

Perhaps the most significant, imminent pre-cursor to Locke – in the context of the history of the Peripatetic Axiom – is Thomas Hobbes. Hobbes commits himself to the Peripatetic Axiom when he writes, in Leviathan (1651), that “there is no conception in a man’s mind, which hath not at first, totally, or by parts, been begotten upon the organs of Sense” (Leviathan, 1.1). Indeed, arguably one could tell a somewhat different story of early modern (or even ‘British’) ‘empiricism’ that takes Hobbes as its starting point. As Peter Nidditch explains, Hobbes (along with the French philosopher Pierre Gassendi (1592-1655)) “first produced in the modern era, especially in his Leviathan and De Corpore, a philosophy of mind and cognition that built on empiricist principles” (Nidditch 1975, viii). Nidditch goes on to suggest, speculatively, that it is most likely Hobbes’ reputation – as a highly unorthodox thinker, at best, and a secret atheist, at worst – that prevented him, retrospectively, from being seen as the ‘father of empiricism’ in the standard narrative. Whatever the explanation, it is Locke rather than Hobbes who would go on to be widely read and highly influential in Britain, and elsewhere, in the seventeenth- and eighteenth-century. As Nidditch puts it: “The Essay gained for itself a unique standing as the most thorough and plausible formulation of empiricism – a viewpoint that it caused to become an enduring powerful force” (Nidditch 1975, vii). Due to the Essay’s widespread influence, we focus on the role that Locke, rather than Hobbes, played in the development of British thought during these centuries; a role which would go on to be seen as so important that it even becomes possible, in hindsight, to speak of a more or less unified group and label it ‘British empiricism’. As we have suggested, there is a story to be told about Hobbes and empiricism, but it is one that, for the most part, we do not tell here (see section 1).

b. Locke

As was noted in the introduction, the question of whether there are innate ideas in the human mind is often seen as a central dividing line between empiricism and rationalism as they are standardly construed. While we pointed out the various issues of this standard narrative, our narrative also makes use of the issue of innatism. Though, crucially, our focus is less on finding a dividing line and more on finding a common denominator in the views of mainly ‘British’ and ‘Irish’ philosophers (for more on issues concerning the ‘British’ moniker see § 6). With that in mind, let us turn to the issue of innatism and the way Locke deals with it.

Locke characterises his innatist opponents’ position like so: “It is an established Opinion amongst some Men, That there are in the Understanding certain innate Principles; some primary Notions…as it were stamped upon the Mind of Man, which the Soul receives in its very first Being; and brings into the world with it,” (Essay, 1.2.5).

Whether or not this is a fair characterisation of his opponents’ views, as Locke sees it, the term ‘innate’ suggests that, on the innatist account, human beings are quite literally born with some in-built knowledge – some principles or propositions that the mind need not acquire but already possesses. In short, on this view, prior to any experience – that is, at the very first instant of its having come into existence – the human mind knows something. Locke develops two lines of argument against the innatist position, which will be referred to in what follows as (1) the Argument from Superfluousness and (2) the Argument from Universal Assent.

The Argument from Superfluousness proceeds as follows:

It would be sufficient to convince unprejudiced Readers of the falseness of this Supposition, if I should only shew (as I hope I shall in the following Parts of this Discourse) how Men, barely [that is, only] by the Use of their natural Faculties, may attain to all the Knowledge they have, without the help of any innate Impressions. (Essay, 1.2.1)

Locke’s point here is that all it takes to convince an ‘unprejudiced reader’ (that is, one who is willing to be swayed by reasonable argument) of the falseness of innatism is evidence that all knowledge can be traced back to instances in which our human “natural Faculties” – that is, our faculties of sense-perception and reflection – were in use. This argument thus depends upon the plausibility of Locke’s claim that all knowledge can be traced back to some kind of experience. We leave aside the Argument from Superfluousness for the moment since we discuss this claim in greater detail below.

In contrast, the Argument from Universal Assent is a standalone argument that does not depend upon any additional claims about the sources of human knowledge. Locke claims that if the human mind possessed certain principles innately then there would surely have to be certain spoken or written propositions that all human beings would assent to. In other words, if there were an innate principle X such that all human beings, regardless of their lives and experiences, knew X, then when confronted with a written or verbal statement of X (“X”), all human beings would agree that “X” is true. For example, let us assume for the moment that murder is wrong is a principle that is innately known to the human mind. Locke’s point is that, if presented with a written or verbal statement of “murder is wrong”, surely all human beings would assent to it.

And yet, Locke argues, this does not seem to be true of this or any other principle (evidenced, for example, by the fact that people do, in fact, commit murder). He writes: “[this] seems to me a Demonstration that there are none such [innate principles of knowledge]: Because there are none to which all Mankind gives an Universal assent” (Essay, 1.2.4). If by ‘demonstrates’, here, Locke means that it logically follows that, since there are no universally assented-to propositions, there must not be any innately known principles, he is not quite right. For there might be other reasons why certain propositions are not universally assented to—perhaps not everyone understands the statements they are being presented with, or perhaps they are lying (perhaps murderers know murder is wrong, but commit it nonetheless). At best, the Argument from Universal Assent provides a probable case against innatism, or places the burden proof on the innatist to explain why there are no universally assented-to propositions, or else neutralises the converse view (which Locke thinks his opponents subscribe to; see Essay, 1.2.4) that the existence of innate principles can be proven by appealing to the existence of universally assented-to propositions. And, of course, Locke’s reasoning also depends upon the truth of the claim that there are, in fact, no universally assented=to propositions (perhaps people have just not had the chance to assent to them yet, because they have not yet been articulated). Given all these mitigating factors, it seems most charitable to suggest that Locke is simply hoping to point out the implausibility, or even absurdity, of the innatist position – especially given an increasing societal awareness of cultural relativity in different societies and religions outside Europe in the seventeenth century (Essay, 1.4.8), not to mention the fact that neither Plato or Aristotle, or any other pre-Christians, would have assented to propositions like ‘God exists’ or ‘God is to be worshipped’ which, Locke claims, are paradigm cases of so-called ‘innate principles’ (Essay, 1.4.8).

Having, to his own satisfaction at least, provided one argument against the innatist position, Locke develops an account of the sources of human knowledge that supports the Argument from Superfluousness – by showing how all human knowledge can be traced back to some kind of experience. In contrast to innatists, Locke maintains that at birth the human mind is a blank slate or ‘tabula rasa’. If we picture the mind as a “white Paper, void of all characters”, Locke asks, “How comes it to be furnished?” (Essay, 2.1.2). His response is that: “I answer, in one word, From Experience: In that, all our Knowledge is founded; and from that ultimately derives itself” (Essay, 2.1.2).

Locke then divides experience into two subcategories with respective mental faculties: ‘sensation’ and ‘reflection’ (Essay, 2.1.2). Concerning sensation, he writes:

Our Senses, conversant about particular sensible Objects, do convey into the Mind, several distinct Perceptions of things, according to those various ways, wherein those Objects do affect them: And thus we come by those Ideas, we have of Yellow, White, Heat, Cold, Soft, Hard, Bitter, Sweet, and all those which we call sensible qualities. (Essay, 2.1.3)

Our ideas of sensation, Locke explains, are those which pertain to the qualities of things we perceive via the (five) external senses: the objects of vision, touch, smell, hearing, and taste. But of course, this does not exhaust the objects of the mind – we can also have ideas of things that are not perceived by the ‘outward’ senses. As Locke writes:

The other Fountain, from which experience furnisheth the Understanding with Ideas, is the Perception of the Operations of our own Minds within us, as it is employ’d about the Ideas it has got; which Operations, when the Soul comes to reflect on, and consider, do furnish the Understanding with another set of Ideas, which could not be had from the things without: and such are, Perception, Thinking, Doubting, Believing, Reasoning, Knowing, Willing, and all the different actings of our own Mind. (Essay 2.1.4)

In a sense, then, Locke’s point is this: While we standardly talk as though we ‘experience’ only those things that can be perceived by the senses, in actual fact we also experience the operations of our own mind as well as things external to it. We can, that is, observe ourselves thinking, doubting, believing, reasoning, and so on – and we can observe ourselves perceiving, too (this claim is contentious: Do we really observe ourselves perceiving, or are we simply aware of ourselves perceiving?).

Locke’s aim is to establish that no object of knowledge, no ‘idea’ (Essay, 1.1.8), can fail to be traced back to one of these two ‘fountains’ of knowledge. In doing so, Locke thus commits himself to a particular formulation of the ‘Peripatetic Axiom’ (discussed in section 2.1). While the ‘Peripatetic Axiom ‘– found in medieval Aristotelians and in Hobbes – states that ‘there is nothing in the intellect not first in the senses,’ Locke’s claim, which is central to the way ‘empiricism’ is construed in this article, is:

Lockean Axiom: There is no idea in the mind that cannot be traced back to some particular experience.

The Lockean Axiom would go on to very influential in seventeenth- and eighteenth-century thought, especially in Britain.

3. Our Knowledge of the External World and Causation

This section focuses on the application of the Lockean Axiom (there is no idea in the mind that cannot be traced back to some particular experience) to our knowledge of the external world. In doing so it most closely resembles the standard narrative of ‘British empiricism’ because the focus rests on Berkeley’s rejection of materialism and Hume’s denial of necessary connection. However, in contrast to the standard narrative, we close this section by emphasizing how Mary Shepherd, who is said to have read Locke’s Essay when she was eight years old (Jeckyl 1894, 217), rejects both positions. Although, as will become evident, in doing so she does not draw from the Lockean Axiom but from two causal principles.

a. Berkeley on the Nature of the External World

 In A Treatise Concerning the Principles of Human Knowledge (1710/34) and Three Dialogues between Hylas and Philonous (1713/34), Berkeley defends the doctrine he is most famous for: Immaterialism. In a nutshell, Berkeley holds that everything that exists is either an immaterial mind or idea (for example, PHK §§ 25–27). Thus, his commitment to the notorious dictum esse est percipi aut percipere (“To be is to be perceived or to perceive”) (compare NB 429, 429a; PHK § 3).

Two key features of his argument for immaterialism are Berkeley’s claims that the “existence of an idea consists in being perceived” (PHK § 3) and that “an idea can be like nothing but an idea” (PHK § 8). Since Berkeley is convinced that sense perception works via resemblance (for example, Works II, 129; TVV § 39) (see Fasko and West 2020; Atherton 1990; West 2021) and because we know that (most) objects of human knowledge are ideas – either “imprinted on the senses” or “formed by help of memory and imagination” (PHK § 1), he argues that we can infer that the objects in the external world also must be ideas or collections of ideas (PHK §§ 1–8). After all, according to Berkeley, when we say something like the table exists, we mean that it can be perceived. And what is perceived is, after all, an idea (PHK § 3) (Daniel 2021, Fields, 2011, Jones 2021, Rickless 2013, Saporiti 2006).

It is important to note that, in developing this argument, Berkeley, implicitly, draws on the Lockean Axiom that there is no idea that cannot be traced back to some particular experience. For Berkeley’s point is that our experience of the external world and its objects clearly suggests that they only exist when they are perceived. That is, when we trace back our ideas of things in the external world to the experiences we have of them, we come to understand that these ‘things’ are also ideas.

Berkeley fortifies his case for immaterialism by rejecting what is, to his mind, the only viable alternative: Materialism. More specifically, Berkeley argues against the existence of a (Lockean) material substance. In doing so, he, again, draws from the Lockean Axiom – and, in that sense, uses Locke’s own claim against him – by raising the question of whether we even have an idea of material substance in the first place. Berkeley then claims that even materialists, like Locke on his reading, must accept that we do not; for, as they themselves admit, there is nothing we can say about it (DHP 261). The reason we do not have an idea of material substance, Berkeley contends, is that there is no such thing in the first place and, thus, no experience of such a thing (and where there is no experience, there can be no idea). In fact, Berkeley believes that the very notion of such a thing would be “repugnant” (DHP 232; PHK § 17). As he puts it:

I have no reason for believing the existence of matter. I have no immediate intuition thereof: neither can I mediately from my sensations, ideas, notions, actions or passions, infer an unthinking, unperceiving, inactive substance, either by probable deduction, or necessary consequence. (DHP 233)

Even worse, assuming the existence of a material substance leads to skepticism concerning the existence of the external world and ultimately also God’s existence (that is, it leads to atheism, compare also PHK § 92) because it leads one to become “ignorant of the true nature of every thing, but you know not whether any thing really exists, or whether there are any true natures at all” (DHP 229). When challenged by his imagined opponent with the argument that we also have no idea of God or other minds (see also section 4.3) – and thus no reason to assume they exist – Berkeley appeals to the (first personal) experience we can have of these entities (DHP 233). This is consistent with the Lockean Axiom which, while it does entail that every idea can be traced back to an experience, does not entail that every experience must lead to an idea.

In sum, in arguing for his immaterialism Berkeley makes implicit use of the Lockean Axiom inasmuch as he draws from it to establish that the external world and its objects must consist of ideas because our experience of the external world and its objects are such that it consists of perceivable things. The Lockean Axiom also plays a role in Berkeley’s argument against the existence of material substance, in that the lack of experience of matter is taken to explain the lack of a corresponding idea – and an analysis of the idea shows its repugnancy.

b. Hume on the Nature of Causation

 At least in the context of contemporary Western thought, Hume’s account of causation is perhaps one of the best known and most discussed theories to have come out of the early modern period (see, for example, Garrett 2015; Bell 2008; Beauchamp and Rosenberg 1981). In An Enquiry Concerning Human Understanding (1748), Hume sets out to demonstrate that causal relations – or what he calls ‘necessary connections’ – are not something that we experience in the world around us (see Noxon 1973 or Traiger 2006 for a discussion of the development of Hume’s thought and the relation between the Treatise and the EHU). Rather, Hume claims, we form the idea or concept of causation in our mind as a result of repeated experiences of ‘causes’ preceding ‘effects’, and the ‘sentiment’ that such repeated experiences generate in us (EHU 7). In other words, on Hume’s view, we feel as though certain events or objects (like smoke and fire) are necessarily connected, by a causal relation, because we see them occur in conjunction with one another repeatedly. But, strictly speaking, Hume argues, we do not experience any such causal relations and thus cannot know with certainty that the two things are necessarily connected – at best, we can have probable knowledge. What is important, for the concerns of this article, is that Hume’s reasoning for this view is premised upon a version of the Lockean Axiom: There is no idea in the mind that cannot be traced back to some particular experience. In other words, it is Hume’s ‘empiricism’ (in the sense that we have used the term in this article) that leads him to arrive at his skeptical account of causation. For an ‘empiricist’, knowledge is dependent upon experience – and Hume’s point in the EHU is that we cannot experience causation. We run through Hume’s argument in more detail below.

Hume begins section 2 of the EHU (where his discussion of the origin of ideas takes place) by establishing what has come to be known as ‘the Copy Principle’ (for further discussion, see Coventry and Seppalainen 2012; Landy 2006 and 2012). The Copy Principle concerns the relation between what Hume calls ‘impressions’ and ‘ideas.’ The crucial thing for our purposes is that, for Hume, ‘impression’ refers (amongst other things) to any direct experience or sense-perception we have of an external object. When I look outside my window and see the sun, for instance, I am receiving an ‘impression’ of the sun. That is, the sun is ‘impressing’ itself upon my sense organs, similarly to a stamp that impresses an insignia upon wax. ‘Ideas,’ on the other hand, are what are left behind, in the mind, by such impressions; Hume’s use of the term ‘idea’ is thus slightly different to that of Locke or Berkeley,   who both use ‘idea’ in a way that also encompasses Humean impressions. When I remember the sun, as I lie in bed at night, I am having an ‘idea’ of the sun. And, similarly, if I lie in bed and imagine tomorrow’s sun, I am also forming an ‘idea’ of it. In terms of our experiences of them, impressions and ideas are differentiated by their degrees of vividness and strength: my impression of the sun, for instance, will be stronger and more vivid (perhaps brighter) than my idea of the sun. As Hume puts it:

These faculties [of memory and imagination] may mimic or copy the perceptions of the senses; but they never can entirely reach the force and vivacity of the original sentiment. The utmost we say of them, even when they operate with greatest vigour, is, that they represent their object in so lively a manner, that we could almost say we feel or see it: But, except the mind be disordered by disease or madness, they never can arrive at such a pitch of vivacity, as to render these perceptions altogether undistinguishable. (EHU 2.1, 17)

An idea might somewhat resemble the strength or vividness of an impression but, Hume claims, an idea of the sun and the sun itself (unless one’s mind is ‘disordered’) will never be entirely indistinguishable.

The Copy Principle entails that every (simple) idea is a copy of an impression. Hume writes:

It seems a proposition, which will not admit of much dispute, that all our ideas are nothing but copies of our impressions, or, in other words that it is impossible for us to think of anything which we have not antecedently felt, either by our external or internal senses. (EHU 7.1.4, 62)

This principle is strongly empiricist in character and closely related to both the Lockean Axiom and the Peripatetic Axiom, which entails that there is nothing in the mind not first in the senses. Like the Lockean Axiom, the Copy Principle (as articulated in this passage) tells us that if I have an idea of X, then I must previously have had an experience, or ‘impression’, of X.

For Hume, all of this makes the issue of where we get our idea of causation extremely pressing. Hume denies that we do in fact have any impressions of causation or ‘necessary connections’ between things:

When we look about us to external objects…we are never able in a single instance, to discover any power of necessary connexion; any quality which binds the effect to the cause, and renders one an infallible consequence of the other. We only find that one does actually, in fact, follow the other. (EHU 7.1.6, 63)

Consider the case of a white billiard ball rolling along a table and knocking a red ball. Hume asks: can you in fact experience or perceive the ‘necessary connection’ (or causal relation) that makes it the case that when the white ball knocks the red ball the red ball moves away? His answer is no: what you experience, strictly speaking, is a white ball moving and then a red ball moving. But if we do not have an impression of causation, in such instances, why do we have an idea of causation?

Hume concludes that while we do not have an outward impression of causation, because we repeatedly experience uniform instances of for example, smoke following fire, or red balls moving away from white balls, we come to feel a new impression which Hume calls a ‘sentiment’. That is, we feel as though we are experiencing causation – even though, in strict truth, we are not. This new feeling or sentiment is “a customary connexion in the thought or imagination between one object and its usual attendant; and this sentiment is the original of that idea which we seek for” (EHU 7.2.30, 78). In other words, while our idea of causation or necessary connection cannot be traced back to a specific impression, it can nonetheless be traced back to experience more generally. Repeated uniform experience, Hume claims, induces us to generate the idea of causation – and is the foundation of our ‘knowledge’ of cause-and-effect relations in the world around us. In line with the Lockean Axiom, then, Hume’s view is that we would have no idea of causation, were it not for our experience of certain events or objects (‘causes’) regularly preceding others (‘effects’).

c. Shepherd on Berkeley and Hume

 The previous subsections have established that Berkeley and Hume both draw on the Lockean Axiom that there is no idea that cannot be traced back to some particular experience in important ways. Both thinkers draw on this principle inasmuch as they take the absence of particular experiences (about the external world or causation) not only to entail that there is no idea but that the things in question (material substance or necessary connections) do not exist. In this section we consider how Mary Shepherd rejects both Berkeley’s immaterialism and Hume’s skeptical account of causation. As will become evident, however, Shepherd does so not by drawing on the Lockean Axiom – which does not play any role in her account of the mind – but by using two causal principles that she introduces in her works. Shepherd is thus an example of the limits of the narrative developed here. For even though she conceives of Locke as her closest ‘philosophical ally’ (LoLordo 2020, 9), Shepherd concludes that one needs, in order to refute Berkeley and Hume, to consider the issue of causation first – and not issues concerning (mental) representation. For Shepherd believes that even (mental) representation and the mental content it allows for ought ultimately to be understood in causal terms.

Shepherd’s first causal principle, the so-called CP, holds that “nothing can ‘begin its own existence’” (for example, ERCE 94). Second, the Causal-Likeness-Principle (CLP) states that “like causes, must generate like Effects” (for example, ERCE 194). It is important to note that the CLP is a biconditional, as Shepherd claims in her second book Essays on the Perception of an External Universe (1827) that “like effects must have like causes” (EPEU 99).

Shepherd defends both principles in her first book, Essay on the Relation of Cause and Effect (1824). The main aim of this work is to refute a Humean account of causation as constant conjunction. In particular, Shepherd wants to establish, against Hume, that causes and effects are necessarily connected (ERCE 10). While the details of Shepherd’s argument can be put aside for now, the crucial thing to note is that she does not draw from the Peripatetic Axiom or the Lockean Axiom. Instead, Shepherd focuses on rejecting Hume’s theory of mental representation and his claim that the possibility of separating cause and effect in thought tells us something about their actual relation (Bolton 2010 & 2019; Landy 2020a & 2020b). Crucially, this rejection of Hume, in turn, fortifies her case for her two causal principles – both of which play a crucial role in arguing against Berkeley.

Meanwhile, in rejecting Berkeley’s version of immaterialism, Shepherd contends that we have sensations of solidity and extension (EPEU 218), and drawing from the CP, we know that these must have a cause. Since we know the mind to be a cause for sensations (for example, EPEU 14–15), there must also be another cause for these sensations. Thus, we can come to know that matter (which she also calls ‘body’) is the “continually exciting cause, for exhibition of the perception of extension and solidity on the mind in particular” (EPEU 155) and matter is “unperceived extended impenetrability” (LMSM 697). In other words, the causal connection between our mental content and the external world allows Shepherd to draw inferences about its objects, which show them not to be ideational, that is, not to merely consist of ideas as Berkeley, for instance, would have it (while Shepherd thus clearly rejects a Berkeleyan brand of immaterialism (see Atherton 1996, Rickless 2018), it is not clear whether she is opposed to all kinds of immaterialism whatsoever; as Bolye (2020, 101) points out ‘(im-)material’ seems to be a “label” for capacities and it is unclear whether more than capacities exist in Shepherd’s metaphysics).

In sum, Shepherd is a fitting end point for this part of the narrative because she not only closely engages with Berkeley and Hume (and their applications of the Lockean Axiom) but also because Locke is such a close philosophical ally for her—although, scholars have noted that Shepherd sometimes advances an idiosyncratic reading of Locke (Boyle 2023; LoLordo 2022). Even more to the point, Shepherd suggests that her theory is a ‘modified Berkeleian theory’ (LMSM 698) and thus aligns herself explicitly with a key figure of the ‘standard’ narrative of British empiricism.

Thus, despite the fact that the Lockean Axiom does not play a role in Shepherd’s argumentation, and in fact it is unclear what she thinks about it, there are good reasons to consider her within this narrative. For Shepherd’s philosophy focuses on key figures within this narrative to the point where she aligns herself implicitly and explicitly with at least two of them.

4. Morality

 

 

One of the most interesting upshots of the widespread acceptance of the Lockean Axiom, or what we might call his ‘empiricist’ philosophy, in Britain and Ireland during the eighteenth century is the effect it had on theorising about morality; specifically concerning the question of where we get our moral ideas (like good, bad, right, wrong, virtuous, and vicious) from. The Lockean Axiom dictates that there is no idea that cannot be traced back to some particular experience. While that might fit nicely with how we get our ideas of concepts like colour, sound, or touch (and any other ideas that can be traced to sense perception), ideas like justice/injustice, good/bad, or right/wrong, do not seem to be easily traceable to some particular experience. It does not seem controversial to suggest that ‘redness’ or ‘loudness’ are qualities we can experience in the world around us, but it is much less obvious that we experience qualities such as ‘goodness’, ‘badness’, ‘rightness’, or ‘wrongness’. For a start, while – barring cases of, for example, blindness, deafness, or any other sensory deficiency – there is likely to be agreement about an object’s colour or the volume of a sound. There is, however, generally speaking, considerable disagreement when it comes to the goodness/badness or rightness/wrongness of an action. The same applies in the case of beauty and other aesthetic qualities, and there is a great deal that could be said about ‘empiricist’ approaches to aesthetics (we do not discuss these issues here but for discussion of Hume’s aesthetics see, for example, Costello 2007, Gracyk 1994, Townsend 2001, and for discussion of Hutcheson’s aesthetics, see, for example, Shelley 2013, Michael 1984, Kivy 2003).

This section looks at three thinkers’ views on morality and examine the role that the Lockean Axiom played in their theorising. All three are important figures in the history of (Western) ethics. Francis Hutcheson was one of the first philosophers to apply the Lockean Axiom to questions of morality and, though he was Irish born, would go on to be known as a central figure in the so-called ‘Scottish Enlightenment’ (his parents were Scottish Presbyterian and he would spend most of his career in Scotland). David Hume pre-empts discussions of utility in ethical theorising that would come to the fore in the work of Mill and Bentham and develops the idea of a sense of ‘taste’ which allows us to perceive the moral characteristics of persons and actions. Meanwhile, Susanna Newcome (1685-1763) has recently been identified (Connolly 2021) as one of the earliest thinkers to defend what is recognisably a form of utilitarianism.

a. Hutcheson and the Moral Sense

In An Inquiry into the Original of Our Ideas of Beauty and Virtue (1725), Francis Hutcheson explicitly acknowledges the indebtedness of his discussion of morality (as well as beauty) to Locke (for example, Inquiry, 1.VII). His begins the Inquiry by defining sensations as “[t]hose Ideas which are rais’d in the Mind upon the presence of external Objects, and their acting upon our Bodys” and adds that “We find that the Mind in such Cases is passive, and has not Power directly to prevent the Perception or Idea” (Inquiry, 1.I). A little later, Hutcheson explains that “no Definition can raise any simple Idea which has not been before perceived by the Senses” (Inquiry, 1. IV). In making these claims, Hutcheson is committing himself to a version of the Lockean Axiom, the claim that there is no idea in the mind that cannot be traced to some particular experience – strictly speaking, this should read ‘simple idea’, since Hutcheson’s view is that all simple ideas must be traced back to some experience – compound ideas might be the product of reason.

Hutcheson’s commitment to the Lockean Axiom leads him to conclude that humans have a “Moral Sense” (see Frankena 1955; Harris 2017) as well as external senses of seeing, hearing, touching, tasing, and smelling. In fact, in his Essay on the Nature and Conduct of the Passions and Affections (1742), Hutcheson claims we have a range of ‘internal’ senses including a “Publick Sense”, concerned with the happiness of others, a “Sense of Honour”, and a sense of “decency and dignity” (Essay, 5-30). This is understandable given that, for Hutcheson, a sensation is ‘an idea raised in the mind upon the presence of external objects’ – and it is external objects, or more often external people (and their actions), that raise in us ideas of right, wrong, good, bad, justice, or injustice.

In the Essay, Hutcheson lays out a line of reasoning which justifies this view: “If we may call every Determination of our Minds to receive Ideas Independently on our Will, and to have Perceptions of Pleasure and Pain, A SENSE, we shall find many other Senses besides those commonly explained” (Essay, 5). His point is this: a sense is a ‘determination’ or faculty of the mind by means of which it receives (passively) certain kinds of ideas. Our sense of vision, for instance, is where we get our visual ideas, for example, ideas of colour or brightness/darkness. Our olfactory sense is where we get our ideas of smell such as sourness, putridness, and so on. However, if we can identify ideas that cannot be traced back to one of the five external senses – vision, hearing, taste, touch, smell – Hutcheson argues, then there must be another sense, an internal sense, by means of which the mind has received that idea. Such is the case with our ideas of good, bad, right, wrong, and so on. Since these ideas cannot be traced to any of the five external senses – because we do not literally see, hear, taste, touch, or smell good or bad, or right or wrong – we can infer that there must be a moral sense by which the mind has received them. Hutcheson describes this moral sense as that by which “we perceive Virtue, or Vice in our selves, or others” (Essay, 20). That is, through our naturally built-in moral sense, humans can detect virtue and vice. Note that this view implies that virtue and vice, and relatedly notions like good, bad, right, wrong, justice, and injustice, are qualities out there to be sensed. But what is it exactly that we are perceiving with our moral sense? And how does the human mind perceive virtue and vice in ourselves and other people?

For Hutcheson, the answer is that our ideas of virtue, vice, and other moral concepts are grounded in perceptions of pleasure and pain. Indeed, as the quotation above suggests, for Hutcheson, all perceptions are accompanied by a feeling of pleasure or pain. Some objects excite pleasure or pain in us, Hutcheson explains, even when we cannot see any “Advantage or Detriment the Use of such Objects might tend: Nor would the most accurate Knowledge of these things vary either the Pleasure or Pain of the Perception” (Inquiry, 1.VI). That is, some objects are naturally pleasurable or painful to sense – and such objects, according to Hutcheson, are beautiful or ugly, respectively. Similarly, the actions of some people generate pleasure or pain in us, and this is what determines whether we characterise those people are virtuous or vicious. Hutcheson maintains that it is a moral sense that generates our ideas of virtue or vice (just as it is an aesthetic sense that generates ideas of beauty or ugliness), rather than, say, a judgement or act of reason, because those ideas do “not arise from any Knowledge of Principles, Proportions, Causes, or of the Usefulness of the Object” (Inquiry, 1.XII). Instead, just as we are ‘struck’ with the colour of an object or the pitch of a sound, we are ‘struck’ by the rightness or wrongness, or virtuousness or viciousness, of a person or action.

In short, Hutcheson’s view is that we feel a kind of pleasure or displeasure in response to certain character traits or actions which determines whether we characterise them as virtuous or vicious. For example, one might feel pleasure witnessing an act of charity, or displeasure witnessing an action of cruelty. In the former case, an idea of virtue (or goodness, or rightness) is raised in our minds, while in the latter it is an idea of vice (or badness, or wrongness). In so doing, Hutcheson provides an empiricist  account of the origins of ideas concerning moral concepts, that is, one that draws on Lockean Axiom.

b. Hume on Taste and the Moral Sense

Like Hutcheson, Hume is interested in identifying the source of our ideas of moral concepts like virtue, vice, justice, injustice, right, and wrong. And, like Hutcheson, Hume arrives at the view that such ideas are derived from some kind of moral sense, which he calls ‘taste’ (see, for example, T 3.3.6; Shelley 1998). (Another similarity with Hutcheson is that many of Hume’s claims about our sense of morality are paralleled in his discussion of beauty—including the claim that we have a sense of beauty.) In An Enquiry Concerning the Principles of Morals (1751), Hume’s account of moral sense, or taste, is part of a wider discussion of whether it is reason or sentiment, that is, feeling, that gives us our principles of morality. Hume lays out the debate like so:

There has been a controversy started of late…concerning the general foundation of MORALS; whether they can be derived from REASON, or from SENTIMENT; whether we attain the knowledge of them by a chain of argument and induction, or by an immediate feeling and finer internal sense; whether, like all sound judgment of truth and falsehood, they should be the same to every rational intelligent being; or whether, like the perception of beauty and deformity, they be founded entirely on the particular fabric and constitution of the human species. (EPM 1.3, 170)

In other words, the question is: do we reach conclusions about what is right or wrong in the same way we reach the conclusion of a mathematical formula, or do we reach such conclusions in the way we arrive at judgements about what counts as beautiful? The question is significant because, Hume claims, if our moral principles are more like judgements of beauty, then they might not, strictly speaking, be objective. They might instead be grounded in specific human values, concerns, desires, and judgements. Whereas if they are more like conclusions arrived at using reasoning, such as mathematical conclusions, Hume claims, then they can be more appropriately described as objective.

Hume opts for a decidedly ‘empiricist’ approach, that is, he draws from the Lockean Axiom, in answering this question, which ultimately leads him to reject the claim that moral principles are the product of reason. He explains that in the sciences, or ‘natural philosophy’, thinkers “will hearken to no arguments but those which are derived from experience” (EPM 1.10, 172). The same ought to be true, he claims, in ethics. In line with the Lockean Axiom, Hume then suggests that we ought to “reject every system of ethics, however subtile [that is, subtle] or ingenious, which is not found in fact and observation” (ibid.)—that is, the previously mentioned experience which underlies the arguments must be tied to the world we can perceive by our senses. Thus, like the natural philosophers of the Royal Society in London (the natural scientists he is referring to here), who rejected armchair theorising about nature in favour of going out and making observations, Hume’s aim is to arrive at an account of the origin of our moral principles that is based on observations of which traits or actions people do, in practice, deem virtuous or vicious—and why they do so.

What Hume claims to find is that traits like benevolence, humanity, friendship, gratitude, and public spirit—in short, all those which “proceed from a tender sympathy with others, and a generous concern for our kind and species” (EPM 2.1.5, 175)—receive the greatest approbation, that is, approval or praise. What all these character traits, which Hume calls the ‘social virtues’, have in common is their utility (see also Galvagni 2022 for more on Hume’s notion of virtue). This is no coincidence, Hume argues, for “the UTILITY, resulting from the social virtues, forms, at least, a part of their merit, and is one source of that approbation, and regard so universally paid to them” (EPM 2.2.3, 176). This leads Hume to develop the following line of reasoning: There are a set of traits, or ‘social virtues’, that are deemed the most praiseworthy by society. People who exhibit these traits are characterised as ‘virtuous.’ What these virtuous traits have in common is that they promote the interests of—that is, are useful to—society at large. Thus, Hume concludes, utility is at the heart of morality. This conclusion would go on to influence later thinkers like Jeremy Bentham and John Stuart Mill and is central to the normative ethical theory utilitarianism—which we also discuss in section 4.3 in relation to Susanna Newcome.

What role does the moral sense, or taste, play in Hume’s account of the origins of morality? His answer is that taste serves to motivate us to action, based on the pleasure that comes with approbation or the displeasure that comes with condemnation. He writes: “The hypothesis which we embrace is plain. It maintains that morality is determined by sentiment. It defines virtue to be whatever mental action or quality gives to a spectator the pleasing sentiment of approbation; and vice the contrary,” (EPM, Appendix 1, I, 289). In other words, Hume’s point is that we enjoy responding to a person or action with approval and do not enjoy, and may even take displeasure from, responding to persons or actions with blame or condemnation. Thus, like Hutcheson, Hume thinks that ideas we receive via our moral sense are accompanied by feelings of pleasure or pain. This is a claim about human psychology and, again, an idea that would go on to play an important role in the utilitarian ethics of Bentham and Mill—especially the idea, known as ‘psychological hedonism’, that humans are driven by a desire for pleasure and to avoid pain. Hume himself seems to endorse a kind of psychological hedonism when he claims that if you ask someone “why he hates pain, it is impossible he can ever give any [answer]” (EPM, Appendix 1, V, 293).

In line with the Lockean Axiom, Hume concludes that it cannot be reason alone that is the source of our moral principles. Again, like Hutcheson, Hume thinks that we sense—immediately, prior to any rational judgement—rightness or wrongness, or virtue or vice, in certain persons or actions. What is more, he argues, reason alone is “cool and disengaged” and is thus “no motive to action”, whereas taste, or moral sense, is a motive for action since it is involves a feeling of pleasure or pain (EPM, Appendix 1, V, 294). For that reason, Hume concludes that, in morality, taste “is the first spring or impulse to desire or volition” (ibid.).

c. Newcome on Pain, Pleasure, and Morality

There is no explicit commitment to the Lockean Axiom in the writings of Susanna Newcome. However, what we do find in Newcome is a development of the idea, also found in Hutcheson and Hume, that moral theorising is rooted in experiences of pleasure and pain—ideas which, as we found in sections 4.1 and 4.2, are themselves premised upon an acceptance of the Lockean Axiom. Thus, in Newcome (as in Shepherd), we find a thinker deeply influenced by the work of others who did adhere to the Lockean Axiom. In a sense, then, Newcome’s work is indirectly influenced by that Axiom. What we also find in Newcome is a bridge between the ‘empiricism’ of Locke, and thinkers like Hutcheson and Hume, who accept the Lockean Axiom, and the later utilitarianism of Jeremy Bentham and John Start Mill. For these reasons, Newcome’s ethical thinking merits inclusion in this article on ‘empiricism’ and the story of the development of the Lockean Axiom we have chosen to tell (see section 1).

In An Enquiry into the Evidence of the Christian Religion (1728/1732), Newcome provides the basis for a normative ethical theory that looks strikingly similar to the utilitarianism later, and more famously, defended by Jeremy Bentham and John Stuart Mill. For this reason, Connolly (2021) argues that Newcome—whose work pre-dates that of both Bentham and Mill—could plausibly be identified as the first utilitarian. Newcome bases her claims about ethics on claims about our experiences of pleasure and pain. What is also interesting, for our present purposes, is that Newcome identifies ‘rationality’ with acting in a way that maximises pleasure or happiness. Consequently, on Newcome’s view, we can work out what actions are rational by paying attention to which actions lead to experiences of pleasure or happiness—and the same applies to irrational actions, which lead to experiences of pain or unhappiness. In the remainder of this section, we outline Newcome’s views on pleasure and pain, happiness and unhappiness, and rational and irrational action.

Newcome begins her discussion of ethics by claiming that pleasure and pain cannot be defined (Enquiry, II.I). She explains that happiness and misery are not the same as pleasure and pain. Rather, she claims, happiness is “collected Pleasure, or a Sum Total of Pleasure” while misery is “collected Pain, or a Sum Total of Pain” (Enquiry, II.II–III). In other words, happiness is made up of feelings of pleasure, and misery is made up of feelings of pain. One is in a state of happiness when one is experiencing pleasure, and one is in a state of misery when one is experiencing pain. Newcome then goes on to commit herself to what has come to be known as ‘psychological hedonism’ (see section 4.2): the view that humans are naturally driven to pursue pleasure and avoid pain. As she puts it, “To all sensible Beings Pleasure is preferable to Pain” (Enquiry, III.I) and “If to all sensible Beings Pleasure is preferable to Pain, then all such Beings must will and desire Pleasure, and will an Avoidance of Pain” (Enquiry, III.II). Newcome then moves from these claims about what humans naturally pursue or avoid to a claim about what is most ‘fit’ for us. Like later utilitarians such as Bentham and Mill, Newcome thus bases her normative ethical theory—that is, her account of how we ought to act—on psychological hedonism, an account of how we naturally tend to act. She writes: “What sensible Beings must always prefer, will, and desire, is most fit for them” (Enquiry, III.III) and “What sensible Beings must always will contrary to, shun and avoid, is most unfit for them” (Enquiry, III.IV). She concludes: “Happiness is then in its own Nature most fit for sensible Beings” and “Misery is in its own Nature most unfit for them” (Enquiry, III.V–VI).

As we noted at the beginning of this section, Newcome does not explicitly commit herself to the Lockean Axiom that there is no idea in the mind that cannot be traced back to some particular experience. Nonetheless, it is true to say that Newcome arrives at her conception of how humans ought to act on the basis of claims about experience. As we saw, Newcome’s view is that pleasure and pain cannot be defined. Her view seems to be that we all just know what it is to feel pleasure and experience pain, through experience. In much the same way that one could not convey an accurate notion of light or darkness to someone blind from birth or loud and quiet to someone deaf from birth, Newcome’s view seems to be that the only way to know what pleasure and pain are is to have pleasurable and painful experiences. And it is on the basis of such experiences that Newcome, in turn, arrives at her conception of happiness, misery, and ‘fit’ or ‘unfit’ actions—that is, the kinds of actions that it are ‘right’ or ‘wrong’, respectively, for us to perform. As we suggested above, Newcome’s moral philosophy is also noteworthy in that she identifies rational actions with those which are conducive to pleasure. She explains: “As Reason is that Power of the Mind by which it finds Truth, and the Fitness and Unfitness of Things, it follows, that whatever is True or Fit, is also Rational, Reasonable, or according to Reason,” (Enquiry, IV).

And she adds that “all those Actions of Beings which are Means to their Happiness, are rational” (Enquiry, IV.V). In Newcome, then, we find not only a normative ethical theory but also an account of rational action that is grounded, ultimately, in our experience of things. Rational action is action conducive to happiness, and happiness is the accumulation of pleasure. We work out what actions are rational or irrational, then, by appealing to our experience of pleasure or pain.

5. God and Free-Thinking

This section focuses on the application of the Lockean Axiom to questions concerning the existence of God and the divine attributes of wisdom, goodness, and power—a crucial issue for philosophers during the early modern period, when theological issues were seen as just as important as other philosophical or scientific issues.

If you accept the Lockean Axiom, this seems to pose a problem for talk of God and his attributes (although it is worth noting that Locke does seem not see it that way; rather, he thinks the idea of God is on the same epistemic footing as our idea of other minds (Essay 2.23.33–35)). As the ‘free-thinking’ philosopher Anthony Collins (1676–1729) argues, if all the ideas in our minds can be traced back to some particular experience, and if we cannot experience God directly (as orthodox Christian teachings, particularly in the Anglican tradition, would have it), then it seems impossible that we could have an idea of God. But if we cannot have an idea of the deity, one might worry, how we can know or learn anything about God? And what does that mean for the Bible, which is supposed to help us do just that? Thus, thinkers like Collins would argue, while you can have faith in God’s existence, whether this is a reasonable or justified belief is an entirely different question.

A potential rebuttal to Collins’ way of arguing, however, is to point to divine revelations in the form of miracles or other Christian mysteries. Perhaps miracles do constitute instances in which those present can, or could, experience God, or divine actions. This kind of response is attacked by another free-thinker, John Toland (1670–1722), who argues that religious mysteries cannot even be an object of believing because they are inconceivable. For example, the idea that God is father, son, and holy spirit all at once is something that seems both inconceivable and contrary to reason. Against these lines of reasoning, more orthodox thinkers like George Berkeley, who, crucially, accepts the Lockean Axiom as well (see section 3.1), argues that even though we cannot have an idea of God, we can nonetheless experience the deity through our experience of the divine creation, nature. We outline Collins, Toland, and Berkeley’s views on God, and their relation to ‘empiricism’, in the subsections below.

a. Anthony Collins

Anthony Collins had a close friendship with Locke, but he adopted the Lockean Axiom to advance his free-thinking agenda. Like Toland (see 5.2), Collins is concerned with defending the right to make “use of the understanding, in endeavouring to find out the meaning of any proposition whatsoever, in considering the nature of evidence for or against it, and in judging of it according to the seeming force or weakness of the evidence” (Discourse, 3).

Crucially, this process ought not to be interfered with by authority figures, particularly non-religious ones. Rather, everyone needs to be able to judge the evidence freely on their own (Discourse 3–21).

When it comes to applying the Lockean Axiom to questions concerning God’s existence and the divine attributes, Collins takes a concession by an orthodox cleric, archbishop William King (1650–1729), as his starting point. King writes that “it is in effect agreed on all hands, that the Nature of God, as it is in it self, is  incomprehensible by human Understanding; and not only his Nature, but likewise his Powers and Faculties” (Sermon § 3). While experience is not explicitly mentioned here, the underlying thought is that God is mysterious because we cannot experience God himself or the divine attributes. In other words, we do not have an idea of God because we cannot experience God. Thus, Collins argues that the word ‘God’ is empty (that is, does not denote anything in the world) and that when we say something like ‘God is wise,’ this is basically meaningless (Vindication, 12–13). In particular, Collins emphasizes that because of this lack of experience and the subsequent emptiness of the term, it becomes impossible to prove the existence of God against atheists. For the term cannot refer to more than a “general cause or effect” (Vindication, 13)—something that, he thinks, even atheists agree exists (Vindication, 14). They would only deny that this cause is wise, or would refuse the notion that this cause is immaterial, equating it instead with the “Material Universe” (Vindication, 14). To put it differently, Collins comes close to using the Lockean Axiom to advance atheism. At the very least, he makes it evident that accepting this axiom undermines fundamental theological commitments, because God and the divine attributes are generally held to be beyond the realm of creaturely experience and thus whatever idea we have of God must be empty (for discussion of Collins’ philosophy and the question of whether he is an atheist see O’Higgins 1970, Taranto 2000, and Agnesina 2018.) As we discuss in the next subsection, a similar way of arguing can also be found in John Toland’s Christianity not Mysterious (1696), which might have been an influence on Collins’ thinking. In contrast to Collins, though, Toland puts more emphasis on the connection between the Lockean Axiom and language, something that he also adopts from Locke.

b. John Toland

John Toland was an Irish-born deist who was raised as a Roman Catholic but converted to Anglicanism (the predominant denomination in Britain at the time) in his twenties. Throughout his writing career, Toland challenged figures in positions of authority. In Christianity not Mysterious, Toland takes aim at the Anglican clergy; this, ultimately, led to a public burning of several copies of the book by a hangman and Toland fleeing Dublin.

As mentioned in the previous section, Toland has a similar way of arguing compared to Collins in Christianity not Mysterious. This is no surprise if we consider that Toland highly esteemed Locke and accepts the Lockean Axiom. In fact, and, again, similarly to Collins, he implicitly draws from the axiom (or rather its contraposition) to argue against the religious mysteries of Christianity, such as the virgin birth of Jesus Christ or the latter’s resurrection from the dead. These events are mysterious in the sense that they cannot be explained without invoking a supernatural power because they conflict with the way things ‘naturally’ are. In line with such an understanding, Toland defines mysteries as “a thing of its own Nature inconceivable, and not to be judg’d by our ordinary Faculties and Ideas” (CNM, 93). The underlying idea is that mysteries are beyond the realm of our experience and that we cannot have an idea of any mystery because we cannot experience them—and so Toland says that “a Mystery expresses Nothing by Words that have no Ideas at all” (CNM, 84). In saying this, Toland is intending to follow Locke in holding that every meaningful word must stand for an idea and as such can be traced to some experience. As Locke says: “He that hath Names without Ideas, wants meaning in his Words, and speaks only empty Sounds’ (Essay 3.10.31). On this basis Toland argues that terms referring to mysteries are empty or meaningless because there can be no experiences of them. For instance, Toland criticises the doctrine of the Holy Trinity on this ground as well as arguing that it is neither supported by the bible nor any other form of divine revelation (CNM, § 3)—the existence of which is not rejected outright (compare CNM, 12).

In keeping with his critical attitude towards (religious) authorities, Toland claims that the Holy Trinity and other mysteries are an invention of “priest-craft” (CNM, 100) and nothing but a tool for submission. This point ties into his overall emancipatory aim of arguing for the right of everyone to use their reason in order to interpret the bible on their own, without interference by religious authorities (CNM, 5–14). For Toland believes that every reasonable person is capable of understanding the bible because reason is God-given. As Toland puts this point when addressing a potential clerical reader: “The uncorrupted Doctrines of Christianity are not above their [that is, the lay people’s] Reach or Comprehension, but the Gibberish of your Divinity Schools they understand not” (CNM, 87).

In short, Toland makes use of Lockean insights to tackle what were difficult and important theological questions of the day. By implicitly drawing on the Lockean Axiom and a broadly Lockean understanding of meaning, he argues against an overreach of clerical authority and against the existence of religious mysteries. For Toland, it holds that if something is really part of Christianity, it must also be accessible by our God-given reason (see also Daniel 1984 and the essays in Toland 1997 for more on Toland’s position).

c. George Berkeley

Throughout his life, Berkeley was very concerned with battling atheism or ideas which he thought undermined Christian teachings. His Principles of Human Knowledge was dedicated to identifying and rejecting the “grounds” for “Atheism and Irreligion” (Works II, 20). He also defends the idea that vision (NTV1709 § 147) or nature (NTV1732 § 147) is a divine language in his New Theory of Vision. Yet his most elaborate defense of the idea that we can experience God through nature is found in the fourth dialogue of Alciphron; or the Minute Philosopher (1732/52), which is a set of philosophical dialogues. In a nutshell, Berkeley argues that we have no direct access or experience of any other mind, including our fellow human beings or rational agents. Nonetheless most of us believe that other rational agents exist. The reason for this, Berkeley contends, is that these agents exhibit “signs” of their rationality which we can experience. Most notably, they communicate with us using language (AMP 4.5–7). Berkeley then argues (AMP 4.8–16) that nature—that is, everything we see, hear, smell, taste, and touch—literally forms a divine language (there are competing interpretations of how to best interpret this divine language; see, for example Fasko 2021 and Pearce 2017). This language not only shows that God (as a rational agent) exists, but also displays the divine goodness by providing us with “a sort of foresight which enables us to regulate our actions for the benefit of life. And without this we should be eternally at a loss” (PHK § 31). For example, God ensures that where there is fire there is smoke and, in the way, ‘tells’ us there is fire nearby, when we see smoke. In this way, Berkeley objects to the line of reasoning introduced at the beginning of this section—that we cannot have an idea of God because we cannot experience the deity—by showing that there is a sense in which we experience God, via the divine language that constitutes nature. Thus, Berkeley not only accepts the Lockean Axiom, but also accepts Collins’s point that we immediately experience God. What he rejects is the notion that there are no mediate signs for God’s existence because nature, as a divine language, is abundant with them.

While Alciphron provides evidence of God’s existence, Berkeley’s account of how we know (something) about God’s nature can be found in the Three Dialogues. There, he explains:

[T]aking the word ‘idea’ in a large sense, my soul may be said to furnish me with an idea, that is, an image or likeness of God, though indeed extremely inadequate. For all the notion I have of God is obtained by reflecting on my own soul, heightening its powers, and removing its imperfections. (DHP 231–32)

In other words, by reflecting on my own mind, but endeavoring to remove my imperfections, I can get a sense of what God’s mind must be like. Combined with the claims in Alciphron, Berkeley thus offers an account of knowledge of God’s existence and nature.

In the seventh dialogue of Alciphron, Berkeley tackles the challenge issued by Toland. Berkeley argues that it is not problematic that some words do not signify ideas and thus their meaning cannot be traced back to some experience. In fact, Berkeley argues, our everyday language is full of such words. These words still have a meaning because they serve a purpose:

[T]here may be another use of words besides that of marking and suggesting distinct ideas, to wit, the influencing our conduct and actions, which may be done either by forming rules for us to act by, or by raising certain passions, dispositions, or emotions in our minds (AMP 7.5).

Berkeley thus deems it irrelevant for the meaningfulness of a term whether it refers to ideas that are ultimately grounded in experience. Rather, its meaning needs to be judged by the function it serves. When it comes to the mysteries that Toland attacked, Berkeley argues it is irrelevant that we cannot experience them, as long as talking about them serves the right function, that is, it is still meaningful (AMP 7.14–31) (see Jakapi 2002; West 2018).

6. Anton Wilhelm Amo: A Case Study in the Limits of British Empiricism

We argued in the first section of this article that considering the Peripatetic axiom, or more precisely the Lockean Axiom, allows for a more inclusive and diverse alternative story than the standard narrative of ‘British Empiricism’ which solely focuses on Locke, Berkeley, and Hume. This was our motivation for moving away from the standard narrative and focusing on the Lockean Axiom. The advantages of the narrative presented here are that it can incorporate a wider variety of issues and thinkers. However, we also pointed out that the narrative told here is neither exclusive nor exhaustive. Rather than this being a fault specific to our chosen narrative, we think this is an inevitable consequence of developing narratives that include some figures or ideas and exclude others.

This final section’s aim is to further put this narrative into perspective—not least, to make it abundantly clear that we do not intend to replace the standard narrative with the ‘correct’ story of ‘British Empiricism’. Rather, our aim is to illustrate that we are forced to tell stories that involve difficult choices, which ought to be, nonetheless, deliberate (and transparent) and to show what kind of stories can be told and what the limitations of narratives, such as the one developed here, are. In the following, we therefore first introduce a fringe case—that is, a thinker who could, on certain readings, be read as an ‘empiricist’—in the form of Anton Wilhelm Amo (1703–1756), the first African to receive a doctorate in Europe and a figure who is increasingly of interest to Early Modern scholars (for example, Wiredu 2004, Emma-Adamah 2015, Meyns 2019, Menn and Smith 2020, Smith 2015, Walsh 2019, West 2022).

The aim in doing so is to demonstrate that the Peripatetic Axiom transcended the boundaries of early modern Britain and that it was quite possible for thinkers on the continent to have just as much (if not more) in common with, for example, Locke than Descartes (in turn, this indicates that the traditional story of ‘empiricism versus rationalism’ cannot simply be replaced with ‘Lockeanism versus Cartesianism’). The case of Amo also puts pressure on the cohesiveness of the concept of ‘British Empiricism’—in short, there is nothing uniquely British about being an ‘empiricist’ (that is, accepting the Peripatetic Axiom or the Lockean Axiom). We begin with a very brief overview of Amo’s philosophy before drawing out the tension between, on the one hand, Amo’s commitment to the Peripatetic Axiom and, on the other, the difficulty that arises if we try to place him in the ‘empiricist’ tradition. The case of Amo, we think, shows that there simply is not—in any realist sense—any fact of the matter about whether this or that philosopher is or is not an ‘empiricist.’

Anton Wilhelm Amo wrote four texts during his life time: the Inaugural Dissertation on the Impassivity of the Human Mind, the Philosophical Disputation Containing a Distinct Idea of those Things that Pertain either to the Mind or to Our Living and Organic Body (both written in 1734), a Treatise on the Art of Philosophising Soberly and Accurately, and On the Rights of Moors in Europe (his first text, published in 1729, which, sadly, is now lost). The three surviving texts outline Amo’s account of the mind-body relation, which is substance dualist, and his theory of knowledge. Specifically, the Inaugural Dissertation and the Philosophical Disputation both defend a roughly Cartesian account of the mind-body relation and mind-body interaction. Amo is critical of certain elements of Descartes’ view—in particular, the idea that the mind can ‘suffer’ with (that is, passively experience sensations) the body (ID, 179–81). Yet, while he is critical, Amo’s aim is not to dismiss but to fix these kinds of issues with Descartes’ dualism (Nwala 1978, 163; Smith 2015, 219). While it is not clear cut, there is therefore a case to be made for thinking of Amo as a ‘Cartesian’—if, by ‘Cartesian’, we mean something like a thinker who sets out to augment Descartes’ worldview in order to defend or support it. He is certainly not an outright critic. At the very least, it would be difficult to place Amo in the ‘empiricist’ tradition—at least as it is typically construed—given the underlying Cartesian flavour of his philosophical system.

What makes Amo an interesting ‘fringe’ case for ‘empiricism’—and, indeed, Cartesianism too—is his explicit commitment to the Peripatetic Axiom (see, for example, Treatise, 139, 141, 146). Like Hobbes and Locke—as well as Aristotelian scholastics like Aquinas before them (see section 2.1)—Amo maintains that there is nothing in the intellect not first in the senses. Other Cartesians, like Antoine Arnauld for example, explicitly rejected the Peripatetic Axiom. As Arnauld puts it, “It is false…that all of our ideas come through our senses” (Arnauld 1970, 7). Now, it is worth noting that Amo is not a lone outlier. Other Cartesians, like Robert Desgabets or Pierre-Sylvain Régis, also accepted the Peripatetic Axiom—thus, there are further fringe cases. Nonetheless, Amo’s body of work is greatly suited to illustrate the limitations of our narrative and, in fact, any narrative that makes use of ‘empiricism’, or related notions like ‘Cartesianism’, as labels. For, on the one hand, Amo has in common with traditional ‘empiricists’, like Locke, a commitment to the Peripatetic Axiom. But on the other, he wants to defend and improve the philosophical system of someone (that is, Descartes) who has come to epitomize like no other what ‘rationalism’ is about.

One might demand to know: ‘Well, is Amo an empiricist or not?’ But what this discussion shows, we contend, is that when it comes to Amo, or others like Desgabet or Régis, there is no simple (or ‘right’) answer. The answer depends on what is meant by ‘empiricist’—and this, in turn, might depend upon the context in which that concept is being employed or the use to which it is being put.

In that sense, Amo’s body of work is illustrative of the very fundamental problem or ‘danger’ that the attribution of any “-ism”—that is, analyst’s rather than actor’s categories—runs, particularly if these positions are taken to be dichotomous to others: such attributions risk obfuscating important similarities or differences between thinkers’ ideas or simply omitting interesting thinkers or ideas just because they do not fit the story—and, crucially, not because they are underserving of attention.

There are other reasons to think of someone like Amo as a particularly significant figure when it comes to examining, and revising, the historical canon—and categories like ‘empiricism’ in particular. In light of growing interest in and demand for non-Western figures, and thinkers from typically marginalised backgrounds, both in teaching and scholarship, Amo—the first African to receive a doctorate in Europe—has picked up considerable attention. But in what context should we teach or write about Amo? Continuing to think in terms of the standard narrative of ‘British Empiricism’ versus ‘Continental Rationalism’ will, as the above discussion showed, not make it easy to incorporate Amo’s work into syllabi or research—precisely because there is no objective fact of the matter about whether he is one or the other. And, as we have already suggested, Amo is not alone; this is true of many figures who, not coincidentally, have never quite found a place in the standard early modern canon. We think there are ways to incorporate figures like Amo into our familiar narratives—for instance, construing ‘empiricism’ in terms of an adherence to the Peripatetic Axiom does, in that sense, make Amo an ‘empiricist’—but such cases also provide reasons to think that we ought to take a serious look at what purpose those narratives serve and whether we, as scholars and educators, want them to continue to do so. New narratives are available and might better serve our aims, and correspond with our values, in teaching and scholarship going forward.

7. References and Further Reading

When citing primary sources, we have always aimed to use the canonical or most established forms. In the cases where there are no such forms, we used abbreviations that seemed sensible to us. Also, if the text is not originally written in English, we have utilized standardly used translations. Finally, we want to note that on each of these figures and issues there is way more high-quality scholarship than we were able to point towards in this article. The references we provide are merely intended to be a starting point for anyone who wants to explore these figures and issues in more detail.

a. Primary Sources

  • Amo, Anton Wilhelm. Anton Wilhelm Amo’s Philosophical Dissertations on Mind and Body. Edited by Smith, Justin EH, and Stephen Menn. Oxford: Oxford University Press, 2020.
  • The first critical translation of Amo’s work espousing his philosophy of mind.

  • Amo, Anton Wilhelm. Treatise on the art of philosophising soberly and accurately (with commentaries). In T. U. Nwala (Ed.), William Amo Centre for African Philosophy. University of Nigeria, 1990.
  • Amo’s most systematic text in which he offers a guide to logic and fleshes out his account of the mind-body relation and philosophy of mind.

  • Aristotle. [APo.], Posterior Analytics, trans. Hugh Tredennick, in Aristotle: Posterior Analytics, Topica, Loeb Classical Library, Cambridge, MA; London: William Heinemann, 1964, pp. 2–261.
  • One of the most prominent English translations of Aristotle’s famous work on science.

  • Aristotle. [EN], The Nicomachean Ethics, trans. H. Rackham, Loeb Classical Library, London: William Heinemann, Cambridge, MA: Harvard University Press, 1947.
  • One of the most prominent English translations of Aristotle’s famous work on ethics.

  • Aristotle. [GA], De la génération des animaux, ed. Pierre Louis, Collection des Universités de France, Paris: Les Belles Lettres, 1961; trans. A. L. Peck, in Aristotle, Generation of Animals, Loeb Classical Library, London: William Heinemann, Cambridge, MA: Harvard University Press, 1953.
  • One of the most prominent English translations of Aristotle’s famous work on biology.

  • Aristotle. [Meteor.], Meteorologica, trans. H. D. P. Lee, Loeb Classical Library, London: William Heinemann, Cambridge, MA: Harvard University Press, 1962.
  • One of the most prominent English translations of Aristotle’s famous work on the elements.

  • Aristotle. The Complete Works of Aristotle. Edited by Jonathan Barnes. Princeton: Princeton University Press, 1984.
  • Standard English translation used by scholars of Aristotle’s complete works.

  • Arnauld, Antoine. La Logique, ou L’Art de penser. Flammarion, 1970.
  • Edition of Arnauld and Nicole’s logic textbook.

  • Arnauld, Antoine and Nicole, Pierre. Logic, or, The art of thinking in which, besides the common, are contain’d many excellent new rules, very profitable for directing of reason and acquiring of judgment in things as well relating to the instruction of for the excellency of the matter printed many times in French and Latin, and now for publick good translated into English by several hands. London: Printed by T.B. for H. Sawbridge, 1685.
  • Early English translation of this important text for the so-called Port-Royal Logic; an influential logic textbook.

  • Aquinas, Thomas. [DA] A Commentary on Aristotle’s De anima. Edited by Robert Pasnau. New Haven, CN: Yale University Press, 1999.
  • English translation of Aquinas’ commentary on Aristotle’s famous text on the soul.

  • Aquinas, Thomas. Truth.Translated by Mulligan, Robert W., James V. McGlynn, and Robert W. Schmidt. 3 volumes. Indianapolis: Hackett, 1994.
  • English translation of Aquinas commentary on Aristotle’s famous text on the soul.

  • Astell, Mary. A Serious Proposal to the Ladies. Parts I and II. Edited by P. Springborg. Ontario: Broadview Literary Texts, 2002.
  • Argues for women’s education and offers a way for women to improve their critical thinking skills.

  • Astell, Mary. The Christian Religion, As Profess’d by a Daughter of the Church of England. In a Letter to the Right Honourable, T.L. C.I., London: R. Wilkin, 1705.
  • Introduces Astell’s religious and philosophical views and continues her feminist project.

  • Bacon, Francis. The Works. Edited by J. Spedding, R. L. Ellis, and D. D. Heath. 15 volumes. London: Houghton Mifflin, 1857–1900.
  • First edition of Bacon’s works, still in use by scholars.

  • Bacon, Roger. [OM] The ‘Opus Maius’ of Roger Bacon. Edited Robert Belle Burke. 2 volumes. New York: Russell & Russell, 1928.
  • One if not the most important works of Bacon, attempting to cover all aspects of natural science .

  • Berkeley, George. The Correspondence of George Berkeley. Edited by Marc A. Hight. Cambridge: Cambridge University Press, 2013.
  • Most comprehensive edition of Berkeley’s correspondence with friends, family, and contemporaries thinkers.

  • Berkeley, George. The Works of George Berkeley, Bishop of Cloyne. Edited by A. A. Luce and T. E. Jessop. 9 volumes. London: Thomas Nelson and Sons, 1948-1957
  • Currently the standard scholarly edition of Berkeley’s writings.

  • Cavendish, Margaret. Observations upon Experimental Philosophy, Edited by Eileen O’Neill. Cambridge: Cambridge University Press, 2001.
  • Cavendish’s critique of the experimental philosophy of the Royal Society in London, and a defence of her own philosophical system.

  • Cavendish, Margaret. Grounds of Natural Philosophy. Edited by Anne M. Thell. Peterborough, Canada: Broadview Press, 2020.
  • The most detailed articulation of Cavendish’s ‘vitalist’ philosophical system of nature.

  • Cavendish, Margaret. The Blazing World and Other Writings. London: Penguin Classics, 1994.
  • Cavendish’s fantasy novel, with critiques the Royal Society and was published alongside her Observations

  • Collins, Anthony. A Discourse of Free-thinking: Occasion’d by the Rise and Growth of a Sect Call’d Free-thinkers. London,1713.
  • A defence of the right to think for oneself on any question.

  • Collins, Anthony. A vindication of the divine attributes In some remarks on his grace the Archbishop of Dublin’s sermon, intituled, Divine predestination and foreknowledg consistent with the freedom of man’s will. H. Hills, and sold by the booksellers of London and Westminster, 1710.
  • A critique of Archbishop King’s sermon, arguing that King’s position is effectively no different from atheism.

  • Conway, Anne. The Principles of the Most Ancient and Modern Philosophy. Translated by J. C[lark]. London, 1692.
  • First English translation of Conway’s only known book introducing her metaphysics and system of nature.

  • Hobbes, Thomas. Leviathan, with selected variants from the Latin edition of 1668. Edited by Edwin Curley. Indianapolis: Hackett, 1994
  • Hobbes’ influential political treatise, in which he also defends materialism and an ‘empiricist’ theory of knowledge.

  • Hume, David. Enquiries concerning Human Understanding and concerning the Principles of Morals, edited by L. A. Selby-Bigge, 3rd ed. revised by P. H. Nidditch, Oxford: Clarendon Press, 1975.
  • Standard scholarly edition of Hume’s famous work in which he lays out his moral and political philosophy.

  • Hume, David. A Treatise of Human Nature. Edited by L. A. Selby-Bigge, 2nd edition revised by P. H. Nidditch. Oxford: Clarendon Press, 1975.
  • Standard scholarly edition of Hume’s famous work in which he lays out his account of human nature and begins to develop an account of the human mind.

  • Hutcheson, Francis. An Inquiry into the Original of Our Ideas of Beauty and Virtue. Edited by Wolfgang Leidhold. Indianapolis: Liberty Fun, 2004.
  • Hutcheson’s influential texts on ethics and aesthetics, in which he argues that we have both a moral sense and a sense of beauty.

  • Hutcheson, Francis. An Essay on the Nature and Conduct of the Passions, with Illustrations on the Moral Sense. Edited by Aaron Garret. Indianapolis: Liberty Fund, 2002.
  • A text outlining Hutcheson’s moral philosophy.

  • King, William. Archbishop King’s Sermon on Predestination. Edited by David Berman and Andrew Carpenter. Cadenus Press: Dublin, 1976.
  • A sermon on predestination that revolving around the issue of divine attributes and the way we can meaningfully talk about these attributes and God’s nature.

  • Leibniz, Gottfried Wilhelm. Die philosophischen Schriften. Edited by Carl Immanuel Gerhardt. 7 volumes. Weidmann: Berlin, 1875–90.
  • Standard scholarly edition of all of Leibniz’s works.

  • Locke, John. An Essay concerning Human Understanding. Edited by Peter H. Nidditch. Oxford: Clarendon Press, 1975.
  • Standard scholarly edition of Locke’s most famous work, providing his description of the human mind.

  • Masham, Damaris. Occasional Thoughts in Reference to a Vertuous or Christian Life, London: A. and J. Churchil, 1705.
  • Masham’s second book develops the views of the Discourse in relation to practical morality.

  • Masham, Damaris. A Discourse Concerning the Love of God, London: Awsnsham and John Churchill, 1696.
  • Argues that humans are social and rational as well as motivated by love of happiness.

  • Newcome, Susanna. An Enquiry into the Evidence of the Christian Religion. By a Lady [ie S. Newcome]. The second edition, with additions. London: William Innys, 1732.
  • Newcome’s book espousing her views on morality and a defence of the Christian religion.

  • Plato. Plato: Complete Works. Edited by John M. Cooper. Indianapolis: Hackett, 1997.
  • A standard English edition of Plato’s complete works.

  • Shepherd, Mary. Essays on the Perception of an External Universe, and Other Subjects connected with the Doctrine of Causation. London: John Hatchard and Son, 1827.
  • Shepherd’s second book introducing her metaphysics by establishing that there is an independently and continuously existing external world.

  • Shepherd, Mary. An Essay upon the Relation of Cause and Effect, controverting the Doctrine of Mr. Hume, concerning the Nature of the Relation; with Observations upon the Opinions of Dr. Brown and Mr. Lawrence, Connected with the Same Subject. London: printed for T. Hookham, Old Bond Street, 1824.
  • Shepherd’s first book introducing her notion of causation by way of rejecting a Humean notion of causation.

  • Toland, John. John Toland’s Christianity Not Mysterious: Text, Associated Works, and Critical Essays. Edited by Philip McGuinness, Alan Harrison, and Richard Kearney. Dublin: Liliput Press, 1997.
  • Critical edition of one of Toland’s most famous works which argues that nothing that is above or beyond reason that is part of Christianity.

  • Wollstonecraft, Mary. A Vindication of the Rights of Woman with Strictures on Political and Moral Subjects. Edited by Sylvana Tomaselli, in A Vindication of the Rights of Men with A Vindication of the Rights of Woman and Hints, Cambridge: Cambridge University Press, 1995.
  • Critical edition of Wollstonecraft’s groundbreaking work arguing for women’s rights.

b. Secondary Sources

  • Agnesina, Jacopo. The philosophy of Anthony Collins: free-thought and atheism. Paris: Honoré Champion, 2018.
  • Consideration of Collins’ philosophy with a focus on the question whether he is an atheist.

  • Anfray, Jean-Pascal. “Leibniz and Descartes.”In The Oxford Handbook of Descartes and Cartesianism edited by Steven Nadler, Tad M. Schmaltz, and Delphine Antoine-Mahut, 721–37, Oxford: Oxford University Press, 2019.
  • Essay considering the complicated relationship between two rationalists.

  • Atherton, Margaret. “Lady Mary Shepherd’s Case Against George Berkeley. ” British Journal for the History of Philosophy 4 (1996): 347–66. Doi: 10.1080/09608789608570945
  • First article to discuss and evaluate Shepherd’s criticism of Berkeley.

  • Atherton, Margaret, ed. Women philosophers of the early modern period. Indianapolis/Cambridge: Hackett, 1994.
  • Groundbreaking volume that contains various women philosophers and present excerpts of their works, intended for their inclusion in the classroom.

  • Atherton, Margaret. “Cartesian reason and gendered reason.” In A mind of one’s own edited by Louise Antony and Charlotte Witt, 21-37, Boulder, CO: Westview Press, 1993.
  • Argues against first generation feminist critiques for the emancipatory potential that Cartesianism held for some female thinkers.

  • Atherton, Margaret. Berkeley’s Revolution in Vision. Ithaca: Cornell University Press, 1990.
  • The most comprehensive study of Berkeley’s theory of vision/ philosophy of perception.

  • Ayers, Michael. Locke: Epistemology and Ontology. London: Routledge, 1991.
  • An in-depth discussion of Locke’s theory of knowledge and metaphysics.

  • Bahar, Saba. Mary Wollstonecraft’s Social and Aesthetic Philosophy: An Eve to Please Me. New York: Palgrave, 2002.
  • Sustained discussion of the way that aesthetic considerations (pertaining to the presentation of women) play a crucial role for Wollstonecraft’s feminist project.

  • Beauchamp, T.L. and A. Rosenberg. Hume and the Problem of Causation. Oxford: Oxford University Press, 1981.
  • Classical study of the Humean notion of causation and its problems.

  • Bell, Martin. “Hume on Causation.” In The Cambridge Companion to Hume’s Treatise, edited by Donald C. Ainslie, and Annemarie Butler, 147–76, Cambridge: Cambridge University Press, 2015.
  • Consideration of Hume’s view of causation, highlighting the centrality of this issue for understanding his philosophical system.

  • Bennett, Jonathan. Locke, Berkeley, Hume: Central Themes. Oxford: Oxford University Press, 1971.
  • Classical story of the three so-called empiricist which highlights issues discussed by all of these thinkers.

  • Bergès, Sandrine, and Coffee, Alan. The Social and Political Philosophy of Mary Wollstonecraft. Oxford: Oxford University Press, 2016.
  • Essays that distinctively consider Wollstonecraft as a philosopher and relate her to her intellectual context as well as contemporary debates.

  • Bergès, Sandrine. The Routledge guidebook to Wollstonecraft’s A Vindication of the Rights of Woman. London: Routledge, 2013.
  • Contributions introducing readers to Wollstonecraft’s famous work of women’s rights and hence also to the origins of feminist thought.

  • Bolton, Martha Brandt. “Lady Mary Shepherd and David Hume on Cause and Effect.” Feminist History of Philosophy: The Recovery and Evaluation of Women’s Philosophical Thought edited by Eileen O’Neill & Marcy P. Lascano, 129–52, Cham: Springer, 2019.
  • Sustained discussion of the different understanding of causation by Hume and Shepherd.

  • Bolton, Martha. “Causality and Causal Induction: The Necessitarian Theory of Lady Mary Shepherd. ” In Causation and Modern Philosophy, edited by Keith Allen and Tom Stoneham, 242–61. New York: Routledge, 2010.
  • Classical article on Shepherd’s idiosyncratic notion of causation and the way she departs from Hume.

  • Boyle Deborah. Mary Shepherd: A Guide. Oxford: Oxford University Press, 2023.
  • First book length treatment of Shepherd’s metaphysics, discussing her core commitments and pointing to helpful secondary literature.

  • Boyle, Deborah. “Mary Shepherd on Mind, Soul, and Self.” Journal of the History of Philosophy 58, no. 1 (2020): 93–112. Doi: 10.1353/hph.2020.0005
  • First sustained discussion of Shepherd’s philosophy of mind.

  • Boyle, Deborah A. The well-ordered universe: The philosophy of Margaret Cavendish. Oxford: Oxford University Press, 2018.
  • In-depth discussion of Cavendish’s metaphysics.

  • Broad, Jacqueline. “Damaris Masham on Women and Liberty of Conscience.” Feminist History of Philosophy: The Recovery and Evaluation of Women’s Philosophical Thought edited by Eileen O’Neill & Marcy P. Lascano, 319–36, Cham: Springer, 2019.
  • One of the first considerations of the role and of the ethics of toleration.

  • Broad, Jacqueline. The philosophy of Mary Astell: An early modern theory of virtue. Oxford: Oxford University Press, 2015.
  • Argues that Astell’s ethical goals are at the center of her philosophical project and help to unite some of her seemingly diverging commitments.

  • Broad, Jacqueline. “A woman’s influence? John Locke and Damaris Masham on moral accountability.” Journal of the History of Ideas 67, no. 3 (2006): 489–510. Doi: https://www.jstor.org/stable/30141038
  • Considers the influence Masham had on Locke’s notion of moral accountability.

  • Chappell, Vere Ed. Essays on Early Modern Philosophy, John Locke—Theory of Knowledge. London: Garland Publishing, 1992.
  • Contributions on a broad variety of issues that pertain to Locke theories of knowledge ranging from triangles to memory.

  • Conley, John J. “Suppressing Women Philosophers: The Case of the Early Modern Canon.” Early Modern Women: An Interdisciplinary Journal 1, no. 1 (2006): 99-114. Doi: 10.1086/EMW23541458
  • Consideration of the exclusion of women from the history of philosophy with a focus on the challenges of their reintegration.

  • Connolly, Patrick J. “Susanna Newcome and the Origins of Utilitarianism.” Utilitas 33, no. 4 (2021): 384–98. Doi: 10.1017/S0953820821000108
  • One of the few scholarly works on Newcome arguing that she occupies a noteworthy position at the dawn of utilitarianism.

  • Costelloe, Timothy M. Aesthetics and morals in the philosophy of David Hume. London: Routledge, 2013.
  • A broad discussion of Hume’s ethics and aesthetics.

  • Cranefield, Paul F. “On the Origin of the Phrase NIHIL EST IN INTELLECTU QUOD NON PRIUS PUERIT IN SENSU.” Journal of the history of medicine and allied sciences 25, no. 1 (1970): 77–80. Doi: 10.1093/jhmas/XXV.1.77
  • Early article looking into the origin of the Peripatetic axiom as found in Locke.

  • Cruz, Maité. “Shepherd’s Case for the Demonstrability of Causal Principles.” Ergo: An Open Access Journal of Philosophy (forthcoming).
  • Argues that Shepherd endorses a broadly Lockean or Aristotelian substance metaphysics.

  • Cunning, David. Cavendish. London: Routledge, 2016.
  • An introduction to Cavendish’s life and philosophical contributions.

  • Daniel, Stephen H. George Berkeley and Early Modern Philosophy. Oxford: Oxford University Press, 2021.
  • Book length treatments of Berkeley, relating his views to many other Early Modern figures and Ramism.

  • Daniel, Stephen Hartley. John Toland: His methods, manners, and mind. Kingston/Montreal: McGill-Queen’s Press-MQUP, 1984.
  • Only one of few book length studies of Toland and his philosophy.

  • Detlefsen, Karen. “Atomism, Monism, and Causation in the Natural Philosophy of Margaret Cavendish. ” Oxford Studies in Early Modern Philosophy 3 (2006): 199–240. Doi: 10.1093/oso/9780199203949.003.0007
  • A paper covering Cavendish’s rejection of atomism and commitment to monism, and her theory of causation.

  • Emma-Adamah, Victor U. “Anton Wilhelm Amo (1703-1756) the African‐German philosopher of mind: an eighteen-century intellectual history.” PhD diss., University of the Free State, 2015.
  • A doctoral dissertation on Amo’s account of the mind-body relation.

  • Falco, Maria J., ed. Feminist Interpretations of Mary Wollstonecraft. University Park PA: Penn State Press, 2010.
  • Includes contributions on the political and social impact of Wollstonecraft’s views.

  • Fasko, Manuel. Die Sprache Gottes: George Berkeleys Auffassung des Naturgeschehens. Basel/Berlin: Schwabe Verlag, 2021.
  • Detailed discussion of Berkeley’s divine language hypothesis arguing, contra Pearce, that only vision is the language of God.

  • Fasko, Manuel, and Peter West. “The Irish Context of Berkeley’s ‘Resemblance Thesis.’ ” Royal Institute of Philosophy Supplements 88 (2020): 7–31. Doi 10.1017/S1358246120000089
  • Arguing for the importance of the notion that representation requires resemblance in Berkeley’s intellectual context.

  • Fields, Keota. Berkeley: Ideas, Immateralism, and Objective Presence. Lanham: Lexington Books, 2011.
  • Discussion of Berkeley’s immaterialism in context of Descartes’ notion of objective presence that requires causal explanations of the content of ideas .

  • Frankel, Lois. “Damaris Cudworth Masham: A seventeenth century feminist philosopher.” Hypatia 4, no. 1 (1989): 80–90. Doi: 10.1111/j.1527-2001.1989.tb00868.x
  • Early art icle showing that Masham is a philosopher in her own right by espousing her feminist views.

  • Frankena, William. “Hutcheson’s Moral Sense Theory.” Journal of the History of Ideas (1955): 356-375. Doi: https://www.jstor.org/stable/2707637
  • Classic article on Hutcheson’s notion that we have a moral sense (much like a sense for seeing).

  • Galvagni, Enrico. “Secret Sentiments: Hume on Pride, Decency, and Virtue.” Hume Studies 47, no. 1 (2022): 131–55. Doi: 10.1353/hms.2022.0007
  • Discusses Hume’s account of decency and argues that it challenges standard virtue ethical interpretations of Hume.

  • Garrett, Don. “Hume’s Theory of Causation.” In The Cambridge Companion to Hume’s Treatise, edited by Donald C. Ainslie, and Annemarie Butler, 69–100, Cambridge: Cambridge University Press, 2015.
  • An introductory overview of Hume’s controversial theory of causation.

  • Gasser-Wingate, Marc. Aristotle’s Empiricism. Oxford: Oxford University Press, 2021.
  • An in-depth discussion of Aristotle’s view that all knowledge comes from perception.

  • Gordon‐Roth, Jessica, and Nancy Kendrick. “Including Early Modern Women Writers in Survey Courses: A Call to Action.” Metaphilosophy 46, no. 3 (2015): 364–79. Doi: 10.1111/meta.12137
  • Arguing for the importance of including women philosopher’s not in the least because of the current underrepresentation of women in the discipline.

  • Gracyk, Theodore A. “Rethinking Hume’s standard of taste.” The Journal of Aesthetics and Art Criticism 52, no. 2 (1994): 169–82.
  • A novel reading of Hume’s account of our knowledge of beauty.

  • Harris, James A. “Shaftesbury, Hutcheson and the Moral Sense. ” In The Cambridge History of Moral Philosophy, edited by Sacha Golob and Jens Timmermann, 325–37. Cambridge: Cambridge University Press, 2017. Doi: 10.1017/9781139519267.026
  • An introductory overview of Hutcheson’s account of the moral sense.

  • Hutton, Sarah. “Women, philosophy and the history of philosophy.” In Women Philosophers from the Renaissance to the Enlightenment, edited by Ruth Hagengruber and Sarah Hutton 12–29. New York: Routledge, 2021.
  • A discussion of why and how women are omitted from many histories of philosophy.

  • Hutton, Sarah. “Liberty of Mind: Women Philosophers and the Freedom to Philosophize.” In Women and liberty, 1600-1800: philosophical essays edited by Jacqueline Broad, and Karen Detlefsen,123–37. Oxford: Oxford University Press, 2017.
  • A paper arguing that women in early modern philosophy construed liberty as ‘freedom of the mind.’

  • Hutton, Sarah. “Religion and sociability in the correspondence of Damaris Masham (1658–1708).” In Religion and Women in Britain, c. 1660-1760, edited by Sarah Apetrei and Hannah Smith, 117–30. London: Routledge, 2016.
  • A discussion of Masham’s religious and social views, as espoused in her correspondences.

  • Hutton, Sarah. Anne Conway: A woman philosopher. Cambridge: Cambridge University Press, 2004.
  • Detailed discussion of Conway’s philosophy and her intellectual context

  • Jakapi, Roomet. “Emotive meaning and Christian mysteries in Berkeley’s Alciphron.” British journal for the history of philosophy 10, no. 3 (2002): 401–11. Doi: https://doi.org/10.1080/09608780210143218
  • Discusses the notion that Berkeley has an emotive theory of meaning.

  • Jolley, Nicholas. Locke, His Philosophical Thought. Oxford: Oxford University Press, 1999.
  • A broad discussion of Locke’s philosophical project.

  • Jones, Tom. George Berkeley: A Philosophical Life. Princeton: Princeton University Press, 2021.
  • The most comprehensive study of Berkeley’s life and intellectual context.

  • Kivy, Peter. The Seventh Sense: Francis Hutchenson and Eighteenth-Century British Aesthetics. Oxford: Clarendon Press, 2003.
  • An in-depth discussion of Hutcheson’s account of the sense of beauty.

  • Landy, David. “Shepherd on Hume’s Argument for the Possibility of Uncaused Existence.” Journal of Modern Philosophy 2 no. 1: 2020a. Doi: 10.32881/jomp.128
  • Discusses Shepherd’s criticism of Hume’s argument.

  • Landy, David. “A Defense of Shepherd’s Account of Cause and Effect as Synchronous.” Journal of Modern Philosophy 2, no. 1 (2020). Doi: 10.32881/jomp.46
  • Important discussion of Shepherd’s. account of synchronicity, defending this account against Humean worries.

  • Landy, David. “Hume’s theory of mental representation.” Hume Studies 38, no. 1 (2012): 23–54. Doi: 10.1353/hms.2012.0001
  • A novel interpretation of Hume’s account of how the mind represents external objects.

  • Landy, David. “Hume’s impression/idea distinction.” Hume Studies 32, no. 1 (2006): 119–39. Doi: 10.1353/hms.2011.0295
  • A discussion of Hume’s account of the relation between impressions and ideas.

  • Lascano, Marcy P. The Metaphysics of Margaret Cavendish and Anne Conway: Monism, Vitalism, and Self-Motion. Oxford: Oxford University Press, 2023.
  • Comprehensive discussion and comparison of Cavendish and Conway on three major themes in their philosophy.

  • Loeb, Louis E. Reflection and the stability of belief: essays on Descartes, Hume, and Reid. Oxford: Oxford University Press, 2010.
  • A discussion of the connections between Descartes, Hume, and Reid’s philosophies.

  • LoLordo, Antonia. Mary Shepherd. Cambridge: Cambridge University Press, 2022.
  • A broad overview of Shepherd’s philosophy, suitable for beginners.

  • LoLordo, Antonia, ed. Mary Sheperd’s Essays on the Perception of an External Universe. Oxford: Oxford Univeristy Press, 2020.
  • First critical edition of Shepherd’s 1827 book and 1832 paper.

  • Mackie, J. L. Problems from Locke, Oxford: Clarendon Press, 1971.
  • A discussion of the philosophical problems, relevant even today, that arise in Locke’s writing.

  • Mercer, Christia. “Empowering Philosophy.” In Proceedings and Addresses of the APA, vol. 94 (2020): 68–96.
  • An attempt to use philosophy’s past to empower it’s present and to promote a public-facing attitude to philosophy.

  • Meyns, Chris. “Anton Wilhelm Amo’s philosophy of mind.” Philosophy Compass 14, no. 3 (2019): e12571. Doi: 10.1111/phc3.12571
  • The first paper to provide a reconstruction of Amo’s philosophy of mind, suitable for beginners.

  • Michael, Emily. “Francis Hutcheson on aesthetic perception and aesthetic pleasure.” The British Journal of Aesthetics 24, no. 3 (1984): 241–55. Doi: 10.1093/bjaesthetics/24.3.241
  • A discussion of the sense of beauty and the feeling of pleasure in Hutcheson.

  • Myers, Joanne E. “Enthusiastic Improvement: Mary Astell and Damaris Masham on Sociability.” Hypatia 28, no. 3 (2013): 533–50. Doi: 10.1111/j.1527-2001.2012.01294.x
  • A discussion of the social philosophy of two early modern women.

  • Nwala, T. Uzodinma. “Anthony William Amo of Ghana on The Mind-Body Problem.” Présence Africaine 4 (1978): 158–65. Doi: 10.3917/presa.108.0158
  • An early attempt to reconstruct Amo’s response to the mind-body problem.

  • Influential paper since it is one of the first to discuss the problems and limits of the standard narrative that contrasts empiricism and rationalism.

  • Noxon, J. Hume’s Philosophical Development. Oxford: Oxford University Press, 1973.
  • A discussion of the development and changes in Hume’s philosophy over his lifetime.

  • O’Higgins, James. Anthony Collins the Man and His Works. The Hague : Martinus Nijhoff, 1970.
  • Still one of the most detailed discussions of Collins philosophy and intellectual context in English.

  • O’Neill, Eileen. “HISTORY OF PHILOSOPHY: Disappearing Ink: Early Modern Women Philosophers and Their Fate in History.” Philosophy in a Feminist Voice: Critiques and Reconstructions, edited by Janet A. Kourany, 17-62. Princeton: Princeton University Press, 1998.
  • Groundbreaking paper demonstrating how women thinkers have eradicated from the history of philosophy.

  • Pearce, Kenneth L. Language and the Structure of Berkeley’s World. Oxford: Oxford University Press, 2017.
  • Detailed consideration of Berkeley’s divine language hypothesis (that is, the notion that nature is the language of God).

  • Rickless, Samuel C. “Is Shepherd’s pen mightier than Berkeley’s word?.” British Journal for the History of Philosophy26, no. 2 (2018): 317–30. Doi: 10.1080/09608788.2017.1381584
  • Discussion of Shepherd’s criticism of Berkeley.

  • Rickless, Samuel C. Berkeley’s argument for idealism. Oxford: Oxford University Press, 2013.
  • Critically discusses Berkeley’s arguments for idealism.

  • Sapiro, Virginia. A vindication of political virtue: The political theory of Mary Wollstonecraft. Chicago: University of Chicago Press, 1992.
  • One of the first detailed discussions of Wollstonecraft’s’ political thought.

  • Saporiti, Katia. Die Wirklichkeit der Dinge. Frankfurt a. M.; Klostermann, 2006.
  • Critical examination of Berkeley’s metaphysics.

  • Seppalainen, Tom, and Angela Coventry. “Hume’s Empiricist Inner Epistemology: A Reassessment of the Copy Principle.” In The Continuum Companion to Hume, edited by Alan Bailey, Daniel Jayes O’Brie 38–56, London: Continuum, 2012.
  • Looks at exactly how Hume’s ‘copy principle’ (the claim that all ideas are copies of impressions) works.

  • Shapiro, Lisa. “Revisiting the early modern philosophical canon.” Journal of the American Philosophical Association 2, no. 3 (2016): 365–83. Doi: 10.1017/apa.2016.27
  • Critical consideration of the standard narrative arguing for a more inclusive story in terms of figures and issues considered.

  • Shelley, James. “Empiricism: Hutcheson and Hume.” In The Routledge companion to aesthetics, edited by Berys Gaut and Dominic Lopes, 55–68. London: Routledge, 2005.
  •  An overview of Hutcheson and Hume’s ‘empiricist’ approach to beauty and aesthetics.

  • Shelley, James R. “Hume and the Nature of Taste.” The Journal of Aesthetics and Art Criticism 56, no. 1 (1998): 29–38. Doi: 10.2307/431945
  • Focuses on the ‘normative force’ in Hume’s conception of taste.

  • Smith, Justin EH. Nature, human nature, and human difference: Race in early modern philosophy. Princeton: Princeton University Press, 2015.
  • Investigates the rise of the category of race in the Early Modern period.

  • Taranto, Pascal. Du déisme à l’athéisme: la libre-pensée d’Anthony Collins. Paris: Honoré Champion, 2000.
  • Discusses Collins’ writings and the question whether he is a (covert) atheist.

  • Thomas, Emily. “Time, Space, and Process in Anne Conway.” British Journal for the History of Philosophy 25, no 5 (2017): 990–1010. Doi: 10.1080/09608788.2017.1302408
  • Discussion of Conway’s views in relation to Leibniz, arguing that Conway is ultimately closer to Henry More.

  • Townsend, Dabney. Hume’s aesthetic theory: Taste and sentiment. London: Routledge, 2013.
  • Close examination of Hume’s aesthetic theory.

  • Traiger, Saul  Ed. The Blackwell Guide to Hume’s “Treatise”. Oxford: Blackwell, 2006. Student guide to Hume’s famous work . Doi: 10.1353/jhi.2016.0017
  • Arguing for the emergence of the standard narrative in 20th century based on its simplicity and aptness for teaching.

  • Walsh, Julie. “Amo on the Heterogeneity Problem.” Philosophers’ Imprint 19, no. 41 (2019): 1–18. Doi: http://hdl.handle.net/2027/spo.3521354.0019.041
  • A discussion of a problem facing Amo’s philosophy, about how the mind and body can be in unison if they are heterogeneous entities.

  • West, Peter. “Why Can An Idea Be Like Nothing But Another Idea? A Conceptual Interpretation of Berkeley’s Likeness Principle” Journal of the American Philosophical Association 7, no. 4 (2021): 530-548. Doi: doi:10.1017/apa.2020.34
  • An account of why Berkeley thinks an idea can be like nothing but another idea.

  • West, Peter. ‘Mind-Body Commerce: Occasional Causation and Mental Representation in Anton Wilhelm Amo” Philosophy Compass 17, no. 9 (2022). Doi: https://doi.org/10.1111/phc3.12872
  • An overview of secondary literature on Amo’s philosophy of mind so far, and a new reading of how his theory of mental representation works.

 

Author Information

Manuel Fasko
Email: manuel.fasko@unibas.ch
University of Basel
Switzerland

and

Peter West
Email: Peter.west@nulondon.ac.uk
Northeastern University London
United Kingdom

Susanne K. Langer (1895—1985)

Susanne K. Langer
Photo by Monozigote, CC BY-SA 4.0, via Wikimedia Commons

Susanne Langer was an American philosopher working across the analytic and continental divide in the fields of logic, aesthetics, and theory of mind. Her work connects in various ways to her central concerns of feeling and meaning.

Feeling, in Langer’s philosophy, encompasses the qualitative, sensory, and emotional aspects of human experience. It is not limited to mere emotional states but includes the entire range of sensory and emotional qualities that humans perceive and experience. Langer argues that feeling is not separate from rationality but, rather, an integral part of human intelligence and creativity.

In contrast to the logical positivists with whom she is sometimes associated, Langer argues for an expanded field of meaning. In contrast to the early Wittgenstein, who argues for a very limited field of meaning bounded by strict usage of language, Langer argues that symbolisms other than language are capable of expressing thoughts that language cannot.

Langer’s theory of feeling is closely tied to her theory of art, where she argues that artworks express forms of feeling. Artists use various elements, such as colours, shapes, sounds, and rhythms, to formulate feeling in their work, with each artwork being an art symbol. According to Langer, the artist’s task is to formulate the quality or gestalt of a particular feeling in their chosen medium.

In her broader philosophy of mind, Langer suggests that feeling is a fundamental aspect of human consciousness. She contends that feeling is not limited to individual emotions but is the basis for all forms of human thought, perception, and expression. In this sense, feeling serves as the foundation for higher-level cognitive processes, including symbolic thought and language.

Langer’s legacy includes her influential books on logic, philosophy of art, and theory of mind. Her position, whilst subject to minor terminological changes during her career, remains overwhelmingly consistent over half a century, and the resulting vision is a bold and original contribution to philosophy. Her ideas in the philosophy of art have been engaged with by various philosophers, including Nelson Goodman, Malcolm Budd, Peter Kivy, Brian Massumi, and Jenefer Robinson. In neuroscience and psychology, her notion of feeling, and her conceptual framework of mind, have been made use of by figures including Antonio Damasio and Jaak Panksepp. Overall, Langer’s work has left a lasting impact on philosophy, with her insights into the role of feeling in human life continuing to resonate with contemporary scholars and researchers.

Langer’s inclusiveness and rigor have recommended her thought to the generations since her passing. In the arts and biosciences her ideas are becoming more widely known. Langer’s work is a model of synthetic conceptual thinking which is both broad and coherent.

Table of Contents

  1. Life and Work
  2. Feeling
  3. Logic
  4. The ‘New Key’ in Philosophy
  5. Theory of Art
  6. Theory of Mind
  7. Political Philosophy and Contribution to the ‘Modern Man’ Discourse
  8. Legacy
  9. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Life and Work

Susanne K. Langer (née Knauth) grew up in the Upper West Side of Manhattan, New York. The second of five children born to affluent German immigrants, Langer spoke German at home and French at school, and later claimed she only felt fully comfortable with English by the time she reached high school. Formative experiences included family summer vacations to Lake George and family music recitals in which Langer played the cello.

She attended Radcliffe College from 1916 and was awarded her doctorate in 1926 (Langer took the same classes as male students at Harvard during this time, who were taught separately; Harvard would not award men and women degrees on an equal basis until 1975). During Langer’s time at Radcliffe, she notably studied logic under Henry Sheffer, who introduced her to the ideas of Russell and the early Wittgenstein, as well as under Alfred North Whitehead, with Langer attending the lecture series which would become Process and Reality (1929). Whitehead would also supervise Langer’s doctoral thesis and write the introduction to her first book, The Practice of Philosophy (1930). Sheffer published very little, and Langer’s second book, An Introduction to Symbolic Logic (1937), is presented as putting forward Sheffer’s approach to logic, something Sheffer himself never did.

Langer married William Langer in 1921, who would go on to become a scholar of European history, and the two spent much of their first year of marriage in Vienna. Langer lived the rest of her life in America, though she returned to Europe with her family in the summer of 1933 for a European tour and to visit Edmund Husserl in Schluchsee, Germany. The couple had two children but divorced in 1941, with Langer never remarrying.

In addition to the intellectual influences of Whitehead, Sheffer, and Wittgenstein, Langer was strongly taken by the ideas of Ernst Cassirer; they met and corresponded, with Langer going on to translate Cassirer’s Language and Myth (1946) into English.

Langer’s third book, Philosophy in a New Key (1942), sold more than half a million copies. Arguing that there had been a shift in many fields towards recognition of the role of the symbolic in human life, ritual, art and language, the book brought together findings from many areas and offered a conceptual framework within which to understand, in particular, language and music.

After her divorce, Langer moved to New York City and stayed there for a decade as she wrote her theory of art, Feeling and Form (1953). Langer had part-time and temporary positions at various academic departments, including Radcliffe (1926-42) and Columbia (1945-50), but she did not have a full-time academic post until 1954, when she took up the chair of the philosophy department at Connecticut College for Women. From 1962, she was funded by a grant from the Edgar J. Kaufmann Foundation for her major work on theory of mind, at which point she retired to concentrate on her writing. After this, she split her time between Old Lyme, Connecticut and summers in a wood cabin in Ulster County, New York. Due to ill health and, in particular, failing eyesight, she published a curtailed version of her final, third volume of Mind in 1982. She died in 1985.

2. Feeling

Langer’s notion of feeling underpins all her other work. Feeling tells organisms how they are doing in various categories of need, both internal and external. As Langer puts it:

Feeling is the constant, systematic, but private display of what is going on in our own system, the index of much that goes on below the limen of sentience, and ultimately of the whole organic process, or life, that feeds and uses the sensory and cerebral system. (Langer, 1967)

Langer’s basic analytical unit of life is the act, which she considers in terms of phases. Langer repeatedly acknowledges the futility of drawing hard dividing lines in the natural sciences. Her preference instead is to find centres of activity which hold together because they are functional. An act is a functional unit, and can be considered on dramatically different scales, from cell to organ, to organism and ecosystem. Feeling is anything that can be felt, which is to say that it is a felt phase of an act. Feeling is the mark of at least primitive mentality or mentation, though not, at least in known non-human animals, mind. The relationship of feeling to logic in Langer’s work is that she argues for an expanded logical field of meaning which includes feeling, which is not considered as an irrational disturbance in an organism but the origin of logic; Langer writes that only a highly emotional animal could have developed the methods of logic. Lastly, there are unconscious processes, but there is no unconscious feeling: whatever can be felt is felt consciously. That anything that can be felt is a felt phase of an act emphasises this.

Langer describes how a phase is not a thing but a mode of appearance, explaining that when iron, for instance, is heated to become red-hot, redness is a phase, a mode of appearance, rather than being a new entity. When the iron is cooled the redness vanishes; Langer claims that, similarly, feeling is like this redness of the iron, mere appearance that has no independent existence. This is not to deny the importance of these appearances, however, since they are what the organism has to guide its negotiation with both its internal and external environment. Langer considers the notion of feelings to be a reification, that the process of feeling does not result in ontologically distinct products.

To the extent that an organism is able to react to a stimulus, it is able to feel. There are processes that may appear animated, such as leaves blowing along a path, or water bubbling up from a geyser, but in these examples the processes are entirely dictated by the external environment rather than being active agents seeking to maintain certain balances. If the stimuli in these examples cease, the wind for the leaf and the heat for the geyser, the animation would cease too, and immediately.

Animals feel, they feel their internal and external environment, they feel their own responses to the environment, and they feel as the environment responds to their actions. On human feeling, Langer writes:

Pure sensation—now pain, now pleasure—would have no unity, and would change the receptivity of the body for future pains and pleasures only in rudimentary ways. It is sensation remembered and anticipated, feared or sought or even imagined and eschewed that is important in human life. It is perception molded by imagination that gives us the outward world we know. And it is the continuity of thought that systematizes our emotional reactions into attitudes with distinct feeling tones, and sets a certain scope for an individual’s passions. In other words: by virtue of our thought and imagination we have… a life of feeling. (Langer, 1953)

Langer’s ideas are distinguished from those of the Classical Associationists; feeling is far from being a passive or neutral process, as Langer here stresses the feedback loop of imagination and perception in giving us access to the world. In stressing the continuity of the life of feeling, Langer is stressing the continuity of consciousness—not entirely unbroken in human experience, but normatively present. Feeling, for Langer, is the driving force of consciousness, motivating, among other functions, imagining and seeking and remembering.

This view of feeling leads to a particular view of consciousness: not as an emergent property of complex organisms such as humans but as a continuum along which there are simpler and more complex consciousnesses; whatever being is capable of feeling has at least a primitive awareness of, at a minimum, its sensory environment. Langer considers these very simple organisms, therefore, to be feeling, which is to be constantly attaining psychical phases of sensory acts, and that this constitutes mental activity. Langer describes this as mentation until it reaches the high development that it does in humans, which is the point at which this activity passes the threshold to be considered mind.

The clear question to come out of this is to ask what, if not consciousness, accounts for the gulf between animal mentation and the human mind. And, for Langer, the answer to this is symbolic thought.

Many animals are capable of reacting appropriately to signs (in later works Langer calls these signals), but, in known examples, only humans respond symbolically to the environment. A sign or signal, for Langer, is a symptom of an event; this can be natural, as in footprints signifying that a person or animal has walked a certain way, or artificial, as in a school bell signifying that the end of the school day has come. Symbols, by contrast, call attention primarily to concepts rather than objects; Langer writes that if someone says “Napoleon,” the correct response is not to look around for him but to ask “What about Napoleon?” The symbolic therefore allows people to imagine non-actual situations, including other times and places and the speculative.

Langer considers both emotion and logic to be high developments of feeling. Langer writes that logic is a device for leading people between intuitions, these intuitions being meaningful personal understandings (see the next section for a fuller discussion of Langer’s logic). Langer does not have a fully developed theory of emotion, though she refers to emotional situations in individual people and groups not infrequently. Her notion of feeling is certainly compatible with the use that is made of it by scientists such as Jaak Panksepp and Antonio Damasio, though it need not necessitate their ideas of emotion.

Langer’s notion of art concerns feeling as well: she argues that artworks present forms of feeling for contemplation. The purpose of art is to render pre-reflexive experience available to consciousness so that it can be reflected (rather than merely acted) upon. Knowledge of feeling captures what artworks are meant to help us with educationally, socially, and cross-culturally. We have access, in life and in art, to forms only, from which we extrapolate meaning. In life, the forms of feeling are too embedded in practical situations for us to contemplate them. When art is viewed as art, the experience of them is disinterested, the forms are isolated from practical situations.

Despite Langer’s emphasis on embodiment, she also clearly emphasises cognitive evaluations. As in many other areas, Langer’s work can be seen to bridge perspectives that are often considered incompatible: in this case, that emotion is either fundamentally embodied or fundamentally cognitive:

Certainly in our history, presumably for long ages – eons, lasting into present times – the human world has been filled more with creatures of fantasy than of flesh and blood. Every perceived object, scene, and especially every expectation is imbued with fantasy elements, and those phantasms really have a stronger tendency to form systematic patterns, largely of a dramatic character, than factual impressions. The result is that human experience is a constant dialectic of sensory and imaginative activity – a making of scenes, acts, beings, intentions and realizations such as I believe animals do not encounter. (Langer, 1972)

Langer here clearly believes cognitive evaluations matter—beliefs, whether about ghosts and monsters and gods or about why the bus is late and what might be done about it, and especially expectations, which determine to a surprising extent what is perceived. Langer also stresses here the dynamic real-time mixing of sensory and imaginative activity, disposing the holder of these expectations towards certain kinds of experience.

This emphasis on feeling in Langer has clear parallels to her contemporary John Dewey, who focused on experience similarly. These parallels have been drawn out most thoroughly by Robert Innis in his monograph on Langer.

3. Logic

Langer’s most distinctive contribution to the philosophy of logic is her controversial claim of a presentational logic that operates differently from, but is no less reasonable than, traditional logic. This presentational logic functions by association rather than by logical implication (as in traditional logic) or causality; nonetheless, Langer considers it also to be a logic because presentational forms contain relational patterns. Langer first put forward this idea in her doctoral dissertation in 1926, ‘A Logical Analysis of Meaning’, in which Langer investigated the meaning of meaning from the starting point that the dictionary definition of meaning seems to have little to do with meaning in art or religion.

Langer developed this idea further in her first book, The Practice of Philosophy (1930), in which she also situated philosophy in relation to science. Arguing that analysis is an indispensable part of any complex understanding, she distinguished between the empirical sciences which pursue facts and the rational sciences which instead pursue meanings—the latter exemplified by mathematics and logic. These rational sciences, Langer claimed, are the foundation of the ‘higher’ and more concrete subjects of ethics and metaphysics. Langer points out, for instance, that it was in studying numbers that philosophers gained the understanding they needed to approach more accurately the concept of infinity, and that Zeno’s paradox—that matter in its eternal motion is really at rest—is solved by a clear understanding of the continuum.

Aspects of Langer’s views here are heavily influenced by logical positivism, and this impression of these ideas is likely to be strengthened in the reader’s mind by Langer’s positive discussion of Bertrand Russell and of the early Wittgenstein of the Tractatus. One feature that Langer shares with logical positivism, for instance, is her view that philosophy is a critique of language. But even in this first book, published at approximately the peak of logical positivism’s popularity, Langer explicitly distinguishes her views from those of logical positivism. Already at this point, Langer is insisting on the importance of an interpretant in the meaning relation, reinserting the aspect of personal experience which logical positivism had carefully removed.

One of Langer’s contributions to the logic of signs and symbols is the claim that the semantic power of language is predicated on the lack of any rival interest in vocables. Langer uses the example of an actual peach to replace the spoken word ‘plenty’, and she argues that we are too interested in peaches for this to be effective: the peach would be both distracting and wasted. It is the irrelevance of vocables for any other purpose than language that leads to the transparency of spoken language, where meaning appears to flow through the words.

Langer’s textbook, An Introduction to Symbolic Logic (1937), was written expressly to take students to the point where they could tackle Russell and Whitehead’s Principia Mathematica (1910-3). This textbook contains not only instruction on the formal aspects of symbolic logic, Boolean as well as that of Principia Mathematica, but also extensive philosophical discussion on metaphor, exemplification, generalization and abstraction. As well as achieving this task, the book functions as an introduction to how Sheffer practiced logic, since he did not publish such a text.

Sheffer had followed Josiah Royce in considering logic to be a relational structure rather than dealing solely with inference. Langer takes this notion and follows it through its implications, paying special attention to the distinctions between types of logic and meaning.

From one perspective, Langer’s view is very radical, since expanding the notion of meaning to logical resemblance incorporates huge swathes of life which had been dismissed by many of the thinkers she cites most, such as Russell and the early Wittgenstein, as nonsense. However, this emphasis on the structure of relations can also be seen as a form of hylomorphism, connecting Langer’s views to a tradition which stretches back to Aristotle.

4. The ‘New Key’ in Philosophy

Langer’s next book, Philosophy in a New Key (1942), might be thought of as her central work, in that it serves as a summation and development of her previous work in logic and an expanded field of meaning, but also gives early formulation to her ideas in all the fields which would preoccupy her for the rest of her career, including art and theory of mind, but also touching on linguistics, myth, ritual, and ethnography.

In the book Langer claims that figures as diverse as Freud, Cassirer, Whitehead, Russell, and Wittgenstein are all engaged in a shared project to understand the nature of human symbolization. Along the way, Langer touches on a wide variety of subjects of philosophical interest. Her theory of depiction, for instance, is given, along with a speculative account of the early development of language, and the relation of fantasy to rational thought.

Langer justifies the exploration of all these different topics in a single text by relating them all to a single idea: that across a wide range of humanities subjects there had been, in the late 19th and early 20th centuries, a fundamental shift in the intellectual framework within which work was done in these disciplines and that this shift was related in every case to an expanded appreciation of the nature of human symbolization. Langer describes this shift using the musical metaphor of a key change—hence Philosophy in a New Key. In her introduction, Langer offers a brief account of previous shifts in philosophy such as, for instance, the Cartesian notion of looking at reality as a dichotomy of inner experience and outer world.

Langer refers to her theory of the symbolic as a semantic theory, which proved controversial, as her theory includes but is not limited to language. This is the expanded field of meaning that Langer sought to describe and provided conceptual scaffolding for. Where Wittgenstein’s Tractatus Logico-Philosophicus famously ends with the statement that “whereof we cannot speak, we must remain silent” (Wittgenstein, 1922), Langer argues that language is only one of many symbolisms, albeit one of particular importance, and that other symbolisms, including myth, ritual, and art, can form thoughts which language is incapable of. Whether or not Langer is correct depends not only on whether the semantic can be broadened in this way so that the semantic does not need a corresponding syntax, for instance, but also on whether there are thoughts which language is not capable of expressing.

Langer’s distinction between discursive and presentational symbolic forms in Philosophy in a New Key has received extensive discussion. Briefly, discursive forms are to be read and interpreted successively, whereas presentational forms are to be read and interpreted as a whole. Additionally, another important difference is that in discursive symbolisms the individual elements have independent meaning whereas in non-discursive symbolism they do not; words have independent meaning even when isolated from a wider text or utterance, whilst lines, colours and shapes isolated from an artwork do not have independent meaning.

Scientific language and mathematical proofs are straightforwardly discursive, whereas photographs and paintings are straightforwardly presentational. Some less intuitive but still important applications of this distinction exist, however, with novels and poems, for instance, being considered presentational forms by Langer; despite being formed with language, the artwork functions as a whole, and cannot be judged without considering the whole. On the other hand, graphs and charts function discursively, despite being visual.

Langer’s discussion of ritual is related to her careful reading of Ernst Cassirer, whom Langer met and corresponded with, and who considered Philosophy in a New Key to be the book on art which corresponded to his three-volume Philosophy of Symbolic Forms (1923-9). Langer would translate and write the introduction for the English-language edition of Cassirer’s Language and Myth (1946). Considering rain dances, for instance, Langer discusses them neither as a dishonest trick of tribal seniors nor as magic. Instead, the group activity is seen as symbolic:

Rain-making may well have begun in the celebration of an imminent shower after long drought; that the first harbinger clouds would be greeted with entreaty, excitement, and mimetic suggestion is obvious. The ritual evolves while a capricious heaven is making up its mind. Its successive acts mark the stages that bring the storm nearer. (Langer, 1942)

Langer notes, moreover, that participants do not try to make it snow in mid-summer, nor to ripen fruits entirely out of season. Instead, the elements are either present or imminent, and participants encourage them.

Langer’s treatment of music in the book is notable, defending critic Clive Bell’s famous phrase in Art (1917) that called art ‘significant form’. Langer argues that the sense in which this is true is that music is a symbolic form without fixed conventionally assigned meanings—she calls music an “unconsummated symbolism.” (Langer, 1942) Langer dismisses both the hedonic theory of art and the contagion theory, and she argues instead that music expresses the composer’s knowledge of feeling, an idea she attempts to elucidate and clarify but which she attributes to numerous European composers, critics, and philosophers including Wagner, Liszt and Johann Adam Hüller.

Philosophy in a New Key might also be thought to be central because Langer’s later theory of art is explicitly introduced on its cover as being derived from Philosophy in a New Key, and subsequently her Mind trilogy is introduced as having come out of her research on living form that informed her philosophy of art. Langer herself frequently refers back to Philosophy in a New Key in her later works, whereas The Practice of Philosophy never went beyond the first edition, with Langer in her later life turning down requests by the publisher to put it back in print.

The book was unexpectedly popular. Despite its enthusiastic popular reception, the book was largely neglected by the academic community at the time. The book’s success may partly explain Langer’s relative prominence within musical aesthetics compared to her relative neglect in the aesthetics of other artforms since her treatment of music in Philosophy in a New Key is much fuller than her brief and scattered comments on other artforms. Langer was well aware of this, and indeed the subsequent work, Feeling and Form, gives separate and sustained attention to a wide variety of artforms.

5. Theory of Art

After Philosophy in a New Key’s popular success in giving an account of music, Langer generalised its account to a theory of all the arts. Feeling and Form (1953) is split into three major parts: the first deals with introductory matters; part two, by far the largest part of the book, gives separate and sustained attention to each of the artforms dealt with in turn, including painting, sculpture, architecture, music, dance, poetry, myth and legend, prose fiction, comedic drama and tragic drama (there is also a short appendix on film at the end of the book); then, in part three, Langer gives her general account.

Helpfully, in part three she compares her ideas in detail to those of R. G. Collingwood, whose Principles of Art (1938) had appeared just fifteen years before; this very much helps to locate Langer’s position. The final chapter of Feeling and Form considers art from the point of view of its public, considering the educational and social role of art, in a way that both ties Feeling and Form into the sections on ritual and myth in Philosophy in a New Key and anticipates some arguments Langer would make in Volumes 1 and 3 of Mind. The theory of art presented here is based primarily on Feeling and Form, but also includes elements and quotes from the two other later books in which Langer discusses art at length: Problems of Art (1957) and Mind: Volume 1 (1967).

Langer’s theory states that artworks present forms of feeling. This is possible because both feeling and artistic elements are experienced as qualitative gradients; the forms of each are congruent. Feeling may be complex or simple—more or fewer gradients can be experienced simultaneously; artworks, similarly, may present many gradients at once or very few. In either case, there is a unity to the feeling or artwork—an overall quality. It is this quality of feeling that an artist tries to express when creating a work, negotiating the artistic elements.

Artists work by weighing qualities in the forming artwork—a formulation that seems to capture practices as diverse as traditional easel painting or the selection of ready-mades, a composer writing a symphony or a rock band writing a song, or theatre directors giving feedback to actors on blocking or actors improvising a scene of street theatre. “Artistic forms,” Langer writes, “are more complex than any other symbolic forms we know. They are, indeed, not abstractable from the works that exhibit them. We may abstract a shape from an object that has that shape, by disregarding color, weight and texture, even size; but to the total effect that is an artistic form, the color matters, the thickness of lines matters, and the appearance of texture and weight.” (Langer, 1957) The value of art is intrinsic to the work, rather than being a communication medium, and it is the sensuous qualities of the work which give the viewer access to the meaning (literary work being experienced in the sensuous imagination).

Langer holds that artworks are each a symbol expressive of human feeling. By expression—to press out—Langer means projection, she uses the example of horns projected from the head of a reindeer. An art object is therefore a projection of feeling, not spontaneous feeling—but the artist’s knowledge of feeling. Langer’s Expressivism, moreover, does not insist on melodrama and high emotion. Whilst it could be argued that Langer’s concept of expression differs too significantly from others in the Expressivist tradition to be called such, Langer herself writes that she, Croce, and Collingwood are embarked on a shared project, as well as Bell, Fry, and Cassirer. (Langer, 1953) So long as it is remembered that Langer does not claim that artworks express emotion, the grouping seems fair; Langer’s account concerns expressive form articulating knowledge of feeling rather than a contagious and spontaneous outpouring.

Langer writes that artists try to express a unitary gestalt:

What any true artist – painter or poet, it does not matter – tries to “re-create” is not a yellow chair, a hay wain or a morally perplexed prince, as a “symbol of his emotion,” but that quality which he has once known, the emotional “value” that events, situations, sounds or sights in their passing have had for him. He need not represent those same items of his experience, though psychologically it is a natural thing to do if they were outstanding forms; the rhythm they let him see and feel may be projected in other sensible forms, perhaps even more purely. When he finds a theme that excites him it is because he thinks that in his rendering of it he can endow it with some such quality, which is really a way of feeling. (Langer, 1967)

Langer believes that people feel, and artists have developed special sensitivity to feeling, and when working in an artistic mode, they seek to articulate what they have felt, so that the resulting artwork seems to possess the same quality as the feeling the artist has in mind (remembering that consciousness is a fundamentally embodied process for Langer, feeling raised above the “limen of sentience”). Langer stresses that the artist need not have experienced the feeling, but they must be capable of imagining it.

Langer distinguishes between what she calls primary and secondary illusions. A primary illusion is what an artform stably presents—so painting, sculpture, and architecture must present virtual space whilst a piece of music must present virtual time. This is the contextual framework within which artistic elements exist. Further primary illusions include virtual powers (dance) and virtual memory (literature). Primary illusions do not come and go, and are not a presentation of gradients; because of this they are not the site of particular interest in most artworks—Langer, for instance, criticises the work of Russian artist Malevich for generating a sense of space in his “magic squares” but nothing else. Secondary illusions, by contrast, present gradients; artworks can function because of the congruence between gradients of feeling and gradients in artworks. Gradients are projected into artworks, and while there are no set rules for how this is done, it is possible to analyse an artwork to see how a work has been achieved. By stressing the absence of rules of projection, what Langer means is that the results of these analyses cannot be generalised and reapplied—this is one major way in which art images are distinguished from models, which generally do have a single stable rule of projection; the salience of a gradient depends on the artwork. The relationship of secondary illusions to primary illusion is that of feeling to a life of feeling.

Feeling and Form did not find the success of its predecessor yet it has been mentioned or taught in some aesthetics programmes in the UK and US; perhaps surprisingly, it also seems to have been featured in university aesthetics syllabuses in China and India. Feeling and Form has also been made use of by philosophers seeking to put forward accounts of particular artforms. Robert Hopkins, for instance, has offered a limited defence of her ideas of virtual kinetic volume in sculpture as found in Feeling and Form.

Philosopher Paul Guyer has suggested, in his sub-chapter on Langer in his History of Modern Aesthetics, that the reason for the neglect of Feeling and Form may be timing; the publication of Feeling and Form in 1953 coincided with the publication of Wittgenstein’s Philosophical Investigations, the latter text preoccupying philosophy departments for decades. Accounts of art such as Langer’s which offered a single function which all artworks were meant to perform, expression in Langer’s case, were not in keeping with the intellectual fashion for proceduralist theories, such as George Dickie’s institutional theory of art or Arthur Danto’s historical account.

Langer produced two other books in this phase of her career. Problems of Art (1957) is a transcribed and edited collection of Langer’s talks on art to different audiences. Langer addresses different sorts of audiences, including general and non-specialist and technical, and so her position on many points is made clearer because of the different registers in which she addresses her audiences. She had had four years since the publication of Feeling and Form in which to synthesise the formulation of many of her ideas into a clearer form. The book also contains a reprint of her important and otherwise difficult-to-find essay from 1951 in honour of Henry Sheffer, ‘Abstraction in Science and Abstraction in Art’.

Secondly, Langer produced Reflections on Art: A Source Book of Writings by Artists, Critics, and Philosophers (1958). This latter book is a collection of writings on art which Langer considered to be both important and otherwise hard to find. Whilst invaluable in tracing influences on Langer’s ideas, Reflections on Art is not particularly helpful as an introductory text because of its focus on, in particular, metaphor and expression, at the expense of a wider survey of writings on art.

6. Theory of Mind

In the first volume of Mind (1967), Langer sets out the problem as she sees it: the mind-body duality resists efforts to solve it because it is built on faulty premises, that mind is not metaphysically distinct from body, and that behaviorism in psychology has previously led to an avoidance of the most pressing issues of the discipline. To tackle this, Langer puts forward the thesis, which she planned to substantiate over three volumes, that the whole of animal and human life, including law, the arts, and the sciences, is a development of feeling. The result is a biologically grounded theory of mind, a conceptual structure within which work in the life sciences can be integrated.

Furthermore, Langer claims that it is possible to know the character of the mind by studying the history of art, which shows the development and variety of feeling in its objectified forms. Langer proceeds, then, to first take issue with the ‘idols of the laboratory’—jargon, controlled experiment, and objectivity, claiming that each of these has its place but have held back progress in the life sciences. Claiming that each of these weaknesses is philosophical, Langer argues that scientific knowledge ultimately aims to explain phenomena, and that at a pre-scientific level work is motivated and guided by images which present the phenomenal character of a particular dynamic experience. Images are susceptible to analysis in a way that feeling itself is not. Here Langer calls art symbols a “systematic device whereby observations can be made, combined, recorded and judged, elements distinguished and imaginatively permuted, and, most important, data exhibited and shared, impressions corroborated.” (Langer, 1967) This is material art history seen as a data set, a treasure trove for psychological research.

Langer goes on to explore artistic projection, the artistic idea, and abstraction in art and science before considering living form—that functional artworks need a semblance of livingness, something Aristotle already remarked upon as the single most important characteristic of art. The image of mind that art provides can be used by those studying the mind to test the validity of their accounts.

This then sets up Langer’s discussion of acts and the growth and evolution of acts. Langer coins a new term, pressions, to name the class of relations which hold between acts and situations, such as impression, expression and suppression. Langer sees the evolution of life as fundamentally the evolution of acts, and sees the dominance of both mechanical models and imputing an agent such as God or Nature to situations as antithetical to serious understanding of this process.

The second volume of mind deals with a single section of her project, ‘The Great Shift’ from animal mentation to human mind. Starting with plankton, Langer considers progressively more complex examples, considering topics including instinct and the growth of acts. Langer seeks to neither deny animal feeling nor anthropomorphise animal feeling and behaviour. Langer draws on Jakob von Uexküll’s idea of animal ambient—that differing sensory abilities lead to different animals living in different experiential spaces even if they share the same actual space.

Langer discusses the migration of birds and other animals, arguing that animal travels should be seen as round trips, and migration as an elaboration of the same: a round trip with a long stopover. Also discussed are the parent-young relations of dolphins and the alleged use of language by chimpanzees. Langer brings a large amount of empirical material to bear on these issues, before moving on to consider the specialisation of man. She argues that Homo sapiens has been successful because of specialisation, against the argument that the species is a generalist. Langer considers the shape of the human foot, and that there is no evidence in this for humans ever living entirely in trees. The shape of the foot in facilitating bipedality, and an upright posture, and a larger brain, are all discussed, as is consideration of the hand as a sense organ. Langer then stresses a hugely important feature of the human brain, that it is able to finish impulses on a virtual level instead of needing to enact these in the actual world. This liberates the brain for conceptual thought.

Langer discusses dreaming and argues that the evidence suggests the brain requires constant activation, which is what has driven its increase in size and function. She then links the biologically grounded model of mentation she has drawn so far with the components of symbolization, discussing how mental abstraction is affected by memory, the origin of imagination, and the origins of speech in expression rather than communication.

Langer claims then that speech is necessary for social organisation and that all natural languages are socially adequate. Langer discusses the dangers of the imaginative capacity of humanity, and the feeling of reality, before discussing morality—a concern she notes is peculiar to man.

The final volume of Mind is not what Langer had planned, with an epistemological theory and a metaphysics. Due to poor health and failing eyesight, Langer left the final section of the book with only a brief outline.

What the third volume accomplishes, however, is to make connections between the model of man as the symbolic animal, which had been achieved by the end of the second volume, and various anthropological data relating to tribes, city states, and other societies. The focus of the third volume is considerably broadened to accommodate symbolic mind in society, and Langer by necessity only offers glimpses into this; Adrienne Dengerink Chaplin calls it a “holistic, biologically based, philosophical anthropology.” (Dengerink Chaplin, 2020)

Langer also offers a view of philosophy of religion, that “even as the power of symbolic thought creates the danger of letting the mind run wild, it also furnishes the saving counterbalance of cultural restraint, the orientating dictates of religion.” (Langer, 1953) A religious community and religious symbols keep a rein on individuation, strengthening social bonds; the loss of these religious frameworks in the modern world is a large part of the disorientation of modern life.

As the trajectory of her intellectual career intersected with Wittgenstein’s at several important junctures, it is of interest that she gives a brief verdict on his Philosophical Investigations: that it is a despairing resort to behaviourism.

7. Political Philosophy and Contribution to the ‘Modern Man’ Discourse

Langer’s contribution to political philosophy has received little attention, and her interest in it is certainly minor compared to her substantial interests in logic, the arts, and theory of mind. It consists of chapters on the structure of society in Philosophy in a New Key and the third volume of Mind, and, most notably, articles on the political danger of outdated symbolism governing societies in ‘The Lord of Creation’ (1944), and on what might be done to tackle the persistence of international war in ‘World Law and World Reform’ (1951).

‘The Lord of Creation’ essentially presents the arguments of Philosophy in a New Key through the lens of political philosophy and sociology. Symbolisation, Langer argues, is the source of the distinctiveness of human society—whilst animals, intelligent or not, live very realistic lives, humans are characteristically unrealistic: “magic and exorcism and holocausts—rites that have no connection with common-sense methods of self-preservation.” (Langer, 1944) This is because, Langer claims, people live lives in which there is a constant dialectic of sensory and imaginative activity, so that fantastic elements permeate our experience of reality: “The mind that can see past and future, the poles and the antipodes, and guess at obscure mechanisms of nature, is ever in danger of seeing what is not there, imagining false and fantastic causes, and courting death instead of life.” (Langer, 1944) This human condition has become a human crisis, according to Langer, because scientific progress has led to such upheavals in human living, especially in terms of the symbols which previously gave a shared context to human life.

Industrialisation, secularisation and globalisation have within two centuries, and in many places less, led to a poverty in the governing symbols available to humanity, according to Langer. People are now living together without overarching societal ties of religion or ethnicity, and are left with the vague notion of nationality to unite them, a concept Langer has little patience for, considering it to be a degraded tribalism:

At first glance it seems odd that the concept of nationality should reach its highest development just as all actual marks of national origins – language, dress, physiognomy, and religion – are becoming mixed and obliterated by our new mobility and cosmopolitan traffic. But it is just the loss of these things that inspires this hungry seeking for something like the old egocentric pattern in the vast and formless brotherhood of the whole earth. (Langer, 1944)

The problem is not merely industrial warfare, for Langer, but industrial warfare at a time when ‘modern man’ is simultaneously symbolically impoverished.

‘World Law and World Reform’ is a densely argued twelve pages; Langer argues that whilst civil war is a failure of institutions, and as such not irradicable, international war is, by nature, institutional. What she means by this is that the power of nation states is backed up by the threat and use of force, and it is the display and use of this force which enables diplomacy. Langer dismisses the notion of popular demand for war, arguing that it is diplomats—here she lists kings, presidents, premiers, other leading personages and their cabinets—who prepare and make war: “The threat of violence is the accepted means of backing claims in the concert of nations, as suit and judgement are in civil life.” (Langer, 1951)

Langer argues that this situation is the result of an essentially tribal philosophy of government, which did relatively little damage in the past, but which has the potential to end human life on earth since the invention of atomic weapons. Her solution is the creation and empowerment of a world judiciary, which would be invested with the power to adjudicate and enforce its decisions. She acknowledges that the United Nations is the most notable international institution of her era and lists five reforms which would make it suitable to perform the role of this world judiciary: “1) Extend membership to all nations; 2) Make the General Assembly a legislative body with power to adopt a constitution; 3) Give the World Court the power of summons, and make its decisions binding; 4) Set up a high secretariat (or other executive) to administer world interests; 5) Internationalize all armed force, setting up a federal guard (not enlisted by national units) and allowing the several nationals national police guards of their own, for domestic use.” (Langer, 1951)

Langer is not optimistic about these steps happening in short order, but she argues that historical parallels exist, and that the steps need not happen in one go and can be worked towards as a far-sighted goal. Her historical parallels are action to combat the Black Death and, later, to end legal child labour. In both of these situations, a pre-existing social malady became intolerable due to social changes which exacerbated them, and it was this which prompted social reform. Langer argues that, similarly, properly constituted world courts could bring an end to international war.

8. Legacy

Because of Langer’s many temporary academic positions, and her focus on research instead of teaching from the mid-1950s onwards, her legacy is mainly to be found in her publications, especially books, rather than in direct influence on students. Having said this, numerous individuals who would go on to be influential in their fields studied with Langer, including artist Eva Hesse and philosopher Arthur Danto. Danto would write the preface to the abridged version of Mind.

Langer herself is a subject of growing interest, with research being undertaken into her life, career, and index card system. The Susanne K. Langer Circle is an international group of scholars with interest in Langer’s work and life and is affiliated with Utrecht University. It hosted the first international conference on Langer’s work in 2022.

Langer’s textbook Introduction to Symbolic Logic was the first introductory book on the subject and made the methods of symbolic logic much more accessible. Randall Auxier has published an updated version of this with many more exercises and expanded discussion.

In philosophy of art, Langer’s ideas on expression have been engaged with by a range of prominent thinkers in the philosophy of music, including Malcolm Budd, Peter Kivy, and Jenefer Robinson. Nelson Goodman’s positions on many issues, in particular those he discusses in Languages of Art (1968), are influenced by Langer’s ideas, something Goodman half acknowledges in his introduction, though Goodman somewhat disingenuously cites Langer directly only as Cassirer’s translator.

Philosopher Brian Masumi has engaged with Langerean thought, particularly her work in Feeling and Form, discussing her ideas on motif and, especially, semblance, writing “Langer has probably gone further than any other aesthetic philosopher toward analyzing art-forms not as “media” but according to the type of experiential event they effect.” (Massumi, 2011) Langer in Massumi has been a very influential reference for younger philosophers engaging with her thought.

Jarold Lanier, the ‘father of virtual reality’, attributes the term ‘virtual world’ to Langer—computing and virtual reality pioneer Ivan Sutherland had Feeling and Form era Langer in mind. Here is the first reference to a virtual world in Langer—she is discussing architecture, in particular how one nomadic camp may be set up in the same geographical area where one from another culture used to be, but the sense is extremely evocative when considered in light of virtual reality:

A place, in this non-geographical sense, is a created thing, an ethnic domain made visible, tangible, sensible. As such it is, of course, an illusion. Like any other plastic symbol, it is primarily an illusion of self-contained, self-sufficient, perceptual space. But the principle of organization is its own: for it is organized as a functional realm made visible —the center of a virtual world, the “ethnic domain,” and itself a geographical semblance. (Langer, 1953)

Lanier made the change from virtual world to virtual reality, but the fundamental notion is Langerean. Pioneering media theorist Marshall McLuhan similarly seems to have had Langer in mind, occasionally citing her, when considering how media reshapes and reconstitutes (again, the above quote is suggestive here when considering McLuhan’s famous dictum “the medium is the message.”)

In neuroscience, several notable figures have referred in print approvingly to Langer’s ideas on feeling, including Jaak Panksepp, Gerald Edelman, and Antonio Damasio. The latter refers to his notion of background feeling as being what Langer describes, though he arrived at it independently. In psychology, Fred Levin writes that Langer anticipated by decades  the notion of feeling that the biosciences would adopt.

9. References and Further Reading

a. Primary Sources

  • A Logical Analysis of Meaning, doctoral thesis, Radcliffe College, 1926.
    • Unpublished thesis making the case for an expanded understanding of meaning which includes religion and the arts, argues that philosophy is the clarification of concepts.
  • The Practice of Philosophy. New York: Henry Holt, 1930.
    • Explains Langer’s perspective on what it is to do philosophy, and its distinction from and relation to other fields, including science, mathematics and logic, and art.
  • An Introduction to Symbolic Logic. New York: Allen and Unwin, 1937. Second revised edition, New York, Dover, 1953.
    • Textbook aiming to take beginners to the point of being able to tackle Russell and Whitehead’s Principia Mathematica.
  • Philosophy in a New Key: A Study in the Symbolism of Reason, Rite and Art. Cambridge, MA: Harvard University Press, 1942.
    • Langer’s most influential book—drawing together researches in psychology, art, ritual, language and logic to claim that there had been a recent philosophical shift predicated on an expanded awareness of the symbolic.
  • ‘The Lord of Creation’. In Fortune Magazine (1944).
    • Popular treatment discussing how the power of symbolisation is both a strength and source of the precariousness of human society.
  • ‘Abstraction in Science and Abstraction in Art’. In Structure, Method and Meaning: Essays in Honor of Henry M. Sheffer, edited by Paul Henle, Horace M. Kallen and Susanne K. Langer, 171–82. New York: Liberal Arts Press, 1951.
    • Defends the thesis that scientific abstraction concerns generalisation whilst artistic abstraction specifies unique objects which are forms of feeling.
  • ‘World Law and World Reform’. The Antioch Review, Vol. 11, No. 4 (Winter, 1951).
    • Langer’s most sustained political philosophy defending the implementation of empowered world courts.
  • Feeling and Form: A Theory of Art Developed from Philosophy in a New Key. New York: Charles Scribner’s, 1953.
    • An Expressivist theory of art which discusses numerous artforms in detail before generalising the conclusions.
  • Problems of Art: Ten Philosophical Lectures. New York: Charles Scribner’s, 1957.
    • Accessible collection of lectures to different audiences on art topics.
  • Reflections on Art: A Source Book of Writings by Artists, Critics, and Philosophers. Editor. Baltimore, MD: Johns Hopkins University Press, 1958.
    • Langer’s choice of aesthetics readings with introduction.

b. Secondary Sources

  • Auxier, Randall E. ‘Susanne Langer on Symbols and Analogy: A Case of Misplaced Concreteness?’ Process Studies 26 (1998): 86–106.
    • Suggests a modification to Langer’s account of symbols and considers this part of her account in relation to that of both Whitehead and Cassirer.
  • Auxier, Randall E. Logic: From Images to Digits. 2021. Ronkonkoma: Linus Learning.
    • An accessible and updated version of Langer’s symbolic logic, separating it from the implied metaphysics of the original.
  • Browning, Margaret M. ‘The Import of Feeling in the Organization of Mind’ in Psychoanalytic Psychology, Vol. 33, No. 2 (2016), pp. 284–298.
    • Pursues and defends a Langerean view of feeling from a neuroscientific perspective.
  • Browning, Margaret M. ‘Our Symbolic Minds: What Are They Really?’ in The Psychoanalytic Quarterly, Vol. 88, No. 1 (2019), pp. 25–52.
    • Discusses intersubjectivity from a Langerean perspective.
  • Budd, M. Music and the Emotions, London: Routledge, 1985.
    • A serious critique of Langer’s musical aesthetics.
  • Dengerink Chaplin, Adrienne. The Philosophy of Susanne Langer: Embodied Meaning in Logic, Art, and Feeling, London: Bloomsbury Academic, 2019.
    • Monograph on Langer with a particular focus on the influence of Wittgenstein, Whitehead, Scheffer and Cassirer on the development of Langer’s thought.
  • Dryden, Donald. ‘The Philosopher as Prophet and Visionary: Susanne Langer’s Essay on Human Feeling in the Light of Subsequent Developments in the Sciences’. Journal of Speculative Philosophy 21, No. 1 (2007): 27–43.
    • A brief summary of some of the applications of Langer’s theory of mind with a view to defending the applicability of the Langerean view.
  • Dryden, Donald. ‘Susanne Langer and William James: Art and the Dynamics of the Stream of Consciousness’. Journal of Speculative Philosophy, New Series, 15, No. 4 (1 January 2001): 272–85.
    • Traces commonalities and distinctions in the ideas on thinking and feeling of James and Langer.
  • Dryden, Donald. ‘Whitehead’s Influence on Susanne Langer’s Conception of Living Form’, Process Studies 26, No. 1–2 (1997): 62–85.
    • A clear account of what Langer does and does not take from Whitehead particularly concerning act form.
  • Gaikis, L. (ed.) The Bloomsbury Handbook of Susanne K. Langer. London: Bloomsbury Academic, 2024.
    • Featuring an extensive collection of major scholars on Langer, this book elucidates her transdisciplinary connections and insights across philosophy, psychology, aesthetics, history, and the arts.
  • Ghosh, Ranjan K. Aesthetic Theory and Art: A Study in Susanne K. Langer. Delhi: Ajanta Books International, 1979.
    • Doctoral dissertation on Langer which takes the unusual step, in an appendix, of applying her theories to specific artworks.
  • Hopkins, R. ‘Sculpture’ in Jerrold Levinson (ed.), The Oxford Handbook of Aesthetics. Oxford University Press. pp. 572–582 (2003).
    • Criticizes and offers a limited defence of Langer’s notion of virtual kinetic volume in sculpture.
  • Innis, Robert E. Susanne Langer in Focus: The Symbolic Mind. Bloomington: Indiana University Press, 2009.
    • The first English-language monograph on Langer; particularly helpful in locating Langer in relation to the pragmatist tradition.
  • Lachmann, Rolf. Susanne K. Langer: die lebendige Form menschlichen Fühlens und Verstehens. Munich: W. Fink, 2000.
    • The first monograph on Langer (German language).
  • Lachmann, Rolf. “From Metaphysics to Art and Back: The Relevance of Susanne K. Langer’s Philosophy for Process Metaphysics.” Process Studies, Vol. 26, No. 1–2, Spring-Summer 1997, 107–25.
    • English-language summary by Lachman of his above book.
  • Massumi, B. Semblance and Event: Activist Philosophy and the Occurrent Arts. Cambridge, MA: The MIT Press, 2011.
    • An aesthetics of interactive art, ephemeral art, performance art, and art intervention. The titular semblance is Langerean and the early part of the book features an extended discussion on and from ideas taken from Feeling and Form.
  • Nelson, Beatrice K. ‘Susanne K. Langer’s Conception of “Symbol” – Making Connections through Ambiguity’. Journal of Speculative Philosophy, New Series 8, No. 4 (1 January 1994): 277–96.
    • Considers what is involved and at stake in Langer’s synthetic project.
  • Reichling, Mary. ‘Susanne Langer’s Concept of Secondary Illusion in Music and Art’. Journal of Aesthetic Education 29, No. 4 (1 December 1995): 39–51.
    • Opening up of the philosophical discussion on secondary illusions with reference to specific works and art criticism.
  • Sargeant, Winthrop. ‘Philosopher in a New Key’. New Yorker, 3 December 1960.
    • New Yorker profile on Langer.
  • Saxena, Sushil. Hindustani Sangeet and a Philosopher of Art: Music, Rhythm, and Kathak Dance Vis-À-Vis Aesthetics of Susanne K. Langer. New Delhi: D. K. Printworld, 2001.
    • Applies Langerean aesthetics to a type of music Langer did not discuss.
  • Schultz, William. Cassirer and Langer on Myth: An Introduction. London: Routledge, 2000.
    • Discussion of literary myths in Cassirer and Langer, both commonalities and distinctions in their positions.
  • van der Tuin, Iris. ‘Bergson before Bergsonism: Traversing ‘Bergson’s Failing’ in Susanne K. Langer’s Philosophy of Art’. Journal of French and Francophone Philosophy 24, No. 2 (1 December 2016): 176–202.
    • Considers Feeling and Form in relation to the philosophy and reception of Henri Bergson.

 

Author Information

Peter Windle
Email: peterwindle@gmail.com
University of Kent
United Kingdom

Aristotle: Epistemology

For Aristotle, human life is marked by special varieties of knowledge and understanding. Where other animals can only know that things are so, humans are able to understand why they are so. Furthermore, humans are the only animals capable of deliberating in a way that is guided by a conception of a flourishing life. The highest types of human knowledge also differ in having an exceptional degree of reliability and stability over time. These special types of knowledge constitute excellences of the soul, and they allow us to engage in characteristic activities that are integral to a good human life, including the study of scientific theories and the construction of political communities.

Aristotle’s central interest in epistemology lies in these higher types of knowledge. Among them, Aristotle draws a sharp division between knowledge that aims at action and knowledge that aims at contemplation, valuing both immensely. He gives a theory of the former, that is, of practically oriented epistemic virtues, in the context of ethics (primarily in the sixth book of the Nicomachean Ethics [Nic. Eth.], which is shared with the Eudemian Ethics [Eud. Eth.]), and he gives a theory of the latter both there and in the Posterior Analytics [Post. An.], where the topic of epistemology is not sharply distinguished from the philosophy of science. Lower types of knowledge and other epistemically valuable states are treated piecemeal, as topics like perception, memory and experience arise in these texts as well as in psychological, biological, and other contexts.

Although Aristotle is interested in various forms of error and epistemic mistakes, his theory of knowledge is not primarily a response to the possibility that we are grossly deceived, or that the nature of reality is radically different from the way we apprehend it in our practical dealings and scientific theories. Instead, Aristotle takes it for granted that we, like other animals, enjoy various forms of knowledge, and sets out to enumerate their diverse standards, objects, purposes and relative value. He emphasizes the differences among mundane forms of knowledge such as perception and higher forms such as scientific theorizing, but he also presents an account on which the latter grows organically out of the former. His pluralism about knowledge and his sensitivity to the different roles various forms of knowledge play in our lives give his theory enduring relevance and interest.

Table of Contents

  1. Knowledge in General
  2. Perception
  3. Memory
  4. Experience
  5. Knowledge as an Intellectual Virtue
    1. The Division of the Soul
    2. Scientific Virtues
      1. Theoretical Wisdom
      2. Demonstrative Knowledge
      3. Non-Demonstrative Scientific Knowledge
    3. Practical Knowledge and the Calculative Virtues
      1. Craft
      2. Practical Wisdom
  6. References and Further Reading
    1. Bibliography

1. Knowledge in General

Knowledge in a broad sense (gnōsis, from whose root the word “knowledge” derives; sometimes also eidenai) is enjoyed by all animals from an early stage in their individual development (Generation of Animals [Gen. An.] I 23, 731a30–4). In Aristotle’s usage, it includes everything from a worm’s capacity to discriminate hot and cold to the human ability to explain a lunar eclipse or contemplate the divine (for representative usages, see Post. An. I 1, 71a1–2; II 8, 93a22; II 19, 99b38–9). However, Aristotle shows comparatively little interest in knowledge in this broad sense. The Aristotelian corpus has no surviving treatise devoted to knowledge in all generality, and there is no evidence that Aristotle ever authored such a text. His main interest is in more specific kinds of knowledge. Nevertheless, a few features of Aristotle’s view regarding knowledge in general deserve comment.

First, it is relatively clear that he takes gnōsis to be at least factive (although this is disputed by Gail Fine). That is, if someone (or some animal) has gnōsis that something is the case, then that thing is true. Plausibly, Aristotle takes gnōsis to be not only true cognition, but cognition that is produced by a faculty like perception which reliably yields truths. This makes it tempting to compare Aristotle’s general view of knowledge with contemporary forms of reliabilism such as Ernest Sosa’s or John Greco’s, though the reliability of gnōsis is not a point Aristotle stresses.

Second, Aristotle also treats most kinds of knowledge as relatives (Physics [Phys.] VII 3, 247b1–3). A relative, in Aristotle’s metaphysical scheme, is an entity which is essentially of something else (Cat. 7, 6a37). One example is that of a double, since a double is essentially the double of something else (Cat. 7, 6a39–b1). Likewise, knowledge is essentially knowledge of something-or-other (Cat. 7, 6b5), be it an external particular (De Anima [De An.] II 5, 417b25–7), a universal within the soul (De An. II 5, 417b22–3; compare Phys. VII 3, 247b4–5, 17–18), or the human good (Nic. Eth. VI 5, 1140a25–8). It is fundamental to Aristotle’s way of thinking about knowledge that it is in this way object directed, where the notion of object is a broad one that includes facts, particulars, theories and ethical norms. Aristotle frequently characterizes different types of knowledge by the types of objects they are directed at.

Third, for Aristotle, knowledge generally builds upon itself. In many cases, learning amounts to reconceiving the knowledge we already have, or coming to understand it in a new way (Post. An. I 1, 71b5–8). Further, Aristotle notes that the knowledge we gain when we learn something is often closely connected to the knowledge that we need to already have in order to learn this. For instance, in order to gain a proper geometrical understanding of why a given triangle has internal angles that sum to 180 degrees, we must already know that triangles in general have this angle sum and know that this particular figure is a triangle, whereupon it may be asked: what is this if not already to know that the particular triangle has this angle sum (Post. An. I 1, 71a19–27; cf. Pr. An. II 21, 67a12–22)? Likewise, in order to arrive at knowledge of what something is, that is, of its definition, we must perform an inquiry that involves identifying and scrutinizing things of the relevant kind. That requires knowing that the relevant things exist; however, how can we identify these things if we do not know what defines instances of that kind (Post. An. II 7, 92b4–11; II 8, 93a19–22)?

Aristotle identifies such questions with a famous puzzle raised in Plato’s Meno: how can we search for anything that we do not already know (Post. An. I 1, 71a29, compare Pr. An. II 21, 67a21–2)? Either we already know it, in which case we do not need to look for it, or we do not know it, in which case we do not know what we are seeking to learn and we will therefore not recognize it when we have found it (Meno 80e).

As David Bronstein and Gail Fine have shown, much of Aristotle’s epistemology is structured around this challenge. Aristotle is confident that we can distinguish the prior knowledge required for various types of learning from what we seek to learn; hence, for Aristotle, the puzzle in the Meno amounts to a challenge to articulate what prior knowledge various kinds of learning depend upon. The picture of learning and inquiry we get from Aristotle is, consequently, a thoroughly cumulative one. Typically, we learn by building on and combining what we already know rather than going from a state of complete ignorance to a state of knowledge. Aristotle is concerned to detail the various gradations in intellectual achievement that exist between mundane knowledge and full scientific or practical expertise.

This approach, however, raises a different worry. If we can only gain knowledge by building on knowledge we already have, then the question arises: where does our learning begin? Plato’s answer, at least as Aristotle understands it, is that we have innate latent beliefs in our souls which we can recollect and hence come to know (Post. An. II 19, 99b25–6). Aristotle rejects this view, taking it to require, implausibly, that we have more precise cognitive states in us than we are aware of (Post. An. II 19, 99b26–7). Instead, he adverts to perception as the type of knowledge from which higher cognitive states originate (Post. An. II 19, 99b34–5; cf. Met. I 1, 980a26–7). At least the most rudimentary types of perception allow us to gain knowledge without drawing on any prior knowledge. Thus, for Aristotle, everything learned (both for us and for other animals) starts with perception, such that any lack in perception must necessarily result in a corresponding lack in knowledge (Post. An. I 18, 81a38–9). Depending on the intellectual capabilities of a given animal, perception may be the highest type of knowledge available, or the animal may naturally learn from it, ascending to higher types of knowledge from which the animal can learn in turn (Post. An. II 19, 99a34–100a3; Met. I 1, 980a27–981b6).

2. Perception

For Aristotle, perception is a capacity to discriminate that is possessed by all human and non-human animals (Post. An. II 19, 99b36–7; De An. II 2, 413b2; Gen. An. I 23, 731a30–4), including insects and grubs (Met. I 1, 980a27–b24; De An. II 2, 413b19–22). Every animal possesses at least the sense of touch, even though some may lack other sensory modalities (De An. II 2, 413b8–10, 414a2–3). Each sense has a proper object which only that perceptual modality can detect as such (De An. II 6, 418a9–12): color for sight, sound for hearing, flavor for taste, odor for smell and various unspecified objects for touch (De An. II 6 418a12–14). For Aristotle, perception is not, however, limited to the proper objects of the sensory faculties. He allows that we and other animals also perceive a range of other things: various common properties which can be registered by multiple senses, such as shape, size, motion and amount (De An. II 6, 418a17–18), incidental objects such as a pale thing or even the fact that the pale thing is the son of Diares (De An. II 6, 418a21), and possibly even general facts such as that fire is hot (Met. I 1, 981b13; but see below).

Aristotle holds that we are never, or at least most infrequently, in error about the proper objects of perception, like color, sound, flavor, and so on (De An. II 6, 418a12; De An. III 3, 428b18–19). We are, however, regularly mistaken about other types of perceptual objects (De An. III 3, 428b19–25). While I can be mistaken, for instance, about the identity of the red thing I am perceiving (Is it an ember? Is it a glowing insect? Is it just artifact of the lighting?), I usually am not mistaken that I am seeing red. In Aristotle’s language, this is to say that I am more often in error regarding the incidental objects of perception (De An. III 3, 428b19–22). The common objects of perception are, in his view, even more prone to error (De An. III 3, 428b22–5); for example, I can easily misperceive the size of the red thing or the number of red things there are.

Aristotle gives an account of the way perception works which spans physiology, epistemology and philosophy of mind. In order for perception to occur, there must be an external object with some quality to be perceived and a perceptual organ capable of being affected in an appropriate way (De An. II 5, 417b20–1, 418a3–5). Aristotle posits that each sense organ is specialized and can only be affected in specific ways without being harmed. This explains both why different sensory modalities have different proper objects and why overwhelming stimuli can disable or damage these senses (De An. II 12, 424a28–34; III 2, 426a30–b3; III 13, 435b4–19). Perception takes place when the sensory organ is altered within its natural bounds, in such a way as to take on the sensible quality of the object perceived. In this way, the perceptual organ takes on the sensible form of the object without its matter (De An. III 12, 424a17–19). Much debate has revolved around whether Aristotle means that the organ literally takes on the sensible property (whether, for instance, the eye literally becomes red upon seeing red), or whether Aristotle means that it does so rather in some metaphorical or otherwise attenuated sense.

Some animals, Aristotle holds, have no other form of knowledge except perception. Such animals, in his view, only have knowledge when they are actually perceiving (Post. An. II 19, 99b38–9); they know only what is present to them when their perceptual capacities are in play. The same holds for human perceptual knowledge. If we can be said to have knowledge on account of our merely perceiving something, then this is knowledge we have only at the time when this perception is occurring (Pr. An. II 21, 67a39–67b1; compare Met. Ζ 15, 1039b27–30). A person has, for instance, perceptual knowledge that Socrates is sitting only when actually perceiving Socrates in a seated position. It follows that we cease to have this knowledge as soon as we cease to perceive the thing that we know by perception (Nic. Eth. VI 3, 1139b21–22; Topics [Top.] V 3, 131b21–22).

For Aristotle, this represents a shortcoming of perceptual knowledge. Perceptual knowledge is transitory or unstable in a way that knowledge ideally is not, since knowledge is supposed to be a cognitive state which we can rely upon (Categories [Cat.] 8, 8b27–30; Posterior Analytics [Post. An.] I 33, 89a5–10). Perception is also lacking as a form of knowledge in other ways. Higher types of knowledge confer a grasp of the reasons why something is so, but perception at best allows us to know that something is so (Metaphysics [Met.] I 1, 981b12–13; Post. An. I 31, 88a1–2). The content of perception is also tied to a particular location and time: what I perceive is that this thing here has this property now (Post. An. I 31, 87b28–30). Even if the content of my perception is a fact like the fact that fire is hot (rather than that this fire is hot), a perceptual experience cannot, according to Aristotle, tell me that fire is in general hot, since that would require me to understand why fire is hot (Met. I 1, 981b13).

Hence, while knowledge begins with perception, the types of knowledge which are most distinctively human are the exercise of cognitive abilities that far surpass perception (Gen. An. I 23, 731a34–731b5). In creatures like us, perception ignites a curiosity that prompts repeated observation of connected phenomena and leads us through a series of more demanding cognitive states that ideally culminate in scientific knowledge or craft (Met. I 1, 980a21–27; Post. An. I 31, 88a2–5; Post. An. II 19, 100a3–b5). The two most important of these intermediate states are memory and experience. Let us turn to these, before considering the types of knowledge that Aristotle considers to be virtues of the soul.

3. Memory

For Aristotle, perception provides the prior knowledge needed to form memories. The capacity to form memories allows us to continue to be aware of what we perceived in the past once the perceived object is no longer present, and thus to enjoy forms of knowledge that do not depend on the continued presence of their objects. Learning from our perceptions in order to form memories thus constitutes an important step in the ascent from perception to higher types of knowledge. With the formation of a memory, we gain epistemic access to the contents of our perceptions that transcends the present moment and place.

Aristotle distinguishes memory from recollection. Whereas recollection denotes an active, typically conscious “search” (On Memory and Recollection [De Mem.] 453a12), memory is a cognitive state that results passively from perception (De Mem. 453a15). In order to form a memory, the perceived object must leave an impression in the soul, like a stamp on a tablet (De Mem. 450a31–2). This requires the soul to be in an appropriate receptive condition, a condition which Aristotle holds to be absent or impaired in both the elderly and the very young (De Mem. 450a32–b7). Aristotle however denies that a memory is formed simultaneously with the impression of the perceived object, since we do not remember what we are currently perceiving; we have memories only of things in the past (De Mem. 449b24–26). He infers that there must be a lapse of time between the perceptual impression and the formation of a memory (De Mem. 449b28, 451a24–5, 29–30).

The fact that memory requires an impression raises a puzzle, as Aristotle notices: if perception is necessarily of a present object, but only an impression left by the object is present in our memory, do we really remember the same things that we perceive (De Mem. 450a25–37, 450b11–13)? His solution is to introduce a representational model of memory. The impression formed in us by a sensory object is a type of picture (De Mem. 450a29–30). Like any picture, it can be considered either as a present artifact or as a representation of something else (De Mem. 450b20–5). When we remember something, we access the representational content of this impression-picture. Memory thus requires a sensory impression, but it is not of the sensory impression; it is of the object this impression depicts (De Mem. 450b27–451a8).

While the capacity to form memories represents a cognitive advance over perception thanks to its cross-temporal character, memory is still a rudimentary form of knowledge, which Aristotle takes not to belong to the intellect strictly speaking (De An. I 4, 408b25–9; De Mem. 450a13–14). Memories need not possess any generality (although Aristotle does not seem to rule out the possibility of remembering generalizations), nor does memory as such tell us the reasons why things are so. A more venerable cognitive achievement than memory which, however, still falls short of full scientific knowledge or craft, is what Aristotle calls “experience” (empeiria).

4. Experience

Memories constitute the prior knowledge required to gain experience, which we gain by means of consciously or unconsciously grouping memories of the same thing (Post. An. II 19, 100a4–6; Met. I 1, 980b28–981a1). The type of knowledge we gain in experience confers practical success; in some cases, the practical efficacy (which is not to say the overall value) of this type of knowledge surpasses that of scientific knowledge (Met. I 1, 981a12–15). Aristotle emphasizes the pivotal role of experience in the acquisition of knowledge of scientific principles (Pr. An. I 30, 46a17–20; II 19, 100a6), but he considers the proper grasp of scientific principles to be a strictly different (and more valuable) kind of knowledge.

Experience thus sits mid-way between the awareness of the past we enjoy by way of memory and the explanatory capacity we have in scientific knowledge. His characterization of the content of the knowledge we have in experience has given rise to divergent interpretations. He contrasts experience with the “art” that a scientifically informed doctor has as follows:

[T]o have a judgment that when Callias was ill of this disease this did him good, and similarly in the case of Socrates and in many individual cases, is a matter of experience; but to judge that it has done good to all persons of a certain constitution, marked off in one class, when they were ill of this disease, e.g. to phlegmatic or bilious people when burning with fever–this is a matter of art. (Met. I 1, 981a7–12, trans. Ross)

On one traditional reading, the contrast Aristotle wishes to draw here concerns the generality of what one knows in experience and in scientific knowledge respectively. A person with scientific knowledge knows a universal generalization (for example, “all phlegmatic people are helped by such-and-such a drug when burning with a fever”), whereas a person with experience knows only a string of particular cases which fall under this generalization (“Socrates was helped by such-and-such a drug when burning with a fever”, “Callias was helped by such-and-such a drug when burning with a fever”, and so on).

What distinguishes the experienced person from someone who merely remembers these things, however, is that the memories of the experienced person are grouped or connected (Post. An. II 19, 100a4–6; Met. I 1, 980b28–981a1). Precisely what this grouping or connection comes to is not made clear by the text, but one point suggested by the passage above is that it allows one to competently treat new cases by comparison with old ones. An experienced person would thus, in this example, be able to prescribe the correct drug if, for instance, Polus should arrive with a fever and be of the relevant constitution to benefit from it. The experienced person will do this, however, by comparing Polus with Socrates and Callias, not by means of an explicit grasp of the universal generalization that all phlegmatic people benefit from this drug when suffering from a fever (or even that most of them do). The person with experience thus has a capacity to generalize, but not yet any explicit grasp of the underlying generalization.

One problem for this reading is that outside of this passage Aristotle describes generalizations, even scientifically accurate ones, as things known by experience. In particular, Aristotle describes scientific explananda as things known by experience, where these are taken to be general facts like the fact that round wounds heal more slowly (Post. An. I 13, 79a14–16 with Met. I 1, 981a28–30; Historia Animalium [Hist. An.] VIII 24, 604b25–7; Pr. An. I 30, 46a17–27; and, possibly, Post. An. II 19, 100a3–8). According to Pieter Sjoerd Hasper and Joel Yurdin, the content of experience does not differ from that of scientific knowledge in being any less general or less accurate than scientific knowledge. Instead, what one has experience of is fully precise scientific facts, but what one lacks is a grasp of their causes. On this view, Aristotle’s point in the passage quoted is that someone with experience knows that a certain treatment is effective for all feverish patients who are phlegmatic, but the person does not know why. Experience thus gives one knowledge of scientific explananda; further inquiry or reflection is however needed to have properly scientific knowledge, which requires a grasp of the causes of what one knows by experience.

On either of these interpretations, experience can be seen to contribute a further dimension to the temporal reach of our knowledge. Where memory allows us to retain perceptual knowledge, and thus extends our knowledge into the past, experience extends our knowledge into the future. A person with experience has not only a retrospective grasp of what has cured certain patients; this person has learned from this knowledge what will cure (or is likely to cure) the next patient with the relevant malady, either by direct comparison with previous cases or by grasping the relevant generalization. Since experience presupposes memory, an experienced person has knowledge whose reach extends both backward and forward in time.

5. Knowledge as an Intellectual Virtue

A virtue, for Aristotle, is a particular respect in which a thing is excellent at being what it is or doing what it is meant to do. If, with Aristotle, we suppose that not only our characters but also our intellects can be in better or worse conditions, it makes sense to talk about virtues of intellect as well as virtues of character.

Unlike contemporary virtue epistemologists, who tend to identify knowledge as a type of success issuing from intellectual virtues, Aristotle directly identifies the most desirable types of knowledge with certain intellectual virtues. This has an important effect on his epistemology. A virtue is a kind of stable condition, something a person is qualified with over a period of time rather than (primarily) a thing a person may be said to have or lack on a given occasion. The identification of the highest types of knowledge with virtues thus leads Aristotle to think of these kinds of knowledge as abilities. The relevant abilities include not just practical ones (like building a house) but also purely intellectual abilities, most importantly the ability to contemplate.

Since intellectual virtues must be stable states of the intellect, the best types of knowledge are also those that are difficult to acquire and, conversely, cannot be easily lost or forgotten (Cat. 8, 8b26–9a10). This does not hold of memories (which are easily formed and routinely forgotten) and even less so of perceptual knowledge (which, as we have seen, is for Aristotle a type of knowledge we have just when we are actually perceiving). Only the type of knowledge that is the outcome of protracted instruction or research counts as knowledge in the sense of a virtue (Nic. Eth. VII 3, 1141a18–22). Further, Aristotle thinks we only have this type of knowledge of necessary generalizations which belong to an axiomatizable theory, on the one hand, and of practically pertinent generalizations together with particular facts about their implementation, on the other. His reasons for this view are connected with his division of the human soul.

a. The Division of the Soul

Aristotle takes the human soul to have distinct parts corresponding to our various capacities. He divides the soul first into a rational and a non-rational part. The non-rational part of the soul accounts for the capacities we share with other animals. This part of the soul is divided into a vegetative part, which represents capacities for growth and nutrition, and a part representing the capacities we share with other animals but not with plants.

The rational part of our soul accounts for those capacities by which we seek to grasp truth, capacities which Aristotle takes to be limited to humans and the divine. By “truth”, Aristotle means both the theoretical truth of things that hold independently of us and the practical “truth” of an action or intention that accords with our rational desires (Nic. Eth. VI 2, 1139a26–31). Accordingly, Aristotle divides the rational soul into a calculative and a scientific part corresponding to the different types of truth we seek to grasp (Nic. Eth. VI 1, 1139a6–15; compare. Pol. VII 14, 1333a24–5). The calculative part of the rational soul is responsible for the cognitive component of our practical deliberation, while the scientific part of the soul is responsible for our grasp of what we seek to know for its own sake.

Each part of the soul can, for Aristotle, be in a better or a worse condition. Aristotle notes that this also holds of the nutritive part and the capacities for perception, but he shows little interest in the perfection of these capacities in normative contexts, since they are not distinctively human (Nic. Eth. I 7, 1097b33–5; I 13, 1102a32–b3). Perfecting the non-rational part of the soul is, for humans, to acquire virtues of character, such as courage, temperance and magnanimity. These are acquired, if at all, through a process of habituation beginning in childhood (Nic. Eth. II 1, 1103a25–6, b23–5). To perfect the rational part of the soul, on the other hand, is to acquire what Aristotle calls the “intellectual virtues” (Nic. Eth. I 13, 1103a3–5; VI 1, 1138b35–39a1). Such virtues are also acquired only gradually and over a long period of time, but “mostly as a result of instruction” (didaskalia, Nic. Eth. II 1, 1103a15) rather than habituation.

In addition to taking the calculative and the scientific parts of the soul to be concerned with practical and theoretical truth respectively, Aristotle also distinguishes them according to the modal statuses of the truths that they grasp. The virtues of the calculative part of the soul are excellences for grasping truth concerning what is contingent or can be otherwise (Nic. Eth. VI 1, 1139a8), whereas the virtues of the scientific part of the soul concern “things whose principles cannot be otherwise” (Nic. Eth. VI 1, 1139a7–8). This careful formulation leaves open the possibility that the scientific soul may grasp contingencies so long as the things about which it grasps these contingencies have principles which are necessary. There are, for instance, necessary principles which govern the eclipse of the moon, so that one can have scientific knowledge of the eclipse of the moon even though the moon is not always or necessarily eclipsed (Post. An. I 8, 75b33–6; compare I 31, 87b39–88a5; II 8, 93a35–93b3). Aristotle, however, tends to treat such cases as secondary, taking the primary objects of the scientific part of the soul to be strict and exceptionless necessities.

Aristotle takes there to be different intellectual capacities devoted to the grasp of truths of differing modal statuses for a variety of reasons. On the one hand, he thinks of action as the manipulation of truth. If I fashion some planks of wood into a table, I am making it true that these planks are a table (which, before I begin, is false). It follows that intellectual capacities that are directed towards action must have contingent truths as their objects, since if something cannot be otherwise, then a fortiori it cannot come to be otherwise by someone’s agency (Nic. Eth. VI 2, 1139a36–b11).

Conversely, Aristotle takes only necessary truths to be appropriate objects for the form of knowledge that pertains to the scientific part of the soul. The best condition for this part of the soul is one that allows someone to contemplate the truth freely and at will (De An. II 5, 417b23–5). This means that it ought not to need to monitor, intervene in or otherwise “check on” how things stand in the world with respect to what we know. Aristotle thinks that if one could have this sort of knowledge of a contingent state of affairs, then this state of affairs might change without our awareness, pulling the rug, as it were, out from under our knowledge (Nic. Eth. VI 3, 1139b21–2). For instance, if I could have scientific knowledge that Socrates is sitting, and Socrates gets up without me noticing, then I would suddenly no longer know that Socrates is sitting (since it would no longer be true that Socrates is sitting). Hence, if my knowledge is guaranteed to remain knowledge just by me having learned it in the appropriate way, then what I know must be a state of affairs that does not change, and this will be so if scientific knowledge is of necessities.

b. Scientific Virtues

i. Theoretical Wisdom

Wisdom (sophia) is Aristotle’s name for the best condition of the scientific part of the soul (Nic. Eth. VI 1, 1139a16; compare Met. I 2, 983a9–10) and the “most precise of the kinds of knowledge” (Nic. Eth. VI 7, 1141a17–8). This is the state that we are in when our soul grasps the best objects in the universe in the most intellectually admirable way, enabling us to contemplate these objects with total comprehension (Nic. Eth. VI 7, 1141a20–1, 1141b2–8; X 7, 1177a32–b24; Met. I 1, 981b25–982a3). In the best objects, Aristotle surely intends to include God (Met. 983a4–5) and possibly also the celestial bodies or other things studied in the books of the Metaphysics. He makes clear that humans and their polities are not among these most venerable things: we are in his view plainly not “the best thing there is in the universe” (Nic. Eth VI 7, 1141a21). Humans and their goals may be the most fitting objects of practical knowledge (Nic. Eth. VI 7, 1141b4–15), but there are better things to contemplate.

To this extent, theoretical wisdom is a distinctively disinterested type of knowledge. It is the limiting case of the type of knowledge we seek when we want to understand something for its own sake rather than because it benefits us or has practical utility, and Aristotle associates it strongly with leisure (scholē) (Met. I 1, 982a14–16; Nic. Eth. VI 12; Politics [Pol.] VII.14, 1333a16–b5). This does not mean, however, that it is neutral with respect to its ethical value. On the contrary, Aristotle takes the person with superlative wisdom to be “superlatively happy” (Nic. Eth. X 8, 1179a31), and the pursuit of theoretical wisdom is undoubtedly a central component of the good life in his view.

Aristotle also holds that wisdom can be practically advantageous in more mundane ways. He recounts a story about Thales putting his philosophical knowledge to work so as to predict an excellent olive crop and amassing a fortune by buying up all of the oil presses and then loaning them out at a profit (Pol. I 11, 1259a5–23). Yet he stresses that sophia is neither for the sake of such practical advantages (Nic. Eth. VI 7, 1141b2–8) nor does it require its possessor to be practically wise (Nic. Eth. VI 7, 1141b20–1). He depicts Thales as amassing this wealth to show that “philosophers could easily become wealthy if they wished, but this is not their concern” (Pol. I 11, 1259a15–18).

The best kind of theoretical knowledge has, for Aristotle, the structure of an axiomatic science. One has the best theoretical orientation towards the world when one grasps how each fact of a science concerning the highest things follows from the principles of the highest things (Met. I 2, 982a14–16). Wisdom thus divides into two components, scientific knowledge of certain principles (nous) and the type of scientific knowledge that consists in grasping a scientific proof or “demonstration” issuing from these principles (Nic. Eth. VI 7, 1141a17–20). Someone with the virtue of wisdom understands why the basic principles of theology (or whatever science deals with the best things) are the basic principles of that science, and is also able to prove, in axiomatic fashion, every other theorem in that science on the basis of these principles.

While wisdom is for Aristotle the best kind of theoretical knowledge, he does not hold that this sort of knowledge ought to form a foundation for all other kinds of knowledge or even all other scientific knowledge. This is because he holds that each kind of thing is only properly understood when we understand it according to its own, specific principles (Post. An. I 2, 71b23–25, 72a6; I 6, 74b24–26; I 7; Met. I 3, 983a23–25; Phys. I 1, 184a1–15). Knowledge of the first principles of the highest science might give someone a general understanding of a range of other things (Met. I 2, 982a7–10, 23–24; Nic. Eth. VI 7, 1141a12–15)—it might explain, for instance, why animals move at all by saying that they move in imitation of divine motion—but this sort of general understanding is, for Aristotle, no substitute for the specific kind of understanding we have when we grasp, for example, the mechanics of a particular animal’s motion or the function of this motion in its peculiar form of life.

For this reason, Aristotle takes each scientifically explicable domain to be associated with its own dual virtues of demonstrative and non-demonstrative scientific knowledge. The virtues of demonstrative and non-demonstrative knowledge are, therefore, not characteristics which a person can be said to simply have or to lack in general. Instead, someone might possess the virtues of scientific knowledge with respect to, say, geometry and lack them with respect to, say, human physiology. While one type of scientific knowledge might assist in the acquisition of another, and perhaps even provide some of its principles (Post. An. I 7, 75b14–17; compare I 9, 76a16–25), Aristotle insists that there is a different virtuous state associated with each distinct scientific domain (Nic. Eth. VI 10, 1143a3–4). He does, however, take all such virtues to share a common axiomatic structure, which he lays out in the Posterior Analytics in the course of giving a theory of demonstration.

ii. Demonstrative Knowledge

A demonstration (apodeixis), for Aristotle, is a deductive argument whose grasp imparts scientific knowledge of its conclusion (Post. An. I 2, 71b18–19). Aristotle takes it for granted that we possess a distinctive kind of knowledge by way of deductive reasoning and asks what conditions a deductive argument must satisfy in order to confer scientific knowledge. His primary model for this type of knowledge is mathematics, which, alongside geometrical construction, included the practice of providing a deductive argument from basic principles to prove that the construction satisfies the stated problem. Aristotle however seeks to generalize and extend this model to broadly “mathematical” sciences like astronomy and optics, and, with some qualifications, to non-mathematical sciences like botany and meteorology. His theory of knowledge in the Posterior Analytics (especially the first book) investigates this ideal knowledge state by asking what conditions an argument must satisfy in order to be a demonstration.

Aristotle observes, to begin, that not all deductive arguments are demonstrations (Post. An. I 2, 71b24–26). In particular, an argument from false premises does not confer knowledge of its conclusion (Post. An. I 2, 71b26–27). The notion of demonstration is not, however, simply the notion of a sound deductive argument, since even sound arguments do not provide knowledge of their conclusions unless the premises are already known. Moreover, even sound arguments from known premises may not provide the best kind of knowledge of the conclusion. Aristotle holds that in order to impart the best kind of knowledge of a necessary truth, an argument must establish this truth on the basis of principles that properly pertain to the type of thing the demonstration concerns. A demonstration of some astronomical fact must, for instance, proceed from properly astronomical principles (Pr. An. I 30, 46a19–20). What this rules out is, on the one hand, arguments from accidental and “chance” features of an object (Post. An. I 6, especially 74b5–12, 75a28–37; I 30), and arguments from the principles of a different science, on the other (Post. An. I 7, 75b37–40).

Two requirements for demonstration, then, are that the premises be true and that they be non-accidental facts belonging to the relevant science. Assuming that all principles of a science are true and non-accidental, this reduces to the condition that a demonstration be from principles belonging to the relevant science. This, however, is still not a sufficient condition for an argument to be a demonstration. Aristotle famously contrasts the following two arguments (Post. An. I 13, 78a30–7):

Argument One

Things that do not twinkle are near;
the planets do not twinkle;
therefore, the planets are near.

Argument Two

What is near does not twinkle;
the planets are near;
therefore, the planets do not twinkle.

Here by “twinkle” we should understand the specific astronomical phenomenon whereby a celestial body’s visual intensity modulates in the way that a distant star’s does on a clear night, and we should understand both of these arguments to quantify over visible celestial bodies. If we do, then both of these arguments are sound. The planets are near the earth (relative to most astronomical bodies), and they do not display the astronomical property of twinkling. It is also true that bodies which are relatively close to us, as compared to the stars, fail to display this effect, so the first premise of Argument Two is true. Further, only visible celestial bodies which are near to us fail to twinkle, so the first premise of Argument One is also true. All of the premises in these two arguments are also, in Aristotle’s view, properly astronomical facts. To this extent they both establish that their respective conclusions hold as a matter of astronomical science.

The latter argument is in Aristotle’s view superior, however, in that it establishes not only that the conclusion holds but also why it does. In a completed theory of astronomy, the non-twinkling of the planets might be explained by recourse to their nearness to us, for example by adding that other celestial bodies obstruct the light issuing from more distant ones. Argument Two conveys this explanation by presenting the immediate cause of the conclusion, nearness, as a middle term shared between the two premises. The two premises in the argument thus not only prove the conclusion; they jointly explain what makes the conclusion true.

On the other hand, while the facts that non-twinkling celestial bodies are near and that all planets are non-twinkling celestial bodie do provide perfectly legitimate grounds to infer that the planets are near, an argument from these premises provides little insight into why the planets are near. The fact that the planets are near might take significant work to establish (it might even be established using a chain of reasoning such as that in Argument One), but it would be a confusion, in Aristotle’s view, to think that the soundness of Argument One and the scientific character of its premises shows that the non-twinkling of the planets explains their nearness. The order of explanation runs rather in the opposite direction: they do not twinkle because they are near. In a completed science of astronomy as Aristotle conceives it, where it is assumed that the gross distance of all celestial bodies from the earth is eternally fixed, the nearness of the planets would presumably be treated as a fundamental given from which other things may be explained, not as a fact requiring explanation.

Someone is in a better cognitive condition with respect to a given fact, Aristotle evidently holds, if that person not only knows that it is true but also grasps why it is true. Aristotle does not argue for this position, but it is not difficult to imagine what reasons he might give. We naturally desire not just to know but to understand; curiosity is sated by explanation rather than sheer fact. Further, understanding confers stability on what we know, and Aristotle takes stability to be a desirable quality of knowledge (Cat. 8, 8b28–30; Post. An. I 33, 89a5–10). If I understand why the planets must not twinkle (rather than knowing that this is so but having no idea why), then I will be less likely to give up this belief in light of an apparent observation to the contrary, since to do so would require me to also revise my beliefs about what I take to be the explanation. This is especially so if I understand how this fact is grounded, as Aristotle requires of demonstrative knowledge, in the first principles of a science, since renouncing that piece of knowledge would then require me to renounce the very principles of my scientific theory.

Hence, the type of deduction which places one in the best cognitive condition with respect to an object must be explanatory of its conclusion in addition to being a sound argument with premises drawn from the correct science. The notion of explanation Aristotle works with in laying down this condition is a resolutely objective one. Scientific explanations are not just arguments that someone, or some select class of people, find illuminating. They are the best or most appropriate kinds of explanations available for the fact stated in the conclusion because they argue from what is prior to the conclusion in the order of nature (Phys. I.1, 184a10–23). Further, the fact that a given set of premises explains their conclusion need not be obvious or immediately clear. Aristotle leaves open the possibility that it might be a significant cognitive achievement to see that the premises of a given demonstration explain its conclusion (Post. An. I 7, 76a25–30).

When someone does grasp demonstrative premises as explanatory of a given demonstrative conclusion, the argument is edifying because it tracks some objective fact about how things stand with the relevant kind (celestial bodies, triangles, and so on). Aristotle describes the way that scientific knowledge correctly tracks the order of things in terms of “priority” (Post. An. I 2, 71b35–72a6). Borrowing the terminology of contemporary metaphysics, we might gloss this by saying that demonstrations reveal facts about grounding in a way that not all deductive arguments do. The second syllogism is better than the first one because the fact that the planets are near together with relevant universal generalization about the optical behavior of nearness ground the fact that the planets do not twinkle. They are in an objective sense responsible for the fact that the planets do not twinkle. Given the assumption that grounding is antisymmetric, the premises of the first syllogism cannot also ground its conclusion.

Aristotle’s account of the specific content and logical form of scientific principles is notoriously obscure. Key texts, which do not obviously stand in agreement, are Post. An. I 2, I 4, I 10, II 19 and Pr. An. I 30. Nevertheless, there are a few key ideas which Aristotle remains consistently committed to. First, a particularly important type of principle is one that states what something is, or its definition (Post. An. I 2, 72a22). The centrality accorded to this type of principle suggests a project of grounding all truths of a given science in facts about the essences of the kind or kinds that this science concerns. Aristotle however seems aware of problems with such a rigid view, and admits non-specific or “common” axioms into demonstrative sciences (Post. An. I 2, 72a17–19; I 10, 76b10–12). These include axioms like the principle of non-contradiction, which are in some sense assumed in every science (Post. An. I 2, 72a15–17; I 11, 77a30), as well as those like the axiom that equals taken from equals leave equals, which can be given both an arithmetical and a geometrical interpretation (Post. An. I 10, 76a41—b1; I 11, 77a30).

Aristotle also briefly discusses the profile of conviction (pistis) that someone with scientific knowledge ought to display. At least some of the principles of a demonstration, Aristotle holds, should be “better known” to the expert scientist than their conclusions, and the scientist should be “more convinced” of them (Post. An. I 2, 72a25–32). This is motivated in part by the idea that the principles of demonstrations are supposed to be the source or grounds for our knowledge of whatever we demonstrate in science. Aristotle also connects it with the requirement that someone who grasps a demonstration should be “incapable of being persuaded otherwise” (Post. An. I 2, 72b3). Someone with demonstrative knowledge in the fullest sense will never renounce their beliefs under dialectical pressure, and Aristotle thinks this requires someone to be supremely confident in the principles that found her demonstrations.

iii. Non-Demonstrative Scientific Knowledge

The claim that the principles of demonstrations must be better known than their conclusions generates a problem. If the best way to know something theoretically is by demonstration, and the premises of demonstrations must be at least as well known as their conclusions, then the premises of demonstrations will themselves need to be demonstrated. However, these demonstrations in turn will also have premises requiring demonstration, and so on. A regress looms.

Aristotle canvases three possible responses to this problem. First, demonstrations might extend back infinitely: there might be an infinite number of intermediate demonstrations between a conclusion and its first principles (Post. An. I 3, 72b8–10). Aristotle dismisses this solution on the grounds that we can indeed have demonstrative knowledge (Post. An. I 1, 71a34–72b1), and that we could not have it if having it required surveying an infinite series of arguments (Post. An. I 3, 72b10–11).

Two other views, which Aristotle attributes to two unnamed groups of philosophers, are treated more seriously. Both assume that chains of demonstrations terminate. One group says that they terminate in principles which cannot be demonstrated and consequently cannot be known (or at least, not in the demanding way required by scientific knowledge (Post. An. I 3, 72b11–13)). The other holds that demonstrations “proceed in a circle or reciprocally” (Post. An. I 3, 72b17–18): the principles are demonstrated, but from premises which are in turn demonstrated (directly or indirectly) from them.

Aristotle rejects both of these views, since both possibilities run afoul of the requirement that the principles of demonstrations be better known than their conclusion. The first alternative maintains that the principles are not known, or at least not in any scientifically demanding way, while the second requires that the principles in turn be demonstrated from (and hence not better known than) other demonstrable facts.

Aristotle’s solution is to embrace the claim that the principles are indemonstrable but to deny that this implies they are not known or understood in a rigorous and demanding way. There is no good reason, Aristotle maintains, to hold that demonstrative knowledge is the only or even the best type of scientific knowledge (Post. An. I 3, 72b23–5); it is only the best kind of scientific knowledge regarding what can be demonstrated. There is a different kind of knowledge regarding scientific principles. Aristotle sometimes refers to this as “non-demonstrative scientific knowledge” (Post. An. I 3, 72b20; I 33, 88b36) and identifies or associates it strongly with nous (Post. An. I 33, 88b35; II 19, 100b12), which is translated variously as comprehension, intellection and insight.

Aristotle is therefore a foundationalist insofar as he takes all demonstrative knowledge to depend on a special sort of knowledge of indemonstrable truths. It should be stressed, however, that Aristotle’s form of foundationalism differs from the types that are now more common in epistemology.

First, Aristotle professes foundationalism only regarding demonstrative knowledge in particular: he does not make any similar claim about perceptual knowledge as having a foundation in, for instance, the perception of sense data, nor for practical knowledge nor for our knowledge of scientific principles.

Second, as we have seen, Aristotle’s view is that scientific knowledge is domain-specific. Expert knowledge in one science does not automatically confer expert knowledge in any other, and Aristotle explicitly rejects the idea of a ”super-science” containing the principles for all other sciences (Post. An. I 32). Hence, Aristotle defends what we might term a “local” foundationalism about scientific knowledge. Our knowledge of geometry will have one set of foundations, our knowledge of physics another, and so on. (compare Post. An. I 7; I 9, 75b37–40; 76a13–16; Nic. Eth. VI 10, 1143a3–4).

Third, the faculty which provides our knowledge of the ultimate principles of demonstrative knowledge is, as we have seen, itself a rational faculty, albeit one which does not owe its knowledge to demonstration. Hence, Aristotle does not take the foundation of our demonstrative knowledge to be “brute” or “given”; his claim, more modestly, is that our knowledge of scientific principles must be of a different kind than demonstrative knowledge.

Finally, Aristotle’s foundationalism should not be taken to imply that we need to have knowledge of principles prior to discovering any other scientific facts. In at least some cases, Aristotle takes knowledge of scientific explananda to be acquired first (Post. An. II 2, 90a8–9), by perception or induction (Post. An. I 13, 78a34–5). Only later do we discover the principles which allow us to demonstrate them, and thus enjoy scientific knowledge of them.

Aristotle does not say much about the specific character of our knowledge of principles, and its nature has been the subject of much debate. As we have seen, Aristotle requires the principles to be “better known” (at least in part) than their demonstrative consequences, and he also refers to this type of knowledge as “more precise” (Post. An. II 19, 99b27) than demonstrative knowledge. Some scholars take Aristotle’s view to be that the principles are self-explanatory, while others take the principles to be inexplicable.

His views about the way we acquire knowledge of first principles have also been subject to varying interpretations. Traditionally, Aristotle’s view was taken to be that we learn the first principles by means of exercising nous, understood as a capacity for abstracting intelligible forms from the impressions left by perception. Subsequent scholars have pointed to the dearth of textual evidence for ascribing such a view to Aristotle in the Posterior Analytics, however. Aristotle calls the state we are in when we know first principles nous (Post. An. II 19, 100b12), but he does not claim that we learn first principles by means of exercising a capacity called nous.

A second possibility is that Aristotle thinks we obtain knowledge of scientific principles through some form of dialectic—a competitive argumentative practice outlined in the Topics that operates with different standards and procedures than scientific demonstration. Another view, defended by Marc Gasser-Wingate, is that our knowledge of the first principles is both justified and acquired by what Aristotle calls “induction” (epagōgē)—a non-deductive form of scientific argument in which we generalize from a string of observed cases or instances.

Some scholars also divide the question about how we first come to know the first principles from questions about what justifies this knowledge in the context of a science. One suggestion is that Aristotle takes the justification for nous to consist in a recognition of the principles as the best explanations for other scientific truths. On one version of this view, forcefully defended by David Charles, knowledge of first principles is not acquired prior to our knowledge of demonstrable truths; rather, we gain the two in lockstep as we engage in the process of scientific explanation. On other versions of this view, we come to know the first principles in some less demanding way before we come to appreciate their explanatory significance and thus have proper scientific knowledge of them. David Bronstein, who defends a version of the latter view, argues that Aristotle recommends a range of special methods for determining first principles, including, importantly, a rehabilitation of Plato’s method of division.

c. Practical Knowledge and the Calculative Virtues

Wisdom and scientific knowledge (demonstrative and non-demonstrative) are the excellences of the scientific part of our soul, that part of us devoted to the contemplation of unchanging realities. Aristotle takes the type of knowledge that we employ in our dealings with other people and our manipulation of our environment to be different in kind from these types of knowledge, and gives a separate account of their respective justification, acquisition, and purpose.

The goal of practical knowledge is to enable us to bring about changes in the world. Where attaining theoretical knowledge is a matter of bringing one’s intellect into conformity with unchanging structures in reality, practical knowledge involves a bidirectional relationship between one’s intellect and desires, on the one hand, and the world, on the other. As practical knowers we seek not only to conform our desires and intellects to facts about what is effective, ethical and practically pertinent; we also seek to conform these situations to what we judge to be such. Hence, practical knowledge can have as its objects neither necessities (since no one can coherently decide to, for example, change the angle sum of a triangle (Nic. Eth. III 2, 1112a21–31)) nor what is in the past (someone might make a decision to sack Troy, but “no one decides to have sacked Troy” (Nic. Eth. VI 2, 1139b7)). Only present and future contingencies are, in Aristotle’s view, possible objects of practical knowledge.

Aristotle distinguishes two activities that are enabled by practical thinking: action (praxis) and production (poiēsis) (Nic. Eth. VI 2, 1139a31–b5). Production refers to those doings whose end lies outside of the action itself (Nic. Eth. VI 5, 1140b6–7), the paradigm of which is the fashioning of a craft object like a shoe or a house. Aristotle recognizes that not all of our doings fit this mold, however: making a friend, doing a courageous act, or other activities laden with ethical significance cannot be thought of only with strain on the model of manufacturing a product. Such activities do aim to bring about changes in reality, but their end is not separate from the action itself. In performing a courageous act, say, I am, in Aristotle’s view, simply aiming to exercise the virtue of courage appropriately. Praxis is Aristotle’s name for doings such as these. It refers to a distinctively human kind of action, one not shared by other animals (Nic. Eth. VI 2, 1139a20; III 3, 1112b32; III 5, 1113b18), involving deliberation and judgment that an action is the best way of fulfilling one’s goals (Nic. Eth. VI 2, 1139a31).

Both action and production require more than just knowledge in order to be performed well. In particular, the best kind of action also requires the doer to be virtuous, and this, for Aristotle, has a desiderative as well as an epistemic component. Someone is only virtuous if that person desires the right things, in the right way, to the right extent (Nic. Eth. II 3, 1104b3–13; II 9 1109b1–5; III 4 1113a31–3; IV 1 1120a26–7; X I, 1172a20–3). Further, Aristotle does not take the starting points of our practical knowledge to be themselves objects of practical knowledge. It is not part of someone’s technical expertise to know that, for example, a sword is to be made or a patient to be healed; rather, a blacksmith or a doctor in her capacity as such takes for granted that these ends are to be pursued, and it is the job of her practical knowledge to determine actions which bring them about (Nic. Eth. III 3, 1112b11–16). The proper ends of actions, meanwhile, are given by virtue (Nic. Eth. VI 12, 1144a8, 20), and the virtues are habituated into us from childhood (Nic. Eth. II 2, 1103a17–18, 23–26, 1103b23–25). Nevertheless, Aristotle takes certain types of knowledge to be indispensable for engaging in action and production in the best possible ways. He identifies these types of knowledge with the intellectual virtues of practical wisdom (phronēsis) and craft (technē).

i. Craft

Craft (technē) is Aristotle’s name for the type of knowledge that perfects production (Nic. Eth. IV 4, 1140a1–10; Met. IX 2, 1046a36–b4). Aristotle mentions a treatise on craft which seems to have given a treatment of it roughly parallel to the treatment of scientific knowledge he gives in the Posterior Analytics (Nic. Eth. VI 4, 1140a2–3; compare VI 3, 1139b32), but this treatise is lost. Aristotle’s views on technē must be pieced together from scattered remarks and an outline of this treatise’s contents in the Nicomachean Ethics (VI 4).

As with scientific knowledge, Aristotle does not take craft to be a monolithic body of knowledge. He holds, sensibly enough, that a different type of technical knowledge is required for bringing into being a different kind of object. Aristotle’s stock example of a technē is the construction of houses (Nic. Eth. VI 4, 1140a4). In constructing a house, a craftsperson begins with the form of the house in mind together with a desire to bring one about, and practical knowledge is what enables this to lead to the actual presence of a house (Met. VII 7, 1032a32–1032b23; Phys. II 2, 194a24–27; II 3, 195b16–25).

In order for this to occur, a person with craft must know the “true prescription” pertaining to that practice (Nic. Eth. VI 4, 1140a21), that is, the general truths concerning how the relevant product is to be brought about. In the case of housebuilding, these might include the order in which various housebuilding activities need to be carried out, the right materials to use for various parts, and the correct methods for joining different types of materials. While a merely “experienced” housebuilder might manage to bring about a house without such prescriptions, they would not, in Aristotle’s view, bring about a house in the best or most felicitous way, and hence could not be said to operate according to the craft of housebuilding.

Aristotle indicates that these prescriptions fit together in a causal or explanatory way (Met. A 1, 981a1–3, 28–981b6; Post. An. II 19, 100a9). This view is plausible. Someone with the best kind of knowledge about how to bring about some product will presumably not only know what should be done but also understand why that is the correct thing to do. Such understanding, after all, has not only theoretical interest but also practical benefit. Suppose, for instance, that the craft of housebuilding prescribes that one should bind bricks using straw. Someone who understands why this is prescribed will be in a better position to know what else can be substituted should straw be unavailable, or even when it may be permissible to omit the binding agent. None of this is to say that a practitioner of a craft requires the same depth of understanding as someone with scientific knowledge, however. A technician of housebuilding does not need to know, for example, the chemical or physical principles which explain why and how binding agents work at a microscopic scale.

Given that it involves a kind of understanding, knowing the craft’s correct prescriptions in the way required by craft is a significant intellectual accomplishment. Nevertheless, this is not sufficient for having craft knowledge, according to Aristotle. Someone with craft knowledge must also have a “productive disposition” (Nic. Eth. VI 3, 1140a20–1), that is, a tendency to actually produce the goods according to these prescriptions when they have the desire to do so. Aristotle makes this disposition a part of craft knowledge itself, and not merely an extra condition required for practicing the craft, for at least three reasons.

First, someone does not count as having craft knowledge if that person has only a theoretical grasp of how houses are to be made, for example. Having craft knowledge requires knowledge of how to build houses, and Aristotle thinks that this sort of knowledge is only available to someone with a disposition to actually build them. Second, unlike mathematical generalizations, the prescriptions grasped in technē are not exceptionless necessities (Nic. Eth. VI 4, 1140a1–2; compare Nic. Eth. VI 2, 1139a6–8). Hence, simply knowing these prescriptions (even if one has every intention of fulfilling them) is not in itself sufficient for an ability to actually bring about the relevant product. One must have an ability to recognize when a rule of thumb about, say, the correct materials to use in building a house fails to apply. Aristotle thinks of this type of knowledge as existing in a disposition to apply the prescriptions correctly rather than as an auxiliary theoretical prescription.

A third, related reason is that the process of production requires one to make particular decisions that go beyond what is specified in the prescriptions given by that craft. Thus, even where the craft prescription instructs the builder to, for instance, separate cooking and sleeping quarters or to have a separate top floor for this kind of house, it may not specify the specific arrangement of these quarters or the precise elevation of the second floor. The ability to make such decisions in the context of practicing a craft is, for Aristotle, conferred by the productive disposition involved in craft knowledge rather than by the grasp of additional prescriptions.

ii. Practical Wisdom

Practical wisdom is the central virtue of the calculative part of the soul. This type of knowledge makes one excellent at deliberation (Nic. Eth. VI 9, 1142b31–3; VI 5, 1140b25). Since deliberation is Aristotle’s general term for reasoning well in practical circumstances, practical wisdom is also the type of knowledge that perfects action (praxis). More generally, practical wisdom is the intellectual virtue “concerned with things just and fine and good for a human being” (Nic. Eth. VI 12, 1143b21–22). It includes, or is closely allied with, a number of related types of practical knowledge that inform ethical behavior: good judgment (gnōmē), which Aristotle characterizes as a sensitivity to what is reasonable in a given situation (, Nic. Eth. VI 12, 1143a19–24); comprehension (sunesis), an ability to discern whether a given action or statement accords with practical wisdom (Nic. Eth. VI 10, 1143a9–10); and practical intelligence (nous, related to, but distinct from the theoretical virtue discussed above), which allows one to spot or recognize practically pertinent particulars (Nic. Eth. VI 11, 1143b4–5).

Practical wisdom thus serves to render action rather than production excellent. One important difference between practical wisdom and craft immediately follows. Whereas in craft someone performs an action for the purpose of creating something “other” than the production itself, the end of practical wisdom is the perfection of the action itself (Nic. Eth. VI 5, 1140b6–7). Nevertheless, in many respects, Aristotle’s view of practical wisdom is modeled on his view of craft knowledge. Like craft knowledge, the goal of practical wisdom is to effect some good change rather than simply to register the facts as they stand. In addition, like craft, this type of knowledge involves both a grasp of general prescriptions governing the relevant domain and an ability to translate these generalities into concrete actions. In the case of practical wisdom, the domain is the good human life generally (Nic. Eth. VI 8, 1141b15; Nic. Eth. VI 12, 1143b21–2), and the actions which it enables are ethically good actions. Hence, the general prescriptions associated with practical wisdom concern the living of a flourishing human life, rather than any more particular sphere of action. Practical wisdom also, like craft, involves an ability to grasp the connections between facts, but in a way that is specifically oriented towards action (Nic. Eth. VI 2 1139a33–1139b5; Nic. Eth. VI 7, 1141b16–20).

Some of the complications involved in moving from general ethical prescriptions to concrete actions also mirror those regarding the movement from a craft prescription to the production of a craft object. For one, Aristotle holds that many or all general truths in ethics likewise hold only for the most part (Nic. Eth. I 3, 1094b12–27; 1098a25-34). The ethical prescription, for instance, to reciprocate generosity is a true ethical generalization, even if not an exceptionless one (Nic. Eth. IX 2, 1164b31). If ethical norms permit of exceptions, then knowing these norms will not always be sufficient for working out the ethical thing to do. A further epistemic capacity will be required in order to judge whether general ethical prescriptions apply in the concrete case at hand, and this is plausibly one function of phronēsis.

Aristotle also describes phronēsis as a capacity to work out what furthers good ends (Nic. Eth. VI 5, 1140a25–9; VI 9, 1142b31–3). He distinguishes it from the trait of cleverness, a form of means-end reasoning that is indifferent to the ethical quality of the ends in question (Nic. Eth. VI 12, 1144a23–5). Phronēsis is an ability to further good ends in particular, and to further them in the most appropriate ways. It has also been argued that phronēsis has the function of recognizing, not only the means to one’s virtuous ends, but also what would constitute the realization of those ends in the first place. For instance, I might have the intention to be generous, but it is another thing to work out what it means to be generous to this friend at this time under these circumstances. This is parallel to the way that one needs, in, say,. seeking to construct a house to decide which particular type of house to construct given the constraints of location and resources.

One crucial difference between craft knowledge and practical wisdom is, however, the following. Whereas it suffices for craft knowledge to find a means to an end which is in accord with the goals of that craft, a practically wise person must find a way of realizing an ethical prescription which is in accord with all of the ethical virtues (Nic. Eth. VI 12, 1144a29–1144b1). This is a considerable practical ability in its own right, especially when the demands of different virtues come into conflict, as they might, for instance, when the just thing to do is not (or not obviously) the same as the kind or the generous thing to do. Practical wisdom thus requires, first, that one has all of the virtues so as to be sensitive to their various demands (Nic. Eth. VI 13, 1145a1–2). Over and above the possession of the virtues, practical wisdom calls for an ability to navigate their various requirements and arbitrate between them in concrete cases. In this way, it constitutes a far higher achievement than craft knowledge, since a person with practical wisdom grasps and succeeds in coordinating all of the goods constitutive of a human life rather than merely those directed towards the production of some particular kind of thing or the attainment of some specific goal.

6. References and Further Reading

Two good overviews of Aristotle’s views about knowledge, with complementary points of emphasis, are Taylor (1990) and Hetherington (2012). Bolton (2012) emphasizes Aristotle’s debt to Plato in epistemology. Fine (2021) is one of the few to treat Aristotle’s theory of knowledge in all generality at significant length, but readers should be aware that some of her central theses are not widely supported by other scholars. More advanced but nevertheless accessible pieces on Aristotle’s epistemology and philosophy of science may be found in Smith (2019), Anagnostopoulos (2009) and Barnes (1995).

The most in-depth study of Aristotle’s theory of scientific knowledge in the Posterior Analytics is Bronstein (2016), which focuses on the prior knowledge requirement and reads Aristotle’s views as a response to Meno’s paradox. See also Angioni (2016) on Aristotle’s definition of scientific knowledge in the Posterior Analytics. McKirahan (1992) and Barnes (1993) both provide useful commentary on the Posterior Analytics. See Barnes (1969), Burnyeat (1981), Lesher (2001) and Pasnau (2013) for views concerning whether Aristotle’s theory in the Posterior Analytics is best viewed as an epistemology, a philosophy of science, or something else. Sorabji (1980) also contains penetrating discussions of many specific issues in Aristotle’s epistemology and philosophy of science. For scholarly issues, Berti (1981) is still an excellent resource.

On Aristotle’s scientific method more generally, see Lennox (2021), Bolton (1987) and Charles (2002). For how we acquire knowledge of first principles, important contributions include Kahn (1981) and Bayer (1997) (who defend a view close to the traditional one), Irwin (1988) (who argues for the importance of a form of dialectic in coming to know first principles), and Gasser-Wingate (2016) (who argues for the role of induction and perception). Morison (2019) as well as Bronstein (2016) discuss at length the nature of knowledge of first principles and its relationship to nous in Aristotle.

Shields (2016) provides an excellent translation and up-to-date commentary on the De Anima. Kelsey (2022) gives a novel reading of De Anima as a response to Protagorean relativism. For Aristotle’s views on perception, see Modrak (1987) and Marmodoro (2014). Gasser-Wingate (2021) argues for an empiricist reading of Aristotle, against the rationalist reading of Frede (1996). On the more specific issue of whether Aristotle takes perception to involve a literal change in the sense organ, one can start with Caston (2004), Sorabji (1992) and Burnyeat (2002).

For the Nicomachean Ethics, Broadie and Rowe (2002) provide useful, if partisan, philosophical introduction and commentary, while Reeve (2014) provides extensive cross-references to other texts. For Aristotle’s views about practical wisdom, Russell (2014) and Reeve (2013) are useful starting points. Walker (2018) gives a prolonged treatment of Aristotle’s views about contemplation and its alleged “uselessness”, and Ward (2022) provides interesting background on the religious context of Aristotle’s views.

a. Bibliography

  • Anagnostopoulos, Georgios (ed.). 2009. A Companion to Aristotle. Sussex: Wiley-Blackwell.
  • Angioni, Lucas. 2016. Aristotle’s Definition of Scientific Knowledge. Logical Analysis and History of Philosophy 19: 140–66.
  • Barnes, Jonathan. 1969. Aristotle’s Theory of Demonstration. Phronesis 14: 123–52.
  • Barnes, Jonathan. 1993. Aristotle: Posterior Analytics. Oxford: Clarendon Press.
  • Barnes, Jonathan (ed.). 1995. The Cambridge Companion to Aristotle. Cambridge: Cambridge University Press.
  • Bayer, Greg. 1997. Coming to Know Principles in Posterior Analytics II.19. Apeiron 30: 109–42.
  • Berti, Enrico (ed.). 1981. Aristotle on Science: The Posterior Analytics. Proceedings of the Eighth Symposium in Aristotelicum Held in Padua from September 7 to 15, 1978. Padua: Editrice Antenore.
  • Bolton, Robert. 1987. Definition and Scientific Method in Aristotle’s Posterior Analytics and Generation of Animals. In Philosophical Issues in Aristotle’s Biology. Cambridge: Cambridge University Press.
  • Bolton, Robert. 1997. Aristotle on Essence and Necessity. Proceedings of the Boston Area Colloquium in Ancient Philosophy (edited by John J. Cleary) 13:113–38. Leiden: Brill.
  • Bolton, Robert. 2012. Science and Scientific Inquiry in Aristotle: A Platonic Provenance. In The Oxford Handbook of Aristotle (edited by Christopher Shields), 46–59. Oxford: Oxford University Press.
  • Bolton, Robert. 2014. Intuition in Aristotle. In Rational Intuition: Philosophical Roots, Scientific Investigations, 39–54. Cambridge: Cambridge University Press.
  • Bolton, Robert. 2018. The Search for Principles in Aristotle. In Aristotle’s Generation of Animals: A Critical Guide (edited by Andrea Falcon and David Lefebvre), 227–48. Cambridge: Cambridge University Press.
  • Broadie, Sarah, and Christopher Rowe. 2002. Nicomachean Ethics. Philosophical Introduction and Commentary by Sarah Broadie (translated by Christopher Rowe). New York: Oxford University Press.
  • Bronstein, David. 2010. Meno’s Paradox in Posterior Analytics 1.1. Oxford Studies in Ancient Philosophy 38: 115–41.
  • Bronstein, David. 2012. The Origin and Aim of Posterior Analytics II.19. Phronesis 57(1): 29–62.
  • Bronstein, David. 2016. Aristotle on Knowledge and Learning: The Posterior Analytics. Oxford: Oxford University Press.
  • Bronstein, David. 2020. Aristotle’s Virtue Epistemology. In What the Ancients Offer to Contemporary Epistemology (edited by Stephen Hetherington and Nicholas Smith), 157–77. New York: Routledge.
  • Burnyeat, Myles. 1981. Aristotle on Understanding Knowledge. In Aristotle on Science: The Posterior Analytics (edited by Enrico Berti). Padua: Editrice Antenore.
  • Burnyeat, Myles. 2011. Episteme. In Episteme, Etc. Essays in Honour of Jonathan Barnes (edited by Benjamin Morison and Katerina Ierodiakonou), 3–29. Oxford: Oxford University Press.
  • Burnyeat, Myles. 2002. De Anima II 5. Phronesis 47(1): 28–90.
  • Byrne, Patrick. 1997. Analysis and Science in Aristotle. Albany: State University of New York Press.
  • Caston, Victor. 2004. The Spirit and the Letter: Aristotle on Perception. In Metaphysics, Soul and Ethics: Themes from the Work of Richard Sorabji (edited by Ricardo Salles), 245–320. Oxford University Press.
  • Charles, David. 2002. Aristotle on Meaning and Essence. Oxford: Oxford University Press.
  • Fine, Gail. 2021. Aristotle on Knowledge. In Essays in Ancient Epistemology, 221–32. Oxford University Press.
  • Frede, Michael. 1996. Aristotle’s Rationalism. In Rationality in Greek Thought (edited by Michael Frede and Gisela Striker), 157–73. Oxford University Press.
  • Gasser-Wingate, Marc. 2016. Aristotle on Induction and First Principles. Philosopher’s Imprint 16(4): 1–20.
  • Gasser-Wingate, Marc. 2019. Aristotle on the Perception of Universals. British Journal for the History of Philosophy 27(3): 446–67.
  • Gasser-Wingate, Marc. 2021. Aristotle’s Empiricism. New York: Oxford University Press.
  • Goldin, Owen. 1996. Explaining an Eclipse: Aristotle’s Posterior Analytics 2.1–10. Ann Arbor: The University of Michigan Press.
  • Greco, John. 2010. Achieving Knowledge. Cambridge: Cambridge University Press.
  • Hasper, Pieter Sjoerd, and Joel Yurdin. 2014. Between Perception and Scientific Knowledge: Aristotle’s Account of Experience. In Oxford Studies in Ancient Philosophy (edited by Brad Inwood), 47:119–50.
  • Hetherington, Stephen (ed.). 2012. Aristotle on Knowledge. In Epistemology: The Key Thinkers, 50–71. London: Continuum.
  • Hintikka, Jaakko. 1967. Time, Truth and Knowledge in Ancient Greek Philosophy. American Philosophical Quarterly 4(1): 1–14.
  • Irwin, Terence. 1988. Aristotle’s First Principles. Oxford: Clarendon Press.
  • Kahn, Charles. 1981. The Role of Nous in the Cognition of First Principles in Posterior Analytics II 19. In Aristotle on Science: The Posterior Analytics. Proceedings of the Eighth Symposium in Aristotelicum Held in Padua from September 7 to 15, 1978 (edited by Enrico Berti). Padua: Editrice Antenore.
  • Kelsey, Sean. 2022. Mind and World in Aristotle’s de Anima. Cambridge, UK: Cambridge University Press.
  • Kiefer, Thomas. 2007. Aristotle’s Theory of Knowledge. London: Continuum.
  • Kosman, Aryeh. 2013. Understanding, Explanation, and Insight in Aristotle’s Posterior Analytics. In Virtues of Thought, 7–26. Cambridge: Harvard University Press.
  • Lennox, James G. 2021. Aristotle on Inquiry. Cambridge: Cambridge University Press.
  • Lesher, James H. 2001. On Aristotelian Ἐπιστήμη as ‘Understanding’. Ancient Philosophy 21(1): 45–55.
  • Lorenz, Hendrik. 2014. Understanding, Knowledge and Inquiry in Aristotle. In The Routledge Companion to Ancient Philosophy, 290–303. New York: Routledge.
  • Malink, Marko. 2013. Aristotle on Circular Proof. Phronesis 58(3): 215–48.
  • Marmodoro, Anna. 2014. Aristotle on Perceiving Objects. New York: Oxford University Press.
  • McKirahan, Richard. 1992. Principles and Proofs. Princeton: Princeton University Press.
  • Modrak, Deborah K. W. 1987. Aristotle: The Power of Perception. Chicago: University of Chicago Press.
  • Morison, Benjamim. 2019. Theoretical Nous in the Posterior Analytics. Manuscrito 42(4): 1–43.
  • Morison, Benjamin. 2012. Colloquium 2: An Aristotelian Distinction Between Two Types of Knowledge. In Proceedings of the Boston Area Colloquium of Ancient Philosophy (edited by Gary Gurtler and William Wians), 27:29–63.
  • Pasnau, Robert. 2013. Epistemology Idealized. Mind 122: 987–1021.
  • Reeve, C. D. C. 2013. Aristotle on Practical Wisdom: Nicomachean Ethics VI. Cambridge: Harvard University Press.
  • Reeve, C. D. C. 2014. Aristotle: Nicomachean Ethics. Indianapolis: Hackett.
  • Russell, Daniel C. 2014. Phronesis and the Virtues (NE Vi 12-13). In The Cambridge Companion to Aristotle’s Nicomachean Ethics (edited by Ronald Polansky), 203–20. New York: Cambridge University Press.
  • Shields, Christopher. 2016. Aristotle. De Anima. Oxford: Clarendon Press.
  • Smith, Nicholas D. (ed.). 2019. The Philosophy of Knowledge: A History (Vol. I: Knowledge in Ancient Philosophy). London: Bloomsbury Academic.
  • Sorabji, Richard. 1980. Necessity, Cause and Blame. Perspectives on Aristotle’s Theory. Ithaca: Cornell University Press.
  • Sorabji, Richard. 1992. Intentionality and Physiological Processes: Aristotle’s Theory of Sense-Perception. In Essays on Aristotle’s De Anima (edited by Martha C. Nussbaum and Amelie Oksenberg Rorty), 195–225. Clarendon Press.
  • Sosa, Ernst. 2010. Knowing Full Well. Cambridge: Princeton University Press.
  • Taylor, C. C. W. 1990. Aristotle’s Epistemology. In Epistemology (edited by Stephen Everson), 116–42. Cambridge: Cambridge University Press.
  • Walker, Matthew D. 2018. Aristotle on the Uses of Contemplation. Cambridge: Cambridge University Press.
  • Ward, Julie K. 2022. Searching for the Divine in Plato and Aristotle: Philosophical Theoria and Traditional Practice. Cambridge: Cambridge University Press.

 

Author Information

Joshua Mendelsohn
Email: jmendelsohn@luc.edu
Loyola University Chicago
U. S. A.

History of Utilitarianism

The term “utilitarianism” is most-commonly used to refer to an ethical theory or a family of related ethical theories.  It is taken to be a form of consequentialism, which is the view that the moral status of an action depends on the kinds of consequences the action produces. Stated this way, consequentialism is not committed to any view of what makes certain outcomes desirable. A consequentialist could claim (rather absurdly) that individuals have a moral obligation to cause as much suffering as possible. Similarly, a consequentialist could adopt an ethical egoist position, that individuals are morally required to promote their own interests. Utilitarians have their own position on these matters. They claim it is utility (such as happiness, or well-being), which makes an outcome desirable, they claim that an outcome with greater utility is morally preferable to one with less. Contrary to the ethical egoist, the utilitarian is committed to everyone’s interests being regarded as equally morally important.

These features are fairly uncontroversial among utilitarians, but other features are the subject of considerable dispute. How “utility” should be understood is contested. The favoured ways of understanding utilitarianism have varied significantly since Jeremy Bentham—seen as the “father of utilitarianism”—produced the first systematic treatise of the view. There have also been proponents of views that resemble utilitarianism throughout history, dating back to the ancient world.

This article begins by examining some of the ancient forerunners to utilitarianism, identifying relevant similarities to the position that eventually became known as utilitarianism. It then explores the development what has been called “classical utilitarianism”. Despite the name, “classical utilitarianism” emerged in the 18th and 19th centuries, and it is associated with Jeremy Bentham and John Stuart Mill. Once the main features of the view are explained, some common historical objections and responses are considered. Utilitarianism as the social movement particularly influential in the 19th century is then discussed, followed by a review of some of the modifications of utilitarianism in the 20th century. The article ends with a reflection on the influence of utilitarianism since then.

Table of Contents

  1. Precursors to Utilitarianism in the Ancient World
    1. Mozi
    2. Epicureanism
  2. The Development of Classical Utilitarianism
    1. Hutcheson
    2. Christian Utilitarianism
    3. French Utilitarianism
  3. Classical Utilitarianism
    1. Origin of the Term
    2. Bentham
    3. Features of Classical Utilitarianism
      1. Consequentialism
      2. Hedonism
      3. Aggregation
      4. Optimific (‘Maximising’)
      5. Impartiality
      6. Inclusivity
    4. Early Objections and Mill’s Utilitarianism
      1. Dickens’ Gradgrindian Criticism
      2. The ‘Swine’ Objection and ‘Higher Pleasures’
      3. Demandingness
      4. Decision Procedure
  4. The Utilitarian Movement
  5. Utilitarianism in the Twentieth 20th Century
    1. Hedonism and Welfarism
    2. Anscombe and ‘Consequentialism’
    3. Act versus Rule
    4. Satisficing and Scalar Views
  6. Utilitarianism in the Early 21st Century
  7. References and Further Reading

1. Precursors to Utilitarianism in the Ancient World

While utilitarianism became a refined philosophical theory (and the term “utilitarianism” was first used) in the 18th century, positions which bear strong similarities to utilitarianism have been deployed throughout history. For example, similarities to utilitarianism are sometimes drawn to the teachings of Aristotle, the Buddha and Jesus Christ. In this section, two views from the ancient world are considered. The first is of Mozi, who is sometimes described as the first utilitarian (though this is disputed). The second is Epicurus, whose hedonism was influential on the development of utilitarianism.

a. Mozi

Mozi (c.400s-300s B.C.E)—also known as Mo-Tzu, Mo Di and Mo Ti—led the Mohist school in Chinese philosophy, which, alongside the Confucian school, was one of the two major schools of thought during the Warring States period (403-221 B.C.E.). In this article, some salient similarities between his ethical outlook and utilitarianism will be observed. For a more detailed discussion of Mozi’s philosophy, including how appropriate it is to view him as a utilitarian, see the article devoted to his writings.

Utilitarians are explicit in the importance of impartiality, namely that the well-being of any one individual is no more important than the well-being of anyone else. This is also found in Mozi’s writings. The term jian’ai is often translated as “universal love”, but it is better understood as impartial care or concern. This notion is regarded as the cornerstone of Mohism. The Mohists saw excessive partiality as the central obstacle to good behaviour. The thief steals because they do not sufficiently care for the person they steal from, and rulers instigate wars because they care more for their own good than the people whose countries they invade. Thus, Mozi implored his followers to “replace partiality with impartiality”.

His emphasis on the importance of impartiality bears striking similarities to arguments later made by Bentham and Sidgwick. Mozi’s impartiality is like the utilitarian’s in that it implies inclusivity and equality. Every person’s interests are morally important, and they are equally important.

A second clear similarity between Mohists and utilitarians is the focus on consequences when considering the justifications for actions or practices. Unlike the Confucians, who saw rituals and custom as having moral significance, Mozi would reject this unless they could satisfy some useful purpose. If a custom serves no useful purpose, it should be disposed of. For example, it was customary at the time to spend large quantities of resources on funeral rites, but Mozi criticised this due to these conferring no practical benefit. This scrutiny of the status quo, and willingness to reform practices deemed unbeneficial is something found repeatedly in utilitarians in the 18th century and beyond (see section 4).

A particularly interesting suggestion made by Mozi is that the belief in ghosts and spirts should be encouraged. He claimed that historically, a belief in ghosts who would punish dishonesty or corrupt behaviour had motivated people to act well. Upon seeing scepticism about ghosts in his time, Mozi thought this meant people felt free to act poorly without punishment: “If the ability of ghosts and spirits to reward the worthy and punish the wicked could be firmly established as fact, it would surely bring order to the state and great benefit to the people” (The Mozi, chapter 31).

Mozi approves of the belief in the existence of ghosts, whether or not they actually exist, because of the useful consequences of this belief. This suggestion that utility may count in favour of believing falsehoods is reminiscent of a claim by Henry Sidgwick (1838-1900). Sidgwick was a utilitarian, but he acknowledged that the general public may be happier if they did not believe utilitarianism was true. If that was the case, Sidgwick suggests that the truth of utilitarianism should be kept secret, and some other moral system that makes people happier be taught to society generally. This controversial implication——that it might be morally appropriate to mislead the general public when it is useful——is radical, but it is a reasonable inference from this type of moral view, which Mozi embraced.

A significant difference between Mozi and the utilitarians of the 18th century is the theory of the good he endorsed. Mozi sought to promote a range of goods, specifically order, wealth and a large population. Classical utilitarians, however, regarded happiness or pleasure as the only good. This view was presented shortly after Mozi, in Ancient Greece.

b. Epicureanism

The Epicureans, led by Epicurus (341-271 B.C.E.), were (alongside the Stoics and the Skeptics) one of the three major Hellenistic schools of philosophy. The Epicureans were hedonistic, which means that they saw pleasure as the only thing that was valuable in itself, and pain (or suffering) as the only ultimately bad thing.

This commitment is shared by later utilitarians, and it can be seen in slogans like “the greatest happiness of the greatest number”, which was later used by Frances Hutcheson and popularised by Bentham (though he later disliked it as too imprecise).

Though the Epicureans saw pleasure as the only good, the way they understood pleasure was somewhat different to the way one might imagine pleasure today. They realised that the most intense pleasures, perhaps through eating large amounts of tasty food or having sex, are short-lived. Eating too much will lead to pain further down the line, and appetites for sex dwindle. Even if appetites do not fade, becoming accustomed to intense pleasures may lead to sadness (a mental pain) further down the line if one’s desires cannot be satisfied. Thus, Epicurus endorsed finding pleasure in simple activities that could be reliably maintained for long periods of time. Rather than elaborate feasts and orgies, Epicurus recommended seeking joy in discussion with friends, developing tastes that could easily be satisfied and becoming self-sufficient.

A particular difference between the Epicurean view of pleasure and the view of later hedonists is that Epicurus regards a state of painlessness—being without any physical pains or mental disturbances—as one of pleasure. In particular, Epicurus thought we should aim towards a state of ataraxia, a state of tranquillity or serenity. For this reason, the Epicurean view is similar to a version of utilitarianism sometimes known as negative utilitarianism, which claims that morality requires agents to minimise suffering, as opposed to the emphasis typical utilitarians play on promoting happiness.

Epicurus also differed from utilitarians in terms of the scope of his teachings. His guidance was fairly insular, amounting to something like egoistic hedonism—one that encouraged everyone to promote their own personal pleasure. Epicurus encouraged his followers to find comfort with friends, and make their families and communities happy. This is a stark difference from the attitude of radical reform exhibited by Jeremy Bentham and his followers, who intended to increase the levels of happiness all over the world, rather than merely in the secluded garden that they happened to inhabit.

Epicurean teaching continued long after Epicurus’ death, with Epicurean communities flourishing throughout Greece. However, with the rise of Christianity, the influence of Epicureanism waned. There are several reasons that may explain this. The metaphysical picture of the world painted by Epicureans was one lacking in divine providence, which was seen as impious. Furthermore, the Epicurean attitude towards pleasure was often distorted, and portrayed as degrading and animalistic. This criticism, albeit unfair, would go on to be a typical criticism of utilitarianism (see 3.d.ii). Due to these perceptions, Epicureanism was neglected in the Middle Ages.

By the 15th century, this trend had begun to reverse. The Italian Renaissance philosopher Lorenzo Valla (1407-1457) was influenced by Epicurus and the ancient Epicurean Lucretius (99-55 B.C.E.). Valla defended Epicurean ideas, particularly in his work, On Pleasure, and attempted to reconcile them with Christianity. Thomas More (1478-1535) continued the rehabilitation of hedonism. In Utopia (1516), More describes an idyllic society, where individuals are guided by the quest for pleasure. The Utopian citizens prioritised spiritual pleasures over animalistic ones, which may have made this view more amenable to More’s contemporaries. Later still, the French philosopher Pierre Gassendi (1592-1695) embraced significant portions of Epicurean thinking, including the commitment to ataraxia (tranquillity) as the highest pleasure. The Renaissance revival of Epicureanism paved the way for the development of utilitarianism.

2. The Development of Classical Utilitarianism

In the 17th and early 18th century, philosophical positions that are recognisably utilitarian gained prominence. None of the following labelled themselves as “utilitarians” (the word had not yet been introduced) and whether some should properly be described in this way is a matter of some dispute, but each contain significant utilitarian features and have an important place in the intellectual history.

a. Hutcheson

Francis Hutcheson (1694-1795) was a Scots-Irish philosopher sometimes seen as the first true utilitarian. Geoffrey Scarre (1996) suggests that Hutcheson deserves the title of “father of British utilitarianism” (though Bentham is more typically described in this kind of way). As with many attributions of this sort, this is heavily contested. Colin Heydt, for instance, suggests Hutcheson should not be classified as a utilitarian. Regardless, his contribution to the development of utilitarian thought is undisputed.

Hutcheson was a moral sense theorist. This means he thought that human beings have a special faculty for detecting the moral features of the world. The moral sense gives a person a feeling of pleasure when they observe pleasure in others. Further, the sense approves of actions which are benevolent. Benevolent actions are those that aim towards the general good.

One particular passage that had significant influence on utilitarians can be found in Hutcheson’s Inquiry Concerning the Original of Our Ideas of Virtue or Moral Good (1725):

In the same manner, the moral evil, or vice, is as the degree of misery, and number of sufferers; so that, that action is best, which procures the greatest happiness for the greatest numbers; and that, worst, which, in like manner, occasions, misery.

The phrase, “greatest happiness for the greatest number(s)” became one of the major slogans of utilitarianism. This seems to be the first appearance of the phrase in English (though it was used decades previously by Leibniz). Because of this position, it is easy to see how Hutcheson can be interpreted as a utilitarian.

One important distinction between Hutcheson and utilitarians, however, is that he views the motives of individuals as what is valuable, rather than the state of affairs the action brings about. Whereas utilitarians view happiness itself as good, Hutcheson thinks it is the motives identified by our moral sense (which aim at happiness), which are good.

Hutcheson anticipates something similar to Mill’s higher/lower pleasures distinction (see 3.d.ii). In his posthumously published A System of Moral Philosophy, he says there are “a great variety of pleasures of different and sometimes inconsistent kinds, some of them also higher and more durable than others” (1755). Hutcheson associates dignity and virtuous action with the higher pleasures, and claims that “the exercise of virtue, for some short period, provided it is not succeeded by something vicious, is of incomparably greater value than the most lasting sensual pleasures”. These “higher” pleasures include social and intellectual activities, and are held to trump “lower” pleasures, like food and sex. Hutcheson is aware, however, that pleasures are “generally blended”. Lower pleasures may be accompanied by socialising, moral qualities, or friendship.

This appreciation for the variety and combinations of pleasure adds a rich texture to Hutcheson’s account. However, these intricacies may indicate a further difference between his view and utilitarianism. For the utilitarian, for a certain type of activity to be more valuable than another, this must be explained in terms of pleasure. Hutcheson, however, seems to determine which pleasures are higher and lower based on prior views he harbours about which are noble. He supposes that people who possess “diviner faculties and fuller knowledge” will be able to judge which pleasures are better, and thus which it is better to engage in and promote in others.

Hutcheson is further distinct from utilitarians in that it is unclear whether he is actually trying to provide a theory of right action. He notes that our moral sense can discern which actions are best and worst, but he does not explicitly link this to an account of what it is our duty to do, or what it would be wrong for us not to do. This could be viewed simply as something Hutcheson omitted, but alternatively could be interpreted as a version of scalar utilitarianism (see section 5.d).

b. Christian Utilitarianism

Utilitarianism today is usually seen as a secular doctrine. From Bentham onwards, utilitarians typically attempted to describe their worldview without referring to any theistic commitments. In the 18th century, however, there was a distinct branch of early utilitarians who gave theistic justifications for their position. Participants in this strand are sometime referred to as “Anglican utilitarians”. Richard Cumberland (1631-1718) was an early example of this, and was later followed by John Gay (1699-1745), Soame Jenyns (1704-1787), Joseph Priestley (1733-1804), and William Paley (1743-1805). Paley’s Principles of Moral and Political Philosophy (1785) was the first to bring utilitarianism to a wider audience, and it remained the most discussed example of utilitarianism well into the 19th century.

Cumberland was a natural law theorist, which is to say that moral truths are determined by or can be derived from features of the world, including the nature of human beings. In Cumberland’s view, because human beings find pleasure good and pain bad, they can discern that God wills that they promote pleasure and diminish pain. In A Treatise of the Laws of Nature (1672), he writes: Having duly pondered on these matters to the best of our ability, our minds will be able to bring forth certain general precepts for deciding what sort of human actions may best promote the common good of all beings, and especially of rational beings, in which the proper happiness of each is contained. In such precepts, provided they be true and necessary, is the law of nature contained.

So, armed only with empirical facts about the world, like experiences of pleasure and pain, and our possessing the faculty of reason, Cumberland claimed that it was possible to ascertain that human beings have a God-given duty to promote the general happiness.

While secular versions of utilitarianism came to dominate the tradition, this type of argument for utilitarianism actually has some distinct advantages. Notably, this can provide simple answers to the question “Why be moral?”. Everyone may value their own happiness, so this provides everyone with a reason to act in ways that increase their own happiness. However, there are instances where promoting one’s own personal happiness seem to conflict with the common good. John Gay issued a challenge for secular versions of utilitarianism to explain why an agent in such a position has reason to sacrifice their own happiness to help others: “But how can the Good of Mankind be any Obligation to me, when perhaps in particular Cases, such as laying down my Life, or the like, it is contrary to my Happiness?” (Concerning the Fundamental Principle of Virtue or Morality, 1731).

For the Anglican utilitarian, this question is resolved easily. While it might appear that an individual’s happiness is best promoted by a selfish act contrary to the public good, this is only because rewards of the afterlife have not been taken into account. When someone recognises the infinite rewards for complying with God’s will (or infinite punishments for defying it), they will realise that acting in the interests of the common good (promoting the general happiness) is actually in their best interests. This kind of solution to the problem of moral motivation is not available for secular utilitarians.

Although theistically grounded versions of utilitarianism may stand on firmer ground when it comes to the problem of moral motivation, there are costs too. There are challenges to the existence of an all-powerful creator (see arguments for atheism). Even if those are avoided, the natural law reasoning championed by the Anglican utilitarians might not be persuasive. The inference from what kinds of things people enjoy to a specific divine purpose of human beings (for example, Priestley claims that we can discover that God “made us to be happy”) is one that might be scrutinised. Furthermore, the theistic utilitarian faces a version of the Euthyphro problem: is happiness good because God desires it, or does God desire happiness because it is good?

The Anglican utilitarians foresaw some of the problems that would become serious areas of discussion for later utilitarians. In Priestley, for instance, one can find a discussion of what would later be known as the “demandingness objection” (discussed in section 3.d.iii).

William Paley’s utilitarianism is of historical interest because he discussed several features of the view that have concerned utilitarians and their critics since. For example, he raised the question of whether certain types of action usually deemed to be evil, such as bribery or deceit, might be regarded as morally good if they lead to good consequences:

It may be useful to get possession of a place…or of a seat in parliament, by bribery or false swearing: as by means of them we may serve the public more effectually than in our private station. What then shall we say? Must we admit these actions to be right, which would be to justify assassination, plunder and perjury; or must we give up our principle, that the criterion of right is utility? (The Principles of Moral and Political Philosophy, 1785: 854).

In his answer to this question, Paley suggests a form of what would later be known as rule-utilitarianism (discussed further in section 5.c). He suggests that two types of consequences of an action can be distinguished—the general consequences and the particular consequences. The particular consequence is what follows from a specific action, that is, bribing someone on a given occasion. The general consequence is what follows from acting on that rule, and it is the general consequence Paley views as more important. Paley suggests that, in considering whether bribery to gain a political position is right, one should think about the consequences if everyone accepted a rule where bribery was allowed. Once this is taken into account, Paley argues, it will become apparent that bribery is not useful.

Like Epicurus, Paley is somewhat dismissive of animalistic pleasures, but his explanation for this differs. He makes a distinction between pleasures, which are fleeting, and happiness, which he seems to regard as possessed over longer periods of time:

Happiness does not consist in the pleasures of sense, in whatever profusion or variety they be enjoyed. By the pleasures of sense, I mean, as well the animal gratifications of eating, drinking, and that by which the species is continued, as the more refined pleasures of music, painting, architecture, gardening, splendid shows, theatric exhibitions; and the pleasures, lastly, of active sports, as of hunting, shooting, fishing, etc. (Principles of Moral and Political Philosophy, 35)

He claims these bodily pleasures do not contribute to happiness because they are too fleeting and “by repetition, lose their relish”. Rather, Paley sees happiness as consisting in social activities, the exercise of our faculties, and good health. Paley might then be seen as suggesting that happiness is something one does, rather than something one experiences. He also emphasises the importance of “prudent constitution of the habits” (which bears similarities to Aristotelian ethics). This distinguishes Paley somewhat from the classical utilitarians, who regarded pleasure as a mental state, and happiness consisting in pleasure as well as an absence of pain.

William Paley is also somewhat distinctive due to his conservative values. Unlike Bentham and his followers, who were radical reformers, Paley found the status quo satisfactory. This difference arises for a few different reasons. One explanation for this is that he thought that happiness was relatively evenly distributed around society. He did not think, for instance, that the wealthy were significantly happier than the poor. He argued that this was the case because of his view of happiness—he thought the wealthy and the poor had fairly equal access to social activities, utilising their faculties, and good health.

In his discussions of what acts should be regarded as criminal and what the punishments should be, he does appeal to utility, but also regularly to scripture. As a consequence, Paley’s position on many social issues is one that would now be considered extremely regressive. For example, he favoured financial penalties for women guilty of adultery (but did not suggest the same for men) and argued that we should not pursue leisure activities (like playing cards or frequenting taverns) on the Sabbath. Like many of the later utilitarians, Paley did argue that slavery should be abolished, criticising it as an “odious institution”, but he was in favour of a “gradual” emancipation.

The Anglican utilitarians were extremely influential. Bentham was familiar with their work, citing Joseph Priestley in particular as a major inspiration. Many of the discussions that later became strongly associated with utilitarianism originated here (or were at least brought to a wider audience). An obvious difference between many of the Anglican utilitarians and the later (Benthamite) utilitarians is the conservativism of the former. (One notable exception is perhaps found in Priestley, who celebrated the French Revolution. This reaction was met with such animosity—his chapel was destroyed in a riot—that he emigrated to America.) The Anglican utilitarians were committed to the traditional role of the church and did not endorse anything like the kind of radical reform championed by Bentham and his followers.

c. French Utilitarianism

The development of utilitarianism is strongly associated with Britain. John Plamenatz described the doctrine as “essentially English”. However, a distinctly utilitarian movement also took place in 18th-century France. Of the French utilitarians, Claude Helvétius (1715-1751) and François-Jean de Chastellux (1734-1788) are of particular interest.

While the dominant form utilitarianism in Britain in the 18th century was the Anglican utilitarianism of John Gay (see 2.b), the French utilitarians argued from no divine commitments. Helvétius’ De L’Espirit (1758) was ordered to be burned due to its apparently sacrilegious content. That the French utilitarians were secular has some implications that make it historically noteworthy. As mentioned above (section 2.b), one advantage of the theistically-grounded utilitarianism is that it solves the problem of moral motivation—one should promote the well-being of others because God desires it, and, even if one is fundamentally self-interested, it is in one’s interests to please God (because one’s happiness in the afterlife depends on God’s will). Without the appeal to God, giving an account of why anyone should promote the general happiness, rather than their own, becomes a serious challenge.

Helvétius poses an answer to this challenge. He accepts that the general good is what we should promote, but also, influenced by the Hobbesian or Mandevillian view of human nature, holds that people are generally self-interested. So, people should promote the general good, but human nature will mean that they will promote their individual goods. Helvétius takes this to show that we need to design our laws and policies so that private interest aligns with the general good. If everyone’s actions will be directed towards their own good, as a matter of human nature, “it is only by incorporating personal and general interest, that they can be rendered virtuous.” For this reason, he claims that morality is a frivolous science, “unless blended with policy and legislation”. Colin Heydt identifies this as the key insight that Bentham takes from Helvétius.

Taking this commitment seriously, Helvétius considered what it took to make a human life happy, and what circumstances would be most likely to bring this about. He approached this with a scientific attitude, suggesting “that ethics ought to be treated like all the other sciences. Its methods are those of experimental physics”. But this raises the question of how policy and legislation be designed to make people happy.

Helvétius thought that to be happy, people needed to have their fundamental needs met. In addition to this, they needed to be occupied. Wealthy people may often find themselves bored, but the “man who is occupied is the happy man”. So, the legislator should seek to ensure that citizens’ fundamental needs are met, but also that they are not idle, because he viewed labour as an important component in the happy life. Helvétius treats the suggestion that labour is a negative feature of life with scorn, claiming:

“To regard the necessity of labour as the consequence of an original sin, and a punishment from God, is an absurdity. This necessity is, on the contrary, a favour from heaven” (A Treatise on Man: His Intellectual Faculties and Education, volume 2).

Furthermore, certain desires and dispositions are amenable to an individual’s happiness, so the legislator should encourage citizens to psychologically develop a certain way. For instance, people should be persuaded that they do not need excessive wealth to be happy, and that in fact, luxury does not enhance the happiness of the rich. Because of this, he proposed institutional restrictions on what powers, privileges, and property people could legally acquire. In addition, Helvétius suggested that education should serve to restrict citizens’ beliefs about what they should even want to require, that is, people could be taught (or indoctrinated?) not to want anything that would not be conducive the public good.

As poverty does negatively affect the happiness of the poor, Helvétius defended limited redistribution of wealth. Specifically, one suggestion he offered was to force families that have shrunk in size to relinquish some of their land to families which have grown. Exactly what is the best way to move from a state of misery (which he thought most people were in) to a state of happiness would vary from society to society. So specific suggestions may have limited application. Helvétius urged that this transformation should take place and might involve changing how people think.

In Chastellux’s work, the view that governments should act primarily to promote public happiness is explicit. In his De la Félicité publique (1774), he says: It is an indisputable point, (or at least, there is room to think it, in this philosophical age, an acknowledged truth) that the first object of all governments, should be to render the people happy.

Accepting this, Chastellux asked how this should be done. What is most noteworthy in Chastellux is that he pursued a historical methodology, examining what methods of governments had been most successful in creating a happy populace, so that the more successful efforts might be emulated and developed. From his observations, Chastellux claimed that no society so far had discovered the best way to ensure happiness of its citizens, but he does not find this disheartening. He notes that even if all governments had aimed at the happiness of their citizens, it would “be no matter of astonishment” that they had so far failed, because human civilisation is still in its infancy. He harbours optimism that the technological developments of the future could help improve the quality of life of the poorest in society.

While the historical methodology found in Chastellux may be questionable (Geoffrey Scarre describes it as “fanciful and impressionistic”), it showed a willingness to utilise empirical measures in determining what is most likely to promote the general happiness.

Of the French utilitarians, Helvétius had the greatest influence on later developments in Britain; he was regularly acknowledged by Jeremy Bentham, William Godwin, and John Stuart Mill. The conviction to create good legislation and policies forms the crucial desire of utilitarians in the political realm. In Helvétius, we can also see the optimism of the radical reformer utilitarians, holding to his hope that “wise laws would be able without doubt to bring about the miracle of a universal happiness”.

3. Classical Utilitarianism

While many thinkers were promoting recognisably utilitarian ideas long before him, it is Jeremy Bentham who is credited with providing the first systematic account of utilitarianism in his Introduction to the Principles of Morals and Legislation (1789).

a. Origin of the Term

The word “utilitarianism” is not used in Jeremy Bentham’s Introduction to the Principles of Morals and Legislation (IPML). There he introduces the ‘principle of utility’, that “principle which approves or disapproves of every action whatsoever, according to the tendency it appears to have to augment or diminish the happiness of the party whose interest is in question; or, what is the same thing in other words to promote or to oppose that happiness”. Bentham borrows the term “utility” from David Hume’s Treatise of Human Nature (1739-1740). There, Hume argues that for any character traits viewed as virtues, this can be explained by the propensity of those traits to cause happiness (‘utility’). Bentham later reported that upon reading this, he “felt as if scales had fallen from my eyes”.

The first recorded use of the word “utilitarianism” comes in a letter Bentham wrote in 1781. The term did not catch on immediately. In 1802, in another letter, Bentham was still resisting the label “Benthamite” and encouraging the use of “utilitarian” instead. While Bentham seems to have originated the term, this does not seem to have been common knowledge. John Stuart Mill, in Utilitarianism (1861) notes that he found the term in an 1821 John Galt novel. He was using it as early as 1822, when he formed a society called the ‘Utilitarian Society’, which was a group of young men, who met every two weeks for three and half years. After this, the term entered common parlance.

b. Bentham

As well as providing what became the common name of the view, Jeremy Bentham (1748-1832) is credited with making utilitarianism a systematic ethical view. His utilitarian inclinations were sparked when he read Joseph Priestley’s Essay on Government (1768), and he claims that the “greatest happiness of the greatest number” is the measure of right and wrong in his Fragment on Government (1776). It is in IMPL, however, where the ideas are presented most clearly and explicitly.

In IPML, Bentham defines utility as “that property in any object, whereby it tends to produce benefit, advantage, pleasure, good, or happiness”. In the opening of IPML, Bentham makes clear his view that utility (pleasure and pain) determines the rightness or wrongness of an action. He states:

Nature has placed mankind under the governance of two sovereign masters, pain and pleasure. It is for them alone to point out what we ought to do, as well as determine what we shall do. On the one hand the standard of right and wrong, on the other the chain of causes and effects, are fastened to their throne. They govern us in all we do, in all we say, in all we think: every effort we can make to throw off our subjection, will serve but to demonstrate and confirm it.

As well as emphasising hedonism as the standard of rightness (normative hedonism), Bentham seems here committed to a certain view about our motivation. He not only claims that the rightness or wrongness of an action is determined by pain/pleasure, but also that these notions determine what we will do. Specifically, following Hobbes, Bentham thought that everyone is, as a matter of fact, always motivated by their own happiness, a form of psychological egoism. If we accept the ought-implies-can principle, the idea that we can only be required to act in ways that it is actually possible for us to act, this is a difficult position to reconcile with the claim that we ought to promote the general happiness. If human beings are necessarily always motivated by their own self-interest, imploring them to promote the interests of others seems futile.

Bentham was aware of this sort of objection. One type of response he gives is to claim that we should ensure, where possible, that society is structured so that when individuals act in their own interests, this is conducive to the general happiness. This answer is reminiscent of the strategy deployed by Helvétius (section 2.c). When the incentive and punitive structures in society are structured in this way, self-interested actions benefit the wider community. Second, he suggests that individuals do benefit from living in a community where the general good is promoted. This amounts to a denial that any self-interested actions actually does clash with the general good. This strikes many as implausible, as any actions that would be good for the general good but bad for the individual acting, would disprove it. This move is rendered unnecessary if psychological egoism is abandoned, and given some of the arguments against the view, Bentham’s utilitarianism may be better off without that psychological claim.

One of the ideas Bentham is known for is the “hedonic calculus” or “felicific calculus” (though Bentham never himself used either of these terms). The crux of this is the thought that to determine the value of an action, one can use a kind of moral ledger. On one side of the ledger, the expected good effects of the action and how good they are can be added up. On the other side, the bad effects of the action can be added. The total value of the negative effects can then be subtracted from the value of the positive effects, giving the total value of the action (or policy). This idea was first introduced by Pierre Bayle (1647-1706), though Bentham adds considerable depth to the idea.

In considering how to value a quantity of pleasure (or pain), Bentham observed that we can evaluate it with regards to seven dimensions or elements. These are the pleasure’s:

(1) intensity

(2) duration (how long the pleasure lasts)

(3) certainty/uncertainty (the probability it will occur)

(4) propinquity or remoteness (how soon the pleasure will occur)

(5) fecundity (how likely it is to be followed by further pleasures)

(6) purity (how likely it is to be followed or accompanied by pains)

(7) extent (the number of persons it extends to)

Bentham included a poem in the second edition of IPML, so that people could remember these dimensions:

Intense, long, certain, speedy, fruitful, pure –
Such marks in pleasures and in pains endure.
Such pleasures seek if private be thy end:
If it be public, wide let them extend
Such pains avoid, whichever be thy view:
If pains must come, let them extend to few.

On Bentham’s view, these are all the features we must know of a certain pleasure. Importantly, even a frivolous game, if it turns out to have the same intensity, duration, and so forth, is just as good as intellectual pursuits. He says this explicitly about the game push-pin (a children’s game where players try to hit each other’s pins on a table): “Prejudice apart, the game of push-pin is of equal value with the arts and sciences of music and poetry”. Notably, this view set him apart from those who claimed a difference in kind between types of pleasures, like John Stuart Mill (see section 3.d.ii).

While Bentham does suggest that this kind of happiness arithmetic would be successful in determining what actions are best, he does not suggest that we consider every factor of every possible action in advance of every given action. This would obviously be excessively time consuming, and could result in a failure to act, which would often be bad in terms of utility. Rather, we should use our experience as a guide to what will likely promote utility best.

Though the term “greatest happiness for the greatest number” has become strongly associated with utilitarianism and is used by Bentham in earlier works, he later distanced himself from it, because in it “lurks a source of misconception”. One interpretation of the expression suggests we should ascertain the largest number of people benefited by an action (the greatest number), and benefit those as much as possible, no matter what the effects are on the other remainder. For instance, we could imagine a policy that enslaved 1% of the population for the benefit of the 99%, greatly benefiting that majority, but making the enslaved miserable. A policy like this, which ignores entirely the well-being of some, is certainly not what Bentham intended. He later speaks simply of the “greatest happiness principle”, the requirement to promote the greatest happiness across the whole community.

Bentham was an active reformer. He argued for radical political changes, including arguing for the right to vote for women, significant prison reforms, the abolition of slavery, the elimination of capital punishment, and in favour of sexual freedom. Each of these was argued for on grounds of utility. Bentham gained a number of intellectual followers. One of the most notorious of these was James Mill (1783-1836), who was one of the major figures in 19th century philosophy and economics. Mill’s reputation was international, attracting attention from Karl Marx (1818-1883), and is still seen as one of the most important figures in utilitarianism, but today he is overshadowed by his son, John Stuart. John Stuart Mill (1806-1873) met Bentham when he was two years old, and, under the influences of Bentham and his father, became one of utilitarianism’s fiercest champions. John Stuart Mill’s defence of utilitarianism is still the most widely read today (discussed in more depth in 3.d).

c. Features of Classical Utilitarianism

It is a matter of some dispute what features make a moral theory appropriate for the name utilitarianism. The core features mentioned here are those commonly associated with classical utilitarianism. It is not clear how many of those associated with utilitarianism, even in 19th century Britain, actually accepted classical utilitarianism, that is, who thought the correct moral theory possessed these six features. For instance, though John Stuart Mill is regarded as the man who did most to popularise the view, he rejected elements of this picture, as he explicitly rejected the requirement to maximise utility (see Jacobson 2008 for a discussion of how Mill deviates from this orthodox picture). Regardless of how many actually held it, the view consisting of these claims has become the archetype of utilitarianism. The more a moral view departs from these, the less likely it is to be deemed a version of utilitarianism.

i. Consequentialism

Views are classed as consequentialist if they place particular emphasis on the role of the outcome of actions, rather than features intrinsic to the actions (for example, whether it involves killing, deception, kindness, or sympathy) as forms of deontology do, or what the actions might reveal about the character of the agent performing them (as does virtue ethics).

Classical utilitarianism is uncontroversially consequentialist. Later variations, such as rule-utilitarianism (see section 5c), which regard consequences as having an important role, are less easily categorised. Versions of utilitarianism that do not assess actions solely in terms of the utility they produce are sometimes referred to as indirect forms of utilitarianism.

ii. Hedonism

Following the Epicureans, classical utilitarianism regards pleasure as the only thing that is valuable in itself. Pleasure is the “utility” in classical utilitarianism. On this view, actions are morally better if they result in more pleasure, and worse if they result in less.

Hedonists differ on how they understand pleasure. The Epicureans, for instance, regarded a state of tranquility (ataraxia) as a form of pleasure, and one that should be pursued because it is sustainable. Classical utilitarians typically regard pleasure as a mental state which the individual experiences as positive. Bentham evaluated pleasures across his seven elements, but importantly thought no pleasure was superior in kind to any other. For example, the pleasure from eating fast food is no less valuable than the pleasure one may attain from reading a great novel, though they may differ in terms of sustainability (one might become ill fairly quickly from eating fast food) or propinquity (pleasure from fast food may be quick, whereas it may take some time to come to appreciate a complex prose). This parity of pleasures was something John Stuart Mill disagreed with, leading to a notable difference in their views (see 3.d.ii).

Many contemporary utilitarians, recognising issues with hedonism, have instead adopted welfarism, the weaker claim that the only thing that is intrinsically valuable is well-being, that is, whatever it is that makes a life go well. Well-being could be given a hedonistic analysis, as in classical utilitarianism, but alternatively a preference-satisfaction view (which states that one’s well-being consists in having one’s preferences satisfied) or an objective-list view (which states that lives go well or badly depending on how well they satisfy a set list of criteria) could be adopted.

iii. Aggregation

The utilitarian thinks that everyone’s individual pleasure is good, but they also think it makes sense to evaluate how good an outcome is by adding together all the respective quantities of pleasure (and pain) of the individuals affected. Imagine that we can assign a numerical value to how happy every person is (say 10 is as happy as you could be, zero is neither happy or unhappy, and -10 is as unhappy as you could be). The aggregative claim holds that we can simply add the quantities together for an action to see which is the best.

One of the criticisms sometimes made of utilitarianism is that ignores the separateness of persons. When we decide actions based on aggregated sums of happiness, we no longer think about individuals as individuals. Instead, they are treated more like happiness containers. A related complaint is that determining the best outcome by adding together the happiness scores of every individual can obscure extremes that might be morally relevant. This has implications that many find counterintuitive, such as that this method may judge an outcome where one person undergoes horrific torture to be a good outcome, so long as enough other people are happy.

iv. Optimific (‘Maximising’)

Hedonists believe pleasure is the only good. Aggregation commits utilitarians to the idea that the pleasures and pains of different people can be added to compare the value of outcomes. One could accept these claims without thinking that a moral agent must always do the best. Classical utilitarianism does hold that one is required to perform the best action. In other words, classical utilitarianism is a maximising doctrine (“maximising” is another word introduced into English by Jeremy Bentham).

Maximising views are controversial. One reason for this is that they eliminate the possibility of supererogatory actions, that is, actions that are beyond the call of duty. For example, we might think donating most of your income to charity would be a wonderful and admirable thing to do, but not something that is usually required. The maximiser claims that you must do the best action, and this is the case even if doing so is really difficult, or really costly, for the person acting.

Some of the most persistent criticisms of utilitarianism concern how much it demands. In response, some of the 20th-century revisions of the view sought to abandon this element, for example, satisficing versions and scalar views (5.d).

v. Impartiality

Utilitarians embrace a form of egalitarianism. No individual’s well-being is more important than any other’s. Because of this, utilitarians believe that it is just as important to help distant strangers as it is to help people nearby, including one’s friends or family. As Mill puts it, utilitarianism requires an agent “to be as strictly impartial as a disinterested and benevolent spectator”.

In fact, sometimes impartiality may require a person to help a stranger instead of a loved one. William Godwin (1756-1836) highlighted this in a famous example. He described a scenario where a fire broke out, and a bystander was able to save either Archbishop Fénelon (a famous thinker and author of the time) or a chambermaid. Godwin argued that because of Fénelon’s contributions to humanity, a bystander would be morally required to save him. Moreover, Godwin claimed, one would be required to save Fénelon even if the chambermaid was one’s mother.

This requirement for strict impartiality strikes many as uncomfortable, or even alienating. When challenged, Godwin defended his position, but insisted that scenarios where this kind of sacrifice is required would be rare. In most instances, he thought, people do happen to be more able to bring happiness to themselves or their loved ones, because of greater knowledge or increased proximity. In this way, some partial treatment, like paying more attention to one’s friends or family, can be defended impartially.

vi. Inclusivity

The classical utilitarian accepts the hedonist commitment that happiness is what is valuable. It is a separate question whose happiness should count. Utilitarians answer this with the most inclusive answer possible—everyone’s. Any subject that is capable of pleasure or pain should be taken into consideration.

This has some radical implications. As well as human beings, many animals can also experience pleasure or pain. On this topic, one passage from Bentham is regularly deployed by defenders of animal rights:

It may come one day to be recognized, that the number of legs, the villosity of the skin, or the termination of the os sacrum, are reasons equally insufficient for abandoning a sensitive being to the same fate. What else is it that should trace the insuperable line? Is it the faculty of reason, or perhaps, the faculty for discourse? …the question is not, Can they reason? nor, Can they talk? but, Can they suffer? (IPML, chapter XVII)

Reasoning of this sort extends the domain of morally relevant beings further than many were comfortable with. Bentham was not alone among utilitarians in suggesting that non-human life should be taken into moral consideration. In his Utilitarianism, Mill noted that lives full of happiness and free from pain should be “secured to all mankind; and not to them only, but, so far as the nature of things admits, to the whole sentient creation.” This emphasis on the importance of the well-being of animal life, as well as human life, has persisted into contemporary utilitarian thought.

d. Early Objections and Mill’s Utilitarianism

In the 19th century, knowledge of utilitarianism spread throughout society. This resulted in many criticisms of the view. Some of these were legitimate challenges to the view, which persist in some form today. Others, however, were based upon mistaken impressions.

In 1861, frustrated by what he saw as misunderstandings of the view, John Stuart Mill published a series of articles in Fraser’s Magazine, introducing the theory and addressing some common misconceptions. This was later published as a book, Utilitarianism (1863). Mill was somewhat dismissive of the importance of this work. In letters, he described it as a “little treatise”, and barely mentioned it in his Autobiography (unlike all his other major works). Despite this, it is the most widely consulted defence of utilitarianism.

Here are some of the early criticisms of utilitarianism, and Mill’s responses.

i. Dickens’ Gradgrindian Criticism

In the 19th century, utilitarianism was perceived by some of its detractors as cold, calculating, and unfeeling. In his 1854 novel, Hard Times, Charles Dickens portrays a caricature of a utilitarian in the character of Thomas Gradgrind. Gradgrind, who is described explicitly as a utilitarian, is originally described as follows:

Thomas Gradgrind, sir. A man of realities. A man of facts and calculations. A man who proceeds upon the principle that two and two are four, and nothing over, and who is not to be talked into allowing for anything over. Thomas Gradgrind, sir—peremptorily Thomas—Thomas Gradgrind. With a rule and a pair of scales, and the multiplication table always in his pocket, sir, ready to weigh and measure any parcel of human nature, and tell you exactly what it comes to. It is a mere question of figures, a case of simple arithmetic. You might hope to get some other nonsensical belief into the head of George Gradgrind, or Augustus Gradgrind, or John Gradgrind, or Joseph Gradgrind (all supposititious, non-existent persons), but into the head of Thomas Gradgrind—no, sir!

The reputation of utilitarians for being joyless and overly fixated on precision was so established that John Stuart Mill addressed this misconception in Utilitarianism (1861). Mill complains that the opponents of utilitarianism have been mistaken that the view opposes pleasure, which he describes as an “ignorant blunder”. This view of the position may come, in part, from its name, and the focus on utility, or what is useful or functional—terms seldom associated with happiness.

Despite Mill’s frustrations with this criticism, the colloquial use of the word “utilitarian” continued to have similar connotation long after his death. In an episode of the sitcom Seinfeld, for example, Elaine notes that while the female body is aesthetically appealing, the “The male body is utilitarian — it’s for getting around. It’s like a Jeep” (1997). The implication is that utilitarian objects being functional rather than fun. This association may be unfortunate and unfair, as Mill argues, but it has been a persistent one.

This particular criticism may be unfortunate, but aspects of it—such as the focus on measurement and arithmetic—foreshadow some of the utilitarianism’s later criticisms, like John Rawls’ (1921-2002) suggestion that it cannot appreciate the separateness of persons, or Bernard Williams’ (1923-2003) complaint that the view insists that people regard themselves as merely nodes in a utility calculus.

ii. The ‘Swine’ Objection and ‘Higher Pleasures’

Another criticism that was regularly levelled against utilitarianism was that it is unfit for humans, because the focus on pleasure would not allow for the pursuits of uniquely human goods. This was a criticism also made (unfairly) of the Epicureans. It suggested that the hedonist would endorse a life consisting entirely in eating, sleeping, and having sex, which were devoid of more sophisticated activities like listening to music, playing card games, or enjoying poetry. The allegation suggests that the utilitarian proffers an ethics for swine, which is undignified for human beings. Consequently, the opponent suggests, the view must be rejected.

There are several ways a utilitarian could respond to this. They could make use of the Epicurean strategy, which is to suggest that the animalistic pleasures are just as good, but they are not sustainable. If you try to spend all your time eating delicious food, your appetite will run out, and you may make yourself sick. Pleasures of the mind, however, might be pursued for a longer time. If someone is able to take pleasure in listening to poetry or music, this might also be more readily satisfied. Indulging in pleasures of these sorts does not require scarce resources, and so could be less vulnerable to contingent environmental factors. A bad harvest may ruin one’s ability to enjoy a certain food, but it would not tarnish one’s ability to enjoy a piece of music or think about philosophy. This is the type of response that would satisfy Bentham. He thought that no type of pleasure was intrinsically better than another (that push-pin “is of equal value with the arts and sciences of music and poetry”).

Mill disagreed with Bentham on this matter, claiming instead that “some kinds of pleasure are more desirable and more valuable than others”. On his view, the pleasure gained from appreciating a sophisticated poem or an opera could be better than the pleasure from push-pin, even if both instances had the same duration, were equally intense, and had no additional relevant consequences.

This was a controversial aspect of Mill’s utilitarianism, and many found his justification for this unconvincing. He suggested that someone who had experienced two different kinds of pleasures would be able to discern which was the higher quality. Some people may not be able to appreciate some forms of pleasure, because of ignorance or a lack of intelligence, just as animals are not capable of enjoying a great novel. But, according to Mill, it is generally better to be the intelligent person than the fool, and better to be a human than a pig, even a happy one: “It is better to be a human being dissatisfied than a pig satisfied; better to be Socrates dissatisfied than a fool satisfied. And if the fool, or the pig, is of a different opinion, it is only because they only know their own side of the question” (Mill, Utilitarianism, chapter 2).

Mill’s suggestion, however, invites scrutiny. Many people do opt for “lower” pleasures, rather than “higher” ones, even when capable of enjoying both. One might also wonder whether some mixture of different kinds of pleasures might be preferable to restricting oneself to pleasures more closely associated with the intellect and reasoning (which Mill regards as superior), yet Mill does not consider this, or that different people may simply have different preferences regarding some of these kinds of pleasure, without that indicating any superiority or inferiority. Mill’s proposal raises many questions, so a utilitarian may find that the simpler, Benthamite ‘quantitative hedonism’ is preferable to Mill’s ‘qualitative hedonism’ (see here for further discussion of this distinction).

While this aspect of Mill’s utilitarianism is contentious, a similar type of argument is still utilised to justify the claim that animals have a different moral status (see also the discussion of animals and ethics).

iii. Demandingness

Because of the classical utilitarian commitment to maximisation, utilitarianism is sometimes accused of being excessively demanding. Everyone is required, according to the classical utilitarian, to bring about the most happiness. If an individual can best serve the general utility by living an austere, self-sacrificial life, this is what the utilitarian calculus demands. However, this strikes many as counterintuitive. According to common-sense moral thinking, people can use their time in myriad ways without having morally failed, but the maximiser states that one must always do the very best. Morality then threatens to encroach on every decision.

Mill was aware of this criticism. He identified two particular ways this might be a concern.

First, utilitarianism may be seen to require that moral agents are always thinking about duty, that this must be the motive in every action a person performs. Thinking about morality must be central in all a person’s decisions. This, he claims, is a mistake. Mill argues that the business of ethics is people’s conduct, not whether they act because of a conscious desire to bring about the greatest utility. He provides an example to illustrate this. If a bystander notices someone drowning, what matters is that they save them, whatever their reasons might be:

He who saves a fellow creature from drowning does what is morally right, whether his motive be duty, or the hope of being paid for his trouble: he who betrays the friend that trusts him, is guilty of a crime, even if his object be to serve another friend to whom he is under greater obligations. (Utilitarianism, chapter 2)

Here, Mill makes a distinction between the moral worth of the action and the moral worth of an agent. As far as the action is concerned, the drowning person being rescued is what matters. Whether the person doing the saving is an admirable person might depend on whether they did it for noble reasons (like preventing suffering) or selfish reasons (like the hope of some reward), but utilitarianism is primarily concerned with what actions one should do. In other places, Mill does talk extensively about what makes a virtuous person, and this is strongly connected to his utilitarian commitments.

Second, Mill was aware of the worry that utilitarianism might dominate one’s life. If every action one performs must maximise utility, will this not condemn one to be constantly acting for the sake of others, to the neglect of the things that make one’s own life meaningful? Mill was dismissive of this worry, claiming that “the occasions on which any person (except one in a thousand) has it in his power to do this on an extended scale, in other words, to be a public benefactor, are but exceptional”. Sometimes, one might find oneself in a situation where one could save a drowning stranger, but such scenarios are rare. Most of the time, Mill thought, one individual does not have the ability to affect the happiness of others to any great degree, so they can focus on improving their own situation, or the situations of their friends or families.

In the 19th century, this response may have been more satisfactory, but today it seems wildly implausible. Due to the existence of effective charities, and the ability to send resources around the world instantly, an affluent person can make enormous differences to the lives of people halfway around the world. This could be in terms of providing food to countries experiencing famine, inoculations against debilitating illnesses or simply money to alleviate extreme poverty. In his time, perhaps Mill could not have been confident that small sums of money could prevent considerable suffering, but today’s middle classes have no such excuse.

Because of technological developments, for many people in affluent countries, living maximising happiness may require living a very austere life, while giving most of their resources to the world’s poorest people. This appears implausible to many people, and this intuition forms the basis of one of the major objections to utilitarianism today. Some have responded to this by moving to rule, satisficing, or scalar forms of utilitarianism (see section 5).

iv. Decision Procedure

The utilitarian claims that the right action is that which maximises utility. When an agent acts, they should act in a way that maximises expected utility. But how do they determine this? One way is to consider every possible action one might do, and for each one, think about all the consequences one might expect (with appropriate weightings for how likely each consequence would be), come up with an expected happiness value for each action, and then pick the one with the highest score. However, this sounds like a very time-consuming process. This will often be impossible, as time is limited. Is this a problem for utilitarians? Does it make the view impractical?

Mill was aware of this concern, that “there is not time, previous to action, for calculating and weighing the effects of any line of conduct on the general happiness.” However, Mill thinks this objection obscures relevant information gained throughout human history. As people have acted in all sorts of ways, with varying results, any person today can draw upon humanity’s wealth of knowledge of causes and effects, as well as from their own experiences. This background knowledge provides reasons to think that some actions are likely to be more conducive to happiness than others. Often, Mill thinks, an agent will not need to perform any calculations of utility to determine which actions best promote happiness; it will just be obvious.

Mill ridicules the suggestion that individuals would be completely ignorant of what actions they must do if they were to adopt utilitarianism. There would, of course, be no need to contemplate on each occasion whether theft or murder promote utility—and even if there were, he suggests that this would still not be particularly puzzling. Acknowledging this criticism with some derision, Mill notes that “there is no difficulty in proving any ethical standard whatever to work ill, if we suppose universal idiocy to be conjoined with it”.

However, this kind of objection relates to an interesting question. Should a utilitarian endorse reasoning like a utilitarian? Mill suggests that it is preferable in many occasions to make use of rules that have been previously accepted. But how does one determine whether to use a rule and when to perform a utility calculation? Some of Mill’s remarks about how to use rules have prompted commentators to regard him as a rule-utilitarian (see section 5.c). Utilitarianism also seems to allow for the possibility that no one should believe that utilitarianism is true. If, for instance, it turns out that the world would be a happier place if everyone accepted a Kantian ethical theory, the utilitarian should, by their own lights, favour a world where everyone believes Kant. Henry Sidgwick (1838-1900) took this seriously, and he defended the idea that perhaps only an “enlightened few” should know the truth about morality, and keep it hidden from the masses.

Utilitarians can say that the truth of their view does not depend on what the correct decision procedure is. Whether performing a utility calculus or simply acting on common-sense morality leads to most happiness, they can still say that the right actions are those that lead to happiness being maximised, that is, that utilitarianism is the correct theory. However, given that utilitarians do tend to care about how people should act, and want to change behaviours, the question of how one should decide what to do is pertinent. Exactly what the relationship between utilitarianism and practical reasoning is, or should be, according to utilitarians, is a persisting question.

4. The Utilitarian Movement

Today, utilitarianism is regarded primarily as a moral theory which can be used to determine the obligations of an individual in a situation. This focus on individual morality gives an inaccurate impression of the Utilitarian movement (‘Utilitarianism’ with a capital ‘U’ will be used to indicate the movement, as distinct from the moral theory) in the 18th and 19th century. The Utilitarians were keenly focused on social change. This took the form of revising social policy with the aim of improving the general happiness. Bentham is explicit on the first page of Introduction to the Principles of Morals and Legislation that the principle of utility applies not only to actions of private individuals, but also to “every measure of government”. Helvétius was similarly minded, emphasising the importance of laws that could make people happy, as well as ways to change people, so that they could be made happy more easily.

The Utilitarian project was an ambitious one. Every policy, every law, every custom was open to scrutiny. If it was deemed not conducive to general happiness, the Utilitarians suggested it should be disregarded or replaced. Because they were so willing to disregard customs—even those the general community placed high values on—the Utilitarians were a radical group. This section discusses some of the policies supported by Utilitarians.

A common plea from Utilitarians, deemed radical at the time, was for women’s suffrage. A notable example of this comes from Harriet Taylor (1807-1858). Taylor befriended and later married John Stuart Mill, and she is regarded as a prominent Utilitarian in her own right. She had a significant influence on Mill’s writing (exactly how much influence she had is a matter of dispute, though Mill said in his introduction to On Liberty, “Like all that” he had “written for many years, it belongs as much to her as to” him). In Taylor’s Enfranchisement of Women (1851), she argues that women should have equal political rights to men, including the right to vote and to serve in juries. In fact, Taylor’s arguments call for the equal access to all spheres of public life. In particular, she claimed women should be able to enter all professions, including running for political office.

In the same essay, Taylor condemned slavery. This was another point Utilitarians were largely united on. Bentham also criticised slavery on the grounds that it had negative effects on the general happiness, and when abolition was discussed in parliament, he actively opposed compensating slave-traders for their losses. John Stuart Mill was also vocal on the topic of slavery and the just treatment of former slaves. As a Member of Parliament, Mill chaired the Jamaica Committee, which aimed to prosecute Governor Eyre of Jamaica, who used excessive and deadly force in suppressing an uprising at Morant Bay in 1865. This pitted Mill against many prominent intellectuals, including his contemporary (and sometimes friend) Thomas Carlyle (1795-1881). Mill received assassination threats for his position, which was seen by many as overly sympathetic towards the Black Jamaicans.

Like his wife, John Stuart Mill also campaigned for the rights of women. He thought not only that society would benefit considerably from the liberation of women, but also that there would be an “unspeakable gain in private happiness to the liberated half of the species; the difference to them between a life of subjection to the will of others, and a life of rational freedom”. As well as making the case in his book The Subjection of Women (which drew heavily upon material from his wife’s previous work), Mill spoke passionately in favour of expanding suffrage in Parliament. This cause clearly moved Mill, who was reportedly arrested as a teenager for distributing information about contraception. Henry Sidgwick was also an active campaigner, particularly regarding education reform. He became one of the leading voices advocating for access to higher education for women and was one of the organisers of “Lectures for Ladies” at Cambridge, which, in 1871, led to the formation of Newnham College, an all-women’s college (at the time, women were not allowed to attend the university).

Jeremy Bentham, in the early 1800s, wrote essays defending sexual freedom. He was motivated by the harsh way that society treated homosexuals and thought there could be no utilitarian justification for this. While many members of the public may have been offended by these behaviours, they were not harmful, but the restrictions and punishments faced by the marginalised groups were.

Utilitarians were also vocal in defense of animal welfare. Bentham argued that the feature relevant for whether an entity has moral status is “is not, Can they reason? nor, Can they talk? but, Can they suffer?”. Mill, despite famously arguing that humans can appreciate “higher pleasures” than animals, is insistent that animal welfare is relevant. He thought it obvious that, for a utilitarian, any practice that led to more animal suffering than human pleasure was immoral, thus it seems likely he would have opposed factory farming practices.

Not all of the proposals endorsed by Utilitarians are looked on quite so favourably with a modern eye. While John Stuart Mill argued, from utilitarian principles, for a liberal democratic state, he suggested that those arguments did not apply to “barbarians” who were “unfit for representative government”. Infamously, Mill considered India unsuitable for democracy, and is seen by some as an apologist for the British Empire for defending this kind of view.

Another infamous proposal from the Utilitarians comes from Bentham in the domain of prison reform. Bentham suggested an innovative prison design known as the “panopticon” (1787). This was designed to be humane and efficient. A panopticon prison is circular with cells around the edges, and an inspector’s lodge in the middle, situated so that the guard can view each cell. From the inspection lodge each cell would be visible, but blinds to the inspector’s lodge would prevent the prisoners from seeing whether they were being watched, or even whether a guard was present, at any given time. The mere possibility that they were being watched at any time, Bentham thought, would suffice to ensure good behaviour. He also thought that this would prevent guards from mistreating prisoners, as that too would be widely visible. The panopticon was later popularised and criticised by Michel Foucault in Discipline and Punish. The panopticon is notorious for imposing psychological punishment on inmates. Never knowing whether one is being watched can be psychologically stressful. For better or worse, the panopticon anticipated many developments in surveillance present in early 21st-century society.

In each of these proposals, the Utilitarians insisted that policies, laws, or customs must be justified by their effects. If the effects were positive, they were good and could be maintained. If the effects were negative, they should be dispelled with. This attitude, and the radical political ambition, characterised Utilitarianism as a movement.

5. Utilitarianism in the Twentieth 20th Century

Despite its many detractors, utilitarianism in one form or another continued to hold sway as one of the major moral approaches throughout the 20th century. Philippa Foot (1920-2010) claimed in 1985 that it “tends to haunt” even those who reject the view. That being said, during the 20th century, new criticisms of the view emerged, and previous objections were explored in considerably more depth. This resulted in additional complications to the view, novel defences, and variations on the classical view.

In this section, some of the major 20th-century developments for utilitarianism are discussed. Some advances that may have been described under the heading of “utilitarianism” previously have been omitted, because they veer too far from the core view. For example, G. E. Moore’s “ideal utilitarianism”, despite the name, departs significantly from the central utilitarian commitments, so is not included here (in the early 21st century, this was typically regarded as a non-utilitarian form of consequentialism—see this discussion for further details).

a. Hedonism and Welfarism

The hedonism embraced by classical utilitarianism is controversial. Some of the reasons for this have already been discussed, such as the suggestion that pleasure is all that matters is crude or a doctrine “worthy of swine”. An additional complaint that this offers an impoverished theory of the good suggests that it ignores the values of achievement or authenticity. One example that exemplifies this is the thought experiment of the “experience machine” given by Robert Nozick (1938-2002):

Suppose there were an experience machine that would give you any experience you desired. Superduper neuropsychologists could stimulate your brain so that you would think and feel you were writing a great novel, or making a friend, or reading an interesting book. All the time you would be floating in a tank, with electrodes attached to your brain. Should you plug into this machine for life, pre-programming your life’s experiences? (Nozick, Anarchy, State & Utopia, 1974)

Nozick supposes that many people would be reluctant to plug into the machine. Given that the machine could guarantee more pleasurable experiences than life outside it could, this suggests that people value something other than simply the pleasurable sensations. If some of the things that one would miss out on inside the machine (like forming relationships or changing the world in various ways) are valuable, this suggests that hedonism—the claim that only pleasure matters—is false.

In the 20th century, as a result of rejecting the hedonistic component, several utilitarians modified their view, such that utility could be understood differently. One way to change this is to suggest that the classical view is right that it is important that a person’s life goes well (their well-being), and also that this is the only thing that matters morally, but that it gets something wrong about what makes a person’s life go well. Rather than just a matter of how much pleasure a life contains, we might think well-being is best understood in another way. If a view holds that the well-being of individuals—however this is best understood—is the only moral value, it is welfarist.

One account of well-being regards preferences as especially important, such that a person’s life is made better by their preferences being satisfied. This view, which when joined to utilitarianism is known as preference utilitarianism, is able to evade the problems caused by the experience machine, because some of our preferences are not just to experience certain sensations, but to do things and to have relationships. These preferences would remain unsatisfied in an artificial reality, so the preference utilitarian could regard a person’s life as going less well as a result (even if they do not know it).

However, preference utilitarianism has problems of its own. For instance, some preferences simply do not seem that important. John Rawls (1921-2002) imagines a case of an intellectually gifted person, whose only desire is to count blades of grass. According to preference-satisfaction theories of well-being, if such a person is able to spend all their time grass-counting, their life is as good as it can be. Yet many have the intuition that this life is lacking some important features, like participating in social relationships or enjoying cultural pursuits. If there is some value lacking in the life of the grass-counter, this implies something wrong with the preference-satisfaction account of well-being.

Another objection against preference utilitarianism concerns preferences a person no longer has. If someone has a preference for something to happen, then forgets about it, never to find out whether it occurs, does this actually make their life go better? To take this to an extreme, does a person’s life improve if one of their preferences is satisfied after they die? Utilitarians who are more hedonistically inclined find this implausible. Peter Singer, one of utilitarianism’s most famous defenders, previously endorsed preference utilitarianism, but has since abandoned this in favour of hedonistic utilitarianism.

b. Anscombe and ‘Consequentialism’

G.E.M. Anscombe (1919-2001) was an influential figure in 20th century philosophy. She was not a utilitarian but was responsible for significant changes in how utilitarianism was discussed. In ‘Modern Moral Philosophy’ (1958), Anscombe expressed extremely critical views about the state of moral philosophy. She thought the notion of morality as laws or rules that one must follow made little sense in a secular world; that without a divine law-maker (God), injunctions to or prohibitions against acting some way lacked authority. She was similarly critical of Kant, claiming that the idea that one could legislate for oneself was “absurd”. Among other things, her paper—and Anscombe’s general rejection of the major ethical theories of her day—sparked renewed interest in Aristotelian ethical thinking and the development of virtue ethics.

Anscombe also criticised utilitarianism as a “shallow philosophy” because it suggested that it was always able to give clear-cut answers. She claimed that in ethics borderline cases are ubiquitous. In these cases, there is not an obvious answer, and even if there is a correct answer, it might be something one should be conflicted about.

Anscombe’s criticisms of utilitarians since Sidgwick were particularly scathing. She claimed that they held a view of intention that meant everything that was foreseen was intended—a view she thought was “obviously incorrect”. Anscombe invented the term “consequentialism” as a name for the view she was critical of, distinguishing this from “old-fashioned Utilitarianism”. After Anscombe, “consequentialism” became a broader label than utilitarianism. As well as the classical view outlined above, “consequentialism” allowed for different conceptions of the good. For example, a view that thought that only consequences matter, but held that—as well as happiness or well-being—beauty is intrinsically valuable would be consequentialist, but not utilitarian (this is why G.E. Moore’s “ideal utilitarianism” has not been discussed in this article, as he makes claims of this sort). Today, the term “consequentialism” is used more often by philosophers than “utilitarianism”, though many of those identifying as consequentialists either embrace or sympathise with utilitarianism.

c. Act versus Rule

In the 20th century, a distinction that had been noted previously was scrutinised and given a name. This is the act/rule distinction. Versions of rule-utilitarianism had been given before the 20th century. The rule utilitarian claims that, rather than examining the consequences of any particular action to determine the ethical status of an action, one should consider whether it is compatible with a set of rules that would have good consequences if (roughly) most people accepted them.

The term “rule-utilitarian” was not in popular use until the second half of the 20th century, but the central claim—that the rules one is acting in accordance with determine the moral status of one’s actions—was much older. George Berkeley (1685-1753) is sometimes suggested to have offered the first formulation of rule-utilitarianism. He suggested that we should design rules that aim towards the well-being of humanity, that “The Rule is framed with respect to the Good of Mankind, but our Practice must be always shaped immediately by the Rule”.

Later in the 18th century, William Paley (1743-1804) also suggested something like rule-utilitarianism in response to the problem that his view would seemingly condone horrible behaviours, like lying one’s way to a powerful position, or murder, if the consequences were only good enough. Paley rejected this by claiming that the consequences of the rule should be considered. If one was willing to lie or cheat or steal in order to promote the good, Paley suggested this would licence others to lie, cheat, or steal in other situations. If others did, from this precedent, decide that lying, cheating, and stealing were permissible, this would have bad consequences, particularly when people did these actions for nefarious reasons. Thus, Paley reasoned, these behaviours should be prohibited. Later still, in his Utilitarianism, John Stuart Mill proposed what some have interpreted as a form of rule-utilitarianism, though this is controversial (a discussion on this dispute can be found here).

While principles that can properly be regarded as rule-utilitarian were proposed before, it was in the 20th century that these views received the name “rule-utilitarianism” and were given extensive scrutiny.

Before considering some of the serious objections to rule-utilitarianism, it is worth noting that the view has some apparent advantages over classical act-utilitarianism. Act-utilitarians have a difficulty in making sense of prohibitions resulting from rights. Jeremy Bentham famously described the idea that there might exist moral rights as “nonsense on stilts”, but this is a controversial position. It is often argued that we do have rights, and that these are unconditional and inalienable, such as the right to bodily autonomy. If one person has a right to bodily autonomy, this is understood as requiring that others do not use their body in certain ways, regardless of the consequences. However, basic act-utilitarianism cannot make sense of this. In a famous example, Judith Jarvis Thomson (1929-2020) imagines a surgeon who realises they could save the life of five patients by killing a healthy person who happens to be the right blood type. Assuming they could avoid special negative consequences from the surgeon killing an innocent healthy person (perhaps they can perform the killing so that it looks like an accident to prevent the public panicking about murderous surgeons), an act-utilitarian seems committed to the view that the surgeon should kill the one in order to save the five. The rule-utilitarian, however, has a neat response. They can suggest that a set of rules that gives people rights over their own bodies—rights that preclude surgeons killing them even if they have useful organs—leads to more happiness overall, perhaps because of the feeling of safety or self-respect that this might result in. So the rule-utilitarian can say such a killing was wrong, even if on this particular occasion it would have resulted in the best consequences.

Another potential advantage for rule-utilitarians is that they may have an easier time avoiding giving extremely demanding moral verdicts. For the act-utilitarian, one must always perform the action which has the best consequences, regardless of how burdensome this might be. Given the state of the world today, and how much people in affluent countries could improve the lives of those living in extreme poverty with small sums of money, act-utilitarianism seems to imply that affluent people in developed nations must donate the vast majority of their disposable income to those in extreme poverty. If buying a cup of coffee does not have expected consequences as good as donating the money to the Against Malaria Foundation to spend on mosquito nets, the act-utilitarian claims that buying the cup of coffee is morally wrong (because of the commitment to maximising). Rule-utilitarians can give a different answer. They consider what moral rule would be best for society. One of the reasons act-utilitarianism is so burdensome for a given individual is that the vast majority of people give nothing or very little. However, if every middle-class person in developed nations donated 10% of their income, this might be sufficient to eliminate extreme poverty. So perhaps that would be the rule a rule-utilitarian would endorse.

Despite some advantages, rule-utilitarianism does have many problems of its own. One issue pertains to the strength of the rules. Consider a rule prohibiting lying. This might seem like a good rule for a moral code. However, applying this rule in a case where a would-be murderer asks for the location of a would-be victim would seemingly have disastrous consequences (Kant is often ridiculed for his absolutist stance in this case). One response here would be to suggest that the rules could be more specific. Maybe “do not lie” is too broad, and instead the rule “do not lie, unless it saves a life” is better? But if all rules should be made more and more complicated when this leads to rules with better consequences, this defeats the purpose of the rules. As J. J. C. Smart (1920-2012) pointed out, the view then seems to collapse into a version of act-utilitarianism. In Smart’s words:

 I conclude that in every case if there is a rule R the keeping of which is in general optimific, but such that in a special sort of circumstances the optimific behaviour is to break R, then in these circumstances we should break R…. But if we do come to the conclusion that we should break the rule…what reason remains for keeping the rule?  (Smart, ‘Extreme and Restricted Utilitarianism’, 1956)

On the other hand, one might suggest that the rules stand, and that lying is wrong in this instance. However, this looks like an absurd position for a utilitarian to take, as they claim that what matters is promoting good consequences, yet they will be forced to endorse an action with disastrous consequences. If they suggest rule-following even when the consequences are terrible, this is difficult to reconcile with core consequentialist commitments, and looks like—in Smart’s terms—“superstitious rule worship”. Is it not incoherent to suggest that only the consequences matter, but also that sometimes one should not try to bring about the best consequences? The rule-utilitarian thus seems to face a dilemma. Of the two obvious responses available, one leads to a collapse into act-utilitarianism and the other leads to incoherence.

Richard Brandt (1910-1997) was the first to offer a rigorous defence of rule-utilitarianism. He offers one way of responding to the above criticism. He suggests that the rules should be of a fairly simple sort, like “do not lie”, “do not steal” and so on, but in extreme scenarios, these rules will be suspended. When a murderer arrives at the door asking for the location of one’s friends, this is an extreme example, so ordinary rules can be suspended so that disaster can be averted. A version of this strategy, where the correct set of rules includes an “avoid disaster” rule, is defended by contemporary rule-consequentialist Brad Hooker (Hooker’s own view is not strictly rule-utilitarian because his code includes prioritarian caveat—he thinks there is some moral importance to prioritising the worst-off in society, over and above their benefits to well-being).

A second problem for rule-utilitarians concerns issues relating to partial compliance. If everyone always acted morally decently and followed the rules, this would mean that certain rules would not be required. For instance, there would be no rules needed for dealing with rule-breakers. But it is not realistic to think that everyone will always follow the rules. So, what degree of compliance should a rule-utilitarian cater for when devising their rules? Whatever answer is given to this is likely to look arbitrary. Some rule-utilitarians devise the rules not in terms of compliance, but acceptance or internalisation. Someone may have accepted the rules but, because of weakness of will or a misunderstanding, still break the rules. Formulating the view this way means that the resulting code will incorporate rules for rule-breakers.

A further dispute concerns whether rule-utilitarianism should really be classified as a form of utilitarianism at all. Because the rightness of an action is only connected to consequences indirectly (via whether or not the action accords to a rule and whether the rule relates to the consequences in the right way), it is sometimes argued that this should not count as a version of utilitarianism (or consequentialism) at all.

d. Satisficing and Scalar Views

A common objection to act-utilitarianism is that, by always requiring the best action, it demands too much. In ordinary life, people do not view each other as failing whenever they do something that does not maximise utility. One response to this is to reconstrue utilitarianism without the claim that an agent must always do the best. Two attempts at such a move will be considered here. One replaces the requirement to do the best with a requirement to do at least good enough. This is known as satisficing utilitarianism. A second adjustment removes obligation entirely. This is known as scalar utilitarianism.

Discussions of satisficing were introduced into moral philosophy by Michael Slote, who found maximising versions of utilitarianism unsatisfactory. Satisficing versions of utilitarianism hope to provide more intuitive verdicts. When someone does not give most of their money to an effective charity, which may be the best thing they could do, they might still do something good enough by giving some donation or helping the needy in other ways. According to the satisficing utilitarian, there is a standard which actions can be measured against. A big problem for satisficing views arises when they are challenged to say how this standard is arrived at—how do they figure out what makes an action good enough? Simple answers to the question have major issues. If, for instance, they suggest that everyone should bring about consequences at least 90% as good as they possibly can, this suggests someone can always permissibly do only 90% of the best. But in some cases, doing what brings about 90% of the best outcome looks really bad. For example, if 10 people are drowning, and an observer can decide how many to save without any cost to themselves, picking 9—and allowing one to die needlessly—would be a monstrous decision. Many sophisticated versions of satisficing utilitarianism have been proposed, but none so far has escaped some counterintuitive implications.

The problem of where to set the bar is not one faced by the scalar utilitarians, as they deny that there is a bar. The scalar utilitarian acknowledges that what makes actions better or worse is their effects on peoples’ well-being but shuns the application of “rightness” and “wrongness”. This approach avoids problems of being overly or insufficiently demanding, because it makes no demands. The scalar view avoids deontic categories, like permissible, impermissible, required, and forbidden. Why might such a view seem appealing? For one thing, the categories of right and wrong are typically seen as binary—the act-utilitarian says actions are either right or wrong, a black-and-white matter. If the moral quality of actions is extremely richly textured, this might look unsatisfactory. Furthermore, using the blunt categories of “right” and “wrong”, someone confident that they have acted rightly may become morally complacent. Unless you are doing the very best, there is room for improvement, scope for doing better, which can be obfuscated by viewing acts as merely permissible or impermissible. While some utilitarians have found this model attractive, abandoning “right” and “wrong” is a radical move, and perhaps unhelpful. It might seem very useful, for instance, for some actions to be regarded as forbidden. Similarly, an account of morality which sets the boundaries of permissible action may be much more useful for regulating behaviour than viewing it merely as matters of degrees.

6. Utilitarianism in the Early 21st Century

In moral theory, discussions of utilitarianism have been partly subsumed under discussions of consequentialism. As typically classified, utilitarianism is simply a form of consequentialism, so any problems that a theory faces in virtue of being consequentialist are also faced by utilitarian views. Some consequentialists will also explicitly reject the label of “utilitarianism” because of its commitment to a hedonistic or welfarist account of the good. Brad Hooker, for example, endorses a rule-consequentialism where not only the total quantity of happiness matters (as the utilitarian would suggest), but where the distribution of happiness is also non-instrumentally important. This allows him to claim that a world with slightly less overall happiness, but where the poorest are happier, is all-things-considered better than a world with more total happiness, but where the worst-off are miserable.

While many of the discussions concern consequentialism more broadly, many of the arguments involved in these discussions still resemble those from the 19th century. The major objections levelled against consequentialism in the early 21st century—for example, whether it demands too much, whether it can account for rights or justice, or whether it allows partial treatment in a satisfactory way—target its utilitarian aspects.

The influence of utilitarian thinking and the Utilitarian movement is still observable. One place where Utilitarian thinking is particularly conspicuous is in the Effective Altruism movement. Like the 19th century Utilitarians, Effective Altruists ask what interventions in the world will actually make a difference and promote the behaviours that are the best. Groups such as Giving What We Can urge individuals to pledge a portion of their income to effective charities. What makes a charity effective is determined by rigorous scientific research to ascertain which interventions have the best prospects for improving peoples’ lives. Like the classical utilitarians and their predecessors, they answer the question of “what is good?” by asking “what is useful?”. In this respect, the spirit of utilitarianism lives on.

7. References and Further Reading

  • Ahern, Dennis M. (1976): ‘Is Mo Tzu a Utilitarian?’, Journal of Chinese Philosophy 3 (1976): 185-193.
    • A discussion about whether the utilitarian label is appropriate for Mozi.
  • Anscombe, G. E. M. (1958): ‘Modern Moral Philosophy’, Philosophy, 33(124), 1-19.
    • Influential paper where Anscombe criticises various forms of utilitarianism popular at the time she was writing, and also introduces the word “consequentialism”.
  • Bentham, Jeremy (1776): A Fragment on Government, F. C. Montague (ed.) Oxford: Clarendon Press (1891).
    • One of the first places utilitarian thinking can be seen in Bentham’s writings.
  • Bentham, Jeremy (1787): ‘Panopticon or The Inspection House’, in The Panopticon Writings. Ed. Miran Bozovic (London: Verso, 1995). p. 29-95
    • This is where Bentham proposes his innovative prison model, the “panopticon”. It also includes lengthy discussions of how prisoners should be treated, as well as proposals for hospitals, “mad-houses” and schools.
  • Bentham, Jeremy (1789): An Introduction to the Principles of Morals and Legislation., Oxford: Clarendon Press, 1907.
    • Seen as the first rigorous account of utilitarianism. It begins by describing the principle of utility, and it continues by considering applications of the principle in morality and legal policy.
  • Brandt, R. B. (1959): Ethical Theory, Englewood-Cliffs, NJ: Prentice Hall.
    • This book offers a clear formulation of rule-utilitarianism, and it is one of the earliest resources that refers to the view explicitly as “rule-utilitarianism”.
  • Chastellux, François-Jean de (1774): De la Félicité publique, (“Essay on Public Happiness”), London: Cadell; facsimile reprint New York: Augustus Kelley, 1969.
    • This book is where Chastellux investigates the history of human societies in terms of their successes (and failures) in securing happiness for their citizens.
  • Cumberland, Richard (1672): A Treatise of the Laws of Nature (De Legibus Naturae), selection printed in British Moralists 1650-1800 (1991), D.D. Raphael (ed.), Hackett.
    • Here Cumberland discusses the nature of things, and introduces his natural law view, which leads to some utilitarian-like conclusion.
  • Dabhoiwala, Faramerz (2014): ‘Of Sexual Irregularities by Jeremy Bentham—review’, The Guardian,  https://www.theguardian.com/books/2014/jun/26/sexual-irregularities-morality-jeremy-bentham-review.
    • Article about a recent book discussing Bentham’s position on sexual ethics.
  • De Lazari-Radek, Karazyna and Singer, Peter (2014): The Point of View of the Universe, Oxford University Press.
    • An exposition of Henry Sidgwick’s utilitarianism, considering his view in light of contemporary ethical discussions.
  • Dickens, Charles (1854): Hard Times, Bradbury & Evans.
    • Novel featuring Thomas Gradgrind—a caricature of a utilitarianist.
  • Foot, Philippa (1985): ‘Utilitarianism and the Virtues’, Mind, 94(374), 196-209.
    • Foot—an opponent of utilitarianism—notes how utilitarianism has been extremely persistent. She suggests that one reason for this is that utilitarianism’s opponents have been willing to grant that it makes sense to think of objectively better and worse “states of affairs”, and she scrutinises this assumption.
  • Gay, John (1731): Concerning the Fundamental Principle of Virtue or Morality, selection printed in British Moralists 1650-1800 (1991), D.D. Raphael (ed.), Hackett.
    • This includes Gay’s challenge to secular versions of utilitarianism, to explain moral motivation.
  • Helvétius, Claude (1777): A Treatise on Man, His Intellectual Faculties, and His Education, 2 vols., London: B. Law and G. Robinson.
    • Published after Helvétius’ death, this work includes lengthy discussions of how society may be altered to better promote happiness.
  • Heydt, Colin (2014): ‘Utilitarianism before Bentham’, in The Cambridge Companion to Utilitarianism, pp. 16-37). Cambridge: Cambridge University Press. doi:10.1017/CCO9781139096737.002
    • This paper describes the intellectual development of utilitarianism, drawing attention to the non-utilitarian origins, as well as the distinct religious and secular variations of utilitarianism in Britain, and the French utilitarians.
  • Hooker, Brad (2000): Ideal Code, Real World: A Rule-consequentialist Theory of Morality. Oxford University Press.
    • This book offers a rigorous defence of rule-consequentialism. Hooker’s account is not rule-utilitarian (because he claims that some priority should be given to the worst-off in society), but he offers defences against all the major objections to rule-utilitarianism.
  • Hruschka, Joachim, 1991. “The Greatest Happiness Principle and Other Early German Anticipations of Utilitarian Theory,” Utilitas, 3: 165–77.
    • Hruschka dispels some myths about the origins of the term “greatest happiness for the greatest number”, and he explores the history of the idea in Germany prior to the development of utilitarianism in Britain.
  • Hutcheson, Francis (1725): Inquiry Concerning the Original of Our Ideas of Virtue or Moral Good, treatise II of An Inquiry into the Original of our Ideas of Beauty and Virtue, selection printed in British Moralists 1650-1800 (1991), D.D. Raphael (ed.), Hackett.
    • This work provides a detailed account of Hutcheson’s moral and aesthetic theory.
  • Hutcheson, Francis (1755): A System of Moral Philosophy, three volumes, London.
    • Published after Hutcheson’s death, this book was written specifically for students. It further develops Hutcheson’s moral thinking, and it includes a discussion of different kinds of pleasures.
  • Jacobson, Daniel (2008): ‘Utilitarianism without Consequentialism: The Case of John Stuart Mill’, Philosophical Review, 117(2), 159-191.
    • This article makes a case for distinguishing the view of John Stuart Mill and his contemporaries from consequentialism, as the view is discussed today. This locates “Utilitarianism” within a certain socio-historical context and identifies ways in which it differs in its commitments than the “consequentialism”.
  • MacAskill, William (2015): Doing Good Better: Effective Altruism and How You Can Make a Difference, Random House.
    • An introduction to the Effective Altruism movement, which can be seen as an intellectual descendent of the Utilitarians.
  • Mill, John Stuart (1861): Utilitarianism, originally published in Fraser’s Magazine, now widely available, e.g., https://www.utilitarianism.net/books/utilitarianism-john-stuart-mill/1
    • This is an attempt from John Stuart Mill to demonstrate that utilitarianism is much more appealing than critics at the time implied. This is often seen today as the foundational text for utilitarianism, though Mill did not seem to regard it as highly as some of his other works, like On Liberty and Considerations on Representative Government.
  • Mill, John Stuart (1867): ‘House of Commons Speech’, Hansard. https://hansard.parliament.uk/Commons/1867-05-20/debates/c38e8bdb-704c-4952-9375-e33d7967a5a4/Clauses34ProgressMay17?highlight=%22conceding%20to%22#contribution-b39e743f-6b70-45e4-82c4-8ac642f8fd18
    • A lengthy speech given by Mill as an MP arguing for suffrage for women.
  • Mozi (2010): The Mozi: A Complete Translation, Ian Johnston (trans.), The Chinese University Press.
    • A translated version of Mozi’s work, accompanied by commentary.
  • Nozick, Robert (1974): Anarchy, State & Utopia, New York: Basic Books.
    • In this book, as well as his general account of the requirements of justice, Nozick introduces the example of the “experience machine”, which is often thought to demonstrate a problem for hedonism.
  • O’Keefe, Tim (2009): Epicureanism, Acumen Publishing.
    • O’Keefe discusses the teachings of Epicurus. As well as Epicurean ethics, this includes large discussions of Epicurean thoughts on metaphysics and epistemology.
  • Paley, William (1785): Principles of Moral and Political Philosophy, Boston: Richardson and Lord (1821).
    • Paley’s Principles of Moral and Political Philosophy was the most influential work of utilitarianism for much of the 19th It also includes an early defence of what would be later termed rule-utilitarianism.
  • Priestley, Joseph (1768): Essay on the First Principles of Government, London.
    • In this work, Priestley claims that the greatest happiness for the greatest number is the measure of right and wrong. Bentham says this influenced him significantly.
  • Railton, Peter (1984): ‘Alienation, Consequentialism and the Demands of Morality’, Philosophy & Public Affairs, 13(2), 134-171.
    • Elaborates a complaint relating to utilitarian decision procedure, and how this may lead to alienation. Railton offers a distinction between “objective” and “subjective” versions of consequentialism, endorsing the former.
  • Rawls, John (1971): A Theory of Justice, Cambridge, MA: Harvard University Press.
    • When developing his influential theory of justice, Rawls criticises the inability of classical utilitarianism to properly appreciate the individual nature of persons.
  • Rosen, Frederick (2003): Classical Utilitarianism from Hume to Mill, London: Routledge.
    • This book traces the influence of the idea that utility is the basis of morality and justice, starting from Hume. It includes many of the figures discussed in this article in significantly more depth. It also devotes two chapters to considering the notion of utility as found in the works of Adam Smith.
  • Scarre, Geoffrey (1996): Utilitarianism, London: Routledge.
    • This book provides a wonderful discussion of utilitarianism. The first few chapters of the book were extremely useful in the creation of this article.
  • Schultz, Bart and Varouxakis, Georgios (2005): Utilitarianism and Empire, Oxford: Lexington.
    • This book is a collection of essays that consider the relationship between Utilitarianism—particularly as a social movement—and the British Empire. It explores the criticisms that early Utilitarians, like Jeremy Bentham and John Stuart Mill, were racist, insufficiently critical of slavery, and served as apologists for the British Empire.
  • Slote, Michael (1984): ‘Satisficing Consequentialism’, Proceedings of the Aristotelian Society, 58, 139-163.
    • This article marks the introduction of satisficing views, which remove the feature of maximising from utilitarianism, instead claiming that it is (at least) sometimes permissible to perform actions which do not have the best consequences, but which are good enough.
  • Smart, J. J. C and Williams, Bernard (1973): Utilitarianism: For & Against, Cambridge University Press.
    • A pair of essays for and against utilitarianism. Williams’ part includes his objection that utilitarianism undermines the integrity of moral agents, which has been very influential.
  • Taylor, Harriet (1851): ‘Enfranchisement of Women’, available here: https://www.utilitarianism.net/books/enfranchisement-of-women-harriet-taylor-mill
    • Harriet Taylor’s essay arguing for the legal equality of women.
  • Thomson, Judith Jarvis (1976): ‘Killing, Letting Die and The Trolley Problem’, The Monist, 59(2), 204-217.
    • This paper uses the case of a surgeon who must decide whether to kill one healthy person to save five, which has been used since to show problems utilitarianism has with making sense of rights. It also introduces the term “trolley problem” for a type of case that has become commonplace in moral philosophy.

 

Author Information

Joe Slater
Email: Joe.Slater@glasgow.ac.uk
University of Glasgow
United Kingdom

Moral Perception

It is a familiar thought that many of our beliefs are directly justified epistemically by perception. For example, she sees what looks to her to be a cat on the mat, and from this she is justified in saying “There is a cat on the mat.” This article explores the idea that our moral beliefs can be justified empirically in a similar manner. More precisely, it focuses on canonical moral perception (CMP), which restricts perceptual experiences to sensory perceptual experiences, such as vision, touch, taste, smell, and sound. For ease of exposition, this article uses visual perceptual experiences as the sensory modality of choice.

We should be interested in the viability of such a thesis for several reasons. First, if CMP is a plausible epistemology of justification of moral beliefs, then it is uniform with a broader perceptual epistemology and therefore comes with ready-made responses to skeptical challenges to morality. Second, CMP avoids over-intellectualising moral epistemology, and it explains how it is that lay people have justified moral beliefs. Third, CMP, if true, has interesting implications for our methodology of investigating morality. In effect, CMP states that experience comes first, contrary to how some (but not all) rival views characterize moral epistemology as starting from the armchair.

First, the thesis of CMP in presented in detail. The following section considers prima facie arguments in favor of CMP, which are the considerations of epistemic uniformity and the role of experience in moral inquiry. Next, the article discusses prima facie arguments against CMP, which are the problems of counterfactual knowledge, the causal objection, and the ‘looks’ objection. Finally, the article presents arguments for CMP that draw from the philosophy of perception and the philosophy of mind, and it concludes that much of the debate surrounding CMP is continuous with debates in the general philosophy of perception and the philosophy of mind.

Table of Contents

  1. The Central Thesis
  2. The Prima Facie Case for Moral Perception
    1. Moral Perception and Epistemic Uniformity
    2. The Role of Experience in Moral Inquiry
  3. The Prima Facie Case Against Moral Perception
    1. Justification of Counterfactual Moral Beliefs
    2. The Causal Objection
    3. The ‘Looks’ Objection
  4. Arguments from Philosophy of Perception
    1. High-Level Contents in Perception
    2. Phenomenal Contrast Arguments
    3. Phenomenal Contrast and Parity Considerations
    4. Cognitive Penetration
    5. The Mediation Challenge
    6. Moral Perception and Wider Debates in The Philosophy of Perception
  5. Summary: Looking Forward
  6. References and Further Reading

1. The Central Thesis

Suppose upon returning home one evening, someone encounters a stranger harming a senior citizen for entertainment. As they witness this act, they form the belief that what they are witnessing is morally wrong. Assuming that the belief is epistemically justified, it remains a question what the source of justification for this particular moral belief is. One answer is that perceptual states (such as sight and hearing) provide the justification. This thesis is called canonical moral perception:

CMP: Some moral beliefs are non-inferentially justified by sensory perceptual experiences.

To be clear, CMP claims that some moral beliefs are non-inferentially justified by sensory perceptual experiences. This leaves open the possibility of multiple sources for the justification of moral beliefs while showing that there is an interesting debate here regarding the possibility of CMP, since rivals of the view will deny that any moral beliefs are justified in such a way. For purposes of exposition, this article uses vision as the perceptual state of choice, but it should be kept in mind that this is not to convey that vision is the only source of perceptual justification for moral beliefs. Despite the fact that emotions are sometimes spoken of as if they are a kind of perception, this article does not consider emotional perception in any detail. Someone who endorses CMP may be called a ‘perceptualist.’

Fundamentally, the epistemic contribution of perception is to justify belief and play the role of a justificatory regress stopper. Given that justification for some beliefs bottoms out in perceptual experience, and that some moral beliefs are justified but not on the basis of other beliefs, CMP extends perceptual justification to the moral domain. CMP is a foundationalist theory of the justification of moral beliefs and this article treats it as such. Other foundationalist views, such as intuitionists and emotional perceptualists, will have their own ways of handling the regress problem that differs from Canonical Moral Perception. In particular, the perceptualist (at least) holds that what is essential to perception is its representational nature, the phenomenological character of perceptual experience, and its role as a non-inferential source of justification, and will offer a stopper to the regress problems based on those characteristics. Intuitionists and emotional perceptualists may agree that some of those characteristics are essential to their justificatory source as well, but the story for how their regress stoppers work will differ based on how emotions and intuitions differ from perception. For example, emotional perceptualists may say that what is special about emotional perceptual states is that they are valenced, and that this plays a special role in their justificatory story.

Furthermore, this paper assumes on behalf of the perceptualist a phenomenal dogmatist account of foundationalism of the kind espoused by Jim Pryor, where someone is immediately, but defeasibly, justified by their perceptual experience (Pryor 2000). Phenomenal dogmatism is not a very strong foundationalism in that it does not require an infallibly known belief to ground all the remaining knowledge one may possess. Rather, what phenomenal dogmatism grants us is the claim that representational seeming states justify beliefs based on those seeming states in virtue of having those seeming states. Insofar as one may be concerned about challenges to a general foundationalist picture, the perceptualist will follow Pryor in responding to those objections.

Some of the philosophers mentioned in this article will talk about theories of perceptual moral knowledge, and most of what this article says will be compatible with those theories. A perceptually justified moral belief in the absence of defeaters is perceptual moral knowledge, after all.

2. The Prima Facie Case for Moral Perception

a. Moral Perception and Epistemic Uniformity

Considerations of uniformity and economy within epistemology might push one towards adopting CMP over its more traditional rivals, such as intuitionism. CMP entails that the methodology of obtaining justified moral beliefs does not differ in any significant or substantial way from other kinds of justification gained by perceptual experiences. That is, just as one forms the justified belief that there is a cat in the room by seeing that there is in fact a cat in the room, one forms the justified belief that some act is wrong by perceiving the wrongness of the act. This leads us to the considerations of uniformity. If there is no special methodology that differentiates justified moral beliefs from other justified beliefs in different domains, then the need for positing a special source of justification, such as the intellectual seemings of the intuitionist is moot. Another advantage of CMP is that it gives us a foundationalist epistemology, thereby avoiding regress and circularity worries regarding justification. To be clear, the advantages mentioned are shared with some rival accounts of moral epistemology, so these are not unique advantages but rather considerations that keep it a live theory.

b. The Role of Experience in Moral Inquiry

CMP captures the role that experience seems to play in moral inquiry. If we consider how non-philosophers form most of their moral beliefs, it is unlikely that the sole basic source is a priori reasoning. Most people do not sit in an armchair and contemplate runaway trolleys, yet it seems that most individuals have justified basic moral beliefs. When an individual is asked to explain how they know that an action is wrong, a common answer among lay people is that they saw the wrongness of that action. CMP takes this statement at face value, and considering that moral philosophers are not different in kind from the typical human being, we might think that when engaging in a moral thought experiment the philosopher is making use of past moral observations.

If we are persuaded that experience plays a role in answering moral questions, then a natural thought is that particular moral beliefs are among the most epistemically basic; particular moral beliefs form part of our evidential bedrock. They are basic in the sense that, from justified particular moral beliefs we can infer additional justified moral beliefs, but we cannot make an inference in the opposite direction.  For example, one basic justified particular moral belief for the perceptualist may be a very specific claim such as, ‘The instance of a father hugging his child I witnessed yesterday is morally good.’ From this particular experience of goodness, once we return to the armchair and ponder if fathers hugging their children is good, we might inductively infer a more general statement such as ‘It is usually good for fathers to hug their children.’ In short, we draw from experience to reach conclusions about more abstract moral questions. Sarah McGrath, motivates CMP with these considerations in mind (2018, 2019). As McGrath explains:

[A] significant part of our most fundamental evidence for [moral] theorizing consists in singular moral judgments that we know to be true. But I also think that there is a fairly widespread tendency to neglect this fact, and to think that our evidence, or what we ultimately have to go on in our ethical theorizing, consists exclusively of judgments with more general content (2018).

To expand on this: it is a common self-conception of moral philosophers that the methodology of moral inquiry they perform is to consider cases or action types, form judgments about those cases and reach general moral principles (such as ‘It is usually good for fathers to hug their children’, or ‘All things being equal, it is wrong to intentionally cause harm’) that are broadly applicable. That is, judgments about very specific cases will be formed by way of considering the more general principles. As McGrath points out, when considering the morality of an action type, we often draw upon our past experiences of tokens of an action to make moral judgments.  To illustrate this, we can imagine an agent who yesterday saw the goodness of a father hugging a child, and then the next day is presented with a thought experiment that asks the agent to consider a near identical scenario. Presumably, this agent will judge the hugging once again to be good, and this judgment will be based on the past observations they made the day before. Thus, CMP denies that intuitions about general moral beliefs reached in the armchair are always methodologically prior to experience in moral theorizing.

If intuitions about general moral principles are epistemically basic, then making use of particular moral judgements is epistemically mistaken. However, drawing on past observations to reach judgements on thought experiments about fathers hugging their children, or even the trolley problem, is not obviously an epistemic misstep. In fact, we often draw on past observations and experiences to give advice on problems that our friends and family experience. Rather than draw on general principles to advise a friend to end her relationship, we usually appeal to previous relationships we have been through to make such a judgment. These are the common and legitimate ways we form moral beliefs, and CMP is the most natural epistemic explanation of our practice of moral inquiry as we find it.

That said, we may worry about cases where we have background knowledge informing our experience of a situation; it may seem strange that we can have the kind of experientially justified moral beliefs CMP promises while at the same time recognizing that background knowledge changes what we may justifiably believe about what our perceptual experiences. For example, we can imagine the father hugging child, but now know have the background information that the father has a criminal record of incest. There are two ways for the perceptualist to handle cases where there is background knowledge informing the observation in such cases. The first is to stick with the kind of Pryor style phenomenal dogmatism, in which the perceptual seeming of goodness delivers prima facie justification for believing the hugging is morally good, but this is defeated by the additional knowledge of the father’s criminal record. The second option is to lean into the phenomenon of cognitive penetration, and answer that the background knowledge does change the perceptual experience of the father hugging the child from one of goodness to one of badness, since our propositional attitude would contour our perceptual experience on this option. In sum, there are two possible ways for the perceptualist to answer this kind of concern, but adjudicating between the two options canvassed here is beyond the scope of this article.

3. The Prima Facie Case Against Moral Perception

a. Justification of Counterfactual Moral Beliefs

Although CMP provides a theory of justification in actual situations, situations in which you see a morally valenced act, we might wonder what the theory says about justification of moral beliefs gained via thought experiments or reading fiction. Call the kind of justification gained in these instances counterfactual justification. Both Hutton and Wodak challenge CMP to provide an account of how one can have counterfactual moral justification (Hutton 2021; Wodak 2019). The challenge takes the following general form: By hypothesis, CMP explains moral justification in localized, everyday cases. However, we do not receive justification for moral beliefs solely through sensory perception, since we can have counterfactual moral justification. So, CMP is an incomplete explanation of the sources of moral justification. Because CMP cannot capture cases where we receive justification through literature or thought experiments, an epistemological theory that can provide a unified explanation of both counterfactual justification and justification gained in everyday cases is preferable on the grounds of parsimony. The following two paragraphs present particular versions of this challenge.

Hutton asks us to consider a case of someone reading a book depicting the brutalities of slavery, stipulating that they have an emotional response to the scenarios depicted in the book. Here, no perception is present (other than of words on a page), but there is a strong emotional response and plausibly, Hutton claims, the individual reading the book forms the justified moral belief that slavery is wrong. The upshot of Hutton’s argument is that CMP cannot explain what the source of justification in the case of literature is, while emotion is able to both explain the source of justification in moral beliefs formed from reading literature and everyday cases.

Like Hutton, Wodak notes that much of our moral inquiry is a priori, and intuitionism is far better suited to capture instances where our justified moral beliefs come from imagining scenarios. When sitting in the armchair imagining a trolley scenario, when we form the justified moral belief that pulling the lever is the right action, we can ask what justifies the belief, and Wodak states “The intuitionist can explain this very easily: our intuitions can concern actual and hypothetical cases” (Wodak 2019). That is, the intuitionist’s story for justification stays the same between imagined cases and cases we encounter on the street. CMP cannot appeal to perceptual justification because in thought experiments there is no perception of the scenario. Because CMP lacks resources to explain the source of the justification, and intuitionism can explain the source of justification in both thought-experiments and everyday cases, Wodak concludes that intuitionism should be preferred on the grounds of parsimony.

While it is true that CMP by itself is unable to capture counterfactual justification and gives some prima facie considerations against the view, this should not be cause for alarm on the part of the advocate of CMP. Recall that CMP states that some of our moral beliefs are perceptually justified, not that all moral beliefs are justified in such a way. The advocate of CMP has the option to make a disjunctive response to challenges from counterfactual justification such as those made by Wodak and Hutton. This response needs to be done with care; the advocate of CMP should avoid introducing an account of counterfactual justification that suffices to explain actual justification as well. Even though the challenge for a story for counterfactual justification has yet to be fully answered, there are other considerations for adhering to CMP.

b. The Causal Objection

The causal objection argues that we cannot perceive moral properties because we cannot be put in a causal relation with them. That is, one might think that moral properties are causally inert, and for this reason we cannot perceive them. Put in the form of an argument, the causal objection appears as:

    1. To perceive some property, one must be placed in the appropriate causal relation with that property.
    2. One can never be put in the proper causal relation with moral properties.
    3. One cannot perceive moral properties.

McBrayer responds to the causal objection by pointing out that on three of the most popular realist accounts moral properties premise two comes out false (McBrayer 2010). These three proposals are (i) treating moral properties as secondary properties, (ii) treating moral properties as natural properties, and (iii) treating moral properties as non-natural properties.

When moral properties are held to be secondary properties, where secondary properties are properties that under appropriate viewing conditions are perceived as such, premise two fails as demonstrated by an analogy between colors and moral properties. We can imagine looking at a chair under midday light and perceiving it to be brown. What causally contributes to our perceptual experience is not the brownness of the chair (due to the nature of secondary properties), but the other properties of the chair. Nonetheless, perceiving the chair results in knowledge of the chair’s color, so we are still put in an appropriate causal relation with the property of brownness. In the case of moral properties, stipulated to be secondary properties, we will be placed in the same causal relation with them as we are with colors. Under ideal viewing circumstances, we will be placed in a causal relation with the base properties (such as a father hugging a child) and perceive the goodness of that action. In short, if we take moral properties to be secondary properties, the response to the causal objection is a common cause style of explanation.

If one takes a reductionist naturalist account of the moral properties, matters are even simpler. Because moral properties are identical to natural properties, the explanation as to how we are able to be in the proper causal relation with them is the same explanation as to how we are able to be in the proper causal relationship with chairs, cars, and human actions.

Finally, according to McBrayer, non-naturalism about moral properties avoids the causal objection as well. What the proponent of the causal objection wants is a non-accidental connection between our perceptual beliefs and the moral facts, and an account that delivers a non-accidental connection between our perceptual beliefs and the moral facts suffices to defuse the causal objection. This is so even if the connection is not causal, strictly speaking. To see this, first note that we are stipulating the supervenience principle, the moral facts necessarily supervene on the natural facts such that there is no change in the moral without a change in the natural. Assuming that we can see supervening properties, the accidentality is eliminated because whenever we see a supervening property we see its natural property that serves as its base, and the natural property serves as the proper causal relationship that satisfies the causal constraint.

The causal objection is an instance of a general challenge to the perception of high-level properties. In this case, the causal objection is an instance of explanatory superfluity. This challenge is as follows: One might think that we cannot be put in a causal relation with high-level properties, and so we do not perceive them. There is no need to claim that we are in a causal relation with trees when being in a causal relation with the lower-level properties of trees is sufficient for justified tree belief; further causal contact would be an instance of overdetermination. To put the objection in a slightly different way, if our perceptual states are in a causal relation with the property of being a pine tree, then the content of our perceptual experience of a pine tree would be causally overdetermined. There is no reason to think that our perceptual experiences are overdetermined, so our perceptual states are not in a causal relation with the property of being a pine tree. It is not clear how worried the defender of CMP should worry about this objection. Because the causal objection shares strong features with the causal exclusion problem of mind-body interaction which provides a framework for addressing these issues, the objection may not carry much weight (Kim 1993, Yablo 2003).

c. The ‘Looks’ Objection

If perception justifies some moral beliefs, then this is presumably because there is a phenomenological character, a what-it-is-likeness, when perceiving moral properties. The ‘looks objection’ claims that this is not the case: we do not have perceptual justification of moral beliefs because there is no phenomenological character for moral properties (Huemer 2005, Reiland 2021). The argument is commonly structured this way:

    1. A moral belief is perceptually justified if there is some way that a moral property looks.
    2. Moral properties have no look.
    3. No moral beliefs are perceptually justified.

We can deny the ‘looks’ objection by rejecting premises one or two, or arguing that the conclusion does not follow. Because ‘looks’ is ambiguous in the argument, one strategy for denying the objection is to interpret ‘looks’ in various ways and see if the argument remains valid. McBrayer (2010a) tackles the ‘looks’ objection by considering several possible readings of “looks” other than the phenomenal ‘looks’ mentioned above. The upshot of McBrayer’s strategy is that on all interpretations of “look” he considers, the objection is invalid. McBrayer settles on a possible reading of ‘looks’ which is supposed to provide the strongest version of the objection. This is the ‘normally looks’, which is understood as the way something resembles something else. If we substitute ‘normally look’ in premise two, we get:

2′. Moral properties do not normally look like anything.

Even with ‘normally looks’, the objection comes out invalid. This is for the following reasons. When ‘normally looks’ is read as normally looking a way to multiple people, the argument fails as many non-moral properties, assuming they have a normal look, do not appear the same way to multiple people. For example, imagine a group of individuals looking at a car from different viewpoints; there is no single way the car appears to all of them. Yet, if a car has a normal look but can appear different ways to different individuals, then there is no principled reason to think that rightness cannot appear different ways yet have a normal look as well. Understood in this cross-person sense, 2′ comes out false. Similarly, when 2′ is read as the way a thing normally looks to an individual, the objection is still invalid. Even if 2′ is true, it is only true of low-level properties such as colors, since no matter what angle you view red from, it would always look the same. Many high-level properties, such as danger, do not have a way of normally looking to an individual. But, assuming we are perceptually justified in judgments of danger despite its disparate looks, such as a rattlesnake looking dangerous and a loaded gun looking dangerous, premise 1 does not hold. We may still be perceptually justified in a belief about a property even if there is no particular look for that property. Finally, if an opponent argues that there is a complex and ineffable way that high-level properties normally look, then this strategy is open to the defender of moral perception as well, so 2′ again comes out false. On all readings McBrayer considers, the ‘looks’ objection is unsound.

Proponents of the ‘looks objection’ may be unsatisfied with McBrayer’s response, however. The kind of ‘looks’ that is likely intended by opponents of CMP is ‘phenomenal looks’. That is, the what-it-is-likeness of perceiving something, such as what it is like to perceive a car or a cat, is the intended meaning of “looks” in the argument. “Looks” in fact was characterized as the phenomenal kind in the opening paragraph of this section. However, McBrayer omits this reading of ‘looks’, and misses the most plausible and strongest reading of the objection. It remains up to contemporary defenders of CMP to provide an account of what the phenomenological ‘looks’ of moral properties are like. Until an account is provided, the looks objection remains a live challenge.

Whatever this account may be, it will also provide a general strategy for answering a general looks objection in the philosophy of perception. This objection is the same as the looks objection listed above, but with instances of ‘moral’ replaced with ‘high-level property’, and concludes that our high-level property beliefs are not perceptually justified (McGrath 2017). If an account is successful at articulating what the phenomenal looks of a higher-order property is, or motivating the belief that high-level properties have one, then this provides a framework for CMP to use in answering the moral looks objection.

4. Arguments from Philosophy of Perception

While the prima facie arguments provide initial motivation for CMP, the thesis is ultimately about the epistemic deliverances of our sensory faculty. Accordingly, much of the debate about the viability of CMP parallels debates in the general philosophy of perception. In this section, we will see the arguments for and against moral perception drawing from empirical perceptual psychology and general philosophy of perception.

a. High-Level Contents in Perception

A natural move for the moral perceptualist in defense of the claim that we are non-inferentially justified by perception is to argue that we see moral properties. The perceptualist here means to be taken literally, similar to how we see the yellow of a lemon or the shape of a motorcycle. If we do perceive moral properties, then a very straightforward epistemic story can be told. This story the perceptualist aims to explain how perceptual moral justification is the same for perceptual justification of ordinary objects. For example, the explanation for how someone knows there is a car before them is that they see a car and form the corresponding belief that there is a car. The story for justification of moral beliefs here will be that someone sees the wrongness of some action and forms the corresponding belief that the action is wrong (absent defeaters). The perceptualist will typically flesh out this move by assuming an additional epistemic requirement, called the Matching Content Constraint (MCC):

MCC: If your visual experience E gives you immediate justification to believe some external world proposition that P, then it’s a phenomenal content of E that P (Silins 2011).

The MCC states that one is non-inferentially justified only if there is a match in contents between a perceiver’s perceptual state and doxastic state (their belief). The reason perceptual contents matter to CMP is that if perceptual contents include moral properties, then one has a perceptual experience of those moral properties, and if one has an experience of those moral properties then a story for a non-inferential perceptual justification of moral beliefs is in hand, which is no different from our perceptual justification of other objects. On the other hand, if there is a mismatch between our perceptual contents and our moral beliefs, then we may find a non-inferentialist perceptual epistemology such as CMP to be implausible.

Given the MCC, the perceptualist needs it to be the case that perceptual experience includes high-level contents, such as being a car, being a pine tree, or being a cause of some effect. If perceptual experiences do contain high-level contents, then the inclusion of moral contents in perceptual experiences is a natural theoretical next-step, barring a principled reason for exclusion. After all, if we commit to arguing that we perceive causation and carhood, extending the contents of perception to rightness (and wrongness) does not appear to require too large a stretch of the imagination. The extension of perceptual experiences to include moral contents meets the matching content constraint, and it clears the way for arguing for CMP. However, if the contents of our perceptual experiences are restricted to low-level contents, which are colors, shapes, depth, and motion (although what counts a low-level content may vary between theorists), the defense of CMP becomes much trickier.

Holding onto CMP because one accepts a high-level theory of content comes with its own risk. If a thin view of contents turns out to be the correct account of perceptual content, such that what makes up the content of our perceptual states are color arrays, shapes, depth and motion, then CMP appears to lose much of its motivation. It would be theoretically awkward to insist that moral contents show up if content about cars, pine trees, and causation are incapable of doing so. And if moral properties do not appear in the contents of perceptual experience, then a simple story as to how we can have perceptual justification for moral beliefs is lost.

Even if perception does not have high-level contents, or nor moral contents, this does not mean that CMP is a failed theory of moral epistemology. Sarah McGrath , provides a story as to how we can have perceptual beliefs in the absence of high-level contents in perception (2018, 2019). This story is an externalist one; the source of the justification comes from a Bayesian account of the adjustment of priors (the probability that a belief is true) given non-moral observations, rather than any experiential contents of morality itself. Through perceptual training and experience our perceptual system is trained to detect morally relevant stimuli, such as detecting the whimper of pain a dog may voice when kicked. On McGrath’s view, then, one is perceptually justified in a moral belief when the perceptual system reliably tracks the moral facts. The upshot for the defender of CMP is that there is much theorizing to be done about the compatibility between CMP and the thin-content view, and McGrath’s view shows one way to reconcile the two.

b. Phenomenal Contrast Arguments

An argument for thinking that we do perceive moral properties, as well as other high-level properties, is the argument from phenomenal contrast. Susanna Siegel develops a kind of phenomenal contrast argument as a general strategy for arguing that the contents of our perception are far richer than a thin view of contents would allow (2006, 2011, 2017). How a phenomenal contrast argument works is as follows. We are asked to imagine two scenarios, one in which a property is present, and a contrast scenario where the same property is absent. If the intuition about these cases is that the perceptual phenomenology is different for a perceiver in these scenarios, then one can argue that what explains the difference in experience in these cases is the absence of the property, which makes a difference to what is perceptually experienced. The reason an advocate of CMP would want to use this strategy is that if there is a phenomenal contrast between two cases, then there is an explanatory gap that CMP fills; if there is a moral experience in one case but not in a different similar case, CMP can explain the difference by saying that a moral property is perceived one case but not in the other, thus explaining the phenomenal difference.

To better illustrate phenomenal contrast, here is a concrete example from Siegel arguing that causation appears in the contents of perception (2011). Imagine two cases both in which we are placed behind a shade and see two silhouettes of objects. In the control case, we see one silhouette of an object bump into another object, and the second object begins to roll. In the contrast case, whenever one silhouette begins to move towards the other silhouette, the other silhouette begins to move as well, keeping a steady distance from the first silhouette. If we have the intuition that these two cases are phenomenally different for a perceiver, then Siegel argues that the best explanation for this difference is that causation is perceptually represented in the former case and not the latter, whereas competitors deny that causation appears in the content would have to find some alternative, and more complicated, explanation for the contrast.

The phenomenal contrast argument has been wielded to argue for moral contents specifically by Preston Werner (2014). Werner asks us to imagine two different individuals, a neurotypical individual and an emotionally empathetic dysfunctional individual (EEDI), coming across the same morally-valenced scenario. Let this scenario be a father hugging a child. When the neurotypical individual comes upon the scene of the father hugging his child, this individual is likely to be moved and have a variety of physiological and psychological responses (such as feeling the “warm fuzzies”). When the EEDI comes upon the scene of the father hugging his child, however, they will be left completely cold, lacking the physiological and psychological responses the neurotypical individual underwent. This version of the phenomenal contrast argument purports to show that what best accounts for the experiential difference between these two individuals is that the neurotypical individual is able to perceptually represent the moral goodness of the father hugging the child, thus explaining the emotional reaction, whereas the EEDI was left cold because of their inability to perceptually represent moral goodness. If this argument is successful, then we have reason to think that moral properties appear in the contents of experience.

One might object here that Werner is not following the general methodology that Siegel sets out for phenomenal contrast. Werner  defends his case as a phenomenal contrast by arguing that making use of two different scenarios would be too controversial to be fruitful because of the difference between learning to recognise morally valenced situations and having the recognitional disposition to recognise pine trees, and that the two individuals in the scenario are sufficiently similar in that they both have generally functional psychologies, but interestingly different in that the EEDI lacks the ability to properly emotionally respond to situations. Similarly, we might wonder about the use of an EEDI in this phenomenal contrast case. Although the EEDI possesses much of the same cognitive architecture as the neurotypical individual, the EEDI is also different in significant aspects. First, an immediate explanation of the difference might appeal to emotions, rather than perceptual experiences; the EEDI lacks the relevant emotions requisite for moral experiences. Second, the EEDI makes for a poor contrast if they lack the moral concepts needed to recognise moral properties in the first place. Similarly, the use of an EEDI as a contrast may prove problematic due to the exact nature of an EEDI being unclear; claiming that the best explanation between the two individuals’ experiences is due to a representational difference may be premature in the face of numerous and conflicting theories about the pathology of an EEDI. That is, because an EEDI’s perceptual system is identical to that of the neurotypical individual, the EEDI may still perceptually represent moral properties but fail to respond or recognise them for some other reason. If this hypothesis is correct, then the use of an EEDI is illegitimate because it does not capture the purported experiential difference.

c. Phenomenal Contrast and Parity Considerations

Even if CMP gets the right result, this does not rule out that other views can explain the phenomenology as well. For example, Pekka Väyrynen claims that inferentialism provides a better explanation of moral experiences, particularly regarding explanations of different experiences in phenomenal contrast scenarios (2018). To show this, Väyrynen first provides a rival hypothesis to a perceptualist account, which is as follows. When we see a father hugging his child, our experience of moral goodness is a representation that “results from an implicit habitual inference or some other type of transition in thought which can be reliably prompted by the non-moral perceptual inputs jointly with the relevant background moral beliefs” (Väyrynen 2018). This rival hypothesis aims to explain the phenomenological experiences targeted by phenomenal contrast arguments by stating that rather than moral properties appearing in our perceptual contents, what happens when we have a moral experience is that past moral learning, in conjunction with the non-moral perceptual inputs, forms a moral belief downstream from perception.

To see how this might work in a non-moral case, we can consider the following vignette, Fine Wine (Väyrynen 2018):

Greg, an experienced wine maker, reports that when he samples wine he perceives it as having various non-evaluative qualities which form his basis for classifying it as fine or not. Michael, a wine connoisseur, says that he can taste also fineness in wine.

Väyrynen asks if Michael has a perceptual experience of a property, in this case, fineness, that Greg cannot pick up on, and argues that there is no difference in perceptual experience. Granting that Greg and Michael’s experiences of the wine can differ, we need not appeal to Michael being able to perceive the property of fineness in order to explain this difference. What explains the difference in phenomenology, according to Väyrynen, is that Michael’s representations of fineness are “plausibly an upshot of inferences or some other reliable transitions in thought…” (Väyrynen 2018). Väyrynen’s hypothesis aims to reveal the phenomenal contrast argument as lacking the virtue of parsimony. That is, the perceptualist is using more theoretical machinery than needed to explain the difference in phenomenal experiences. The phenomenal contrast argument explains the difference in phenomenology between two individuals by claiming that moral properties appear in the contents of perception. Väyrynen’s rival hypothesis is supposed to be a simpler and more plausible alternative that explains why we may think high-level contents are in perception. First, it explains what appears to be a difference in perceptual experience as a difference in doxastic experience (a difference in beliefs). Second, because the difference is in doxastic experience, Väyrynen’s hypothesis does not commit to high-level contents in perception. Everyone who is party to this debate agrees on the existence of low-level perceptual contents and doxastic experiences, so to endorse high-level contents is to take on board an extra commitment. All things being equal, it is better to explain a phenomenon with fewer theoretical posits. In other words, Väyrynen’s hypothesis gets better explanatory mileage than the perceptualist’s phenomenal contrast argument.

Consider a moral counterpart of fine wine, where Greg and Michael witness a father hugging his child. Greg rarely engages in moral theorizing, but he classifies the action as morally good based on some of the non-moral features he perceives. Michael, on the other hand, is a world class moral philosopher who claims he can see the goodness or badness of actions. The perceptualist will say that the latter individual perceives goodness, but the former individual is perceptually lacking such that they cannot pick up on moral properties. The perceptualist who makes use of phenomenal contrast arguments is committed to saying here that Michael’s perceptual system has been trained to detect moral properties and has moral contents in perceptual experience, whereas Greg has to do extra cognitive work to make a moral judgment. Väyrynen’s rival hypothesis, on the other hand, need not claim we perceptually represent moral properties, but rather can explain the difference in phenomenology by appealing to the implicit inferences one may make in response to non-moral properties to which one has a trained sensitivity. According to Väyrynen’s hypothesis, Michael’s cognitive system is trained to make implicit inferences in response to certain non-moral properties, whereas Greg needs to do a bit more explicit cognitive work to make a moral judgment. What seems like a difference in perceptual experience is explained away as a difference in post-perceptual experience.

Väyrynen’s hypothesis also challenges Werner’s phenomenal contrast argument above, as it has an explanation for the EEDI’s different phenomenological experience. The neurotypical has the moral experience because of implicit inferences being made, but the EEDI fails to have the same experience because the EEDI lacks a sensitivity to the moral properties altogether, failing to draw the inferences the neurotypical is trained to make. In short, the difference in one’s phenomenological experience is explained by this rival hypothesis by differences in belief, rather than in perception.

d. Cognitive Penetration

It is already clear how low-level contents make it into perception, as perceptual scientists are already familiar with the rod and cone cells that make up the retina and process incoming light, as well as how that information is used by the early visual system. Is less clear how high-level contents make their way into perceptual experience. If perception does contain high-level contents, then a mechanism is required to explain how such contents make it into perceptual experience. The mechanism of choice for philosophers of perception and cognitive scientists is cognitive penetration. Cognitive penetration is a psychological hypothesis claiming that at least some of an individual’s perceptual states are shaped by that individual’s propositional attitudes, such as beliefs, desires, and fears. Put another way, cognitive penetration is the claim that perceptual experience is theory-laden.

To understand how cognitive penetration is supposed to work, we should consider another phenomenal contrast case. Imagine that you are working at a nature conservation center, and are unfamiliar with the plant known as Queen Anne’s lace. While working at the conservation center, you are told by your supervisor that plants that look a certain way are Queen Anne’s Lace. After repeated exposure to the plant, that a plant is Queen Anne’s lace becomes visually salient to you. In other words, your perceptual experience of Queen Anne’s lace prior to learning to recognize it is different from the perceptual experience you have of the plant after you have learned to recognize it. Cognitive penetration explains this shift in perceptual experience as your Queen Anne’s lace beliefs shape your perceptual experience, such that the property of ‘being Queen Anne’s lace’ makes it into the content of your perception. In other words, the difference in perceptual experiences is explained by the difference in perceptual contents, which in turn is explained by perceptual experiences being mediated by propositional attitudes. We should take care to separate this from a similar thesis which claims that there is no change in perceptual experience after learning to recognize Queen Anne’s lace, but that the shift in the phenomenology (the what-it-is-likeness) of looking at Queen Anne’s lace is explained by changes in post-perceptual experience, such as having new beliefs about the plant. Cognitive penetration claims that the phenomenological difference is between perceptual experiences, and it is the beliefs about Queen Anne’s lace that changes the perceptual experience.

Cognitive penetration is an attractive option for the perceptualist because it provides a mechanism to explain how moral properties make their way into the contents of perception. Consequently, the perceptualist’s theory for how we see the rightness or wrongness of actions will be identical to the story about Queen Anne’s lace above: an individual learns about morality from their community and forms moral beliefs, which in turn prime the perceptual system to perceive moral properties. One the perceptualist has cognitive penetration, they then have a story for moral properties in the contents of perception, and then the perceptualist delivers an elegant epistemology of moral justification. This epistemology respects the matching content constraint, which states that in order for a belief to be justified by perception, the contents of a belief must match the contents of perception. The perceptualist may then say that we have foundational perceptual justification for our moral beliefs, in the same way that we have foundational perceptual justification for tree beliefs. Just as we see that there is a tree before us, we see that an action is wrong.

e. The Mediation Challenge

The perceptualist’s use of cognitive penetration has led to challenges to the view on the grounds that cognitive penetration, the thesis that propositional attitudes influence perceptual experiences, lead to counterintuitive consequences. One of the most prominent challenges to the possibility of moral perception comes from Faraci , who argues that if cognitive penetration is true, then CMP must be false (Faraci 2015). To motivate the argument that no moral justification is grounded in perception, Faraci defends a principle he calls mediation:

If perceptions of X are grounded in experiences as of Y, then perceptions of X produce perceptual justification only if they are mediated by background knowledge of some relation between X and Y. (Faraci 2015)

What mediation states is that one can only have perceptual justification of some high-level property if the experience of that higher level property is in some way informed by knowledge of its relation to the lower-level properties it is grounded in. To motivate the plausibility of mediation, Faraci appeals to the non-moral example of seeing someone angry. If Norm sees Vera angry, presumably he knows that she is angry because he sees her furrowed brow and scowl, and he knows that a furrowed brow and scowl is the kind of behavior that indicates that someone is angry. In an analogous moral case, someone witnessing a father hugging his child knows that they are seeing a morally good action only if they have the relevant background knowledge, the relevant moral beliefs, connecting parental affection with goodness. If the witness did not possess the moral bridge principle that parental affection was good, then the witness would not know that they had seen a morally good action. The thrust of the argument is that if mediation is plausible in the non-moral case, then it is plausible in the moral case as well. If mediation is plausible in the moral case, then CMP is an implausible account of moral epistemology because it will need to appeal to background moral knowledge not gained in perceptual experience to explain how we see that the father hugging the child is a morally good action.

Faraci considers three possible ways of avoiding appeal to background knowledge for the defender of moral perception. The first option is to claim that the moral bridge principles are themselves known through perceptual experiences. In the case of the child hugging the father, then, we antecedently had a perceptual experience that justified the belief that parental affection is good. The problem with this response is that it leads to a regress, since we would have to have further background knowledge connecting parental affection and goodness (such as parental affection causes pleasure, and pleasure is good), and experientially gained knowledge of each further bridge principle.

The second option is that one could already know some basic moral principles a priori. The problem with this response should be apparent, since if we know some background principles a priori, then this is to concede the argument to Faraci that none of our most basic moral knowledge is known through experience.

Finally, someone could try to argue that one comes to know a moral fact by witnessing an action multiple times and its correlation with its perceived goodness, but the problem with this is that if each individually viewed action is perceived as being good, then we already have background knowledge informing us of the goodness of that act. If so, then we have not properly answered the mediation challenge and shown that CMP is a plausible epistemology of morality.

One way to defend CMP in response to Faraci’s challenge is to follow Preston Werner’s claim that mediation is too strong and offer a reliabilist account of justification that is compatible with a weaker reading of mediation. Werner considers a weak reading and a strong reading of Faraci’s mediation condition (Werner 2018). Werner rejects the strong reading of mediation on the grounds that while it may make Faraci’s argument against the plausibility of moral perception work, it overgeneralises to cases of perception of ordinary objects. Werner points out that the strong reading of mediation requires that we be able to make explicit the background knowledge of perceptual judgements that we make; if we perceive a chair, the strong reading requires that we be able to articulate the ‘chair theory’ that is informing our perceptual experience, otherwise our perceptual judgment that there is a chair is unjustified. Because the vast majority of non-mereologists are not able to make explicit a ‘chair-theory’ informing their perception, the strong reading yields the verdict that the vast majority of us are unjustified in our perceptual judgment of there being a chair.

The weak reading that Werner offers is what he calls “thin-background knowledge”, which he characterizes as “subdoxastic information that can ground reliable transitions from perceptual information about some property Y to perceptual information as of some other property X” (Werner 2018). The upshot is that a pure perceptualist epistemology of morality is compatible with the thin-background knowledge reading of mediation: We do not need to have access to the subdoxastic states that ground our perceptual judgments in order for us to know that our perceptual judgments are justified. Werner’s response to Faraci, in summary, is that a pure perceptualist epistemology is plausible because thin-background knowledge gives us an explanation as to how our perceptual moral judgments are in good epistemic standing.

f. Moral Perception and Wider Debates in The Philosophy of Perception

A broader lesson from the mediation challenge, however, is that many of the issues facing CMP are the same arguments that appear in general debates regarding the epistemology and metaphysics of perception. In the case of Faraci , the argument is a particular instance of a wider concern about cognitive penetration (2015).

What the mediation challenge reflects is a general concern about epistemic dependence and epistemic downgrade in relation to cognitive penetration. In particular, the mediation principle is an instance of the general challenge of epistemic dependence:

A state or process, e, epistemically depends upon another state, d, with respect to content c if state or process e is justified or justification-conferring with respect to c only if (and partly because) d is justified or justification-conferring with respect to c. (Cowan 2014, 674)

The reason one might worry about epistemic dependence in connection with cognitive penetration is that the justification conferring state in instances of cognitive penetration might be the belief states shaping the perceptual experiences, rather than the perceptual experiences doing the justificatory work. If this is true, then a perceptual epistemology of all high-level contents is doubtful, since what does the justificatory work in identifying pine trees will be either training or reflecting on patterns shared between trees, neither of which lend them to a perceptual story.

There is another general worry for cognitive penetration: epistemic downgrade. Assuming cognitive penetration is true, even if one were able to explain away epistemic dependence one might still think that our perceptual justification is held hostage by the beliefs that shape our experiences. For illustration, let us say we have the belief that anyone with a hoodie is carrying a knife. If we see someone wearing a hoodie and they pull a cellphone out, our belief may shape our perceptual state such that our perceptual experience is that of the person in the hoodie pulling a knife. I then believe that the person is pulling a knife. Another example of epistemic downgrade is that of anger:

Before seeing Jack, Jill fears that Jack is angry at her. When she sees him, her fear causes her to have a visual experience in which he looks angry at her. She goes on to believe that he is angry (Siegel 2019, 67).

In both cases there appears to be an epistemic defect: a belief is shaping a perceptual experience, which in turn provides support to the very same belief that shaped that experience. It is an epistemically vicious feedback loop. The worry about epistemic downgrade and high-order contents should be clear. In the case of morality, our background beliefs may be false, which will in turn shape our moral perceptual experiences to be misrepresentative. This appears to provide a defeater for perceptual justification of morality, and it forces the perceptualist to engage in defense of the moral background beliefs which may turn out to be an a priori exercise, defeating the a posteriori character of justified moral beliefs the perceptualist wanted.

One way to avoid these epistemic worries is for the moral perceptualist to endorse some form of epistemic dogmatism, which is to claim that seemings (perceptual or doxastic) provide immediate prima facie, defeasible, justification for belief. The perceptualist who adopts this strategy can argue that the worry of epistemic dependence is misplaced because although the presence of high-level content is causally dependent on the influence of background beliefs, given their dogmatist theory justification for a belief epistemically depends only on the perceptual experience itself. To see this, consider the following analogy: If one is wearing sunglasses, the perceptual experience one has will depend on those sunglasses they are wearing, but one’s perceptual beliefs are not justified by the sunglasses, but rather by the perceptual experience itself (Pryor 2000). For concerns about epistemic downgrade, the perceptualist may give a similar response, which is to state that one is defeasibly justified in a perceptual belief until one is made aware of a defeater, which in this case is the vicious feedback loop. To be clear, no moral perceptualist has made use of this response in print, as most opt for a kind of externalist account of perceptual justification. We should keep in mind that the dogmatist response is made in debates in general perceptual epistemology, and because debates about the epistemic effects of cognitive penetration in moral perception are instances of the general debate, the dogmatist strategy is available should the moral perceptualist wish to use it.

Apart from the epistemic difficulties cognitive penetration incurs, because cognitive penetration is a thesis about the structure of human cognitive architecture it must withstand scrutiny from cognitive science and empirical psychology. A central assumption of the cognitive science and psychology of perception is that the perceptual system is modular, or informationally encapsulated. However, cognitive penetration assumes the opposite because it claims that beliefs influence perceptual experience. Because cognitive penetration holds that the perceptual system is non-modular and receives input from the cognitive system, it falls upon advocates of the hypothesis to show that there is empirical support for the thesis. The problem is that most empirical tests purporting to demonstrate effects of cognitive penetration are questionable. The results have been debunked as either being explainable by other psychological effects such as attention effects, or they have been dismissed on the grounds of poor methodology and difficult to replicate (Firestone and Scholl 2016). Furthermore, in the case of perceptual learning, cognitive penetration predicts changes in the neurophysiology of the cognitive system rather than in the perceptual system, as it would be new beliefs that explain learning to recognize an object. Research in perceptual neurophysiology shows the opposite: perceptual learning is accompanied by changes in the neurophysiology of the perceptual system (Connolly 2019). The viability of CMP, insofar as it depends on cognitive penetration for high-level contents, is subject not only to epistemic pressures, but also to empirical fortune.

5. Summary: Looking Forward

For moral epistemologists, a foundationalist epistemology that provides responses to skeptical challenges is highly desirable. While a variety of theories of moral epistemology do provide foundations, CMP provides an epistemology that grounds our justification in our perceptual faculty that we are all familiar with and provides a unified story for all perceptual justification.

The overall takeaway is that the arguments that are made by both defenders and challengers to CMP are instances of general issues in the philosophy of perception. The lesson to be drawn here for CMP is that the way forward is to pay close attention to the general philosophy of perception literature. Because the literature of CMP itself remains in very early development, paying attention to the general issues will prevent the advocate of CMP from falling into mistakes made in the general literature, as well as open potential pathways for developing CMP in interesting and novel ways.

6. References and Further Reading

  • Audi, Robert. 2013. Moral Perception. Princeton University Press.
    • A book length defense of CMP. A good example of the kind of epistemic ecumenicism a perceptualist may adopt.
  • Bergqvist, Anna, and Robert Cowan (eds.). 2018. Evaluative Perception. Oxford: Oxford University Press.
    • Collection of essays on the plausibility of CMP and emotional perception.
  • Church, Jennifer. 2013. “Moral Perception.” Possibilities of Perception (pp. 187-224). Oxford: Oxford University Press.
    • Presents a Kantian take on moral perception.
  • Crow, Daniel. 2016. “The Mystery of Moral Perception.” Journal Of Moral Philosophy 13, 187-210.
    • Challenges moral perception with a reliability challenge.
  • Connolly, Kevin. 2019. Perceptual Learning: The Flexibility of the Senses. Oxford: Oxford University Press.
    • Discusses the findings of the neuroscience and psychology of perception in relation to theses in the philosophy of mind. Chapter 2 argues against cognitive penetration.
  • Cowan, Robert. 2014. “Cognitive Penetrability and Ethical Perception.” Review of Philosophy and Psychology 6, 665-682.
    • Discusses the epistemic challenges posed to moral perception by cognitive penetration. Focuses on epistemic dependence.
  • Cowan, Robert. 2015. “Perceptual Intuitionism.” Philosophy and Phenomenological Research 90, 164-193.
    • Defends the emotional perception of morality.
  • Cowan, Robert. 2016. “Epistemic perceptualism and neo-sentimentalist objections.” Canadian Journal of Philosophy 46, 59-81.
    • Defends the emotional perception of morality.
  • Faraci, David. 2015. “A hard look at moral perception.” Philosophical Studies 172, 2055-2072.
  • Faraci, David. 2019. “Moral Perception and the Reliability Challenge.” Journal of Moral Philosophy 16, 63-73.
    • Responds to Werner 2018. Argues that moral perception has a reliability challenge.
  • Firestone, Chaz, and Brian J. Scholl. 2016a. “Cognition Does Not Affect Perception: Evaluating the Evidence for ‘Top-down’ Effects.” Behavioral and Brain Sciences 39.
    • Challenges studies that purport to demonstrate the effects of cognitive penetration.
  • Firestone, Chaz, and Brian J. Scholl. 2016b. “‘Moral Perception’ Reflects Neither Morality Nor Perception.” Trends in Cognitive Sciences 20, 75-76.
    • Response to Gantman and Van Bavel 2015.
  • Fodor, Jerry. 1983. The Modularity of Mind. Cambridge, Massachusetts: MIT Press.
    • Argues for the informational encapsulation of the perceptual system.
  • Gantman, Ana P. and Jay J.Van Bavel. 2014. “The moral pop-out effect: Enhanced perceptual awareness of morally relevant stimuli.” Cognition, 132, 22-29.
    • Argues that findings in perceptual psychology support moral perception.
  • Gantman, Ana P. and Jay J. Van Bavel. 2015. “Moral Perception.” Trends in Cognitive Sciences 19, 631-633.
  • Hutton, James. 2022. “Moral Experience: Perception or Emotion?” Ethics 132, 570-597.
  • Huemer, Michael. 2005. Ethical Intuitionism. New York: Palgrave MacMillan.
    • Section 4.4.1 presents the ‘looks’ objection.
  • Kim, Jaegwon. 1993. Supervenience and Mind: Selected Philosophical Essays. Cambridge: Cambridge University Press.
    • A collection of essays discussing the causal exclusion problem.
  • McBrayer, Justin P. 2010a. “A limited defense of moral perception.” Philosophical Studies 149, 305–320.
  • McBrayer, Justin P. 2010b. “Moral perception and the causal objection.” Ratio 23, 291-307.
  • McGrath, Matthew. 2017. “Knowing what things look like.” Philosophical Review 126, 1-41.
    • Presents a general version of the ‘looks’ objection.
  • McGrath, Sarah. 2004. “Moral Knowledge by Perception.” Philosophical Perspectives 18, 209-228.
    • An early formulation of CMP, discusses the epistemic motivations for the view.
  • McGrath, Sarah. 2018. “Moral Perception and its Rivals.” In Anna Bergqvist and Robert Cowan (eds.), Evaluative Perception (pp. 161-182). Oxford: Oxford University Press.
  • McGrath, Sarah. 2019. Moral Knowledge. Oxford: Oxford University Press.
    • Chapter 4 is a presentation of CMP that does not require high-level contents. Chapter 1 is a criticism of some views on the methodology of moral inquiry.
  • Pylyshyn, Zenon. 1999. “Is Vision Continuous with Cognition? The Case for Cognitive Impenetrability of Visual Perception.” Behavioral and Brain Sciences 22, 341-365.
  • Pryor, James. 2000. “The Skeptic and the Dogmatist.” Noûs 34, 517-549.
    • Early presentation of phenomenal dogmatism. Responds to epistemic concerns about the theory-ladenness of perception.
  • Reiland, Indrek. 2021. “On experiencing moral properties.” Synthese 198, 315-325.
    • Presents a version of the ‘looks’ objection.
  • Siegel, Susanna. 2006. “Which properties are represented in perception.” In Gendler, Tamar S. & John Hawthorne (eds.), Perceptual Experience (pp. 481-503). Oxford: Oxford University Press.
    • Argues that perceptual experience includes high-level contents.
  • Siegel, Susanna. 2011. The Contents of Visual Experience. Oxford: Oxford University Press.
    • Book length defense of high-level contents in perceptual experience.
  • Siegel, Susanna. 2012. “Cognitive Penetrability and Perceptual Justification.” Noûs 46.
    • Discusses the issue of epistemic downgrade.
  • Siegel, Susanna. 2019. The Rationality Of Perception. Oxford: Oxford University Press.
    • Chapter 4 is a discussion of epistemic downgrade and responds to criticisms of the problem.
  • Siegel, Susanna & Byrne, Alex. 2016. “Rich or thin?” In Bence Nanay (ed.), Current Controversies in Philosophy of Perception (pp. 59-80). New York: Routledge-Taylor & Francis.
    • Byrne and Siegel debate whether or not there are high-level perceptual contents.
  • Väyrynen, Pekka. 2018. “Doubts about Moral Perception.”  In Anna Bergqvist and Robert Cowan (eds.), Evaluative Perception (pp. 109-128). Oxford: Oxford University Press.
  • Werner, Preston J. 2016. “Moral Perception and the Contents of Experience.” Journal of Moral Philosophy 13, 294-317.
  • Werner, Preston J. 2017. “A Posteriori Ethical Intuitionism and the Problem of Cognitive Penetrability.” European Journal of Philosophy 25, 1791-1809.
    • Argues that synchronic cognitive penetration is a problem for CMP, but diachronic cognitive penetration is epistemically harmless.
  • Werner, Preston J. 2018. “Moral Perception without (Prior) Moral Knowledge.” Journal of Moral Philosophy 15, 164-181.
    • Response to Faraci 2015.
  • Werner, Preston J. 2018. “An epistemic argument for liberalism about perceptual content.” Philosophical Psychology 32, 143-159.
    • Defends the claim that there are high-level contents in perception by arguing that it best explains some findings in perceptual psychology, such as facial recognition.
  • Werner, Preston J. 2020. “Which Moral Properties Are Eligible for Perceptual Awareness?” Journal of Moral Philosophy 17, 290-319.
    • Discusses which moral properties we can perceive, concludes that we perceive at least pro-tanto evaluative properties.
  • Wodak, Daniel. 2019. “Moral perception, inference, and intuition.” Philosophical Studies 176, 1495-1512.
  • Yablo, Stephen. 2005. “Wide Causation” In Stephen Yablo (ed.), Thoughts: Papers on Mind, Meaning, and Modality. Oxford: Oxford University Press.
    • Presents a solution to the causal exclusion problem. Argues that mental states are causally efficacious in a ‘wide’ sense in that they would still be explanatorily valuable even if the ‘thin’ causes, the physical states, were different.

 

Author Information

Erich Jones
Email: Jones.7269@buckeyemail.osu.edu
The Ohio State University
U. S. A.

The Arrow of Time

Many philosophers and physicists claim that time has an arrow that points in a special direction. The Roman poet Ovid may have referred to this one-way property of time when he said, “Time itself glides on with constant motion, ever as a flowing river.”

Experts who accept the existence of time’s arrow divide into two broad camps on the question of the arrow’s foundation. Members of the intrinsic camp say time’s arrow is an intrinsic feature of time itself, a property it has on its own independently of what processes there are. They are apt to describe the arrow as an uninterrupted passage or flow from the past to the future. Members of the extrinsic camp say the arrow is not intrinsic to time itself but instead is all the processes that happen to go regularly and naturally in only one direction, such as milk dispersing into black coffee, turning it brown, and people becoming biologically older, not younger.

If the extrinsic position were correct, one might naturally expect the underlying laws of physics to reveal why all those macroscopic processes are seen to go regularly in only one time direction; but the laws do not—at least the fundamental ones do not. A normal documentary movie, if shown in reverse, is surprising to the viewer because the arrow of time reverses, but it does not show anything that is not permitted by the fundamental laws, yet those laws supposedly describe the microscopic processes that produce all the macroscopic processes. To explain the apparent inconsistency between the observed time-asymmetry of macroscopic processes and the time-symmetry of all the significant, fundamental, microscopic processes, some philosophers of physics point to the need to discover a new fundamental law that implies an arrow of time. Others suggest the arrow might be inexplicable, a brute fact of nature. Members of the “entropy camp” claim there is an explanation, but it does not need any new fundamental law because the arrow is produced by entropy increasing plus the fact that the universe once had minimal entropy. (Entropy can be a useful numerical measure of a physical system’s randomness or disorder or closeness to equilibrium. The system can be the universe as a whole.)

Commenting upon the observed asymmetry between past and future, a leading member of the extrinsic camp said, “The arrow of time is all of the ways in which the past is different from the future.” Some of these ways are entropy increasing in the future, causes never producing effects in the past, the universe’s continuing to expand in volume, and our remembering the past but never the future. Can some of these ways be used to explain others, but not vice versa? This is called the taxonomy problem, and there are competing attempts to solve the problem.

Some philosophers even ask whether there could be distant regions of space and time having an arrow pointing in reverse compared to our arrow. If so, would adults there naturally walk backwards on their way to becoming infants while they remember the future?

Table of Contents

  1. Introduction
  2. The Intrinsic Theory
      1. Criticisms
  3. The Extrinsic Theory
    1. Criticisms
  4. The Entropy Arrow
    1. The Past Hypothesis
  5. Other Arrows
    1. The Memory Arrow
    2. The Cosmological Arrow
    3. The Causal Arrow
  6. Living with Arrow-Reversal
  7. Reversibility and Time-Reversal Symmetry
    1. Summary
    2. More Details
  8. References and Further Reading

1. Introduction

In 1927, Arthur Eddington coined the term time’s arrow when he said, “I shall use the phrase ‘time’s arrow’ to express this one-way property of time which has no analogue in space. It is a singularly interesting property from a philosophical standpoint.”

Writers use a wide variety of terms to characterize the arrow. They say, for example, that time has a distinguished direction. The philosopher Max Black described other terminology:

Instead of saying that time has a “direction,” writers will sometimes say that time is “asymmetrical” or “irreversible”, or that the passing of time is “irrevocable.” Although these alternative expressions are not exact synonyms, they are closely connected in meaning, and may well be considered together….Those who say that time has a direction will often say, by way of contrast, that space has no direction, or that space is “symmetrical.” Or they will say that events are reversible with respect to their spatial locations, but not with respect to their temporal locations (Black 1959, 54, 57).

Time is very much like a straight line of instants or moments. Given a time line of instants, any segment between two of those instants such as instants A and B has two directions, that is, two orientations: from A to B, and from B to A. Specifying time’s arrow distinguishes one of those directions from the other. Expressed in the language of mathematical physics, the asymmetric relation of before-or-simultaneous-with provides a linear ordering of the instants. But so does the asymmetric relation of later-than-or-simultaneous-with. Philosophers want to know what in the world or in consciousness distinguishes one of these relations from the other.

Time’s arrow is somewhat like an arrow that is shot from an archer’s bow. It has an intrinsic difference between its head and its tail, as you can tell just by looking at its shape. However, describing and explaining time’s arrow is more challenging.

Here is a long list of topics and issues to be resolved in formulating a successful philosophical theory of time’s arrow. They are all discussed below. The order on the list is not very important. Conceptual clarity is the philosophical goal here. There should be definitions of the terms “time’s arrow” and “arrow of time” and some decision on whether the arrow is objective or subjective. Is there more than one arrow? Is time’s arrow simply the fact that the future is different from the past? Where is the arrow pointing? Does it necessarily point there, or is this a contingent fact? What might count as good grounds for saying time has an arrow, and do we have those grounds? What is the relationship between time reversal and arrow reversal, and what exactly do those terms mean? Was Hans Reichenbach correct when he said that “we cannot speak of a direction of time for the whole; only certain sections of time have directions, and these directions are not the same”? Is time’s arrow a spatially-limited, local feature or instead a larger, perhaps global feature of the universe? If the universe had started out differently, but with time existing, might time never have had the arrow it has? How do we reconcile the fact that we humans always experience one-way processes with the fact that the fundamental laws of physics allow the fundamental processes to go either way in time?  Researchers do not agree on what counts as evidence of the arrow nor on what is a good example of the presence of the arrow, so a resolution is required. There are some rare decay processes of mesons in electroweak interactions that do go only one way in time. Could these account for time’s arrow?  One of the main questions is whether the arrow is intrinsic to time itself or instead only an extrinsic feature due to processes that happen to occur over time the way they do, and with time itself being totally undirected like a long line with no preference for one direction over the other. Do the past and future happen to be different, or do they have to be different by definition or for some other reason? Can it be shown that time’s arrow is natural and to be expected, or is its existence merely a primitive fact, as some have argued? Is the arrow fundamental, or can it be derived from something more fundamental? Does the existence of an arrow have ontological implications such as placing special demands on the structure of spacetime other than what is required by relativity theory? Is it clear that temporal phenomena even seem to us to have an arrow? Researchers are divided on that question, too. So, this set of issues is fertile territory for philosophers. The questions need answers or, in some cases, explanations of why they are not good questions.

Much has been said on these issues in the academic literature. This article provides only introductions and clarifications of some of them, and it does not attempt to provide definitive analyses and treatments of any open issues.

Looking back over history since the late 1800s, it is clear that our civilization is continually learning more about the philosophical issues involving the arrow of time. The academic accomplishments here provide a paradigm counterexample to the carelessly repeated quip that there is no progress in the field of philosophy. Progress in philosophy need not simply be a change from no consensus to consensus on an open issue.

Regarding the issue of whether the arrow is a local or a global feature, the philosopher Geoffrey Matthews remarked that, “Much of the recent literature on the problem of the direction of time assumes…it must be solved globally, for the entire universe at once, rather than locally, for some small part or parts of the universe,” but he recommends that it be solved locally (Matthews 1979, 82). However, the cosmologist George Ellis has a different perspective. He says: “The direction of time is the cosmologically determined direction in which time flows globally.” (Ellis 2013).

Max Black claimed time has an arrow if, but only if, ordinary discourse about time such as “This event happens earlier than that event” is objectively true or false and is not dependent upon who is speaking. That approach to the arrow via ordinary language retained few followers into the twenty-first century.

To say the arrow exists objectively implies that it exists independently of what we think, feel, do, or say. This is often expressed briefly as saying the arrow is mind-independent. One might consider whether there is a sense in which the arrow is mind-independent and another sense in which it is not. Or perhaps there are two arrows, an objective one and a subjective one, or a physical one and a wholly phenomenological one. It takes a conscious observer to notice time’s arrow, but is the arrow that is noticed dependent on that observer? Money is observer dependent. If there is no agreement to treat that piece of paper containing an ex-president’s picture on it as money, then it is not money and is merely a piece of paper. Is time’s arrow like money in this sense or in a sense that gives the arrow some other observer-dependent existence?

On this issue, Huw Price presents an argument inspired by David Hume’s claim that necessary connections among events are projected into the external world by our minds. Price says that, when seen from the Archimedean point of view that objectively looks down upon all the past, present and future events, it becomes clear that just as the color blue is a subjective feature of external reality so also time’s arrow is merely a subjective or intersubjective projection onto an inherently time-symmetric external reality (Price 1996). Seen from this atemporal perspective, the arrow exists only in our collective imagination. Physical time is not subjective, but its arrow is, he says. Craig Callender objected to Price’s use of the word “merely” here. He claimed Price tells only half the story. In addition to the subjective projection, there is also an objective asymmetry of time because, “Thermodynamic behaviour not only shapes asymmetric agents but also provides an objective asymmetry in the world” (Callender 1998, 157). The thermodynamic behavior Callender is talking about is entropy increase. Theoretical physicists have reached no consensus regarding whether time’s arrow is an objective feature of the world, and neither have philosophers of physics.

Physical time is what clocks are designed to measure. Phenomenological time or psychological time, unlike physical time, is private time. It is also called subjective time. Compared to physical time that is shown on a clock, our psychological time can change its rate depending on whether we are bored or intensively involved. Many philosophers in the twenty-first century believe psychological time is best understood not as a kind of time but rather as awareness of physical time. Psychological time is what people usually are thinking of when they ask whether time is just a construct of the mind. But see (Freundlich 1973) for a defense of the claim that “physical time acquires meaning only through phenomenological time.”

It is surprising to many people to learn that nearly all scientists believe there does not exist a physical time that is independent of space. What has independent existence is spacetime, say the scientists; they are drawing this conclusion from Einstein’s theory of relativity. The theory of relativity implies spacetime is the set of all events, and spacetime is an amalgam of space and time, with each of these two parts being different in different reference frames. According to relativity theory, the amount of time an event lasts is relative. It is relative to someone’s choice of a reference frame or coordinate system or vantage point. How long your dinner party lasted last night is very different depending on whether it is measured by a clock on the dinner table or by a clock in a spaceship speeding by at close to the speed of light. If no reference frame has been pre-selected, then it is a violation of relativity theory to say one duration is correct and the other is incorrect. In this article, advocates of the extrinsic theory presume time exists, but only as a feature of spacetime and only relative to a reference frame. Advocates of the intrinsic theory may or may not presume this.

One minor point about the word “state” that is used ahead. The word “state” usually means a system’s state at one time. For a physical system with fundamental parts, the state of the system at a time is also called its “configuration.” If it were to turn out that time emerges from fundamental timeless entities, then the concept of a state would need to be re-defined to be useful for the most fundamental level.

2. The Intrinsic Theory

Philosophers ask whether there is an arrow of time or in time—that is, whether (i) there is an arrow of time itself in the sense of its being part of time’s intrinsic structure, or instead (e) there is only an arrow in time that is extrinsic to time itself and that is due to time’s contents, specifically to physical processes that evolve one way over time. The intrinsic theory is committed to claim (i), and the extrinsic theory is committed to claim (e). The intrinsic theory is interested in the asymmetry of time, of time itself, over and above the asymmetry of processes in time. This difference in the two philosophical theories is sometimes expressed as their differing on whether time’s arrow is due to time’s inherent form or to time’s content.

Defenders of both the intrinsic theory and the extrinsic theory agree that the arrow of time reveals itself in the wide variety of one-way processes such as people growing older and never younger, punctured balloons bursting and never un-bursting, and candles burning but never un-burning. But only those persons promoting the extrinsic theory suggest that time’s arrow is identical to, or produced by, the set of these one-way processes.

The image of time proposed by the class of intrinsic theories is closer to the commonsense understanding of time, to what Wilfrid Sellars called the “manifest image,” than it is to the scientific image of time, which is time as seen through the lens of contemporary science.

All who say the arrow is intrinsic to time believe that strong evidence for the arrow is found in personal experience, usually both internal and external experience. What motivates most persons in the intrinsic camp is that having an intrinsic arrow seems to be the best explanation for the everyday flux of their own temporal experiences plus the similar experiences of others. Some imagine themselves advancing through time; others, that they stand still while time flows like a river past them. Some speak of more events becoming real. The philosopher Jenann Ismael mentions “the felt whoosh of experience.” It is not simply that we all occasionally experience time as directed but that this is relentless. Time stops for no one, says the proverb.

Robin Le Poidevin points out that the experience of the arrow is both internal and external. He says, “we are not only aware of [time’s passage] when we reflect on our memories of what has happened. We just see time passing in front of us, in the movement of a second hand around a clock, or the falling of sand through an hourglass, or indeed any motion or change at all” (Le Poidevin 2007, 76).

Assuming those in the intrinsic camp are correct in claiming that there is strong evidence for the arrow being found in personal experience, the following question is relevant: “Did time’s arrow exist before consciousness evolved on Earth?”

Most twenty-first century experts in the intrinsic camp find the arrow not just within experience but also in what is experienced; it is found in what is experienced in the sense that, even though it may require conscious creatures to detect the arrow, nevertheless what is detected is the kind of thing that was a feature of nature long before beings with a conscious mind evolved on Earth to detect it, and it deserves a central place in any temporal metaphysics. To briefly summarize this extended bit of argumentation, it seems to us that the external world has an arrow of time because it does, and it does because we all experience it. All members of the intrinsic camp say time has an intrinsic arrow and space does not.

When it is said the arrow is a feature of time’s intrinsic structure, what is meant by the term “structure”? That term is about overall form rather than specific content. The structure is not a feature of a single three-dimensional object, nor is it detectable in an experience that lasts for only an instant.

When it is said the arrow is intrinsic to time or to be a feature of time’s intrinsic structure, what is meant by the term “intrinsic”? Intrinsic is like internal or inherent. An intrinsic property can apply to just one thing, like a quality; an extrinsic property is a relational property. An object’s mass is intrinsic to it, but its weight is not. Here is another example from the metaphysician Theodore Sider. He reminds us that a person having long hair is very different from her having a long-haired brother. Having long hair is an intrinsic property she has without involving anyone else. But having a long-haired brother is not intrinsic to her. It is an extrinsic or relational property that she has because of her relationship to someone else, namely her brother. Notice that her having the intrinsic property of long hair does not imply that long hair is essential to her. For a second example, a potato’s mass is intrinsic to it, but its weight is extrinsic and depends, for example, upon whether it is on the moon or on Earth. The metaphysicians David Lewis and Bernard Katz offer a deeper discussion of the difference between intrinsic and extrinsic in (Lewis 1983) and (Katz 1983).

Although the concept of being intrinsic is not the same as the concept of being essential, many researchers in the intrinsic camp believe the arrow is essential to time in the sense that time necessarily has an arrow, that it is not a contingent feature, and that there would not be real time without it. Thus, arrow reversal would imply time reversal, and vice versa. Those in the extrinsic camp are much less apt to hold this position, and they are content to say that the arrow is a contingent feature, but that time would be very strange without it. They also have a different definition of time reversal from that of the intrinsic camp.

Within the intrinsic camp there are different explanations of why time is intrinsically directed. The largest sub-camp is the dynamic camp that promotes a dynamic theory of time’s arrow. Members are commonly called “temporal dynamists.”

Diagram of theories

An Euler diagram of philosophical theories
of time’s arrow that are discussed below

Each point in the Euler two-circle diagram represents a specific theory of the arrow of time.

Let’s consider first the dynamic sub-camp of the intrinsic theorists. Its members frequently say time has an internal structure that is robustly “dynamic” or “transitory” or “active.” What does this mean? In answering, some philosophers offer a very picturesque style of description and appeal to the idea of a “river of time” that is now gushing out of nothingness. Less picturesquely, most answer by saying time “passes” or “flows” or “has a flux” or “lapses” or “runs” or has a “moving present” or has a feature called “becoming.” All these terms are intended to help describe time’s intrinsic arrow.

Let’s attempt to be clearer. Most advocates  of time’s passing claim that passage or becoming is part of the physical world. What part is that?

Here is one clarification of time’s passing:

We take temporal passage to consist in (a) there being a fact of the matter regarding which entities are objectively present, and (b) there being changes in which [of the] entities are objectively present. Presentism, the growing block theory, the dropping branches theory and the moving spotlight theory are all theories according to which time passes (Miller and Norton 2021, 21).

If you believe in becoming or time’s passing, then you must believe that in some sense the future is not as real as the present or past.

But how does a researcher go about showing that a dynamic theory is true or false? George Schlesinger made an interesting remark about this. Think of what he calls the “transient theory of time” as what  this article calls the “dynamic theory.”

There is no doubt that the transient theory of time is consistent and intelligible. Is it true? I do not believe that this is the kind of question to which a final conclusive answer is possible. As with all genuinely metaphysical theories, what we may reasonably expect is further clarification concerning its precise presuppositions and implications and an increasingly more detailed list of its advantages and disadvantages. (Schlesinger 1985, 92).

The various, prominent dynamic theories of time are presented in (Zimmerman 2005) and (Dainton 2020). The present article introduces a sample of them.

In 1927, C.D. Broad said becoming is the transitory aspect of time that must be added to the “mere” ordering of events via McTaggart’s B-series relations. The B-series is too static to capture the dynamic character of time, Broad claimed, because if one event happens before another then it always does, and this fact never changes. Yet facts do change. It was once a fact that dinosaurs exist. The B-series is a static representation. Broad said absolute becoming “seems to me to be the rock-bottom peculiarity of time, distinguishing temporal sequence from all other instances of one-dimensional order, such as that of points on a line, numbers in order of magnitude, and so on.” He believed the A-theory was needed to capture this objective, non-phenomenological sense of becoming. Arthur Eddington said, “we have direct insight into ‘becoming’ which sweeps aside all symbolic knowledge as being on an inferior plane.”

Broad’s moving spotlight theory is suggested with the following metaphor from the Spanish philosopher George Santayana: “The essence of nowness runs like fire along the fuse of time.” The theory treats the dimension of time much like a single dimension of space with all the past, present, and future events arranged in temporal order along the time dimension (the fuse of time), and with simultaneous events having the same location. Promoting the moving spotlight theory as a way of clarifying what he meant by becoming, C.D. Broad said, “What is illuminated is the present, what has been illuminated is the past, and what has not yet been illuminated is the future” (Broad 1923, 59). The theory assumes eternalism in the sense that all past, present, and future events exist (in the tenseless sense of the term), but it has the extra assumptions that the present time is a metaphysically privileged time and that A-theory sentences about the past such as “The year 1923 is no longer our present year” are metaphysically basic, unlike the B-theory which might analyze that sentence as “The year 1923 happens before the year in which you are reading this sentence.” Most versions of the moving spotlight theory imply being present is primitive in the sense of being an unanalyzable feature of reality.  The idea that for the moving spotlight the present is metaphysically privileged implies for its advocates Timothy Williamson and Quentin Smith that future events are not spatial, but they shed this property and acquire the property of being spatial or in space as they become highlighted by the spotlight, then they shed this property and become non-spatial, past events. For an examination of the spotlight theory, see (Zimmerman 2005) and (Miller 2019).

Some dynamists, such as C. D. Broad in his 1927 book Scientific Thought, have explained time’s passage as reality’s growing by the continual accretion of new moments or new facts or the creation of more states of affairs. This is not an eternalist theory nor a presentist theory. It employs a growing-block model of events in Einstein’s sense of the technical term “block,” but without the actual future events that exist in their traditional block universe. That is, the growing-block consists of all real present and past point-events; and the latest moment that exists in the block is metaphysically privileged and called “the present” and “now.” All longer-duration events are composed of point-events. In classical physics, the block can have a four-dimensional Cartesian coordinate system of three spatial dimensions and one time dimension. The block grows in volume over time as more moments and events occur and more facts about the present and past are created. The direction of the arrow of time is the direction that the block grows.

Enamored of the idea of reality’s growing by the accretion of facts, Michael Tooley argued that “the world is dynamic, and dynamic in a certain way: it is a world where tenseless states of affairs come into existence, but never drop out of existence, and therefore a world where the past and the present are real, but the future is not.” A tenseless state of affairs is a fact in which any use of tense in its description is not semantically or ontologically significant. When someone says, “one plus two is three,” the word “is” occurs in the present tense, but we know to ignore this and treat it as insignificant and assume the speaker’s statement is not only about the present. See (Dainton 2020) to learn more about tenseless vs. tensed characteristics.

An important feature of Tooley’s and others’ growing-block theories is that they make what is real be dependent on what time it is. Adherents to the theory say one of its virtues is its promotion of the unreality of the future because this naturally allows the future to be “open” or indeterminate unlike the past and present that are “closed” and determinate and so cannot change. This openness is part of the manifest image of time that almost all persons hold, but this sometimes means different things to different people. It might mean the future is non-existent, or that it is not straightforwardly knowable as is the past, or that we human beings are able to shape the future but not the past. Some researchers say this openness shows itself semantically by the fact that a contingent proposition about the future such as “There will be a sea battle tomorrow” is neither true nor false presently. The eternalist is more apt to say the proposition has a truth value and is eternally true (that is, true at all times) or else is eternally false, but we just do not know which one it is.

See (Miller 2013) for a comparison of the growing-block ontology of time with its competitors: presentism and eternalism. For some subsequent research on the growing block, see (Grandjean 2022).

Let us turn to intrinsic theories that are promoted by those outside of the large, dynamic camp. One theory implies time has an intrinsic arrow because time is intrinsically anisotropic (that is, one direction is intrinsically privileged compared to the other, that one way is privileged over the other way), and sufficient evidence of time’s being intrinsically anisotropic would be the existence of time-anisotropic processes obeying time-anisotropic laws of nature. (The terms time-anisotropic and time-asymmetric and temporally directed have different senses, but they denote the same thing.)

Ferrel Christensen’s argument in favor of time’s being intrinsically asymmetric appealed to the evidence of there being so many kinds of one-way processes in nature, and the simplest explanation of this, he suggested, is that time itself is intrinsically asymmetric.

The bare fact that our experience of time is mediated by processes that take place in time doesn’t argue that any or all of the structural features of the latter aren’t also possessed by temporality in its own right. …Is it not plausible to suggest that a single asymmetry is responsible for them all, namely that of time itself? For reasons having to do with economy, the ability of a single feature to theoretically explain a diversity of phenomena is in general regarded in science as good evidence for the reality of that feature; it has the effect of unifying and organizing our picture of the world. Surely there must be some common reason, one is tempted to argue, for the evidence of the various asymmetries in time—what else might it be if not the asymmetry of time itself? (Christensen 1987, 238 and 243).

Tim Maudlin is a member of the intrinsic camp who also does not promote a dynamic theory of time’s arrow. See the letter “m” for Maudlin’s theory in the diagram above. He accepts the block universe theory in the sense that the past, present, and future are equally real, but he also accepts the passage of time and does not characterize the block as static. Maudlin’s reasoning here is that the word “static” refers to objects that persist through time and never change. The block universe does not persist through time. Instead, time exists within it.

Maudlin says that time’s passage is an intrinsic or inherent asymmetry in the structure of space-time itself and that time passes objectively and independently of the material contents of spacetime and their processes. That is how he interprets “becoming.” Maudlin argues that the direction of time is primitive and so cannot be deduced or otherwise explained more deeply. He believes that “except in a metaphorical sense, time does not move or flow,” but it does pass. Maudlin adds that, “the passing of time…is the foundation of our asymmetrical treatment of the initial and final states of the universe” (Maudlin 2007, 142). Stephen Savitt, Dennis Dieks, and Mauro Dorato also have claimed that time passes in the block universe. Many philosophers (for example, Dorato 2006) would agree that if becoming is to be an objective feature of nature and not merely a subjective feature of human experience, then necessarily there is some sort of ontological asymmetry between past and future events.

However, if passage were merely having a one-dimensional asymmetric continuum structure, then the real numbers would pass from less to greater numbers, which would be an odd use of the word “pass.” Related to this point, Maudlin said, “The passage of time connotes more than just an intrinsic asymmetry; not just any asymmetry would produce passing … the passage of time underwrites claims about one state ‘coming out of’ or ‘being produced from’ another, while a generic spatial (or temporal) asymmetry would not underwrite such locutions….” Maudlin’s notion of production means that states at later times exist in virtue of states at earlier times.

Christian Loew explains Maudlin’s position this way:

(E)arlier states are metaphysically more fundamental than the later states that exist in virtue of them…. The laws of nature, given the state of the universe at some earlier time, constrain its state at all later times (either deterministically or by specifying probabilities). But it is only because of the intrinsic direction of time that earlier states are metaphysically more fundamental than later states and that later states exist in virtue of them. Think of the laws like the bed of a river and think of the direction of time like water pressure. The laws of nature chart out how the world must evolve, but the intrinsic direction of time is the driving force that gets this evolution going….Maudlin’s notion of production captures this generation of all other states from the initial state and the laws of nature: these other states are produced from the laws of nature operating on the initial state. By contrast, if time has no direction, then all states in time are equally fundamental and there is no explanation in terms of production (Loew 2018,  489, 490).

Maudlin believes he has pinpointed why so many physicists do not agree with him that it is a fundamental, irreducible fact of nature that time is an intrinsically directed object. Overly influenced by Einstein’s theory of relativity, these physicists treat time as if it is a space dimension, then note that no space dimension has an arrow, so they conclude the time dimension has no arrow. Einstein, himself, never made such an argument. According to Maudlin:

I think the reason it’s hard for physicists to see the direction of time is they use the piece of mathematics that was developed to analyze space, and there is no direction in space. So, you have this mathematical theory built for something with no direction in it, you then try to analyze spacetime with it, and you say “Gosh, I don’t see a direction anymore; it must be an illusion.” …It’s not an illusion.

a. Criticisms

A variety of criticisms of the intrinsic theory have been offered. For example, those in the extrinsic camp often say there is an over-emphasis on the phenomenology of temporal awareness. In reply, those in the intrinsic camp often accuse those in the extrinsic camp of scientism due to their irresponsible rejection of our valid temporal intuitions.

One argument given for why the intrinsic theory improperly describes time is that relativity theory treats time as the fourth dimension, a one-dimensional subspace of spacetime, and we know space has no direction. To elaborate on this criticism, notice that people on earth sometimes believe space has a direction toward “down,” but that is a mistake. Space seems to have a down arrow only because we happen to live on the surface of a very massive object which pulls objects toward its center, but if we lived out in space away from any very massive objects, we would not be inclined to assign a direction to space. Similarly, people commonly suppose time has a direction because they experience so many one-way processes, but they experience these only because their past is associated with an extremely low entropy state, namely, the big bang. If this peculiar event were not in their past, then they would appreciate that time could just as well have had no arrow or had a reversed arrow. Once they free their minds from the influence of the presence of earth and the influence of a low-entropy big bang in their past, they could see more clearly that neither space nor time has an intrinsic arrow. Advocates of the intrinsic theory usually respond to this criticism by saying it makes too much of the weak analogy between time and space.

One other broad criticism claims the intrinsic theory is not coherent. Huw Price said, “I am not convinced that it is possible to make sense of the possibility that time itself might have a certain direction” (Price 2002, 87). Dynamists typically, but not universally, promote an A-theory of time in which the concept of time’s passage depends upon the coherence of the idea that pastness and presentness are intrinsic properties of events, properties that are gained and lost over time. This dependence is illustrated by the fact that, according to the A-theory, the birthday party occurred last week because the party event has a week’s degree of pastness, a degree that will keep increasing. Critics argue that these technical A-concepts are superficially coherent but ultimately not coherent and that B-concepts are sufficient for the task.

Nathan Oaklander criticized the moving-now theory or spotlight theory because it seems to be committed to the claim that the same NOW exists at every time, but he doubted anything sensible can be made of the NOW being the “same” (1985). For more discussion of this criticism, see (Prosser 2016).

Other critics of the intrinsic theory say the problem is not that the theory is inconsistent or nonsense but that it is obscure and unexplanatory.

Still others complain about subjectivity. They say the advocates of the intrinsic theory and its passage of time are relying ultimately on McTaggart’s A-series, but “A-series change and the passage of time are mind dependent in the sense of being merely matters of psychological projection,” unlike the B-theory with its arrow in time (Bardon 2013, 102).

Cognitive scientists and biochemists are naturally interested in learning more about the bodily mechanisms, including the mental mechanisms that allow people to detect time and also to detect time’s arrow. However, say the critics, they should attend more to the difference between the two. We see sand fall in the hourglass. Our detection of the change is correctly said to be evidence for us that time exists, but is it also evidence that the arrow of time exists, as Le Poidevin believes? No, say some critics from the extrinsic camp. What would be evidence would be noticing that the sand never falls up.

A motivation for adopting the dynamic theory is that it seems to them that they directly experience the dynamic character of time. “It would be futile to try to deny these experiences,” said D.C. Williams, who believed it seems to all of us that time passes. George Schlesinger said, “Practically all agree that the passage of time intuitively appears to be one of the most central features of reality.” Claims about their phenomenology of time do clearly motivate those experts in the intrinsic camp to say time passes, but it does not seem to be a strong motivation for the average person who is not an expert on issues of time’s arrow. Kristie Miller and her associates, “found that, on average, participants only weakly agreed that it seems as though time passes, suggesting that most people do not unambiguously have a phenomenology as of time passing.” Her results in experimental philosophy suggest “that ~70% of people represent actual time as dynamical and ~30% represent it as non-dynamical” (Miller 2020).

Some critics question the effectiveness of the argument that, if time intuitively seems to be dynamic, then it is. There are several very different ways this criticism is made. Many critics say that, even though much of our experience is not an illusion, our experience of the passage of time is merely an illusion—something we experience that should be explained away:

A moving spotlight theorist might…argue: his theory is superior because it is only in his theory that things are as they seem. But this is not a good argument. A B-theorist might have an excellent story to tell about why things are not as they seem. If he does, then it should not count against his theory that it says we are subject to an illusion (Skow 2011, 361).

In this spirit, other critics of dynamical time say that it seems to most of us that rocks are perfectly solid and contain no empty space; nevertheless, science rightly tells us we are mistaken. These critics say the intrinsic camp’s supposed intrinsic asymmetry of time that so many seem to find in their own experience and the experience of others is only a product of people, including certain philosophers of physics, overly relying on their intuitions and uncritical impressions, while misinterpreting their temporal experiences and being insufficiently sensitive to science. Their arguments do not take into account that science is properly in the business of precisification of concepts and of promoting concepts that are maximally useful for understanding nature. For more detailed criticism along these lines, see (Callender 2017).

Dennis Dieks argued that, “on closer inspection it appears that the scientific B-theory may explain our intuition better than the A-theory, even though the latter at first sight seems to completely mirror our direct experience…. There is becoming and change in this picture in the following sense: events occur after each other in time, displaying different qualities at different instants” (Dieks 2012, 103 and 111). Opponents of a dynamic sense of becoming often say that becoming is real only in the sense that an event comes into being out of others in its local past. So, the B-theorist need not deny temporal passage provided it is not the robust passage promoted by the A-theorist.

The intrinsic theory is commonly criticized for its language use, for its violation of what philosophers of language call “logical grammar.” For example, pointing out how those in the intrinsic camp use the word becoming, J.J.C. Smart said:

Events happen, things become, and things do not just become, they become something or other. “Become” is a transitive verb; if we start using it intransitively, we can expect nothing but trouble. This is part of what is wrong with Whitehead’s metaphysics; see, for example, Process and Reality, p. 111, where he says that actual occasions “become.” (Smart 1949, 486).

C.D. Broad does not make this mistake in his use of language, says Smart, but he makes another mistake with language: his use of the transitive phrase “become existent” is misleading philosophically. Emphasizing Broad’s faulty use of language, Smart declares:

With what sorts of words can we use the expressions “to change” and “to become “? …I think that if certain philosophers, notably Whitehead and McTaggart, had asked themselves this question…they would have saved themselves from much gratuitous metaphysics… (Smart 1949, 486).

One prominent complaint made against the growing-block theory (GBT) is that it cannot give us a good reason to believe that we are now in the objective present and Napoleon is not. Bourne says, “We are in no better epistemic position than thinking subjects located in the objective past who are wrongly believing that they are located in the objective present, since ‘[…] we would have all the same beliefs […] even if we were past'” (Bourne 2002, 362). Vincent Grandjean adds, “the epistemic objection does not merely concern GBT, but is equally applicable to every A-theory of time that distinguishes between the notions of existing at the present time and just existing. For example, the epistemic objection is equally applicable to the moving spotlight theorist.”

Another frequent complaint made against the growing-block theory is that it is not compatible with the theory of relativity because it presumes absolute simultaneity rather than simultaneity relative to a conventionally chosen reference frame.

For consideration of the variety of these and other philosophical objections that are made to the tensed account of the dynamic, growing block, see chapter 10 of (Tooley 1997). See (Dorato 2006) for an argument that the unreality of the future, a proclaimed virtue of Tooley’s theory, is not a necessary condition for temporal passage. See (Earman 2008) regarding the prospects for revisions of a growing-block model of the universe.

The role the fields of psychology and cognitive science can and should play in understanding time’s arrow is an interesting issue. The human mind and body have some clock-like features, but clearly there is no single neuron that tracks time’s arrow. There may be some mental and neuronal structures to be found that do track time’s arrow, but there is no consensus that these have been found. Presumably there would be multiple mental procedures involved, and the neuronal structures would be both complex and distributed around the brain. But researchers cannot simply presume that what accounts for our temporal phenomenology is, among other things, time’s arrow. It will not be if there are pervasive phenomenal illusions regarding the arrow. Perhaps the mechanisms that account for our phenomenology that is purported to be about time’s arrow do not actually track the arrow. So, there is much useful, future research ahead. For more about these issues, see (Braddon-Mitchell and Miller 2017).

There is a subtle sub-issue here about how things seem. A distinction can be made between phenomenal error and cognitive error:

Temporal non-dynamists hold that there is no temporal passage, but [they] concede that many of us judge that it seems as though time passes. Phenomenal Illusionists suppose that things do seem this way, even though things are not this way. They attempt to explain how it is that we are subject to a pervasive phenomenal illusion. More recently, Cognitive Error Theorists have argued that our experiences do not seem that way; rather, we are subject to an error that leads us mistakenly to believe that our experiences seem that way. …We aim to show that Cognitive Error Theory is a plausible competitor to Phenomenal Illusion Theory (Miller et. al. 2020).

Adolf Grünbaum complained that the main weakness of dynamic theories is that passage and becoming and the arrow have no appropriate place in the fundamental laws. He probably would have found support in Jill North’s remark that, “There is no more structure to the world than what the fundamental laws indicate there is.” Some members of the dynamic camp reacted to Grünbaum’s complaint by saying the intrinsic theory does not need a law of physics to recognize the arrow: “[T]here is in the world an asymmetric relation holding among events, the temporal priority relation, and…we can know when this relation holds or fails to hold, at least sometimes, without relying upon any features of the lawlike nature of the world in time” (Sklar 1974, 407-410).

Other members of the dynamic camp reacted very differently to Grünbaum’s complaint by saying the fundamental laws do need to be revised in order to recognize some extra structure that reveals time’s intrinsic arrow. This suggestion has faced fierce resistance. Frank Wilczek, a Nobel Laureate in physics, objected to any revision like this. Coining the term “Core theory” for the theories of relativity and quantum mechanics (including quantum field theory and the standard model of particle physics), which are our two currently accepted fundamental theories of physics, Wilczek declared:

The Core has such a proven record of success over an enormous range of applications that I can’t imagine people will ever want to junk it. I’ll go further: I think the Core provides a complete foundation for biology, chemistry, and stellar astrophysics that will never require modification. (Well, “never” is a long time. Let’s say for a few billion years.)

Some members of the intrinsic camp might say, “We are not junking it, just supplementing it so it can be used to explain even more.”

Many critics of dynamic theories of time’s arrow speak approvingly of the 1951 article, “The Myth of Passage,” in which Harvard University metaphysician D.C. Williams argued that the passage of time is a myth, and that time does not really move or flow or pass or have any inherent dynamic character whatsoever. According to Williams, all proponents of a dynamic theory of time believe that:

Over and above the sheer spread of events, with their several qualities, along the time axis, …there is something extra, something active and dynamic, which is often and perhaps best described as “passage.” This something extra I think is a myth…one which is fundamentally false, deceiving us about the facts, and blocking our understanding of them. The literature of “passage” is immense, but it is naturally not very exact and lucid, and we cannot be sure of distinguishing in it between mere harmless metaphorical phenomenology and the special metaphysical declaration which I criticize. But “passage” it would seem, is a character supposed to inhabit and glorify the present, “the passing present,” “the moving present,” the “travelling now….” It is James’s “passing moment.” It is what Broad calls “the transitory aspect” of time…. It is Bergson’s living felt duration. It is Heidegger’s Zeitlichkeit. It is Tillich’s “moment that is creation and fact….” It is “the dynamic essence which Professor Ushenko believes that Einstein omits from the world. It is the mainspring of McTaggart’s “A-series” which puts movement in time, and it is Broad’s pure becoming.

The dynamic theories lead to other troubles, says J.J.C. Smart. For instance, when we critically examine the metaphor of time’s passage and ask about the rate of flow of time:

We are postulating a second time-scale with respect to which the flow of events along the first time-dimension is measured…the speed of flow of the second stream is a rate of change with respect to a third time-dimension, and so we can go on indefinitely postulating fresh streams…. Sooner or later we shall have to stop thinking of time as a stream….

With respect to motion in space it is always possible to ask “how fast is it?” …Contrast the pseudo-question “how fast am I advancing through time?” or “How fast did time flow yesterday?” …We do not even know the sort of units in which our answer should be expressed. “I am advancing through time at how many seconds per ___?” we might begin, and then we should have to stop. What could possibly fill the blank? Not “seconds” surely. In that case the most we could hope for would be the not very illuminating remark that there is just one second in every second. (Smart 1949, 485).

D.C. Williams agreed with Smart, and he added:

Bergson, Broad, and some of the followers of Whitehead have tried to soften the paradoxes of passage by supposing that the present does not move across the total time level, but that it is the very fountain where the river of time gushes out of nothingness (or out of the power of God). The past, then, having swum into being and floated away, is eternally real, but the future has no existence at all. This may be a more appealing figure, but logically it involves the same anomalies of meta-happening and meta-time which we observed in the other version.

Huw Price has complained that, “A rate of second per second is not a rate at all in physical terms. It is a dimensionless quantity, rather than a rate of any sort. (We might just as well say that the ratio of the circumference of a circle to its diameter flows at ? seconds per second!).”

Tim Maudlin (who advocates an intrinsic theory but not a dynamic theory) and others have bit the bullet and argued that time actually does pass at the rate of one second per second. He claimed the belief that the seconds cancel out is a mistake. Critics of the intrinsic arrow ask: If the rate of one second per second does make sense, then so does a rate of two seconds per second, and what would that be like? It would be absurd. George Schlesinger claimed the rate of two seconds per second does make sense (Schlesinger 1985). See (Skow 2012) and (Miller and Norton 2021) for more discussion of differential passage and of time’s rate of passage with and without a hypertime against which time’s rate is compared.

An additional criticism of the dynamic camp’s position made by members of the extrinsic camp is that grounding time’s arrow on new nows being produced is a mistake because the concept of now, in the sense of a present for all of us, is inconsistent with scientific fact. According to the theory of relativity, if the reference frames of two observers, in which each observer is stationary in their own frame, are moving relative to each other, they must disagree on which events are happening now, with their having more disagreement the farther away that the events occur and the greater the relative speed between the two frames, so the concept of “now” cannot be objective. It is relative to a person’s favored frame of reference. The proper way to understand the word “now,” say most of these critics, is as an indexical that changes its reference from person to person and from time to time, as does the word “here;” and just as the changing reference of “here” indicates no arrow of space, neither does the changing reference of “now” indicate an arrow of time. For a defense of the moving spotlight theory against this criticism, see (Skow 2009).

For many members of the intrinsic camp, to explain time’s arrow is to explain the intrinsic difference between the past and the future. In this regard, some say there is something irrevocable or closed about past events that distinguishes them from future events. This is a deep and metaphysically significant fact. In response, Williams said in 1951, “As for the irrevocability of past time, it seems to be no more than the trivial fact that the particular events of 1902, let us say, cannot also be the events of 1952.”

Tim Maudlin has promoted an intrinsic theory but not a dynamic theory. Instead, he claimed it is a fundamental, irreducible fact that time is a directed object. The key metaphor for Maudlin is that of production, the present “produces” the past. Christian Loew highlights what he believes is a problem for Maudlin’s attempt to explain the thermodynamic arrow of entropy change in terms of production:

It is unclear what role an intrinsic direction of time that underwrites production could play in explaining the thermodynamic asymmetry. Nothing about production seems to rule out that low entropy macrostates have been produced from earlier states of higher entropy…. The time asymmetry of production, therefore, cannot explain the thermodynamic asymmetry by itself. To account for why entropy increase is typical toward the future but not toward the past, Maudlin’s account needs to be supplemented with restrictions on the boundary conditions. Maudlin seems to acknowledge this need…when he emphasizes that production guarantees that microstates in our universe have an atypical, low-entropy past only “given how it [i.e., production] started.” This appeal to how production started seems to presuppose the special initial boundary condition of the actual universe. But bringing in the boundary conditions in this way threatens to make production superfluous in an explanation of the thermodynamic asymmetry. (Loew 2018, 487).

Some other critics of Maudlin’s position on time’s arrow claim that, when Maudlin says, “change and flow and motion all presuppose the passage of time,” he should have said instead that they all presuppose the existence of time, not its passage. Maudlin was once asked “What does it mean for time to pass? Is that synonymous with ‘time has a direction,’ or is there something in addition?” Maudlin responded: “There’s something in addition. ‘For time to pass’ means for events to be linearly ordered by earlier and later.” Maudlin’s opponent in the extrinsic camp can be expected to say, “Wait! That linear ordering is just what I mean by time existing.”

Focusing on undermining objections to his position that time passes, Maudlin said:

There are three sorts of objections to the passage of time, which we may group as logical, scientific, and epistemological. Logical objections contend that there is something incoherent about the idea of the passage of time per se. Scientific objections claim that the notion of the passage of time is incompatible with current scientific theory, and so would demand a radical revision of the account of temporal structure provided by physics itself. Epistemological objections contend that even if there were such a thing as the passage of time, we could not know that there was, or in which direction time passes (Maudlin 2002 260).

Maudlin proceeded from there to argue that there are adequate responses to all three kinds of objections. He praised Huw Price’s book Time’s Arrow & Archimedes’ Point: New Directions for the Physics of Time for carefully presenting these objections and the responses.

3. The Extrinsic Theory

What would you think if some morning you noticed a mess of several broken eggs on your kitchen floor and then saw the mess apparently assemble spontaneously into unbroken eggs that rose up and landed on the nearby tabletop? You would think something is wrong here. Change does not occur that way naturally. Perhaps someone is secretly intervening to play a trick on you. If you could wait patiently for trillions and trillions of years, it is overwhelmingly probable you still would never witness eggs naturally behaving that way, yet those strange reverse-processes do not violate the fundamental laws, namely the equations of the fundamental theories of physics.

What if you could take a God’s eye view of the universe, and some morning you noticed that every process played out in a reverse direction to what you have learned to expect? You probably would conclude that time’s arrow had reversed. Appreciating this interpretation of the scenario provides a motivation for adopting the extrinsic theory of time’s arrow which implies that the arrow is due only to processes regularly showing one-way behavior spontaneously, and it is not due to some inherent structure within time as those in the intrinsic camp believe. It is a real pattern in time’s content in Daniel Dennett’s sense of the term “real pattern.” The extrinsic theory is more popular among physicists than among philosophers.

The extrinsic theory is committed to the claims that time’s arrow (1) is extrinsic to time itself, (2) is identical to, or produced by, the presence of physical processes that are never observed to go the other way spontaneously even if the laws allow them to go that way, and (3) if anything depends on our choice of reference frame—or our choice of coordinate system—it is thereby not an objective feature of the world. It is not independently or intrinsically “real.” Regarding point (3), time’s arrow does not have this frame dependence, which is why there can be a frame-free master arrow of time but not a frame-free master clock.

What it means for a process to go spontaneously is that no one intervenes and uses outside energy to manipulate the process. The assumption is that there is no violation of the system’s boundary being closed and isolated. For example, ice cubes spontaneously melt in a warm room, but the puddle of water can be forced to turn into an ice cube if someone intervenes and collects all the water and places it into a refrigerator’s freezer, then takes the new cube out of the refrigerator and places it in the room where it was.

Those in the extrinsic camp consider the following to be the major feature of nature they need to explain: What emerges at a higher scale has an arrow, but what it emerges from at the lowest scale does not.

Those in the intrinsic camp disagree with clause (2) above and say the one-way physical processes illustrate time’s arrow or indicate it or exemplify it, but they do not produce it. The intrinsic camp and extrinsic camp also give different answers to the question, “What is the relationship between time’s directedness and time’s arrow?” Those in the intrinsic camp are very likely to say they are the same. Many in the extrinsic camp are likely to say they are not the same because time itself, like space itself, has no direction. It just appears to have an arrow itself because of an important event in our past, the big bang. It could have been so different that time’s arrow would now go in reverse to how it now goes. Similarly, space has no direction even it seems to have the direction we call “down,” but that is just because we have an important object below our feet, the earth. In outer space, it would be clearer that there is no intrinsic arrow of space. Analogously, with a different origin of the universe, the arrow of processes might go in reverse or might not exist. That is why there is no intrinsic arrow of time.

If you were in the extrinsic camp, and you were looking for science to tell you about time’s arrow, you would naturally look to confirmed theories of physics rather than to the theories of the special sciences such as geology and plant science. The theories of physics underlie or explain the proper use of the word “time” in all the sciences. Our two comprehensive and fundamental theories of physics are the general theory relativity and quantum mechanics. They are fundamental because they cannot be inferred from other theories of physics. Time is primitive in all the fundamental theories. Surprisingly, their laws appear to be nearly oblivious to time’s arrow. Advocates of the extrinsic theory have concluded from this that time itself has no arrow or nearly no arrow. The need for the hedge term “nearly” is discussed in a later section. Some physicists, such as Ilya Prigogine, have concluded instead that if the laws are oblivious to the arrow, then new time asymmetric laws are needed.

The largest sub-group within the extrinsic camp constitutes the entropy camp. Its members believe they have uncovered a physical foundation for time’s arrow. It is entropy increase. Entropy-increase plus the fact that the universe had a minimal amount of entropy in the past is why our universe has our current B-relation rather than its inverse, they would say. A more detailed presentation of what entropy is and what role it plays in time’s arrow is provided later in this article, but loosely it can be said that entropy is a quantitative measure of an isolated system’s closeness to equilibrium or how run down it has become or how disordered or how decayed or how close to being homogeneous or to having its energy spread out and dispersed. “Isolated” means the system is left to itself with no tweaking from outside the system. Entropy is higher after the light bulb has burned out, the cup of hot coffee has cooled to room temperature, the neat pile of leaves you just raked was scattered by a gust of wind, the battery has run down, and the tree has died. These macroscopic processes are always found to run in only one direction in time. Even the best swinging pendulum is a one-way process because of friction. Here is a brief description of the key idea:

We all have the incontestable experience of the flow of time. The problem is to explain it in terms of physics; for we feel that the direction of time is not merely “subjective,” but rooted in the nature of things or “objective.” …Many physicists and philosophers believed it to be solved when the second law of thermodynamics was discovered. (Hutten 1959).

Sean Carroll commented on this point:

We can explain our impressions that time flows and that there is something more real about now than about other moments as consequences of the arrow of time, which is itself a consequence of increasing entropy (Carroll 2022c, 136).

The second law describes spontaneous entropy increase. The law is expressed in terms of probabilities of what will happen, not what must happen,  so, for the entropy camp, the appeal to probability is a key to explaining the arrow of time.

Here are seven tenets of the entropy camp. They believe the presence of time’s arrow in a closed and isolated region, including the region we call our observable universe itself,  (1) can be explained or defined by the overall entropy increase in the region, (2) depends on the region having a very large number of atoms or molecules in motion so entropy can be well-defined, (3) emerges only at the macroscopic scale of description, and (4) depends on the fact that entropy was low earlier. (5) Rather than saying time’s arrow necessarily points towards the future, as members of the intrinsic camp say, members of the extrinsic camp say it points towards equilibrium. Equilibrium is the state in which entropy has its maximum value. In that state, time’s arrow disappears, and the average distribution of particles in the system stays the same for a very, very long time. (6) The direction toward equilibrium in the region is, as a matter of fact, the future direction, regardless of reference frame. (7) The informal remark that time flows, or passes, or lapses is explicated as time’s existing.

The term “region” is purposefully vague to allow different claims about the size of the region. Also, there is a slight vagueness in the concept of entropy because there is no minimum number of particles required and no minimum number of new configurations or states per second that need to be present for a dynamic system to have a definite value for its entropy. At room temperature, there are 1024 molecules in about 17 cubic centimeters of water or 3.5 teaspoons. Also, at room temperature, a single water molecule collides with its neighbors about 1014 times per second. Therefore, the number of new configurations per second is enormous. It is so enormous that most experts believe the various sources of vagueness just mentioned are irrelevant to philosophical issues involving time’s arrow.

Members of the extrinsic camp would say that, if you clean your messy room and so decrease its entropy, you are not reversing the room’s arrow. They would say the arrow of time applies to the room or is manifest in the room, but there is no “room’s arrow.” Time’s arrow is overall entropy increase throughout the universe, they would say, even though there can be entropy decrease in some smaller sub-systems.

According to the entropy camp, the arrow emerges as the scale increases. This kind of emergence is not a process in time such as when an oak tree emerges from an acorn. It is a coarse-graining feature. It is something that reveals itself as the unhelpful information in the finer details is not taken into account. The arrow’s emergence is not strong emergence of something new and independent of whatever happens at the lower scale, but only weak emergence. Carroll explained the intended sense of the term “emergent”:

To say that something is emergent is to say that it’s part of an approximate description of reality that is valid at a certain (usually macroscopic) level and is to be contrasted with “fundamental” things, which are part of an exact description at the microscopic level…. Fundamental versus emergent is one distinction, and real versus not-real is a completely separate one.

The term “microscopic level” is a vague term that designates the atomic scale, that is, the world at the level of atoms and molecules, or much smaller scales. It is not tied to the use of a microscope. Emergent features are those we posit because they transcend the obscuring details of the microscopic level and give us useful, succinct information that improves our understanding of the phenomena we are interested in at a larger scale. When you want to understand why you just heard a loud noise, as a practical matter you must ignore the information about positions and velocities of all the molecules (even if you were to have some of this information) and focus on the more useful coarse-grained information that the room contained a glass of water which fell onto a hard floor and broke, thereby sending a loud sound throughout the room and into your ear. However, the lower-scale information about each molecule having this or that momentum and position and what external forces are acting on the molecules can in principle be used to explain both any exceptions to the higher-level regularities and the limits of those regularities. An example of an exception is when a falling glass of water does not break even though it hits the floor.

Because the law of entropy increase is a coarse-grained feature of nature, it is irrelevant to Laplace’s Demon. The demon uses the fine-grained features.

How should one explain why the direction of entropy increase coincides with the direction of time? Perhaps it is an inexplicable fact. Perhaps it is a coincidence. Perhaps there is a deeper explanation. This is a bad question, said Julian Barbour: “It is wrong to seek to explain why the direction of entropy increase coincides with the direction of time. The direction of entropy increase is the direction of time.” Everyone in the intrinsic camp disagrees with Barbour.

Although adherents to the extrinsic theory often speak of the “ways” in which the past differs from the future as being “arrows,” this article often calls them “mini-arrows” in order to distinguish a mini-arrow from time’s master arrow that includes all the mini-arrows. The term “mini-arrow” is not a term commonly used in the literature. Typical mini-arrows recognized by the extrinsic camp are entropy increasing, causes always preceding their effects, space’s constantly expanding and never contracting, radiation flowing away from accelerated charges (such as in a candle flame or light bulb) and not into them, people having access to records of the past but not of the future, heat flowing naturally only from hot to cold, and our being able to intervene and affect the future but never the past. Explaining time’s arrow in more depth requires solving the problem of showing how these mini-arrows are related to each other. Perhaps some can be used to explain others but not vice versa. Huw Price called this the taxonomy problem. Attempts to solve the problem are explored in a later section.

The main goals of the entropy camp are (i) to describe how emergence works in more detail, (ii) to explain why the universe and its sub-systems have not yet reached equilibrium, (iii) to understand why entropy was lower in the past, and (iv) to solve the taxonomy problem.

Carroll commented:

In reality, as far as the laws of physics are concerned, all directions in space are created equal. If you were an astronaut, floating in your spacesuit while you performed an extravehicular activity, you wouldn’t notice any difference between one direction in space and another. The reason why there’s a noticeable distinction between up and down for us isn’t because of the nature of space; it’s because we live in the vicinity of an extremely influential object: the Earth…. Time works the same way. In our everyday world, time’s arrow is unmistakable, and you would be forgiven for thinking that there is an intrinsic difference between past and future. In reality, both directions of time are created equal. The reason why there’s a noticeable distinction between past and future isn’t because of the nature of time….

Instead, he would say, it is because of the nature of entropy change—a universal tendency to evolve toward equilibrium—and the fact that entropy was low in the distant past. Also, when he said, “both directions of time are created equal” he did not mean to imply that there was intentional creation involved.

Carroll’s position on the intrinsic theory and its appeal to our impression that time flows is that, “we can explain our impressions that time flows and that there is something more real about now than about other moments as consequences of the arrow of time, which is itself a consequence of increasing entropy” (Carroll 2022c, 136).

In his 1956 book The Direction of Time, Hans Reichenbach proposed an influential version of the entropy theory. He said, “positive time is the direction toward higher entropy,” and he defined the future direction of time as the direction of the entropy increase of most branch sub-systems. These are sub-systems that become isolated temporarily from the main system of objects being analyzed. Very probably, these isolated branch systems undergo an entropy increase over time, and the overall direction they all go is toward equilibrium. Reichenbach’s overall goal was to explain the direction of time in terms of the direction from causes to their effects.

a. Criticisms

A variety of criticisms has been directed at the extrinsic theory of time’s arrow. A very popular one is that the theory is too static. It misses the dynamic feature of time that is the key feature of the intrinsic arrow. It misses what is customarily called time’s becoming or time’s passage. A-theorists complain that the B-theorists of the extrinsic camp mistakenly promote a non-dynamic or static block theory of time in which insufficient attention is paid to change because the B-series of events is only about what events occur before what other events. For example, the B-theory fails to capture the dynamical fact that the present keeps moving along the A-series of events as time goes by.

Maudlin would say the extrinsic theory misses the dynamical fact that the present “produces” the past. Those in the entropy camp believe entropy tends to increase over time in isolated systems because higher entropy states are “typical” or “likely,” compared to lower entropy states. These theorists, says Maudlin, do not appreciate that their notion of typicality needs to rely on an assumption of the “production” of events that serves as a “driving force” that gets states to evolve into other states. Yet it is the intrinsic arrow that provides this production, this driving force. The intrinsic arrow is why production is an asymmetric relation: If A produces B, then B does not produce A. In this sense, earlier states are metaphysically more fundamental than the later states that are produced by them and exist because of them. So, those in the entropy camp have things backward.

Bertrand Russell was an influential promoter of the B-series. In response to Russell, his colleague at Cambridge University J.M.E. McTaggart said:

No, Russell, no. What you identify as “change” isn’t change at all. The “B-series world” you think is the real world is…a world without becoming, a world in which nothing happens.

By “B-series world,” he means a world without an A-series. If Russell were to have lived in the twenty-first century, he might have responded to McTaggart by saying McTaggart’s mistake is to imply by analogy that a video file in a computer could never represent anything that changes because the file itself does not change.

Strictly speaking, the A-theory and the B-theory are theories about the ordering of events, not times. But if you think of a time as a set of simultaneous, instantaneous events, then the theories are about the ordering of times.

Members of the extrinsic camp have a responsibility to answer the following criticism: If, as they believe, objectively there is no dynamic flow or passage inherent in physical time, then why do so many people believe there is? Surely these people have gotten something right about the nature of time. Craig Callender has tried to defend the extrinsic theory against this criticism:

While physical time does not itself flow, we can explain why creatures like us embedded in a world like this one would nonetheless claim that it does…. In contrast to simply positing a primitive metaphysical flow and crossing one’s fingers in hope that somehow we sense it, the present theory advances independently suggested mechanisms, makes a number of specific claims, unifies some types of phenomena and theory, and suggests fruitful lines of inquiry. By any reasonable standard of theory choice, it is a better theory of passage than any currently on offer in metaphysics (Callender 2017, 227 and 263).

Let us turn now to criticisms that are more specific to the entropy camp. A critic might complain that members of the entropy camp cannot successfully defend their belief that the relation of earlier-than reduces to the lower-to-higher entropy relation rather than to the higher-to-lower entropy relation. At best they must presuppose what they need to prove.

Those in the entropy camp believe entropy tends to increase over time in isolated systems because higher entropy states are “typical” or “likely,” compared to lower entropy states. These theorists, says Maudlin, do not appreciate that their notion of typicality needs to rely on an assumption of the “production” of events that serves as a “driving force” that gets states to evolve into other states. Yet it is the intrinsic arrow that provides this production, this driving force. The intrinsic arrow is why production is an asymmetric relation: If A produces B, then B does not produce A. In this sense, earlier states are metaphysically more fundamental than the later states that are produced by them and exist because of them. So, those in the entropy camp have the dependencies backward.

Many critics of the entropy camp say that, even though time’s arrow is directly correlated with the universe’s increase over time in its entropy, this is an interesting but merely contingent feature of the universe that is not crucial to characterizing or explaining time’s arrow, although it might be a sign of the arrow’s presence. Stop signs near intersections of roads are signs of the presence of cars in the world, but cars do not need stop signs in order to be cars.

Does time need an entropy arrow or any kind of extrinsic arrow at all? Ferrell Christensen has an interesting perspective on this. “It is puzzling that Boltzmann’s thesis of extrinsic temporal asymmetry is accepted so widely without question—it is virtually an article of faith among philosophers and physicists in certain quarters. I suggest that such an attitude is unjustified. At least for now, the assumption that time needs an extrinsic arrow is in error” (Christensen 1987, 247).

Castagnino and Lombardi argue that the arrow is intrinsic to relativistic spacetime, but it is not based on or reducible to entropy changes. They embrace:

Earman’s ‘Time Direction Heresy’, according to which the arrow of time, if it exists, is an intrinsic feature of spacetime which does not need and cannot be reduced to non-temporal features [such as entropy] (Earman 1974, 20) and it cannot be “defined” in terms of entropy…[and the] geometrical approach to the problem of the arrow of time has conceptual priority over the entropic approach, since the geometrical properties of the universe are more basic than its thermodynamic properties…. [T]to confidently transfer the concept of entropy from the field of thermodynamics to cosmology is a controversial move. The definition of entropy in cosmology is still a very problematic issue, even more than in thermodynamics: there is not a consensus among physicists regarding how to define a global entropy for the universe. In fact, it is usual to work only with the entropy associated with matter and radiation because there is not yet a clear idea about how to define the entropy due to the gravitational field (Castagnino and Lombardi 2009, 8).

Critics of the entropy theory have complained that entropy is anthropomorphic and person-dependent, but time’s arrow is not. This criticism is explored later in this article after more has been said about the nature of entropy and after we consider the ideas of Edwin T. Jaynes.

Entropy change is not the same as complexity change. The state of the universe at the big bang was simple. A future state near equilibrium will be simple. Today’s state is quite complex. One important alternative to the entropy theory among advocates of the extrinsic theory of time is that “the direction gets into time not through the growth of disorder but through the growth of structure and complexity” (Barbour 2020, 20).

Turning back to broader criticisms of the extrinsic theory that do not apply only to the entropy theory, Tim Maudlin and others complain that proponents of the extrinsic theory do not understand the nature of time-reversal symmetry. Saying the universe obeys time-reversal symmetry or time-reversal invariance, which is the same thing, is the technical way of saying that whatever happens could have happened in reverse. Maudlin believes misunderstanding this feature is a mistake that causes so many of those in the extrinsic camp to accept the following faulty argument: The fundamental physical laws are time-reversal invariant. So, there is no direction of time in fundamental physics. This issue is explored further in the final section of this article.

4. The Entropy Arrow

The idea that entropy is intimately connected to time’s arrow originated from noticing that increases of entropy in closed and isolated systems are correlated with increases in time. The increases are directly correlated and highly correlated but not perfectly correlated because entropy increase is a strong tendency, not a certainty, although this difference between tendency and certainty was not clear to the physics community until thermodynamics became grounded in statistical mechanics.

The word “entropy” is not part of the vocabulary of the ordinary person, but the concept is indispensable in modern science. It has many helpful characterizations depending on the situation. For example, in some systems it can be a useful numerical measure of the system’s randomness, the amount of energy that has become useless in the system, lack of knowledge of the system’s precise micro-state, and the system’s nearness to equilibrium. In equilibrium, the system’s macroscopic properties are constant and nothing interesting is happening macroscopically even though its microstate does keep changing. In equilibrium there is time but no arrow of time (according to the entropy theory but not the intrinsic theory).

In a closed and isolated and non-expanding system, total energy stays the same and merely changes form over time while total entropy increases over time. So, entropy is not energy. Nor is entropy a kind of substance that retains its identity over time. It is a special property of a multi-particle system (for example, a group of molecules) that is changing its configuration over time.

In the late 17th century, Robert Boyle claimed, without strong evidence, that physical phenomena can be explained by “the particular sizes, shapes, and situations of the extremely little bodies that cause them.” In the 1870s, it still was not generally accepted by physicists that there are these “little bodies” or corpuscles that we now call atoms and molecules. At that time, Boltzmann envisioned thought experiments for a universe that he treated as a discrete particle system, as if it were a system of tiny, colliding billiard balls in constant motion in three dimensions of space. However, he believed his atomism, like Isaac Newton’s atomism, was merely a counterfactual assumption, one that was literally false yet helpful in describing thermodynamic behavior. Most all physicists, including Boltzmann, believed at the time that matter is continuous and infinitely divisible. But he soon changed his mind and became an early exponent of the atomic thesis—that atoms are real and not just useful mathematical artifacts. But his contemporaries were slow to accept atoms, and he came under severe criticism for promoting the atomic hypothesis. They argued that atoms cannot be seen and never would be, so why believe in them? Boltzmann’s response was that atoms are real because they are so helpful for explaining the second law and other thermodynamic principles.

Boltzmann had other insights. The notion of a macrostate as opposed to a microstate of a physical system was first suggested by him. At any point in time, a system is in one macrostate which is produced by exactly one microstate but which could have been produced by any one of a great many other microstates because at the macrolevel no one except Laplace’s Demon could practically tell one of these microstates from another.

Scientists never actually know the microstate at a given time of any macroscopic system, such as the positions and momenta of its massive number of atoms and molecules; they know only the values of some of the system’s useful, emergent, macroscopic variables such as volume, pressure, temperature, and (going beyond thermodynamics) voltage, color, species, and the number of cows milked this morning in barn 6. The values of those macro-variables constitute the system’s macrostate. Boltzmann’s most important insight in statistical thermodynamics was that there is multiple realizability of any macrostate by an enormous number of microstates. Boltzmann realized that this feature can be used to quantitatively define entropy. He had the fruitful insight that the entropy of a system can be characterized quantitatively as the logarithm of how many ways the constituent corpuscles or molecules of a closed and isolated physical system can be re-configured so that external observers cannot tell the difference macroscopically among all those configurations.

Here is a helpful analogy from Brian Roberts for understanding what he calls Boltzmann’s counting argument for why a closed system’s high-entropy states can happen in such an enormous variety of ways that they occupy the ‘greatest volume’ of possibilities.

It is like imagining a house with a thousand blue rooms and … (several rooms that are red). …Think of the blue rooms as analogous to high-entropy states (of the isolated house). Now, suppose that you were to leave a red room and enter an arbitrary new room (with a uniform probability of entering any  given room); it is overwhelmingly likely that the new room will be blue. Moreover, it is likely that if you continue to repeat this process, your room colour will (with high probability) be unchanging or ‘stationary’ over time…. Unfortunately, this is not enough to explain the time-asymmetric behaviour of such systems. Returning to the house analogy, suppose we find a person in a red room and ask what colour room they are most likely to have come from? The very same volume argument concludes: a blue one. That is, the counting argument by itself provides equally good evidence for a high entropy state to the future and to the past (Roberts 2022).

(The first two parenthetic phrases in the quotation were added by the author of this article.) Your entering a new room is analogous to the system’s entering a new configuration, a new state. Explaining the time-asymmetric behavior of the house requires the addition of the Past Hypothesis and other assumptions to be discussed below.

The principal law in physics that involves entropy is the second law of thermodynamics. There is agreement that it has never been violated, but there is no agreement on what has not been violated. The philosopher of physics Percy Bridgman quipped that “There have been nearly as many formulations of the second law as there have been discussions of it.” Here is the best, short, non-mathematical version for the philosophical considerations of this article:

Second Law of Thermodynamics: There is a strong tendency for entropy to increase with time as a closed, isolated system moves towards equilibrium.

This is the case provided the system is not already at equilibrium. If it is already in equilibrium, then entropy would have a strong tendency to stay the same. If the entropy camp is correct that time’s arrow is due to entropy increase, then as the universe approaches equilibrium time’s arrow fades away. This statement of the second law for a system’s total entropy is quoted from the physicist Richard Feynman. A system in which no matter and no energy can cross the system’s boundary is said to be closed and isolated, respectively.

Another way to express the second law is to say that, for a closed and isolated system having a nonequilibrium initial condition, it is far more likely to move closer to equilibrium than away from equilibrium. So, the proper answer to the question, “Why does total entropy increase over time?” is that this is what is very, very likely to happen.

The second law is not a fundamental law of physics. Physicists agree that it should be derivable from the fundamental laws using the techniques of statistical mechanics (statistical techniques for handling a system with a very large number of components), perhaps with other assumptions. However, there is no consensus on the details of the derivation. From 1902 to 1905, Einstein worked unsuccessfully to derive the second law from basic physical features, and he stopped working on the problem in 1905 when he produced the theory of relativity. The derivation difficulty continues in the 21st century. In the 20th century the field of thermodynamics came more and more to be understood as the field of coarse-grained statistical mechanics in which the coarse graining is due to the loss of a considerable amount of information about the configuration of the system’s constituent particles.

All other things being equal, physicists prefer exact laws to probabilistic laws, and the standard form of the second law is probabilistic. The second law is about a strong tendency or propensity, not a necessity, although in the early days of thermodynamics the law was mistakenly presented as a necessity—that total entropy never decreases in a closed and isolated system. This mistake led some critics to make the inaccurate comment that the growth of life on Earth is inconsistent with the second law. The reason there can be life on Earth is because our sun sends high-temperature, low-entropy, yellow sunlight to Earth where it is used for photosynthesis and so forth and then radiated away (as lower-temperature, higher-entropy, infrared energy). If the sun’s energy were not continually re-radiated, the Earth would slowly heat up, and we all would die of global warming.

Regarding this notion of tendency, consider that when two dice are rolled, the tendency is to not get two sixes. The failure to roll two sixes is not a necessity; it is merely likely. Similarly for entropy. When the state of a closed and isolated system of many molecules changes, it has a propensity to change to a state with higher total entropy because that is the most likely thing to happen. It is enormously likely, almost certain in systems with a great many particles that frequently change their configurations. However, the second law is not true for systems with a small number of parts or for systems of parts that do not change their configurations.

Boltzmann was the first person to have this insight about likelihood. Since then, it has been considered a misunderstanding of the second law and of the concept of entropy to believe there is an asymmetric force or mechanism that causes entropy to increase.

The second law is expressed as an inequality, not an equation, and the actual speed or rate with which a system increases its entropy is not specified, even by the quantified version of the law. The rate depends on many factors that are not discussed in this article and that do not affect the philosophical claims about time’s arrow.

What the second law does not imply is that, for a single system that tends to increase its total entropy, all of its sub-systems tend to increase their entropy, too. Entropy within some sub-system might decrease, but one can expect there will be an even greater entropy increase elsewhere in the system. For example, think of human civilization on Earth as a sub-system of the effectively isolated solar system. Civilization can thrive on Earth—but at the expense of entropy increase due to the sun’s continuing to burn its nuclear fuel.

There are several background assumptions made in founding thermodynamics upon statistical mechanics. The system must have very many particles, and they must readily change their configurations so that it looks as if chance is operating. More specifically, it has proved useful for explaining a system’s evolution in the future to assume that all or almost all the microscopic states underpinning a given macroscopic state are equiprobable, and it is presumed this idealization is not misleading. (But is it legitimate to assume that all the blue rooms in the above analogy are sufficiently equiprobable? In the real world beyond this toy model, the “blue rooms” surely are not exactly equiprobable.) It is also clear that, if one wants to explain evolution from the past and not just evolution toward the future, there needs to be some assumption about why the entropy started out lower rather than higher than it is today. Unfortunately, there is no consensus among physicists regarding whether more assumptions are needed in order to establish in detail that thermodynamics is founded upon statistical thermodynamics.

Another point of controversy is the complaint that entropy is subjective or anthropomorphic. The influential physicist Edwin T. Jaynes remarked that:

Entropy is an anthropomorphic concept, not only in the well-known statistical sense that it measures the extent of human ignorance as to the microstate. Even at the purely phenomenological level, entropy is an anthropomorphic concept. For it is a property, not of the physical system, but of the particular experiments you or I choose to perform on it

because of our choice of what level of coarse-graining to use. Jaynes’ position is presented then attacked by the philosopher Adolf Grünbaum as misunderstanding the concept of entropy. It  is clear that entropy is somewhat observer-dependent, but what is more important is the extent and significance of this observer dependence. Grünbaum’s point is that all the different ways of coarse-graining lead to nearly the same result, to the same value for the entropy. Thus, entropy is not significantly subjective (Grünbaum 1973 648-659). Nobel-Prize winning physicist Roger Penrose agreed:

In view of these problems of subjectivity, it is remarkable that the concept of entropy is useful at all in precise scientific descriptions—which it certainly is! The reason for this utility is that the changes from order to disorder in a system, in terms of detailed particle positions and velocities, are utterly enormous, and (in almost all circumstances) will completely swamp any reasonable differences of viewpoint as to what is or is not ‘manifest order’ on the macroscopic scale (Penrose 1989, 310).

Not everyone adopted the position that Grünbaum and Penrose advocated. Carlo Rovelli and Huw Price did not. Rovelli said, “The directionality of time is…real but perspectival…: the entropy of the world in relation to us increases…and…the increase in entropy which we observe depends on our interaction with the universe….” For the entropy camp, the objectivity of time’s arrow turns on the outcome of this continuing debate.

Time’s arrow is not illustrated by the fact that tomorrow comes after today. That fact is true by definition. Instead, according to the entropy camp, the arrow of our universe is shown by the fact that today has a greater value of entropy than yesterday and so on for tomorrow and the foreseeable days ahead. The universe will never be seen to have a state just like it has today. Why is this? If things change, why can’t they change back? They can, but the probability that they will is insignificant.

According to the entropy camp, there are two ways to have time without an arrow. There would be no arrow if entropy were to stop changing; it would stop at equilibrium. There also would be no arrow if the entropy changes were to become randomly directed. Members of the intrinsic camp disagree with these two exceptions and say there is no way to have time without an arrow.  Even at equilibrium, time would continue to go  from past to future, they say.

In the 21st century, Stephen Wolfram said that nature is a cosmic computer. Every physical process is actually a natural computation, Wolfram argued, and “… the passage of time basically corresponds to the progressive updating” of this computer. This updating is what other physicists call evolving according to the laws of nature. One of Wolfram’s critics, the philosopher of physics Tim Maudlin, reacted by remarking, “The physics determines the computational structure, not the other way around.”

For a not-too-mathematical introduction to entropy and some of its many sub-issues, see the chapter “Entropy and Disorder” in (Carroll 2010). For a more mathematical, but easy-to-understand introduction, see (Lebowitz 1993). For an examination of how entropy has been misunderstood in the literature, see (Lazarovici and Reichert 2015).

a. The Past Hypothesis

Physicists presume that the fundamental laws relevant to understanding entropy change are reversible in the sense that they have time-reversal symmetry. This implies (among many other things) that, for every solution of the equations for which entropy increases, there is also a time-reversed (that is, process-reversed) solution in which entropy decreases. Yet we never notice any significant entropy decreases. We invariably experience entropy increases. These increases all together are what those in the entropy camp call the arrow of time. Philosophers and physicists want to know why the other half of the equations’ solutions are not manifested.

So, careful attention is needed in order to explain in detail why the entropy in our universe generally increases from past to future and not from future to past. If not in our universe, then at least in our observable universe. Ludwig Boltzmann, with a little subsequent help from modern statistical mechanics, tried to explain this by appealing to the second law of thermodynamics with no assumption that time is intrinsically directed, but he invariably had to assume that entropy was low in the distant past.

But why was entropy was so low in the distant past of our observable universe? Is this just an ad hoc assumption? It certainly makes the distant past be very special. This fact about low entropy is not derivable from any of the fundamental laws, and it is not known a priori. Richard Feynman highlighted the need for this assumption when he said in 1963:

So far as we know all the fundamental laws of physics, like Newton’s equations, are reversible. Then where does irreversibility come from? It comes from going from order to disorder, but we do not understand this till we know the origin of the order… for some reason the universe at one time had a very low entropy for its energy content, and since then the entropy has increased. So that is the way towards the future. That is the origin of all irreversibility, that is what makes the process of growth and decay, that makes us remember the past and not the future…. One possible explanation of the high degree of order in the present-day world is that it is just a question of luck. Perhaps our universe happened to have had a fluctuation of some kind in the past…. We would like to argue that this is not the case.

In 1965, Feynman said: “I think it necessary to add to the physical laws the hypothesis that in the past the universe was more ordered, in the technical sense, than it is today” (Feynman 1965, 116). In 2000, the philosopher David Albert suggested we assume that the entropy of the observable universe was minimal all the way back in time, and if the time began at the Big Bang, then entropy was minimal at the Big Bang. This low-entropy boundary condition in the past is his Past Hypothesis. The hypothesis is not a dynamical law, but it is a law in the sense that it provides a lot of information in a compact and simple expression, which is all David Lewis requires of a law.

At the big bang, the character of gravity was much more important than anything else. In that early and brief era, the universe was very highly ordered. This was gravitational order, not thermodynamic order. Gravity was in a very special, improbable state while everything else was as random as it could be. Gravitation produces clumpiness, but the universe then was not at all clumpy, although it is presently clumped into stars and galaxies. Although the low entropy of the gravitational order dominated the concurrent high thermodynamic entropy at the big bang, the reverse holds these days.

Sean Carroll defended the Past Hypothesis:

You need the Past Hypothesis…. Now, to be fair, the story I am telling you here, this is the standard story that most physicists or philosophers would tell you. There are people who don’t go along with the standard story. There are people who…think that time just has a direction built into it…that there is a flow from the past to the future. I don’t think that. Most working physicists don’t think that, but there are people who think that.… Even if you believe that, it doesn’t by itself tell you whether the past had low entropy.

To me the logic goes in the following way. You might want to think that time fundamentally has a direction–or that time doesn’t fundamentally have a direction [and] it’s just that it started with low entropy and so we perceive it to have a direction macroscopically. But if you think that time fundamentally has a direction, you still need to explain why the early universe had low entropy. That doesn’t come for granted. There is no feature about saying time has a direction that then says if I take the current state of the universe and evolve it into the past, the entropy goes down. There is no connection there, right? So, even if you believe that time has a direction, you still need to have some past hypothesis. And once you have the past hypothesis, you don’t need to assume that time has a direction because it will have a direction macroscopically [because of the second law of thermodynamics] even if microscopically it’s completely reversible. I think that’s why most people like the Past Hypothesis package when it comes to explaining the asymmetry of time (Carroll 2022b).

In the above quotation, Carroll supported two claims: (1) The Past Hypothesis is needed in order to successfully use entropy to explain the existence of time’s arrow. (2) The extrinsic theory, especially the entropy theory, is more appropriate than any intrinsic theory for explaining time’s arrow.

There is a fine point that is relevant here, the difference in kinds of entropy between ordinary thermodynamic entropy and gravitational entropy. At the time of the big bang, the entropy was smooth or highly ordered in regard to its gravitational potential energy, but not in regard to the thermodynamic entropy or entropy of heat. The latter is very high. The gravitational potential energy is the more important factor at that time, and it is very, very low.

The Past Hypothesis does not require that the low entropy state at the big bang had no prior state nor that it is the very lowest possible value for entropy. Entropy in the distant past just needs to be low enough that the universe could easily have evolved into the observable universe of today.

But this raises another fine point: the difference between a boundary condition on the past as opposed to a boundary condition on the future. If our goal were to explain only why entropy increases in the future, then we could assume the Principle of Indifference—that we are indifferent about which microstate is producing a given macrostate—and not bother with the Past Hypothesis. Not so, when it comes to explaining why entropy decreases in the past. In that case, as explained in (Carroll 2020):

[W]e have to supplement the Principle of Indifference with the Past Hypothesis. When it comes to picking out microstates within our macrostate, we do not assign every one equal probability: We choose only those microstates that are compatible with a much lower-entropy past (a very tiny fraction), and take all of those to have equal probability. …If we use statistical mechanics to predict the future behavior of a system, the predictions we get based on the Principle of Indifference plus the Past Hypothesis are indistinguishable form those we would get from the Principle of Indifference alone. As long as there is no assumption of any special future boundary conditions, all is well.

Speaking for the community of cosmologists, Brian Greene issued a warning: The Past Hypothesis is “observationally motivated but theoretically unexplained.” Instead of merely adopting Albert’s hypothesis, cosmologists want a theoretical reason why the Big Bang began at a relatively low entropy macrostate, a reason that makes this low entropy natural and not merely assumed ad hoc. The search for that theoretical reason has turned out to be extremely difficult. About this search, Roger Penrose declared, “To me, it’s the greatest puzzle about the Big Bang.”

Craig Callender proposed a solution to Penrose’s puzzle: “It seems a philosophically respectable position to maintain that the Past Hypothesis doesn’t need explanation” because it is a brute fact, a rock-bottom truth.

Anything whatsoever could be explained by the right choice of unusual initial conditions. Is the Past Hypothesis true merely because of some random fluctuation? Remarking on what he believed is the weakness of that explanation, Carroll said:

The state of the early universe was not chosen randomly among all possible states. Everyone in the world who has thought about the problem agrees with that. What they don’t agree on is why the early universe was so special—what is the mechanism that put it in that state? And, since we shouldn’t be temporal chauvinists about it, why doesn’t the same mechanism put the late universe in a similar state? (Carroll 2010 301-2).

Motivated by this explanatory optimism, many cosmologists have produced speculative theories that appeal to special conditions long before the Big Bang that have led naturally to low entropy at the Big Bang. However, none of these theories has attracted many supporters.

Among cosmologists, the most widely supported explanation of why the Big Bang was at relatively low entropy is that this is implied by cosmic inflation, a special version of the Big Bang theory that supposes there was early, exponential inflation of space, a swelling that proceeded much faster than the speed of light. This inflation theory establishes what direction the arrow of time points, and it is attractive also because it provides a ready explanation for many other unsolved problems in physics such as why the oldest and most distant microwave radiation arriving now on Earth from all directions is so uniform in frequency and temperature.

The leading theory for why this inflation began is that it was ignited by a fluctuation in a pre-existing inflaton field (not inflation field) that was at even lower entropy. It is believed that at a very early time all the energy of the universe was contained within the inflaton field. Unfortunately, there is no convincing reason why the inflaton field exists and why it fluctuated as it did and why entropy was lower before then—a convincing reason that this was natural and to be expected—other than that these assumptions help explain the value of entropy just as inflation began. So, the conclusion has to be accepted that Penrose’s puzzle remains unsolved.

See (Wallace 2017) for an exploration of the controversy about the Past Hypothesis and how it should be properly formulated.

5. Other Arrows

Mini-arrows are time-asymmetries of kinds of macro processes. Time has many mini-arrows that distinguish the future from the past, and these are part of what constitutes or exemplifies time’s master arrow according to the extrinsic camp. This article has mentioned several, but there are others. These mini-arrows are deep and interesting asymmetries of nature, and philosophers and physicists would like to know how the mini-arrows are related to each other. Can some be used to explain others? This is the taxonomy problem.

Sean Carroll has a precisely expressed position on the taxonomy problem:

All of the macroscopic manifestations of the arrow of time…can be traced to the tendency of entropy to increase, in accordance with the Second Law of Thermodynamics.

So, that is the single thing that enables all these other asymmetries between past and future. The fact that entropy is increasing is the reason why there is an arrow of time. I would not say it is the arrow of time.

Not all members of the entropy camp agree with Carroll that entropy is the fundamental mini-arrow in terms of which all the other mini-arrows can be explained. See (Maudlin 2007) for more discussion of which of time’s mini-arrows can be explained in terms of which others. The following sub-sections consider three mini-arrows—the memory arrow, the cosmological arrow, and the causal arrow. This section doesn’t discuss some of the other arrows. The radiative arrow is shown by light leaving a candle flame rather then converging from all directions into the flame.  The action arrow is shown by our ability to affect the future and not the past.

a. The Memory Arrow

The memory mini-arrow or psychological mini-arrow shows itself in the fact that we remember the past and never the future. The most popular explanation of this mini-arrow appeals to entropy. Past events often have present traces, but future events never have them. When you see a footprint in the sand, you think someone in the past stepped there, never that someone is coming. Past events also often leave present traces in our brains. Remembering an event is a mental process that interrogates the brain’s stored trace of the event. It is reviewing but not re-viewing the event. The trace in the sand requires the sand to increase its entropy; and the trace in our brains requires our neuron structure to increase its entropy.

Adrian Bardon offers a summary of the principal account:

In forming a memory, we reconfigure our neurons. This creates a local increase in order (within parts of our brain responsible for memory), but only at the expense of a slight expenditure of energy, a dissipation of bodily heat, and an overall entropy increase. Therefore, on this account, the formation of memories is relative to the larger thermodynamic trend. Our brains getting themselves into better order happens within the context of the trend towards overall heat dissipation. In a universe where systems necessarily decrease in entropy, our brains couldn’t be getting themselves into better order. According to the theory, then, the psychological order is dependent on the entropic arrow—and is thus just as contingent as the entropic arrow (Bardon 2013 121).

b. The Cosmological Arrow

In 1937, Arthur Eddington coined the phrase “cosmological arrow” as the name for the mini-arrow of time produced by the relentless increase in the volume of the universe over time.

The most-favored explanation of the cosmological mini-arrow, and why it is directly correlated with the increase in time, involves dark energy. In 1998, cosmologists discovered the universal presence of dark energy. It exerts a small, negative, repulsive pressure on space making it expand everywhere. For billions of years, space has slowly been increasing the rate of this expansion, namely the rate at which clusters of galaxies expand away from each other. As time goes on, it will never stop expanding because dark energy never dilutes away, so when the volume doubles, so does the amount of dark energy (or so it is predicted, but this has never been experimentally or observationally established). Things might have started out differently, but they did not. This is the standard explanation of why there is a cosmological mini-arrow.

The physicist Richard Muller argued that this cosmological arrow grounds time’s arrow. Muller is in the intrinsic camp. He said the problem of time’s arrow is really the problem of “why time flows forward rather than backward.” And: “The flow of time consists of the continuous creation of new moments, new nows, that accompany the creation of new space” during cosmic expansion. So, the arrow of time is cosmic expansion.

Most all cosmologists believe the Big Bang’s expansion is only of 3-D space and not 4-D spacetime. Muller challenged this popular position. He said, “The progression of time can be understood by assuming that the Hubble expansion takes place in 4 dimensions rather than in 3.” This is a version of the growing-block theory.

The usual assumption in cosmology is that 3D spatial expansion has no effect on the value of the universe’s entropy. According to Muller, this is not so. See his article, “Throwing Entropy under the Bus.” Penrose believes Muller’s proposal about entropy lacks promise. Penrose has said, “There is a common view that the entropy increase in the second law is somehow just a necessary consequence of the expansion of the universe…. This opinion seems to be based on…misunderstanding” (Penrose 2004 701).

George Ellis promoted the cosmological arrow of spatial expansion as the key to understanding time’s arrow. He advocated an intrinsic theory of time’s arrow via a growing-block theory in which:

The cosmological direction of time…is set by the start of the universe. There is no mechanism that can stop or reverse the cosmological flow of time, set by the start of the universe. It sets the direction of flow of the time parameter t…; time starts at t = 0 and then increases monotonically…. The gravitational equations…are time symmetric (because the Einstein equations are time symmetric), but the actual universe had a start. This broke the time symmetry and set the master arrow of time: the universe is expanding, not contracting, because it started off from a zero-volume state. It had nowhere to grow but larger….

A ‘past condition’ cascades down from cosmological to micro scales, being realized in many microstructures and setting the arrow of time at the quantum level by top-down causation. This physics arrow of time then propagates up, through underlying emergence of higher-level structures, to geology, astronomy, engineering, and biology. …The overall picture that emerges is one of the arrow of time in physical systems being determined in a top-down manner, starting off from a special initial condition at the cosmological scale where the cosmological arrow of time sets the basic direction of causation, but then emerging in complex systems through bottom-up causation… (Ellis 2013).

c. The Causal Arrow

Noting that causes happen before their effects, some researchers have suggested that time’s arrow and its mini-arrows can be explained or defined in terms of the causal mini-arrow. This causal theory is a bold proposal for solving the taxonomy problem.

Some philosophers believe that it is true by definition that causes precede their effects. Others disagree and say this definition inappropriately rules out backward causation  because the existence or non-existence of backward causation should be an empirical matter, not a matter of definition. Everyone agrees, though, that normally causes do happen before their effects.

Tooley says that causes “fix” their effects in the sense of making them real. If so, where do uncaused events fit into the temporal order, or are  uncaused events nonexistent?

One would like to know more specifically how cause-effect relations are tied to the arrow of time. There have been many suggestions. For example, in his 1956 book The Direction of Time, Hans Reichenbach advocated a causal theory of time. Like Leibniz, he believed time order reduces to causal order. Reichenbach believed that macroscopic causality produces a temporal ordering on events, although the ordering alone is insufficient for supplying time’s direction (that is, specifying which of the two possible orderings is actual). Think of a horizontal line. Its points are ordered from left to right but also ordered from right to left. Intrinsically the two orders have the same structure; one order is just as good as the other. What is needed in addition for distinguishing one direction in time from the other, says Reichenbach, is entropy flow in branch systems. His point is explained below. He does not rely on a hypothesis about entropy starting off at a minimum. For another causal theory of time, see chapters 10 and 11 of Real Time II by D.H. Mellor. For commentary on the effectiveness of the program of using causation to establish an ordering on time, see (Smart 1969) and “Time and Causation” by Mattias Frisch in (Dyke and Bardon 2013). Here are some highlights and background.

An important issue is whether causes exist at both the microlevel and macrolevel. The physicist Lee Smolin insists that time’s arrow is intrinsic to time and that causes exist at any scale, no matter how small. Sean Carroll disagrees. He argues that time’s arrow is extrinsic and that, at the microlevel, the fundamental laws of physics imply there is no distinction between past and future and thus no causality. Commenting upon the fact that particle physicists do speak of cause and effect when discussing the microlevel, Carroll says they are using a different notion of causality; they simply mean that signaling occurs slower than the speed of light and occurs only within light cones.

Many researchers have considered the concepts of cause and effect to be metaphysically dubious compared to the clearer concept of temporal order. Is it appropriate to assume that the concept of cause is even coherent? In the nineteenth century, the distinguished physicist Gustav Kirchhoff said the concept of cause is “infected with vagueness,” and Ernst Mach said the concept has no place in physics. In 1912, Bertrand Russell declared the concept to be a “relic of a bygone era” that is not useful in fundamental physics, and so physicists who aim to speak clearly about the foundations of physics should confine themselves to using differential equations and avoid causal discourse.

In the early twenty-first century, the philosophers Nancy Cartwright, David Albert, and Judea Pearl argued for at least the coherence and usefulness of causal discourse. Pearl remarked that, “Physicists write equations, but they talk cause and effect in the cafeteria.”

There is also the issue of whether the causal arrow is objective or subjective. The philosopher Huw Price (Price 1992) declared that causal discourse is “perspectival” or subjective, and causal order is not an objective asymmetry of nature. One implication of this is that backward causation is possible.

Assuming for the moment that the concept of causality is in fact coherent, consistent and an objective asymmetry in nature, how might it help us understand order relations for physical time? Some have said that to understand temporal precedence, it is sufficient to say:

Event C temporally precedes event E just in case event C could have been part of the cause of event E, the effect.

If this is correct, we can understand the “happens before” relation if we can understand the modal notion of “could have been part of the cause.” This proposal presupposes that we can be clear about how to distinguish causes from effects without relying on our knowledge of what happens before what. Can we? Mauro Dorato, among others, has argued that, “there is no physical property, attributable only to an event qua cause, that intrinsically (non-relationally) differentiates it from its effect” (Dorato 2000 S524). If causes can be distinguished from effects only by assuming causes happen before effects, then we have the fallacy of begging the question, which is a form circular reasoning.

Here are some suggestions that have been offered to avoid this circularity. The first comes from Frank Ramsey, and it was adopted by Hans Reichenbach. Using a macroscopic concept of causality, we can know what causes what independently of knowing whether causes happen before effects by an:

Appeal to intervention:

Event C is a cause of event E if controlling C is an agent’s effective means of controlling E and not vice versa.

Appeal to probability:

One event is the cause of another if the appearance of the first event is followed with a high probability by the appearance of the second, and there is no third event that we can use to factor out the probability relationship between the first and second events and thereby declare the relationship to be spurious.

Appeal to conditional probability:

Fact C causes fact E if the chance of E, given C, is greater than the chance of E, given not-C.

Appeal to counterfactuals:

What “C causes E” means is that, if C had been different, but everything else had stayed the same, then E would have been different.

Appeal to possible worlds:

What “C causes E” means is that in a possible world like ours in which E doesn’t happen, C doesn’t happen.

Philosophers of physics must assess whether any of these appeals succeed, perhaps with revision.

Some other scholars such as Tim Maudlin recommend not relying upon any of these appeals because causal order and thus the distinction between cause and effect is a primitive feature of the universe and cannot be defined or explained in terms of anything more fundamental. His claim successfully avoids the charge of circular reasoning, but it faces other problems involving how our knowledge of patterns of events, such as how this kind of event being followed by that kind, ever produces our knowledge of causal relations.

For a detailed discussion of attempts to avoid the charge of circular reasoning when defining or explaining temporal precedence in terms of causes preceding their effects, see (Papineau 1996). See also (Woodward 2014). The philosophical literature on the nature of causation is voluminous, and here we touch briefly on only a few points, but a point of entry into the literature is this encyclopedia’s article on causation.

Can entropy increase be explained in terms of causality? Can the cosmological mini-arrow (the expansion of the universe) also be explained in terms of causality? These are difficult questions to answer positively, but some researchers are optimistic that this can be done as part of a broader program aimed at the taxonomy problem.

Even if temporal precedence can be explained in terms of causal order, there is an additional issue involving the intrinsic camp vs. the extrinsic camp. Many in the intrinsic camp say that once we have temporal precedence we can say the arrow is simply the transformation from past to future. Those in the extrinsic camp disagree. More is needed, they say. To explain the arrow, we need to explain why so many processes go one-way in time; they do not do this because of temporal precedence. Perhaps a story about entropy increase is required.

Instead of trying to define or explain time’s arrow in terms of the direction of causation, Ferrel Christensen suggested doing the reverse. Perhaps the features giving time its intrinsic arrow are what is responsible for the direction of causation. See (Christensen 1987) for more about this research program.

Or perhaps the features giving time its extrinsic arrow rather than intrinsic arrow are what is responsible for causation and its direction. That is a position taken by many members of the entropy camp. Sean Carroll offered a promissory note: “We should be able to trace the fact that causes precede effects to the fact that entropy is increasing over time” (with the help of the Past Hypothesis). He means all causes, not merely some causes, but only causes at the macroscopic level. As Carroll describes causality, one can distinguish causes from effects at the macrolevel because the causes have “leverage” over the future, and this does not work the other way in time. He explains leverage in terms of intervention by saying that causation occurs when a small change at one time produces a large change at a later time (and the converse fails). We are confident, he says, that intervening and making a small change to the effect would not have made a change to the cause, and this is so due to the presence of the arrow of time. Because of this leverage, we can say the small change is the cause and the large change is the effect. At the fundamental microscale, says Carroll, there are no causes and effects, just patterns of events that particles follow with no arrow of time being apparent. To summarize, Carroll’s position is that causes make sense only on the large scale, and causes must occur before their effects, and this is so because of the direction of time which in turn is due to entropy increases.  He provides a sixty-second video of his argument at “Do Causes and Effects Really Exist?”.

Huw Price, who also is an advocate of a causal theory of time, has objected to Carroll’s position:

I argue that the asymmetry of causation cannot be reduced to any of the available physical asymmetries, such as the second law of thermodynamics. The basic problem for such a reduction is that the available physical asymmetries are essentially macroscopic, and therefore cannot account for causal asymmetry in microphysics (Price 1996, pp. 9-10).

Many physicists do not agree with Price’s assumption that there is causal asymmetry in microphysics. Brian Greene, for example, insists that a causal relationship is an emergent, macroscopic phenomenon.

6. Living with Arrow-Reversal

What would it be like to live with time’s arrow going in reverse to the way it actually does go? Before examining the many proposed answers to this question, remember that the phrase “arrow going in reverse” is ambiguous. The intrinsic camp and extrinsic camp agree that time reversal implies arrow reversal, but members of the intrinsic camp understand arrow-reversal to be about the reversal of time’s intrinsic passage, and members of the extrinsic camp understand it to be about process reversal such as explosions becoming implosions.

Could  we use a telescope to look back in time to some place and find time running in reverse there compared to our arrow? What would that look like? Or is the arrow always global?

Sean Carroll said, “The thought experiment of an entire universe with a reversed arrow of time is much less interesting than that of some subsystem of the universe with a reversed arrow. The reason is simple: Nobody would ever notice…. If the whole thing ran in reverse, it would be precisely the same as it appears now.”

Roger Penrose claimed an entire universe with a reversed arrow of time is quite interesting. He said that, if we lived there, then we would ascribe teleological effects to omelets assembling themselves into unbroken eggs or water droplets distributed across the floor along with nearby broken shards of glass assembling themselves into an unbroken glass of water. According to Penrose, “‘Look!’, we would say, ‘It’s happening again. That mess is going to assemble itself into another glass of water!’”

There is a significant amount of empirical evidence that some processes in distant galaxies unfold in the same time direction as they do here on Earth, and there is no contrary empirical evidence. For example, light signals are received only after they are sent, never before. Nevertheless, Horwich said: “I will defend the idea that the ‘directional character’ of time might vary from one region to another” (Horwich 1987 42). Boltzmann and Reichenbach tried to define the arrow locally, so they, too, supported the idea that the arrow could point in different directions in different regions.

How about the “directional character” of time pointing in different directions for different persons? Ferrel Christensen said:

Conceivably, then, the earlier-later asymmetry of common experience is limited to our region of time or of space. Indeed, suppose it were so highly spatially localized that different persons could have opposite time-senses: then one would remember events which for another are still in the future (Christensen 1987 232-3).

In 1902 in his Appearance and Reality, the British idealist philosopher and member of the intrinsic camp F.H. Bradley said that when time runs backward, “Death would come before birth, the blow would follow the wound, and all must seem irrational.” The philosopher J.J.C. Smart disagreed about the irrationality. He said all would seem as it is now because memory would become precognition, so an inhabitant of a time-reversed region would feel the blow and then the wound, just as in our normal region.

Stephen Hawking, also in the extrinsic camp with Smart, suggested in 1988 in A Brief History of Time:

Suppose, however, that God decided that…disorder would decrease with time. You would see broken cups gathering themselves together and jumping back onto the table. However, any human beings who were observing the cups would be living in a universe in which disorder decreased with time. I shall argue that such beings would have a psychological arrow of time that was backward. That is, they would remember events in the future, and not remember events in the past.

Hilary Putnam investigated the possibility of communication between our region of space with a normal arrow and a region with a reversed arrow:

Suppose…there is a solar system X in which the people “live backwards in time” (relative to us). Then if we go and spy on these people (bringing our own radiation source, since their sun sucks radiation in, and doesn’t supply it), we will see the sort of thing we see when we watch a motion picture run backwards…. It is difficult to talk about such extremely weird situations without deviating from ordinary idiomatic usage of English. But this difficulty should not be mistaken for a proof that these situations could not arise.

Tim Maudlin disagreed with Putnam and argued that there is a convincing argument that these situations could not arise. Assuming naturalism and the supervenience of the mental on the physical, and introducing a creative use of the asterisk symbol, Maudlin said:

[G]iven the actual sequence of physical states of your body over the last ten minutes, the time-reversed sequence of time-reversed states is also physically possible…. Let’s call this sequence of states your time-reversed Doppelgänger. […Introducing an asterisk notation] I will speak of the Doppelgänger’s neuron*s; these are just the bits of the Doppelgänger which correspond, under the obvious mapping, to the original’s neurons. …[T]he physical processes going on the Doppelgänger’s brain* are quite unlike the processes going on in a normal brain. …The visual system* of the Doppelgänger is also quite unusual: rather than absorbing light from the environment, the retina*s emit light out into the environment. …In every detail, the physical processes going on in the Doppelgänger are completely unlike any physical processes we have ever encountered or studied in a laboratory, quite unlike any biological processes we have ever met. We have no reason whatsoever to suppose that any mental state at all would be associated with the physical processes in the Doppelgänger. Given that the Doppelgänger anti-metabolizes, etc., it is doubtful that it could even properly be called a living organism (rather than a living* organism*), much less conscious living organism.

Norbert Wiener claimed any attempt to communicate between the normal region and the arrow-reversed region would “ruin everything” because one of the regions would rapidly collapse—the one that is very delicately balanced so that the entropy flows in reverse compared to our region. Sean Carroll agreed. A microstate that leads to entropy decrease is extraordinarily unstable under small perturbations, and entropy increase would take over again very quickly.

Sean Carroll proposed an argument against there actually being any time-reversed regions. Throughout the universe, cosmic rays continually hurtle from one region into another, so if there were a time-reversed region it would continually be encountering cosmic rays, but they that would be anti-particles for that region. However, any encounter between particles and anti-particles creates large releases of energy, much larger than the energy arriving on Earth from any distant star. Those large releases have never been observed, but they would have been if they existed.

7. Reversibility and Time-Reversal Symmetry

a. Summary

Understanding time-reversibility and time-reversal symmetry can be difficult because the terms are often mistakenly assumed to always be synonyms. Here is a summary of what is explained in the next section in more detail.

Physicists and philosophers of physics are interested in whether the universe is reversible, that is, time-reversible. If it is, then from knowledge of the present state of a closed and isolated system, Laplace’s Demon could both predict its future and retrodict its past. (If the system is not closed, then we also need to know what is crossing the system’s boundary and all the external forces acting on the system.) We can do all this predicting and retrodicting because of the conservation of information in a closed and isolated system. So, in the main sense of “reversibility,”

reversibility ==? conservation of information.

The logic symbol “?” represents “if and only if.” What this means, anthropomorphically-expressed, is that a closed and isolated system remembers where it came from and knows where it is going. For example, reversibility implies that, if your apartment burns down, then the information in the smoke, heat, light, and charred remains is sufficient for reconstructing your apartment exactly. The apartment information is preserved, though it is not accessible practically.

Even if the universe actually is reversible, we would never notice it because all our experience is of one-way macroscopic processes, statistically irreversible ones. We may have experienced a burning apartment but never an unburning apartment. More generally, if time-reversibility holds in the universe in the sense of information conservation, then when a closed and isolated system changes from being in an instantaneous state 1 at time 1 to a new instantaneous state 2 at any other time 2 according to the fundamental laws, then those same laws imply that, if the system were in state 2 at time 2,  then it must have been in state 1 at time 1.

Time-reversibility is sometimes called time-symmetry, but the two terms are not synonymous if by “time-symmetry” we mean time-reversal symmetry. This is because,

time-reversibility ? time-reversal symmetry.

The first implies the second, but the second does not imply the first. The reason for this is explained simply in (Carroll 2022c, 128).

Time-reversal symmetry implies that, if any process goes one way, it could in principle go the other way without violating any fundamental laws, even if it would violate a less fundamental law such as the second law of thermodynamics.

There is general agreement that,

Time-reversal symmetry ? time-reversal invariance under the time-reversal operation T.

But the term “operation T” is ambiguous. The intrinsic theory says it is reversing time. The extrinsic theory says T is reversing the dynamics and not time itself. That is why advocates of the extrinsic theory complain that it is misleading to call T the “time-reversal operation.” Nevertheless, adherents of the extrinsic theory continue to call it that because it has been customary to do so.

Intrinsic T ? time runs in reverse.

Extrinsic T ? all processes run in reverse.

The extrinsic T is about event-order reversal, such as replacing the time variable t by its negation -t when applying the fundamental laws. Informally, for an adherent of the extrinsic theory of time’s arrow, a theory is time-reversal symmetric just in case its laws do not indicate a time direction.

Ever since Isaac Newton created what is now called classical mechanics, time-reversal symmetry was considered by scientists to be an exact feature of the fundamental laws and to be a brute feature of nature. It was presumed to be a universal symmetry that would hold in all fundamental theories. Then in the 20th century, a direct failure of time-reversal symmetry was detected. Philosophers of physics immediately disagreed about the significance of this discovery. Some physicists and philosophers claimed the arrow of time had been found in nature. Others claimed instead that this surprising discovery is very interesting but irrelevant to the problem of explaining the arrow of time. Members of the entropy camp claimed that whether time-reversal symmetry holds or does not hold is irrelevant to time’s arrow because the symmetry has nothing to do with entropy.

b. More Details

What complicates the discussion of this topic is that often two people in disagreement will give different senses to the same term without explicitly mentioning this and perhaps not noticing it.

Let us delve into the details. Symmetry is about actions or operations that conserve something. Symmetry is a special kind of pattern. It is a pattern of staying the same (staying invariant or not varying) in an important way despite a certain kind of change. If a symmetry is preserved under a transformation (that is, change, operation, and so forth), then it is said to be invariant under that transformation.

Asymmetry is not merely lack of symmetry. It has special features, too, as we shall see below. They are contraries of each other, not contradictories of each other. For example, On the integers, the “=” relation is symmetric, the “<” relation is asymmetric, and the “?” relation is neither.

It is often claimed that the group of instants wouldn’t really be time if they weren’t asymmetric (so that, if instant a happens before instant b, then instant b cannot happen before instant a), this would rule out closed time loops a priori. To prevent this and make the question of whether time can circle back on itself be an empirical question, physicists will not require asymmetry for all of time’s instants, but only for smaller segments of time, “neighborhoods” of instants.

We won’t pause here to explore the many kinds of symmetry and asymmetry. Instead, this section focuses on the two most relevant symmetries for time: time-translation symmetry and time-reversal symmetry.

Time-translation symmetry or time-translation invariance (which is the same thing), means the fundamental laws of nature do not depend on what time it is, and they cannot change as time goes by. For example, your health might change as time goes by, but the fundamental laws underlying your health that held last year are the same as those that hold today. This translation symmetry property of time implies time’s homogeneity, that all instants are equivalent. This feature of time can be expressed using the language of coordinate systems by saying that replacing the time variable t everywhere in a fundamental law by t + 4 does not change what processes are allowed by the law. The choice of “4” was an arbitrary choice of a real number. Requiring the laws of physics to be time-translation symmetric was proposed by Isaac Newton. The theory of General Relativity is completely time-translation invariant. There is no special time in the universe according to general relativity.

Members of the intrinsic camp who advocate this symmetry would say the symmetry exists because time passes the same regardless of whether it is Sunday or Monday. Members of the extrinsic camp who advocate time-translation symmetry would say, instead, that the symmetry exists because the allowable fundamental physical processes stay the same regardless of whether it is Sunday or Monday. That is, the fundamental laws are the same on Sunday and Monday. Members of both camps agree that specific physical systems within space-time need not have this symmetry. For example, your body could be healthy on Sunday but not Monday.

Some physicists believe time translation symmetry fails—for two reasons. Assuming the universe had a first moment, such as at the big bang, then the first moment is distinguished from all later moments because it is the only moment without a predecessor. Second, the fundamental laws we have now will change in the future. However, there is no empirical evidence of either symmetry failure; it is merely an educated guess.

Another symmetry, time-reversal symmetry, is the more important symmetry for discussions of time’s arrow, although many experts say it, too, is irrelevant to the arrow. Unsurprisingly, the two camps give different senses to the term “time-reversal symmetry.”

The camps agree that time-reversal symmetry is the property of invariance under the time-reversal transformation T, but they do not agree on what the time-reversal transformation T is. The intrinsic camp says it is about time running in reverse, but the extrinsic camp says it is about fundamental processes running in reverse and not time itself running in reverse, that is, a theory is time-reversal symmetric only if the reversal transformation T always transforms a possible process into another possible process.

Both camps agree that time-reversal symmetry is not about time’s stopping and then reversing, and both agree that you cannot run experiments backward in time, regardless of the funding you are given.  But the two camps disagree about whether the universe’s being time-reversal asymmetric implies that time is intrinsically asymmetric. The intrinsic camp says it does. The extrinsic camp says it does or does not depending upon how you define T.

A naive definition of T is violated in rare cases of particle decay involving the weak force, but some physicists point out that perhaps there is a more sophisticated definition that says we should re-define “time-reversal” to include charge reversal and parity reversal so that there is still time-reversal symmetry in this new, less-naive sense of symmetry under CPT reversal.

To explain and explore these issues , let’s start with what it means for a binary relation to be symmetric. Binary relations are two-argument or two-place relations.

A binary relation R on a set S of objects is symmetric if and only if, for any members of S, such as x and y, if xRy, then also yRx.

A binary relation R on a set S of objects is asymmetric if and only if, for any members of S, such as x and y, if xRy, then not also yRx.

Symmetry and asymmetry are contraries of each other, so failure of symmetry does not automatically produce asymmetry.

If the binary relation of time order were to be asymmetric, this would rule out time-travel loops or closed time-like curves. The reason why is that, if the universe in state A eventually evolves into state B that is the same as A, then it will do it again and again, and state A will be both before and after state B, showing that time order cannot be asymmetric.

The time axis in a coordinate system, is always asymmetric and so is any space coordinate:

The time coordinates are asymmetric if and only if the binary relation “temporally less-than” is asymmetric on the set of time coordinates of point-events.

No disagreement here between the two camps.

A coordinate system is what the analyst places on a reference frame to help specify locations in the frame. Time coordinates are locations in time of point-events. “Noon” is a time coordinate, but it is an ambiguous one because it might be noon today or noon tomorrow. For mathematical usefulness, we want both our space and time axes and their parallel coordinate lines to be asymmetric regardless of whether space and time themselves are asymmetric. We ensure this usefulness via the choice of coordinate system that we add to a mathematical line. The geometric line without coordinates has no privileged direction. A continuous line from point x to point y is also a continuous line from y to x. Real numbers, the decimals, make good time coordinates on a mathematical line because they are continuous and because in their natural order they have a greater-than direction that orients the points along the line, and because, if a point-event is moved continuously in time, its time coordinates also change continuously. These features are crucial for an efficient application of calculus.

In describing time-reversal asymmetry (or anisotropy), the intrinsic camp says it crucially involves intrinsic properties of time, and the extrinsic camp says it crucially involves only extrinsic properties. Here is the definition that the intrinsic camp uses:

(Intrinsic definition) Time is asymmetric if and only if time does not run backward.

Asked what that means, they are likely to say:

Time does not run backward (in any time period)if and only if the “happens before” binary relation on the set of non-simultaneous point-events is asymmetric (in any time period).

The reason given for why we should believe the “happens before” relation is asymmetric (and so time always runs one way and never runs backward) would likely mention the one-way character of passage or flow or becoming or perhaps now-ness running like fire along the fuse of time toward the future, with these being objective features of the universe that are described a bit vaguely.

If time fails to be asymmetric, this might be because it does not flow one way on Tuesday nights or does not flow one way on Mars, but otherwise flows one way. Those strange cases would imply failure of both symmetry and asymmetry.

Members of the extrinsic camp say any definition of time being asymmetric should not rely on the intrinsic character of the “happens before” relation. Instead, the definition should state whether every process that is described by the fundamental laws could have run backward.

Most members of the extrinsic camp prefer definitions such as these:

(Extrinsic definition) Time is symmetric if and only if all the fundamental theories are time-reversible symmetric.

(Extrinsic definition) A theory is time-reversal symmetric (or time-reversal invariant) if and only if, whenever a history, such as a sequence of instantaneous states S1, S2, … Sn, is physically possible according to that theory, then the temporally inverted sequence T(Sn), T(Sn-1)… T(S1) is also possible according to that theory.

n is a positive integer. T is the so-called “time-reversal operator” that reverses the dynamics, but not time itself. For any particle, its position is the same in T(Si) and Si, but its velocity and momenta are reversed. T(T) must be the identity operator. The point of creating the definition is to make more precise what it means to say “the same thing is happening backwards in time.” A theory is time-reversal invariant if and only if the time reversal transformation turns solutions into solutions and non-solutions into non-solutions of the theory’s equations. (Aside: Saying “Time is invariant” and “Time is not invariant”  make no sense. Also, note that the definitions above do not refer to a coordinate system. )

A state at some instant is a complete description at that instant of the universe or of whatever system is the focus of attention. A possible history is a sequence of instantaneous states. A dynamical law, as opposed to other kinds of laws, says how one state is to be updated to another state over time. Assuming the dynamical laws are not probabilistic, the dynamic laws determine what new states will occur. A state is often called a configuration, so when a system changes, what changes is its configuration. All states are mutually exclusive in the sense that, if a system is in one state, then it cannot also be in another state at the same time. All these ideas are often summarized by saying that, for the extrinsic theory, a scientific theory is time-reversal symmetric if its laws do not indicate a time direction.

Reversal in the extrinsic sense for a classical system obeying Newton’s mechanics works like this. For arbitrary times t1 and tn, suppose the physical system evolves over a sequence of times from an initial time t1 to a final time tn. Time-reversal invariance, according to the extrinsic theory, implies that taking the state at tn and reversing all the velocities (or momenta) of the constituent particles (not only the speed but also the direction of motion), then evolving the system back to the initial time t1 according to the laws, and then flipping all the velocities (or momenta) again, will produce the original state at t1. What was done is undone. If she is walking forward up the steps in state, say, S3, with a velocity v3, then in T(S3), the result of applying the time-reversal transformation to S3, she is walking backward down the steps at velocity -v3, which is what you’d see if the film of the sequence were shown to you in reverse. Reversing a velocity requires changing the direction of the velocity while keeping the speed the same. Under the time-reversal transformation, gravity does not become repulsive, nor do two north poles of magnets attract each other, but light rays do change their direction. The fundamental laws stay the same. It is only the instances of sequences of actual processes that reverse.

So far, this is the treatment in classical mechanics. But this treatment is naive, some say. What requires reversing depends in general on which theory is being used. For classical electromagnetism, the magnetic field also needs to be reversed. If we include other theories, such as all those within the Standard Model of Particle physics, then several other things need to be reversed, such as charge and parity in order to preserve symmetry. Then to preserve time-reversal symmetry we should redefine time reversal to mean CPT reversal. Others complain that this redefinition is an ad hoc maneuver.

We are trying to find a proper definition for what it is to reverse the dynamics. To properly describe time-reversal symmetry, some argue that more steps are involved than merely sending t to -t. To handle all our fundamental theories in physics, a less naive time-reversal operator T requires also simultaneously reversing momentum, charge, and parity. For a deeper treatment of the details of the time-reversal operator, see (North 2008).

For simplicity, the above definition of being time-reversal symmetric assumed a digital sequence of times for the process’ evolution even though in the real world the process may evolve continuously. That assumption does not affect the philosophical points being made. For how to define T in other theories such as electromagnetic theory and quantum field theory, see (Earman 2002). Another assumption was that we were using systems of identical point particles. In less ideal systems, the state needs to specify the mass, orientation and angular momentum of each elementary particle. For an even better treatment, the state needs to be specified in terms of Schrödinger’s quantum wave function.

Time-reversal symmetry of all the fundamental laws implies determinism because the final state can be in the future, but determinism does not imply time-reversal symmetry. A deterministic theory can allow two different present states to evolve into the same future state. The theory of relativity is deterministic and time-reversal symmetric (except where singularities are involved). So is statistical mechanics. So is the evolution of the wave function in quantum mechanics, assuming no measurement is being made. On the Copenhagen Interpretation of quantum mechanics, the evolution is indeterministic because the wave function collapses instantaneously during a measurement.

Over the subsequent several hundred years, beginning with the acceptance of Newtonian mechanics, all the fundamental laws of physics were believed by all physicists to be time-reversal symmetric (assuming the quantum measurement process is an exception that is not covered by these laws). This was envisioned as being simply a very precise and very fundamental feature of reality. Then, in 1964, experimental results by Cronin and Fitch suggested that the decay of some long-lived K0 mesons (called neutral kaons) violated time-reversal symmetry (what those above called “naive” time-reversal symmetry). The term “long-lived” here means about 5 × 10?8 seconds. The two experimenters were awarded the Nobel Prize for this result. In a Nobel ceremony speech, Gösta Ekspong described this impact on our understanding of the arrow of time:

The laws of physics resemble a canon by Bach. They are symmetric in space and time. They do not distinguish between left and right, nor between forward and backward movements. For a long time everyone thought it had to be like that…. [Cronin and Fitch’s] discovery…implied consequences for time reflection. At least one theme is played more slowly backwards than forwards by Nature.

Later experiments more carefully confirmed this failure of time-reversal symmetry by using some decays of B mesons and the oscillation of electron-neutrinos. Failure of symmetry, however, does not entail asymmetry because symmetry and asymmetry are contraries.

These rare exceptions to time-reversal symmetry are brute facts of nature that cannot be deduced from the standard model of particle physics. Many philosophers of physics argued that these rare exceptions of time-reversal symmetry are not relevant to explaining time’s arrow; surely they have nothing to do with, say, eggs breaking and never unbreaking and people getting biologically older and never younger. For more about the negative reactions within the philosophical community to these experimental results, see (Roberts 2022, especially p. 20). Roberts argues that these experimental results can be a foundation for time’s arrow.

Although the failure of time-reversal symmetry occurs very rarely and, according to most researchers, has no significant effects on ordinary objects or our ordinary lives outside the physics laboratory, one cannot avoid the conclusion that there are fundamental phenomena involving the weak nuclear force that are sensitive to time’s direction and to the binary relation “happens before.” Those who accept what might be called the “single exception assumption”—the assumption that the existence of even a single time-asymmetric fundamental law of nature is a sufficient condition for a fundamental mini-arrow and for time overall to fail to be symmetric—are committed to the claim that our universe’s laws are actually not all time-reversal symmetric.

Others say the above treatment is naive. They say time-symmetry fails only for a naive version of time-reversal symmetry, one that fails to incorporate C and P reversal. Time-reversal should be redefined to mean CPT reversal, which has never been violated. This is a combination of three transformations: charge reversal, parity reversal, and time reversal. C is charge replacement (including replacing particles by their anti-particles). P is parity reversal (reversing handedness or becoming your mirror reflection), and T is (naive) time-reversal. Still others complain that this is an ad hoc rescue that corrupts what we mean by “time-reversal.”

There is still another reaction to the discovery of exceptions to so-called naive time-reversal. Some researchers do not accept the single exception assumption, believing it overemphasizes trivial exceptions, and they affirm that the fundamental laws are time-reversal symmetric even in what was called the naive sense. The researchers recommend covering up trivial exceptions so as not to mislead the reader with insignificant complications; they say the laws of fundamental physics are “not significantly sensitive to time’s direction.”

To point out another complication that involves properly defining terms, there is a second notion called reversibility or sometimes time-reversibility. Failure to distinguish it from time-reversal symmetry under the T transformation occasionally leads to confusion. Reversibility says the information of a closed system is conserved over time. This implies time-reversal invariance, but the converse does not hold.

Back to the arrow. According to a member of the extrinsic camp:

Time-reversal invariance T has nothing to do with the arrow of time. That feature of particle physics [namely T], while important in its own right, leaves reversibility intact. The arrow of time stems from the fact that the macroscopic world does not appear reversible, even though the microscopic world seemingly is…What, then, is responsible for irreversibility, and thus for time’s arrow? The ultimate answer lies in the fact that the entropy of a closed system, including the universe as a whole, tends to increase over time (Carroll 2022c, 127 and 129).

8. References and Further Reading

  • Albert, David Z. 2000. Time and Chance. Harvard University Press. Cambridge, MA.
    • A technical treatise surveying the philosophical and physical issues involving time’s direction. The book never uses the word “arrow.” Reading between the lines, he says time has no intrinsic arrow, and that the arrow is basically due to processes taking place over time, but he is not optimistic that all the mini-arrows can be explained in terms of entropy change. On p. 11, Albert defines what it is for something to happen backward. On p. 20, he says, “classical electrodynamics is not time-reversal invariant.” Chapter 4 introduces his Past-Hypothesis that he calls a “fundamental…law of nature.” Albert describes a connection between the problem of time’s direction and the measurement problem in quantum mechanics.
  • Arntzenius, Frank. 1997. “Mirrors and the Direction of Time.” Philosophy of Science, December, Vol. 64, Supplement. Proceedings of the 1996 Biennial Meetings of the Philosophy of Science Association. Part II: Symposia Papers. The University of Chicago Press, pp. S213-S222.
    • Challenges an argument he had made two years earlier that if even one of the laws of nature is not time-reversal symmetric, then that is all that is required for us to infer that time has an objective direction. Assumes familiarity with quantum mechanics.
  • Arntzenius, Frank and Hilary Greaves. 2009. “Time Reversal in Classical Electromagnetism,” The British Journal for the Philosophy of Science, Volume 60, Number 3, pp. 557-584.
    • Surveys the debate between David Albert and David Malament regarding what time-reversal means, especially whether it always means reversing the order of states in a trajectory.
  • Augustynek, Zdzisla W. 1968. “Homogeneity of Time,” American Journal of Physics, 36, pp. 126-132.
    • A discussion of the physical equivalence of all time’s instants and the principles of invariance and symmetry involving time. The author worries about whether the principles of time symmetry are analytic or synthetic. Is time’s symmetry tautological or empirical? Explains why the principle of time’s symmetry implies, via Noether’s Theorem, the principle of the conservation of energy. Aimed at an audience of professional physicists.
  • Barbour, Julian B. 2020. The Janus Point: A New Theory of Time. Basic Books, New York.
    • Contains an argument that the Past Hypothesis is a necessary consequence of a new fundamental law of the universe yet to be discovered.
  • Bardon, Adrian. 2013. A Brief History of the Philosophy of Time. Oxford University Press.
    • Chapter five offers a brief analysis of the relationships among the psychological arrow, the causal arrow, and the entropic arrow.
  • Black, Max. 1959. “The Direction of Time.” Analysis, Vol. 19, No. 3, pp. 54-63.
    • Contains this philosopher’s proposal to explain the direction of time in terms of the objectivity of the truth values of ordinary language statements involving the temporal relation is-earlier-than.
  • Bourne. Craig. 2002. “When Am I? A tense time for some tense theorists?” Australasian Journal of Philosophy, 80, 359–371.
    • Criticizes the growing-block model for its inability to distinguish our own objective present.
  • Braddon-Mitchell, David and Kristie Miller. 2017. “On Time and the Varieties of Science.” Boston Studies in the Philosophy and History of Science, vol. 326, pp. 67-85.
    • A study of how physics and the other sciences should work together to understand time. The authors say, “The special sciences…tell us where, amongst a theory of the physical world, we should expect to locate phenomena such as temporality; they tell us what it would take for there to be time. Physical theory tells us whether there is anything like that in the world and what its hidden nature is.”
  • Broad, Charlie Dunbar. 1923. Scientific Thought. London: Kegan Paul.
    • C.D. Broad describes a version of the moving spotlight theory, a growing-block theory.
  • Broad, Charlie Dunbar. 1938. Examination of McTaggart’s Philosophy, Volume II. Cambridge University Press.
    • Examines McTaggart’s proposals, including the existence of a universal now. Oaklander has written extensively on Broad’s treatment of time and how it changed during his lifetime. Broad’s 1938 position is considered to be his clearest and most defensible treatment.
  • Callender, Craig. 1998. “Review: The View from No-When” in The British Journal for the Philosophy of Science, Vol. 49, March. pp. 135-159.
    • This is a review of Huw Price’s book Time’s Arrow and Archimedes’ Point: New Directions for the Physics of Time. He says Price aims to answer the question: What does the world look like when we remove the effects of our temporally asymmetric prejudices?
  • Callender, Craig. 1999. “Reducing Thermodynamics to Statistical Mechanics: The Case of Entropy.” Journal of Philosophy vol. 96, pp. 348-373.
    • Examines the issue of how to explain thermodynamics in terms of statistical mechanics. The techniques of statistical physics are needed when systems are so complicated that statistical features are more useful than exact values of the variables—for example the statistical feature of average kinetic energy that is called temperature is more useful than trying to acquire knowledge of the position at a time of this or that molecule. From 1902 to 1905, Einstein worked unsuccessfully to derive the 2nd Law from basic physical features.
  • Callender, Craig. 2004. “There is No Puzzle About the Low Entropy Past.” In Contemporary Debates in Philosophy of Science, edited by C. Hitchcock, pp. 240-55. Malden: Wiley-Blackwell.
    • Explores some critical comments made about the Past Hypothesis.
  • Callender, Craig. 2017. What Makes Time Special? Oxford University Press, Oxford, U.K.
    • A comprehensive monograph on the relationship between the manifest image of time and the scientific image. He claims philosophers who defend parts of the manifest image have created all sorts of technical models (that is, theories) that try to revise and improve the scientific image. According to Callender, “These models of time are typically sophisticated products and shouldn’t be confused with manifest time. Instead, they are models that adorn the time of physics with all manner of fancy temporal dress: primitive flows, tensed presents, transient presents, ersatz presents, Meinongian times, existent presents, priority presents, thick and skipping presents, moving spotlights, becoming, and at least half a dozen different types of branching! What unites this otherwise motley class is that each model has features that allegedly vindicate core aspects of manifest time. However, these tricked out times have not met with much success” (p. 29). Chapter 11 is devoted to the flow of time.
  • Carroll, Sean. 2008. “The Cosmic Origins of Time’s Arrow.” Scientific American.
    • Describes the thermodynamic arrow and speculates that to solve the problem of the direction of time, one should accept a multiverse in which in some universes time runs in reverse to how it does in ours.
  • Carroll, Sean. 2010. From Eternity to Here: The Quest for the Ultimate Theory of Time. Dutton/Penguin Group: New York.
    • A popular, lucid, and deep presentation of what can be learned from current science about the nature of time. Of all Carroll’s popular publications, this is the one that has the most to say about the arrow of time.
  • Carroll, Sean. 2016, The Big Picture. Dutton/Penguin Random House. New York.
    • “The parts address how a complex universe can emerge from basic physical laws, how we can discover these laws, what we already know about them, and what implications they have for the evolution of life, and for consciousness, and for human values,” says David Kordahl in his review in The New Atlantis. Carroll explains how entropy can rise even as a system becomes less complex.
  • Carroll, Sean. 2019. “Sean Carroll on Causality and the Arrow of Time, ” FQXI Foundational Questions Institute, August 21. Available on YouTube.
    • He sketches his program to explain how entropy increase can explain the causal arrow. He admits that his explanation is still a work in progress.
  • Carroll, Sean. 2020. “Why Boltzmann Brains are Bad,” in Current Controversies in Philosophy of Science, 1st Edition, edited by Shamik Dasgupta, Ravit Dotan, and Brad Weslake. Routledge. pp. 7-20.
    • Argues that theories predicting Boltzmann Brains cannot simultaneously be true and justifiably believed.
  • Carroll, Sean. 2022a. “The Arrow of Time in Causal Networks.” U.C. Berkeley Physics Colloquium, April 22. YouTube https://www.youtube.com/watch?v=6slug9rjaIQ.
    • Discussion of how the thermodynamic arrow can explain the causal arrow. This talk is aimed at mathematical physicists.
  • Carroll, Sean. 2022b. “Ask Me Anything,” Mindscape podcasts, April AMA and May AMA. https://www.preposterousuniverse.com/podcast/.
    • Carroll discusses time’s having no fundamental or intrinsic arrow, why we need to adopt the Past Hypothesis, and how to define the term “arrow of time.”
  • Carroll, Sean. 2022c. The Biggest Ideas in the Universe: Space, Time, and Motion. Dutton/Penguin Random House.
    • A sophisticated survey of what modern physics implies about space, time, and motion, especially relativity theory without quantum mechanics. There is some emphasis on the philosophical issues. Introduces the relevant equations, but it is aimed at a general audience and not physicists. Chapter Five on Time is highly recommended for disentangling the meaning of time reversibility from the meaning of time reversal symmetry. Advocates the extrinsic theory of time’s arrow in terms of entropy.
  • Castagnino, Mario and Olimpia Lombardi. 2009. “The Global Non-Entropic Arrow of Time: From Global Geometrical Asymmetry to Local Energy Flow,” Synthese, vol. 169, no. 1 July, pp. 1-25.
    • Challenges the claim that time’s arrow should not be explicated in terms of entropy. The authors’ goal is to show how to define a global arrow of time from the geometrical properties of spacetime and how this arrow can be “transferred to the local level, where it takes the form of a non-spacelike local energy flow that provides the criterion for breaking the symmetry resulting from the time-reversal invariant laws of local physics.”
  • Christensen, Ferrel. 1987. “Time’s Error: Is Time’s Asymmetry Extrinsic?” Erkenntnis March pp. 231-248.
    • Examination of whether time’s arrow is intrinsic or extrinsic. He claims, “there are no very strong arguments in favor of the view that time is only extrinsically anisotropic. Moreover, there are some serious arguments in opposition to the claim.” He is in the intrinsic camp, but he says the concept of time passing is nonsensical.
  • Dainton, Barry. 2020. Time and Space, Second Edition. McGill-Queens University Press. Ithaca, 2010.
    • An easy-to-read textbook that surveys the major philosophical issues about time and offers many arguments. It is not primarily about time’s arrow. Regarding time’s arrow, Dainton suggests the goal is “defining the direction of time in terms of entropy” (p. 49) rather than explaining the direction in terms of entropy.
  • Davies, Paul C. W. 1974. The Physics of Time Asymmetry. University of California Press. Berkeley and Los Angeles.
    • A survey by a proponent of the extrinsic theories of time.
  • Deng, Natalja M. 2017. “On ‘Experiencing Time’: A Response to Simon Prosser,” Inquiry: An Interdisciplinary Journal of Philosophy 61(3), pp. 281-301.
    • A chapter-by-chapter critique of (Prosser 2016). Explores the psychology of time.
  • Dieks, Dennis. 1975. “Physics and the Direction of Causation,” Erkenntnis, vol. 25, no. 1, July, pp. 85-110.
    • Explores how physics can recognize the direction of causation.
  • Dieks, Dennis. 2012. “The Physics and Metaphysics of Time, European Journal of Analytic Philosophy, pp. 103-119.
    • Surveys the physics and metaphysics of time and argues in favor of the B-theory over the A-theory. Confronts the claim that physics needs to be revised to account for the arrow and the claim that the B-theory cannot give an accurate description of our temporal experiences.
  • Dorato, Mauro. 2000. “Becoming and the Arrow of Causation.” Philosophy of Science, Sept., Vol. 67, Supplement. Proceedings of the 1998 Biennial Meetings of the Philosophy of Science Association. Part II: Symposia Papers September, pp. S523-S534.
    • The author focuses on what would be required to establish the objectivity of becoming. He recommends solving the taxonomy problem by saying causation is the main philosophical asymmetry, in Horwich’s sense of that term, namely the philosophical symmetries of trace, knowledge, explanation, action, counterfactual dependence, and our subjective sense of the passage of time.
  • Dorato, Mauro. 2006. “Absolute Becoming, Relational Becoming and the Arrow of Time: Some Non-Conventional Remarks on the Relationship Between Physics and Metaphysics,” Studies in History and Philosophy of Modern Physics, 37, 3, 2006, 559–76. Reprinted in (Oaklander 2008).
    • Provides an in-depth analysis of becoming. Argues that the arrow of becoming is more fundamental than the arrow of entropy change. And he asserts that, because the conceptual link between becoming and the issue of the direction of time requires regarding the asymmetry of causation as fundamental, such an asymmetry cannot turn out to be merely extrinsically correlated to irreversible physical processes.
  • Dyke Heather & Adrian Bardon 2013. (eds.), A Companion to the Philosophy of Time. Wiley-Blackwell.
    • A collection of academic articles on a wide variety of issues in the philosophy of time.
  • Earman, John. 1974. “An Attempt to Add a Little Direction to ‘The Problem of the Direction of Time.’” Philosophy of Science. 41: 15-47.
    • Comments on the role of semantic ambiguity in discussions of time’s arrow. Speculates on life in a time-reversed world. Argues that the arrow of time is an intrinsic feature of spacetime.
  • Earman, John. 2002. “What Time Reversal Invariance Is and Why It Matters.” International Studies in the Philosophy of Science, 16, 245-264.
    • Explains how the time reversal operator must be defined differently in different situations.
  • Earman, John. 2006. “The ‘Past Hypothesis’: Not Even False.” Studies in History and Philosophy of Modern Physics 37, 399-430.
    • Criticizes the Past Hypothesis and the view that the asymmetry of entropy can be explicated through its role within cosmological theories.
  • Earman, John. 2008. “Reassessing the Prospects for a Growing Block Model of the Universe,” International Studies in the Philosophy of Science 22, 135-164.
    • Explains the growing-block model and examines arguments for and against it. Hermann Minkowski invented the block model in 1908. His block contains not all the future events that might happen but rather all the future events that will happen.
  • Ellis, George. 2013. “The Arrow of Time and the Nature of Spacetime.” Studies in History and Philosophy of Science Part B: Studies in History and Philosophy of Modern Physics 44 (3): 242-262.
    • Promotes a growing block theory with top-down causation due to a past condition of the universe. What he means by a past condition is the idea that global conditions determine the arrow of time by top-down causation. His ideas are developed with the tools of quantum field theory.
  • Falk, Dan. 2008. In Search of Time: The History, Physics, and Philosophy of Time. St. Martin’s Griffin. New York.
    • A popular survey by a reliable guide.
  • Farr, Matt and Alexander Reutlinger. 2013. “A Relic of a Bygone Age? Causation, Time Symmetry and the Directionality Argument.” Erkenntnis 78, Supplement 2, pp. 215-235.
    • An assessment of Russell’s argument that the time symmetry of fundamental physics is inconsistent with the time asymmetry of causation.
  • Freundlich, Yehudah. 1973. “‘Becoming’ and the Asymmetries of Time,” Philosophy of Science, Vol. 40, No. 4., pp. 496-517.
    • Examines the senses in which time’s arrow is mind-dependent, and the relationship between the possible asymmetries of phenomenological and physical time. He says, “We find that physical time acquires meaning only through phenomenological time, and that phenomenological time is fundamentally asymmetric. …The central thesis of this paper will be that merely to differentiate between appearance and reality is implicitly to assume a directed flow of time [from past to future]. …The focal point of any phenomenalist position is the assertion that the meaningful content of any physical statement is exhausted by the claims that statement makes as regards the ways we are appeared to.”
  • Frisch, Mathias. 2013. “Time and Becoming” in Dyke and Bardon 2013.
    • Endorses the dynamic theory and develops the causal theory.
  • Frisch, Mathias. 2014. Causal Reasoning in Physics. Cambridge: Cambridge University Press.
    • Explores the variety of issues involved in using causal reasoning in physics, including the relationship of the causal mini-arrow to other mini-arrows.
  • Grandjean, Vincent. 2022. The Asymmetric Nature of Time: Accounting for the Open Future and the Fixed Past. Synthese Library, volume 468. Springer. https://link.springer.com/book/10.1007/978-3-031-09763-8.
    • This book develops and defends a version of the growing-block theory.
  • Greene, Brian. 2004. The Fabric of the Cosmos: Space, Time, and the Texture of Realty. Alfred A. Knopf. New York.
    • A leading theoretical physicist provides a popular introduction to cosmology, relativity, string theory, and time’s arrow.
  • Greene, Brian. 2020. “Your Daily Equation #30: What Sparked the Big Bang?” May 20. https://www.youtube.com/watch?v=7QkT7evF2-E.
    • Describes repulsive gravity and cosmic inflation. Presupposes the viewer’s facility with partial differential equations.
  • Grünbaum, Adolf. 1973. Philosophical Problems of Space and Time. Second edition. Alfred A. Knopf. New York.
    • His views on time’s arrow are chiefly presented in the two chapters “The Anisotropy of Time,” and “Is the Coarse-Grained Entropy of Classical Statistical Mechanics an Anthropomorphism?” The first edition of 1963 was expanded in 1973 with new material.
  • Horwich, Paul. 1987. Asymmetries in Time: Problems in the Philosophy of Science. The MIT Press. Cambridge.
    • An analysis of many theories of time’s arrow. Horwich claims there is no intrinsic difference between the past and the future. Time itself is symmetric and does not itself have an arrow. David Hume was correct, says Horwich, in asserting that causes happen before their effects only because of our human convention about what those words mean. Horwich has a unique solution to the taxonomy problem that gives special weight to the knowledge mini-arrow and its explanation in terms of the fork asymmetry. This book is written for experts in the field.
  • Hutten, Ernest H. 1959. “Reviewed Work(s): The Direction of Time by H. Reichenbach.” Philosophy, Vol. 34, No. 128, January, pp. 65-66.
    • Briefly summarizes the main themes in Reichenbach’s causal theory of time. Hutten believes Reichenbach makes several serious, irrepairable mistakes in his argument.
  • Ismael, Jenann T. 2017. “Passage, Flow, and the Logic of Temporal Perspectives,” in Time of Nature and the Nature of Time: Philosophical Perspectives of Time in Natural Sciences. Ed. by Christophe Bouton and Philippe Huneman. Boston Studies in the History and Philosophy of Science 326. Springer International Publishing. Pp. 23-38.
    • A careful examination of some of the formal features of temporal perspectives such as time’s passage. Explores the logic of the content of temporal experience rather than of the quality of that experience.
  • Kajimoto, Naoyuki and Kristie Miller and James Norton. 2020. “Primitive Directionality and Diachronic Grounding,” Acta Analytica, pp. 195-211.
    • Considers how to defend the claim that time’s directionality is primitive by using the concept of grounding.
  • Katz, Bernard D. 1983. “The Identity of Indiscernibles Revisited,” Philosophical Studies: An International Journal for Philosophy in the Analytic Tradition 44, July, pp. 37-44.
    • Explores the difficulties of distinguishing intrinsic properties from extrinsic properties.
  • Lazarovici, Dustin, and Peter Reichert. (2015). “Typicality, Irreversibility and the Status of Macroscopic Laws.” Erkenntnis 80:4. 689-716.
    • Examines how entropy and the second law have been misunderstood in the academic literature. Considers theories that imply the Past Hypothesis.
  • Le Poidevin, Robin. 2007. The Images of Time: An Essay on Temporal Representation, Oxford University Press. Oxford.
    • Explores temporal representation and temporal phenomenology.
  • Lebowitz, Joel L. 1993. “Boltzmann’s Entropy and Time’s Arrow.” Physics Today. September. Pp. 32-38.
    • A popular treatment of the entropy arrow aimed at high school and college undergraduate physics teachers.
  • Lees, J. P., et al. 2012. “Observation of Time-Reversal Violation in the B0 Meson System.” Physical Review Letters, 109, 211801.
    • SLAC National Accelerator Laboratory at Stanford University reported the first direct test of time-reversal symmetry without any dependence on charge reversal or parity reversal. B0 mesons failed to decay time-reversibly via the weak interaction.
  • Lewis, David. 1979. “Counterfactual Dependence and Time’s Arrow.” Noûs, 13, 455-476.
    • Explores the possibility of explaining time’s arrow in terms of the causal arrow.
  • Lewis, David. 1983. “New Work for a Theory of Universals,” Australasian Journal of Philosophy 61, pp. 343-377.
    • Explores the difficulties in distinguishing intrinsic properties from extrinsic properties.
  • Loew, Christian. 2018. “Fundamentality and Time’s Arrow.” Philosophy of Science, 85. July. Page 483.
    • Develops and defends Maudlin’s views on time. Claims the intrinsic arrow is needed to explains why there is asymmetry of entropy change. He says, “My goal is to flesh out a way of understanding the idea that time has an intrinsic direction that can underwrite explanation.”
  • Matthews, Geoffrey. 1979. “Time’s Arrow and the Structure of Spacetime,” Philosophy of Science, vol. 46, pp. 82-97.
    • Argues that time’s arrow is a local rather than a global feature of the universe.
  • Maudlin, Tim. 2002. “Remarks on the Passing of Time,” Proceedings of the Aristotelian Society, 2002, New Series, Vol. 102 (2002), pp. 259-274. Oxford University Press.
    • Defends his claim that the passage of time is an intrinsic asymmetry in the structure of spacetime itself, an asymmetry that is metaphysically independent of the material contents of spacetime such as the entropy gradient. Focuses not on the positive reasons to accept his claim but rather on the negative program of undermining arguments given against his claim.
  • Maudlin, Tim. 2007. The Metaphysics Within Physics. Oxford University Press. Oxford.
    • Argues that time passes. Maudlin says the fundamental laws of nature and the direction of time require no philosophical analysis, but causation does. He says spacetime has a fundamental, intrinsic, inexplicable temporal direction, and this explains why earlier states produce later states but not vice versa. He objects to the Humean program of analysis in philosophy of physics that implies (1) “laws are nothing but patterns in the physical state of the world,” and (2) the direction of time is “nothing but a matter of how physical systems are disposed to behave throughout the universe” (which is the extrinsic theory). Maudlin advocates a non-Humean primitivist approach to both the fundamental laws and time’s arrow.
  • Mellor, D. H. 1991. “Causation and the Direction of Time.” Also published in Erkenntnis 1975, 35, pp. 191–203.
    • A defense of a causal theory of time.
  • Mellor, D. H. 1995. The Facts of Causation. Routledge.
    • An influential analysis of the concept of causation that emphasizes singular causation, the causation of one fact by another.
  • Miller, Kristie. 2013. “Presentism, Eternalism, and the Growing Block,” in (Dyke and Bardon 2013, 345-364).
    • A careful presentation of the three main ontologies of time followed by an investigation of whether disagreements among advocates of the ontologies are involved in merely semantic disagreements and are “talking past” each other. The pros and cons of each ontology are considered.
  • Miller, Kristie. 2019. “The Cresting Wave: A New Moving Spotlight Theory.” Canadian Journal of Philosophy, 49, pp. 94-122.
    • A revision on the moving spotlight theory that adds a cresting wave of causal efficacy. Miller is not a temporal dynamist.
  • Miller, Kristie and A. Holcombe and A.J. Latham. 2020. “Temporal Phenomenology: Phenomenological Illusion versus Cognitive Error.” Synthese 197, pp. 751–771.
    • Our temporal phenomenology is our experience of temporal properties and relations such as order, succession, duration, and passage. The article defends the claim that a person can make a cognitive error in saying it seems to them that time passes because they fail to make a careful distinction between “how actual time is taken to be” and “a representation of what it is to be time: of what is essential to time.” Investigates how we represent time in all possible worlds.
  • Miller, Kristie and John Norton. 2021. “If Time Can Pass, Time Can Pass at Different Rates,” Analytic Philosophy Vol. 62, March, pp. 21–32.
    • Offers an explication of the notion of time passing and considers whether it always must pass at the same rate.
  • Muller, Richard A. 2016. Now: The Physics of Time, New York: W. W. Norton & Co.
    • Argues that the arrow of time is not due to entropy increase but is only correlated with it. The relevant chapter is titled “Throwing Entropy under the Bus.”
  • Muller, Richard A. and Shaun Maguire. 2016. “Now and the Flow of Time,” arXiv. https://arxiv.org/pdf/1606.07975.pdf.
    • An original argument for why the thermodynamic arrow is not the fundamental arrow of time. The progression of time can be understood, they say, by assuming the flow of time consists of the continuous creation of new moments, new nows, that accompany the creation of new space. This is a version of the growing block theory of time.
  • Musser, George. 2017. “A Defense of the Reality of Time,” Quanta Magazine. May 16. https://www.quantamagazine.org/a-defense-of-the-reality-of-time-20170516/.
    • A condensed interview with Tim Maudlin.
  • North, Jill. 2002. “What Is the Problem about the Time-Asymmetry of Thermodynamics? A Reply to Price.” The British Journal for the Philosophy of Science, Vol. 53, No. 1, March, pp. 121-136.
    • Commentary and critique of the positions taken by Huw Price. “Price argues that there are two conceptions of the puzzle of the time-asymmetry of thermodynamics. He thinks this puzzle has remained unsolved for so long partly due to a misunderstanding about which of these conceptions is the right one and what form a solution ought to take.” North argues that it is Price’s understanding that is mistaken.”
  • North, Jill. 2008. “Two Views on Time Reversal.” Philosophy of Science, Vol. 75, No. 2, April, pp. 201-223.
    • Clearly addresses the issue of what we could and should mean by time reversal in the context of classical physics.
  • North Jill. 2009. “The ‘Structure’ of Physics: A Case Study,” Journal of Philosophy, vol. 106, pp. 57–88.
    • North asks what a fundamental theory of physics says about the structure of the world when the theory has two significantly different mathematical formulations, such as Newton’s mechanics in its Lagrangian and Hamiltonian versions. Each of the two has its own system of coordinates and equations of motion. North considers scenarios in which the two versions of a theory indicate different structures of the world itself versus when they indicate simply two different descriptions of the same underlying structure.
  • Oaklander, L. Nathan. 1985. “A Reply to Schlesinger.” The Philosophical Quarterly, Vo. 35, No. 138, January, pp. 93-94.
    • Criticizes the moving-now theory that was presented by Schlesinger. One criticism is that the theory is committed to the claim that the same NOW applies to all times, but that impales the theory on the horns of a dilemma: it is either incoherent or circular.
  • Oaklander, L. Nathan. 2008. Editor. The Ontology of Time. Routledge.
    • A collection of diverse, but influential, articles on the major issues about time.
  • Papineau, David. 1996. “Philosophy of Science,” in The Blackwell Companion to Philosophy edited by Nicholas Bunnin and E. P. Tsui-James, Blackwell Publishers Inc.: Oxford. pp. 290-324.
    • Discusses a variety of attempts to escape the circularity problem that arises in trying to define or explain time’s arrow.
  • Penrose, Oliver. 2001. “The Direction of Time.” in Chance in Physics: Foundations and Perspectives, edited by J. Bricmont, D. Dürr, M. C. Galavotti, G. C. Ghirardi, F. Petruccione and N. Zanghi. Springer Verlag.
    • Adopts an extrinsic theory of time’s arrow. Argues that Reichenbach’s principle of the common cause is the proper approach to understanding the time direction of asymmetric processes. Presumes a familiarity with advanced mathematics and physics.
  • Penrose, Roger. 1989. The Emperor’s New Mind: Concerning Computers, Minds, and The Laws of Physics. Oxford University Press: Oxford. Reprinted with corrections in 1990.
    • A wide-ranging, popular physics book that contains speculations on living in a time-reversed world plus other philosophical commentary by this future Nobel Prize winner.
  • Penrose, Roger. 2004. The Road to Reality: A Complete Guide to the Laws of the Universe. Alfred A. Knopf: New York.
    • An expert in general relativity, Penrose provides an advanced presentation of all the most important laws of physics, interspersed with philosophical comments.
  • Pooley, Oliver. 2013. “Relativity, the Open Future, and the Passage of Time,” Proceedings of the Aristotelian Society, CXIII part 3: 321-363.
    • Discusses whether time passes according to the B-theory and the theory of relativity.
  • Price, Huw. 1992. “The Direction of Causation: Ramsey’s Ultimate Contingency,” PSA: Proceedings of the Biennial Meeting of the Philosophy of Science Association. Pp. 253-267.
    • Following Frank Ramsey, Price argues that the best way to account for causal asymmetry is to consider it to be a feature that agents project onto the world.
  • Price, Huw. 1996. Time’s Arrow & Archimedes’ Point: New Directions for the Physics of Time. Oxford University Press, Inc.: Oxford.
    • This book is filled with clear, expository material, but Price offers much original material. He is interested in more clearly separating subjective asymmetries from objective asymmetries. He argues that, objectively, time has no arrow, it has no direction, it does not flow, the future can affect the past, and philosophers of physics need to adopt an Archimedean point of view outside of time in order to discuss time in an unbiased way. Although Price admits we cannot literally step outside time, we can take the Archimedean point of view of “nowhen” from which we can view the timeless block universe and see that time’s arrow is inherently anthropomorphic, as is the directionality of causation. Makes a case that many arguments trying to show how and why temporal order exists presuppose that temporal order exists. He claims there is no good reason to rule out the possibility that what happens now can depend on what happens in the future, at least for microphysical phenomena. The book is reviewed in (Callender 1998).
  • Price, Huw. 2002. “Boltzmann’s Time Bomb,” The British Journal for the Philosophy of Science, March, Vol. 53, No. 1, pp. 83-119.
    • Price agrees that statistical arguments alone do not give us good reason to expect that entropy will always continue to increase because a past hypothesis is also needed. He clarifies the different assumptions being made in attempts to explain thermodynamic asymmetry, and he emphasizes that if thermodynamics were not time-asymmetric this could be for two different reasons: “The world might exhibit entropic gradients in both temporal directions, without a global temporal preference, at least on the large scale. For example, there might be a single long period of entropy ‘increase’, ‘followed’ by a matching period of entropy ‘decrease’.” Or, instead, “entropy might be approximately constant everywhere, and at all times” (pp. 89-90). Price’s position is challenged in (North 2002).
  • Prosser, Simon. 2016. Experiencing Time. Oxford University Press, Oxford.
    • Covers a broad range of topics in the interface between the philosophy of mind and the philosophy of time. Claims the A-theory is unintelligible. Says it is impossible to experience the passage of time. Argues that the present is not an ontologically privileged time. But see (Deng 2017).
  • Reichenbach, Hans. 1956. The Direction of Time. University of California Press: Berkeley.
    • An influential, technical treatise on the causal theory of time. One noteworthy feature is that it tries to establish the direction of time using the concept of entropy but without relying on a past hypothesis. Reichenbach died before being able to write his final chapter of the book that was to be on the relationship between time’s objective properties and our subjective experience of time.
  • Roberts, Brian W. 2022. Reversing the Arrow of Time, Cambridge University Press.
    • This philosopher of physics surveys the issues involved in understanding the arrow of time. He tries to debunk the claim that the many mini-arrows commonly dealt with in the taxonomy problem are in fact arrows. And he argues that classical thermodynamics is not a temporally asymmetric theory.
  • Rovelli, Carlo. 2018. The Order of Time. Riverhead Books. New York.
    • A popular presentation of this physicist’s many original ideas about time. Claims that both entropy and the Past-Hypothesis are human-dependent.
  • Russell, Bertrand. 1913. “On the Notion of Cause,” Proceedings of the Aristotelian Society 13, pp. 1-26.
    • Russell argues that causes and causal relations should not be part of the fundamental physical description of the world.
  • Salmon, W. 1971. Statistical Explanation and Statistical Relevance. University of Pittsburgh Press: Pittsburgh.
    • Salmon argues that a good definition of causation should ward off counterexamples due to common causes.
  • Savitt, Steven F. 1991. “Asymmetries in Time: Problems in the Philosophy of Science by Paul Horwich,” Canadian Journal of Philosophy, Volume 21, no. 3, pp. 399-417.
    • A review of (Horwich 1987).
  • Savitt, Steven F. 1995. Time’s Arrows Today: Recent Physical and Philosophical Work on the Direction of Time.” Editor. Cambridge University Press. Cambridge.
    • A collection of independent research papers by distinguished investigators of the topic.
  • Schlegel, Richard. 1968. Time and the Physical World. Dover Publications, Inc., New York. This Dover reprint was originally published by Michigan State University Press in 1961.
    • This book in the philosophy of physics compares the manifest image to the scientific image of physical time as it was understood in 1961.
  • Schlesinger, George. 1985. “How to Navigate the River of Time”, The Philosophical Quarterly. Vol. 35, No. 138. January. Pp. 91-92.
    • Presents and defends the moving-now theory. He agrees that time seems to everyone to pass. Schlesinger defends the claim that time can change its rate. In the same journal two years earlier, Oaklander had claimed Schlesinger’s position is incoherent.
  • Sklar, Lawrence. 1974. Space, Time, and Spacetime. University of California Press: Berkeley, CA.
    • Attacks the intrinsic theory of the arrow. Surveys the causal theory of time and various theories of time’s arrow. Pages 379-394 describe the changes in Boltzmann’s thinking about the second law.
  • Skow, Bradford. 2009. “Relativity and the Moving Spotlight,” The Journal of Philosophy 106, pp. 666-678.
    • Argues that the moving spotlight theory is consistent with special relativity, particularly with its implication that the present or the NOW is relative to a reference frame.
  • Skow, Bradford. 2011. “Experience and the Passage of Time.” Philosophical Perspectives, 25, Metaphysics, pp. 359-387.
    • An examination of the argument that the best explanation of our experience is that time passes. Focuses on the moving spotlight theory which he believes is the best A-theory of time.
  • Skow, Bradford. 2012. “One Second Per Second.” Philosophy and Phenomenological Research 85, pp. 377-389.
    • Analyzes various arguments for and against the coherence of the phrase “a rate of one second per second,” which has multiple interpretations.
  • Smart, J.J.C. 1949. “The River of Time.” Mind, October, Vol. 58, No. 232, pp. 483-494.
    • Provides a variety of arguments against the intrinsic theory of time’s arrow. He emphasizes that things change, and events happen, but events do not change except when they are described too vaguely. This article emphasizes the analysis and clarification of ordinary language using the techniques of “logical grammar.”
  • Smart, J.J.C. 1967. “Time” in The Encyclopedia of Philosophy, ed. by Paul Edwards, volume 8. Macmillan Publishing Co., Inc. & The Free Press: New York, pp. 126-134.
    • A survey of philosophical issues about time from a member of the extrinsic camp. The views of the intrinsic camp are not given much attention.
  • Smart, J.J.C. 1969. “Causal Theories of Time,” The Monist, Vol. 53, No. 3. July. Pp. 385-395.
    • Criticizes some of the early causal theories of time from Hans Reichenbach, Adolf Grünbaum, and Henryk Mehlberg.
  • Th´ebault, Karim P. Y. 2021. “The Problem of Time,” in Routledge Companion to the Philosophy of Physics, edited by Eleanor Knox and Alastair Wilson. Routledge: London.
    • Explores the representation of time in classical, relativistic, and quantum physics. Not written at the easy intellectual level of the present encyclopedia article, but it provides a broad, accurate introduction to the problem of time and its academic literature.
  • Tooley, Michael. 1997. Time, Tense, and Causation. Oxford University Press, Clarendon Press: Oxford.
    • A growing-block model. In his dynamic and tensed theory of time, facts are tenseless states of affairs that come into existence, never to go out of existence. Causation is primitive, and the theory of relativity needs modification to allow for our common present. Tooley believes that both the tenseless theory and the standard tensed theory are false.
  • Wallace, David. 2013. “The Arrow of Time in Physics,” in (Dyke and Bardon 2013).
    • In this chapter, Wallace concentrates on the arrow of time as it occurs in physics. He explores how the arrow can exist even though there is time symmetry in thermodynamics and statistical mechanics. He provides a broad discussion of arrows other than the entropy arrow.
  • Wallace, D. 2017. “The Nature of the Past Hypothesis,” in The Philosophy of Cosmology, edited by K. Chamcham, et al. Cambridge University Press: Cambridge, pp. 486–99.
    • Explores the controversies about the acceptability of the Past Hypothesis and which formulation of it is appropriate.
  • Williams, Donald C. 1951. “The Myth of Passage,” Journal of Philosophy, volume 48, July 19, pp. 457-472.
    • Influential argument that time’s passage is a myth.
  • Woodward, James. 2014. “A Functional Account of Causation; or, A Defense of the Legitimacy of Causal Thinking by Reference to the Only Standard That Matters—Usefulness (as Opposed to Metaphysics or Agreement with Intuitive Judgment),” Philosophy of Science, volume 81, pp. 691–713.
    • A technically demanding treatment of the question “How does causation fit with physics?” and of the impact an appropriate answer has for understanding the metaphysical, descriptive, and functional roles of causation.
  • Zimmerman, Dean. 2005. “The A-Theory of Time, the B-Theory of Time, and ‘Taking Tense Seriously’.” Dialectica volume 59, number 4, pp. 401-457.
    • In exploring the issues mentioned in the title, Zimmerman considers different versions of the spotlight theory, those with a real future and those without, those having events shedding their A-properties and those without this shedding.
  • Zimmerman, Dean. 2011. “Presentism and the Space-Time Manifold”, in The Oxford Handbook of Philosophy of Time, ed. C. Callender, pp. 163–244, Oxford: Oxford University Press.
    • Considers how to reconcile presentism with relativity theory by finding a privileged reference frame.

 

Author Information

Bradley H. Dowden
Email: dowden@csus.edu
California State University, Sacramento
U. S. A.

William Godwin (1756–1836)

Following the publication of An Enquiry Concerning Political Justice in 1793 and his most successful novel, Caleb Williams, in 1794, William Godwin was briefly celebrated as the most influential English thinker of the age. At the time of his marriage to the writer Mary Wollstonecraft in 1797, the achievements and influence of both writers, as well as their personal happiness together, seemed likely to extend into the new century. It was not to be. The war with revolutionary France and the rise of a new spirit of patriotic fervour turned opinion against reformers, and it targeted Godwin. Following her death in September 1797, a few days after the birth of a daughter, Mary, Godwin published a candid memoir of Wollstonecraft that ignited a propaganda campaign against them both and which became increasingly strident. He published a third edition of Political Justice and a second major novel, St. Leon, but the tide was clearly turning. And while he continued writing into old age, he never again achieved the success, nor the financial security, he had enjoyed in the 1790s. Today he is most often referenced as the husband of Mary Wollstonecraft, as the father of Mary Wollstonecraft Shelley (the author of Frankenstein and The Last Man), and as the founding father of philosophical anarchism. He also deserves to be remembered as a significant philosopher of education.

In An Enquiry Concerning Political Justice, Godwin argues that individuals have the power to free themselves from the intellectual and social restrictions imposed by government and state institutions.  The argument starts with the very demanding requirement that we assess options impartially and rationally. We should act only according to a conviction that arises from a conscientious assessment of what would contribute most to the general good. Incorporated in the argument are principles of impartiality, utility, duty, benevolence, perfectionism, and, crucially, independent private judgment.

Godwin insists that we are not free, morally or rationally, to make whatever choices we like. He subscribes to a form of necessitarianism, but he also believes that choices are constrained by duty and that one’s duty is always to put the general good first. Duties precede rights; rights are simply claims we make on people who have duties towards us. Ultimately, it is the priority of the principle of independent private judgment that produces Godwin’s approach to education, to law and punishment, to government, and to property. Independent private judgment generates truth, and therefore virtue, benevolence, justice, and happiness. Anything that inhibits it, such as political institutions or modes of government, must be replaced by progressively improved social practices.

When Godwin first started An Enquiry Concerning Political Justice, he intended it to explore how government can best benefit humanity. He and the publisher George Robinson wanted to catch the wave of interest created by the French Revolution itself and by Edmund Burke’s Reflections on the Revolution in France, which so provoked British supporters of the revolution. Robinson agreed to support Godwin financially while he worked, with the understanding that he would send sections of the work as he completed them. This meant that the first chapters were printed before he had fully realised the implications of his arguments. The inconsistencies that resulted were addressed in subsequent editions. His philosophical ideas were further revised and developed in The Enquirer (1797), Thoughts Occasioned by a Perusal of Dr. Parr’s Spital Sermon (1801), Of Population (1820), and Thoughts on Man (1831), and in his novels. He also wrote several works of history and biography, and wrote or edited several texts for children, which were published by the Juvenile Library that he started with his second wife, Mary Jane Clairmont.

Table of Contents

  1. Life
  2. Godwin’s Philosophy: An Enquiry Concerning Political Justice
    1. Summary of Principles
    2. From Private judgment to Political Justice
  3. Educational Implications of Godwin’s philosophy
    1. Progressive Education
    2. Education, Epistemology, and Language.
    3. Education, Volition, and Necessitarianism
    4. Government, the State, and Education
  4. Godwin’s Philosophical Anarchism
    1. Introduction
    2. Punishment
    3. Property
    4. Response to Malthus
  5. Godwin’s Fiction
    1. Caleb Williams (1794)
    2. St. Leon: A Tale of the Sixteenth Century (1799)
  6. Conclusion
  7. References and Further Reading
    1. Works by William Godwin
      1. Early Editions of An Enquiry Concerning Political Justice
      2. Other Editions of An Enquiry Concerning Political Justice
      3. Collected Editions of Godwin’s Works and Correspondence
      4. First Editions of Other Works by Godwin
      5. Online Resources
      6. Other Editions of Selected Works by Godwin
    2. Biographies of Godwin
    3. Social and Historical Background
    4. Other Secondary Sources in Philosophy, Education, Fiction, and Anarchism

1. Life

William Godwin was born in 1756 in Wisbech in Cambridgeshire, England, the seventh of thirteen children. His father was a Dissenting minister; his mother was the daughter of a successful shipowner. Godwin was fond of his lively mother, less so of his strictly Calvinist father. He was a pious and academically precocious boy, readily acquiring a close knowledge of the Old and New Testaments. After three years at a local school, where he read widely, learned some Latin and developed a passion for the classics, he moved at the age of 11 to Norwich to become the only pupil of the Reverend Samuel Newton. Newton was an adherent of Sandemanianism, a particularly strict form of Calvinism. Godwin found him pedantic and unjustly critical. The Calvinist doctrines of original sin and predestination weighed heavily. Calvinism left emotional scars, but it influenced his thinking. This was evidenced, Godwin later stated, in the errors of the first edition of Political Justice: its tendency to stoicism regarding pleasure and pain, and the inattention to feeling and private affections.

After a period as an assistant teacher of writing and arithmetic, Godwin began to develop his own ideas about education and to take an interest in contemporary politics. When William’s father died in 1772, his mother paid for her clever son to attend the New College, a Dissenting Academy in Hoxton, north of the City of London. By then Godwin had become, somewhat awkwardly, a Tory, a supporter of the aristocratic ruling class. Dissenters generally supported the Whigs, not least because they opposed the Test Acts, which prohibited anyone who was not an Anglican communicant from holding a public office. At Hoxton Godwin received a more comprehensive higher education than he would have received at Oxford or Cambridge universities (from which Dissenters were effectively barred). The pedagogy was liberal, based on free enquiry, and the curriculum was wide-ranging, covering psychology, ethics, politics, theology, philosophy, science, and mathematics. Hoxton introduced Godwin to the rational dissenting creeds, Socinianism and Unitarianism, to which philosophers and political reformers such as Joseph Priestley and Richard Price subscribed.

Godwin seems to have graduated from Hoxton with both his Sandemanianism and Toryism in place. But the speeches of Edmund Burke and Charles James Fox, the leading liberal Whigs, impressed him and his political opinions began to change. After several attempts to become a Dissenting minister, he accepted that congregations simply did not take to him; and his religious views began a journey through deism to atheism. He was influenced by his reading of the French philosophes. He settled in London, aiming to make a living from writing, and had some early encouragement. Having already completed a biography of William Pitt, Earl of Chatham, he now contributed reviews to the English Review and published a collection of sermons. By 1784 he had published three minor novels, all quite favourably reviewed, and a satirical pamphlet entitled The Herald of Literature, a collection of spoof ‘extracts’ from works purporting to be by contemporary writers. He also contemplated a career in education, for in July 1783 he published a prospectus for a small school that he planned to open in Epsom, Surrey.

For the next several years Godwin was able to earn a modest living as a writer, thanks in part to his former teacher at Hoxton, Andrew Kippis, who commissioned him to write on British and Foreign History for the New Annual Register. The work built him a reputation as a competent political commentator and introduced him to a circle of liberal Whig politicians, publishers, actors, artists, and authors. Then, in 1789, events in France raised hopes for radical reform in Great Britain. On November 4 Godwin was present at a sermon delivered by Richard Price which, while primarily celebrating the Glorious Revolution of 1688, anticipated many of the themes of Political Justice: universal justice and benevolence; rationalism; and a war on ignorance, intolerance, persecution, and slavery. The special significance of the sermon is that it roused Edmund Burke to write Reflections on the Revolution in France, which was published in November 1790. Godwin had admired Burke, and he was disappointed by this furious attack on the Revolution and by its support for custom, tradition, and aristocracy.

He was not alone in his disappointment. Thomas Paine’s Rights of Man, and Mary Wollstonecraft’s A Vindication of the Rights of Men were early responses to Burke. Godwin proposed to his publisher, George Robinson, a treatise on political principles, and Robinson agreed to sponsor him while he wrote it. Godwin’s ideas veered over the sixteen months of writing towards the philosophical anarchism for which the work is best known.

Political Justice, as Godwin declared in the preface, was the child of the French Revolution. As he finished writing it in January 1793, the French Republic declared war on the Kingdom of Great Britain. It was not the safest time for an anti-monarchist, anti-aristocracy, anti-government treatise to appear. Prime Minister William Pitt thought the two volumes too expensive to attract a mass readership; otherwise, the Government might have prosecuted Godwin and Robinson for sedition. In fact, the book sold well and immediately boosted Godwin’s fame and reputation. It was enthusiastically reviewed in much of the press and keenly welcomed by radicals and Dissenters. Among his many new admirers were young writers with whom Godwin soon became acquainted: William Wordsworth, Robert Southey, Samuel Taylor Coleridge, and a very youthful William Hazlitt.

In 1794 Godwin wrote two works that were impressive and successful in different ways. The novel Things as They Are: or The Adventures of Caleb Williams stands out as an original exploration of human psychology and the wrongs of society. Cursory Strictures on the Charge delivered by Lord Chief Justice Eyre to the Grand Jury first appeared in the Morning Chronicle newspaper. Pitt’s administration had become increasingly repressive, charging supporters of British reform societies with sedition. On May 12, 1794, Thomas Hardy, the chair of the London Corresponding Society (LCS), was arrested and committed with six others to the Tower of London; then John Thelwall, a radical lecturer, and John Horne Tooke, a leading light in the Society for Constitutional Information (SCI), were arrested.  The charge was High Treason, and the potential penalty was death. Habeas Corpus had been suspended, and the trials did not begin until October. Godwin had attended reform meetings and knew these men. He was especially close to Thomas Holcroft, the novelist and playwright. Godwin argued in Cursory Strictures that there was no evidence that the LCS and SCI were involved in any seditious plots, and he accused Lord Chief Justice Eyre of expanding the definition of treason to include mere criticism of the government. ‘This is the most important crisis in the history of English liberty,’ he concluded. Hardy was called to trial on October 25, and, after twelve days, the jury returned a verdict of not guilty. Subsequently, Horne Tooke and Thelwall were tried and acquitted, and others were dismissed.  Godwin’s article was considered decisive in undermining the charge of sedition. In Hazlitt’s view, Godwin had saved the lives of twelve innocent men (Hazlitt, 2000: 290). The collapse of the Treason Trials caused a surge of hope for reform, but a division between middle-class intellectuals and the leaders of labouring class agitation hastened the decline of British Jacobinism. This did not, however, end the anti-Jacobin propaganda campaign, nor the satirical attacks on Godwin himself.

A series of essays, published as The Enquirer: Reflections on Education, Manners and Literature (1797), developed a position on education equally opposed to Jean-Jacques Rousseau’s progressivism (in Emile) and to traditional education. Other essays modified or developed ideas from Political Justice. One essay, ‘Of English Style’, describes clarity and propriety of style as the ‘transparent envelope’ of thoughts. Another essay, ‘Of Avarice and Profusion’, prompted the Rev. Thomas Malthus to respond with his An Essay on the Principle of Population (1798).

At the lodgings of a mutual friend, the writer Mary Hays, Godwin became reacquainted with a woman he had first met in 1791 at one of the publisher Joseph Johnson’s regular dinners, when he had wanted to converse with Thomas Paine rather than with her. Since then, Mary Wollstonecraft had spent time in revolutionary Paris, fallen in love with an American businessman, Gilbert Imlay, and given birth to a daughter, Fanny. Imlay first left her then sent her on a business mission to Scandinavia. This led to the publication of Letters Written During a Short Residence in Sweden, Norway and Denmark (1796). She had completed A Vindication of the Rights of Woman in 1792, a more substantial work than her earlier A Vindication of the Rights of Men. She had also recently survived a second attempt at suicide. Having previously published Mary: A Fiction in 1788, she was working on a second novel, The Wrongs of Woman: or, Maria. A friendship soon became a courtship. When Mary became pregnant, they chose to get married and to brave the inevitable ridicule, both previously having condemned the institution of marriage (in Godwin’s view it was ‘the worst of monopolies’). They were married on March 29, 1797. They worked apart during daytime, Godwin in a rented room near their apartment in Somers Town, St. Pancras, north of central London, and came together in the evening.

Godwin enjoyed the dramatic change in his life: the unfamiliar affections and the semi-independent domesticity. Their daughter was born on August 30. The birth itself went well but the placenta had broken apart in the womb; a doctor was called to remove it, and an infection took hold. Mary died on September 10. At the end she said of Godwin that he was ‘the kindest, best man in the world’. Heartbroken, he wrote that he could see no prospect of future happiness: ‘I firmly believe that there does not exist her equal in the world. I know from experience we were formed to make each other happy’. He could not bring himself to attend the funeral in the churchyard of St. Pancras Church, where just a few months earlier they had married.

Godwin quickly threw himself into writing a memoir of Wollstonecraft’s life. Within a few weeks he had completed a work for which he was ridiculed at the time, and for which he has been criticised by historians who feel that it delayed the progress of women’s rights. The Memoirs of the Author of a Vindication of the Rights of Woman (1798) is a tender tribute, and a frank attempt to explore his own feelings, but Godwin’s commitment to complete candour meant that he underestimated, or was insensitive to, the likely consequence of revealing ‘disreputable’ details of Mary’s past, not least that Fanny had been born out of wedlock. It was a gift to moralists, humourists, and government propagandists.

Godwin was now a widower with a baby, Mary, and a toddler, Fanny, to care for. With help from a nursemaid and, subsequently, a housekeeper, he settled into the role of affectionate father and patient home educator. However, he retained a daily routine of writing, reading, and conversation. A new novel was to prove almost as successful as Caleb Williams. This was St. Leon: A Tale of the Sixteenth Century. It is the story of an ambitious nobleman disgraced by greed and an addiction to gambling, then alienated from society by the character-corrupting acquisition of alchemical secrets. It is also the story of the tragic loss of an exceptional wife and of domestic happiness: it has been seen as a tribute to Wollstonecraft and as a correction to the neglect of the affections in Political Justice.

The reaction against Godwin continued into the new century, with satirical attacks coming from all sides. It was not until he read a serious attack by his friend Dr. Samuel Parr that he was stung into a whole-hearted defence, engaging also with criticisms by James Mackintosh and Thomas Malthus. Thoughts Occasioned by the Perusal of Dr. Parr’s Spital Sermon was published in 1801. His replies to Mackintosh and Malthus were measured, but his response to Parr was more problematic, making concessions that could be seen as undermining the close connection between truth and justice that is crucial to the argument of Political Justice.

Since Mary Wollstonecraft’s death, Godwin had acquired several new friends, including Charles and Mary Lamb, but he clearly missed the domesticity he had enjoyed so briefly; and he needed a mother for the girls. The story goes that Godwin first encountered his second wife in May 1801, shortly before he started work on the reply to Dr. Parr. He was sitting reading on his balcony when he was hailed from next door: ‘Is it possible that I behold the immortal Godwin?’ Mary Jane Clairmont had two children, Charles and Jane, who were similar in age to Fanny and Mary. Godwin’s friends largely disapproved – they found Mary Jane bad-tempered and artificial – but Godwin married her, and their partnership endured until his death.

Godwin had a moderate success with a Life of Chaucer, failed badly as a dramatist, and completed another novel, Fleetwood, or the New Man of Feeling (1805), but he was not earning enough to provide for his family by his pen alone. He and Mary Jane conceived the idea of starting a children’s bookshop and publishing business. For several years the Juvenile Library supplied stationery and books of all sorts for children and schools, including history books and story collections written or edited by ‘Edward Baldwin’, Godwin’s own name being considered too notorious. Despite some publishing successes, such as Charles and Mary Lamb’s Tales from Shakespeare, the bookshop never really prospered. As he slipped into serious debt, Godwin felt he was slipping also into obscurity. In 1809 he wrote an Essay on Sepulchres: A Proposal for Erecting some Memorial of the Illustrious Dead in All Ages on the Spot where their Remains have been Interred. The Essay was generally well-received, but the proposal was ignored. With the Juvenile Library on the point of collapse, the family needed a benefactor who could bring them financial security.

Percy Bysshe Shelley was just twenty, recently expelled from Oxford University for atheism, and newly married and disinherited, when in January 1812 he wrote a fan letter to a philosopher he had not been sure was still living. His reading of Political Justice at school had ‘opened to my mind fresh & more extensive views’, he wrote. Shelley went off to Ireland to agitate for independence and distribute his pamphlet An Address to the Irish People. Godwin disapproved of the inflammatory tone, but invited Shelley and his wife, Harriet, to London. They eventually arrived in October and Shelley and Godwin thereafter maintained a friendly correspondence. Shelley’s first major poem, Queen Mab, with its Godwinian themes and references, was published at this time. During 1813, as he and Shelley continued to meet, Godwin saw a good deal of a new friend and admirer, Robert Owen, the reforming entrepreneur and philanthropist. Hazlitt commented that Owen’s ideas of Universal Benevolence, the Omnipotence of Truth and the Perfectibility of Human Nature were exactly those of Political Justice. Others thought Owen’s ‘socialism’ was Godwinianism by another name. As Godwin pleaded with friends and admirers for loans and deferrals to help keep the business afloat, the prospect of a major loan from Shelley was thwarted by Sir Timothy Shelley withholding his son’s inheritance when he turned twenty-one.

Godwin’s troubles took a different turn when Mary Godwin, aged sixteen, returned from a stay with friends in Scotland looking healthy and pretty. Harriet Shelley was in Bath with a baby. Shelley dined frequently with the Godwins and took walks with Mary and Jane. Soon he was dedicating an ode to ‘Mary Wollstonecraft Godwin’. On June 26th Mary declared her love as they lay together in St. Pancras Churchyard, beside her mother’s grave, Jane lingering nearby. By July Shelley had informed Harriet that he had only ever loved her as a brother. Godwin was appalled and remonstrated angrily, but early on the morning of July 28 he found a letter on his dressing table: Mary had eloped with Shelley, and they had taken Jane with them.

Godwin’s life over the next eight years, until Shelley’s tragic death in 1822, was far less dramatic or romantic than those of Mary and Shelley, or of Claire (as Jane now called herself). Their travels in Europe, the births and deaths of several children, including Claire’s daughter by Lord Byron, the precocious literary achievements (Shelley’s poems and Mary’s novel Frankenstein) are well known. Meanwhile, in London, Mary Wollstonecraft’s daughter, Fanny, was left unhappily behind. The atmosphere at home was tense and gloomy. Godwin refused to meet Mary and her lover until they were married, although the estrangement did not stop him accepting money from Shelley. A protracted struggle ensued, with neither party appearing to live up to Godwinian standards of candour and disinterestedness. Then, in October 1816, Fanny left the family home, ostensibly to travel to Ireland to visit her aunts (Wollstonecraft’s sisters). In Swansea, she killed herself by taking an overdose of laudanum. She was buried in an unnamed pauper’s grave, Godwin being fearful of further scandal connected with himself and Wollstonecraft. Shortly after this, Harriet Shelley’s body was pulled from the Serpentine in London. Shelley and Mary could now marry, and before long they escaped to Italy, with Claire (Jane) still in tow.

Despite these troubles and the precarious position of the Juvenile Library, Godwin managed to complete another novel, Mandeville, A Tale of the Seventeenth Century in England (1817). He took pride in his daughter’s novel and in his son-in-law’s use of Godwinian ideas in his poems. At the end of 1817, Godwin began his fullest response to Malthus. It took him three years of difficult research to complete Of Population. Meanwhile, his financial difficulties had reached a crisis point. He besieged Shelley in Italy with desperate requests to fulfil his promised commitments, but Shelley had lost patience and refused. The money he had already given, he complained, ‘might as well have been thrown into the sea’. A brief reprieve allowed the Godwins to move, with the Juvenile Library, to better premises. Then came the tragedy of July 8th, 1822. Shelley drowned in rough seas in the Gulf of La Spezia. Mary Shelley returned to England in 1823 to live by her pen. In 1826 she published The Last Man, a work, set in the twenty-first century, in which an English monarch becomes a popular republican leader only to survive a world-wide pandemic as the last man left alive. Godwin’s influence is seen in the ambition and originality of her speculative fiction.

Godwin himself worked for the next five years on a four-volume History of the Commonwealth—the period between the execution in 1649 of Charles I and the restoration in 1660 of Charles II. He describes the liberty that Cromwell and the Parliamentarians represented as a means, not an end in itself; the end is the interests and happiness of the whole: ‘But, unfortunately, men in all ages are the creatures of passions, perpetually prompting them to defy the rein, and break loose from the dictates of sobriety and speculation.’

In 1825, Godwin was finally declared bankrupt, and he and Mary Jane were relieved of the burden of the Juvenile Library. They moved to cheaper accommodation. Godwin had the comfort of good relations with his daughter and grandson. He hoped for an academic position with University College, which Jeremy Bentham had recently helped to establish, but was disappointed. He worked on two further novels, Cloudesley and Deloraine. In 1831 came Thoughts on Man, a collection of essays in which he revisited familiar philosophical topics. In 1834, the last work to appear in his lifetime was published. Lives of the Necromancers is a history of superstition, magic, and credulity, in which Godwin laments that we make ourselves ‘passive and terrified slaves of the creatures of our imagination’. A collection of essays on religion, published posthumously, made similar points but commended a religious sense of awe and wonder in the presence of nature.

The 1832 Reform Bill’s extension of the male franchise pleased Godwin. In 1833, the Whig government awarded him a pension of £200 a year and a residence in New Palace Yard, within the Palace of Westminster parliamentary estate—an odd residence for an anarchist. When the Palace of Westminster was largely destroyed by fire, in October 1834, the new Tory Government renewed his pension, even though he had been responsible for fire safety at Westminster and the upkeep of the fire engine. He spent the last years of his life in relative security with Mary Jane, mourning the deaths of old friends and meeting a new generation of writers. He died at the age of eighty on April 7, 1836. He was buried in St. Pancras Churchyard, in the same grave as Mary Wollstonecraft. When Mary Shelley died in 1851, her son and his wife had Godwin’s and Wollstonecraft’s remains reburied with her in the graveyard of St. Peter’s Church in Bournemouth, on the south coast.

2. Godwin’s Philosophy: An Enquiry Concerning Political Justice

Note: references to An Enquiry Concerning Political Justice (PJ) give the volume number and page number of the two volume 1798 third edition, which is the same as the 1946 University of Toronto Press, ed. F. E. L. Priestley, facsimile edition. This is followed by the book and chapter number of the first edition (for example, PJ II: 497; Bk VIII, vi). Page numbers of other works are those of the first edition.

a. Summary of Principles

The first edition of An Enquiry Concerning Political Justice was published in 1793. A second edition was published in 1796 and a third in 1798. Despite the modifications in the later editions, Godwin considered ‘the spirit and the great outlines of the work remain untouched’ (PJ I, xv; Preface to second edition). Arguably, he was underplaying the significance of the changes. They make clear that pleasure and pain are the only bases on which morality can rest, that feeling, rather than reason or judgment, is what motivates action, and that private affections have a legitimate place in our rational deliberations.

The modifications are incorporated in the ‘Summary of Principles’ (SP) that he added to the start of the third edition (PJ I, xxiii–xxvii). The eight principles are:

(1) ‘The true object of moral and political disquisition, is pleasure or happiness.’ Godwin divides pleasures between those of the senses and those that are ‘probably more exquisite’, such as the pleasures of intellectual feeling, sympathy, and self-approbation. The most desirable and civilized state is that in which we have access to all these diverse sources of pleasure and possess a happiness ‘the most varied and uninterrupted’.

(2) ‘The most desirable condition of the human species, is a state of society.’ Although government was intended to secure us from injustice and violence, in practice it embodies and perpetuates them, inciting passions and producing oppression, despotism, war, and conquest.

(3) ‘The immediate object of government is security.’ But, in practice, the means adopted by government restrict individual independence, limiting self-approbation and our ability to be wise, useful, or happy. Therefore, the best kind of society is one in which there is as little as possible encroachment by government upon individual independence.

(4) ‘The true standard of the conduct of one man to another is justice.’ Justice is universal, it requires us to aim to produce the greatest possible sum of pleasure and happiness and to be impartial.

(5) ‘Duty is the mode of proceeding, which constitutes the best application of the capacity of the individual, to the general advantage.’ Rights are claims which derive from duties; they include claims on the forbearance of others.

(6) ‘The voluntary actions of men are under the direction of their feelings.’  Reason is a controlling and balancing faculty; it does not cause actions but regulates ‘according to the comparative worth it ascribes to different excitements’—therefore, it is the improvement of reason that will produce social improvements.

(7) ‘Reason depends for its clearness and strength upon the cultivation of knowledge.’ As improvement in knowledge is limitless, ‘human inventions, and modes of social existence, are susceptible of perpetual improvement’. Any institution that perpetuates particular modes of thinking or conditions of existence is pernicious.

(8) ‘The pleasures of intellectual feeling, and the pleasures of self-approbation, together with the right cultivation of all our pleasures, are connected with the soundness of understanding.’ Prejudices and falsehoods are incompatible with soundness of understanding, which is connected, rather, with free enquiry and free speech (subject only to the requirements of public security). It is also connected with simplicity of manners and leisure for intellectual self-improvement: consequently, an unequal distribution of property is not compatible with a just society.

b. From Private judgment to Political Justice

Godwin claims there is a reciprocal relationship between the political character of a nation and its people’s experience. He rejects Montesquieu’s suggestion that political character is caused by external contingencies such as the country’s climate. Initially, Godwin seems prepared to argue that good government produces virtuous people. He wants to establish that the political and moral character of a nation is not static; rather, it is capable of progressive change. Subsequently, he makes clear that a society of progressively virtuous people requires progressively less governmental interference. He is contesting Burke’s arguments for tradition and stability, but readers who hoped that Godwin would go on to argue for a rapid, or violent, revolution were to be disappointed. There is even a Burkean strain in his view that sudden change can risk undoing political and social progress by breaking the interdependency between people’s intellectual and emotional worlds and the social and political worlds they inhabit. He wants a gradual march of opinions and ideas. The restlessness he argues for is intellectual, and it is encouraged in individuals by education.

Unlike Thomas Paine and Mary Wollstonecraft in their responses to Burke, Godwin rejects the language of rights. Obligations precede rights and our fundamental obligation is to do what we can to benefit society as a whole. If we do that, we act justly; if we act with a view to benefit only ourselves or those closest to us, we act unjustly. A close family relationship is not a sufficient reason for a moral preference, nor is social rank. Individuals have moral value according to their potential utility. In a fire your duty would be to rescue someone like Archbishop Fénelon, a benefactor to humankind, rather than, say, a member of your own family. (Fénelon’s 1699 didactic novel The Adventures of Telemachus, Son of Ulysses criticised European monarchies and advocated universal brotherhood and human rights; it influenced Rousseau’s philosophy of education.) It seems, then, that it is the consequences of one’s actions that make them right or wrong, that Godwin’s moral philosophy is a form of utilitarianism. However, Mark Philp (1986) argues that Godwin’s position is more accurately characterised as a form of perfectionism: one’s intentions matter and these, crucially, are improvable.

What makes our intentions improvable is our capacity for private judgment. As Godwin has often been unfairly described, both in his own day and more recently, as a cold-hearted rationalist, it is important to clarify what he means by ‘judgment’. It involves a scrupulous process of weighing relevant considerations (beliefs, feelings, pleasures, alternative opinions, potential consequences) in order to reach a reasonable conclusion. In the third edition (SP 6–8), he implies that motivating force is not restricted to feelings (passions, desires), but includes preferences of all kinds. The reason/passion binary is resisted. An existing opinion or intellectual commitment might be described as a feeling, as something which pleases us and earns a place in the deliberative process. In his Reply to Parr, Godwin mentions that the choice of saving Fénelon could be viewed as motivated by the love of the man’s excellence or by an eagerness ‘to achieve and secure the welfare and improvement of millions’ (1801: 41). Furthermore, any kind of feeling that comes to mind thereby becomes ratiocinative or cognitive; the mind could not otherwise include it in the comparing and balancing process. Godwin rejects the reason/passion binary most explicitly in Book VIII of Political Justice, ‘On Property’. The word ‘passion’, he tells us, is mischievous, perpetually shifting its meaning. Intellectual processes that compare and balance preferences and other considerations are perfectible (improvable); the idea that passions cannot be corrected is absurd, he insists. The only alternative position would be that the deliberative process is epiphenomenal, something Godwin could not accept. (For the shifting meaning of ‘passion’ in this period, and its political significance, see Hewitt, 2017.)

Judgments are unavoidably individual in the sense that the combination of relevant considerations in a particular case is bound to be unique, and also in the sense that personal integrity and autonomy are built into the concept of judgment. If we have conscientiously weighed all the relevant considerations, we cannot be blamed for trusting our own judgment over that of others or the dictates of authority. Nothing—no person or institution, certainly not the government—can provide more perfect judgments. Only autonomous acts, Godwin insists, are moral acts, regardless of actual benefit. Individual judgments are fallible, but our capacity for good judgment is perfectible (SP 6). Although autonomous and impartial judgments might not produce an immediate consensus, conversations and a conscientious consideration of different points of view help us to refine our judgment and to converge on moral truths.

In the first edition of Political Justice, it is the mind’s predisposition for truth that motivates our judgments and actions; in later editions, when it is said to be feelings that motivate, justice still requires an exercise of impartiality, a divestment of our own predilections (SP 4). Any judgment that fails the impartiality test would not be virtuous because it would not be conducive to truth. Godwin is not distinguishing knowledge from mere belief by specifying truth and justified belief conditions; rather, he is specifying the conditions of virtuous judgments: they intentionally or consciously aim at truth and impartiality. A preference for the general good is the dominant motivating passion when judgments are good and actions virtuous. The inclusion in the deliberation process of all relevant feelings and preferences arises from the complexity involved in identifying the general good in particular circumstances. Impartiality demands that we consider different options conscientiously; it does not preclude sometimes judging it best to benefit our friends or family.

Is the development of human intellect a means to an end or an end in itself? Is it intrinsically good? Is it the means to achieving the good of humankind or is the good of humankind the development of intellect? If the means and the end are one and the same, then, as Mark Philp (1986) argues, Godwin cannot be counted, straightforwardly at least, a utilitarian, even though the principle of utility plays a major role in delineating moral actions. If actions and practices with the greatest possible utility are those which promote the development of human intellect, universal benevolence and happiness must consist in providing the conditions for intellectual enhancement and the widest possible diffusion of knowledge. The happiest and most just society would be the one that achieved this for all.

When the capacity for private judgment has been enhanced, and improvements in knowledge and understanding have been achieved, individuals will no longer require the various forms of coercion and constraint that government and law impose on them, and which currently inhibit intellectual autonomy (SP 3). In time, Godwin speculates, mind could be so enhanced in its capacities, that it will conquer physical processes such as sleep, even death. At the time he was mocked for such speculations, but their boldness is impressive, and science and medicine have greatly prolonged the average lifespan, farm equipment (as he foretold) really can plough fields without human control, and research continues into the feasibility (and desirability) of immortality.

Anticipating the arguments of John Stuart Mill, Godwin argues that truth is generated by intellectual liberty and the duty to speak candidly and sincerely in robust dialogue with others whose judgments differs from one’s own. Ultimately, a process of mutual individual and societal improvement would evolve, including changes in opinion. Godwin’s anarchistic vision of future society anticipates the removal of the barriers to intellectual equality and justice and the widest possible access to education and to knowledge.

3. Educational Implications of Godwin’s philosophy

a. Progressive Education

Godwin’s interest in progressive education was revealed as early as July 1783 when the Morning Herald published An Account of the Seminary. This was the prospectus for a school—‘For the Instruction of 12 Pupils in the Greek, Latin, French and English Languages’—that he planned to open in Epsom, Surrey. It is unusually philosophical for a school prospectus. It asserts, for example, that when children are born their minds are tabula rasa, blank sheets susceptible to impressions; that by nature we are equal; that freedom can be achieved by changing our modes of thinking; that moral dispositions and character derive from education and from ignorance. The school’s curriculum would focus on languages and history, but the ‘book of nature’ would be preferred to human compositions. The prospectus criticizes Rousseau’s system for its inflexibility and existing schools for failing to accommodate children’s pursuits to their capacities. Small group tuition would be preferred to Rousseauian solitary tutoring. Teachers would not be fearsome: ‘There is not in the world,’ Godwin writes, ‘a truer object of pity than a child terrified at every glance, and watching with anxious uncertainty the caprices of a pedagogue’. Although nothing transpired because too few pupils were recruited, the episode reveals how central education was becoming to Godwin’s political and social thinking. In the Index to the third edition of Political Justice, there are references to topics such as education’s effects on the human mind, arguments for and against a national education system, the danger of education being a producer of fixed opinions and a tool of national government. Discussions of epistemological, psychological, and political questions with implications for education are frequent. What follows aims to synthesize Godwin’s ideas about education and to draw out some implications.

Many of Godwin’s ideas about education are undoubtedly radical, but they are not easily assimilated into the child-centred progressivism that traces its origin back to Rousseau. Godwin, like Wollstonecraft, admired Rousseau’s work, but they both took issue with aspects of the model of education described in Emile, or On Education (1762). Rousseau believed a child’s capacity for rationality should be allowed to grow stage by stage, not be forced. Godwin sees the child as a rational soul from birth. The ability to make and to grasp inferences is essential to children’s nature, and social communication is essential to their flourishing. Children need to develop, and to refine, the communication and reasoning skills that will allow them to participate in conversations, to learn, and to start contributing to society’s progressive improvement. A collision of opinions in discussions refines judgment. This rules out a solitary education of the kind Emile experiences. Whatever intellectual advancement is achieved, diversity of opinion will always be a condition of social progress, and discussion, debate, disagreement (‘conversation’) will remain necessary in education.

Unlike Rousseau, Godwin does not appear to be especially concerned with stages of development, with limits to learning or reading at particular ages. He is not as concerned as Rousseau is about the danger of children being corrupted by what they encounter. We know that his own children read widely and were encouraged to write, to think critically, to be imaginative. They listened and learned from articulate visitors such as Coleridge. Godwin’s interest in children’s reading encouraged him to start the Juvenile Library. One publication was an English dictionary, to which Godwin prefixed A New Guide to the English Tongue. He hoped to inspire children with the inclination to ‘dissect’ their words, to be clear about the primary and secondary ideas they represent. The implication is that the development of linguistic judgment is closely connected with the development of epistemic judgment, with the capacity for conveying truths accurately and persuasively. The kind of interactive dialogue that he believes to be truth-conducive would require mutual trust and respect. There would be little point in discussion, in a collision of ideas, if one could not trust the other participants to exercise the same linguistic and epistemic virtues as oneself. Judgment might be private but education for Godwin is interpersonal.

A point on which Godwin and Rousseau agree is that children are not born in sin, nor do they have a propensity to evil. Godwin is explicit in connecting their development with the intellectual ethos of their early environment, the opinions that have had an impact on them when they were young. Some of these opinions are inevitably false and harmful, especially in societies in which a powerful hierarchy intends children to grow up taking inequalities for granted. As their opinions and thinking develop through early childhood to adulthood, it is important that individuals learn to think independently and critically in order to protect themselves from false and corrupt opinions.

Godwin does not advocate the kind of manipulative tutoring to which Rousseau’s Emile is subjected; nor does he distinguish between the capacities or needs of boys and girls in the way that Rousseau does in his discussion of the education appropriate to Emile’s future wife, Sophie. According to Rousseau, a woman is formed to please a man, to be subjected to him, and therefore requires an education appropriate to that role. Mary Wollstonecraft, in Chapter 3 of A Vindication of the Rights of Woman, had similarly rejected Rousseau’s differentiation. Another difference is that, whereas Rousseau intends education to produce citizens who will contribute to an improved system of government, Godwin intends education to produce individuals with the independence of mind to contribute to a society that requires only minimal governmental or institutional superintendence.

b. Education, Epistemology, and Language.

Underlying Godwin’s educational thinking are important epistemological principles. In acquiring skills of communication, understanding, reasoning, discussion, and judgment, children acquire the virtue of complete sincerity or truthfulness. Learning is understanding, not memorisation. Understanding is the percipience of truth and requires sincere conviction. One cannot be said to have learned or to know or to have understood something, and one’s conduct cannot properly be guided by it, unless one has a sincere conviction of its truth. The connection between reason and conduct is crucial. Correct conduct is accessible to reason, to conscientious judgment. When they are given reasons for acting one way rather than another, children must be open to being convinced. This suggests that pedagogy should emphasise explanation and persuasion rather than monological direct instruction. Moral education is important in regard to conduct, but, as all education prepares individuals to contribute to the general good, all education is moral education.

Godwin gives an interesting analysis of the concept of truth, especially in the second and third editions of Political Justice. Children will need to learn that private judgment cannot guarantee truth. Not only are judgments clearly fallible, but—at least by the third edition—‘truth’ for Godwin does not indicate a transcendental idea, with an existence independent of human minds or propositions. ‘True’ propositions are always tentative, correctable on the basis of further evidence. The probability of a proposition being true can only be assessed by an active process of monitoring available evidence. Although Godwin frequently refers to truth, misleadingly perhaps, as ‘omnipotent’, he can only mean that the concept provides a standard, a degree of probability that precludes reasonable doubt. This suggests that ‘conviction’ is an epistemic judgment that there is sufficient probability to warrant avowal.

The reason why Godwin tends to emphasize truth rather than knowledge may be that we cannot transmit knowledge because we cannot transmit the rational conviction that would turn a reception of a truth into the epistemic achievement of knowing. Each recipient of truths must supply their own conviction via their own private judgment. Godwin insists that we should take no opinions on trust without independent thought and conviction. Judgments need to be refreshed to ensure that what was in the general interest previously still is. When we bind ourselves to the wisdom of our ancestors, to articles of faith or outdated teachings, we are inhibiting individual improvement and the general progress of knowledge. Conviction comes with a duty to bear witness, to pass on the truth clearly and candidly in ‘conversations’. The term ‘conversation’ implies a two-way, open-ended exchange, with at least the possibility of challenge. Integrity would not permit a proposition with an insufficient degree of probability to be conveyed without some indication of its lesser epistemic status, as with conjectures or hearsay. In modern terms, appreciating the difference in the epistemic commitments implicated by different speech acts, such as assertions, confessions, and speculations, would be important to the child’s acquisition of linguistic and epistemic skills or virtues.

c. Education, Volition, and Necessitarianism

 Another aspect of Godwin’s philosophy that makes children’s education in reasoning and discussion important is his account of volition and voluntary choice. If a judgment produced no volition, it could be overruled by hidden or unconscious feelings or desires, and there would be no prospect of developing self-control. Disinterested deliberation would be a delusion and moral education would be powerless. Although Godwin made concessions concerning reason’s role in the motivation of judgments and actions, and in time developed doubts about the potential for improving the human capacity for impartiality, he did not alter the central point that it is thoughts that are present to mind, cognitive states with content, that play a role in motivation. Not all thoughts are inferences. By the time passions or desires, or any kind of preference, become objects of awareness, they are ratiocinative; the intellect is necessarily involved in emotion and desire. This ensures there is a point in developing critical thinking skills, in learning to compare and balance conscientiously whatever preferences and considerations are present to mind.

Godwin admits that some people are more able than others to conquer their appetites and desires; nevertheless, he thinks all humans share a common nature and can, potentially, achieve the same level of self-control, allowing judgment to dominate. This suggests that learning self-control should be an educational priority. Young people are capable of being improved, not by any form of manipulative training, coercion, or indoctrination, but by an education that promotes independence of mind through reflective reading and discussion. He is confident that a society freed from governmental institutions and power interests would strengthen individuals’ resistance to self-love and allow them to identify their own interests with the good of all. It would be through education that they would learn what constitutes the general good and, therefore, what their duties are. Although actions are virtuous that are motivated by a passion for the general good, they still require a foundation in knowledge and understanding.

The accusation that Godwin had too optimistic a view of the human capacity for disinterested rationality and self-control was one made by contemporaries, including Thomas Malthus. In later editions of Political Justice, reason is represented as a capacity for deliberative prudence, a capacity that can be developed and refined even to the extent of exercising control over sexual desire. Malthus doubted that most people would ever be capable of the kind of prudence and self-control that Godwin anticipated. Malthus’s arguments pointed towards a refusal to extend benevolence to the poor and oppressed, Godwin’s pointed towards generosity and equity.

The influence on Godwin’s perfectionism of the rational Dissenters, especially Richard Price and Joseph Priestley, is most apparent in the first edition of Political Justice. He took from them, and also from David Hartley and Jonathan Edwards, the doctrine of philosophical necessity, according to which a person’s life is part of a chain of causes extending through eternity ‘and through the whole period of his existence, in consequence of which it is impossible for him to act in any instance otherwise than he has acted’ (PJ I: 385; Bk IV, vi). Thoughts, and therefore judgments, are not exceptions: they succeed each other according to necessary laws. What stops us from being mere automatons is the fact that experience creates habits of mind which compose our moral and epistemic character, the degree of perfection in our weighing of preferences in pursuit of truth. The more rational, or perfect, our wills have become, the more they subordinate other considerations to truth. But the course of our lives, including our mental deliberations, is influenced by our desires and passions and by external intrusions, including by government, so to become autonomous we need to resist distortions and diversions. Experience and active participation in candid discussion help to develop our judgment and cognitive capacities, and as this process of improvement spreads through society, the need for government intervention and coercion reduces.

In revising this account of perfectionism and necessitarianism for the second and third editions of Political Justice, Godwin attempts to keep it compatible with the more positive role he then allows desire and passion. The language shifts towards a more Humean account of causation, whereby regularity and observed concurrences are all we are entitled to use in explanations and predictions, and patterns of feeling are more completely absorbed into our intellectual character. Godwin’s shift towards empiricism and scepticism is apparent, too, in the way truth loses much of its immutability and teleological attraction. This can be viewed as a reformulation rather than a diminution of reason, at least in so far as the changes do not diminish the importance of rational autonomy. We think and act autonomously, Godwin might say, when our judgments are in accordance with our character—that is, with our individual combination of moral and epistemic virtues and vices, which we maintain or improve by conscientiously monitoring and recalibrating our opinions and preferences. Autonomy requires that we do not escape the trajectory of our character but do try to improve it.

It is important to Godwin that we can make a conceptual distinction between voluntary and involuntary actions. He would not want young people to become fatalistic as a consequence of learning about scientific determinism, and yet he did not believe people should be blamed or made to suffer for their false opinions and bad actions: the complexity in the internal and environmental determinants of character is too great for that. Wordsworth for one accepted the compatibility of these positions. ‘Throw aside your books of chemistry,’ Hazlitt reports him saying to a student, ‘and read Godwin on Necessity’ (Hazlitt, 2000: 280).

d. Government, the State, and Education

For Godwin, progress towards the general good is delineated by progressive improvement in education and the development of private judgment. The general good is sometimes referred to by Godwin in utilitarian terms as ‘happiness’, although he avoids the Benthamite notion of the greatest happiness of the greatest number; and there is no question of pushpin being as good as poetry. A just society is a happy society for all, not just because individual people are contented but because they are contented for a particular reason: they enjoy a society, an egalitarian democracy, that allows them to use their education and intellectual development for the general good, including the good of future generations. A proper appreciation of the aims of education will be sufficient inspiration for children to want to learn; they will not require the extrinsic motivation of rewards and sanctions.

Godwin’s critique of forms of government, in Book V of Political Justice, is linked to their respective merits or demerits in relation to education. The best form of government is the one that ‘least impedes the activity and application of intellectual powers’ (PJ: II: 5; Bk V, i).  A monarchy gives power to someone whose judgment and understanding have not been developed by vulnerability to the vicissitudes of fortune. All individuals need an education that provides not only access to books and conversation but also to experience of the diversity of minds and characters. The pampered, protected education of a prince inculcates epistemic vices such as intellectual arrogance and insouciance. He is likely to be misled by flatterers and be saved from rebellion only by the servility, credulity, and ignorance of the populace. No one person, not even an enlightened and virtuous despot, can match a deliberative assembly for breadth of knowledge and experience. A truly virtuous monarch, even an elected one, would immediately abolish the constitution that brought him to power. Any monarch is in the worst possible position to choose the best people for public office or to take responsibility for errors, and yet his subjects are expected to be guided by him rather than by justice and truth.

Similar arguments apply to aristocracies, to presidential systems, to any constitution that invests power in one person or class, that divides rulers from the people, including by a difference in access to education. Heredity cannot confer virtue or wisdom; only education, leisure and prosperity can explain differences of that kind. In a just society no one would be condemned to stupidity and vice. ‘The dissolution of aristocracy is equally in the interest of the oppressor and the oppressed. The one will be delivered from the listlessness of tyranny, and the other from brutalising operation of servitude’ (PJ II: 99; Bk V, xi).

Godwin recognises that democracy, too, has weaknesses, especially representative democracy. Uneducated people are likely to misjudge characters, be deceived by meretricious attractions or dazzled by eloquence. The solution is not epistocracy but an education for all that allows people to trust their own judgment, to find their own voice. Representative assemblies might play a temporary role, but when the people as a whole are more confident and well-informed, a direct democracy would be more ideal. Secret ballots encourage timidity and inconstancy, so decisions and elections should be decided by an open vote.

The close connection between Godwin’s ideas about education and his philosophical anarchism is clear. Had he been less sceptical about government involvement in education, he might have embraced more immediately implementable education policies. His optimism derives from a belief that the less interference there is by political institutions, the more likely people are to be persuaded by arguments and evidence to prefer virtue to vice, impartial justice to self-love. It is not the “whatever is, is right” optimism of Leibniz, Pope, Bolingbroke, Mandeville, and others; clearly, things can and should be better than they are. Complacency about the status quo benefits only the ruling elites. The state restricts reason by imposing false standards and self-interested values that limit the ordinary person’s sense of his or her potential mental capacities and contribution to society. Godwin’s recognition of a systemic denial of a voice to all but an elite suggests that his notion of political and educational injustice compares with what Miranda Fricker (2007) calls epistemic injustice. Social injustice for Godwin just is epistemic injustice in that social evils derive from ignorance, systemic prejudices, and inequalities of power; and epistemic injustice, ultimately, is educational injustice.

A major benefit of the future anarchistic society will be the reduction in drudgery and toil, and the increase in leisure time. Godwin recognises that the labouring classes especially are deprived of time in which to improve their minds. He welcomes technology such as printing, which helps to spread knowledge and literacy, but abhors such features of industrialisation as factories, the division of labour that makes single purpose machines of men, women, and children, and a commercial system that keeps the masses in poverty and makes a few opulently wealthy. Increased leisure and longevity create time for education and help to build the stock of educated and enlightened thinkers. Social and cultural improvement results from this accretion. Freed from governmental interference, education will benefit from a free press and increased exposure to a diversity of opinion. Godwin expresses ‘the belief that once freed from the bonds of outmoded ideas and educational practices, there was no limit to human abilities, to what men could do and achieve’ (Simon, 1960: 50). It is a mistake, Godwin writes towards the end of Political Justice, to assume that inequality in the distribution of what conduces to the well-being of all, education included, is recognised only by the ‘lower orders’. The beneficiaries of educational inequality, once brought to an appreciation of what constitutes justice, will inevitably initiate change. The diffusion of education will be initiated by an educated elite, but local discussion and reading groups will play a role: the educated and the less educated bearing witness to their own knowledge, passing it on and learning from frank conversation.

Unlike Paine and Wollstonecraft, Godwin does not advocate a planned national or state system of mass education. Neither the state nor the church could be trusted to develop curricula and pedagogical styles that educate children in an unbiased way. He is wary of the possibility of a mass education system levelling down, of reducing children to a “naked and savage equality” that suits the interests of the ruling elite. Nor could we trust state-accredited teachers to be unbiased or to model open-mindedness and explorative discussion. He puts his faith, rather, in the practices of a just community, one in which a moral duty to educate all children is enacted without restraint. Presumably, each community would evolve its own practices and make progressive improvements. The education of its children, and of adults, would find a place within the community’s exploration of how to thrive without government regulation and coercion. Paine wanted governmental involvement in a mass literacy movement, and Wollstonecraft wanted a system of coeducational schools for younger children, but Godwin sees a danger in any proposal that systematizes education.

Godwin’s vision of society does not allow him to specify in any detail a particular curriculum. Again, to do so would come too close to institutionalising education, inhibiting local democratic choice and diversity. He does, however, advocate epistemic practices which have pedagogical implications. Children should be taught to venerate truth, to enquire, to present reasons for belief, to reject as prejudice beliefs unsupported by evidence, to examine objections. ‘Refer them to reading, to conversation, to meditation; but teach them neither creeds nor catechisms, neither moral nor political’ (PJ II: 300; Bk VI, viii). In The Enquirer he writes: ‘It is probable that there is no one thing that it is of eminent importance for a child to learn. The true object of juvenile education, is to provide, against the age of five and twenty, a mind well regulated, active, and prepared to learn’ (1797: 77-78).

In the essay ‘Of Public and Private Education’, Godwin considers the advantages and disadvantages of education by private tutor rather than by public schooling. He concludes by wondering whether there might be a middle way: ‘Perhaps an adventurous and undaunted philosophy would lead to the rejecting them altogether, and pursuing the investigation of a mode totally dissimilar’ (1797: 64). His criticisms of both are reinforced in his novel Mandeville, in which the main character is educated privately by an evangelical minister, and then sent, unhappily, to Winchester College; he experiences both modes as an imposition on his liberty and natural dispositions. Certainly, Godwin’s ideas rule out traditional schools, with set timetables and curricula, with authoritarian teachers, ‘the worst of slaves’, whose only mode of teaching is direct instruction, and deferential pupils who ‘learn their lessons after the manner of parrots’ (1797: 81). The first task of a teacher, Godwin suggests in the essay ‘Of the Communication of Knowledge’, is to provide pupils with an intrinsic motive to learn—that is, with ‘a perception of the value of the thing learned’ (1797: 78). This is easiest if the teacher follows the pupil’s interests and facilitates his or her enquiries. The teacher’s task then is to smooth the pupil’s path, to be a consultant and a participant in discussions and debates, modelling the epistemic and linguistic virtues required for learning with and from each other. The pupil and the ‘preceptor’ will be co-learners and the forerunners of individuals who, in successive generations, will develop increasingly wise and comprehensive views.

In Godwin’s view, there will never be a need for a national system of pay or accreditation, but there will be a need, in the short-term, for leadership by a bourgeois educated elite. It is interesting to compare this view with Coleridge’s idea of a ‘clerisy’, a permanent national intellectual elite, most fully developed by Coleridge in On the Constitution of the Church and State (1830). The term ‘clerisy’ refers to a state-sponsored group of intellectual and learned individuals who would diffuse indispensable knowledge to the nation, whose role would be to humanize, cultivate, and unify. Where Godwin anticipates an erosion of differences of rank and an equitable education for all, Coleridge wants education for the labouring classes to be limited, prudentially, to religion and civility, with a more extensive liberal education for the higher classes. The clerisy is a secular clergy, holding the balance between agricultural and landed interests on the one hand, and mercantile and professional interests on the other. Sages and scholars in the frontline of the physical and moral sciences would serve also as the instructors of a larger group whose role would be to disseminate knowledge and culture to every ‘parish’.  Coleridge discussed the idea with Godwin, but very little in it could appeal to a philosopher who anticipated a withering away of the national state; nor could Godwin have agreed with the idea of a permanent intellectual class accredited and paid by the state, or with the idea of a society that depended for its unity on a permanently maintained intelligentsia. Coleridge’s idea put limits on the learning of the majority and denied them the freedom, and the capacity, to pursue their own enquiries and opinions—as did the national education system that developed in Britain in the nineteenth and twentieth centuries.

Godwin’s educational ideas have had little direct impact. They were not as well-known as those of Rousseau to later progressivist educational theorists and practitioners. He had, perhaps, an over-intellectualised conception of children’s development, and too utopian a vision of the kind of society in which his educational ideas could flourish. Nevertheless, it is interesting that his emphasis on autonomous thinking and critical discussion, on equality and justice in the distribution of knowledge and understanding, and his awareness of how powerful interests and dominant ideologies are insinuated through education, are among the key themes of modern educational discourse. The way in which his ideas about education are completely integral to his anarchist political philosophy is one reason why he deserves attention from philosophers of education, as well as from political theorists.

4. Godwin’s Philosophical Anarchism

a. Introduction

Godwin was the first to argue for anarchism from first principles. The examination of his ideas about education has introduced important aspects of his anarchism, including the preference for local community-based practices, rather than any national systems or institutions. His anarchism is both individualistic and socially oriented. He believes that the development of private judgment enables an improved access to truth, and truth enables progression towards a just society. Monarchical and aristocratic modes of government, together with any form of authority based on social rank or religion, are inconsistent with the development of private judgment. Godwin’s libertarianism in respect of freedom of thought and expression deserves recognition, but his commitment to sincerity and candour, to speech that presumes to assert as true only what is epistemically sound, means that not all speech is epistemically responsible. Nor is all listening responsible: free speech, like persuasive argument, requires a fair-minded and tolerant reception. To prepare individuals and society for the responsible exercise of freedom of thought and expression is a task for education.

Godwin was a philosophical anarchist. He did not specify ways in which like-minded people should organise or build a mass movement. Even in the 1790s, when the enthusiasm for the French Revolution was at its height, he was cautious about precipitating unrest. With regard to the practical politics of his day, he was a liberal Whig, never a revolutionary. But the final two Books of Political Justice take Godwin’s anarchism forward with arguments concerning crime and punishment (Book VII) and property (Book VIII). It is here that some of his most striking ideas are to be found, and where he engages with practical policy issues as well as with philosophical principles.

b. Punishment

Godwin sees punishment as inhumane and cruel. In keeping with his necessitarianism, he cannot accept that criminals make a genuinely free choice to commit a crime: ‘the assassin cannot help the murder he commits any more than the dagger’ (PJ II: 324; Bk VII, i). Human beings are not born into sin, but neither are they born virtuous. Crime is caused environmentally, by social circumstances, by ignorance, inequality, oppression. When the wealthy acknowledge this, they will recognise that if their circumstances and those of the poor were reversed, so, too, would be their crimes. Therefore, Godwin rejects the notions of desert and retributive justice. Only the future benefit that might result from punishment matters, and he finds no evidence that suffering is ever beneficial. Laws, like all prescriptions and prohibitions, condemn the mind to imbecility, alienating it from truth, inviting insincerity when obedience is coerced. Laws, and all the legal and penal apparatus of states, weaken us morally and intellectually by causing us to defer to authority and to ignore our responsibilities.

Godwin considers various potential justifications of punishment. It cannot be justified by the future deterrent effect on the same offender, for a mere suspicion of criminal conduct would justify it. It cannot be justified by its reformative effect, for patient persuasion would be more genuinely effective. It cannot be justified by its deterrent effect on non-offenders, for then the greatest possible suffering would be justified because that would have the greatest deterrent effect. Any argument for proportionality would be absurd because how can that be determined when there are so many variables of motivation, intention, provocation, harm done? Laws and penal sentences are too inflexible to produce justice. Prisons are seminaries of vice, and hard labour, like slavery of any kind, is evil. Only for the purposes of temporary restraint should people ever be deprived of their liberty. A radical alternative to punishment is required.

The development of individuals’ capacities for reason and judgment will be accompanied by a gradual emancipation from law and punishment. The community will apply its new spirit of independence to advance the general good. Simpler, more humane and just practices will emerge. The development of private judgment will enable finer distinctions, and better understanding, to move society towards genuine justice. When people trust themselves and their communities to shoulder responsibility as individuals, they will learn to be ‘as perspicacious in distinguishing, as they are now indiscriminate in confounding, the merit of actions and characters’ (PJ II: 412; Bk VI, viii).

c. Property

Property, Godwin argues, is responsible for oppression, servility, fraud, malice, revenge, fear, selfishness, and suspicion. The abolition—or, at least, transformation—of property will be a key achievement of a just society. If I have a superfluity of loaves and one loaf would save a starving neighbour’s life, to whom does that loaf justly belong? Equity is determined by benefit or utility: ‘Every man has a right to that, the exclusive possession of which being awarded to him, a greater sum of benefit or pleasure will result, than could have arisen from its being otherwise appropriated’ (PJ II:423; Bk VIII, i).

It is not just a question of subsistence, but of all indispensable means of improvement and happiness. It includes the distribution of education, skills, and knowledge. The poor are kept in ignorance while the rich are honoured and rewarded for being acquisitive, dissipated, and indolent. Leisure would be more evenly distributed if the rich man’s superfluities were removed, and this would allow more time for intellectual improvement. Godwin’s response to the objection that a superfluity of property generates excellence—culture, industry, employment, decoration, arts—is that all these would increase if leisure and intellectual cultivation were evenly distributed. Free from oppression and drudgery, people would discover new pleasures and capacities. They will see the benefit of their own exertions to the general good ‘and all will be animated by the example of all’ (PJ II: 488; Bk VIII, iv).

Godwin addresses another objection to his egalitarianism in relation to property: the impossibility of its being rendered permanent: we might see equality as desirable but lack the capacity to sustain it; human nature will always reassert itself. To this Godwin’s response is that equality can be sustained if the members of the community are sufficiently convinced that it is just and that it generates happiness. Only the current ‘infestation of mind’ could see inequality dissolve, happiness increase, and be willing to sacrifice that. In time people will grow less vulnerable to greed, flattery, fame, power, and more attracted to simplicity, frugality, and truth.

But if we choose to receive no more than our just share, why should we impose this restriction on others, why should we impose on their moral independence? Godwin replies that moral error needs to be censured frankly and contested by argument and persuasion, but we should govern ourselves ‘through no medium but that of inclination and conviction’ (PJ II, 497; Bk VIII, vi). If a conflict between the principle of equality and the principle of independent judgment appears, priority should go with the latter. The proper way to respect other people’s independence of mind is to engage them in discussion and seek to persuade them. Conversation remains, for Godwin, the most fertile source of improvement. If people trust their own opinions and resist all challenges to it, they are serving the community because the worst possible state of affairs would be a clockwork uniformity of opinion. This is why education should not seek to cast the minds of children in a particular mould.

In a society built on anarchist principles, property will no longer provide an excuse for the exploitation of other people’s time and labour; but it will still exist to the extent that each person retains items required for their welfare and day-to-day subsistence. They should not be selfish or jealous of them. If two people dispute an item, Godwin writes, let justice, not law, decide between them. All will serve on temporary juries for resolving disputes or agreeing on practices, and all will have the capacity to do so without fear or favour.

d. Response to Malthus

The final objection to his egalitarian strictures on property in Political Justice is the chapter ‘Of the objection to this system from the principle of population’ (Book VIII: vii). The objection raises the possibility that an egalitarian world might become too populous to sustain human life. Godwin argues that if this were to threaten human existence, people would develop the strength of mind to overcome the urge to propagate. Combined with the banishment of disease and increased longevity—even perhaps the achievement of immortality—the nature of the world’s population would change. Long life, extended education, a progressive improvement in concentration, a reduced need for sleep, and other advances, would result in a rapid increase in wisdom and benevolence. People would find ways to keep the world’s population at a sustainable level.

This chapter, together with the essay ‘Of Avarice and Profusion’ (The Enquirer, 1797), contributed to Thomas Malthus’ decision to write An Essay on the Principle of Population, first published in 1798. He argued that Godwin was too optimistic about social progress. They met and discussed the question amicably, and a response was included in Godwin’s Reply to Dr Parr, but his major response, Of Population, was not published until 1820, by which time Malthus’s Essay was into its fifth, greatly expanded, edition. Godwin argues against Malthus’s geometrical ratio for population increase and his arithmetical ratio for the increase in food production, drawing where possible on international census figures. He looks to mechanisation, to the untapped resources of the sea, to an increase in crop cultivation rather than meat production, and to chemistry’s potential for producing new foodstuffs. With regard to sexual passions, he repeats his opinion from Political Justice that men and women are capable of immense powers of restraint, and with regard to the Poor Laws, which Malthus wished to abolish, he argued that they were better for the poor than no support at all. Where Malthus argued for lower wages for the poor, Godwin argued for higher pay, to redistribute wealth and stimulate the economy.

When Malthus read Of Population, he rather sourly called it ‘the poorest and most old-womanish performance to have fallen from a writer of note’. The work shows that Godwin remained positive about the capacity of humankind to overcome misery and to achieve individual and social improvement. He knew that if Malthus was right, hopes for radical social progress, and even short-term relief for the poor and oppressed, were futile.

5. Godwin’s Fiction

a. Caleb Williams (1794)

Godwin wrote three minor novels before he wrote Political Justice. They had some success, but nothing like that of the two novels he completed in the 1790s. Caleb Williams and St. Leon were not only the most successful and intellectually ambitious of his novels but were also the two that relate most closely to his philosophical work of the 1790s. He wrote two more novels that were well received: Fleetwood. or The New Man of Feeling (1805) and Mandeville, a Tale of the Seventeenth Century in England (1817). His final two novels, Cloudsley (1830) and Deloraine (1831), were more romantic and less successful.

Things As They Are; or The Adventures of Caleb Williams is both a study of individual psychology and a continuation, or popularization, of Godwin’s critical analysis of English society in Political Justice. It explores how aristocracy insinuates authority and deference throughout society. One of the two main characters, Falkland, is a wealthy philanthropist whose tragic flaw is a desire to maintain at any cost his reputation as an honourable and benevolent gentleman. The other, Caleb, is his bright, self-educated servant with insatiable curiosity. Caleb admires Falkland, but he begins to suspect that it was his master who murdered the uncouth and boorish neighbouring squire, Barnabas Tyrrel. When the opportunity arises for him to search the contents of a mysterious chest in Falkland’s library, Caleb cannot resist. He is discovered by Falkland and learns the truth from him. Not only was Falkland the murderer, but he had allowed innocent people to die for the crime. He is driven to protect his reputation and honour at any cost. Caleb is chased across the country, and around Europe, by Falkland’s agents. He is resourceful and courageous in eluding them, but Falkland’s power and resources are able to wear him down and bring him to court, where Falkland and Caleb face each other. They are both emotionally, psychologically, and physically exhausted. In different ways, both have been persecuted and corrupted by the other, and yet theirs is almost a love story. The trial establishes the facts as far as they interest the law, but it is not the whole truth: not, from a moral perspective, in terms of true guilt and innocence, and not from a psychological perspective.

Caleb’s adventures during his pursuit across Britain and Europe allow us to see different aspects of human character and psyche, and of the state of society. Caleb recounts his adventures himself, allowing the reader to infer the degree to which he is reliably understanding and confessing his own moral and psychological decline. He espouses principles of sincerity and candour, but his narrative shows the difficulty of being truly honest with oneself. The emotional and mental effects of his persecution are amplified by his growing paranoia.

The novel was recognised as an attack on values and institutions embedded in English society, such as religion, law, prisons, inequality, social class, the abuse of power, and aristocratic notions of honour. One of the more didactic passages occurs when Caleb is visited in prison by Thomas, a fellow servant. Thomas looks at the conditions in which Caleb is kept—shackled and without even straw for a bed—and exclaims, ‘Zounds, I have been choused. They told me what a fine thing it was to be an Englishman, and about liberty and property, and all that there; and I find it is all flam’ (2009: 195). In another episode, Caleb encounters a group of bandits. Their leader, Raymond, justifies their activities to Caleb: ‘We undertake to counteract the partiality and iniquity of public institutions. We, who are thieves without a licence, are at open war with another set of men, who are thieves according to law… we act, not by choice, but only as our wise governors force us to act’ (2009: 209).

It is also a story of communication failure, of mutual distrust and resentment that could have been resolved by conversation. Caleb’s curiosity made him investigate the chest for himself, rather than openly confront Falkland with his suspicions. Both men have failed to exercise their private judgment independently of the values and expectations of their social situation. By the end of the novel, any hope of resolution has evaporated: a frank and rational discussion at the right time could have achieved it. It was, at least in part, the social environment—social inequality—that created their individual characters and the communication barrier.

As well as themes from Political Justice, there are echoes of the persecution and surveillance of British radicals at the time of writing and of the false values, as Godwin saw them, of Burke’s arguments in favour of tradition and aristocracy, of ‘things as they are’. It is not surprising that the novel was especially praised by readers with radical views. In his character sketch of Godwin (in The Spirt of the Age), Hazlitt wrote that ‘no one ever began Caleb Williams that did not read it through: no one that ever read it could possibly forget it, or speak of it after any length of time but with an impression as if the events and feelings had been personal to himself’ (Hazlitt, 2000: 288).

b. St. Leon: A Tale of the Sixteenth Century (1799)

Despite its historical setting, St. Leon is as concerned as Caleb Williams is with the condition of contemporary society and with themes from Political Justice. Gary Kelly (1976) has coupled St. Leon with Caleb Williams as an English Jacobin novel (together with works by Elizabeth Inchbald, Robert Bage, and Thomas Holcroft), and Pamela Clemit (1993) classes them as Rational or Godwinian novels (together with works by Mary Shelley and the American novelist Charles Brockden Brown). They are certainly philosophical novels. St. Leon is also an historical novel in that its setting in sixteenth century Europe is accurately depicted, and it is a Gothic novel in that it contains mystery, horror, arcane secrets, and dark dungeons. B. J. Tysdahl (1981) refers to its ‘recalcitrant Gothicism’. When Lord Byron asked why he did not write another novel, Godwin replied that it would kill him. ‘And what matter,’ Byron replied, ‘we should have another St. Leon’.

The central character and narrator, St. Leon, is as imbued with the values of his own country, class, and period as Falkland. At the start of the novel, he is a young French nobleman in thrall to chivalric values and anxious to create a great reputation as a knight. A high point of his youth is his attendance at the Field of the Cloth of Gold in 1520, when Francis I of France and Henry VIII of England met in awe-inspiring splendour, as if to mark the end of medievalism. A low point is when the French are defeated at the Battle of Pavia. St. Leon’s education had prepared him for a chivalric way of life; its passing leaves him unprepared for a world with more commercial values. His hopes of aristocratic glory are finally destroyed by an addiction to gambling. He loses his wealth and the respect of his son, Charles, and might have lost everything had he been married to a less extraordinary woman. Marguerite sees their financial ruin as a blessing in disguise, and for a period the family enjoys domestic contentment in a humble setting in Switzerland.

This changes when St. Leon encounters a stranger who has selected him to succeed to the possession of arcane knowledge. The alchemical secrets he is gifted—the philosopher’s stone and the elixir of life—restore his wealth and give him immortality. He seizes the opportunity to make amends to his family and to society by becoming the world’s benefactor. But the gift turns out to be a curse. His wife dies, his philanthropic schemes fail, and he becomes an outcast, mistrusted and alienated forever. Generations pass; St. Leon persists but sees himself as a monster undeserving of life. Only by unburdening himself of the alchemical knowledge, as the stranger had done, could he free himself to die. Otherwise, he must live forever a life of deceit and disguise. As the narrator, he cannot provide clues even to the recipients of his narration, in whatever age we might live. We pity him but we cannot entirely trust him. Even as a narrator he is suspected. As in Caleb Williams, the impossibility of candour and truthfulness is shown to be corrupting, and as in Mary Shelley’s Frankenstein, unique knowledge and a unique form of life are shown to bring desolation in the absence of affection, trust, and communication.

We can interpret St. Leon as a renewal of Godwin’s critique of Burke and of the British mixture of tradition and commercialism. We can see in Marguerite a tribute to Mary Wollstonecraft. Is there also, as Gary Kelly suggests (1976: 210), a parallel between the radical philosophers of the late eighteenth century—polymaths like Joseph Priestley and Richard Price, perhaps, or Godwin and Wollstonecraft themselves—and the alchemical adept whose knowledge and intentions society suspects and is unprepared for? Writing St. Leon so shortly after the death of Wollstonecraft, when he is enduring satirical attacks, Godwin must have felt himself in danger of becoming isolated and insufficiently appreciated. We can see the novel as pessimistic, reflecting Godwin’s doubts about the potential for radical change in his lifetime. But Godwin well knew that alchemy paved the way for chemical science, so perhaps the message is more optimistic: what seems like wishful thinking today will lead us to tomorrow’s accepted wisdom.

6. Conclusion

Godwin died on the cusp of the Victorian age, having played a part in the transition from the Enlightenment to Romanticism. His influence persisted as Political Justice reached a new, working-class readership through quotation in Owenite and Chartist pamphlets and a cheap edition published in 1842, and his ideas were discussed at labour movement meetings. His novels influenced Dickens, Poe, Hawthorne, Balzac, and others. According to Marshall (1984: 392), Marx knew of Godwin through Engels, but disagreed with his individualism and about which social class would be the agent of reform. Of the great anarchist thinkers who came after him, Bakunin does not refer to him, Tolstoy does but may not have read him directly; Kropotkin, however, hailed him as the first to define the principles of anarchism.

Godwin’s political philosophy can appear utopian, and his view of the potential for human improvement naively optimistic, but his ideas still have resonance and relevance. As a moral philosopher, he has not received sufficient credit for his version of utilitarian principles, contemporaneous with Bentham’s, a version that anticipates John Stuart Mill’s. He was both intellectually courageous in sticking to his fundamental principles, and conscientious in admitting to errors. Unlike Malthus, he believed the conditions of the poor and oppressed can and should be improved. He is confident that an egalitarian democracy free of government interference will allow individuals to thrive. One of his most important contributions to social and political theory is his analysis of how educational injustice is a primary source of social injustice. The journey to political justice begins and ends with educational justice.

7. References and Further Reading

a. Works by William Godwin

i. Early Editions of An Enquiry Concerning Political Justice

  • 1793. An Enquiry Concerning Political Justice, and Its Influence on General Virtue and Happiness. First edition. 2 vols. London: G.G and J. Robinson.
  • 1796. An Enquiry Concerning Political Justice, and Its Influence on General Virtue and Happiness. Second edition. 2 vols. London: G.G and J. Robinson.
  • 1798. An Enquiry Concerning Political Justice, and Its Influence on General Virtue and Happiness. Third edition. 2 vols. London: G.G and J. Robinson.

ii. Other Editions of An Enquiry Concerning Political Justice

  • 1946. An Enquiry Concerning Political Justice. F. E. L. Priestley (ed). 3 vols. Toronto: University of Toronto Press.
    • This is a facsimile of the third edition. Volume 3 contains variants from the first and second editions.
  • 2013. An Enquiry Concerning Political Justice. Mark Philp (ed). Oxford World Classics. Oxford: Oxford University Press.
    • This is based on the text of 1793 first edition. In addition to an introduction by Mark Philp, it includes a chronology of Godwin’s life and explanatory notes.
  • 2015. Enquiry Concerning Political Justice: And Its Influence On Morals And Happiness. Isaac Kramnick (ed.). London: Penguin.
    • This is based on the text of the 1798 third edition. It includes the Summary of Principles. Introduction and Editor’s Notes by Isaac Kramnick.

iii. Collected Editions of Godwin’s Works and Correspondence

  • 1992. Collected Novels and Memoirs of William Godwin. 8 vols. Mark Philp (ed.). London: Pickering and Chatto Publishers, Ltd.
    • A scholarly series that includes Memoirs of the Author of a Vindication of the Rights of Woman as well as the text of all Godwin’s fiction and some unpublished pieces.
  • 1993. Political and Philosophical Writings of William Godwin, 7 Volumes, Mark Philp (ed.). London, Pickering and Chatto Publishers Ltd.
    • A scholarly edition of Godwin’s principal political and philosophical works, including some previously unpublished pieces. Volume 1 includes a complete bibliography of Godwin’s works and political essays. Volume 2 contains the remaining political essays. Volume 3 contains the text of the first edition of Political Justice; volume 4 contains variants from the second and third editions. Volumes 5 and 6 contain educational and literary works, including The Enquirer essays. Volume 7 includes Godwin’s final (unfinished) work, published posthumously: The Genius of Christianity Unveiled.
  • 2011, 2014. The Letters of William Godwin. Volume 1: 1778–1797, Volume 2: 1798–1805. Pamela Clemit (ed). Oxford: Oxford University Press.
    • A projected six volume series.

iv. First Editions of Other Works by Godwin

  • 1783. An Account of the Seminary That Will Be Opened on Monday the Fourth Day of August at Epsom in Surrey. London: T. Cadell.
  • 1784. The Herald of Literature, as a Review of the Most Considerable Publications That Will Be Made in the Course of the Ensuing Winter. London: J. Murray.
  • 1794a. Cursory Strictures on the Charge Delivered by Lord Chief Justice Eyre to the Grand Jury London: D. I. Eaton.
  • 1794b. Things As They Are; or The Adventures of Caleb Williams. 3 vols. London: B. Crosby.
  • 1797. The Enquirer: Reflections on Education, Manners and Literature. London: GG and J Robinson.
  • 1798. Memoirs of the Author of a Vindication of the Rights of Woman. London: J. Johnson.
  • 1799. St. Leon, A Tale of the Sixteenth Century. 4 vols. London: G.G. and J. Robinson.
  • 1801 Thoughts Occasioned by the Perusal of Dr. Parr’s Spital Sermon, Preached at Christ Church, April I5, 1800: Being a Reply to the Attacks of Dr. Parr, Mr. Mackintosh, the Author of an Essay on Population, and Others. London: GG and J Robinson.
  • 1805. Fleetwood. or The New Man of Feeling. 3 vols. London: R. Phillips.
  • 1817. Mandeville, a Tale of the Seventeenth Century in England. 3 vols. London: Longman, Hurst, Rees, Orme and Brown.
  • 1820. Of Population. An Enquiry Concerning the Power of Increase in the Numbers of Mankind, Being an Answer to Mr. Malthus’s Essay on That Subject. London: Longman, Hurst, Rees, Orme and Brown.
  • 1824. History of the Commonwealth of England from Its Commencement to Its Restoration. 4 vols. London: H. Colburn
  • 1831. Thoughts on Man, His Nature, Productions, and Discoveries. Interspersed with Some Particulars Respecting the Author. London: Effingham Wilson

v. Online Resources

  • 2010. The Diary of William Godwin. Victoria Myers, David O’Shaughnessy, and Mark Philp (eds.). Oxford: Oxford Digital Library. http://godwindiary.bodleian.ox.ac.uk/index2.html.
    • Godwin kept a diary from 1788 to 1836. It is held by the Bodleian Library, University of Oxford as part of the Abinger Collection. Godwin recorded meetings, topics of conversation, his reading and writing in succinct notes.

vi. Other Editions of Selected Works by Godwin

  • 1986. Romantic Rationalist: A William Godwin Reader. Peter Marshall (ed.). London: Freedom Press.
    • Contains selections from Godwin’s works, arranged by theme.
  • 1988. Caleb Williams. Maurice Hindle (ed.). London: Penguin Books.
  • 1994. St. Leon. Pamela Clemit (ed.). Oxford World Classics. Oxford: Oxford University Press.
  • 2005. Godwin on Wollstonecraft: Memoirs of the Author of a Vindication of the Rights of Woman. Richard Holmes (ed). London: Harper Perennial.
  • 2009. Caleb Williams. Pamela Clemit (ed.). Oxford World Classics. Oxford: Oxford University Press
  • 2019. Fleetwood. Classic Reprint. London: Forgotten Books.
  • 2019. Mandeville: A Tale of the Seventeenth Century in England. Miami, Fl: Hard Press Books.

b. Biographies of Godwin

  • Brailsford, H N. 1951. Shelley, Godwin and Their Circle. Second edition. Home University Library of Modern Knowledge. Oxford: Oxford University Press.
  • Brown, Ford K. 1926. The Life of William Godwin. London: J. M. Dent and Sons.
  • Clemit, Pamela (ed). 1999. Godwin. Lives of the Great Romantics III: Godwin, Wollstonecraft and Mary Shelley by their Contemporaries. Volume 1. London: Pickering and Chatto.
  • Goulbourne, Russell, Higgins, David (eds.). 2017. Jean-Jacques Rousseau and British Romanticism: Gender and Selfhood, Politics and Nation. London: Bloomsbury.
  • Hazlitt, William. 2000. ‘William Godwin’ in The Fight and Other Writings. Tom Paulin (ed.). London: Penguin.
  • Locke, Don. 1980. A Fantasy of Reason: The Life and Thought of William Godwin. London: Routledge and Kegan Paul.
    • This is described as a ‘philosophical biography’.
  • Marshall, Peter. 1984. William Godwin. New Haven: Yale University Press.
    • A new edition is entitled William Godwin: Philosopher, Novelist, Revolutionary. PM Press, 2017. The text appears the same. A standard biography.
  • Paul, Charles Kegan, 1876, William Godwin: his Friends and Contemporaries, 2 volumes, London: H.S King.
    • An early and thorough biography, with important manuscript material.
  • St Clair, William. 1989. The Godwins and the Shelleys: The Biography of a Family. London: Faber and Faber.
  • Thomas, Richard Gough. 2019. William Godwin: A Political Life. London: Pluto Press.
  • Woodcock, George. 1946. A Biographical Study. London: Porcupine Press.

c. Social and Historical Background

  • Butler, Marilyn. 1984. Burke, Paine, Godwin and the Revolution Controversy. Cambridge: Cambridge University Press.
  • Grayling, A. C. 2007. Towards the Light: The Story of the Struggles for Liberty and Rights. London: Bloomsbury.
  • Hay, Daisy. 2022. Dinner with Joseph Johnson: Books and Friendship in a Revolutionary Age. London: Chatto and Windus.
    • A study of the regular dinners held by the radical publisher, whose guests included Godwin, Wollstonecraft, Fuseli, Blake, and many other writers, artists, and radicals.
  • Hewitt, Rachel. 2017. A Revolution in Feeling: The Decade that Forged the Modern Mind. London: Granta.
  • Norman Jesse. 2013. Edmund Burke: Philosopher Politician Prophet. London: William Collins.
  • Philp, Mark. 2020. Radical Conduct: Politics, Sociability and Equality in London 1789–1815. Cambridge UK: Cambridge University Press.
    • A study of the radical intellectual culture of the period and of Godwin’s position within it.
  • Simon, Brian. 1960. Studies in the History of Education, 1780 – 1870. London: Lawrence and Wishart.
  • Tomalin, Claire. 1974. The Life and Death of Mary Wollstonecraft. London: Weidenfeld and Nicolson.
  • Uglow, Jenny. 2014. In These Times: Living in Britain Through Napoleon’s Wars 1798 – 1815. London: Faber and Faber.

d. Other Secondary Sources in Philosophy, Education, Fiction, and Anarchism

  • Bottoms, Jane. 2004. ‘“Awakening the Mind”: The Educational Philosophy of William Godwin’. History of Education 33 (3): 267–82.
  • Claeys, Gregory. 1983. ‘The Concept of “Political Justice” in Godwin’s Political Justice.’ Political Theory 11 (4): 565–84.
  • Clark, John P. 1977. The Philosophical Anarchism of William Godwin. Princetown: Princetown University Press.
  • Clemit, Pamela. 1993. The Godwinian Novel. Oxford: Clarendon Press.
  • Crowder, George. 1991. Classical Anarchism: The Political Thought of Godwin, Proudhon, Bakunin and Kropotkin. Oxford: Oxford University Press.
  • Eltzbacher, Paul. 1960. Anarchism: Seven Exponents of the Anarchist Philosophy. London: Freedom Press.
  • Fleisher, David. 1951. William Godwin: A Study of Liberalism. London: Allen and Unwin.
  • Fricker, Miranda. 2007. Epistemic Injustice: Power and the Ethics of Knowing. Oxford: Oxford University Press.
  • Kelly, Gary. 1976. The English Jacobin Novel 1780 – 1805. Oxford: Clarendon Press.
  • Knights, B. 1978. The Idea of the Clerisy in the Nineteenth Century. Cambridge UK: Cambridge University Press.
  • Lamb, Robert. 2006. ‘The Foundations of Godwinian Impartiality’. Utilitas 18 (2): 134–53.
  • Lamb, Robert. 2009. ‘Was William Godwin a Utilitarian?’ Journal of the History of Ideas 70 (1): 119–41.
  • Manniquis, Robert, Myers, Victoria. 2011. Godwinian Moments: From Enlightenment to Romanticism. Toronto: University of Toronto/Clark Library UCLA.
  • Marshall, Peter. 2010. Demanding the Impossible: A History of Anarchism. Oakland, CA: PM Press.
  • Mee, Jon. 2011. ‘The Use of Conversation: William Godwin’s Conversable World and Romantic Sociability’. Studies in Romanticism 50 (4): 567–90.
  • Monro, D.H. 1953. Godwin’s Moral Philosophy. Oxford: Oxford University Press.
  • O’Brien, Eliza, Stark, Helen, Turner, Beatrice (eds.) 2021. New Approaches to William Godwin: Forms, Fears, Futures. London: Palgrave/MacMillan.
  • Philp, Mark. 1986. Godwin’s Political Justice. London: Duckworth.
    • A detailed analysis of Godwin’s major philosophical work.
  • Pollin, Burton R. 1962. Education and Enlightenment in the Works of William Godwin. New York: Las Americas Publishing Company.
    • Still the most thorough study of Godwin’s educational thought.
  • Scrivener, Michael. 1978. ‘Godwin’s Philosophy Re-evaluated’. Journal of the History of Ideas 39: 615–26.
  • Simon, Brian, (ed). 1972. The Radical Tradition in Education in Great Britain. London: Lawrence and Wishart.
  • Singer, Peter, Leslie Cannold, Helga Kuhse. 1995, ‘William Godwin and the Defence of Impartialist Ethics’. Utilitas, 7(1): 67–86.
  • Suissa, Judith. 2010. Anarchism and Education: A Philosophical Perspective. Second. Oakland, CA: PM Press.
  • Tysdahl, B J. 1981. William Godwin as Novelist. London: Athlone Press.
  • Weston, Rowland. 2002. ‘Passion and the “Puritan Temper”: Godwin’s Critique of Enlightened Modernity’. Studies in Romanticism. 41 (3): 445-470.
  • Weston, Rowland. 2013. ‘Radical Enlightenment and Antimodernism: The Apostasy of William Godwin (1756–1836)’. Journal for the Study of Radicalism. 7 (2): 1–30.

Author Information

Graham Nutbrown
Email: gn291@bath.ac.uk
University of Bath
United Kingdom

Noam Chomsky (1928 – )

Noam Chomsky is an American linguist who has had a profound impact on philosophy. Chomsky’s linguistic work has been motivated by the observation that nearly all adult human beings have the ability to effortlessly produce and understand a potentially infinite number of sentences. For instance, it is very likely that before now you have never encountered this very sentence you are reading, yet if you are a native English speaker, you easily understand it. While this ability often goes unnoticed, it is a remarkable fact that every developmentally normal person gains this kind of competence in their first few years, no matter their background or general intellectual ability. Chomsky’s explanation of these facts is that language is an innate and universal human property, a species-wide trait that develops as one matures in much the same manner as the organs of the body. A language is, according to Chomsky, a state obtained by a specific mental computational system that develops naturally and whose exact parameters are set by the linguistic environment that the individual is exposed to as a child. This definition, which is at odds with the common notion of a language as a public system of verbal signals shared by a group of speakers, has important implications for the nature of the mind.

Over decades of active research, Chomsky’s model of the human language faculty—the part of the mind responsible for the acquisition and use of language—has evolved from a complex system of rules for generating sentences to a more computationally elegant system that consists essentially of just constrained recursion (the ability of a function to apply itself repeatedly to its own output). What has remained constant is the view of language as a mental system that is based on a genetic endowment universal to all humans, an outlook that implies that all natural languages, from Latin to Kalaallisut, are variations on a Universal Grammar, differing only in relatively unimportant surface details. Chomsky’s research program has been revolutionary but contentious, and critics include prominent philosophers as well as linguists who argue that Chomsky discounts the diversity displayed by human languages.

Chomsky is also well known as a champion of liberal political causes and as a trenchant critic of United States foreign policy. However, this article focuses on the philosophical implications of his work on language. After a biographical sketch, it discusses Chomsky’s conception of linguistic science, which often departs sharply from other widespread ideas in this field. It then gives a thumbnail summary of the evolution of Chomsky’s research program, especially the points of interest to philosophers. This is followed by a discussion of some of Chomsky’s key ideas on the nature of language, language acquisition, and meaning. Finally, there is a section covering his influence on the philosophy of mind.

Table of Contents

  1. Life
  2. Philosophy of Linguistics
    1. Behaviorism and Linguistics
    2. The Galilean Method
    3. The Nature of the Evidence
    4. Linguistic Structures
  3. The Development of Chomsky’s Linguistic Theory
    1. Logical Constructivism
    2. The Standard Model
    3. The Extended Standard Model
    4. Principles and Parameters
    5. The Minimalist Program
  4. Language and Languages
    1. Universal Grammar
    2. Plato’s Problem and Language Acquisition
    3. I vs. E languages
    4. Meaning and Analyticity
    5. Kripkenstein and Rule Following
  5. Cognitive Science and Philosophy of Mind
  6. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Life

Avram Noam Chomsky was born in Philadelphia in 1928 to Jewish parents who had immigrated from Russia and Ukraine. He manifested an early interest in politics and, from his teenage years, frequented anarchist bookstores and political circles in New York City. Chomsky attended the University of Pennsylvania at the age of 16, but he initially found his studies unstimulating. After meeting the mathematical linguist Zellig Harris through political connections, Chomsky developed an interest in language, taking graduate courses with Harris and, on his advice, studying philosophy with Nelson Goodman. Chomsky’s 1951 undergraduate honors thesis, on Modern Hebrew, would form the basis of his MA thesis, also from the University of Pennsylvania. Although Chomsky would later have intellectual fallings out with both Harris and Goodman, they were major influences on him, particularly in their rigorous approach, informed by mathematics and logic, which would become a prominent feature of his own work.

After earning his MA, Chomsky spent the next four years with the Society of Fellows at Harvard, where he had applied largely because of his interest in the work of W.V.O. Quine, a Harvard professor and major figure in analytic philosophy. This would later prove to be somewhat ironic, as Chomsky’s work developed into the antithesis of Quine’s behaviorist approach to language and mind. In 1955, Chomsky was awarded his doctorate and became an assistant professor at the Massachusetts Institute of Technology, where he would continue to work as an emeritus professor even after his retirement in 2002. Throughout this long tenure at MIT, Chomsky produced an enormous volume of work in linguistics, beginning with the 1957 publication of Syntactic Structures. Although his work initially met with indifference or even hostility, including from his former mentors, it gradually altered the very nature of the field, and Chomsky grew to be widely recognized as one of the most important figures in the history of language science. Since 2017, he has been a laureate professor in the linguistics department at the University of Arizona.

Throughout his career, Chomsky has been at least as prolific in social, economic, and political criticism as in linguistics. Chomsky became publicly outspoken about his political views with the escalation of the Vietnam War, which he always referred to as an “invasion”. He was heavily involved in the anti-war movement, sometimes risking both his professional and personal security, and was arrested several times. He remained politically active and, among many other causes, was a vocal critic of US interventions in Latin America during the 1980s, the reaction to the September 2001 attacks, and the invasion of Iraq. Chomsky has opposed, since his early youth, the capitalist economic model and supported the Occupy movement of the early 2010s. He has also been an unwavering advocate of intellectual freedom and freedom of speech, a position that has at times pitted him against other left-leaning intellectuals and caused him to defend the rights of others who have very different views from his own. Despite the speculations of many biographers, Chomsky has always denied any connection between his work in language and politics, sometimes quipping that someone was allowed to have more than one interest.

In 1947, Chomsky married the linguist Carol Doris Chomsky (nee Schatz), a childhood friend from Philadelphia. They had three children and remained married until her death in 2008. Chomsky remarried Valeria Wasserman, a Brazilian professional translator, in 2014.

2. Philosophy of Linguistics

Chomsky’s approach to linguistic science, indeed his entire vision of what the subject matter of the discipline consists of, is a sharp departure from the attitudes prevalent in the mid-20th century. To simplify, prior to Chomsky, language was studied as a type of communicative behavior, an approach that is still widespread among those who do not accept Chomsky’s ideas. In contrast, his focus is on language as a type of (often unconscious) knowledge. The study of language has, as Chomsky states, three aspects: determining what the system of knowledge a language user has consists of, how that knowledge is acquired, and how that knowledge is used. A number of points in Chomsky’s approach are of interest to the philosophy of linguistics and to the philosophy of science more generally, and some of these points are discussed below.

a. Behaviorism and Linguistics

When Chomsky was first entering academics in the 1950s, the mainstream school of linguistics for several decades had been what is known as structuralism. The structuralist approach, endorsed by Chomsky’s mentor Zellig Harris, among others, concentrated on analyzing corpora, or records of the actual use of a language, either spoken or written. The goal of the analysis was to identify patterns in the data that might be studied to yield, among other things, the grammatical rules of the language in question. Reflecting this focus on language as it is used, structuralists viewed language as a social phenomenon, a communicative tool shared by groups of speakers. Structuralist linguistics might well be described as consisting of the study of what happens between a speaker’s mouth and a listener’s ear; as one well -known structuralist put it, “the linguist deals only with the speech signal” (Bloomfield, 1933: 32). This is in marked contrast to Chomsky and his followers, who concentrate on what is going on in the mind of a speaker and who look there to identify grammatical rules.

Structuralist linguistics was itself symptomatic of behaviorism, a paradigm prominently championed in psychology by B.F. Skinner and in philosophy by W.V.O. Quine and which was dominant in the midcentury. Behaviorism held that science should restrict itself to observable phenomena. In psychology, this meant seeking explanations entirely in terms of external behavior without discussing minds, which are, by their very nature, unobservable. Language was to be studied in terms of subjects’ responses to stimuli and their resulting verbal output. Behaviorist theories were often formed on the basis of laboratory experiments in which animals were conditioned by being given food rewards or tortured with electric shock in order to shape their behavior. It was thought that human behavior could be similarly explained in terms of conditioning that shapes reactions to specific stimuli. This approach perhaps reached its zenith with the publication of Skinner’s Verbal Behavior (1957), which sought to reduce human language to conditioned responses. According to Skinner, speakers are conditioned as children, through training by adults, to respond to stimuli with an appropriate verbal response. For example, a child might realize that if they see a piece of candy (the stimulus) and respond by saying “candy”, they might be rewarded by adults with the desired sweet, reinforcing that particular response. For an adult speaker, the pattern of stimuli and response could be very complex, and what specific aspect of a situation is being responded to might be difficult to ascertain, but the underlying principle was held to be the same.

Chomsky’s scathing 1959 review of Verbal Behavior has actually become far better known than the original book. Although Chomsky conceded to Skinner that the only data available for the study of language consisted of what people say, he denied that meaningful explanations were to be found at that level. He argued that in order to explain a complex behavior, such as language use, exhibited by a complex organism such as a human being, it is necessary to inquire into the internal organization of the organism and how it processes information. In other words, it was necessary to make inferences about the language user’s mind. Elsewhere, Chomsky likened the procedure of studying language to what engineers would do if confronted with a hypothetical “black box”, a mysterious machine whose input and output were available for inspection but whose internal functioning was hidden. Merely detecting patterns in the output would not be accepted as real understanding; instead, that would come from inferring what internal processes might be at work.

Chomsky particularly criticized Skinner’s theory that utterances could be classified as responses to subtle properties of an object or event. The observation that human languages seem to exhibit stimulus-freedom goes back at least to Descartes in the 17th century, and about the same time as Chomsky was reviewing Skinner, the linguist Charles Hockett (later one of Chomsky’s most determined critics) suggested that this is one of the features that distinguish human languages from most examples of animal communication. For instance, a vervet monkey will give a distinct alarm call any time she spots an eagle and at no other times. In contrast, a human being might say anything or nothing in response to any given stimulus. Viewing a paining one might say, “Dutch…clashes with the wallpaper…. Tilted, hanging too low, beautiful, hideous, remember our camping trip last summer? or whatever else might come to our minds when looking at a picture.” (Chomsky, 1959:2). What aspect of an object, event, or environment triggers a particular response rather than another can only be explained in mental terms. The most relevant fact is what the speaker is thinking about, so a true explanation must take internal psychology into account.

Chomsky’s observation concerning speech was part of his more general criticism of the behaviorist approach. Chomsky held that attempts to explain behavior in terms of stimuli and responses “will be in general a vain pursuit. In all but the most elementary cases, what a person does depends in large measure on what he knows, believes, and anticipates” (Chomsky, 2006: xv). This was also meant to apply to the behaviorist and empiricist philosophy exemplified by Quine. Although Quine has remained important in other aspects of analytic philosophy, such as logic and ontology, his behaviorism is largely forgotten. Chomsky is widely regarded as having inaugurated the era of cognitive science as it is practiced today, that is, as a study of the mental.

b. The Galilean Method

Chomsky’s fundamental approach to doing science was and remains different from that of many other linguists, not only in his concentration on mentalistic explanation. One approach to studying any phenomenon, including language, is to amass a large amount of data, look for patterns, and then formulate theories to explain those patterns. This method, which might seem like the obvious approach to doing any type of science, was favored by structuralist linguists, who valued the study of extensive catalogs of actual speech in the world’s languages. The goal of the structuralists was to provide descriptions of a language at various levels, starting with the analysis of pronunciation and eventually building up to a grammar for the language that would be an adequate description of the regularities identifiable in the data.

In contrast, Chomsky’s method is to concentrate not on a comprehensive analysis but rather on certain crucial data, or data that is better explained by his theory than by its rivals. This sort of methodology is often called “Galilean”, since it takes as its model the work of Galileo and Newton. These physicists, judiciously, did not attempt to identify the laws of motion by recording and studying the trajectory of as many moving objects as possible. In the normal course of events, the exact paths traced by objects in motion are the results of the complex interactions of numerous phenomena such as air resistance, surface friction, human interference, and so on. As a result, it is difficult to clearly isolate the phenomena of interest. Instead, the early physicists concentrated on certain key cases, such as the behavior of masses in free fall or even idealized fictions such as objects gliding over frictionless planes, in order to identify the principles that, in turn, could explain the wider data. For similar reasons, Chomsky doubts that the study of actual speech—what he calls performancewill yield theoretically important insights. In a widely cited passage (Chomsky, 1962, 531), he noted that:

Actual discourse consists of interrupted fragments, false starts, lapses, slurring, and other phenomena that can only be understood as distortions of an underlying idealized pattern.

Like the ordinary movements of objects observable in nature, which Galileo largely ignored, actual speech performance is likely to be the product of a mass of interacting factors, such as the social conventions governing the speech exchange, the urgency of the message and the time available, the psychological states of the speakers (excited, panicked, drunk), and so on, of which purely linguistic phenomena will form only a small part. It is the idealized patterns concealed by these effects and the mental system that generates those patterns —the underlying competence possessed by language users —that Chomsky regards as the proper subject of linguistic study. (Although the terms competence and performance have been superseded by the I-Language/E-Language distinction, discussed in 4.c. below, these labels are fairly entrenched and still widely used.)

c. The Nature of the Evidence

Early in his career (1965), Chomsky specified three levels of adequacy that a theory of language should satisfy, and this has remained a feature of his work. The first level is observational, to determine what sentences are grammatically acceptable in a language. The second is descriptive, to provide an account of what the speaker of the language knows, and the third is explanatory, to give an explanation of how such knowledge is acquired. Only the observational level can be attained by studying what speakers actually say, which cannot provide much insight into what they know about language, much less how they came to have that knowledge. A source of information about the second and third levels, perhaps surprisingly, is what speakers do not say, and this has been a focus of Chomsky’s program. This negative data is drawn from the judgments of native speakers about what they feel they can’t say in their language. This is not, of course, in the sense of being unable to produce these strings of words or of being unable, with a little effort, to understand the intended message, but simply a gut feeling that “you can’t say that”. Chomsky himself calls these interpretable but unsayable sentences “perfectly fine thoughts”, while the philosopher Georges Rey gave them the pithier name “WhyNots”. Consider the following examples from Rey 2022 (the “*” symbol is used by linguists to mark a string that is ill-formed in that it violates some principle of grammar):

(1) * Who did John and kiss Mary? (Compared to John, and who kissed Mary? and who-initial questions like Who did John kiss?)

(2) * Who did stories about terrify Mary? (Compared to stories about who terrified Mary?)

Or the following question/answer pairs:

(3) Which cheese did you recommend without tasting it? * I recommended the brie without tasting it. (Compared to… without tasting it.)

(4) Have you any wool? * Yes, I have any wool.

An introductory linguistics textbook provides two further examples (O’Grady et al. 2005):

(5) * I went to movie. (Compared to I went to school.)

(6) *May ate a cookie, and then Johnnie ate some cake, too. (Compared to Mary ate  a cookie, and then Johnnie ate a cookie too/ate a snack too.)

The vast majority of English speakers would balk at these sentences, although they would generally find it difficult to say precisely what the issue is (the textbook challenges the reader to try to explain). Analogous “whynot” sentences exist in every language yet studied.

What Chomsky holds to be significant about this fact is that almost no one, aside from those who are well read in linguistics or philosophy of language, has ever been exposed to (1) –(6) or any sentences like them. Analysis of corpora shows that sentences constructed along these lines virtually never occur, even in the speech of young children. This makes it very difficult to accept the explanation, favored by behaviorists, that we recognize them to be unacceptable as the result of training and conditioning. Since children do not produce utterances like (1) –(6), parents never have a chance to explain what is wrong, to correct them, and to tell them that such sentences are not part of English. Further, since they are almost never spoken by anyone, it is vanishingly unlikely that a parent and child would overhear them so that the parent could point them out as ill-formed. Neither is this knowledge learned through formal instruction in school. Instruction in not saying sentences like (1)–(6) is not a part of any curriculum, and an English speaker who has never attended a day of school is as capable of recognizing the unacceptability of (1)–(6) as any college graduate.

Examples can be multiplied far beyond (1)–(6); there are indefinite numbers of strings of English words (or words of any language) that are comprehensible but unacceptable. If speakers are not trained to recognize them as ill-formed, how do they acquire this knowledge? Chomsky argues that this demonstrates that human beings possess an underlying competence capable of forming and identifying grammatical structures—words, phrases, clauses, and sentences —in a way that operates almost entirely outside of conscious awareness, computing over structural features of language that are not actually pronounced or written down but which are critical to the production and understanding of sentences. This competence and its acquisition are the proper subject matter for linguistic science, as Chomsky defines the field.

d. Linguistic Structures

An important part of Chomsky’s linguistic theory (although it is an idea that predates him by several decades and is also endorsed by some rival theories) is that it postulates structures that lie below the surface of language. The presence of such structures is supported by, among other evidence, considering cases of non-linear dependency between the words in a sentence, that is, cases where a word modifies another word that is some distance away in the linear order of the sentence as it is pronounced. For instance, in the sentence (from Berwick and Chomsky, 2017: 117):

(7) Instinctively, birds who fly swim.

we know that instinctively applies to swim rather than fly, indicating an unspoken connection that bypasses the three intervening words and which the language faculty of our mind somehow detects when parsing the sentence. Chomsky’s hypothesis of a dedicated language faculty —a part of the mind existing for the sole purpose of forming and interpreting linguistic structures, operating in isolation from other mental systems —is supported by the fact that nonlinguistic knowledge does not seem to be relied on to arrive at the correct interpretation of sentences such as (7). Try replacing swim with play chess. Although you know that birds instinctively fly and do not play chess, your language faculty provides the intended meaning without any difficulty. Chomsky’s theory would suggest that this is because that faculty parses the underlying structure of the sentence rather than relying on your knowledge about birds.

According to Chomsky, the dependence of human languages on these structures can also be observed in the way that certain types of sentences are produced from more basic ones. He frequently discusses the formation of questions from declarative sentences. For instance, any English speaker understands that the question form of (8) is (9), and not (10) (Chomsky, 1986: 45):

(8) The man who is here is tall.

(9) Is the man who is here tall?

(10) * Is the man who here is tall?

What rule does a child learning English have to grasp to know this? To a Martian linguist unfamiliar with the way that human languages work, a reasonable initial guess might be to move the fourth word of the sentence to the front, which is obviously incorrect. To see this, change (8) to:

(11) The man who was here yesterday was tall.

A more sophisticated hypothesis might be to move the second auxiliary verb in the sentence, is in the case of (8), to the front. But this is also not correct, as more complicated cases show:

(12) The woman who is in charge of deciding who is hired is ready to see him now.          

(13) * Is the woman who is in charge of deciding who hired is ready to see him now?

In fact, in no human language do transformations from one type of sentence to another require taking the linear order of words into account, although there is no obvious reason why they shouldn’t. A language that works on a principle such as switch the first and second words of a sentence to indicate a question is certainly imaginable and would seem simple to learn, but no language yet cataloged operates in such a way.

The correct rule in the cases of (8) through (13) is that the question is formed by moving the auxiliary verb (is) occurring in the verb phrase of the main clause of the sentence, not the one in the relative clause (a clause modifying a noun, such as who is here). Thus, knowing that (9) is the correct question form of (8) or that (13) is wrong requires sensitivity to the way that the elements of a sentence are grouped together into phrases and clauses. This is something that is not apparent on the surface of either the spoken or written forms of (8) or (12), yet a speaker with no formal instruction grasps it without difficulty. It is the study of these underlying structures and the way that the mind processes them that is the core concern of Chomskyan linguistics, rather than the analysis of the strings of words actually articulated by speakers.

3. The Development of Chomsky’s Linguistic Theory

 Chomsky’s research program, which has grown to involve the work of many other linguists, is closely associated with generative linguistics. This name refers to the project of identifying sets of rules—grammars—that will generate all and only the sentences of a language. Although explicit rules eventually drop out of the picture, replaced by more abstract “principles”, the goal remains to identify a system that can produce the potentially infinite number of sentences of a human language using the resources contained in the minds of a speaker, which are necessarily finite.

Chomsky’s work has implications for the study of language as a whole, but his concentration has been on syntax. This branch of linguistic science is concerned with the grammars that govern the production of sentences that are acceptable in a language and divide them from nonacceptable strings of words, as opposed to semantics, the part of linguistics concerned with the meaning of words and sentences, and pragmatics, which studies the use of language in context.

Although the methodological principles have remained constant from the start, Chomsky’s theory has undergone major changes over the years, and various iterations may seem, at least on a first look, to have little obvious common ground. Critics present this as evidence that the program has been stumbling down one dead end after another, while Chomsky asserts in response that rapid evolution is characteristic of new fields of study and that changes in a program’s guiding theory are evidence of healthy intellectual progress. Five major stages of development might be identified, corresponding to the subsections below. Each stage builds on previous ones, it has been alleged; superseded iterations should not be considered to be false but rather replaced by a more complete explanation.

a. Logical Constructivism

Chomsky’s theory of language began to be codified in the 1950s, first set down in a massive manuscript that was later published as Logical Structure of Linguistic Theory (1975) and then partially in the much shorter and more widely read Syntactic Structures (1957). These books differed significantly from later iterations of Chomsky’s work in that they were more of an attempt to show what an adequate theory of natural language would need to look like than to fully work out such a theory. The focus was on demonstrating how a small set of rules could operate over a finite vocabulary to generate an infinite number of sentences, as opposed to identifying a psychologically realistic account of the processes actually occurring in the mind of a speaker.

Even before Chomsky, since at least the 1930s, the structure of a sentence was thought to consist of a series of phrases, such as noun phrases or verb phrases. In Chomsky’s early theory, two sorts of rules governed the generation of such structures. Basic structures were given by rewrite rules, procedures that indicate the more basic constituents of structural components. For example,

S → NP VP

indicates that a noun phrase, NP, followed directly by a verb phrase, VP, constitute a sentence, S. “NP → N” indicates that a noun may constitute a noun phrase. Eventually, the application of these rewrite rules stops when every constituent of a structure has been replaced by a syntactic element, a lexical word such as Albert or meows. Transformation rules alter those basic structures in various ways to produce structures corresponding to complex sentences. Importantly, certain transformation rules allowed recursion. This is a concept central to computer science and mathematical logic, by which a rule could be applied to its own output an unlimited number of times (for instance, in mathematics, one can start with 0 and apply the recursive function add 1 repeatedly to yield the natural numbers 0,1,2,3, and so forth.). The presence of recursive rules allows the embedding of structures within other structures, such as placing Albert meows under Leisa thinks to get Leisa thinks Albert meows. This could then be placed under Casey says that to produce Casey says that Leisa thinks Albert meows, and so on. Embedding could be done as many times as desired, so that recursive rules could produce sentences of any length and complexity, an important requirement for a theory of natural language. Recursion has not only remained central to subsequent iterations of Chomsky’s work but, more recently, has come to be seen as the defining characteristic of human languages.

Chomsky’s interest in rules that could be represented as operations over symbols reflected influence from philosophers inclined towards formal methods, such as Goodman and Quine. This is a central feature of Chomsky’s work to the present day, even though subsequent developments have also taken psychological realism into account. Some of Chomsky’s most impactful research from his early career (late 50s and early 60s) was the invention of formal language theory, a branch of mathematics dealing with languages consisting of an alphabet of symbols from which strings could be formed in accordance with a formal grammar, a set of specific rules. The Chomsky Hierarchy provides a method of classifying formal languages according to the complexity of the strings that could be generated by the language’s grammar (Chomsky 1956). Chomsky was able to demonstrate that natural human languages could not be produced by the lowest level of grammar on the hierarchy, contrary to many linguistic theories popular at the time. Formal language theory and the Chomsky Hierarchy have continued to have applications both in linguistics and elsewhere, particularly in computer science.

b. The Standard Model

Chomsky’s 1965 landmark work, Aspects of the Theory of Syntax, which devoted much space to philosophical foundations, introduced what later became known as the “Standard Model”. While the theory itself was in many respects an extension of the ideas contained in Syntactic Structures, there was a shift in explanatory goals as Chomsky addressed what he calls “Plato’s Problem”, the mystery of how children can learn something as complex as the grammar of a natural language from the sparse evidence they are presented with. The sentences of a human language are infinite in number, and no child ever hears more than a tiny subset of them, yet they master the grammar that allows them to produce every sentence in their language. (“Plato’s Problem” is an allusion to Plato’s Meno, a discussion of similar puzzles surrounding geometry. Section 4.b provides a fuller discussion of the issue as well as more recent developments in Chomsky’s model of language acquisition.) This led Chomsky, inspired by early modern rationalist philosophers such as Descartes and Leibniz, to postulate innate mechanisms that would guide a child in this process. Every human child was held to be born with a mental system for language acquisition, operating largely subconsciously, preprogrammed to recognize the underlying structure of incoming linguistic signals, identify possible grammars that could generate those structures, and then to select the simplest such grammar. It was never fully worked out how, on this model, possible grammars were to be compared, and this early picture has subsequently been modified, but the idea of language acquisition as relying on innate knowledge remains at the heart of Chomsky’s work.

An important idea introduced in Aspects was the existence of two levels of linguistic structure: deep structure and surface structure. A deep structure contains structural information necessary for interpreting sentence meaning. Transformations on a deep structure —moving, deleting, and adding elements in accordance with the grammar of a language —yield a surface structure that determines the way that the sentence is pronounced. Chomsky explained (in a 1968 lecture) that,

If this approach is correct in general, then a person who knows a specific language has control of a grammar that generates the infinite set of potential deep structures, maps them onto associated surface structures, and determines the semantic and phonetic interpretations of these abstract objects (Chomsky, 2006: 46).

Note that, for Chomsky, the deep structure was a grammatical object that contains structural information related to meaning. This is very different from conceiving of a deep structure as a meaning itself, although a theory to that effect, generative semantics, was developed by some of Chomsky’s colleagues (initiating a debate acrimonious enough to sometimes be referred to as “the linguistic wars”). The names and exact roles of the two levels would evolve over time, and they were finally dropped altogether in the 1990s (although this is not always noticed, a matter that sometimes confuses the discussion of Chomsky’s theories).

Aspects was also notable for the introduction of the competence/performance distinction, or the distinction between the underlying mental systems that give a speaker mastery of her language (competence) and her actual use of the language (performance), which will seldom fully reflect that mastery. Although these terms have technically been superseded by E-language and I-language (see 4.c), they remain useful concepts in understanding Chomsky’s ideas, and the vocabulary is still frequently used.

c. The Extended Standard Model

Throughout the 1970s, a number of technical changes, aimed at simplification and consolidation, were made to the Standard Model set out in Aspects. These gradually led to what became known as the “Extended Standard Model”. The grammars of the Standard Model contained dozens of highly specific transformation rules that successively rearranged elements of a deep structure to produce a surface structure. Eventually, a much simpler and more empirically adequate theory was arrived at by postulating only a single operation that moved any element of a structure to any place in that structure. This operation, move α, was subject to many “constraints” that limited its applications and therefore restrained what could be generated. For instance, under certain conditions, parts of a structure form “islands” that block movement (as when who is blocked from moving from the conjunction in John and who had lunch? to give *Who did John and have lunch?). Importantly, the constraints seemed to be highly consistent across human languages.

Grammars were also simplified by cutting out information that seemed to be specified in the vocabulary of a language. For example, some verbs must be followed by nouns, while others must not. Compare I like coffee and She slept to * I like and * She slept a book. Knowing which of these strings are correct is part of knowing the words like and slept, and it seems that a speaker’s mind contains a sort of lexicon, or dictionary, that encodes this type of information for each word she knows. There is no need for a rule in the grammar to state that some verbs need an object and others do not, which would just be repeating information already in the lexicon. The properties of the lexical items are therefore said to “project” onto the grammar, constraining and shaping the structures available in a language. Projection remains a key aspect of the theory, so that lexicon and grammar are thought to be tightly integrated.

Chomsky has frequently described a language as a mapping from meaning to sound. Around the time of the Extended Standard Model, he introduced a schema whereby grammar forms a bridge between the Phonetic Form, or PF, the form of a sentence that would actually be pronounced, and the Logical Form, or LF, which contained the structural specification of a sentence necessary to determine meaning. To consider an example beloved by introductory logic teachers, Everyone loves someone might mean that each person loves some person (possibly a different person in each case), or it might mean that there is some one person that everyone loves. Although these two sentences have identical PFs, they have different LFs.

Linking the idea of LF and PF to that of deep structure and surface structure (now called D-structure and S-structure, and with somewhat altered roles) gives the “T-model” of language:

D-structure

|

transformations

|

PF –    S-Structure    – LF

As the diagram indicates, the grammar generates the D-structure, which contains the basic structural relations of the sentence. The D-structure undergoes transformations to arrive at the S-structure, which differs from the PF in that it still contains unpronounced “traces” in places previously occupied by an element that was then moved elsewhere. The S-structure is then interpreted two ways: phonetically as the PF and semantically as the LF. The PF is passed from the language system to the cognitive system responsible for producing actual speech. The LF, which is not a meaning itself but contains structural information needed for semantic interpretation, is passed to the cognitive system responsible for semantics. This idea of syntactic structures and transformations over those structures as mediating between meaning and physical expression has been further developed and simplified, but the basic concept remains an important part of Chomsky’s theories

d. Principles and Parameters

In the 1980s, the Extended Standard Model would develop into what is perhaps the best known iteration of Chomskyan linguistics, what was first referred to as “Government and Binding”, after Chomsky’s book Lectures on Government and Binding (1981). Chomsky developed these ideas further in Barriers (1986), and the theory took on the more intuitive name “Principles and Parameters”. The fundamental idea was quite simple. As with previous versions, human beings have in their minds a computational system that generates the syntactic structures linking meanings to sounds. According to Principles and Parameters Theory, all of these systems share certain fixed settings (principles) for their core components, explaining the deep commonalities that Chomsky and his followers see between human languages. Other elements (parameters) are flexible and have values that are set during the language learning process, reflecting the variations observable across different languages. An analogy can be made with an early computer of the sort that was programmed by setting the position of switches on a control panel: the core, unchanging, circuitry of the computer is analogous to principles, the switches to parameters, and the program created by one of the possible arrangements of the switches to a language such as English, Japanese, or St’at’imcets (although this simple picture captures the essence of early Principles and Parameters, the details are a great deal more complicated, especially considering subsequent developments).

Principles are the core aspects of language, including the dependence on underlying structure and lexical projection, features that the theory predicts will be shared by all natural human languages. Parameters are aspects with binary settings that vary from language to language. Among the most widely discussed parameters, which might serve as convenient illustrations, are the Head and Pro-Drop parameters.

A head is the key element that gives a phrase its name, such as the noun in a noun phrase. The rest of the phrase is the complement. It can be observed that in English, the head comes before the complement, as in the noun phrase medicine for cats, where the noun medicine is before the complement for cats; in the verb phrase passed her the tea, the verb passed is first, and in the prepositional phrase in his pocket, the preposition in is first. But consider the following Japanese sentence (Cook and Newsom, 1996: 14):

(14) E wa kabe ni kakatte imasu
[subject marker] picture wall on hanging is

           The picture is hanging on the wall

Notice that the head of the verb phrase, the verb kakatte imasu, is after its complement, kabe ni, and ni (on) is a postposition that occurs after its complement, kabe. English and Japanese thus represent different settings of a parameter, the Head, or Head Directionality, Parameter. Although this and other parameters are set during a child’s development by the language they hear around them, it seems that very little exposure is needed to fix the correct value. It is taken as evidence of this that mistakes with head positioning are vanishingly rare; English speaking children almost never make mistakes like * The picture the wall on is at any point in their development.

The Pro-Drop Parameter explains the fact that certain languages can leave the pronoun subjects of a sentence implied, or up to context. For instance, in Italian, a pro-drop language, the following sentences are permitted (Cook and Newsom, 1996: 55).

(15) Sono il tricheco
be (1st-person-present) the walrus
I am the walrus.

 

(16) E’ pericoloso sporger- si
be (3rd person present) dangerous lean out- (reflexive)

        It is dangerous to lean out. [a warning posted on trains]

On the other hand, the direct English translations * Am the walrus and * Is dangerous to lean out are ungrammatical, reflecting a different parameter setting, “non-prodrop”, which requires an explicit subject for sentences.

A number of other, often complex, differences beyond whether subjects must be included in all sentences were thought to come from the settings of Pro-Drop and the way it interacts with other parameters. For example, it has been observed that many pro-drop languages allow the normal order of subjects and verbs to be inverted; Cade la note is acceptable in Italian, unlike its direct translation in English, * falls the night. However, this feature is not universal among pro-drop languages, and it was theorized that whether it is present or not depends on the settings of other parameters.

Examples such as these reflect the general theme of Principles and Parameters, in which “rules” of the sort that had been postulated in Chomsky’s previous work are no longer needed. Instead of syntactical rules present in a speaker’s mental language faculty, the particular grammar of a language was hypothesized to be the result of the complex interaction of principles, the setting of parameters, and the projection properties of lexical items. As a relatively simple example, there is no need for an English-speaking child to learn a bundle of related rules such as noun first in a noun phrase, verb first in a verb phrase, and so on, or for a Japanese-speaking child to learn the opposite rules for each type of phrase; all of this is covered by the setting of the Head Parameter. As Chomsky (1995: 388) puts it,

A language is not, then, a system of rules but a set of specifications for parameters in an invariant system of principles of Universal Grammar. Languages have no rules at all in anything like the traditional sense.

This outlook represents an important shift in approach, which is often not fully appreciated by philosophers and other non-specialists. Many scholars assume that Chomsky and his followers still regard languages as particular sets of rules internally represented by speakers, as opposed to principles that are realized without being explicitly represented in the brain.

This outlook led many linguists, especially during the last two decades of the 20th century, to hope that the resemblances and differences between individual languages could be neatly explained by parameter settings. Language learning also seemed much less puzzling, since it was now thought to be a matter, not of learning complex sets of rules and constraints, but rather of setting each parameter, of which there were at one time believed to be about twenty, to the correct value for the local language, a process that has been compared to the children’s game of “twenty questions”. It was even speculated that a table could be established where languages could be arranged by their parameter settings, in analogy to the periodic table on which elements could be placed and their chemical properties predicted by their atomic structures.

Unfortunately, as the program developed, things did not prove so simple. Researchers failed to reach a consensus on what parameters there are, what values they can take, and how they interact, and there seemed to be vastly more of them than initially believed. Additionally, parameters often failed to have the explanatory power they were envisioned as having. For example, as discussed above, it was originally claimed that the Pro-Drop parameter explained a large number of differences between languages with opposite settings for that parameter. However, these predictions were made on the basis of an analysis of several related European languages and were not fully borne out when checked against a wider sample. Many linguists now see the parameters themselves as emerging from the interactions of “microparameters” that explain the differences between closely related dialects of the same language and which are often found in the properties of individual words projecting onto the syntax. There is ongoing debate as to the explanatory value of parameters as they were originally conceived.

During the Principles and Parameters era, Chomsky sharpened the notions of competence and performance into the dichotomy of I-languages and E-languages. The former is a state of the language system in the mind of an individual speaker, while the latter, which corresponds to the common notion of a language, is a publicly shared system such as “English”, “French”, or “Swahili”. Chomsky was sharply critical of the study of E-languages, deriding them as poorly defined entities that play no role in the serious study of linguistics —a controversial attitude, as E-languages are what many linguists regard as precisely the subject matter of their discipline. This remains an important point in his work and will be discussed more fully in 4.d. below.

e. The Minimalist Program

From the beginning, critics have argued that the rule systems Chomsky postulated were too complex to be plausibly grasped by a child learning a language, even if important parts of this knowledge were innate. Initially, the replacement of rules by a limited number of parameters in the Principles and Parameters paradigm seemed to offer a solution, as by this theory, instead of an unwieldy set of rules, the child needed only to grasp the setting of some parameters. But, while it was initially hoped that twenty or so parameters might be identified, the number has increased to the point where, although there is no exact consensus, it is too large to offer much hope of providing a simple explanation of language learning, and microparameters further complicate the picture.

The Minimalist Program was initiated in the mid-1990s partially to respond to such criticisms by continuing the trend towards simplicity that had begun with the Extended Standard Theory, with the goal of the greatest possible degree of elegance and parsimony. The minimalist approach is regarded by advocates not as a full theory of syntax but rather as a program of research working towards such a theory, building on the key features of Principles and Parameters.

In the Minimalist Program, syntactic structures corresponding to sentences are constructed using a single operation, Merge, that combines a head with a complement, for example, merging Albert with will meow to give Albert will meow. Importantly, Merge is recursive, so that it can be applied over and over to give sentences of any length. For instance, the sentence just discussed can be merged with thinks to give thinks Albert will meow and then again with Leisa to form the sentence Leisa thinks Albert will meow. Instead of elements within a structure moving from one place to another, a structure merges with an element already inside of it and then deletes redundant elements; a question can be formed from Albert will meow by first producing will Albert will meow, and finally will Albert meow? In order to prevent the production of ungrammatical strings, Merge must be constrained in various ways. The main constraints are postulated to be lexical, coming from the syntactic features of the words in a language. These features control which elements can be merged together, which cannot, and when merging is obligatory, for instance, to provide an object for a transitive verb.

During the Minimalist Program era, Chomsky has worked on a more specific model for the architecture of the language faculty, which he divides into the Faculty of Language, Broad (FLB) and the Faculty of Language, Narrow (FLN). The FLN is the syntactic computational system that had been the subject of Chomsky’s work from the beginning, now envisioned as using a single operation, that of recursive Merge. The FLB is postulated to include the FLN, but additionally the perceptual-articulatory system that handles the reception and production of physical messages (spoken or signed words and sentences) and the conceptual-intentional system that handles interpreting the meaning of those messages. In a schema similar to a flattened version of the T-model, the FLN forms a bridge between the other systems of the FLB. Incoming messages are given a structural form by the FLN that is passed to the conceptual-intentional system to be interpreted, and the reverse process allows thoughts to be articulated as speech. The different structural levels, D-structure and S-structure, of the T-model are eliminated in favor of maximal simplicity (the upside-down T is now just a flat  ̶ ). The FLN is held to have a single level on which structures are derived through Merge, and two interfaces connected to the other parts of the FLB.

One important implication of this proposed architecture is the special role of recursion. The perceptual-articulatory system and conceptual-intentional system have clear analogs in other species, many of whom can obviously sense and produce signals and, in at least some cases, seem to be able to link meanings to them. Chomsky argues that, in contrast, recursion is uniquely human and that no system of communication among non-human animals allows users to limitlessly combine elements to produce a potential infinity of messages. In many ways, Chomsky is just restating what had been an important part of his theory from the beginning, which is that human language is unique in being productive or capable of expressing an infinity of different meanings, an insight he credits to Descartes. This makes recursion the characteristic aspect of human language that sets it apart from anything else in the natural world, and a central part of what it is to be human.

The status of recursion in Chomsky’s theory has been challenged in various ways, sometimes with the claim that some human language has been observed to be non-recursive (discussed below, in 4.a). That recursion is a uniquely human ability has also been called into question by experiments in which monkeys and corvids were apparently trained in recursive tasks under laboratory conditions. On the other hand, it has also been suggested that if the recursive FLN really does not have any counterpart among non-human species, it is unclear how such a mechanism might have evolved. This last point is only the latest version of a long-running objection that Chomsky’s ideas are difficult to reconcile with the theory of evolution since he postulates uniquely human traits for which, it is argued by critics, there is no plausible evolutionary history. Chomsky counters that it is not unlikely that the FLN appeared as a single mutation, one that would be selected due to the usefulness of recursion for general thought outside of communication. Providing evolutionary details and exploring the  relationship between the language faculty and the physical brain have formed a large part of Chomsky’s most recent work.

The central place of recursion in the Minimalist Program also brought about an interesting change in Chomsky’s thoughts on hypothetical extraterrestrial languages. During the 1980s, he speculated that alien languages would be unlearnable by human beings since they would not share the same principles as human languages. As such, one could be studied as a natural phenomenon in the way that humans study physics or biology, but it would be impossible for researchers to truly learn the language in the way that field linguists master newly encountered human languages. More recently, however, Chomsky hypothesized that since recursion is apparently the core, universal property of human language and any extraterrestrial language will almost certainly be recursive as well, alien languages may not be that different from our own, after all.

4. Language and Languages

As a linguist, Chomsky’s primary concern has always been, of course, language. His study of this phenomenon eventually led him to not only formulate theories that were very much at odds with those held at one time by the majority of linguists and philosophers, but also to have a fundamentally different view about the thing, language, that was being studied and theorized about. Chomsky’s views have been influential, but many of them remain controversial today. This section discusses some of Chomsky’s important ideas that will be of interest to philosophers, especially concerning the nature and acquisition of language, as well as meaning and analyticity, topics that are traditionally the central concerns of philosophy of the language.

a. Universal Grammar

Perhaps the single most salient feature of Chomsky’s theory is the idea of Universal Grammar ( UG). This is the central aspect of language that he argues is shared by all human beings —a part of the organization of the mind. Since it is widely assumed that mental features correspond, at some level, to physical features of the brain, UG is ultimately a biological hypothesis that would be part of the genetic inheritance that all humans are born with.

In terms of Principles and Parameters Theory, UG consists of the principles common to all languages and which will not change as the speaker matures. UG also consists of parameters, but the values of the parameters are not part of UG. Instead, parameters may change from their initial setting as a child grows up, based on the language she hears spoken around her. For instance, an English-speaking child will learn that every sentence must have a subject, setting her Pro-Drop parameter to a certain value, the opposite of the value it would take for a Spanish-speaking child. While the Pro-Drop parameter is a part of UG, this particular setting of the parameter is a part of English and other languages where the subject must be overtly included in the sentence. All of the differences between human languages are then differences in vocabulary and in the settings of parameters, but they are all organized around a common core given by UG.

Chomsky has frequently stated that the important aspects of human languages are set by UG. From a sufficiently detached viewpoint, for instance, that of a hypothetical Martian linguist, there would only be minor regional variations of a single language spoken by all human beings. Further, the variations between languages are predictable from the architecture of UG and can only occur within narrowly constrained limits set by that structure. This was a dramatic departure from the assumption, largely unquestioned until the mid-20th century, that languages can vary virtually without limit and in unpredictable ways. This part of Chomsky’s theory has remained controversial, with some authorities on crosslinguistic work, such as the British psycholinguist Stephen Levinson (2016), arguing that it discounts real and important differences among languages. Other linguists argue the exact contrary: that data from the study of languages worldwide backs Chomsky’s claims. Because the debate ultimately concerns invisible mental features of human beings and how they relate to unpronounced linguistic structures, the interpretation of the evidence is not straightforward, and both sides claim that the available empirical data supports their position.

The theory of UG is an important aspect of Chomsky’s ideas for many reasons, among which is that it clearly sets his theories apart as different from paradigms that had previously been dominant in linguistics. This is because UG is not a theory about behavior or how people use language, but instead about the internal composition of the human mind. Indeed, for Chomsky and others working within the framework of his ideas, language is not something that is spoken, signed, or written but instead exists inside of us. What many people think of as language —externalized acts of communication —are merely products of that internal mental faculty. This in turn has further implications for theories of language acquisition (see 4.b) and how different languages should be identified (4.c).

An important implication of UG is that it makes Chomsky’s theories empirically testable. A common criticism of his work is that because it abstracts away from the study of actual language use to seek underlying idealized patterns, no evidence can ever count against it. Instead, apparent counterexamples can always be dismissed as artifacts of performance rather than the competence that Chomsky was concerned with. If correct, this would be problematic since it is widely agreed that a good scientific theory should be testable in some way. However, this criticism is often based on misunderstandings. A linguist dismissing an apparent failure of the theory as due to performance would need to provide evidence that performance factors really are involved, rather than a problem with the underlying theory of competence. Further, if a language was discovered to be organized around principles that contravened those of UG, then many of the core aspects of Chomsky’s theories would be falsified. Although candidate languages have been proposed, all of them are highly controversial, and none is anything close to universally accepted as falsifying UG.

In order to count as a counterexample to UG, a language must actually breach one of its principles; it is not enough that a principle merely not be displayed. As an example, one of the principles is what is known as structure dependence: when an element of a linguistic structure is moved to derive a different structure, that movement depends on the structure and its organization into phrases. For instance, to arrive at the correct question form of The cat who is on the desk is hungry; it is the is in the main clause, the one before hungry, that is moved to the front of the sentence, not the one in the relative clause (between who and on). However, in some languages, for instance Japanese, elements are not moved to form questions; instead, a question marker (ka) is added at the end of the sentence. This does not make Japanese a counterexample to the UG principle that movement is always structurally dependent. The Japanese simply do not exercise this principle when forming questions, but neither is the principle violated. A counterexample to UG would be a language that moved elements but did so in a way that did not depend on structure, for instance, by always moving the third word to the front or inverting the word order to form a question.

A case that generated a great deal of recent controversy has been the claim that Pirahã, a language with a few hundred speakers in the Brazilian rain forest, lacks recursion (Everett 2005). This has been frequently presented as falsifying UG, since recursion is the most important principle, indeed the identifying feature, of human language, according to the Minimalist Program. This alleged counterexample received widespread and often incautious coverage in the popular press, at times being compared to the discovery of evidence that would disprove the theory of relativity.

This assertion that Pirahã has no recursion has itself been frequently challenged, and the status of this claim is unclear. But there is also a lack of agreement on whether, if true, this claim would invalidate UG or whether it would just be a case similar to the one discussed above, the absence of movement in Japanese when forming questions, where a principle is not being exercised. Proponents of Chomsky’s ideas counter that UG is a theory of mental organization and underlying competence, a competence that may or may not be put fully to use. The fact that the Pirahã are capable of learning Portuguese (the majority language in Brazil) shows that they have the same UG present in their minds as anyone else. Chomsky points out that there are numerous cases of human beings choosing not to exercise some sort of biological capacity that they have. Chomsky’s own example is that although humans are biologically capable of swimming, many would drown if placed in water. It has been suggested by sympathetic scholars that this example is not particularly felicitous, as swimming is not an instinctive behavior for humans, and a better example might be monks who are sworn to celibacy. Debate has continued concerning this case, with some still arguing that if a language without recursion would not be accepted as evidence against UG, it is difficult to imagine what can.

b. Plato’s Problem and Language Acquisition

One of Chomsky’s major goals has always been to explain the way in which human children learn language. Since he sees language as a type of knowledge, it is important to understand how that knowledge is acquired. It seems inexplicable that children acquire something as complex as the grammar and vocabulary of a language, let alone the speed and accuracy with which they do so, at an age when they cannot yet learn how to tie their shoes or do basic arithmetic. The mystery is deepened by the difficulty that adults, who are usually much better learners than small children, have with acquiring a second language.

Chomsky addressed this puzzle in Aspects of the Theory of Syntax (1965), where he called it “Plato’s Problem”. This name is a reference to Plato’s Meno, a dialog in which Socrates guides a young boy, without a formal education, into producing a fairly complex geometric proof, apparently from the child’s own mental resources. Considering the difficult question of where this apparent knowledge of geometry came from, Plato, speaking through Socrates, concludes that it must have been present in the child already, although dormant until the right conditions were presented for it to be awakened. Chomsky would endorse largely the same explanation for language acquisition. He also cites Leibniz and Descartes as holding similar views concerning important areas of knowledge.

Chomsky’s theories regarding language acquisition are largely motivated by what has become known as the “Poverty of the Stimulus Argument,” the observation that the information about their native language that children are exposed to seems inadequate to explain the linguistic knowledge that they arrive at. Children only ever hear a small subset of the sentences that they can produce or understand. Furthermore, the language that they do hear is often “corrupt” in some way, such as the incomplete sentences frequently used in casual exchanges. Yet on this basis, children somehow master the complex grammars of their native languages.

Chomsky pointed out that the Poverty of the Stimulus makes it difficult to maintain that language is learned through the same general-purpose learning mechanisms that allow a human being to learn about other aspects of the world. There are many other factors that he and his followers cite to underline this point. All developmentally normal children worldwide are able to speak their native languages at roughly the same age, despite vast differences in their cultural and material circumstances or the educational levels of their families. Indeed, language learning seems to be independent of the child’s own cognitive abilities, as children with high IQs do not learn the grammar of their language faster, on average, than others. There is a notable lack of explicit instruction; analyses of speech corpora show that adult correction of children’s grammar is rare, and it is usually ineffective when it does occur. Considering these factors together, it seems that the way in which human children acquire language requires an explanation in a way that learning, say, table manners or putting shoes on do not.

The solution to this puzzle is, according to Chomsky, that language is not learned through experience but innate. Children are born with Universal Grammar already in them, so the principles of language are present from birth. What remains is “merely” learning the particularities of the child’s native language. Because language is a part of the human mind, a part that each human being is born with, a child learning her native language is just undergoing the process of shaping that part of her mind into a particular form. In terms of the Principles and Parameters Theory, language learning is setting the value of the parameters. Although subsequent research has shown that things are more complicated than the simple setting of switches, the basic picture remains a part of Chomsky’s theory. The core principles of UG remain unchanged as the child grows, while peripheral elements are more plastic and are shaped by the linguistic environment of the child.

Chomsky has sometimes put the innateness of language in very strong terms and has stated that it is misleading to call language acquisition “language learning”. The language system of the mind is a mental organ, and its development is similar, Chomsky argues, to the growth of bodily organs such as the heart or lungs, an automatic process that is complete at some point in a child’s development. The language system also stabilizes at a certain point, after which changes will be relatively minor, such as the addition of new words to a speaker’s vocabulary. Even many of those who are firm adherents to Chomsky’s theories regard such statements as incautious. It is sometimes pointed out that while the growth of organs does not require having any particular experiences, proper language development requires being exposed to language within a certain critical period in early childhood. This requirement is evidenced by tragic cases of severely neglected children who were denied the needed input and, as a result, never learned to speak with full proficiency.

It has also been pointed out that even the rationalist philosophers whom Chomsky frequently cites did not seem to view innate and learned as mutually exclusive. Leibniz (1704), for instance, stated that arithmetical knowledge is “in us” but still learned, drawn out by demonstration and testing on examples. It has been suggested that some such view is necessary to explain language acquisition. Since humans are not born able to speak in the way that, for example, a horse is able to run within hours of birth, some learning seems to be involved, but those sympathetic to Chomsky regard the Poverty of the Stimulus as ruling out simply acquiring language completely from external sources. According to this view, we are born with language inside of us, but the proper experiences are required to draw that knowledge out and make it available.

The idea of innate language is not universally accepted. The behaviorist theory that language learning is a result of social conditioning, or training, is no longer considered viable. But it is a widely held view that general statistical learning mechanisms, the same mechanisms by which a child learns about other aspects of the world and human society, are responsible for language learning, with only the most general features of language being innate. These sorts of theories tend to have the most traction in schools of linguistic thought that reject the idea of Universal Grammar, maintaining that no deep commonalities hold across human languages. On such a view, there is little about language that can be said to be shared by all humans and therefore innate, so language would have to be acquired by children in the same way as other local customs. Advocates of Chomsky’s views counter that such theories cannot be upheld given the complexity of grammar and the Poverty of the Stimulus, and that the very fact that language acquisition occurs given these considerations is evidence for Universal Grammar. The degree to which language is innate remains a highly contested issue in both philosophy and science.

Although the application of statistical learning mechanisms to machine learning programs, such as OpenAI’s ChatGPT, has proven incredibly successful, Chomsky points out that the architecture of such programs is very different from that of the human mind: “A child’s operating system is completely different from that of a machine learning program” (Chomsky, Roberts, and Watumull, 2023). This difference, Chomskyans maintain, precludes drawing conclusions about the use or acquisition of language by humans on the basis of studying these models.

c. I vs. E languages

Perhaps the way in which Chomsky’s theories differ most sharply from those of other linguists and philosophers is in his understanding of what language is and how a language is to be identified. Almost from the beginning, he has been careful to distinguish speaker performance from underlying linguistic competence, which is the target of his inquiry. During the 1980s, this methodological point would be further developed into the I-language/E-language distinction.

A common concept of what an individual language is, explicitly endorsed by philosophers such as David Lewis (1969), Michael Dummett (1986), and Michael Devitt (2022), is a system of conventions shared between speakers to allow coordination. Therefore, language is a public entity used for communication. It is something like this that most linguists and philosophers of language have in mind when they talk about “English” or “Hindi”. Chomsky calls this concept of language E-language, where the “E” stands for external and extensional. What is meant by “extensional” is somewhat technical and will be discussed later in this subsection. “External” refers to the idea just discussed, where language is a public system that exists externally to any of its speakers. Chomsky points out that such a notion is inherently vague, and it is difficult to point to any criteria of identity that would allow one to draw firm boundaries that could be used to tell one such language apart from another. It has been observed that people living near border areas often cannot be neatly categorized as speaking one language or the other; Germans living near the Dutch border are comprehensible to the nearby Dutch but not to many Germans from the southern part of Germany. Based on the position of the border, we say that they are speaking “German” rather than “Dutch” or some other E-language, but a border is a political entity with negligible linguistic significance. Chomsky (1997: 7) also called attention to what he calls “semi-grammatical sentences,” such as the string of words.

(17) *The child seems sleeping.

Although (17) is clearly ill-formed, most “English” speakers will be able to assign some meaning to it. Given these conflicting facts, there seems to be no answer to whether (17) or similar strings are part of “English”.

Based on considerations like those just mentioned, Chomsky derides E-languages as indistinct entities that are of no interest to linguistic science. The real concept of interest is that of an I-language, where the “I” refers to intensional and internal. “Intensional” is in opposition to “extensional”, and will be discussed in a moment. “Internal” means contained in the mind of some individual human being. Chomsky defines language as a computational system contained in an individual mind, one that produces syntactic structures that are passed to the mental systems responsible for articulation and interpretation. A particular state of such a system, shaped by the linguistic environment it is exposed to, constitutes an I-language. Because all I-languages contain Universal Grammar, they will all resemble each other in their core aspects, and because more peripheral parts of language are set by the input received, the I-language of two members of the same linguistic community will resemble one another more closely. For Chomsky, for whom the study of language is ultimately the study of the mind, it is the I-language that is the proper topic of concern for linguists. When Chomsky speaks of “English” or “Swahili”, this is to be understood as shorthand for a cluster of characteristics that are typically displayed by the I-languages of people in a particular linguistic community.

This rejection of external languages as worthy of study is closely related to another point where Chomsky goes against a widely held belief in the philosophy of language, as he does not accept the common hypothesis that language is primarily a means of communication. The idea of external languages is largely motivated by the widespread theory that language is a means for interpersonal communication, something that evolved so that humans could come together, coordinate to solve problems, and share ideas. Chomsky responds that language serves many uses, including to speak silently to oneself for mental clarity, to aid in memorization, to solve problems, to plan, or to conduct other activities that are entirely internal to the individual, in addition to communication. There is no reason to emphasize one of these purposes over any other. Communication is one purpose of language—an important one, to be sure—but it is not the purpose.

Besides the internal/external dichotomy, there is the intensional/extensional distinction, referring to two different ways that sets might be specified. The extension of a set is what elements are in that set, while the intension is how the set is defined and the members are divided from non-members. For instance, the set {1, 2, 3} has as its extension the numbers 1, 2, and 3. The intension of the same set might be the first three positive integers, or the square roots of 1, 4, and 9, or the first three divisors of 6; indeed, an infinite number of intensions might generate the same set extension.

Applying this concept to languages, a language might be defined extensionally in terms of the sentences of the language or intentionally in terms of the grammar that generates all of those sentences but no others. While Chomsky favors the second approach, he attributes the first to two virtually opposite traditions. Structuralist linguists, who place great value on studying corpora, and other linguists and philosophers who focus on the actual use of language define a language in terms of the sentences attested in corpora and those that fit similar patterns. A very different tradition consists of philosophers of language who are known as “Platonists”, and who are exemplified by Jerrold Katz (1981, 1985) and Scott Soames (1984), former disciples of Chomsky. On this view, every possible language is a mathematical object, a set of possible sentences that really exist in the same abstract sense that sets of numbers do. Some of these sets happen to be the languages that humans speak.

Both of these extensional approaches are rejected by Chomsky, who maintains that language is an aspect of the human mind, so what is of interest is the organization of that part of the mind, the I-language. This is an intensional approach, since a particular I-language will constitute a grammar that will produce a certain set of sentences. Chomsky argues that both extensional approaches, the mathematical and the usage-based, are insufficiently focused on the mental to be of explanatory value. If a language is an abstract mathematical object, a set of sentences, it is unclear how humans are supposed to acquire knowledge of such a thing or to use it. The usage-based approach, as a theory of behavior, is insufficiently explanatory because any real explanation of how language is acquired and used must be in mental terms, which means looking at the organization of the underlying I-language.

While many who study language accept the concept of the I-language and agree with its importance, Chomsky’s complete dismissal of E-languages as worthy of study has not been widely endorsed. E-languages, even if they are ultimately fiction, seem to be a necessary fiction for disciplines such as sociolinguistics or for the historical analysis of how languages have evolved over time. Further, having vague criteria of identity does not automatically disqualify a class of entities from being used in science. For example, the idea of species is open to many of the same criticisms concerning vagueness that Chomsky directs at E-languages, and its status as a real category has been debated, but the concept often plays a useful role in biology.

d. Meaning and Analyticity

It might be said that the main concern of the philosophy of language is the question of meaning. How is it that language corresponds to, and allows us to communicate about, states of affairs in the world or to describe possible states of affairs? A related question is whether there are such things as analytic truths, that is, sentences that are (as they were often traditionally characterized) necessarily true by virtue of meaning alone. It might seem like anyone who understands all the words in:

(18) If Albert is a cat, then Albert is an animal.

knows that it has to be true, just in virtue of knowing what it means. Appeals to such knowledge were frequently the basis for explaining our apparent a priori knowledge of logic and mathematics and for what came to be known as “analytic philosophy” in the 20th century. But the exact nature and scope of this sort of truth and knowledge are surprisingly hard to clarify, and many philosophers, notably Quine (1953) and Fodor (1998), argue that allegedly analytic statements are no different from any other belief that is widely held, such as:

(19) The world is more than a day old.

On this outlook, not only are apparently analytic truths open to revision just like any other belief, but the entire idea of determinate meanings becomes questionable.

As mentioned earlier, Chomsky’s focus has been not on meaning but instead on syntax, the grammatical rules that govern the production of well-formed sentences, considered largely independent of their meanings. Much of the critical data for his program has consisted of unacceptable sentences, the “WhyNots,” such as:

(20) * She’s as likely as he’s to get ill. (Rey 2022)

Sentences like (20), or (1)-(6) in 2.c above, are problematic, not because they have no meaning or have an anomalous meaning in some way, but because of often subtle issues under the surface concerning the syntactic structure of the sentence. Chomsky frequently argued that syntax is independent of meaning, and a theory of language should be able to explain the syntactic data without entering into questions of meaning. This idea, sometimes called “the autonomy of syntax”, is supported by, among other evidence, considering sentences such as:

(21) Colorless green ideas sleep furiously. (Chomsky 1965: 149)

which makes no sense if understood literally but is immediately recognizable as a grammatical sentence in English. Whether syntax is entirely independent of meaning and use has proven somewhat contentious, with some arguing that, on the contrary, questions of grammaticality cannot be separated from pragmatic and semantic issues. However, the distinction fits well with Chomsky’s conception of I-language, an internal computational device that produces syntactic structures that are then passed to other mental systems. These include the conceptual-intentional system responsible for assigning meaning to the structures, a system that interfaces with the language faculty but is not itself part of that faculty, strictly speaking.

Despite his focus on syntax, Chomsky does frequently discuss questions of meaning, at least from 1965 on. Chomsky regards the words (and other lexical items, such as prefixes and suffixes) that a speaker has stored in her lexicon as bundles of semantic, syntactic, and phonetic features, indicating information about meaning, grammatical role, and pronunciation. Some features that Chomsky classifies as syntactic may seem to be more related to meaning, such as being abstract. Folding these features into syntax seemed to be supported by the observation that, for example,

(22) * A very running person passed us.

is anomalous because very requires an abstract complement in such a context (a very interesting person is fine). In Aspects of the Theory of Syntax (1965), he also introduced the notion of “selectional rules” that identify sentences such as:

(23) Golf plays John (1965: 149)

as “deviant”. A particularly interesting example is:

(24) Both of John’s parents are married to aunts of mine. (1965: 77)

In 1965, (24) might have seemed to be analytically false, but in the 21st century, such a sentence may very well be true!

One popular theory of semantics is that the meaning of a sentence consists of its truth conditions, that is, the state of affairs that would make the sentence true. This idea, associated with the philosopher of language Donald Davidson (1967), might be said to be almost an orthodoxy in the study of semantics, and it certainly has an intuitive appeal. To know what The cat is on the mat means is to know that this sentence is true if and only if the cat is indeed on the mat. Starting in the late 1990s, Chomsky would challenge this picture of meaning as an oversimplification of the way that language works.

According to Chomsky’s view, also developed by Paul Pietroski (2005), among others, the sentences of a language do not, themselves, have truth conditions. Instead, sentences are tools that might be used, among other things, to make statements that have truth values relative to  their context of use. Support for this position is drawn from the phenomenon of polysemy, where the same word might be used with different truth-conditional roles within a single sentence, such as in:

(25) The bank was destroyed by the fire and so moved across the street. (Chomsky 2000: 180)

where the word bank is used to refer to both a building and a financial institution. There is also open texture, a phenomenon by which the meaning of a word might be extended in multiple ways, many of which might have once been impossible to foresee (Waismann 1945). An oft-cited example is mother: in modern times, unlike in the past, it is possible that two women, the woman who produces the ovum and the woman who carries the fetus, may both be called  mothers of the child. One might also consider the way that a computer, at one time a human being engaged in performing computations, was easily extended to cover electronic machines that are sometimes said to think, something that was also at one time reserved for humans.

Considering these phenomena, it seems that the traditional idea of words as having fixed “meanings” might be better replaced by the idea of words as “filters or lenses, providing ways of looking at things and thinking about the products of our minds” (Chomsky 2000, 36), or, as Pietroski (2005) puts it, as pointers in conceptual space. A speaker uses the structures made available by her I-language in order to arrange these “pointers” in such a way as to convey information, producing statements that might be assigned truth values given the context. But a speaker is hardly constrained to her I-language, which might be supplemented by resources such as gestures, common knowledge, shared cultural background, or sensibility to the listener’s psychology and ability to fill in gaps. Consider a speaker nodding towards a picture of the Eiffel Tower and saying “been there”; to the right audience, under the right circumstances, this is a perfectly clear statement with a determinate truth value, even though the I-language, which produces structures corresponding to grammatical sentences, has been overridden in the interests of efficiency.

It has been suggested (Rey 2022) that this outlook on meaning offers a solution to the question of whether there are sentences that are analytically true and that are distinct from merely strongly held beliefs. Sentences such as If Albert is a cat, he is an animal may be analytic in the sense that, in the lexicon accessed by the I-language, [animal] is a feature of cat (as argued by Katz 1990). On the other hand, the I-language might be overruled in the face of future evidence, such as discovering that cats are really robots from another planet (as Putnam 1962 imagined). These two apparently opposing facts can be accommodated by the open texture of the word cat, which might come to be used in cases where it does not, at present, apply.

Chomsky, throughout his long career, seems to have frequently vacillated concerning the existence of analytic truths. Early on, as in Aspects (1965), he endorses analyticity, citing sentence 24 and similar examples. At other times, he seems to echo Quine, at one point (1975), stating that the study of meaning cannot be dissociated from systems of belief. More recently (1997) he explicitly allows for analytic truths, arguing that necessary connections occur between the concepts denoted by the lexicons of human languages. For example, “If John persuaded Bill to go to college, then Bill at some point decided or intended to go to college… this is a truth of meaning” (1997: 30). This is to say that it is an analytic truth based on a relation that obtains between the concepts persuade and intend. Ultimately, though, Chomsky regards analyticity as an empirical issue, not one to be settled by considering philosophical intuitions but rather through careful investigation of language acquisition, crosslinguistic comparison, and the relation of language to other cognitive systems, among other evidence. Currently, he holds that allowing for analytic truths based on relations between concepts seems more promising than alternative proposals, but this is an empirical question to be resolved through science.

Finally, mention should be made of the way that Chomsky connects considerations of meaning with “Plato’s Problem”, the question of how small children manage to do something as difficult as learning language. Chomsky notes that the acquisition of vocabulary poses this problem “in a very sharp form” (1997: 29). During the peak periods of language learning, children learn several words a day, often after hearing them a single time. Chomsky accounts for this rapid acquisition in the same way as the acquisition of grammar: what is being learned must already be in the child. The concepts themselves are innate, and what a child is doing is simply learning what sounds people in the local community use to label concepts she already possesses. Chomsky acknowledges that this idea has been criticized. Hilary Putnam (1988), for example, asks how evolution could have possibly had the foresight to equip humans with a concept such as carburetor. Chomsky’s response is simply that this certainly seems surprising, but that “the empirical facts appear to leave open a few other possibilities” (1997: 26). Conceptual relations, like those mentioned above between persuades and intends, or between chase and follow with the intent of staying on one’s path, are, Chomsky asserts, grasped by children on the basis of virtually no evidence. He concludes that this indicates that children approach language learning with an intuitive understanding of important concepts, such as intending, causing something to happen, having a goal, and so on.

Chomsky suggests a parallel to his theory of lexical acquisition in the Nobel Prize-winning work of the immunologist Niels Jerne. The number of antigens (substances that trigger the production of antibodies) in the world is so enormous, including man-made toxins, that it may seem absurd to propose that immune systems would have evolved to have an innate supply of specific antibodies. However, Jerne’s work upheld the theory that an animal could not be stimulated to make an antibody in response to a specific antigen unless it had already produced such an antibody before encountering the antigen. In fact, Jerne’s (1985) Nobel speech was entitled “The Generative Grammar of the Immune System”.

Chomsky’s theories of innate concepts fit with those of some philosophers, such as Jerry Fodor (1975). On the other hand, this approach has been challenged by other philosophers and by linguists such as Stephen Levinson and Nicholas Evans (2009), who argue that the concepts labeled by words in one language very seldom map neatly onto the vocabulary of another. This is sometimes true even of very basic terms, such as the English preposition “in”, which has no exact counterpart in, for example, Korean or Tzeltal, languages that instead have a range of words that more specifically identify the relation between the contained object and the container. This kind of evidence is understood by some linguists to cast doubt on the idea that childhood language acquisition is a matter of acquiring labels for preexisting universal concepts.

e. Kripkenstein and Rule Following

This subsection introduces the “Wittgenstenian Problem”, one of the most famous philosophical objections to Chomsky’s notion of an underlying linguistic competence. Chomsky himself stated that out of the various criticisms his theory had received over the years, “this seems to me to be the most interesting” (1986: 223). Inspired by Ludwig Wittgenstein’s cryptic statement that “no course of action could be determined by a rule, because every course of action could be made out to accord with the rule” (1953: §201), Saul Kripke (1982) developed a line of argument that entailed a deep skepticism about the nature of rule-following activities, including the use of language. Kripke is frequently regarded as having gone beyond what Wittgenstein might have intended, so his argument is often attributed to a composite figure, “Kripke’s Wittgenstein” or “Kripkenstein”. A full treatment of this fascinating, but lengthy and complex, argument is beyond the scope of this article (the interested reader might consult the article “Kripke’s Wittgenstein.” It can be summarized as asserting that, in a case where a person seems to be following a rule, there are no facts about her that determine which rule she is actually following. To take Kripke’s example, if someone seems to be adding numbers in accordance with the normal rules of addition but then gives a deviant answer, say 68 + 57 = 5, there is no way to establish that she was not actually performing an operation called quaddition instead, which is like addition except that it gives an answer of 5 for any equation involving numbers larger than 57. Kripke claims that any evidence, including her own introspection, that she was performing addition and made a bizarre mistake is equally compatible with the hypothesis that she was performing quaddition. Ultimately, he concludes, there is no way to settle such questions, even in theory; there is simply no fact of the matter about what rule is being followed.

The relevance of Kripke’s argument to Chomsky’s linguistic theory is that this directly confronts his notion of language as an internalized system of rules (or, in later iterations, a system of principles and parameters that gives rise to rules that are not themselves represented). According to Chomsky’s theory, a grammatical error is explained as a performance issue, for example, a mistake brought on by inattention or distraction that causes a deviation from the system of rules in the mind of the speaker. According to Kripke, calling this a deviation from those rules, rather than an indication that different rules (or no rules) are being followed, is like trying to decide the question of addition vs. quaddition. Kripke asserted that there is no fact of the matter in the linguistic case, either, any more than in the example of addition and quaddition. Therefore, “it would seem that the use of the ideas of rules and competence in linguistics needs serious reconsideration” (1982: 31).

An essential part of Chomsky’s response to Kripke’s criticism was that the question of what is going on inside a speaker is no different in principle than any other question investigated by the sciences. Given a language user, say Jones, “We then try… to construct a complete theory, the best one we can, of relevant aspects of how Jones is constructed” (1986: 237). Such a theory would involve specifying that Jones incorporates a particular language, consisting of fixed principles and the setting of parameters, and that he follows the rules that would emerge from the interactions of these factors. Any particular theory like this could be proven wrong —Chomsky notes, “This has frequently been the case” —and, therefore, such a theory is an empirically testable one that can be found to be correct or incorrect. That is, given a theory of the speaker’s underlying linguistic competence, whether she is making a mistake or the theory is wrong is “surely as ascertainable as any other fact about a complex system” (Rey 2020: 125). What would be required is an acceptable explanation of why a mistake was made. The issues here are very similar to those surrounding Chomsky’s adaptation of the “Galilean Method” (see 2.b above) and the testability, or lack thereof, of his theories in general (see 4.a).

5. Cognitive Science and Philosophy of Mind

Because Chomsky regards language as a part of the human mind, his work has inevitably overlapped with both cognitive science and philosophy of mind. Although Chomsky has not ventured far into general questions about mental architecture outside of the areas concerned with language, his impact has been enormous, especially concerning methodology. Prior to Chomsky, the dominant paradigm in both philosophy and cognitive science was behaviorism, the idea that only external behavior could be legitimately studied and that the mind was a scientifically dubious entity. In extreme cases, most notably Quine (1960), the mind was regarded as a fiction best dropped from serious philosophy. Chomsky began receiving widespread notice in the 1950s for challenging this orthodoxy, arguing that it was a totally inadequate framework for the study of language (see 2.a, above), and he is widely held to have dramatically altered the scientific landscape by reintroducing the mind as a legitimate object of study.

Chomsky has remained committed throughout his career to the view that the mind is an important target of inquiry. He cautions against what he calls “methodological dualism” (2000: 135), the view that the study of the human mind must somehow proceed differently than the study of other natural phenomena. Although Chomsky says that few contemporary philosophers or scientists would overtly admit to following such a principle, he suggests that in practice it is widespread.

Chomsky postulates that the human mind contains a language faculty, or module, a biological computer that operates largely independently of other mental systems to produce and parse linguistic structures. This theory is supported by the fact that we, as language users, apparently systematically perform highly complex operations, largely subconsciously, in order to derive appropriate structures that can be used to think and communicate our thoughts and to parse incoming structures underlying messages from other language users. These activities point to the presence of a mental computational device that carries them out. This has been interpreted by some as strong evidence for the computational theory of mind, essentially the idea that the entire mind is a biological computer. Chomsky himself cautions against such a conclusion, stating that the extension from the language module to the whole mind is as of yet unwarranted.

In his work over the last two decades, Chomsky has dealt more with questions of how the language faculty relates to the mind more broadly, as well as the physical brain, questions that he had previously not addressed extensively. Most recently, he proposed a scheme by which the language faculty, narrowly defined, or FLN, consists only of a computational device responsible for constructing syntactic structures. This device provides a bridge between the two other systems that constitute the language faculty more broadly, one of which is responsible for providing conceptual interpretations for the structures of the FLN, the other for physical expression and reception. Thus, while, in this view, the actual language faculty plays a narrow role, it is a critical one that allows the communication of concepts. The FLN itself works with a single operation, merge, which combines two elements. This operation is recursive, allowing elements to be merged repeatedly. He suggests that the FLN, which is the only part of the system unique to humans, evolved due to the usefulness of recursion not only for communication but also for planning, navigation, and other types of complex thought. Because the FLN is thought to have no analog among other species, recursion is theorized to be an important characteristic of human thought, which gives it its unique nature.

While the FLN interfaces with other mental systems, passing syntactic structures between them, the system itself is postulated to carry out its operations in isolation. This follows from Chomsky’s view of syntax as largely autonomous from questions of meaning and also from the way that linguistic knowledge seems to be specialized and independent of our general knowledge about the world. For instance, we can recognize a sentence such as:

(26) On later engines, fully floating gudgeon pins are fitted (Cook and Newsom 1998: 83).

as well-formed, despite the fact that most readers will not know what it means. This concept of a specialized language faculty, which has been a constant in Chomsky’s work almost from the start, represents a substantive commitment to the “modularity of mind”, a thesis that the mind consists, at least in part, of specialized and autonomous systems. There is debate among cognitive scientists and in the philosophy of psychology regarding the degree to which this picture is accurate, as opposed to the idea that mental processes result from the interaction of general faculties, such as memory and perception, which are not domain-specific in the way of the hypothesized language faculty.

It should be emphasized that the language faculty Chomsky hypothesizes is mental, not a specific physical organ in the brain, unlike, for example, the hippocampus. Asking where it is in the brain is something like asking where a certain program is in a computer; both emerge from the functioning of many physical processes that may be scattered in different locations throughout the entire physical device. At the same time, although Chomsky’s theory concerns mental systems and their operations, this is intended as a description, at a high level of abstraction, of computational processes instantiated in the physical brain. Opponents of Chomsky’s ideas frequently point out that there has been little progress in actually mapping these mental systems onto the brain. Chomsky acknowledges that “we do not really know how [language] is actually implemented in neural circuitry” (Berwick and Chomsky 2017: 157). However, he also holds that this is entirely unsurprising, given that neuroscience, like linguistics, is as of yet in its infancy as a serious science. Even in much simpler cases, such as insect navigation, where researchers carry out experiments and genetic manipulations that cannot be performed on humans, “we still do not know in detail how that computation is implemented” (2017: 157).

In his most recent publications, Chomsky has worked towards unifying his theories of language and mind with neuroscience and theories of the physical brain. He has at times expressed pessimism about the possibility of fully unifying these fields, which would require explaining linguistic and psychological phenomena completely in terms of physical events and structures in the brain, While he holds that this may be possible at some point in the distant future, it may require a fundamental conceptual shift in neuroscience. He cautions that it is also possible that such a unification may never be completely possible. Chomsky points to Descartes’ discussion of the “creative” nature of human thought and language, which is the observation that in ordinary circumstances the use of these abilities is “innovative without bounds, appropriate to circumstances but not caused by them” (Chomsky 2014: 1), as well as our apparent possession of free will. Chomsky suggests that it is possible that such phenomena may be beyond our inherent cognitive limitations and impossible for us to ever fully understand.

6. References and Further Reading

a. Primary Sources

Chomsky has been a highly prolific author who has written dozens of books explaining and promoting his theories. Although almost all of them are of great interest to anyone interested in language and mind, including philosophers, they vary greatly in the degree to which they are accessible to non-specialists. The following is a short list of some of the relatively non-technical works of philosophical importance:

  • Chomsky, N. 1956. “Three Models for the Description of Language”. IRE Transactions   on Language Theory. 2(3) pages 113 –124.
    • The earliest presentation of the Chomsky Hierarchy.
  • Chomsky, N. 1957. Syntactic Structures. The Hague: Mouton and Company.
  • Chomsky, N. 1959. “A Review of B.F. Skinner’s Verbal Behavior”. Language 35(1): 2658.
  • Chomsky, N. 1965. Aspects of the Theory of Syntax. Cambridge, MA: MIT Press.
    • While many of the exact proposals about syntax are dated, this contains what is still one of the best summaries of Chomsky’s ideas concerning language acquisition and the connections he sees between his program and the work of the early modern rationalist philosophers.
  • Chomsky, N. 1968. “Quine’s Empirical Assumptions”. Synthese, 19 (1/2): 53 –68.
    • A critique of Quine’s philosophical objections.
  • Chomsky, N. 1975. The Logical Structure of Linguistic Theory. Berlin: Springer.
    • The earliest statement of Chomsky’s theory, now somewhat outdated, originally circulated as a typescript in 1956.
  • Chomsky, N. 1981. Lectures on Government and Binding. The Hague: Mouton.
  • Chomsky, N. 1986. Barriers. Boston: The MIT Press.
  • Chomsky, N. 1986. Knowledge of Language: its Nature, Origin and Use. Westport, CN: Praeger.
    • Contains Chomsky’s response to “Kripkenstein”, as well as the first discussion of languages.
  • Chomsky, N. 1988. Language and Problems of Knowledge: The Managua Lectures. Cambridge, MA: MIT Press.
    • A series of lectures for a popular audience that introduces Chomsky’s linguistic work.
  • Chomsky, N. 1995. The Minimalist Program. Boston: MIT Press.
  • Chomsky, N. 1997. “Language and Problems of Knowledge”. Teorema. (16)2: 5 –33.
    • This is probably the best short introduction to Chomsky’s ideas on the nature and acquisition of language, especially the E-language/I-language distinction.
  • Chomsky, N. 2000. New Horizons in the Study of Language and Mind. Cambridge: Cambridge University Press.
    • It is philosophically interesting in that it contains a significant discussion of Chomsky’s views on contemporary trends in the philosophy of language, particularly his rejection of “externalist” theories of meaning.
  • Hauser, M.; Chomsky, N.; Fitch, T. 2002. “The Faculty of Language: What Is It, Who Has It, and How Did It Evolve”. Science. 198: 1569 –1579.
    • A good summary of directions in generative linguistics, including proposals about the structure of the language faculty in terms of FLN/FLB.
  • Chomsky, N. 2006. Language and Mind. Cambridge: Cambridge University Press.
    • Also contains valuable historical context.
  • Chomsky, N. 2014. “Science, Mind and Limits of Understanding”. The Science and Faith Foundation, https://chomsky.info/201401/. The Vatican.
  • Berwick, R. and Chomsky, N. 2016. Why Only Us: Language and Evolution. Boston: MIT Press.
    • It is valuable as a non-technical look at the current state of Chomsky’s theories as well as a discussion of the evolutionary development of language.
  • Keating, B. 2020. “Noam Chomsky: Is it Possible to Communicate with Extraterrestrials”. YouTube. https://www.youtube.com/watch?v=n7mvUw37g-U.
    • Chomsky discusses hypothetical extraterrestrial languages and the possibility of communicating with aliens.
  • Chomsky, N., Roberts, I., and Watumull, J. “Noam Chomsky: The False Promise of ChatGPT”. New York Times. March 8, 2023.
    • For someone interested in exploring Chomsky’s linguistic theories in depth, the following are a few key works tracing their development (along with Aspects, listed above).

b. Secondary Sources

There is a vast secondary literature surrounding Chomsky that seeks to explain, develop, and often criticize his theories. The following is a small sampling of works interesting to non-specialists. After a list of sources that cover Chomsky’s work in general, sources that are relevant to more specific aspects are listed by the section of this article they were referenced in or apply to.

  • General: 
    • Cook, V. and Newsom, M. 1996. Chomsky’s Universal Grammar: An Introduction.  Malden, MA: Blackwell.
      • Very clear introduction to Chomsky’s theories and their importance to linguistic science. The first three chapters are especially valuable to non-specialists.
    • Rey, G. 2020. Representation of Language: Philosophical Issues in a Chomskyan Linguistics. Oxford: Oxford University Press.
      • A useful and thorough overview of the philosophical implications of Chomsky’s theories, particularly regarding the philosophy of science and the philosophy of mind, as well as a summary of the core linguistic theory.
    • Scholz, B., Pelletier, F., Pullum, G., and Nedft, R. 2022. “Philosophy of Linguistics”, The Stanford Encyclopedia of Philosophy, Edward N. Zalta (ed.).
      • This article is an excellent critical comparison of Chomsky’s theories on language and linguistic science with the major rival approaches.
  • Life:
    • Rai, M. 1995. Chomsky’s Politics. London: Verso.
    • Cohen, J., and Rogers, J. 1991. “Knowledge, Morality and Hope: The Social Thought of Noam Chomsky.” New Left Review. I/187: 5–27.
  • Philosophy of Linguistics:
    • Bloomfield, L. 1933. Language. New York: Holt, Rinehart, and Winston.
    • Hockett, C. 1960. “The Origin of Speech”. Scientific American. 203: 88 –111.
    • Quine, W. 1960. Word and Object. Cambridge, MA: MIT University Press.
    • Skinner, B. 1957. Verbal Behavior. New York: Appleton-Century-Crofts.
  • The Development of Chomsky’s Linguistic Theory:
    • Baker, M. 2001. The Atoms of Language. New York: Basic Books
      • Easily readable presentation of Principles and Parameters Theory.
    • Harris, R. 2021. The Linguistics Wars. Oxford: Oxford University Press.
    • Liao, D., et al.  2022. “Recursive Sequence Generation in Crows”. Science Advances. 8(44).
      • Summarizes recent challenges to Chomsky’s claim that recursion is uniquely human.
    • Tomalin, M. 2006. Linguistics and the Formal Sciences: The Origins of Generative Grammar. Cambridge, UK: University of Cambridge Press.
      • Provides as interesting historical background connecting Chomsky’s early work with  contemporary developments in logic and mathematics.
  • Technical:
  • Generative Grammar:
    • Lasnik, H. 1999. Minimalist Analysis. Malden, MA: Blackwell
    • Lasnik, H. 2000. Syntactic Structures Revisited. Cambridge, MA. MIT University Press
    • Lasnik, H. and Uriagereka, J. 1988. A Course in GB Syntax. Cambridge, MA. MIT University Press.
  • Language and Languages:
  • Criticisms of Universal Grammar:
    • Evans, N. and Levinson, S. 2009. “The Myth of Language Universals: Language Diversity and its Importance for Cognitive Science”. Behavioral and Brain Sciences 32(5) pages 429 –492.
    • Levinson, S. 2016. “Language and Mind: Let’s Get the Issues Straight!”. Making Sense of Language (Blum, S., ed.). Oxford: Oxford University Press pages 68 –80.
      • Relevant to the debate over the I-language/E-language distinction:
    • Devitt, M. 2022. Overlooking Conventions: The Trouble with Linguistic Pragmatism. Oxford: Oxford University Press.
    • Dummet, M. 1986. “’A nice derangement of epitaphs: Some comments on Davidson and Hacking”. Truth and Interpretation (Lepore, E. ed.). Oxford: Blackwell.
    • Katz, J. 1981. Language and Other Abstract Objects. Lanham, MD: Rowman and Littlefield.
    • Katz, J. 1985. The Philosophy of Linguistics. Oxford: Oxford University Press.
    • Lewis, D. 1969. Convention: A Philosophical Study. Cambridge, MA: Harvard University Press.
    • Soames, S. 1984. “Linguistics and Psychology”. Linguistics and Philosophy 7: 155 –179.
  • Meaning and Analyticity:
    • Davidson, D. 1967. “Truth and Meaning”. Synthese 17(3): 304 –323.
    • Fodor, J. 1998. Concepts: Where Cognitive Science Went Wrong. Cambridge, MA: MIT   University Press.
    • Katz, J. 1990. The Metaphysics of Meaning. Oxford: Oxford University Press.
    • Pietroski, P. 2005. “Meaning Before Truth”. Contextualism in Philosophy: Knowledge, Meaning and Truth. Oxford: Oxford University Press.
    • Putnam, H. 1962. “It Ain’t Necessarily So.” Journal of Philosophy LIX: 658 –671.
    • Quine, W. 1953. “Two Dogmas of Empiricism”. From a Logical Point of View. Cambridge, MA: Harvard University Press.
    • Rey, G. 2022. “The Analytic/Synthetic Distinction. The Stanford Encyclopedia of Philosophy (Spring 2023 Edition), Edward N. Zalta & Uri Nodelman (eds.).
      • See especially the supplement specifically on Chomsky and analyticity.
    • Waismann, F. 1945. “Verifiability”. Proceedings of the Aristotelian Society 19.
  • Language Acquisition and the Theory of Innate Concepts:
    • Fodor, J. 1975. The Language of Thought. Scranton, PA: Crowell.
    • Jerne, N. “The Generative Grammar of the Immune System”. Science. 229 pages 1057 –1059.
    • Putnam, H. 1988. Representation and Reality. Cambridge, MA: MIT University Press.
  • “Kripkenstein” and Rule-Following:
    • Kripke, S. 1982. Wittgenstein on Rules and Private Language. Cambridge, MA: Harvard University Press.
    • Wittgenstein, L. 1953. Philosophical Investigations (Anscombe, G. translator). Oxford: Blackwell.
  • On Pirahã:
    • Everett, D. 2005. “Cultural Constraints on Grammar and Cognition in Pirahã”. Current Anthropology 46(4): 621–646.
      • The original claim that a language without recursion had been identified, allegedly showing Universal Grammar to be false.
    • Hornstein, N. and Robinson, J. 2016. “100 Ways to Misrepresent Noam Chomsky”. Current Affairs.
      • Representative responses to Everett from those in Chomsky’s camp assert that even if his claims are correct, they would not represent a counterexample to Universal Grammar.
    • McWhorter, J. 2016. “The bonfire of Noam Chomsky: journalist Tom Wolfe targets the  acclaimed linguist”. Vox.
      • Linguist John McWhorter provides a very understandable summary of the issues and assesses the often incautious way that the case has been handled in the popular press.
    • Nevins, A., Pesetsky, D., Rodrigues, C. 2009. “Pirahã Exceptionality: A Reassessment”. Language 85(2): 355 –404.
      • Technical article criticizing Everett’s assessment of Pirahã syntax.
  • Other:
    • Lakoff, G. 1971. “On Generative Semantics”. Semantics (Steinberg, G. and Jacobovits, I. ed.). Cambridge, UK: Cambridge University Press.
      • An important work critical of Chomsky’s “autonomy of syntax”.
    • Cognitive Science and Philosophy of Mind.
    • Rey, G. 1997. Contemporary Philosophy of Mind. Hoboken: Wiley-Blackwell.
      • Covers Chomsky’s contributions in this area, particularly regarding the downfall of behaviorism and the development of the computational theory of mind.

 

Author Information

Casey A. Enos
Email: cenos@georgiasouthern.edu
Georgia Southern University
U. S. A.

Humean Arguments from Evil against Theism

Arguments from evil are arguments against Theism, which is broadly construed as the view that there is a supremely powerful, knowledgeable, and good creator of the universe. Arguments from evil attempt to show that there is a problem with Theism. Some arguments depend on it being known that Theism is false, but some arguments from evil also try to show that Theism is known to be probably false, or unreasonable, or that there is strong evidence against it. Arguments from evil are part of the project of criticizing religions, and because religions offer comprehensive worldviews, arguments from evil are also part of the project of evaluating which comprehensive worldviews are true or false.

Humean arguments from evil take their argumentative strategy from Philo’s argument from evil in part XI of Hume’s Dialogues Concerning Natural Religion. Philo’s argumentative strategy is distinctive in that it is fundamentally explanatory in nature. Philo takes as his data for explanation the good and evil we know about. He asks which hypothesis about a creator best explains that data. He argues that the good and evil we know about is best explained not by Theism but by some rival hypothesis to Theism. In this way, the good and evil we know about provides a reason for rejecting Theism.

This article surveys Humean arguments from evil. It begins by explaining Philo’s original argument from evil as well as some potential drawbacks of that argument. Then it turns to more fully explaining the distinctive features of Humean arguments from evil in comparison to other arguments from evil. It highlights three features in particular: they appeal to facts about good and evil, they are comparative, and they are abductive. The remainder of the article articulates a modern, prototypical Humean argument inspired by the work of Paul Draper. It explains the idea that the good and evil we know about is better explained by a rival to Theism called the “Hypothesis of Indifference,” roughly, the hypothesis that there is no creator who cares about the world one way or the other. It then shows how to strengthen Humean arguments from evil by providing additional support for the rival hypothesis to Theism. Finally, it examines four prominent objections to Humean arguments.

This article focuses on Humean arguments that try to show that Theism is known to be false, or probably false, or unreasonable to believe. These kinds of Humean arguments are ambitious, as they try to draw an overall conclusion about Theism itself. But there can also be more modest Humean arguments that try to show that some evidence favors a rival to Theism without necessarily drawing any overall conclusions about Theism itself. This article focuses on ambitious Humean arguments rather than these modest Humean arguments mostly because ambitious Humean arguments are the ones contemporary philosophers have focused on. But it is important to keep in mind that Humean arguments from evil—like arguments from evil more generally—come in different shapes and sizes and may have different strengths and weaknesses.

Table of Contents

  1. Philo’s Argument from Evil
  2. Distinctive Features of Humean Arguments
  3. Modern Humean Arguments
  4. Strengthening Humean Arguments
  5. Criticisms of Humean Arguments
    1. Objection 1: Limited Biological Roles
    2. Objection 2: Naturalism and Normativity
    3. Objection 3: God’s Obligations
    4. Objection 4: Skeptical Theism
  6. References and Further Reading

1. Philo’s Argument from Evil

Natural theology is the attempt to provide arguments for the existence of God by only appealing to natural facts—that is, facts that are not (purportedly) revealed or otherwise supernatural. Three of the traditional arguments for the existence of God—the ontological argument, the cosmological argument, and the teleological argument—belong to the project of natural theology. Conversely, natural atheology is the attempt to provide arguments against the existence of God by appealing to natural (non-supernatural, non-revealed) facts as well.

Hume’s Dialogues Concerning Natural Religion is a classic work of natural atheology. In the dialogue, the interlocutors assume that there is a creator (or creators) of the world; they advance arguments about the nature or character of this creator. Most of the dialogue—parts II-VIII—discusses design arguments for the existence of God whereas later parts—parts X-XI—discuss arguments from evil. In the dialogue, Philo offers a variety of critiques of prototypical theistic ideas. (Because it is controversial whether Philo speaks for Hume—and if so, where—this article attributes the reasoning to Philo.)

In section X, the interlocutors discuss what is called a “logical” or “incompatibility” argument from evil. They begin by describing various facts about good and evil they have observed. For instance, many people experience pleasure in life; but oftentimes they also experience great pain; the strong prey upon the weak; people use their imaginations not just for relaxation but to create new fears and anxieties; and so forth. They consider whether those facts are logically inconsistent with the existence of a creator with infinite power, wisdom, and goodness. Ultimately Philo does not think it would be reasonable to infer the existence of such a creator from those facts; but he does concede that they are logically consistent with the existence of such a creator (X:35; XI.4, 12). But Philo’s concession is not his last word on the subject.

In section XI, Philo constructs a different argument from evil. Philo begins by articulating additional claims about good and evil he takes himself to know. Most of these additional claims consist in causes of suffering that seem to be unnecessary—for example, variations in weather cause suffering, yet seem to serve no purpose; pain teaches animals and people how to act, but it seems that pleasure would be just as effective at motivating people to act; and so forth. Given these claims, Philo considers what we can reasonably infer about the creator (or creators) of the universe. He considers four potential hypotheses:

  1. The creator(s) of the universe are supremely good.
  2. The creator(s) of the universe are supremely malicious.
  3. The creator(s) of the universe have some mixture of both goodness and malice.
  4. The creator(s) of the universe have neither goodness nor malice.

In evaluating these hypotheses, Philo uses a Humean principle of reasoning that “like effects have like causes.” In other words, the only kinds of features it is reasonable to infer from an effect to its cause(s) are features that would be similar between the two. (He uses this principle throughout the Dialogue; see also II.7, II.8, II.14, II.17, V.1, VI.1.) Using this principle, he argues that of these hypotheses the fourth is “by far the most probable” (XI.15). He rejects the first and the second because the causes of the universe would be too dissimilar to the universe itself. The world is mixed, containing both good and evil. Thus, one cannot infer that the cause of the world contains no evil—as the first hypothesis suggests—or contains no good—as the second hypothesis suggests. Those causes are too dissimilar. He also rejects the third hypothesis. He assumes that if the universe had some mixture of goodness and malice this would be because some of the creators of the universe would be good and some of the creators of the universe would be malicious. And, he assumes, the universe would then be like a battlefield between them. But the regularity of the world suggests the universe is not a battlefield between dueling creators. Having ruled out the first three hypotheses, the most probable hypothesis must be the fourth. As Philo himself says of this hypothesis, using language that is graphic both now and then (XI.13):

The whole [of the universe] presents nothing but the idea of a blind nature, impregnated by a great vivifying principle, and pouring forth from her lap, without discernment or parental care, her maimed and abortive children.

Philo’s conclusion has both a weak and a strong interpretation. In the strong interpretation, Philo is concluding that we can reasonably believe something about the nature of the creator(s), namely, that they are indifferent. In a weak interpretation, Philo is concluding that of these four hypotheses, the fourth is the most probable—but it may not be sufficiently probable to reasonably believe. Either way, the most reasonable hypothesis is that the creator has neither goodness nor malice.

At first blush, it might not be obvious how Philo’s conclusion provides a reason for rejecting Theism. In fact, it might look like Philo is just concerned to undermine an argument from our knowledge of good and evil to Theism. And, one might point out, undermining an argument for a conclusion is not the same thing as providing a reason for rejecting that conclusion. To see how Philo’s conclusion provides a reason for rejecting Theism, notice two things. First, Philo is not merely claiming something purely negative, like that some argument for Theism fails. Rather, he is also claiming something positive, namely, that the fourth hypothesis—where the creator has neither goodness nor malice—is the most reasonable of the four considered, given our knowledge of good and evil. Second, that hypothesis is inconsistent with Theism, which maintains (at the very least) that God is supremely good. Since the most reasonable thing to believe, given that data, is inconsistent with Theism, then that data provides a reason for rejecting Theism. In this way, Philo is not simply undermining an argument for Theism; he is also providing a reason for rejecting Theism.

Philo’s specific argument has received a mixed reaction both historically and in the early 21st century. From a contemporary perspective, there are at least two drawbacks to Philo’s specific argument. First, Philo and his interlocutors assume that there is a creator (or creators) of the universe. Thus, they only consider hypotheses that imply that there is a creator (or creators) of the universe. But many contemporary naturalists and atheists do not assume that there is any creator at all. From a contemporary perspective, it would be better to consider a wider range of hypotheses, including some that do not imply that there is a creator. Second, when evaluating hypotheses, Philo uses Hume’s principles of reasoning that “like causes have like effects.” But many contemporary philosophers reject such principles. Insofar as Philo’s reasoning assumes Hume’s own principles of reasoning, it will exhibit the various problems philosophers have identified for Hume’s principles of reasoning.

But even if Philo’s specific argument suffers from drawbacks, his argumentative strategy is both distinctive and significant. Thus, one might mount an argument that shares several of the distinctive features of his argumentative strategy without committing oneself to the specific details of Philo’s own argument. Toward the end of the 20th and beginning of the 21st century, Paul Draper did exactly that, constructing arguments against Theism that utilize Philo’s argumentative strategy while relying on a more modern epistemology. It is natural to call these arguments Humean arguments since their strategy originates in a dialogue written by Hume—even if modern defenses of them vary from Hume’s original epistemology. The next section describes in more detail several of the distinctive features of Philo’s argumentative strategy.

2. Distinctive Features of Humean Arguments

First, many arguments from evil focus exclusively on facts about evil. Some arguments focus on our inability to see reasons that would justify God’s permission of those evils (Martin (1978), Rowe (1979)). Other arguments focus on the horrific nature of such evils (Adams (1999)). By contrast, Humean arguments from evil focus on facts about both good and evil. The focus on both good and evil is appropriate and significant.

The focus on good and evil is appropriate because, if God exists, God cares about preventing evil but also bringing about what is good. The focus on good and evil is significant because it provides a richer set of data with which to reason about the existence of God. For it is conceivable that facts about evil provide some evidence against the existence of God, but facts about good provide even stronger evidence for the existence of God, thereby offsetting that evidence. Or, alternatively, it is conceivable that facts about evil provide little to no evidence against the existence of God, but facts about good and evil together provide strong evidence against the existence of God. By focusing on both good and evil, Humean arguments provide a richer set of data to reason about the moral character of a purported creator.

Second, Humean arguments compare Theism to some rival hypothesis that is inconsistent with Theism. Normally, the rival hypothesis is more specific than the denial of Theism. For instance, Philo’s argument considered rival hypotheses to Theism that are fairly specific. And we can distinguish between different Humean arguments on the basis of the different rival hypotheses they use.

There is an important advantage of using a specific rival hypothesis to Theism. The simplest rival to Theism is the denial of Theism. But consider all of the views that are inconsistent with Theism. That set includes various forms of naturalism, but also pantheism, panentheism, non-theistic idealisms, various forms of pagan religions, and perhaps others yet. So, the denial of Theism is logically equivalent to the disjunction of these various theories. But it is not at all obvious what a disjunction of these various theories will predict. By contrast, it is normally more obvious what a more specific, rival hypothesis to Theism predicts. Thus, by focusing on a more specific rival hypothesis to Theism, it is easier to compare Theism to that rival.

Third, Humean arguments are best understood abductively. They compare to what degree a specific rival to Theism better explains, or otherwise predicts, some data. Even Philo’s own argument could be understood abductively: the hypothesis that there is a supremely good creator does not explain the good and evil Philo observes because the creator proposed by that hypothesis is not similar to the good and evil he observes. To be clear, Humean arguments need not claim that the rival actually provides the best explanation of those facts. Rather, their claim is more modest, but with real bite: a rival to theism does a better job of explaining some facts about good and evil.

Some Humean arguments may stop here with a comparison between Theism and a specific rival hypothesis. But many Humean arguments are more ambitious than that: they try to provide a reason for rejecting Theism. This feature of such Humean arguments deserves further clarification. Sometimes abductive reasoning is characterized as “inference to the best explanation.” In a specific inference to the best explanation, one infers that some hypothesis is true because it is part of the best explanation of some data. Such Humean arguments need not be understood as inference to the best explanation in this sense. Though it is not as catchy, some Humean arguments could be understood as “inference away from a worse explanation.” Some body of data gives us reason to reject Theism because some hypothesis other than Theism does a better job of explaining that data and that hypothesis is inconsistent with Theism. Notice that a specific rival to Theism can do a better job of explaining that data even if some other hypothesis does an even better job yet.

Lastly, Humean arguments are evidential arguments from evil, not logical arguments from evil. More specifically, Humean arguments do not claim that some known facts are logically inconsistent with Theism. Rather, they claim that some known facts are strong evidence against Theism. Logical arguments from evil have an important methodological feature. If some known fact is logically inconsistent with Theism, then it does not matter what evidence people muster for Theism—we already know that Theism is false. By contrast, evidential arguments may need to be evidentially shored up. Even if the arguments are successful in providing strong evidence against Theism, it may be that there is also strong evidence in favor of Theism as well. This difference between evidential arguments and logical arguments is relevant in section 4 which indicates how to strengthen Humean arguments.

3. Modern Humean Arguments

 This section explains a modern, prototypical Humean argument. The author who has done the most to develop Humean arguments is Paul Draper. The argument in this section is inspired by Paul Draper’s work without being an interpretation of any specific argument Draper has given. Humean arguments compare Theism to some specific rival to Theism; and different Humean arguments may use different specific rivals to compare to Theism. Consequently, it is important to begin by clarifying what specific rival is used to generate Humean arguments.

This article uses the term Hypothesis of Indifference. The Hypothesis of Indifference is the claim that it is not the case that the nature or condition of life on earth is the result of a creator (or creators) who cares positively or negatively about that life. The Hypothesis of Indifference is a natural hypothesis to focus on for several reasons. First, it is inconsistent with Theism, but is more specific than just the denial of Theism. Second, it does not imply that there is a creator. Third, it is consistent with metaphysical naturalism, the view that there are no supernatural facts. These last two reasons are important to a modern audience—many people believe that there is no creator of the universe, and many philosophers accept metaphysical naturalism.

The central claim of this Humean argument is this: the Hypothesis of Indifference does a much better job predicting the good and evil we know about than Theism does. This article refers to this claim as Central Claim. Central Claim does not claim that the Hypothesis of Indifference perfectly predicts the good and evil we know about. It does not even claim that the Hypothesis of Indifference is the best explanation of the good and evil we know about. Rather, it claims that in comparison to Theism, the Hypothesis of Indifference does a much better job of predicting the good and evil we know about.

The comparison in Central Claim is an antecedent comparison. That is, it compares what the Hypothesis of Indifference and Theism predict about good and evil antecedent of our actual knowledge of that good and evil. We are to set aside, or bracket, our actual knowledge of good and evil and ask to what degree each hypothesis—the Hypothesis of Indifference, Theism—predicts what we know.

This procedure of antecedent comparison is not unique to Humean arguments. It is frequently used in the sciences. A classic example of the same procedure is the retrograde movement of Mars. Using the naked eye, Mars seems to move “backwards” through the sky. Some astronomers argued that the retrograde motion of Mars was better explained by heliocentrism than geocentrism. But in making their arguments, they first set aside what they already knew about the retrograde motion of Mars. Rather, they asked to what degree each hypothesis would predict the retrograde motion of Mars before considering whether Mars exhibits retrograde motion.

There are different strategies one might use to defend Central Claim. One strategy appeals to what is normally called our background knowledge. This is knowledge we already have “in the background.” Such knowledge is frequently relied upon when we are evaluating claims about evidence, prediction, explanation, and the like. For instance, suppose I hear a loud repeating shrieking noise from my kitchen. I will immediately take that as evidence that there is smoke in my kitchen and go to investigate. However, when I take that noise as evidence of smoke in my kitchen, I rely upon a huge range of knowledge that is in the background, such as: loud repeating shrieking noises do not happen at random; that noise is not caused by a person or pet; there is a smoke detector in my kitchen; smoke detectors are designed to emit loud noises in the presence of smoke; and so on. I rely on this background knowledge—implicitly or explicitly—when I take that noise as evidence of smoke in my kitchen. For instance, if I lacked all of that background knowledge, it is very unlikely I would immediately take that noise as evidence of smoke in my kitchen.

One strategy for defending Central Claim relies upon our background knowledge. The basic strategy has four parts. First, one argues that our background knowledge supports certain kinds of predictions about good and evil. Second, one argues that those predictions are, to a certain degree, accurate. Third, one argues that the Hypothesis of Indifference does not interfere with or undermine those predictions. Finally, one argues that Theism interferes with or undermines those predictions, producing more inaccurate predictions. The end result, then, is that the combination of the Hypothesis of Indifference with our background knowledge does a better job of predicting the data of good and evil than the combination of Theism with our background knowledge.

This strategy can be implemented in various ways. One way of implementing it appeals to our background knowledge of the biological role or function of pleasure and pain (Draper (1989)). Specifically, our background knowledge predicts that pleasure and pain will play certain adaptive roles or functions for organisms. And when we consider the pleasure and pain we know about, we find that it frequently plays those kinds of roles. For instance, warm sunlight on the skin is pleasant, but also releases an important vitamin (vitamin D); rotten food normally produces an unpleasant odor; extreme temperatures that are bad for the body are also painful to experience for extended durations; and so forth. So, our background knowledge makes certain predictions about the biological role or function of pleasure and pain, and those predictions are fairly accurate.

The Hypothesis of Indifference does not interfere with, or undermine, those predictions as it does not imply the existence of a creator who has moral reasons for deviating from the biological role of pleasure and pain. By contrast, Theism does interfere with, and undermine, those predictions. For pleasure is a good and pain a bad. Thus, given Theism, one might expect pleasure and pain to play moral or religious roles or functions. The exact nature of those moral or religious roles might be open to debate. But they might include things like the righteous receiving happiness or perhaps good people getting the pleasure they deserve. Similarly, given Theism, one might expect pain to not play certain biological roles if it does not simultaneously play moral or religious roles. For instance, given Theism, one might not expect organisms that are not moral agents to undergo intense physical pain (regardless of whether that pain serves a biological role). In this way, Theism may interfere with the fairly accurate predictions from our background information. Thus, the combination of the Hypothesis of Indifference and our background knowledge does a better job of predicting some of our knowledge of good and evil—namely, the distribution of pleasure and pain—than the combination of Theism and our background knowledge.

A second strategy for defending Central Claim utilizes a thought experiment (compare Hume Dialogue, XI.4, Dougherty and Draper (2013), Morriston (2014)). Imagine two alien creatures who are of roughly human intelligence and skill. One of them accepts Theism, and the other accepts the Hypothesis of Indifference. But neither of them knows anything about the condition of life on earth. They first make predictions about the nature and quality of life on earth, then they learn about the accuracy of their predictions. One might argue that the alien who accepts the Hypothesis of Indifference will do a much better job predicting the good and evil on earth than the alien who accepts Theism. But as it goes for the aliens so it goes for us: the Hypothesis of Indifference does a much better job of predicting the good and evil we know about than Theism does

The alien who accepts Theism might be surprised as it learns about the actual good and evil of life on earth. For the alien’s acceptance of Theism gives it reason to expect a better overall balance of good and evil than we know about. By contrast, the alien who accepts the Hypothesis of Indifference might not be surprised by the good and evil that we know about because the Hypothesis of Indifference does not imply the existence of a creator with a moral reason for influencing the good and evil the earth has. So the alien’s acceptance of the Hypothesis of Indifference does not give it a reason for anticipating any particular distribution of good and evil. Thus, the alien accepting the Hypothesis of Indifference might not be surprised to discover the specific good and evil it does in fact know about.

Recall that Central Claim involves an antecedent comparison—it compares to what degree two hypotheses predict some data antecedent of our actual knowledge of that data. This thought experiment models the idea of an antecedent comparison by having the aliens not actually know the relevant data of good and evil. Their ignorance of the good and evil models our “bracketing” of our own knowledge.

Having considered some defenses of Central Claim, we can now formulate some Humean arguments that use Central Claim as a premise. One Humean argument goes like this:

Central Claim: the Hypothesis of Indifference does a much better job predicting the good and evil we know about than Theism does.

Therefore, the good and evil we know about is evidence favoring the Hypothesis of Indifference over Theism.

This argument is valid. But the inference of this argument is modest on two fronts. First, evidence comes in degrees, from weak evidence to overwhelming evidence. The conclusion of this argument merely states that the good and evil we know about is evidence favoring one hypothesis over another without specifying the strength of that evidence. Second, this conclusion is consistent with a wide range of views about what is reasonable for us to believe. The conclusion is consistent with views like: it is reasonable to believe Theism; it is reasonable to believe the Hypothesis of Indifference; it is not reasonable to believe or disbelieve either. To be sure, this argument still asserts Central Claim; and as we see in section V, a number of authors have objected to Central Claim and arguments for it. But the conclusion drawn from Central Claim is quite modest. Perhaps for these reasons, defenders of Humean arguments from Philo to the present have tended to defend Humean arguments with more ambitious conclusions.

Consider the following simple Humean argument against Theism:

Central Claim: the Hypothesis of Indifference does a much better job predicting the good and evil we know about than Theism does.

Therefore, Theism is probably false.

This argument does not draw a conclusion comparing Theism to some rival. Rather, it draws a conclusion about Theism itself. In this way it is more ambitious than the argument just considered. What makes this Humean argument a simple Humean argument is that it only has one premise—Central Claim. However, this argument is not valid, and there are several reasons for thinking it is not very strong. The next section explains what those reasons are and how to strengthen Humean arguments by adding additional premises to produce a better (and arguably valid) argument.

4. Strengthening Humean Arguments

Suppose that Central Claim is true. Then a rival hypothesis (Hypothesis of Indifference) to a hypothesis (Theism) does a much better job predicting some data (what we know about good and evil). However, that fact on its own might not make it reasonable to believe the rival hypothesis (Hypothesis of Indifference) or disbelieve the relevant hypothesis (Theism). For the rival hypothesis might have other problems such as being ad hoc or not predicting other data (compare Plantinga (1996)).

An analogy will be useful in explaining these points. Suppose I come home to find that one of the glass windows on the back door of my home has been broken. These facts are “data” that I want to explain. One hypothesis is that the kids next door were playing and accidentally broke the glass with a ball (Accident Hypothesis). A rival hypothesis is that a burglar broke into my home by breaking the glass (Burglar Hypothesis). Now the Burglar Hypothesis better predicts the data. If the burglar is going to break into my home, an effective way to do that is to break the glass on the door to thereby unlock the door. By contrast, the Accident Hypothesis does a worse job predicting the data. Even if the kids were playing, the ball might not hit my door. And even if the ball did hit the door, it might not hit the glass with enough force to break it. So, in this case, the rival hypothesis (Burglar Hypothesis) to a hypothesis (Accident Hypothesis) does a much better job predicting some data (the broken glass on my back door). Does it thereby follow that it is reasonable for me to believe the rival hypothesis (Burglar Hypothesis) or it is unreasonable for me to believe the hypothesis (Accident Hypothesis)?

No, or at least, not yet. First, the Burglar Hypothesis is much less simple than the Accident Hypothesis. I already know that there are kids next door who like to play outside. I do not already know that there is a burglar who wants to break into my home. So the Burglar Hypothesis is committed to the existence of more things than I already know about. That makes the Burglar Hypothesis less ontologically simple. Second, the Burglar Hypothesis might not predict as well other data that I know. Suppose, for instance, there is a baseball rolling around inside my home, and nothing has been stolen. The Accident Hypothesis does a much better job predicting this data than the Burglar Hypothesis. So even if the Burglar Hypothesis better predicts some data, on its own, that would not make it reasonable for me to believe The Burglar Hypothesis or make it reasonable to disbelieve the Accident Hypothesis.

Returning to Humean arguments, suppose Central Claim is true so that a rival to Theism, specifically the Hypothesis of Indifference, better predicts the good and evil we know about. It may not yet follow that it is reasonable to believe the Hypothesis of Indifference or disbelieve Theism. For it may be that the rival is much less simple than Theism. Or it may be that the rival to Theism does a much worse job predicting other data that we know about.

To strengthen Humean arguments, additional premises can be added (compare Dougherty and Draper (2013), Perrine and Wykstra (2014), Morriston (2014)). For instance, an additional premise might be Simplicity Claim: the Hypothesis of Indifference is just as simple, if not more so, than Theism. Another premise might be Not-Counterbalanced Claim: there is no body of data we know about that Theism does a much better job predicting than the Hypothesis of Indifference. The strengthened argument looks like this:

Central Claim: the Hypothesis of Indifference does a much better job predicting the good and evil we know about than Theism does.

Simplicity Claim: the Hypothesis of Indifference is just as simple, if not more so, than Theism.

Not-Counterbalanced Claim: there is no body of data we know about that Theism does a much better job predicting than the Hypothesis of Indifference.

Therefore, Theism is false.

This argument is a stronger argument than the simple one-premise argument from the previous section. Arguably, it is valid. (Whether it is valid depends partly on the relationship between issues like simplicity and probability; but see Dougherty and Draper (2013: 69) for an argument that it is valid.)

Premises like Simplicity Claim and Not-Counterbalanced Claim are not always defended in discussion of arguments from evil. But they can be defended by pressing into service other work in the philosophy of religion. For instance, natural theologians try to provide evidence for the existence of God by appealing to facts we know about. Critics argue that such evidence does not support Theism or, perhaps, supports Theism only to a limited degree. These exchanges are relevant to evaluating Not-Counterbalanced Claim. To be sure, Humean arguments compare Theism to some rival. So other work in philosophy of religion might not straightforwardly apply if it does not consider a rival to Theism or considers a different rival than the one used in the relevant Humean argument.

These additional premises strengthened Humean arguments because Humean arguments are not logical or incompatibility arguments. That is, they do not claim that the good and evil we know about is logically inconsistent with Theism. Rather, they are abductive arguments. They claim that what we know about good and evil is evidence against Theism because some rival to Theism better predicts or explains it. But in evaluating how well a hypothesis explains some data, it is oftentimes important to also consider further facts about the hypothesis—such as how simple it is or if it is also known to be false or otherwise problematic.

Lastly, some might think that the relation between simple and strengthened Humean arguments is just a matter of whether we have considered some evidence against Theism or all relevant evidence for or against Theism. But considering some evidence versus all the evidence are just two different tasks, and the first task can be done without consideration of the second. However, the relation between simple and strengthened Humean arguments is a little more complex than that for certain methodological reasons.

Each of the premises of a strengthened Humean argument involves a comparison of Theism with a specific rival to Theism. But the specific choice of the rival might make it easier to defend some of the comparisons while simultaneously making it harder to defend other comparisons. For instance, the Hypothesis of Indifference does not posit any entity that has the ability or desire to influence life on earth. Some defenders of Central Claim might use that feature to argue that the Hypothesis of Indifference has better predictive fit than Theism with regard to the good and evil we know about. But exactly because the Hypothesis of Indifference does not posit any entity that has the ability or desire to influence life on earth, it may have worse predictive fit when it comes to the fine-tuning of the universe, the existence of life at all, the existence of conscious organisms, the existence of moral agents, and other potential evidence. So picking the Hypothesis of Indifference might make it easier to defend some premises of a strengthened Humean argument (perhaps Central Claim) while also making it harder to defend other premises of a strengthened Humean argument (perhaps Not-Counterbalanced Claim).

As such, the relationship between a simple and strengthened Humean argument is more complex. It is not simply a matter of considering one potential pool of evidence and then considering a larger pool of evidence. Rather, the choice of a specific rival to Theism is relevant to an evaluation of both simple and strengthened Humean arguments. Some specific rivals might make it easier to defend a simple Humean argument while also making it harder to defend a strengthened Humean argument (or vice versa). Defenders of Humean arguments have to carefully choose a specific rival that balances simplicity and predictive strength to challenge Theism.

5. Criticisms of Humean Arguments

Like all philosophical arguments, Humean arguments have received their fair share of criticisms. This section describes a handful of criticisms and potential responses to those criticisms. These criticisms are all criticisms of Central Claim (or premises like it). Consequently, these objections could be lodged against simple Humean arguments and strengthened Humen arguments—as well as the “modest” Humean argument mentioned at the end of section III. (For a discussion of historical responses to Hume’s writing on religion, see Pyle (2006: chapter 5).)

a. Objection 1: Limited Biological Roles

Some authors object to the biological role argument for Central Claim (Plantinga (1996), Dougherty and Draper (2013)). Consider the wide range of pleasure and pain we know about. For instance, I get pleasure out of reading a gripping novel, listening to a well-crafted musical album, or tasting the subtle flavors of a well-balanced curry. Likewise, consider the pain of self-sacrifice, the displeasure of a hard workout, or the frustration of seeing a coworker still fail to fill in standardized forms correctly. The objection goes that these pleasures and pains do not seem to serve any biological roles.

Defenders of Humean arguments might respond in two ways. First, they might distinguish between the pleasure and pain of humans and of non-human animals. It might be that the pleasure and pain in non-human animals is much more likely to play a biological role than the pleasure and pain in humans. Thus, overall, pleasure and pain are more likely to play a biological role. Second, they might point out that Central Claim does not imply that the Hypothesis of Indifference does a good job explaining pleasure and pain. Rather, it implies that the Hypothesis of Indifference does a much better job than Theism. Thus, from the mere fact that some pleasures and pains do not seem to serve any biological roles it would not follow that Theism does a better job of predicting pleasure and pain than the Hypothesis of Indifference.

b. Objection 2: Naturalism and Normativity

Humean arguments maintain that what we know about good and evil is better predicted or explained by some rival to Theism than by Theism itself. In a simple understanding, what we know about good and evil includes claims like: it is bad that stray cats starve in the winter. However, some critics argue that the best explanation of the existence of good and evil is Theism itself. That is, they might argue that a purely naturalistic world, devoid of any supernatural reality, does a much worse job predicting the existence of good and evil than a claim like Theism. The argument here is abductive: there might not be any contradiction in claiming that the world is purely naturalistic and that there is good and evil. Nonetheless, a purely naturalistic hypothesis does a much worse job predicting or explaining good and evil than Theism. Thus, these critics argue, premises like Central Claim are false, since Theism does a much better job of explaining the existence of good and evil than naturalistic alternatives to Theism (see Lauinger (2014) for an example of this criticism).

Note that this objection only applies to certain kinds of Humean arguments. Specifically, it only applies to Humean arguments that implicitly or explicitly assume a rival to Theism that is a purely naturalistic hypothesis. However, not all rivals to Theism need be a purely naturalistic hypothesis. For instance, some of the rivals that Philo considered are not purely naturalistic. Nonetheless, many contemporary authors do accept a purely naturalistic worldview and would compare that worldview with a Theistic one.

In response, defenders of Humean arguments might defend metaethical naturalism. According to metaethical naturalism, normative facts, including facts about good and evil, are natural facts. Defenders of Humean arguments might argue that given metaethical naturalism, a purely naturalistic worldview does predict, to a high degree, normative facts. Determining whether this response succeeds, though, would require a foray into complex issues in metaethics.

c. Objection 3: God’s Obligations

Many philosophers and ordinary people assume that if Theism is true, then God has certain obligations to us. For instance, God is obligated to not bring about evil for us for absolutely no reason at all. These obligations might be based in God’s nature or some independent order. Either way, God is required to treat us in certain ways. The idea that if Theism is true, then God has certain obligations to us is a key idea in defending arguments from evil, including Humean arguments from evil. For instance, one of the defenses of Central Claim from above said that Theists might be surprised at the distribution of good and evil we know about. They might be surprised because they expect God to prevent that evil, since God has an obligation to prevent it, and that being all-powerful, God could prevent it. In this way, defenses of Central Claim (and premises like it) may implicitly assume that if Theism is true, then God has certain obligations to us.

However, some philosophers reject the claim that God has certain obligations to us (Adams (2013), Murphy (2017)). In these views, God might have a justifying reason to prevent evils and harms to us; but God does not have requiring reasons of the sort generated by obligations. There are different arguments for these views, and they are normally quite complex. But the arguments normally articulate a conception of God in which God is not a moral agent in the same way an average human person is a moral agent. But if God is not required to prevent evils and harms for us, God is closer to Hume’s “indifferent creator.” Just as an indifferent creator may, if they so desire, improve the lives of humans and animals, so too God may, if God so desires, improve the lives of humans and animals. But neither God nor the indifferent creator must do so.

Defenders of Humean arguments may respond to these arguments by simply criticizing these conceptions of God. Defenders of Humean arguments might argue that those conceptions are false or subtly incoherent. Alternatively, they might argue that those conceptions of God make it more difficult to challenge premises like Not-Counterbalanced Claim. For if God only has justifying reasons for treating us in certain ways, there might be a wide range of potential ways God would allow the world to be. But if there is a wide range of potential ways God would allow the world to be, then Theism does not make very specific predictions about how the world is. In this way, critics of Humean arguments may make it easier to challenge a premise like Central Claim but at the cost of making it harder to challenge a premise like Not-Counterbalanced Claim.

d. Objection 4: Skeptical Theism

Perhaps some of the most persistent critics of Humean arguments are skeptical theists (van Inwagen (1991), Bergmann (2009), Perrine and Wykstra (2014), Perrine (2019)). While there are many forms of skeptical theism, a unifying idea is that even if God were to exist, we should be skeptical of our ability to predict what the universe is like—including what the universe is like regarding good and evil. Skeptical theists develop and apply these ideas to a wide range of arguments against Theism, including Humean arguments.

Skeptical theistic critiques of Humean arguments can be quite complex. Here the critiques are simplified into two parts that form a simple modus tollens structure. The first part is to argue that there are certain claims that we cannot reasonably disbelieve or otherwise reasonably rule out. (In other words, we should be skeptical of their truth.) The second part is to argue that if we are reasonable in believing Central Claim (or something like it), then it is reasonable for us to disbelieve those claims. Since it is not reasonable for us to believe those claims, it follows that we are not reasonable in believing Central Claim (or something like it).

For the first part, consider a claim like this:

Limitation. God is unable to create a world with a better balance of good and evil without sacrificing other morally significant goods.

Skeptical theists argue that it is not reasonable for us to believe that Limitation is false; rather, we should be skeptical of its truth or falsity. One might argue that it is reasonable for us to believe that Limitation is false because it is hard for us to identify the relevant morally significant goods. But skeptical theists argue that this is a poor reason for disbelieving Limitation since God is likely to have created the world with many morally significant goods that are obscure to us. One might argue that it is reasonable for us to believe that Limitation is false because it is easy for us to imagine or conceive of a world in which it is false. But skeptical theists argue that this is a poor reason for disbelieving Limitation because conceivability is an unreliable guide to possibility when it comes to such complex claims like Limitation. In general, skeptical theists argue that our grasp of the goods and evils there are, as well as how they are connected, is too poor for us to reasonably disbelieve something like Limitation. In this way, they are skeptical of our access to all of the reasons God might have that are relevant to the permission of evil.

The second part of the skeptical theist’s critique is that if it is not reasonable for us to believe Limitation is false, then it is not reasonable for us to believe Central Claim is true. This part of the skeptical theist’s critique may seem surprising. Central Claim is a comparison between two hypotheses. Limitation is not comparative. Nonetheless, skeptical theists think they are importantly related. To see how they might relate, an analogy might be useful.

Suppose Keith is a caring doctor. How likely is it that Keith will cut a patient with a scalpel? At first blush, it might seem that it is extremely unlikely. Caring doctors do not cut people with scalpels! But on second thought, it is natural to think that whether Keith will cut a patient with a scalpel depends upon the kinds of reasons Keith has. If Keith has no compelling medical reason to do so, then given that Keith is a caring doctor, it is extremely unlikely Keith will cut a patient with a scalpel. But if Keith does have a compelling reason—he is performing surgery or a biopsy, for instance—then even if Keith is a caring doctor, it is extremely likely he will cut a patient with a scalpel. Now suppose someone claims that Keith will not cut a patient with a scalpel. That person is committed to a further claim: that Keith lacks a compelling medical reason to cut the patient with a scalpel. After all, even a caring doctor will cut a patient with a scalpel if there is a compelling medical reason to do so.

So, reconsider:

Central Claim: the Hypothesis of Indifference does a much better job predicting the good and evil we know about than Theism does.

There are several arguments one can give for Central Claim. But most of them utilize a simple idea: if Theism is true, there is a God who has reason for preventing the suffering and evil we know about, but if the Hypothesis of Indifference is true, there is no creator with such reasons. But, skeptical theists claim, God might have reasons for permitting suffering and evil if by doing so God can achieve other morally significant goods. Thus, to claim that God would prevent the suffering and evil we know about assumes that God could create a world with a better balance of good and evil without sacrificing other morally significant goods. (Compare: to claim that Keith, the kindly doctor, would not cut a patient with a scalpel assumes that Keith lacks a compelling medical reason to cut the patient with a scalpel.) Thus, if it is reasonable for us to believe Central Claim, it must also be reasonable for us to disbelieve:

 Limitation: God is unable to a create a world with a better balance of good and evil without sacrificing other morally significant goods.

After all, God might create a world with this balance of good and evil if it were necessary for other morally significant goods. But at this point, the first part of the skeptical theistic critique is relevant. For the skeptical theist claims that it is not reasonable for us to disbelieve Limitation. To do that, we would have to have a better understanding of the relationship between goods and evils than we do. Since it is not reasonable for us to reject Limitation, it is not reasonable for us to accept Central Claim.

As indicated earlier, the skeptical theist’s critique is quite complex. Nonetheless, some defenders of Humean arguments think that the criticism fails because the reasons skeptical theists give for doubting Central Claim can be offset or cancelled out. The defenders of Humean arguments reason by parity here. Suppose that the skeptical theist is right and that, for all we know, God could not have created a better balance of good and evil without sacrificing other morally significant goods. And suppose that the skeptical theist is right that this gives us a reason for doubting Central Claim. Well, that skepticism cuts both ways. For all we know, God could have created a better balance of good and evil without sacrificing other morally significant goods. By parity, that gives us a reason for accepting Central Claim. Thus, the skepticism of skeptical theism gives us both a reason to doubt Central Claim and a reason for accepting Central Claim. These reasons offset or cancel each other out. But once we set aside these offsetting reasons, we are still left with strong reasons for accepting Central Claim—namely, the reasons given by the arguments of section II. So, the skeptical theist’s critique does not ultimately succeed.

6. References and Further Reading

  • Adams, Marilyn McCord. (1999). Horrendous Evils and the Goodness of God. Cornell University Press.
  • Develops and responds to an argument from evil based on horrendous evils.

  • Adams, Marilyn McCord. (2013). “Ignorance, Instrumentality, Compensation, and the Problem of Evil.” Sophia. 52: 7-26.
  • Argues that God does not have obligations to us to prevent evil.

  • Bergmann, Michael. (2009). “Skeptical Theism and the Problem of Evil.” In Thomas Flint and Michael Rea, eds., The Oxford Handbook of Philosophical Theology. Oxford University Press.
  • A general introduction to skeptical theism that also briefly criticizes Humean arguments.

  • David Hume, Dialogues Concerning Natural Religion, part XI.
  • The original presentation of a Humean argument.

  • Dougherty, Trent and Paul Draper. (2013). “Explanation and the Problem of Evil.” In Justin McBrayer and Daniel Howard-Snyder, eds., The Blackwell Companion to the Problem of Evil. Blackwell Publishing.
  • A debate on Humean arguments.

  • Draper, Paul. (1989). “Pain and Pleasure: An Evidential Problem for Theists.” Nous. 23: 331-350
  • A classic modern presentation of a Humean argument.

  • Draper, Paul. (2013). “The Limitation of Pure Skeptical Theism.” Res Philosophica. 90.1: 97-111.
  • A defense of Humean arguments from skeptical theistic critiques.

  • Draper, Paul. (2017). “Evil and the God of Abraham, Anselm, and Murphy.” Religious Studies. 53: 564-72.
  • A defense of Humean arguments from the criticism that God lacks obligations to us.

  • Lauinger, William. (2014). “The Neutralization of Draper-Style Evidential Arguments from Evil.” Faith and Philosophy. 31.3: 303-324.
  • A critique of Humean arguments that good and evil better fit with Theism than naturalism.

  • Martin, Michael. (1978). “Is Evil Evidence Against the Existence of God?” Mind. 87.347: 429-432.
  • A brief argument that our inability to see God’s reasons for permitting suffering is evidence against Theism.

  • Morriston, Wes. (2014). “Skeptical Demonism: A Failed Response to a Humean Challenge.” In Trent Dougherty and Justin McBrayer, eds., Skeptical Theism. Oxford University Press.
  • A defense of a Humean argument from Skeptical Theism.

  • Murphy, Mark. (2017). God’s Own Ethics. Oxford: Oxford University Press.
  • A criticism of Humean arguments from the claim that God lacks obligations to us.

  • O’Connor, David. (2001). Hume on Religion. Routledge Press, chapter 9.
  • A modern discussion of Philo’s argument from evil that discusses the weak and strong interpretations.

  • Perrine, Timothy and Stephen Wykstra. (2014). “Skeptical Theism, Abductive Atheology, and Theory Versioning.” In Trent Dougherty and Justin McBrayer, eds., Skeptical Theism. Oxford University Press.
  • A skeptical theistic critique of Humean arguments, focusing on the methodology of the arguments.

  • Perrine, Timothy. (2019). “Skeptical Theism and Morriston’s Humean Argument from Evil.” Sophia. 58: 115-135.
  • A skeptical theistic critique of Humean arguments that defends them from the offsetting objection.

  • Pitson, Tony. (2008). “The Miseries of Life: Hume and the Problem of Evil.” Hume Studies. 34.1: 89-114.
  • A historical discussion of Hume’s views on the relation between the problem of evil and natural theology and atheology.

  • Plantinga, Alvin. (1996). “On Being Evidentially Challenged.” In Daniel Howard-Snyder, ed., The Evidential Argument From Evil. Bloomington, IN: Indiana University Press.
  • An argument that Humean arguments need to be strengthened to be cogent.

  • Pyle, Andrew. (2006). Hume’s Dialogue Concerning Natural Religion. Continuum.
  • A modern commentary on Hume’s Dialogue that provides a discussion of its historical place and reception.

  • Van Inwagen, Peter. (1991 [1996]). “The Problem of Evil, the Problem of Air, and the Problem of Silence.” Reprinted in Daniel Howard-Snyder, ed., The Evidential Argument From Evil. Bloomington, IN: Indiana University Press.
  • An earlier skeptical theistic critique of Humean arguments.

 

Author Information

Timothy Perrine
Email: tp654@scarletmail.rutgers.edu
Rutgers University
U. S. A.

The Metaphysics of Nothing

This article is about nothing. It is not the case that there is no thing that the article is about; nevertheless, the article does indeed explore the absence of referents as well as referring to absence. Nothing is said to have many extraordinary properties, but in predicating anything of nothingness we risk contradicting ourselves. In trying to avoid such misleading descriptions, nothingness could be theorised as ineffable, though that theorisation itself is an attempt to disparage it. Maybe nothingness is dialetheic, or maybe there are no things that are dialetheic, since contradictions are infamous for leading to absurdity. Contradictions and nothingness can explode very quickly into infinity, giving us everything out of nothing. So, perhaps nothing is something after all.

This article considers different metaphysical and logical understandings of nothingness via an analysis of the presence/absence distinction, by considering nothing first as the presence of absence, second as the absence of presence, third as both a presence and an absence, and fourth as neither a presence nor an absence. In short, it analyses nothingness as a noun, a quantifier, a verb, and a place, and it postulates nothingness as a presence, an absence, both, and neither.

Table of Contents

  1. Introduction—Nothing and No-thing
  2. Nothing as Presence of Absence
  3. No-thing as Absence of Presence
    1. Eliminating Negation
    2. Eliminating True Negative Existentials
    3. Eliminating Referring Terms
    4. Eliminating Existentially Loaded Quantification
  4. Beyond the Binary—Both Presence and Absence
    1. Dialectical Becoming
    2. Dialetheic Nothing
  5. Beyond the Binary—Neither Presence nor Absence
    1. The Nothing Noths
    2. Absolute Nothing
  6. Conclusion
  7. References and Further Reading

1. Introduction—Nothing and No-thing

Consider the opening sentence:

“This article is about nothing.”

This has two readings:

(i) This article is about no-thing (in that there is no thing that this article is about).

(ii) This article is about Nothing (in that there is something that this article is about).

The first reading (i) is a quantificational reading about the (lack of) quantity of things that this article is about. ‘Quantificational’ comes from ‘quantifier’, where a quantifier is a quantity term that ranges over entities of a certain kind. In (i), the quantity is none, and the entities that there are none of are things. This reading is referred to throughout the article as ‘no-thing’ (hyphenated, rather than the ambiguous ‘nothing’) to highlight this absence of things. The second reading (ii) is a noun phrase about the identity of the thing that this article is about. This reading is referred to throughout the article as ‘Nothing’ (capitalised, again avoiding the ambiguous ‘nothing’) to highlight the presence of a thing. In going from (i) to (ii), we have made a noun out of a quantity (a process we can call ‘nounification’). We have given a name to the absence, Nothing, giving it a presence. Sometimes this presence is referred to as ‘nothingness’, but that locution is avoided here since usually the ‘-ness’ suffix in other contexts indicates a quality or way of being, rather than a being itself (compare the redness of a thing to red as a thing, for example), and as such ‘nothingness’ is reserved for describing the nothing-y state of the presence Nothing and the absence no-thing.

It is important not to conflate these readings, and they cannot be reduced to one or the other. To demonstrate their distinctness, consider that (i) and (ii) have different truth values, as (ii) is true whilst (i) is false: it is not the case that this article is not about anything (that is, that for any x whatsoever there is no x that this article is about). As such, the article would be very short indeed (or even empty), bereft of a topic and perhaps bereft of meaning. I intend to do better than that. My intentional states are directed towards Nothing, hence the truth of (ii): there is indeed a topic of this article, and that topic—the subject, or even object of it—is Nothing.

There has been much debate over whether it is legitimate to nounify the quantificational reading of no-thing. Those who are sceptical would say that the ambiguous ‘nothing’ is really not ambiguous at all and should only be understood as a (lack of) quantity, rather than a thing itself. They might further argue that it is just a slip of language that confuses us into taking Nothing to be a thing, and that some of the so-called paradoxes of nothingness arise from illegitimate nounification that otherwise dissolve into mere linguistic confusions. The dialogues between characters in Lewis Carroll’s Alice in Wonderland and Through the Looking Glass are often cited as exemplars of such slippage and confusions. For instance [with my own commentary in square brackets]:

“‘I see nobody [that is, no-body as a quantifier] on the road’, said Alice.

‘I only wish I had such eyes’, the King remarked in a fretful tone.

‘To be able to see Nobody! [that is, Nobody as a noun] And at that distance too! Why, it’s as much as I can do to see real people [that is, somebodyness, rather than nobodyness, as states], by this light!’” (1871 p234)

Here, the term under consideration is ‘nobody’, and the same treatment applies to this as ‘nothing’ (in that we can disambiguate ‘nobody’ into the quantificational no-body and nounified Nobody). Alice intended to convey that there were no-bodies (an absence of presence) in quantitative terms. But the King then nounifies the quantifier, moving to a presence of absence, and applauds Alice on her apparent capacity to see Nobody.

Making this shift from things to bodies is helpful because bodies are less abstract than things (presumably you are reading this article using your body, your family members have bodies, animals have bodies, and so you have an intuitive understanding of what a body is). Once we have determined what is going on with no-body and every-body, we can apply it to no-thing and every-thing. So, consider now ‘everybody’. When understood as a quantifier, every-body is taken to mean all the bodies in the relevant domain of quantification (where a domain of quantification can be understood as the selection of entities that our quantifier terms range over). Do all those bodies, together, create the referent of Everybody as a noun? In other words, does Everybody as a noun refer to all the bodies within the quantitative every-body? One of the mistakes made by the likes of the King is to treat the referent of the noun as itself an instance of the type of entity the quantifier term is quantifying over. This is clear with respect to bodies, as Everybody is not the right sort of entity to be a body itself. All those bodies, together, is not itself a body (unless your understanding of what a body is can accommodate for such a conglomerate monster). Likewise, Nobody, when understood alongside its quantifier reading of no-body as a lack of bodies, is not itself a body (as, by definition, it has no bodies). So, the King, who is able to see only ‘real people’, makes a category mistake in taking Nobody to be, presumably, ‘unreal people’. Nobody, like Everybody, are quite simply not the right category of entity to instantiate or exemplify people-hood, bodyness, or be a body themselves.

The lesson we have learnt from considering ‘nobody’ is that nounifying the quantifier (no-body) does not create an entity (Nobody) of the kind that is being quantified over (bodies). So, returning to the more general terms ‘nothing’ and ‘everything’, are they the right kind of entities to be things themselves? Do Nothing and Everything, as nouns, refer to things, the same category of thing that their quantifier readings of no-thing and every-thing quantify over? The level of generality we are working with when talking of things makes it more difficult to diagnose what is going on in these cases (by comparison with Nobody and Everybody, for example).

To help, we can apply the lessons learnt from Alfred Tarski (1944) in so far as when talking of these entities as things we are doing so within a higher order or level of language—a metalanguage—in order to avoid paradox. We can see how this works with the Liar Paradox. Consider the following sentence, call it ‘S’: ‘This sentence is false’. Now consider that S is true and name the following sentence ‘S*’: ‘S is true’. If S (and thereby also S*) is true, then S says of itself that it is false (given that S literally states ‘This sentence is false’, which if true, would say it is false). On the other hand, if S (and thereby also S*) is false, then S turns out to be true (again, given that S literally states ‘This sentence is false’, which if it is false, would be saying something true). Tarski’s trick is to say that S and S* are in different levels of language. By distinguishing the level of language that S is talking in when it says it ‘… is false’, from the level of language that S* is talking in when it says that S ‘is true’, we end up avoiding the contradiction of having S be true and false at the same time within the same level. S is in the first level or order of language—the object language—and when we talk about S we ascend to a higher level or order of language—the metalanguage. As such, the truth and falsity appealed to in S are of the object language, and the truth and falsity appealed to in S* are of the metalanguage.

Applying Tarski’s trick to Nothing, perhaps Nothing cannot be considered a thing at the same level as the things it is not, just as Everything cannot be considered a thing at the same level as all the things it encapsulates. As quantifier terms, no-thing and every-thing quantify over things in the first level or order of the object language. As nouns, Nothing and Everything can only be considered things themselves in the higher level or order of the metalanguage, which speaks about the object language. The ‘things’ (or lack of) quantified over by every-thing and no-thing are of the object language, whereas the type of ‘thing’ that Everything and Nothing are are of the metalanguage. This avoids Nothing being a thing of the same type that there are no-things of.

Finally, then, with such terminology and distinctions in hand, we are now in a position to understand the difference between the presence of an absence (Nothing, noun), and the absence of a presence (no-thing, quantifier). Lumped into these two theoretical categories are the related positions of referring to a non-existing thing and the failure to refer to any thing at all (which whilst there are important variations, there are illuminating similarities that justify their shared treatment). Each of these approaches in turn are explored before describing other ways in which one can derive (and attempt to avoid deriving) the existence of some-thing from no-thing.

2. Nothing as Presence of Absence

When we sing that childhood song, ‘There’s a hole in my bucket, dear Liza’, the lyrics can be interpreted as straightforwardly meaning that there really is, there really exists, a hole in the bucket, and it is to that hole that the lyrics refer. Extrapolating existence in this sort of way from our language is a Quinean (inspired by the work of W. V. O. Quine) criterion for deriving ontological commitments, and specifically Quine argued that we should take to exist what our best scientific theories refer to. Much of our language is about things, and according to the principle of intentionality, so are our thoughts, in that they are directed towards or refer to things. (Of course, not all language and thought point to things: for example, in the lyrics above, the words ‘a’ and ‘in’ do not pick out entities in the way that ‘bucket’ and ‘Liza’ do. The question is whether ‘hole’ and ‘nothing’ function more like nonreferential ‘a’ and ‘in’ or referential ‘bucket’ and ‘Liza’.)

In our perceptual experiences and in our languages and theories we can find many examples of seeming references to nothingness, including to holes, gaps, lacks, losses, absences, silences, voids, vacancies, emptiness, and space. If we take such experiences, thoughts, and language at face value, then nothingness, in its various forms, is a genuine feature of reality. Jean-Paul Sartre is in this camp, and, in Being and Nothingness, he argues that absences can be the objects of judgements. Famously, Sartre described the situation in which he arrived late for his appointment with Pierre at a café, and ‘sees’ the absence of Pierre (because Pierre is who he is expecting to see, and the absence of Pierre frustrates that expectation and creates a presence of that absence—Sartre does not also ‘see’ the absence of me, because he was not expecting to see me). Relatedly, and perhaps more infamously, Alexius Meinong takes non-existent things to have some form of Being, such that they are to be included in our ontology, though Meinongians—those inspired by Meinong—disagree on what things specifically should be taken as non-existent.

So, what things should we take to exist? Consider the Eleatic principle which states that only causes are real. Using this principle, Leucippus noted that voids have causal power, and generalises that nonbeings are causally efficacious, such that they are as equally real as atoms and beings in general. When we sing, on the part of Henry, his complaints to dear Liza that the water is leaking from his bucket, then, the hole is blamed as being the cause of this leakage, and from this we might deduce the hole’s existence (the presence of an absence with causal powers). Similarly, we might interpret Taoists as believing that a wide variety of absences can be causes (for example, by doing no-thing—or as little as possible to minimise disruption to the natural way of the Tao—which is considered the best course of ‘(in)action’), and as such are part of our reality. As James Legge has translated from the Tao Te Ching: “Vacancy, stillness, placidity, tastelessness, quietude, silence, and non-action, this is the level of heaven and earth, and the perfection of the Tao and its characteristics” (1891 p13).

Roy Sorensen (2022) has gone to great lengths to describe the ontological status of various nothings, and his book on ‘Nothing’ (aptly named Nothing) opens with the following interesting case about when the Mona Lisa was stolen from the Louvre in Paris. Apparently, at the time, more Parisians visited the Louvre to ‘see’ the absence than they did the presence of the Mona Lisa, and the ‘wall of shame’ where the Mona Lisa once hung was kept vacant for weeks to accommodate demand. The Parisians regarded this presence of the absence of the Mona Lisa as something that could be photographed, and they aimed to get a good view of this presence of absence for such a photo, otherwise complaining that they could not ‘see’ if their view was obstructed. Applying the Eleatic principle, the principle of intentionality, a criterion for ontological commitment, or other such metaphysical tests to this scenario (as with Sartre’s scenario) may provide a theoretical basis for interpreting the ‘object’ of the Parisians’ hype (and the missing Pierre) as a presence of absence (of presence)—a thing, specifically, a Nothing.

Interpreting Nothing as a presence of absence requires us to understand Nothing as a noun that picks out such a presence of absence. If there is no such presence of this nothingness, and instead such a state is simply describing where something is not, then it is to be understood as an absence of presence via a quantificational reading of there being no-thing that there is. It can be argued that the burden of proof is on the latter position, which denies Nothing as a noun, to argue that there is only absence of a presence rather than a presence of absence. Therefore, in what follows, we pay close attention to this sceptical view to determine whether we can get away with nothingness as an absence, where there is no-thing, rather than there being a presence of Nothing as our language and experience seem to suggest.

3. No-thing as Absence of Presence

Returning to Liza and that leaking bucket, instead of there being a hole in the bucket, we could reinterpret the situation as the bucket having a certain perforated shape. Rather than there being a presence of a hole (where the hole is an absence), we could say that there is an absence of bucket (where the bucket is a presence) at the site of the leaking water. Such a strategy can be used not only to avoid the existence of holes as things themselves, but also to reinterpret other negative states in positive ways. For example, Aristotle, like Leucippus, argues from the Eleatic principle in saying that omissions can be causes, but to avoid the existence of omissions themselves this seeming causation-by-absence must be redescribed within the framework of Being. As such, negative nothings are just placeholders for positive somethings.

We can see a parallel move happen with Augustine who treats Nothing as a linguistic confusion—where others took there to be negative things (presences of an absence), Augustine redescribed those negative things as mere lacks of positive things (absences of a presence). For example, Mani thought ‘evil’ names a substance, but Augustine says ‘evil’ names an absence of goodness just as ‘cold’ names the absence of heat. Saying that evil exists is as misleading as saying cold exists, as absences are mere privations, and privations of presences specifically. Adeodatus and his father argue similarly, where Adeodatus says ‘nihil’ refers to what is not, and in response his father says that to refer to what is not is to simply fail to refer (see Sorensen 2022 p175). This interpretation of language is speculated to have been imported from Arab grammarians and been influenced by Indian languages where negative statements such as ‘Ostriches do not fly’ are understood as metacognitive remarks that warn us not to believe in ostrich flight rather than a description of the non-flight of ostriches (again see Sorensen 2022 p176 and p181).

Bertrand Russell attempted to generalise this interpretation of negative statements by reducing all negative truths to positive truths (1985). For example, he tried to paraphrase ‘the cat is not on the mat’ as ‘there is a state of affairs incompatible with the cat being on the mat’. But of course, this paraphrase still makes use of negation with respect to ‘incompatible’ which simply means ‘not compatible’, and even when he tried to model ‘not p’ as an expression of ‘disbelief that p’, this too requires negation in the form of believing that something is not the case (or not believing that something is the case). This ineliminatibility of the negation and the negative facts we find it in meant that Russell eventually abandoned this project and (in a famous lecture at Harvard) conceded that irreducibly negative facts exist. Dorothy Wrinch (1918) jests at the self-refuting nature of such positions that try to eliminate the negative, by saying that it is “a little unwise to base a theory on such a disputable point as the non-existence of negative facts”. So can we eliminate Nothing in favour of no-thing? Can we try, like Russell’s attempt, to avoid the presence of negative absences like Nothing, and instead only appeal to the absence of positive presences like no-thing? Can we escape commitment to the new thing created by nounifying no-thing into Nothing, can no-thing do all the work that Nothing does? Consider various strategies.

a. Eliminating Negation

Despite Russell’s attempt, it seems we cannot eliminate negative facts from our natural language. But from the point of view of formal languages, like that of logic, negation is in fact dispensable. Take, for example, the pioneering work of Christine Ladd-Franklin. In 1883, her dissertation put forward an entire logical system based on exclusion, where she coined the NAND operator which reads ‘not … and not …’, or ‘neither … nor …’.  This closely resembles the work of Henry Sheffer, who later, in 1913, demonstrated that all of the logical connectives can be defined in terms of the dual of disjunction, which he named NOR (short for NOT OR, ‘neither … nor …’), or the dual of conjunction, which was (confusingly) named NAND (short for NOT AND, ‘either not … or not …’) and has come to be known as the Sheffer stroke. This Sheffer stroke, as well as the earlier Ladd-Franklin’s NAND operator, do away with the need for a symbolic representation of negation. Another example of such a method is in Alonzo Church’s formal language whereby the propositional constant f was stipulated to always be false (1956, §10), and f can then be used to define negation in terms of it as such: ~ A =df  A → f. If we can do away with formal negation, then perhaps this mirrors the possibility of doing away with informal negation, including Nothing.

An issue with using this general method of escaping negative reality regards what is known as ‘true negative existentials’ (for example, ‘Pegasus does not exist’). Using Sheffer’s NAND, this is ‘Pegasus exists NAND Pegasus exists’ which is read ‘either it is not the case that Pegasus exists or it is not the case that Pegasus exists’, which we would want to be true. But since Pegasus does not exist, the NAND sentence will not be true, as each side of the NAND (that is, ‘Pegasus exists’) is false. As we shall see, this is a persistent problem which has motivated many alternatives to the classical logic setup.

Another issue concerns whether the concept of negation has really been translated away in these cases, or whether negation has just become embedded within the formal language elsewhere under the guise of some sort of falsehood, ever present in the interpretation. This questioning of the priority of the concept of negation was put forward by Martin Heidegger, when he asks: “Is there Nothing only because there is ‘not’, i.e. negation? Or is it the other way round? Is there negation and ‘not’ only because there is Nothing?” (1929 p12) Heidegger’s answer is that “‘Nothing’ is prior to ‘not’ and negation” (ibid.), and so whilst ‘not’ and negation may be conceptually eliminable because they are not primitive, ‘Nothing’ cannot be so. Try as we might to rid ourselves of Nothing, we will fail, even if we succeed in ridding our formal language of ‘not’ and negation. We shall now turn to more of these eliminative methods.

b. Eliminating True Negative Existentials

The riddle, or paradox, of non-being describes the problem of true negative existentials, where propositions like ‘Pegasus does not exist’ are true but seem to bring with them some commitment to an entity ‘Pegasus’. As we learn from Plato’s Parmenides, “Non-being is… being something that is not, – if it’s going not to be” (1996 p81). It is thus self-defeating to say that something, like Pegasus, does not exist, and so it is impossible to speak of what there is not (but even this very argument negates itself). What do we do in such a predicament?

In the seminal paper ‘On What There Is’ (1948), Quine described this riddle of non-being as ‘Plato’s Beard’—overgrown, full of non-entities beyond necessity, to be shaved off with Ockham’s Razor. The problem arises because we bring a thing into existence in order to deny its existence. It is as if we are pointing towards something, and accusing what we are pointing at of not being there to be pointed at. This is reflected in the classical logic that Quine endorsed, where both ‘there is’ and ‘there exists’ are expressed by means of the ‘existential quantifier’ (∃), which is, consequently, interpreted as having ontological import. As a result, such formal systems render the statement ‘There is something that does not exist’ false, nonsensical, inexpressible, or contradictory. How can we get around this issue, in order to rescue the truth of negative existentials like ‘Pegasus does not exist’ without formalising it as ‘Pegasus—an existent thing—does not exist’?

This issue closely resembles the paradox of understanding Nothing—in referring to nothingness as if it were something. As Thales argues, thinking about nothing makes it something, so there can only truly be nothing if there is no one to contemplate it (see Frank Close 2009 p5). The very act of contemplation, or the very act of referring, brings something into existence, and turns no-thing into some-thing, which is self-defeating for the purposes of acknowledging an absence or denying existence. In his entry on ‘Nothingness’ in The Oxford Companion to the Mind, Oliver Sacks summarises the difficulty in the following way: “How can one describe nothingness, not-being, nonentity, when there is, literally, nothing to describe?” (1987 p564)

c. Eliminating Referring Terms

Bertrand Russell (1905) provides a way to ‘describe nothingness’ by removing the referent from definite descriptions. Russell analyses true negative existentials such as ‘The present King of France does not exist’ as ‘It is not the case that there is exactly one present King of France and all present Kings of France exist’. By transforming definite descriptions into quantitative terms, we do not end up referring to an entity in order to deny its existence—rather, the lack of an entity that meets the description ensures the truth of the negative existential. Quine (1948) takes this method a step further by rendering all names as disguised descriptions, and thereby analyses ‘Pegasus does not exist’ as more accurately reading ‘The thing that pegasizes does not exist’. Such paraphrasing away of referring devices removes the problem of pointing to an entity when asserting its nonexistence, thereby eliminating the problem of true negative existentials.

However, such methods are not without criticism, with some claiming their resolutions are worse than the problems they were initially trying to resolve. As Karel Lambert argues, they come with their own problems and place “undue weight both on Russell’s controversial theory of descriptions as the correct analysis of definite descriptions and on the validity of Quine’s elimination of grammatically proper names” (1967 p137). Lambert proposes, instead of ridding language of singular terms via these questionable means, one could rid singular terms of their ontological import. She creates a system of ‘free logic’ whereby singular terms like names need not refer in order to be meaningful, and propositions containing such empty terms can indeed be true. Therefore, ‘Pegasus does not exist’ may be meaningful and true even whilst ‘Pegasus’ does not refer, without contradiction or fancy footwork via paraphrasing into definite descriptions and quantificational statements.

Lambert (1963) also insists that such a move to free logic is required in order to prevent getting something from nothing in classical logic, when we derive an existential claim from a corresponding universal claim where the predicate in use is not true of anything in the domain. This happens when we infer according to the rule of ‘Universal Instantiation’ whereby what is true of all things is true of some (or particular) things, for example:

∀x(Fx → Gx)

∃x(Fx & Gx)

If no thing in the domain is F, then theoretically hypothesizing that all Fs are Gs leads to inferring that some Fs are Gs, thereby deriving an x that is F and G from the domain where there was no thing in the domain that was F to start with. Rather than the ad hoc limitation of the validity of such inferences to domains that include (at least) things that are F (or are more generally simply not empty), Lambert instead proposes her system of free logic where there need not be a thing in the domain for statements to be true.

But what about Nothing? Is ‘Nothing’ a referring term? For Rudolf Carnap, asking such a question is “based on the mistake of employing the word ‘nothing’ as a noun, because in ordinary language it is customary to use it in this form in order to construct negative existential statements… [E]ven if it were admissible to use ‘nothing’ as a name or description of an entity, still the existence of this entity would be denied by its very definition” (1959 p70). Many have argued against the first part of Carnap’s argument, to show that there are occurrences of ‘Nothing’ as a noun which cannot be understood in quantificational terms or as the null object without at least some loss of meaning (see, for example, Casati and Fujikawa 2019). Nevertheless, many have agreed with the second part of Carnap’s argument that even as a noun ‘Nothing’ would fail to refer to an existent thing (see, for example, Oliver and Smiley 2013). But if Nothing does not refer to an existent thing, what then is this encyclopaedia article about?

As Maria Reicher (2022) states, “One of the difficulties of this solution, however, is to give an account of what makes such sentences true, i.e., of what their truthmakers are (given the principle that, for every true sentence, there is something in the world that makes it true, i.e., something that is the sentence’s truthmaker).” The truthmaker of my opening sentence ‘This article is about nothing’ might then be that Nothing is what this article is about, even when Nothing is the name for the nounified no-thing. The problematic situation we seem to find ourselves in is this: Without an entity that the statement is about, the statement lacks a truthmaker; but with an entity that the statement is about, the statement becomes self-refuting in denying that very entity’s existence. But there is another option. ‘Nothing’ may not refer to an existent thing, yet this need not entail the lack of a referent altogether, because instead perhaps ‘Nothing’ refers to a non-existent thing, as we shall now explore.

d. Eliminating Existentially Loaded Quantification

Meinong’s ‘Theory of Objects’ (1904) explains how we can speak meaningfully and truthfully about entities that do not exist. Meinongians believe that we can refer to non-existent things, and talk of them truthfully, due to quantifying over them and having them as members in our domains of quantification. When we speak of non-existent things, then, our talk refers to entities in the domain that are non-existent things. So it is not that our language can be true without referring at all (as in free logic), but rather that our language can be true without referring to an existent thing (where instead what is referred to is a non-existent thing, which acts as a truthmaker). This approach grants that flying horses do not exist, but this does not imply that there are no flying horses. According to the Meinongian, there are flying horses, and they (presumably) belong to the class of non-existent things, where Pegasus is one of them. This class of non-existent things might also include the present King of France, Santa Claus, the largest prime number, the square circle, and every/any-thing you could possibly imagine if taken to not exist—maybe even Nothing.

So, for the Meinongian, naïvely put, there are existents and non-existents. Both are types of ‘thing’, and the over-arching name for these things are that they have ‘being’. All existent things have being, but not all being things have existence. And perhaps in such an account, Nothing could have ‘being’ regardless of its non/existence. Since Meinongians quantify over both existent and non-existent things, their quantification over domains containing both such things must be ontologically neutral (namely, by not having existential import), and they can differentiate between the two types of things by employing a predicate for existence which existent things instantiate and non-existent things do not. The neutral universal and particular quantifiers (Λ and Σ) can then be defined using the classical universal and existential quantifiers (∀ and ∃) with the existence predicate (E!) as such:

∀x =df Λx(E!x)

∃x =df Σx(E!x)

‘All existent things are F’ can be written as such:

∀x(Fx) =df Λx(E!x → Fx)

And ‘Some existent things are F’ can be written as such:

∃x(Fx) =df Σx(E!x & Fx)

Using these neutral quantifiers, we can then say, without contradiction, that some things do not exist, as such:

Σx(~E!x)

Despite these definitions, it would be erroneous to describe Meinongianism as “the way of the two quantifiers” (Peter van Inwagen 2003 p138). This is because the ontologically loaded quantifier ∃ can be considered as being restricted to existents, and so is different to Σ only by a matter of degree with respect to what is in the domain, that is, its range. Such a restriction of the domain can be understood as part and parcel of restricting what it is to count as a ‘thing’, where, for Quine, every-(and only)-thing(s) exists.

One need not be a Meinongian to treat the quantifiers as ontologically neutral, however. For example, Czeslaw Lejewski argues that the existentially non-committal ‘particular quantifier’ is “a nearer approximation to ordinary usage” and claims to “not see a contradiction in saying that something does not exist” (1954 p114). Another way to free the quantifiers of their ontological import is to demarcate ontological commitment from quantificational commitment, as in the work of Jody Azzouni (2004). Even the very basic idea of quantificational commitment leading to a commitment to an object in the domain of quantification can be challenged, by taking the quantifiers to be substitutional rather than objectual. In a substitutional interpretation, a quantificational claim is true not because there is an object in the domain that it is true of, but because there is a term in the language that it is true of (for an early pioneer of substitutional quantification, see Ruth Barcan-Marcus 1962).

In contrast to these alternative systems, for Quine (1948), “to be is to be the value of a bound variable”, which simply means to be quantified over by a quantifier, which further simplified means to be in the domain of quantification. An ontology, then, can be read straight from the domain, which contains (only) the existent things, which happens to be all the ‘things’ that there are. As we have seen, this is problematic with respect to understanding nonexistence. But that is not all. Ladd-Franklin (1912 p653), for example, argues that domains are just ‘fields of thought’, and thus the domain of discourse may vary, and it cannot simply be assumed to contain all of (and only) the things that exist in our reality. Even when the field of thought is physics, or whatever our best science may be, the domain of quantification still leaves us none the wiser with respect to what there is in reality. As Mary Hesse argues, “it is precisely what this domain of values is that is often a matter of dispute within physics” (1962 p243). Indeed, she continues, the very act of axiomatizing a theory in order to answer the question ‘what are the values of its variables?’ implies the adoption of a certain interpretation, which in turn is equivalent to the decisions involved in answering the question ‘what are entities?’ Therefore, one cannot informatively answer ‘what is there?’ with ‘the values of the bound variables’. Extrapolating from the domain is thus no guide to reality: it can give us some-thing from no-thing, regardless of whether every-thing includes more than every (existent) thing. And we cannot infer the existence of Nothing from ‘Nothing’.

4. Beyond the Binary—Both Presence and Absence

As we shall now see, the supposed choice between the binary options of understanding ‘nothing’ as Nothing (a noun, presence of absence) or no-thing (a quantifier, absence of presence) can itself be challenged. To get to that point, firstly, we introduce the dialectical process of Becoming which Nothing participates in, and then we introduce dialetheic understandings of the contradictory nature of Nothing.

a. Dialectical Becoming

In G. W. F. Hegel’s dialectics, a particular pattern is followed when it comes to conceptual analysis. To start, a positive concept is introduced as the ‘thesis’. Then, that positive concept is negated to create the ‘antithesis’ which opposes the thesis. The magic happens when the positive concept and the negative concept are unified to create a third concept, the ‘synthesis’ of the thesis and antithesis. When Hegel applied this dialectic of thesis-antithesis-synthesis to the topic we are considering in this article, the resulting pattern is Being-Nothing-Becoming. To start, he took Being as the positive thesis, which he stated is ‘meant’ to be the concept of presence. Negating this thesis of Being, we get what he stated is ‘meant’ to be the concept of absence, namely, Nothing, as the antithesis.

It is important to note that for Hegel the difference between Being and Nothing is only “something merely meant” (1991 remark to §87) in that we do mean to be highlighting different things when we use the term ‘Nothing’ rather than ‘Being’ or vice versa, but in content they are actually the same. What is the content of Being and Nothing, then, that would equate them in this extensional manner? Well, as purely abstract concepts, Being and Nothing are said to have no further determination, in that Being asserts bare presence, and Nothing asserts bare absence. Given that both are bare, and thus undetermined, they have the same (lack of) properties or content. (Compare the situation with the morning star and evening star—these terms were employed to mean different things, but actually they both refer to Venus.)

There is a presence to Nothing in its asserting absence, and there is an absence to Being in its empty presence. As Julie Maybee (2020) has described, “Being’s lack of determination thus leads it to sublate itself and pass into the concept of Nothing”, and this movement goes both ways. In speculating the bidirectional relationship between Being and Nothing, we enter the dialectic moment of synthesis that unifies and combines them into a state of Becoming. To Become is to go from Being to Nothing or from Nothing to Being, as we do when we consider their equally undefined content. But despite their extensional similarity (in what content they pick out), intensionally (their intended definitional meaning) Being and Nothing are different. Any contradiction that may arise from their synthesis can thus be avoided by reference to this difference. But what if such contradictions provide a more accurate understanding of nothingness, to better reflect its paradoxical nature? This is the idea we will now take up.

b. Dialetheic Nothing

Heidegger pointed out that in speaking of Nothing we make it into something and thereby contradict ourselves. Much like in that dialectical moment of synthesis, we posit Nothing as a being—as a thing—even though by our quantificational understanding that is precisely what it is not (see Krell 1977 p98f). Where can we go from here? Does this mean it is impossible to speak of Nothing without instantaneous self-defeat, by turning Nothing into not-no-thing, namely, some-thing? To this, Graham Priest adds, “One cannot, therefore, say anything of nothing. To say anything, whether that it is something or other, or just that it is, or even to refer to it at all, is to treat it as an object, which it is not” (2002 p241, emphasis in original).

Of course, Priest did say something about Nothing, as did Heidegger, and as does this article. It therefore is not impossible to talk of it. Perhaps the lesson to learn is that any talk of it will be false because the very act of doing so turns it into what it is not. This would be a kind of error-theory of Nothing, that whatever theorising is done will be in error, by virtue of postulating an object to be theorised where there is no object. But this will not do once we consider statements that motivate such a theory, like ‘Nothing is not an object’, which the error-theorist would want to be true in order for all (other) statements about Nothing to be false. Can we not even say that we cannot say anything about Nothing, then? Nor say that?

These problems reflect issues of ineffability. To be ineffable is to not be able to be effed, where to be effed is to be described in some way. Start with the idea that Nothing is ineffable, because in trying to describe it (a no-thing) we end up turning it into some-thing (a thing) that it is not. But, to say that Nothing is ineffable is a self-refuting statement, since ‘Nothing is ineffable’ is to say something about Nothing, namely, that it is ineffable. Furthermore, if it is true that Nothing is ineffable, then it is not true that no-thing is ineffable, because Nothing is. So, to repeat, can the (in)effability of nothingness be effed? And what about effing that?

Ludwig Wittgenstein’s Tractatus is also an example of trying to eff the ineffable, via a self-conscious process of ‘showing’ rather than ‘saying’ what cannot be said, or else rendering it all meaningless. Wittgenstein’s work explores (among other things) the limits of our language in relation to the limits of our world, and the messy paths that philosophical reflection on our language can take us down. Applying this to Nothing, it might be that the contradictions that arise from attempts to express nothingness reflect contradictions in its very nature. And maybe when we get caught up in linguistic knots trying to understand Nothing it is because Nothing is knotty (which pleasingly rhymes with not-y). Perhaps then we need not try to find a way out of contradictions that stem from analysing nothingness if those contradictions are true. So, is it true that Nothing is both an object and not an object? Is it true that Nothing is both a thing and no-thing? Whilst this would not be Wittgenstein’s remedy, according to Priest, ‘yes’, we ought to bite this bullet and accept the paradoxical nature of Nothing at face value. To treat such a contradiction as true, one must endorse a dialetheic metaphysics, with a paraconsistent logic to match, where Nothing is a dialetheia.

5. Beyond the Binary—Neither Presence nor Absence

a. The Nothing Noths

As we have seen, when contemplating nothingness, we can quickly go from no-thing to Nothing, which is no longer a ‘nothing’ due to being some-thing. When we turn towards nothingness, it turns away from us by turning itself into something else. This makes nothingness rather active, or rather re-active, in a self-destructive sort of way. As Heidegger put it, “the nothing itself noths or nihilates” (1929 p90).

Carnap was vehemently against such metaphysical musings, claiming that they were meaningless (1959 p65-67). Indeed, Heidegger and the Vienna Circle (of which Carnap was a leading and central figure) were in opposition in many ways, not least with respect to Heidegger’s antisemitism and affiliation with the Nazis in contrast with the Vienna Circle’s large proportion of Jewish and socialist members (see David Edmonds 2020 for the relationship between the political and philosophical disputes).

Somewhat mediating on the logical side of things, Oliver and Smiley (2013) consider ‘the nothing noths’ as “merely a case of verbing a noun” and argue: “If ‘critiques’ is what a critique does, and ‘references’ is what a reference does, ‘nichtet’ is what das Nichts does. The upshot of all this is that ‘das Nichts nichtet’ [‘the nothing noths’] translates as ‘zilch is zilch’ or, in symbols, ‘O=O’. Far from being a metaphysical pseudo-statement, it is a straightforward logical truth” (p611). If verbing a noun is legitimate, what about nouning a quantifier? If ‘Criticisms’ is the name for all criticisms, and ‘References’ is the name for all references, then is not ‘Everything’ the name for every-thing, and likewise ‘Nothing’ the name for no-thing? Such an understanding would make the path to such entities quite trivial, a triviality that ‘straightforward logical truths’ share. But if we have learnt anything about Nothing so far, it is surely that it is a long way (at least 8,000 words away) from being trivial.

Heidegger avoids charges of triviality by clarifying that Nothing is “‘higher’ than or beyond all ‘positivity’ and ‘negativity’” (see Krummel 2017 p256 which cites Beiträge). This resonates with Eastern understandings of true nothingness as irreducible to and outside of binary oppositions, which is prominent in the views of Nishida Kitarō from the Kyoto School. What are they good for? ‘Absolute nothing’ (and more).

b. Absolute Nothing

When Edwin Starr sang that war was good for absolutely nothing (1970), the message being conveyed was that there was no-thing for which war was good. This was emphasised and made salient by the ‘absolutely’. When we are analysing nothingness, we might likewise want to emphasise that what we are analysing is absolutely nothing. But what would that emphasis do? In what way does our conception of nothingness change when we make its absoluteness salient?

For the Kyoto School, this ‘absolute’ means cutting off oppositional understandings, in a bid to go beyond relativity. The way we comprehend reality is very much bound up in such oppositions: life/death, yes/no, true/false, black/white, man/woman, good/bad, acid/alkaline, high/low, left/right, on/off, 0/1, even/odd, this/that, us/them, in/out, hot/cold… and challenging such binaries is an important part of engaging in critical analysis to better grasp the complexities of reality. But these binaries may very well include opposites we have been relying upon in our understanding of nothingness, namely, presence/absence, thing/no-thing, no-thing/Nothing, binary/nonbinary, relative/absolute, and so forth. It seems whatever concept or term or object we hold (like Hegel’s ‘thesis’), we can negate it (like Hegel’s ‘antithesis’), making a set of opposites. What then can be beyond such oppositional dialect? Nothing. (Or is it no-thing?)

Zen Buddhism explains that true nothingness is absolute, not relative—beyond the realm of things. Our earlier attempts at elucidating Nothing and no-thing were very much conceptually related to things, and so to get a truer, more absolute nothingness, we must go beyond no-thing/thing and no-thing/Nothing. Only once detached from all contrasts do we have absolute nothingness.

Nishida says absolute negation (zettai hitei 絶対否定) is beyond the affirmative/negative itself, and so is a rejection of what it colloquially represents: true negation is thereby a negation of negation. This is not the double-negation of classical logic (whereby something being not not true is for that something to be true) and it is not the mealy-mouthed multiple-negation of conversation (whereby not disliking someone does not entail liking them but rather just finding them incredibly annoying, for example). Instead, this negation of negation leaves the realm of relativity behind, it goes beyond (or negates) that which can be negated to enter the absolute realm. No-thing can be absolute without being absolved of any defining opposition that would render it merely relative. And so Nothing can only be absolute when it goes beyond the binaries that attempt to define it in the world of being. This does not place the absolute nothingness in the realm of nonbeing; rather, absolute nothingness transcends the being/nonbeing distinction.

Without anything to define absolute nothingness in relation to, it is quite literally undefined. As such, Nothing cannot be made into a subject or object that could be judged, and so is completely undetermined. It would not make sense, then, to interpret ‘absolute nothing’ as a thing, because that would bring it into the purview of predication. Instead, Nishida (2000 467, 482) speaks of it as a place: “the place of absolute nothing” (zettai mu no basho) or “the place of true nothing” (shin no mu no basho). Within this place is every determination of all beings, and as such is infinitely determined. But this is in contradiction with its status as being completely undetermined, beyond the realm of relative definition. Is absolute nothingness really beyond the realm of relative definition if it is defined in contrast to relativity, namely, as absolute? It seems that we have stumbled upon contradictions and binaries again. (Ask yourself: Can we avoid them? Ought we avoid them?) Like the dialetheic understanding of Nothing, this absolute nothingness is effed as ineffable in terms of what it is and is not. And like the nothing-that-noths, this absolute nothingness is active, but rather than nihilating anything that comes in its path, it creates every-thing.

6. Conclusion

This article has analysed nothingness as a noun, a quantifier, a verb, and a place. It has postulated nothingness as a presence, an absence, both, and neither. Through an exploration of metaphysical and logical theories that crossed the analytic/continental and East/West divides, it started with nothing, got something, and ended up with everything. What other topic could be quite as encompassing? Without further ado, and after much ado about nothing, let us conclude the same way that Priest does in his article ‘Everything and Nothing’ (which hopefully you, the reader, will now be able to disambiguate):

“Everything is interesting; but perhaps nothing is more interesting than nothing” (Gabriel and Priest 2022 p38).

7. References and Further Reading

  • Jody Azzouni (2004) Deflating Existential Consequence: A Case for Nominalism, Oxford University Press.
  • Ruth Barcan-Marcus (1962) ‘Interpreting Quantification’, Inquiry, V: 252–259.
  • Filippo Casati and Naoya Fujikawa (2019) ‘Nothingness, Meinongianism and Inconsistent Mereology’, Synthese, 196.9: 3739–3772.
  • Rudolf Carnap (1959) ‘The Elimination Of Metaphysics Through Logical Analysis of Language’, A. Pap (trans.) in A. J. Ayer (ed.) Logical Positivism, New York: Free Press, 60–81.
  • Lewis Carroll (1871) Through the Looking-Glass and What Alice Found There, in M. Gardner (ed.) The Annotated Alice: The Definitive Edition, Harmondsworth: Penguin, 2000.
  • Alonzo Church (1956) Introduction to Mathematical Logic, Princeton University Press.
  • Frank Close (2009) Nothing: A very short introduction, Oxford University Press.
  • David Edmonds (2020) The Murder of Professor Schlick: The Rise and Fall of the Vienna Circle, Princeton University Press.
  • Suki Finn (2018) ‘The Hole Truth’, Aeon.
  • Suki Finn (2021) ‘Nothing’, Philosophy Bites. https://podcasts.google.com/feed/aHR0cHM6Ly9waGlsb3NvcGh5Yml0ZXMubGlic3luLmNvbS9yc3M.
  • Suki Finn (2023) ‘Nothing To Speak Of’, Think, 22.63: 39–45.
  • Markus Gabriel and Graham Priest (2022) Everything and Nothing, Polity Press.
  • W. F. Hegel (1991) The Encyclopedia Logic: Part 1 of the Encyclopaedia of Philosophical Sciences, F. Geraets, W. A. Suchting, and H. S. Harris (trans.), Indianapolis: Hackett.
  • Martin Heidegger (1929) ‘What is Metaphysics?’, in (1949) Existence and Being, Henry Regenry Co.
  • Mary Hesse (1962) ‘On What There Is in Physics’, British Journal for the Philosophy of Science, 13.51: 234–244.
  • Peter van Inwagen (2003) ‘Existence, Ontological Commitment, and Fictional Entities’, in Michael
  • Loux and Dean Zimmerman (eds.) The Oxford Handbook of Metaphysics, Oxford University Press, 131–157.
  • F. Krell (ed.) (1977) Martin Heidegger: Basic Writings, New York: Harper & Row.
  • John W. M. Krummel (2017) ‘On (the) nothing: Heidegger and Nishida’, Continental Philosophy Review, 51.2: 239–268.
  • Christine Ladd-Franklin (1883) ‘The Algebra of Logic’, in Charles S. Pierce (ed.) Studies in Logic, Boston: Little, Brown & Co.
  • Christine Ladd-Franklin (1912) ‘Implication and Existence in Logic’, The Philosophical Review, 21.6: 641–665.
  • Karel Lambert (1963) ‘Existential Import Revisited’, Notre Dame Journal of Formal Logic, 4.4: 288–292.
  • Karel Lambert (1967) ‘Free Logic and the Concept of Existence’, Notre Dame Journal of Formal Logic 8.1-2: 133–144.
  • James Legge (1891) The Writings of Chuang Tzu, Oxford University Press.
  • Czeslaw Lejewski (1954) ‘Logic and Existence’, British Journal for the Philosophy of Science, 5: 104–19.
  • Julie E. Maybee (2020) ‘Hegel’s Dialectics’, The Stanford Encyclopedia of Philosophy, Edward N. Zalta (ed.), <https://plato.stanford.edu/archives/win2020/entries/hegel-dialectics/>.
  • Alexius Meinong (1904) ‘Über Gegenstandstheorie’, in Alexius Meinong (ed.) Untersuchungen zur Gegenstandstheorie und Psychologie, Leipzig: J. A. Barth.
  • Kitarō Nishida (2000) Nishida Kitarō zenshū [Collected works of Nishida Kitarō], Tokyo: Iwanami.
  • Alex Oliver and Timothy Smiley (2013) ‘Zilch’, Analysis, 73.4: 601–613.
  • Plato (1996) Parmenides, A. K. Whitaker (trans.) Newburyport, MA: Focus Philosophical Library.
  • Graham Priest (2002) Beyond the Limits of Thought, Oxford University Press.
  • W.V.O. Quine (1948) ‘On What There Is’, The Review of Metaphysics, 2.5: 21–38.
  • Maria Reicher (2022) ‘Non-existent Objects’, The Stanford Encyclopedia of Philosophy, Edward N. Zalta and Uri Nodelman (eds.), URL = <https://plato.stanford.edu/archives/win2022/entries/non-existent-objects/>.
  • Bertrand Russell (1905) ‘On Denoting’, Mind, 14: 479–493.
  • Bertrand Russell (1985) The Philosophy of Logical Atomism, La Salle, II: Open Court.
  • Oliver Sacks (1987) ‘Nothingness’, in Richard L. Gregory (ed.) The Oxford Companion to the Mind, Oxford University Press.
  • Jean-Paul Sartre (1956) Being and Nothingness: An Essay on Phenomenological Ontology, Hazel E. Barnes (trans.), New York: Philosophical Library.
  • Henry Sheffer (1913) ‘A Set of Five Independent Postulates for Boolean Algebras, with Applications to Logical Constants’, Transactions of the American Mathematical Society, 14: 481–488.
  • Roy Sorensen (2022) Nothing: A Philosophical History, Oxford: Oxford University Press. Edwin Starr (1970) War, Motown: Gordy Records.
  • Alfred Tarski (1944) ‘The Semantic Conception of Truth’, Philosophy and Phenomenological Research, 4.3: 341–376.
  • Ludwig Wittgenstein (1961) Tractatus Logico-Philosophicus, D. F. Pears and B. F. McGuinness (trans.), New York: Humanities Press.
  • Dorothy Wrinch (1918) ‘Recent Work In Mathematical Logic’, The Monist, 28.4: 620–623.

 

Author Information

Suki Finn
Email: suki.finn@rhul.ac.uk
Royal Holloway University of London
United Kingdom

Impossible Worlds

Actual facts abound and actual propositions are true because there is a world, the actual world, that the propositions correctly describe. Possibilities abound as well. The actual world reveals what there is, but it is far from clear that it also reveals what there might be. Philosophers have been aware of this limitation and have introduced the notion of a possible world. Finally, impossibilities abound because it turned out that possibilities do not exhaust the modal space as a whole. Beside the actual facts, and facts about the possible, there are facts about what is impossible. In order to explain this, philosophers have introduced the notion of an impossible world.

This article is about impossible worlds. First, there is a presentation of the motivations for postulating impossible worlds as a tool for analysing impossible phenomena. This apparatus seems to deliver great advances in modal logic and semantics, but at the same time it gives rise to metaphysical issues concerning the nature of impossible worlds. Discourse about impossible worlds is explained in Sections 2 and 3. Section 4 provides an overview of the theories in discussion in the academic literature, and Section 5 summarises the drawbacks of those theories. Section 6 takes a closer look at the logical structure of impossible worlds, and Section 7 discusses the connection between impossible worlds and hyperintensionality.

Table of Contents

  1. Introduction
  2. The First Argument for Impossible Worlds
  3. Impossible Worlds and Their Applications
  4. The Metaphysics of Impossible Worlds
  5. Troubles with Impossible Worlds
  6. The Logic of Impossible Worlds
  7. Impossible Worlds and Hyperintensionality
  8. Conclusion
  9. References and Further Readings

1. Introduction

Modal notions are those such as ‘possibility’, ‘necessity’, and ‘impossibility’, whose analysis requires a different account than so-called indicative notions. To compare the two, indicative propositions are about this world, the world that obtains; and all the true indicative propositions describe the world completely. Propositions of the latter kind are about the world as well, although in a different sense. They are about its modal features or, said otherwise, about alternatives to it. Philosophers call them possible worlds.

For a start, it is important to consider the distinction between pre-theoretical and theoretical terms. Pre-theoretical terms are terms we handle before we engage in philosophical theorizing. Theoretical terms, on the other hand, are introduced by philosophers via sets of definitions. Such terms are usually defined via terms that we already understand in advance. The debate about possible worlds can be understood along the similar lines. The word ‘world’ is a theoretical notion that differs from the word as we use it in everyday life. In the latter, the world is everything we live in and interact with. The philosophical ‘world’ represents the world and is one of many such representations. Its uniqueness rests on the correct representation of it. ‘Actual world’, ‘possible world’, as well as ‘impossible world’ are thus theoretical terms.

An example will be helpful here. Consider the following proposition:

(1)  Canberra is the capital of Australia.

Given the constitutional order of Australia, (1) is true because Canberra is the capital of Australia. In contrast, the proposition:

(2)  Melbourne is the capital of Australia

is false, because it is not the case. So (1) and (2) are factual claims, because they describe the constitutional order in Australia. Consider, however, the following proposition:

(3)  Melbourne could be the capital of Australia.

At first sight, (3) also appears to be about our world in some sense, yet it displays structurally different features than (1) and (2). So, why is it so? Some philosophers dismiss this question by rejecting its coherence. Others propose a positive solution by means of other worlds. In the following two sections I provide two arguments for doing so.

2. The First Argument for Impossible Worlds

In his Counterfactuals (1973), David Lewis states the following:

I believe, and so do you, that things could have been different in countless ways. But what does this mean? Ordinary language permits the paraphrase: there are many ways things could have been besides the way they actually are. I believe that things could have been different in countless ways; I believe permissible paraphrases of what I believe; taking the paraphrase at its face value, I therefore believe in the existence of entities that might be called ‘ways things could have been.’ I prefer to call them ‘possible worlds’. (Lewis 1973: 84)

Takashi Yagisawa builds on Lewis’s view as follows:

There are other ways of the world than the way the world actually is. Call them ‘possible worlds.’ That, we recall, was Lewis’ argument. There are other ways of the world than the ways the world could be. Call them ‘impossible worlds’. (Yagisawa 1988: 183)

These two quotes reflect a need for an analysis of modality in terms of worlds. While Lewis postulates possible worlds as the best tool for analysing modal propositions, Yagisawa extends the framework by adding impossible worlds. In other words, while Lewis accepts:

(P) It is possible that P if and only if there is a possible world, w, such that at w, P.

and:

(I) It is impossible that P if and only if there is no possible world, i, such that at i, P.

as definitions of possibility and impossibility.

An alternative analysis of impossibility extends the space of worlds and, in addition to possible worlds, commits to impossible worlds. As a consequence, proponents of impossible worlds formulate a dilemma in the form of modus tollens and modus ponens respectively:

    1. If we endorse arguments for the existence of possible worlds, then, with all needed changes made, we should endorse the same kind of argument for the existence of impossible worlds.
    2. There are arguments that disqualify impossible worlds from being acceptable entities.

Therefore:

There are no possible worlds. (By modus tollens.)

Or:

1*. If we endorse arguments for the existence of possible worlds, then mutatis mutandis, we should endorse the same kind of argument for the existence of impossible worlds.

2*. There are arguments that establish possible worlds as acceptable entities.

Therefore:

There are impossible worlds. (By modus ponens.)

A need for impossible worlds starts from an assumption that if the paraphrase argument justifies belief in worlds as ways things could have been, then the same argument justifies belief in worlds as ways things could not have been. The second reason is the applicability of impossible worlds. I will discuss some applications of impossible worlds in the next section.

3. Impossible Worlds and Their Applications

It is thought of as a platitude that the introduction of theoretical terms ought to be followed by their theoretical utility. Moreover, the usability of theoretical terms should not solve a particular problem only. Instead, their applications should range over various philosophical phenomena and systematically contribute to their explanation.

The theoretical usefulness of possible worlds has been proven in the analysis of de re as well as de dicto modalities (see the article on Frege’s Problem: Referential Opacity, Section 2), as well as in the analysis of counterfactual conditionals, propositional states, intensional entities, or relations between philosophical theories. Given their applicability, possible worlds have turned out to be a useful philosophical approach to longstanding philosophical problems.

To begin with, representing properties and propositions as sets of their instances, possible individuals and possible worlds respectively, offered many advantages in philosophy. In particular, impossible worlds provide a more nuanced explanation of modality in a way that an unadulterated possible world framework does not. Like possible worlds, impossible worlds are ‘localisers’, albeit in the latter case, where impossible things happen. Consider these two statements:

(4)  2 + 2 = 5

and

(5)  Melbourne both is and is not in Australia.

(4), according to a possible worlds semantic treatment, does not hold in any possible world, because possible worlds are worlds at which only possible things happen. Also, there is no possible world at which Melbourne both is and is not in Australia. Given these two data, and assuming the widely accepted, although disputable, view of propositions as sets of possible worlds, (4) and (5) are ontologically one and the same proposition. It is the empty set. However, (4) and (5) are about different subject matters, namely arithmetic and geography. In order not to confuse these two (impossible) subjects, one sort of way out is presented by impossible worlds: there is an impossible world at which (4) is true and (5) is false, and vice versa.

The well-known reductio ad absurdum mode of argument is another, although controversial, reason for taking impossible worlds seriously (for a more detailed exposition of this, see the article on Reductio ad Absurdum). The internal structure of such arguments starts with certain assumptions and then, via logically valid steps, leads to a contradiction. The occurrence of such an assumption shows that, although the conclusion is contradictory, the impossible assumption gives rise to a counterfactual string of mutually interconnected and meaningful premises. Some proponents of impossible worlds insist that unless we take such impossible assumptions seriously, reductio ad absurdum arguments would not play such a crucial role in philosophical reasoning. For the opposite view according to which mathematical practice does not depend on using counterfactuals, see Williamson (Williamson 2007, 2017). For a more substantive discussion of the reductio ad absurdum and impossible worlds, see also Berto& Jago (2019, especially Chapter XII).

Whatever the machinery behind the reductio ad absurdum argument is, there is a strong reason to postulate impossible worlds for the analysis of a sort of counterfactual conditionals, nonetheless. According to the most prevalent theory, a counterfactual is true if and only if there is no possible world w more similar to the actual world than some possible world such that (i) the antecedent and the consequent of the conditional are both true in , and (ii) the antecedent is true but the consequent is not true in w. Clearly, such an account falls short in analysing counterpossible conditionals unless we either deny their possible worlds interpretation (Fine 2012), admit that they are trivially true (Lewis 1973, Williamson 2007), treat the putative triviality by other means (Vetter 2016) or simply accept impossible worlds. To demonstrate the problem, here is a pair of famous examples, originally from (Nolan 1997):

(6) If Hobbes had (secretly) squared the circle, sick children in the mountains of South America at the time would have cared.

(7) If Hobbes had (secretly) squared the circle, sick children in the mountains of South America at the time would not have cared.

Although intuitions are usually controversial within the philosophical room, there is something intriguing about (7). Namely, although its antecedent is impossible, we seem to take (7) to be true. For, in fact, no sick children would have cared if the antecedent had been true, since this would have made no difference to sick children whatsoever. By the same reasoning, (6) is intuitively false; for again, no sick children would have cared if the antecedent had been true. Consequently, the occurrence of these distinct truth values requires a distinctive analysis and impossible worlds analysis is one candidate.

Disagreements in metaphysical disputes display another feature of impossibility. Metaphysicians argue with each other about lots of issues. For instance, they variously disagree about the nature of properties. Suppose that trope theory is the correct theory of properties and so is necessary true (see the article on Universals). Then this means that both the theory of properties as transcendent universals and the theory of properties as immanent universals are both (a) impossible, and (b) distinct. But they are true in the same possible worlds (that is, none), and to distinguish these two views in terms of where they are true requires impossible worlds. Similarly, proponents of modal realism and modal ersatzism disagree about the nature of possible worlds (see the article on Modal Metaphysics). But they both agree that if either of these theories is true, it is true in all possible worlds; necessarily so. By this reasoning, one’s opponent’s claim is necessarily wrong; she defends an impossible hypothesis. For more details on this (and other issues) see Nolan (1997) and (Miller 2017).

Although theories of fiction abound, its analyses in terms of possible worlds dominate. According to such analyses, what happens in a work of fiction happens at a set of possible worlds, full stop. However, the problem is that fiction fairly often hosts impossible events.

For instance, ‘Sylvan’s Box’ (Priest 1997) is a short story about an object which is inconsistent because it is both empty and non-empty. A usual treatment of such stories uses the terminology of worlds which realise what is stated in the story. However, Priest claims, any interpretation of the story in terms of sub-sets of internally consistent sets of possible worlds (see Lewis 1978) misrepresents the story.

Of course, these applications of impossible worlds are not exhaustive and, as we will see in Section 4, impossible worlds have limitations. Let us, however, suppose that the dilemma is irresistible, and that impossible worlds are, at least to some extent, as applicable as possible worlds are. Given so, one must always consider the cost of such commitment. Since the theoretical application of any entity brings with it an ontological burden, an optimal trade-off between application and ontological commitments must be sought. And impossible worlds are an excellent example of such a trade-off. The next section overviews several metaphysical issues about impossible worlds.

4. The Metaphysics of Impossible Worlds

The introduction of theoretical entities requires a view about their metaphysical nature. The introduction of impossible worlds in not an exception and requires an answer to the question of what impossible worlds are, and, additionally, how impossible worlds differ from possible worlds. We can think of the questions as the identification question and the kind question, respectively.

The identification question concerns the nature of impossible worlds. Like proponents of possible worlds, proponents of impossible worlds disagree about the metaphysical nature and divide into several camps. To start with realism about worlds, these views share a common idea that whatever worlds are, these worlds exist. Probably the most prominent version of modal realism is the genuine modal realism.  While modal realism is a thesis according to which possible worlds exist, genuine modal realism claims that possible worlds exist and, moreover, possible worlds exist in the very same way as ‘we and our surroundings’; they are as concrete as we, buildings, animals, and cars are. What is more, every individual exists in one possible world only (for more on transworld identity, see the article on David Lewis). The actual world is a world which has temporal and spatial dimensions and, consequently, every possible world fulfils this requirement. According to modal realism, possible worlds are concrete spatiotemporal entities.

Another version of modal realism with impossible worlds is presented by Kris McDaniel (2004). His strategy is to withdraw Lewis’s commitment to individuals existing in one possible world only. Instead, he allows an individual to exist in many worlds and to thus bear the exists at relation to more than one world. Such so-called modal realism with overlap is genuine realism, because it accepts concrete possible worlds and their inhabitants.

A modified version of modal realism is presented by Yagisawa (2010). Under the name of modal dimensionalism, Yagisawa postulates so-called metaphysical indices. These indices represent the spatial, temporal, and modal dimensions of the world. According to Yagisawa, the world has spatial, temporal, and additionally modal dimensions, in the same way that I have my own spatial, temporal and modal dimensions. Namely, my temporal dimension includes, among other things, me as a child, me nine minutes ago, and me in the future. My spatial dimensions are the space occupied by my hands, head, as well as the rest of my body. My modal dimension includes my possible stages of being a president, a football player and so forth.

A more moderate version of modal realism is modal ersatzism. Like genuine modal realism, modal ersatzism takes possible worlds to be existent entities (see again the article on Modal Metaphysics), yet denies that they have spatiotemporal dimensions. Naturally, such a brand of realism attracts fans of less exotic ontology because possible worlds are considered as already accepted surrogates for otherwise unwelcome philosophical commitments: complete and consistent sets of propositions or sentences, complete and consistent properties, or complete and consistent states of affairs. Usually, these entities are non-concrete in nature and are parts of the actual world (the view is sometimes called actualism). Alternatively, for an excellent overview of various kinds of ersatzism, see (Divers 2002).

Finally, views according to which worlds do not in fact exist, are widespread in literature. Under the name of modal anti-realism, such views reject modal realism for primarily epistemological reasons although neither deny the meaningfulness of modal talk nor the accuracy of its worlds semantics. Although modal anti-realism is not so widespread in the literature, several positive proposals have demonstrated its prospects. For instance, Rosen (1990) proposes a strategy of ‘fictionalising’ the realist’s positions in shape of useful fictions. Although his primary target is genuine modal realism, it is easy to generalise the idea to other versions of modal realism.

The kind question asks whether possible and impossible worlds are of the same metaphysical category or fall under metaphysically distinct categories. The extent to which we identify possible worlds with a certain kind of entity (identification question) and accept impossible worlds for one reason or another, the response to the kind question predetermines our views about the nature of impossible worlds.

A positive response to the kind question is put forward in Priest (1997). As he puts it, anyone who accepts a particular theory of possible worlds, be it concrete entities, abstract entities, or non-existent entities, has no cogent reason to pose an ontological difference between merely possible and impossible worlds (see Priest 1997: 580–581). The idea is expressed by the so-called parity thesis which says that theories of the nature of possible worlds should be applied equally to impossible worlds.

Now, particular versions of modal realism together with the parity thesis lead to specific views of impossible worlds. To begin with genuine modal realism, extended genuine modal realism accepts concrete possible and impossible worlds. These worlds are spatiotemporal entities, and whatever is impossible holds in some concrete impossible world. For the idea of paraphrasing Lewis’s original argument from ways, see Naylor (1986) and Yagisawa (1988).

Modal dimensionalism as well as modal realism with overlap find their impossible alternatives relatively easily. In the former, I simply have impossible stages as well. In the latter, modal realism with overlap allows that an individual can have mutually incompatible properties at two different possible worlds. For example, an individual, a, bears the exists at relation to a world at which a is round, and bears the exists at relation to another world in which a is square, thus representing the situation ‘a is round and square’. Since it is impossible to be both round and square, this is an impossible situation.

A moderate version of modal realism, modal ersatzism combined with parity thesis is, so to speak, in an easier position. Given her metaphysical commitments, be it sets, sentences, propositions, or whatever you have are already assumed to exist, it is only one step further to introduce impossible worlds as their incomplete and inconsistent counterparts without incurring any additional ontological commitments.

Proponents of the negative response to the kind question, on the other hand, deny the parity thesis. Impossible worlds, according to them, are a distinct kind of entity. Interestingly, such a metaphysical stance allows for a ‘recombination’ of philosophically competitive position. For instance, the hybrid genuine modal realism, indicated in Restall (1997), Divers (2002) and further developed in (Berto 2009), posits concrete possible worlds as the best representation of possible phenomena, but abstract impossible worlds as the ‘safest’ representation of impossible phenomena. In other words, what is possible happens in concrete possible worlds as genuine modal realism conceives them, and what is impossible is represented by more moderate ontological commitments.  In particular, possible worlds are concrete and impossible worlds are, according to hybrid genuine modal realism, sets of propositions modelled in accordance with genuine modal realism. Notably, hybrid genuine modal realism is one of many options for the opponents of the Parity thesis. As mentioned earlier, the hybrid approach to modality allows us to interpret possibility/impossibility pair in terms of distinct metaphysical categories and, depending on the category choice, explicates the duality via the identification question (possible tropes/inconsistent sets; maximal properties/impossible fictions, or other alternatives). Given that the variety of versions remains an underdeveloped region of modal metaphysics in the early twenty-first century, it is a challenge for the future to fill in the gaps in the literature.

5. Troubles with Impossible Worlds

Undoubtedly, any introduction of suspicious entities into philosophy comes with problems, and impossible worlds are not an exception. Besides incredulous stares toward them, philosophical arguments against impossible worlds abound.

A general argument against impossible worlds points to the analysis of modality. For, as far as the goal is to provide an account of modal concepts in more graspable notions, the introduction of impossible worlds puts the accuracy of the analysis at stake. Recall the initial impossibility schema (I):

(I) It is impossible that P if and only if there is no possible world, i, such that at i,

An impossible worlds reading substitutes the occurrence of ‘no possible world’ with ‘impossible world’ along the lines of (I*):

(I*) It is impossible that P if and only if there is an impossible world, i, such that at i, P.

(I*) mimics the structure of (P) and proponents of impossible worlds are expected to be tempted to it. However, (I*) is ‘superficially tempting’. For, although (P) and (I*) are both biconditionals it is hard to accept the right-to-left direction of (I*). For instance, although it is impossible that A & ~A, the conjuncts themselves may be contingent and, by (P), be true in some possible world. Such disanalogy between (P) and (I*) makes impossible worlds of not much use in the theory of impossibility in the first place.

Other problems concern particular theories of modality. Starting with extended modal realism, Lewis himself did not the feel the need to dedicate much space to its rejection. There are two reasons. The first reason is that to provide an extensional, non-modal analysis of modality and, at the same time, distinguish possible worlds from impossible worlds without making use of modal notions is a viable project. The second reason is that a restricting modifier, like ‘in a world’, works by limiting domains of implicit and explicit quantification to a certain part of all that there is, and therefore has no effect on the truth-functional connectives (Lewis 1986, 7, fn.3).). By this, Lewis means that insofar as you admit an impossible thing in some impossible world, you thereby admit impossibility into reality. Since this is an unacceptable conclusion, Lewis rejects the extended version of his modal realism via a simple argument:

1. There is a concrete impossible world at which (A & ~A)

2. At w (A & ~A) if and only if at w A & ~(at w A)

3. The right-hand side of (2) is literally a true contradiction

4. The Law of Non-Contradiction is an undisputable logical principle.

C. There are no concrete impossible worlds.

For Lewis, restricting modifiers works by limiting domains of implicit and explicit quantification to a certain part of all there is. Therefore, ‘On the mountain both P and Q’ is equivalent to ‘On the mountain P, and on the mountain Q’; likewise, ‘On the mountain not P’ is equivalent to ‘Not: on the mountain P’. As a result, ‘On the mountain both P and not P’ is equivalent to the overt contradiction ‘On the mountain P, and not: on the mountain P’. In other words, there is no difference between a contradiction within the scope of the modifier and a plain contradiction that has the modifier within it. See (Lewis 1986: 7 fn. 3) for a full exposition of this argument.

Modal dimensionalism is not without problems either. Jago (2013) argues that adding an impossible stage of ‘Martin’s being a philosopher and not a philosopher’ to my modal profile generates undesired consequences, for modal stages are subject to existential quantification in the same way that actual stages are. And since both actual and modal stages exist, they instantiate inconsistencies, full stop. In the opposite direction, see Yagisawa’s response (2015), as well as Vacek (2017).

Modal realism with overlap has its problems too. A simple counterexample to it relies on the (usually) indisputable necessity of identity and the view according to which no two objects share the same properties: Leibniz’s law. The argument goes as follows: it is impossible for Richard Routley not to be Richard Sylvan because this is one and the same person (in 1983 Richard Routley adopted the last name “Sylvan”):

    1. It is impossible that ∼ (Routley = Sylvan)

Therefore, there is an impossible world i where ∼ (Routley = Sylvan). Now, take the property ‘being a logician’. It is impossible for Routley but not Sylvan to be a logician which, by modal realism with overlap’s lights, means that Routley, but not Sylvan, bears the being a logician relation to a world i. Generalising the idea,

    1. for some property P, in i Routley has P, but Sylvan does not.

However, by Leibniz’s law, it follows that ∼ (Routley = Sylvan). And that is absurd.

What about modal ersatzism? Recall that this alternative to (extended) modal realism takes possible worlds to be existent entities of a more modest kind. The move from ersatz possible worlds to impossible worlds, together with the parity thesis, leads to the inheritance of the various problems of ersatz theories. One such problem is the failure of the reductive analysis of modality. As Lewis argues, any ersatzist theory must at some point appeal to primitive modality and thus give up the project of analysing modality in non-modal terms. Another problem is that entities like states of affairs, properties and propositions are intensional in nature and thus do not contribute to a fully extensional analysis. For scepticism about intensional entities, see Quine (1956). For more problems with modal ersatzism, see Lewis (1986: ch. 3).

Modal fictionalism can be a way of avoiding the realist’s problems. For, if ‘according to the possible worlds fiction’ explains possibility, then ‘according to the possible and impossible worlds fiction’ offers a finer-grained analysis with no exotic ontological commitments. But again, such a relatively easy move from possibility to impossibility faces the threat of inheriting the problems of modal fictionalism. One such difficulty is that fictionalism is committed to weird abstract objects, to wit, ‘stories’. Another worry about (extended) modal fictionalism is the story operator itself. For, unless the operator is understood as primitive, it should receive an analysis in more basic terms. And the same applies to the ‘according to the possible and impossible worlds fiction’ operator.

Moreover, even if modal fictionalists provide us with an account of their fiction operator, it will probably face the same importation problem that the modal realist does. The argument goes as follows. First, suppose logic is governed by classical logic. Second, if something is true in fiction, so are any of its classical consequences. Third, given the explosion principle (everything follows from a contradiction), an inconsistent fiction implies that every sentence is true in the fiction. Fourth, take an arbitrary sentence and translate it as ‘according to the fiction, A’. Fifth, ‘according to the fiction, A’ is true (because an inconsistent fiction implies that all sentences are true within it). Sixth, given that A is the actual truth, ‘according to the fiction, A’ implies: actually A. But it seems literally false to say that any arbitrary sentence is actually true. For more details, see Jago (2014).

The hybrid view has its limitations too. One limitation is that the view introduces two ontological categories and is, so to speak, ideologically less parsimonious than theories following the parity thesis. Moreover, as Vander Laan (1997, 600) points out, there does not seem to be any ontological principle which would justify two different ontological categories in one modal language, namely language of possibility and impossibility.

Yet, there are at least two responses available for the hybrid view. First, proponents of the hybrid view might simply claim that if the best theory of modality plays out that way, that is, if the theory which best systematises our intuitions about modality approves such a distinction, the objection is illegitimate. Second, even the ersatzer faces the same objection. The actual world has two different interpretations and, consequently, two different ontological categories. The actual world can be understood either as us and all our (concrete) surroundings, or abstract representation of it.

Undoubtedly, there is much more to be said about the metaphysics of impossible worlds. Since they come in various versions, one might worry whether any systematic account of such entities is available. Be that as it may, the story does not end with metaphysics. Besides semantic applications of impossible worlds and their metaphysical interpretation, there are logical criteria which complicate their story even more. The next section therefore discusses the logical boundaries (if any) of impossible worlds.

6. The Logic of Impossible Worlds

One might wonder how far impossibility goes, because, one might think, impossible worlds have no logical borders. Yet, one view to think of impossible worlds is as so-called ‘logic violators’. According to this definition, impossible worlds are worlds where the laws of a logic fail. I use the indefinite article here because it is an open question what the correct logic is. Suppose we grant classical logic its exclusive status among other logics. Then, impossible worlds are worlds where the laws and principles of classical logic cease to hold, and the proper description of logical behaviour of impossible worlds requires different logic.

We might therefore wonder whether there is a logic which impossible worlds are closed under. One such candidate is paraconsistent logic(s). Such logics are not explosive, which means that it is not the case that from contradictory premises anything follows. Formally, paraconsistent logic denies the principle α, ~α |= β, and its proponents argue that, impossibly, there are worlds at which inconsistent events happen. Given their denial of the explosion principle, paraconsistent logics should be the tool for an accurate and appropriate analysis of such phenomena. For an extensive discussion of paraconsistent logics, see Priest, Beall, and Armour-Garb (2004).

However, some examples show that even paraconsistent logics are not sufficient for describing the plenitude of the impossible. For example, paraconsistent logic usually preserves at least some principles of classical logic (see the article on Paraconsistent Logic) and cannot thus treat the impossibilities of their violations. A solution would be to introduce its weaker alternative which would violate those principles. But even this manoeuvre seems not to be enough because, as Nolan (1997) puts it, there is tension between a need of at least some logical principles on one side and the impossibility of their failure on the other. For, ‘if for any cherished logical principle there are logics available where that principle fails… if there is an impossible situation for every way things cannot be, there will be impossible situations where even the principles of (any) subclassical logics fail (Nolan 1997, 547). In other words, if we think of a weaker logic as validating fewer arguments, we easily end up with logical nihilism (Russell 2018). Another option is to admit a plurality of logics (Beall & Restall 2006) or, controversially, accept the explosion principle and fall into trivialism: every proposition follows (Kabay 2008).

7. Impossible Worlds and Hyperintensionality

Let me finish with the question of the place of impossibility in reality. In other words, the question remains whether impossibility is a matter of reality, or a matter of representing it. In other words, are impossible matters representational or non-representational? While the literature about impossible issues is inclined towards the latter option, some authors have located the failure of necessary equivalence, that is, the failure of substituting extensionally as well as intensionally equivalent terms, within the world.

To be more precise, levels of analysis ascend from the extensional, to the intensional, to the hyperintensional level. Nolan (2014) suggests that a position in a sentence is extensional if expressions with the same extension can be substituted into that position without changing the truth-value of the sentence. An intensional position in a sentence is then characterised as non-extensional,  such that expressions that are necessarily co-extensional are freely substitutable in that position, while preserving its truth value. Finally, a hyperintensional position in a sentence is neither extensional nor intensional, and one can substitute necessary equivalents while failing to preserve the truth-value of the sentence. Apparently, the introduction of impossible worlds moves philosophical analyses into the hyperintensional level, since even when A and B are necessarily equivalent (be this logical, mathematical, or metaphysical necessity), substituting one of them for the other may result in a difference in truth value. But if that is so, and if some hyperintensional phenomena are non-representational, then impossibility is a very part of reality.

There are several cases which both display worldly features and are hyperintensional. For instance, some counterfactual conditionals with impossible antecedents are non-representational (Nolan 2014). Also, Schaffer (2009) contrasts the supervenience relation to the grounding relation, and concludes that there are substantive grounding questions regarding mathematical entities and relations between them. Yet, given the supervenience relation, such questions turn out to be vacuously true. Explanation as a hyperintensional phenomenon might be understood non-representationally as well. Namely, as an asymmetric relation between the explanans and its necessarily equivalent explanandum. Among other things, some dispositions (Jenkins & Nolan 2012), the notion of intrinsicality (Nolan 2014), the notion of essence (Fine 1994) or omissions (Bernstein 2016) might be understood in the same way. Indeed, all these examples are subject to criticism, but the reader might at least feel some pressure to distinguish between ‘merely’ representational and non-representational hyperintensionality. For more details, see Nolan (2014) and Berto & Jago (2019) and, for an alternative approach to hyperintensionality, Duží,  Jespersen,  Kosterec,  and Vacek (2023).

8. Conclusion

Impossible worlds have been with us, at least implicitly, since the introduction of possible worlds. The reason for this is the equivalence of the phrases ‘it is possible’ and ‘it is not impossible’, or ‘it is impossible’ and ‘it is not possible’. The controversies about impossible worlds can also be understood as a sequel to the controversies about possible worlds. In the beginning, possible worlds were hard to understand, and this produced some difficult philosophical debates. It is therefore no surprise that impossible worlds have come to follow the same philosophical path.

9. References and Further Readings

  • Beall, J. & Restall, G. (2006). Logical Pluralism, Oxford: Oxford University Press.
  • A developed account of a position according to which there is more than one (correct) logic.

  • Bernstein, S. (2016). Omission Impossible, Philosophical Studies, 173, pp. 2575–2589.
  • A view according to which omissions with impossible outcomes play an explanatory role.

  • Berto, F. (2008). Modal Meinongianism for Fictional Objects, Metaphysica 9, pp. 205–218.
  • A combination of Meinongian tradition and impossible worlds.

  • Berto, F. (2010). Impossible Worlds and Propositions: Against the Parity Thesis, Philosophical Quarterly 60, pp. 471–486.
  • A version of modal realism which distinguishes distinct impossible propositions, identifies impossible worlds as sets and avoids primitive modality.

  • Berto, F. & Jago, M. (2019). Impossible Worlds, Oxford: Oxford University Press.
  • A detailed overview of theories of impossible worlds.

  • Divers, J. (2002). Possible Worlds, London: Routledge.
  • Duží, M.; Jespersen, B.; Kosterec, M.; Vacek, D. (eds.). (2023).  Transparent Intensional Logic, College Publications.
  • A detailed survey of the foundations of Transparent Intensional Logic.

  • Fine, K. (1994). Essence and Modality: The Second Philosophical Perspectives Lecture, Philosophical Perspectives 8, pp.  1–16.
  • A detailed overview of the possible world ontologies.

  • Fine, K. (2012). Counterfactuals Without Possible Worlds, Journal of Philosophy 109: 221–246.
  • The paper argues that counterfactuals raise a serious difficulty for possible worlds semantics.

  • Jago, M. (2013). Against Yagisawa’s Modal Realism, Analysis 73, pp. 10–17.
  • This paper attacks modal dimensionalism from both possibility and impossibility angles.

  • Jago, M. (2014). The Impossible: An Essay on Hyperintensionality, Oxford: Oxford University Press.
  • A detailed overview of the history, as well as the current state of impossible worlds discourse.

  • Jenkins, C.S. & Daniel N. (2012). Disposition Impossible, Noûs, 46, pp. 732–753.
  • An original account of impossible dispositions.

  • Kabay, P. D. (2008). A Defense of Trivialism, PhD thesis, University of Melbourne.
  • A defence of trivialism, on the basis that there are good reasons for thinking that trivialism is true.

  • Kiourti, I. (2010). Real Impossible Worlds: The Bounds of Possibility, Ph.D. thesis, University of St Andrews.
  • A defence of Lewisian impossible worlds. It provides two alternative extensions of modal realism by adding impossible worlds.

  • Lewis, D. (1973). Counterfactuals, Cambridge, MA: Harvard University Press.
  • One of the first explicit articulations of modal realism and its analysis of counterfactual conditionals.

  • Lewis, D. (1978). Truth in Fiction, American Philosophical Quarterly 15, pp. 37–46.
  • An approach which aims at dispensing with inconsistent fictions via the method of union or the method of intersection. According to Lewis, we can explain away an inconsistent story via maximally consistent fragments of it.

  • Lewis, D. (1986). On the Plurality of Worlds, Oxford: Blackwell.
  • A detailed defence of modal realism, including an overview of arguments against modal ersatzism.

  • McDaniel, K. (2004). Modal Realism with Overlap, Australasian Journal of Philosophy 82, pp. 137–152.
  • An approach according to which the worlds of modal realism overlap, resulting in transworld identity.

  • Miller, K. (2017). A Hyperintensional Account of Metaphysical Equivalence, Philosophical Quarterly 67: 772–793.
  • This paper presents an account of hyperintensional equivalency in terms of impossible worlds.

  • Naylor, M. (1986). A Note on David Lewis’ Realism about Possible Worlds, Analysis 46, pp. 28–29.
  • One of the first modus tollens arguments given in response to modal realism.

  • Nolan, D. (1997). Impossible Worlds: A Modest Approach, Notre Dame Journal of Formal Logic 38, pp. 535–572.
  • Besides giving an original account of counterpossible conditionals, this paper introduces the strangeness of impossibility condition: any possible world is more similar (nearer) to the actual world than any impossible world.

  • Nolan, D. (2014). Hyperintensional Metaphysics, Philosophical Studies 171, pp. 149–160.
  • A defence of modal realism with overlap: the view that objects exist at more than one possible world.

  • Priest, G. (1997). Sylvan’s Box: A Short Story and Ten Morals, Notre Dame Journal of Formal Logic, 38, 573–582
  • A short story which is internally inconsistent, yet perfectly intelligible.

  • Priest, G., Beall, J. C., & Armour-Garb, B. (eds.). (2004), The Law of Non-Contradiction, Oxford: Oxford University Press.
  • A collection of papers dedicated to the defence as well as the rejection of the law of non-contradiction.

  • Russell, G. (2018). Logical Nihilism: Could There Be No Logic?, Philosophical Issues 28: 308–324
  • A proposal according to which there is no logic at all.

  • Schaffer, J. (2009). On What Grounds What, in D, Chalmers, D. Manley, and R. Wasserman (eds.), Metametaphysics: New Essays on the Foundations of Ontology, Oxford: Oxford University Press, pp. 347–383.
  • A defence of the grounding relation as providing a philosophical explanation.

  • Quine, W. V. (1956). Quantifiers and Propositional Attitudes, Journal of Philosophy 53, pp. 177–187.
  • According to Quine, propositional attitude constructions are ambiguous, yet an intensional analysis of them does not work.

  • Restall, G. (1997). Ways Things Can’t Be, Notre Dame Journal of Formal Logic 38: 583–96.
  • In the paper, Restall identifies impossible worlds with sets of possible worlds.

  • Rosen, G. (1990). Modal Fictionalism, Mind 99, pp. 327–354.
  • An initial fictionalist account of modality, ‘parasiting’ on the advantages of modal realism, while avoiding its ontological commitments.

  • Vacek, M. (2017). Extended Modal Dimensionalism, Acta Analytica 32, pp. 13–28.
  • A defence of modal dimensionalism with impossible worlds.

  • Vander Laan, D. (1997). The Ontology of Impossible Worlds, Notre Dame Journal of Formal Logic 38, pp. 597–620.
  • A theory of impossible worlds as maximal inconsistent classes of propositions, as well as a critique of various alternative positions.

  • Vetter, B. (2016). Counterpossibles (not only) for Dispositionalists, Philosophical Studies 173: 2681–2700
  • A proposal according to which the non-vacuity of some counterpossibles does not require impossible worlds.

  • Williamson, T. (2017). Counterpossibles in Semantics and Metaphysics, Argumenta 2: 195–226.
  • A substantial contribution to the semantics of counterpossible conditionals.

  • Yagisawa, T. (1988). Beyond Possible Worlds, Philosophical Studies 53, pp. 175–204.
  • An influential work about the need for impossible worlds, especially with regard to modal realism.

  • Yagisawa, T. (2010). Worlds and Individuals, Possible and Otherwise, Oxford: Oxford University Press.
  • A detailed account of modal dimensionalism and its ontological, semantic and epistemological applications.

  • Yagisawa, T. (2015). Impossibilia and Modally Tensed Predication, Acta Analytica 30, pp. 317–323.
  • The paper provides responses to several arguments against modal dimensionalism.

 

Author Information

Martin Vacek
Email: martin.vacek@savba.sk
Institute of Philosophy at the Slovak Academy of Sciences
Slovakia

Boethius (480-524)

Boethius was a prolific Roman scholar of the sixth century AD who played an important role in transmitting Greek science and philosophy to the medieval Latin world. His most influential work is The Consolation of Philosophy. Boethius left a deep mark in Christian theology and provided the basis for the development of mathematics, music, logic, and dialectic in medieval Latin schools. He devoted his life to political affairs as the first minister of the Ostrogothic regime of Theodoric in Italy while looking for Greek wisdom in devout translations, commentaries, and treatises.

During the twenty century, his academic modus operandi and his Christian faith have been a matter of renewed discussion. There are many reasons to believe his academic work was not a servile translation of Greek sources

The Contra Eutychen is the most original work by Boethius. It is original in its speculative solution and its methodology of using hypothetical and categorical logic in its analysis of terms, propositions, and arguments. The Consolation of Philosophy is also original, though many authors restrict it to his methodology and the way to dispose of the elements, but not the content, which would represent the Neoplatonic school of Iamblichus, Syrianus, and Proclus. Boethius was primarily inspired by Plato, Aristotle, and Pythagoras. His scientific, mathematical and logical works are not original, as he recognized.

Table of Contents

  1. Life
  2. Time
  3. Writings
    1. Literary Writings 
      1. The Consolation of Philosophy
    2. Theological Treatises
    3. Scientific Treatises
    4. Logical Writings
      1. Translations
      2. Commentaries
      3. Treatises
        1. On the Division
        2. On the Topics
        3. On the Hypothetical Syllogisms
      4. Treatises on Categorical Syllogisms
        1. The De Syllogismo Categorico
        2. The Introductio ad Syllogismos Categoricos
  4. Influence of the Treatises
  5. His Sources
  6. References and Further Reading

1. Life

Anicius Manlius Severinus Boethius (c. 480-524 AD), Boethius, was a prominent member of the gens Anicia, a family with strong presence in the republican and imperial Roman life. From the time of Constantine its members were converted and advocated for the doctrine of the Christian church of Rome. The study of Latin epigraphy (compare Martindale 1980, p. 232) and some biographical details about his childhood delivered by Boethius himself (Consolation of Philosophy ii, 3, 5) allow believing that his father was another Boethius, Narius Manlius Boethius, who was praetorian prefect, then prefect of Italy, and finally consul and patrician in 487 AD (compare Cameron 1981, pp. 181-183). It is not clear if this Boethius is the one who was the prefect of Alexandria in 457 AD, but Courcelle (1970, p. 299, n.1) suggested it so to give more weight to his hypothesis that Boethius could have used his social situation to go to Athens or Alexandria and learn Greek and deepen his study of philosophy and theology. What seems more likely is that Boethius’ grandfather was the same Boethius who was murdered by Valentinian III in 454 AD (compare Martindale 1980, p. 231).

After his father’s death, which occurred when Boethius was a child, he received the protection of Quintus Aurelius Symmachus Memmius, who belonged to a very influential family of the Roman nobility. Later, Boethius married Symmachus’s daughter, Rusticiana, sealing a family alliance that was disturbing to Theodoric, the Ostrogoth king, who was in Italy to impose authority and governance to the collapsed Western Empire by following the request of Flavius Zeno, the Eastern Roman Emperor. The political commitment of Boethius with Rome is attested not only by the public office of magister officiorum, the highest political rank that could be exercised in the reign of Theodoric, but also for the education and cursus honorum of his two sons, Symmachus and Boethius, who became senators (Consolation of Philosophy ii, 3,8; 4,7).

The prestige of Boethius in sixth-century Rome is attested not only by the honors granted him during his youth (some of which were denied to his older fellows, compare Consolation of Philosophy ii, 3), but also by the requests from friends and relatives to write commentaries and treatises to explain some difficult matters. In addition, Cassiodorus (Magnus Aurelius Cassiodorus), well known for founding in 529 AD the monastery of Vivarium, reports a scientific mission entrusted to him by Theodoric in terms of giving a horologium, a clock regulated by a measured flow of water, to the king of the Burgundians, Gundobad (compare. Variae I, 45 and I, 10. Mommsen ed. 1894).

2. Time

Theodoric must have been an ominous character for Romans, perhaps the lesser evil. The difficulties involved in moving from the pure ideal of Rome to Theodoric’s nascent eclectic culture must have been the context in which Boethius lived. By this time the unity of the Western Roman Empire was fragile, and the political power continuously disputed by various Germanic warlords, from Genseric the Vandal king, in 455 AD until Theodoric, the Ostrogoth king, in 526 AD.

It was Theodoric who organized a more stable government and attracted greater political unity among the leaders of the dominant two ethnic groups, the Roman and the Ostrogoth. In 493 Theodoric founded in Ravenna, northern Italy, the political and diplomatic capital of his government after defeating Odoacer there, as planned by the Emperor Flavius Zeno in Constantinople as a punishment for not respecting the authority of the host eastern Empire.

Theodoric brief reign (he died in 526, two years after Boethius) kept the administrative structure of the Roman Empire and sustained a joint government between the two main ethnic and political groups. Theodoric was not an entirely uneducated man (though see Excerpta Valesiana II, 79. Moreau ed. 1968) and would have had familiarity with Greek culture after staying in Constantinople, as hostage at the age of eight; it is known that, whatever his motivation was, he regularly respected the Roman civil institutions (but see the opinion of Anderson 1990, pp. 111-115). Boethius himself gave a panegyric to Theodoric during the ceremony in which Boethius’ two children were elected consuls (Consolation of Philosophy ii, 3, 30).

But the association of the two political powers, the military of Theodoric and the political of Rome, had many reasons to be adverse. By this time, Boethius must have been not only the most influential Roman politician in the Ostrogoth government but also the most distinguished public figure of the Roman class. The personal and political opposition was, after all, deep and irreconcilable. The Arianism of Theodoric and the Catholicism of Boethius clashed in 518, when Justin was appointed Roman emperor of the East. He abolished the Henoticon and embarked on a recovery policy of the Catholic faith of the Council of Chalcedon, and he began a plan to approach to Rome (Matthews 1981, p. 35). The most difficult years came as the elder Theodoric began to be worried about the destiny of his non-Catholic Eastern allies and concerned on his own stability in Italy. Around 524 AD, Boethius was accused of treason by Theodoric himself, without the right to be defended by the Roman Senate, which was also accused of treason (compare also Excerpta Valesiana II, 85-87. Mommsen ed. 1984). He was quickly imprisoned near Pavia, where he remained until being executed.

The detailed circumstances of the accusation have been not entirely clear to posterity, even if Boethius gives a summary of them in his Consolation of Philosophy i, 4. In essence, the charge was treason against the Ostrogoth government by seeking the alliance of Justin in Constantinople. The evidence for this charge includes Boethius’ intention of defending the senate of being involved in protecting the senator Albinus (who was accused of the same charge before), and the exhibition of some letters sent to Justin that contained expressions adverse to Theodoric and his regime. Boethius calls these letters apocryphal (Consolation of Philosophy i, 4). Probably Albinus was not in secret negotiations with the Eastern empire, and Boethius was innocent of wishing to defend the Senate of treason and concealment. However, he was accused and punished for his conspiracy, at the mercy of a violent despotic king who did not establish a proper defense or prove the charge against him. The execution of Boethius came quickly, and the manslaughter of his father-in-law, Symmachus, followed soon as well as the abuse and death of Pope John I. During his imprisonment, Boethius wrote his masterpiece The Consolation of Philosophy, which was not only a work of great influence in the Middle Ages and the Renaissance, but one of the most respected works of human creativity.

3. Writings

Boethius’ writings divide into three kinds: philosophical, theological, and scientific. The scientific writings are divided into mathematical and logical. The relationship between Boethius and his work remains complex. His completed works are traditionally conceived as original. The disorganized and incomplete shape of some of his works, especially his scientific treatises, is explained by his execution and death. However, many twentieth century scholars believe that this classical description only applies to his Theological treatises and partly to the Consolation of Philosophy, for Boethius depends on his sources more than an original work. However, this opinion is somehow a generalization of the situation that surrounded the scientific writings, and the truth is more in the middle.

a. Literary Writings

i. The Consolation of Philosophy

Boethius’ philosophical work is identified with his Consolatio Philosophiae, which combines stylistic refinement through the composition of prose and poetry with philosophical ideas within a conceptual framework based on a Neoplatonic interpretation of Aristotle with some subtle touches of Christian philosophy‑although this has been a matter of discussion. The unexpected visit of Lady Philosophy in his prison allows for a dialogue with a wonderful counterpoint between human opinion and the wisdom of Lady Philosophy, although Boethius says that Lady Philosophy is just the announcer of the light of truth (IV, 1, 5). The themes raised by The Consolation of Philosophy, such as the nature of fortune, human happiness, the existence of God and evil, human freedom and divine providence, became the focus of attention for Christian metaphysics of the Latin Middle Ages.

In Book I, Boethius briefly reviews his political life and the reasons for his accusation and imprisonment, showing that he is fully aware of those who accused him. In Book II, he discusses the nature of fortune and the reasons why no one should trust in it. In Book III he argues‑already in a different sense from what we might expect from the philosophy of Plato and Aristotle‑that true happiness (beatitudo) identifies with divinity itself, whose nature is unique and simple. He identifies the highest good (perfectus bonum) with the father of all things (III, 10, 43), and maintains that it is not possible to possess happiness without first having access to the highest good. The difference between his theory of happiness and that of Aristotle and Plato is that Boethius places God as a sine qua non condition for the possession of happiness, implying that every man must trust in divine wisdom and God’s provident wisdom to be happy. In Book IV, he addresses the issue of the existence of evil in the realm of one who knows and can do everything (IV, 1, 12: in regno scientis omnia potentis omnia). The allusion to the kingdom of God (regnum dei) is highly significant for proving its implicit Christianity mostly because he completes this allusion with the metaphor of the gold and clay vessels that the master of the house disposes since this symbol is found in the Letters of Saint Paul (Timothy 2,20; 2 Corinthians 4,7; and Romans 9, 21), and because it had an enormous Patristic citation. In Book V, Boethius examines one of the most complex problems of post-Aristotelian philosophy: the compatibility of human freedom and divine foreknowledge (divina praescientia). Boethius’ treatment will be of great theoretical value for later philosophy, and the remains of his discussion can be seen in Thomas Aquinas, Valla, and Leibniz (compare Correia (2002a), pp. 175-186).

Neoplatonic influence has been discerned in the Consolation, especially that of Proclus (412-485 AD), and Iamblichus. But this fact is not enough to affirm that Boethius in the Consolation only follows Neoplatonic authors. The issue is whether there is an implicitly Christian philosophy in this work. Furthermore, the absence of the name of Christ and Christian authors has led some scholars to believe that the Consolation is not a work of Christian philosophy, and Boethius’ Christianity was even doubted for this fact (compare Courcelle, 1967, p. 7-8). Added to this is the fact that, if Boethius was a Christian, he would seek consolation in the Christian faith rather than in pagan philosophy. However, it must be considered that the genre of philosophical consolation, in the form of logotherapy, was traditional in Greek philosophy. Crantor of Solos, Epicurus, Cicero, and Seneca had written consolations about the loss of life, exile, and other ills that affect the human spirit. Cicero in his Tusculan Disputations (3.76) even shows that the different philosophical schools were committed to the task of consoling the dejected and recognizes various strategies applied by the different schools as they conceived the place of human beings in the universe. Boethius was surely aware of this tradition (Cicero wrote his consolation for himself) and if we take this assumption for granted, Boethius’ Consolation of Philosophy would fit within the genre of consolation as a universal genre together with the themes of universal human grief (evil, destiny, fortune, unhappiness). At the same time, Boethius would be renovating this literary genre in a Christian literary genre, since Lady Philosophy does not convey Boethius’ spirit towards pagan philosophy in general, but rather to a new philosophy that should be called Christian. We see this not only in the evocations of Saint Paul’s letters and the new theory of happiness but also when, in Book V, Boethius identifies God with the efficient principle (de operante principio) capable of creating from nothing (V, 1, 24-29). Hence, he adapts Aristotle’s definition of chance by incorporating the role of divine providence (providentia) in disposing all things in time and place (locis temporibusque disponit: V, 1, 53-58).

b. Theological Treatises

The Opuscula sacra or Theological treatises are original efforts to resolve some theological controversies of his time, which were absorbed by Christological issues and by the Acacian schism (485-519). Basically, they are anti-heretical writings, where a refined set of Aristotelian concepts and reasoning are put in favor of the Chalcedonian formula on the unity of God and against both Nestorious’ dyophysitism and Eutyches’ monophysitism. Boethius claims to be seeking an explanation on these issues using Aristotle’s logic. This makes him a forerunner of those theologians trusting theological speculations in logic. The following five treatises are now accepted as original: (1) De Trinitate, (2) If the Father and the Son and the Holy Spirit are substantially predicated of divinity, (3) How the substances are good in virtue of their existence without being substantially good, (4) Treatise against Eutyches and Nestorius, and (5) De Fide Catholica. The most original and influential of Boethius’ theological treatises is the Contra Eutychen.

Because of the absence of explicit Christian doctrines in the Consolation of Philosophy, the authenticity of the theological treatises was doubted by some scholars in the early modern era. But Alfred Holder discovered a fragment of Cassiodorus in the manuscript of Reichenau, which later was published by H. Usener (1877), in which the existence of these treatises and their attribution to Boethius is reported. Cassiodorus served as senator with Boethius and succeeded him in the charge of Magister officiorum in Theodoric’s government. Cassiodorus mentions that Boethius wrote “a book on the Trinity, some chapters of dogmatic teaching, and a book against Nestorius” (compare Anecdoton Holderi, p. 4, 12-19. Usener ed. 1877). This discovery confirmed not only the authenticity of Boethius’ theological treatises, but also cleared the doubts over whether Boethius was a Christian or not. The Treatise against Eutyches and Nestorius has been admitted as the most original of Boethius’s theological treatises (Mair, 1981, p. 208). By the year 518 Boethius had translated, commented on, and treated a large part of Aristotle’s Organon (compare De Rijk’s chronology, 1964). Thus, Boethius makes use of Aristotelian logic as an instrument. In Contra Eutychen, he uses all the resources that are relevant to the subject in question: division and definition, hypothetical syllogisms, distinction of ambiguous meanings of terms, detection and resolution of fallacies involved. This is accompanied by the idea that human intelligence can store arguments against or in favor of a certain thesis to possess a copia argumentorum (a copy of arguments; p. 100, 126-130), suggesting that there can be several arguments to demonstrate the same point under discussion, which is a matter reminiscent of Aristotle’s Topics. Thus, Boethius gives a perfect order of exposition, rigorously declared at the beginning of the central discussion of the treatise: 1) Define nature and person, and distinguish these two concepts by means of a specific difference; 2) Know the extreme errors of the positions of Nestorius and Eutyches; 3) Know the middle path of the solution of Catholic faith. Boethius’s solution is the Catholic solution to theological problem of the two natures of Christ. According to his view, Christ is one person and two natures, the divine and the human, which are perfect and united without being confused. He is thus consubstantial with humanity and consubstantial with God.

c. Scientific Treatises

Within the scientific writings, we find mathematical and logical works. Boethius gives us scientific writings on arithmetic, geometry, and music; no work on astronomy has survived, but Cassiodorus (Variae I. 45, 4) attributed to him one on Ptolemy. Similarly, Cassiodorus attributes another on geometry with a translation of Euclid’s Elementa, but what we count as Boethius’ writing on geometry does not correspond to Cassiodorus’ description. His logical works are on demonstrative and inventive logic. A treatise on division (De divisione) has also credited to him (compare Magee 1998). But one on definition (De definitione) has been refuted as original and attributed to Marius Victorinus (Usener, 1877). Boethius devotes to logic three types of writings: translations, commentaries, and treatises.

Boethius uses the term quadrivium (De Institutione Arithmetica I, 1, 28) to refer to arithmetic, geometry, music, and astronomy, which reveals that he might be engaged not only in the development of these sciences, but also in the didactic of them. However, his works on logic do not reveal that this plan might also have covered the other disciplines of the trivium, grammar and rhetoric.

The scientific writings of Boethius occupied an important place in the education of Latin Christendom. The influence that these treatises had in the medieval quadrivium and even into early modern tradition is such that only Newton’s physics, Descartes’ analytical geometry, and Newton’s and Leibniz’s infinitesimal calculus were able to prevail in the Boethian scientific tradition.

It is known that the way Boethius approaches arithmetic and music is speculative and mathematical. Arithmetic is known as the science of numbers and does not necessarily include calculation. And music is a theoretical doctrine of proportion and harmony and has nothing directly to do with making music or musical performance techniques. In De Institutione musica I, 2, 20-23, Boethius makes a distinction of three types of music: cosmic (mundana), human (humana) and instrumental. He distinguishes them according to their universality. The mundana is the most universal, since it corresponds to celestial harmony and the order of stars: some stars rotate lower, others higher, but all form a set with each other. It is followed by human music, which is what we, as humans, experience and reproduce directly within ourselves. It is the song, the melodies that are created by poetry. It is responsible for our own harmony, especially the harmonious conjunction between the sensitive part and the intellectual part of our nature, just as the bass and treble voices articulate in musical consonance. The third is instrumental music, generated by tension of a string or by the breath of air or by percussion.

At the beginning of his De Institutione Musica (I, 10, 3-6), when following Nichomachus of Gerasa, Boethius adopts without criticism not only Pythagoras’ theory of music, but also the supernatural context in which Pythagoras announces the origin of music through a divine revelation given by the symmetric and proportional sounds coming from a blacksmith. The marked tendency of the Pythagorean theory of music impedes Boethius from making a richer report of music by including the more empirical approach by Aristoxenus, who is criticized by Boethius just as the Stoics are in logic.

d. Logical Writings

Boethius has three kinds of works on logic: translations, commentaries, and treatises. Their content revolves mainly around Aristotle’s logical writings: Categories, De Interpretatione, Prior Analytics, Posterior Analytics, Topics and Sophistical Refutations, traditionally called the Organon. But even if Boethius wanted to devote works on each one, he did not complete the task.

i. Translations

As a translator, Boethius has a consummate artistry. His translations are literal and systematic. They do not lack the force of the Greek, and they never spoil the style of Latin. Its literal translation method has been compared to that developed later by William of Moerbeke (who translated some works of Aristotle and other Greek commentators) for their use and study of Thomas Aquinas. Boethius’ translations from Greek are so systematic that scholars often can determine what the Greek term behind the Latin word is. Boethius’ translations are edited in Aristoteles Latinus (1961-1975). Translations on every work by Aristotle’s Organon have been found. In addition to these works, Boethius translated the Isagoge of Porphyry, which is an introduction (Eisagogé is the Greek term for ‘introduction’) to Aristotle’s Categories.

In these translations, Boethius exceeded the art of Marius Victorinus, who had earlier translated into Latin Aristotle’s Categories and De Interpretatione, and Porphyry’s Isagoge. Boethius himself attributed certain errors and confusions in Marius Victorinus and informs us that Vetius Praetextatus’ translation of Aristotle’s Prior Analytics, rather than being a translation of Aristotle’s text, is a paraphrase of the paraphrase made by Themistius on this Aristotelian work (compare Boethius in Int 2, p. 3; Meiser ed. 1877-1880). The translation of Greek works into Latin was common. Apuleius of Madaura, a Latin writer of 2 AD., born and settled in North Africa, had translated the arithmetic of Nicomachus of Gerasa and wrote an abridgement of Aristotelian logic. In general, we can say that Boethius saw very clearly the importance of systematic translations into Latin of Greek philosophy and science as an educational service to the nascent European Latin Christianity.

ii. Commentaries

Even if Boethius planned to comment on the complete Organon, he finished only the following:

    • On Porphyry’s Isagoge (In Porphyry Isagogen, two editions).
    • On Aristotle’s Categories (In Aristotelis Categorias, two editions).
    • On Aristotle’s De Interpretatione (In Aristotelis Peri hermeneias, two editions).
    • On the Topics of Cicero (In Ciceronis Topica, one edition).

Though no commentary on Posterior Analytics, Topics or Sophistical Refutations exist, this does not suggest that Boethius was unaware of them. In his Introductio ad syllogismos categoricos (p. 48, 2), when Boethius deals with singular propositions, he seems to follow some explanations closely related to a commentary on Sophistical Refutations. Even if his plan of performing a double commentary on every work is not original, he explained this modus operandi. The first edition contains everything which is simple to understand, and the second edition focuses on everything which is more subtle and requires deeper, longer explanation.

The influence of these commentaries on medieval education was enormous, as they contain key concepts that became central to both the logica vetus and medieval philosophy. In fact, his comments on Porphyry’s Isagoge contain the so-called problem of universals (Brandt 1906 ed.p. 24, 159), and his comments on De Intepretatione give the linguistic and semantic basis of the long tradition of logical analysis of medieval thinkers until Peter Abelard. Additionally, his comments on Cicero’s Topics were influential in the history of logic and sciences by dividing logic into the demonstrative and the dialectic branches, underlining the distinction between Aristotle’s Analytics and Topics.

Many times, Boethius’ commentaries are given through long explanations, but they contain valuable information on the history of logic as they build upon many doctrines on earlier commentators of Aristotle. The commentary on Aristotle’s logic had a long Greek tradition, and Boethius knew to select those commentators and doctrines that improve Aristotle’s text. In that tradition, the earlier author played an important role over the latter. However, there is important evidence that Boethius is not following any continuous copy of any of the earlier Greek commentators.

iii. Treatises

Boethius not only translated and commented on the works of Aristotle and Porphyry, but he wrote some monographs or logical treatises that are different from his commentaries, for they are not intended to provide the correct interpretation of Aristotle’s text, but to improve the theory itself. If we leave aside the De definitione, five treatises are recognized:

    • On Division (De divisione liber)
    • On Categorical syllogism (De syllogismo categorico)
    • Introduction to categorical syllogisms (Introductio ad syllogismos categoricos)
    • On Topical Differences (De Topicis differentiis)
    • On hypothetical syllogisms (De hypotheticis syllogismis).
1. On the Division

Boethius’ De divisione transmitted the Aristotelian doctrine of division, that is, the doctrine that divides a genus into subordinate species. The aim of division is to define (compare Magee 1998). For example:

 

In Aristotle’s works there are examples of divisions (for example, Politics 1290b21, De generatione et corruptione 330a1), which proves that Boethius accepted this definition method regardless of the fact that its origin was Platonic. The logical procedure was also appreciated by the first Peripatetics, and the proof is that, as Boethius reports at the beginning of this treatise, Andronicus of Rhodes published a book on the division, because of its considerable interest to Peripatetic philosophy (De divisione 875D; compare also Magee 1998, pp. xxxiv-xliii). Also, the Neoplatonic philosopher Plotinus studied Andronicus’ book and Porphyry adapted its contents for commenting on Plato’s Sophist (De divisione 876D). The species of division that were recounted by Boethius are that any division is either secundum se or secundum accidens. The first has three branches: (i) a genus into species (for example, animal is divided into rational and non-rational); (ii) the whole into its parts (for example, the parts of a house); and (iii) a term into its own meanings (for example, ‘dog’ means quadruped capable of barking, a star in Orion and an aquatic animal). The division secundum accidens is also triple: (i) a subject into its accidents (for example, a man into black, white and an intermediate color); (ii) accidents into a subject (for example, among the things that are seeking, some belong to the soul and some belong to body); finally, (iii) the accidents into accidents (for example, among white things some are liquid some are solid).

It is worth noting that not all the genus-species divisions are dichotomous, as it was with Platonists, because Peripatetic philosophers also accepted that a genus can be divided into three species or more, since the general condition of a division to be correct is that it must never have less than two species and never infinite species (De divisione 877C-D). As it seems, this is one of the differences between Aristotle and the Platonists. In fact, Aristotle criticizes the Platonists’ dependence on dichotomous divisions by arguing that if all divisions were dichotomous, then the number of animal species would be odd or a multiple of two (Aristotle, Parts of Animals I, 3, 643a16-24).

2. On the Topics

Boethius’ idea of logic is complex and in no way reduces only to formal demonstration. When he refers to logic as such (compare In Isagogen 138,4-143,7; De Top Diff 1173C.; and In Ciceronis topica I 2.6-8), he distinguishes between demonstrative and dialectical syllogism and criticizes the Stoics for leaving out the dialectical part of logic and maintaining a narrower idea of it. In fact, Boethius does not reduce logic to demonstration, but he divides logic into two parts: judgement and the discovery of arguments. Since he identifies the former to Analytics and the later to Topics, the division applies to reconcile these two main procedures of logic. Logic is both a demonstration and a justification of reasonable premises, as the syllogism can manage necessary or possible matters.

In Ciceronis Topica Boethius is commenting on Cicero’s Topics. The objective of this work is to adopt Ciceronian forensic cases and explain them within his understanding of Peripatetic tradition of Aristotle’s Topics. Boethius’ notion of topic is based on what seems to be the Theophrastean notion, which is a universal proposition, primitive and indemonstrable, and in and of itself known (Stump, 1988, pp. 210-211). A topic is true if demonstrated through human experience, and its function is to serve as a premise within the argument sought. The topic may be within or outside argumentation. One example in the treatise (1185C) appears to be autobiographic: the question of whether to be ruled by a king is better than by a consul. According to Boethius, one should argue thus: the rule of a king lasts longer than the government maintained by a consul. If we assume that both governments are good, it must be said that a good that lasts longer is better than one that takes less time. Consequently, to be ruled by a king is better than being governed by a consul. This argument clearly shows the topic or precept: goods that last longer are more valuable than those that last a shorter time. Within the argument it works as an indemonstrable proposition. Boethius often calls them a maximal proposition (propositio maxima).

Boethius called dialectic the discipline studying this type of argumentation. The syllogism can be categorical or hypothetical, but it will be dialectic if the matter in its premises is only credible and non-demonstrative. In De Top Diff 1180C, Boethius introduces a general classification of arguments in which demonstrative arguments can be non-evident to human opinion and nevertheless demonstratively true. In fact, our science has innumerable non-evident affirmations that are entirely demonstrable. On the other hand, dialectical arguments are evident to human opinion, but they could lack demonstration.

Boethius devotes the entire Book 5 of this commentary to discussing dialectical hypothetical syllogisms and here, as in his treatise on hypothetical syllogisms, the role of belief (fides) is quite important in defining dialectical arguments in general, as it will be more explained in the following section.

3. On the Hypothetical Syllogisms

The De hypothetico syllogismo (DHS), perhaps originally titled by Boethius De hypotheticis syllogismis, as Brandt (1903, p. 38) has suggested, was published in Venice in 1492 (1st ed.) and 1499 (2nd ed.). This double edition was the basis for the editions of Basel (1546 and 1570) and the subsequent publication of J.P. Migne in Patrologia Latina, vols. 63 and 64 (1st ed., 1847) and (2nd ed. 1860), which appears to be a reprint of the Basel edition. The editions of 1492 and 1499 form the editio princeps, which has been used regularly for the study of this work to the critical revision of the text by Obertello (1969). DHS is the most original and complete treatise of all those written in the antiquity on hypothetical logic that have survived. It was not systematically studied during medieval times, but it had a renaissance in the twentieth century, through the works of Dürr (1951), Maroth (1979), Obertello (1969), and others.

According to the conjecture of Brandt (1903, p. 38), it was written by Boethius between 523 and 510, but De Rijk (1964, p. 159) maintains that it was written between 516 and 522. In DHS Boethius does not follow any Aristotle’s text but rather Peripatetic doctrines. This is because Aristotle wrote nothing about hypothetical syllogisms, although he was aware of the difference between categorical and hypothetical propositions. Thus, De Interpretatione 17a15-16 defines that “A single-statement-making sentence is either one that reveals a single thing or one that is single in virtue of a connective” (Ackrill’s translation, 1963), and later (17a20-22) he adds, “Of these the one is a simple statement, affirming or denying something of something, the other is compounded of simple statements and is a kind of composite sentence” (Ackrill’s translation, 1963). Even if Aristotle promised to explain how categorical and hypothetical syllogisms are related to each other (compare Prior Analytics 45b19-20 and 50a39-b1), he never did.

Aristotle only developed  a syllogistic logic with simple or categorical propositions, that is, propositions saying something of something (e.g., “Virtue is good”). The syllogism with conditional premises (for example, “The man is happy, if he is wise”) was covered by the first associates of Aristotle, Theophrastus and Eudemus (DHS I, 1,3). Boethius’ DHS contains the most complete information about this Peripatetic development. The theory is divided into two parts: disjunctive and connective propositions. A conditional connection is like “If P, then Q”, where P and Q are simple propositions. A disjunctive proposition is instanced as “Either P or Q”. Boethius presents two indemonstrable syllogisms to each part. The first disjunctive syllogism: ‘It is P or it is Q. But, it is not P. Therefore, it is Q.’ And the second: ‘It is P or it is Q. But, it is not Q. Therefore, it is P.’ As to connectives, the first syllogism is “If it is P, then it is Q. But it is P. Then, it is Q”. And the second is “If it is P, then it is Q. But it is not Q. Then, it is not P”.  Boethius accepts that ‘It is P or it is Q’ is equivalent to ‘If it is not P, then it is Q. Accordingly, Boethius leaves implicit the concordance between hypothetical and disjunctive syllogisms:

First disjunctive syll. First hypothetical syll.   Second disjunctive syll. Second hypothetical syll.
It is P or it is Q

It is not P

Therefore, it is Q.

If it is not P, it is Q

It is not P

Therefore, it is Q.

It is P or it is Q

It is not Q

Therefore, it is P.

It is not P, it is Q

It is not Q

Therefore, it is P.

The theory also develops more complex syllogisms and classifies them in modes. For example, DHS II, 11, 7, says correctly that: “The eighth mode is what forms this proposition: “If it is not a, it is not b; and if it is not b, it is not c; but it is c; therefore, it must be a”.

Boethius’ development does not use conjunctions, and this must be an important difference between the Stoic theory and the Peripatetic original development. This fact makes Boethius deny the hypothetical affirmation “If it is P, then it is Q” by attaching the negative particle to the consequent. Thus ‘If it is P, then it is not Q’ (DHS I, 9,7). This is an internal negation, instead of Stoic negation, which is external or propositional, since applies the negative particle to the entire proposition. This explains why he does not consider Stoic axioms based on conjunction in DHS, which he did in his In Ciceronis Topica, V.

The question of whether Boethius is right in believing that the theory comes from Theophrastus and other Peripatetics is still difficult to answer. Speca (2001, p. 71) raises the doubt that we cannot presently be certain of its Peripatetic provenance, because the sources cannot go further back than the end of II century AD, and by then the hypothetical theory was already terminologically conflated with Stoic terminology. He is right, if we look at Boethius’ examples like ‘It is day, then it is light’, and so forth, which are from the Stoic school. On the other hand, Bobzien (2002 and 2002a) has supported the contrary view, and she is inclined to belief in the historical view of Boethius’ account.

The scrupulous view of Speca (2001) is methodologically safe, but it is worth noticing that there are at least three important differences between Boethius’ hypothetical syllogistic logic and Stoic logic. One is negation: Peripatetic hypothetical negation follows the argumentative line of categorical negation; the negative particle must be posed before the most important part of the proposition, and that is the consequent in the case of a conditional proposition. Thus, as said, the negation of “If P, then Q” will be “If P, then  not Q”. Stoic negation poses the negative particle before the entire proposition. And thus, the negation will be “It is not the case that if P, then Q”.

The second difference is that Boethius, in his DHS, distinguishes material and formal conclusions just as he does in his treatises on categorical logic (compare DHS I, iv, 1-2; 3; and I, ii, 1-7; II, ii, 7). In a hypothetical syllogism, to affirm the consequent is fallacious, but if the terms mutually exclude (as if they had an impossible matter) and the third hypothetical mood is given (“If it is not P, it is Q”), there will be a syllogism. Boethius gives the example “If it is not day, it is night. It is night. Therefore, it is not day”. But the conclusion does not obtain if ‘white’ and ‘black’ are correspondingly proposed by P and Q. Thus, a syllogism, either categorical or hypothetical, is logically valid if it does not depend on a specific matter of proposition to be conclusive. On the contrary, material syllogisms, either categorical or hypothetical, are valid under certain matters within a certain form, as they are not logical conclusions, for they are not valid universally or in every propositional matter. Accordingly, Boethius (DHS II, iv, 2) distinguishes between the nature of the relation (natura complexionis) and the nature of the terms (natura terminorum).

The third difference lies in the function Boethius puts on fides, belief (DHS I, 2,4; I, 2,5;  II, 1,2). The role of fides is the crucial core of Boethius’ DHS. According to him, if someone argues through the first indemonstrable, or by any other hypothetical syllogism, he needs to confirm the minor premise, which is a belief. It is not the syllogism as such which is in doubt, but its conclusion, which is conditioned to the truth of the categorical proposition. Boethius’ reason is the originality and primitiveness of categorical syllogisms. He calls categorical syllogisms ‘simple’ and hypothetical syllogisms ‘non-simple’, because the latter resolves into the former (DHS I, 2,4. Non simplices vero dicuntur quoniam ex simplicibus constant, atque in eosdem ultimos resolvuntur). The role of belief in Boethius’ theory of hypothetical syllogisms is also emphasized in his ICT and, if Stump (1988, pp. 210-1) is right, in recognizing the activity of Theophrastus behind Boethius’ theory of Aristotle’s Topics, then Theophrastus and the activity of the first Peripatetics could be well behind DHS.

iv. Treatises on Categorical Syllogisms

The De syllogismo categorico (DSC) and Introductio ad syllogismos categoricos (ISC) ​​are two treatises on categorical syllogisms composed by the young Boethius. Their contents are similar and almost parallel, which have raised various explanations during the early twenty-first century. They have greatly influenced the teaching of logic in medieval Western thought, especially the former which is the only one that contains syllogistic logic.

1. The De Syllogismo Categorico

DSC was written by Boethius early in his life, perhaps around 505 or 506 AD (for the chronology of Boethius works in logic, compare De Rijk 1964). Despite its importance, it did not received a critical edition until the work by Thörnqvist Thomsen (2008a). In the oldest codices (for example, Orleans 267, p. 57), DSC is entitled “Introductio in syllogi cathegoricos”, but this title changed to De syllogismo categorico after the editions by Martianus Rota (Venice, 1543) and Henrichus Loritus Glareanus (Basel, 1546). The edition of Migne (1891) is based on these two editions of the sixteenth century. During the twentieth century, most scholars have corrected this title to De categoricis syllogismis, after Brandt (1903, p. 238, n. 4), argued for using the plural.

The sources of DSC seem to be a certain introduction to categorical syllogistic logic that Porphyry had written to examine and approve the syllogistic theory of Theophrastus, whose principles are inspired by Aristotle’s Prior Analytics. This seems to be suggested from what Boethius says at the end of this work (p. 101, 6-8): “When composing this on the introduction to the categorical syllogism as fully as the brevity of an introductory work would allow, I have followed Aristotle as my principal source and borrowed from Theophrastus and Porphyry occasionally” (Thomsen Thörnqvist transl.). The existence of a similar work by Theophrastus is confirmed by various ancient references; for example, Boethius attributes to him the work “On the affirmation and negation” (in Int 2, 9, 25; Meiser ed.; also Alexander of Aphrodisias in An Pr 367, 15 and so forth), and Alexander of Aphrodisias cites profusely Theophrastus’ own Prior Analytics (in An Pr 123, 19 and 388, 18; Wallies ed. On the works by Theophrastus, see Bochenski 1947 and Sharples 1992, p. 114-123). Moreover, J. Bidez, in the life and works of Porphyry (compare Bidez 1923, p. 198, and Bidez 1964, p. 66*) confirms the existence of a written work entitled “Introduction to categorical syllogisms” written by Porphyry.

DSC is divided into two books. In the first, Boethius reviews the theory of simple propositions, in a way that recalls his commentaries on Aristotle’s De Interpretatione (ed. Meiser 1877-1880). However, DSC exceeds both the commentaries and what Aristotle teaches in his De Interpretatione. In fact, it includes some extra matters: (i) the law of subalternation when reviewing the logical relationships of the Square of oppositions; (ii) a broader explanation on conversion by containing conversion in contraposition (which Aristotle only developed for universal affirmative propositions); (iii) conversion by accident for universal negative propositions (which Aristotle did not include); and (iv) the division of simple propositions.

The second book is a synopsis of the central part of Aristotle’s theory of syllogism (Prior Analytics I, 2-8) plus Theophrastus’ doctrine of indirect syllogistic moods. Theophrastus added five indirect moods to Aristotle’s four moods. Medieval logicians knew these moods through the technical names: Baralipton, Celantes, Dabitis, Fapesmo, and Frisesomorum. Moreover, the second book of DSC (69, 8-72, 11) contains a complete explanation of the definition of syllogism, which recalls Alexander of Aphrodisias’ teaching in his commentary on Aristotle’s Topics. Again, DSC is more technical and elaborated than Aristotle’s Prior Analytics. In addition, Boethius’ explanation on reducing the imperfect moods of the second and third syllogistic figures to the first four modes of the first figure (Barbara, Celarent, Darii and Ferio) suggests a more systematic way than Aristotle’s own explanations.

A careful reading of the logical contents of DSC also makes clear that Boethius (DSC 17, 10) is following a division of categorical propositions to define the three main logical operations of Aristotelian logic: the opposition of propositions (contradiction, contrariety, and subcontrariety); the conversion of propositions (simple, by accident, and by contraposition); and syllogisms, with its figures, syllogistic moods, and the main extensions of first figure. This division is not Boethius’. Already Alexander of Aphrodisias (In An Pr 45,9) gives a complete use of it. There are remnants in Apuleius (PeriH 7, 9-14, p. 183) and Galen (Inst Log, 6,3), and it reappears in Boethius’ time in Ammonius (In An Pr 35.26) and Philoponus (In An Pr 40.31). It is also present in later authors.

Boethius, after commenting on the definitions of the elements of simple propositions (name, verb, indefinite name and verb, and phrase) takes a pair of propositions and divides them into categorical propositions as follows: a pair of simple propositions can or cannot have terms in common. If they do not have any term in common, then they do not have any logical relation. But if they have some terms in common, there is an alternative: either both terms are in common or some term in common. If both terms are in common, they can or cannot have the same order. When they have same order, the theory of Opposition is stated. If both terms change their order, the theory of Conversion is defined. On the other hand, if the pair has only one term in common, the syllogistic theory will appear.

2. The Introductio ad Syllogismos Categoricos

Boethius is the author of DSC and ISC, two treatises on categorical logic. They have a notorious similarity, and they look parallel to some extent. This opens the question of why Boethius wrote two. The first modern explanation proposed a strong dependence between them. Prantl (1855, I, p. 682, n.80) believed that the first book of DSC was an excerpt of ISC. But the presence of syllogistic logic in the second book of DSC and its total absence in ISC is enough to contradict Prantl’s explanation. Brandt (1903, p. 245) was right in refuting him. However, the reason why the treatises are so alike each other had not been found at all. Murari (1905) and McKinlay (1907) have suggested that the second book of DSC (dedicated to syllogistic logic) was originally the second book of ISC, while the first book of DSC was not by Boethius, but it was attached later to the codices in the middle age. According to McKinlay’s later revision of his hypothesis (1938, p. 218), ISC must be identified to Boethius’s Institutio categorica, thought to be lost, and mentioned by Boethius in his treatise On Hypothetical Syllogism (833B).

McKinlay’s hypothesis has lost support due to later works by De Rijk (1964, p. 39) and Magee (1998, p. xvii-xix). In the early twenty-first century, in her critical edition of both treatises, Christina Thomsen Thörnqvist (2008a and 2008b) has given a new explanation. She thinks (2008a, p. xxxix) that ISC is a review of the first book of DSC and that Boethius was intending to give a review of DSC’s two books, but this original plan was not completed (compare Thomsen Thörnqvist), for while Boethius was writing the first book, he realized that he had gone too far in what was supposed to be nothing more than an introduction to Aristotle’s syllogistic logic. In this conjecture she follows Marenbon (2003, p. 49).

In any case, ISC is different from DSC not only because of its absence of syllogistic logic. ISC (15.2) incorporates the notion of strict and non-strict definitions of the elements of the categorical proposition (name, verb, and so on). It incorporates with high interest proofs based on the matters of proposition (29.18). And it has a high consideration of singular propositions by including material that was not in his commentaries (48.2). Additionally, ISC contains a crucial difference: the logic of indefinite propositions. It states their opposition (51.9), their equivalence (62.9), and it develops with more detail conversion by contraposition (69.1).

The divisions of DSC and ISC

ISC cannot be the breviarium Boethius promised to write in his second commentary on Aristotle’s De Interpretatione (in Int 2, p. 251, 8-9). However, Shiel (1958, p. 238) thinks the contrary. The only reason is that ISC contains more than Boethius’ commentaries on De Interpretatione.  The essence of ISC must come from its division.

After developing the linguistic section of Aristotle’s De Interpretatione, both ISC and DSC present their plans through the establishment of a division of a pair of categorical propositions. These divisions contain identical branches, but they also contain important differences. On the one hand, the division of ISC is not as complete as that of DSC, because it does not incorporate the theory of syllogism, but it is more specific than that of DSC by incorporating indefinite terms, on which DSC says nothing. The following description shows how both divisions overlap one another, and what the differences between them are:

On the one hand, if ISC were the first book of DSC, then the indefinite propositions (which only ISC develops) would not take any part of the second book of DSC (which is only on syllogisms). Accordingly, their introduction would be purposeless. On the other hand, if the plan of ISC were a review of DSC’s two books, then Boethius was obliged to develop a theory of syllogisms with indefinite premises, which is unlikely since ISC’s division does not contain syllogistic logic (despite ISC’s being an introduction to syllogistic). But even if one thinks that this could have been so, there are several doubts concerning the logical capacity to do so in Boethius’ sources, even though the issue was not unknown. Boethius indeed recounts (in Int 2, 12-26, p. 316) that Plato and others made ​​conclusive syllogisms with negative premises, which is not allowed by Aristotle in his Prior Analytics (I, 4.41b7-9). According to Boethius, it is possible because Plato in Theaetetus (186e3-4) knew that sometimes a negative categorical proposition can be replaced with the correspondent affirmation with indefinite predicate terms. Boethius (in Int 2, 9, p. 317) cites Alexander of Aphrodisias as one the ancient authors in dealing with syllogisms with indefinite premise, which is certain because Alexander, in his commentary on Aristotle’s Prior Analytics, quotes another syllogism of this sort (in An Pr 397, 5-14). Even Aristotle’s De caelo (269b29-31) has another example. However, this does not seem sufficient to believe that Boethius in his ISC was able to introduce a theory of syllogistic logic with indefinite premises. (To this point, compare I. M. Bochenski (1948), pp. 35-37; and Thomas (1949), pp. 145-160; also, Álvarez & Correia (2012), pp. 297-306. Compare also Correia (2001), pp. 161-174.).

4. Influence of the Treatises

DSC and ISC were taken together and never considered separate. There are no signs that both treatises were studied by the medieval logicians and philosophers before the eleventh century (compare Van de Vyver, 1929, pp. 425-452).

The first text where the influence of their teaching is clear is the twelve century Anonymus Abbreviatio Montana. The other is the Dialectic by Peter Abelard. We know this not only because the name of Boethius is cited as the main source, but also because the division of propositions we have seen above is accepted and maintained by Abelard and the anonymous author of the Abbreviatio.

Later on, the authority of these treatises is more evident. In the fourteenth century, Peter of Spain’s Summulae logicales adopted the indirect moods of the first figure—the doctrine of the matters of proposition (which can be traced in the history of logic as far back as Alexander of Aphrodisias and Apuleius)—and he follows Boethius in the idea that is originally found in Aristotle of reducing the imperfect moods of the second and third syllogistic figures to the first four perfect moods of the first figure.

5. His Sources

The Contra Eutychen is the most original work by Boethius. It is original in its speculative solution and its methodology of using hypothetical and categorical logic in his analysis of terms, propositions, and arguments. The Consolation of Philosophy is also original, though many authors restrict it to his methodology and the way to dispose of the elements, but not the content, which would represent the Neoplatonic school of Iamblichus, Syrianus, and Proclus. As to his inspiring figures, Boethius gives his most respectful words to Plato and Aristotle, but the figure of Pythagoras is also venerated in De Institutione musica (DIM I, 10-20).

As to his scientific writings, his mathematical and logical works are not original, and Boethius recognizes it. When dealing with these scientific matters, Boethius relies on specific Greek sources: in mathematical disciplines, he follows the middle-Platonist Nicomachus of Gerasa (compare Bower, C., 1978, p. 5). However, not everything comes from him (Barbera, A. 1991. pp. 1-3 and 48-49). In his De Institutione musica (IV, 1-2), he follows with some changes (Barbera, ibid., pp. 38-60) to the Sectio Canonis, attributed to Euclides; and, in developing books V, and part of IV, he uses C. Ptolemy’s Harmonicas (compare DIM V, 4, 25; V, 5, 5; V, 8, 13; V, 11, 1; V, 14, 21, V, 18, 24 et al.; also, Redondo Reyes, 2002, p. cxv).

As to Aristotelian logic, he recognizes agreement with the certainty of the Peripatetic doctrines reviewed by the Neoplatonist Porphyry (compare Boethius in Int 2, 24-27, p. 17. Meiser ed., 1877-1880), but it is also true that not everything comes from him, for Boethius also names Syrianus, Proclus’s master.

As to the sources of his logical works, though far from being resolved, there is a basic agreement with refusing the hypothesis proposed by Pierre Courcelle (1969, p. 28) that they are dependent on the work of Ammonius Hermias in Alexandria. This same rebuttal attacks the widespread belief (from Courcelle too) that Boethius studied Greek in Alexandria. Indeed, Courcelle followed Bidez (1923, pp. 189-201), who some years before had shown that Boethius’ logical commentaries (not the treatises) owed almost everything to Porphyry. But Courcelle (1969) made a valuable observation about this: Boethius also refers to Syrianus, the teacher of Proclus, who taught later than Porphyry. Accordingly, Courcelle proposed that the occurrence of post-Porphyrian authors was due to Boethius’ reliance on the school of Ammonius in Alexandria, as Boethius’ logical works were written between 500 and 524, and by this time the school of Athens had fallen into decline after the death of Proclus in 485. On the other hand, Alexandria, where Ammonius taught from this date, had flourished as the center of philological, philosophical, and medical studies. Courcelle showed several parallels in the texts, but these, as he also saw, implied only a common source. However, he proposed that, in a passage of the second commentary on Aristotle’s De Interpretatione (in Int 2, 9, p. 361), the corrupt phrase sicut audivimus docet should be amended as follows: sicut Ammonius docet. Courcelle knew that the absence of the name of Ammonius in Boethius’ writings was the main objection of his hypothesis, but this emendation made it very convincing. He refused, therefore, the emendation that Meiser had done earlier in 1880, in the critical edition of Boethius’s commentaries on De Interpretatione (compare Praefatio, iv). Indeed, before Courcelle, Meiser had proposed emending Eudemus to read: sicut Eudemus docet. Subsequent studies showed that the emendation of Meiser was correct because the doctrine in question was given by Eudemus.

The focus of Courcelle, however, was to place the problem of the sources of Boethius’s logical writings into the correct focus. That is why Shiel (1958, pp. 217-244) offered a new explanation to this status quaestionis: he proposed that Boethius managed all his material, either pre- or post-Porphyrian, from a Greek codex of Aristotle’s Organon, having glosses and marginal notes from which he translated all the comments and explanations. This singular hypothesis has seduced many scholars and has even been generalized as Boethius’ general modus operandi. Shiel’s hypothesis is plausible in some respects when applied to the works on logic, but it seems to have many problems when applied to other kinds of writing. Many scholars have accepted the existence of this manuscript in Boethius’s hands by his verbatim allusions (for example, in Int 2 20-3 p. 250), although not all have accepted Shiel’s conclusions, which remove all originality to Boethius, when presenting him only as a mechanical translator of these Greek glosses. And even though Shiel always referred to Boethius’ works on logic, it is easy to generalize the servile attitude in his scientific material to his other works, but the poems or the philosophical synthesis of the Consolation or the logical analysis of Contra Eutychen have no parallel in earlier sources and are by themselves evidence of a lucid thinker.

According to Shiel (1990), Boethius’s logic comes from a copy of the commentary of Porphyry that was used in the school of Proclus in Athens. This copy was a codex containing Aristotle’s Organon with margins strongly annotated with comments and explanations. Magee has shown the difficulty to accepting the existence of this kind of codex before the ninth century AD (Magee, 1998, Introduction). On the other hand, some scholars find that Shiel’s hypothesis does not accurately apply to all the logical writings of Boethius, as Stump (1974, p. 73-93) has argued in his defense of the comments on the Topics. Moreover, the absence of Proclus’s name in Boethius’ works on logic, even when Proclus made important contribution to logic as in the case of the Canon of Proclus (compare Correia, 2002, pp. 71-84), raises new doubts about the accuracy of the formula given by Shiel.

6. References and Further Reading

  • Ackrill, J.L., 1963. Aristotle’s Categories and De Interpretatione. Translation with notes.  Oxford 1963: Oxford University Press.
  • Alvarez, E. & Correia, M. 2012. “Syllogistic with indefinite terms”. History and Philosophy of Logic, 33, 4, pp. 297-306.
  • Anderson, P. 1990. Transiciones de la antigüedad al feudalismo, Madrid: Siglo XXI.
  • Barbera, A. 1991. The Euclidean division of the Canon. Greek and Latin sources. Lincoln: The University of Nebraska Press, pp. 1-3, y 48-49.
  • Bidez, J. 1923. “Boèce et Porphyre”, en Revue Belge de Philologie et d’Histoire, 2 (1923), pp. 189-201.
  • Bidez, J. 1964. Vie de Porphyre. Le philosophe néoplatonicien, Hildesheim: G. Olms.
  • Bidez, J. 1984. Boethius und Porphyrios, en Boethius, M. Fuhrmann und J. Gruber (eds.). Wege der Forschung, Band 483, Darmstadt, pp. 133-145.
  • Bobzien, S. 2002. “The development of Modus Ponens in Antiquity: from Aristotle to the 2nd century AD. Phronesis vol. 47, 4, pp. 359-394.
  • Bobzien, S. 2002a. “A Greek Parallel to Boethius’ De Hypotheticis Syllogismis”, Mnemosyne 55 (2002a), pp. 285-300.
  • Bochenski, I.M. 1947. La logique de Théophraste, 2nd ed., Fribourg: Libraire de L’Université.
  • Bochenski, I.M. 1948. “On the categorical syllogism”, en Dominican Studies, vol. I, 1, pp. 35-37.
  • Bower, C., 1978. “Boethius and Nicomachus: An essay concerning the sources of De Institutione Musica”, Vivarium, 6, 1, pp. 1-45-
  • Brandt, S. 1903. “Entstehungszeit und zeitliche Folge der Werke von Boethius”, en Philologus, 62, pp. 234-275.
  • Cameron, A. 1981. “Boethius’ father’s name”, en Zeitschrifts für Papyrologie und Epigraph, 44, 1981, pp. 181-183.
  • Chadwick, H. 1981. “Introduction”, en Gibson (1981), pp. 1-12.
  • Correia, M. 2002a. “Libertad humana y presciencia divina en Boecio”, Teología y Vida, XLIII (2002), pp. 175-186
  • Correia, M. 2001. “Boethius on syllogisms with negative premisses”, en Ancient Philosophy 21, pp. 161-174.
  • Correia, M. 2009. “The syllogistic theory of Boethius”, en Ancient Philosophy 29, pp. 391-405.
  • Correia, M. 2002. “El Canon de Proclo y la idea de lógica en Aristóteles”. Méthexis 15, pp. 71-84.
  • Courcelle, P. 1967. La Consolation de Philosophie dans la tradition littéraire. Antécédent et Postérité de Boèce. Etudes Augustiniennes. Paris: Editions du Centre National de la Recherche Scientifique. C.N.R.S.
  • Courcelle, P. 1969. Late Latin Writers and their Sources, Harvard University Press, Cambridge/Massachusetts (see: Les Lettres Grecques en Occident de Macrobe à Cassiodore, 2nd. Ed., París, 1948).
  • De Rijk, L. 1964. On the chronology of Boethius’ works on logic (I and II), en Vivarium, vol. 2, parts 1 & 2, pp. 1-49 and 122-162.
  • Devereux, D. & Pellegrin, P. 1990. Biologie, logique et métaphysique chez Aristote, D. Devereux et P. Pellegrin (eds.). Paris: Editions du Centre National de la Recherche Scientifique. C.N.R.S.
  • Dürr, K. 1951. The propositional logic of Boethius. Amsterdam: North Holland Publishing. (Reprinted in 1980 by Greenwood Press, USA).
  • Friedlein, G. 1867. Anicii Manlii Torquati Severini Boetii De Institutione Arithmetica libri duo. De Institutione Musica libri quinque. Accedit Geometria quae fertur Boetii. G. Friedlein (ed.). Leipzig: Teubner.
  • Fuhrmann, M. & Gruber, J. 1984. Boethius. M. Fuhrmann y J. Gruber (eds.). Wege der Forschung, Band 483, Darmstadt.
  • Gibson, M. 1981. Boethius, his life, thought and influence. Gibson, M. (ed.). Oxford: Blacwell.
  • Isaac, I. 1953. Le Peri Hermeneias en Occident de Boèce à Saint Thomas. Histoire littéraire d’un traité d’Aristote, París.
  • Kaylor, N.H. & Phillips, P.E. 2012. A Companion to Boethius in the Middle Ages. Kaylor, N.H. & Phillips, P.E (eds.). Leiden/Boston : Brill.
  • Kretzmann, N. 1982. “Syncategoremata, exponibilia, sophismata”. The Cambridge History of Later Medieval Philosophy, pp. 211-214. Cambridge: Cambridge University Press.
  • Lloyd, G.E.R. 1990. “Aristotle’s Zoology and his Metaphysics: the status quaestionis. A critical review of some recent theories”, en Devereux & Pellegrin (1990), pp. 8-35.
  • Lukasiewicz, J. 1951. Aristotle’s Syllogistic. Oxford: Oxford University Press.
  • Magee, J. 1998. Anicii Manlii Severini Boethii De divisione liber. Critical edition, translation, prolegomena and commentary. Leiden/Boston/Koln: Brill.
  • Mair, J. 1981. “The text of the Opuscula Sacra”, pp. 206-213. In Gibson, M. (1981).
  • Marenbon, J. 2003. Boethius. Oxford: Oxford University Press.
  • Marenbon, J. 2009. The Cambridge Companion to Boethius. Cambridge: Cambridge University Press.
  • Maroth, M. 1979. “Die hypothetischen Syllogismen”, en Acta Antigua 27 (1979), pp. 407-436.
  • Martindale, J. R. 1980. The prosopography of the Later Roman Empire: A.D. 395-527. Cambridge 1980: Cambridge University Press.
  • Matthews, J., 1981. Boethius. His life, thought and influence, en M. Gibson (ed.), Oxford.
  • McKinlay, A. P. 1907. “Stylistic tests and the chronology of the works of Boethius”, en Harvard Studies in Classical Philology, XVIII, pp. 123-156.
  • McKinlay, A.P. 1938. “The De syllogismis categoricis and Introductio ad syllogismos categoricos of Boethius”, en Classical and Mediaeval Studies in honor of E. K. Rand, pp. 209-219.
  • Meiser, C. 1877-1880. Anicii Manlii Severini Boetii Commentarii in Librum Aristotelis PERI ERMHNEIAS. Prima et secunda editio. C. Meiser (ed.), Leipzig.
  • Migne, J.-P. 1891. De Syllogismo Categorico, en Patrologia Latina, 64, vol. 2, J.-P. Migne (ed.), París.
  • Migne, J.-P. 1981. Introductio ad Syllogismos Categoricos, en Patrologia Latina, 64, vol. 2, J.-P. Migne (ed.), París.
  • Minio Paluello, L. 1972. Opuscula. The Latin Aristotle, Amsterdam: A. Hakkert. Eds.
  • Minio-Paluello, L. 1965. Aristoteles Latinus, II, 1-2, L. Minio-Paluello (ed.), París: Desclée de Brouwer.
  • Mommsen, Th. 1894. Cassiodori Senatoris Variae, Monumenta Germaniae Historica, Auctorum Antiquissimorum Tomus XII, Mommsen, Th. (ed.), Berlin: Weidmann.
  • Moreau, J. & Velkov, V. 1968. Excerpta Valesiana. Moreau, J and Velkov, V. (eds.), Leipzig, Academia Scientiarum Germanica Berolinensis: Teubner.
  • Murari, R. 1905. Dante e Boezio, Contributo allo studio delle fonti Dantesche. Bologna: Nicola Zanichelli.
  • Mynors, R.A.B. 1963. Cassiodori Senatori Institutiones, R.A.B. Mynors (Ed.), Oxford: Clarendon Press.
  • Nuchelmans, G. 1973. Theories of the Proposition, Leiden 1973: North Holland.
  • Obertello, L.A.M. 1969. Severino Boezio De hypotheticis syllogismis. Testo, traduzione e commento. Brescia: Paideia Editrice.
  • Prantl, C., 1855. Geschichte der Logik im Abendlande, Leipzig, 1855-1870: G. Olms.
  • Prior, A.N. 1963. “The Logic of the Negative Terms in Boethius”, en Franciscan Studies, 13, vol. I, pp. 1-6.
  • Prior, A.N. 1962. Formal Logic, Oxford: Clarendon Press.
  • Rand, Stewart & Tester. 1990. Boethius, the Theological tractates and The Consolation of philosophy, Translated by H.F. Stewart, E.K. Rand and S.J. Tester. Cambridge Massachusetts/London, England. The Loeb Classical Library: Harvard University Press.
  • Redondo Reyes, P. 2002. La Harmónica de Claudio Ptolomeo: edición crítica, introducción traducción y comentario. PhD thesis, Murcia, España.
  • Sharples, R. 1992. Theophrastus of Eresus. Sources for his Life, Writings, Thought and Influence, vols. i-iii, W.W. Fortenbaugh, P.M. Huby, R.W. Sharples, D. Gutas (Eds.), together with A.D. Barker, J.J. Keaney, D.C. Mirhady, D. Sedley and M.G. Sollenberger. Leiden: Brill.
  • Shiel, J. 1990. “Boethius’ Commentaries on Aristotle, en Sorabji (1990): pp. 349-372, (también: Medieval and Renaissance Studies 4, 1958, pp. 217-44).
  • Sorabji, R. 1990. Aristotle Transformed. The Ancient Commentators and their Influence. Sorabji, R. (ed.). London: Duckworth.
  • Spade, P.V., 1982. The semantics of terms, en The Cambridge History of Later Medieval Philosophy, Cambridge 1982, pp. 190-1: Cambridge University Press.
  • Speca, A. 2001. Hypothetical syllogistic & Stoic logic. Leiden/Boston/Köln: Brill.
  • Stump, E., 1974. “Boethius’ Works on Topics”, en Vivarium, 12, 2, pp. 77-93.
  • Sullivan, M.W., 1967. Apuleian Logic. The Nature, Sources, and Influence of Apuleius’s Peri Hermeneias, en: Studies in Logic and the Foundations of Mathematics, Amsterdam: North-Holland.
  • Usener, H. 1877. Anecdoton Holderi: ein beitrag zur Geschichte Roms in ostgotischer Zeit. Leipzig: Teubner.
  • Thomas, P. 1908. Apulei Opera quae Supersunt, vol iii, Apulei Platonici Madaurensis De Philosophia Libri, liber PERI ERMHNEIAS, Thomas P. (ed.), pp. 176-194, Leipzig: Teubner.
  • Thomas, I., O.P. 1949. “CS(n): An Extension of CS”, in Dominican Studies, pp. 145-160.
  • Thomsen Thörnqvist, C. 2008a. Anicii Manlii Seuerini Boethii De syllogismo categorico. A critical edition with introduction, translation, notes and indexes. Studia Graeca et Latina Gothoburgensia LXVIII, University of Gothenburg: Acta Universitatis Gothoburgensis.
  • Thomsen Thörnqvist, C. 2008b. Anicii Manlii Severini Boethii Introductio ad syllogismos categoricos. A critical edition with introduction, commentary and indexes. Studia Graeca et Latina Gothoburgensia LXIX, University of Gothenburg: Acta Universitatis Gothoburgensis.
  • Van de Vyver, A., 1929. “Les étapes du développement philosophique du aut Moyen-Age”, Revue Belge de Philologie et d’Histoire, viii (1929), pp. 425-452. Brussels: Société pour Le Progrès des Études Philosophiques et Historiques.
  • Wallies (1883): Alexandri in Aristotelis Analyticorum Priorum Librum I Commentarium, M. Wallies (ed.), in Commentaria in Aristotelem Graeca, vol. 2.1, Berlín: G. Reimerum.

 

Author Information

Manuel Correia
Email: mcorreia@uc.cl
Pontifical Catholic University of Chile
Chile

Enactivism

The term ‘enaction’ was first introduced in The Embodied Mind, co-authored by Varela, Thompson, and Rosch and published in 1991. That seminal work provides the first original contemporary formulation of enactivism. Its authors define cognition as enaction, which they in turn characterize as the ‘bringing forth’ of domains of significance through organismic activity that has been itself conditioned by a history of interactions between an organism and its environment.

To understand mentality, however complex and sophisticated it may be, it is necessary to appreciate how living beings dynamically interact with their environments. From an enactivist perspective, there is no prospect of understanding minds without reference to such interactions because interactions are taken to lie at the heart of mentality in all of its varied forms.

Since 1991, enactivism has attracted interest and attention from academics and practitioners in many fields, and it is a well-established framework for thinking about and investigating mind and cognition. It has been articulated into several recognizably distinct varieties distinguished by their specific commitments. Some versions of enactivism, such as those put forward by Thompson and Di Paolo and others, focus on expanding and developing the core ideas of the original formulation of enactivism advanced by Varela, Thompson, and Rosch. Other versions of enactivism, such as sensorimotor knowledge enactivism and radical enactivism incorporate other ideas and influences in their articulation of enactivism, sometimes leaving aside and sometimes challenging the core assumptions of the original version of enactivism.

Table of Contents

  1. Core Commitments
  2. Contemporary Varieties of Enactivism
    1. Original Enactivism
      1. Biological Autonomy
      2. Bringing Forth Domains of Significance
      3. Phenomenological Connections
      4. Buddhist Connections
      5. Sense-Making
    2. Sensorimotor Knowledge Enactivism
    3. Radical Enactivism
  3. Forerunners
  4. Debates
  5. Applications and Influence
  6. Conclusion
  7. References and Further Reading

1. Core Commitments

 What unifies different articulations of enactivism is that, at their core, they all look to living systems to understand minds, and they conceive of cognition as embodied activity. In enactivist terms, perceiving, imagining, remembering, and even the most abstract forms of thinking are to be understood, first and foremost, as organismic activities that dynamically unfold across time and space.

Enactivists conceive of the embodied cognitive activity that they take to constitute cognition as fundamentally interactive in at least two ways. First, the manner and style of any given bout of cognitive activity are conditioned by the cognizer’s prior history of engagement with environments and the particularities of the current environment with which they are actively engaged. Second, cognizers shape their environments and are, in turn, shaped by them in a variety of ways across multiple timescales.

A cornerstone commitment of enactivism is that minds arise and take shape through the precarious self-creating, self-sustaining, adaptive activities of living creatures as they regulate themselves by interacting with features of its environment. To take a central case, an organism’s characteristic patterns of sensorimotor interaction are deemed to be shaped by its prior history of active engagement with aspects of their environments. Its past engagements reinforce and tend to perpetuate its sensorimotor habits and tendencies. Yet organisms are not wholly creatures of past habits. Living beings always remain flexibly open to adjusting their repertoires and ways of doing things in new and novel ways. Cognition, which takes the form of patterns of open-ended, flexible, extended spatio-temporal activity, is thus deemed ‘autonomous’ in the sense that it unfolds in ways that are viable for sustaining itself and that are not externally regulated or pre-programmed.

Enactivists regard an organism’s environment as a domain of significance populated with items of relevance, not as a neutral setting that can be adequately characterized in, say, purely physical terms. Importantly, in this regard, organisms are said to ‘enact’ or ‘bring forth’ their ‘worlds’. Organisms not only adapt to and are shaped by their environments; they also dynamically fashion, curate, and adapt to them. Through such activity and exchanges, both organisms and their environments are transformed and, in an important sense, brought into being. Enactivists often explicate the unprescribed bi-directional influence of organisms on their environments and vice versa, poetically, using the metaphor of “laying down a path in walking”.

Another signature enactivist idea is that qualitative, phenomenal aspects of lived experience—what it is like to experience something—are an achievement of organismic activity. To take a central case, perceptual experience arises and takes shape through an organism’s active exploration of aspects of its environment. It is through such engaged efforts and the specific ways they are carried out that organisms experience the world in particular ways. Accordingly, organismic activities of certain kinds are required to achieve phenomenal access to aspects of the world or for things to ‘show up’ or “to be present” phenomenally.

Minds, conceived in enactivist terms, operate in ways that are fundamentally unlike those of mechanisms that are driven entirely by externally sourced programs and algorithms. Enactivism thus sees itself as directly opposing the views of cognition that understand it as essentially computational and representational in nature. In its original formulation, enactivism strongly rejects the idea that minds are in the business of collecting, transforming, and representing information sourced from a pre-given world that is assumed to exist independently of and prior to organisms. Strikingly, to conceive of cognition in line with the original version of enactivism entails holding that when organisms actively engage with aspects of their worlds, they always do so in mentality-constituting ways. Yet, enactivists hold that such cognitive activity neither involves constructing representations of those worlds based on retrieved information nor does it depend on any kind of computational processing. So conceived, enactivism rejects the longstanding idea that the core business of cognition is to represent and compute, and, concomitantly, it rejects the familiar explanatory strategies of orthodox cognitive science.

Enactivism is a significant philosophical enterprise because, at least under standard interpretations, it offers a foundational challenge to cognitivist accounts of mind—those that conceive of mentality in representational and computational terms. Enactivists regard such conceptions of mind, which dominate much mainstream analytic philosophy and cognitive science, not only as resting on a mistaken theoretical foundation but as presenting a tempting picture of mentality that, practically, subverts efforts to develop a healthier and more accurate understanding of ourselves and our place in nature.

2. Contemporary Varieties of Enactivism

There are several, and importantly, different versions of enactivism occupying the contemporary philosophical landscape.

a. Original Enactivism

 The Embodied Mind by Varela, Thompson, and Rosch, published in 1991, is the locus classicus of enactivism. That landmark work is variously described as initially formulating and advancing the most influential statement of enactivism in recent times. It is credited with being “the first and among the most profound” of the many and various enactivist offerings that have followed in its wake (Kabat-Zinn 2016, p. xiii).

Enactivism, as originally formulated, is not a neatly defined or finished theory. It is variously described in the literature as a broad, emerging ‘perspective’, ‘approach’, ‘paradigm’, or ‘framework’ for understanding mind and cognition (see, for instance, Varela, Thompson and Rosch 1991; Baerveldt and Verheggen 2012; Stewart and others 2010; Gallagher 2017). Enactivism is not a finished product; it continues to evolve as new versions of enactivism emerge which adjust, add to, or reject certain core and peripheral commitments of the original version.

Though the original version of enactivism resists definition in terms of a set of central theses, it does have distinctive features. There are three key and recurring themes emphasized in the original statement of enactivism. The first theme is that understanding organismic biological autonomy is the key to understanding minds. Original enactivism assumes that there is deep continuity between life and mind, such that understanding the biological autonomy of organisms sheds direct light on cognition. The second theme is that minds cannot be understood without coming to terms with subjective, lived experience, and consciousness. The third theme is that non-Western traditions, and in particular, Buddhist philosophy and its practices of meditation and mindfulness, should play a significant role in reforming and rethinking the future sciences of the mind, both theoretically and practically.

The original version of enactivism put forward in The Embodied Mind has been successively developed and expanded upon in works, mainly by Thompson, Di Paolo, and their co-authors (principally Thompson 2007, Froese and Di Paolo 2011, McGann and others 2013, Di Paolo and others 2017, Di Paolo and others 2018, Di Paolo 2018, 2021). Some speak of these works, collectively, as constituting and contributing to a variety of autopoietic enactivism (Hutto and Myin 2013, 2017, Ward and others 2017, Stapleton 2022). The label, which now has some purchase was chosen because the original version of enactivism and those that seek directly to expand on it, are united in looking to biological autonomy to understand the fundamentals of mindedness. Crucially, all enactivists of this stripe embrace the notion of autopoiesis —the self-creating and self-sustaining activity of living systems —as a common theoretical starting point, having been inspired by “the work of the biologists Humberto Maturana and Francisco Varela” (Baerveldt and Verheggen 2012, p. 165; see Maturana and Varela, 1980, 1987). Nevertheless, the label autopoietic enactivism is contested (see, for example, Thompson 2018, Netland 2022). It is thought to be misleading because, although these enactivists build upon the work of Varela and Maturana, they have added significant resources, expanding upon and modifying the initial conception of autopoiesis in their efforts to explicate key aspects of biological autonomy, namely, recognizing its teleological character (see, for instance, Thompson 2007, 127; Di Paolo 2009, p. 12; Di Paolo 2018 and others, p. 37). As such, enactivists working on these topics deem autopoiesis, as originally conceived, to be, at most, necessary but insufficient for important world-involving forms of cognition (see Thompson 2007, p. 149-150; see also p. 127). For these reasons, Barandiaran (2017) recommends the label autonomist enactivism instead. However, given these nuances, it may be safer and more accurate to speak of these positions simply as variants of original enactivism.

The primary aim of the original version of enactivism was to address the problem of understanding how lived experience fits into the world, as described by science, including cognitive science. On the face of it, the two appear unreconcilably different from one another. Thompson (2016) puts the apparent dilemma that motivated the first formulation of enactivism in terms of a hard choice: either “accept what science seems to be telling us and deny our experience… or hold fast to our experience and deny science” (p. xix).

The original version of enactivism was born from the aspiration of finding a way for cognitive science to give appropriate attention to lived experience. One of its key assumptions is that “we cannot begin to address… [the gap between science and experience] without relying on some kind of phenomenology, that is, on some kind of descriptive account of our experience in the everyday world” (Thompson 2016, p. xx).

Enactivism rejects mainstream conceptions of mind that strongly demarcate minds from bodies and environments. It holds that such conceptions are not justified and should be rethought. Enactivism aims to eradicate misleading dualisms that continue to dominate analytic philosophy of mind and much cognitive science. It aims to dissolve the mind-body problem by asking us to abandon our attachment to traditional dichotomies and to come to see that minds are not ultimately separate from bodies, environments, or others.

Original enactivism seeks to put the mind-body problem to rest once and for all. It also rejects the traditional input-output processing model of the mind, a model which pays homage, often explicitly, to the idea that minds are furnished by the senses by accepting that the senses supply minds with information about the external world. Original enactivism rejects this familiar characterization of mental activity, denying that minds ever pick up or process information from the environment. Concomitantly, original enactivism rejects the idea that minds are fundamentally information processing systems that manipulate informational content by categorizing, conceptualizing, and schematizing it by means of representational-computational processes. By also pressing us to radically rethink key notions —self, nature, and science – original enactivism aims to usher in “a new kind of cognitive science” (Rosch 2016, p. xxxv). So conceived, enactivism seeks to revolutionize and massively reform the sciences of the mind.

Embracing original enactivism entails rethinking foundational mainstream theoretical assumptions that are prevalent in much analytic philosophy of mind and cognitive science. Importantly, in this vein, original enactivists advocate not only for changes to our theoretical mindset but also for changes in existing practices and approaches we use in the cognitive sciences and cognate domains that study and engage with minds. Thus, the original version of enactivism envisions that future sciences of the mind will recognize and work with “another mode of knowing not based on an observer and observed” (Rosch 2016, p. x). Original enactivism, thus, issues a normative demand to create a space in which those working to understand and expand our lived experience can speak to and understand empirically focused scientists of the mind. In such a setting, there would be a dynamic and interactive ‘circulation’ and cross-fertilization of theory and practice (Thompson 2016, 2017).

This is the sense in which original enactivism seeks to provide “a framework for a far-reaching renewal of cognitive science as a whole” (Stewart, Gapenne, and Di Paolo 2010, p. viii.).

It is an open question just how much of the ambition of original enactivism has been achieved, but it is undeniable that much has changed in the fields of philosophy and the sciences of the mind since its debut. Thompson (2016) summarizes the current state of the art.

The idea that there is a deep continuity in the principles of self-organization from the simplest living things to more complex cognitive beings — an idea central to Varela’s earlier work with neurobiologist Humberto Maturana — is now a mainstay of theoretical biology. Subjective experience and consciousness, once taboo subjects for cognitive science, are now important research topics, especially in cognitive neuroscience. Phenomenology now plays an active role in the philosophy of mind and experimental cognitive science. Meditation and mindfulness practices are increasingly used in clinical contexts and are a growing subject of investigation in behavioral psychology and cognitive neuroscience. And Buddhist philosophy is increasingly recognized as an important interlocutor in contemporary philosophy (p. xix).

Notably, there have been efforts to transform the way the science of intersubjectivity is itself conducted by getting researchers to participate, at once, both as subjects and objects of research. Details of this method, called PRISMA, are set out in De Jaegher and others (2017). Thompson (2017) praises this work for being “clearly animated by the full meaning of enaction as requiring not just a change in how we think but also in how we experience” (p. 43). For a related discussion of how cognitive science practice might change by giving due attention to dynamically evolving experience, see McGann (2022).

i. Biological Autonomy

All living systems —from simple cells to whole organisms, whether the latter are single-celled bacteria or human beings —actively individuate themselves from other aspects of their environments and maintain themselves by engaging in a constant “dynamical exchange of energy and matter that keeps the inside conditions just right for life to perpetuate itself” (Kabat-Zinn 2016, p. xiv). This is all part of the great game of life: staying far enough away from entropy, aka thermodynamic equilibrium, to survive.

Enactivists emphasize the autopoietic character—the self-creating and self-individuating results—of the activity through which living systems adaptively produce and maintain vital boundaries and relationships between themselves and what lies beyond them (Varela and others, 1991; Thompson, 2007). Accordingly, “organisms actively and continuously produce a distinction between themselves and their environment where none existed before they appeared and where none will remain after they are gone” (Di Paolo and others 2018, p. 23).

What determines the boundaries of a given organism? Where does a given organism end and the environment begin? Enactivists seek to answer such questions by pointing to the fact that living systems are organizationally and operationally closed, which is to say that they are “constituted as a network of interdependent processes, where the behavior of the whole emerges from the interaction dynamics of its component parts” (Barandiaran 2017, p. 411, see also Di Paolo and Thompson 2014, Di Paolo and others 2018; Kirchhoff 2018a).

The basic idea of operational closure is that self-defining autopoietic processes can be picked out by the fact that they exist in mutually enabling networks of circular means-end activities, such that “all of the processes that make up the system are enabled by other processes in the system” (Di Paolo and others 2018, p. 25). Operational closure is evident in the self-sustaining autonomous activity of, for example, metabolic networks in living systems. External influences —such as, say, the effects of sunlight being absorbed by chlorophyll —are any influences that are not mutually enabled or produced by processes within such a closed system.

The exact boundaries of a self-producing, self-individuating living system can be flexible. In this regard, Di Paolo and others (2018) cite the capacity of some insects and spiders to breathe underwater for certain periods of time. They manage to do this by trapping air bubbles in the hair on their abdomens. In such cases, these environmental features become part of the self-individuating enabling conditions of the organism’s operationally closed network: “These bubbles function like external gills as the partial pressure of oxygen within the bubble, diminished by respiration, equilibrates with that of the water as the external oxygen flows in” (Di Paolo and others 2018, p. 28, see also Turner 2000).

When we consider concrete cases, it is evident that autopoietic processes of self-production and self-distinction require living systems to continuously adjust to features of their environment. This involves the “selective opening and selective rejection of material flows—in other words, an adaptive regulation of what goes in and what stays out” (Di Paolo and others 2018, p. 40).

Adaptive regulation requires flexibility. It requires simultaneous adjustments at multiple timescales and various levels, where each adjustment must be responsive to particular speeds and rhythms at the scale required to meet specific thresholds. This is why the business of being and staying alive is necessarily complex, forever unfinished, precarious, and restless (Di Paolo and others, 2017; 2018). Though there is room for error, minimally, organisms that survive and propagate must actively avoid engaging in behaviors that are overly maladaptive.

Enactivists hold that such adaptive activity is autonomous. Living systems establish their own unprincipled norms of operation —norms that arise naturally from the activity of staying alive and far from entropy. It is because organisms generate their own norms through their activities that enactivists speak of them as having an immanent teleology (Thompson 2007, Di Paolo and others 2018).

It is claimed that this notion of autonomy is the very core of enactivism (Thompson 2007, Barandiaran 2017, p. 409; Di Paolo and others, 2018, p. 23). It is regarded as a notion that, strictly speaking, goes “beyond autopoiesis” (Di Paolo and others 2018, p. 25).

Enactivists contend that the fact that living systems are autonomous in the precise sense just defined is what distinguishes them from wholly lifeless, heteronomous machines of the sort that are driven only by external, exogenous instructions. A core idea of enactivism is that “the living body is a self-organizing system. To think of living bodies in this way “contrasts with viewing it as a machine that happens to be made of meat rather than silicon” (Rosch 2016, p. xxviii). In line with this understanding, enactivists hold that organismic processes “operate and self-organize historically rather than function” (Di Paolo and others 2018, p. 20). It is precisely because organisms must always be ready to adjust to new possibilities and circumstances that the self-organizing activity of living systems cannot be governed by instructions in a functionally pre-specified manner (see Barandiaran 2017, p. 411).

Enactivists hold that autonomous norm generation is a feature of all modes and styles of cognitive activity and not just as it concerns basic organismic self-production, self-individuation, and self-maintenance. Di Paolo and others (2018), for example, identify two important dimensions of autonomous self-regulation beyond the basic cycles of regulation that sustain living organisms. These additional dimensions they identify are cycles of sensorimotor interactions involved in action, perception, and emotion and cycles of intersubjectivity involved in social engagements with others (Di Paolo and others 2018, p. 22).

ii. Bringing Forth Domains of Significance

Connected with their understanding of biological autonomy, enactivists reject the idea that organisms simply adapt to features of a pre-existing, neutrally characterized physical world. Instead, they hold that organisms are attuned to features of environments or domains that are significant to them —environments that organisms themselves bring into being. It is on this basis that enactivists “conceive of mental life as the ongoing meaningful engagement between precariously constituted embodied agents and the worlds of significance they bring forth in their self-asserting activity” (Di Paolo and others 2018, p. 20). Hence, a central thesis of enactivism is that “cognition is not the grasping of an independent, outside world by a separate mind or self, but instead the bringing forth or enacting of a dependent world of relevance in and through embodied action” (Thompson 2016, p. xviii).

In this view, organisms and environments dynamically co-emerge. The autonomous adaptative activity of organisms “brings forth, in the same stroke, what counts as other, the organism’s world.” (Thompson 2007, p. 153). The pre-existing world, as characterized by physics and chemistry, is not equivalent to an organism’s environment. The latter, which is effectively captured by von Uexküll’s (1957) notion of an Umwelt, is a sub-set of the physio-chemical world that is relevant to the organism in question. This environmental domain of significance or relevance for organisms, which enactivists hold, is brought into being through the activity of organisms themselves.

For example, sucrose only serves as food for a bacterium because it has certain physical and chemical properties. Yet without organisms that use it as a nutrient, sucrose, understood merely as something that exists as part of the physicochemical world, is not food. Hence, that it is food for bacteria depends not only, or even primarily, on the physiochemical properties of sucrose itself but chiefly on the existence and properties of bacteria —properties connected to their metabolic needs and processes that they brought into being. Although, taking the stance of scientists, we can and do speak of aspects of an organism’s environment using the language of physics and chemistry, describing them in organism-neutral terms, it is only if we recognize the significance that such worldly features have for the organism that we are able to pick the right aspects of the world that are relevant or important to it.

On the face of it, to suggest that organisms ‘bring forth’ or ‘enact’ their own environments may appear to be an extravagant thesis. Yet it finds support in the seminal work of biologists, principally Gould and Lewontin (1979), who question accounts of Darwinian adaptationism in two key respects. They reject construing natural selection as an external evolutionary force that separately targets and optimizes individuated organismic traits. They also reject the idea that natural selection fashions organisms to better compete against one another for the resources of the pre-existing physical world (for further details, see Godfrey-Smith 2001). In the place of strong adaptationism, original enactivists propose to understand evolution in terms of natural drift– seeing it as a holistic, “ongoing process of satisfaction that triggers (but does not specify) change in the form of viable trajectories” (see a full summary in Varela and others 1991, pp. 196-197 and also Maturana and Mpodozis 2000).

A major focus of the critique of adaptationism is the rejection of the idea that a living creature’s environment is an external, “preexistent element of nature formed by autonomous forces, as a kind of theatrical stage on which the organisms play out their lives” (Lewontin and Levins 1997, p. 96, Lewontin 2000).

Putting pressure on the idea that organisms simply adapt to a neutrally characterized external world, Lewontin and Levins (1997) observe that not all worldly forces affect every organism equally. In some cases, some forces greatly affect certain organisms, while the same forces matter to other creatures hardly at all. The all-pervasive force of gravity provides a shining example. All middle-sized plants and animals must contend with it. Not only does gravity affect the musculoskeletal, respiratory, and circulatory systems of such organisms, but also affects their single biological cells. Gravity influences cell size and processes such as mechanotransduction —processes by which cells electrochemically respond, at micro-timescales, to mechanical features and forces in the environment. Hence, even on a microlevel, gravity matters for such cognitively important activities as hearing, proprioception, touch, and balance. Due to their size, other organisms, however, must contend with and are shaped in their activities by other forces. For microorganisms, it is Brownian motion, not gravity, that matters most to their lives. It is reported that some microbes can survive the hypergravity of extraterrestrial, cosmic environments, which exert a gravitational force up to 400,000 times greater than that found on Earth (Deguchi and others 2011). This is one reason why bacteria “are ubiquitous, present in nearly every environment from the abyssal zone to the stratosphere at heights up to 60 km, from arctic ice to boiling volcanoes” (Sharma and Curtis 2022, p. 1).

These reminders support the enactivist claim that the relationship between organism and environment is dialectical —that the one cannot exist without the other. Maintaining that organisms and their environments reciprocally codetermine one another, defenders of this view of biological development hold that:

Environments are as much the product of organisms as organisms are of environments. There is no organism without an environment, but there is no environment without an organism. There is a physical world outside of organisms, and that world undergoes certain transformations that are autonomous. Volcanoes erupt, and the earth precesses on its axis of rotation. But the physical world is not an environment; only the circumstances from which environments can be made are (Lewontin and Levins 1997, p. 96).

Moreover, the relationship between organisms and their environments is not static; it coevolves dynamically over time: “As the species evolves in response to natural selection in its current environment, the world that it constructs around itself is actively changed” (Thompson 2007, p. 150). Lewontin and Levins (1997) provide a range of examples of how organisms relate to and actively construct their environments. These include organisms regulating ambient temperatures through the metabolic production of shells of warm, moist air around themselves and plant roots producing humic acids that alter the physiochemical structure of soil to help them absorb nutrients.

Looking to these foundations, Rolla and Figueiredo (2021) further explicate the evolutionary dynamics by which organisms can be said to, literally, bring forth their worlds. Drawing on the notion of niche construction, theirs is an effort to show that “enactivism is compatible with the idea of an independent reality without committing to the claim that organisms have cognitive access to a world composed of properties specified prior to any cognitive activity”. For more on the notion of niche construction, and why it is thought to be needed, see Laland and others (2014), Laland and others (2016), and Werner (2020).

iii. Phenomenological Connections

In line with its overarching aim, original enactivism aims at giving an account of “situated meaningful action that remains connected both to biology and to the hermeneutic and phenomenological studies of experience” (Baerveldt and Verheggen, 2012, p. 165. See also Stapleton and Froese (2016), Netland (2022)).

See also Stapleton and Froese (2016), Netland (2022). It owes a great deal to the European tradition of phenomenology in that its account of virtual milieus and vital norms is inspired by Merleau-Ponty’s The Structure of Behaviour and, especially, his notion of “the lived body” (Kabat-Zinn 2016, p. xiii). Virtual milieus and their properties are not something found ‘objectively’ in the world; rather, they are enacted or brought forth by organisms. Organisms not only enact their environments —in the sense that sucrose might become food for certain creatures —they also enact their qualitative, felt experiences of the world. In this vein, enactivists advance the view that “our perceived world [the world as perceived]…is constituted through complex and delicate patterns of sensorimotor activity” (Varela and others, 1991, p. 164).

By appealing to arguments from biology, enactivists defend the view that organisms and their environments are bound together in ways that make it impossible to characterize one without reference to the other when it comes to understanding mental life. They apply this same thinking when it comes to thinking about qualitative, phenomenally conscious aspects of mind, holding, for example, that “we will not be able to explain colour if we seek to locate it in a world independent of our perceptual capacities” (Varela and others, 1991, p. 164). This is not meant to be a rejection of mind-independent realism in favor of mind-dependent idealism. Defenders of the original version of enactivism offer this proposal explicitly as providing a ‘middle way’ between these familiar options. By their lights, “colours are not ‘out there’ independent of our perceptual and cognitive capacities…[but equally] colors are not ‘in here’ independent of our surrounding biological and cultural world” (p. Varela and others 1991, p. 172).

For enactivists, colors cannot be understood independently of the very particular ways that experiencing beings respond to specific kinds of worldly offerings. Accordingly, it is not possible to think about the nature of colors qua colors without also referencing those ways of interacting with environmental offerings. This claim rests on two assumptions. First, the way colors appear to organisms —the way they experience them —is essential to understanding the nature of colors as such. Second, such experiential properties are inescapably bound up with organismic ways of responding to aspects of their environments.

Importantly, though enactivists deny that colors are objective properties of the world independent of organisms that perceive them, they neither claim nor imply that colors are wholly mind-dependent properties in the sense associated with classical Berkleyian idealism as it is standardly portrayed.

Furthermore, it is precisely because enactivists hold that an organism’s ways of responding to aspects of its environment are not inherently representational, or representationally mediated that “color provides a paradigm of a cognitive domain that is neither pregiven nor represented but rather experiential and enacted” (Varela and others 1991, p. 171). This conclusion is meant to generalize, applying to all phenomenological structures and aspects of what is brought forth by organisms as domains of significance through their autonomous activity.

In this regard, in its original formulation, enactivism drew on “significant resources in the phenomenological tradition for rethinking the mind” (Gallagher 2017, p. 5). Apart from explicitly borrowing from Merleau-Ponty, Varela and others (1991) also aligned their project with other classic thinkers of the phenomenological tradition, such as Husserl and Sartre, to some extent.

For example, although the enactivists wished to steer clear of what Hubert Dreyfus interpreted as Husserl’s representationalist leanings, they acknowledge the prime importance of his efforts to “develop a specific procedure for examining the structure of intentionality, which [for him] was the structure of experience itself” (Varela and others 1991, p. 16). For this reason, and by contrast, they explicitly oppose and criticize the cognitivist conviction that there is “a fundamental distinction between consciousness and intentionality” (p. 56). By their lights, drawing such a distinction creates a mind-mind problem and disunifies our understanding of the cognizing subject.

Nevertheless, despite borrowing in key respects from the Western phenomenological tradition, when formulating their initial statement of enactivism, Varela and others (1991) also criticized that tradition for, allegedly, being overly theoretical in its preoccupations. According to their assessment at the time, phenomenology “had gotten bogged down in abstract, theoretical reflection and had lost touch with its original inspiration to examine lived experience in a rigorous way” (Thompson 2016, p. xx-xxi). This critical take on phenomenology motivated the original enactivists to “turn to the tradition of Buddhist philosophy and mindfulness-awareness medita­tion as a more promising phenomenological partner for cognitive sci­ence” (Thompson 2007, p. 413).

In time, Thompson and Varela too, in their analysis of the specious present and their work with Natalie Depraz, at least, came to revise original enactivism’s negative verdict concerning phenomenology’s limitations. In his later writings, Thompson admits that the authors of The Embodied Mind, wrongly, gave short shrift to phenomenology. For example, by conceding that they had relied too heavily on second-hand sources and had not given careful attention to the primary texts, Thompson makes clear that the original enactivists came to hold, mistakenly, that Husserl sponsored an unwanted brand of representationalism (see Thompson 2007 appendix A, Thompson 2016).

Many contemporary enactivists, including Thompson, openly draw on and seek to renovate ideas from the phenomenological tradition, connecting them directly with current theorizing in the cognitive sciences (Gallagher 2005, Gallagher and Zahavi 2008/2021, Gallagher 2017). As Gallagher notes, for example, there has been new work in this vein on “Husserl’s concept of the ‘I can’ (the idea that I perceive things in my environment in terms of what I can do with them); Heidegger’s concept of the pragmatic ready-to-hand (Zuhanden) attitude (we experience the world primarily in terms of pre-reflective pragmatic, action-oriented use, rather than in reflective intellectual contemplation or scientific observation); and especially Merleau-Ponty’s focus on embodied practice” (Gallagher 2017, p. 5).

iv. Buddhist Connections

 A major source of inspiration for original enactivists comes from Buddhist philosophy and practice. Thompson remarks in an interview that, to his knowledge, The Embodied Mind, “was the first book that related Buddhist philosophy to cognitive science, the scientific study of the mind, and the Western philosophy of mind” (Littlefair 2020).

Speaking on behalf of the authors of The Embodied Mind, Rosch (2016) reports that “We turned to Buddhism because, in our judgment, it provided what both Western psychology and phenomenology lacked, a disciplined and nonmanipulative method of allowing the mind to know itself—a method that we (in retrospect naively) simply called mindfulness” (Rosch 2016, xli). Despite having turned to Buddhist philosophy and psychology due to a mistaken assessment of what Western phenomenology has to offer, original enactivism continues to seek fruitful dialogues between Buddhist and Western traditions of philosophy of mind. Enactivism has helped to promote the recognition that phenomenological investigations need not be limited to work done in the European tradition.

There are potential gains to be had from conducting dialogues across traditions of thought for at least two reasons. Sometimes those working in a different tradition focus on phenomena unnoticed by other traditions. And sometimes those working in a different tradition offer novel observations about phenomena that are already of common interest. Recognizing the potential value of such dialogues, enactivists have a sustained interest in what Asian traditions of thought and practice have to offer when it comes to investigating and describing experience, and “in particular the various Buddhist and Hindu philosophical analyses of the nature of the mind and consciousness, based on contemplative mental training” (Thompson 2007, p. 474).

Inspired by these efforts at cross-fertilization, Varela initially formulated neurophenomenology, which was subsequently taken up by others (Varela 1996, 1999, Thompson 2007). Neurophenomenology was developed as a novel approach to the science of consciousness —one that incorporates empirical studies of mindful, meditative practice with the aim of getting beyond the hard problem of consciousness. Although, as a practical approach to the science of consciousness, neurophenomenology certainly breaks new ground, it has been criticized for failing to adequately address the theoretical roots of the hard problem of consciousness, which are grounded in particular metaphysical commitments (see, for example, Kirchhoff and Hutto 2016 and replies from commentators).

Another enactivist borrowing from Buddhist philosophy, of a more theoretical bent, is the claim that cognition and consciousness are absolutely groundless —that they are ultimately based only on empty co-dependent arising. Thompson (2016) reports that the original working title of The Embodied Mind was Worlds Without Grounds. That initial choice of title, though later changed, shows the centrality of the idea of groundlessness for the book’s authors. As Thompson explains, the notion of groundlessness in Buddhist philosophy is meant to capture the idea “that phenomena lack any inherent and independent being; they are said to be ‘empty’ of ‘own being’” (p. xviii).

The original enactivists saw a connection with the Buddhist notion of groundlessness and their view that cognition only arises through viable organismic activity and histories of interaction that are not predetermined. For them, the idea that cognition is groundless is supported by the conception of evolution as natural drift. Accordingly, they maintain that “our human embodiment and the world that is enacted by our history of coupling reflect only one of many possible evolutionary pathways. We are always constrained by the path we have laid down, but there is no ultimate ground to prescribe the steps that we take” (Varela and others 1991, p. 213). Or, as Thompson (2016) puts it, “Cognition as the enaction of a world means that cognition has no ground or foundation beyond its own history” (p. xviii).

Thompson (2021) has emphasized the apparently far-reaching consequences this view has for mainstream conceptions of science and nature. To take it fully on board is to hold that ultimate reality is ungraspable, that it is beyond conception, or that it is not ‘findable under analysis’. As such, he observes that, on the face of it, the traditional Mahāyāna Buddhist idea of ‘emptiness’ (śūnyatā—the lack of intrinsic reality) appears to be at odds with standard, realist, and objectivist conceptions of scientific naturalism. As such, this raises a deep question of what taking these Buddhist ideas seriously might mean “for scientific thinking and practice” (Thompson 2021, p. 78). Others too have sought to work through the implications of taking enactivist ideas seriously when thinking about an overall philosophy of nature (Hutto and Satne 2015, 2018a, 2018b; Gallagher 2017, 2018b; Meyer and Brancazio 2022). These developments raise the interesting question: To what extent, and at what point, might enactivist revisions to our understanding and practice of science come into direct tension with and begin to undermine attempts to make the notion of autonomous agency credible by “providing a factual, biological justification for it” (Varela 1991 p. 79)?

v. Sense-Making

A foundational, signature idea associated with the original version of enactivism and its direct descendants is that the autonomous agency of living systems and what it entails are a kind of sense-making. The notion of sensemaking made its debut in the title of a presentation that Varela delivered in 1981, and the idea’s first published expression arrived with the publication of that presentation, as follows: “Order is order, relative to somebody or some being who takes such a stance towards it. In the world of the living, order is indeed in­separable from the ways in which living systems make sense, so that they can be said to have a world” (Varela 1984, p. 208; see Thompson 2011 for further discussion of the origins of the idea). The idea that living systems are sense-making systems has proved popular with many enactivists, although interestingly, there is no explicit mention of sense-making in The Embodied Mind.

Sense making is variously characterised in the literature. Sometimes it is characterised austerely, serving simply as another name for the autonomous activity of living systems. In other uses, it picks out, more contentiously, what is claimed to be directly entailed by the autonomous activity of living systems. In the latter uses, different authors attribute a variety of diverse properties to sense making activity in their efforts to demonstrate how phenomenological aspects of mind derive directly from, or are otherwise somehow connected with, the autonomous agency of living systems. Making the case for links between life and mind can be seen, broadly, as a continuation of Varela’s project “to establish a direct entailment from autopoiesis to the emergence of a world of significance” (Di Paolo and others 2018, p. 32).

At its simplest, sense-making is used to denote the autonomous agency of living systems. For example, that is how the notion is used in the following passages:

Living is a process of sense-making, of bringing forth sig­nificance and value. In this way, the environment becomes a place of valence, of attraction and repulsion, approach or escape (Thompson 2007, p. 158).

Sense-making is the capacity of an autonomous system to adaptively regulate its operation and its relation to the environment depending on the virtual consequences for its own viability as a form of life (Di Paolo and others 2018, p. 33).

Such an identification is at play when it is said that “even the simplest organisms regulate their interactions with the world in such a way that they transform the world into a place of salience, meaning, and value —into an environment (Umwelt) in the proper biological sense of the term. This transformation of the world into an environment happens through the organism’s sense-making activity” (Thompson and Stapleton 2009, p. 25). However, Di Paolo and others (2017) go further, claiming that “it is possible to deduce from processes of precarious, material self-individuation the concept of sense-making” (p. 7).

Enactivists add to this basic explication of sense-making, claiming that the autonomous activity of living systems is equivalent to, invariably gives rise to, entails, or is naturally accompanied by a plethora of additional properties: having a perspective, intentionality, interpretation, making sense of the world, care, concern, affect, values, evaluation, and meaning.

Thompson (2007) explains that the self-individuating and identity-forging activity of living systems “establishes logically and operationally the reference point or perspective for sense-making and a do­main of interactions” (p. 148). It is claimed that such autonomous sense-making activity establishes “a perspective from which interactions with the world acquire a normative status” (Di Paolo and others 2018, p. 32). Di Paolo and others (2017) appear to add something more to this explication when they take sense-making to be equivalent to an organism not only having a perspective on things but having “a perspective of meaning on the world invested with interest for the agent itself (p. 7).

Thompson (2007) tells us that according to Varela, sense-making “is none other than intentionality in its minimal and original biological form” (Thompson 2007, p. 148; see Varela 1997a, Thompson 2004). This fits with the account of intentionality provided in The Embodied Mindccording to which “embodied action is always about or directed toward something that is missing… actions of the system are always directed toward situations that have yet to become actual” (Varela and others 1991, p. 205). In their classic statement of this view, the original enactivists held that intentionality “consists primarily in the directedness of action… to what the system takes its possibilities for action to be and to how the resulting situations fulfill or fail to fulfill these possibilities” (Varela and others 1991, p. 205-206).

Talk of sense-making, despite the minimal operational definition provided above, is sometimes used interchangeably and synonymously with the notion that organisms make sense of their environments. This illocution is at the heart of Varela’s initial presentation of the view in Varela (1984), but others retain the language. Thompson (2007) tells us that “an autopoietic system always has to make sense of the world so as to remain viable” (p. 147-8). He also tells us, “Life is thus a self-affirming process that brings forth or enacts its own identity and makes sense of the world from the perspec­tive of that identity.” (Thompson 2007, p. 153). Rolla and Huffermann (2021) describe enactivists as committed to the claim that “organisms make sense of their environments through autopoiesis and sensorimotor autonomy, thereby establishing meaningful environmental encounters” (p. 345).

Enactivists also regard sense-making as the basis for values and evaluations, as these, they claim, appear even in the simplest and most basic forms of life (see, for example, Rosch 2016). This claim connects with the enactivist assumption that all living things have intrinsic purposiveness and an immanent teleology (Thompson 2007, Di Paolo and others 2018, see also Gambarotto and Mossio 2022).

Certain things are adaptative or maladaptive for organisms, and, as such, through their active sense-making, they tend to be attracted to the former and repulsed by the latter (Thompson 2007, p. 154). Accordingly, it is claimed that organisms must evaluate whatever they encounter. For example, a sense-making system “… ‘evaluates’ the environmental situation as nutrient-rich or nutrient-poor” (Di Paolo and others 2018, p. 32). It is claimed that such evaluation is necessary given that the “organism’s ‘concern’… is to keep on going, to continue living” (Di Paolo and others 2018, p. 33). Moreover, it is held that the autonomous sense-making activity of organisms generates norms that “must somehow be accessible (situations must be accordingly discernible) by the organism itself.” (Di Paolo and others 2018, p. 32). So conceived, we are told that “sense-making… lies at the core of every form of action, perception, emotion, and cognition, since in no instance of these is the basic structure of concern or caring ever absent. This is constitutively what distinguishes mental life from other material and relational processes” (Di Paolo and others 2018, p. 33).

Those who have sought to develop the idea of sense-making also maintain that “cognition is behaviour in relation to meaning… that the system itself enacts or brings forth on the basis of its autonomy” (Thompson 2007, p. 126). In this regard, Cappuccio and Froese (2014) speak of an organism’s “active constitution of a meaningful ‘world-environment’ (Umwelt)” (p. 5).

Importantly, Thompson (2007) emphasizes that sense-making activity not only generates its own meaning but also simultaneously responds to it. He tells us that “meaning is generated within the system for the system itself —that is, it is generated and at the same time consumed by the system” (p. 148). This idea comes to the fore when he explicates his account of emotional responding, telling us that “an endogenously generated response… creates and carries the meaning of the stimulus for the animal. This meaning reflects the individual organism’s history, state of expectancy, and environmental context” (Thompson 2007, p. 368). Similarly, in advancing her own account of enactive emotions, Colombetti (2010) also speaks of organismic “meaning generating” activity and describes the non-neural body as a “vehicle of meaning” (2010, p. 146; p. 147).

Di Paolo and his co-authors defend similar views, holding that “the concept of sense-making describes how living organisms relate to their world in terms of meaning” (Di Paolo and others 2017, p. 7); and that an organism’s engagements with features of the environment “are appreciated as meaningful by the organism” (Di Paolo and others 2018, p. 32).

Enactivists who defend these views about sense-making are keen to note that the kind of ‘meaning’ that they assume is brought forth and consumed by organisms is not to be understood in terms of semantic content, nor does it entail the latter. As such, the kind of meaning that they hold organisms bring forth is not in any way connected to or dependent upon mental representations as standardly understood. We are told “if we wish to continue using the term representation, then we need to be aware of what sense this term can have for the enactive approach… “Autonomous systems do not operate on the basis of internal representations; they enact an environment” (Thompson 2007, p. 58 –59). Indeed, in moving away from cognitivist assumptions, a major ambition of this variety of enactivism is to establish that “behavior…expresses meaning-constitution rather than information processing” (Thompson 2007, p. 71).

In sum, a main aspiration of original enactivism is to bring notions such as sense-making to bear to demonstrate how key observations about biological autonomy can ground phenomenological aspects of mindedness such as “concernful affect, caring attitudes, and meaningful engagements that underscore embodied experience” (Di Paolo and others 2018, p. 42). The sense-making interpretation of biological autonomy is meant to justify attributing basic structures of caring, concern, meaning, sense, and value to living systems quite generally (Di Paolo and others 2018, p. 22). Crucially and pivotally, it is claimed of the original version of enactivism that through its understanding of “precarious autonomy, adaptivity, and sense-making, the core aspect of mind is naturalized” (Di Paolo and others 2018, p. 33).

In pursuing its naturalizing ambition, the original version of enactivism faces a particular challenge. Simply put, the weaker —more austere and deflated —its account of sense-making, the more credible it will be for the purpose of explicating the natural origins of minds, but it will be less capable of accounting for all aspects of mindedness. Contrariwise, the stronger —more fulsome and inflated —its account of sense-making, the more capable it will be of accounting for all aspects of mindedness, but the less credible it will be for the purpose of explicating the natural origins of minds.

For example, in their original statement of enactivism, Varela and others (1991) speak of the most primitive organisms enacting domains of ‘significance’ and ‘relevance’. They add that this implies that ‘some kind of interpretation’ is going on. Yet, they are also careful to emphasize that they use their terms advisedly and are at pains to highlight that “this interpretation is a far cry from the kinds of interpretation that depend on experience” (p. 156). More recently, Stapleton (2022) maintains that:

The autopoietic enactivist is, of course, not committed to viewing the bacterium as experiencing the value that things in its environment have for it. Nor, to viewing the bacterium as purposefully regulating its coupling with the environment, where ‘purposeful’ is understood in the terms we normally use it—as implying some kind of reflection on a goal state and striving to achieve that goal state by behaving in a way in which one could have done otherwise (p. 168).

Even if it is accepted that all cognition lies along a continuum, anyone who acknowledges that there are significantly different varieties of cognition that have additional properties not exhibited by the most basic forms must face up to the ‘scaling up’ challenge. As Froese and Di Paolo (2009) ask, “Is it a question of merely adding more complexity, that is, of just having more of the same kind of organizations and mechanisms? Then why is it seemingly impossible to properly address the hallmarks of human cognition with only these basic biological principles?” (p. 441). In this regard, Froese and Di Paolo (2009) admit that even if the notion of sense-making is thought to be appropriate for characterizing the activity of the simplest living creatures, it still “cries out for further specification that can distinguish between different modes of sense-making” (p. 446).

With the scaling up challenge in sight, several enactivists have been working to explicate how certain, seemingly distinctive high-end human forms of sense-making relate to those of the most basic, primitive forms of life (Froese and Di Paolo 2009; De Jaegher and Froese 2009; Froese, Woodward and Ikegami 2013, Kee 2018). Working in this vein, Cuffari and others (2015) and Di Paolo and others (2018) have broken new ground by providing a sense-making account of human language in their efforts to dissolve the scaling-up problem and demonstrate the full scope and power of key ideas from the original version of enactivism.

b. Sensorimotor Knowledge Enactivism

At a first pass, what is sometimes called simply sensorimotor enactivism holds that perceiving and perceptual experience “isn’t something that happens in us, it is something we do” (Noë 2004, p. 216). Accordingly, perceiving and experiencing are “realized in the active life of the skillful animal” (Noë 2004, p. 227). Its main proponent, Alva Noë (2021), tells us:

The core claim of the enactive approach, as I understand it, and as this was developed in Noë, 2004, and also O’Regan and Noë, 2001… [is that] the presence of the world, in thought and experience, is not something that happens to us but rather something that we achieve or enact (p. 958).

This version of enactivism travels under various names in the literature, including the enactive approach (Noë 2004, 2009, 2012, 2016, 2021); sensorimotor theory (O’Regan and Noë 2001; Myin and O’Regan 2002; Myin and Noë 2005; O’Regan 2011); ‘the dynamic sensorimotor approach’ (Hurley and Noë 2003), which also drew on Hurley (1998); and ‘actionism (Noë 2012, 2016). In Noë (2021), the new label sensorimotor knowledge enactivism’ was introduced to underscore the key importance of the idea that perceiving and perceptual experiences are grounded in a special kind of knowledge. Hence, a fuller and more precise explication of the core view of this version of enactivism is that experience of the world comes in the form of an understanding that is achieved through an active exploration of the world, which is mediated by practical knowledge of its relevant sensorimotor contingencies.

The emphasis on sensorimotor understanding and knowledge is what makes this version of enactivism distinctive. Sensorimotor knowledge enactivism holds that in order “to perceive, you must have sensory stimulation that you understand” (Noë 2004, p. 183; see also p. 180, p. 3). In explicating this view, Noë (2012) is thus at pains to highlight “the central role understanding, knowledge, and skill play in opening up the world for experience… the world is blank and flat until we understand it” (Noë 2012, p. 2). Later in the same book, he underscores this crucial point yet again, saying that:

According to the actionist (or enactive) direct realism that I am developing here, there is no perceptual experience of an object that is not dependent on the exercise by the perceiver of a special kind of knowledge. Perceptual awareness of objects, for actionist-direct realism, is an achievement of sensorimotor understanding. (Noë 2012, p. 65).

These claims also echo the original statement of the view, which tells us that “the central idea of our new approach is that vision is a mode of exploration of the world that is mediated by knowledge of what we call sensorimotor contingencies” (O’Regan and Noë 2001, p. 940, see also Noë 2004, p. 228).

Putting this together, Noë (2004) holds that “all perception is intrinsically thoughtful” (2004, p. 3). Accordingly, canonical forms of perceiving and thinking really just lie at different points along the same spectrum: “perception is… a kind of thoughtful exploration of the world, and thought is… a kind of extended perception” (Noë 2012, p. 104 –105). Sensorimotor knowledge enactivism thus asks us to think of the distinction between thought and perception as “a distinction among different styles of access to what there is… thought and experience are different styles of exploring and achieving, or trying to achieve, access to the world” (Noë 2012, p. 104 –105).

The view is motivated by the longstanding observation that we cannot achieve an accurate phenomenology of experience if we only focus on the raw stimulation and perturbation of sensory modalities. A range of considerations support this general position. A proper phenomenology of experience requires an account of what it is to grasp the perceptual presence of objects in the environment. But this cannot be accounted for solely by focusing on raw sensations. The visual experience of, say, seeing a tomato is an experience of a three-dimensional object that takes up space voluminously. This cannot be explained simply by appealing to what is passively ‘given’ to or supplied by the senses. For what is, strictly, provided to the visual system is only, at most, a partial, two-dimensional take of the tomato.

Empirical findings also reveal the need to distinguish between mere sensing and experiencing. It has been shown that it is possible to be sensorially stimulated in normal ways without this resulting in the experience of features or aspects of the surrounding environment in genuinely perceptual ways —in ways that allow subjects to competently engage with worldly offerings or to make genuinely perceptual reports. This is the situation, for example, for those first learning to manipulate sensory substitution devices (O’Regan and Nöe 2001, Nöe 2004, Roberts 2010)

There are longstanding philosophical and empirical reasons for thinking that something must be added to sensory stimulation to a yield full -blown experience of worldly offerings and to enable organisms to engage with them successfully. Something must be added to sensory stimulation to a yield full-blown experience of worldly offerings and enable organisms to engage with them successfully.

A familiar cognitivist answer is that the extra ingredient needed for perceiving comes in the form of inner images or mental representations. Sensorimotor knowledge enactivism rejects these proposals, denying that perceiving depends on mental representations, however rich and detailed. In this regard, sensorimotor knowledge enactivism also sets its face against the core assumption of the popular predictive processing accounts of cognition by holding that

the world does not show up for us “as it does because we project or interpret or confabulate or hypothesize… in something like the way a scientist might posit the existence of an unobserved force” (Noë 2012, p. 5).

Sensorimotor knowledge enactivism, by contrast, holds that perceptual experience proper is grounded in the possession and use of implicit, practical knowledge such that, when such knowledge is deployed properly, it constitutes understanding and allows organisms to make successful contact with the world.

Successfully perceiving the world and enjoying perceptual experiences of it are mediated and made possible by the possession and skillful deployment of a special kind of practical knowledge of sensorimotor contingencies, namely, knowledge of the ways in which stimulation of sense modalities changes, contingent upon aspects of the environment and the organism’s own activities.

Having the sensation of softness consists in being aware that one can exercise certain practical skills with respect to the sponge: one can, for example, press it, and it will yield under the pressure. The experience of the softness of the sponge is characterized by a variety of such possible patterns of interaction with the sponge, and the laws that describe these sensorimotor interactions we call, following MacKay (1962), the laws of sensorimotor contingency (O’Regan and Noë, 2001). (O’Regan and others, 2005, p. 56, emphasis added).

Knowledge of this special sort is meant to account for the expectations that perceivers have concerning how things will appear in the light of possible actions. It amounts to knowing how things will manifest themselves if the environment is perceptually explored in certain ways. At some level, so the theory claims, successful perceivers must have implicit mastery of relevant laws concerning sensorimotor contingencies.

Echoing ideas first set out in the original version of enactivism, sensorimotor knowledge enactivism holds that the phenomenal properties of experience —what-it-is-like properties —are not to be identified with extra ingredients over and above the dynamic, interactive responses of organisms. As such, its advocates hold that “we enact our perceptual experience: we act it out” (Noë 2004, p. 1). In line with the position advanced by other enactivists, Noë (2004) claims that:

Different animals inhabit different perceptual worlds even though they inhabit the same physical world. The sights, sounds, odors, and so on that are available to humans may be unavailable to some creatures, and likewise, there is much we ourselves cannot perceive. We lack the sensorimotor tuning and the understanding to encounter those qualities. The qualities themselves are not subjective in the sense of being sensations. We don’t bring them into existence. But only a very special kind of creature has the biologically capacity, as it were, to enact them (p. 156).

On their face, some of the statements Noë makes about phenomenal properties appear to be of a wholly realist bent. For example, he says, “There is a sense in which we move about in a sea of perspectival properties and are aware of them (usually without thought or notice) whenever we are perceptually conscious. Indeed, to be perceptually conscious is to be aware of them” (Noë 2004, p. 167). Yet, he also appears to endorse a middle way -position that recognizes that the world can be understood as a domain of perceptual activity just as much as it can be understood as a place consisting of or containing the properties and facts that interest us (Noë 2004, p. 167).

It is against that backdrop that Noë holds, “Colours are environmental phenomena, and colour experience depends not only on movement-dependent but also on object-dependent sensorimotor contingencies… colour experience is grounded in the complex tangle of our embodied existence” (Noë 2004, p. 158) In the end, sensorimotor knowledge enactivism offers the following answer to the problem of consciousness: “How the world shows up for us depends not only on our brains and nervous systems but also on our bodies, our skills, our environment, and the way we are placed in and at home in the world” (Noë 2012, pp. 132-3).

Ultimately, “perceptual experience presents the world as being this way or that; to have experience, therefore, one must be able to appreciate how the experience presents things as being” (Noë 2004, p. 180). This is not something that is automatically done for organisms; it is something that they sometimes achieve. Thus, “The world shows up for us thanks to what we can do… We make complicated adjustments to bring the world into focus … We achieve access to the world. We enact it by enabling it to show up for us.… If I don’t have the relevant skills of literacy, for example, the words written on the wall do not show up for me” (Noë 2012, p. 132 –133).

So understood, sensorimotor knowledge enactivism resists standard representational accounts of perception, holding that “perceivings are not about the world; they are episodes of contact with the world” (Noë 2012, p. 64). It sponsors a form of enactive realism according to which the content of perceiving only becomes properly perceptual content that represents how things are when the skillful use of knowledge makes successful contact with the world. There is no guarantee of achieving that outcome. Hence, many attempts at perceiving might be groping, provisional efforts in which we only gain access to how things appear to be and not how they are.

On this view, “perception is an activity of learning about the world by exploring it. In that sense, then, perception is mediated by appearance” (Noë 2004, p. 166). Achieving access to the world via knowledgeable, skillful exploration is to discover the relevant patterns that reveal “how things are from how they appear” (Noë 2012, p. 164). Thus, “hearing, like sight and touch, is a way of learning about the world… Auditory experience, like visual experience, can represent how things are” (Noë 2004, p. 160).

Accordingly, Noë (2004) holds that the perceptual content of experience has a dual character: it presents the world as being a certain way and presents how things are experienced, capturing how things look, or sound, or feel from the vantage point of the perceiver. It is because Noë assumes perceptual content has both of these aspects that he is able to defend the view that perceptual experience is a “way of encountering how things are by making contact with how they appear to be” (Noë 2004, p. 164).

The key equation for how this is possible, according to sensorimotor knowledge enactivism, is as follows: “How [things] (merely) appear to be plus sensorimotor knowledge gives you how things are” (Noë 2004, p. 164). Put otherwise, “for perceptual sensation to constitute experience —that is, for it to have genuine representational content —the perceiver must possess and make use of sensorimotor knowledge” (Noë 2004, p. 17).

Even though knowledge and understanding lie at the heart of sensorimotor knowledge enactivism, Noë (2012) stresses that “your consciousness of… the larger world around you is not an intellectual feat” (Noë 2012, p. 6). He proposes to explain how to square these ideas by offering a putatively de-intellectualized account of knowledge and understanding, advancing a “practical, active, tool-like conception of concepts and the understanding” (Noë 2012, p. 105).

Sensorimotor knowledge enactivism bills itself as rejecting standard representationalism about cognition while also maintaining that perceptual experiences make claims or demands on how things are (Noë 2021). Since, to this extent, sensorimotor knowledge enactivism retains this traditional notion of representational content, at its core, Noë (2021) has come to regard the ‘real task’ for defenders of this view as “to rethink what representation, content, and the other notions are or could be” (p. 961).

It remains to be seen if sensorimotor knowledge enactivism can explicate its peculiar notions of implicit, practical understanding, and representational content in sufficiently novel and deflated ways that can do all the philosophical work asked of them without collapsing into or otherwise relying on standard cognitivist conceptions of such notions. This is the longstanding major challenge faced by this version of enactivism (Block 2005, Hutto 2005).

c. Radical Enactivism

 Radical enactivism, also known as radical enactive cognition or REC, saw its debut in Hutto (2005) and was developed and supported in subsequent publications (Menary 2006, Hutto 2008, 2011a, 2011c, 2013a, 2013c, 2017, 2020, Hutto and Myin 2013, 2017, 2018a, 2018b, 2021). It was originally proposed as a critical adjustment to sensorimotor enactivists conservative tendencies, as set out in O’Regan and Noë (2001), tendencies which were deemed to be at odds with the professed anti-representationalism of the original version of enactivism. Radical enactivism proposes an account of enactive cognition that rejects characterizing or explaining the most basic forms of cognition in terms of mediating knowledge. This is because radical enactivists deem it unlikely that such notions can be non-vacuously explicated or accounted for naturalistically.

Importantly, radical enactivism never sought to advance a wholly original, new type or brand of enactivism. Instead, its goal was always to identify a minimal core set of tenable yet non-trivial enactivist theses and defend them through analysis and argument.

Much of the work of radical enactivists is subtractive —it adds by cutting away, operating on the assumption that often less is more. The adopted approach is explicated in greater detail in Evolving Enactivism, wherein several non-enactivist proposals about cognition are examined in an effort to assess whether they could be modified and allied with radical enactivism. This process, known as RECtification, is one “through which…. target accounts of cognition are radicalized by analysis and argument, rendering them compatible with a Radically Enactive account of Cognition” (Hutto and Myin 2017, p. xviii).

In advancing this cause, Hutto and Myin (2013) restrict radical enactivism’s ambitions to only promoting strong versions of what they call the Embodiment Thesis and the Developmental-Explanatory Thesis.

The Embodiment Thesis conceives of basic cognition in terms of concrete, spatio-temporally extended patterns of dynamic interaction between organisms and their environments. These interactions are assumed to take the form of individuals engaging with aspects of their environments across time, often in complex ways and on multiple scales. Radical enactivists maintain that these dynamic interactions are loopy, not linear. Such sensitive interactions are assumed, constitutively, to involve aspects of the non-neural bodies and environments of organisms. Hence, they hold that cognitive activity is not restricted to what goes on in the brain. In conceiving of cognition in terms of relevant kinds of world-involving organismic activity, radical enactivists characterize it as essentially extensive, not merely extended, in contrast to what Clark and Chalmers (1998) famously argued (see Hutto and Myin 2013; Hutto, Kirchhoff and Myin 2014).

The Developmental-Explanatory Thesis holds that mentality-constituting interactions are grounded in, shaped by, and explained by nothing more than the history of an organism’s previous interactions and features of its current environment. Sentience and sapience emerge, in the main, through repeated processes of organismic engagement with environmental offerings. An organism’s prolonged history of engaged encounters is the basis of its current embodied tendencies, know-how, and skills.

Radical enactivism differs from other versions of enactivism precisely in rejecting their more extravagant claims. It seeks to get by without the assumption that basic cognition involves mediating knowledge and understanding. Similarly, radical enactivism seeks to get by without assuming that basic cognition involves sense-making. It challenges the grounds for thinking that basic forms of cognition have the full array of psychological and phenomenological attributes associated with sense-making by other enactivists. Radical enactivists, for example, resist the idea that basic cognition involves organisms somehow creating, carrying, and consuming meanings.

Additionally, radical enactivists do not assume that intentionality and phenomenality are constitutively or inseparably linked. Its supporters do not endorse the connection principle according to which intentionality and phenomenal consciousness are taken to be intrinsically related (see Searle 1992, Ch. 7; compare Varela and others, 1991, p. 22). Instead, radical enactivists maintain that there can be instances of world-directed cognition that are lacking in phenomenality, even though, in the most common human cases, acts of world-directed cognition possess a distinctive phenomenal character (Hutto 2000, p. 70).

Most pivotally, radical enactivism thoroughly rejects positing representational contents at the level of basic mentality. One of its most signature claims, and one in which it agrees with original enactivism, is that basic forms of mental activity neither involve nor are best explained by the manipulation of contentful representations. Its special contribution has been to advance novel arguments designed to support the idea that organismic activity, conceived of as engaging with features of their environments in specifiable ways, suffices for the most basic kinds of cognition.

To encourage acceptance of this view, radical enactivists articulated the hard problem of content (Hutto 2013c, Hutto and Myin 2013, Hutto and Myin 2018a, 2018b). This hard problem, posed as a challenge to the whole field, rests on the observation that information understood only in terms of covariance does not constitute any kind of content. Hutto and Myin (2013) erect this observation into a principle and use it to reveal the hard choice dilemma that anyone seeking to give a naturalistic account of basic cognition must face. The first option is to rely only on the notion of information-as-covariance in securing the naturalistic credentials of explanatory resourcesthe cost of not having adequate resources to explain the natural origins of the content that basic forms of cognition are assumed to have. The second option is to presuppose an expanded or inflated notion of information, one which can adequately account for the content of basic forms of cognition, at the cost of having to surrender its naturalistic credentials. Either way, so the analysis goes, it is not possible to give a naturalized account of the content of basic forms of cognition.

Providing a straight solution to the hard problem of content requires “explaining how it is possible to get from non-semantic, non-contentful informational foundations to a theory of content using only the resources of a respectable explanatory naturalism” (Hutto 2018, pp. 245).

Hutto and Myin (2013) put existing naturalistic theories of content to the test, assessing their capacity to answer this challenge. As Salis (2022, p.1) describes this work, they offer “an ensemble of reasons” for thinking naturalistic accounts of content will fail.

Radical enactivism wears the moniker ‘radical’ due to its interest in getting to the root of issues concerning cognition and its conviction that not all versions of enactivism have been properly steadfast in their commitment to anti-content, anti-representational views about the character of basic mindedness. For example, when first explicating their conception of the aboutness or intentionality of cognition as embodied action, the original enactivists note that the mainstream assumption is that “in general, intentionality has two sides: first, intentionality includes how the system construes the world to be (specified in terms of the semantic content of intentional states); second, intentionality includes how the world satisfies or fails to satisfy this construal (specified in terms of the conditions of satisfaction of intentional states)” (Varela and others 1991, p. 205). That mainstream notion of intentionality, which is tied to a particular notion of content, is precisely the kind of intentionality that radical enactivism claims does not feature in basic cognition. In providing compelling arguments against the assumption that basic cognition is contentful in that sense, radical enactivism’s primary ambition is to strengthen enactivism by securely radicalizing it.

Several researchers have since argued the hard problem of content has already been solved, or, at least, that it can be answered in principle or otherwise avoided (Miłkowski 2015, Raleigh 2018, Lee 2019, Ramstead and others 2020, Buckner 2021, Piccinini 2022). Yet, see Hutto and Myin (2017, 2018a, 2018b) and Segundo-Ortin and Hutto (2021) for assessments of the potential moves.

On the positive side of the ledger, radical enactivists contend that the kind of mindedness found at the roots of cognition can be fruitfully characterized as a kind of Ur-intentionality. It is a kind of intentionality that lacks the sort of content associated with truth or accuracy conditions (Hutto and Myin 2013, 2017, 2018a, Zahnoun 2020, 2021b, 2021c). Moreover, radical enactivists hold that we can adequately account for Ur-intentionality, naturalistically, using biosemiotics – a modified teleosemantics inspired, in the main, by Millikan (1984) but stripped of its problematic semantic ambitions. This proposed adjustment of Millikan’s theory was originally advanced in Hutto (1999) in the guise of a modest biosemantics that sought to explain forms of intentionality with only nonconceptual content. That version of the position was abandoned and later radicalized to become a content-free biosemiotics (see Hutto 2006, 2008, Ch. 3). The pros and cons of the Ur-intentionality proposal continue to be debated in the literature (Abramova and Villalobos 2015, De Jesus 2016, 2018, Schlicht and Starzak 2019, Legg 2021, Paolucci 2021, Zipoli Caiani 2022, Mann and Pain 2022).

Importantly, radical enactivists only put biosemiotics to the theoretical use of explicating the properties of non-contentful forms of world-involving cognition. Relatedly, they hold that when engaged in acts of basic cognition, organisms are often sensitive to covariant environmental information, even though it is a mere metaphor to say organisms process it. Although organisms are sensitive to relevant indicative, informational relationships, “these relationships were not lying about ready-made to be pressed into service for their purposes” (Hutto 2008, p. 53 –54). When it comes to understanding biological cognition, the existence of the relevant correspondences is not explained by appeals to ahistorical natural laws but by various selectionist forces.

As Thompson (2011b) notes, if radical enactivism’s account of biosemiotics is to find common ground with original enactivism and its direct descendants, it would have to put aside strong adaptationist views of evolution. In fact, although radical enactivism does place great explanatory weight on natural selection, it agrees with original enactivism at least to the extent that it does not hold that biological traits are individually optimized —not selected for —in isolation from one another to make organisms maximally fit to deal with features of a neutral, pre-existing world.

Radical enactivists accept that content-involving cognition exists even though they hold that our basic ways of engaging with the world and others are contentless. In line with this position, they have sought to develop an account of The Natural Origins of Content, a project pursued in several publications by Hutto and Satne (2015, 2017a, 2017b) and Hutto and Myin (2017). In these works, the authors have proposed that capacities for contentful speech and thought emerge with the mastery of distinctive socio-cultural practices —specifically, varieties of discursive practices with their own special norms. These authors also hold that the mastery of such practices introduces kinks into the cognitive mix, such as the capacity for ratio-logical reasoning (see, for example, Rolla 2021). Nevertheless, defenders of radical enactivism maintain that these kinks do not constitute a gap or break in the natural or evolutionary order (see Myin and Van den Herik 2020 for a defense of this position and Moyal-Sharrock 2021b for its critique). Instead, radical enactivists argue that the content-involving practices that enable the development of distinctively kinky cognitive capacities can be best understood as a product of constructed environmental niches (Hutto and Kirchhoff 2015). Rolla and Huffermann (2021) propose that in fleshing out this account, radical enactivism could combine with Di Paolo and others (2018)’s new work on linguistic bodies to understand the cognitive basis of language mastery, characterizing it as a kind of norm-infused and acquired shared know-how.

3. Forerunners

In the opening pages of Sensorimotor Life, its authors describe their contribution to the enactive literature as that of adding a ‘tributary to the flow of ideas’ which found its first expression in Varela, Thompson and Rosch’s The Embodied Mind. Making use of that metaphor, they also astutely note the value of looking “upstream to discover ‘new’ predecessors,” namely precursors to enactivism that can only be identified in retrospect: those which might qualify as “enactivists avant la lettre” (Di Paolo and others 2017, p. 3).

Enactivism clearly has “roots that predate psychology in its modern academic form.”

(Baerveldt and Verheggen 2012, p. 165). For example, in challenging early modern Cartesian conceptions of the mind as a kind of mechanism, it reaches back to a more Aristotelian vision of the mind that emphasizes its biological basis and features shared with all living things. Baerveldt and Verheggen (2012) also see clear links between enactivism and “a particular ‘radical’ tradition in Western Enlightenment thinking that can be traced at least to Spinoza” (p. 165). Gallagher argues that Anaxagoras should be considered the first enactivist based on his claim that human hands are what make humans the most intelligent of animals.

In the domain of biological ecology, there are clear and explicit connections between enactivism and the work of the German biologist Jakob von Uexküll, who introduced the notion of Umwelt, that had great influence in cybernetics and robotics. Resonances with enactivism can also be found in the work of Helmuth Plessner, a German sociologist and philosopher who studied with Husserl and authored Levels of Organic Life and the Human.

Another philosopher, Hans Jonas, who studied with both Heidegger and Husserl, stands out in this regard. As Di Paolo and others (2017) note, “Varela read his work relatively late in his career and was impressed with the resonances with his own thinking” (p. 3). In a collection of his essays, The Phenomenon of Life, very much in the spirit of the original version of enactivism, Jonas defends the view that there exists a deep, existential continuity between life and mind.

Many key enactivist ideas have also been advanced by key figures in the American pragmatist tradition. As Gallagher (2017) observes, many of the ideas of Peirce, Dewey, and Mead can be considered forerunners of enactivism” (p. 5). Gallagher and Lindgren (2015) go a step further, maintaining that the pioneers of enactivism “could have easily drawn on the work of John Dewey and other pragmatists. Indeed, long before Varela and others (1991), Dewey (1896) clearly characterized what has become known as enactivism” (p. 392). See also Gallagher (2014), Gallagher and Miyahara (2012), and Barrett (2019).

In advocating the so-called actional turn, enactivists touch on recurrent themes of central importance in Wittgenstein’s later philosophy, in particular his emphasis on the importance of our animal nature, forms of life, and the fundamental importance of action for understanding mind, knowledge, and language use. Contemporary enactivists characterize the nature of minds and how they fundamentally relate to the world in ways that not only echo but, in many ways, fully concur with the later Wittgenstein’s trademark philosophical remarks on the same topics. Indeed, Moyal-Sharrock (2021a) goes so far as to say that “Wittgenstein is —and should be recognized to be —at the root of the important contemporary philosophical movement called enactivism” (p. 8). The connections between Wittgenstein and enactivism are set out by many other authors (Hutto 2013d, 2015c, Boncompagni 2013, Loughlin 2014, 2021a, 2021b, Heras-Escribano and others 2015. See also Loughlin 2021c, for a discussion of how some of Wittgenstein’s ideas might also challenge enactivist assumptions).

4. Debates

Enactivism bills itself as providing an antidote to accounts of cognition that “take representation as their central notion” (Varela and others 1991, p. 172). Most fundamentally, in proposing that minds, like all living systems, are distinguished from machines by their biological autonomy, it sees itself as opposed to and rejects computational theories and functionalist theories of mind, including extended functionalist theories of mind (Di Paolo and others 2017, Gallagher 2017). Enactivism thus looks to work in robotics in the tradition of Brooks (1991) and dynamical systems theory (Smith and Thelen 1994, Beer 1998, Juarrero 1999) for representation-free and model-free ways of characterising and potentially explaining extensive cognitive activity (Kirchhoff and Meyer 2019, Meyer 2020a, 2020b).

In a series of publications, Villalobos and coauthors offer a sustained critique of enactivism for its commitment to biological autonomy on the grounds that its conception of mind is not sufficiently naturalistic. These critics deem enactivism’s commitment to teleology as the most problematic and seek to develop, in its place, an account of biological cognition built on a more austere interpretation of autopoietic theory (Villalobos 2013, Villalobos and Ward 2015, Abramova and Villalobos 2015, Villalobos and Ward 2016, Villalobos and Silverman 2018, Villalobos 2020, Villalobos and Razeto-Barry 2020, Villalobos and Palacios 2021).

An important topic in this body of work, taken up by Villalobos and Dewhurst (2017), is the proposal that enactivism may be compatible, despite its resistance to the idea, with a computational approach to cognitive mechanisms. This possibility seems plausible to some given the articulation of conceptions of computation that allow for computation without representation (see, for example, Piccinini 2008, 2015, 2020). For a critical response to the suggestion that enactivism is or should want to be compatible with a representation-free computationalism, see Hutto and others (2019) and Hutto and others (2020).

Several authors see great potential in allying enactivism and ecological psychology, a tradition in psychology miniated by James Gibson which places responsiveness to affordances at its center (Gibson 1979). In recent times, this possibility has become more attractive with the articulation of radical embodied cognitive science (Chemero 2009), that seeks to connect Gibsonian ideas with dynamical systems theory, without invoking mental representations.

A joint ecological-enactive approach to cognition has been proposed in the form of the skilled intentionality framework (Rietveld and Kiverstein 2014, Bruineberg and Rietveld 2014, Kiverstein and Rietveld and 2015, 2018, Bruineberg and others 2016, Rietveld, Denys and Van Westen 2018, Bruineberg, Chemero and Rietveld 2019). It seeks to provide an integrated basis for understanding the situated and affective aspects of the embodied mind, emphasizing that organisms must always be sensitive to multiple affordances simultaneously in concrete situations.

The task of ironing out apparent disagreements between enactivsm and ecological psychology to forge a tenable alliance of these two traditions has also been actively pursued by others (see Heras-Escribano 2016, Stapleton 2016, Segundo-Ortin and others 2019, Heras-Escribano 2019, Crippen 2020, Heft 2020, Myin 2020, Ryan and Gallagher 2020, Segundo-Ortin 2020, McGann and others 2020, Heras-Escribano 2021, Jurgens 2021, Rolla and Novaes 2022).

A longstanding sticking point that has impeded a fully-fledged enactivist-ecological psychology alliance is the apparent tension between enactivism’s wholesale rejection of the notion that cognition involves information processing and the tendency of those in the ecological psychology tradition to talk of perception as involving the ‘pickup’ of information ‘about’ environmental affordances (see Varela and others 1991, p. 201-204; Hutto and Myin 2017, p. 86). See also Van Dijk and others (2015). The use of such language can make it appear as if the Gibsonian framework is committed to the view that perceiving is a matter of organisms attuning to the covariant structures of a pre-given world. Notably, Baggs and Chemero (2021) attempt to directly address this obstacle to uniting the two frameworks (see also de Carvalho and Rolla 2020).

There have been attempts to take enactivist ideas seriously by some versions of predictive processing theories of cognition. In several publications, Andy Clark (2013, 2015, 2016) has sought to develop a version of predictive processing accounts of cognition that is informed, to some extent, by the embodied, non-intellectualist, action-orientated vision of mind promoted by enactivists.

Yet most enactivist-friendly advocates of predictive processing accounts of cognition tend to baulk when it comes to giving up the idea that cognition is grounded in models and mental representations. Clark (2015) tells us that he can’t imagine how to get by without such constructs when he rhetorically asks himself, “Why not simply ditch the talk of inner models and internal representations and stay on the true path of enactivist virtue?” (Clark 2015, p. 4; see also Clark 2016, p. 293). Whether a tenable compromise is achievable or whether there is a way around this impasse is a recurring and now prominent theme in the literature on predictive processing (see, for example, Gärtner and Clowes 2017, Constant and others 2021, Venter 2021, Constant and others 2022, Gallagher and others 2022, Gallagher 2022b).

Several philosophers have argued that it is possible to develop entirely non-representationalist predictive processing accounts of cognition that could be fully compatible with enactivism (Bruineberg and Rietveld 2014; Bruineberg, Kiverstein, and Rietveld 2016; Bruineberg and others, 2018; Bruineberg and Rietveld 2019). This promised union comes in the form of what Venter (2021) has called free energy enactivism. The Free Energy Principle articulated by Friston (2010, 2011) maintains that what unites all self-organizing systems (including non-living systems) is that they work to minimize free energy. Many have sought to build similar bridges between enactivism and free energy theorizing (Kirchhoff 2015, Kirchhoff and Froese 2017, Kirchhoff and Robertson 2018, Kirchhoff 2018a, 2018b,

Kirchhoff and others 2018, Robertson and Kirchhoff 2019, Ramstead and others 2020a, Hesp and others 2019). However, Di Paolo, Thompson, and Beer (2022) identify what they take to be fundamental differences between the enactive approach and the free energy framework that appear to make such a union unlikely, if not impossible.

5. Applications and Influence

Enactivism’s novel framework for conceiving of minds and our place in nature has proved fertile and fecund. Enactivism serves as an attractive philosophical platform from which many researchers and practitioners are inspired to launch fresh investigations into a great variety of topics—investigations that have potentially deep and wide-ranging implications for theory and practice.

In the domain of philosophy of psychology, beyond breaking new ground in our thinking about the phenomenality and intentionality of perception and perceptual experience, enactivism has generated many fresh lines of research. Enactivists have contributed to new thinking about: the nature of habits and their intelligence (for example, Di Paolo and others 2017; Ramírez-Vizcaya and Froese 2019; Zarco and Egbert 2019; Hutto and Robertson 2020); emotions and, especially, the distinction in the affective sciences between basic and non-basic emotions ( for example, Colombetti and Thompson 2008; Hutto 2012; Colombetti 2014; Hutto, Robertson, and Kirchhoff 2018); pretense (Rucińska 2016, 2019; Weichold and Rucińska 2021, 2022); imagination (for example, Thompson 2007; Medina 2013; Hutto 2015a; Roelofs 2018; Facchin 2021); memory (for example, Hutto and Peeters 2018; Michaelian and Sant’Anna 2021); mathematical cognition (for example, Zahidi and Myin 2016; Gallagher 2017, 2019; Hutto 2019; Zahidi 2021); social cognition – and, in particular, advanced the proposal that the most basic forms of intersubjectivity take the form of direct, engaged interactions between agents, where this is variously understood in terms of unprincipled embodied engagements scaffolded by narrative practices (Hutto 2006, Gallagher and Hutto 2008 – see also Paolucci 2020, Hutto and Jurgens 2019), interaction theory (Gallagher 2005, 2017, 2020a), and participatory sense-making (De Jaegher and Di Paolo 2007; De Jaegher 2009).

In addition to stimulating new thinking about mind and cognition, enactive ideas have also influenced research on topics in many other domains, including: AI and technological development (Froese and Ziemke 2009; Froese and others 2012; Ihde and Malafouris 2019; Sato and McKinney 2022; Rolla and others 2022); art, music, and aesthetics (Noë 2015; Schiavio and De Jaegher 2017; Fingerhut 2018, Murphy 2019; Gallagher 2021; Høffding and Schiavio 2021); cognitive archaeology (Garofoli 2015, 2018, 2019; Garofoli and Iliopoulos 2018); cross-cultural philosophy (McKinney 2020, Janz 2022, Lai 2022); education and pedagogical design (Hutto and others 2015; Gallagher and Lindgren 2015; Abrahamson and others 2016; Hutto and Abrahamson 2022); epistemology (Vörös 2016; Venturinha 2016; Rolla 2018; De Jaegher 2021; Moyal-Sharrock 2021); ethics and values (Varela 1999a; Colombetti and Torrance 2009; Di Paolo and De Jeagher 2022); expertise and skilled performance (Hutto and Sánchez-García 2015; Miyahara and Segundo-Ortin 2022; Robertson and Hutto 2023); mental health, psychopathology, and psychiatry (Fuchs 2018; de Haan 2020; Jurgens and others 2020; Maiese 2022b, 2022c, 2022d); rationality (Rolla 2021).

6. Conclusion

There can be no doubt that enactivism is making waves in today’s philosophy, cognitive science, and beyond the boundaries of the academy. Although only newly born, enactivism has established itself as a force to be reckoned with in our thinking about mind, cognition, the world around us, and many other related topics. What remains to be seen is whether, and to what extent, different versions of enactivism will continue to develop productively, whether they will unite or diverge, whether they will find new partners, and, most crucially, whether enactivist ideas will continue to be actively taken up and widely influential. For now, this much is certain: The enactivist game is very much afoot.

7. References and Further Reading

  • Abrahamson, D., Shayan, S., Bakker, A., and Van der Schaaf, M. 2016. Eye-tracking Piaget: Capturing the Emergence of Attentional Anchors in the Coordination of Proportional Motor Action. Human Development, 58(4-5), 218–244.
  • Abramova, K. and Villalobos, M. 2015. The Apparent Ur-Intentionality of Living Beings and the Game of Content. Philosophia, 43(3), 651-668.
  • Baerveldt, C. and Verheggen, T. 2012. Enactivism. The Oxford Handbook of Culture and Psychology. Valsiner, J. (ed). Oxford. Oxford University Press. pp. 165–190.
  • Baggs, E. and Chemero, A. 2021. Radical Embodiment in Two Directions. Synthese, 198:S9, 2175–2190.
  • Barandiaran, X. E. 2017. Autonomy and Enactivism: Towards a Theory of Sensorimotor Autonomous Agency. Topoi, 36(3), 409–430.
  • Barandiaran, X. and Di Paolo, E. 2014. A Genealogical Map of the Concept of Habit. Frontiers in Human Neuroscience.
  • Barrett, L. 2019. Enactivism, Pragmatism … Behaviorism? Philosophical Studies. 176(3), 807–818.
  • Beer R. 1998. Framing the Debate between Computational and Dynamical Approaches to Cognitive Science. Behavioral and Brain Sciences, 21(5), 630-630.
  • Boncompagni, A. 2020. Enactivism and Normativity: The case of Aesthetic Gestures. JOLMA – The Journal for the Philosophy of Language, Mind, and the Arts, 2(1):177-194.
  • Boncompagni, A. 2013. Enactivism and the ‘Explanatory Trap’: A Wittgensteinian Perspective. Methode – Analytic Perspectives, 2, 27-49.
  • Brooks R. 1991. Intelligence without Representation. Artificial Intelligence. 47: 139-159.
  • Bruineberg, J., Chemero, A., and Rietveld, E. 2019. General Ecological Information supports Engagement with Affordances for ‘Higher’ Cognition. Synthese, 196(12), 5231–5251.
  • Bruineberg, J., Kiverstein, J., and Rietveld, E. 2016. The Anticipating Brain is Not a Scientist: The Free-Energy Principle from an Ecological-Enactive Perspective. Synthese, 195(6), 2417-2444.
  • Bruineberg, J., and Rietveld, E. 2014. Self-organisation, Free Energy Minimization, and Optimal Grip on a Field of Affordances. Frontiers in Human Neuroscience 8(599), 1-14. doi.org/10.3389/fnhum.2014.00599.
  • Buckner, C. 2021. A Forward-Looking Theory of Content. Ergo. 8:37. 367-401.
  • Burnett, M. and Gallagher, S. 2020. 4E Cognition and the Spectrum of Aesthetic Experience. JOLMA – The Journal for the Philosophy of Language, Mind, and the Arts. 1: 2. 157–176.
  • Candiotto, L. 2022. Loving the Earth by Loving a Place: A Situated Approach to the Love of Nature, Constructivist Foundations, 17(3), 179–189.
  • Cappuccio M. and Froese, T. 2014. Introduction. In Cappuccio, M and Froese, T. (eds.), Enactive Cognition at the Edge of Sense-Making: Making Sense of Nonsense. Basingstoke: Palgrave Macmillan. pp. 1-33
  • Chemero A. 2009. Radical Embodied Cognitive Science. Cambridge, MA: MIT Press.
  • Clark, A. 2015. Predicting Peace: The End of the Representation Wars: Reply to Madary. In Open MIND: 7(R), ed. T. Metzinger and J. M. Windt. MIND Group. doi: 10.15502/9783958570979.
  • Clark, A. 2016. Surfing Uncertainty: Prediction, Action, and the Embodied Mind. New York: Oxford University Press.
  • Colombetti, G. 2014. The Feeling Body: Affective Science Meets the Enactive Mind. Cambridge, MA, MIT Press.
  • Colombetti, G. 2010. Enaction, Sense-Making and Emotion. In Stewart J, Gapenne O, and Paolo, E.D. (eds.). Enaction: Toward a New Paradigm for Cognitive Science, Cambridge MA: MIT Press, 145-164.
  • Colombetti, G. and Torrance, S. 2009. Emotion and Ethics: An Inter-(En)active Approach. Phenomenology and the Cognitive Sciences, 8 (4): 505-526.
  • Colombetti, G. and Thompson, E. 2008. The Feeling Body: Towards an Enactive Approach to Emotion. In W. F. Overton, U. Müller and J. L. Newman (eds.), Developmental Perspectives on Embodiment and Consciousness. Erlbaum. pp. 45-68.
  • Constant, A., Clark, A and Friston, K. 2021. Representation Wars: Enacting an Armistice Through Active Inference. Frontiers in Psychology.
  • Constant, C., Clark, A., Kirchhoff, M, and Friston, K. 2022. Extended Active Inference: Constructing Predictive Cognition Beyond Skulls. Mind and Language, 37(3), 373-394.
  • Crippen, M. 2020. Enactive Pragmatism and Ecological Psychology. Frontiers in Psychology, 11. 203–204.
  • Cuffari, E.C., Di Paolo, E.A., and De Jaegher, H. 2015. From Participatory Sense-Making to Language: There and Back Again. Phenomenology and the Cognitive Sciences, 14 (4), 1089-1125.
  • de Carvalho, E., and Rolla, G. 2020. An Enactive‐Ecological Approach to Information and Uncertainty. Frontiers in Psychology, 11, 1–11.
  • de Haan, S. 2020. Enactive Psychiatry. Cambridge, UK: Cambridge University Press.
  • De Jaegher, H. 2021. Loving and Knowing: Reflections for an Engaged Epistemology. Phenomenology and the Cognitive Sciences 20, 847–870.
  • De Jaegher, H. 2015. How We Affect Each Other: Michel Henry’s’ Pathos-With’ and the Enactive Approach to Intersubjectivity. Journal of Consciousness Studies 22 (1-2), 112-132.
  • De Jaegher, H. 2013. Embodiment and Sense-Making in Autism. Frontiers in Integrative Neuroscience, 7, 15. doi:10.3389/fnint.2013.00015
  • De Jeagher, H. 2009. Social Understanding through Direct Perception? Yes, by Interacting. Consciousness and Cognition. 18 (2), 535-542.
  • De Jaegher, H. and Di Paolo, E.A. 2008. Making Sense in Participation: An Enactive Approach to Social Cognition. In Morganti, F. and others (eds.). Enacting Intersubjectivity. IOS Press.
  • De Jeagher, H. and Di Paolo, E. 2007. Participatory Sense-Making: An Enactive Approach to Social Cognition. Phenomenology and the Cognitive Sciences 6(4): 485-507
  • De Jaegher, H, Di Paolo, E.A, and Gallagher, S. 2010. Can Social Interaction Constitute Social Cognition? Trends in Cognitive Sciences, 14 (10), 441-447.
  • De Jaegher, H., and Froese, T. 2009. On the Role of Social Interaction in Individual Agency. Adaptive Behavior, 17(5), 444‐460.
  • De Jaegher, H., Pieper, B., Clénin, D., and Fuchs, T. 2017. Grasping Intersubjectivity: An Invitation to Embody Social Interaction Research. Phenomenology and the Cognitive Sciences, 16(3), 491–523.
  • De Jesus, P. 2018. Thinking through Enactive Agency: Sense‐Making, Bio‐semiosis and the Ontologies of Organismic Worlds. Phenomenology and the Cognitive Sciences, 17(5), 861–887.
  • De Jesus P. 2016. From Enactive Phenomenology to Biosemiotic Enactivism. Adaptive Behavior. 24(2):130-146.
  • Degenaar, J., and O’Regan, J. K. 2017. Sensorimotor Theory and Enactivism. Topoi, 36, 393–407.
  • Deguchi, S., Shimoshige, H., Tsudome, M. Mukai, S., Corkery, R.W., S, and Horikoshi, K. 2011. Microbial growth at hyperaccelerations up to 403,627 × g. PNAS. 108:19. 7997-8002
  • Dewey, J. 1922. Human Nature and Conduct: An Introduction to Social Psychology, 1st edn. New York: Holt.
  • Di Paolo, E. A. 2021. Enactive Becoming. Phenomenology and the Cognitive Sciences, 20, 783–809.
  • Di Paolo, E. A. 2018. The Enactive Conception of Life. In A. Newen, De L. Bruin, and S. Gallagher (eds.). The Oxford Handbook of 4E Cognition (pp. 71–94). Oxford: Oxford University Press.
  • Di Paolo, E. A. 2009. Extended Life. Topoi, 28(9).
  • Di Paolo, E. A. 2005. Autopoiesis, Adaptivity, Teleology, Agency. Phenomenology and the Cognitive Sciences, 4, 429–452.
  • Di Paolo, E., Buhrmann, T., and Barandiaran, X. E. 2017. Sensorimotor Life. Oxford: Oxford University Press.
  • Di Paolo, E. A., Cuffari, E. C., and De Jaegher, H. 2018. Linguistic Bodies. The Continuity Between Life and Language. Cambridge, MA: MIT Press.
  • Di Paolo, E.A. and De Jaegher, H. 2022. Enactive Ethics: Difference Becoming Participation. Topoi. 41, 241–256.
  • Di Paolo, E.A. and De Jaegher, H. 2012. The Interactive Brain Hypothesis. Frontiers in Human Neuroscience.
  • Di Paolo, E.A., Rohde, M. and De Jaegher, H. 2010. Horizons for the Enactive Mind: Values, Social Interaction, and Play. In Stewart, J., Gapenne, O., and Di Paolo, E.A. (eds). Enaction: Toward a New Paradigm for Cognitive Science. Cambridge, MA: MIT Press.
  • Di Paolo, E. A. and Thompson, E. 2014. The Enactive Approach. In L. Shapiro (Ed.), The Routledge Handbook of Embodied Cognition (pp. 68–78). London: Routledge.
  • Di Paolo, E., Thompson, E. and Beer, R. 2022. Laying Down a Forking Path: Tensions between Enaction and the Free Energy Principle. Philosophy and the Mind Sciences. 3.
  • Facchin, M. 2021. Is Radically Enactive Imagination Really Contentless? Phenomenology and the Cognitive Sciences. 21. 1089–1105.
  • Fingerhut, J. 2018. Enactive Aesthetics and Neuroaesthetics. Phenomenology and Mind, 14, 80–97.
  • Froese, T. and Di Paolo, E.A. 2011. The Enactive Approach: Theoretical Sketches from Cell to Society. Pragmatics and Cognition, 19 (1), 1-36.
  • Froese, T., and Di Paolo, E.A. 2009. Sociality and the Life‐Mind Continuity Thesis. Phenomenology and the Cognitive Sciences, 8(4), 439-463.
  • Froese, T, McGann, M, Bigge, W, Spiers, A, and Seth, A.K. 2012. The Enactive Torch: A New Tool for the Science of Perception. IEEE Trans Haptics. 5(4):365-75
  • Froese, T., Woodward, A. and Ikegami, T. 2013. Turing Instabilities in Biology, Culture, and Consciousness? On the Enactive Origins of Symbolic Material Culture. Adaptive Behavior, 21 (3), 199-214.
  • Froese, T., and Ziemke, T. 2009. Enactive Artificial Intelligence: Investigating the Systemic Organization of Life and Mind. Artificial Intelligence, 173(3–4), 466–500.
  • Fuchs, T. 2018. Ecology of the Brain: The Phenomenology and Biology of the Embodied Mind. New York: Oxford University Press.
  • Gallagher, S. 2022b. Surprise! Why Enactivism and Predictive Processing are Parting Ways: The Case of Improvisation. Possibility Studies and Society.
  • Gallagher, S. 2021. Performance/Art: The Venetian Lectures. Milan: Mimesis International Edizioni.
  • Gallagher, S. 2020a. Action and Interaction. Oxford: Oxford University Press.
  • Gallagher, S. 2020b. Enactivism, Causality, and Therapy. Philosophy, Psychiatry, and Psychology, 27 (1), 27-28.
  • Gallagher, S. 2018a. Educating the Right Stuff: Lessons in Enactivist Learning. Educational Theory. 68 (6): 625-641.
  • Gallagher, S. 2018b. Rethinking Nature: Phenomenology and a Non-Reductionist Cognitive Science. Australasian Philosophical Review. 2 (2): 125-137
  • Gallagher, S. 2017. Enactivist Interventions: Rethinking the Mind. Oxford: Oxford University Press.
  • Gallagher, S. 2014. Pragmatic Interventions into Enactive and Extended Conceptions of Cognition. Philosophical Issues, 24 (1), 110-126.
  • Gallagher, S. 2005. How the Body Shapes the Mind. New York: Oxford University Press.
  • Gallagher, S. and Bower, M. 2014. Making Enactivism Even More Embodied. Avant, 5 (2), 232-247.
  • Gallagher, S. and Hutto, D. 2008. Understanding Others through Primary Interaction and Narrative Practice. In Zlatev, J., Racine, T., Sinha, C. and Itkonen, E. (eds). The Shared Mind: Perspectives on Intersubjectivity. John Benjamins. 17-38.
  • Gallagher, S., Hutto, D. and Hipólito, I. 2022. Predictive Processing and Some Disillusions about Illusions. Review of Philosophy and Psychology. 13, 999–1017.
  • Gallagher, S. and Lindgren, R. 2015. Enactive Metaphors: Learning through Full-Body Engagement. Educational Psychological Review. 27: 391–404.
  • Gallagher, S. and Miyahara, K. 2012. Neo-Pragmatism and Enactive Intentionality. In: Schulkin, J. (eds) Action, Perception and the Brain. New Directions in Philosophy and Cognitive Science. Palgrave Macmillan, London.
  • Garofoli, D. 2019. Embodied Cognition and the Archaeology of Mind: A Radical Reassessment. In Marie Prentiss, A. M. (ed). Handbook of Evolutionary Research in Archaeology. Springer. 379-405.
  • Garofoli, D. 2018. RECkoning with Representational Apriorism in Evolutionary Cognitive Archaeology. Phenomenology and the Cognitive Sciences.17, 973–995.
  • Garofoli, G. 2015. A Radical Embodied Approach to Lower Palaeolithic Spear-making. The Journal of Mind and Behavior. 1-25.
  • Garofoli, D and Iliopoulos, A. 2018. Replacing Epiphenomenalism: A Pluralistic Enactive Take on the Metaplasticity of Early Body Ornamentation. Philosophy and Technology, 32, 215–242.
  • Gärtner, K. and Clowes, R. 2017. Enactivism, Radical Enactivism and Predictive Processing: What is Radical in Cognitive Science? Kairos. Journal of Philosophy and Science, 18(1). 54-83.
  • Gibson, J.J. 1979. The Ecological Approach to Visual Perception. Boston: Houghton Mifflin.
  • Godfrey‐Smith, P. 2001. Three Kinds of Adaptationism. In Hecht Orzack, S. H. (ed). Adaptationism and Optimality (pp. 335–357). Cambridge: Cambridge University Press.
  • Gould, S. J., and Lewontin, R.C. 1979. The Spandrels of San Marco and the Panglossian Paradigm: A Critique of the Adaptationist Programme. Proceedings of the Royal Society of London—Biological Sciences, 205(1161), 581–598.
  • Heft, H. 2020. Ecological Psychology and Enaction Theory: Divergent Groundings. Frontiers in Psychology.
  • Heras-Escribano, M. 2021. Pragmatism, Enactivism, and Ecological Psychology: Towards a Unified Approach to Post-Cognitivism. Synthese, 198 (1), 337-363.
  • Heras-Escribano, M. 2019. The Philosophy of Affordances. Basingstoke: Palgrave Macmillan.
  • Heras-Escribano, M. 2016. Embracing the Environment: Ecological Answers for Enactive Problems. Constructivist Foundations, 11 (2), 309-312.
  • Heras-Escribano, M, Noble, J., and De Pinedo, M. 2015. Enactivism, Action and Normativity: a Wittgensteinian Analysis. Adaptive Behavior, 23 (1), 20-33.
  • Hesp, C., Ramstead, M., Constant, A., Badcock, P., Kirchhoff, M., Friston, K. 2019. A Multi-scale View of the Emergent Complexity of Life: A Free-Energy Proposal. In: Georgiev, G., Smart, J., Flores Martinez, C., Price, M. (eds) Evolution, Development and Complexity. Springer Proceedings in Complexity. Springer, Cham.
  • Hipólito, I, Hutto, D.D., and Chown, N. 2020. Understanding Autistic Individuals: Cognitive Diversity not Theoretical Deficit. In Rosqvist, H., Chown, N., and Stenning, A. (eds). Neurodiversity Studies: A New Critical Paradigm, 193-209.
  • Høffding, S. and Schiavio. A. 2021. Exploratory Expertise and the Dual Intentionality of Music-Making. Phenomenology and the Cognitive Sciences, 20 (5): 811-829.
  • Hurley, S. L. 1998. Consciousness in Action. Cambridge, MA: Harvard University Press.
  • Hurley, S. and Noë, A. 2003. Neural Plasticity and Consciousness. Biology and Philosophy 18, 131–168.
  • Hutto, D. D. 2020. From Radical Enactivism to Folk Philosophy. The Philosophers’ Magazine. 88. 75-82.
  • Hutto, D.D. 2019. Re-doing the Math: Making Enactivism Add Up. Philosophical Studies. 176. 827–837.
  • Hutto, D.D. 2018. Getting into Predictive Processing’s Great Guessing Game: Bootstrap Heaven or Hell? Synthese, 195, 2445–2458.
  • Hutto, D.D. 2017. REC: Revolution Effected by Clarification. Topoi. 36:3. 377–391
  • Hutto, D.D. 2015a. Overly Enactive Imagination? Radically Re-imagining Imagining. The Southern Journal of Philosophy. 53. 68–89.
  • Hutto, D.D. 2015b. Contentless Perceiving: The Very Idea. In Wittgenstein and Perception. O’Sullivan, M. and Campbell, M. (eds). London: Routledge. 64-84.
  • Hutto, D.D. 2013a. Radically Enactive Cognition in our Grasp. In The Hand – An Organ of the Mind. Radman, Z. (ed). Cambridge, MA: MIT Press. 227-258.
  • Hutto, D.D. 2013b. Enactivism from a Wittgensteinian Point of View. American Philosophical Quarterly. 50(3). 281-302.
  • Hutto, D.D. 2013c. Psychology Unified: From Folk Psychology to Radical Enactivism. Review of General Psychology. 17(2). 174-178.
  • Hutto, D.D. 2012. Truly Enactive Emotion. Emotion Review. 4:1. 176-181.
  • Hutto, D.D. 2011a. Philosophy of Mind’s New Lease on Life: Autopoietic Enactivism meets Teleosemiotics. Journal of Consciousness Studies. 18:5-6. 44-64.
  • Hutto, D.D. 2011c. Enactivism: Why be Radical? In Sehen und Handeln. Bredekamp, H. and Krois, J. M. (eds). Berlin: Akademie Verlag. 21-44
  • Hutto, D.D. 2008. Folk Psychological Narratives: The Socio-Cultural Basis of Understanding Reasons. Cambridge, MA: The MIT Press.
  • Hutto, D.D. 2006. Unprincipled Engagement: Emotional Experience, Expression and Response. In Menary, R. (ed.), Radical Enactivism: Intentionality, Phenomenology and Narrative: Focus on the Philosophy of Daniel D. Hutto. Amsterdam: Jon Benjamins.
  • Hutto, D.D. 2005. Knowing What? Radical versus Conservative Enactivism. Phenomenology and the Cognitive Sciences. 4(4). 389-405.
  • Hutto, D.D. 2000. Beyond Physicalism. Philadelphia/Amsterdam: John Benjamins.
  • Hutto, D.D. 1999. The Presence of Mind. Philadelphia/Amsterdam: John Benjamins.
  • Hutto, D.D. and Abrahamson, D. 2022. Embodied, Enactive Education: Conservative versus Radical Approaches. In Movement Matters: How Embodied Cognition Informs Teaching and Learning. Macrine and Fugate (eds). Cambridge, MA: MIT Press.
  • Hutto, D., Gallagher, S., Ilundáin-Agurruza, J., and Hipólito, I. 2020. Culture in Mind – An Enactivist Account: Not Cognitive Penetration but Cultural Permeation. In Kirmayer, L. J., S. Kitayama, S., Worthman, C.M., Lemelson, R. and Cummings, C.A. (Eds.), Culture, Mind, and Brain: Emerging Concepts, Models, Applications. New York, NY: Cambridge University Press. pp. 163–187.
  • Hutto, D.D. and Jurgens, A. 2019. Exploring Enactive Empathy: Actively Responding to and Understanding Others. In Matravers, D. and Waldow, A. (eds). Philosophical Perspectives on Empathy: Theoretical Approaches and Emerging Challenges. London: Routledge. pp. 111-128.
  • Hutto, D.D. and Kirchhoff, M. 2015. Looking Beyond the Brain: Social Neuroscience meets Narrative Practice. Cognitive Systems Research, 34, 5-17.
  • Hutto, D.D., Kirchhoff, M.D., and Abrahamson, D. 2015. The Enactive Roots of STEM: Rethinking Educational Design in Mathematics. Educational Psychology Review, 27(3), 371-389.
  • Hutto, D.D., Kirchhoff, M. and Myin, E. 2014. Extensive Enactivism: Why Keep it All In? Frontiers in Human Neuroscience. doi: 10.3389/fnhum.2014.00706.
  • Hutto, D.D. and Myin, E. 2021. Re-affirming Experience, Presence, and the World: Setting the RECord Straight in Reply to Noë. Phenomenology and the Cognitive Sciences. 20, vol. 5, no. 20, pp. 971-989
  • Hutto, D.D. and Myin, E. 2018a. Much Ado about Nothing? Why Going Non-semantic is Not Merely Semantics. Philosophical Explorations. 21(2). 187–203.
  • Hutto, D.D. and Myin, E. 2018b. Going Radical. In The Oxford Handbook of 4E Cognition. Newen, A. Gallagher, S. and de Bruin, L. (eds). Oxford: Oxford University Press. pp. 95-116.
  • Hutto, D.D. and Myin, E. 2017. Evolving Enactivism: Basic Minds meet Content. Cambridge, MA: The MIT Press.
  • Hutto, D.D. and Myin, E. 2013. Radicalizing Enactivism: Basic Minds without Content. Cambridge, MA: The MIT Press.
  • Hutto, D. D., Myin, E., Peeters, A, and Zahnoun, F. 2019. The Cognitive Basis of Computation: Putting Computation In its Place. In Colombo, M. and Sprevak, M. (eds). The Routledge Handbook of The Computational Mind. London: Routledge. 272-282.
  • Hutto, D.D. and Peeters, A. 2018. The Roots of Remembering. Extensively Enactive RECollection. In New Directions in the Philosophy of Memory. Michaelian, K. Debus, D. Perrin, D. (eds). London: Routledge. pp. 97-118.
  • Hutto, D.D. and Robertson, I. 2020. Clarifying the Character of Habits: Understanding What and How They Explain. In Habits: Pragmatist Approaches from Cognitive Science, Neuroscience, and Social Theory. Caruana, F. and Testa, I. (eds). Cambridge: Cambridge University Press. pp. 204-222.
  • Hutto, D.D., Robertson, I and Kirchhoff, M. 2018. A New, Better BET: Rescuing and Revising Basic Emotion Theory. Frontiers in Psychology, 9, 1217.
  • Hutto, D.D., Röhricht, F., Geuter, U., and S. Gallagher. 2014. Embodied Cognition and Body Psychotherapy: The Construction of New Therapeutic Environments. Sensoria: A Journal of Mind, Brain and Culture. 10(1).
  • Hutto, D.D. and Sánchez-García, R. 2015. Choking RECtified: Enactive Expertise Beyond Dreyfus. Phenomenology and the Cognitive Sciences. 14:2. 309-331.
  • Hutto, D.D., and Satne, G. 2018a. Naturalism in the Goldilock’s Zone: Wittgenstein’s Delicate Balancing Act. In Raleigh, T. and Cahill, K. (eds). Wittgenstein and Naturalism. London: Routledge. 56-76.
  • Hutto, D.D. and Satne, G. 2018b. Wittgenstein’s Inspiring View of Nature: On Connecting Philosophy and Science Aright. Philosophical Investigations. 41:2. 141-160.
  • Hutto, D.D., and Satne, G. 2017a. Continuity Scepticism in Doubt: A Radically Enactive Take. In Embodiment, Enaction, and Culture. Durt, C, Fuchs, T and Tewes, C (eds). Cambridge, MA. MIT Press. 107-126.
  • Hutto, D.D. and Satne, G. 2017b. Davidson Demystified: Radical Interpretation meets Radical Enactivism. Argumenta. 3:1. 127-144.
  • Hutto. D.D. and Satne, G. 2015. The Natural Origins of Content. Philosophia. 43. 521–536.
  • Ihde, D., and Malafouris, L. 2019. Homo Faber Revisited: Postphenomenology and Material Engagement Theory. Philosophy and Technology, 32(2), 195–214.
  • Janz, B. 2022. African Philosophy and Enactivist Cognition: The Space of Thought. Imprint Bloomsbury Academic.
  • Jonas, H. 1966. The Phenomenon of Life. Evanston: Northwestern University Press
  • Juarrero, A. 1999. Dynamics in Action: Intentional Behavior as a Complex System. Cambridge: The MIT Press.
  • Jurgens, A., Chown, N, Stenning, A. and Bertilsdotter-Rosqvist, H. 2020. Neurodiversity in a Neurotypical World: An Enactive Framework for Investigating Autism and Social Institutions. In Rosqvist, H., Chown, N., and Stenning, A. (eds). Neurodiversity Studies: A New Critical Paradigm, 73-88.
  • Jurgens, A. 2021. Re-conceptualizing the Role of Stimuli: An Enactive, Ecological Explanation of Spontaneous-Response Tasks. Phenomenology and the Cognitive Sciences, 20 (5), 915-934.
  • Kabat-Zinn, J. 2016. Foreword to the Revised Edition. In Varela, F. J., Thompson, E., and Rosch, E. The Embodied Mind: Cognitive Science and Human Experience. Revised Edition (6th ed.). Cambridge, MA: MIT Press.
  • Kee, H. 2018. Phenomenology and Naturalism in Autopoietic and Radical Enactivism: Exploring Sense-Making and Continuity from the Top Down. Synthese. pp. 2323–2343.
  • Kirchhoff, M. 2018a. Autopoiesis, Free Energy, and the Life–Mind Continuity Thesis, Synthese, 195 (6), 2519-2540.
  • Kirchhoff, M. 2018b. The Body in Action: Predictive Processing and the Embodiment Thesis. The Oxford Handbook of 4E Cognition. Oxford: Oxford University Press. pp. 243-260.
  • Kirchhoff, M. 2015. Species of Realization and the Free Energy Principle. Australasian Journal of Philosophy. 93 (4), 706-723.
  • Kirchhoff, M and Froese, T. 2017. Where There is Life, There is Mind: In Support of a Strong Life-Mind Continuity Thesis. Entropy, 19 (4), 169.
  • Kirchhoff, M. and Hutto, D.D. 2016. Never Mind the Gap: Neurophenomenology, Radical Enactivism and the Hard Problem of Consciousness. Constructivist Foundations. 11 (2): 302–30.
  • Kirchhoff, M and Meyer, R. 2019. Breaking Explanatory Boundaries: Flexible Borders and Plastic Minds. Phenomenology and the Cognitive Sciences, 18 (1), 185-204.
  • Kirchhoff, M., Parr, T., Palacios, E, Friston, K, and Kiverstein, J. 2018. The Markov Blankets of Life: Autonomy, Active Inference and the Free Energy Principle. Journal of The Royal Society Interface, 15 (138), 2017-0792.
  • Kirchhoff, M. D., and Robertson, I. 2018. Enactivism and Predictive Processing: A Non‐Representational View. Philosophical Explorations, 21(2), 264–281.
  • Kiverstein, J. D., and Rietveld, E. 2018. Reconceiving Representation‐Hungry Cognition: An Ecological‐Enactive Proposal. Adaptive Behavior, 26(4), 147–163.
  • Kiverstein, J. and Rietveld, E. 2015. The Primacy of Skilled Intentionality: On Hutto and Satne’s The Natural Origins of Content. Philosophia, 43 (3). 701–721.
  • Lai, K.L. 2022. Models of Knowledge in the Zhuangzi: Knowing with Chisels and Sticks. In Lai, K.L. (ed). Knowers and Knowledge in East-West Philosophy: Epistemology Extended. Basingstoke: Palgrave Macmillan 319-344.
  • Laland, K. N., Matthews, B., and Feldman, M.W. 2016. An Introduction to Niche Construction Theory. Evolutionary Ecology, 30(2), 191–202.
  • Laland, K., Uller, T., Feldman, M., Sterelny, K., Müller, G., Moczek, A., Jablonka, E., and Odling‐Smee, J. 2014. Does Evolutionary Theory Need a Rethink? Yes, Urgently. Nature, 514, 161–164.
  • Langland-Hassan, P. 2021. Why Pretense Poses a Problem for 4E Cognition (and how to Move Forward). Phenomenology and the Cognitive Sciences. 21. 1003 – 1021.
  • Langland-Hassan, P. 2022. Secret Charades: Reply to Hutto. Phenomenology and the Cognitive Sciences. 21. 1183 – 1187.
  • Lee, J. 2019. Structural Representation and the Two Problems of Content. Mind and Language. 34: 5. 606-626.
  • Legg, C. 2021. Discursive Habits: A Representationalist Re-Reading of Teleosemiotics. Synthese, 199(5), 14751-14768.
  • Lewontin, R. 2000. The Triple Helix: Gene. Cambridge, MA: Harvard University Press.
  • Lewontin, R. and Levins, R. 1997. Organism and Environment. Capitalism Nature Socialism. 8: 2. 95-98.
  • Littlefair, S. 2020. Why Evan Thompson isn’t a Buddhist. Lion’s Roar: Buddhist Wisdom for Our Time. https://www.lionsroar.com/evan-thompson-not-buddhist/
  • Loughlin, V. 2014. Radical Enactivism, Wittgenstein and the Cognitive Gap. Adaptive Behavior. 22 (5): 350-359.
  • Loughlin, V. 2021a. 4E Cognitive Science and Wittgenstein. Basingstoke: Palgrave Macmillan
  • Loughlin, V. 2021b. Why Enactivists Should Care about Wittgenstein. Philosophia 49(11–12).
  • Loughlin, V. 2021c. Wittgenstein’s Challenge to Enactivism. Synthese, 198 (Suppl 1), 391–404.
  • Maiese, M. 2022a. Autonomy, Enactivism, and Mental Disorder: A Philosophical Account. London: Routledge.
  • Maiese, M. 2022b. White Supremacy as an Affective Milieu. Topoi, 41 (5): 905-915.
  • Maiese, M. 2022c. Mindshaping, Enactivism, and Ideological Oppression. Topoi, 41 (2): 341-354.
  • Maiese, M. 2022d. Neoliberalism and Mental Health Education. Journal of Philosophy of Education. 56 (1): 67-77.
  • Malafouris, L. 2013. How Things Shape the Mind: A Theory of Material Engagement. Cambridge, MA: MIT Press.
  • Mann, S. and Pain, R. 2022. Teleosemantics and the Hard Problem of Content, Philosophical Psychology, 35:1, 22-46.
  • Maturana, H. R., and Varela, F. J. 1980. Autopoiesis and Cognition: The Realization of the Living. Boston: D. Reidel.
  • Maturana, H., and Varela, F. 1987. The Tree of Knowledge: The Biological Roots of Human Understanding. New Science Library/Shambhala Publications.
  • Maturana, H., and Mpodozis, J. 2000. The Origin of Species by Means of Natural Drift. Revista Chilena De Historia Natural, 73(2), 261–310.
  • McGann M. 2022. Connecting with the Subject of our Science: Course-of-Experience Research Supports Valid Theory Building in Cognitive Science. Adaptive Behavior. doi:10.1177/10597123221094360).
  • McGann M. 2021. Enactive and Ecological Dynamics Approaches: Complementarity and Differences for Interventions in Physical Education Lessons. Physical Education and Sport Pedagogy, 27(3):1-14.
  • McGann, M. 2007. Enactive Theorists Do it on Purpose: Toward An Enactive Account of Goals and Goal-directedness. Phenomenology and the Cognitive Sciences, 6, 463–483.
  • McGann, M, De Jaegher, H. and Di Paolo, E.A. 2013. Enaction and Psychology. Review of General Psychology, 17 (2), 203-209.
  • McGann, M. Di Paolo, E.A., Heras-Escribano, M. and Chemero, A. 2020. Enaction and Ecological Psychology: Convergences and Complementarities. Frontiers in Psychology. 11:1982.
  • McKinney, J. 2020. Ecological-Enactivism Through the Lens of Japanese Philosophy. Frontiers Psychology. 11.
  • Medina, J. 2013. An Enactivist Approach to the Imagination: Embodied Enactments and ‘Fictional Emotions’. American Philosophical Quarterly 50.3: 317–335.
  • Merleau-Ponty, M. 1963. The Structure of Behavior. Pittsburgh: Duquesne University Press .
  • Meyer, R. 2020a. The Nonmechanistic Option: Defending Dynamical Explanations. The British Journal for the Philosophy of Science. 71 (3):959-985
  • Meyer, R. 2020b. Dynamical Causes. Biology and Philosophy, 35 (5), 1-21.
  • Meyer, R and Brancazio, N. 2022. Putting Down the Revolt: Enactivism as a Philosophy of Nature.
  • Michaelian, K. and Sant’Anna, A. 2021. Memory without Content? Radical Enactivism and (Post)causal theories of Memory. Synthese, 198 (Suppl 1), 307–335.
  • Miłkowski, M. 2015. The Hard Problem of Content: Solved (Long Ago). Studies in Logic, 41(1): 73-88.
  • Miyahara, K and Segundo-Ortin, M. 2022. Situated Self-Awareness in Expert Performance: A Situated Normativity account of Riken no Ken, Synthese. 200, 192. https://doi.org/10.1007/s11229-022-03688-w
  • Moyal-Sharrock, 2016. The Animal in Epistemology: Wittgenstein’s Enactivist Solution to the Problem of Regress. International Journal for the Study of Skepticism. 6 (2-3): 97-119
  • Moyal-Sharrock, 2021a. Certainty In Action: Wittgenstein on Language, Mind and Epistemology. London: Bloomsbury.
  • Moyal-Sharrock, D. 2021b. From Deed to Word: Gapless and Kink-free Enactivism. Synthese, 198 (Suppl 1), 405–425.
  • Murphy, M. 2019. Enacting Lecoq: Movement in Theatre, Cognition, and Life. Basingstoke: Palgrave Macmillan.
  • Myin. E. 2020. On the Importance of Correctly Locating Content: Why and How REC Can Afford Affordance Perception. Synthese, 198 (Suppl 1):25-39.
  • Myin, E., and O’Regan, K. J. 2002. Perceptual Consciousness, Access to Modality, and Skill Theories: A Way to Naturalize Phenomenology? Journal of Consciousness Studies, 9, 27–46
  • Myin, E and Van den Herik, J.C. 2020. A Twofold Tale of One Mind: Revisiting REC’s Multi-Storey Story. Synthese,198 (12): 12175-12193.
  • Netland, T. 2022. The lived, living, and behavioral sense of perception. Phenomenology and the Cognitive Sciences, https://doi.org/10.1007/s11097-022-09858-y
  • Noë, A. 2021. The Enactive Approach: A Briefer Statement, with Some Remarks on ‘Radical Enactivism’. Phenomenology and the Cognitive Sciences, 20, 957–970
  • Noë, A. 2015. Strange Tools: Art and Human Nature. New York: Hill and Wang
  • Noë, A. 2012. Varieties of Presence. Cambridge, MA: Harvard University Press.
  • Noë, A. 2009. Out of Our Heads: Why You Are not Your Brain and Other Lessons from the Biology of Consciousness. New York: Hill and Wang.
  • Noë, A. 2004. Action in Perception. Cambridge, MA: MIT Press.
  • Øberg, G. K., Normann, B. and S Gallagher. 2015. Embodied-Enactive Clinical Reasoning in Physical Therapy. Physiotherapy Theory and Practice, 31 (4), 244-252.
  • O’Regan, J. K. 2011. Why Red Doesn’t Sound Like a Bell: Understanding the Feel of Consciousness. Oxford: Oxford University Press.
  • O’Regan, J. K., and Noë, A. 2001. A Sensorimotor Account of Vision and Visual Consciousness. Behavioral and Brain Sciences, 24, 883–917.
  • O’Regan, J. K., Myin, E., and Noë, A. 2005. Skill, Corporality and Alerting Capacity in an Account of Sensory Consciousness. Progress in Brain Research, 150, 55–68.
  • Paolucci, C. 2021. Cognitive Semiotics. Integrating Signs, Minds, Meaning and Cognition, Cham Switzerland: Springer.
  • Paolucci, C. 2020. A Radical Enactivist Approach to Social Cognition. In: Pennisi, A., Falzone, A. (eds). The Extended Theory of Cognitive Creativity. Perspectives in Pragmatics, Philosophy and Psychology. Cham: Springer.
  • Piccinini, G. 2022. Situated Neural Representations: Solving the Problems of Content. Frontiers in Neurorobotics. 14 April 2022, Volume 16.
  • Piccinini, G. 2020. Neurocognitive Mechanisms: Explaining Biological Cognition. New York: Oxford University Press.
  • Piccinini, G. 2015. Physical Computation: A Mechanistic Account. New York: Oxford University Press.
  • Piccinini. G. 2008. Computation without Representation. Philosophical Studies. 137 (2), 205-241.
  • Raleigh, T. 2018. Tolerant Enactivist Cognitive Science, Philosophical Explorations, 21:2, 226-244.
  • Ramírez-Vizcaya, S., and Froese, T. 2019. The Enactive Approach to Habits: New Concepts for the Cognitive Science of Bad Habits and Addiction. Frontiers in psychology, 10, 301.
  • Ramstead, MJD, Kirchhoff, M, and Friston, K. 2020a. A Tale of Two Densities: Active Inference is Enactive Inference. Adaptive Behavior. 28(4):225-239.
  • Ramstead, MJD, Friston, K, Hipolito, I. 2020b. Is the Free-Energy Principle a Formal Theory of Semantics? From Variational Density Dynamics to Neural and Phenotypic Representations. Entropy, 22(8), 889.
  • Reid, D. 2014. The Coherence of Enactivism and Mathematics Education Research: A Case Study. AVANT. 5:2. 137-172.
  • Rietveld, E., Denys, D. and Van Westen, M. 2018. Ecological-Enactive Cognition as Engaging with a Field of Relevant Affordances: The Skilled Intentionality Framework (SIF). In A. Newen, L. L. de Bruin and S. Gallagher (Eds.). Oxford Handbook of 4E Cognition. Oxford: Oxford University Press, 41-70.
  • Rietveld, E. and Kiverstein, J. 2014. A Rich Landscape of Affordances. Ecological Psychology, 26(4), 325-352.
  • Robertson, I. and Hutto, D. D. 2023. Against Intellectualism about Skill. Synthese201(4), 143.
  • Robertson, I. and Kirchhoff, M. D. (2019). Anticipatory Action: Active Inference in Embodied Cognitive Activity. Journal of Consciousness Studies, 27(3-4), 38-68.
  • Roelofs, L. 2018. Why Imagining Requires Content: A Reply to A Reply to an Objection to Radical Enactive Cognition. Thought: A Journal of Philosophy. 7 (4):246-254.
  • Rolla, G. 2021. Reconceiving Rationality: Situating Rationality into Radically Enactive Cognition. Synthese, 198(Suppl 1), pp. 571–590.
  • Rolla, G. 2018. Radical Enactivism and Self-Knowledge. Kriterion,141, pp. 732-743
  • Rolla, G. and Figueiredo, N. 2021. Bringing Forth a World, Literally. Phenomenology and the Cognitive Sciences.
  • Rolla, G. and Huffermann, J. 2021. Converging Enactivisms: Radical Enactivism meets Linguistic Bodies. Adaptive Behavior. 30(4). 345-359.
  • Rolla, G., and Novaes, F. 2022. Ecological-Enactive Scientific Cognition: Modeling and Material Engagement. Phenomenology and the Cognitive Sciences, 21, pp. 625–643.
  • Rolla, G., Vasconcelos, G., and Figueiredo, N. 2022. Virtual Reality, Embodiment, and Allusion: An Ecological-Enactive Approach. Philosophy and Technology. 35: 95.
  • Rosch, E. 2016. Introduction to the Revised Edition. In The Embodied Mind: Cognitive Science and Human Experience. Revised Edition (6th ed.). Cambridge, MA: MIT Press.
  • Rucińska, Z. 2019. Social and Enactive Perspectives on Pretending. Avant. 10:3. 1-27.
  • Rucińska, Z. 2016. What Guides Pretence? Towards the Interactive and the Narrative Approaches. Phenomenology and the Cognitive Sciences. 15: 117–133.
  • Ryan Jr, K.J. and S. Gallagher. 2020. Between Ecological Psychology and Enactivism: Is There Resonance? Frontiers in Psychology, 11, 1147.
  • Salis, P. 2022. The Given and the Hard problem of Content. Phenomenology and the Cognitive Sciences.
  • Sato, M. and McKinney, J. 2022. The Enactive and Interactive Dimensions of AI: Ingenuity and Imagination Through the Lens of Art and Music. Artificial Life. 28 (3): 310–321.
  • Schiavio, A. and De Jaegher, H. 2017. Participatory Sense-Making in Joint Musical Practice. Lesaffre, M. Maes, P-J, Marc Leman, M. (eds). The Routledge Companion to Embodied Music Interaction. London: Routledge, 31-39.
  • Schlicht, T. and Starzak, T. 2019. Prospects of Enactivist Approaches to Intentionality and Cognition. Synthese, 198 (Suppl 1): 89-113.
  • Searle, J. 1992. The Rediscovery of the Mind. Cambridge: The MIT Press.
  • Segundo-Ortin, M. 2020. Agency From a Radical Embodied Standpoint: An Ecological-Enactive Proposal. Frontiers in Psychology.
  • Segundo-Ortin, M, Heras-Escribano, M, and Raja, V. 2019. Ecological Psychology is Radical Enough. A Reply to Radical Enactivists. Philosophical Psychology, 32 (7), 1001-102340.
  • Segundo-Ortin, M and Hutto, D.D. 2021. Similarity-based Cognition: Radical Enactivism meets Cognitive Neuroscience. Synthese, 198 (1), 198, 5–23.
  • Seifert, L, Davids, K, Hauw, D and McGann, M. 2020. Editorial: Radical Embodied Cognitive Science of Human Behavior: Skill Acquisition, Expertise and Talent Development. Frontiers in Psychology 11.
  • Sharma, G. and Curtis, P.D. 2022. The Impacts of Microgravity on Bacterial Metabolism. Life (Basel). 12(6): 774.
  • Smith, L.B., and, E.Thelen. 1994. A Dynamic Systems Approach to the Development of Cognition and Action. Cambridge, MA: MIT Press.
  • Stapleton, M. 2022. Enacting Environments: From Umwelts to Institutions. In Lai, K.L. (ed). Knowers and Knowledge in East-West Philosophy: Epistemology Extended. Basingstoke: Palgrave Macmillan. 159-190.
  • Stapleton, M. and Froese, T. 2016. The Enactive Philosophy of Embodiment: From biological Foundations of Agency to the Phenomenology of Subjectivity. In M. García-Valdecasas, J. I. Murillo, and N. F. Barrett (Eds.), Biology and Subjectivity: Philosophical Contributions to non-Reductive Neuroscience. (pp. 113–129). Cham: Springer.
  • Stewart, O. Gapenne, and E. A. Di Paolo (Eds.). 2010. Enaction: Toward a New Paradigm for Cognitive Science (pp. 183–218). Cambridge: The MIT Press.
  • Thompson, E. 2021. Buddhist Philosophy and Scientific Naturalism. Sophia, 1-16.
  • Thompson, E. 2018. Review: Evolving Enactivism: Basic Minds Meet Content. Notre Dame Philosophical Reviews. http://ndpr.nd.edu/reviews/ evolving-enactivism-basic-minds-meet-content/
  • Thompson, E. 2017. Enaction without Hagiography. Constructivist Foundations, 13(1), 41–44.
  • Thompson, E. 2016. Introduction to the Revised Edition. In Varela, F. J., Thompson, E., and Rosch, E. The Embodied Mind: Cognitive Science and Human Experience. Revised Edition (6th ed.). Cambridge, MA: MIT Press.
  • Thompson, E. 2011a. Living Ways of Sense-Making, Philosophy Today: SPEP Supplement. 114-123.
  • Thompson, E. 2011b. Précis of Mind in Life. Journal of Consciousness Studies.18. 10-22.
  • Thompson, E. 2011c. Reply to Commentaries. Journal of Consciousness Studies.18. 176-223.
  • Thompson, E. 2007. Mind in Life: Biology, Phenomenology, and the Sciences of Mind. Cambridge, MA: Harvard University Press.
  • Thompson, E. 2005. Sensorimotor Subjectivity and the Enactive Approach to Experience. Phenomenology and the Cognitive Sciences, 4: 407-427.
  • Thompson, E. and Stapleton, M. 2009. Making Sense of Sense-Making: Reflections on Enactive and Extended Mind Theories. Topoi, 28: 23-30.
  • Turner, J. S. 2000. The Extended Organism: The Physiology of Animal-Built Structures. Cambridge, MA: Harvard University Press.
  • Van Dijk, L, Withagen, R.G. Bongers, R.M. 2015. Information without Content: A Gibsonian Reply to Enactivists’ Worries. Cognition, 134. pp. 210-214.
  • Varela, F. J. 1999a. Ethical Know-How: Action, Wisdom, and Cognition. Stanford, CA: Stanford University Press.
  • Varela, F.J. 1999b. The Specious Present: A Neurophenomenology of Time Consciousness. In J. Petitot, F. J. Varela, and B. R. M. Pachoud (Eds.), Naturalizing Phenomenology (pp. 266–314). Stanford: Stanford University Press
  • Varela, F. J. 1996. Neurophenomenology: A methodological remedy for the hard problem. Journal of Consciousness Studies, 4, 330–349
  • Varela F. J. 1991. Organism: A Meshwork of Selfless Selves. In: Tauber A. I. (ed.), Organism and the Origins of Self. Dordrecht: Kluwer, 79–107.
  • Varela, F.J. 1984. Living Ways of Sense-Making: A Middle Path for Neuroscience. In: P. Livingstone (Ed.), Order and Disorder: Proceedings of the Stanford International Symposium, Anma Libri, Stanford, pp.208-224.
  • Varela, F. J. 1979. Principles of Biological Autonomy. New York: Elsevier.
  • Varela, F. J., Thompson, E., and Rosch, E. 1991. The Embodied Mind: Cognitive Science and Human Experience. Cambridge: MIT Press.
  • Venter, E. 2021. Toward an Embodied, Embedded Predictive Processing Account. Frontiers in Psychology. doi: 10.3389/fpsyg.2021.543076
  • Venturinha, N. 2016. Moral Epistemology, Interpersonal Indeterminacy and Enactivism. In Gálvez, J.P. (ed). Action, Decision-Making and Forms of Life. Berlin, Boston: De Gruyter, pp. 109-120.
  • Villalobos, M. 2020. Living Beings as Autopoietic Bodies. Adaptive Behavior, 28 (1), 51-58.
  • Villalobos, M. 2013. Enactive Cognitive Science: Revisionism or Revolution? Adaptive Behavior, 21 (3), 159-167.
  • Villalobos, M., and Dewhurst, J. 2017. Why Post‐cognitivism Does Not (Necessarily) Entail Anti‐Computationalism. Adaptive Behavior, 25(3), 117–128.
  • Villalobos, M. and Palacios, S. 2021. Autopoietic Theory, Enactivism, and Their Incommensurable Marks of the Cognitive. Synthese, 198 (Suppl 1), 71–87.
  • Villalobos, M. and Razeto-Barry, P. 2020. Are Living Beings Extended Autopoietic Systems? An Embodied Reply. Adaptive Behavior, 28 (1), 3-13.
  • Villalobos, M. and Silverman, D. 2018. Extended Functionalism, Radical Enactivism, and the Autopoietic Theory of Cognition: Prospects for a Full Revolution in Cognitive Science. Phenomenology and the Cognitive Sciences, 17 (4), 719-739.
  • Villalobos, M., and Ward, D. 2016. Lived Experience and Cognitive Science: Reappraising Enactivism’s Jonasian Turn. Constructivist Foundations, 11, 204–233
  • Villalobos, M. and Ward, D. 2015. Living Systems: Autopoiesis, Autonomy and Enaction. Philosophy and Technology, 28 (2), 225-239.
  • Vörös, S., Froese, T., and Riegler, A. 2016. Epistemological Odyssey: Introduction to Special Issue on the Diversity of Enactivism and Neurophenomenology. Constructivist Foundations, 11(2), 189–203.
  • Ward, D., Silverman, D., and Villalobos, M. 2017. Introduction: The Varieties of Enactivism. Topoi, 36(3), 365–375.
  • Weichold, M. and Rucińska, Z. 2021. Pretense as Alternative Sense-making: A Praxeological Enactivist Account. Phenomenology and the Cognitive Sciences. 21. 1131–1156.
  • Weichold, M. and Rucińska, Z. 2022 Praxeological Enactivism vs. Radical Enactivism: Reply to Hutto. Phenomenology and the Cognitive Sciences. 21. 1177-1182.
  • Werner, K. 2020. Enactment and Construction of the Cognitive Niche: Toward an Ontology of the Mind‐World Connection. Synthese, 197(3), 1313–1341.
  • Zahidi, K. 2021. Radicalizing Numerical Cognition. Synthese, 198 (Suppl 1): S529–S545.
  • Zahidi, K., and Myin, E. 2016. Radically Enactive Numerical Cognition. In G. Etzelmüller and C. Tewes (Eds.), Embodiment in Evolution and Culture (pp. 57–72). Tübingen, Germany: Mohr Siebeck.
  • Zahnoun F. 2021a. The Socio-Normative Nature of Representation. Adaptive Behavior. 29(4): 417-429.
  • Zahnoun, F. 2021b. Some Inaccuracies about Accuracy Conditions. Phenomenology and the Cognitive Sciences.
  • Zahnoun, F. 2021c. On Representation Hungry Cognition (and Why We Should Stop Feeding It). Synthese, 198 (Suppl 1), 267–284.
  • Zahnoun, F. 2020. Truth or Accuracy? Theoria 86 (5):643-650.
  • Zarco M., Egbert M. D. 2019. Different Forms of Random Motor Activity Scaffold the Formation of Different Habits in a Simulated Robot. In Fellermann H., Bacardit J., Goni-Moreno A., Fuchslin M. (Eds.) The 2019 Conference on Artificial Life. No. 31, 582-589
  • Zipoli Caiani, S. 2022. Intelligence Involves Intensionality: An Explanatory Issue for Radical Enactivism (Again). Synthese. 200: 132.

 

Author Information

Daniel D. Hutto
Email: ddhutto@uow.edu.au
University of Wollongong
Australia

The Compactness Theorem

The compactness theorem is a fundamental theorem for the model theory of classical propositional and first-order logic. As well as having importance in several areas of mathematics, such as algebra and combinatorics, it also helps to pinpoint the strength of these logics, which are the standard ones used in mathematics and arguably the most important ones in philosophy.

The main focus of this article is the many different proofs of the compactness theorem, applying different Choice-like principles before later calibrating the strength of these and the compactness theorems themselves over Zermelo-Fraenkel set theory ZF. Although the article’s focus is mathematical, much of the discussion keeps an eye on philosophical applications and implications.

We first introduce some standard logics, detailing whether the compactness theorem holds or fails for these. We also broach the neglected question of whether natural language is compact. Besides algebra and combinatorics, the compactness theorem also has implications for topology and foundations of mathematics, via its interaction with the Axiom of Choice. We detail these results as well as those of a philosophical nature, such as apparent ‘paradoxes’ and non-standard models of arithmetic and analysis. We then provide several different proofs of the compactness theorem based on different Choice-like principles.

In later sections, we discuss several variations of compactness in logics that allow for infinite conjunctions / disjunctions or generalised quantifiers, and in higher-order logics. The article concludes with a history of the compactness theorem and its many proofs, starting from those that use syntactic proofs before moving to the semantic proofs model theorists are more accustomed to today.

Contents

  1. Introduction
  2. Compactness: Common Logics and Natural Language
  3. Implications of Compactness
  4. Some Non-topological Proofs
  5. Connection to Topology
  6. Extensions and Generalisations
  7. Relative Strength
  8. History of the Compactness Theorem
  9. References and Further Reading

1. Introduction

A logic consists of a language, grammar, semantics, and consequence relation \vDash. If \Gamma is a set of sentences of this logic and \delta one of its sentences, \Gamma \vDash \delta means that any model of \Gamma is a model of \delta. (A model of \Gamma is a model of all sentences in \Gamma.) Informally, a logic is called compact if it is determined by its behaviour on finite sets of sentences; there may be infinitely many sentences in the language, but we can always reduce our considerations to finitely many in any given situation.
More formally, a logic is compact just when:

  • If every finite subset \Gamma^\text{fin} of \Gamma is satisfiable then \Gamma is also satisfiable.
  • If \Gamma is an unsatisfiable set of sentences then so is \Gamma^\text{fin} for some finite subset \Gamma^\text{fin} of \Gamma.

Some authors take the compactness of a logic to be its satisfaction of these statements’ biconditional versions. We have chosen to omit the reverse implication from the definition as it easily follows from the meaning of \Gamma \vDash \delta. These two characterisations of compactness are equivalent, since the second statement is effectively the contrapositive of the first. In a logic containing a classical negation connective (by which we mean for each sentence \delta there is a sentence \neg \delta such that \mathfrak{M} \vDash \neg \delta if and only if \mathfrak{M} \nvDash \delta ), both statements are equivalent to:

  • If \Gamma \vDash \delta then \Gamma^\text{fin} \vDash \delta for some finite subset \Gamma^\text{fin} of \Gamma.

This equivalence follows from

    \[\Gamma \vDash \delta \text{ if and only if } \Gamma \cup \{\neg \delta\} \text{ is unsatisfiable.}\]

The compactness theorem is said to hold for a logic precisely when the logic is compact.

Alongside its close cousin, the completeness theorem for first-order logic, the compactness theorem for first-order logic is one of the most important theorems in contemporary logic.
In this entry, we give a few examples of compact and incompact logics and briefly discuss whether natural languages such as English are compact (Section 2). We then mention some mathematical and philosophical implications of the compactness of first-order logic (Section 3). Following that, we give some non-topological proofs of the compactness of propositional and first-order logic (Section 4), followed by a topological proof of the propositional case, which gives the compactness theorem its name (Section 5). We continue with a sketch of some generalisations of the usual notion of compactness (Section 6), a calibration of the strength of the compactness theorems relative to the \textsf{ZF} axioms (Section 7), and end with some notes on the history of the compactness theorems (Section 8). Our discussion concerns the logics philosophers are most familiar with.

2. Compactness: Common Logics and Natural Language

a. Common Logics

Propositional logic is usually taken to consist of a set of sentential atoms \{p_1, \ldots, p_n, \ldots \} and some truth-functionally complete set of Boolean connectives, for instance \{\neg, \vee, \wedge\}. If we denote by \textsf{PL}_\kappa a propositional logic with \kappa sentential atoms, \textsf{PL}_\omega is a propositional logic with a countable infinity of atoms. (\kappa will usually be taken to be an infinite cardinal in what follows.) Any propositional logic \textsf{PL}_\kappa is compact, whatever its set of truth-functional connectives may be. Notice that when \kappa = n is finite, the compactness of any \textsf{PL}_n is a trivial consequence of the fact that any sentence is logically equivalent to a sentence drawn from a fixed set of size no greater than 2^{2^n}. (This set is of size exactly 2^{2^n} just when the set of connectives is truth-functionally complete.)

First-order logic with standard semantics is also compact. This fact is of tremendous importance for logic and its applications, since first-order logic remains the canonical logic to this day, the widespread interest in higher, supplementary and alternative logics notwithstanding. By `first-order logic’, we understand throughout first-order logic with identity. First-order logic without identity is of course also compact, since it is a sublogic of first-order logic with identity.

By second-order logic we mean second-order logic with standard or full semantics, in which second-order n-place predicate variables range over all the n-tuples from the domain of interpretation (and similarly for functional variables). In contrast to first-order logic, second-order logic is not compact. To see this, let \exists_{\geq n} be a sentence of first-order logic satisfied in all and only models with domain of size \geq n$; \(\exists_{\geq_1} may thus be taken as \exists x(x = x), \exists_{\geq_2} as \exists x \exists y\neg (x = y), and so on. Since first-order logic is a sublogic of second-order logic, \exists_{\geq n} is a sentence of second-order logic too. Consider next the sentence

    \[\exists R (R \textnormal{ is functional } \wedge R \textnormal{ is injective } \wedge \neg R \textnormal{ is surjective})\]

where R is a binary predicate variable, `R is functional’ abbreviates \forall x \exists ! y Rxy, `R is injective’ abbreviates \forall x \forall y \forall z ((Rxz \wedge Ryz) \to x = y) and `R is surjective’ abbreviates \forall y \exists x Rxy. Any interpretation of this sentence states that the domain is Dedekind-infinite. The following second-order argument is then valid:

    \[\exists_{\geq 1} \\\]

    \[\exists_{\geq 2} \\\]

    \[\vdots \\\]

    \[\exists_{\geq n} \\\]

    \[\vdots \\\]

    \[\rule{11cm}{0.7pt} \\\]

    \[\exists R (R \text{ is functional } \wedge R \text{ is injective } \wedge \neg R \text{ is surjective})\]

However, no finite subset of the premisses entails the conclusion. For let the finite subset be \{\exists_{\geq i_1}, \exists_{\geq i_2}, \dots, \exists_{\geq i_k}\} and take m \geq \max \{i_1, i_2, \dots, i_k\}. Then there is a model of size m in which the k premisses \exists_{\geq i_1}, \exists_{\geq i_2}, \dots, \exists_{\geq i_k} are true but the argument’s conclusion is false. Hence second-order logic is not compact.

The compactness theorem also typically, but not invariably, fails for infinitary logics. Any logic which allows infinite disjunctions, for example, is incompact, since the set of sentences \{c \neq c_i: i \in \omega \} \cup \{\bigvee_{i \in \omega} c = c_i\} is \emph{finitely-satisfiable} (every finite subset is satisfiable) but unsatisfiable. We return to infinitary logics and to generalisations of the notion of compactness in Section 6.

b. Natural Language

Natural languages are languages such as English, Mandarin, French, and Arabic. Formal languages in contrast are logical languages such as those of propositional, first-order and second-order logic. Although of great mathematical and philosophical importance, the latter are not `natural’ in the intended sense because they are not anyone’s native language and are only ever `spoken’, if at all, in limited contexts.

Is natural language, say English, compact? We must first clarify what the question means. Assume there is such a thing as the relation of logical consequence in natural language. For example, consider these two natural-language arguments:

    \[\text{Hypatia is a woman.}\]

    \[\text{All women are mortal.}\]

    \[\rule{4cm}{0.7pt}\]

    \[\text{Hypatia is mortal.}\]


    \[\text{Hypatia is mortal.}\]

    \[\text{All women are mortal.}\]

    \[\rule{4cm}{0.7pt}\]

    \[\text{Hypatia is a woman.}\]

The first argument is logically valid, whereas the second argument is invalid.
These two examples, of a logically valid English argument and a logically invalid one respectively, help us home in on the notion we are interested in, namely English’s logical validity, but they do not, of course, provide definitions of it. Next, let’s say that a natural language is compact just when, for any logically valid argument in this language, there is a logically valid argument whose conclusion remains the same, yet the premiss set is a finite subset of the original argument’s premiss set. This definition is the analogue of our last definition of compactness above for a formal language: if \Gamma \vDash \delta then \Gamma^\text{fin} \vDash \delta for some finite subset \Gamma^\text{fin} of \Gamma. An equivalent definition could be given based on the other definition of compactness for a formal language.

The overwhelming majority of linguists, philosophers, and other theorists of language take natural language to consist of infinitely many sentences. The idea is that since such sentences are of finite, but arbitrary, length, there must be infinitely many of them. The set of these sentences may be specified by a set of recursive procedures, which generate sentences of arbitrary length. For example, all the following are sentences of English:

    \[\text{My grandfather was tall;}\]

    \[\text{My great-grandfather was tall;}\]

    \[\text{My great-great-grandfather was tall;}\]

Now as a matter of empirical fact, there is some finite number N such that I do not have a great^{N}-grandfather (which N is the least such may be vague). That does not affect the point at issue, which is that these infinitely many sentences are bona fide sentences of English.

Consider now an English argument roughly analogous to the following English analogue of the second-order-logic argument presented earlier:

    \[\textnormal{There is at least one planet.}\]

    \[\textnormal{There are at least two planets.}\]

    \[\vdots\]

    \[\textnormal{There are at least } n \textnormal{ planets.}\]

    \[\vdots\]

    \[\rule{7cm}{0.7pt}\]

    \[\textnormal{There are infinitely many planets.}\]

This argument appears to be valid. Clearly, no finite subset of the premiss set entails the conclusion. If this is right, the English consequence relation is incompact. The moral carries over to any natural language into which the argument is translatable.

Resistance to the argument for the incompactness of English may take several forms. One line of resistance, for instance, would query whether any natural language has a single consequence relation. According to this objection, there are only various consequence relations that arise from looking at, say, English through a particular theoretical lens; none is the correct one. Logical pluralists would all take this view, as would some other philosophers of logic: Beall and Restall (2006) and Shapiro (2014) defend different versions of logical pluralism. The objection, then, is that the argument for the incompactness of English given above assumes that there is a determinate notion: the English consequence relation.

This objection, and others, must be addressed before we can conclude that English is incompact. Here we do not take sides but flag the issue as an important one. For more discussion, see Paseau and Griffiths (2021), and Griffiths and Paseau (2022, chap. 5).

3. Implications of Compactness

This section draws some implications of the compactness of first-order logic. The sample below is a small selection from a list that could fill volumes.Our choices are mostly guided by their philosophical implications, although there are a few examples of primarily mathematical interest. From this point on, we assume some knowledge of elementary model theory (Chang and Keisler 1990; Hodges 1997).

  1. Any compact logic extending first-order logic cannot express the notions of finitude or infinitude (of a model). Suppose towards a contradiction that \phi_F is satisfied by all and only finite models. Then \{\phi_F\} \cup \{\exists_{\geq n}: n \in \omega\} is unsatisfiable, and hence by compactness it must have an unsatisfiable finite subset, which must be a subset of \{\phi_F\} \cup \{\exists_{\geq i_1}, \ldots, \exists_{\geq i_k}\} for some \i_1 < \ldots < i_k. But any finite model with domain of size \geq i_k satisfies (any subset of) \{\phi_F\} \cup \{\exists_{\geq i_1}, \ldots, \exists_{\geq i_k}\}, thereby contradicting our hypothesis. And if there were a sentence \phi_I satisfied by all and only infinite models then \neg \phi_I would be satisfied by all and only finite models, a hypothesis we have just refuted.This application of the compactness theorem is entirely typical. Schematically, one shows by contradiction that the class of models with some \omega-property (expressible by the conjunction of an infinite set S of first-order sentences) is not definable by a single sentence \phi. In these arguments, \neg \phi together with any finite subset of S is satisfiable, but \{\neg \phi\} \cup S is not, contradicting compactness.

    Informally speaking, in these applications the \omega-property is the conjunction of the n-properties ; in our example, the n-property was having size at least n and the \omega-property was having infinite size.However, it is possible that (depending on the language) there is a sentence which implies that any of these models must be infinite. As an example, take the first-order language with a single unary function symbol, and take the sentence

        \[\phi \df := \forall x \forall y (f(x) = f(y) \to x = y) \wedge \exists x \forall y \neg (f(y) = x).\]

    The models of this sentence are sets endowed with an injective, but non-surjective function. By the pigeonhole-principle, each such model must be infinite. Since this sentence cannot express infinitude, there must be an infinite model not satisfying \phi. An example is the domain of natural numbers, over which f is interpreted as the identity function on the natural numbers.As a further illustration of this technique, compactness implies that the class of torsion-free abelian groups is not finitely-axiomatisable in first-order logic. (A group is torsion-free if the identity is the only element with finite order. The prototypical example of a torsion-free abelian group is (\mathbb{Z}, +).) For if it were, by the single sentence \phi say, then the set

        \[\text{[abelian group axioms]} \cup \{\neg \phi\} \cup \{\forall x \ne 0 (\underbrace{x + \cdots + x}_{n \text{ times}} \ne 0) : n = 1, 2, \dots\}\]


    would be finitely-satisfiable (for finite sets where the sums are bounded by n in the sentences from the rightmost set, take the integers under addition modulo p for some prime p > n ). But the set itself is unsatisfiable, since

        \[\set{\text{[abelian group axioms]}} \cup \{\forall x \ne 0 (\underbrace{x + \cdots + x}_{n \text{ times}} \ne 0) : n = 1, 2, \dots\}\]

    is satisfied by all and only torsion-free abelian groups. This contradicts compactness. Note that here the \omega-property is that all non-identity elements of the model have infinite order, and the n-property that all non-identity elements have order \neq n (all these properties incorporate the abelian group axioms too).As a still further illustration, the same type of argument shows that the class of algebraically-closed fields is also not finitely-axiomatisable. The relevant \omega-property is that the field is algebraically-closed, and the relevant n-property is

        \[\forall y_1 \forall y_2 \ldots \forall y_n \exists x(x^n + y_1 \cdot x^{n-1} + \cdots + y_{n-1} \cdot x + y_n = 0)\]

    This style of argument is easily applied to many other domains.

  2. Suppose \Sigma_1 and \Sigma_2 are two sets of sentences of a compact logic (in which conjunction and negation are definable) with the property that every model satisfies either \Sigma_1 or \Sigma_2 but not both. Then there is a sentence \sigma such that \Sigma_1 is logically equivalent to \sigma (meaning that \{\sigma\} and \Sigma_1 have the same models) and \Sigma_2 is logically equivalent to \neg \sigma (Chang and Keisler 1990, p. 12, cor. 1.2.15).
  3. The compactness theorem may be used to show that any first-order theory of arithmetic T_{AR} satisfied by the standard model has a non-standard model. (By the standard model of arithmetic, for the theory T_{AR} in question, we mean the structure of natural numbers with the standard interpretation of the non-logical symbols in the language of T_{AR}$; the constant \(\overline{0} denotes 0, the two-place symbol + denotes addition, \cdot denotes multiplication, and so forth. A nonstandard model can have `infinite’ elements: objects greater than those denoted by the terms \overline{n} for all n. As an ordered set, a nonstandard model looks like \mathbb{N} followed by blocks of \mathbb{Z}, which themselves form a dense linear order without endpoints.) Assuming that each numeral \overline{n} is definable in T_{AR}, consider

        \[T_{AR}^+ = T_{AR} \cup \{c \neq \overline{n}:n \in \omega \}\]

    where c is any constant not in the language of T_{AR}. Any finite subset of T_{AR}^+ is satisfied by the standard model, because we may interpret c as a number distinct from those n where c \ne \overline{n} occurs in the given finite subset. Hence by compactness, T_{AR}^+ has a model \mathfrak{M}. The reduct of \mathfrak{M} to the language of T_{AR} is non-standard since it contains an element c^{\mathfrak{M}} (the denotation of c in \mathfrak{M}$ not identical to any natural number.Supplementing the argument just given with an appeal to the downward Löwenheim-Skolem Theorem (Chang and Keisler 1990, p. 66, cor. 2.1.4; Hodges 1997, p. 72, cor. 3.1.4) shows that any such first-order theory of arithmetic T_{AR} is not even \aleph_0-categorical, since it has a countably infinite non-standard model. (For an infinite cardinal \kappa, a theory is \kappa-categorical if it has exactly one model, up to isomorphism, of cardinality \kappa.) Observe that a theory may be complete even if it is not \aleph_0-categorical. To see this, run the previous argument supplemented by an application of the downward Löwenheim-Skolem Theorem for a complete theory of arithmetic T_{AR}. More generally, a theory may fail to be \lambda-categorical for every infinite cardinal \lambda and still be complete—in other words the converse of the Łoś-Vaught test (Chang and Keisler 1990, p. 139, prop. 3.1.7) fails.The same general idea can be used to demonstrate the existence of non-standard models of real analysis. Let T_{AN} be a first-order theory of analysis satisfied by the standard model (the ordered field of real numbers). As above, consider T_{AN}^+ = T_{AN} \cup \{\overline{0} < c < \overline{n}^{-1} : n = 1,2, \ldots\} where c is any constant not in T_{AN}$'s language. Any finite subset of \(T_{AN}^+ is satisfied by the standard model, because we may interpret c as a positive real number smaller than \frac{1}{n}, where n is the largest number for which the sentence \overline{0} < c < \overline{n}^{-1} is in the given finite subset. Hence by compactness, T_{AN}^+ has a model \mathfrak{M}. The reduct of \mathfrak{M} to the language of T_{AN} is non-standard since it contains an element c^{\mathfrak{M}} not identical to any real number. Indeed, this element must be a positive infinitesimal, meaning that it is a number greater than \overline{0}^{\mathfrak{M}} but smaller than every \(\overline{n}^{-1})^{\mathfrak{M}}. As well as infinitesimals, our non-standard model also contains infinite elements, since the the model satisfies \forall x \neq 0 \exists y(x \cdot y = 1) and thus any non-zero element has an inverse. From these foundations, a consistent theory of the calculus that revives to a degree the use of infinitesimals in early modern mathematics may be constructed, called non-standard analysis: see Goldblatt (1998) or the original Robinson (1966). Furthermore, since \mathbb{R} and \mathfrak{M} have the same theory, any sentence in the language of T_{AN} that can be proven from T_{AN}^+ will hold in \mathfrak{M} and thus automatically holds in \mathbb{R}. This is the transfer principle of non-standard analysis.

    Since second-order logic is incompact, the arguments just given fail for second-order theories of arithmetic and analysis. Indeed, there are categorical second-order axiomatisations of arithmetic (for example, Peano Arithmetic with second-order induction (Shapiro 1991, p. 82, thm. 4.8)) and real analysis (for example, the axioms for a complete ordered field (p. 84, thm. 4.10)). For more on second-order theories, consult (chap. 3–6).

  4. Assuming the downward Löwenheim-Skolem Theorem, another corollary of the compactness of first-order logic is the upward Löwenheim-Skolem Theorem (Chang and Keisler 1990, p. 67, cor. 2.1.6; Hodges 1997, p. 127,cor. 5.1.4). This upward version of the theorem states that if a first-order language \mathcal{L} has cardinality \leq \lambda and \mathfrak{M} is an infinite model with domain of cardinality \leq \lambda then \mathfrak{M} has an elementary extension of cardinality \lambda. For the proof, we consider the set of sentences consisting of the elementary diagram (the set \{\phi \in \mathcal{L}(\mathfrak{M}) : \mathfrak{M} \vDash \phi\}, where \mathcal{L}(\mathfrak{M}) is the extension of \mathcal{L} obtained by adding constant symbols c_m for each m in the domain of \mathfrak{M}, and the interpretation of c_m in \mathfrak{M} is simply m together with the sentences c_\alpha \ne c_\beta for all distinct \alpha, \beta \in \lambda, where the c_{\alpha} are new constants.This set is finitely-satisfiable (because the infinite model \mathfrak{M} satisfies any finite subset), and hence by compactness it is satisfiable. Say that it is satisfied by a model \mathfrak{N}, which must be of size \geq \lambda as it satisfies \{c_{\alpha} \neq c_{\beta}: \alpha, \beta \in \lambda \textnormal{ s.t. } \alpha \neq \beta\}. Since \mathfrak{N} also satisfies the elementary diagram of \mathfrak{M}, an elementary embedding of \mathfrak{M} into \mathfrak{N} exists, and thus there is an elementary extension \mathfrak{O} of \mathfrak{M} with domain of size \geq \lambda (\mathfrak{O} is an isomorph of \mathfrak{N} whose domain includes that of \mathfrak{M} ). To find an elementary extension of \mathfrak{M} of size exactly \lambda, now apply the downward Löwenheim-Skolem Theorem to \mathfrak{O}.The upward Löwenheim-Skolem Theorem may be applied to show not only that theories of arithmetic and analysis satisfied by their respective standard models have non-standard models, but also that they have non-standard models of every infinite cardinality. More generally, any first-order theory in a countable language satisfied by an infinite model has models of every cardinality. Applying this to \textsf{ZF} or \textsf{ZFC}, we obtain Skolem’s paradox: if there exists a model of set theory, then there exists a countably-infinite model, which will nonetheless model the existence of uncountable sets (Skolem 1922, 295–96).The upward and downward Löwenheim-Skolem Theorems and the so-called Skolem paradox have generated much philosophical debate. The most famous philosophical use of these results is found in Putnam (1980). Wielding the theorems, Putnam argued that our mathematical-scientific theories do not admit a determinate intepretation. In particular, he claimed that set theory augmented with any theoretical principles from science and scientific data we care to add admits a countable model. This argument has given rise to an extensive literature. The responses to Putnam’s argument, and to other versions, include technical discussion of exactly what Putnam’s argument requires mathematically speaking, as well as philosophical commentary.
  5. The compactness theorem may be used to prove the Order-Extension Principle (also known as Szpilrajn’s Extension Theorem): any partial order may be extended to a linear order. A partial order (A, <) consists of a domain A with an irreflexive and transitive relation <. A linear order is a partial order satisfying the additional linearity axiom \forall x \forall y(x < y \vee x = y \vee y < x). A linear order (A, <_*) extends the partial order (A, <) just when for all \a_1, a_2 in A, if \a_1 < a_2 then \a_1 <_* a_2; in other words, the identity map from (A, <) into (A, <_*) is a homomorphism. The notions of partial and linear order are first-order definable in the language consisting of a single two-place non-logical predicate, which is what permits the application of the compactness theorem.Unusually for applications of the compactness theorem, this result is of potential importance outside mathematics, logic, and philosophy. A standard assumption in economics is that a subject’s preferences over goods are linearly ordered. Empirically, however, we find that people’s preferences tend at best to be partially ordered: I may prefer going out for a Japanese meal rather than an Italian meal, but I may have no preference between going to the cinema and going out for a Japanese meal. In light of the Order-Extension Principle, one might try to argue that preferences being linearly ordered is a justifiable idealisation of the empirical data. See Szpilrajn (1930) for the first proof of this theorem and Richter (1966) for an early economic application, as well as Gonczarowski, Kominers, and Shorrer
    (2019) for more applications of the compactness theorem in economics.How does one prove the Order-Extension Principle? A relatively straight-forward proof by induction on the size of the domain shows that every partial order with a finite domain can be extended to a linear order. We may then use the compactness theorem to prove that every partial order can be extended to a linear order, whatever the size of the domain, be it finite or infinite. Given a partial order (A, <), let \Sigma_A be a set of sentences consisting of the (Robinson) diagram of (A, <) (the diagram of an \mathcal{L}-structure \mathfrak{A} is the set of literals—atomic sentences or negations of atomic sentences—in the language \mathcal{L}' = \mathcal{L} \cup {\set{c_a : a \in A}}, where c_a are new constant symbols, satisfied in the expansion \mathfrak{A}' of \mathfrak{A} where we interpret c_a^{\mathfrak{A}'} \df := a for each a \in A.), together with the axioms c_a < c_b \vee c_b < c_a for all distinct a, b \in A.
    Since every partial order with a finite domain can be extended to a linear order, it follows that any finite subset of \Sigma_A is satisfiable. By compactness, \Sigma_A is therefore satisfiable. If \mathfrak{M} is a model of \Sigma_A then (A, <) embeds into the \mathcal{L}_<-reduct of \mathfrak{M}, which we call (B, <_B), via f: (A, < to (B, <_B) say. The required linear order extending (A, <) is the inverse under f of the restriction of the order <_B to f(A), that is a_1 <_* a_2 if and only if f(a_1) <_B f(a_2).Note in passing that the Order-Extension Principle implies that every set A can be linearly ordered: simply consider a linear order that extends the empty partial order on A.

The compactness theorem for first-order logic has a great many other applications to model theory—as Keisler (1965, 113) puts it, “the most useful theorem in model theory is probably the compactness theorem”—as well as to set theory, other parts of logic, combinatorics, algebra, algebraic geometry; see Hodges (1997), especially chapter 5, for more.

4. Some Non-topological Proofs

In this section, we give three styles proofs of the compactness of propositional and first-order logic that do not draw on topology. These three proofs illustrate some important techniques, and could be broadly classified as deductive, syntactic, and semantic methods of proof.

a. Deductive: Proofs via Soundness and Completeness

We recall the definitions of soundness and completeness for a logic \mathcal{L} equipped with semantic consequence relation \vDash and deductive consequence relation \vdash. The logic is complete iff for any set of well-formed formulas \Gamma and any well-formed formula \delta, if \Gamma \vDash \delta then \Gamma \vdash \delta; and it is sound iff the converse obtains: if \Gamma \vdash \delta then \Gamma \vDash \delta.

If a logic has a sound and complete proof procedure, it is compact. A simple argument demonstrates this:

(1) \Gamma \vDash \delta Assumption

(2) \Gamma \vdash \delta From (1) by Completeness

(3) \Gamma^\text{fin} \vdash \delta From (2) by the finiteness of proofs

(4) \Gamma^\text{fin} \vDash \delta From (3) by Soundness

Here \Gamma^\text{fin} is some finite subset of \Gamma. Anything deserving of the name of `proof procedure’ usually satisfies a host of syntactic requirements. Given soundness and completeness the only such requirement needed for the validity of the inference above is that the step from (2) to (3) be valid, namely that proofs draw only on finitely many premisses. The argument just given therefore applies to any logic which has a sound and complete proof procedure in this liberal sense.

Thus, no incompact logic can be completable by a sound proof procedure; second-order logic in particular cannot.

The proof of compactness via completeness is in an important respect unsatisfactory because it is based on properties incidental to the semantic property of interest. The proof derives compactness, a semantic property, from a property of the logic relating its syntax to its semantics. Thus Keisler (1965, 113): “unlike the completeness theorem, the compactness theorem does not involve the notion of a formal deduction, and so it is desirable to prove it directly without using that notion”. Indeed, from the perspective of a model theorist who sees talk of syntax as a heuristic for the study of certain relations between structures that happen to have syntactic correlates, proving compactness via completeness is tantamount to heresy (Poizat 2000, 53).

b. Syntactic: Henkin-Style Proofs

This proof is modelled on Henkin’s (1949b) proof of the completeness theorem for first-order logic, with its deductive core replaced by a semantic one. We begin with a relatively concrete argument for the simplest case of \textsf{PL}_\omega before passing to more abstract versions. Unlike the usual versions of the proof, our argument does not assume any particular set of truth-functional connectives, only that the set of sentences is countably-infinite (and thus the set of connectives is countable); the argument is usually simpler (but less general) when a particular set of connectives has been specified.

Let \{s_i: i \in \omega\} be an enumeration of the set of sentences of \textsf{PL}_\omega. Given a finitely-satisfiable set \Gamma, define a denumerable sequence of sets \Gamma_n as follows:

    \begin{gather*} \Gamma_0 = \Gamma, \\ \Gamma_{n+1} = \begin{cases} \Gamma_{n} \cup \{s_n \} & \text{if } \Gamma_{n} \cup \{s_n \} \textnormal{ is finitely-satisfiable}, \\ \Gamma_{n} & \textnormal{ otherwise}, \end{cases} \\ \Gamma_\omega = \bigcup_{n \in \omega} \Gamma_n. \end{gather*}

By definition, \Gamma_{n} is finitely-satisfiable for all n, hence \Gamma_\omega is finitely-satisfiable since any finite subset of \Gamma_\omega must be drawn from some \Gamma_n.

From \Gamma_{\omega}, define the valuation v by letting v(p) = 1 if and only if p \in \Gamma_{\omega}, where p is a sentence letter. We now prove that v is a valuation in which all the sentences in \Gamma_{\omega} (not just the sentence letters) are true. Suppose towards a contradiction that v(\phi) = 0 for some \phi \in \Gamma_\omega. Without loss of generality (renumbering sentence letters if necessary), let p_1, \ldots, p_k be all the sentence letters in \phi with truth-value 1 under v and p_{k+1},\ldots, p_n all the sentence letters in \phi with truth-value 0 under v (one of these sets may be empty, in which case the argument to follow is easily modified). By the definition of v, none of p_{k+1},\ldots, p_n is in \Gamma_\omega so for each of these, let \Delta_i be a finite subset of \Gamma_\omega such that \Delta_i \cup \{p_i\} is unsatisfiable (k+1 \leq i \leq n); some such \Delta_i must exist given that p_i was omitted in the construction of \Gamma_\omega. Now consider the set

    \[S := \df \{p_1, \ldots, p_k, \phi\} \cup \left( \bigcup_{i = k + 1}^n \Delta_i \right).\]

Any valuation in which all the elements of \bigcup_{i = k + 1}^n \Delta_i are true is one in which each of p_{k+1}, \ldots, p_n is false, since \Delta_i \cup \{p_i\} is unsatisfiable, and so any valuation in which p_1, \dots, p_k and all elements of \bigcup_{i = k + 1}^n \Delta_i are true is one in which the sentence letters p_{k+1}, \ldots, p_n have truth-value 0. It follows that any such valuation is one in which \phi is false, since \phi contains no sentence letters other than p_1, \ldots, p_k, p_{k+1}, \ldots, p_n and v(\phi) = 0. Hence S is unsatisfiable, contradicting the fact that this set is a finite subset of \Gamma_\omega. Having proved by reductio that any sentence in \Gamma_{\omega} is true under v, it follows that \Gamma, which is a subset of \Gamma_\omega, is satisfiable. It is obvious from its construction that \Gamma_\omega is a maximal finitely-satisfiable set, meaning that it is finitely-satisfiable and none of its proper extensions are finitely-satisfiable.

A more abstract version of the argument just presented runs as follows. Suppose \Gamma is finitely-satisfiable. Order by inclusion the set F_\Gamma of finitely-satisfiable sets of sentences of the language containing \Gamma. F_\Gamma is non-empty, since it contains at least \Gamma. Any chain in F_\Gamma has an upper bound, obtained by taking the union of the elements in the chain: this union contains \Gamma as a subset since all the members of the chain do, and it is finitely-satisfiable since any of its finite subsets must come from some element of the chain, which by hypothesis is finitely-satisfiable. Zorn’s Lemma, provable in \textsf{ZFC}, states precisely that every partial order with the property that every chain has an upper bound has a maximal element. Since the conditions of Zorn’s Lemma are satisfied, we deduce from it that F_\Gamma has a maximal element, that is to say, a maximal finitely-satisfiable set extending \Gamma. We then reason as in the previous paragraph to show that all the elements of \Gamma are true under the valuation v defined on sentence letters by v(p) = 1 if and only if p is a member of this maximal finitely-satisfiable set. Nowhere did we rely on the fact that the sentence letters are denumerably many, or on any assumption about the set of connectives. So this more general argument shows that \textsf{PL}_\kappa is compact for any cardinal \kappa whatsoever.

The abstract argument just given invoked Zorn’s Lemma, well known to be equivalent to the Axiom of Choice in \textsf{ZF}. Is this use of a \textsf{ZF}-equivalent of Choice necessary? No. A weaker principle than Zorn’s Lemma, namely the Ultrafilter Lemma, will do. (Section 7 has more on the relative strengths in \textsf{ZF} of the Ultrafilter Lemma and Zorn’s Lemma. Consult Moore (1982) for more on the Axiom of Choice and its foundational significance.) To explain what the Ultrafilter Lemma is, we must first define the notion of a filter on a Boolean algebra. We denote the bottom, top, join, meet and complement in a Boolean algebra (with which we assume basic familiarity, see Givant and Halmos (2009) for an introduction) by the symbols 0, 1, \vee, \wedge, \neg; the derived strong and weak inequality symbols have their customary meanings (x \leq y means that x \wedge y = x and so on). A filter on a Boolean algebra B is then a subset \mathcal{F} of B such that:

  1. 1 \in \mathcal{F}; 0 \notin \mathcal{F};
  2. \forall x,y \in \mathcal{F}, x \wedge y \in \mathcal{F};
  3. \forall x,y \in B, \textnormal{if } x \leq y \textnormal{ and } x \in \mathcal{F} \textsf{ then } y \in \mathcal{F}.

An ultrafilter \mathcal{F} is a maximal filter on B, or alternatively a filter on B that contains exactly one of b or \neg b for each b \in B. The Ultrafilter Lemma, sometimes called the `Ultrafilter Theorem’ or `Ultrafilter Principle’, states that any filter on a Boolean algebra may be extended to an ultrafilter.

Armed with the notion of an ultrafilter, we may now modify the abstract proof of \textsf{PL}_\kappa‘s compactness to rely not on Zorn’s Lemma but on the \textsf{ZF}-weaker Ultrafilter Lemma instead. (Henceforth we assume that the set of propositional connectives is truth-functionally complete.) Consider the Boolean algebra B whose domain is the set of equivalence classes of sentences of \textsf{PL}_\kappa under logical equivalence, 0 and 1 being the equivalence class of a contradiction and tautology respectively, \vee, \wedge and \neg respectively denoting disjunction, conjunction and negation on the equivalence class representatives (easy check: this is well-defined); such a Boolean algebra is usually called a Lindenbaum algebra. To simplify notation, we denote the equivalence class of \phi by \phi itself; observe that the equivalence of \psi \leq \phi and \psi \vDash \phi follows from the fact that \phi \wedge \psi and \psi are logically equivalent if and only if \psi \vDash \phi.

Given a finitely-satisfiable set \Gamma, form the set \Gamma^+ consisting of all the sentences entailed by some finite subset of \Gamma, that is,

    \[\Gamma^+ = \{\phi: \exists \Gamma^\text{fin} \subseteq \Gamma \text{ finite s.t.~} \Gamma^\text{fin} \vDash \phi \}.\]

Clearly, \Gamma is a subset of \Gamma^+, and \Gamma^+ also has the following three properties:

  1. 1 \in \Gamma^+ but 0 \notin \Gamma^+ (since \Gamma is finitely-satisfiable);
  2. \forall x,y \in \Gamma^+, x \wedge y \in \Gamma^+;
  3. \forall x,y \in B, \textnormal{if } x \leq y \textnormal{ and } x \in \Gamma^+ \textnormal { then } y \in \Gamma^+.

In other words, \Gamma^+ is a filter on the Boolean algebra of sentences of \textsf{PL}_\kappa. It is the smallest filter containing \Gamma, that is to say, \Gamma is a filter-base for \Gamma^+. By the Ultrafilter Lemma, \Gamma^+ may be extended to an ultrafilter. The alternative definition of an ultrafilter shows that this ultrafilter is a maximal finitely-satisfiable set of sentences containing \Gamma. The rest of the proof proceeds as above. We have thus proved the compactness of \textsf{PL}_\kappa using the Ultrafilter Lemma rather than Zorn’s Lemma.

The compactness of first-order logic may be demonstrated by a similar argument. We give two versions of the argument: the first and slightly more direct version uses the Axiom of Choice; the second uses only the weaker Ultrafilter Lemma. For the first version, let \kappa be the size of \text{WFF}(\mathcal{L}) \times \text{Var}(\mathcal{L}) where \mathcal{L} is the first-order language in which each of the sentences in our given finitely-satisfiable set \Gamma is expressed, \text{WFF}(\mathcal{L}) is the set of formulas of \mathcal{L}, and \text{Var}(\mathcal{L}) is \mathcal{L}‘s set of variables. Add a distinct set of constants of size \kappa to the language (usually known as nullary Skolem functions ), disjoint from \mathcal{L} and to be used as a source of witnesses. Then \mathcal{L}^+ also has size \kappa (since \kappa^{< \aleph_0} = \kappa for infinite \kappa ), and so we can well-order both the set of formulas \mathcal{L}^+ and the set of new constant symbols in order-type \kappa. We now augment the set \Gamma recursively by exploiting the well-ordering of \mathcal{L}^+ and the well-ordering of the set of new constants (each well-ordering being of type \kappa ). Let \Gamma_0 = \Gamma and \Gamma_\lambda = \bigcup_{\alpha < \lambda} \Gamma_\alpha for \lambda a limit. For the successor case, let

    \[\Gamma_{\alpha + 1} := \df \Gamma_\alpha \cup \{\exists x \phi \to \phi[c\backslash x]\},\]

where \phi[c\backslash x] denotes \phi with c substituted for any free occurrences of x in \phi, if \exists x \phi is the \alpha^{th} formula in the ordering of \text{WFF}(\mathcal{L}^+) and c is the first constant in the ordering of the set of constants not to appear in any element of \Gamma_\alpha nor in \phi; otherwise (if the \alpha^{th} formula is not existential) let \Gamma_{\alpha + 1} \df := \Gamma_\alpha. Finally, let \Gamma_\kappa \df := \bigcup_{\alpha < \kappa} \Gamma_\alpha and define \Gamma^+ as the set of sentences entailed by any finite subset of \Gamma_\kappa, that is,

    \[\Gamma^+ \df := \{\phi \in \mathcal{L}^+: \exists \Gamma^\text{fin} \subseteq \Gamma_\kappa \text{ finite s.t.~} \Gamma^\text{fin} \vDash \phi \}.\]

An identical argument to the one in the propositional case now shows that \Gamma^+ is a filter on the Boolean algebra of the first-order \mathcal{L}^+-sentences quotiented by logical equivalence. The only difference lies in the verification that 0 \notin \Gamma^+, in other words, that \Gamma^+ is finitely-satisfiable. For if

    \[\{\gamma_1, \ldots, \gamma_n, \exists x_{i_1} \phi_{j_1} \to \phi_{j_1}[c_{k_1}\backslash x_{i_1}], \ldots, \exists x_{i_m} \phi_{j_m} \to \phi_{j_m}[c_{k_m}\backslash x_{i_m}]\}\]

were unsatisfiable, with \gamma_1, \ldots, \gamma_n \in \Gamma and c_{k_m} is the greatest constant subject to their well-order (m and n here are natural numbers, and the i, j and k-indices are ordinals smaller than \kappa ), then

    \[\ \{\gamma_1, \ldots, \gamma_n\} \cup \{\exists x_{i_l} \phi_{j_l}} \to \phi_{j_l}[c_{k_l} \backslash x_{i_l}] : l=1, \dots, m - 1\} \vDash \exists x_{i_{m}} \phi_{j_{m}} \wedge \neg \phi_{j_{m}}[c_{k_{m}}\backslash x_{i_{m}}]. \]

But if the premisses

    \[\gamma_1, \ldots, \gamma_n, \exists x_{i_1} \phi_{j_1} \to \phi_{j_1}[c_{k_1}\backslash x_{i_1}], \ldots, \exists x_{i_{m-1}} \phi_{j_{m-1}} \to \phi_{j_{m-1}}[c_{k_{m-1}}\backslash x_{i_{m-1}}]\]

are jointly satisfiable then they cannot entail both \exists x_{i_{m}} \phi_{j_{m}} and \neg \phi_{j_{m}}[c_{k_{m}} \setminus x_{i_{m}}]. The reason is that if there is a model in which this premiss set and \exists x_{i_{m}} \phi_{j_{m}} are satisfied, then there is a model in which the interpretation of c_{k_m} may be chosen to witness the truth of \exists x_{i_{m}} \phi_{j_{m}}, since c_{k_{m}} does not appear in any other sentence than \phi_{j_{m}}[c_{k_{m}}\backslash x_{i_{m}}].

Having verified that \Gamma^+ is a filter, we invoke the Ultrafilter Lemma as before to extend \Gamma^+ to an ultrafilter \Gamma^{++}. Note that \Gamma^{++} contains \Gamma_\kappa and hence is witness-complete—every existential statement is satisfied by some constant. To show that \Gamma^{++} (and hence \Gamma) is satisfiable, we construct a term model \mathcal{T} . The term model’s domain consists of the closed terms of \mathcal{L}^+, quotiented by the relation of appearing in an identity statement of \Gamma^{++}, that is to say \tau_1 \sim \tau_2 if and only if \tau_1 = \tau_2 \in \Gamma^{++}. The interpretation a^{\mathcal{T}} of the constant a is a‘s equivalence class, [a]; the interpretation f^{\mathcal{T}} of the function symbol f applied to [\tau_1], \ldots, [\tau_n] is \[f(\tau_1, \ldots, \tau_n)]; and for any n-place relation symbol R and closed terms \tau_1, \ldots, \tau_n, R([\tau_1], \ldots, [\tau_n]) holds in the model if and only if R(\tau_1, \ldots, \tau_n) is an element of \Gamma^{++}. A routine argument shows that this interpretation is well-defined and that the term model \mathcal{T} is a model of \Gamma^{++}. It is an instructive exercise to determine where exactly the arguments just given break down in the case of second-order logic: see Paseau (2010a, 75–76) for details.

To see how to avoid the use of the Axiom of Choice, we show that turning \Gamma into a Skolem set (a set containing a witness for every existential statement) does not in fact require the Well-Ordering Principle. In this alternative argument for the compactness of first-order logic, we construct a set \{c_{\langle \phi, x \rangle}: \phi \in \mathcal{L}, x \in \text{Var}(\mathcal{L})\} of constants disjoint from the set of constants of \mathcal{L}, one for each ordered pair in \text{WFF}(\mathcal{L}) \times \text{Var}(\mathcal{L}). (If we were to formalise this argument in our set-theoretic metatheory, say \textsf{ZF}, we may for instance code each constant c_{\langle \phi, x \rangle} as \langle \phi, x, \text{WFF}(\mathcal{L}) \times \text{Var}(\mathcal{L})\rangle. The point is that no Choice principles are required for this recursive definition.) Then add Skolem sentences \{\exists x \phi \to \phi[c_{\langle \phi, x \rangle} \backslash x]\} for every such ordered pair \langle \phi, x \rangle. This gives us a new set of sentences \Gamma_1 \supset \Gamma_0 = \Gamma in a language \mathcal{L}_1 \supset \mathcal{L}_0 = \mathcal{L}. Iterate this process \omega-times, constructing a chain of sets of sentences \Gamma_0 \subset \Gamma_1 \subset \cdots \subset \Gamma_n \subset \cdots, and a corresponding chain of languages \mathcal{L}_0 \subset \mathcal{L}_1 \subset \cdots \subset \mathcal{L}_n \subset \cdots. By an inductive argument, each \Gamma_n is readily seen to be finitely-satisfiable (there is no need to assume the well-ordering of \mathcal{L}_{n+1} \setminus \mathcal{L}_n at this point), hence so is \bigcup_{n \in \omega} \Gamma_n. Finally, define \Gamma^+ as the filter generated by \Gamma_\omega. The rest of the argument, which defines a term model from the ultrafilter extending the filter \Gamma^+ proceeds as above. The version of the argument sketched in this paragraph invoked the Ultrafilter Lemma and at no point did it use the Axiom of Choice.

The set \Gamma^{++} is an example of a Hintikka set, which is a set of first-order sentences T that satisfies the following axioms. (For brevity, we use take \neg, \wedge, \forall as primitive symbols in our first-order logic.)

  • For every atomic sentence \phi, if \phi \in T then \neg \phi \notin T,
  • For every closed term t, t = t \in T,
  • If \phi(x) is an atomic formula with a single free variable x, and s, t are closed terms such that \phi(s), s = t \in T, then \phi(t) \in T,
  • If \phi is a sentence and \neg \neg \phi \in T, then \phi \in T,
  • If \Phi is a finite set of sentences, then
    • \bigwedge \Phi \in T implies \Phi \subseteq T,
    • \neg \bigwedge \Phi \in T implies \neg \phi \in T for some \phi \in \Phi,
  • If \phi(x) is a formula with a single free variable x, then
  • \forall x \phi(x) \in T implies \phi(t) \in T for every closed term t,
  • \neg \forall x \phi(x) \in T implies \neg \phi(t) \in T for some closed term t.

Given a Hintikka set T, we can construct a term model (using the same definition given above) that satisfies T (Hodges 1997, 40–42).

c. Semantic: Ultraproduct Proofs

In this subsection, we prove the compactness theorem for first-order logic from Łoś’ Theorem.
We recall the definition of an ultraproduct and state the theorem. For a proof, see Chang and Keisler (1990, p. 217, thm. 4.1.9) or Hodges (1997, p. 241–242, thm. 8.5.3).

Let (\mathfrak{A}_i)_{i \in I} be a collection of first-order structures of the same signature, indexed by I, and U an ultrafilter over this indexing set I. In this context, the Boolean algebra is \mathcal{P}(I) with the subset ordering \subset, so a filter F \subset \mathcal{P}(I) is a set of subsets of I which does not contain the empty set and which is closed under finite intersection and upward containment. Denote the domain of each \mathfrak{A}_i by A_i. When U is an ultrafilter over I, defining a \sim_U b by \{i \in I: a(i) = b(i)\} \in U yields an equivalence relation over the product

    \[\prod_{i \in I} A_i = \Bigg\{f : I \to \bigcup_{i \in I} A_i \mid \forall i \in I, f(i) \in A_i\Bigg\}.\]

(We tend to view elements of the product as generalised sequences (x_i)_{i \in I}.) The characteristic function \mu_U of U is a finitely-additive full measure on I and thus intuitively the elements of U are `large subsets of I‘: two functions in \prod_{i \in I} A_i are identified by this relation precisely when they agree `U-almost-everywhere’. We define a new structure \mathfrak{B} = \prod_{i \in I} \mathfrak{A}_i / U as follows:

  • Domain of \mathfrak{B} = \prod_{i \in I} A_i quotiented by \sim_U; [a] denotes the equivalence class of a;
  • c^\mathfrak{B} = [(\cdots, c^{\mathfrak{A}_i}, \cdots)];
  • f^{\mathfrak{B}}([a_1], \cdots, [a_k]) = [(\cdots, f^{\mathfrak{A}_i}(a_1(i), \cdots, , a_k(i)), \cdots)];
  • R^{\mathfrak{B}}([a_1], \cdots, [a_k]) if and only if \{i \in I: \mathfrak{A}_i \vDash R(a_1(i), \cdots, a_k(i))\} \in U.

for any k-place function and relation symbols f and R and any constant c of the first-order language in question. As may be checked, the fact that U is a filter means that the definitions just given do not depend on the choice of representatives from \prod_{i \in I} A_i. For \phi atomic we have by definition:

    \[\mathfrak{B} \vDash \phi(a_1, \cdots, a_n) \text{ if and only if } \{i \in I: \mathfrak{A}_i \vDash \phi(a_1(i), \cdots, a_n(i))\} \in U.\]

Łoś’ theorem, which is proved by induction on the complexity of \phi, extends this to all \phi:

for all \phi, \mathfrak{B} \vDash \phi(a_1, \cdots, a_n) if and only if \{i \in I: \mathfrak{A}_i \vDash \phi(a_1(i), \cdots, a_n(i))\} \in U.

The induction step for the universal / existential quantifier step does require some Choice. In fact, the Ultrafilter Lemma together with Łoś’ theorem imply the full Axiom of Choice (Howard 1975).

To prove the compactness theorem for first-order logic from Łoś’ Theorem, let \Sigma be a finitely-satisfiable set of first-order sentences. Let I be the set of finite subsets of \Sigma and suppose we are given for each i \in I a model \mathfrak{A}_i for the sentences in i. For i \in I let

    \begin{gather*} J_i \df := \{j \in I: i \subset j\}, \\ F \df := \{J \subset I: J_i \subset J \textnormal{ for some } i \in I\}. \end{gather*}

The collection of J_i is closed under finite intersection because J_{i_1} \cap J_{i_2} = J_{i_1 \cup i_2}, hence F is also closed under finite intersection; by definition, F is closed under upward containment, and clearly F does not contain the empty set. Thus F satisfies the conditions for being a filter and may therefore be extended to an ultrafilter U. Now for any \sigma \in \Sigma, we have \{\sigma\} \in I and J_{\{\sigma\}} \subset \{i \in I: \mathfrak{A}_i \vDash \sigma \}, from which it follows that \{i \in I: \mathfrak{A}_i \vDash \sigma \} \in U. By Łoś’ theorem, \mathfrak{B} \vDash \sigma. This is true for every \sigma \in \Sigma, so \mathfrak{B} \vDash \Sigma, in other words if \Sigma is satisfiable. Intuitively, the model \mathfrak{B} was designed so as to agree with \mathfrak{A}_i on the `large subset’ J_i of I and in particular to agree with \mathfrak{A}_i on the truth-value of each element of i; as this is true for all i, \mathfrak{B} satisfies \Sigma. This completes the ultraproduct proof of the compactness of first-order logic.

For particular applications, we may not need the full strength of the Axiom of Choice to apply then ultraproduct method. For example, suppose we wish to use the ultraproduct method to construct a non-standard model of arithmetic (see Section 2c) with language \mathcal{L} and standard model \mathfrak{A} with domain \omega. We extend the language to \mathcal{L}' by introducing a new constant symbol c and for each n \in \omega, let \mathfrak{A}_n be the \mathcal{L}'-expansion of \mathfrak{A} given by interpreting c^{\mathfrak{A}_n} \df := n. Let F be the filter of cofinite subsets of \omega and let U be an ultrafilter that extends F. Then the particular instance of Łoś’ Theorem can be proven for (\mathfrak{A}_n)_{n \in \omega} and U, utilising the well-foundedness of \omega in place of the Axiom of Choice. Thus \mathfrak{B} = \prod_{n \in \omega} \mathfrak{A}_n / U satisfies the same \mathcal{L}-sentences as \mathfrak{A}, and for all m \in \omega,

    \[\{n \in \omega : \mathfrak{A}_n \vDash c \ne \underbrace{1 + \dots + 1\}}_{m \text{ times}}} = \omega \setminus \{m\} \in F \subseteq U,\]

so by Łoś’ Theorem it follows that \mathfrak{B} \vDash c \ne \underbrace{1 + \dots + 1}_{m \text{ times}}. Therefore \mathfrak{B} is a non-standard model of arithmetic.

5. Connection to Topology

Tarski (1952) article gives the compactness theorem its name (see theorems 13 and 17), observing its similarity with the finite-intersection-property definition of compactness in topologies.
Here we will demonstrate the topological connection in two stages. First, we show that the compactness theorem for a propositional logic is equivalent to the claim that its associated valuation space is compact. (Recall that an open cover of a topological space X is a collection \mathcal{C} of open subsets of X whose union is X, and that a space X is compact if every open cover has a finite subcover—a finite subset that is also a cover. Equivalently, every collection of closed subsets with the finite-intersection property—every finite subset has non-empty intersection—has non-empty intersection. Intuitively, you cannot `escape’ a compact space; every collection of points will accumulate somewhere.) Second, we show that this space is indeed compact. We assume some basic knowledge of topology (Sutherland 2009; Willard 1970).

In this section, we spell out this reasoning for the case of propositional logic. We start with an argument that demonstrates the compactness of any \textsf{PL}_\kappa, initially assuming for the sake of simplicity that the version of \textsf{PL}_\kappa in question is equipped with a truth-functional set of connectives.

Let V_{\kappa} be the set of all valuations of \textsf{PL}_\kappa. For each sentence \phi of \textsf{PL}_\kappa define U(\phi) = \{v \in V_{\kappa}: v(\phi) = 1\}, the set of valuations in which \phi is true. Since U(\phi_1) \cap U(\phi_2) = \{v \in V_{\kappa}: v(\phi_1) = 1\} \cap \{v \in V_{\kappa}: v(\phi_2) = 1\} = U(\phi_1 \wedge \phi_2), the sets U(\phi) form a basis for a topology on V_{\kappa}; call this basis \mathcal{B}. (A basis \mathcal{B} for a topology \tau is a collection of open sets with the property that U \in \tau if and only if U = \bigcup \mathcal{C} for some \mathcal{C} \subset \mathcal{B}. Given a collection \mathcal{B} of subsets of X, if its union is X and for all B_1, B_2 \in \mathcal{B}, B_1 \cap B_2 = \bigcup \mathcal{C} for some \mathcal{C} \subset \mathcal{B}, then there exists a unique topology on X with basis \mathcal{B} (Willard 1970, p. 38, thm. 5.3).) The topological space we shall be interested in has domain V_{\kappa} and topology generated by the basis \mathcal{B} = \{U(\phi): \phi \textnormal{ is a sentence of } \textsf{PL}_\kappa \}; with this topology, V_\kappa is the dual space of the Boolean algebra of sentences of \textsf{PL}_\kappa. Thus properties of this space are determined only by the logical properties of valuations.

Now consider any set of sentences \Sigma of \textsf{PL}_\kappa. \Sigma is unsatisfiable if and only if any valuation assigns truth-value 0 to at least one of its members, or equivalently any valuation assigns truth-value 1 to at least one sentence \neg s such that s \in \Sigma. It follows that \Sigma is unsatisfiable if and only if each valuation is in some open set of the form U(\neg s) where s \in \Sigma, or equivalently that \{U(\neg s): s \in \Sigma\} is an open cover of the topological space V_{\kappa}. The compactness of \textsf{PL}_\kappa is therefore equivalent to the following claim:

if \{U(\neg s): s \in \Sigma\} is an open cover of V_{\kappa} then there is a finite subset
\Sigma^\text{fin} of \Sigma such that \{U(\neg s): s \in \Sigma^\text{fin}\} is an open cover of V_{\kappa}.

Thus \textsf{PL}_\kappa is compact if and only if V_\kappa is compact. (Note that any open set is the union of some basis sets and that any basis set U(\phi) is identical to U(\neg s) for s = \neg \phi. Also, a space is compact if every open cover drawn from a fixed basis has a finite subcover.) It suffices, then, to show that V_\kappa is compact in the usual topological sense.

The last claim is immediate from the fact that V_\kappa is homeomorphic to \{0,1\}^\kappa = \{f : \kappa \to \{0, 1\}\} with the (Tychonoff) product topology, when \{0,1\} is given the discrete topology \{\emptyset, \{0\}, \{1\}, \{0,1\}\}. Recall that the product topology on the product \prod_{i \in I}X_i of a family of topological spaces (X_i, \tau_i) indexed by I takes as its basic open sets \prod_{i \in I}O_i, where O_i \in \tau_i, and O_i = X_i for all but finitely elements i in I. (For a finite collection (X_1, \tau_1), \dots, (X_n, \tau_n), the product topology on X_1 \times \dots \times X_n has basis {\set{U_1 \times \dots \times U_n : \forall i , U_i \in \tau_i}}. In general, the product topology is determined by the following universal property: for any space Y and function f : Y \to \prod_{i \in I} X_i, f is continuous if and only if \pi_i \circ f is continuous for all i \in I, where \pi_i : \prod_{j \in I} X_j \to X_i is the projection map onto the i-th coordinate. If we remove the restriction that all-but-finitely-many O_i‘s are equal to X_i, we obtain the box product topology. This coincides with the product topology for finite indexes, but varies drastically for infinite products.) Since \{0,1\}^\kappa with the product topology is homeomorphic to the V_\kappa (if the sentence letters in \textsf{PL}_\kappa are p_\alpha for \alpha \in \kappa, then f(v)(\alpha) \df := v(p_\alpha) defines a homeomorphism f : V_\kappa \to \{0, 1\}^\kappa) the latter is compact if and only if the former is. Now \{0,1\} with the discrete topology is compact because finite spaces are trivially compact. Tychonoff’s Theorem, a ZF-equivalent of the Axiom of Choice, states that the product of compact spaces is compact. Putting the pieces together, it follows that \{0,1\}^\kappa with the product topology is compact. This proves the compactness of \textsf{PL}_\kappa.

We do not in fact need the full power of Tychonoff’s Theorem. The weaker principle that the product of compact Hausdorff spaces is compact will do, because \{0,1\} with the discrete topology is Hausdorff (any two of its elements—0 and 1—are contained in disjoint open sets \{0\} and \{1\} ). This weaker version of Tychonoff’s Theorem is in fact equivalent to the compactness theorem for propositional logic (and first-order logic), as discussed in Section 7.

The topological proof of \textsf{PL}_\kappa‘s compactness just given assumed that \textsf{PL}_\kappa‘s set of connectives is truth-functionally complete. Without this assumption, there is no guarantee that the set \mathcal{B} is a basis for the product topology; for example, if a propositional logic cannot define negation, no element of \mathcal{B} may correspond to the open set consisting of all elements of \{0,1\}^\kappa with first coordinate 0, and if it cannot define conjunction then \mathcal{B} will not be closed under intersection and thus cannot be a basis. The compactness of a propositional logic with \kappa sentence letters but without a truth-functional set of connectives is of course immediate from the fact that it is a sublogic of a truth-functionally complete propositional logic with the same \kappa sentence letters. The topological way of seeing this is to endow the space V_\kappa with the topology which has \mathcal{B} as a subbase and denote it by Y.
(A subbase for a topological space (X, \tau) is a collection of open sets \mathcal{S} \subset \tau such that \mathcal{B} \df := \{X\} \cup \{\bigcap \mathcal{F} : \mathcal{F} \subset \mathcal{S} \text{ finite}\} is a basis for (X, \tau).Given an arbitrary collection \mathcal{S} of subsets of a set X, there is a unique topology \tau on X which has \mathcal{S} as a subbase. (Willard 1970, p. 39, thm. 5.6).) This topology is coarser than the dual space topology on V_\kappa: every open set in Y is open in the dual space topology. In general, if the topology \tau on some set X is coarser than the topology \tau^* on X and (X, \tau^*) is compact, then so is \(X, \tau). Intuitively, there are fewer means of `escaping’ (via the finite-intersection property) in \tau than in \tau^*, so if \tau^* is compact then there are no such routes in \tau. It follows that the topological space Y is compact, and thus any propositional logic is compact, whether or not its set of connectives is truth-functionally complete.

A `bare hands’ argument that avoids anything as strong as either the general Tychonoff Theorem or its compact Hausdorff version above may also be given for the compactness of V_\omega. We present this argument, since \textsf{PL}_\omega is the most common propositional logic.

Suppose that some open cover \mathcal{C} of the space A = \{0,1\}^\omega with the product topology lacks a finite subcover. We reduce this supposition to absurdity by constructing an element a of A with the following property: the set of all elements of A extending an arbitrary finite initial segment of a cannot be covered by any finite subset of \mathcal{C} . If f is a finite sequence of 0’s and 1’s, define A_f as the set of elements of A extending f. Let a_0 = \langle 0 \rangle if A_{\langle 0 \rangle} does not admit a finite subcover from \mathcal{C}, and a_0 = \langle 1 \rangle otherwise. More generally, if a_m has been defined, let a_{m+1} = a_m\frown\langle 0 \rangle if A_{a_m\frown\langle 0 \rangle} does not admit a finite subcover from C, and a_{m+1} = a_m\frown\langle 1 \rangle otherwise. Since by the assumption to be reduced to absurdity A = A_{\emptyset} does not admit a finite subcover from \mathcal{C}, an easy inductive argument shows that A_{a_m} does not admit a finite subcover from \mathcal{C} either, for any m. Letting a be the element of A whose restriction to m is a_m (more formally, a = \bigcup_{m \in \omega} a_m ), it follows from the fact that \mathcal{C} is an open cover of A that a is an element of some basis set B \subset O, where O is an open set in \mathcal{C}. From a \in B and the definition of the product topology, it further follows that for some m, A_{a_m} \subset B, so that A_{a_m} does in fact admit a finite subcover from \mathcal{C}, namely the open set O \in \mathcal{C} which contains B as a subset. This contradicts our supposition and shows that A = \{0,1\}^\omega with the product topology is compact.

The argument just given, which constructs an infinite branch through a binary branching tree, did not require any version of Choice. The reason is that at each stage in the construction of a we used the ordering on \{0, 1\} to extend a_m by 0 if possible and 1 if not. An alternative argument for the compactness of \{0,1\}^\omega uses the fact that a countable product \prod_{n \in \omega}(X_n, d_n) of bounded (by 1) metric spaces (X_i, d_i), is metrisable via the metric

    \[d_{\omega}(x, y) = \sum_{n \in \omega} \frac {d_n(x_n, y_n)}{2^n}.\]

To say that \prod_{n \in \omega}(X_n, d_n) is metrisable in this way is to say that the metric d_{\omega} induces the product topology on \prod_{n \in \omega}(X_n, d_n). The rest of the argument turns on showing that the metric space \rule{0mm}{4mm}(\{0,1\}^\omega, d_\omega) is compact, by exploiting the space’s metric properties. This argument cannot be generalised to show that \textsf{PL}_\kappa‘s valuation space is compact for any \kappa \geq \omega_1: the uncountable product of metric spaces, each having at least two points, is not metrisable, as no point has a countable neighbourhood base.

Moving to arbitrary logics \mathcal{L}, we construct the space X_{\mathcal{L}} of theories \text{Th}(\mathfrak{M}) of \mathcal{L}-structures \mathfrak{M}, topologised by the subbasic open sets U_\phi \df := \{T \in X_{\mathcal{L}} : \phi \notin T\} for each formula \phi in \mathcal{L}. The compactness of \mathcal{L} is then easily seen to be equivalent to the compactness of X_{\mathcal{L}} with this topology (see Figure 1). In the case of propositional logic, X_{\textsf{PL}_\kappa} \cong V_\kappa. (Note that unlike for bases, the implication that compactness of a space follows from every open cover drawn from a fixed subbase} has a finite subcover requires the Axiom of Choice. This is the Alexander subbase theorem (Willard 1970, p. 129, problem 17S).) However, the argument for first-order logic is more complicated than that for propositional logic and not as elementary as exhibiting a simple homeomorphism.


Figure 1: The connection between compactness of logics and its space of theories.

6. Extensions and Generalisations

The discussion of the compactness theorem in the most common logics, our concern here, could now ramify in several directions. There are two main dimensions of variation when considering a proof of compactness for a logic: we could vary the notion of compactness, or we could vary the logic. As an example of the latter, we might consider, for example, modal logics (\textsf{K}, \textsf{T}, \textsf{S4}, \textsf{S5}); infinitary logics \mathcal{L}_{\kappa \lambda} (\mathcal{L}_{\kappa \lambda} being the extension of first-order logic which allows for conjunctions and disjunctions of length less than \kappa and existential and universal quantifications of length less than \lambda. Usually \kappa \geq \lambda: we use \infty to represent no bound on the connectives / quantifiers respectively.); other extensions of first-order logic, such as logics with cardinality quantifiers.

A rough rule of thumb is that infinitary logics are not compact. In particular, as demonstrated in Section 2 with the example of \{c \neq c_i: i \in \omega \} \cup \{\bigvee_{i \in \omega} c = c_i\}, \mathcal{L}_{\kappa \lambda} is not compact whenever \kappa > \omega. In contrast, logics extending first-order logic with generalised quantifiers tend to be compact, but not if they contain cardinality quantifiers. Roughly speaking, the quantifiers Q_{\aleph_{\kappa}}, Q_{>\aleph_{\kappa}}, Q_{<\aleph_{\kappa}}, Q_{\geq;\aleph_{\kappa}}, and Q_{\leq;\aleph_{\kappa}} are interpretable as ‘there are exactly \aleph_{\kappa} ‘, ‘there are more than \aleph_{\kappa} ‘, ‘there are fewer than \aleph_{\kappa} ‘, ‘there are at least \aleph_{\kappa} ‘, and ‘there are no more than \aleph_{\kappa} ‘. Consider the set

    \[\left\{\neg Q_{\geq \aleph_{\kappa}} x(x=x)\right\} \cup\left\{c_{\alpha} \neq c_{\beta}: \alpha, \beta \in \kappa \text { s.t. } \alpha \neq \beta\right\}\]

in an infinitary logic containing Q_{\aleph_{\kappa}}; this set is finitely-satisfiable but unsatisfiable. (The arguments involving the other cardinality quantifiers are similar.) However, such logics tend to satisfy weaker notions of compactness. In particular, the logic that extends first-order logic with the quantifier Q_{>\aleph_{0}}, interpreted as ‘there are uncountably many’, is countably-compact: that is, a countable set of sentences is satisfiable if and only if it is finitely-satisfiable (Keisler 1970).

As an aside, it is worth observing that not all logics with higher-order (second-order or above) quantifiers are incompact. Second-order logic with non-standard (Henkin) semantics is compact (Enderton 2001, chap. 4), as is existential second-order logic, whose sentences are of the form \exists S_1 \dots \exists S_n \phi, where S_1, \dots, S_n are second-order functions / relations, and \phi is a first-order sentence in a language with symbols for S_1, \dots, S_n. (Via embeddings, this logic is equivalent to many first-order logics extended with extra quantifiers and game-theoretic semantics, such as Henkin’s quantifier, Hintikka-Sandu’s independence-friendly logic, and Väänänen’s dependence logic. However, these logics do not have a classical negation and the alternative form of compactness given in the introduction of this article is false: see section 2a of this article. For further details on these logics, consult Väänänen (2007) and Mann, Sandu, and Sevenster (2011)). As another example, consider what has been called pure second-order logic with identity, that is, second-order logic with neither functional nor first-order variables but with both second-order and first-order identity (as well as predicate variables and quantifiers). Pure second-order logic may be thought of as the complement of first-order logic relative to second-order logic in the following sense: first-order logic has object but not predicate quantifiers, pure second-order logic has predicate but not object quantifiers, and second-order logic combines the two. In other words, second-order logic merges first-order and pure second-order logic. As Paseau (2010b) shows, pure second-order logic with identity is compact; Denyer (1992) gives an argument that applies to pure second-order logic without identity. The moral is that the incompactness of second-order logic is not owed solely to the presence of second-order quantifiers, but to the combination of both first- and second-order quantifiers. Lindström’s Theorem tells us that any regular logical system weakly extending first-order logic satisfying both the downward Löwenheim-Skolem property (if a sentence is satisfiable, it is satisfiable in an at most countable model) and the compactness theorem is in fact identical to first-order logic itself (Lindström 1969, p. 8, thm. 2).

The other natural dimension of generalisation is the notion of compactness. A fairly obvious generalisation is:

    \[\begin{quote} a logic is (\(\kappa\), \(\lambda\))-compact, where \(\kappa\) is an infinite cardinal or \(\infty\) and \(\lambda\) is an infinite cardinal \(\leq \kappa\), just when: whenever \(\Gamma\) is an unsatisfiable set of sentences of cardinality \(\leq \kappa\) there is an unsatisfiable \(\Gamma^{\lambda} \subset \Gamma\) of cardinality \(< \lambda\). (By convention \(\kappa < \infty\) for every cardinal \(\kappa\).) \end{quote}\]

The ordinary notion of compactness is (\infty, \aleph_0)-compactness: whenever \Gamma is any unsatisfiable set of sentences (of any cardinality) there is an unsatisfiable \Gamma^{\aleph_0} \subset \Gamma of cardinality < \aleph_0, in other words of finite cardinality. A special case arises when \kappa = \lambda^+ and the logic is \mathcal{L}_{\lambda \lambda}. If this logic satisfies (\lambda,\lambda)-compactness, that is, if whenever \Gamma is an unsatisfiable set of sentences of cardinality \leq \lambda then \Gamma has an unsatisfiable subset of cardinality < \lambda, we say that it satisfies the Weak Compactness Theorem. One can show that for infinite cardinals, having the tree property is equivalent to satisfying the Weak Compactness Theorem, and either imply the weak inaccessibility of the cardinal. Cardinals satisfying the Weak Compactness Theorem are exactly those satisfying the combinatorial partition property \kappa \to (\kappa)^2, and indeed the latter is the usual definition of a weakly-compact cardinal. For the definitions of the tree property and the partition property, and more on weakly-compact cardinals, consult chapters 9 and 17 of Jech (2003).

Another generalisation of the notion of compactness is strong compactness. An uncountable cardinal \kappa is strongly-compact if for any set S, any \kappa-complete filter on S can be extended to a \kappa-complete ultrafilter on S. (A filter \mathcal{F} on a set is \kappa-complete if \bigcap \mathcal{A} \in \mathcal{F} for every \mathcal{A} \subset \mathcal{F} with |\mathcal{A}| < \kappa.) One may then show that \kappa is strongly-compact if and only if the logic \mathcal{L}_{\kappa \omega} (or \mathcal{L}_{\kappa \kappa} ) has the property of (\infty, \kappa)-compactness, that is, if \Gamma is an unsatisfiable set of sentences then it has an unsatisfiable subset of cardinality < \kappa. (p. 366, lemma 20.2).

Both weakly- and strongly-compact cardinals are large cardinal properties. (A sentence \phi(\kappa) is a large cardinal property if it satisfies the following: \textsf{ZFC} \cup \{\neg \exists \kappa \phi(\kappa)\} is equiconsistent with \textsf{ZFC}, \textsf{ZFC} \vdash \forall \kappa (\phi(\kappa) \to \kappa \text{ is a cardinal}), and the consistency strength of \textsf{ZFC} \cup \{\exists \kappa \phi(\kappa)\} is strictly larger than \textsf{ZFC}, by which we mean the consistency of the former implies that of the latter, but there is no finitistic argument for the reverse implication, for example, if \textsf{ZFC} \cup \{\exists \kappa \phi(\kappa)\} implies the arithmetised consistency of \textsf{ZFC}, by Gödel’s second incompleteness theorem.) Another large cardinal property is the extendible cardinal property, which is defined in terms of elementary embeddings of levels in the von Neumann cumulative hierarchy. However, it was later discovered that extendibility of a cardinal \kappa is equivalent to (\infty, \kappa)-compactness of \mathcal{L}_{\kappa \omega}^2, or the (\infty, \kappa)-compactness of \mathcal{L}_{\kappa \kappa}^n for each positive integer n (Kanamori 2009, p. 315, thm. 23.4). \mathcal{L}_{\kappa \lambda}^n is the nth-order extension of \mathcal{L}_{\kappa \lambda}.

Generalising the argument in Section 5 above, it is easily shown that a logic \mathcal{L} is (\kappa, \lambda)-compact precisely when every subbasic open cover \{U_\phi : \phi \in \Gamma\} of X_{\mathcal{L}}, where \Gamma is a set of \mathcal{L}-sentences of cardinality at most \kappa, has a subcover of cardinality less than \lambda. This is weaker than stating that X_{\mathcal{L}} is (\kappa, \lambda)-compact, meaning every cover of cardinality at most \kappa has a subcover of cardinality less than \lambda. For example, if \mathcal{L} is the extension of first-order logic with the quantifier Q_{> \aleph_0}, then \mathcal{L} is (\aleph_0, \aleph_0)-compact, but X_{\mathcal{L}} is not (\aleph_0, \aleph_0)-compact. See Ebbinghaus (1985, sec. 5.1), Makowsky (1985, sec. 1.1), Mannila (1983), and Stephenson (1984) for more on (\kappa, \lambda)-compact logics and spaces.

These remarks do no more than gesture at an extensive literature generalising the notion of compactness and connecting it to topics of set-theoretic interest. We would be remiss if we did not at least mention in passing the Barwise compactness theorem, which is as an important theorem in generalised recursion theory (Keisler and Knight 2004). The interested reader is invited to explore the references contained within this section.

7. Relative Strength

As we saw, the compactness theorem for both propositional and first-order logic may be proven in \textsf{ZFC}; but as also indicated, all that is required for its proof, in the case of both propositional and first-order logic, is the Ultrafilter Lemma, which is in fact weaker than Choice relative to \textsf{ZF}. Indeed, the Ultrafilter Lemma turns out to be equivalent to the compactness theorem in \textsf{ZF}. By the capitalised name `Compactness Theorem’ we hereafter understand the compactness theorem for both propositional and first-order logic, since these are of equivalent strength relative to \textsf{ZF}. The relative strength of these and related principles has been well-studied; in this section we report several of these results.

a. ZF-Equivalents

Tychonoff’s Theorem states that every product of compact spaces (with the product topology) is compact. In proving compactness earlier, we only required Tychonoff’s Theorem in the case when the product consisted of Hausdorff spaces. This weakened version is in fact equivalent to the Compactness Theorem. Some other equivalents are:

  • The Boolean Prime Ideal Theorem: every Boolean algebra has a prime ideal.
  • The Ultrafilter Lemma: every filter on a Boolean algebra can be extended to an ultrafilter.
  • Stone’s Representation Theorem: every Boolean algebra is isomorphic to a field of sets.
  • Alexander subbase theorem: if X is a topological space with a subbase \mathcal{S} and every open cover \mathcal{U} \subseteq \mathcal{S} has a finite subcover, then X is compact (Rubin and Scott 1954).
  • For every graph G, if every finite subgraph of G is 3-colourable then G is also 3-colourable (Cowen 1990).
  • If \Sigma is a set of propositional sentences consisting of a disjunction of at most three literals and every finite subset is satisfiable, then \Sigma is also satisfiable (Cowen 1990).

The equivalences of the first three statements with the Compactness Theorem can be found in Jech (1973, chap. 2). For more information, consult Howard and Rubin (1998, form 14).

b. Principles Strictly ZF-Stronger Than the Compactness Theorem

It is known that the full Tychonoff Theorem is equivalent to the Axiom of Choice in \textsf{ZF} (Kunen 2011, 72). To show that the Compactness Theorem does not imply Choice, we will briefly sketch the construction of a model where Choice fails but the Compactness Theorem holds. Suppose we are working in \mathsf{ZFA} + \mathsf{AC} with a countably infinite set of atoms A (objects which do not contain any elements and are different from the empty set); \mathsf{ZFA} is a variant of \textsf{ZF} with atoms (Jech 1973, sec. 4.1). We can develop all of our standard theory in this system, with some minor modifications (for example, the Axiom of Extensionality states that two sets are equal if they have the same elements). Let \lhd be a dense linear ordering on A without endpoints. Every permutation \pi of A has an extension \tilde{\pi} to the entire universe, define recursively by \tilde{\pi}(X) := \{\tilde{\pi}(x) : x \in X\} for each set x. Let G be the group of order-preserving permutations of A and for every set x define fix(x) \df := { \pi \in G : \forall y \in x, \tilde{\pi}(y) = y} and sym(x) \df := \{ \pi \in G : \tilde{\pi}(x) = x}. We shall say that an object x is symmetric if there is a finite set of atoms E such that fix(E) \subset sym(x). Denote the class of hereditarily symmetric objects by M.

If V denotes the standard cumulative hierarchy in this theory, then it is easily seen that V \subset M. We may also show that \mathsf{ZFA} holds in M. As the Axiom of Choice holds in V, a set X \in M is well-orderable, according to M, precisely when there is a symmetric function mapping X to some set in V. This turns out to be the case precisely when there is a finite set of atoms E such that fix(E) \subset fix(X). Since for every finite set of atoms E, there is a non-identity order-preserving permutation of A fixing E, it follows that A is not well-orderable according to M. However, the Compactness Theorem still holds in M.

Although we worked in a theory with atoms, it is possible to translate this model to a model without atoms where Choice fails but the Compactness Theorem holds. The interested reader should consult (Jech 1973, pp. 44–54, chap. 4) for details of this proof. It now follows that, working in \textsf{ZF}, Tychonoff’s Theorem is not implied from the Compactness Theorem.

Here are some more examples of theorems equivalent to Choice (a principle we restate for the sake of completeness), and thus unprovable from the Compactness Theorem:

  • Axiom of Choice: every family of non-empty sets has a choice function.
  • Axiom of Multiple Choice: every collection of non-empty sets has a multiple choice function—a function that picks out a non-empty, finite subset from each set in the collection (Chapter 9).
  • Well-Ordering Principle: every set can be well-ordered (Kunen 2011, 68).
  • Zorn’s Lemma: every partial order in which every chain has an upper bound has a maximal element (68).
  • Every vector space has a basis (Blass 1984).
  • Every non-empty set can be endowed with a group operation (Hajnal and Kertész 1972).

See Howard and Rubin (1998, form 1) for more statements equivalent to the Axiom of Choice.

c. Principles Strictly ZF-Weaker Than the Compactness Theorem

The following is a small selection of Choice-like principles that are weaker than the Compactness Theorem. See Jech (1973) for the construction of models satisfying one of these statements but not the Compactness Theorem.

  • Order-Extension Principle, and consequently the statement that every set can be linearly-ordered.
  • Hahn-Banach theorem: If p is a sublinear functional on a set X and \phi is a linear functional defined on a subspace V \leq X such that \phi(v) \leq p(v) for all v \in V, then there exists a linear extension \psi of \phi to all of X such that \psi(x) \leq p(x) for all x \in X. In fact, the Hahn-Banach theorem is equivalent to the existence of a finitely-additive probability measure on every Boolean algebra (28). It is easily seen that a prime ideal gives rise to a finitely-additive, \set{0, 1}-valued probability measure on every Boolean algebra, and thus the Hahn-Banach theorem follows from the Boolean Prime Ideal Theorem.
  • Every infinite set has a non-principal ultrafilter.

8. History of the Compactness Theorem

The first proof of a compactness theorem was published by Gödel:

Theorem X. For a denumerably infinite system of formulas to be satisfiable it is necessary and sufficient that every finite subsystem be satisfiable. (Gödel 1930, 118–19)

The formulas of the logic, called restricted (functional) calculus (103n3) or first-order predicate logic (Mal’cev 1936, 1), are those from first-order logic that do not contain any function, constant, nor equality symbols, but do contain propositional variables; these variables can only be interpreted as 1 (true) or 0 (false) in any model. Thus, this system extends that of propositional logic \mathsf{PL}_\omega.

The central idea behind Gödel’s proof is to create an equivalent sequence (\phi_n) of satisfiable formulas, where each \phi_{n+1} is a conjunction, with one of the conjuncts being \phi_n, for each n.
The models for each \phi_n (in the language restricted to only those non-logical symbols that occur in \phi_n) form a finitely-branching tree of height \omega, ordered by extension, which by König’s lemma must contain a branch. (Whilst König’s lemma in general requires some Choice to prove, if the tree is countable then it can be proven in \textsf{ZF}. A finitely-branching tree of height \omega is a countable union of finite sets, which is not necessarily countable in \mathsf{ZF}, even if the finite sets are pairs (Pincus 1974, 224).) The interpretations along this branch are consistent and form a model of the whole sequence (\phi_n), which in turn models the original sequence. We have not included this approach to the compactness theorem in this entry—for details, see Paseau (2010a, 84).

This result also proves the corresponding result for first-order logic with a countable infinity of symbols. Indeed, Gödel does extend his result to allow for equality symbols (Gödel 1930, p. 117, thm. VIII). To allow for constant and function symbols, we introduce new predicate symbols R_c and R_f of arity 1 and n+1 for each constant symbol c and n-ary function symbol f, and make the following substitutions:

  • For a constant symbol c occuring in a sentence \phi, replace each occurrence of c with a variable y that does not occur in \phi to obtain the formula. If \widehat{\phi} denotes this new sentence, we then replace \phi with \exists y (\widehat{\phi} \wedge R_c(y)).
  • For an n-ary function symbol f in a sentence \phi, replace each occurrence of

        \[R(\sigma_1, \dots, \sigma_{k-1}, f(\tau_1, \dots, \tau_n), \sigma_{k+1}, \dots, \sigma_m)\]

    with

        \[\exists y (R(\sigma_1, \dots, \sigma_{k-1}, y, \sigma_{k+1}, \dots, \sigma_m) \wedge R_f(\tau_1, \dots, \tau_n, y)),\]

    where R is an m-ary predicate symbol, \sigma_1, \dots, \sigma_{k-1}, \sigma_{k+1}, \dots, \sigma_m, \tau_1, \dots, \tau_n are terms and y is a variable that does not occur in \phi. Viewing equality as a binary predicate, we can perform the same procedure on subformulas of this form.

Repeating this procedure until there are no occurrences of constant nor function symbols, we obtain a new set of sentences \widehat{\Gamma}. Appending to this set sentences of the form \exists ! z R_c(z) for each constant symbol c and \forall x_1, \dots, x_n \exists ! z R_f(x_1, \dots, x_n, z) for each n-ary function symbol f, we find that the satisfiability of this new set is equivalent to the satisfiability of \Gamma, thus extending Gödel’s result to encompass languages with constant and function symbols.

Note that Gödel’s result is only applicable to the countable versions of propositional and first-order logic. In Gödel’s (1932) short paper, he proves the compactness theorem for propositional logic for arbitrary languages. Given a deductively-closed, consistent set \Gamma of propositional formulas and a well-order on the set of propositional formulas, Gödel defines by transfinite recursion:

  • \Gamma_0 \df := \Gamma,
  • for all ordinals \alpha, if there is a formula \phi such that neither \phi nor \neg \phi belong to \Gamma_\alpha, let \phi_\alpha be the least such \phi and define

        \[\Gamma_{\alpha + 1} \df := \{\psi : (\phi_\alpha \to \psi) \in \Gamma_\alpha\}.\]

    Otherwise, define \Gamma_{\alpha + 1} \df := \Gamma_\alpha.

  • for all limit ordinals \lambda, \Gamma_\lambda \df := \bigcup_{\alpha < \lambda} \Gamma_\alpha.

For an ordinal \alpha where \Gamma_{\alpha + 1} = \Gamma_\alpha, it follows that the valuation v defined as follows satisfies every formula in \Gamma$: for all propositional letters \(p,

    \[v(p) = 1 \iff p \in \Gamma_\alpha.\]

Independently, Mal’cev (1936, thm. 1) also proved the compactness theorem for propositional logic, again using the full strength of the Axiom of Choice. His proof relies on transfinite induction, letting \kappa be an uncountable cardinal and assuming that the compactness theorem holds for sets of propositional sentences with cardinality strictly less than \kappa. Let \Gamma be a finitely-satisfiable set of sentences with cardinality \kappa and well-ordered in this order type. At each non-limit stage \alpha, he substitutes the \alphath sentence \phi_\alpha for a conjunction of literals. Supposing this has been done up to, but not including, stage \alpha < \kappa, so we have a set \Gamma_\alpha = \rule{0mm}{4.5mm}\{\widehat{\phi}_\beta : \beta < \alpha\} \cup \{\phi_\beta : \alpha \leq \beta < \kappa\}. The \alphath substitution is made according to the following rule: as the initial segments of \Gamma_\alpha are satisfiable, we have a collection of \kappa-many models. One of the truth-value assignments to the variables that occur in \phi_\alpha must coincide with \kappa-many of the assignments for these models. Replace \phi_\alpha with \widehat{\phi}_\alpha, which is the conjunction of the letters in \phi_\alpha that are assigned true, together with the negation of the letters that are assigned false. By construction, the final set \Gamma_\kappa = \{\widehat{\phi}_\alpha : \alpha < \kappa\} is satisfiable, and any satisfying model is also a model of the original \Gamma.

Mal’cev claimed to have extended this result to first-order logic in the same paper. However, his terminology is slightly ambiguous. What seems to be at issue is the claim that every sentence is `equivalent’ to a \forall \exists-sentence; one of the form \forall x_1 \dots \forall x_m \exists y_1 \dots \exists y_n \phi, where \phi is an atomic formula. This is patently false: for instance, the arithmetical hierarchy does not collapse, so there exists a sentence that is not equivalent to a \Pi_2-sentence, even over Peano Arithmetic. However, by `equivalent’ Mal’cev appears to mean equisatisfiable:

    \[\begin{quote} as is well known, every formula of FOPL can be replaced with an equivalent formula in the Skolem normal form for satisfiability. (4) \end{quote}\]

This is also noted in a review of Mal’cev’s later articles (Henkin and Mostowski 1959, 57). Under this interpretation, the statement is true. Skolem’s normal form for satisfiability can be found in (Skolem 1920):

Theorem 1. If U is an arbitrary first-order proposition, there exists a first-order proposition U' in normal form with the property that U is satisfiable in a given domain whenever U' is, and conversely. (255)

Note that the languages of U and U' are not in general the same. This result readily extends to sets of sentences and is a well-known tool in model theory. (Nowadays, Skolemisation of a theory usually refers to the introduction of function symbols directly in order to prove, amongst other things, the downward Löwenheim-Skolem (Skolem 1920, 257–59). As we do not require this theorem, the predicate approach has the added advantage of not requiring Choice.)

The first explicit, publshed proofs of a compactness theorem from completeness were given independently by Henkin (1948, 1949b, 1950) and Robinson (1949, 1951). The logics considered in these dissertations are first-order logic / theory of simple types for Henkin, and restricted functional calculus for Robinson. However, the proof relies only upon the finitary nature of the deductive system and completeness of the logic. (The completeness theorem for propositional logic is independently proven by Bernays (1918, sec. 3)—see Zach (1999, 340–48) for a discussion on authorship—and Post (1921, 169). For the restricted functional calculus, it was first asked, for an empty set of premisses, in Hilbert and Ackermann (1928, 68). It was proven in the form of the model existence theorem—for countable sets in Gödel (1929, 96–101) and for arbitrary sets, using the Axiom of Choice, in Henkin (1949b, p. 164, cor. 2) and Robinson (1951, chap. 3). As noted above, this easily extends to completeness of first order logic.)

The compactness theorem finally receives its name in Tarski (1952) in two forms. The second form (p. 467, thm. 17) is the form we are familiar with. By \mathbf{AC}, Tarski denotes the collection of arithmetical classes, which are classes of the form

    \[\mathcal{C L} (\phi) \df := \{\mathfrak{M} : \mathfrak{M} \models \phi\}\]

where \phi is a formula over a fixed first-order language \mathcal{L}, and the \mathfrak{M} range over \mathcal{L}-structures. In his terminology, the compactness theorem is the statement that if \Gamma is a collection of \mathcal{L}-formulas and \bigcap \{\mathcal{CL}(\phi) : \phi \in \Gamma\} = \emptyset (which means \Gamma is unsatisfiable) then there exists a finite set \Delta \subset \Gamma such that \bigcap \{\mathcal{CL}(\phi) : \phi \in \Delta\} = \emptyset (which means \Gamma is not finitely-satisfiable). The first form (p. 466, thm. 13) is a generalisation that applies to types (Chang and Keisler 1990, 78–79; Hodges 1997, 130–31).

Whilst Tarski does not provide a topological argument for these theorems, relying upon completeness instead, he does draw some connection between a topological space based on \mathbf{AC} and first-order logic. By defining a closure operator and then quotienting by equivalence of models, he obtains a space homeomorphic to the one constructed at the end of section 5 of this article. (There are some foundational issues at play here, since each equivalence class is a proper class, as are the open sets. Tarski takes only a set of models as opposed to the whole class. We avoided this issue in our construction by considering the space of theories. As the class of theories is a set, by taking enough representatives—using, for example, the Axiom of Collection—we find that Tarski’s space will be homeomorphic to our construction in section 5.) As he observes, the resulting space is a Stone space, noting that compactness of the space follows from the compactness of the logic. (Tarski uses the term bicompact for what we call compact. Compact used to refer to the countable compactness property (Willard 1970, 304).)

The ultraproduct proof is given in Frayne, Morel, and Scott (1962, p. 216, thm. 2.10), which is based on Łoś’ Theorem (p. 213, lemma 2.1)—see Łoś (1971, 105, “(2.6)”) for the original statement (without proof). The argument is similar to the one given in Section 4c. Frayne, Morel, and Scott (1962, 195) acknowledge that Alfred Tarski suggested the use of ultraproducts, which had several applications in the literature already, in proving the compactness theorem. David Gale gave a topological proof of the compactness theorem for propositional logic (as credited by Henkin (1949a, 48n4)), whilst Beth (1951) gives a topological proof of the completeness theorem before deriving compactness as a corollary. A direct topological proof is given by Frayne, Morel, and Scott (1962, p. 225, ex. 2), using a combination of ultraproducts and Stone spaces.

More details on this history can be found in Dawson (1993) and the references therein.

9. References and Further Reading

  • Barwise, J., and Solomon Feferman, eds. 1985. Model-Theoretic Logics. Perspectives in Mathematical Logic. New York: Springer.
    • A book on the model-theory of many logics that extend first-order logic.
  • Beall, J. C., and Greg Restall. 2006. Logical Pluralism. Oxford: Clarendon Press.
    • The book that started the contemporary debate on whether there is one correct foundational logic (logical monism) or more than one (logical pluralism). The authors defend logical pluralism.
  • Bernays, Paul. 1918. “Beiträge zur axiomatischen Behandlung des Logik-Kalküls.” In Ewald and Sieg 2013, 231–68.
    • Bernays’ habilitiation thesis, in which he proves the completeness theorem for propositional logic, amongst other results related to the chosen propositional calculus.
  • Beth, E. W. 1951. “A Topological Proof of the Theorem of Löwenheim-Skolem-Gödel.” Indagationes Mathematicae (Proceedings) 54:436–44. https://doi.org/10.1016/S1385-7258(51)50062-8.
    • Beth presents a proof of the Löwenheim-Skolem-Gödel theorem: a set of first-order sentences Γ has a countable model if and only if Γ is consistent.
  • Blass, Andreas. 1984. “Existence of Bases Implies the Axiom of Choice.” In Axiomatic Set Theory, edited by James E. Baumgartner, Donald A. Martin, and Saharon Shelah, 31–33. Contemporary Mathematics 31. Providence, RI: American Mathematical Society.
    • Blass proves that if every vector space has a basis, then the Axiom of Choice holds, via the Axiom of Multiple Choice.
  • Chang, C. C., and H. Jerome Keisler. 1990. Model Theory. 3rd ed. Studies in Logic and the Foundations of Mathematics 73. Amsterdam: North-Holland.
    • A classic model theory text.
  • Cowen, Robert H. 1990. “Two Hypergraph Theorems Equivalent to BPI.” Notre Dame Journal of Formal Logic 31 (2): 232–40. https://doi.org/ 10.1305/ndjfl/1093635418.
    • Cowen proves that the Boolean Prime Ideal theorem (and thus the Compactness Theorem) is equivalent to the Compactness Theorem restricted to propositional formulas formed from a disjunct of at most 3 literals, and is also equivalent to the statement “a graph G is 3-colourable if every finite subgraph is 3-colourable”.
  • Dawson, John W., Jr. 1993. “The Compactness of First-Order Logic: From Gödel to Lindström.” History and Philosophy of Logic 14 (1): 15–37. https://doi.org/10.1080/01445349308837208.
    • Another article detailing the development and history of the compactness theorems.
  • Denyer, Nicholas. 1992. “Pure Second-Order Logic.” Notre Dame Journal of Formal Logic 33 (2): 220–24. https://doi.org/10.1305/ndjfl/1093636099.
    • Denyer proves that pure second-order logic (where formulas only have predicate variables) without identity is compact, whilst the corresponding logic with only functional variables is not.
  • Ebbinghaus, H.-D. 1985. “Extended Logics: The General Framework.” In Barwise and Feferman 1985, chap. 2.
    • Ebbinghaus presents a general framework for logics extending first-order logic and discusses various model-theoretic properties of these logics, in particular compactness and variations thereof.
  • Enderton, Herbert B. 2001. A Mathematical Introduction to Logic. 2nd ed. San Diego: Harcourt/Academic Press.
    • An introductory textbook on propositional, first-order, and second-order logic, as well as Gödel’s incompleteness theorems.
  • Ewald, William, and Wilfried Sieg, eds. 2013. David Hilbert’s Lectures on the Foundations of Arithmetic and Logic: 1917–1933. Vol. 3 of David Hilbert’s Lectures on the Foundations of Mathematics and Physics: 1891–1933. Heidelberg: Springer.
    • A collection of Hilbert’s works in the foundations of mathematics, with additional commentary and historical background, as well as a reproduction of Bernays’ habilitation thesis.
  • Feferman, Solomon, John W. Dawson Jr., Stephen C. Kleene, Gregory H. Moore, Robert M. Solovay, and Jean van Heijenoort, eds. 1986. Publications 1929–1936. Vol. 1 of Kurt Gödel Collected Works. Oxford: Oxford University Press.
    • A collection of many of Gödel’s important works, including his famous results on completeness, incompleteness, and compactness.
  • Frayne, T., A. C. Morel, and Dana S. Scott. 1962. “Reduced Direct Products.” Fundamenta Mathematicae 51:195–228. https://doi.org/10.4064/fm-51-3-195-228.
    • This paper presents several results on properties reduced products and ultraproducts.
  • Givant, Steven, and Paul Halmos. 2009. Introduction to Boolean Algebras. Undergraduate Texts in Mathematics. New York: Springer.
    • An introductory textbook on algebraic, order-theoretic, and topological aspects of Boolean algebras.
  • Gödel, Kurt. 1929. “Über die Vollständigkeit des Logikkalküls.” Translated by Stefan Bauer-Mengelberg and Jean van Heijenoort. In Feferman, Dawson, Kleene, Moore, Solovay, and Van Heijenoort 1986, 60–101.
    • Gödel’s doctoral dissertation, in which he proves the completeness of the restricted functional calculus.
  • Gödel, Kurt. 1930. “Die Vollständigkeit der Axiome des logischen Funktionenkalküls.” Translated by Stefan Bauer-Mengelberg. In Feferman, Dawson, Kleene, Moore, Solovay, and Van Heijenoort 1986, 102–23.
    • Based on his 1929 doctoral dissertation, Gödel proves the compactness (for countable sets of sentences) of restricted functional calculus from the completeness theorem.
  • Gödel, Kurt. 1932. “Eine Eigenshaft der Realisierungen des Aussagenkalküls.” Translated by John W. Dawson Jr. In Feferman, Dawson, Kleene, Moore, Solovay, and Van Heijenoort 1986, 238–41.
    • Gödel presents his proof of the compactness theorem in general for propositional logic.
  • Goldblatt, Robert. 1998. Lectures on the Hyperreals: An Introduction to Nonstandard Analysis. Graduate Texts in Mathematics 188. New York: Springer.
    • A graduate-level textbook on the subject of non-standard analysis, including the foundational and model-theoretic justification for its methods.
  • Gonczarowski, Yannai A., Scott Duke Kominers, and Ran I. Shorrer. 2019. “To Infinity and Beyond: Scaling Economic Theories via Logical Compactness.” Harvard Business School Entrepreneurial Management Working Paper, no. 19–127, revised November 9, 2020. https://doi.org/10. 2139/ssrn.3409828.
    • This substantial working paper demonstrates several applications of the compactness theorem in economics.
  • Griffiths, O., and A. C. Paseau. 2022. One True Logic. Oxford: Oxford University Press.
    • The authors argue that there is one correct foundational logic, and that it is highly infinitary.
  • Hajnal, A., and A. Kertész. 1972. “Some New Algebraic Equivalents of the Axiom of Choice.” Publicationes Mathematicae Debrecen 19:339–40.
    • The equivalence, under ZF, of the Axiom of Choice with the existence of groups structures (or other algebraic structures) on any set is proven in this short article.
  • Henkin, Leon A. 1948. “The Completeness of Formal Systems.” PhD diss., Princeton University.
    • Henkin’s doctoral dissertation which contains his proof of the completeness and compactness theorems for first-order functional calculus and the simple theory of types, as well as applications of the compactness theorem to algebra among other results. These results were published in Henkin (1949b, 1950).
  • Henkin, Leon A. 1949a. “Fragments of the Propositional Calculus.” Journal of Symbolic Logic 14:42–48. https://doi.org/10.2307/2268976.
    • Henkin demonstrates how to obtain complete axiomatisations for fragments of propositional logic.
  • Henkin, Leon A. 1949b. “The Completeness of the First-Order Functional Calculus.” Journal of Symbolic Logic 14:159–66. https://doi.org/10. 2307/2267044.
    • This article contains Henkin’s proof of the completeness and compactness theorems of first-order functional calculus, based on his doctoral dissertation.
  • Henkin, Leon A. 1950. “Completeness in the Theory of Types.” Journal of Symbolic Logic 15:81–91. https://doi.org/10.2307/2266967.
    • This article contains Henkin proves the completeness theorem for the theory of simple types, based on his doctoral dissertation.
  • Henkin, Leon A., and Andrzej Mostowski. 1959. Review of Anatoliĭ Ivanovič Mal’cev. 1941. “Ob odnom obščém métodé polučéniá lokal’nyh téorém téorii grupp.” Ivanovskij Gosudarstvénnyj Pédagogičéskij Institut, Učényé zapiski, Fiziko-matématičéskié nauki 1 (1): 3–9 and “O prédstavléniáh modéléj.” 1956 by Anatoliĭ Ivanovič Mal’cev. Doklady Akadémii Nauk SSSR 108:27–29. Journal of Symbolic Logic 24 (1): 55–57. https://doi. org/10.2307/2964581.
    • Henkin and Mostowski’s review of two of Mal’cev’s important papers on applications of the compactness theorems in algebra. English translations of these article are found in Mal’cev (1971, 15–26).
  • Hilbert, D., and W. Ackermann. 1928. “Grundzüge der theoretischen Logik.” In Ewald and Sieg 2013, 809–915.
    • A collection of Hilbert’s works in the foundations of mathematics, with additional commentary and historical background, as well as a reproduction of Bernays’ habilitation thesis.
  • Hodges, Wilfrid. 1997. A Shorter Model Theory. Cambridge: Cambridge University Press.
    • A comprehensive textbook on model theory.
  • Howard, Paul E. 1975. “Łoś’ Theorem and the Boolean Prime Ideal Theo- rem Imply the Axiom of Choice.” Proceedings of the American Mathematical Society 49:426–28. https://doi.org/10.2307/2040659.
    • In this paper, Howard proves the title statement and discusses the possibility of assumption of the Boolean Prime Ideal Theorem.
  • Howard, Paul E., and Jean E. Rubin. 1998. Consequences of the Axiom of Choice. Mathematical Surveys and Monographs 59. Providence, RI: American Mathematical Society.
    • A comprehensive source of the myriad Choice principles that occur throughout mathematics, organised by equivalence as well as topic. It also includes descriptions of the many models of set theory that demonstrate non-provable implications between the different forms of Choice.
  • Jech, Thomas J. 1973. The Axiom of Choice. Studies in Logic and the Foundations of Mathematics 75. Amsterdam: North-Holland.
    • A textbook on the Axiom of Choice and other Choice-like principles, which covers the standard permutation models used for proving the independence of one statement from another.
  • Jech, Thomas J. 2003. Set Theory. 3rd ed. Springer Monographs in Mathematics. Berlin: Springer.
    • A graduate-level textbook on axiomatic set theory, covering infinitary combinatorics, large cardinals, inner models, and forcing.
  • Kanamori, Akihiro. 2009. The Higher Infinite: Large Cardinals in Set Theory from Their Beginnings. 2nd ed. Springer Monographs in Mathematics. Berlin: Springer.
    • A graduate-level textbook on the large cardinal heirarchy.
  • Keisler, H. Jerome. 1965. “A Survey of Ultraproducts.” In Logic, Methodology and Philosophy of Science II: Procedings of the 1964 International Congress, edited by Yehoshua Bar-Hillel, 112–26. Studies in Logic and the Foundations of Mathematics. Amsterdam: North-Holland.
    • A survey on the use of ultraproducts in mathematics, including its relationship with the compactness theorem for first-order logic via Łoś’ theorem.
  • Keisler, H. Jerome. 1970. “Logic with the Quantifier ‘There Exist Uncountably Many’.” Annals of Pure and Applied Logic 1:1–93. https://doi.org/ 10.1016/S0003-4843(70)80005-5.
    • An extensive paper on the model-theoretic properties of logics that include the quantifier ‘there exist uncountably many’.
  • Keisler, H. Jerome, and Julia F. Knight. 2004. “Barwise: Infinitary Logic and Admissible sets.” The Bulletin of Symbolic Logic 10 (1): 4–36. https: //doi.org/10.2178/bsl/1080330272.
    • A survey of the Barwise compactness theorem of infinitary logic.
  • Kunen, Kenneth. 2011. Set Theory. Studies in Logic 34. London: College Publications.
    • A textbook on axiomatic set theory, covering infinitary combinatorics and forcing.
  • Lindström, Per. 1969. “On Extensions of Elementary Logic.” Theoria 35:1– 11. https://doi.org/10.1111/j.1755-2567.1969.tb00356.x.
    • Lindström proves his famous theorem characterising first-order logic via the (countable) compactness and Löwenheim-Skolem theorems.
  • Łoś, Jerzy. 1971. “Quelques remarques, théorèmes et problèmes sur les classes définissables d’algèbres.” In Mathematical Interpretation of Formal Systems, 2nd ed., 16:98–113. Studies in Logic and the Foundations of Mathematics. Amsterdam: North-Holland.
    • In this article (originally published in 1955), Łoś states without proof the theorem which bears his name.
  • Makowsky, J. A. 1985. “Compactness, Embeddings and Definability.” In Barwise and Feferman 1985, chap. 18.
    • Makowsky presents results on several model-theoretic properties of abstract logics, including generalised compactness properties.
  • Mal’cev, Anatoliĭ Ivanovič. 1936. “Investigations in the realm of mathematical logic.” In Mal’cev 1971, 1–14.
    • In this article, Mal’cev proves the compactness theorems for propositional and first-order logic.
  • Mal’cev, Anatoliĭ Ivanovič. 1971. The Metamathematics of Algebraic Systems: Collected papers: 1936–1967. Edited and translated by Benjamin Franklin Wells III. Amsterdam: North-Holland.
    • Translations of most of Mal’cev’s works on algebraic applications of mathematical logic.
  • Mann, Allen L., Gabriel Sandu, and Merlijn Sevenster. 2011. Independence-Friendly Logic: A Game-Theoretic Approach. London Mathematical Society Lecture Note Series 386. Cambridge: Cambridge University Press.
    • This monograph provides an introduction to independence-friendly logic, a logic with game-theoretic semantics.
  • Mannila, Heikki. 1983. “A Topological Characterization of (λ, µ)-Compactness.” Annals of Pure and Applied Logic 25 (3): 301–5. https://doi.org/10.1016/0168-0072(83)90022-2.
    • Mannila gives a model-theoretic characterisation of the (λ, µ)-compactness of the model space of a logic.
  • Moore, Gregory H. 1982. Zermelo’s Axiom of Choice: Its Origins, Development, and Influence. Studies in the History of Mathematics and Physical Sciences 8. New York: Springer-Verlag.
    • A comprehensive book detailing the history of Zermelo’s explication of the Axiom of Choice and the consequent controversy surrounding it in the early twentieth century.
  • Paseau, A. C. 2010a. “Proofs of the Compactness Theorem.” History and Philosophy of Logic 31 (1): 73–98. https://doi.org/10.1080/014453409 03495340. Corrected in “Proofs of the Compactness Theorem.” History and Philosophy of Logic 32, no. 4 (2011): 407. https://doi.org/https://doi.org/10.1080/01445340.2011.618261.
    • This article examines different proofs of the compactness theorem, including some of the ones included in the present article, and draws some philosophical conclusions.
  • Paseau, A. C. 2010b. “Pure Second-Order Logic with Second-Order Identity.” Notre Dame Journal of Formal Logic 51 (3): 351–60. https://doi. org/10.1215/00294527-2010-021.
    • Pure second-order logic is second-order logic without first-order or functional variables. In the context of second-order logic, it is therefore, in a sense, the complement of first-order logic. The article states and proves some metalogical results for this logic.
  • Paseau, A. C., and Owen Griffiths. 2021. “Is English Consequence Compact?” Thought: A Journal of Philosophy 10 (3): 188–98. https://doi. org/10.1002/tht3.492.
    • A detailed investigation of the validity of the ‘planets’ argument discussed in section 2b of the present article.
  • Pincus, David. 1974. “The Strength of the Hahn-Banach Theorem.” In Victoria Symposium on Nonstandard Analysis: University of Victoria 1972, edited by Albert Hurd and Peter Loeb, 203–48. Lecture Notes in Mathematics 369. Berlin: Springer-Verlag.
    • Pincus shows the consistency (relative to ZF) of the Hahn-Banach theorem holding whilst the Boolean Prime Ideal Theorem failing.
  • Poizat, Bruno. 2000. A Course in Model Theory. Translated by Moses Klein. Universitext. An Introduction to Contemporary Mathematical Logic. New York: Springer.
    • A graduate-level textbook on model theory.
  • Post, Emil L. 1921. “Introduction to a General Theory of Elementary Propositions.” American Journal of Mathematics 43 (3): 163–85. https://doi. org/10.2307/2370324.
    • Based on his doctoral dissertation, Post proves the completeness theorem for propositional calculus, as well as some results on many-valued logics.
  • Putnam, Hilary. 1980. “Models and Reality.” Journal of Symbolic Logic 45 (3): 464–82. https://doi.org/10.2307/2273415.
    • In this article, Putnam sets out the argument mentioned in section 3 of the present article.
  • Richter, Marcel K. 1966. “Revealed Preference Theory.” Econometrica 34 (3): 635–45. https://doi.org/10.2307/1909773.
    • Richter characterises the representability and rationality of consumers, where the latter proof uses the Order-Extension Principle.
  • Robinson, Abraham. 1949. “On the Metamathematics of Algebraic Systems.” PhD diss., Birkbeck College, University of London.
    • Robinson’s doctoral dissertation, which includes his proof of the completeness and compactness theorems for first-order functional calculus. These results were later published in Robinson (1951).
  • Robinson, Abraham. 1951. On the Metamathematics of Algebra. Studies in Logic and the Foundations of Mathematics 4. Amsterdam: North-Holland.
    • A monograph largely based on Robinson’s doctoral dissertation.
  • Robinson, Abraham. 1966. Non-Standard Analysis. Studies in Logic and the Foundations of Mathematics 42. Amsterdam: North-Holland.
    • A foundational text in non-standard analysis, providing a rigorous justification of its methods.
  • Rubin, Herman, and Dana S. Scott. 1954. Some Topological Theorems Equivalent to the Boolean Prime Ideal Theorem. Presented at 503rd meeting of the American Mathematical Society, May 1, 1954. Abstract in W. Green. “The May Meeting in Yosemite.” Bulletin of the American Mathematical Society 60 (1954): 386–99. https://doi.org/10.1090/ S0002-9904-1954-09827-0.
    • At this talk, several topological theorems are proven to be equivalent to the Boolean Prime Ideal Theorem.
  • Shapiro, Stewart. 1991. Foundations without Foundationalism: A Case for Second-Order Logic. Oxford Logic Guides 17. Oxford: Clarendon Press.
    • A detailed and extensive account of second-order logic, taking in its historical, mathematical and philosophical aspects.
  • Shapiro, Stewart. 2014. Varieties of Logic. Oxford: Oxford University Press. https://doi.org/10.1093/acprof:oso/9780199696529.001.0001.
    • The author defends a more radical form of logical pluralism (the thesis that there is more than one correct foundational logic) than that of Beall and Restall (2006).
  • Skolem, Thoralf. 1920. “Logico-Combinatorial Investigations in the Satisfiability or Provability of Mathematical Propositions: A Simplified Proof of a Theorem by L. Löwenheim and Generalizations of the Theorem.” Translated by Stefan Bauer-Mengelberg. In From Frege to Gödel: A Source Book in Mathematical Logic, 1879–1931, edited by Jean van Heijenoort, 254–63. Cambridge, MA: Harverd University Press.
    • A translated extract from Skolem’s 1920 paper, in which he presents his normal form for formulas in first-order logic as well as presenting a fully-correct proof of the downward Löwenheim-Skolem theorem for countable sets of sentences, which uses his Axiom of Choice.
  • Skolem, Thoralf. 1922. “Some Remarks on Axiomatized Set Theory.” Translated by Stefan Bauer-Mengelberg, 290–301.
    • A translated transcript of Skolem’s address at the Fifth Congress of Scandinavian Address held in Helsinki 4th–7th July 1920, in which he discusses several foundational aspects of set theory, and in particular demonstrates his ‘paradox’.
  • Stephenson, R. M., Jr. 1984. “Initially κ-Compact and Related Spaces.” In Handbook of Set-Theoretic Topology, edited by Kenneth Kunen and Jerry E. Vaughan, 603–32. Amsterdam: North-Holland.
    • This article presents a survey of results on (κ, λ)-compactness of topological spaces.
  • Sutherland, Wilson A. 2009. Introduction to Metric and Topological Spaces. 2nd ed. Oxford: Oxford University Press.
    • An introductory textbook to topology. Chapter 13 on compactness is particularly relevant to this article.
  • Szpilrajn, Edward. 1930. “Sur l’extension de l’ordre partiel.” Fundamenta Mathematicae 16:386–89. https://doi.org/10.4064/fm-16-1-386-389.
    • In this short paper, Szpilrajn (also known as Marczewski) proves his Order Extension Principle.
  • Tarski, Alfred. 1952. “Some Notions and Methods on the Borderline of Algebra and Metamathematics.” In Proceedings of the International Congress of Mathematicians: Cambridge, Massachussetts, U.S.A., August 30–September 6, 1950, 1:705–20. Providence, RI: American Mathematical Society.
    • Tarski presents several results in the language of arithmetical classes (the classes of structures that satisfy a particular first-order sentence). Theorem 13 gives the compactness theorem its name.
  • Väänänen, Jouko. 2007. Dependence Logic: A New Approach to Independence Friendly Logic. 70. Cambridge: Cambridge University Press.
    • A monograph on dependence logic, and other logics with game-theoretic semantics.
  • Willard, Stephen. 1970. General Topology. Reading, MA: Addison-Wesley.
    • An introductory textbook on topology. Chapter 6 on compactness is particularly relevant to this article.
  • Zach, Richard. 1999. “Completeness Before Post: Bernays, Hilbert, and the Development of Propositional Logic.” Bulletin of Symbolic Logic 5 (3): 331–66. https://doi.org/10.2307/421184.
    • A historical article on Bernays and Hilbert’s role in development of mathematical logic, particularly focused on propositional logic.

 

Author Information

A. C. Paseau
Email: alexander.paseau@philosophy.ox.ac.uk
University of Oxford
United Kingdom

and

Robert Leek
Email: r.leek@bham.ac.uk
University of Birmingham
United Kingdom

Hunhu/Ubuntu in the Traditional Thought of Southern Africa

The term Ubuntu/Botho/Hunhu is a Zulu/Xhosa/Ndebele/Sesotho/Shona word referring to the moral attribute of a person, who is known in the Bantu languages as Munhu (Among the Shona of Zimbabwe), Umuntu (Among the Ndebele of Zimbabwe and the Zulu/Xhosa of South Africa) and Muthu (Among the Tswana of Botswana) and Omundu (Among the Herero of Namibia) to name just a few of the Bantu tribal groupings. Though the term has a wider linguistic rendering in almost all the Bantu languages of Southern Africa, it has gained a lot of philosophical attention in Zimbabwe and South Africa, especially in the early twenty-first century for the simple reason that both Zimbabwe and South Africa needed home-grown philosophies to move forward following political disturbances that had been caused by the liberation war and apartheid respectively. Philosophically, the term Ubuntu emphasises the importance of a group or community and it finds its clear expression in the Nguni/Ndebele phrase: umuntu ngumuntu ngabantu which when translated to Shona means munhu munhu muvanhu (a person is a person through other persons).  This article critically reflects on hunhu/ubuntu as a traditional and/or indigenous philosophy by focussing particularly on its distinctive features, its components and how it is deployed in the public sphere.

Table of Contents

  1. Introduction
  2. About the Sources
  3. Hunhu/Ubuntu and Ethno-Philosophy
    1. The Deployment of Hunhu/Ubuntu in the Public Sphere
  4. The Distinctive Qualities/Features of Hunhu/Ubuntu
  5. The Components of Hunhu/Ubuntu
    1. Hunhu/Ubuntu Metaphysics
    2. Hunhu/Ubuntu Ethics
    3. Hunhu/Ubuntu Epistemology
  6. Conclusion
  7. References and Further Reading

1. Introduction

The subject of Hunhu/Ubuntu has generated considerable debate within the public and private intellectual discussions, especially in South Africa and Zimbabwe where the major focus has been on whether or not Hunhu/Ubuntu can compete with other philosophical world views as well as whether or not Hunhu/Ubuntu can solve the socio-political challenges facing the two countries.  Hunhu/ubuntu is also a key theme in African philosophy as it places an imperative on the importance of group or communal existence as opposed to the West’s emphasis on individualism and individual human rights. Thus, Hunhu/Ubuntu, as an aspect of African philosophy, prides in the idea that the benefits and burdens of the community must be shared in such a way that no one is prejudiced but that everything is done to put the interests of the community ahead of the interests of the individual. To this end, the traditional philosophical meaning of the term Ubuntu/Botho/Hunhu is sought and its importance in the academy is highlighted and explained. The article also looks at how the concept is deployed in the public sphere. It provides an elaborate analysis of the qualities/features of Hunhu/Ubuntu as exemplified by John S Pobee’s expression Cognatus ergo sum, which means I am related by blood therefore I exist. Finally, the article outlines and thoroughly explains the components cognate to Hunhu/Ubuntu as an aspect of ethno-philosophy, namely: Hunhu/Ubuntu Metaphysics, Hunhu/Ubuntu Ethics and Hunhu/Ubuntu Epistemology.

2. About the Sources

Many scholars have written about Ubuntu and it is only fair to limit our discussion to those scholars who have had an interest in the philosophical meaning of the term in Southern African Thought. In this category, we have first generation scholars of Ubuntu such as Mogobe Bernard Ramose (1999; 2014), who is credited for his definition of Ubuntu as humaneness, Stanlake Samkange and Tommie Marie Samkange (1980) who link Hunhu/Ubuntu with the idea of humanism and Desmond Tutu (1999) who sees Ubuntu as a conflict resolution philosophy. These three are regarded as first-generation scholars of Ubuntu because historically, they are among the first black philosophers hailing from Africa to write about Hunhu/Ubuntu as a philosophy. They also started writing as early as the 1980s and early 1990s and they regarded Ubuntu inspired by the traditional southern African thought as a human quality or as an attribute of the soul.

We also have second generation scholars of Ubuntu such as Michael Onyebuchi Eze (2010), who is credited for his critical historicisation of the term Ubuntu,  Michael Battle (2009) who is credited for some deep insights on the linguistic meaning of the term Ubuntu as well as his famous claim that Ubuntu is a gift to the Western world; Fainos Mangena (2012a and 2012b) who is credited for defining Hunhu/Ubuntu and extracting from it the idea of the Common Moral Position (CMP) and Thaddeus Metz (2007) whose search for a basic principle that would define African ethics has attracted a lot of academic attention; Christian BN Gade (2011; 2012 and 2013) who has taken the discourse of Hunhu/Ubuntu to another level by looking at the historical development of discourses on Ubuntu as well as the meaning of Ubuntu among South Africans of African Descent (SAADs).  Finally, we have Martin H Prozesky who has outlined some of the distinctive qualities/features of Hunhu/Ubuntu philosophy that are important for this article.

3. Hunhu/Ubuntu and Ethno-Philosophy

In order to define Ubuntu and show its nexus with ethno-philosophy, it is important that we first define ethno-philosophy. To this end, Zeverin Emagalit defines ethno-philosophy as a system of thought that deals with the collective worldviews of diverse African people as a unified form of knowledge based on myths, folk wisdom and the proverbs of the people. From the above definition, we can pick two important points:  The first point is ethno-philosophy as a “system of thought” and the second point is “the collective world views of diverse African people” and that they are a unified form of knowledge.  This means that the diversity that characterise African people, in terms of geographical location, history and ethnicity, does not take away the fact that Africans have “a unified form of knowledge” that is based on group identity or community.  Now, this is what qualifies Ubuntu as an important aspect of ethno-philosophy.

This section defines Ubuntu as well as tracing its historical roots in Southern African cultures. To begin with, the term Ubuntu comes from a group of sub-Saharan languages known as Bantu (Battle, 2009: 2). It is a term used to describe the quality or essence of being a person amongst many sub-Saharan tribes of the Bantu language family (Onyebuchi Eze, 2008: 107). While Battle does not make reference to the Shona equivalence of Ubuntu and recognises the words Ubuntu and Bantu by the common root of –ntu (human); Ramose uses the Zulu/isiNdebele word Ubuntu concurrently with its Shona equivalent – hunhu to denote the idea of existence. For Ramose, Hu– is ontological, while –nhu is epistemological and so is Ubu– and –ntu (Ramose 1999: 50).  Having lived in Africa and Zimbabwe, Ramose is able to know with some degree of certainty the ontological and epistemological status of the words hunhu and ubuntu.  It sometimes takes an insider to be able to correctly discern the meanings of such words.

Hunhu/ubuntu also says something about the character and conduct of a person (Samkange and Samkange 1980: 38). What this translates to is that hunhu/ubuntu is not only an ontological and epistemological concept; it is also an ethical concept. For Battle, Ubuntu is the interdependence of persons for the exercise, development and fulfilment of their potential to be both individuals and community. Desmond Tutu captures this aptly when he uses the Xhosa proverb, ungamntu ngabanye abantu whose Shona equivalence is munhu unoitwa munhu nevamwe vanhu (a person is made a person by other persons). Generally, this proverb, for Battle, means that each individual’s humanity is ideally expressed in relationship with others. This view was earlier expressed by Onyebuchi Eze (2008: 107) who put it thus:

More critical…is the understanding of a person as located in a community where being a person is to be in a dialogical relationship in this community. A person’s humanity is dependent on the appreciation, preservation and affirmation of other person’s humanity. To be a person is to recognize therefore that my subjectivity is in part constituted by other persons with whom I share the social world.

In regard to the proverbial character of Ubuntu, Ramose also remarks that, “Ubuntu is also consistent with the practices of African peoples as expressed in the proverbs and aphorisms of certain Nguni languages, specifically Zulu and Sotho” (Ramose quoted in van Niekerk 2013).

In his definition of ubuntu, Metz (2007: 323) follows Tutu and Ramose when he equates Ubuntu to the idea of humanness and to the maxim a person is a person through other persons. This maxim, for Metz, “has descriptive senses to the effect that one’s identity as a human being causally and even metaphysically depends on a community.” With this submission, Metz, agrees with Ramose, Samkange and Samkange and Gade that ubuntu is about the group/community more than it is about the self.

It may be important, at this juncture, to briefly consider the historical roots of the term Ubuntu in order to buttress the foregoing. To begin with, in his attempt to trace the history of the idea of Ubuntu, Michael Onyebuchi Eze (2010: 90) remarks thus when it comes to the idea of Ubuntu, “history adopts a new posture…where it is no longer a narrative of the past only but of the moment, the present and the future.” Other than asking a series of questions relating to “history as a narrative of the moment, present and future,” he does not adequately explain why this is so. Instead, he goes further to explain the view of “history as a narrative of the past.” As a narrative of the past, Onyebuchi Eze observes thus:

Ubuntu is projected to us in a rather hegemonic format; by way of an appeal to a unanimous past through which we may begin to understand the socio-cultural imaginary of the “African” people before the violence of colonialism; an imagination that must be rehabilitated in that percussive sense for its actual appeal for the contemporary African society (2010:93).

Onyebuchi Eze seems to be suggesting that there is too much romanticisation of the past when it comes to the conceptualisation and use of the term Ubuntu. He seems to question the idea of the unanimous character of Ubuntu before “the violence of colonialism” reducing this idea to some kind of imagination that should have no place in contemporary African society. We are compelled to agree with him to that extent. Thus, unlike many scholars of Ubuntu who have tended to gloss over the limitations of Ubuntu, Onyebuchi Eze is no doubt looking at the history of this concept with a critical eye. One of the key arguments he presents which is worthy of our attention in this article is that of the status of the individual and that of the community in the definition and conceptualisation of Ubuntu.

While many Ubuntu writers have tended to glorify community over and above the individual, Onyebuchi Eze (2008: 106) is of the view that, “the individual and the community are not radically opposed in the sense of priority but engaged in contemporaneous formation.” Thus, while we agree with Onyebuchi Eze that both the individual and the community put together define Ubuntu, we maintain that their relationship is not that of equals but that the individual is submerged within the community and the interests and aspirations of the community matter more than those of the individual. This, however, should not be interpreted to mean that the individual plays an ancillary role in the definition of Ubuntu.  Below, we outline and explain the qualities/features of hunhu/ubuntu as an aspect of ethno-philosophy.

4. The Deployment of Hunhu/Ubuntu in the Public Sphere

Hunhu/Ubuntu has dominated the public discourse especially in Zimbabwe and South Africa where it has been used to deal with both political and social differences. In Zimbabwe, for instance, hunhu/ubuntu has been used to bring together the Zimbabwe African National Union Patriotic Front (ZANU PF) and Patriotic Front Zimbabwe African People’s Union (PF ZAPU) after political tensions that led to the Midlands and Matabeleland disturbances of the early 1980s which saw about 20000 people killed by the North Korea trained Fifth Brigade. The 1987 Unity accord was done in the spirit of Ubuntu where people had to put aside their political differences and advance the cause of the nation.

The Global Political Agreement of 2008 which led to the signing of the Government of National Unity (GNU) also saw hunhu/ubuntu being deployed to deal with the political differences between ZANU PF and the Movement for Democratic Change (MDC) formations as a result of the violent elections of June 2008. This violence had sown the seeds of fear to the generality of the Zimbabwean population and so it took hunhu/ubuntu to remove the fear and demonstrate the spirit of “I am because we are, since we are therefore I am.” The point is that the two political parties needed each other in the interest of the development of the nation of Zimbabwe.

In South Africa, Desmond Tutu, who was the Chairperson of the Truth and Reconciliation Commission (TRC) which was formed to investigate and deal with the apartheid atrocities in the 1990s demonstrated in his final report that it took Ubuntu for people to confess, forgive and forget. In his book: No Future without Forgiveness, published in 1999, Tutu writes, “the single main ingredient that made the achievements of the TRC possible was a uniquely African ingredient – Ubuntu.” Tutu maintains that, what constrained so many to choose to forgive rather than to demand retribution, to be magnanimous and ready to forgive rather than to wreak revenge was Ubuntu (Tutu quoted in Richardson, 2008: 67). As Onyebuchi Eze (2011: 12) would put it, “the TRC used Ubuntu as an ideology to achieve political ends.”  As an ideology Ubuntu has been used as a panacea to the socio-political problems affecting the continent of Africa, especially the Southern part of the continent. This means that Ubuntu as a traditional thought has not been restricted to the academy alone but has also found its place in the public sphere where it has been utilised to solve political conflicts and thereby bring about socio-political harmony.  To underscore the importance of Ubuntu not only as an intellectual and public good, Gabriel Setiloane (quoted in Vicencio, 2009: 115) remarks thus, “Ubuntu is a piece of home grown African wisdom that the world would do well to make its own.” This suggests the southern African roots of ubuntu as a traditional thought.

5. The Distinctive Qualities/Features of Hunhu/Ubuntu

 While Martin H Prozesky (2003: 5-6) has identified the ten qualities that are characteristic of hunhu/ubuntu, it is important to note that, although this article will only utilise Prozesky’s ten qualities, the philosophy of hunhu/ubuntu has more than ten qualities or characteristics. Our justification of using Prozesky’s ten qualities is that they aptly capture the essence of Ubuntu as an aspect of ethno-philosophy. This article begins by outlining Prozesky’s ten qualities before attempting to explain only four of them, namely humaneness, gentleness, hospitality and generosity. Prozesky’s ten qualities are as follows:

    1. Humaneness
    2. Gentleness
    3. Hospitality
    4. Empathy or taking trouble for others
    5. Deep Kindness
    6. Friendliness
    7. Generosity
    8. Vulnerability
    9. Toughness
    10. Compassion

Hunhu/ubuntu as an important aspect of ethno-philosophy is an embodiment of these qualities. While Ramose uses humaneness to define hunhu/ubuntu, Samkange and Samkange use humanism to define and characterise the same. The impression one gets is that the former is similar to the latter. But this is further from the truth. Thus, with regard to the dissimilarity between humaneness and humanism, Gade (2011: 308) observes:

I have located three texts from the 1970s in which Ubuntu is identified as ‘African humanism.’ The texts do not explain what African humanism is, so it is possible that their authors understood African humanism as something different from a human quality.

Granted that this is may be the case, the question then is: What is the difference between humaneness and humanism, and African humaneness and African humanism as aspects of hunhu/ubuntu philosophy? While humaneness may refer to the essence of being human including the character traits that define it (Dolamo, 2013: 2); humanism, on the other hand, is an ideology, an outlook or a thought system in which human interests and needs are given more value than the interests and needs of other beings (cf. Flexner, 1988: 645).Taken together, humaneness and humanism become definitive aspects of hunhu/ubuntu only if the pre-fix ‘African’ is added to them to have African humaneness and African humanism respectively. African humaneness would, then, entail that the qualities of selflessness and commitment to one’s group or community are more important than the selfish celebration of individual achievements and dispositions.

African humanism, on the other hand; would, then, refer to an ideology or outlook or thought system that values peaceful co-existence and the valorisation of community.  In other words, it is a philosophy that sees human needs, interests and dignity as of fundamental importance and concern (Gyekye 1997: 158).  Gyekye maintains that African humanism “is quite different from the Western classical notion of humanism which places a premium on acquired individual skills and favours a social and political system that encourages individual freedom and civil rights” (1997: 158).

Thus, among the Shona people of Zimbabwe, the expression munhu munhu muvanhu, which in isiNdebele and Zulu language translates to umuntu ngumuntu ngabantu, both of which have the English translation of “a person is a person through other persons,” best explain the idea of African humanism (cf. Mangena 2012a; Mangena 2012b; Shutte 2008; Tutu 1999).

In regard to the definition and characterisation of African humanism, Onyebuchi Eze (2011:12) adds his voice to the definition of African humanism when he observes that:

As a public discourse, Ubuntu/botho has gained recognition as a peculiar form of African humanism, encapsulated in the following Bantu aphorisms, like Motho ke motho ka batho babang; Umuntu ngumuntu ngabantu (a person is a person through other people). In other words, a human being achieves humanity through his or her relations with other human beings.

Whether one prefers humaneness or humanism, the bottom line is that the two are definitive aspects of the philosophy of hunhu/ubuntu which places the communal interests ahead of the individual interests. Of course, this is a position which Onyebuchi Eze would not buy given that in his view, the community cannot be prioritised over the individual as:

The relation with ‘other’ is one of subjective equality, where the mutual recognition of our different but equal humanity opens the door to unconditional tolerance and a deep appreciation of the ‘other’ as an embedded gift that enriches one’s humanity (2011: 12).

Some believe that what distinguishes an African of black extraction from a Westerner is the view that the former is a communal being while the latter prides in the idea of selfhood or individualism. To these people the moment we take the individual and the community as subjective equals [as Onyebuchi Eze does] we end up failing to draw the line between what is African from what is fundamentally Western.  Having defined humaneness, this article will now define and characterise the quality of gentleness as understood through hunhu/ubuntu. Gentleness encompasses softness of heart and being able to sacrifice one’s time for others. Thus, being gentle means being tender-hearted and having the ability to spend time attending to other people’s problems. Gentleness is a quality of the tradition thought of hunhu/ubuntu in that it resonates with John S Mbiti’s dictum: “I am because we are, since we are therefore I am” (1969: 215). The point is that with gentleness, one’s humanity is inseparably bound to that of others. Eric K Yamamoto (1997: 52) puts it differently in reference to the altruistic character of Ubuntu philosophy when he remarks thus:

Ubuntu is the idea that no one can be healthy when the community is sick. Ubuntu says I am human only because you are human. If I undermine your humanity, I dehumanise myself.

Both the definition of gentleness provided above and Mbiti’s dictum are equivalent to Yamamoto’s understanding of gentleness in that they both emphasise on otherness rather than the self. The attribute of hospitality also defines hunhu/ubuntu philosophy. Hospitality generally means being able to take care of your visitors in such a way that they feel comfortable to have you as their host and the relationship is not commercial.  However, the Western definition of hospitality is such that the host goes out of his or her way to provide for the needs of his guests in return for some payment. This, however, should not be interpreted to mean that the Westerner is not hospitable outside of commerce. No doubt, they can also be hospitable but it is the magnitude of hospitality that differs.

In the case of the Shona/Ndebele communities in Africa where hospitality is given for free as when one provides accommodation and food to a stranger at his or her home, the magnitude is high. Coming to the idea of hospitality in Africa, it is important to note that in traditional Shona/Ndebele society when a person had travelled a long journey looking for some relative, they would get tired before reaching their relative’s home and along the way; it was common for them to be accommodated for a day or two before they get to their relative’s home. During their short stay, they would be provided with food, accommodation and warm clothes (if they happened to travel during the winter season).

Among the Korekore-Nyombwe people of Northern Zimbabwe, strangers would be given water to drink before asking for directions or before they ask for accommodation in transit. The thinking was that the stranger would have travelled a very long distance and was probably tired and thirsty and so there was need to give them water to quench their thirst. Besides, water (in Africa) symbolises life and welfare and so by giving strangers water they were saying that life needed to be sustained and that as Africans, we are “our brothers’ keepers.” Thus, hunhu/ubuntu hospitality derives its impetus from this understanding that the life and welfare of strangers is as important as our own lives and welfare.

Now, this is different from the idea of home and hospitality in Western Cosmopolitans where a home is a place of privacy. Most homes in the West have durawalls or high fences to maximise the privacy of the owner and so a stranger cannot just walk in and be accommodated. This is quite understandable because in Western societies, the individual is conceived of as the centre of human existence and so there is need to respect his or her rights to privacy.  In the West, the idea of a stranger walking into a private space is called trespassing and one can be prosecuted for this act. And yet in African traditional thought, in general, and in the Shona/Ndebele society, in particular, the idea of trespassing does not exist in that way.

In fact, in pre-colonial Shona/Ndebele society, however, the community was at the centre of human existence and that is why the pre-colonial Shona/Ndebele people would easily accommodate strangers or visitors without asking many questions. However, due to the colonisation of Africa, some Africans have adopted the Western style of individual privacy, but this is contrary to hunhu/ubuntu hospitality which is still being practiced in most Shona/Ndebele rural communities today. The point is that philosophies of hospitality, identity and belonging are more clearly played out on the home front than in the public sphere.

The last attribute to be discussed in this section, is generosity. Generally, generosity refers to freedom or liberality in giving (Flexner 1988: 550). The attribute of generosity in Southern African thought is best expressed proverbially. In Shona culture, for instance, there are proverbs that explain the generosity of the Shona people or vanhu. Some of these include: Muenzi haapedzi dura (A visitor does not finish food), Chipavhurire uchakodzwa(The one who gives too much will also receive too much), Chawawana idya nehama mutogwa unokangamwa (Share whatever you get with your relatives because strangers are very forgetful) and Ukama igasva hunazadziswa nekudya (Relations cannot be complete without sharing food).

These proverbs not only demonstrate that Bantu people are generous people, but the proverbs also say something about the hunhu/ubuntu strand that runs through the traditional thought of almost all the Bantu cultures of Southern Africa whereby everything is done to promote the interests of the group or community. The proverbs show that the Bantu people are selfless people as summarised by the Nguni proverb which we referred to earlier, which says: Umuntu ngumuntu ngabantu (a person is a person through other persons) or as they put it in Shona: Munhu munhu muvanhu. Without the attribute of generosity, it may be difficult to express one’s selflessness.

6. The Components of Hunhu/Ubuntu

This section outlines the components of hunhu/ubuntu traditional philosophy showing how these are different from the branches of Western philosophy. These components will be outlined as hunhu/ubuntu metaphysics, hunhu/ubuntu ethics as well as hunhu/ubuntu epistemology. The objective is to show that while Western philosophy is persona-centric and is summarised by Descartes’ famous phrase, Cogito ergo sum which when translated to English means “I think therefore I am”; hunhu/ubuntu traditional philosophy, on the other hand, is communo-centric and is summarised by Pobee’s famous dictum, Cognatus ergo sum which when translated to English means, “I am related by blood, therefore, I exist.” In much simpler terms, while Western philosophy emphasises the self and selfhood through the promotion of individual rights and freedoms, hunhu/ubuntu traditional thought emphasises on the importance of the group or community through the promotion of group or communal interests.

a. Hunhu/Ubuntu Metaphysics

Before defining and characterising hunhu/ubuntu metaphysics, it is important to begin by defining the term Metaphysics itself. For lack of a better word in African cultures, the article will define metaphysics from the standpoint of Western philosophy. The article will then show that this definition, though, it will give us a head-start; can only partially be applied to non- Western cultures. To begin with, in the history of Western philosophy, Metaphysics is by far regarded as the most ancient branch of philosophy and it was originally called first philosophy (Steward and Blocker 1987: 95).  The term Metaphysics is only but an accident of history as it is thought to have resulted from an editor’s mistake as “he was sorting out Aristotle’s works in order to give them titles, several decades after Aristotle had died. It is thought that the editor came across a batch of Aristotle’s writings that followed The Physics and he called them Metaphysics, meaning After Physics” (1987: 96).

Metaphysics then is a branch of philosophy that deals with the nature of reality. It asks questions such as: What is reality? Is reality material, physical or an idea?  As one tries to answer these questions, a world is opened to him or her that enables him or her to identify, name and describe the kinds of beings that exist in the universe. Thus, two words define being, namely: ontology and predication. While pre-Socratics such as Thales, Anaximander, Anaximenes, Heraclitus and Parmenides and others defined being in terms of appearance and reality as well as change and permanence; Classical philosophers such as Socrates/Plato and Aristotle defined change in terms of form and matter.

While form was more real Socrates/Plato and existed in a different realm than that of matter, Aristotle argued that both form and matter together formed substance which was reality. Although the idea of being has always been defined in individual terms in the history of Western philosophy; it was given its definitive character by French Philosopher, Rene Descartes, who defined it in terms of what he called Cogito ergo sum which when translated to English means, “I think therefore I am.” Thus, the individual character of Western philosophy was firmly established with the popularisation of Descartes’ Cogito. A question can be asked: Does this understanding of metaphysics also apply to non-Western cultures? The answer is yes and no. Yes in the sense that in non-Western cultures being is also explained in terms of appearance and reality as well as change and permanence and no in the sense that non-Western philosophies, especially the hunhu/ubuntu traditional philosophy of Southern Africa has a communal character, not an individual character. Having said this, so what is hunhu/ubuntu metaphysics?

Hunhu/ubuntu metaphysics is a component of hunhu/ubuntu traditional philosophy that deals with the nature of being as understood by people from Southern Africa. As we have already intimated, in Southern African traditional thought, being is understood in the communal, physical and spiritual sense. Thus, a human being is always in communion with other human beings as well as with the spiritual world. Sekou Toure (1959) calls this “the communion of persons” whereby being is a function of the “us” or “we” as opposed to the “I” as found in “the autonomy of the individuals” that is celebrated in the West and is especially more revealing in Descartes’ Cogito.  Pobee (1979) defines the African being in terms of what he calls Cognatus ergo sum which means “I am related by blood, therefore, I exist.” What this suggests is that in Southern Africa, just like in the rest of Sub-Saharan Africa, the idea of being is relational.

Coming to the communion of human beings with the spiritual world, it is important to remark that the idea of being has its full expression through participation. Just as, Socrates/Plato’s matter partakes in the immutable forms, being in the Shona/Ndebele society depends solely on its relationship with the spiritual world which is populated by ancestral spirits, avenging spirits, alien spirits and the greatest spiritual being called Musikavanhu/Nyadenga/unkulunkulu (The God of Creation). The greatest being works with his lieutenants, ancestors and other spirits to protect the interests of the lesser beings, vanhu/abantu. In return, vanhu/abantu enact rituals of appeasement so that it does not become a one-way kind of interaction. It is, however, important to note that while Socratic/Platonic Metaphysics is dualistic in character; hunhu/ubuntu Metaphysics is onto-triadic or tripartite in character. It involves the Supreme Being (God), other lesser spirits (ancestral/alien and avenging) and human beings.

b. Hunhu/Ubuntu Ethics

Hunhu/ubuntu ethics refer to the idea of hunhu/ubuntu moral terms and phrases such as tsika dzakanaka kana kuti dzakaipa (good or bad behaviour), kuzvibata kana kuti kusazvibata (self-control or reckless behaviour), kukudza vakuru (respecting or disrespecting elders) and kuteerera vabereki (being obedient or disobedient to one’s immediate parents and the other elders of the community) among others. In Shona society they say: Mwana anorerwa nemusha kana kuti nedunhu (It takes a clan, village or community to raise a child). Having defined hunhu/ubuntu ethics, it is important to distinguish them from hunhu/ubuntu morality which relates to the principles or rules that guide and regulate the behaviour of vanhu or abantu (human beings in the Shona/Ndebele sense of the word) within Bantu societies.

What distinguishes hunhu/ubuntu ethics from Western ethics is that the former are both upward-looking/transcendental and lateral, while the latter are only lateral. This section will briefly distinguish between an upward-looking/transcendental kind of hunhu/ubuntu ethic from a lateral kind of hunhu/ubuntu ethic. By upward-looking/transcendental is meant that hunhu/ubuntu ethics are not only confined to the interaction between humans, they also involve spiritual beings such as Mwari/Musikavanhu/Unkulunkulu (Creator God), Midzimu (ancestors) and Mashavi (Alien spirits). Thus, hunhu/ubuntu ethics are spiritual, dialogical and consensual (cf. Nafukho 2006). By dialogical and consensual is meant that the principles that guide and regulate the behaviour of vanhu/abantu are products of the dialogue between spiritual beings and human beings and the consensus that they reach.  By lateral is meant that these principles or rules are crafted solely to guide human interactions.

As Mangena (2012: 11) would put it, hunhu/ubuntu ethics proceed through what is called the Common Moral Position (CMP). The CMP is not a position established by one person as is the case with Plato’s justice theory, Aristotle’s eudaimonism, Kant’s deontology or Bentham’s hedonism (2012: 11). With the CMP, the community is the source, author and custodian of moral standards and personhood is defined in terms of conformity to these established moral standards whose objective is to have a person who is communo-centric than one who is individualistic. In Shona/Ndebele society, for instance, respect for elders is one of the ways in which personhood can be expressed with the goal being to uphold communal values. It is within this context that respect for elders is a non-negotiable matter since these are the custodians of these values and fountains of moral wisdom.

Thus, one is born and bred in a society that values respect for the elderly and he or she has to conform. One important point to note is that the process of attaining the CMP is dialogical and spiritual in the sense that elders set moral standards in consultation with the spirit world which, as intimated earlier is made up of Mwari/Musikavanhu/Unkulunkulu (Creator God) and Midzimu (ancestors), and these moral standards are upheld by society (2012: 12).  These moral standards, which make up the CMP, are not forced on society as the elders (who represent society), Midzimu (who convey the message to Mwari) and Mwari (who gives a nod of approval) ensure that the standards are there to protect the interest of the community at large.

Communities are allowed to exercise their free will but remain responsible for the choices they make as well as their actions. For instance, if a community chooses to ignore the warnings of the spirit world regarding an impending danger such as a calamity resulting from failure by that community to enact an important ritual that will protect members of that community from say, flooding or famine; then the community will face the consequences.

c. Hunhu/Ubuntu Epistemology

What is epistemology? In the Western sense of the word, epistemology deals with the meaning, source and nature of knowledge. Western philosophers differ when it comes to the sources of knowledge with some arguing that reason is the source of knowledge while others view experience or the use of the senses as the gateway to knowledge. This article will not delve much into these arguments since they have found an audience, instead it focuses on hunhu/ubuntu epistemology. However, one cannot define and characterise hunhu/ubuntu traditional epistemology without first defining and demarcating the province of African epistemology as opposed to Western epistemology.

According to Michael Battle (2009: 135), “African epistemology begins with community and moves to individuality.” Thus, the idea of knowledge in Africa resides in the community and not in the individuals that make up the community. Inherent in the powerful wisdom of Africa is the ontological need of the individual to know self and community (2009: 135) and discourses on hunhu/ubuntu traditional epistemology stems from this wisdom. As Mogobe Ramose (1999) puts it, “the African tree of knowledge stems from ubuntu philosophy. Thus, ubuntu is a well spring that flows within African notions of existence and epistemology in which the two constitute a wholeness and oneness.” Just like, hunhu/ubuntu ontology, hunhu/ubuntu epistemology is experiential.

In Shona society, for instance, the idea of knowledge is expressed through listening to elders telling stories of their experiences as youths and how such experiences can be relevant to the lives of the youths of today. Sometimes, they use proverbs to express their epistemology. The proverb: Rega zvipore akabva mukutsva(Experience is the best teacher) is a case in point. One comes to know that promiscuity is bad when he or she was once involved in it and got a Sexually Transmitted Infection (STI) and other bad consequences. No doubt, this person will be able to tell others that promiscuity is bad because of his or her experiences. The point is that hunhu/ubuntu epistemology is a function of experience. In Shona, they also say: Takabva noko kumhunga hakuna ipwa (We passed through the millet field and we know that there are no sweet reeds there). The point is that one gets to know that there are no sweet reeds in a millet field because he or she passed through the millet field. One has to use the senses to discern knowledge.

7. Conclusion

In this article, the traditional philosophy of hunhu/ubuntu was defined and characterised with a view to show that Africa has a traditional philosophy and ethic which are distinctively communal and spiritual. This philosophy was also discussed with reference to how it has been deployed in the public sphere in both Zimbabwe and South Africa. The key distinctive qualities/features of this traditional philosophy were clearly spelt out as humaneness, gentleness, hospitality and generosity. This philosophy was also discussed within the context of its three main components, namely; hunhu/ubuntu metaphysics, hunhu/ubuntu ethics and hunhu/ubuntu epistemology. In the final analysis, it was explained that hunhu/ubuntu metaphysics, hunhu/ubuntu ethics and hunhu/ubuntu epistemology formed the aspects of what is known today as traditional southern African thought.

8. References and Further Reading

  • Appiah, K.A. 1992. In My Father’s House: Africa in the Philosophy of Culture. New York: Oxford University Press.
    • A thorough treatise of the idea of Africa in the philosophy of culture
  • Battle, M. 2009. Ubuntu: I in You and You in Me. New York: Seasbury Publishing
    • A discussion of Ubuntu and how this idea has benefitted the Western world.
  • Dolamo, R. 2013.  “Botho/Ubuntu: The Heart of African Ethics.” Scriptura, 112 (1), pp.1-10
    • A thorough treatise on the notion of Ubuntu and its origin in Africa
  • Eze, M.O. 2011.  “I am Because You Are.” The UNESCO Courier, pp. 10-13
      • A Philosophical analysis of the idea of ubuntu
  • Eze, M.O. 2010. Intellectual History in Contemporary South Africa.  New York: Palgrave Macmillan
    • A detailed outline of the definition and characterization of intellectual history in Contemporary Africa
  • Eze, M.O. 2008. “What is African Communitarianism? Against Consensus as a regulative ideal.” South African Journal of Philosophy. 27 (4), pp. 106-119
    • A philosophical discussion of the notions of community and individuality in African thought
  • Flexner, S et al. 1988. The Random House Dictionary. New York: Random House.
    • One of the best dictionaries used by academics
  • Gade, C.B.N. 2011. “The Historical Development of the Written Discourses on Ubuntu.” South African Journal of Philosophy, 30(3), pp. 303-330
    • A philosophical discussion of the historical development of the Ubuntu discourse in Southern Africa
  • Gade, C.B.N. 2012. “What is Ubuntu? Different Interpretations among South Africans of African Descent.” South African Journal of Philosophy, 31 (3), pp.484-503
    • A Case-study on how South Africans of African descent interpret ubuntu
  • Gade, C.B.N. 2013. “Restorative Justice and the South African Truth and Reconciliation Process.”  African Journal of Philosophy, 32(1), pp.  10-35
    • A philosophical discussion of the origins of the idea of Restorative Justice
  • Gyekye, K. 1997. Tradition and Modernity: Reflections on the African Experience. New York: Oxford University Press
    • A philosophical rendition of the concepts of tradition and modernity in Africa
  • Hurka, T. 1993. Perfectionism. New York: Oxford University Press
    • A discussion on the notion of perfectionism
  • Makinde, M.A. 1988. African philosophy, Culture and Traditional Medicine. Athens: Africa Series number 53.
    • A thorough treatise on culture and philosophy in African thought
  • Mangena, F. 2012a. On Ubuntu and Retributive Justice in Korekore-Nyombwe Culture: Emerging Ethical Perspectives. Harare: Best Practices Books
    • A philosophical discussion of the place of Ubuntu and culture in the death penalty debate
  • Mangena, F. 2012b. “Towards a Hunhu/Ubuntu Dialogical Moral Theory.” Phronimon: Journal of the South African Society for Greek Philosophy and the Humanities, 13 (2), pp. 1-17
    • A philosophical discussion of the problems of applying Western ethical models in non-Western cultures
  • Mangena, F.2014. “In Defense of Ethno-philosophy: A Brief Response Kanu’s Eclecticism.” Filosofia Theoretica: A Journal of Philosophy, Culture and Religions, 3 (1), pp.96-107
    • A reflection on the importance of ethno-philosophy in the African philosophy debate
  • Mangena, F. 2015. “Ethno-philosophy as Rational: A Reply to Two Famous Critics.” Thought and Practice: A Journal of the Philosophical Association of Kenya, 6 (2), pp. 24-38
    • A reaction to the Universalists regarding the place of ethno-philosophy in African thought
  • Mbiti, J.S. 1969. African Religions and Philosophy. London: Heinemann
    • A discussion of community in African philosophy
  • Metz, T. 2007. “Towards an African Moral Theory.” The Journal of Political Philosophy, 15(3), pp. 321-341
    • A philosophical outline of what Thaddeus Metz perceive as African ethics
  • Nafukho. F.M. 2006.  “Ubuntu Worldview: A Traditional African View of Adult Learning in the Workplace.” Advances in Developing Human Resources, 8(3), pp.408-415
    • A thorough treatise on the three pillars of ubuntu
  • Pobee, J.S. 1979. Towards an African Theology. Nashville: Abingdon Press.
    • A theological discussion of the notions of community and individuality in African thought
  • Prozesky, M.H. 2003. Frontiers of Conscience: Exploring Ethics in a New Millennium. Cascades: Equinym Publishing
    • An outline of Ubuntu’s ten qualities
  • Ramose, M.B. 1999. African Philosophy through Ubuntu. Harare: Mond Books.
    • A thorough discussion on the nature and character of ubuntu
  • Ramose, M.B. 2007. “But Hans Kelsen was not born in Africa: A reply to Metz.” South African Journal of Philosophy, 26(4), pp. 347-355
    • Ramose’s response to Thaddeus Metz’s claim that African ethics lack a basic norm
  • Ramose, M.B. 2014b.  “Ubuntu: Affirming Right and Seeking Remedies in South Africa.” In: L Praeg and S Magadla (Eds.). Ubuntu: Curating the Archive (pp. 121-1346). Scottsville: University of KwaZulu Natal Press
    • A discussion of Ubuntu as affirming right and wrong in South Africa
  • Samkange, S and Samkange, T.M. 1980. Hunhuism or Ubuntuism: A Zimbabwean Indigenous Political Philosophy. Salisbury: Graham Publishing
    • A philosophical handbook on notions of Hunhu/Ubuntu in Zimbabwe
  • Steward, D and Blocker H.G. 1987. Fundamentals of Philosophy. New York: Macmillan Publishing Company
    • A discussion of key topics in Western philosophy
  • Shutte, A. 2008. “African Ethics in a Globalizing World.” In: R Nicolson (Ed.).Persons in Community: African Ethics in a Global Culture (pp. 15-34). Scottsville: University of KwaZulu Natal Press
    • A philosophical discussion of African ethics and its place in the globe
  • Taylor, D.F.P. 2013. “Defining Ubuntu for Business Ethics: A Deontological Approach.” South African Journal of Philosophy, 33(3), pp.331-345
    • An attempt to apply Ubuntu in the field of Business in Africa
  • Toure, S. 1959. “The Political Leader considered as the Representative of Culture.” http://www.blackpast.org/1959-sekou-toure-political-leader-considered-representative-culture
    • A discussion of the link between leadership, politics and culture in Africa
  • Tutu, D. 1999. No Future without Forgiveness. New York: Doubleday
    • A philosophical discussion of the findings of the Truth and Reconciliation Commission in South Africa
  • van Niekerk, J. (2013). “Ubuntu and Moral Value.” Johannesburg (PhD Dissertation submitted to the Department of Philosophy, University of Witwatersrand)
    • A philosophical rendition of the discourse of ubuntu and moral value.
  • Yamamoto, E.K. 1997. “Race Apologies.” Journal of Gender, Race and Justice, Vol. 1, pp. 47-88
    • A critical reflection on the nexus of Ubuntu, race, gender and justice
  • Vicencio, C.V.  2009. Walk with Us and Listen: Political Reconciliation in Africa. Cape Town: University of Cape Town Press
    • A philosophical discussion of political reconciliation in Africa.
  • Richardson, N. R. 2006. “Reflections on Reconciliation and Ubuntu.” In: R Nicolson (Ed.). Persons in Community: African Ethics in a Global Culture. Scottsville: University of KwaZulu Natal Press
    • A discussion on reconciliation in light of the Truth and Reconciliation Commission in South Africa.

 

Author Information

Fainos Mangena
Email: fainosmangena@gmail.com
University of Zimbabwe
Zimbabwe

History of African Philosophy

This article traces the history of systematic African philosophy from the early 1920s to date. In Plato’s Theaetetus, Socrates suggests that philosophy begins with wonder. Aristotle agreed. However, recent research shows that wonder may have different subsets. If that is the case, which specific subset of wonder inspired the beginning of the systematic African philosophy? In the history of Western philosophy, there is the one called thaumazein interpreted as ‘awe’ and the other called miraculum interpreted as ‘curiosity’. History shows that these two subsets manifest in the African place as well, even during the pre-systematic era. However, there is now an idea appearing in recent African philosophy literature called ọnụma interpreted as ‘frustration,’ which is regarded as the subset of wonder that jump-started the systematic African philosophy. In the 1920s, a host of Africans who went to study in the West were just returning. They had experienced terrible racism and discrimination while in the West. They were referred to as descendants of slaves, as people from the slave colony, as sub-humans, and so on. On return to their native lands, they met the same maltreatment by the colonial officials. ‘Frustrated’ by colonialism and racialism as well as the legacies of slavery, they were jolted onto the path of philosophy—African philosophy—by what can be called ọnụma.

These ugly episodes of slavery, colonialism and racialism not only shaped the world’s perception of Africa; they also instigated a form of intellectual revolt from the African intelligentsias. The frustration with the colonial order eventually led to angry questions and reactions out of which African philosophy emerged, first in the form of nationalisms, and then in the form of ideological theorizations. But the frustration was borne out of colonial caricature of Africa as culturally naïve, intellectually docile and rationally inept. This caricature was created by European scholars such as Kant, Hegel and, much later, Levy-Bruhl to name just a few. It was the reaction to this caricature that led some African scholars returning from the West into the type of philosophizing one can describe as systematic beginning with the identity of the African people, their place in history, and their contributions to civilization. To dethrone the colonially-built episteme became a ready attraction for African scholars’ vexed frustrations. Thus began the history of systematic African philosophy with the likes of JB Danquah, Meinrad Hebga, George James, SK. Akesson, Aime Cesaire, Leopold Senghor, Kwame Nkrumah, Julius Nyerere, George James, William Abraham, John Mbiti and others such as Placid Tempels, and Janheinz Jahn to name a few.

Table of Contents

  1. Introduction
  2. Criteria of African Philosophy
  3. Methods of African Philosophy
    1. The Communitarian Method
    2. The Complementarity Method
    3. The Conversational Method
  4. Schools of African Philosophy
    1. Ethnophilosophy School
    2. Nationalist/Ideological School
    3. Philosophic Sagacity
    4. Hermeneutical School
    5. Literary School
    6. Professional School
    7. Conversational School
  5. The Movements in African Philosophy
    1. Excavationism
    2. Afro-Constructionism/Afro-Deconstructionism
    3. Critical Reconstructionism/Afro-Eclecticism
    4. Conversationalism
  6. Epochs in African Philosophy
    1. Pre-systematic Epoch
    2. Systematic Epoch
  7. Periods of African Philosophy
    1. Early Period
    2. Middle Period
    3. Later Period
    4. New Era
  8. Conclusion
  9. References and Further Reading

1. Introduction

African philosophy as a systematic study has a very short history. This history is also a very dense one since actors sought to do in a few decades what would have been better done in many centuries. As a result, they also did in later years what ought to have been done earlier and vice versa, thus making the early and the middle periods overlap considerably. The reason for this overtime endeavor is not far-fetched. Soon after colonialism, actors realized that Africa had been sucked into the global matrix unprepared. During colonial times, the identity of the African was European; his thought system, standard and even his perception of reality were structured by the colonial shadow which stood towering behind him. It was easy for the African to position himself within these Western cultural appurtenances even though they had no real connection with his being.

The vanity of this presupposition and the emptiness of colonial assurances manifested soon after the towering colonial shadow vanished. Now, in the global matrix, it became shameful for the African to continue to identify himself within the European colonialist milieu. For one, he had just rejected colonialism, and for another, the deposed European colonialist made it clear that the identity of the African was no longer covered and insured by the European medium. So, actors realized suddenly that they had been disillusioned and had suffered severe self-deceit under colonial temper. The question which trailed every African was, “Who are you?” Of course, the answers from European perspective were savage, primitive, less than human, etc. It was the urgent, sudden need to contradict these European positions that led some post-colonial Africans in search of African identity. So, to discover or rediscover African identity in order to initiate a non-colonial or original history for Africa in the global matrix and start a course of viable economic, political and social progress that is entirely African became one of the focal points of African philosophy. Here, the likes of Cesaire, Nkrumah and Leon Damas began articulating the negritude movement.

While JB Danquah (1928, 1944) and SK Akesson (1965) rationally investigated topics in African politics, law and metaphysics, George James (1954) reconstructed African philosophical history, Meinrad Hebga (1958) probed topics in African logic. These three represent some of the early African philosophers. Placid Tempels (1959), the European missionary, also elected to help and, in his controversial book, Bantu Philosophy, sought to create Africa’s own philosophy as proof that Africa has its own peculiar identity and thought system. However, it was George James, who attempted a much more ambitious project in his work, Stolen Legacy. In this work, there were strong suggestions not only that Africa had philosophy but that the so-called Western philosophy, the very bastion of European identity, was stolen from Africa. This claim was intended to make the proud European colonialists feel indebted to the humiliated Africans, but it was unsuccessful. That Greek philosophy had roots in Egypt does not imply, as some claim, that Egyptians were high-melanated nor that high-melanated Africans created Egyptian philosophy. The use of the term “Africans” in this work is in keeping with George James’ demarcation that precludes the low-melanated people of North Africa and refers to the high-melanated people of southern Sahara.

Besides the two above, other Africans contributed ideas. Aime Cesaire, John Mbiti, Odera Oruka, Julius Nyerere, Leopold Senghor, Nnamdi Azikiwe, Kwame Nkrumah, Obafemi Awolowo, Alexis Kegame, Uzodinma Nwala, Emmanuel Edeh, Innocent Onyewuenyi, and Henry Olela, to name just a few, opened the doors of ideas. A few of the works produced sought to prove and establish the philosophical basis of African, unique identity in the history of humankind, while others sought to chart a course of Africa’s true identity through unique political and economic ideologies. It can be stated that much of these endeavors fall under the early period.

For its concerns, the middle period of African philosophy is characterized by the Great Debate. Those who seek to clarify and justify the position held in the early period and those who seek to criticize and deny the viability of such a position entangled themselves in a great debate. Some of the actors on this front include, C. S. Momoh, Robin Horton, Henri Maurier, Lacinay Keita, Peter Bodunrin, Kwasi Wiredu, Kwame Gyekye, Richard Wright, Barry Halen, Joseph Omoregbe, C. B. Okolo, Theophilus Okere, Paulin Hountondji, Gordon Hunnings, Odera Oruka and Sophie Oluwole to name a few.

The middle period eventually gave way to the later period, which has as its focus, the construction of an African episteme. Two camps rivaled each other, namely; the Critical Reconstructionists who are the evolved Universalists/Deconstructionists, and the Eclectics who are the evolved Traditionalists/Excavators. The former seek to build an African episteme untainted by ethnophilosophy; whereas, the latter seek to do the same by a delicate fusion of relevant ideals of the two camps. In the end, Critical Reconstructionism ran into a brick wall when it became clear that whatever it produced cannot truly be called African philosophy if it is all Western without African marks. The mere claim that it would be African philosophy simply because it was produced by Africans (Hountondji 1996 and Oruka 1975) would collapse like a house of cards under any argument. For this great failure, the influence of Critical Reconstructionism in the later period was whittled down, and it was later absorbed by its rival—Eclecticism.

The works of the Eclectics heralded the emergence of the New Era in African philosophy. The focus becomes the Conversational philosophizing, in which the production of philosophically rigorous and original African episteme better than what the Eclectics produced occupied the center stage.

Overall, the sum of what historians of African philosophy have done can be presented in the following two broad categorizations to wit; Pre-systematic epoch  and the Systematic epoch. The former refers to Africa’s philosophical culture, thoughts of the anonymous African thinkers and may include the problems of Egyptian and Ethiopian legacies. The latter refers to the periods marking the return of Africa’s first eleven, Western-trained philosophers from the 1920’s to date. This latter category could further be delineated into four periods:

    1. Early period 1920s – 1960s
    2. Middle period 1960s – 1980s
    3. Later period 1980s – 1990s
    4. New (Contemporary) period since 1990s

Note, of course, that this does not commit us to saying that, before the early period, people in Africa never philosophized—they did.  But one fact that must not be denied is that much of their thoughts were not documented in writing; most of those that may have been documented in writing are either lost or destroyed, and, as such, scholars cannot attest to their systematicity or sources. In other words, what this periodization shows is that African philosophy as a system first began in the late 1920s. There are, of course, documented writings in ancient Egypt, medieval Ethiopia, etc. The historian Cheikh Anta Diop (1974) has gazetted some of the ideas. Some of the popularly cited include St Augustine, born in present-day Algeria, but who being a Catholic Priest of the Roman Church, was trained in western-style philosophy education, and is counted amongst the medieval philosophers. Wilhelm Anton Amo, who was born in Ghana, West Africa, was sold into slavery as a little boy, and later educated in western-style philosophy in Germany where he also practised. Zera Yacob and Walda Heywat, both Ethiopian philosophers with Arabic and European educational influences. The question is, are the ideas produced by these people indubitably worthy of the name ‘African philosophies’? Their authors may be Africans by birth, but how independent are their views from foreign influences? We observe from these questions that the best that can be expected is a heated controversy. It would be uncharitable to say to the European historian of philosophy that St Augustine or Amo was not one of their own. Similarly, it may be uncharitable to say to the African historian that Amo or Yacob was not an African. But, does being an African translate to being an African philosopher?  If we set sentiments aside, it would be less difficult to see that all there is in those questions is a controversy. Even if there were any substance beyond controversy, were those isolated and disconnected views (most of which were sociological, religious, ethnological and anthropological) from Egypt, Rome, Germany and Ethiopia adequate to form a coherent corpus of African philosophy? The conversationalists, a contemporary African philosophical movement, have provided us with a via-media out of this controversy. Rather than discard this body of knowledge as non-African philosophies or uncritically accept them as African philosophy as the likes of Obi Ogejiofor and  Anke Graness, the conversationalists urge that they be discussed as part of the pre-systematic epoch that also include those Innocent Asouzu (2004) describes as the “Anonymous Traditional African Philosophers”. These are the ancient African philosophers whose names were forgotten through the passage of time, and whose ideas were transmitted through orality.

Because there are credible objections among African philosophers with regards to the inclusion of it in the historical chart of African philosophy, the Egyptian question (the idea that the creators of ancient Egyptian civilization were high-melanated Africans from the south of the Sahara) will be included as part of the controversies in the pre-systematic epoch. The main objection is that even if the philosophers of stolen legacy were able to prove a connection between Greece and Egypt, they could not prove in concrete terms that Egyptians who created the philosophy stolen by the Greeks were high-melanated Africans or that high-melanated Africans were Egyptians. It is understandable the frustration and desperation that motivated such ambitious effort in the ugly colonial era which was captured above, but any reasonable person, judging by the responses of time and events in the last few decades knew it was high time Africans abandoned that unproven legacy and let go of that, now helpless propaganda. If, however, some would want to retain it as part of African philosophy, it would carefully fall within the  pre-systematic era.

In this essay, the discussion will focus on the history of systematic African philosophy touching prominently on the criteria, schools, movements and periods in African philosophy. As much as the philosophers of a given era may disagree, they are inevitably united by the problem of their epoch. That is to say, it is orthodoxy that each epoch is defined by a common focus or problem. Therefore, the approach of the study of the history of philosophy can be done either through personality periscope or through the periods, but whichever approach one chooses, he unavoidably runs into the person who had chosen the other. This is a sign of unity of focus. Thus philosophers are those who seek to solve the problem of their time. In this presentation, the study of the history of African philosophy will be approached from the perspectives of criteria, periods, schools, and movements. The personalities will be discussed within these purviews.

2. Criteria of African Philosophy

To start with, more than three decades of debate on the status of philosophy ended with the affirmation that African philosophy exists. But what is it that makes a philosophy African? Answers to this question polarized actors into two main groups, namely the Traditionalists and Universalists. Whereas the Traditionalists aver that the studies of the philosophical elements in world-view of the people constitute African philosophy, the Universalists insist that it has to be a body of analytic and critical reflections of individual African philosophers. Further probing of the issue was done during the debate by the end of which the question of what makes a philosophy “African” produced two contrasting criteria. First, as a racial criterion; a philosophy would be African if it is produced by Africans. This is the view held by people like Paulin Hountondji, Odera Oruka (in part), and early Peter Bodunrin, derived from the two constituting terms—“African” and “philosophy”. African philosophy following this criterion is the philosophy done by Africans. This has been criticized as inadequate, incorrect and exclusivist. Second, as a tradition criterion; a philosophy is “African” if it designates a non-racial-bound philosophy tradition where the predicate “African” is treated as a solidarity term of no racial import and where the approach derives inspiration from African cultural background or system of thought. It does not matter whether the issues addressed are African or that the philosophy is done by an African insofar as it has universal applicability and emerged from the purview of African system of thought. African philosophy would then be that rigorous discourse of African issues or any issues whatsoever from the critical eye of African system of thought. Actors like Odera Oruka (in part), Meinrad Hebga, C. S. Momoh, Udo Etuk, Joseph Omoregbe, the later Peter Bodunrin, Jonathan Chimakonam can be grouped here. This criterion has also been criticized as courting uncritical elements of the past when it makes reference to the controversial idea of African logic tradition. Further discussion on this is well beyond the scope of this essay. What is, however, common in the two criteria is that African philosophy is a critical discourse on issues that may or may not affect Africa by African philosophers—the purview of this discourse remains unsettled. Recently, the issue of language has come to the fore as crucial in the determination of the Africanness of a philosophy. Inspired by the works of Euphrase Kezilahabi (1985), Ngugi wa Thiong’o (1986), AGA Bello (1987), Francis Ogunmodede (1998), to name just a few, the ‘language challenge’ is now taken as an important element in the affirmation of African philosophy. Advocates ask, should authentic African philosophy be done in African languages or in a foreign language with wider reach? Godfrey Tangwa (2017), Chukwueloka Uduagwu (2022) and Enyimba Maduka (2022) are some contemporary Africans who investigate this question. Alena Rettova (2007) represents non-African philosophers who engage the question.

3. Methods of African Philosophy

a. The Communitarian Method

This method speaks to the idea of mutuality, together or harmony, the type found in the classic expression of ubuntu: “a person is a person through other persons” or that, which is credited to John Mbiti, “ I am because we are, since we are, therefore I am”. Those who employ this method wish to demonstrate the idea of mutual interdependence of variables or the relational analysis of variables. You find this most prominent in the works of researchers working in the areas of ubuntu, personhood and communalism. Some of the scholars who employ this method include; Ifeanyi Menkiti, Mogobe Ramose, Kwame Gyekye, Thaddeus Metz, Fainos Mangena, Leonhard Praeg, Bernard Matolino, Michael Eze, Olajumoke Akiode, Rianna Oelofsen, and so forth.

b. The Complementarity Method

This method was propounded by Innocent Asouzu, and it emphasizes the idea of missing link. In it, no variable is useless. The system of reality is like a network in which each variable has an important role to play i.e. it complements and is, in return, complemented because no variable is self-sufficient. Each variable is then seen as a ‘missing link’ of reality to other variables. Here, method is viewed as a disposition or a bridge-building mechanism. As a disposition, it says a lot about the orientation of the philosopher who employs it. The method of complementary reflection seeks to bring together seemingly opposed variables into a functional unity. Other scholars whose works have followed this method include Mesembe Edet, Ada Agada, Jonathan Chimakonam and a host of others.

c. The Conversational Method

This is a formal procedure for assessing the relationships of opposed variables in which thoughts are shuffled through disjunctive and conjunctive modes to constantly recreate fresh thesis and anti-thesis each time at a higher level of discourse without the expectation of the synthesis. The three principal features of this method are relationality, the idea that variables necessarily interrelate; contextuality, the idea that the relationship of variables is informed and shaped by contexts; complementarity, the idea that seemingly opposed variables can complement rather than contradict. It is an encounter between philosophers of rival schools of thought and between different philosophical traditions or cultures in which one party called nwa-nsa (the defender or proponent) holds a position and another party called nwa-nju (the doubter or opponent) doubts or questions the veracity and viability of the position. On the whole, this method points to the idea of relationships among interdependent, interrelated and interconnected realities existing in a network whose peculiar truth conditions can more accurately and broadly be determined within specific contexts. This method was first proposed by Jonathan Chimakonam and endorsed by the  Conversational School of Philosophy. Other scholars who now employ this method include, Victor Nweke, Mesembe Edet, Fainos Mangena, Enyimba Maduka, Ada Agada, Pius Mosima, L. Uchenna Ogbonnaya, Aribiah Attoe, Leyla Tavernaro-Haidarian, Amara Chimakonam, Chukwueloka Uduagwu, Patrick Ben, and a host of others.

4. Schools of African Philosophy

a. Ethnophilosophy School

This is the foremost school in systematic African philosophy which equated African philosophy with culture-bound systems of thought. For this, their enterprise was scornfully described as substandard hence the term “ethnophilosophy.” Thoughts of the members of the Excavationism movement like Tempels Placid and Alexis Kagame properly belong here, and their high point was in the early period of African philosophy.

b. Nationalist/Ideological School

The concern of this school was nationalist philosophical jingoism to combat colonialism and to create political philosophy and ideology for Africa from the indigenous traditional system as a project of decolonization. Thoughts of members of the Excavationism movement like Kwame Nkrumah, Leopold Sedar Senghor and Julius Nyerere in the early period can be brought under this school.

c. Philosophic Sagacity

There is also the philosophic sagacity school, whose main focus is to show that standard philosophical discourse existed and still exists in traditional Africa and can only be discovered through sage conversations. The chief proponent of this school was the brilliant Kenyan philosopher Odera Oruka who took time to emphasize that Marcel Gruaile’s similar programme is less sophisticated than his.  Other adherents of this school include Gail Presbey, Anke Graness and the Cameroonian philosopher Pius Mosima. But since Oruka’s approach thrives on the method of oral interview of presumed sages whose authenticity can easily be challenged be, what was produced may well distance itself from the sages and becomes the fruits of the interviewing philosopher. So, the sage connection and the tradition became questionable. Their enterprise falls within the movement of Critical Reconstructionism of the later period.

d. Hermeneutical School

Another prominent school is the hermeneutical school. Its focus is that the best approach to studying African philosophy is through interpretations of oral traditions and emerging philosophical texts. Theophilus Okere, Okonda Okolo, Tsenay Serequeberhan and Ademola Fayemi Kazeem are some of the major proponents and members of this school. The confusion, however, is that they reject ethnophilosophy whereas the oral tradition and most of the texts available for interpretation are ethnophilosophical in nature. The works of Okere and Okolo feasted on ethno-philosophy. This school exemplifies the movement called Afro-constructionism of the middle period.

e. Literary School

The literary school’s main concern is to make a philosophical presentation of African cultural values through literary/fictional ways. Proponents like Chinua Achebe, Okot P’Bitek, Ngugi wa Thiong’o, Wole Soyinka to name a few have been outstanding. Yet critics have found it convenient to identify their discourse with ethnophilosophy from literary angle thereby denigrating it as sub-standard. Their enterprise remarks the movement of Afro-constructionism of the middle period.

f. Professional School

 Perhaps the most controversial is the one variously described as professional, universalist or modernist school. It contends that all the other schools are engaged in one form of ethnophilosophy or the other, that standard African philosophy is critical, individual discourse and that what qualifies as African philosophy must have universal merit and thrive on the method of critical analysis and individual discursive enterprise. It is not about talking, it is about doing. Some staunch unrepentant members of this school include Kwasi Wiredu, Paulin Hountondji, Peter Bodunrin, Richard Wright, Henri Maurier to name a few. They demolished all that has been built in African philosophy and built nothing as an alternative episteme. This school champions the movement of Afro-deconstructionism and the abortive Critical Reconstructionism of the middle and later periods, respectively.

Perhaps, one of the deeper criticisms that can be leveled against the position of the professional school comes from C. S. Momoh’s scornful description of the school as African logical neo-positivism. They agitate that (1) there is nothing as yet in African traditional philosophy that qualifies as philosophy and (2) that critical analysis should be the focus of African philosophy; so, what then is there to be critically analyzed? Professional school adherents are said to forget in their overt copying of European philosophy that analysis is a recent development in European philosophy which attained maturation in the 19th century after over 2000 years of historical evolution thereby requiring some downsizing. Would they also grant that philosophy in Europe before 19th century was not philosophy? The aim of this essay is not to offer criticisms of the schools but to present historical journey of philosophy in the African tradition. It is in opposition to and the need to fill the lacuna in the enterprise of the professional school that the new school called the conversational school has emerged in African philosophy.

g. Conversational School

 This new school thrives on fulfilling the yearning of the professional/modernist school to have a robust individual discourse as well as fulfilling the conviction of the traditionalists that a thorough-going African philosophy has to be erected on the foundation of African thought systems. They make the most of the criterion that presents African philosophy as a critical tradition that prioritizes engagements between philosophers and cultures, and projects individual discourses from the methodological lenses and thought system of Africa that features the principles of relationality, contextuality and complementarity. The school has an ideological structure consisting of four aspects: their working assumption that relationship and context are crucial to understanding reality; their main problem called border lines or the presentation of reality as binary opposites; their challenge, which is to trace the root cause of border lines; and their two main questions, which are: does difference amount to inferiority and are opposites irreconcilable? Those whose writings fit into this school include Pantaleon Iroegbu, Innocent Asouzu, Chris Ijiomah, Godfrey Ozumba, Andrew Uduigwomen, Bruce Janz, Jennifer Vest, Jonathan Chimakonam, Fainos Mangena, Victor Nweke, Paul Dottin, Aribiah Attoe, Leyla Tavernaro-Haidarian, Maduka Enyimba, L. Uchenna Ogbonnaya, Isaiah Negedu, Christiana Idika, Ada Agada, Amara Chimakonam, Patrick Ben, Emmanuel Ofuasia, Umezurike Ezugwu, to name a few. Their projects promote partly the movements of Afro-eclecticism and fully the conversationalism of the later and the new periods, respectively.

5. The Movements in African Philosophy

There are four main movements that can be identified in the history of African philosophy, they include: Excavationism, Afro-constructionism / Afro-deconstructionism, Critical Reconstructionism / Afro-Eclecticism and Conversationalism.

a. Excavationism

 The Excavators are all those who sought to erect the edifice of African philosophy by systematizing the African cultural world-views. Some of them aimed at retrieving and reconstructing presumably lost African identity from the raw materials of African culture, while others sought to develop compatible political ideologies for Africa from the native political systems of African peoples. Members of this movement have all been grouped under the schools known as ethnophilosophy and nationalist/ideological schools, and they thrived in the early period of African philosophy. Their concern was to build and demonstrate unique African identity in various forms. A few of them include JB Danquah, SK Akesson, Placid Tempels, Julius Nyerere, John Mbiti, Alexis Kagame, Leopold Senghor, Kwame Nkrumah and Aime Cesaire, and so on.

b. Afro-Constructionism/Afro-Deconstructionism

The Afro-deconstructionists, sometimes called the Modernists or the Universalists are those who sought to demote such edifice erected by the Excavators on the ground that their raw materials are substandard cultural paraphernalia. They are opposed to the idea of unique African identity or culture-bound philosophy and prefer a philosophy that will integrate African identity with the identity of all other races. They never built this philosophy. Some members of this movement include Paulin Hountondji, Kwasi Wiredu, Peter Bodunrin, Macien Towa, Fabien Ebousi Boulaga, Richard Wright and Henri Maurier, and partly Kwame Appiah. Their opponents are the Afro-constructionists, sometimes called the Traditionalists or Particularists who sought to add rigor and promote the works of the Excavators as true African philosophy. Some prominent actors in this movement include Ifeanyi Menkiti, Innocent Onyewuenyi, Henry Olela, Lansana Keita, C. S. Momoh, Joseph Omoregbe, Janheinz Jahn, Sophie Oluwole and, in some ways, Kwame Gyekye. Members of this twin-movement have variously been grouped under ethnophilosophy, philosophic sagacity, professional, hermeneutical and literary schools and they thrived in the middle period of African philosophy. This is also known as the period of the Great Debate.

c. Critical Reconstructionism/Afro-Eclecticism

 A few Afro-deconstructionists of the middle period evolved into Critical Reconstructionists hoping to reconstruct from scratch, the edifice of authentic African philosophy that would be critical, individualistic and universal. They hold that the edifice of ethnophilosophy, which they had demolished in the middle period, contained no critical rigor. Some of the members of this movement include, Kwasi Wiredu, Olusegun Oladipo, Kwame Appiah, V. Y. Mudimbe, D. A. Masolo, Odera Oruka and, in some ways, Barry Hallen and J. O. Sodipo. Their opponents are the Afro-Eclectics who evolved from Afro-constructionism of the middle period. Unable to sustain their advocacy and the structure of ethnophilosophy they had constructed, they stepped down a little bit to say, “Maybe we can combine meaningfully, some of the non-conflicting concerns of the Traditionalists and the Modernists.” They say (1) that African traditional philosophy is not rigorous enough as claimed by the Modernists is a fact (2) that the deconstructionist program of the Modernists did not offer and is incapable of offering an alternative episteme is also a fact (3) maybe the rigor of the Modernists can be applied on the usable and relevant elements produced by the Traditionalists to produce the much elusive, authentic African philosophy. African philosophy for this movement therefore becomes a product of synthesis resulting from the application of tools of critical reasoning on the relevant traditions of African life-world.  A. F. Uduigwomen, Kwame  Gyekye, Ifeanyi Menkiti, Kwame Appiah, Godwin Sogolo and Jay van Hook are some of the members of this movement. This movement played a vital reconciliatory role, the importance of which was not fully realized in African philosophy. Most importantly, they found a way out and laid the foundation for the emergence of Conversationalism. Members of this twin-movement thrived in the later period of African philosophy.

d. Conversationalism

The Conversationalists are those who seek to create an enduring corpus in African philosophy by engaging elements of tradition and individual thinkers in critical conversations. They emphasize originality, creativity, innovation, peer-criticism and cross-pollination of ideas in prescribing and evaluating their ideas. They hold that new episteme in African philosophy can only be created by individual African philosophers who make use of the “usable past” and the depth of individual originality in finding solutions to contemporary demands. They do not lay emphasis on analysis alone but also on critical rigor and what is now called arumaristics—a creative reshuffling of thesis and anti-thesis that spins out new concepts and thoughts. Further, their methodological ambience features principles such as relationality, contextuality and complementarity. Members of this movement thrive in this contemporary period, and their school can be called the conversational school. Some of the philosophers that have demonstrated this trait include Pantaleon Iroegbu, Innocent Asouzu, Chris Ijomah, Godfrey Ozumba, Andrew Uduigwomen,  Bruce Janz, Jonathan Chimakonam, Fainos Mangena, Jennifer Lisa Vest, L. Uchenna Ogbonnaya, Maduka Enyimba, Leyla Tervanaro-Haidarian, Aribiah Attoe, and so forth.

6. Epochs in African Philosophy

Various historians of African philosophy have delineated the historiography of African philosophy differently. Most, like Obenga, Abanuka, Okoro, Oguejiofor, Graness, Fayemi, etc., have merely adapted the Western periodization model of ancient, medieval, modern and contemporary. But there is a strong objection to this model. Africa, for example, did not experience the medieval age as Europe did. The intellectual history of the ancient period of Europe shares little in common with ancient Africa. The same goes for the modern period. In other words, the names ancient, medieval and modern refer to actual historical periods in Europe with specific features in their intellectual heritage, which share very little in common with those exact dates in Africa. It, thus, makes no historical, let alone philosophical sense, to adopt such a model for African philosophy. Here, we have a classic case of what Innocent Asouzu calls “copycat philosophy”, which must be rejected. The conversationalists, therefore, propose a different model, one that actually reflects the true state of things. In this model, there are two broad categorizations to wit; Pre-systematic epoch and the Systematic epoch. The latter is further divided into four periods, early, middle, later and the contemporary periods.

a. Pre-systematic Epoch

This refers to the era from the time of the first homo sapiens to the 1900s. African philosophers studied here are those Innocent Asouzu describes as the “Anonymous Traditional African Philosophers”, whose names have been lost in history. They may also include the ancient Egyptians, Ethiopians and Africans who thrived in Europe in that era. The controversies surrounding the nativity of the philosophies of St Augustine, Anton Amo, the Egyptian question, etc., may also be included.

b. Systematic Epoch

This refers to the era from the 1920s to date when systematicity that involves academic training, writing, publishing, engagements, etc., inspired by African conditions and geared towards addressing those conditions, became central to philosophical practice in Africa, South of the Sahara. This latter epoch could further be delineated into four periods: early, middle, later and the contemporary periods.

7. Periods of African Philosophy

a. Early Period

The early period of African philosophy was an era of the movement called cultural/ideological excavation aimed at retrieving and reconstructing African identity. The schools that emerged and thrived in this period were ethnophilosophy and ideological/nationalist schools. Hegel wrote that the Sub-Saharan Africans had no high cultures and made no contributions to world history and “civilization” (1975: 190). Lucien Levy Bruhl also suggested that they were “pre-logical” (1947: 17). The summary of these two positions, which represent the colonial mindset, is that Africans have no dignified identity like their European counterpart. This could be deciphered in the British colonial system that sought to erode the native thought system in the constitution of social systems in their colonies and also in the French policy of assimilation. Assimilation is a concept credited to the French philosopher Chris Talbot (1837), that rests on the idea of expanding French culture to the colonies outside of France in the 19th and 20th centuries. According to Betts (2005: 8), the natives of these colonies were considered French citizens as long as the “French culture” and customs were adopted to replace the indigenous system. The purpose of the theory of assimilation, for Michael Lambert, therefore, was to turn African natives into French men by educating them in the French language and culture (1993: 239-262).

During colonial times, the British, for example, educated their colonies in the British language and culture, strictly undermining the native languages and cultures. The products of this new social system were then given the impression that they were British, though second class, the king was their king, and the empire was also theirs. Suddenly, however, colonialism ended, and they found, to their chagrin, that they were treated as slave countries in the new post-colonial order. Their native identity had been destroyed, and their fake British identity had also been taken from them; what was left was amorphous and corrupt. It was in the heat of this confusion and frustration that the African philosophers sought to retrieve and recreate the original African identity lost in the event of colonization. Ruch and Anyanwu, therefore, ask, “What is this debate about African identity concerned with and what led to it? In other words, why should Africans search for their identity?” Their response to the questions is as follows:

The simple answer to these questions is this: Africans of the first half of this (20th century) century have begun to search for their identity, because they had, rightly or wrongly, the feeling that they had lost it or that they were being deprived of it. The three main factors which led to this feeling were: slavery, colonialism and racialism. (1981: 184-85)

Racialism, as Ruch and Anyanwu believed, may have sparked it off and slavery may have dealt the heaviest blow, but it was colonialism that entrenched it. Ironically, it was the same colonialism at its stylistic conclusion that opened the eyes of the Africans by stirring the hornet’s nest. An African can never be British or French, even with the colonially imposed language and culture. With this shock, the post-colonial African philosophers of the early period set out in search of Africa’s lost identity.

James, in 1954 published his monumental work Stolen Legacy. In it, he attempted to prove that the Egyptians were the true authors of Western philosophy; that Pythagoras, Socrates, Plato and Aristotle plagiarized the Egyptians; that the authorship of the individual doctrines of Greek philosophers is mere speculation perpetuated chiefly by Aristotle and executed by his school; and that the African continent gave the world its civilization, knowledge, arts and sciences, religion and philosophy, a fact that is destined to produce a change in the mentality both of the European and African peoples. In G. M. James’ words:

In this way, the Greeks stole the legacy of the African continent and called it their own. And as has already been pointed out, the result of this dishonesty had been the creation of an enormous world opinion; that the African continent has made no contribution to civilization, because her people are backward and low in intelligence and culture…This erroneous opinion about the Black people has seriously injured them through the centuries up to modern times in which it appears to have reached a climax in the history of human relations. (1954: 54)

These robust intellectual positions supported by evidential and well-thought-out arguments quickly heralded a shift in the intellectual culture of the world. However, there was one problem George James could not fix; he could not prove that the people of North Africa (Egyptians) who were the true authors of ancient art, sciences, religion and philosophy were high-melanated Africans, as can be seen in his hopeful but inconsistent conclusions:

This is going to mean a tremendous change in world opinion, and attitude, for all people and races who accept the new philosophy of African redemption, i.e. the truth that the Greeks were not the authors of Greek philosophy; but the people of North Africa; would change their opinion from one of disrespect to one of respect for the black people throughout the world and treat them accordingly. (1954: 153)

It is inconsistent how the achievements of North Africans (Egyptians) can redeem the black Africans. This is also the problem with Henri Olela’s article “The African Foundations of Greek Philosophy”.

However, in Onyewuenyi’s The African Origin of Greek Philosophy, an ambitious attempt emerges to fill this lacuna in the argument for a new philosophy of African redemption. In the first part of chapter two, he reduced Greek philosophy to Egyptian philosophy, and in the second part, he attempted to further reduce the Egyptians of the time to high-melanated Africans. There are, however, two holes he could not fill. First, Egypt is the world’s oldest standing country which also tells its own story by themselves in different forms. At no point did they or other historians describe them as wholly high-melanated people. Second, if the Egyptians were at a time wholly high-melanated, why are they now wholly low-melanated? For the failure of this group of scholars to prove that high-melanated Africans were the authors of Egyptian philosophy, one must abandon the Egyptian legacy or discuss it as one of the precursor arguments to systematic African philosophy until more evidence emerges.

There are other scholars of the early period who tried more reliable ways to assert African identity by establishing native African philosophical heritage. Some examples include JB Danquah, who produced a text in the Akan Doctrine of God (1944), Meinrad Hebga (1958), who wrote “Logic in Africa”, and SK Akesson, who published “The Akan Concept of Soul” (1965). Another is Tempels, who authored Bantu Philosophy (1959). They all proved that rationality was an important feature of the traditional African culture. By systematizing Bantu philosophical ideas, Tempels confronted the racist orientation of the West, which depicted Africa as a continent of semi-humans. In fact, Tempels showed latent similarities in the spiritual inclinations of the Europeans and their African counterpart. In the opening passage of his work he observed that the European who has taken to atheism quickly returns to a Christian viewpoint when suffering or pain threatens his survival. In much the same way, he says the Christian Bantu returns to the ways of his ancestors when confronted by suffering and death. So, spiritual orientation or thinking is not found only in Africa.

In his attempt to explain the Bantu understanding of being, Tempels admits that this might not be the same as the understanding of the European. Instead, he argues that the Bantu construction is as much rational as that of the European. In his words:

So, the criteriology of the Bantu rests upon external evidence, upon the authority and dominating life force of the ancestors. It rests at the same time upon the internal evidence of experience of nature and of living phenomena, observed from their point of view. No doubt, anyone can show the error of their reasoning; but it must none the less be admitted that their notions are based on reason, that their criteriology and their wisdom belong to rational knowledge. (1959: 51)

Tempels obviously believes that the Bantu, like the rest of the African peoples, possess rationality, which undergird their philosophical enterprise. The error in their reasoning is only obvious in the light of European logic. But Tempels was mistaken in his supposition that the Bantu system is erroneous. The Bantu categories only differ from those of the Europeans in terms of logic, which is why a first-time European on-looker would misinterpret them to be irrational or spiritual. Hebga demonstrates this and suggests the development of African logic. Thus, the racist assumptions that Africans are less intelligent, which Tempels rejected with one hand, was smuggled in with another. For this, and his other errors such as, his depiction of Bantu ontology with vital force, his arrogant claim that the Bantu could not write his philosophy, requiring the intervention of the European, some African philosophers like Paulin Hountondji and Innocent Asouzu to name just a few, criticized Tempels. Asouzu, for one, describes what he calls the “Tempelsian Damage” in African philosophy to refer to the undue and erroneous influence, which the Bantu Philosophy has had on contemporary Africans. For example,  Tempels makes a case for Africa’s true identity, which, for him, could be found in African religion within which African philosophy (ontology) is subsumed. In his words, “being is force, force is being”. This went on to influence the next generation of African philosophers like the Rwandise,  Alexis Kagame. Kagame’s work The Bantu-Rwandan Philosophy (1956), which offers similar arguments, thus further strengthening the claims made by Tempels, especially from an African’s perspective. The major criticism against their industry remains the association of their thoughts with ethnophilosophy, where ethnophilosophy is a derogatory term. A much more studded criticism is offered recently by Innocent Asouzu in his work Ibuanyidanda: New Complementary Ontology (2007). His criticism was not directed at the validity of the thoughts they expressed or whether Africa could boast of a rational enterprise such as philosophy but at the logical foundation of their thoughts. Asouzu seems to quarrel with Tempels for allowing his native Aristotelian orientation to influence his construction of African philosophy and lambasts Kagame for following suit instead of correcting Tempels’ mistake. The principle of bivalence evident in the Western thought system was at the background of their construction.

Another important philosopher in this period is John Mbiti. His work African Religions and Philosophy (1969) avidly educated those who doubted Africans’ possession of their own identities before the arrival of the European by excavating and demonstrating the rationality in the religious and philosophical enterprises in African cultures. He boldly declared: “We shall use the singular, ‘philosophy’ to refer to the philosophical understanding of African peoples concerning different issues of life” (1969: 2). His presentation of time in African thought shows off the pattern of excavation in his African philosophy. Although his studies focus primarily on the Kikamba and Gikuyu tribes of Africa, he observes that there are similarities in many African cultures just as Tempels did earlier.  He subsumes African philosophy in African religion on the assumption that African peoples do not know how to exist without religion. This idea is also shared by William Abraham in his book The Mind of Africa as well as Tempels’ Bantu Philosophy. African philosophy, from Mbiti’s treatment, could be likened to Tempels’ vital force, of which African religion is its outer cloak. The obvious focus of this book is on African views about God, political thought, afterlife, culture or world-view and creation, the philosophical aspects lie within these religious over-coats. Thus, Mbiti establishes that the true, and lost, identity of the African could be found within his religion. Another important observation Mbiti made was that this identity is communal and not individualistic. Hence, he states, “I am because we are and since we are therefore I am” (1969: 108). Therefore, the African has to re-enter his religion to find his philosophy and the community to find his identity. But just like Kagame, Mbiti was unduly and erroneously influenced both by Tempels and the Judeo-Christian religion in accepting the vital force theory and in cloaking the African God with the attributes of the Judeo-Christian God.

This is a view shared by William Abraham. He shares Tempels’ and Mbiti’s views that the high-melanated African peoples have many similarities in their culture, though his studies focus on the culture and political thought of the Akan of present-day Ghana. Another important aspect of Abraham’s work is that he subsumed African philosophical thought in African culture taking, as Barry Hallen described, “an essentialist interpretation of African culture” (2002: 15). Thus for Abraham, like Tempels and Mbiti, the lost African identity could be found in the seabed of African indigenous culture in which religion features prominently.

On the other hand, there were those who sought to retrieve and establish, once again, Africa’s lost identity through economic and political ways. Some names discussed here include Kwame Nkrumah, Leopold Senghor and Julius Nyerere. These actors felt that the African could never be truly decolonized unless he found his own system of living and social organization. One cannot be African living like the European. The question that guided their study, therefore, became, “What system of economic and social engineering will suit us and project our true identity?” Nkrumah advocates African socialism, which, according to Barry Hallen, is an original, social, political and philosophical theory of African origin and orientation. This system is forged from the traditional, communal structure of African society, a view strongly projected by Mbiti. Like Amilcar Cabral, and Julius Nyerere, Nkrumah suggests that a return to African cultural system with its astute moral values, communal ownership of land and a humanitarian social and political engineering holds the key to Africa rediscovering her lost identity. Systematizing this process will yield what he calls the African brand of socialism. In most of his books, he projects the idea that Africa’s lost identity is to be found in African native culture, within which is African philosophical thought and identity shaped by communal orientation. Some of his works include, Neo-colonialism: The Last Stage of Imperialism (1965), I Speak of Freedom: A Statement of African Ideology (1961), Africa Must Unite (1970), and Consciencism (1965).

Leopold Sedar Senghor of Senegal charted a course similar to that of Nkrumah. In his works Negritude et Humanisme (1964) and Negritude and the Germans (1967), Senghor traced Africa’s philosophy of social engineering down to African culture, which he said is communal and laden with brotherly emotion. This is different from the European system, which he says is individualistic, having been marshaled purely by reason. He opposed the French colonial principle of assimilation aimed at turning Africans into Frenchmen by eroding and replacing African culture with French culture. African culture and languages are the bastions of African identity, and it is in this culture that he found the pedestal for constructing a political ideology that would project African identity. Senghor is in agreement with Nkrumah, Mbiti, Abraham and Tempels in many ways, especially with regards to the basis for Africa’s true identity.

Julius Nyerere of Tanzania is another philosopher of note in the early period of African philosophy. In his books Uhuru na Ujamaa: Freedom and Socialism (1964) and Ujamaa: The Basis of African Socialism (1968), he sought to retrieve and establish African true identity through economic and political ways. For him, Africans cannot regain their identity unless they are first free, and freedom (Uhuru) transcends independence. Cultural imperialism has to be overcome. And what is the best way to achieve this if not by developing a socio-political and economic ideology from the petals of African native culture, and traditional values of togetherness and brotherliness? Hence, Nyerere proposes Ujamaa, meaning familyhood—the “being-with” philosophy or the “we” instead of the “I—spirit” (Okoro 2004: 96). In the words of Barry Hallen, “Nyerere argued that there was a form of life and system of values indigenous to the culture of pre-colonial Africa, Tanzania in particular, that was distinctive if not unique and that had survived the onslaughts of colonialism sufficiently intact to be regenerated as the basis for an African polity” (2002: 74). Thus for Nyerere, the basis of African identity is the African culture, which is communal rather than individualistic. Nyerere was in agreement with other actors of this period on the path to full recovery of Africa’s lost identity. Some of the philosophers of this era not treated here include Aime Cesaire, Nnamdi Azikiwe, Obafemi Awolowo, Amilcar Cabral, and the two foreigners, Janheinz Jahn and Marcel Griaule.

b. Middle Period

The middle period of African philosophy was also an era of the twin-movement called Afro-constructionism and afro-deconstructionism, otherwise called the Great Debate, when two rival schools—Traditionalists and Universalists clashed. While the Traditionalists sought to construct an African identity based on excavated African cultural elements, the Universalists sought to demolish such architectonic structure by associating it with ethnophilosophy. The schools that thrived in this era include Philosophic Sagacity, Professional/Modernist/Universalist, hermeneutical and Literary schools.

An important factor of the early period was that the thoughts on Africa’s true identity generated arguments that fostered the emergence of the Middle Period of African philosophy. These arguments result from questions that could be summarized as follows: (1) Is it proper to take for granted the sweeping assertion that all of Africa’s cultures share a few basic elements in common? It was this assumption that had necessitated the favorite phrase in the early period, “African philosophy,” rather than “African philosophies”. (2) Does Africa or African culture contain a philosophy in the strict sense of the term? (3) Can African philosophy emerge from the womb of African religion, world-view and culture? Answers and objections to answers soon took the shape of a debate, characterizing the middle period as the era of the Great Debate in African philosophy.

This debate was between members of Africa’s new crop of intellectual radicals. On the one hand, were the demoters and, on the other were the promoters of African philosophy established by the league of early-period intellectuals. The former sought to criticize this new philosophy of redemption, gave it the derogatory tag “ethnophilosophy” and consequently denigrated the African identity that was founded on it, as savage and primitive identity. At the other end, the promoters sought to clarify and defend this philosophy and justify the African identity that was rooted in it as true and original.

For clarity, the assessment of the debate era will begin from the middle instead of the beginning. In 1978 Odera Oruka a Kenyan philosopher presented a paper at the William Amo Symposium held in Accra, Ghana on the topic “Four Trends in Current African Philosophy” in which he identified or grouped voices on African philosophy into four schools, namely ethnophilosophy, philosophic sagacity, nationalistic-ideological school and professional philosophy. In 1990 he wrote another work, Sage Philosophy: Indigenous Thinkers and the Modern Debate on African Philosophy in which he further added two schools to bring the number to six schools in African philosophy. Those two additions are the hermeneutic and the artistic/literary schools.

Those who uphold philosophy in African culture are the ethnophilosophers, and they include the actors treated as members of the early period of African philosophy and their followers or supporters in the Middle Period. Some would include C. S. Momoh, Joseph Omoregbe, Lansana Keita, Olusegun Oladipo, Gordon Hunnings, Kwame Gyekye, M. A. Makinde, Emmanuel Edeh, Uzodinma Nwala, K. C. Anyanwu and later E. A. Ruch, to name a few. The philosophic sagacity school, to which Oruka belongs, also accommodates C. S. Momoh, C. B. Nze, J. I. Omoregbe, C. B. Okolo and T. F. Mason. The nationalist-ideological school consists of those who sought to develop indigenous socio-political and economic ideologies for Africa. Prominent members include Julius Nyerere, Leopold Senghor, Kwame Nkrumah, Amilcar Cabral, Nnamdi Azikiwe and Obafemi Awolowo. The professional philosophy school insists that African philosophy must be done with professional philosophical methods such as analysis, critical reflection and logical argumentation, as it is in Western philosophy. Members of this school include: Paulin Hountondji, Henri Maurier, Richard Wright, Peter Bodunrin, Kwasi Wiredu, early E. A. Ruch, R. Horton, and later C. B. Okolo. The hermeneutic school recommends interpretation as a method of doing African philosophy. A few of its members include Theophilus Okere, Okonda Okolo, Tsenay Serequeberhan, Godwin Sogolo and partly J. Sodipo and B. Hallen. The Artistic/Literary school philosophically discusses the core of African norms in literary works, and includes Chinua Achebe, Okot P’Bitek, Ngugi wa Thiong’o, Wole Soyinka, Elechi Amadi and F. C. Ogbalu.

Also, in 1989, C. S. Momoh in his The Substance of African Philosophy outlined five schools, namely African logical neo-positivism, the colonial/missionary school of thought, the Egyptological school, the ideological school and the purist school. The article was titled “Nature, Issues and Substance of African Philosophy” and was reproduced in Jim Unah’s Metaphysics, Phenomenology and African Philosophy (1996).

In comparing Momoh’s delineations with Oruka’s, it can be said that the purist school encompasses Oruka’s ethnophilosophy, artistic/literary school and philosophic sagacity; The African logical neo-positivism encompasses  professional philosophy and the hermeneutical schools; and the ideological and colonial/missionary schools correspond to Oruka’s nationalistic-ideological school. The Egyptological school, therefore, remains outstanding. Momoh sees it as a school that sees African philosophy as synonymous with Egyptian philosophy or, at least, as originating from it. Also, Egyptian philosophy as a product of African philosophy is expressed in the writings of George James, I. C. Onyewuenyi and Henry Olela.

Welding all these divisions together are the perspectives of Peter Bodunrin and Kwasi Wiredu. In the introduction to his 1985 edited volume Philosophy in Africa: Trends and Perspectives, Bodunrin created two broad schools for all the subdivisions in both Oruka and Momoh, namely the Traditionalist and Modernist schools. While the former includes Africa’s rich culture and past, the latter excludes them from the mainstream of African philosophy. Kwasi Wiredu also made this type of division, specifically Traditional and Modernist, in his paper “On Defining African Philosophy” in C. S. Momoh’s (1989) edited volume. Also, A. F. Uduigwomen created two broad schools, namely the Universalists and the Particularists, in his “Philosophy and the Place of African Philosophy” (1995). These can be equated to Bodunrin’s Modernist and Traditionalist schools, respectively. The significance of his contribution to the Great Debate rests on the new school he evolved from the compromise of the Universalist and the Particularist schools (1995/2009: 2-7). As Uduigwomen defines it, the Eclectic school accommodates discourses pertaining to African experiences, culture and world-view as parts of African philosophy. Those discourses must be critical, argumentative and rational. In other words, the so-called ethnophilosophy can comply with the analytic and argumentative standards that people like Bodunrin, Hountondji, and Wiredu insist upon. Some later African philosophers revived Uduigwomen’s Eclectic school as a much more decisive approach to African philosophy (Kanu 2013: 275-87). It is the era dominated by Eclecticism and meta-philosophy that is tagged the ‘Later period’ in the history of African philosophy. For perspicuity, therefore, the debate from these two broad schools shall be addressed as the perspectives of the Traditionalist or Particularist and the Modernist or Universalist.

The reader must now have understood the perspectives on which the individual philosophers of the middle period debated. Hence, when Richard Wright published his critical essay “Investigating African Philosophy” and Henri Maurier published his “Do we have an African Philosophy?” denying the existence of African philosophy at least, as yet, the reader understands why Lansana Keita’s “The African Philosophical Tradition”, C. S. Momoh’s African Philosophy … does it exist?” or J. I. Omoregbe’s “African Philosophy: Yesterday and Today” are offered as critical responses. When Wright arrived at the conclusion that the problems surrounding the study of African philosophy were so great that others were effectively prevented from any worthwhile work until their resolution, Henri Maurier responded  to the question, “Do we have an African Philosophy?” with “No! Not Yet!” (1984: 25). One would understand why Lansana Keita took it up to provide concrete evidence that Africa had and still has a philosophical tradition. In his words:

It is the purpose of this paper to present evidence that a sufficiently firm literate philosophical tradition has existed in Africa since ancient times, and that this tradition is of sufficient intellectual sophistication to warrant serious analysis…it is rather…an attempt to offer a defensible idea of African philosophy. (1984: 58)

Keita went on in that paper to excavate intellectual resources to prove his case, but it was J. I. Omoregbe who tackled the demoters on every front. Of particular interest are his critical commentaries on the position of Kwasi Wiredu and others who share Wiredu’s opinion that what is called African philosophy is not philosophy, but community thought at best. Omoregbe alludes that the logic and method of African philosophy need not be the same as those of Western philosophy, which the demoters cling to.  In his words:

It is not necessary to employ Aristotelian or the Russellian logic in this reflective activity before one can be deemed to be philosophizing. It is not necessary to carry out this reflective activity in the same way that the Western thinkers did. Ability to reason logically and coherently is an integral part of man’s rationality. The power of logical thinking is identical with the power of rationality. It is therefore false to say that people cannot think logically or reason coherently unless they employ Aristotle’s or Russell’s form of logic or even the Western-type argumentation. (1998: 4-5)

Omoregbe was addressing the position of most members of the Modernist school who believed that African philosophy must follow the pattern of Western philosophy if it were to exist. As he cautions:

Some people, trained in Western philosophy and its method, assert that there is no philosophy and no philosophizing outside the Western type of philosophy or the Western method of philosophizing (which they call “scientific” or “technical”. (1998: 5)

Philosophers like E. A. Ruch in some of his earlier writings, Peter Bodunrin, C. B. Okolo, and Robin Horton were direct recipients of Omoregbe’s criticism. Robin Horton’s “African Traditional Thought and Western Science” is a two-part essay that sought, in the long run, to expose the rational ineptitude in African thought. On the question of logic in African philosophy, Robin Horton’s “Traditional Thought and the emerging African Philosophy Department: A Comment on the Current Debate” first stirred the hornet’s nest and was ably challenged by Godorn Hunnings’ “Logic, Language and Culture”, as well as by Omoregbe’s “African Philosophy: Yesterday and Today”. Earlier, Meinrad Hebga’s “Logic in Africa” had made insightful ground-clearing on the matter. Recently, C.S. Momoh’s “The Logic Question in African Philosophy” and Udo Etuk’s “The Possibility of an African Logic” as well as Jonathan C. Okeke’s “Why can’t there be an African Logic” made impressions. However, this logic question is gathering new momentum in African philosophical discourse. Recently, Jonathan O Chimakonam (2020), has put together a new edited collection that compiled some of the seminal essays in the logic question debate.

On the philosophical angle, Kwasi Wiredu’s “How not to Compare African Traditional Thought with Western Thought” responded to the lopsided earlier effort of Robin Horton but ended up making its own criticisms of the status of African philosophy, which, for Wiredu, is yet to attain maturation. In his words, “[M]any traditional African institutions and cultural practices, such as the ones just mentioned, are based on superstition. By ‘superstition’ I mean a rationally unsupported belief in entities of any sort (1976: 4-8 and 1995: 194).” In his Philosophy and an African Culture, Wiredu was more pungent. He caricatured much of the discourse on African philosophy as community thought or folk thought unqualified to be called philosophy. For him, there had to be a practised distinction between “African philosophy as folk thought preserved in oral traditions and African philosophy as critical, individual reflection, using modern logical and conceptual techniques” (1980: 14). Olusegun Oladipo supports this in his Philosophy and the African Experience. As he puts it:

But this kind of attitude is mistaken. In Africa, we are engaged in the task of the improvement of “the condition of men”. There can be no successful execution of this task without a reasonable knowledge of, and control over, nature. But essential to the quest for knowledge of, and control over, nature are “logical, mathematical and analytical procedures” which are products of modern intellectual practices. The glorification of the “unanalytical cast of mind” which a conception of African philosophy as African folk thought encourages, would not avail us the opportunity of taking advantage of the theoretical and practical benefits offered by these intellectual procedures. It thus can only succeed in making the task of improving the condition of man in Africa a daunting one. (1996: 15)

Oladipo also shares similar thoughts in his The Idea of African Philosophy. African philosophy, for some of the Modernists, is practised in a debased sense. This position is considered opinionated by the Traditionalists. Later E. A. Ruch and K. C. Anyanwu in their African Philosophy: An Introduction to the Main Philosophical Trends in Contemporary Africa attempt to excavate the philosophical elements in folklore and myth. C. S. Momoh’s “The Mythological Question in African Philosophy” and K. C. Anyanwu’s “Philosophical Significance of Myth and Symbol in Dogon World-View” further reinforced the position of the Traditionalists. (cf. Momoh 1989 and Anyanwu 1989).

However, it took Paulin Hountondji in his African Philosophy: Myth and Reality to drive a long nail in the coffin. African philosophy, for him, must be done in the same frame as Western philosophy, including its principles, methodologies and all. K. C. Anyanwu again admitted that Western philosophy is one of the challenges facing African philosophy but that only calls for systematization of African philosophy not its decimation. He made these arguments in his paper “The Problem of Method in African philosophy”.

Other arguments set Greek standards for authentic African philosophy as can be found in Odera Oruka’s “The Fundamental Principles in the Question of ‘African Philosophy’ (I)” and Hountondji’s “African Wisdom and Modern Philosophy.” They readily met with Lansana Keita’s “African Philosophical Systems: A Rational Reconstruction”, J. Kinyongo’s “Philosophy in Africa: An Existence” and even P. K. Roy’s “African Theory of Knowledge”. For every step the Modernists took, the Traditionalists replied with two, a response that lingered till the early 1990’s when a certain phase of disillusionment began to set in to quell the debate. Actors on both fronts had only then begun to reach a new consciousness, realizing that a new step had to be taken beyond the debate. Even Kwasi Wiredu who had earlier justified the debate by his insistence that “without argument and clarification, there is strictly no philosophy” (1980: 47), had to admit that it was time to do something else. For him, African philosophers had to go beyond talking about African philosophy and get down to actually doing it.

It was with this sort of new orientation, which emerged from the disillusionment of the protracted debate that the later period of African philosophy was born in the 1980’s. As it is said in the Igbo proverb, “The music makers almost unanimously were changing the rhythm and the dancers had to change their dance steps.”  One of the high points of the disillusionment was the emergence of the Eclectic school in the next period called ‘the Later Period’ of African philosophy.

c. Later Period

This period of African philosophy heralds the emergence of movements that can be called Critical Reconstructionism and Afro-Eclecticism. For the Deconstructionists of the middle period, the focus shifted from deconstruction to reconstruction of African episteme in a universally integrated way; whereas, for the eclectics, finding a reconcilable middle path between traditional African philosophy and modern African philosophy should be paramount. Thus they advocate a shift from entrenched ethnophilosophy and universal hue to the reconstruction of African episteme if somewhat different from the imposed Westernism and the uncritical ethnophilosophy. So, both the Critical Reconstructionists and the Eclectics advocate one form of reconstruction or the other. The former desire a new episteme untainted by ethnophilosophy, while the latter sue for reconciled central and relevant ideals.

Not knowing how to proceed to this sort of task was a telling problem for all advocates of critical reconstruction in African philosophy, such as V. Y. Mudimbe, Ebousi Boulaga, Olusegun Oladipo, Franz Crahey, Jay van Hook, Godwin Sogolo, and Marcien Towa to name a few. At the dawn of the era, these African legionnaires pointed out, in different terms, that reconstructing African episteme was imperative. But more urgent was the need to first analyse the haggard philosophical structure patched into existence with the cement of perverse dialogues. It appeared inexorable to these scholars and others of the time that none of these could be successful outside the shadow of Westernism. For whatever one writes, if it is effectively free from ethnophilosophy, then it is either contained in Western discourse or, at the very least proceeds from its logic. If it is already contained in Western narrative or proceeds from its logic, what then makes it African? This became something of a dead-end for this illustrious group, which struggled against evolutions in their positions.

Intuitively, almost every analyst knows that discussing what has been discussed in Western philosophy or taking the cue from Western philosophy does not absolutely negate or vitiate what is produced as African philosophy. But how is this to be effectively justified? This appears to be the Achilles heel of the Critical Reconstructionists of the later period in African philosophy. The massive failure of these Critical Reconstructionists to go beyond the lines of recommendation and actually engage in reconstruction delayed their emergence as a school of thought in African philosophy. The diversionary trend which occurred at this point ensured that the later period, which began with the two rival camps of Critical Reconstructionists and Eclectics, ended with only the Eclectics left standing. Thus dying in its embryo, Critical Reconstructionism became absorbed in Eclecticism.

The campaign for Afro-reconstructionism had first emerged in the late 1980s in the writings of Peter Bodunrin, Kwasi Wiredu, V. Y. Mudimbe, Lucius Outlaw, and much later, in Godwin Sogolo, Olusegun Oladipo, and Jay van Hook, even though principals like Marcien Towa and Franz Crahey had hinted at it much earlier. The insights of the latter two never rang bells beyond the ear-shot of identity reconstruction, which was the echo of their time. Wiredu’s cry for conceptual decolonization and Hountondji’s call for the abandonment of the ship of ethnophilosophy were in the spirit of Afro-reconstructionism of the episteme. None of the Afro-reconstructionists except for Wiredu was able to truly chart a course for reconstruction. His was linguistic, even though the significance of his campaign was never truly appreciated. His 1998 work “Toward Decolonizing African Philosophy and Religion,” was a clearer recapitulation of his works of preceding years.

Beyond this modest line, no other reconstructionist crusader of the time actually went beyond deconstruction and problem identification. Almost spontaneously, Afro-reconstructionism evolved into Afro-eclecticism in the early 1990s when the emerging Critical Reconstructionism ran into a brick wall of inactivity. The argument seems to say, ‘If it is not philosophically permissible to employ alternative logic different from the one in the West or methods, perhaps we can make do with the merger of the approaches we have identified in African philosophy following the deconstructions.’ These approaches are the various schools of thought from ethnophilosophy, philosophic sagacity, ideological school, universal, literary to hermeneutic schools, which were deconstructed into two broad approaches, namely: The traditionalist school and the modernist school, also called the particularist and the universalist schools.

Eclectics, therefore, are those who think that the effective integration or complementation of the African native system and the Western system could produce a viable synthesis that is first African and then modern. Andrew Uduigwomen, the Nigerian philosopher, could be regarded as the founder of this school in African philosophy. In his 1995 work “Philosophy and the Place of African Philosophy,” he gave official birth to Afro-eclecticism. Identifying the Traditionalist and Modernist schools as the Particularist and Universalist schools, he created the eclectic school by carefully unifying their goals from the ruins of the deconstructed past.

Uduigwomen states that the eclectic school holds that an intellectual romance between the Universalist conception and the Particularist conception will give rise to an authentic African philosophy. The Universalist approach will provide the necessary analytic and conceptual framework for the Particularist school. Since, according to Uduigwomen, this framework cannot thrive in a vacuum, the Particularist approach will, in turn, supply the raw materials or indigenous data needed by the Universalist approach. From the submission of Uduigwomen above, one easily detects that eclecticism for him entails employing Western methods in analyzing African cultural paraphernalia.

However, Afro-Eclecticism is not without problems. The first problem, though is that he did not supply the yardstick for determining what is to be admitted and what must be left out of the corpus of African tradition. Everything cannot meet the standard of genuine philosophy, nor should the philosophical selection be arbitrary. Hountondji, a chronic critic of traditional efforts, once called Tempels’ Bantu philosophy a sham. For him, it was not African or Bantu philosophy but Tempels’ philosophy with African paraphernalia. This could be extended to the vision of Afro-eclecticism. On the contrary, it could be argued that if Hountondji agrees that the synthesis contains as little as African paraphernalia, then it is something new and, in this respect, can claim the tag of African philosophy. However, it leaves to be proven how philosophical that little African paraphernalia is.

Other notable eclectics include Batholomew Abanuka, Udobata Onunwa, C. C. Ekwealor and much later Chris Ijiomah. Abanuka posits in his 1994 work that a veritable way to do authentic African philosophy would be to recognize the unity of individual things and, by extension, theories in ontology, epistemology or ethics. There is a basic identity among these because they are connected and can be unified. Following C. S. Momoh (1985: 12), Abanuka went on in A History of African Philosophy to argue that synthesis should be the ultimate approach to doing African Philosophy. This position is shared by Onunwa on a micro level. He says that realities in African world-view are inter-connected and inter-dependent (1991: 66-71). Ekwealor and Ijiomah also believe in synthesis, noting that these realities are broadly dualistic, being physical and spiritual (cf. Ekwalor 1990: 30 and Ijiomah 2005: 76 and 84). So, it would be an anomaly to think of African philosophy as chiefly an exercise in analysis rather than synthesis. The ultimate methodological approach to African philosophy, therefore, has to reflect a unity of methods above all else.

Eclecticism survived in the contemporary period of African philosophy in conversational forms. Godfrey Ozumba and Jonathan Chimakonam on Njikoka philosophy, E. G. Ekwuru and later Innocent Egwutuorah on Afrizealotism, and even Innocent Asouzu on Ibuanyidanda ontology are all in a small way, various forms of eclectic thinking. However, these theories are grouped in the New Era specifically for the time of their emergence and the robust conversational structure they have.

The purest development of eclectic thinking in the later period could be found in Pantaleon Iroegbu’s Uwa Ontology. He posits uwa (worlds) as an abstract generic concept with fifteen connotations and six zones. Everything is uwa, in uwa and can be known through uwa. For him, while the fifteen connotations are the different senses and aspects which uwa concept carries in African thought, the six zones are the spatio-temporal locations of the worlds in terms of their inhabitants. He adds that these six zones are dualistic and comprise the earthly and the spiritual. They are also dynamic and mutually related. Thus, Iroegbu suggests that the approach to authentic African philosophy could consist of the conglomeration of uwa. This demonstrates a veritable eclectic method in African philosophy.

One of the major hindrances of eclecticism of the later period is that it leads straight to applied philosophy. Following this approach in this period almost makes it impossible for second readers to do original and abstract philosophizing for its own sake. Eclectic theories and methods confine one to their internal dynamics believing that for a work to be regarded as authentic African philosophy, it must follow the rules of Eclecticism. The wider implication is that while creativity might blossom, innovation and originality are stifled. Because of pertinent problems such as these, further evolutions in African philosophy became inevitable. The Kenyan philosopher Odera Oruka had magnified the thoughts concerning individual rather than group philosophizing, thoughts that had been variously expressed earlier by Peter Bodunrin, Paulin Hountondji and Kwasi Wiredu, who further admonished African philosophers to stop talking and start doing African philosophy. And V. Y. Mudimbe, in his The Invention of Africa…, suggested the development of an African conversational philosophy, and the reinvention of Africa by its philosophers, to undermine the Africa that Europe invented. The content of Lewis Gordon’s essay “African Philosophy’s search for Identity: Existential consideration of a recent effort”, and the works of Outlaw and Sogolo suggest a craving for a new line of development for African philosophy—a new approach which is to be critical, engaging and universal while still being African. This in particular, is the spirit of the conversational thinking, which was beginning to grip African philosophers in late 1990s when Gordon wrote his paper. Influences from these thoughts by the turn of the millennium year crystallized into a new mode of thinking, which then metamorphosed into conversational philosophy. The New Era in African philosophy was thus heralded. The focus of this New Era and the orientation became the conversational philosophy.

d. New Era

This period of African philosophy began in the late 1990s and took shape by the turn of the millennium years. The orientation of this period is conversational philosophy, so, conversationalism is the movement that thrives in this period. The University of Calabar has emerged as the international headquarters of this new movement hosting various workshops, colloquia and conferences in African philosophy under the auspices of a revolutionary forum called The Conversational/Calabar School of Philosophy. This forum can fairly be described as revolutionary for the radical way they turned the fortunes of African philosophy around. When different schools and actors were still groping about, the new school provided a completely new and authentically African approach to doing philosophy. Hinged on the triple principles of relationality (that variables necessarily interrelate), contextuality (that the relationships of variables occur in contexts) and complementarity (that seemingly opposed variables can complement rather than merely contradict), they formulated new methodologies (complementary reflection and conversational method) and developed original systems to inaugurate a new era in the history of African philosophy.

The Calabar School begins its philosophical inquiry with the assumptions that a) relationships are central to understanding the nature of reality, b) each of these relationships must be contextualized and studied as such. They also identify border lines as the main problem of the 21st century. By border lines, they mean the divisive line we draw between realities in order to establish them as binary opposites. These lines lead to all marginal problems such as racism, sexism, classisim, creedoism, etc. To address these problems, they raise two questions: does difference amount to inferiority? And, are opposites irreconcilable? In the Calabar School of Philosophy, some prominent theories have emerged to respond to the border lines problems and the two questions that trail it. Some theoretic contributions of the Calabar School include, uwa ontology (Pantaleon Iroegbu), ibuanyidanda (complementary philosophy) (Innocent Asouzu), harmonious monism (Chris Ijiomah), Njikoka philosophy (Godfrey Ozumba), conceptual mandelanism (Mesembe Edet), and conversational thinking (Jonathan Chimakonam), consolation philosophy (Ada Agada), predeterministic historicity (Aribiah Attoe), personhood-based theory of right action (Amara Chimakonam), etc. All these theories speak to the method of conversational philosophy.  Conversational philosophy is defined by the focus on studying relationships existing between variables and active engagement between individual African philosophers in the creation of critical narratives therefrom, through engaging the elements of tradition or straightforwardly by producing new thoughts or by engaging other individual thinkers. It thrives on incessant questioning geared toward the production of new concepts, opening up new vistas and sustaining the conversation.

Some of the African philosophers whose works follow this trajectory ironically have emerged in the Western world, notably in America. The American philosopher Jennifer Lisa Vest is one of them. Another one is Bruce Janz. These two, to name a few, suggest that the highest purification of African philosophy is to be realized in conversational-styled philosophizing. However, it was the Nigerian philosopher Innocent Asouzu who went beyond the earlier botched attempt of Leopold Senghor and transcended the foundations of Pantaleon Iroegbu and CS Momoh to erect a new model of African philosophy that is conversational. The New Era, therefore, is the beginning of conversational philosophy.

Iroegbu in his Metaphysics: The Kpim of Philosophy inaugurated the reconstructive and conversational approach in African philosophy. He studied the relationships between the zones and connotations of uwa. From the preceding, he engaged previous writers in a critical conversation out of which he produced his own thought, (Uwa ontology) bearing the stamp of African tradition and thought systems but remarkably different in approach and method of ethnophilosophy. Franz Fanon has highlighted the importance of sourcing African philosophical paraphernalia from African indigenous culture. This is corroborated in a way by Lucius Outlaw in his African Philosophy: Deconstructive and Reconstructive Challenges. In it, Outlaw advocates the deconstruction of European-invented Africa to be replaced by a reconstruction to be done by conscientious Africans free from the grip of colonial mentality (1996: 11). Whereas the Wiredu’s crusade sought to deconstruct the invented Africa, actors in the New Era of African philosophy seek to reconstruct through conversational approach.

Iroegbu and Momoh inaugurated this drive but it was Asouzu who has made the most of it. His theory of Ibuanyidanda ontology or complementary reflection maintains that “to be” simply means to be in a mutual, complementary relationship (2007: 251-55). Every being, therefore, is a variable with the capacity to join a mutual interaction. In this capacity, every being alone is seen as a missing link and serving a missing link of reality in the network of realities. One immediately suspects the apparent contradiction that might arise from the fusion of two opposed variables when considered logically. But the logic of this theory is not the two-valued classical logic but the three-valued system of logic developed in Africa (cf. Asouzu 2004, 2013; Ijiomah 2006, 2014, 2020; Chimakonam 2012, 2013 and 2014a, 2017, 2018, 2019, 2020). In this, the two standard values are sub-contraries rather than contradictories thereby facilitating effective complementation of variables. The possibility of the two standard values merging to form the third value in the complementary mode is what makes Ezumezu logic, one of the systems developed in the Calabar school, a powerful tool of thought.

A good number of African philosophers are tuning their works into the pattern of conversational style. Elsewhere in Africa, Michael Eze, Fainos Mangena, Bernard Matolino, Motsamai Molefe, Anthony Oyowe, Thaddeus Metz and Leonhard Praeg are doing this when they engage with the idea of ubuntu ethics and ontology, except that they come short of studying relationships. Like all these scholars, the champions of the new conversational orientation are building the new edifice by reconstructing the deconstructed domain of thought in the later period of African philosophy. The central approach is conversation, as a relational methodology. By studying relationships and engaging other African philosophers, entities or traditions in creative struggle, they hope to reconstruct the deconstructed edifice of African philosophy. Hence, the New Era of African philosophy is safe from the retrogressive, perverse dialogues, which characterized the early and middle periods.

Also, with the critical deconstruction that occurred in the latter part of the middle period and the attendant eclecticism that emerged in the later period, the stage was set for the formidable reconstructions and conversational encounters that marked the arrival of the New Era of African philosophy.

8. Conclusion

The development of African philosophy through the periods yields two vital conceptions for African philosophy, namely that African philosophy is a critical engagement of tradition and individual thinkers on the one hand, and on the other hand, it is also a critical construction of futurity. When individual African philosophers engage tradition critically in order to ascertain its logical coherency and universal validity, they are doing African philosophy. And when they employ the tools of African logic in doing this, they are doing African philosophy. On the second conception, when African philosophers study relationships and engage in critical conversations with one another and in the construction of new thoughts in matters that concern Africa but which are nonetheless universal and projected from African native thought systems, they are doing African philosophy. So, the authentic African philosophy is not just a future project; it can also continue from the past.

On the whole, this essay discussed the journey of African philosophy from the beginning and focused on the criteria, schools and movements in African philosophical tradition. The historical account of the periods in African philosophy began with the early period through to the middle, the later and finally, the new period. These periods of African philosophy were covered, taking particular interest in the robust, individual contributions. Some questions still trail the development of African philosophy, many of which include, “Must African philosophy be tailored to the pattern of Western philosophy, even in less definitive issues? If African philosophy is found to be different in approach from Western philosophy, — so what? Are logical issues likely to play any major roles in the structure and future of African philosophy? What is the future direction of African philosophy? Is the problem of the language of African philosophy pregnant? Would conversations in contemporary African philosophy totally eschew perverse dialogue? What shall be the rules of engagement in African philosophy?” These questions are likely to shape the next lines of thought in African philosophy.

9. References and Further Reading

  • Abanuka, Batholomew. A History of African Philosophy. Enugu: Snaap Press, 2011.
    • An epochal discussion of African philosophy.
  • Abraham, William. The Mind of Africa. Chicago: University of Chicago Press, 1962.
    • A philosophical discussion of culture, African thought and colonial times.
  • Achebe, Chinua. Morning yet on Creation Day. London: Heinemann, 1975.
    • A philosophical treatment of African tradition and colonial burden.
  • Anyanwu, K. C. “Philosophical Significance of Myth and Symbol in Dogon World-view”. C. S. Momoh ed. The Substance of African Philosophy. Auchi: APP Publications, 1989.
    • A discussion of the philosophical elements in an African culture.
  • Akesson, Sam. K. “The Akan Concept of Soul”. African Affairs: The Journal of the Royal African Society, 64(257), 280-291.
    • A discourse on African metaphysics and philosophy of religion.
  • Akiode, Olajumoke. “African philosophy, its questions, the place and the role of women and its disconnect with its world”. African Philosophy and the Epistemic Marginalization of Women; edited by Jonathan O. Chimakonam and Louise du Toit. Routledge, 2018.
    • A critical and Afro-feminist discussion of the communalist orientation in African philosophy.
  • Aristotle. Metaphysica, Translated into English under the editorship of W. D. Ross, M.A., Hon. LL.D (Edin.) Oxford. Vol. VIII, Second Edition, OXFORD at the Clarendon Press 1926. Online Edition. 982b.
    • A translation of Aristotle’s treatise on metaphysics.
  • Asouzu Innocent. I. Ibuanyidanda: New Complementary Ontology Beyond World-Immanentism, Ethnocentric Reduction and Impositions. Litverlag, Münster, Zurich, New Brunswick, London, 2007.
    • An African perspectival treatment of metaphysics or the theory of complementarity of beings.
  • Asouzu, Innocent. I. The Method and Principles of Complementary, Calabar University Press, 2004.
    • A formulation of the method and theory of Complementary Reflection.
  • Asouzu, Innocent. I. 2013. Ibuanyidanda (Complementary Reflection) and Some Basic Philosophical Problems in Africa Today. Sense Experience, “ihe mkpuchi anya” and the Super-maxim. Litverlag, Münster, Zurich, Vienna, 2013.
    • A further discussion on the theory, method and logic of complementary Reflection.
  • Attoe, Aribiah David. “Examining the Method and Praxis of Conversationalism,” in Chimakonam Jonathan O., E Etieyibo, and I Odimegwu (eds). Essays on Contemporary Issues in African Philosophy. Cham: Springer, 2022.
    • An broad examination of the method of conversational thinking.
  • Babalola, Yai. “Theory and Practice in African Philosophy: The Poverty of Speculative Philosophy. A Review of the Work of P. Hountondji, M. Towa, et al.” Second Order, 2. 2. 1977.
    • A Critical review of Hountondji and Towa.
  • Bello, A. G. A. Philosophy and African Language. Quest: Philosophical Discussions. An International African journal of Philosophy, Vol 1, No 1, Pp5-12, 1987.
    • A critical engagement on the subject of language of philosophy.
  • Betts, Raymond. Assimilation and Association in French Colonial Territory 1890 to 1915. (First ed. 1961), Reprinted. Nebraska: University of Nebraska Press, 2005
    • A discourse on French colonial policies.
  • Bodunrin, Peter. “The Question of African Philosophy”. Richard Wright (ed) African Philosophy: An Introduction 3rd ed. Lanham: UPA, 1984.
    • A discourse on the nature and universal conception of African philosophy.
  • Cesaire Aime. Return to My Native Land. London: Penguin Books, 1969.
    • A presentation of colonial impact on the mind of the colonized.
  • Chimakonam, Jonathan. O. “On the System of Conversational Thinking: An Overview”, Arụmarụka: Journal of Conversational Thinking, 1(1), 2021, pp1-45.
    • A detail discussion of the main components of Conversational Thinking.
  • Chimakonam Jonathan O. Ed. Logic and African Philosophy: Seminal Essays in African Systems of Thought. Delaware: Vernon Press, 2020.
    • A collection of selected seminal papers on the African logic debate.
  • Chimakonam, Jonathan O. Ezumezu A System of Logic for African Philosophy and Studies. Cham. Springer Nature, 2019.
    • A theoretic formulation of the system of Ezumezu logic.
  • Chimakonam, Jonathan, O. The ‘Demise’ of Philosophical Universalism and the Rise of Conversational Thinking in Contemporary African Philosophy. Method, Substance, and the Future of African Philosophy, ed. Etieyibo Edwin. 135-160. Cham. Springer Nature, 2018.
    • A critique of philosophical universalism.
  • Chimakonam Jonathan O. “Conversationalism as an Emerging Method of Thinking in and Beyond African Philosophy,” Acta Academica, 2017a. pp11-33, Vol 2.
    • A methodological presentation of Conversational thinking.
  • Chimakonam Jonathan O. “What is Conversational Philosophy? A Prescription of a New Theory and Method of Philosophising in and Beyond African Philosophy,” Phronimon, 2017b. pp115-130, Vol 18.
    • An intercultural formulation of the Conversational method.
  • Chimakonam, Jonathan, O. The Criteria Question in African Philosophy: Escape from the Horns of Jingoism and Afrocentrism. Atuolu Omalu: Some Unanswered Questions in Contemporary African Philosophy, ed. Jonathan O. Chimakonam. Pp101-123. University Press of America: Lanham, 2015a.
    • A discussion of the Criteria of African philosophy.
  • Chimakonam, Jonathan, O. Addressing Uduma’s Africanness of a Philosophy Question and Shifting the Paradigm from Metaphilosophy to Conversational Philosophy. Filosofia Theoretica: Journal of African Philosophy, Culture and Religions, Vol 4. No 1. 2015b, 33-50.
    • An engagement with Uduma on his Africanness of philosophy question from a conversational viewpoint.
  • Chimakonam, Jonathan, O. Conversational Philosophy as a New School of Thought in African Philosophy: A Conversation with Bruce Janz on the Concept of Philosophical Space. Confluence: Online Journal of World Philosophies. 9-40, 2015c.
    • A rejoinder to Bruce Janz on the concept of philosophical space.
  • Chimakonam Jonathan O. “Transforming the African philosophical place through conversations: An inquiry into the Global Expansion of Thought (GET)”, in South African Journal of Philosophy, Vol. 34, No. 4. 2015d, 462-479.
    • A formulation of some basic principles of conversational thinking.
  • Chimakonam, O. Jonathan. “Ezumezu: A Variant of Three-valued Logic—Insights and Controversies”. Paper presented at the Annual Conference of the Philosophical Society of Southern Africa. Free State University, Bloemfontein, South Africa. Jan. 20-22, 2014.
    • An articulation of the structure of Ezumezu/African logic tradition.
  • Chimakonam, O. Jonathan. “Principles of Indigenous African Logic: Toward Africa’s Development and Restoration of African Identity” Paper presented at the 19th Annual Conference of International Society for African Philosophy and Studies [ISAPS], ‘50 Years of OAU/AU: Revisiting the Questions of African Unity, Identity and Development’. Department of Philosophy, Nnamdi Azikiwe University, Awka. 27th – 29th May, 2013.
    • A presentation of the principles of Ezumezu/African logic tradition.
  • Chimakonam, O. Jonathan. “Integrative Humanism: Extensions and Clarifications”. Integrative Humanism Journal. 3.1, 2013.
    • Further discussions on the theory of integrative humanism.
  • Chimakonam Jonathan O. and Uti Ojah Egbai. “The Value of Conversational Thinking in Building a Decent World: The Perspective of Postcolonial Sub-Saharan Africa”, in Dialogue and Universalism, Vol XXVI No 4. 105-117, 2016.
  • Danquah, J.B. Gold Coast : Akan laws and customs and the Akim Abuakwa constitution. London: G. Routledge & Sons, 1928.
    • A discourse on African philosophy of law.
  • Danquah, J.B. The Akan doctrine of God: a fragment of Gold Coast ethics and religion. London: Cass, 1944.
    • A discourse on African metaphysics, ethics and philosophy of religion.
  • Diop, Cheikh Anta. The African Origin of Civilization: Myth or Reality. Mercer Cook Transl. New York: Lawrence Hill & Company, 1974.
  • Du Bois, W. E. B. The Souls of Black Folk. (1903). New York: Bantam Classic edition, 1989.
    • A discourse on race and cultural imperialism.
  • Edeh, Emmanuel. Igbo Metaphysics. Chicago: Loyola University Press, 1985.
    • An Igbo-African discourse on the nature being.
  • Egbai, Uti Ojah & Jonathan O. Chimakonam. Why Conversational Thinking Could be an Alternative Method for Intercultural Philosophy, Journal of Intercultural Studies, 40:2, 2019. 172-189.
    • A discussion of conversational thinking as a method of intercultural philosophy.
  • Enyimba, Maduka. “On how to do African Philosophy in African Language: Some Objections and Extensions. Philosophy Today, 66. 1, 2022. Pp. 25-37.
    • A discussion on how to do African philosophy using an African language.
  • Ekwealor, C. “The Igbo World-View: A General Survey”. The Humanities and All of Us. Emeka Oguegbu (ed) Onitsha: Watchword, 1990.
    • A philosophical presentation of Igbo life-world.
  • Etuk, Udo. “The Possibility of African logic”. The Third Way in African Philosophy, Olusegun Oladipo (ed). Ibadan: Hope Publications, 2002.
    • A discussion of the nature and possibility of African logic.
  • Fayemi, Ademola K. “African Philosophy in Search of Historiography”. Nokoko: Journal of Institute of African Studies. 6. 2017. 297-316.
    • A historiographical discussion of African philosophy.
  • Frantz, Fanon. The Wretched of the Earth. London: The Chaucer Press, 1965.
    • A critical discourse on race and colonialism.
  • Graness, Anke. “Writing the History of Philosophy in Africa: Where to Begin?”. Journal of African Cultural Studies. 28. 2. 2015. 132-146.
    • A Eurocentric historicization of African philosophy.
  • Graness, A., & Kresse, K. eds., Sagacious Reasoning: H. Odera Oruka in memoriam, Frankfurt: Peter Lang, 1997.
    • A collection of articles on Oruka’s Sage philosophy.
  • Graiule, Marcel. Conversations with Ogotemmêli, London: Oxford University Press for the International African Institute, 1965.
    • An interlocutory presentation of African philosophy.
  • Gyekye, Kwame. An Essay in African Philosophical Thought: The Akan Conceptual Scheme. Cambridge: Cambridge University Press, 1987.
    • A discussion of philosophy from an African cultural view point.
  • Hallen, Barry. A Short History of African Philosophy. Bloomington: Indiana University Press, 2002.
    • A presentation of the history of African philosophy from thematic and personality perspectives.
  • Hallen, B. and J. O. Sodipo. Knowledge, Belief and Witchcraft: Analytic Experiments in African Philosophy. Palo Alto, CA: Stanford University Press, 1997.
    • An analytic discourse of the universal nature of themes and terms in African philosophy.
  • Hebga, Meinrad. “Logic in Africa”. Philosophy Today, Vol.11 No.4/4 (1958).
    • A discourse on the structure of African logical tradition.
  • Hegel, Georg. Lectures on the Philosophy of World History. Cambridge: Cambridge University Press, reprint 1975.
    • Hegel’s discussion of his philosophy of world history.
  • Horton, Robin. “African Traditional Religion and Western Science” in Africa 37: 1 and 2, 1967.
    • A comparison of African and Western thought.
  • Horton, Robin. “Traditional Thought and the Emerging African Philosophy Department: A Comment on the Current Debate” in Second Order: An African Journal of Philosophy vol. III No. 1, 1977.
    • A logical critique of the idea of African philosophy.
  • Hountondji, Paulin. African Philosophy: Myth and Reality. Second Revised ed. Bloomington, Indiana: University Press, 1996.
    • A critique of ethnophilosophy and an affirmation of African philosophy as a universal discourse.
  • Hunnings, Gordon. “Logic, Language and Culture”. Second Order: An African Journal of Philosophy, Vol.4, No.1. (1975).
    • A critique of classical logic and its laws in African thought and a suggestion of African logical tradition.
  • Ijiomah, Chris. “An Excavation of a Logic in African World-view”. African Journal of Religion, Culture and Society. 1. 1. (August, 2006): pp.29-35.
    • An extrapolation on a possible African logic tradition.
  • Iroegbu, Pantaleon. Metaphysics: The Kpim of Philosophy. Owerri: International Universities Press, 1995.
    • A conversational presentation of theory of being in African philosophy.
  • Jacques, Tomaz. “Philosophy in Black: African Philosophy as a Negritude”. Discursos Postcoloniales Entorno Africa. CIEA7, No. 17, 7th Congress of African Studies.
    • A critique of the rigor of African philosophy as a discipline.
  • James, George. Stolen Legacy: Greek Philosophy is Stolen Egyptian Philosophy. New York: Philosophical Library, 1954.
    • A philosophical discourse on race, culture, imperialism and colonial deceit.
  • Jahn, Janheinz. Muntu: An Outline of Neo-African Culture. New York: Grove Press, 1961.
    • A presentation of a new African culture as a synthesis and as philosophical relevant and rational.
  • Jewsiewicki, Bogumil. “African Historical Studies: Academic Knowledge as ‘usable past’ and Radical Scholarship”. The African Studies Review. Vol. 32. No. 3, December, 1989.
    • A discourse on the value of African tradition to modern scholarship.
  • Kanu, Ikechukwu. ‘Trends in African Philosophy: A Case for Eclectism.’ Filosofia Theoretica: A Journal of African Philosophy, Culture and Religion, 2(1), 2013. pp. 275-287.
    • A survey of the trends in African philosophy with a focus on Eclectism.
  • Keita, Lansana. “The African Philosophical Tradition”. Wright, Richard A., ed. African Philosophy: An Introduction. 3rd ed. Lanham, Md.: University Press of America, 1984.
    • An examination of African philosophical heritage.
  • Keita, Lansana. “Contemporary African Philosophy: The Search for a Method”. Tsanay Serequeberhan (ed) African Philosophy: The Essential Readings. New York: Paragon House, 1991.
    • An analysis of methodological issues in and basis of African philosophy.
  • Kezilahabi, Euphrase. African Philosophy and the Problem of Literary Interpretation. Unpublished Ph.D Dissertation. University of Wisconsin, Madison, 1985.
    • A doctoral dissertation on the problem of literary interpretation in African philosophy.
  • Lambert, Michael. “From Citizenship to Négritude: Making a Difference in Elite Ideologies of Colonized Francophone West Africa”. Comparative Studies in Society and History, Vol. 35, No. 2. (Apr., 1993), pp. 239–262.
    • A discourse on the problems of colonial policies in Francophone West Africa.
  • Lewis Gordon. “African Philosophy’s Search for Identity: Existential Considerations of a recent Effort”. The CLR James Journal, Winter 1997, pp. 98-117.
    • A survey of the identity crisis of African philosophical tradition.
  • Leo Apostel. African Philosophy. Belgium: Scientific Publishers, 1981.
    • An Afrocentrist presentation of African philosophy.
  • Levy-Bruhl, Lucien. Primitive Mentality. Paris: University of France Press, 1947.
    • A Eurocentrist presentation of non-European world.
  • Makinde, M.A. Philosophy in Africa. The Substance of African philosophy. C.S. Momoh. Ed. Auchi: African Philosophy Projects’ Publications. 2000.
    • A discourse on the practise and relevance of philosophy in Africa.
  • Mangena, Fainos. “The Fallacy of Exclusion and the Promise of Conversational Philosophy in Africa”, in Chimakonam Jonathan O., E Etieyibo, and I Odimegwu (eds). Essays on Contemporary Issues in African Philosophy. Cham: Springer, 2022.
    • A discourse on the significance of conversational thinking.
  • Masolo, D. A. African Philosophy in Search of Identity. Bloomington: Indiana University Press, 1994.
    • An individual-based presentation of the history of African philosophy.
  • Maurier, Henri. “Do We have an African Philosophy?”. Wright, Richard A., ed. 1984. African Philosophy: An Introduction. 3rd ed. Lanham, Md.: University Press of America, 1984.
    • A critique of Ethnophilosophy as authentic African philosophy.
  • Mbiti, John. African Religions and Philosophy. London: Heinemann,1969.
    • A discourse on African philosophical culture.
  • Momoh, Campbell. “Canons of African Philosophy”. Paper presented at the 6th Congress of the Nigerian Philosophical Association. University of Ife, July 31- August 3, 1985.
    • A presentation of the major schools of thought in African philosophy.
  • Momoh, Campbell .ed. The Substance of African Philosophy. Auchi: APP Publications, 1989.
    • A collection of essays on different issues in African philosophy.
  • Momoh, Campbell. “The Logic Question in African Philosophy”. C. S. Momoh ed. The Substance of African Philosophy. Auchi: APP Publications, 1989.
    • A defense of the thesis of a possible African logic tradition.
  • Mosima, P. M. Philosophic Sagacity and Intercultural Philosophy: Beyond Henry Odera Oruka, Leiden/Tilburg: African Studies Collection 62/Tilburg University, 2016.
  • Mudimbe, V. Y. The Invention of Africa: Gnosis, Philosophy and the Order of Knowledge (African Systems of Thought). Bloomington: Indiana University Press, 1988.
    • A discourse on culture, race, Eurocentrism and modern Africa as an invention of Western scholarship.
  • Nkrumah, Kwame. I Speak of Freedom: A Statement of African Ideology. London: Mercury Books, 1961.
    • A discourse on political ideology for Africa.
  • Nkrumah, Kwame. Towards Colonial Freedom. London: Heinemann. (First published in 1945), 1962.
    • A discussion of colonialism and its negative impact on Africa.
  • Nwala, Uzodinma. Igbo Philosophy. London: Lantern Books, 1985.
    • An Afrocentrist presentation of Igbo-African philosophical culture.
  • Nyerere, Julius. Freedom and Unity. Dares Salaam: Oxford University Press, 1986.
    • A discussion of a postcolonial Africa that should thrive on freedom and unity.
  • Nyerere, Julius. Freedom and Socialism. Dares Salaam: Oxford University Press, 1986.
    • A discourse on the fundamental traits of African socialism.
  • Nyerere, Julius. Ujamaa—Essays on Socialism. Dar-es-Salaam, Tanzania: Oxford University Press, 1986.
    • A collection of essays detailing the characteristics of African brand of socialism.
  • Obenga, Theophile. “Egypt: Ancient History of African Philosophy”. A Companion to African Philosophy. Ed. Kwasi Wiredu. Malden: Blackwell Publishing, 2004.
    • An Afrocentrist historicization of African philosophy.
  • Oelofsen, Rianna. “Women and ubuntu: Does ubuntu condone the subordination of women?” African Philosophy and the Epistemic Marginalization of Women; edited by Jonathan O. Chimakonam and Louise du Toit. Routledge, 2018.
    • A feminist discourse on ubuntu.
  • Ogbalu, F.C. Ilu Igbo: The Book of Igbo Proverbs. Onitsha: University Publishing Company, 1965.
    • A philosophical presentation of Igbo-African proverbs.
  • Ogbonnaya, L. Uchenna. “How Conversational Philosophy Profits from the Particularist and the Universalist Agenda”, in Chimakonam Jonathan O., E Etieyibo, and I Odimegwu (eds). Essays on Contemporary Issues in African Philosophy. Cham: Springer, 2023.
    • A conversational perspective on particularism and universalism.
  • Oguejiofor, J. Obi. “African Philosophy: The State of its Historiography”. Diogenes. 59. 3-4, 2014. 139-148.
    • A Euro-historical adaptation of African philosophy.
  • Ogunmodede, Francis. 1998. ‘African philosophy in African language.’ West African Journal of Philosophical Studies, Vol 1. Pp3-26.
    • A discourse on doing African philosophy in African languages.
  • Okeke, J. Chimakonam. “Why Can’t There be an African logic?”. Journal of Integrative Humanism. 1. 2. (2011). 141-152.
    • A defense of a possible African logic tradition and a critique of critics.
  • Okere, Theophilus. “The Relation between Culture and Philosophy,” in Uche 2 1976.
    • A discourse on the differences and similarities between culture and philosophy.
  • Okere, Theophilus. African Philosophy: A Historico-Hermeneutical Investigation of the Conditions of Its Possibility. Lanham, Md.: University Press of America, 1983.
    • A hermeneutical discourse on the basis of African philosophy.
  • Okolo, Chukwudum B. Problems of African Philosophy. Enugu: Cecta Nigeria Press, 1990.
    • An x-ray of the major hindrances facing African philosophy as a discipline.
  • Okoro, C. M. African Philosophy: Question and Debate, A Historical Study. Enugu: Paqon Press, 2004.
    • A historical presentation of the great debate in African philosophy.
  • Oladipo, Olusegun. (ed) The Third Way in African Philosophy. Ibadan: Hope, 2002.
    • A collection of essays on the topical issues in African philosophy of the time.
  • Oladipo, Olusegun. Core Issues in African Philosophy. Ibadan: Hope Publications, 2006.
    • A discussion of central issues of African philosophy.
  • Olela, Henry. “The African Foundations of Greek Philosophy”. Wright, Richard A., ed. African Philosophy: An Introduction. 3rd ed. Lanham, Md.: University Press of America, 1984.
    • An Afrocentrist presentation of African philosophy as the source of Greek philosophy.
  • Oluwole, Sophie. Philosophy and Oral Tradition. Lagos: Ark Publications, 1999.
    • A cultural excavationist programme in African philosophy.
  • Omoregbe, Joseph. “African Philosophy: Yesterday and Today”. African Philosophy: An Anthology. Emmanuel Eze (ed.), Massachusetts: Blackwell, 1998.
    • A survey of major issues in the debate and a critique of the Universalist school.
  • Onunwa, Udobata. “Humanism: The Bedrock of African Traditional Religion and Culture”. Religious Humanism. Vol. XXV, No. 2, Spring 1991, Pp 66 – 71.
    • A presentation of Humanism as the basis for African religion and culture.
  • Onyewuenyi, Innocent. African Origin of Greek Philosophy: An Exercise in Afrocentrism. Enugu: SNAAP Press, 1993.
    • An Afrocentrist presentation of philosophy as a child of African thought.
  • Oruka, H. Odera. “The Fundamental Principles in the Question of ‘African Philosophy,’ I.” Second Order 4, no. 1: 44–55, 1975.
    • A discussion of the main issues in the debate on African philosophy.
  • Oruka, H. Odera.“Four Trends in African Philosophy.” In Philosophy in the Present Situation of Africa, edited by Alwin Diemer. Weisbaden, Germany: Franz Steiner Erlagh. (First published in 1978), 1981; Ed.
    • A breakdown of the major schools of thought in the debate on African philosophy.
  • Oruka, H. Odera. Sage Philosophy: Indigenous Thinkers and the Modern Debate on African Philosophy. Leiden: E. J. Brill. 1990.
    • A survey of the journey so far in African philosophy and the identification of two additional schools of thought.
  • Osuagwu, I. Maduakonam. African Historical Reconsideration: A Methodological Option for African Studies, the North African Case of the Ancient History of Philosophy; Amamihe Lecture 1. Owerri: Amamihe, 1999.
    • A Euro-historical adaptation of African philosophy.
  • Outlaw, Lucius. “African ‘Philosophy’? Deconstructive and Reconstructive Challenges.” In his On Race and Philosophy. New York and London: Routledge. 1996.
    • A presentation of African philosophy as a tool for cultural renaissance.
  • Plato. Theætetus,155d, p.37.
    • Contains Plato’s theory of knowledge.
  • Presbey, G.M. “Who Counts as a Sage?, Problems in the future implementation of Sage Philosophy,” in: Quest- Philosophical Discussions: An International African Journal of Philosophy/ Revue Africaine Internationale de Philosophie, Vol.XI, No.1, 1997. 2:52-66.
  • Rettova, Alena. Afrophone Philosophies: Reality and Challenge. Zdenek Susa Stredokluky, 2007.
    • A Eurocentric discussion on Afrophone philosophies.
  • Ruch, E. A. and Anyawnu, K. C. African Philosophy: An Introduction to the Main Philosophical Trends in Contemporary Africa. Rome: Catholic Book Agency, 1981.
    • A discussion on racialism, slavery, colonialism and their influence on the emergence of African philosophy, in addition to current issues in the discipline.
  • Sogolo, Godwin. Foundations of African Philosophy. Ibadan: Ibadan University Press, 1993.
    • A discussion of the logical, epistemological and metaphysical grounds for African philosophy project.
  • Sogolo, Godwin. 1990. Options in African philosophy. Philosophy. 65. 251: 39-52.
    • A critical and eclectic proposal in African philosophy.
  • Tangwa, Godfrey. ‘Revisiting the Language Question in African Philosophy’. The Palgrave Handbook of African Philosophy. Eds. Adesinya Afolayan and Toyin Falola. Pp 129-140. New York: Springer Nature, 2017.
    • A discourse on the language problem in African philosophy.
  • Tavernaro-Haidarian, Leyla. “Deliberative Epistemology: Towards an Ubuntu-based Epistemology that Accounts for a Prior Knowledge and Objective Truth,” South African Journal of Philosophy. 37(2), 229-242, 2018.
    • A conversational perspective on ubuntu-based epistemology.
  • Temples, Placide. Bantu philosophy. Paris: Presence Africaine.
    • A theorization on Bantu philosophy.
  • Towa, Marcien. “Conditions for the Affirmation of a Modern African Philosophical Thought”. Tsanay Serequeberhan (ed) African Philosophy: The Essential Readings. New York: Paragon House, 1991.
    • A presentation of important factors required for the emergence of African philosophy as a discipline.
  • Uduagwu, Chukwueloka. “Doing Philosophy in the African Place: A Perspective on the Language Challenge”. Jonathan Chimakonam et al (eds), Essays on Contemporary Issues in African Philosophy. Cham, Springer, 2023.
    • A discourse on the language problem in African philosophy.
  • Uduigwomen, F. Andrew. “Philosophy and the Place of African Philosophy”. A. F. Uduigwomen ed. From Footmarks to Landmarks on African Philosophy. 1995, 2nd Ed. 1995/2009. Lagos: O. O. P. 2009.
    • A collection of essays on different issues in African philosophy.
  • Uduma Orji. “Can there be an African Logic” in A. F. Uduigwomen(ed.) From Footmarks to Landmarks on African Philosophy Lagos: O. O. P. Ltd, 2009.
    • A critique of a culture-bound logic in African thought.
  • Uduma Orji. “Between Universalism and Cultural Identity: Revisiting the Motivation for an African Logic”. A Paper delivered at an International Conference of the Council for Research in Values and Philosophy Washington D.C., USA at University of Cape Coast, Cape Coast Ghana 3–5 February, 2010.
    • A critique of a culture-bound logic in African thought and a presentation of logic as universal.
  • Van Hook, Jay M. “African Philosophy and the Universalist Thesis”. Metaphilosophy. 28. 4: 385-396, 1997.
    • A critique of the universalist thesis in African philosophy.
  • Van Hook, Jay M. The Universalist Thesis Revisited: What Direction for African Philosophy in the New Millennium? In Thought and Practice in African Philosophy, ed. G. Presbey, D. Smith, P. Abuya and O. Nyarwath, 87-93. Nairobi: Konrad Adenauer Stiftung, 2002.
    • A further critique of the universalist thesis in African philosophy.
  • Vest, J. L. 2009. ‘Perverse and necessary dialogues in African philosophy’, in: Thought and practice: a journal of the philosophical association of Kenya. New series, Vol.1 No.2, December, pp. 1-23.
    • An discussion of the proper direction and focus of African philosophy in the new Age.
  • Wamba-ia Wamba, E. “Philosophy in Africa: Challenges of the African Philosopher,” in African Philosophy: The Essential Readings. New York: Paragon House, 1991.
    • A discussions of the technical problems of African philosophy as a discipline.
  • wa Thiong’o, Ngugi. Decolonizing the Mind: The Politics of Language in African Literature. London: J. Curry and Portsmouth, N. H: Heinemann, 1986.
    • A discourse on Eurocentrism, Africa’s decolonization and cultural imperialism.
  • Winch, Peter. “Understanding a Primitive Society”. American Philosophical Quarterly. No. 1, 1964.
    • A discussion and a defense of the rationality of primitive people.
  • Wiredu, Kwasi. Philosophy and an African Culture. Cambridge and New York: Cambridge University Press, 1980.
    • A discussion of the philosophical elements in an African culture and a call for a universalizable episteme for African philosophy.
  • Wiredu, Kwasi. “How Not to Compare African Thought with Western Thought.”Ch’Indaba no. 2 ( July–December): 1976. 4–8. Reprinted in African Philosophy: An Introduction, edited by R. Wright. Washington, D.C.: University Press of America, 1977; and in African Philosophy: Selected Readings, edited by Albert G. Mosley. Englewood Cliffs, N.J.: Prentice Hall, 1995.
    • A critique of Robin Horton’s comparison of African and Western thought.
  • Wiredu, Kwasi.“Our Problem of Knowledge: Brief Reflections on Knowledge and Development in Africa”. African Philosophy as Cultural Inquiry. Ivan Karp and D. A. Masolo (ed). Bloomington, Indiana: Indiana University Press, 2000.
    • A discussion on the role of knowledge in the development of Africa.
  • Wiredu, Kwasi. Cultural Universals and Particulars: An African Perspective. Bloomington: Indiana University Press, 1996.
    • A collection of essays on sundry philosophical issues pertaining to comparative and cross-cultural philosophy.
  • Wiredu, Kwasi. Conceptual Decolonization in African Philosophy. Ed. Olusegun Oladipo. Ibadan: Hope Publications, 1995.
    • A discussion of the importance and relevance of the theory of conceptual decolonization in African philosophy.
  • Wiredu, Kwasi. “On Defining African Philosophy”. C. S. Momoh ed. The Substance of African Philosophy. Auchi: APP Publications, 1989.
    • A discourse on the parameters of the discipline of African philosophy.
  • Wright, Richard A., ed. “Investigating African Philosophy”. African Philosophy: An Introduction. 3rd ed. Lanham, Md.: University Press of America, 1984.
    • A critique of the existence of African philosophy as a discipline.

 

Author Information

Jonathan O. Chimakonam
Email: jchimakonam@unical.edu.ng
University of Calabar
Nigeria

Spinoza: Free Will and Freedom

SpinozaBaruch Spinoza (1632-1677) was a Dutch Jewish rationalist philosopher who is most famous for his Ethics and Theological-Political Treatise. Although influenced by Stoicism, Maimonides, Machiavelli, Descartes, and Hobbes, among others, he developed distinct and innovative positions on a number of issues in metaphysics, epistemology, ethics, politics, biblical hermeneutics, and theology. He is also known as a pivotal figure in the development of Enlightenment thinking. Some of his most notorious claims and most radical views surround issues concerning determinism and free will. Spinoza was an adamant determinist, and he denied the existence of free will. This led to much controversy concerning his philosophy in subsequent centuries. He was, in fact, one of the first modern philosophers to both defend determinism and deny free will. Nevertheless, his philosophy champions freedom, both ethically and politically. It provides an ethics without free will but one that leads to freedom, virtue, and happiness. Prima facie, such an ethical project might seem paradoxical, but Spinoza distinguished between free will, which is an illusion, and freedom, which can be achieved. A thorough familiarity with Spinoza’s views on determinism, free will, freedom, and moral responsibility resolves this apparent paradox of an ethics without free will.

Table of Contents

  1. Spinoza’s Determinism
  2. Spinoza on Free Will
  3. Spinoza on Human Freedom
  4. The Free Man and the Way to Freedom
  5. Spinoza on Moral Responsibility
  6. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Spinoza’s Determinism

 Contrary to many of his predecessors and contemporaries, Spinoza is an adamant and notorious determinist. For him, nature is thoroughly determined. While there are many different varieties of determinism, Spinoza is committed to causal determinism, or what is sometimes called nomological determinism. Some commentators argue that Spinoza is also a necessitarian or that he holds that the actual world is the only one possible (see IP33); (for an overview, see Garrett 1991). In any case, as a causal determinist, Spinoza certainly argues that events are determined by previous events or causes (which are further determined by previous past events or causes, and so on) following the laws of nature. Spinoza clearly expresses that all events are determined by previous causes:

Every singular thing, or anything which is finite and has a determinate existence, can neither exist nor be determined to produce an effect unless it is determined to exist and produce an effect by another cause, which is also finite and has a determinate existence; and again, this cause can neither exist nor be determined to produce an effect unless it is determined to exist and produce an effect by another, which is also finite and has a determinate existence, and so on, to infinity. (IP28)

Here, Spinoza is arguing for an infinite chain of finite causes for any given effect, or, as he puts it, any singular thing which exists. Spinoza demonstrates the above proposition in his (in)famous geometrical method, which requires starting with definitions and axioms, demonstrating propositions from them, and building upon previous demonstrations. His commitment to causal determinism is already displayed in Axiom 3 of Part I: “From a given determinate cause, the effect follows necessarily; and conversely, if there is no determinate cause, it is impossible for an effect to follow.” Surprisingly, Spinoza uses only this axiom to demonstrate the previous proposition, IP27 “a thing which has been determined by God to produce an effect cannot render itself undetermined.” His demonstrations refer to Axiom 3: “This proposition is evident from A3.” So, it is clear that Spinoza thinks that every effect has a cause, but why he holds this view is not yet clear.

To understand why Spinoza is committed to causal determinism, requires an examination of his larger philosophical commitments. First, Spinoza is a rationalist, and as a rationalist, he holds that everything is, in principle, explainable or intelligible. This is to say that everything that exists and everything that occurs have a reason to be or to happen, and that this reason can be known and understood. This is known as the principle of sufficient reason, after Leibniz’s formulation. Secondly, Spinoza is committed to naturalism, at least a kind of naturalism that argues that there are no explanations or causes outside of nature. This is to say, there are no super-natural causes, and all events can be explained naturally with respect to nature and its laws. Spinoza’s rationalism and naturalism are in evidence when he argues for the necessary existence of the one infinite substance (IP11), God or Nature (Deus sive Natura), which is the immanent (IP18) and efficient cause (IP25) of all things.

The existence of everything cannot be a brute fact for Spinoza, nor does it make sense to him to postpone the reason for existence by referring to a personal God as the creator of all. Rather, he argues that the one substance (“God” or “Nature” in Spinoza’s terminology, but in the following just “God” with the caveat that Nature is implied) is the cause of itself and necessarily exists. “God, or a substance consisting of infinite attributes, each of which expresses eternal and infinite essence, necessarily exists” (IP11). In his alternate demonstration for this proposition, he explicitly uses the principle of sufficient reason: “for each thing there must be assigned a cause, or reason, both for its existence and for its nonexistence” (417). The one substance, or God, is the cause of itself, or, as he defines it “that whose essence involves existence, or that whose nature cannot be conceived except as existing” (ID1).

This necessary existence of God entails the necessity by which every individual thing is determined. This is because Spinoza is committed to substance monism, or the position that there is only one substance. This is markedly different from his rationalist predecessor, Descartes, who, though also arguing that only God is properly speaking an independent substance (Principles I, 51), held that there were indefinitely many substances of two kinds: bodies, or res extensa, and thoughts, or res cogitantes (Principles I, 52). Spinoza, though, defines God as one substance consisting of infinite attributes. An attribute is “what the intellect perceives of a substance as constituting its essence” (ID4). By “infinite” here, Spinoza refers primarily to a totality rather than a numerical infinity, so that the one substance has all possible attributes. Spinoza goes on to indicate that the human intellect knows two attributes, namely extension and thought (IIA5). Besides the one substance and its attributes, Spinoza’s ontology includes what he calls modes. Modes are defined as “affections of a substance or that which is in another thing through which it is also conceived” (ID5). Furthermore, Spinoza distinguishes between infinite modes (IP23) and finite modes, the latter generally taken to be all the singular finite things, such as apples, books, or dogs, as well as ideas of these things, thus also the human body and its mind.

There is much scholarly controversy about the question of how substance, attributes, and infinite and finite modes all relate to each other. Of particular contention is the relation between the finite modes and the one infinite substance. A more traditional interpretation of Spinoza’s substance monism takes finite modes to be parts of God, such that they are properties which inhere in the one substance, with the implication of some variety of pantheism, or the doctrine that everything is God. Edwin Curley, however, influentially argues that finite modes should be taken merely as causally and logically dependent on the one infinite substance, that is, God, which itself is causally independent, following Spinoza’s argument of substance as cause of itself or involving necessary existence (IP1-IP11). According to this interpretation, God is identified with its attributes (extension and thought) as the most general structural features of the universe with infinite modes, following necessarily from the attributes and expressing the necessary general laws of nature (for instance, Spinoza identifies the immediate infinite mode of the attribute of extension with motion and rest in Letter 64, 439). On this causal-nomological interpretation of substance, God is the cause of all things but should only be identified with the most general features of the universe rather than with everything existing, for instance the finite modes (Curley 1969, esp. 44-81).

There is, however, resistance to this causal interpretation of the relation between substance and finite modes (see Bennett 1984, 92-110; 1991; Nadler 2008). Jonathan Bennet argues against Curley’s interpretation—returning to the more traditional relation of modes as properties that inhere in a substance—by taking Spinoza’s proposition IP15 more literally: “Whatever is, is in God, and nothing can be, or be conceived without God.” Bennett identifies the finite modes as ways in which the attributes are expressed adjectively (that is, this region of extension is muddy), keeping closer to Spinoza’s use of “mode” as “affections of God’s attributes… by which God’s attributes are expressed in a specific and determinate way” (IP25C). But as Curley points out, Bennett’s interpretation has some difficulty explaining the precise relation of finite modes to infinite modes and attributes, the latter having an immediate causal relation to God (Curley 1991, 49). Leaving aside the larger interpretive controversies, the issue here is that God and its attributes, being infinite and eternal, cannot be the direct or proximate cause of finite modes, though God is the cause of everything, including finite modes. Spinoza writes “From the necessity of the divine nature there must follow infinitely many things in infinitely many modes (that is, everything that can fall under an infinite intellect)” (IP16). For this reason, Spinoza’s argument for determinism seems to recognize an infinite chain of finite causes and a finite chain of infinite causes. The former has already been referred to when Spinoza argues in IP28 that any particular finite thing is determined to exist or produce an effect by another finite cause “and so on, ad infinitum.” Indeed, in his demonstration, Spinoza states that God, being infinite and eternal, could not be the proximate cause of finite things. Further, in the Scholium to this proposition, Spinoza explains that God is the proximate cause of only those things produced immediately by him, which in turn are infinite and eternal (eternal here indicating necessity as in IP10S, 416). That is, Spinoza does indeed argue that that which follows from the absolute nature of any of God’s attributes must be likewise infinite and eternal in IP21-P23.

Some commentators interpret God as being the proximate cause (through its attributes) of the infinite modes, which are understood as part of the finite chain of infinite causes associated with the most basic laws of nature. While Spinoza does not write directly of the “laws of nature” in this discussion in the Ethics, he does so in the Theological Political Treatise (TTP) in his discussion of miracles. Here Spinoza argues that nothing happens outside of the universal laws of nature, which for him are the same as God’s will and decree. Spinoza writes “But since nothing is necessarily true except by the divine decree alone, it follows quite clearly that the universal laws of nature are nothing but decrees of God, which follow from the necessity and perfection of the divine nature” (TTP VI.9). He goes on to argue that if a miracle were conceived as an occurrence contrary to the universal laws of nature, it would be contradictory in itself and mean that God was acting contrary to his own nature. From this passage, it is clear that Spinoza equates what follows from God’s nature with the universal laws of nature, which are eternal and immutable. For this reason, God’s attributes and the infinite modes are often identified with the most general feature of the universe, expressing the laws of nature.

We tend to use “laws of nature” when referring to physical laws. Spinoza, however, holds that God can be understood under the attribute of extension or the attribute of thought, that is, God is both extended (IIP2) and thinking (IIP1). For this reason, laws of nature exist not only in the attribute of extension but also in that of thought. Bodies and ideas both follow the laws of nature. Bodies are finite modes of extension, while ideas are finite modes of thought. Accordingly, he argues that “the order and connection of ideas are the same as the order and connection of things” (IIP7). This is Spinoza’s famous “parallelism,” though he never uses this term. While there is much controversy concerning how to interpret this identity, Spinoza indicates that the extended thing and the thinking thing are one and the same thing expressed under two different attributes or conceived from two different perspectives (IIP7S). For this reason, a body, or an extended mode, and its correlating idea, or a thinking mode, are one and the same thing conceived from different perspectives, namely through the attributes of extension or thought.

This claim has two significant consequences. First, when Spinoza indicates that each singular finite thing is determined to exist and to produce an effect by another singular finite thing ad infinitum, this applies to ideas as well as bodies. For this reason, just as bodies and their motion or rest are the cause of other bodies and their motion or rest—in accordance with universal laws of nature, namely the laws of physics—ideas are the cause of other ideas (IIP9) in accordance with universal laws of nature, presumably psychological laws. Second, being one and the same thing, bodies and ideas do not interact causally. That is to say, the order and connection of ideas are one and the same as the order and connection of bodies, but ideas cannot bring about the motion or rest of bodies, nor can bodies bring about the thinking of ideas. Spinoza writes “The body cannot determine the mind to thinking, and the mind cannot determine the body to motion, to rest, or to anything else if there is anything else” (IIIP2). It is clear, then, that both bodies and ideas are causally determined within their respective attributes and that there is no interaction between them. This will have a significant consequence for Spinoza’s understandings of free will versus freedom.

Spinoza’s most challenging consequence from these positions is his blunt denial of contingency in IP29, where he states: “In nature there is nothing contingent, but all things have been determined from the necessity of the divine nature to exist and produce an effect in a certain way.” To recall, finite modes of the one infinite substance (in the case of the attributes of extension or thought, bodies and ideas) are determined to exist by a finite cause (that is, another body or idea), which is further determined to exist by another cause, and so on to infinity. Furthermore, though the connection between singular things and God (conceived as the one eternal, infinite substance) is complex, ultimately, God is the cause of everything that exists, and everything is determined according to the universal and necessary laws of nature expressed by the infinite modes and the other fundamental features of the attributes of God, as mentioned above. In other words, for Spinoza, every event is necessitated by previous causes and the laws of nature.

2. Spinoza on Free Will

Because he is a determinist, Spinoza denies the existence of free will. This would make him, in contemporary discussions of free will, an incompatibilist as well as a determinist. In contemporary discussions of free will, the major concern centers mostly on the question of whether free will and thereby moral responsibility are compatible with determinism. There are two dominant solutions to this problem. Incompatibilism claims that free will and/or moral responsibility are incompatible with determinism because the latter prohibits free choice and thus accountability. Some incompatibilists, namely libertarians, even claim that—because human beings do have free will and we hold each other accountable for our actions—the world is not thoroughly determined. Other incompatibilists argue that if the world is determined, then free will is not compatible, but may be agnostic about whether the world is determined. The opposite camp of compatibilism claims that free will and/or moral responsibility are compatible with determinism, though they can also be agnostic about whether the world is determined.

Spinoza’s position cannot easily be sorted into this scheme because he distinguishes between free will (libera voluntas) and freedom (libertas). It is very clear that he denies free will because of his determinism: “In the mind there is no absolute, or free, will, but the mind is determined to will this or that by a cause which is also determined by another, and this again by another, and so to infinity” (IIP48). It is also, however, a consequence of Spinoza’s conception of the will. In the Scholium to IIP48, Spinoza explains that by “will” he means “a faculty of affirming or denying and not desire” (IIP48S, 484). That is to say, Spinoza, here, wants to emphasize will as a cognitive power rather than a conative one. In this respect, he seems to be following Descartes, who also understands the will as a faculty of affirming and denying, which, coupled with the understanding, produces judgements. However, Spinoza quickly qualifies against Descartes that the will is not, in fact, a faculty at all, but a universal notion abstracted from singular volitions: “we have demonstrated that these faculties are universal notions which are distinguished from the singulars from which we form them” (IIP48S, 484). Spinoza is here referring to his earlier explanation in the Ethics of the origin of “those notions called universals, like man, horse, dog, and the like” (IIP40S, 477). For Spinoza, these universal notions are imaginary or fictions that are formed “because so many images are formed at one time in the human body that they surpass the power of imagining.” The resulting universal notion combines what all of the singulars agree on and ignores distinctions.

Spinoza is making two bold and related claims here. First, there is no real faculty of will, that is a faculty of affirming and denying. Rather, the will is a created fiction, a universal that adds to the illusion of free will. Second, the will is simply constituted by the individual volitions—our affirmations and denials—and these volitions are simply the very ideas themselves. For this reason, Spinoza claims that the will is the same as the intellect (or mind) (IIP49C). Therefore, it is not an ability to choose this or that as in the traditional understanding, and certainly not an ability to choose between alternative courses of action arbitrarily. It is not even an ability to affirm or deny, as Descartes claimed. Descartes, in explaining error in judgment, distinguishes the intellect from the will. Thus, with his claim that the will is the same as the intellect, Spinoza is directly criticizing the Cartesian view of free will. We will return to this criticism after examining Spinoza’s view of the human mind.

For Spinoza, the human mind is the idea of an actually existing singular thing (IIP11), namely the body (IIP13). So, for instance, my mind is the idea of my body. As mentioned above, Spinoza holds that the order and connection of ideas are the same as the order and connection of things (IIIP7) insofar as God is understood through both the attribute of extension and the attribute of thought. This entails that for every body, there is an idea that has that body as its object, and this idea is one and the same as that body, although conceived under a different attribute. On the other hand, Spinoza also characterizes the human mind as a part of the infinite intellect of God (IIP11C) understood as the totality of ideas. For this reason, Spinoza explains that when the human mind perceives something, God has this idea “not insofar as he is infinite, but insofar as he is explained through the nature of the human mind, or insofar as he constitutes the essence of the human mind,” that is, as an affection or finite mode of the attribute of thought.

While Spinoza says the mind is the idea of the body, he also recognizes that the human body is considered an individual composed of multiple other bodies that form an individual body by the preservation of the ratio of motion and rest (II Physical Interlude, P1 and L5). Accordingly, every body that composes the individual’s body also has a correlative idea. Therefore, the mind is made up of a multitude of ideas just as the body is made up of a multitude of bodies (IIP15). Furthermore, when the human body interacts with the other bodies external to it, or has what Spinoza calls affections, ideas of these affections (the affections caused by external bodies in the individual human body) become part of the mind and the mind regards the external body as present (IIP16 and IIP17). These ideas of the affections, however, involve both the nature of the human body and that of the external body. Spinoza calls these “affections of the human body whose ideas present external bodies as present to us” images. He continues that “when the mind regards bodies in this way, we shall say that it imagines” (IIP17S, 465). Note here that Spinoza avers that images are the affections of the body caused by other bodies, and although they do not always “reproduce the figures of things”, he calls having the ideas of these affections of the body imagining.

As we can see, for Spinoza, the mind is a composite idea that is composed of ideas of the body and ideas of the body’s affections, which involve both the human body and the external body (and ideas of these ideas as well (IIP20)). Without these ideas of the affections of our body “the human mind does not know the human body, nor does it know that it exists, except through ideas of the affections by which the body is affected” (IIP19). At the same time, Spinoza explains that whenever the human mind perceives something, God has the idea of this thing together with the human mind (IIP11C); but God has the idea which constitutes the human mind only “insofar as he is considered to be affected by the idea of another singular thing” (IIP19D). That is, on the one hand, as explained in IP28, finite singular things come into existence or produce an effect by other finite singular things, on the other hand though, to the extent that all things are modes of the one substance, each effect is at the same time caused by God. Though most of our knowledge of the body and the external world comes from ideas of fections, Spinoza claims that these ideas of the body and its affections are for the most part inadequate, that is, incomplete, partial, or mutilated, and therefore not clear and distinct. Spinoza writes “Insofar as he [God] also has the idea of another thing together with the human mind, we say that the human mind perceives the thing only partially, or inadequately” (IIP11C).

Spinoza argues that for the most part we only have inadequate knowledge (cognitio) of the state of our body, of external bodies that affect our body, and of our own mind (as ideas of ideas of our body) (IIP26C, IIP27, and IIP28). Our knowledge concerning our body and its affections and the external bodies affecting our body and our own mind is, therefore, limited in its distinctness. While it is not always entirely clear what Spinoza means by inadequate knowledge or an inadequate idea, he defines an adequate idea as “an idea which, insofar as it is considered in itself, without relation to an object, has all the denomination of a true idea” (IID4). Avoiding the epistemic problems of a correspondence theory of truth, Spinoza argues we can form adequate ideas insofar as “every idea which in us is absolute, or adequate and perfect, is true” (IIP34). An inadequate idea is an incomplete, partial, or mutilated idea, and Spinoza argues that “falsity consists in the privation of knowledge which inadequate, or mutilate and confused, ideas involve” (IIP35).

Returning to Spinoza’s claim that the will is the same as the intellect, the mind is just constituted by all the individual ideas. To say that the will is the same as the intellect means that, for Spinoza, the will as the sum of individual volitions is just the sum of these individual ideas which compose the mind. What Spinoza has in mind is that our ideas, which constitute our mind, already involve affirmations and negations. There is no special faculty needed. To give a simple example, while sitting in a café, I see my friend walk in, order a coffee, and sit down. Perceiving all this is to say that my mind has ideas of the affections of my body caused by external bodies (which is also to say that there is in God the idea of my mind together with the ideas of other things). All these ideas are inadequate, incomplete, or partial. Because I perceive my friend, the idea of the affection of my body affirms that she is present in the café, drinking coffee, sitting over yonder. I am not choosing to affirm these ideas, according to Spinoza, but the very ideas already involve affirmations. As I am distracted by other concerns, such as reading a book, these ideas continue to involve the affirmation of her being present in the café, regardless of whether that fact is true or not. If I look up and see her again, this new idea reaffirms her presence. But if I look up and she has gone, the new idea negates the previous idea.

Spinoza seems to hold that ideas involve beliefs. This is what Spinoza means when he says that the ideas themselves involve affirmations and negations. Rather than the will choosing to assent or deny things, the will is only the individual volitions that are in fact the individual ideas, which always already involve affirmation and/or negation. To be sure, even knowledge as simple as my friend’s presence will involve a complex of indefinite affirmations and negations, everything from the general laws of nature to mundane facts about daily life. A consequence of ideas as involving affirmation and negation is that error does not result from affirming judgments that are false but rather is a consequence of inadequate knowledge (IIP49SI, 485). Unfortunately, most of our ideas are inadequate. In the above example, it can easily be the case that I continue to have the idea of my friend’s presence when she is no longer in the café, because I will have this idea as long as no other idea negates it (IIP17C).

For Spinoza, therefore, the will is not free and is the same as the intellect. He is aware that this is a strange teaching, explicitly pointing out that most people do not recognize its truth. The reason for this failure to recognize the doctrine that the will is not free can, however, be understood both as an epistemic and a global confusion. Epistemically, most people do not understand that an idea involves an affirmation or negation, but they believe the will is free to affirm or deny ideas. According to Spinoza, “because many people either completely confuse these three – ideas, images, and words – or do not distinguish them accurately, or carefully enough, they have been completely ignorant of this doctrine concerning the will” (IIP49SII, 485-86). First, some people confuse ideas with images “which are formed in us from encounters with bodies.” Images, for Spinoza, are physical and extended, and are, therefore, not ideas. But these people take the ideas to be formed by the direct relation between the mind and body. This has two results: a) ideas of things of which no image can be formed are taken to be “only fictions which we feign from free choice of the will”. In other words, some ideas are not understood as ideas (which involve affirmation and negation) caused by other ideas but as choices of the free will; b) these people “look on ideas, therefore, as mute pictures on a panel,” which do not involve affirmation or negation but are affirmed and denied by the will. Second, some people confuse words with ideas or with the affirmation involved in the ideas. Here they confuse affirmations and negations with willfully affirming or denying in words. Spinoza points out that they cannot affirm or deny something contrary to what the very idea in the mind affirms or negates. They can only affirm or deny in words what is contrary to an idea. In the above example, I can deny in words that my friend is in the café, but these words will not be a negation of the idea which I had while perceiving her as being in the café. For Spinoza, images and words are both extended things and not ideas. This confusion, however, has hindered people from realizing that ideas in themselves already involve affirmations and negations.

Spinoza further explains these confusions and defends his view against possible objections. It is here that Spinoza launches his attack on the Cartesian defense of free will and its involvement in error. Before turning to these possible objections and Spinoza’s replies, a brief overview of Descartes’ view of the will is helpful. In Meditations 4, Descartes explains error through the different scopes of the intellect and the will. The former is limited since we only have limited knowledge, that is, clear and distinct ideas, while our will possibly extends to everything in application, and is thus infinite. Descartes writes, “This is because the will simply consists in our ability to do or not do something (that is, to affirm or deny, to pursue or avoid), or rather, it consists simply in the fact that when the intellect puts something forward for affirmation or denial, for pursuit or avoidance, our inclinations are such that we do not feel we are determined by any external force” (57). Descartes continues, however, that freedom of the will does not consist in indifference. The more the will is inclined toward the truth and goodness of what the intellect presents to it, the freer it is. Descartes’ remedy against error is the suspension of judgment whenever the intellect cannot perceive the truth or goodness clearly and distinctly. Descartes, therefore, understands the will as a faculty of choice, which can affirm or deny freely to make judgments upon ideas presented by the intellect. Though the will is freer when it is based on clear and distinct ideas, it still has an absolute power of free choice in its ability to affirm or deny.

Turning to the possible objections to Spinoza’s view of the will brought up in II49S, the first common objection concerns the alleged different scope of the intellect and the will. Spinoza disagrees that the “faculty of the will” has a greater scope than the “faculty of perception”. Spinoza argues that this only seems to be the case because: 1) if the intellect is taken to only involve clear and distinct ideas, then it will necessarily be more limited; and 2) the “faculty of the will” is itself a universal notion “by which we explain all the singular volitions, that is, it is what is common to them all” (488). Under this view of the will, the power of assenting seems infinite because it employs a universal idea of affirmation that seems applicable to everything. Nevertheless, this view of the will is a fiction. Against the second common objection, that we know from experience that we can suspend judgment, Spinoza denies that we have the power to do so. What actually happens when we seem to hold back our judgment is nothing but an awareness that we lack adequate ideas. Therefore, suspension of judgment is nothing more than perception and not an act of free volition. Spinoza provides examples to illustrate his argument, among them that of a child who imagines a winged horse. The child will not doubt the existence of the winged horse, like an adult who has ideas that exclude the existence of winged horses, until he learns the inadequacy of such an idea. Spinoza is careful to note that perceptions themselves are not deceptive. But they do already involve affirmation independently of their adequacy. For this reason, if nothing negates the affirmation of a perception, the perceiver necessarily affirms the existence of what is perceived.

The third objection is that, since it seems that it is equally possible to affirm something which is true as to affirm something which is false, the affirmation cannot spring from knowledge but from the will. Therefore, the will must be distinct from the intellect. In reply to this, Spinoza reminds us that the will is something universal, which is ascribed to all ideas because all ideas affirm something. As soon as we turn to particular cases, the affirmation involved in the ideas is different. Moreover, Spinoza “denies absolutely” that we need the same power of thinking to affirm something as true which is true as we would need in the case of affirming something as true which is false. An adequate or true idea is perfect and has more reality than an inadequate idea, and therefore the affirmation involved in an adequate idea is different from that of an inadequate idea. Finally, the fourth objection refers to the famous Buridan’s ass, who is caught equidistantly from two piles of feed. A human in such an equilibrium, if it had no free will, would necessarily die. Spinoza, rather humorously, responds, “I say that I grant entirely that a man placed in such an equilibrium (namely, who perceives nothing but thirst and hunger and such food and drink as are equally distant from him) will perish of hunger and thirst. If they ask me whether such a man should be thought an ass rather than a man, I say that I do not know – just as I also do not know how highly we should esteem one who hangs himself, or children, fools, and madmen, and so on” (II49S, 490).

Besides answering the common objections to his identification of the will with the intellect, Spinoza also provides an explanation for the necessary origin of our illusionary belief that the will is free (see Melamed 2017). Spinoza alludes to this illusion a number of times. In the Ethics, it first occurs in the Appendix to Part 1 when he argues against natural teleology. He writes that,

All men are born ignorant of the causes of things, and that they all want to seek their own advantage and are conscious of this appetite. From these it follows, first, that men think themselves free, because they are conscious of their volitions and their appetites, and do not even think in their dreams, of the causes by which they are disposed to wanting and willing because they are ignorant of those causes. (440)

That is, because human beings are 1) ignorant of the causes of their volitions but 2) conscious of their desires, they necessarily believe themselves to be free. Hence, free will is an illusion born of ignorance. In a correspondence with Shuller, Spinoza provides a vivid image of the illusion of free will, writing that a stone, when put into motion, if it could judge, would believe itself free to move, though it is determined by external forces. This is exactly the same for human beings’ belief in free will. Spinoza even writes that “because this prejudice is innate in all men, they are not so easily freed from it” (Letter 58, 428).

Spinoza has another extensive discussion of free will as a result of ignorance in the scholium of IIIP2 in the Ethics. The proposition states “I body cannot determine the mind to thinking, and the mind cannot determine the body to motion, to rest, or anything else (if there is anything else)” (IIIP2). Spinoza’s parallelism holds that the mind and the body are one and the same thing conceived through different attributes, so there is no intra—attribute causality. The order and connection of ideas are the same as the order and connection of bodies, but it is not possible to explain the movement of bodies in terms of the attribute of thought, nor is it possible to explain the thinking of ideas through the attribute of extension. Spinoza is well aware that this will be unacceptable to most people who believe their will is free and that it is the mind which causes the body to move: They are so firmly persuaded that the body now moves, now is at rest, solely from the mind’s command, and that it does a great many things which depend only on the mind’s will and its art of thinking” (IIIP2S, 494-95).

Against this prejudice, Spinoza defends his position by pointing out 1) that human beings are so far quite ignorant of the mechanics of the human body and its workings (for instance, the brain) and 2) that human beings cannot explain how the mind can interact with the body. He further elucidates these points by responding to two objections taken from experience.

But they will say [i] that – whether or not they know by what means the mind moves the body – they still know by experience that unless the human mind were capable of thinking, the body would be inactive. And then [ii], they know by experience, that it is in the mind’s power alone both to speak and to be silent, and to do many other things, which they therefore believe to depend on the mind’s decision. (495)

In response to the first objection, Spinoza argues that while it is true that the body cannot move if the mind is not thinking, the contrary, that the mind cannot think if the body is inactive, is equally true, for they are, after all, one and the same thing conceived through different attributes. Against the great disbelief, though, that “the causes of buildings, of painting, and of things of this kind, which are made only of human skill, should be able to be deduced from the laws of Nature alone, insofar as it is considered corporeal” (496), Spinoza responds by reaffirming that humans are not yet aware of what the human body can do according to its own laws. He gives an interesting example of sleepwalkers doing all kinds of actions, none of which they recall when they are awake.

Concerning the second objection that humans apparently speak (a physical action) from the free power of the mind being an indication that the mind controls the body, Spinoza states that humans have just as much control over their words as over their appetites. He points out that they can hold their tongue only in cases of a weak inclination to speak, just as they can resist indulgence in a weak inclination to certain pleasures. But when it comes to stronger inclinations, humans often suffer from akrasia, or weakness of will. Again, they believe themselves to be free when, in fact, they are driven by causes they do not know. He points to:

[The infant believing] he freely wants the milk; the angry child that he wants vengeance; and the timid, flight. So, the drunk believes it is from a free decision of the mind that he speaks the things he later, when sober, wishes he had not said. So, the madman, the chatterbox, the child, and great many people of this kind believe they speak from a free decision of the mind, when really they cannot contain their impulse to speak. (496)

Here again, Spinoza argues that humans believe themselves free because they are conscious of their own desires but ignorant of the causes of them. Discussing the will with the body, he then states that, as bodies and minds are identical, decisions of the mind are the same as appetites and determinations of the body, understood under different attributes.

Finally, Spinoza points out that humans could not even speak unless they recollected words, though recollecting or forgetting itself is not at will, that is, by the free power of the mind. So it must be that the power of the mind consists only in deciding to speak or not to speak. However, Spinoza counters that often humans dream they are speaking and in their dreams believe that they do this freely, but they are not in fact speaking. In general, when humans are dreaming, they believe they are freely making many decisions, but in fact they are doing nothing. Spinoza asks pointedly:

So, I should very much like to know whether there are in the mind two kinds of decisions – those belonging to our fantasies and those that are free? And if we do not want to go that far in our madness, it must be granted that this decision of the mind, which is believed to be free, is not distinguished from the imagination itself, or the memory, nor is it anything beyond that affirmation which is the idea, insofar as it is an idea, necessarily involves. And so the decisions of the mind arise by the same necessity as the idea of things which actually exist. Those, therefore, who believe that they speak or are silent or do anything from a free decision of the mind, dream with open eyes. (497)

One final point concerning the illusion of free will: Spinoza uses belief in free will as one of his examples of error in IIP35S. IIP35 states that “falsity consists in the privation of knowledge which inadequate, or mutilated and confused ideas, involve.” In the Scholium, he reiterates the now familiar cause of the belief in free will, namely, that humans are conscious of their volitions but ignorant of the causes which determine their volitions. However, Spinoza here is not just claiming that we have an inadequate knowledge of the causes of our volitions leading us to err in thinking the will is free. He makes the stronger claim that because our knowledge of the will is inadequate, we cannot help but imagine that our will is free, that is, we cannot help but experience our will as free in some way, even if we know that it is not.

This can be seen from the second example of error that he uses. When looking at the sun, we imagine that it is rather close. But, Spinoza argues, the problem is not just the error of thinking of a much smaller distance than it is. The problem is that we imagine (that is, we have an idea of the affectation of our body affected by the sun) or experience the sun as being two hundred feet away regardless of whether we adequately know the true distance. Even knowing the sun’s true distance from our body, we will always experience it as being about two hundred feet away. Similarly, even if we adequately understand that our will is not free but that each of our volitions is determined, we will still experience it as free. The reason for this is explained in IIP48S, where Spinoza argues that the will, understood as an absolute faculty, is a “complete fiction or metaphysical being, or universal” which we form, however, necessarily. As mentioned above, universals are formed when—the body overloaded with images through affections—the power of imagining is surpassed, and a notion formed by focusing on similarities and ignoring a great many of the differences between its ideas. Spinoza’s point here in emphasizing the inevitability of error due to the prevalence of imagination and the limited scope of our reason is that humans cannot escape the illusion of free will.

3. Spinoza on Human Freedom

While Spinoza denies that the will is free, he does consider human freedom (libertas humana) as possible. Given the caveat just described, this freedom must be understood as limited. For Spinoza, freedom is the end of human striving. He often equates freedom with virtue, happiness, and blessedness (beatitudo), the more familiar end of human activity (for an overview, see Youpa 2010). Spinoza does not understand freedom as a capacity for choice, that is, as liberum arbitrium (free choice), but rather as consisting in acting as opposed to being acted upon. For Spinoza, freedom is ultimately constituted by activity. In Part I of the Ethics, Spinoza defines, “that thing is called free which exists from the necessity of its nature alone, and is determined to act by itself alone. But a thing is called necessary, or rather compelled, which is determined by another thing to exist and produce an effect in a certain and determinant manner” (ID7). According to this definition, only God, properly speaking, is absolutely free, because only God exists from the necessity of his nature and is determined to act from his nature alone (IP17 and IP17C2). Nevertheless, Spinoza argues that freedom is possible for human beings insofar as they act: “I say we act when something happens, in us or outside of us, of which we are the adequate cause, that is, (by D1), when something in us or outside of us follows from our nature, which can be clearly and distinctly understood through it alone” (IIID2). IIID1 gives the definition of adequate cause: “I call that cause adequate whose effect can be clearly and distinctly perceived through it.” From these definitions, we can see that if human freedom is constituted by activity, then freedom will be constituted by having clear and distinct ideas or adequate knowledge.

Above, it was seen that for Spinoza, will and intellect are one and the same. The will is nothing but singular volitions, which are ideas. These ideas already involve affirmation and negation (commonly ascribed to the faculty of will). In Part II, when arguing against the Cartesian view of the will, Spinoza emphasizes the will as a supposed “faculty of affirming and denying” in order to dispel the universal notion of a free will. In Part III, in his discussion of affects, he provides a fuller description of the will and the affective nature of ideas, providing the tools for his discussion of human freedom. By “affect,” Spinoza understands “the affections of the body by which the body’s power of acting is increased or diminished, aided and restrained, and at the same time, the ideas of these affections” (IIID3). Accordingly, he concludes that “if we can be the adequate cause of any of these affections, I understand by the affect an action; otherwise, a passion.” There is thus a close connection between activity and adequate ideas, as well as between passions and inadequate ideas (IIIP3).

Since most of our knowledge involves ideas of affections of the body, which are inadequate ideas, human beings undergo many things, and the mind suffers many passions until the human body is ultimately destroyed. Nevertheless, Spinoza argues that “each thing, as far as it can by its own power, strives to persevere in its own being” (IIIP6)”. This is Spinoza’s famous conatus principle, by which each individual strives to preserve its being or maintain what might be called homeostasis. In fact, Spinoza argues that the conatus, or striving, is the very essence of each thing (IIIP7). Furthermore, this striving is the primary affect, appetite, or desire. The conatus, or striving, when related solely to the mind, is understood as the will. When the conatus is conceived as related to both mind and body, Spinoza calls it appetite, and when humans are conscious of their appetite, he calls it desire (IIIP9S). Hence, Spinoza defines “desire is man’s very essence, insofar as it is conceived to be determined, from any given affection of it, to do something” (Def. Aff. I).

The conatus is central to Spinoza’s entire moral psychology, from which he derives his theory of affects, his theory of freedom, and his ethical and political theories. In arguing that any human individual is fundamentally striving (conatus) to persevere in being, Spinoza follows Hobbes’ moral psychology. In the Leviathan, Hobbes introduces his concept of conatus in its English version: “the small beginnings of motion within the body of man, before they appear in walking, speaking, and other visible action, are commonly called endeavor [conatus]. This endeavor, when it is toward something which causes it, is called appetite or desire” (Leviathan VI.1-2). Such desire or voluntary motion does not spring from a free will, Hobbes argues, but has its origins from the motion of external bodies imparting their motion to the human body, producing sensation. That is, Hobbes already equates the conatus with the will. Also, Hobbes already derives a taxonomy of passions from the conatus, albeit one that is far less sophisticated and complex than Spinoza’s taxonomy. Furthermore, Hobbes holds that the entire life of human beings consists of an endless desire for power, by which he understands “the present means to attain some future apparent good” (Leviathan X.1). This desire for power ends only with the eventual death of an individual (Leviathan IX.2). For Hobbes, humans are, for the most part, led by their passions, as, for instance, in the construction of a commonwealth from the state of nature, in which they are led by the fear of death and hope for a better life (Leviathan XIII.14). Though, of course, reason provides the means by which the construction of the state is possible. While there are many parallels between Hobbes’ and Spinoza’s psychology, Hobbes understands the conatus entirely as physical, explained by a materialistic mechanical philosophy. In contrast, for Spinoza, the conatus is both physical and psychological, according to his parallelism. Notwithstanding his focus on an ethic, his account of the affects often emphasizes psychological explanations.

From desire, that is, the conscious appetite of striving, Spinoza derives two other primary affects, namely joy and sadness. Spinoza describes joy as the passage or transition of the mind from a lesser to a greater perfection or reality, and sadness as the opposite, the passage of the mind from a greater to a lesser perfection or reality (IIIP11S). The affect of joy as related to both mind and body, he calls “pleasure or cheerfulness,” that of sadness “pain or melancholy.” IIIP3 underlines Spinoza’s parallelism with respect to his theory of affects: “the idea of anything that increases or diminishes, aids or restrains, our body’s power of acting, increases or diminishes our power of thinking” (IIIP11). In these essential basic definitions, Spinoza employs the concept of perfection or reality (equated in IID6). What he means by this can be grasped rather intuitively. The more perfection or reality an individual has, the more power it has to persevere in being, or the more the individual is capable of acting and thinking. When this power increases through a transition to greater perfection, the individual experiences joy. But if it decreases to lesser perfection, it experiences sadness.

Spinoza holds that from these three main affects all others, in principle, can be deduced or explained. However, the variety of affects is dependent not only on the individual but also on all the external circumstances under which they strive. Still, Spinoza provides explanations of the major human affects and their origin from other affects. The first affects he deduces from joy and sadness are love and hate. Whatever an individual imagines increases their power and causes joy, they love; and what decreases their power and causes sadness, they hate: “Love is nothing but joy and the accompanying idea of an external cause, and hate is nothing but sadness with the accompanying idea of an external cause” (IIIP13S). Accordingly, human beings strive to imagine those things (that is, have ideas of the affections of their body caused by those things) that increase their power of acting and thinking (IIIP12), causing joy, while avoiding imagining things that decrease their power of acting and thinking, causing sadness. Like Hobbes, Spinoza holds that human beings strive to increase their power, Spinoza, though, understands this specifically as a power to act and indeed to think.

Furthermore, because “the human body can be affected in many ways in which its power of acting is increased or diminished, and also in others which render its power of acting neither greater nor less” (III Post.I), there are many things which become the accidental cause of joy or sadness. In other words, it can happen that an individual loves or hates something not according to what actually causes joy (or an increase in power) or sadness (or a decrease in power), but rather something that appears to bring joy or sadness. This is possible because human beings are usually affected by two or more things at once, one or more of which may increase or decrease their power or causes joy or sadness, while others have no effect. Moreover, an individual, remembering experiences of joy or sadness accidently related to certain external causes, can come to love and hate many things by association (IIIP14). Indeed, Spinoza holds that there are as many kinds of joy, sadness, and desire as there are objects that can affect us (IIIP56), noting the well—known excessive desires of gluttony, drunkenness, lust, greed, and ambition.

Spinoza ultimately develops a rich taxonomy of passions and their mixtures, including the more common anger, envy, hope, fear, and pride, but also gratitude, benevolence, remorse, and wonder, to name a few. Not only does he define these passions, but he also gives an account of their logic, which is paramount for understanding the origin of these passions, and thereby ultimately overcoming them. True to his promise in the preface to the third part, Spinoza treats the affects “just as if they were a question of lines, planes, and bodies” (492). Initially and broadly, Spinoza discusses those affects that are passions because we experience them when we are acted upon. Human beings are passive in their striving to persevere in their being due to their inadequate ideas about themselves, their needs, as well as external things. Therefore, their striving to imagine what increases their power and avoiding what decreases their power fails, leading to a variety of affects of sadness. In contrast to traditional complaints about the weakness of humans with respect to their affects, however, Spinoza argues that “apart from the joy and desire which are passions, there are other affects of joy and desire which are related to us insofar as we act” (IIIP58) and that all such affects related to humans insofar as they act are ones of joy or desire and not sadness. Of course, this makes sense, as sadness is the transition from greater to lesser perfection and a decrease in the power of acting or thinking.

Spinoza’s theory of affects provides the foundation for his theory of human freedom, because ultimately freedom involves maximizing acting and minimizing being acted upon, that is, having active affects and not suffering passions. Recall that for Spinoza only God is absolutely free, because only God is independent as a self—caused substance and acts according to the necessity of his own nature, and because Spinoza defines a free thing as “existing from the necessity of its nature alone, and is determined to act by itself alone.” Human beings cannot be absolutely free. But insofar as they act, they are the adequate cause of their actions. This is to say that the action “follows from their nature, which can be clearly and distinctly understood through it alone” (IIID2). Therefore, when human beings act, they are free. This is opposed to being acted upon, or having passions, in which humans are only the inadequate or partial cause and are not acting according to their nature alone but are determined by something outside of themselves (see Kisner 2021). Therefore, the more human beings act, the freer they are; the more they suffer from passions, the less they are free.

Thus, Spinoza understands freedom in terms of activity as opposed to passivity, acting as opposed to being acted upon, or being the adequate cause of something as opposed to the inadequate cause of something: “I call that cause adequate whose effect can be clearly and distinctly perceived through it. But I call it partial or inadequate if its effect cannot be understood through it alone” (IIID1). From the perspective of the attribute of thought, being the adequate cause of an action is a function of having adequate ideas or true knowledge. He writes, “Our mind does certain things, [acts] and undergoes other things, namely, insofar as it has adequate ideas, it necessarily does certain things, and insofar as it has inadequate ideas, it necessarily undergoes other things” (IIIP1). Spinoza’s reasoning here is that when the mind has an adequate idea, this idea is adequate in God insofar as God constitutes the mind through the adequate idea. Thus, the mind is the adequate cause of the effect because the effect can be understood through the mind alone (by the adequate idea) and not something outside of the mind. But in the case of inadequate ideas, the mind is not the adequate cause of something, and thus the inadequate idea is, in God, the composite of the idea of the human mind together with the idea of something else. For this reason, the effect cannot be understood as being caused by the mind alone. Thus, it is the inadequate or partial cause. While this is Spinoza’s explanation of how being an adequate cause involves having adequate knowledge, there is some controversy among scholars about the status of humans having adequate ideas and true knowledge.

In Part II of the Ethics in IIP40S2, Spinoza differentiates three kinds of knowledge, which he calls imagination, reason, and intuitive knowledge. The first kind, imagination, mentioned above, has its sources in bodies affecting the human body and the ideas of these affections, or perception and sensation. It also includes associations with these things by signs or language. This kind of knowledge is entirely inadequate or incomplete, and Spinoza often writes that it has “no order for the intellect” or follows from the “common order of Nature,” that is, it is random and based on association. Passions, or passive affects, fall in the realm of imagination because imaginations are quite literally the result of the body being acted upon by other things, or, what is the same, ideas of these affections. The other two kinds of knowledge are adequate. Reason is knowledge that is derived from the knowledge of the common properties of all things, what Spinoza calls “common notions”. His thinking here is that there are certain properties shared by all things and that, being in the part and the whole, these properties can only be conceived adequately (IIP38 and IIP39). The ideas of these common properties cannot be but adequate in God when God is thinking the idea that constitutes the human mind and the idea which constitutes other things together in perception. Also, those ideas that are deduced from adequate ideas are also adequate (P40). The common notions, therefore, are the foundation of reasoning.

Some commentators, however, have pointed out that it seems impossible for humans to have adequate ideas. Michael Della Rocca, for instance, argues that having an adequate idea seems to involve knowledge of the entire causal history of a particular thing, which is not possible (Della Rocca 2001, 183, n. 29). This is because of Spinoza’s axiom that “the knowledge of an effect depends on and involves the knowledge of its cause” (IA4), and, as we have seen, finite singular things are determined to exist and produce an effect by another finite singular thing, and so on ad infinitum. Thus, adequate knowledge of anything would require adequate knowledge of all the finite causes in the infinite series. Eugene Marshall obviates this problem by arguing that it is possible to have adequate knowledge of the infinite modes (Marshall 2011, 31-36), which some commentators take, for Spinoza, to be the concern of the common notions (Curley 1988, 45fn; Bennett 1984, 107). Indeed, Spinoza argues that humans have adequate knowledge of God’s eternal and infinite essence (IIP45-P47), which would include knowledge of the attributes and infinite modes. Intuitive knowledge is also adequate, though it is less clear what specifically it entails. Spinoza defines it as “a kind of knowing [that] proceeds from an adequate idea of the formal essence of certain attributes of God to the adequate knowledge of the formal essence of things” (IIP40S2, 478). Here, Spinoza does indicate knowledge of the essence of singular things returning to the above problem, though Marshall, for instance, points out that Spinoza does not indicate the essence of finite modes existing in duration (existing in time), which would require knowledge of the causal history of a finite mode. Rather, he suggests that Spinoza here speaks of the idea of the essence of things as sub specie aeternitatis, or things considered existing in the eternal attributes of God (Marshall 2011, 41-50). Furthermore, rational knowledge and intuitive knowledge are both related (Spinoza argues that rational knowledge encourages intuitive knowledge) but also distinct (VP28).

Rational knowledge and intuitive knowledge, because they involve adequate ideas, are necessary for human freedom. Again, this is because human freedom is constituted by activity, and humans act when they are the adequate cause of something that follows from their nature (IIID2). Moreover, humans can be the adequate cause, in part, when the mind acts or has adequate ideas (IIIPI). This is how Spinoza explains the possibility of human freedom metaphysically. However, human freedom, which Spinoza equates with virtue and blessedness, is the end of human striving, that is, the ongoing project of human existence. The essence of a human being, the conatus, is the striving to persevere in being and consequently to increase the power of acting and thinking, and this increase brings about the affect of joy. This increase in the power of acting and thinking can occur passively— the passion of joy—when human beings strive from inadequate ideas, or it can occur actively when human beings strive from adequate ideas, or from reason and intuitive knowledge. The more human beings strive for adequate ideas or act rationally in accordance with their own nature, the freer they are and the greater is their power of acting and thinking and the consequent joy. Therefore, reason and intuitive knowledge are paramount for freedom, virtue, and blessedness (VP36S) (see Soyarslan 2021).

For Spinoza, human freedom is very different from free will as ordinarily understood. It is not a faculty or ability apart from the intellect. Rather, it is a striving for a specific way of life defined by activity, reason, and knowledge instead of passivity and ignorance. Determinism is not opposed to this view of freedom, as freedom is understood as acting according to one’s own nature and not being compelled by external forces, especially passions. In this respect, it has many similarities to the view of freedom held by Hobbes and that of the Stoa in different respects. For Hobbes, being a materialist, freedom only applies properly to bodies and concerns the absence of external impediments to the motion of a body. Likewise, calling a human free indicates he is free “in those things which by his own strength and wit he is able to do is not hindered to do what he has a will to” (Leviathan XXI.1-2). However, Spinoza’s view of freedom differs substantially from Hobbes in that he has a more extensive view of what it means to be impeded by external forces, recognizing that the order of ideas and bodies are one and the same. For the Stoa, generally speaking, freedom consists in living a rational life according to nature. If one lives according to nature, which is rational, one can be free despite the fact that nature is determined because one conforms the desires to the order of nature through virtue. A famous illustration of such an understanding of freedom is given by a dog led by a cart. If the dog willingly follows the cart that is pulling it, it acts freely; if it resists the motion of the cart, being pulled along nonetheless, it lacks freedom (Long 1987, 386). For Spinoza, freedom does not conflict with determinism either, as long as human beings are active and not passive. Likewise, the greatest impediment to freedom are the passions, which can so overcome the power of an individual that they are in bondage or a slave. Spinoza famously writes “Man’s lack of power to moderate and restrain the affects I call bondage. For the man who is subject to affects is under the control, not of himself, but of fortune, in whose power he so greatly is that often, though he sees the better for himself, he is still forced to follow the worse” (IV Preface, 543). In these lines, Spinoza presents not only the problem that the passions present to human thriving but also situates this problem within the context of the classic enigma of akrasia, or weakness of will.

In the first 18 propositions of Part IV of the Ethics, entitled “Of Human Bondage, or the Power of the Affects,” Spinoza aims to explain “the causes of man’s lack of power and inconstancy, and why men do not observe the precepts of reason” (IVP18S, 555). First, he sets up the general condition that human beings, being a part of nature, are necessarily acted upon by other things (IVP2). Their power in striving to persevere in being is limited and surpassed by the power of other things in nature (IVP3). Therefore, it is impossible for them to be completely free or act only in accordance with their own nature (IVP4). Accordingly, Spinoza admits, “from this it follows that man is necessarily always subject to passions, that he follows and obeys the common order of Nature, and accommodates himself to it as much as the nature of things requires” (IVP4C). This, of course, is the reason that human freedom is always limited and requires constant striving. Human beings are constantly beset by passions, but what is worse is that the power of a passion is defined by the power of external causes in relation to an individual’s power (IVP5). This is to say, human beings can be overwhelmed by the power of external causes in such a way that “the force of any passion or affect can surpass the other actions, or powers of a man, so that the affect stubbornly clings to the man” (IVP6). This can be easily understood from the universal human experiences of grief and loss, envy and ambition, great love and hatred, as well as from any form of addiction and excessive desire for pleasures. Such passions, and even lesser ones, are hard to regulate and can interrupt our striving for a good life or even completing the simple tasks of daily life.

In IVP7, Spinoza touches on the main issue in akrasia, writing that “an affect cannot be restrained or taken away except by an affect opposite to and stronger than the affect to be restrained”. Here we can see why merely knowing what is good or best does not restrain an affect, and humans often see the better course of action but pursue the worse. The issue here is that Spinoza thinks that a true or adequate idea does not restrain a passion unless it is also an affect that increases the individual’s power of action (IVP 14). Furthermore, an affect’s power is compounded by its temporal and modal relationship to the individual. For instance, temporally, an affect whose cause is imagined to be present is stronger than if it were not (IVP9), if it is imagined to be present imminently rather than far in the future, or if it was present in the recent past rather than in distant memory (IVP10). Likewise, modally, an affect toward something humans view as necessary is more intense than if they view it as possible or contingent (IVP11).

Because the power of affects is temporally and modally affected and because an affect can be restrained by an opposite and more powerful affect, it often is the case that a desire that does come from true knowledge or adequate ideas is still overcome by passions (IVP 15). This can be easily seen in a desire for some future good, which is overcome by the longing for pleasures of the moment (IVP16), as is so often the case. However, “a desire that arises from joy is stronger, all things being equal, than one which arises from sadness” (IVP18). That joy is more powerful than sadness is prima facie a good thing, except that in order to overcome the passions and achieve the good life, true knowledge of good and evil in the affects is necessary. Spinoza’s conception of the good life, or what he calls blessedness, is in essence overcoming this domination of the passions and providing the tools for living a life of the mind, which is the life of freedom (see James 2009). Thus, Spinoza provides guidance for how such a good life can be achieved in Books IV and V of the Ethics, namely in the ideal exemplar of the free man and the so-called remedies of the passion.

4. The Free Man and the Way to Freedom

In the preface to Part IV of the Ethics, Spinoza introduces the idea of the model of human nature, or the “free man”. The free man is understood as an exemplar to which humans can look to decide whether an action is good or evil (there is some controversy over the status of the free man, for instance, see Kisner 2011, 162-78; Nadler 2015; Homan 2015). Spinoza is often interpreted as a moral anti-realist because of some of his claims about moral values. For instance, he writes “We neither strive for, nor will, neither want, nor desire anything because we judge it to be good; on the contrary, we judge it to be good because we strive for it, will it, want it, and desire it” (IIIP9S). And by “good here I understand every kind of joy, and whatever leads to it, and especially what satisfies any kind of longing, whatever that may be. And by evil, every kind of sadness, and especially what frustrates longing” (IIIP39S, 516). However, as anything can be the accidental cause of joy or sadness (IIIP15), it would seem that good and evil, or some goods and evils, are relative to the individual, as is the case for Hobbes. Moreover, Spinoza indicates that in nature there is nothing good or evil in itself. He writes “As far as good and evil are concerned, they also indicate nothing positive in things, considered in themselves, nor are they anything other than modes of thinking or notions we form because we compare things to one another” (IV Preface, 545) (for an overview of Spinoza’s meta-ethics, see Marshall 2017).

Nevertheless, in Part IV of the Ethics, Spinoza redefines good and evil. Good is now understood as what is certainly known to be useful to us, and evil as what is certainly known to prevent the attainment of some good (IVD1 and IVD2). What does Spinoza mean here by “useful”? What is useful to a human individual is what will allow them to persevere in being and increase their power of acting and thinking, especially according to their own nature, or “what will really lead a man to greater perfection” (IVP18S, 555). This new definition of good as what is really useful is distinguished from mere joy or pleasure, which, insofar as it prevents us from attaining some other good, can be an evil. For Spinoza, the most useful thing for humans is virtue (IVP18S), by which they can attain greater perfection, or greater power of acting and thinking. In order to understand what is really useful and good, Spinoza proposed the idea of the free man “as a model of human nature which we may look to”. For this reason, he also defines good relative to this model, writing, “I shall understand by good, what we certainly know is a means by which we may approach nearer and nearer to the model of human nature we set before ourselves” (IV Preface, 545).

With this model of human nature in mind, Spinoza then goes on to give an analysis of what is good and evil in the affects. Generally speaking, all passions that involve sadness, that is, affects that decrease the perfection or reality of an individual and consequently the ability of the mind to think and the body to act are evil (IVP41). For instance, hate towards other humans is never good (IVP45) and all species of such hate such as envy, disdain, and anger, are evil (IVP45C2). Also, any affects that are mixed with sadness, such as pity (IVP50), or are vacillations of the mind, like hope and fear (IVP47), are not good in themselves. In contrast, all affects that are joyful, that is, which increase the reality or perfection of an individual and consequently the ability of the mind to think and the body to act, are directly good. Spinoza qualifies, however, since the net increase and decrease in power of the individual has to be taken as a whole, with its particular conditions, and over time. For instance, the passion of joy and pleasure might be excessive (IVP43) or relate to only one part of an individual (IVP60), and the power of passions, being defined by the power of external causes, can easily overcome our power of acting and thinking as a whole and, thus, lead to greater sadness. Likewise, some sadness and pain might be good to the extent that they  prevent a greater sadness or pain by restraining excessive desires (IVP43). It can easily be seen that love, which is a species of joy, if excessive, can be evil. Spinoza writes:

Sickness of the mind and misfortunes take their origin, especially, from too much love towards a thing which is liable to many variations and which we can never fully possess. For no one is disturbed or anxious concerning anything unless he loves it, nor do wrongs, suspicions, and enmities arise except from love for a thing which no one can really fully possess. (VP20S, 606)

Here again, it can be seen that, though joy in itself is directly good, it is often problematic as a passion and sometimes leads to sadness. Nevertheless, there is an interesting asymmetry here. While human beings’ passivity often leads them to the experiences of passions that are a variety of sadness, there are certain passions of joy that can, all things being equal, increase the power of an individual. This asymmetry allows for how human beings can increase their power of thinking and acting before they can act on adequate ideas. Therefore, it is important to note that joyful passions qua passions can be good and increase activity, despite being passions, and insofar as it increases our power of acting, it adds to freedom (see Goldenbaum 2004; Kisner 2011, 168-69). In this respect, the view toward the passions developed by Spinoza, undoubtedly influenced by Stoicism, differs from the general Stoic view. For the Stoa, virtue is living according to reason. The goal of the Stoic sage is to reach ataraxia, a state of mental tranquility, through apatheia, a state in which one is not affected by passions (pathai), which by definition are bad. By contrast, Spinoza explicitly understands passions of joy, all things being equal, as good.

Moreover, Spinoza also emphasizes that there are many things external to the human individual that are useful and therefore good, including all the things that preserve the body (IVP 39) and allow it to optimally interact with the world (IVP 40): “It is the part of a wise man, I say, to refresh and restore himself in moderation with pleasant food and drink, with scents, with the beauty of green plants, with decorations, music, sport, the theater, and other things of this kind, which anyone can use without injury to another” (IVP 45S, 572). Most significant in the category of external goods are other human beings. While other humans can be one of the greatest sources of conflict and turmoil insofar as they are subject to passions (IVP32-34), Spinoza also thinks that “there is no singular thing in Nature which is more useful to man than a man who lives according to the guidance of reason” (IVP35C). For this reason, Spinoza recognizes, similar to Aristotle, that good political organization and friendship are foundational to the good life – freedom, virtue, and blessedness (IVP73, for instance).

Leaving aside the many things in nature that are useful and good for human freedom, despite being external to the individual, what is ultimately constitutive of human freedom is active affects or what is the same, rational activity, that is, striving to persevere in being through the guidance of reason and understanding. Actions are affects which are related to the mind because it understands them, and all such affects are joyful (IIIP59). Nor can desires arising from reason ever be excessive (IVP61). Thus, active joy and desire are always good. Spinoza equates the human striving to persevere in being through the guidance of reason with virtue, which he understands as power, following Machiavelli’s virtu. Albeit for Spinoza, this power is acting from reason and understanding. It can be seen that the conatus is intimately related to virtue, and it is indeed the foundation of virtue. Spinoza writes “The striving to preserve oneself is the first and only foundation of virtue” (IVP22C). When we strive to persevere in being, we seek our own advantage, pursuing what is useful (and therefore good) (IVP19) for increasing our power of acting and thinking. The more we pursue our own true advantage, the more virtue we have (IVP20).

Initially, this apparent egoism may seem like an odd foundation for virtue. However, virtue is the human power to persevere in being, and Spinoza qualifies: “A man cannot be said absolutely to act from virtue insofar as he is determined to do something because he has inadequate ideas, but only insofar as he is determined because he understands” (IVP23). So, virtue, properly speaking, is seeking one’s advantage according to knowledge and striving to persevere in being through the guidance of reason (IVP34). Furthermore, Spinoza argues that what we desire from reason is understanding (IVP26), and the only things that we know to be certainly good or evil are those things which lead us to understanding or prevent it (IVP27). Virtue, therefore, is a rational activity, or active affect, by which we strive to persevere in our being, increasing our power of acting and thinking, through the guidance of reason. Spinoza calls this virtue specifically fortitudo, or “strength of character”. He further divides the strength of character into animositas, or “tenacity” and generositas, or “nobility”. Tenacity is the desire to preserve one’s being through the dictates of reason alone. Nobility, likewise, is the desire to aid others and join them in friendship through the dictates of reason alone (IIIP59S). These two general virtues are both defined as a “desire to strive” to live according to the dictates of reason or to live a rational life of understanding and pursuing what is really to the advantage of the individual.

Though Spinoza does not give a systematic taxonomy of the two sets of virtues, certain specific virtues (and vices) can be found throughout the Ethics (for more, see Kisner 2011, 197-214). Neither does he give an exhaustive list of the “dictates of reason,” though many of these too can be gleaned from the text (see LeBuffe 2010, 177-179). For instance, when he states “He who lives according to the guidance of reason strives, as far as he can, to repay the other’s hate, anger, and disdain towards him with love and nobility” (IVP 46). However, since there is nothing good or evil in nature in itself, the exemplar of the free man is used to consider, in any particular case, what is good and evil from the perspective of the life of freedom and blessedness or happiness. Similar to Aristotle’s phronimos, who is the model of phronesis for discerning virtue in practice, Spinoza’s “free man” can be interpreted as an exemplar to whom an individual can look in order to discern what is truly useful for persevering in being, and what is detrimental to leading a good life defined by rational activity and freedom. In IVP67-IVP73, the so-called “free man propositions”, Spinoza provides an outline of some dictates of reason derived from the exemplar of the free man. Striving to emulate the free man, an individual should not fear death (IVP67), use virtue to avoid danger (IVP68), avoid the favors of the ignorant (IVP70), be grateful (IVP71), always be honest (IVP72), and live a life in community rather than in solitude (IVP73). Ultimately, the exemplar of the free man is meant to provide a model for living a free life, avoiding negative passions by striving to live according to the dictates of reason. However, Spinoza is well aware, as some commentators have pointed out, that the state of the free man, as one who acts entirely from the dictates of reason, may not be entirely attainable for human individuals. In paragraph XXXII of the Appendix to Part IV, he writes “But human power is very limited and infinitely surpassed by the power of external causes. So we do not have the absolute power to adapt things outside us to our use. Nevertheless, we shall bear calmly those things which happen to us contrary to what the principles of our advantage demand, if we are conscious that we have done our duty, that the power we have could not have extended itself to the point where we could have avoided those things, and that we are a part of the whole of nature, whose order we follow.”

In the final part of the Ethics, Spinoza proposes certain remedies to the passions, which he understands as the tools available to reason to overcome them, “the means, or way, leading to freedom.” In general, Spinoza thinks that the more an individual’s mind is made up of adequate ideas, the more active and free the individual is, and the less they will be subject to passions. For this reason, the remedies against the passions focus on activity and understanding. Spinoza outlines five general remedies for the passions:

I. In the knowledge itself of the affects;

II. In the fact that it [the mind] separates the affects from the thought of an external cause, which we imagine confusedly;

III. In the time by which the affection related to things we understand surpasses those related to things we conceive confusedly or in a mutilated way;

IV. In the multiplicity of causes by which affections related to common properties or to God are encouraged;

V. Finally, in the order by which the mind can order its affects and connect them to one another. (VP20S, 605)

The suggested techniques rely on Spinoza’s parallelism, stated in IIP7, that the order of ideas is the same as the order of things. For this reason, Spinoza argues that “in just the same way as thoughts and ideas of things are ordered and connected in the mind, so the affections of the body, images of things are ordered and connected in the body” (IVP1). Therefore, all the techniques suggested by Spinoza involve ordering the ideas according to adequate knowledge, through reason and intuitive knowledge. In this way, the individual becomes more active, and therefore freer, in being a necessary part of nature.

Spinoza’s first and foundational remedy involves an individual fully understanding their affects to obtain self-knowledge. Passive affects, or passions, are, after all, based on inadequate knowledge. Spinoza’s suggestion here is to move from inadequate knowledge to adequate knowledge by attempting to fully understand a passion, that is, to understand its cause. This is possible because, just as the mind is the idea of the body and has ideas of the affections of the body, it can also think ideas of ideas of the mind (IIP20). These ideas are connected to the mind in the same way as the mind is connected to the body (IIP21). Understanding a passion, then, is thinking about the ideas of the ideas of the affections of the body. Attempting to understand a passion has two main effects. First, by the very thinking about their passion, the individual is already more active. Second, by fully understanding their affect, an individual can change it from a passion to an action because “an affect which is a passion ceases to be a passion as soon as we form a clear and distinct idea of it” (VP3).

Spinoza’s argument for the possibility of this relies on the fact that all ideas of the affections of the body can involve some ideas that we can form adequately, that is, there are common properties of all things—the common notions or reason (VP4). So, by understanding affects, thinking ideas of the ideas of the affections of the body, particularly thinking of the causes of the affections of the body, we can form adequate ideas (that follow from our nature) and strive to transform passions into active affects. Spinoza does qualify that we can form some adequate ideas of the affections of the body, underlining that such understanding of passions is limited, but he also writes that “each of us has—in part, at least, if not absolutely—the power to understand himself and his affects, and consequently, the power to bring it about that he is less acted on by them” (VP4S, 598). Since “the appetite by which a man is said to act, and that by which he is said to be acted on are one and the same” (VP4S, 598) anything an individual does from a desire, which is a passion, can also be done from a rational affect.

Interconnected with the first remedy, Spinoza’s second remedy recommends the separation of the affect from the idea of the external cause. VP2 reads “If we separate emotions, or affects, from the thought of an external cause and join them to other thoughts, then the love, or hate, towards the external cause is destroyed, as are the vacillations of the mind arising from these affects.” For Spinoza, love or hate are joy or sadness with an accompanying idea of the external cause. He, here, is indicating that by separating the affect from the thought of an external cause that we understand inadequately, and by understanding the affect as mentioned above by forming some adequate ideas about the affect, we destroy the love and hate of the external cause. As mentioned earlier, anything can be the accidental cause of joy and sorrow (IIIP15), and therefore of love and hate. Furthermore, the strength of an affect is defined by the power of the external cause in relation to our own power (IVP5). Separating the passion from the external cause allows for understanding the affect in relation to the ideas of the mind alone. It might be difficult to grasp what Spinoza means by separating the affect from the external cause in the abstract, but consider the example of the jealous lover. Spinoza defines jealousy as “a vacillation of the mind born of love and hatred together, accompanied by the idea of another who is envied” (IIIP35S). The external causes accompanying the joy and sadness are the beloved and the (imagined) new lover who is envied. By separating the affect from the idea of the external cause, Spinoza is suggesting that a jealous lover could come to terms with the jealousy and form some clear and distinct ideas about it, that is, form some adequate ideas that reduce the power of the passion. Spinoza’s third remedy involves the fact that “affects aroused by reason are, if we take account of time, more powerful than those related to singular things we regard as absent” (VP7). Simply put, “time heals all wounds,” but Spinoza gives an account of why this is. Whereas passions are inadequate ideas that diminish with the absence of the external cause (we have other ideas that exclude the imagining of the external object), an affect related to reason involves the common properties of things “which we always regard as present” (VP7D). Therefore, over time, rational affects are more powerful than passions. This mechanism of this remedy is readily seen in a variety of passions, from heartbreak to addiction.

Spinoza’s fourth and fifth remedies are more concerned with preventing the mind from being adversely affected by passions than with overcoming a specific passion which already exists. The fourth remedy involves relating an affect to a multitude of causes, because “if an affect is related to more and different causes, which the mind considers together with the affect itself, it is less harmful, we are less acted on by it, and we are affected less toward each cause than is the case with another equally great affect, which is related only to one cause or to fewer causes” (VP9). This is the case because, when considering that affect, the mind is engaged in thinking a multitude of different ideas, that is, its power of thinking is increased, and it is more free. Again, this remedy is, in large part, related to the first foundational one. In understanding our affects, we form some adequate ideas and understand the cause of the affect, in part, from these ideas. Insofar as these adequate ideas are common notions concerning the common properties of things, we relate the affects to many things that can engage the mind. Spinoza ultimately claims that “the mind can bring it about that all the body’s affections, or images of things, are related to the idea of God” (VP14), for the mind has an adequate idea of the essence of God (IIP47). Because these affections are related to adequate ideas and follow from our own nature, they are effects of joy accompanied by the idea of God. In other words, all affections of the body can encourage an intellectual love of God. For Spinoza, “he who understands himself and his affects clearly and distinctly loves God, and does so the more, the more he understands himself and his affects” (VP15). This is a large part of how Spinoza conceives of the joyful life of reason and understanding that he calls blessedness.

Finally, the fifth remedy involves the fact that, as Spinoza argues, “so long as we are not torn by affects contrary to our nature, we have the power of ordering and connecting the affection of the body according to the order of the intellect” (VP10). What this amounts to is that the mind will be less affected by negative passions the more adequate ideas it has and will order its ideas according to reason instead of the common order of nature. Spinoza’s suggestion is to “conceive of right principles of living, or sure maxims of life,” which we can constantly look at when confronted by common occurrences and emotional disturbances of life. For instance, Spinoza gives the example of how to avoid being suddenly overwhelmed by hatred by preparing oneself by meditating “frequently on the common wrongs of men, and how they may be warded off best by nobility” (VP10S). This provides the practical mechanism by which we can use the virtues of tenacity and nobility to live a free life (see Steinberg 2014). All the remedies Spinoza mentions allow an individual to be rationally responsive to their environment rather than just being led by their emotions, and insofar as they are led by reason and adequate knowledge, they are free.

5. Spinoza on Moral Responsibility

The discussion about free will and freedom is often concerned with moral responsibility because free will is generally considered a necessary condition for moral responsibility. Moral responsibility is taken to be the condition under which an individual can be praised and blamed, rewarded and punished for their actions. Spinoza’s view on responsibility is complex and little commented upon. And he indeed avers that praise and blame are only a result of the illusion of free will: “Because they think themselves free, those notions have arisen: praise and blame, sin and merit” (I Appendix, 444). Though Spinoza does not speak directly of moral responsibility, he does not completely disavow the idea of responsibility because of his denial of free will. In a series of correspondences with Oldenburg, he makes clear that he does think that individuals are responsible for their actions despite lacking free will, though his sense of responsibility is untraditional. Oldenburg asks Spinoza to explain some passages in the Theological Political Treatise that seem, by equating God with Nature, to imply the elimination of divine providence, free will, and thereby moral responsibility. Spinoza indeed denies the traditional view of divine providence as one of free choice by God. For Spinoza, absolute freedom is acting from the necessity of one’s nature (ID7), and God is free in precisely the fact that everything follows from the necessity of the divine nature. But God does not arbitrarily choose to create the cosmos, as is traditionally argued.

In Letter 74, Oldenburg writes “I shall say what most distresses them. You seem to build on a fatal necessity of all things and actions. But, once that has been asserted and granted, they say the sinews of all laws, of all virtue and religion, are cut, and all rewards and punishments are useless. They think that whatever compels or implies necessity excuses. Therefore, they think no one will be inexcusable in the sight of God” (469). Oldenburg points out the classical argument against determinism, namely that it makes reward and punishment futile and pointless because if human beings have no free will, then they seem to have no control over their lives, and if they have no control over their lives, then there is no justification for punishment or reward. All actions become excusable if they are outside the control of individuals. However, in his response to Oldenburg, Spinoza maintains the significance of reward and punishment even within a deterministic framework. He states,

This inevitable necessity of things does not destroy either divine or human laws. For whether or not the moral teachings themselves receive the form of law or legislation from God himself, they are still divine and salutary. The good which follows from virtue and the love of God will be just as desirable whether we receive it from God as a judge or as something emanating from the necessity of the divine nature. Nor will the bad things which follow from evil actions and affects be any less to be feared because they follow from them necessarily. Finally, whether we do what we do necessarily or contingently, we are still led by hope and fear. (Letter 75, 471)

Spinoza has two points here. The first is that all reward and punishment are natural consequences of actions. Even if everything is determined, actions have good and evil consequences, and these are the natural results of actions. Determinism does not eliminate reward and punishment because there are determined consequences, that are part of the natural order. Traditional views on responsibility are tied to free will, but in this passage, Spinoza is indicating that reward and punishment are justified by the power or right of nature. The second point is that these consequences can regulate human behavior because human beings are led by the hope for some good and the fear of some evil. Determinism does not destroy the law but rather gives it a framework for being effective. Spinoza here seems to be advocating something like a consequentialist theory of responsibility. What matters is that the reward and punishment can act as a deterrent to bad behavior or motivation for desired behavior. Traditional views on responsibility are tied to free will, but in this passage, Spinoza is indicating that reward and punishment are still justified from a social and political standpoint (see Kluz 2015).

To understand Spinoza’s points better, we have to examine his view of law. Spinoza thinks that law is either dependent on natural necessity, that is, laws of nature, or human will. However, because human beings are a part of nature, human law will also be a part of natural law. Moreover, he also thinks that the term “law” is generally more applied to human experience. He writes, “Commonly nothing is understood by law but a command which men can either carry out or neglect—since law confines human power under certain limits, beyond which that power extends, and does not command anything beyond human powers.” For this reason, Spinoza qualifies, “Law seems to need to be defined more particularly: that it is a principle of living man prescribes to himself or to others for some end” (TTP IV.5). Spinoza further divides law into human and divine law. By “human law,” Spinoza specifically means “a principle of living which serves only to protect life and the republic” (TTP IV.9), or what we might call “political” or “civil” law. By “divine law,” he specifically means, that which aims only at the supreme good, that is, the true knowledge and love of God” (TTP IV.9), or what we might call “religious” and “moral” law. The different ends of the law are what distinguish human law from divine law. The first concerns providing security and stability in social life; the second concerns providing happiness and blessedness, which are defined by virtue and freedom. For this reason, “divine law” in Spinoza’s sense concerns what leads to the supreme good for human beings, that is, the rule of conduct that allows humans to achieve freedom, virtue, and happiness. This law Spinoza propounds as moral precepts in the Ethics mentioned above. These laws follow from human nature, that is, they describe what is, in fact, good for human individuals in their striving to persevere in their being, based upon rational knowledge of human beings and nature in general, with the free man as the exemplar toward which they strive.

However, it is not the case that all individuals can access and follow the “divine law” through reason alone, and, therefore, traditionally, divine law took the form of divine commandments ensconced within a system of reward and punishment (while still including, more or less, what Spinoza indicates by ‘divine law”). For Spinoza, what is true in Holy Scripture and “divine law” can also be gained by adequate knowledge because “divine law” is a rule of conduct men lay down for themselves that “aims only at the supreme good, that is, the true knowledge and love of God.” (TTP IV.9). That is to say, “divine law” follows from human nature, which is a part of Nature, but while the free man follows these moral precepts because he rationally knows what is, in fact, advantageous for him, other individuals follow moral precepts because they are led by their passions, namely the hope for some good or the fear of some evil, that is, reward and punishment. Though reward and punishment are, ultimately, the same for the free man and other individuals, the free man is led by reason while other individuals are led by imagination, or inadequate ideas or passions. Likewise, human law, that is, political law, uses a system of reward and punishment to regulate human behavior through hope and fear. Human law provides security and stability for the state in which human individuals co-exist and punishes those who transgress the laws. Moreover, just as in the case of “divine law”, the free man follows human law because he rationally knows his advantage, while other individuals are more led by their passions. Returning to Spinoza’s response, determinism does not do away with law, moral or political, because the utility of the law, that is, the great advantages that following the law provides for the individual and the community and the disadvantages that result from transgressing the law, are retained whether or not human beings have free will. Ultimately, for Spinoza, moral precepts and the law are ensconced in a system of reward and punishment that is necessary for regulating human behavior even without free will.

6. References and Further Reading

All translations are from The Collected Works of Spinoza, Vol. I and II, ed. and trans. Edwin Curley.

a. Primary Sources

  • Descartes, Rene. The Philosophical Writings of Descartes, Vol. I and II, trans. John Cottingham et al. (Cambridge: Cambridge University Press, 1985).
  • Hobbes, Thomas. The Leviathan with Selected Variants from the Latin Edition of 1668, ed. Edwin Curley. Indianapolis: Hackett Publishing Company, 1994).
  • Long, A. A., and D. N. Sedley, trans., The Hellenistic Philosophers, Vol. 1: Translations of the Principal Sources, with Philosophical Commentary. (Cambridge: Cambridge University Press, 1987).
  • Spinoza, Baruch. The Collected Works of Spinoza, Vol. I and II, ed. and trans. by Edwin Curley. (Princeton University Press, 1985).

b. Secondary Sources

  • Bennett, Jonathan. A Study of Spinoza’s Ethics. (Indianapolis: Hackett, 1984).
  • Bennett, Jonathan. “Spinoza’s Monism: A Reply to Curley”, in God and Nature: Spinoza’s Metaphysics, ed. Yirmiyahu Yovel. (Leiden: E.J. Brill, 1991), 53-59.
  • Curley, Edwin. Spinoza’s Metaphysics: An Essay in Interpretation. (Cambridge: Harvard University Press, 1969).
  • Curely, Edwin. Behind the Geometrical Method. (Princeton: Princeton University Press, 1985).
  • Curley, Edwin. “On Bennett’s Interpretation of Spinoza’s Monism”, in God and Nature: Spinoza’s Metaphysics, ed. Yirmiyahu Yovel. (Leiden: E.J. Brill, 1991), 35-52.
  • De Dijn, Herman. Spinoza: The Way to Wisdom. (West Lafayette, IN: Purdue University Press, 1996).
  • Della Rocca, Michael. Representation and the Mind-Body Problem in Spinoza. (Oxford: Oxford University Press, 1996).
  • Gatens, Moira. “Spinoza, Law and Responsibility”, in Spinoza: Critical Assessments of Leading Philosophers Vol.III, ed. by Genevieve Lloyd. (London: Routledge, 2001), 225-242.
  • Garrett, Don. “Spinoza’s Necessitarianism”, in God and Nature: Spinoza’s Metaphysics, ed. Yirmiyahu Yovel. (Leiden: E.J. Brill, 1991), 197-218.
  • Goldenbaum, Ursula. “The Affects as a Condition of Human Freedom in Spinoza’s Ethics”, in Spinoza on Reason and the “Free Man”, edited by Yirmiyahu Yovel. (New York: Little Room Press, 2004), 149-65.
  • Goldenbaum, Ursula, and Christopher Kluz, eds. Doing without Free Will: Spinoza and Contemporary Moral Problems. (New York: Lexington, 2015).
  • Hübner, KarolinaSpinoza on Being Human and Human Perfection”, in Essays on Spinoza’s Ethical Theory, eds. Mathew Kisner and Andrew Youpa. (Oxford: Oxford University Press, 2014), 124-142.
  • Homan, Matthew. “Rehumanizing Spinoza’s Free Man”, in Doing without Free Will: Spinoza and Contemporary Moral Problems, eds. Ursula Goldenbaum and Christopher Kluz (New York: Lexington, 2015), 75-96.
  • James, Susan. “Freedom, Slavery, and the Passions”, in The Cambridge Companion to Spinoza’s Ethics, ed. by Olli Koistinen. (Cambridge: Cambridge University Press, 2009), 223-41.
  • Kisner, Mathew. Spinoza on Human Freedom: Reason, Autonomy and the Good Life. (Cambridge: Cambridge University Press, 2011).
  • Kisner, Mathew, and Andrew Youpa eds. Essays on Spinoza’s Ethical Theory. (Oxford: Oxford University Press, 2014).
  • Kisner, Matthew. “Spinoza’s Activities: Freedom without Independence”, in Freedom, Action, and Motivation in Spinoza’s “Ethics”, ed. Noa Naaman-Zauderer. (London: Routledge, 2021), 37-61.
  • Kluz, Christopher. “Moral Responsibility without Free Will: Spinoza’s Social Approach”, in Doing without Free Will: Spinoza and Contemporary Moral Problems, eds. Ursula Goldenbaum and Christopher Kluz (New York: Lexington, 2015), 1-26.
  • LeBuffe, Michael. From Bondage to Freedom: Spinoza on Human Excellence. (Oxford: Oxford University Press, 2010).
  • Marshal, Colin. “Moral Realism in Spinoza’s Ethics”, in Cambridge Critical Guide to Spinoza’s Ethics, ed. Yitzhak Melamed. (Cambridge: Cambridge University Press, 2017), 248-265,
  • Marshal, Eugene. The Spiritual Automaton: Spinoza’s Science of the Mind. (Oxford: Oxford University Press, 2014).
  • Melamed, Yitzhak. “The Causes of our Belief in Free Will: Spinoza on Necessary, “Innate,” yet False Cognition”, in Cambridge Critical Guide to Spinoza’s Ethics, ed. Yitzhak Melamed. (Cambridge: Cambridge University Press, 2017)
  • Naaman-Zauderer, Nao ed. Freedom, Action, and Motivation in Spinoza’s “Ethics”. (London: Routledge, 2021).
  • Nadler, Steven. “Whatever is, is in God: substance and things in Spinoza’s metaphysics”, in Interpreting Spinoza: Critical Essays, ed. Charles Huenemann. (Cambridge: Cambridge University Press, 2008), 53-70.
  • Nadler, Steven. “On Spinoza’s Free Man”, Journal of the American Philosophical Association, Volume 1, Issue 1, Spring 2015, 103-120.
  • Rutherford, Donald. “Deciding What to Do: The Relation of Affect and Reason in Spinoza’s Ethics”, in Freedom, Action, and Motivation in Spinoza’s “Ethics”, ed. Noa Naaman-Zauderer. (London: Routledge, 2021), 133-151.
  • Soyarslan, Sanem. “From Ordinary Life to Blessedness: The Power of Intuitive Knowledge in Spinoza’s Ethics”, in Essays on Spinoza’s Ethical Theory eds. Mathew Kisner and Andrew Youpa. (Oxford: Oxford University Press, 2014), 236-257.
  • Steinberg, Justin. “Following a Recta Ratio Vivendi: The Practical Utility of Spinoza’s Dictates of Reason”, in Essays on Spinoza’s Ethical Theory, eds. Mathew Kisner and Andrew Youpa. (Oxford: Oxford University Press, 2014), 178-196.
  • Youpa, Andrew. “Spinoza’s Theory of the Good”, in The Cambridge Companion to Spinoza’s Ethics, ed. Olli Koistinen. (Cambridge: Cambridge University Press, 2010), pp. 242 – 257.
  • Youpa, Andrew. The Ethics of Joy: Spinoza on the Empowered Life. (Oxford: Oxford University Press, 2019).
  • Yovel, Yirmiyahu, ed. Spinoza on Reason and the “Free Man”. (New York: Little Room Press, 2004).

Author Information

Christopher Kluz
Email: christopherkluz@cuhk.edu.cn
The Chinese University of Hong Kong, Shenzhen
China

Leibniz: Modal Metaphysics

LeibnizGottfried Wilhelm Leibniz (1646-1716) served as the natural end of the rationalist tradition on the European continent, which included Descartes, Spinoza, and Malebranche. His philosophy was one of the major influences on Kant. Although Leibniz had many philosophical and intellectual interests, he was arguably most concerned with reconciling the freedom required for moral responsibility and the determinism that seemed to be entailed by the new sciences being developed at the time. In fact, in several important writings, including the Theodicy, Leibniz refers to “the free and the necessary and their production as it relates to the origin of evil” as one of the “famous labyrinths where our reason very often goes astray.”

To address this labyrinth, Leibniz developed one of the most sophisticated accounts of compatibilism in the early modern period. Compatibilism is the view that freedom and determinism are compatible and not mutually exclusive. Free actions are fully determined, and yet not necessary—they could have been otherwise, were God to have created another possible world instead. According to Leibniz, free actions, whether they be for God or humans, are those that are intelligent, spontaneous, and contingent. He developed a framework of possible worlds that is most helpful in understanding the third and most complex criterion, contingency.

Leibniz’s theory of possible worlds went on to influence some of the standard ways in which modal metaphysics is analyzed in contemporary Anglo-American analytic philosophy. The theory of possible worlds that he developed and utilized in his philosophy was extremely nuanced and had implications for many different areas of his thought, including, but not limited to, his metaphysics, epistemology, jurisprudence, and philosophy of religion. Although Leibniz’s Metaphysics is treated in a separate article, this article is primarily concerned with Leibniz’s modal metaphysics, that is, with his understanding of the modal notions of necessity, contingency, and possibility, and their relation to human and divine freedom. For more specific details on Leibniz’s logic and possible worlds semantics, especially as it relates to the New Essays Concerning Human Understanding and to the Theodicy, please refer to “Leibniz’s Logic.”

Table of Contents

  1. The Threat of Necessitarianism
  2. Strategies for Contingency
    1. Compossibility
    2. Infinite Analysis
    3. God’s Choice and Metaphysical and Moral Necessity
    4. Absolute and Hypothetical Necessity
  3. Complete Individual Concepts
  4. The Containment Theory of Truth and Essentialism
    1. Superessentialism
    2. Moderate Essentialism
    3. Superintrinsicalness
  5. Leibnizian Optimism and the “Best” Possible World
  6. Compatibilist Freedom
    1. Human Freedom
    2. Divine Freedom
  7. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. The Threat of Necessitarianism

Necessitarianism is the view according to which everything that is possible is actual, or, to put this in the language of possible worlds, there is only one possible world and it is the actual world. Not only is everything determined, but it is also metaphysically impossible anything could be otherwise. In the seventeenth century, Baruch Spinoza was the paradigmatic necessitarian. According to Spinoza, insofar as everything follows from the nature of God with conceptual necessity, things could not possibly be other than they are. For Spinoza, necessitarianism had ethical implications—given that it is only possible for the universe to unfold in one way, we ought to learn to accept the way that the world is so that we can live happily. Happiness, Spinoza thought, is partly and importantly understood to be the rational acceptance of the fully determined nature of existence.

Spinoza’s necessitarianism follows directly from his conception of God and his commitment to the principle of sufficient reason, the thesis that there is a cause or reason why everything is the way it is rather than otherwise. In rejecting the anthropomorphic conception of God, he held instead that God is identical with Nature and that all things are, in some sense, in God. While Leibniz rejected the pantheistic/panentheistic understanding of God that Spinoza held, Leibniz’s view of God nevertheless compelled him to necessitarianism, at least in his early years. This article later reconsiders whether Leibniz’s mature views also commit him to necessitarianism. Consider the following letter that he wrote to Magnus Wedderkopf in 1671. Leibniz writes:

Since God is the most perfect mind, however, it is impossible for him not to be affected by the most perfect harmony, and thus to be necessitated to the best by the very ideality of things…Hence it follows that whatever has happened, is happening, or will happen is best and therefore necessary, but…with a necessity that takes nothing away from freedom because it takes nothing away from the will and the use of reason (A. II. I, 117; L 146).

In this early correspondence, Leibniz reasons that since God’s nature is essentially good, he must, by necessity, only do that which is best. It is impossible for God to do less than the best. After his meeting with Spinoza in 1676, Leibniz’s views related to modality began to shift and became much more nuanced. He went on to develop several strategies for addressing contingency to reject this early necessitarian position. In his mature metaphysics, Leibniz maintained that God acts for the best, but rejected that God acts for the best by necessity. How did he attempt to reconcile these positions though?

2. Strategies for Contingency

a. Compossibility

Leibniz’s first and arguably most important strategy for maintaining contingency is to argue that worlds are not possible with respect to God’s will; rather, worlds are intrinsically possible or impossible. If they were possible only with respect to God’s will, the argument from the letter to Wedderkopf would still be applicable—since God is committed to the best by his own essential nature, there is only one possible world, the actual world which is best. Instead, Leibniz maintains that worlds by their very nature are either possible or impossible. He writes in a piece dated from 1680 to1682 called On Freedom and Possibility:

Rather, we must say that God wills the best through his nature. “Therefore,” you will say “he wills by necessity.” I will say, with St. Augustine, that such necessity is blessed. “But surely it follows that from this that things exist by necessity.” How so? Since the nonexistence of what God wills to exist implies a contradiction? I deny that this proposition is absolutely true, for otherwise that which God does not will would not be possible. For things remain possible, even if God does not choose them. Indeed, even if God does not will something to exist, it is possible for it to exist, since, by its nature, it could exist if God were to will it to exist. “But God cannot will it to exist.” I concede this, yet, such a thing remains possible in its nature, even if it is not possible with respect to the divine will, since we have defined as in its nature possible anything that, in itself, implies no contradiction, even though its coexistence with God can in some way be said to imply a contradiction (Grua 289; AG 20-21).

According to Leibniz, worlds are possible just in case they are compossible. Possibility is a property of an object when its properties are logically consistent. For example, winged horses are possible because there is nothing self-contradictory about a horse with wings. But a winged wingless horse would be internally incoherent. By contrast, compossibility is a feature of sets of things, like worlds, rather than individual things. So, when Leibniz insists that worlds are possible by their own nature, he means that the things in that world do not conflict with one another. For example, there is nothing self-contradictory about an unstoppable force or an immovable object. But those objects could not exist in the same world together because their natures would be inconsistent with one another—they rule each other out. So, while there is a possible world with an unstoppable force and a possible world with an immovable object, there is no possible world with both an unstoppable force and an immovable object.

Although Leibniz often analyzes compossibility as a logical relation holding between the created essences of any given world, he sometimes treats it as a relation between the created essences and the laws of nature which God has decreed in each world. He writes in his correspondence to Arnauld:

I think there is an infinity of possible ways in which to create the world, according to the different designs which God could form, and that each possible world depends on certain principal designs or purposes of God which are distinctive of it, that is, certain primary free decrees (conceived sub ratione possibilitatis) or certain laws of the general order of this possible universe with which they are in accord and whose concept they determine, as they do also the concepts of all the individual substances which must enter into this same universe (G. II, 51; L 333).

Passages like this suggest that even logically inconsistent sets of objects like the unstoppable force and the immovable object could exist in a world together, so long as there is one set of laws governing them.

Although there are several different ways to analyze Leibniz’s notion of compossibility, there is good reason to think that he believed that preserving the intrinsic nature of the possibility of worlds was crucial to salvaging contingency. At one point he even suggests that contingency would be destroyed without such an account. He writes to Arnauld:

I agree there is no other reality in pure possibles than the reality they have in the divine understanding…For when speaking of possibilities, I am satisfied that we can form true propositions about them. For example, even if there were no perfect square in the world, we would still see that it does not imply a contradiction. And if we wished absolutely to reject pure possibles, contingency would be destroyed; for, if nothing were possible except what God actually created, then what God created would be necessary, in the case he resolved to create anything (G. II, 45; AG 75).

Importantly, the possibility of worlds is outside the scope of God’s will. God does not determine what is possible, any more than he determines mathematical, logical, or moral truths.

b. Infinite Analysis

Another strategy for understanding necessity and contingency is through Leibniz’s theory of infinite analysis. According to Leibniz, necessity and contingency are not defined in terms of possible worlds in the way that is common in contemporary metaphysics. According to the standard understanding in contemporary metaphysics, a proposition is possible just in case it is true in some possible world, and a proposition is necessary just in case it is true in every possible world. But for Leibniz, a proposition is necessary if and only if it can be reduced to an identity statement in a finite number of steps. Propositions are contingent just in case it would take an infinite number of steps to reduce the statement to an identity statement. He writes in a piece from 1686 called On Contingency:

Necessary truths are those that can be demonstrated through an analysis of terms, so that in the end they become identities, just as in algebra an equation expressing an identity ultimately results from the substitution of values. That is, necessary truths depend upon the principle of contradiction. Contingent truths cannot be reduced to the principle of contradiction; otherwise everything would be necessary and nothing would be possible other than that which actually attains existence (Grua 303; AG 28).

To see how the theory of infinite analysis works, recall that Leibniz holds that every truth is an analytic truth. Every true proposition is one where the concept of the predicate is contained in the concept of the subject. One way that to understand this reduction is to ask, “Why is this proposition true?” Since every truth is an analytic truth, every truth is like, “A bachelor is an unmarried male.” So why is it true that a bachelor is an unmarried male? It is true because it is bound up in the essence of the concept of unmarried male that he is identical with a bachelor. A bachelor just is an unmarried male.

How would the theory of infinite analysis work for explaining contingency though? Consider the following propositions:

    1. 1+1=2
    2. Judas is the betrayer of Christ.

The first proposition is a simple mathematical truth that almost everyone in the 17th and 18th centuries would consider to be a necessary truth. For Leibniz, it is a necessary truth because can be reduced to an identity statement in a finite number of steps; that is, we could move from 1+1=2 to 1+1=1+1 in a straightforward manner. We could make a similar move for other mathematical and logical truths that are even more straightforward. The law of identity, that “A is identical to A,” for example, is another example that would take a finite number of steps to reduce to an identity.

The second proposition is an example of a contingent truth because the reduction would take an infinite number of steps to reach an identity statement. To understand how this analysis occurs, consider why it is true that Judas is the betrayer of Christ. This analysis would require reasons for Judas’s nature and his existence. Judas exists because God understood in his infinite wisdom that the best possible world would be one where Judas betrays Christ and Christ suffers. And why is Judas part of the best possible world? The only way to answer that question would be for God to compare the actual world with the infinite plurality of other possible worlds—an analysis that would take an infinite number of steps, even for God. Put simply, the sufficient reason for Judas’s contingent existence is that it is deemed to be best by God.

Importantly, Leibniz holds that not even God could complete the infinite analysis discursively; instead, God completes the analysis intuitively, in one feat of the mind. He writes in On Contingency:

For in necessary propositions, when the analysis is continued indefinitely, it arrives at an equation that is an identity; that is what it is to demonstrate a truth with geometrical rigor. But in contingent propositions one continues the analysis to infinity through reasons for reasons, so that one never has a complete demonstration, though there is always, underneath, a reason for the truth, but the reason is understood completely only by God, who alone traverses the infinite series in one stroke of the mind (Grua 303; AG 28).

c. God’s Choice and Metaphysical and Moral Necessity

Another strategy for salvaging contingency is not at the level of worlds, nor in God’s will, but at the level of God’s wisdom; that is, in the choice to actualize certain substances instead of others. Leibniz holds that we must take the reality of God’s choice seriously. As he writes in the Theodicy, “The nature of things, if taken as without intelligence and without choice, has in it nothing sufficiently determinant” (G. VI, 322; H 350).

Even if the plurality of worlds remain possible in themselves as the first strategy holds, or propositions are contingent because of the infinite analysis theory as the second strategy holds, God’s choice still plays an important role in the causal and explanatory chain of events leading to the actualization of a world. In this way, Leibniz’s modal metaphysics stands again in stark contrast to Spinoza. For Spinoza, the world just is God, and in some sense, all things are in God. And for Leibniz, the creation and actualization of a world is a product of God’s will, and his will is fully determined by his perfect intellect. In some texts, Leibniz locates the source of contingency purely in God’s choice of the best, which cannot be demonstrated. And since the choice of the best cannot be demonstrated, God’s choice is contingent. He writes in On Contingency:

Assuming that the proposition “the proposition that has the greater reason for existing [that is, being true] exists [that is, is true] is necessary, we must see whether it then follows that the proposition that has the greater reason for existing [that is, being true] is necessary. But it is justifiable to deny the consequence. For, if by definition a necessary proposition is one whose truth can be demonstrated with geometrical rigor, then indeed it could be the case that this proposition is demonstrable: “every truth and only a truth has greater reason,” or this: “God always acts with the highest wisdom.” But from this one cannot demonstrate the proposition “contingent proposition A has greater reason [for being true] or “contingent proposition A is in conformity with divine wisdom.” And therefore it does not follow that contingent proposition A is necessary. So, although one can concede that it is necessary for God to choose the best, or that the best is necessary, it does not follow that what is chosen is necessary, since there is no demonstration that it is the best” (Grua 305; AG 30).

Related to God’s choice is the distinction between moral and metaphysical necessity. Moral necessity is used by Leibniz in several different writings, beginning with his earliest jurisprudential writings up to and including his Theodicy. In the 17th century, moral necessity was very often understood in terms of the legal use of “obligation,” a term which Leibniz also applied to God. He writes in the Nova Methodus from 1667:

Morality, that is, the justice or injustice of an act, derives however from the quality of the acting person in relation to the action springing from previous actions, which is described as moral quality. But just as the real quality is twofold in relation to action: the power of acting (potential agendi), and the necessity of acting (necessitas agendi); so also the moral power is called right (jus), the moral necessity is called obligation (obligatio) (A. VI. i. 301).

Leibniz echoes this sentiment into the 1690’s in other jurisprudential writings. In the Codex Juris from 1693, Leibniz insists that “Right is a kind of moral power, and obligation is a moral necessity” (G. III. 386; L 421). In short, Leibniz remarkably held consistent throughout his career that “right” and “obligation” are moral qualities that provide the capacity to do what is just.

Importantly, right and obligation are not just related notions—they have force on each other. As Leibniz writes in the Nova Methodus, “The causes of right in one person are a kind of loss of right in another and it concerns the process of acquiring an obligation. Conversely, the ways of losing an obligation are causes of recovering a right, and can be defined as liberation” (A. VI. vi, 305-306). That a right imposes an obligation cannot be overstated. It is precisely for this reason that we can undergo the theodicean project in the first place. We have proper standing to ask for an explanation for God’s permission of suffering because we have a right to the explanation. And we have a right to the explanation because God is morally necessitated or obligated to create. For a point of comparison, contrast this with God’s response to Job when he demands an explanation for his own suffering. God responds, “Who has a claim against me that I must pay? Everything under heaven belongs to me” (Job 41:11). God does not provide an explanation for Job’s suffering because Job does not have proper standing to request such an explanation.

Leibniz contrasts moral necessity with metaphysical necessity. In the Theodicy, he describes “metaphysical necessity, which leaves no place for any choice, presenting only one possible object, and moral necessity, which obliges the wisest to choose the best” (G. VI, 333; H 367). This distinction becomes important for Leibniz because it allows him to say that God’s choice to create the best of all possible worlds is morally necessary, but not metaphysically necessary. God is morally bound to create the best world due to his divine nature, but since there are other worlds which are possible in themselves, his choice is not metaphysically necessary. Leibniz writes again in the Theodicy, “God chose between different courses all possible: thus, metaphysically speaking, he could have chosen or done what was not the best; but he could not morally speaking have done so” (G. VI, 256; H 271).

Some commentators insist that the dichotomy between metaphysical and moral necessity is illusory. Either it is necessary that God must create the best of all possible worlds, or it is not necessary that God must create the best of all possible worlds. Nevertheless, Leibniz took moral necessity to do both logical and theological work. Only with moral necessity could he preserve both the goodness and wisdom of God. If moral necessity is vacuous, then Leibniz would seem to be committed to necessitarianism.

d. Absolute and Hypothetical Necessity

One final strategy for understanding contingency is to make use of a well-known distinction between absolute and hypothetical necessity. This strategy was most fully utilized in Leibniz’s correspondence with Arnauld in the mid 1680’s. Arnauld was deeply concerned with the implications for freedom because of the theory of complete individual concepts. Since Leibniz held that every individual contains within itself complete truths about the universe, past, present, and future, it seems that there can be no room for freedom. If it is included in Judas’s concept from the moment the universe was created that he would ultimately betray Christ, then it seems as if it was necessary that he do so; Judas could not have done otherwise. Leibniz’s response draws on the distinction between absolute and hypothetical necessity. Consider the following propositions:

    1. Necessarily, Caesar crosses the Rubicon.
    2. Necessarily, if Caesar exists, then he crosses the Rubicon.

Leibniz would deny the first proposition, but readily accept the second proposition. He denies the first because it is not a necessary truth that Caesar crosses the Rubicon. The first proposition is not comparable to other necessary truths like those of mathematics and logic which reduce to identity statements and are not self-contradictory. The second proposition is contingent; although it is bound up in Caesar’s essence that he crosses the Rubicon, it does not follow that he necessarily does so. It is only necessary that Caesar crosses the Rubicon on the hypothesis that Caesar exists. And, of course, Caesar might not have existed at all. God might have actualized a world without Caesar because those worlds are compossible, that is, possible in themselves. This is what Leibniz means when he claims that contingent truths are certain, but not necessary. To use a simple analogy, once God pushes over the first domino, it is certain that the chain of dominoes will fall, but God might have pushed over a completely different set of dominos instead. Once a series is actualized, the laws of the series govern it with certainty. And yet the series is not metaphysically necessary since there are other series that God could have actualized instead were it not for his divine benevolence. Leibnitz writes in the Discourse on Metaphysics from 1686:

And it is true that we are maintaining that everything that must happen to a person is already contained virtually in his nature or notion, just as the properties of a circle are contained in its definition; thus the difficulty still remains. To address it firmly, I assert that connection or following is of two kinds. The one whose contrary implies a contradiction is absolutely necessary; this deduction occurs in eternal truths, for example, the truths of geometry. The other is necessary only ex hypothesi and, so to speak, accidentally, but it is contingent in itself, since its contrary does not imply a contradiction. And this connection is based not purely on ideas of God’s simple understanding, but on his free decrees and on the sequence of the universe (A. VI. iv, 1546-1547; AG 45).

Absolute necessity, then, applies to necessary truths that are outside the scope of God’s free decrees, and hypothetical necessity applies to contingent truths that are within the scope of God’s free decrees.

3. Complete Individual Concepts

According to Leibniz, one of the basic features of a substance is that every substance has a “complete individual concept” (CIC, hereafter). The CIC is an exhaustive account of every single property of each substance. He writes in the Discourse on Metaphysics, “the nature of an individual substance or of a complete being is to have a notion so complete that it is sufficient to contain and to allow us to deduce from it all the predicates of the subject to which this notion is attributed” (A. Vi. iv, 1540; AG 41). From this logical conception of substance, Leibniz argues that properties included in the CIC are those of the past, present, and future. The CIC informs what is sometimes referred to as Leibniz’s doctrine of marks and traces. He illustrates this thesis using the example of Alexander the Great in the Discourse, writing:

Thus, when we consider carefully the connection of things, we can say that from all time in Alexander’s soul there are vestiges of everything that has happened to him and marks of everything that will happen to him and even traces of everything that happens in the universe, even though God alone could recognize them all (A. VI. iv, 1541; AG 41).

According to Leibniz, then, in analyzing any single substance, God would be able to understand every other substance in the universe, since every substance is conceptually connected to every other substance. For example, in analyzing the concept of Jesus, God would also be able to understand the concept of Judas. Because it is part of Jesus’s CIC that he was betrayed by Judas, it is also part of Judas’s CIC that he will betray Jesus. Every truth about the universe could be deduced this way as well. If a pebble were to fall off a cliff on Neptune in the year 2050, that would also be included in Jesus’s CIC too. To use one image of which Leibniz is quite fond, every drop in the ocean is connected to every other drop in the ocean, even though the ripples from those drops could only be understood by God. He writes in the Theodicy:

For it must be known that all things are connected in each one of the possible worlds: the universe, whatever it may be, is all of one piece, like an ocean: the least movement extends its effect there to any distance whatsoever, even though this effect become less perceptible in proportion to the distance. Therein God has ordered all things beforehand once for all, having foreseen prayers, good and bad actions, and all the rest; and each thing as an idea has contributed, before its existence, to the resolution that has been made upon the existence of all things; so that nothing can be changed in the universe (any more than in a number) save its essence or, if you will, save its numerical individuality. Thus, if the smallest evil that comes to pass in the world were missing in it, it would no longer be this world; which nothing omitted and all allowance made, was found the best by the Creator who chose it (G. VI. 107-108; H 128).

In addition to describing substances as possessing a CIC, Leibniz also refers to the essential features of a substance as perception and appetition. These features are explained in more detail in an article on Leibniz’s Philosophy of Mind. In short though, Leibniz held that every single representation of each substance is already contained within itself from the moment it is created, such that the change from one representation to another is brought about by its own conatus. The conatus, or internal striving, is what Leibniz refers to as the appetitions of a substance. Leibniz writes in the late Principles of Nature and Grace:

A monad, in itself, at a moment, can be distinguished from another only by its internal qualities and actions, which can be nothing but its perceptions (that is, the representation of the composite, or what is external, in the simple) and its appetitions (that is, its tendencies to go from one perception to another) which are the principles of change (G. VI. 598; AG 207).

Because every perception of the entire universe is contained within each substance, the entire history of the world is already fully determined. This is the case not just for the actual world after the act of creation, but it is true for every possible world. In fact, the fully determined nature of every possible world is what allows God in his infinite wisdom to actualize the best world. God can assess the value of every world precisely because the entire causal history, past, present, and future is already set.

4. The Containment Theory of Truth and Essentialism

The main article on Leibniz describes his epistemological account in more general terms, but Leibniz’s theory of truth has implications for freedom, so some brief comments bear mentioning. According to Leibniz, propositions are true not if they correspond to the world, but instead based on the relationship between the subject and the predicate. The “predicate in notion principle” (PIN, hereafter), as he describes to Arnauld, is the view according to which “In every true affirmative proposition, whether necessary or contingent, universal or particular, the notion of the predicate is in some way included in that of the subject. Praedicatum inest subjecto; otherwise I do not know what truth is” (G. II, 56; L 337). For example, “Judas is the betrayer of Christ” is true not because there is a Judas who betrays Christ in the actual world, but because the predicate “betrayer of Christ” is contained in the concept of the subject, Judas. Judas’s essence, his thisness, or haecceity, to use the medieval terminology, is partly defined by his betrayal of Christ.

The PIN theory of truth poses significant problems for freedom though. After all, if it is part of Judas’s essence that he is the betrayer of Christ, then it seems that Judas must betray Christ. And if Judas must betray Christ, then it seems that he cannot do otherwise. And if he cannot do otherwise, then Judas cannot be morally responsible for his actions. Judas cannot be blameworthy for the betrayal of Christ for doing something that was part of his very essence. And yet, despite this difficulty, Leibniz maintained a compatibilist theory of freedom, where Judas’s actions were certain, but not necessary.

Since Leibniz holds that every essence can be represented by God as having a complete concept and that every proposition is analytically true, he maintains that every property is essential to a substance’s being. Leibniz, therefore, straightforwardly adopts an essentialist position. Essentialism is the metaphysical view according to which some properties of a thing are essential to it, such that if it were to lose that property, the thing would cease to exist. Leibniz’s essentialism has been a contested issue in the secondary literature during the first few decades of the twenty-first century. The next section of this article highlights three of the more dominant and interesting interpretations of Leibniz’s essentialism in his mature philosophy: superessentialism, moderate essentialism, and superintrinsicalness.

a. Superessentialism

The most straightforward way of interpreting Leibniz’s mature ontology is that he agrees with the thesis of superessentialism. According to superessentialism, every property is essential to an individual substance’s CIC such that if the substance were to lack any property at all, then the substance would not exist. Leibniz often explains his superessentialist position in the context of explaining God’s actions. For example, in one passage he writes, “You will object that it is possible for you to ask why God did not give you more strength than he has. I answer: if he had done that, you would not exist, for he would have produced not you but another creature” (Grua 327).

In his correspondence with Arnauld, Leibniz makes use of the notion of “possible Adams” to explain what looks very much like superessentialism. In describing another possible Adam, Leibniz stresses to Arnauld the importance of taking every property to be part of a substance, or else we would only have an indeterminate notion, not a complete and perfect representation of him. This fully determinate notion is the way in which God conceives of Adam when evaluating which set of individuals to create when a world is actualized. Leibniz describes this perfect representation to Arnauld, “For by the individual concept of Adam I mean, to be sure, a perfect representation of a particular Adam who has particular individual conditions and who is thereby distinguished from an infinite number of other possible persons who are very similar but yet different from him…” (G. II, 20; LA. 15). The most natural way to interpret this passage is along the superessentialist reading such that if there were a property that were not essential to Adam, then we would have a “vague Adam.” Leibniz even says as much to Arnauld. He writes:

We must not conceive of a vague Adam, that is, a person to whom certain attributes of Adam belong, when we are concerned with determining whether all human events follow from positing his existence; rather we must attribute to him a notion so complete that everything that can be attributed to him can be deduced from it (G. II, 42; ag 73.).

The notion of “vague Adams” is further described in a famous passage from the Theodicy. Leibniz describes the existence of other counterparts of Sextus in other possible worlds, that, though complete concepts in their own way, are nevertheless different from the CIC of Sextus in the actual world. Leibniz writes:

I will show you some, wherein shall be found, not absolutely the same Sextus as you have seen (that is not possible, he carries with him always that which he shall be) but several Sextuses resembling him, possessing all that you know imperceptibly, nor in consequence all that shall yet happen to him. You will find in one world a very happy and noble Sextus, in another a Sextus content with a mediocre state, a Sextus, indeed, of every kind and endless diversity of forms (G. VI, 363; H 371).

These passages describing other possible Adams and other possible Sextuses suggest that Leibniz was committed to the very strong thesis of superessentialism. Because every property is essential to an individual’s being, every substance is world-bound; that is, each substance only exists in its own world. If any property of an individual were different, then the individual would cease to exist, but there are also an infinite number of other individuals that vary in different degrees, which occupy different worlds. For example, a Judas who was more loyal and did not ultimately betray Christ would not be the Judas of the actual world. Importantly, one small change would also ripple across and affect every other substance in the universe as well. After all, a loyal Judas who does not betray Christ would also mean that Christ was not betrayed, so it would affect his complete concept and essence as well. Put simply, on the superessentialist interpretation of Leibniz’s metaphysics, due to the complete interconnectedness of all things, if any single property of an individual in the world were different than it is, then every substance in the world would be different as well.

The most important worry that Arnauld had about Leibniz’s philosophy was the way in which essentialism threatens freedom. Arnauld thought that human freedom must entail the ability to do otherwise. In the language of possible worlds, this means that an individual is free if they do otherwise in another possible world. Of course, such a view requires the very same individual to exist in another possible world. According to Arnauld, Judas was free in his betrayal of Christ because there is another possible world where Judas does not betray Christ. Freedom requires the actual ability to do otherwise. But Arnauld worried that according to Leibniz’s superessentialism, since it really was not Judas in another possible world that did not betray Christ but instead a counterpart, an individual very similar in another possible world, then we cannot really say that Judas’s action was truly free. Leibniz anticipates this sort of objection in the Discourse, writing, “But someone will say, why is it that this man will assuredly commit this sin? The reply is easy: otherwise it would not be this man” (A. VI. iv, 1576; AG 61). Leibniz, like most classical compatibilists, argues that the actual ability to do otherwise is not a necessary condition for freedom. All that is required is the hypothetical ability to do otherwise. A compatibilist like Leibniz would insist that Judas’s action is nevertheless free even though he cannot do otherwise. If Judas’s past or the laws of nature were different, then he might not betray Christ. Framing freedom in these hypothetical terms is what allows Leibniz to say that the world is certain, but not necessary.

Leibniz’s motivation for superessentialism is driven partly by theodicean concerns. The basic issue in the classical problem of evil is the apparent incompatibility between a perfectly loving, powerful, and wise God on the one hand with cases of suffering on the other. Why would God permit Jesus to suffer? Leibniz’s answer here as it relates to superessentialism is twofold. First, while Jesus’s suffering is indeed tragic, Leibniz contends that it is better for Jesus to exist and suffer than not to exist at all. Second, because of the complete interconnectedness of all things, without Jesus’s suffering, the entire history of the world would be different. Jesus’s suffering is very much part of the calculus when God is discerning which world is the best. And importantly, God is not choosing that Jesus suffers, but only chose a world in which Jesus suffers. He writes in the Primary Truths from 1689:

Properly speaking, he did not decide that Peter sin or that Judas be damned, but only that Peter who would sin with certainty, though not with necessity, but freely, and Judas who would suffer damnation would attain existence rather than other possible things; that is, he decreed that the possible notion become actual (A. VI. iv, 1646; AG 32).

b. Moderate Essentialism

Despite the evidence to interpret Leibniz as a superessentialist, there is also textual support that superessentialism is simply too strong of a thesis. One reason to adopt a weaker version of essentialism is to be logically consistent with transworld identity, the thesis that individuals can exist across possible worlds. Some commentators like Cover and O’Leary-Hawthorne argue for the weaker essentialist position on the grounds that superessentialism cannot utilize the scholastic difference between essential and accidental properties of which Leibniz sometimes makes use. According to moderate essentialism, Leibniz holds that properties that can be attributed to the species are essential in one way and principles attributed to individuals are essential in a different way.

The weaker thesis of moderate essentialism is the view that only monadic properties are essential to an individual substance, and relational or extrinsic properties should be reducible to monadic properties. The result of this view is that an individual is not “world-bound”; that is, a counterpart of that individual might exist in another possible world, and the essential properties of that individual are what designate it across possible worlds. What follows then is that Jesus, for example, could be said to be free for giving himself up in the Garden of Gethsemane because in another possible world, a counterpart of Jesus did not give himself up. Problematically though, Leibniz explicitly mentions in one of the letters to Arnauld that the laws of nature are indeed a part of an individual’s CIC. Leibniz writes to Arnauld, “As there exist an infinite number of possible worlds, there exists also an infinite number of laws, some peculiar to one world, some to another, and each possible individual contains in the concept of him the laws of his world” (G. II, 40; LA 43).

To reconcile the passages where Leibniz suggests that individuals are world-bound, some commentators argue that it is logically consistent to hold that only the perception or expression of the other substance must exist, but not the substance itself. And since monads are “windowless,” that is, causally isolated, the other substance need not exist at all. In his late correspondence with Des Bosses, Leibniz suggests this very thing, namely, that God could create one monad without the rest of the monads in that world. Leibniz writes:

My reply is easy and has already been given. He can do it absolutely; he cannot do it hypothetically, because he has decreed that all things should function most wisely and harmoniously. There would be no deception of rational creatures, however, even if everything outside of them did not correspond exactly to their experiences, or indeed if nothing did, just as if there were only one mind… (G. II, 496; L 611).

The letter to Des Bosses is compelling for moderate essentialism, but it does not entail it. In fact, conceiving of God’s ability to create only one monad in the actual world with only the expressions of every other substance is perfectly consistent with the superessentialist interpretation. The substances need not actually exist in order to support the claim that every property of a CIC is necessary for that substance. Put differently, if it were part of Peter’s CIC that he denied Christ three times, it need not follow that Christ actually existed for this property to hold, so long as the perceptions of Christ follow from the stores of Peter’s substance.

c. Superintrinsicalness

One final variation of essentialism which we might attribute to Leibniz is called superintrinsicalness. This thesis, defended primarily by Sleigh, states that every individual substance has all its properties intrinsically. This view is distinct from moderate essentialism in a very important way. According to superintrinsicalness, both monadic and extrinsic properties are essential to an individual’s CIC. But, contrary to the superessentialist thesis, the properties that compose an individual’s CIC could be different; that is, some components of a substance’s CIC are necessary, and some are contingent. Leibniz writes in the Discourse:

For it will be found that the demonstration of this predicate of Caesar is not as absolute as those of numbers or of geometry, but that it supposes the sequence of things that God has freely chosen, a sequence based on God’s first free decree always to do what is most perfect and on God’s decree with respect to human nature, following out of the first decree, that man will always do (although freely) that which appears to be best. But every truth based on these kinds of decrees is contingent, even though it is certain; for these decrees do not change the possibility of things, and, as I have already said, even though it is certain that God always chooses the best, this does not prevent something less perfect from being and remaining possible in itself, even though it will not happen, since it is not its impossibility but its imperfection which causes it to be rejected. And nothing is necessary whose contrary is possible (A. VI. iv, 1548; AG 46).

One of the consequences of this view is that a substance’s CIC is contingent on the will of God. For example, on this view, it is a logical possibility that Adam could have had a completely different set of properties altogether. And since a substance could have a completely different CIC and relational properties are part of that CIC, then superintrinsicalness would deny that substances are world-bound. Since Leibniz denies world-bound individuals on this interpretation, he would not need any sort of counterpart theory that comes along with the superessentialist reading. After all, Leibniz’s depiction of counterparts states that there are individuals in other possible worlds that, though they are very similar, are numerically distinct from each other. But on the superintrinsicalness thesis, it may be the case that an individual in another possible world is identical to an individual in the actual world.

There is some textual evidence supporting superintrinsicalness as well. Leibniz writes to Arnauld, “Thus, all human events could not fail to occur as in fact they did occur, once the choice of Adam is assumed; but not so much because of the individual concept of Adam, although this concept includes them, but because of God’s plans, which also enter into the individual concept of Adam” (G. II, 51; LA 57). And yet, if a substance could have had a different CIC, then the notion of a haecceity becomes meaningless. The haecceity serves to individuate substances across possible worlds. If the haecceity could be different than it is, then the concept loses its purpose. We could not pick out the Caesar of this world and another possible world, if the thing that makes Caesar can change.

==And yet, if Leibniz accepted superintrinsicalness, then he would have had an easy response to Arnauld’s worry that the complete concept doctrine diminishes the possibility of freedom. Leibniz could have just responded to Arnauld that Judas freely betrayed Christ because, in another possible world, he did not betray Christ; although his haecceity in the actual world determined that he would betray Christ, the haecceity in another possible world may be different such that he did not betray Christ. But this is not the response that Leibniz gives. Instead, he draws on some of the strategies for contingency in defending a compatibilist view of freedom that were discussed earlier.

5. Leibnizian Optimism and the “Best” Possible World

To paraphrase Ivan in The Brothers Karamazov, “The crust of the earth is soaked by the tears of the suffering.” Events like the Thirty Years War deeply affected Leibniz. His theodicean project was an attempt at an explanation and justification for God’s permission of such suffering. Why would a perfectly wise, powerful, and good God permit suffering? And even if we were to grant that God must permit suffering to allow for greater goods such as compassion and empathy, why must there be so much of it? Would the world not have been better with less suffering? The crux of Leibniz’s philosophical optimism was that creating this world was the best that God could do—it was metaphysically impossible for the world to be better than it is. And so, God is absolved of responsibility for not creating something better. But how could Leibniz maintain a position in such contrast to our intuitions that the world could be better with less suffering?

Arguably the most famous part of Leibniz’s philosophy is his solution to the problem of evil. The problem of evil is the most significant objection to classical theism, and it is one that Leibniz developed an entire system of possible worlds to address. He argues that God freely created the best of all possible worlds from amongst an infinite plurality of alternatives. Voltaire mocked such optimism in his Candide, suggesting in a best-case scenario that, if this is really the best world that God could create, then God certainly is not worth much reverence and in a worst-case scenario, it implies that God does not exist at all. But what exactly did Leibniz mean by the “best” possible world? And was Voltaire’s criticism warranted? Leibniz has several responses to the problem of evil which draw on his complex theory of possible worlds.

First, the basis for Voltaire’s misinterpretation is grounded upon the false assumption that the actual world is morally best. Instead, Leibniz contends that the world is metaphysically best. But how are these “moral” and “metaphysical” qualifications related to one another? After all, Leibniz sometimes remarks like he does in the Discourse that “God is the monarch of the most perfect republic, composed of all minds, and the happiness of this city of God is his principal purpose” (A. VI. iv, 1586; AG 67). And yet at other times, like in the Theodicy, he contends that “The happiness of rational creatures is one of the aims God has in view; but it is not his whole aim, nor even his ultimate aim” (G. VI, 169-170; H 189). It seems then that Leibniz is, at least on the face of it, unsure how much God is concerned with the happiness of creation. Happiness is a “principal” purpose of God, and yet not an “ultimate aim.”

One way to reconcile these apparently disparate positions is to be clearer about what Leibniz means by happiness. Leibniz often reminds the reader that the actual world is not the best because it guarantees every substance has the most pleasurable existence. Rather, he holds, like he does in the Confessio, that “Happiness is the state of mind most agreeable to it, and nothing is agreeable to a mind outside of harmony” (A. VI. iii, 116; CP 29). Put differently, the best of all possible worlds is metaphysically best because it is the world where rational minds can contemplate the harmonious nature of creation. Leibniz goes into more detail in The Principles of Nature and Grace, writing:

It follows from the supreme perfection of God that in producing the universe he chose the best possible plan, containing the greatest variety together with the greatest order; the best arranged situation, place and time; the greatest effect produced by the simplest means; the most power, the most knowledge, the most happiness and goodness in created things of which the universe admitted (G. VI, 603).

In short, Leibniz holds that while there is concern with the happiness of minds during the act of creation, the kind of happiness that God wishes to guarantee is not physical pleasure or the absence of physical pain, but instead the rational recognition that the actual world is the most harmonious.

Second, Leibniz contends that “best” does not mean “perfect” or even “very good.” While it is true that we oftentimes have no idea why bad things sometimes happen to good people and why good things sometimes happen to bad people, what we can be sure of is that God, as an ens perfectissimum, a most perfect being, chose this world because it was the best. And it is the best because it contains the most variety and plurality of substances governed by the fewest laws of nature. He writes in the Discourse:

One can say, in whatever manner God might have created the world, it would always have been regular and in accordance with a certain general order. But God has chosen the most perfect world, that is, the one which is at the same time the simplest in hypotheses and richest in phenomena (A. VI. Iv, 1538; AG 39).

Even if we were to grant that Leibniz means something particular by “best,” how should we understand the criteria that the “best” world is the one that is richest in phenomena and governed by the simplest laws?

It is critical that Leibniz has more than one criterion for the best possible world. If there were only one criterion, like the concern for the happiness of creatures, for example, then there is a problem of maximization. For whatever world God created, he could have created another world with more happiness. And since God could always create a better world, then he could never act for the best, for there is no best. But since there is a world, either this is not the best of all possible worlds, or there is no maximally perfect being. Malebranche (and Aquinas) held that there was no best world, and Leibniz wished to distance himself from their views. He writes in the Discourse, “They [the moderns like Malebranche] imagine that nothing is so perfect that there is not something more perfect—this is an error” (A. VI. iv, 1534; AG 37).

Rather than maximizing one feature of a world, which would be impossible, Leibniz reasons that God must optimize the competing criteria of richness of phenomena, simplicity of laws, and abundance of creatures. He writes in the Discourse:

As for the simplicity of the ways of God, this holds properly with respect to his means, as opposed to the variety, richness, and abundance, which holds with respect to his ends or effects. And the one must be in balance with the other, as are the costs of a building and the size and beauty one demands of it (A. VI. iv, 1537; AG 39).

God, like an architect with unlimited resources, must nevertheless weigh competing variables to optimize the best creation.

Even if we grant the claim that there God considers competing variables in creating the best world, we might still wonder why those variables are those of concern. Although it is unclear why Leibniz chose variety, richness, and abundance as the criteria, he points to simplicity as a possible overarching principle. Unfortunately, simplicity alone will not do, for it would be simpler to have only one substance rather than an abundance of substances. It seems then that simplicity in conjunction with a world that is worthy of the majesty of God are the underlying criteria for the best of all possible worlds.

The notion of simplicity is critical for Leibniz’s theodicean account. In fact, simplicity is the key concept that sets Leibniz’s account of God’s justice directly in line with his contemporary, Nicolas Malebranche. Leibniz remarks at one point that Malebranche’s theodicean account reduces in most substantial ways to his own. He writes in the Theodicy, “One may, indeed, reduce these two conditions, simplicity and productivity, to a single advantage, which is to produce as much perfection as is possible: thus Father Malebranche’s system in this point amounts to the same as mine” (G. VI, 241; H 257). The similarities of their accounts are readily apparent. Consider Malebranche’s remark that “God, discovering in the infinite treasures of his wisdom an infinity of possible worlds…, determines himself to create that world…that ought to be the most perfect, with respect to the simplicity of the ways necessary to its production or to its conservation” (OCM. V, 28).

Third, Leibniz appeals to intellectual humility and insists that our intuition that this is not the best possible world is simply mistaken. If we had God’s wisdom, then we would understand that this is the best possible world. Part of the appeal to intellectual humility is also the recognition that God evaluates the value of each world in its totality. In just the same way that it would be unfair to judge the quality of a film by looking at a single frame of the reel, Leibniz reasons that it is also unfair to judge the quality of the world by any singular instance of suffering. And given our relatively small existence in the enormous history of the universe, even long periods of suffering should be judged with proper context. World wars, global pandemics, natural disasters, famine, genocide, slavery, and total climate catastrophe are immense tragedies to be sure, but they mean relatively little in the context of the history of the universe.

The recognition that these cases of suffering mean little should not be interpreted to imply that they mean nothing. A perfectly benevolent God cares about the suffering of every part of creation, and yet, God must also weigh that suffering against the happiness and flourishing of the entirety of the universe, past, present, and future. And moreover, Leibniz reasons that every bit of suffering will ultimately lead to a greater good that redeems or justifies the suffering. To use the language in the contemporary literature in philosophy of religion, there is no “gratuitous evil.” Every case of evil ultimately helps improve the value of the entire universe. In a mature piece called the Dialogue on Human Freedom and the Origin of Evil, Leibniz writes:

I believe that God did create things in ultimate perfection, though it does not seem so to us considering the parts of the universe. It’s a bit like what happens in music and painting, for shadows and dissonances truly enhance the other parts, and the wise author of such works derives such a great benefit for the total perfection of the work from these particular imperfections that it is much better to make a place for them than to attempt to do without them. Thus, we must believe that God would not have allowed sin nor would he have created things he knows will sin, if he could derive from them a good incomparably greater than the resulting evil (Grua 365-366; AG 115).

6. Compatibilist Freedom

a. Human Freedom

Leibniz was deeply concerned with the way in which to properly understand freedom. In one sense, though, his hands were tied; given his fundamental commitment to the principle of sufficient reason as one of the “great principles of human reason” (G. VI, 602), Leibniz was straightforwardly compelled to determinism. Since the principle of sufficient reason rules out causes which are isolated from the causal series, one of the paradigmatic signs of thoroughgoing Libertarian accounts of free will, the most that Leibniz could hope for was a kind of compatibilist account of freedom. And indeed, Leibniz, like most of his other contemporaries, openly embraced the view that freedom and determinism were compatible.

According to the account of freedom developed in his Theodicy, free actions are those that satisfy three individually necessary and jointly sufficient conditions—they must be intelligent, spontaneous, and contingent. He writes in the Theodicy:

I have shown that freedom according to the definition required in the schools of theology, consists in intelligence, which involves a clear knowledge of the object of deliberation, in spontaneity, whereby we determine, and in contingency, that is, in the exclusion of logical or metaphysical necessity (G. VI, 288; H 288).

Leibniz derives the intelligence and spontaneity conditions from Aristotle, but adds contingency as a separate requirement. For an action to be free, Leibniz contends that the agent must have “distinct knowledge of the object of deliberation” (G. VI, 288; H 288), meaning that the agent must have knowledge of their action and also of alternative courses of action. For an action to be spontaneous, the agent’s actions must derive from an internal source and not be externally caused. There is a sense in which every action is spontaneous in that each substance is causally isolated and windowless from every other substance. And finally, actions must be contingent; that is, they must exclude logical or metaphysical necessity.

b. Divine Freedom

It was not just human freedom, though, that Leibniz treated as intelligent, spontaneous, and contingent. In fact, one of the most remarkably consistent parts of Leibniz’s thought, going back to his jurisprudential writings in the 1660’s all the way through to his mature views on metaphysics and philosophical theology, is that the gap between humans and God is a difference of degree and not type. There is nothing substantively different between humans and God. It is for precisely this reason that he insists in his natural law theory that we can discern the nature of justice and try to implement it in worldly affairs. Justice for humans ought to mirror the justice of God.

The implication for this theological view is that God is free in the same way that humans are free; God is perfectly free because his actions are also intelligent, spontaneous, and contingent. Since God is omniscient, he has perfect perceptions of the entire universe, past, present, and future. Since God determines his own actions without any external coercion, he is perfectly spontaneous. And since there is an infinite plurality of worlds, possible in themselves, which God could choose, his actions are contingent. Leibniz reasons that since God meets each of these conditions in the highest sense, God is perfectly free. And even though God is invariably led toward the Good, this in no way is an infringement on his freedom. He writes in the Theodicy:

…It is true freedom, and the most perfect, to be able to make the best use of one’s free will, and always to exercise this power, without being turned aside either by outward force or by inward passions, whereof the one enslaves our bodies and the other our souls. There is nothing less servile and more befitting the highest degree of freedom than to be always led towards the good, and always by one’s own inclination, without any constraint and without any displeasure. And to object that God therefore had need of external things is only a sophism (G. VI. 385; H 386).

Even with this mature account of freedom in place, Leibniz may still have the very same problem that he was concerned about prior to his meeting with Spinoza in 1676. If God’s nature requires him to do only the best, and assuming that there is only one uniquely best world, then it follows that the only possible world is the actual world. God’s essential nature and the fact of a uniquely best world entails that God must create the best. And so, we may end up back in the necessitarian position after all, albeit in a somewhat different way than Spinoza. Although Leibniz endorses the anthropomorphic conception of God that Spinoza denies, both philosophers hold that God’s nature necessitates, in some way, that there is only one possible world, the actual world. Ultimately, it is up to us to decide whether the strategies for contingency and the account of human and divine freedom that Leibniz develops over the course of his long and illustrious career are successful enough to avoid the necessitarian threat of which he was so concerned.

7. References and Further Reading

a. Primary Sources

  • [A] Sämtliche Schriften und Briefe. Ed. Deutsche Akademie der Wissenschaften. Darmstadt, Leipzig, Berlin: Akademie Verlag, 1923. Cited by series, volume, page.
  • [AG] Philosophical Essays. Translated and edited by Roger Ariew and Dan Garber. Indianapolis: Hackett, 1989.
  • [CP] Confessio Philosophi: Papers Concerning the Problem of Evil, 1671–1678. Translated and edited by Robert C. Sleigh, Jr. New Haven, CT: Yale University Press, 2005.
  • [G] Die Philosophischen Schriften von Gottfried Wilhelm Leibniz. Edited by C.I. Gerhardt. Berlin: Weidmann, 1875-1890. Reprint, Hildescheim: Georg Olms, 1978. Cited by volume, page.
  • [Grua] Textes inédits d’après de la bibliothèque provincial de Hanovre. Edited by Gaston Grua. Paris: Presses Universitaires, 1948. Reprint, New York and London: Garland Publishing, 1985.
  • [H] Theodicy: Essays on the Goodness of God, the Freedom on Man and the Origin of Evil. Translated by E.M. Huggard. La Salle, Il: Open Court, 1985.
  • [L] Philosophical Papers and Letters. Edited and translated by Leroy E. Loemker.
  • 2nd Edition. Dordrect: D. Reidel, 1969.
  • [LA] The Leibniz-Arnauld Correspondence. Edited by H.T. Mason. Manchester: Manchester University Press, 1967.
  • [OCM] Œuvres complètes de Malebranche (20 volumes). Edited by A. Robinet. Paris: J. Vrin, 1958–84.

b. Secondary Sources

  • Adams, Robert Merrihew. Leibniz: Determinist, Theist, Idealist. New York: Oxford University Press, 1994.
  • Bennett, Jonathan. Learning from Six Philosophers Vol. 1. New York: Oxford University Press, 2001.
  • Blumenfeld, David. “Is the Best Possible World Possible?” Philosophical Review 84, No. 2, April 1975.
  • Blumenfeld, David. “Perfection and Happiness in the Best Possible World.” In Cambridge Companion to Leibniz. Edited by Nicholas Jolley. Cambridge: Cambridge University Press, 1994.
  • Broad, C.D. Leibniz: An Introduction. Cambridge: Cambridge University Press, 1975.
  • Brown, Gregory and Yual Chiek. Leibniz on Compossibility and Possible Worlds. Cham, Switzerland: Springer, 2016.
  • Brown, Gregory. “Compossibility, Harmony, and Perfection in Leibniz.” The Philosophical Review 96, No. 2, April 1987.
  • Cover, J.A. and John O’Leary-Hawthorne. Substance and Individuation in Leibniz. Cambridge:
  • Cambridge University Press, 1999.
  • Curley, Edwin. “Root of Contingency.” In Leibniz: A Collection of Critical Essays. Edited by Harry Frankfurt. New York: Doubleday, 1974.
  • D’Agostino, Fred. “Leibniz on Compossibility and Relational Predicates.” The Philosophical Quarterly 26, No. 103, April 1976.
  • Hacking, Ian. “A Leibnizian Theory of Truth.” In Leibniz: Critical and Interpretative Essays, edited by
  • Michael Hooker. Minneapolis: University of Minnesota Press, 1982.
  • Horn, Charles Joshua. “Leibniz and Impossible Ideas in the Divine Intellect” In Internationaler Leibniz-Kongress X Vorträge IV, Edited by Wenchao Li. Hannover: Olms, 2016.
  • Horn, Charles Joshua. “Leibniz and the Labyrinth of Divine Freedom.” In The Labyrinths of Leibniz’s Philosophy. Edited by Aleksandra Horowska. Peter Lang Verlag, 2022.
  • Koistinen, Olli, and Arto Repo. “Compossibility and Being in the Same World in Leibniz’s Metaphysic.” Studia Leibnitiana 31, 2021.
  • Look, Brandon. “Leibniz and the Shelf of Essence.” The Leibniz Review 15, 2005.
  • Maher, Patrick. “Leibniz on Contingency.” Studia Leibnitiana 12, 1980.
  • Mates, Benson. “Individuals and Modality in the Philosophy of Leibniz.” Studia Leibnitiana 4, 1972.
  • Mates, Benson. “Leibniz on Possible Worlds.” Leibniz: A Collection of Critical Essays, edited by Harry Frankfurt, 335-365. Notre Dame: University of Notre Dame Press, 1976.
  • Mates, Benson. The Philosophy of Leibniz: Metaphysics and Language. New York: Oxford University Press, 1986.
  • McDonough, Jeffrey. “Freedom and Contingency.” The Oxford Handbook of Leibniz. New York: Oxford University Press, 2018.
  • McDonough, Jeffrey. “The Puzzle of Compossibility: The Packing Strategy.” Philosophical Review.119, No. 2, 2010.
  • Merlo, Giovanni. “Complexity, Existence, and Infinite Analysis.” Leibniz Review 22, 2012.
  • Messina, James and Donald Rutherford. “Leibniz on Compossibility.” Philosophy Compass 4, No. 6,
  • 2009.
  • Mondadori, Fabrizio. “Leibniz and the Doctrine of Inter-World Identity.” Studia Leibnitiana 7, 1975.
  • Mondadori, Fabrizio. “Reference, Essentialism, and Modality in Leibniz’s Metaphysics.” Studia Leibnitiana 5, 1973.
  • Rescher, Nicholas. Leibniz: An Introduction to His Philosophy. Totowa, New Jersey: Rowman and
  • Littlefield, 1979.
  • Rescher, Nicholas. Leibniz’s Metaphysics of Nature. Dordrecht, 1981.
  • Rescher, Nicholas. The Philosophy of Leibniz. Englewood Cliffs, NJ: Prentice Hall, 1967.
  • Rowe, William. Can God Be Free? New York: Oxford University Press, 2006.
  • Russell, Bertrand. A Critical Exposition on the Philosophy of Leibniz, 2nd ed. London: George Allen and Unwin, 1937. Reprint London: Routledge, 1997.
  • Rutherford, Donald. Leibniz and the Rational Order of Nature. Cambridge: Cambridge University Press, 1995.
  • Rutherford, Donald. “The Actual World.” The Oxford Handbook of Leibniz. New York: Oxford University Press, 2018.
  • Sleigh, Robert C., Jr. Leibniz and Arnauld: A Commentary on Their Correspondence. New Haven: Yale University Press, 1990.
  • Wilson, Margaret D. “Compossibility and Law.” In Causation in Early Modern Philosophy: Cartesianism, Occasionalism, and Pre-Established Harmony. Edited by Steven Nadler. University Park, Pennsylvania: Pennsylvania State University Press, 1993.
  • Wilson, Margaret D. “Possible Gods.” Review of Metaphysics 32, 1978/79.

 

Author Information

Charles Joshua Horn
Email: jhorn@uwsp.edu
University of Wisconsin Stevens Point
U. S. A.

Faith: Contemporary Perspectives

Faith is a trusting commitment to someone or something. Faith helps us meet our goals, keeps our relationships secure, and enables us to retain our commitments over time. Faith is thus a central part of a flourishing life.

This article is about the philosophy of faith. There are many philosophical questions about faith, such as: What is faith? What are its main components or features? What are the different kinds of faith? What is the relationship between faith and other similar states, such as belief, trust, knowledge, desire, doubt, and hope? Can faith be epistemically rational? Practically rational? Morally permissible?

This article addresses these questions. It is divided into three main parts. The first is about the nature of faith. This includes different kinds of faith and various features of faith. The second discusses the way that faith relates to other states. For example, what is the difference between faith and hope? Can someone have faith that something is true even if they do not believe it is true? The third discusses three ways we might evaluate faith: epistemically, practically, and morally. While faith is not always rational or permissible, this section covers when and how it can be. The idea of faith as a virtue is also discussed.

This article focuses on contemporary work on faith, largely since the twentieth century. Historical accounts of faith are also significant and influential; for an overview of those, see the article “Faith: Historical Perspectives.”

Table of Contents

  1. The Nature of Faith
    1. Types of Faith
      1. Attitude-Focused vs. Act-Focused
      2. Faith-That vs. Faith-In
      3. Religious vs. Non-Religious
      4. Important vs. Mundane
    2. Features of Faith
      1. Trust
      2. Risk
      3. Resilience
      4. Going Beyond the Evidence
  2. Faith and Other States
    1. Faith and Belief
      1. Faith as a Belief
      2. Faith as Belief-like
      3. Faith as Totally Different from Belief
    2. Faith and Doubt
    3. Faith and Desire
    4. Faith and Hope
    5. Faith and Acceptance
  3. Evaluating Faith
    1. Faith’s Epistemic Rationality
      1. Faith and Evidence
      2. Faith and Knowledge
    2. Faith’s Practical Rationality
    3. Faith and Morality/Virtue
  4. Conclusion
  5. References and Further Reading

1. The Nature of Faith

As we saw above, faith is a trusting commitment to someone or something. While this definition is a good start, it leaves many questions unanswered. This section is on the nature of faith and is divided into two subsections. The first covers distinctions among different kinds of faith and the second explores features of faith.

a. Types of Faith

This subsection outlines distinctions among different kinds of faith. It focuses on four distinctions: attitude-focused faith vs. act-focused faith, faith-that vs. faith-in, religious vs. non-religious faith, and important vs. mundane faith.

i. Attitude-Focused vs. Act-Focused

One of the most important distinctions is faith as an attitude compared to faith as an action. Faith, understood as an attitude, is similar to attitudes like beliefs or desires. In the same way that you might believe that God exists, you might have faith that God exists. Both are attitudes (things in your head), rather than actions (things you do). Call this attitude-focused faith.

Attitude-focused faith is thought to involve at least two components (Audi 2011: 79). The first is a belief-like, or cognitive, component. This could simply be a belief. While some contend that faith always involves belief, others argue that faith can involve something weaker, but still belief-like: some confidence that the object of faith is true, thinking it is likely to be true, supported by the evidence, or the most likely of the options under consideration. Either way, attitude-focused faith involves something belief-like. For example, if you have faith that your friend will win their upcoming basketball game, you will think there is at least a decent chance they win. It does not make sense to have faith that your friend’s team will win if you are convinced that they are going to get crushed. Later, this article returns to questions about the exact connection between faith and belief, but it is relatively uncontroversial that attitude-focused faith involves a belief-like component.

The second component of attitude-focused faith is a desire-like, or conative, component. Attitude-focused faith involves a desire for, or a positive evaluation of, its object. Returning to our example, if you have faith that your friend will win their upcoming game, then you want them to win the game. You do not have faith that they will win if you are cheering for the other team or if you want them to lose. This example illustrates why plausibly, attitude-focused faith involves desire; this article returns to this later as well.

A second kind of faith is not in your head, but an action. This kind of faith is similar to taking a “leap of faith”—an act of trust in someone or something. For example, if your friend promises to pick you up at the airport, waiting for them rather than calling a taxi demonstrates faith that they will pick you up. Walking across a rickety bridge demonstrates faith that the bridge will hold you. Doing a trust fall demonstrates faith that someone will catch you. Call this type of faith an act of faith, or action-focused faith.

On some views, such as Kvanvig’s (2013), faith is a disposition. In the same way that glass is disposed to shatter (even if it never actually shatters), on dispositional views of faith, having faith is a matter of being disposed to do certain things (even if the faithful never actually do them). The view that faith is a disposition could be either attitude-focused or action-focused. Faith might be a disposition to act in certain ways, maybe ways that demonstrate trust or involve risk. This type of faith would be action-focused (see Kvanvig 2013). Faith might instead be a disposition to have certain attitudes: like to believe, be confident in, and/or desire certain propositions to be true. This type of faith would be attitude-focused (see Byerly 2012).

What is the relationship between attitude-focused faith and action-focused faith? They are distinct states, but does one always lead to the other? One might think that, in the same way that beliefs and desires cause actions (for example, your belief that there is food in the fridge and your desire for food leads you to open the fridge), attitude-focused faith will cause (or dispose you toward) action-focused faith, as attitude-focused faith is made up of belief- and desire-like states (see Jackson 2021). On the other hand, we may not always act on our beliefs and our desires. So one question is: could you have attitude-focused faith without action-focused faith?

A related question is whether you could have action-focused faith without attitude-focused faith. Could you take a leap of faith without having the belief- and desire-like components of attitude-focused faith? Speak (2007: 232) provides an example that suggests that you could take a leap of faith without a corresponding belief. Suppose Thomas was raised in circumstances that instilled a deep distrust of the police. Thomas finds himself in an unsafe situation and a police officer is attempting to save him; Thomas needs to jump from a dangerous spot so the officer can catch him. While the officer has provided Thomas with evidence that he is reliable, Thomas cannot shake the belief instilled from his upbringing that the police are not trustworthy. Nonetheless, Thomas jumps. Intuitively, Thomas put his faith in the officer, even without believing that the officer is trustworthy.

Generally, you can act on something, even rationally, if you have a lot to gain if it is true, even if you do not believe that it is true. Whether this counts as action-focused faith without attitude-focused faith, however, will depend on the relationship between faith and belief, a question addressed in a later section.

ii. Faith-That vs. Faith-In

A second distinction is between faith-that and faith-in. Faith-that is faith that a certain proposition is true. Propositions are true or false statements, expressed by declarative sentences. So 1+1=2, all apples are red, and God exists are all propositions. In the case of faith, you might have faith that a bridge will hold you, faith that your friend will pick you up from the airport, or faith that God exists. Faith-that is similar to other propositional attitudes, like belief and knowledge. This suggests that attitude-focused faith is a species of faith-that, since the attitudes closely associated with faith, like belief and hope, are propositional attitudes.

There’s also faith-in. Faith-in is not faith toward propositions, but faith toward persons or ideals. For example, you might have faith in yourself, faith in democracy, faith in your spouse, faith in a political party, or faith in recycling.

Some instances of faith can be expressed as both faith-that and faith-in. For example, theistic faith might be described as faith-that God exists or faith-in God. You might also have faith-that your spouse is a good person or faith-in your spouse. There are questions about the relationship between faith-that and faith-in. For example, is one more fundamental? Do all instances of faith-that reduce to faith-in, or vice versa? Or are they somewhat independent? Is there a significant difference between faith-in X, and faith-that a proposition about X is true?

iii. Religious vs. Non-Religious

A third distinction is between religious faith and secular faith. The paradigm example of religious faith is faith in God or gods, but religious faith can also include: faith that certain religious doctrines are true, faith in the testimony of a religious leader, faith in a Scripture or holy book, or faith in the church or in a religious group. In fact, according to one view that may be popular in certain religious circles, “faith” is simply belief in religious propositions (see Swindal 2021).

However, faith is not merely religious—there are ample examples of non-religious faith. This includes the faith that humans have in each other, faith in secular goals or ideals, and faith in ourselves. It is a mistake to think that faith is entirely a religious thing or reserved only for the religious. Faith is a trusting commitment—and this can involve many kinds of commitments. This includes religious commitment, but also includes interpersonal commitments like friendship or marriage, intrapersonal commitments we have to ourselves or our goals, and non-personal commitments we have to ideals or values.

One reason this distinction is important is that some projects have good reason to focus on one or the other. For example, on some religious traditions, like the Christian tradition, faith is a condition for salvation. But presumably, not any kind of faith will do—religious faith is required. One project in Christian philosophical theology provides an analysis of the religious faith that is closely connected to salvation (see Bates 2017). Projects like these have good reason to set secular faith aside. Others may have a special interest in secular faith and thus set religious faith aside.

This article considers both religious and non-religious faith. While they are different in key ways, they both involve trusting commitments, and many contemporary accounts of faith apply to both.

iv. Important vs. Mundane

A final distinction is between important faith and mundane faith. Important faith involves people, ideals, or values that are central to your life goals, projects, and commitments. Examples of important faith include religious faith, faith in your spouse, or faith in your political or ethical values. In most cases, important faith is essential to your life commitments and often marks values or people that you build your life around.

But not all faith is so important. You might have faith that your office chair will hold you, faith that your picnic will not be rained out, or faith that your spouse’s favorite football team will win their game this weekend. These are examples of mundane faith. While mundane faith still plausibly involves some kind of trusting commitment, this commitment is less important and more easily given up. You may have a weak commitment to your office chair. But—given it is not a family heirloom—if the chair started falling apart, you would quickly get rid of it and buy a new one. So important faith is associated with your central, life-shaping commitments, and mundane faith is associated with casual commitments that are more easily given up.

One might distinguish between objectively important faith—faith held to objectively valuable objects—and subjectively important faith—faith held to objects that are important to a particular individual but may or may not be objectively valuable. For example, some critics of religion might argue that while religious faith might be subjectively important to some, it is nonetheless not objectively important.

While this article focuses mostly on important faith, some of what is discussed also applies to mundane faith, but it may apply to a lesser degree. For example, if faith involves a desire, then the desires associated with mundane faith may be weaker. Now, consider features of faith.

b. Features of Faith

This subsection discusses four key features of faith: trust, risk, resilience, and going beyond the evidence. These four features are often associated with faith. They are not necessarily synonymous with faith, and not all accounts of faith give all four a starring role. Nonetheless, they play a role in understanding faith and its effects. Along the way, this article considers specific accounts that closely associate faith with each feature.

i. Trust

The first feature of faith is trust. As we have noted, faith is a trusting commitment. Trust involves reliance on another person. This can include, for example, believing what they say, depending on them, or being willing to take risks that hinge on them coming through for you. Faith and trust are closely connected, and some even use faith and trust as synonymous (Bishop 2016).

The close association with faith and trust lends itself nicely to a certain view of faith: faith is believing another’s testimony. Testimony is another’s reporting that something is true. Accounts that connect faith and testimony are historically significant, tracing back to Augustine, Locke, and Aquinas. Recent accounts of faith as believing another’s testimony include Anscombe (2008) and Zagzebski (2012). Anscombe, for example, says to have faith that p is to believe someone that p. Religious faith might be believing God’s testimony or the testimony of religious leaders. Interpersonal faith might be believing the testimony of your friends or family.

Plausibly, trust is a key feature—likely the key feature—of interpersonal faith. Faith in others involves trusting another person: this includes faith in God or gods, but also faith in other people and faith in ourselves. It is plausible that even propositional faith can be understood in terms of trust. For example, propositional faith that your friend will pick you up from the airport involves trusting your friend. Even in mundane cases propositional faith could be understood as trust: if you have faith it will be sunny tomorrow, you trust it will be sunny tomorrow.

ii. Risk

Faith is also closely related to risk. William James (1896/2011) discusses a hiker who gets lost. She finally finds her way back to civilization, but as she is walking, she encounters a deep and wide crevice on the only path home. Suppose that, to survive, she must jump this crevice, and it is not obvious that she can make the jump. She estimates that she has about a 50/50 chance. She has two choices: she can give up and likely die in the wilderness. Or she can take a (literal) leap of faith and do her best to make it across the crevice. This decision to jump involves a risk: she might fail to make it to the other side and fall to her death.

Risk involves making a decision in a situation where some bad outcome is possible but uncertain. Jumping a wide crevice involves the possible bad outcome of falling in. Gambling involves the possible bad outcome of losing money. Buying a stock involves the bad outcome of its value tanking.

If faith is connected to risk, this suggests two things about faith. First, faith is associated with a degree of uncertainty. For example, if one has faith that something is true, then one is uncertain regarding its truth or falsity. Second, faith is exercised in cases where there is a potentially bad outcome. The outcome might involve the object of faith’s being false, unreliable, or negative in some other way. For example, if you have faith that someone will pick you up at the airport, there is the possibility that they do not show up. If you have faith in a potential business partner, there is the possibility that they end up being dishonest or difficult to work with.

These examples illustrate the connection between risk and action-focused faith. When we act in faith, there is usually some degree of uncertainty involved and a potentially bad outcome. If you have action-focused faith your spouse will pick you up you wait for them and do not call a taxi, you risk waiting at the airport for a long time and maybe even missing an important appointment if your spouse does not show. If you have action-focused faith someone is a good business partner you dedicate time, money, and energy into your shared business, and you risk wasting all those resources if they are dishonest or impossible to work with. Or you might have action-focused faith that God exists and dedicate your life to God, which risks wasting your life if God does not exist.

Attitude-focused faith may also involve risk: some kind of mental risk. William James (1896/2011) discusses our two epistemic goals: believe truth and avoid error. We want to have true beliefs, but if that is all we cared about, we should believe everything. We want to avoid false beliefs, but if that is all we cared about, we should believe nothing. Much of the ethics of belief is about balancing these two goals, and this balance can involve a degree of mental risk. For example, suppose you have some evidence that God exists, but your evidence is not decisive, and you also recognize that there are some good arguments that God does not exist. While it is safer to withhold judgment on whether God exists, you also could miss out on a true belief. Instead, you might take a “mental” risk, and go ahead and believe that God exists. While you are not certain that God exists, and believing risks getting it wrong, you also face a bad outcome if you withhold judgment: missing out on a true belief. By believing that God exists in the face of indecisive evidence, you take a “mental” or “attitude” risk. James argues that this kind of mental risk can be rational (“lawful”) when “reason does not decide”—our evidence does not make it obvious that the statement believed is true or false—and we face a “forced choice”—we have to commit either way.

The view that faith involves an attitude-risk closely resembles John Bishop’s account of faith, which is inspired by insights from James. Bishop (2007) argues that faith is a “doxastic venture” (doxastic meaning belief-like). Bishop’s view is that faith involves believing beyond the evidence. Bishop argues that certain propositions (including what he calls “framework principles”) are evidentially undecidable, meaning our evidence cannot determine whether the claim is true or false. In these cases, you can form beliefs for non-evidential reasons—for example, beliefs can be caused by desires, emotions, affections, and so forth. This non-evidential believing enables you to believe beyond the evidence (see also Ali 2013).

iii. Resilience

A third feature of faith is resilience. Faith’s resilience stems from the connection between faith and commitment. Consider some examples. If you have faith that my team will win their upcoming game, you have some kind of commitment to my team. If you have faith that God exists, this involves a religious commitment. You might commit to finishing a degree, picking up a new instrument, a marriage, or a religion. These commitments can be difficult to keep—you get discouraged, doubt yourself or others, your desires and passions fade, and/or you get counterevidence that makes you wonder if you should have committed in the first place. Faith’s resilience helps you overcome these obstacles and keep your commitments.

Lara Buchak’s (2012) risky commitment view of faith brings risk and commitment together. On Buchak’s view, faith involves stopping one’s search for evidence and making a commitment. Once this commitment is made, you will maintain that commitment, even in the face of new counterevidence. For example, suppose you are considering making a religious commitment. For Buchak, religious faith involves stopping your search for evidence regarding whether God exists and taking action: making the commitment. Of course, this does not mean that you can no longer consider the evidence or have to stop reading philosophy of religion, but you are not looking for new evidence to decide whether to make (or keep) the commitment. Once you’ve made this religious commitment, you will continue in that commitment even if you receive evidence against the existence of God—at least, to a degree.

The literature on grit is also relevant to faith’s resilience. Grit, a phenomenon discussed by both philosophers and psychologists, is the ability to persevere to achieve long-term, difficult goals (Morton and Paul 2019). It takes grit to train for a marathon, survive a serious illness, or remain married for decades. Matheson (2018) argues that faith is gritty, and this helps explain how faith can be both rational and voluntary. Malcolm and Scott (2021) argue that faith’s grit helps the faithful be resilient to a variety of challenges. Along similar lines, Jackson (2021) argues that the belief- and desire-like components of faith explain how faith can help us keep our long-term commitments, in light of both epistemic and affective obstacles.

iv. Going Beyond the Evidence

A final feature of faith is that it goes beyond the evidence. This component is related to faith’s resilience. Faith helps you maintain your commitments because it goes beyond the evidence. You might receive counterevidence that makes you question whether you should have committed in the first place. For example, you might commit to a certain major, but a few months in, realize the required classes are quite difficult and demanding. You might wonder whether you are cut out for that field of study. Or you might have a religious commitment, but then encounter evidence that an all-good, all-loving God does not exist—such as the world’s serious and terrible evils. In either case, faith helps you continue in your commitment in light of this counterevidence. And if the evidence is misleading—so you are cut out for the major, or God does exist—then this is a very good thing.

The idea that faith goes beyond the evidence raises questions about rationality. How can faith go beyond the evidence but still be rational? Is it not irrational to disrespect or ignore evidence? This article returns to this question later, but for now, note that there is a difference between going beyond the evidence and going against the evidence. Going beyond the evidence might look like believing or acting when the evidence is decent but imperfect. Bishop’s account, for example, is a way that faith might “venture” beyond the evidence (2007). However, this does not mean faith goes against the evidence, requiring you to believe something that you have overwhelming evidence is false.

Some do argue that faith goes against the evidence. They fall into two main camps. The first camp thinks that faith goes against the evidence, and this is a bad thing; faith is harmful, and we should avoid having faith at all costs. The New Atheists, such as Richard Dawkins and Sam Harris, have a view like this (but see Jackson 2020). The second camp thinks that faith goes against the evidence but that is actually a good thing. This view is known as fideism. Kierkegaard argued for fideism, and he thought that faith is valuable because it is absurd: “The Absurd, or to act by virtue of the absurd, is to act upon faith” (Journals, 1849). Nonetheless, Kierkegaard thought having faith is one of the highest ideals to which one can aspire. This article returns to the idea that faith “goes beyond the evidence” in Section 3.

2. Faith and Other States

This section is about the relationship between faith and related attitudes, states, or actions: belief, doubt, desire, hope, and acceptance. Unlike the features just discussed, these states are normally not part of the definition or essence of faith. Nonetheless, these states are closely associated with faith. Appreciating the ways that faith is similar to, but also different than, these states provides a deeper understanding of the nature of faith.

a. Faith and Belief

When it comes to attitudes associated with faith, many first think of belief. Believing something is taking it to be the case or regarding it as true. Beliefs are a propositional attitude: an attitude taken toward a statement that is either true or false.

What is the relationship between faith and belief? Since belief is propositional, it is also natural to focus on propositional faith; so what is the relationship between belief that p and faith that p? More specifically: does belief that p entail faith that p? And: does faith that p entail belief that p? The answer to the first question is no; belief does not entail propositional faith. This is because propositional faith involves a desire-like or affective component; belief does not. You might believe that there is a global pandemic or believe that your picnic was rained out. However, you do not have faith that those things are true, because you do not desire them to be true.

The second question—whether propositional faith entails belief—is significantly more controversial. Does faith that p entail belief that p? Answers to this question divide into three main views. Those who say yes normally argue that faith is a kind of belief. The no camp divides into two groups. The first group argues that faith does not have to involve belief, but it involves something belief-like. A final group argues that faith is something totally different from belief. This article considers each view in turn. (See Buchak 2017 for a very helpful, more detailed taxonomy of various views of faith and belief.)

i. Faith as a Belief

On some views, faith is a belief. Call these “doxastic” views of faith. We have discussed two doxastic views already. The first is the view that faith is simply belief in a religious proposition; it was noted that, if intended as a general theory of faith, this seems narrow, as one can have non-religious faith. (But it may be more promising as an account of religious faith.) The second view is Anscombe’s (2008) and Zagzebski’s (2012) view that faith is a belief based on testimony, discussed in the previous section on trust. A third view traces back to Augustine and Calvin, and is more recently defended by Plantinga (2000). On this view, faith is a belief that is formed through a special mental faculty known as the sensus divintatus, or the “sense of the divine.” For example, you might watch a beautiful sunset and form the belief that there is a Creator; you might be in danger and instinctively cry out to God for help. (Although Plantinga also is sympathetic to views that connect faith and testimony; see Plantinga 2000: ch. 9.)

Note two things about doxastic views. First, most doxastic views add other conditions in addition to belief. For instance, as we have discussed, it is widely accepted that faith has an affective, desire-like component. So on one doxastic view, faith involves a belief that p and a desire for p. You could also add other conditions: for example, faith is associated with dispositions to act in certain ways, take certain risks, or trust certain people. What unites doxastic views is that faith is a kind of belief; faith is belief-plus.

Second, the view that faith entails belief does not require you to accept that faith is a belief. You could have a view on which faith is not a belief, but every time you have faith that a statement is true, you also believe it—faith and belief “march in step” (analogy: just because every animal with a heart also has a kidney does not mean hearts are kidneys). So, another view in the family of doxastic views, is that faith is not a belief, but always goes along with belief.

ii. Faith as Belief-like

Some resist the idea that faith entails belief. Daniel Howard-Snyder (2013) provides several arguments against doxastic views of faith. Howard-Snyder argues that if one can have faith without belief, this makes sense of the idea that faith is compatible with doubt. Doubting might cause you to give up a belief, but Howard-Snyder argues that you can maintain your faith even in the face of serious doubts. Second, other belief-like attitudes can play belief’s role: for example, you could think p is likely, be confident in p, think p is more likely than not, and so forth. If you do not flat-out believe that God exists, but are confident enough that God exists, Howard-Snyder argues that you can still have faith that God exists. A final argument that you can have faith without belief involves real-life examples of faith without belief. Consider the case of Mother Theresa. Mother Theresa went through a “dark night of the soul” in her later life. During this dark time, in her journals, she confessed that her doubts were so serious that at times, she did not believe that God existed. Nonetheless, she maintained her commitment and dedication to God. Many would not merely say she had faith; Mother Theresa was a paradigm example of a person of faith. This again supports the idea that you can have faith without belief. In general, proponents of non-doxastic views do not want to exclude those who experience severe, belief-prohibiting doubts from having religious faith. In fact, one of the functions of faith is to help you keep your commitments in the face of such doubts.

Howard-Snyder’s positive view is that faith is “weakly doxastic.” Faith does not require belief but requires a belief-like attitude, such as confidence, thinking likely, and so forth. He adds other conditions as well; in addition to a belief-like attitude, he thinks that faith that p requires a positive view of p, a positive desire-like attitude toward p, and resilience to new counterevidence against the truth of p.

In response to Howard-Snyder, Malcolm and Scott (2017) defend that faith entails belief. While they agree with Howard-Snyder that faith is compatible with doubt, they point out that belief is also compatible with doubt. It is not uncommon or odd to say things like “I believe my meeting is at 3 pm, but I’m not sure,” or “I believe that God exists, but I have some doubts about it.” Malcolm and Scott go on to argue that faith without belief, especially religious faith without belief, is a form of religious fictionalism. Fictionalists speak about and act on something for pragmatic reasons, but they do not believe the claims that they are acting on and speaking about. For example, you might go to church, pray, or recite a creed, but you do not believe that God exists or what the creed says—you merely do those things for practical reasons. Malcolm and Scott argue that there is something suspicious about this, and there is reason to think that fictionalists do not have genuine faith. They conclude that faith entails belief, and more specifically, religious faith requires the belief that God exists.

This debate is not be settled here, but note that there are various responses that the defender of the weakly-doxastic view of faith could provide. Concerning the point about doubt, a proponent of weak doxasticism might argue that faith is compatible with more doubt than belief. Even if belief is compatible with some doubt—as it seems fine to say, “I believe p but there’s a chance I’m wrong”—it seems like faith is compatible with even more doubt—more counterevidence or lower probabilities. On fictionalism, Howard-Snyder (2018) responds that religious fictionalism is a problem only if the fictionalist actively believes that the claims they are acting on are false. However, if they are in doubt but moderately confident, or think the claims are likely, even if they do not believe the claims, it is more plausible that fictionalists can have faith. You might also respond by appealing to some of the distinctions discussed above: for example, perhaps religious faith entails belief, but non-religious faith does not.

iii. Faith as Totally Different from Belief

A third view pulls faith even further away from belief. On this view, faith does not entail belief, nor does faith entail something belief-like, but instead, faith is totally different from belief. This view is often known as the pragmatist view of faith.

This article returns to these views later, but here is a summary. Some authors argue that faith only involves accepting, or acting as if, something is true (Swinburne 1981; Alston 1996). Others argue that faith is a disposition to act in service of an ideal (Dewey 1934; Kvanvig 2013), or that faith involves pursuing a relationship with God (Buckareff 2005). Some even argue that faith is incompatible with belief; for example, Pojman (1986) argues that faith is profound hope, and Schellenberg (2005) argues that faith is imaginative assent. Both argue that one cannot have faith that p if they believe that p.

Pragmatist views depart drastically from both doxastic and weakly doxastic accounts of faith. Faith does not even resemble belief, but is something totally unlike belief, and more closely related to action, commitment, or a disposition to act.

There are two ways to view the debate between doxastic, weakly doxastic, and pragmatic views of faith. One possibility is that there is a single thing, “faith,” and there are various views about what exactly faith amounts to: is faith a belief, similar to a belief, or not at all like belief? Another possibility, however, is that there are actually different kinds of faith. Plausibly, both doxastic and weakly doxastic views are describing attitude-focused faith, and pragmatic views of faith are describing action-focused faith. This second possibility does not mean there are not any interesting debates regarding faith. It still leaves open whether attitude-focused faith requires belief, or merely something belief-like, and if the latter, what those belief-like attitudes can be, and how weak they can be. It also leaves open which view of action-focused faith is correct. However, you may not have to choose between pragmatist views on the one hand, and doxastic or weakly doxastic views on the other; each view may simply be describing a different strand of faith.

b. Faith and Doubt

One might initially think that faith and doubt are opposed to each other. That is, those with faith will never doubt, or if they do doubt, their faith is weak. However, if you agree with the points made in the previous section—Howard-Snyder’s argument that faith is compatible with doubt; and Malcolm and Scott’s point that belief is also compatible with doubt—there is reason to reject the view that faith and doubt are completely opposed to each other.

Howard-Snyder (2013: 359) distinguishes between two ways of doubting. First, you might simply doubt p. Howard-Snyder says that this involves an inclination to disbelieve p. If you doubt that it will rain tomorrow, you will tend to disbelieve that it will rain tomorrow. This type of doubt—doubting p—might be in tension with, or even inconsistent with faith. Even those who deny that faith entails belief nonetheless think that faith is not consistent with disbelief; you cannot have faith that p if you think p is false (but see Whitaker 2019 and Lebens 2021).

However, not all doubt is closely associated with disbelief. You might instead be in doubt about p, or have some doubts about p. Moon (2018) argues that this type of doubt involves (roughly) thinking you might be wrong. In these cases, you are pulled in two directions—maybe you believe something, but then receive some counterevidence. Moon argues that this second kind of doubt is compatible with belief (2018: 1831), and Howard-Snyder argues that it is compatible with faith. Howard-Snyder says, “Being in doubt is no impediment to faith. Doubt is not faith’s enemy; rather, the enemies of faith are misevaluation, indifference or hostility, and faintheartedness” (2013: 370).

Thus, there is good reason to think that having doubts is consistent with faith. Those that deny that faith entails belief might argue that faith is compatible with more doubts than belief. What is more, faith may be a tool that helps us maintain our commitments in light of doubts. For example, Jackson (2019) argues that evidence can move our confidence levels around, but it does not always change our beliefs. For example, suppose John is happily engaged and will be married soon, and based on the sincerity and commitment of him and his spouse, he has faith that they will not get divorced. Then, he learns that half of all marriages end in divorce. Learning this should lower his confidence that they will remain committed, causing him to have doubts that his marriage will last. However, this counterevidence does not mean he should give up his faith or the commitment. His faith in himself and his spouse can help him maintain the commitment, even in light of the counterevidence and resulting doubts.

c. Faith and Desire

Recall that attitude-focused faith involves a desire for, or a positive evaluation of, the object of faith. If you have faith that your friend will win her upcoming race, then you want her to win; it does not make sense to claim to have faith she will win if you hope she will lose. Similarly, you would not have faith that your best friend has cancer, or that your father will continue smoking. A large majority of the authors writing on the philosophy of faith maintain that faith involves a positive evaluation of its object (Audi 2011: 67; Howard-Snyder 2013: 362–3). Even action-focused faith may involve desire. While it is more closely identified with actions, rather than attitudes, it could still involve or be associated with desires or pro-attitudes.

Malcolm and Scott (2021) challenge the orthodox view that faith entails desire or positivity. They argue that, while faith might often involve desire, the connection is not seamless. For example, you might have faith that the devil exists or faith that hell is populated—not because you want these to be true, but because these doctrines are a part of your religious commitment. You might find these doctrines confusing and difficult to swallow, and even hope that they are false, but you trust that God has a plan or reason to allow these to be true. Malcolm and Scott argue that faith in such cases does not involve positivity toward its object—and in fact, it may involve negativity.

Furthermore, crises of faith can involve the loss of desire for the object of faith. There has been much talk about how faith that p can be resilient in light of counterevidence: evidence that p is false. But what about evidence that p would be a bad thing? One might question their religious commitment, say, not because they doubt God’s existence, but because they doubt that God’s existence would be a good thing, or that God is worth committing to (see Jackson 2021). Malcolm and Scott argue that if one can maintain faith through a crisis of faith, this provides another reason to think that faith may not always involve positivity.

Note that more attention has been paid to the specifics of faith’s belief-like component than faith’s desire-like component. Many authors mention the positivity of faith, motivate it with a few examples, and then move on to other topics. But many similar questions that arise regarding faith and belief could also be raised regarding faith and desire. For example: does faith that p entail a desire for p? What if someone has something weaker than a desire, such as a second-order desire (a desire to desire p)? Or some desire for p, but also some desire for not-p? Could these people have faith? Can other attitudes play the role of desire in faith, such as a belief that p is good?

If you are willing to weaken the relationship between faith and desire, you could agree with Malcolm and Scott that the idea that faith entails desire is too strong, but nonetheless accept that a version of the positivity view is correct. Similar to a weakly doxastic account of faith, you could have a weakly positive account of faith and desire: faith’s desire-like condition could include things like second-order desires, conflicting desires, pro-attitudes, or beliefs about the good. In a crisis of faith, the faithful may have second-order desires or some weaker desire-like attitude. The prospect of weakly positive accounts of faith should be further explored. And in general, more attention should be paid to the relationship between faith and desire. In the religious case, this connection is related to the axiology of theism, the question of whether we should want God to exist (see The Axiology of Theism).

d. Faith and Hope

Faith and hope are often considered alongside each other, and for good reason. Like faith, hope also has a desire-like component and a belief-like component. The desire-like component in both attitudes is similar—whether you have faith that your friend will win their game or hope that they will win their game, you want them to win the game.

However, hope’s belief-like component is arguably weaker than faith’s. Hope that a statement is true merely requires thinking that statement is possibly true; it can be extremely unlikely. Even if there is a 95% chance of rain tomorrow, you can still hope your picnic will not be rained out. Hope’s belief-like component could be one of two things: a belief that p is possible, or a non-zero credence in p. (Credence is a measure of subjective probability—the confidence you have in the truth of some proposition. Credences are measured on a scale from 0 to 1, where 0 represents certainty that a proposition is false, and 1 represents certainty that it is true.) So if you hope that p, you cannot believe p is impossible or have a credence of 0 in p (certainty that p is false). At the same time, it seems odd to hope for things in which you are certain. You do not hope that 1+1=2 or hope that you exist, even if you desire those to be true. Then, as Martin (2013: 69) notes, hope that p may be consistent with any credence in p between, but excluding, 1 and 0.

Thus, on the standard view of hope, hope consists of two things: a desire for p to be true and a belief that p is possible (or non-zero credence). (See Milona 2019 for a recent defense of the standard view. Some argue that hope has additional components; for details of recent accounts of hope, see Rioux 2021.) Contrast this with faith. Unlike hope, faith that a statement is true is not compatible with thinking the statement is extremely unlikely or almost definitely false. If there is a 95% chance of rain tomorrow, you should not—and most would not—have faith that it will be sunny tomorrow. The chance of rain is just too high. But this does not preclude hoping that it will be sunny. Thus, you can hope that something is true when it is so unlikely that you cannot have faith.

This carves out a unique role for hope. Sometimes, after you make a commitment, you get lots of counterevidence challenging your basis for that commitment—counterevidence so strong that you must give up your faith. However, simply because you have to give up your faith does not mean you have to give up hope. You might hope your missing sibling is alive, even in light of evidence that they are dead, or hope that you will survive a concentration camp, or hope that you can endure a risky treatment for a serious illness. And resorting to hope does not always mean you should give up your commitment. Hope can, in general, underlie our commitments when we do not have enough evidence to have faith (see Jackson 2021).

While faith and hope are distinct in certain ways, Pojman (1986) argues that faith is a certain type of hope: profound hope. Pojman is not interested in casual hope—like hope your distant cousin will get the job he applied for—but is focused on the hope that is deep and central to our life projects. In addition to the two components of hope discussed above, profound hope also involves a disposition to act on p, an especially strong desire for p to be true, and a willingness to take great risks to bring p about. Pojman’s view draws on a connection between attitude-focused faith and action-focused faith, as Pojman’s account gives a central role to risky action. Those convinced by the idea that faith requires a bit more evidence than hope may also want to add a condition to Pojman’s view: the belief-like component of faith-as-hope must be sufficiently strong, as faith might require more than merely taking something to be possible.

e. Faith and Acceptance

Accepting that p is acting as if p. When you accept a proposition, you treat it as true in your practical reasoning, and when you make decisions, act as if p were true. According to Jonathan Cohen (1992: 4), when one accepts a proposition, one “includes that proposition… among one’s premises for deciding what to do or think in a particular context.” Often, we accept what we believe and believe what we accept. You believe coffee will wake you up, so you drink it when you are tired in the morning. You believe your car is parked north of campus, so you walk that way when you leave the office.

Sometimes, however, you act as if something is true even though you do not believe it. Say you are a judge in a court case, and the evidence is enough to legally establish that a particular suspect did it “beyond a reasonable doubt.” Suppose, though, you have other evidence that they are innocent, but it is personal, such that it cannot legally be used in a court of law. You may not be justified in believing they are guilty, but for legal reasons, you must accept that they are guilty and issue the “guilty” verdict. In other cases, you believe something, but do not act as it if is true. Suppose you are visiting a frozen lake with your young children, and they want to go play on the ice. You may rationally believe the ice is thick and safe, but refuse to let your children play, accepting that the ice will break, because of how bad it would be if they fell in.

Several authors have argued that faith and acceptance are closely connected. Alston (1996) argues that acceptance, rather than belief, is one of the primary components of faith. That is, those with faith may or may not believe the propositions of faith, but they act as if they are true. A similar view is Swinburne’s pragmatist faith. On Swinburne’s (1981) view, faith is acting on the assumption that p. Like Alston, Swinburne also maintains that faith does not require belief. Schellenberg’s (2005) view also gives acceptance a prominent place in faith. On Schellenberg’s view, faith is imaginative assent. If you have faith that p, you deliberately imagine p to be true, and, guided by this imaginative picture, you act on the truth of p. So Schellenberg’s picture of faith is imaginative assent plus acceptance. While these authors argue that acceptance is necessary for faith, most do not think it is sufficient; the faithful fulfill other conditions, including a pro-attitude towards the object of faith.

A final view is that faith involves a kind of allegiance. Allegiance is an action-oriented submission to a person or ideal. Dewey (1934) and Kvanvig (2013) defend the allegiance view of faith, on which the faithful are more characterized by their actions than their attitudes. The faithful are marked by their loyalty and committed action to the object of faith; in many cases, this could look like accepting certain propositions of faith, even if one does not believe them. Bates (2017) also proposes a model of Christian faith as allegiance, but for Bates, faith requires both a kind of intellectual assent (something belief-like) and allegiance, or enacted loyalty and obedience to God.

Whether these views that give acceptance or action a central role in faith are weakly doxastic or pragmatic depends on one’s view of acceptance: is acceptance a belief-like state or an action-like state? Since acceptance is acting as if something is true, and you can accept a proposition even if you think it is quite unlikely, in my opinion, these views are better characterized as pragmatic. However, some acceptance views, like Bates’, that involve both acceptance and something belief-like, may be doxastic or weakly doxastic.

3. Evaluating Faith

Thus far, this article has focused on the nature of faith. Section 1 covered types of faith and features of faith. Section 2 covered the way faith compares and contrasts with other related attitudes and actions. This final section is about evaluating faith. This section discusses three modes of evaluation: epistemic, practical, and moral.

Note that, like other attitudes and actions, faith is sometimes rational and sometimes irrational, sometimes permissible and sometimes impermissible. In the same way that beliefs can be rational or irrational, faith can be rational or irrational. Not all faith should be evaluated in the same way. The rationality of faith depends on several factors, including the nature of faith and the object of faith. Drawing on some of the above accounts of the nature of faith, this article discusses various answers to the question of why and when faith could be rational, and why and when faith could be irrational.

a. Faith’s Epistemic Rationality

Our first question is whether faith can be epistemically rational, and if so, when and how. Epistemic rationality is rationality that is aimed at getting at the truth and avoiding error, and it is associated with justified belief and knowledge. An epistemically rational belief has characteristics like being based on evidence, being reliably formed, being a candidate for knowledge, and being the result of a dependable process of inquiry. Paradigm examples of beliefs that are not epistemically rational ones are based on wishful thinking, hasty generalizations, or emotional attachment.

Epistemic rationality is normally applied to attitudes, like beliefs, so faith’s epistemic rationality primarily concerns faith as a mental state. This article also focuses on propositional faith, and it divides the discussion of faith’s epistemic rationality into two parts: evidence and knowledge.

i. Faith and Evidence

Before discussing faith, it might help to discuss the relationship between evidence and epistemic rationality. It is widely thought that epistemically rational people follow the evidence. While the exact relationship between evidence and epistemic rationality is controversial, many endorse what is called evidentialism, the view that you are epistemically rational if and only if you proportion your beliefs to the evidence.

We have seen that faith is resilient: it helps us keep our commitments in the face of counterevidence. Given faith’s resilience, it is natural to think that faith goes beyond the evidence (or involves a disposition to go beyond the evidence). But would not having faith then violate evidentialism? Can faith both be perfectly proportioned to the evidence, but also go beyond the evidence? Answers to these questions fall into three main camps, taking different perspectives on faith, evidence, and evidentialism.

The first camp, mentioned previously, maintains that faith violates evidentialism because it goes beyond the evidence; but evidentialism is a requirement of rationality; thus, faith is irrational. Fideists and the New Atheists may represent such a view. However, you might think that the idea that all faith is always irrational is too strong, and that, instead, faith is more like belief: sometimes rational and sometimes irrational. Those that think faith can be rational fall into two camps.

The first camp holds that rational faith does not violate evidentialism and that there are ways to capture faith’s resilience that respect evidentialism. For example, consider Anscombe’s and Zagzebski’s view that faith is believing another’s testimony. On this view, faith is based on evidence, and rational faith is proportioned to the evidence: testimonial evidence. Of course, this assumes that testimony is evidence, but this is highly plausible: much of our geographical, scientific, and even everyday beliefs are based on testimony. Most of our scientific beliefs are not based on experiments we did ourselves—they are based on results reported by scientists. We trust their testimony. We believe geographical facts about the shape of the globe and things about other countries even though we have never traveled there ourselves—again, based on testimony. We ask people for directions on the street and believe our family and friends when they report things to us. Testimony is an extremely important source of evidence, and without it, we would be in the dark about a lot of things.

In what sense does faith go beyond the evidence, on this view? Well, sometimes, we have only testimony to go on. We may not have the time or ability to verify what someone tells us without outside sources, and we may be torn about whether to trust someone. In choosing to take someone’s word for something, we go beyond the evidence. At the very least, we go beyond certain kinds of evidence, in that we do not require outside verifying evidence. One worry for this view, however, is that faith is straightforwardly based on evidence, and thus it cannot sufficiently explain faith’s resilience, or how faith goes beyond the evidence.

A second view on which rational faith goes beyond the evidence without violating evidentialism draws on a view in epistemology known as epistemic permissivism: the view that sometimes, the evidence allows for multiple different rational attitudes toward a proposition. In permissive cases, where your evidence does not point you one way or another, there is an evidential tie between two attitudes. You can then choose to hold the faithful attitude, consistent with, but not required by, your evidence. This does not violate evidentialism, as the faithful attitude is permitted by, and in that sense fits, your evidence. At the same time, faith goes beyond the evidence in the sense that the faithful attitude is not strictly required by your evidence.

Consider two concrete examples. First, suppose your brother is accused of a serious crime. Suppose that there are several good, competing explanations of what happened. It might be rational for you to withhold belief, or even believe your brother is guilty, but you could instead choose the explanation of the evidence that supports your brother’s innocence. This demonstrates faith that your brother is innocent without violating the evidence, since believing that he is innocent is a rational response to the data.

Or suppose you are trying to decide whether God exists. The evidence for (a)theism is complicated and difficult to assess, and there are good arguments on both sides. Suppose, because the evidence is complicated in this way, you could be rational as a theist (who believes God exists), atheist (who believes God does not exist), or agnostic (who is undecided on whether God exists). Say you go out on a limb and decide to have faith that God exists. You are going beyond the evidence, but you are also not irrational, since your evidence rationally permits you to be a theist. Again, this is a case where rational faith respects evidentialism, but also goes beyond the evidence. (Note that, depending on how evidentialism is defined, this response may better fit under the third view, discussed next. Some strong versions of evidentialism are inconsistent with permissivism, and on some versions of the permissivist theory of faith, non-evidential factors can break evidential ties, so things besides evidence affect rational belief.) Attempts to reconcile faith’s resilience with evidentialism include, for example, Jackson (2019) and Dormandy (2021).

The third and final camp holds the view that faith, in going beyond the evidence, violates evidentialism, but this does not mean that faith is irrational. (James 1896/2011 and Bishop 2007 may well be characterized as proponents of this view, as they explicitly reject Clifford’s evidentialism). For example, you might maintain that evidentialism applies to belief, but not faith. After all, it is natural to think that faith goes beyond the evidence in a way that belief does not. To maintain evidentialism about belief, proponents of this view would need to say that rational faith is inconsistent with belief. Then, faith might be subject to different, non-evidentialist norms, but could still be rational and go beyond the evidence.

A second family of views that rejects evidentialism but maintains faith’s rationality are externalist views. Externalists maintain that epistemic justification depends on factors that are external to the person—for example, your belief that there is a cup on the desk can be rational if it is formed by a reliable perceptual process, whether or not you have evidence that there is a cup. Plantinga in particular is an externalist who thinks epistemic justification (or “warrant”) is a matter of functioning properly. Plantinga (2000) argues that religious beliefs can be properly basic: rational even if not based on an argument. Plantinga’s view involves the sensus divinitatus: a sense of the divine, that, when functioning properly, causes people to form beliefs about God (for example, “There is a Creator”; “God exists”; “God can help me”) especially in particular circumstances (for example, in nature, when in need of help, and so forth). These beliefs can be rational, even if not based on argument, and may be rational without any evidence at all.

That said, the view that religious belief can be properly basic does not, by itself, conflict with evidentialism. If a religious belief is based on experiential evidence, but not arguments, it can still be rational according to an evidentialist. Externalist views that deny evidentialism make a stronger claim: that religious belief can be rational without argument or evidence (see Plantinga 2000: 178).

Externalist views—at least ones that reject evidentialism—may be able to explain how rational faith goes beyond the evidence; evidence is not required for faith (or belief) to be epistemically rational. Even so, most externalist views include a no-defeater condition: if you get evidence that a belief is false (a defeater), that can affect, or even preclude, your epistemic justification. For example, you might form a warranted belief in God based on the sensus divinitatus but then begin to question why a loving, powerful God would allow the world’s seriously and seemingly pointless evils; this counterevidence could remove the warrant for your belief in God. Generally, externalist views may need a story about how faith can be resilient in the face of counterevidence to fully capture the idea that faith goes beyond the evidence.

We have seen three views about the relationship between faith, evidence, and evidentialism. On the first view, evidentialism is true, and faith does not respect evidentialism, so faith is irrational. On the second, evidentialism is true, and rational faith goes beyond the evidence in a way that respects evidentialism. On the final view, evidentialism is false, so faith does not have to be based on evidence; this makes space for rational faith to go beyond the evidence. Now, we turn to a second topic concerning the epistemology of faith: faith and knowledge.

ii. Faith and Knowledge

Epistemology is the study of knowledge. Epistemologists mostly focus on propositional knowledge: knowledge that a proposition is true. For example, you might know that 1+1=2 or that it is cold today. Knowledge involves at least three components: justification, truth, and belief. If you know that it is cold today, you believe that it is cold today, it is indeed cold today, and your belief that it is cold today is epistemically justified. (While these three components are necessary for knowledge, many think they are not sufficient, due to Gettier’s (1963) famous counterexamples to the justified true belief account of knowledge.) Note that knowledge is a high epistemic ideal. When a belief amounts to knowledge, it is not merely justified, but it is also true. Many epistemologists also think that knowledge requires a high degree of justification, for example, quite good evidence.

There are three main views about the relationship between faith and knowledge. The first is that propositional faith is a kind of knowledge. Plantinga’s view lends itself to a view of faith along these lines, as Plantinga’s story about proper function is ultimately an account of knowledge. Plantinga’s view is inspired by Calvin’s, who defines faith as a “firm and certain” knowledge of God (Institutes III, ii, 7:551). If Plantinga is right that (undefeated) theistic beliefs, formed reliably by properly functioning faculties in the right conditions, amount to knowledge, then Plantinga’s view might be rightfully characterized as one on which faith is (closely tied to) knowledge. Relatedly, Aquinas discusses a kind of faith that resembles knowledge, but is ultimately “midway between knowledge and opinion” (Summa Theologica 2a2ae 1:2).

On a second view, propositional faith is not a kind of knowledge, but can amount to knowledge in certain circumstances. For example, one might hold that faith may be consistent with less evidence or justification than is required for knowledge, or that faith does not require belief. Thus, one could have faith that p—even rationally—even if one does not know that p. Keep in mind that knowledge is a high epistemic bar, so meeting this bar for knowledge may not be required for faith to be rational—faith that p might be rational even if, for example, p is false, so p is not known. However, faith that p may amount to knowledge when it meets the conditions for knowledge: p is justifiedly believed, true, and not Gettiered.

On a final view, faith that p is inconsistent with knowing p. For example, Howard-Snyder (2013: 370) suggests that for faith, one’s evidence is often “sub-optimal.” Along similar lines, Alston (1996: 12) notes that “[F]aith-that has at least a strong suggestion of a weak epistemic position vis-a-vis the proposition in question.” Since knowledge sets a high epistemic bar (the proposition in question must enjoy a high degree of justification, be true, and so forth), faith may play a role when your epistemic position is too poor to know. And if you know p, faith that p is not needed. This fits well with Kant’s famous remarks: “I have… found it necessary to deny knowledge, in order to make room for faith” (Preface to the Second Edition of the Critique of Pure Reason, 1787/1933: 29). On this third view, then, if you have faith that p, you do not know p, and if you know p, faith that p is unnecessary.

As noted, many epistemologists focus on knowledge-that: knowing that a proposition is true. However, there are other kinds of knowledge: knowledge-how, or knowing how to perform some action, such as riding a bike, and knowledge-who, or knowing someone personally. There has been some interesting work on non-propositional knowledge and faith: see Sliwa (2018) for knowledge-how, and Benton (2018) for knowledge-who. Note that non-propositional knowledge might better fit with non-propositional faith, such as faith-in. This raises several interesting questions, such as: does faith in God require interpersonal knowledge of God? And how does this relate to the belief that God exists? The relationship between non-propositional knowledge and faith merits further exploration.

b. Faith’s Practical Rationality

A second question is whether faith can be practically rational, and if so, when and how. Practical rationality, unlike epistemic rationality, is associated with what is good for you: what fulfills your desires and leads to your flourishing. Examples of practically rational actions include brushing your teeth, saving for retirement, pursuing your dream job, and other things conducive to meeting your goals and improving your life (although see Ballard 2017 for an argument that faith’s practical and epistemic rationality are importantly connected).

Practical rationality is normally applied to actions. Thus, it makes the most sense to evaluate action-focused faith for practical rationality. In particular, acceptance, or acting as if a proposition is true, is often associated with action-focused faith. Thus, this article focuses on what makes accepting a proposition of faith practically rational, and whether leaps of faith can be practically rational but go beyond the evidence.

Elizabeth Jackson’s (2021) view of faith focuses on how acceptance-based faith can be practically rational in light of counterevidence. Jackson notes that, on two major theories of rational action (the belief-desire view and the decision-theory view), rational action is caused by two things: beliefs and desires. If it is rational for you to go to the fridge, this is because you want food (a desire) and you believe there is food in the fridge (a belief). But you can believe and desire things to a stronger or lesser degree; you might rationally act on something because you have a strong desire for it, even though you consider it unlikely. Suppose your brother goes missing. He has been missing for a long time, and there is a lot of evidence he is dead, but you think there is some chance he might be alive. Because it would be so good if he was alive and you found him, you have action-focused faith that he is alive: you put up missing posters, spend lots of time searching for him, and so forth. The goodness of finding him again makes this rational, despite your counterevidence. Or consider another example: you might rationally accept that God exists, by practicing a religion, participating in prayer and liturgy, and joining a spiritual community, even if you have strong evidence against theism. This is because you have a lot of gain if you accept that God exists and God does exist, and not much to lose if God does not exist.

Arguably, then, it is even easier for practically rational faith to go beyond the evidence than it is for epistemically rational faith. Taking an act of faith might be practically rational even if one has little evidence for the proposition they are accepting. Practically rational action depends on both your evidence and also what is at stake, and it can be rational to act as if something is true even if your evidence points the other way. In this, practically rational faith can be resilient in light of counterevidence: what you lose in evidence can be made up for in desire.

Of course, this does not mean that faith is always practically rational. Both your beliefs/evidence and your desires/what is good for you can render faith practically irrational. For example, if you became certain your brother was dead (perhaps his body was found), then acting as if your brother is still alive would be practically irrational. Similarly, faith could be practically irrational if its object is not good for your flourishing: for example, faith that you will get back together with an abusive partner.

However, since it can be rational to accept that something is true even if you have overwhelming evidence that it is false, practically rational acts of faith go beyond (and even against) the evidence. For other related decision-theoretic accounts of how practically rational faith can go beyond the evidence, see Buchak (2012) and McKaughan (2013).

c. Faith and Morality/Virtue

The third and final way to evaluate faith is from a moral perspective. There is a family of questions regarding the ethics of faith: whether and when is faith morally permissible? Is faith ever morally obligatory? Is it appropriate to regard faith as a virtue? Can faith be immoral?

We normally ask what actions, rather than what mental states, are obligatory/permissible/wrong. While virtues are not themselves actions, they are (or lead to) dispositions to act. In either case, it makes sense to morally evaluate action-focused faith. (Although, some argue for doxastic wronging, that is, beliefs can morally wrong others. If they can, this suggests beliefs—and perhaps other mental states—can be morally evaluated. This may open up space to morally evaluate attitude-focused faith as well.)

As with the epistemic and practical case, it would be wrong to think that all cases of faith fit into one moral category. Faith is not always moral: faith in an evil cause or evil person can be immoral. But faith is not always immoral, and may sometimes be morally good: faith in one’s close friends or family members, or faith in causes like world peace or ending world hunger seem morally permissible, if not even morally obligatory.

One of the most widely discussed topics on the ethics of faith is faith as a virtue (see Aquinas, Summa Theologiae II-II, q. 1-16). Faith is often taken to be both a virtue in general, but also a theological virtue (in the Christian tradition, along with hope and charity). For reasons just discussed, the idea that faith is a virtue by definition seems incorrect. Faith is not always morally good—it is possible to have faith in morally bad people or cases, and to have faith with morally bad effects. (This is why the discussion of faith as a virtue belongs in this section, rather than in previous sections on the nature of faith.)

This raises the question: Can faith satisfy the conditions for virtue? According to Aristotle, a virtue is a positive character trait that is demonstrated consistently, across situations and across time. Virtues are acquired freely and deliberately and bring benefits to both the virtuous person and to their community. For example, if you have the virtue of honesty, you will be honest in various situations and also over time; you will have acquired honesty freely and deliberately (not by accident), and your honesty will bring benefits both to yourself and those in your community. Thus, assuming this orthodox Aristotelian definition of virtue, when faith is a virtue, it is a stable character trait, acquired freely and deliberately, that brings benefits to both the faithful person and their community.

There have been several discussions of the virtue of faith in the literature. Anne Jeffrey (2017-a) argues that there is a tension between common assumptions about faith and Aristotelian virtue ethics. Specifically, some have argued that part of faith’s function depends on a limitation or an imperfection in the faithful person (for example, keeping us steadfast and committed in light of doubts or misguided affections). However, according to the Aristotelian view, virtues are traits held by fully virtuous people who have perfect practical knowledge and always choose the virtuous action. Taken together, these two views create a challenge for the idea that faith is a virtue, as faith seems to require imperfections or limitations incompatible with virtue. While this tension could be resolved by challenging the idea that faith’s role necessarily involves a limitation, Jeffrey instead argues that we should re-conceive Aristotelian virtue ethics and embrace the idea that even people with limitations can possess and exercise virtues. In another paper, Jeffrey (2017-b) argues that we can secure the practical rationality and moral permissibility of religious faith—which seems necessary if faith is a virtue—by appealing to the idea that faith is accompanied by another virtue, hope.

There is a second reason to think that the theological virtues—faith, hope, and charity—may not perfectly fit into the Aristotelian mold. While Aristotelian virtues are freely acquired by habituation, some thinkers suggest that theological virtues are infused immediately by God, rather than acquired over time (Aquinas, Summa Theologiae II-II, q. 6). While some may conclude from this that faith, along with the other theological virtues, are not true virtues, this may further support Jeffrey’s suggestion that Aristotle’s criteria for virtue may need to be altered or reconceived. Or perhaps there are two kinds of virtues: Aristotelian acquired virtues and theological infused virtues, each with their own characteristics.

A final topic that has been explored is the question of how virtuous faith interacts with other virtues. The relationship between faith and humility is widely discussed. Several authors have noted that prima facie, faith seems to be in tension with humility: faith involves taking various risks (both epistemic and action-focused risks), but in some cases, those risks may be a sign of overconfidence, which can be in tension with exhibiting humility (intellectual or otherwise). In response to this, both Kvanvig (2018) and Malcolm (2021) argue that faith and humility are two virtues that balance each other out. Kvanvig argues that humility is a matter of where your attention is directed (say, not at yourself), and this appropriately directed attention can guide faithful action. Malcolm argues that religious faith can be understood as a kind of trust in God—specifically, a reliance on God’s testimony, which, when virtuous, exhibits a kind of intellectual humility.

4. Conclusion

Faith is a trusting commitment to someone or something. There are at least four distinctions among kinds of faith: attitude-focused faith vs. act-focused faith, faith-that vs. faith-in, religious vs. non-religious faith, and important vs. mundane faith (Section 1.a). Trust, risk, resilience, and going beyond the evidence are all closely associated with faith (Section 1.b). Considering faith’s relationship to attitudes, states, or actions—belief, doubt, desire, hope, and acceptance—sheds further light on the nature of faith (Section 2). There are three main ways we might evaluate faith: epistemically, practically, and morally. While faith is not always epistemically rational, practically rational, or morally permissible, we have seen reason to think that faith can be positively evaluated in many cases (Section 3).

5. References and Further Reading

  • Ali, Zain. (2013). Faith, Philosophy, and the Reflective Muslim. London, UK: Palgrave Macmillan.
  • Alston, William. (1996). “Belief, Acceptance, and Religious Faith.” In J. Jordan and D. Howard-Snyder (eds.), Faith, Freedom, and Rationality pp. 3–27. Lanham, MD: Rowman and Littlefield.
  • Anscombe, G. E. M. (2008). “Faith.” In M. Geach and L. Gormally (eds.), Faith in a Hard Ground. Exeter: Imprint Academic, 11–19.
  • Audi, Robert. (2011). Rationality and Religious Commitment. New York: Oxford University Press.
  • Ballard, Brian. (2017). “The Rationality of Faith and the Benefits of Religion.” International Journal for the Philosophy of Religion 81: 213–227.
  • Bates, Matthew. (2017). Salvation by Allegiance Alone. Grand Rapids: Baker Academic.
  • Benton, Matthew. (2018). “God and Interpersonal Knowledge.” Res Philosophica 95(3): 421–447.
  • Bishop, John. (2007). Believing by Faith: An Essay in the Epistemology and Ethics of Religious Belief. Oxford: OUP.
  • Bishop, John. (2016). “Faith.” Stanford Encyclopedia of Philosophy. Edward N. Zalta (ed.) https://plato.stanford.edu/entries/faith/
  • Buchak, Lara. (2012). “Can it Be Rational to Have Faith?” In Jake Chandler & Victoria Harrison (eds.), Probability in the Philosophy of Religion, pp. 225–247. Oxford: Oxford University Press.
  • Buchak, Lara. (2017). “Reason and Faith.” In The Oxford Handbook of the Epistemology of Theology (edited by William J. Abraham and Frederick D. Aquino), pp. 46–63. Oxford: OUP.
  • Buckareff, Andrei A. (2005). “Can Faith Be a Doxastic Venture?” Religious Studies 41: 435– 45.
  • Byerly, T. R. (2012). “Faith as an Epistemic Disposition.” European Journal for Philosophy of Religion, 4(1): 109–128.
  • Cohen, Jonathan. (1992). An Essay on Belief and Acceptance. New York: Clarendon Press.
  • Dewey, John (1934). A Common Faith. New Haven, CT: Yale University Press.
  • Dormandy, Katherine. (2021). “True Faith: Against Doxastic Partiality about Faith (in God and Religious Communities) and in Defense of Evidentialism.” Australasian Philosophical Review 5(1): 4–28
  • Gettier, Edmund. (1963). “Is Justified True Belief Knowledge?” Analysis 23(6): 121–123.
  • Howard-Snyder, Daniel. (2013). “Propositional Faith: What it is and What it is Not.” American Philosophical Quarterly 50(4): 357–372.
  • Howard-Snyder, Daniel. (2018). “Can Fictionalists Have Faith? It All Depends.” Religious Studies 55: 1–22.
  • Jackson, Elizabeth. (2019). “Belief, Credence, and Faith.” Religious Studies 55(2): 153–168.
  • Jackson, Elizabeth. (2020). “The Nature and Rationality of Faith.” A New Theist Response to the New Atheists (Joshua Rasmussen and Kevin Vallier, eds.), pp. 77–92. New York: Routledge.
  • Jackson, Elizabeth. (2021). “Belief, Faith, and Hope: On the Rationality of Long-Term Commitment.” Mind. 130(517): 35–57.
  • Jeffrey, Anne. (2017-a). “How Aristotelians Can Make Faith a Virtue.” Ethical Theory and Moral Practice 20(2): 393–409.
  • Jeffrey, Anne. (2017-b). “Does Hope Morally Vindicate Faith?” International Journal for Philosophy of Religion 81(1-2): 193–211.
  • James, William. (1896/2011). “The Will to Believe.” In J. Shook (ed.) The Essential William James, pp. 157–178. New York: Prometheus Books.
  • Kvanvig, Jonathan. (2018). Faith and Humility. Oxford: OUP.
  • Kvanvig, Jonathan. (2013). “Affective Theism and People of Faith.” Midwest Studies in Philosophy 37: 109–28.
  • Lebens, S. (2021). “Will I Get a Job? Contextualism, Belief, and Faith.” Synthese 199(3-4): 5769–5790.
  • Malcolm, Finlay. (2021). “Testimony, Faith, and Humility.” Religious Studies 57(3): 466–483.
  • Malcolm, Finlay and Michael Scott. (2017). “Faith, Belief, and Fictionalism.” Pacific Philosophical Quarterly 98(1): 257–274.
  • Malcolm, Finlay and Michael Scott. (2021). “True Grit and the Positivity of Faith.” European Journal of Analytic Philosophy 17(1): 5–32.
  • Martin, Adrienne M. (2013). How We Hope: A Moral Psychology. Princeton: Princeton University Press.
  • Matheson, Jonathan. (2018). “Gritty Faith.” American Catholic Philosophical Quarterly 92(3): 499–513.
  • McKaughan, Daniel. (2013). “Authentic Faith and Acknowledged Risk: Dissolving the Problem of Faith and Reason.” Religious Studies 49: 101–124.
  • Milona, Michael. (2019). “Finding Hope.” The Canadian Journal of Philosophy 49(5): 710­–729.
  • Moon, Andrew. (2018). “The Nature of Doubt and a New Puzzle about Belief, Doubt, and Confidence.” Synthese 195(4): 1827–1848.
  • Paul, Sarah K., and Jennifer M. Morton. (2019). “Grit.” Ethics 129: 175–203.
  • Plantinga, Alvin (2000). Warranted Christian Belief. New York: Oxford University Press.
  • Pojman, Louis. (1986). “Faith Without Belief?” Faith and Philosophy 3(2): 157-176.
  • Rettler, Brad. (2018). “Analysis of Faith.” Philosophy Compass 13(9): 1–10.
  • Rioux, Catherine. (2021). “Hope: Conceptual and Normative Issues.” Philosophy Compass 16(3): 1–11.
  • Schellenberg, J.L. (2005). Prolegomena to a Philosophy of Religion. Ithaca: Cornell University Press.
  • Sliwa, Paulina. (2018). “Know-How and Acts of Faith.” In Matthew A. Benton, John Hawthorne & Dani Rabinowitz (eds.), Knowledge, Belief, and God: New Insights in Religious Epistemology. Oxford: Oxford University Press. pp. 246-263.
  • Speak, Daniel. (2007). “Salvation Without Belief.” Religious Studies 43(2): 229–236.
  • Swinburne, Richard. (1981). “The Nature of Faith’. In R. Swinburne, Faith and Rea­son, pp. 104–24. Oxford: Clarendon Press.
  • Swindal, James. (2021). “Faith: Historical Perspectives.” Internet Encyclopedia of Philosophy. https://iep.utm.edu/faith-re/
  • Whitaker, Robert K. (2019). “Faith and Disbelief.” International Journal for Philosophy of Religion 85: 149–172.
  • Zagzebski, Linda Trinkaus (2012). “Religious Authority.” In L. T. Zagzebski, Epistemic Authority: A Theory of Trust, Authority, and Autonomy in Belief. Oxford: Oxford University Press, 181–203.

 

Author Information

Elizabeth Jackson
Email: lizjackson111@ryerson.ca
Toronto Metropolitan University
Canada

Pseudoscience and the Demarcation Problem

The demarcation problem in philosophy of science refers to the question of how to meaningfully and reliably separate science from pseudoscience. Both the terms “science” and “pseudoscience” are notoriously difficult to define precisely, except in terms of family resemblance. The demarcation problem has a long history, tracing back at the least to a speech given by Socrates in Plato’s Charmides, as well as to Cicero’s critique of Stoic ideas on divination. Karl Popper was the most influential modern philosopher to write on demarcation, proposing his criterion of falsifiability to sharply distinguish science from pseudoscience. Most contemporary practitioners, however, agree that Popper’s suggestion does not work. In fact, Larry Laudan suggested that the demarcation problem is insoluble and that philosophers would be better off focusing their efforts on something else. This led to a series of responses to Laudan and new proposals on how to move forward, collected in a landmark edited volume on the philosophy of pseudoscience. After the publication of this volume, the field saw a renaissance characterized by a number of innovative approaches. Two such approaches are particularly highlighted in this article: treating pseudoscience and pseudophilosophy as BS, that is, “bullshit” in Harry Frankfurt’s sense of the term, and applying virtue epistemology to the demarcation problem. This article also looks at the grassroots movement often referred to as scientific skepticism and to its philosophical bases.

Table of Contents

  1. An Ancient Problem with a Long History
  2. The Demise of Demarcation: The Laudan Paper
  3. The Return of Demarcation: The University of Chicago Press Volume
  4. The Renaissance of the Demarcation Problem
  5. Pseudoscience as BS
  6. Virtue Epistemology and Demarcation
  7. The Scientific Skepticism Movement
  8. References and Further Readings

1. An Ancient Problem with a Long History

In the Charmides (West and West translation, 1986), Plato has Socrates tackle what contemporary philosophers of science refer to as the demarcation problem, the separation between science and pseudoscience. In that dialogue, Socrates is referring to a specific but very practical demarcation issue: how to tell the difference between medicine and quackery. Here is the most relevant excerpt:

SOCRATES: Let us consider the matter in this way. If the wise man or any other man wants to distinguish the true physician from the false, how will he proceed? . . . He who would inquire into the nature of medicine must test it in health and disease, which are the sphere of medicine, and not in what is extraneous and is not its sphere?

CRITIAS: True.

SOCRATES: And he who wishes to make a fair test of the physician as a physician will test him in what relates to these?

CRITIAS: He will.

SOCRATES: He will consider whether what he says is true, and whether what he does is right, in relation to health and disease?

CRITIAS: He will.

SOCRATES: But can anyone pursue the inquiry into either, unless he has a knowledge of medicine?

CRITIAS: He cannot.

SOCRATES: No one at all, it would seem, except the physician can have this knowledge—and therefore not the wise man. He would have to be a physician as well as a wise man.

CRITIAS: Very true. (170e-171c)

The conclusion at which Socrates arrives, therefore, is that the wise person would have to develop expertise in medicine, as that is the only way to distinguish an actual doctor from a quack. Setting aside that such a solution is not practical for most people in most settings, the underlying question remains: how do we decide whom to pick as our instructor? What if we mistake a school of quackery for a medical one? Do quacks not also claim to be experts? Is this not a hopelessly circular conundrum?

A few centuries later, the Roman orator, statesman, and philosopher Marcus Tullius Cicero published a comprehensive attack on the notion of divination, essentially treating it as what we would today call a pseudoscience, and anticipating a number of arguments that have been developed by philosophers of science in modern times. As Fernandez-Beanato (2020a) points out, Cicero uses the Latin word “scientia” to refer to a broader set of disciplines than the English “science.” His meaning is closer to the German word “Wissenschaft,” which means that his treatment of demarcation potentially extends to what we would today call the humanities, such as history and philosophy.

Being a member of the New Academy, and therefore a moderate epistemic skeptic, Cicero writes: “As I fear to hastily give my assent to something false or insufficiently substantiated, it seems that I should make a careful comparison of arguments […]. For to hasten to give assent to something erroneous is shameful in all things” (De Divinatione, I.7 / Falconer translation, 2014). He thus frames the debate on unsubstantiated claims, and divination in particular, as a moral one.

Fernandez-Beanato identifies five modern criteria that often come up in discussions of demarcation and that are either explicitly or implicitly advocated by Cicero: internal logical consistency of whatever notion is under scrutiny; degree of empirical confirmation of the predictions made by a given hypothesis; degree of specificity of the proposed mechanisms underlying a certain phenomenon; degree of arbitrariness in the application of an idea; and degree of selectivity of the data presented by the practitioners of a particular approach. Divination fails, according to Cicero, because it is logically inconsistent, it lacks empirical confirmation, its practitioners have not proposed a suitable mechanism, said practitioners apply the notion arbitrarily, and they are highly selective in what they consider to be successes of their practice.

Jumping ahead to more recent times, arguably the first modern instance of a scientific investigation into allegedly pseudoscientific claims is the case of the famous Royal Commissions on Animal Magnetism appointed by King Louis XVI in 1784. One of them, the so-called Society Commission, was composed of five physicians from the Royal Society of Medicine; the other, the so-called Franklin Commission, comprised four physicians from the Paris Faculty of Medicine, as well as Benjamin Franklin. The goal of both commissions was to investigate claims of “mesmerism,” or animal magnetism, being made by Franz Mesmer and some of his students (Salas and Salas 1996; Armando and Belhoste 2018).

Mesmer was a medical doctor who began his career with a questionable study entitled “A Physico-Medical Dissertation on the Influence of the Planets.” Later, he developed a theory according to which all living organisms are permeated by a vital force that can, with particular techniques, be harnessed for therapeutic purposes. While mesmerism became popular and influential for decades between the end of the 18th century and the full span of the 19th century, it is now considered a pseudoscience, in large part because of the failure to empirically replicate its claims and because vitalism in general has been abandoned as a theoretical notion in the biological sciences. Interestingly, though, Mesmer clearly thought he was doing good science within a physicalist paradigm and distanced himself from the more obviously supernatural practices of some of his contemporaries, such as the exorcist Johann Joseph Gassner.

For the purposes of this article, we need to stress the importance of the Franklin Commission in particular, since it represented arguably the first attempt in history to carry out controlled experiments. These were largely designed by Antoine Lavoisier, complete with a double-blind protocol in which both subjects and investigators did not know which treatment they were dealing with at any particular time, the allegedly genuine one or a sham control. As Stephen Jay Gould (1989) put it:

The report of the Royal Commission of 1784 is a masterpiece of the genre, an enduring testimony to the power and beauty of reason. … The Report is a key document in the history of human reason. It should be rescued from its current obscurity, translated into all languages, and reprinted by organizations dedicated to the unmasking of quackery and the defense of rational thought.

Not surprisingly, neither Commission found any evidence supporting Mesmer’s claims. The Franklin report was printed in 20,000 copies and widely circulated in France and abroad, but this did not stop mesmerism from becoming widespread, with hundreds of books published on the subject in the period 1766-1925.

Arriving now to modern times, the philosopher who started the discussion on demarcation is Karl Popper (1959), who thought he had formulated a neat solution: falsifiability (Shea no date). He reckoned that—contra popular understanding—science does not make progress by proving its theories correct, since it is far too easy to selectively accumulate data that are favorable to one’s pre-established views. Rather, for Popper, science progresses by eliminating one bad theory after another, because once a notion has been proven to be false, it will stay that way. He concluded that what distinguishes science from pseudoscience is the (potential) falsifiability of scientific hypotheses, and the inability of pseudoscientific notions to be subjected to the falsifiability test.

For instance, Einstein’s theory of general relativity survived a crucial test in 1919, when one of its most extraordinary predictions—that light is bent by the presence of gravitational masses—was spectacularly confirmed during a total eclipse of the sun (Kennefick 2019). This did not prove that the theory is true, but it showed that it was falsifiable and, therefore, good science. Moreover, Einstein’s prediction was unusual and very specific, and hence very risky for the theory. This, for Popper, is a good feature of a scientific theory, as it is too easy to survive attempts at falsification when predictions based on the theory are mundane or common to multiple theories.

In contrast with the example of the 1919 eclipse, Popper thought that Freudian and Adlerian psychoanalysis, as well as Marxist theories of history, are unfalsifiable in principle; they are so vague that no empirical test could ever show them to be incorrect, if they are incorrect. The point is subtle but crucial. Popper did not argue that those theories are, in fact, wrong, only that one could not possibly know if they were, and they should not, therefore, be classed as good science.

Popper became interested in demarcation because he wanted to free science from a serious issue raised by David Hume (1748), the so-called problem of induction. Scientific reasoning is based on induction, a process by which we generalize from a set of observed events to all observable events. For instance, we “know” that the sun will rise again tomorrow because we have observed the sun rising countless times in the past. More importantly, we attribute causation to phenomena on the basis of inductive reasoning: since event X is always followed by event Y, we infer that X causes Y.

The problem as identified by Hume is twofold. First, unlike deduction (as used in logic and mathematics), induction does not guarantee a given conclusion, it only makes that conclusion probable as a function of the available empirical evidence. Second, there is no way to logically justify the inference of a causal connection. The human mind does so automatically, says Hume, as a leap of imagination.

Popper was not satisfied with the notion that science is, ultimately, based on a logically unsubstantiated step. He reckoned that if we were able to reframe scientific progress in terms of deductive, not inductive logic, Hume’s problem would be circumvented. Hence falsificationism, which is, essentially, an application of modus tollens (Hausman et al. 2021) to scientific hypotheses:

If P, then Q
Not Q
Therefore, not P

For instance, if General Relativity is true then we should observe a certain deviation of light coming from the stars when their rays pass near the sun (during a total eclipse or under similarly favorable circumstances). We do observe the predicted deviation. Therefore, we have (currently) no reason to reject General Relativity. However, had the observations carried out during the 1919 eclipse not aligned with the prediction then there would have been sufficient reason, according to Popper, to reject General Relativity based on the above syllogism.

Science, on this view, does not make progress one induction, or confirmation, after the other, but one discarded theory after the other. And as a bonus, thought Popper, this looks like a neat criterion to demarcate science from pseudoscience.

In fact, it is a bit too neat, unfortunately. Plenty of philosophers after Popper (for example, Laudan 1983) have pointed out that a number of pseudoscientific notions are eminently falsifiable and have been shown to be false—astrology, for instance (Carlson 1985). Conversely, some notions that are even currently considered to be scientific, are also—at least temporarily—unfalsifiable (for example, string theory in physics: Hossenfelder 2018).

A related issue with falsificationism is presented by the so-called Duhem-Quine theses (Curd and Cover 2012), two allied propositions about the nature of knowledge, scientific or otherwise, advanced independently by physicist Pierre Duhem and philosopher Willard Van Orman Quine.

Duhem pointed out that when scientists think they are testing a given hypothesis, as in the case of the 1919 eclipse test of General Relativity, they are, in reality, testing a broad set of propositions constituted by the central hypothesis plus a number of ancillary assumptions. For instance, while the attention of astronomers in 1919 was on Einstein’s theory and its implications for the laws of optics, they also simultaneously “tested” the reliability of their telescopes and camera, among a number of more or less implicit additional hypotheses. Had something gone wrong, their likely first instinct, rightly, would have been to check that their equipment was functioning properly before taking the bold step of declaring General Relativity dead.

Quine, later on, articulated a broader account of human knowledge conceived as a web of beliefs. Part of this account is the notion that scientific theories are always underdetermined by the empirical evidence (Bonk 2008), meaning that different theories will be compatible with the same evidence at any given point in time. Indeed, for Quine it is not just that we test specific theories and their ancillary hypotheses. We literally test the entire web of human understanding. Certainly, if a test does not yield the predicted results we will first look at localized assumptions. But occasionally we may be forced to revise our notions at larger scales, up to and including mathematics and logic themselves.

The history of science does present good examples of how the Duhem-Quine theses undermine falsificationism. The twin tales of the spectacular discovery of a new planet and the equally spectacular failure to discover an additional one during the 19th century are classic examples.

Astronomers had uncovered anomalies in the orbit of Uranus, at that time the outermost known planet in the solar system. These anomalies did not appear, at first, to be explainable by standard Newtonian mechanics, and yet nobody thought even for a moment to reject that theory on the basis of the newly available empirical evidence. Instead, mathematician Urbain Le Verrier postulated that the anomalies were the result of the gravitational interference of an as yet unknown planet, situated outside of Uranus’ orbit. The new planet, Neptune, was in fact discovered on the night of 23-24 September 1846, thanks to the precise calculations of Le Verrier (Grosser 1962).

The situation repeated itself shortly thereafter, this time with anomalies discovered in the orbit of the innermost planet of our system, Mercury. Again, Le Verrier hypothesized the existence of a hitherto undiscovered planet, which he named Vulcan. But Vulcan never materialized. Eventually astronomers really did have to jettison Newtonian mechanics and deploy the more sophisticated tools provided by General Relativity, which accounted for the distortion of Mercury’s orbit in terms of gravitational effects originating with the Sun (Baum and Sheehan 1997).

What prompted astronomers to react so differently to two seemingly identical situations? Popper would have recognized the two similar hypotheses put forth by Le Verrier as being ad hoc and yet somewhat justified given the alternative, the rejection of Newtonian mechanics. But falsificationism has no tools capable of explaining why it is that sometimes ad hoc hypotheses are acceptable and at other times they are not. Nor, therefore, is it in a position to provide us with sure guidance in cases like those faced by Le Verrier and colleagues. This failure, together with wider criticism of Popper’s philosophy of science by the likes of Thomas Kuhn (1962), Imre Lakatos (1978), and Paul Feyerabend (1975) paved the way for a crisis of sorts for the whole project of demarcation in philosophy of science.

2. The Demise of Demarcation: The Laudan Paper

A landmark paper in the philosophy of demarcation was published by Larry Laudan in 1983. Provocatively entitled “The Demise of the Demarcation Problem,” it sought to dispatch the whole field of inquiry in one fell swoop. As the next section shows, the outcome was quite the opposite, as a number of philosophers responded to Laudan and reinvigorated the whole debate on demarcation. Nevertheless, it is instructive to look at Laudan’s paper and to some of his motivations to write it.

Laudan was disturbed by the events that transpired during one of the classic legal cases concerning pseudoscience, specifically the teaching of so-called creation science in American classrooms. The case, McLean v. Arkansas Board of Education, was debated in 1982. Some of the fundamental questions that the presiding judge, William R. Overton, asked expert witnesses to address were whether Darwinian evolution is a science, whether creationism is also a science, and what criteria are typically used by the pertinent epistemic communities (that is, scientists and philosophers) to arrive at such assessments (LaFollette 1983).

One of the key witnesses on the evolution side was philosopher Michael Ruse, who presented Overton with a number of demarcation criteria, one of which was Popper’s falsificationism. According to Ruse’s testimony, creationism is not a science because, among other reasons, its claims cannot be falsified. In a famous and very public exchange with Ruse, Laudan (1988) objected to the use of falsificationism during the trial, on the grounds that Ruse must have known that that particular criterion had by then been rejected, or at least seriously questioned, by the majority of philosophers of science.

It was this episode that prompted Laudan to publish his landmark paper aimed at getting rid of the entire demarcation debate once and for all. One argument advanced by Laudan is that philosophers have been unable to agree on demarcation criteria since Aristotle and that it is therefore time to give up this particular quixotic quest. This is a rather questionable conclusion. Arguably, philosophy does not make progress by resolving debates, but by discovering and exploring alternative positions in the conceptual spaces defined by a particular philosophical question (Pigliucci 2017). Seen this way, falsificationism and modern debates on demarcation are a standard example of progress in philosophy of science, and there is no reason to abandon a fruitful line of inquiry so long as it keeps being fruitful.

Laudan then argues that the advent of fallibilism in epistemology (Feldman 1981) during the nineteenth century spelled the end of the demarcation problem, as epistemologists now recognize no meaningful distinction between opinion and knowledge. Setting aside that the notion of fallibilism far predates the 19th century and goes back at the least to the New Academy of ancient Greece, it may be the case, as Laudan maintains, that many modern epistemologists do not endorse the notion of an absolute and universal truth, but such notion is not needed for any serious project of science-pseudoscience demarcation. All one needs is that some “opinions” are far better established, by way of argument and evidence, than others and that scientific opinions tend to be dramatically better established than pseudoscientific ones.

It is certainly true, as Laudan maintains, that modern philosophers of science see science as a set of methods and procedures, not as a particular body of knowledge. But the two are tightly linked: the process of science yields reliable (if tentative) knowledge of the world. Conversely, the processes of pseudoscience, such as they are, do not yield any knowledge of the world. The distinction between science as a body of knowledge and science as a set of methods and procedures, therefore, does nothing to undermine the need for demarcation.

After a by now de rigueur criticism of the failure of positivism, Laudan attempts to undermine Popper’s falsificationism. But even Laudan himself seems to realize that the limits of falsificationism do not deal a death blow to the notion that there are recognizable sciences and pseudosciences: “One might respond to such criticisms [of falsificationism] by saying that scientific status is a matter of degree rather than kind” (Laudan 1983, 121). Indeed, that seems to be the currently dominant position of philosophers who are active in the area of demarcation.

The rest of Laudan’s critique boils down to the argument that no demarcation criterion proposed so far can provide a set of necessary and sufficient conditions to define an activity as scientific, and that the “epistemic heterogeneity of the activities and beliefs customarily regarded as scientific” (1983, 124) means that demarcation is a futile quest. This article now briefly examines each of these two claims.

Ever since Wittgenstein (1958), philosophers have recognized that any sufficiently complex concept will not likely be definable in terms of a small number of necessary and jointly sufficient conditions. That approach may work in basic math, geometry, and logic (for example, definitions of triangles and other geometric figures), but not for anything as complex as “science” or “pseudoscience.” This implies that single-criterion attempts like Popper’s are indeed to finally be set aside, but it does not imply that multi-criterial or “fuzzy” approaches will not be useful. Again, rather than a failure, this shift should be regarded as evidence of progress in this particular philosophical debate.

Regarding Laudan’s second claim from above, that science is a fundamentally heterogeneous activity, this may or may not be the case, the jury is still very much out. Some philosophers of science have indeed suggested that there is a fundamental disunity to the sciences (Dupré 1993), but this is far from being a consensus position. Even if true, a heterogeneity of “science” does not preclude thinking of the sciences as a family resemblance set, perhaps with distinctly identifiable sub-sets, similar to the Wittgensteinian description of “games” and their subdivision into fuzzy sets including board games, ball games, and so forth. Indeed, some of the authors discussed later in this article have made this very same proposal regarding pseudoscience: there may be no fundamental unity grouping, say, astrology, creationism, and anti-vaccination conspiracy theories, but they nevertheless share enough Wittgensteinian threads to make it useful for us to talk of all three as examples of broadly defined pseudosciences.

3. The Return of Demarcation: The University of Chicago Press Volume

Laudan’s 1983 paper had the desired effect of convincing a number of philosophers of science that it was not worth engaging with demarcation issues. Yet, in the meantime pseudoscience kept being a noticeable social phenomenon, one that was having increasingly pernicious effects, for instance in the case of HIV, vaccine, and climate change denialism (Smith and Novella, 2007; Navin 2013; Brulle 2020). It was probably inevitable, therefore, that philosophers of science who felt that their discipline ought to make positive contributions to society would, sooner or later, go back to the problem of demarcation.

The turning point was an edited volume entitled The Philosophy of Pseudoscience: Reconsidering the Demarcation Problem, published in 2013 by the University of Chicago Press (Pigliucci and Boudry 2013). The editors and contributors consciously and explicitly set out to respond to Laudan and to begin the work necessary to make progress (in something like the sense highlighted above) on the issue.

The first five chapters of The Philosophy of Pseudoscience take the form of various responses to Laudan, several of which hinge on the rejection of the strict requirement for a small set of necessary and jointly sufficient conditions to define science or pseudoscience. Contemporary philosophers of science, it seems, have no trouble with inherently fuzzy concepts. As for Laudan’s contention that the term “pseudoscience” does only negative, potentially inflammatory work, this is true and yet no different from, say, the use of “unethical” in moral philosophy, which few if any have thought of challenging.

The contributors to The Philosophy of Pseudoscience also readily admit that science is best considered as a family of related activities, with no fundamental essence to define it. Indeed, the same goes for pseudoscience as, for instance, vaccine denialism is very different from astrology, and both differ markedly from creationism. Nevertheless, there are common threads in both cases, and the existence of such threads justifies, in part, philosophical interest in demarcation. The same authors argue that we should focus on the borderline cases, precisely because there it is not easy to neatly separate activities into scientific and pseudoscientific. There is no controversy, for instance, in classifying fundamental physics and evolutionary biology as sciences, and there is no serious doubt that astrology and homeopathy are pseudosciences. But what are we to make of some research into the paranormal carried out by academic psychologists (Jeffers 2007)? Or of the epistemically questionable claims often, but not always, made by evolutionary psychologists (Kaplan 2006)?

The 2013 volume sought a consciously multidisciplinary approach to demarcation. Contributors include philosophers of science, but also sociologists, historians, and professional skeptics (meaning people who directly work on the examination of extraordinary claims). The group saw two fundamental reasons to continue scholarship on demarcation. On the one hand, science has acquired a high social status and commands large amounts of resources in modern society. This means that we ought to examine and understand its nature in order to make sound decisions about just how much trust to put into scientific institutions and proceedings, as well as how much money to pump into the social structure that is modern science. On the other hand, as noted above, pseudoscience is not a harmless pastime. It has negative effects on both individuals and societies. This means that an understanding of its nature, and of how it differs from science, has very practical consequences.

The Philosophy of Pseudoscience also tackles issues of history and sociology of the field. It contains a comprehensive history of the demarcation problem followed by a historical analysis of pseudoscience, which tracks down the coinage and currency of the term and explains its shifting meaning in tandem with the emerging historical identity of science. A contribution by a sociologist then provides an analysis of paranormalism as a “deviant discipline” violating the consensus of established science, and one chapter draws attention to the characteristic social organization of pseudosciences as a means of highlighting the corresponding sociological dimension of the scientific endeavor.

The volume explores the borderlands between science and pseudoscience, for instance by deploying the idea of causal asymmetries in evidential reasoning to differentiate between what are sometime referred to as “hard” and “soft” sciences, arguing that misconceptions about this difference explain the higher incidence of pseudoscience and anti-science connected to the non-experimental sciences. One contribution looks at the demographics of pseudoscientific belief and examines how the demarcation problem is treated in legal cases. One chapter recounts the story of how at one time the pre-Darwinian concept of evolution was treated as pseudoscience in the same guise as mesmerism, before eventually becoming the professional science we are familiar with, thus challenging a conception of demarcation in terms of timeless and purely formal principles.

A discussion focusing on science and the supernatural includes the provocative suggestion that, contrary to recent philosophical trends, the appeal to the supernatural should not be ruled out from science on methodological grounds, as it is often done, but rather because the very notion of supernatural intervention suffers from fatal flaws. Meanwhile, David Hume is enlisted to help navigate the treacherous territory between science and religious pseudoscience and to assess the epistemic credentials of supernaturalism.

The Philosophy of Pseudoscience includes an analysis of the tactics deployed by “true believers” in pseudoscience, beginning with a discussion of the ethics of argumentation about pseudoscience, followed by the suggestion that alternative medicine can be evaluated scientifically despite the immunizing strategies deployed by some of its most vocal supporters. One entry summarizes misgivings about Freudian psychoanalysis, arguing that we should move beyond assessments of the testability and other logical properties of a theory, shifting our attention instead to the spurious claims of validation and other recurrent misdemeanors on the part of pseudoscientists. It also includes a description of the different strategies used by climate change “skeptics” and other denialists, outlining the links between new and “traditional” pseudosciences.

The volume includes a section examining the complex cognitive roots of pseudoscience. Some of the contributors ask whether we actually evolved to be irrational, describing a number of heuristics that are rational in domains ecologically relevant to ancient Homo sapiens, but that lead us astray in modern contexts. One of the chapters explores the non-cognitive functions of super-empirical beliefs, analyzing the different attitudes of science and pseudoscience toward intuition. An additional entry distinguishes between two mindsets about science and explores the cognitive styles relating to authority and tradition in both science and pseudoscience. This is followed by an essay proposing that belief in pseudoscience may be partly explained by theories about the ethics of belief. There is also a chapter on pseudo-hermeneutics and the illusion of understanding, drawing inspiration from the cognitive psychology and philosophy of intentional thinking.

A simple search of online databases of philosophical peer reviewed papers clearly shows that the 2013 volume has succeeded in countering Laudan’s 1983 paper, yielding a flourishing of new entries in the demarcation literature in particular, and in the newly established subfield of the philosophy of pseudoscience more generally. This article now turns to a brief survey of some of the prominent themes that have so far characterized this Renaissance of the field of demarcation.

4. The Renaissance of the Demarcation Problem

After the publication of The Philosophy of Pseudoscience collection, an increasing number of papers has been published on the demarcation problem and related issues in philosophy of science and epistemology. It is not possible to discuss all the major contributions in detail, so what follows is intended as a representative set of highlights and a brief guide to the primary literature.

Sven Ove Hansson (2017) proposed that science denialism, often considered a different issue from pseudoscience, is actually one form of the latter, the other form being what he terms pseudotheory promotion. Hansson examines in detail three case studies: relativity theory denialism, evolution denialism, and climate change denialism. The analysis is couched in terms of three criteria for the identification of pseudoscientific statements, previously laid out by Hansson (2013). A statement is pseudoscientific if it satisfies the following:

  1. It pertains to an issue within the domains of science in the broad sense (the criterion of scientific domain).
  2. It suffers from such a severe lack of reliability that it cannot at all be trusted (the criterion of unreliability).
  3. It is part of a doctrine whose major proponents try to create the impression that it represents the most reliable knowledge on its subject matter (the criterion of deviant doctrine).

On these bases, Hansson concludes that, for example, “The misrepresentations of history presented by Holocaust deniers and other pseudo-historians are very similar in nature to the misrepresentations of natural science promoted by creationists and homeopaths” (2017, 40). In general, Hansson proposes that there is a continuum between science denialism at one end (for example, regarding climate change, the holocaust, the general theory of relativity, etc.) and pseudotheory promotion at the other end (for example, astrology, homeopathy, iridology). He identifies four epistemological characteristics that account for the failure of science denialism to provide genuine knowledge:

  • Cherry picking. One example is Conservapedia’s entry listing alleged counterexamples to the general theory of relativity. Never mind that, of course, an even cursory inspection of such “anomalies” turns up only mistakes or misunderstandings.
  • Neglect of refuting information. Again concerning general relativity denialism, the proponents of the idea point to a theory advanced by the Swiss physicist Georges-Louis Le Sage that gravitational forces result from pressure exerted on physical bodies by a large number of small invisible particles. That idea might have been reasonably entertained when it was proposed, in the 18th century, but not after the devastating criticism it received in the 19th century—let alone the 21st.
  • Fabrication of fake controversies. Perhaps the most obvious example here is the “teach both theories” mantra so often repeated by creationists, which was adopted by Ronald Reagan during his 1980 presidential campaign. The fact is, there is no controversy about evolution within the pertinent epistemic community.
  • Deviant criteria of assent. For instance, in the 1920s and ‘30s, special relativity was accused of not being sufficiently transpicuous, and its opponents went so far as to attempt to create a new “German physics” that would not use difficult mathematics and would, therefore, be accessible by everyone. Both Einstein and Planck ridiculed the whole notion that science ought to be transpicuous in the first place. The point is that part of the denialist’s strategy is to ask for impossible standards in science and then use the fact that such demands are not met (because they cannot be) as “evidence” against a given scientific notion. This is known as the unobtainable perfection fallacy (Gauch, 2012).

Hansson lists ten sociological characteristics of denialism: that the focal theory (say, evolution) threatens the denialist’s worldview (for instance, a fundamentalist understanding of Christianity); complaints that the focal theory is too difficult to understand; a lack of expertise among denialists; a strong predominance of men among the denialists (that is, lack of diversity); an inability to publish in peer-reviewed journals; a tendency to embrace conspiracy theories; appeals directly to the public; the pretense of having support among scientists; a pattern of attacks against legitimate scientists; and strong political overtones.

Dawes (2018) acknowledges, with Laudan (1983), that there is a general consensus that no single criterion (or even small set of necessary and jointly sufficient criteria) is capable of discerning science from pseudoscience. However, he correctly maintains that this does not imply that there is no multifactorial account of demarcation, situating different kinds of science and pseudoscience along a continuum. One such criterion is that science is a social process, which entails that a theory is considered scientific because it is part of a research tradition that is pursued by the scientific community.

Dawes is careful in rejecting the sort of social constructionism endorsed by some sociologists of science (Bloor 1976) on the grounds that the sociological component is just one of the criteria that separate science from pseudoscience. Two additional criteria have been studied by philosophers of science for a long time: the evidential and the structural. The first refers to the connection between a given scientific theory and the empirical evidence that provides epistemic warrant for that theory. The second is concerned with the internal structure and coherence of a scientific theory.

Science, according to Dawes, is a cluster concept grouping a set of related, yet somewhat differentiated, kinds of activities. In this sense, his paper reinforces an increasingly widespread understanding of science in the philosophical community (see also Dupré 1993; Pigliucci 2013). Pseudoscience, then, is also a cluster concept, similarly grouping a number of related, yet varied, activities that attempt to mimic science but do so within the confines of an epistemically inert community.

The question, therefore, becomes, in part, one of distinguishing scientific from pseudoscientific communities, especially when the latter closely mimic the first ones. Take, for instance, homeopathy. While it is clearly a pseudoscience, the relevant community is made of self-professed “experts” who even publish a “peer-reviewed” journal, Homeopathy, put out by a major academic publisher, Elsevier. Here, Dawes builds on an account of scientific communities advanced by Robert Merton (1973). According to Merton, scientific communities are characterized by four norms, all of which are lacking in pseudoscientific communities: universalism, the notion that class, gender, ethnicity, and so forth are (ideally, at least) treated as irrelevant in the context of scientific discussions; communality, in the sense that the results of scientific inquiry belong (again, ideally) to everyone; disinterestedness, not because individual scientists are unbiased, but because community-level mechanisms counter individual biases; and organized skepticism, whereby no idea is exempt from critical scrutiny.

In the end, Dawes’s suggestion is that “We will have a pro tanto reason to regard a theory as pseudoscientific when it has been either refused admission to, or excluded from, a scientific research tradition that addresses the relevant problems” (2018, 293). Crucially, however, what is or is not recognized as a viable research tradition by the scientific community changes over time, so that the demarcation between science and pseudoscience is itself liable to shift as time passes.

One author who departs significantly from what otherwise seems to be an emerging consensus on demarcation is Angelo Fasce (2019). He rejects the notion that there is any meaningful continuum between science and pseudoscience, or that either concept can fruitfully be understood in terms of family resemblance, going so far as accusing some of his colleagues of “still engag[ing] in time-consuming, unproductive discussions on already discarded demarcation criteria, such as falsifiability” (2019, 155).

Fasce’s criticism hinges, in part, on the notion that gradualist criteria may create problems in policy decision making: just how much does one activity have to be close to the pseudoscientific end of the spectrum in order for, say, a granting agency to raise issues? The answer is that there is no sharp demarcation because there cannot be, regardless of how much we would wish otherwise. In many cases, said granting agency should have no trouble classifying good science (for example, fundamental physics or evolutionary biology) as well as obvious pseudoscience (for example, astrology or homeopathy). But there will be some borderline cases (for instance, parapsychology? SETI?) where one will just have to exercise one’s best judgment based on what is known at the moment and deal with the possibility that one might make a mistake.

Fasce also argues that “Contradictory conceptions and decisions can be consistently and justifiably derived from [a given demarcation criterion]—i.e. mutually contradictory propositions could be legitimately derived from the same criterion because that criterion allows, or is based on, ‘subjective’ assessment” (2019, 159). Again, this is probably true, but it is also likely an inevitable feature of the nature of the problem, not a reflection of the failure of philosophers to adequately tackle it.

Fasce (2019, 62) states that there is no historical case of a pseudoscience turning into a legitimate science, which he takes as evidence that there is no meaningful continuum between the two classes of activities. But this does not take into account the case of pre-Darwinian evolutionary theories mentioned earlier, nor the many instances of the reverse transition, in which an activity initially considered scientific has, in fact, gradually turned into a pseudoscience, including alchemy (although its relationship with chemistry is actually historically complicated), astrology, phrenology, and, more recently, cold fusion—with the caveat that whether the latter notion ever reached scientific status is still being debated by historians and philosophers of science. These occurrences would seem to point to the existence of a continuum between the two categories of science and pseudoscience.

One interesting objection raised by Fasce is that philosophers who favor a cluster concept approach do not seem to be bothered by the fact that such a Wittgensteinian take has led some authors, like Richard Rorty, all the way down the path of radical relativism, a position that many philosophers of science reject. Then again, Fasce himself acknowledges that “Perhaps the authors who seek to carry out the demarcation of pseudoscience by means of family resemblance definitions do not follow Wittgenstein in all his philosophical commitments” (2019, 64).

Because of his dissatisfaction with gradualist interpretations of the science-pseudoscience landscape, Fasce (2019, 67) proposes what he calls a “metacriterion” to aid in the demarcation project. This is actually a set of four criteria, two of which he labels “procedural requirements” and two “criterion requirements.” The latter two are mandatory for demarcation, while the first two are not necessary, although they provide conditions of plausibility. The procedural requirements are: (i) that demarcation criteria should entail a minimum number of philosophical commitments; and (ii) that demarcation criteria should explain current consensus about what counts as science or pseudoscience. The criterion requirements are: (iii) that mimicry of science is a necessary condition for something to count as pseudoscience; and (iv) that all items of demarcation criteria be discriminant with respect to science.

Fasce (2018) has used his metacriterion to develop a demarcation criterion according to which pseudoscience: (1) refers to entities and/or processes outside the domain of science; (2) makes use of a deficient methodology; (3) is not supported by evidence; and (4) is presented as scientific knowledge. This turns out to be similar to a previous proposal by Hansson (2009). Fasce and Picó (2019) have also developed a scale of pseudoscientific belief based on the work discussed above.

Another author pushing a multicriterial approach to demarcation is Damian Fernandez‐Beanato (2020b), whom this article already mentioned when discussing Cicero’s early debunking of divination. He provides a useful summary of previous mono-criterial proposals, as well as of two multicriterial ones advanced by Hempel (1951) and Kuhn (1962). The failure of these attempts is what in part led to the above-mentioned rejection of the entire demarcation project by Laudan (1983).

Fernandez‐Beanato suggests improvements on a multicriterial approach originally put forth by Mahner (2007), consisting of a broad list of accepted characteristics or properties of science. The project, however, runs into significant difficulties for a number of reasons. First, like Fasce (2019), Fernandez-Beanato wishes for more precision than is likely possible, in his case aiming at a quantitative “cut value” on a multicriterial scale that would make it possible to distinguish science from non-science or pseudoscience in a way that is compatible with classical logic. It is hard to imagine how such quantitative estimates of “scientificity” may be obtained and operationalized. Second, the approach assumes a unity of science that is at odds with the above-mentioned emerging consensus in philosophy of science that “science” (and, similarly, “pseudoscience”) actually picks a family of related activities, not a single epistemic practice. Third, Fernandez-Beanato rejects Hansson’s (and other authors’) notion that any demarcation criterion is, by necessity, temporally limited because what constitutes science or pseudoscience changes with our understanding of phenomena. But it seems hard to justify Fernandez-Beanato’s assumption that “Science … is currently, in general, mature enough for properties related to method to be included into a general and timeless definition of science” (2019, 384).

Kåre Letrud (2019), like Fasce (2019), seeks to improve on Hansson’s (2009) approach to demarcation, but from a very different perspective. He points out that Hansson’s original answer to the demarcation problem focuses on pseudoscientific statements, not disciplines. The problem with this, according to Letrud, is that Hansson’s approach does not take into sufficient account the sociological aspect of the science-pseudoscience divide. Moreover, following Hansson—again according to Letrud—one would get trapped into a never-ending debunking of individual (as distinct from systemic) pseudoscientific claims. Here Letrud invokes the “Bullshit Asymmetry Principle,” also known as “Brandolini’s Law” (named after the Italian programmer Alberto Brandolini, to which it is attributed): “The amount of energy needed to refute BS is an order of magnitude bigger than to produce it.” Going pseudoscientific statement by pseudoscientific statement, then, is a losing proposition.

Letrud notes that Hansson (2009) adopts a broad definition of “science,” along the lines of the German Wissenschaft, which includes the social sciences and the humanities. While Fasce (2019) thinks this is problematically too broad, Letrud (2019) points out that a broader view of science implies a broader view of pseudoscience, which allows Hansson to include in the latter not just standard examples like astrology and homeopathy, but also Holocaust denialism, Bible “codes,” and so forth.

According to Letrud, however, Hansson’s original proposal does not do a good job differentiating between bad science and pseudoscience, which is important because we do not want to equate the two. Letrud suggests that bad science is characterized by discrete episodes of epistemic failure, which can occur even within established sciences. Pseudoscience, by contrast, features systemic epistemic failure. Bad science can even give rise to what Letrud calls “scientific myth propagation,” as in the case of the long-discredited notion that there are such things as learning styles in pedagogy. It can take time, even decades, to correct examples of bad science, but that does not ipso facto make them instances of pseudoscience.

Letrud applies Lakatos’s (1978) distinction of core vs. auxiliary statements for research programs  to core vs. auxiliary statements typical of pseudosciences like astrology or homeopathy, thus bridging the gap between Hansson’s focus on individual statements and Letrud’s preferred focus on disciplines. For instance: “One can be an astrologist while believing that Virgos are loud, outgoing people (apparently, they are not). But one cannot hold that the positions of the stars and the character and behavior of people are unrelated” (Letrud 2019, 8). The first statement is auxiliary, the second, core.

To take homeopathy as an example, a skeptic could decide to spend an inordinate amount of time (according to Brandolini’s Law) debunking individual statements made by homeopaths. Or, more efficiently, the skeptic could target the two core principles of the discipline, namely potentization theory (that is, the notion that more diluted solutions are more effective) and the hypothesis that water holds a “memory” of substances once present in it. Letrud’s approach, then, retains the power of Hansson’s, but zeros in on the more foundational weakness of pseudoscience—its core claims—while at the same time satisfactorily separating pseudoscience from regular bad science. The debate, however, is not over, as more recently Hansson (2020) has replied to Letrud emphasizing that pseudosciences are doctrines, and that the reason they are so pernicious is precisely their doctrinal resistance to correction.

5. Pseudoscience as BS

One of the most intriguing papers on demarcation to appear in the course of what this article calls the Renaissance of scholarship on the issue of pseudoscience is entitled “Bullshit, Pseudoscience and Pseudophilosophy,” authored by Victor Moberger (2020). Moberger has found a neat (and somewhat provocative) way to describe the profound similarity between pseudoscience and pseudophilosophy: in a technical philosophical sense, it is all BS.

Moberger takes his inspiration from the famous essay by Harry Frankfurt (2005), On Bullshit. As Frankfurt puts it: “One of the most salient features of our culture is that there is so much bullshit.” (2005, 1) Crucially, Frankfurt goes on to differentiate the BSer from the liar:

It is impossible for someone to lie unless he thinks he knows the truth. … A person who lies is thereby responding to the truth, and he is to that extent respectful of it. When an honest man speaks, he says only what he believes to be true; and for the liar, it is correspondingly indispensable that he consider his statements to be false. For the bullshitter, however, all these bets are off: he is neither on the side of the true nor on the side of the false. His eye is not on the facts at all, as the eyes of the honest man and of the liar are. … He does not care whether the things he says describe reality correctly. (2005, 55-56)

So, while both the honest person and the liar are concerned with the truth—though in opposite manners—the BSer is defined by his lack of concern for it. This lack of concern is of the culpable variety, so that it can be distinguished from other activities that involve not telling the truth, like acting. This means two important things: (i) BS is a normative concept, meaning that it is about how one ought to behave or not to behave; and (ii) the specific type of culpability that can be attributed to the BSer is epistemic culpability. As Moberger puts it, “the bullshitter is assumed to be capable of responding to reasons and argument, but fails to do so” (2020, 598) because he does not care enough.

Moberger does not make the connection in his paper, but since he focuses on BSing as an activity carried out by particular agents, and not as a body of statements that may be true or false, his treatment falls squarely into the realm of virtue epistemology (see below). We can all arrive at the wrong conclusion on a specific subject matter, or unwittingly defend incorrect notions. And indeed, to some extent we may all, more or less, be culpable of some degree of epistemic misconduct, because few if any people are the epistemological equivalent of sages, ideally virtuous individuals. But the BSer is pathologically epistemically culpable. He incurs epistemic vices and he does not care about it, so long as he gets whatever he wants out of the deal, be that to be “right” in a discussion, or to further his favorite a priori ideological position no matter what.

Accordingly, the charge of BSing—in the technical sense—has to be substantiated by serious philosophical analysis. The term cannot simply be thrown out there as an insult or an easy dismissal. For instance, when Kant famously disagreed with Hume on the role of reason (primary for Kant, subordinate to emotions for Hume) he could not just have labelled Hume’s position as BS and move on, because Hume had articulated cogent arguments in defense of his take on the subject.

On the basis of Frankfurt’s notion of BSing, Moberger carries out a general analysis of pseudoscience and even pseudophilosophy. He uses the term pseudoscience to refer to well-known examples of epistemic malpractice, like astrology, creationism, homeopathy, ufology, and so on. According to Moberger, the term pseudophilosophy, by contrast, picks out two distinct classes of behaviors. The first is what he refers to as “a seemingly profound type of academic discourse that is pursued primarily within the humanities and social sciences” (2020, 600), which he calls obscurantist pseudophilosophy. The second, a “less familiar kind of pseudophilosophy is usually found in popular scientific contexts, where writers, typically with a background in the natural sciences, tend to wander into philosophical territory without realizing it, and again without awareness of relevant distinctions and arguments” (2020, 601). He calls this scientistic (Boudry and Pigliucci 2017) pseudophilosophy.

The bottom line is that pseudoscience is BS with scientific pretensions, while pseudophilosophy is BS with philosophical pretensions. What pseudoscience and pseudophilosophy have in common, then, is BS. While both pseudoscience and pseudophilosophy suffer from a lack of epistemic conscientiousness, this lack manifests itself differently, according to Moberger. In the case of pseudoscience, we tend to see a number of classical logical fallacies and other reasoning errors at play. In the case of pseudophilosophy, instead, we see “equivocation due to conceptual impressionism, whereby plausible but trivial propositions lend apparent credibility to interesting but implausible ones.”

Moberger’s analysis provides a unified explanatory framework for otherwise seemingly disparate phenomena, such as pseudoscience and pseudophilosophy. And it does so in terms of a single, more fundamental, epistemic problem: BSing. He then proceeds by fleshing out the concept—for instance, differentiating pseudoscience from scientific fraud—and by responding to a range of possible objections to his thesis, for example that the demarcation of concepts like pseudoscience, pseudophilosophy, and even BS is vague and imprecise. It is so by nature, Moberger responds, adopting the already encountered Wittgensteinian view that complex concepts are inherently fuzzy.

Importantly, Moberger reiterates a point made by other authors before, and yet very much worth reiterating: any demarcation in terms of content between science and pseudoscience (or philosophy and pseudophilosophy), cannot be timeless. Alchemy was once a science, but it is now a pseudoscience. What is timeless is the activity underlying both pseudoscience and pseudophilosophy: BSing.

There are several consequences of Moberger’s analysis. First, that it is a mistake to focus exclusively, sometimes obsessively, on the specific claims made by proponents of pseudoscience as so many skeptics do. That is because sometimes even pseudoscientific practitioners get things right, and because there simply are too many such claims to be successfully challenged (again, Brandolini’s Law). The focus should instead be on pseudoscientific practitioners’ epistemic malpractice: content vs. activity.

Second, what is bad about pseudoscience and pseudophilosophy is not that they are unscientific, because plenty of human activities are not scientific and yet are not objectionable (literature, for instance). Science is not the ultimate arbiter of what has or does not have value. While this point is hardly controversial, it is worth reiterating, considering that a number of prominent science popularizers have engaged in this mistake.

Third, pseudoscience does not lack empirical content. Astrology, for one, has plenty of it. But that content does not stand up to critical scrutiny. Astrology is a pseudoscience because its practitioners do not seem to be bothered by the fact that their statements about the world do not appear to be true.

One thing that is missing from Moberger’s paper, perhaps, is a warning that even practitioners of legitimate science and philosophy may be guilty of gross epistemic malpractice when they criticize their pseudo counterparts. Too often so-called skeptics reject unusual or unorthodox claims a priori, without critical analysis or investigation, for example in the notorious case of the so-called Campeche UFOs (Pigliucci, 2018, 97-98). From a virtue epistemological perspective, it comes down to the character of the agents. We all need to push ourselves to do the right thing, which includes mounting criticisms of others only when we have done our due diligence to actually understand what is going on. Therefore, a small digression into how virtue epistemology is relevant to the demarcation problem now seems to be in order.

6. Virtue Epistemology and Demarcation

Just like there are different ways to approach virtue ethics (for example, Aristotle, the Stoics), so there are different ways to approach virtue epistemology. What these various approaches have in common is the assumption that epistemology is a normative (that is, not merely descriptive) discipline, and that intellectual agents (and their communities) are the sources of epistemic evaluation.

The assumption of normativity very much sets virtue epistemology as a field at odds with W.V.O. Quine’s famous suggestion that epistemology should become a branch of psychology (see Naturalistic Epistemology): that is, a descriptive, not prescriptive discipline. That said, however, virtue epistemologists are sensitive to input from the empirical sciences, first and foremost psychology, as any sensible philosophical position ought to be.

A virtue epistemological approach—just like its counterpart in ethics—shifts the focus away from a “point of view from nowhere” and onto specific individuals (and their communities), who are treated as epistemic agents. In virtue ethics, the actions of a given agent are explained in terms of the moral virtues (or vices) of that agent, like courage or cowardice. Analogously, in virtue epistemology the judgments of a given agent are explained in terms of the epistemic virtues of that agent, such as conscientiousness, or gullibility.

Just like virtue ethics has its roots in ancient Greece and Rome, so too can virtue epistemologists claim a long philosophical pedigree, including but not limited to Plato, Aristotle, the Stoics, Thomas Aquinas, Descartes, Hume, and Bertrand Russell.

But what exactly is a virtue, in this context? Again, the analogy with ethics is illuminating. In virtue ethics, a virtue is a character trait that makes the agent an excellent, meaning ethical, human being. Similarly, in virtue epistemology a virtue is a character trait that makes the agent an excellent cognizer. Here is a partial list of epistemological virtues and vices to keep handy:

Epistemic virtues Epistemic vices
Attentiveness Close-mindedness
Benevolence (that is, principle of charity) Dishonesty
Conscentiousness Dogmatism
Creativity Gullibility
Curiosity Naïveté
Discernment Obtuseness
Honesty Self-deception
Humility Superficiality
Objectivity Wishful thinking
Parsimony
Studiousness
Understanding
Warrant
Wisdom

Linda Zagzebski (1996) has proposed a unified account of epistemic and moral virtues that would cast the entire science-pseudoscience debate in more than just epistemic terms. The idea is to explicitly bring to epistemology the same inverse approach that virtue ethics brings to moral philosophy: analyzing right actions (or right beliefs) in terms of virtuous character, instead of the other way around.

For Zagzebski, intellectual virtues are actually to be thought of as a subset of moral virtues, which would make epistemology a branch of ethics. The notion is certainly intriguing: consider a standard moral virtue, like courage. It is typically understood as being rooted in the agent’s motivation to do good despite the risk of personal danger. Analogously, the virtuous epistemic agent is motivated by wanting to acquire knowledge, in pursuit of which goal she cultivates the appropriate virtues, like open-mindedness.

In the real world, sometimes virtues come in conflict with each other, for instance in cases where the intellectually bold course of action is also not the most humble, thus pitting courage and humility against each other. The virtuous moral or epistemic agent navigates a complex moral or epistemic problem by adopting an all-things-considered approach with as much wisdom as she can muster. Knowledge itself is then recast as a state of belief generated by acts of intellectual virtue.

Reconnecting all of this more explicitly with the issue of science-pseudoscience demarcation, it should now be clearer why Moberger’s focus on BS is essentially based on a virtue ethical framework. The BSer is obviously not acting virtuously from an epistemic perspective, and indeed, if Zagzebski is right, also from a moral perspective. This is particularly obvious in the cases of pseudoscientific claims made by, among others, anti-vaxxers and climate change denialists. It is not just the case that these people are not being epistemically conscientious. They are also acting unethically because their ideological stances are likely to hurt others.

A virtue epistemological approach to the demarcation problem is explicitly adopted in a paper by Sindhuja Bhakthavatsalam and Weimin Sun (2021), who both provide a general outline of how virtue epistemology may be helpful concerning science-pseudoscience demarcation. The authors also explore in detail the specific example of the Chinese practice of Feng Shui, a type of pseudoscience employed in some parts of the world to direct architects to build in ways that maximize positive “qi” energy.

Bhakthavatsalam and Sun argue that discussions of demarcation do not aim solely at separating the usually epistemically reliable products of science from the typically epistemically unreliable ones that come out of pseudoscience. What we want is also to teach people, particularly the general public, to improve their epistemic judgments so that they do not fall prey to pseudoscientific claims. That is precisely where virtue epistemology comes in.

Bhakthavatsalam and Sun build on work by Anthony Derksen (1993) who arrived at what he called an epistemic-social-psychological profile of a pseudoscientist, which in turn led him to a list of epistemic “sins” that pseudoscientists regularly engage in: lack of reliable evidence for their claims; arbitrary “immunization” from empirically based criticism (Boudry and Braeckman 2011); assigning outsized significance to coincidences; adopting magical thinking; contending to have special insight into the truth; tendency to produce all-encompassing theories; and uncritical pretension in the claims put forth.

Conversely, one can arrive at a virtue epistemological understanding of science and other truth-conducive epistemic activities. As Bhakthavatsalam and Sun (2021, 6) remind us: “Virtue epistemologists contend that knowledge is non‐accidentally true belief. Specifically, it consists in belief of truth stemming from epistemic virtues rather than by luck. This idea is captured well by Wayne Riggs (2009): knowledge is an ‘achievement for which the knower deserves credit.’”

Bhakthavatsalam and Sun discuss two distinct yet, in their mind, complementary (especially with regard to demarcation) approaches to virtue ethics: virtue reliabilism and virtue responsibilism. Briefly, virtue reliabilism (Sosa 1980, 2011) considers epistemic virtues to be stable behavioral dispositions, or competences, of epistemic agents. In the case of science, for instance, such virtues might include basic logical thinking skills, the ability to properly collect data, the ability to properly analyze data, and even the practical know-how necessary to use laboratory or field equipment. Clearly, these are precisely the sort of competences that are not found among practitioners of pseudoscience. But why not? This is where the other approach to virtue epistemology, virtue responsibilism, comes into play.

Responsibilism is about identifying and practicing epistemic virtues, as well as identifying and staying away from epistemic vices. The virtues and vices in question are along the lines of those listed in the table above. Of course, we all (including scientists and philosophers) engage in occasionally vicious, or simply sloppy, epistemological practices. But what distinguishes pseudoscientists is that they systematically tend toward the vicious end of the epistemic spectrum, while what characterizes the scientific community is a tendency to hone epistemic virtues, both by way of expressly designed training and by peer pressure internal to the community. Part of the advantage of thinking in terms of epistemic vices and virtues is that one then puts the responsibility squarely on the shoulders of the epistemic agent, who becomes praiseworthy or blameworthy, as the case may be.

Moreover, a virtue epistemological approach immediately provides at least a first-level explanation for why the scientific community is conducive to the truth while the pseudoscientific one is not. In the latter case, comments Cassam:

The fact that this is how [the pseudoscientist] goes about his business is a reflection of his intellectual character. He ignores critical evidence because he is grossly negligent, he relies on untrustworthy sources because he is gullible, he jumps to conclusions because he is lazy and careless. He is neither a responsible nor an effective inquirer, and it is the influence of his intellectual character traits which is responsible for this. (2016, 165)

In the end, Bhakthavatsalam and Sun arrive, by way of their virtue epistemological approach, to the same conclusion that we have seen other authors reach: both science and pseudoscience are Wittgensteinian-type cluster concepts. But virtue epistemology provides more than just a different point of view on demarcation. First, it identifies specific behavioral tendencies (virtues and vices) the cultivation (or elimination) of which yield epistemically reliable outcomes. Second, it shifts the responsibility to the agents as well as to the communal practices within which such agents operate. Third, it makes it possible to understand cases of bad science as being the result of scientists who have not sufficiently cultivated or sufficiently regarded their virtues, which in turn explains why we find the occasional legitimate scientist who endorses pseudoscientific notions.

How do we put all this into practice, involving philosophers and scientists in the sort of educational efforts that may help curb the problem of pseudoscience? Bhakthavatsalam and Sun articulate a call for action at both the personal and the systemic levels. At the personal level, we can virtuously engage with both purveyors of pseudoscience and, likely more effectively, with quasi-neutral bystanders who may be attracted to, but have not yet bought into, pseudoscientific notions. At the systemic level, we need to create the sort of educational and social environment that is conducive to the cultivation of epistemic virtues and the eradication of epistemic vices.

Bhakthavatsalam and Sun are aware of the perils of engaging defenders of pseudoscience directly, especially from the point of view of virtue epistemology. It is far too tempting to label them as “vicious,” lacking in critical thinking, gullible, and so forth and be done with it. But basic psychology tells us that this sort of direct character attack is not only unlikely to work, but near guaranteed to backfire. Bhakthavatsalam and Sun claim that we can “charge without blame” since our goal is “amelioration rather than blame” (2021, 15). But it is difficult to imagine how someone could be charged with the epistemic vice of dogmatism and not take that personally.

Far more promising are two different avenues: the systemic one, briefly discussed by Bhakthavatsalam and Sun, and the personal not in the sense of blaming others, but rather in the sense of modeling virtuous behavior ourselves.

In terms of systemic approaches, Bhakthavatsalam and Sun are correct that we need to reform both social and educational structures so that we reduce the chances of generating epistemically vicious agents and maximize the chances of producing epistemically virtuous ones. School reforms certainly come to mind, but also regulation of epistemically toxic environments like social media.

As for modeling good behavior, we can take a hint from the ancient Stoics, who focused not on blaming others, but on ethical self-improvement:

If a man is mistaken, instruct him kindly and show him his error. But if you are not able, blame yourself, or not even yourself. (Marcus Aurelius, Meditations, X.4)

A good starting point may be offered by the following checklist, which—in agreement with the notion that good epistemology begins with ourselves—is aimed at our own potential vices. The next time you engage someone, in person or especially on social media, ask yourself the following questions:

  • Did I carefully consider the other person’s arguments without dismissing them out of hand?
  • Did I interpret what they said in a charitable way before mounting a response?
  • Did I seriously entertain the possibility that I may be wrong? Or am I too blinded by my own preconceptions?
  • Am I an expert on this matter? If not, did I consult experts, or did I just conjure my own unfounded opinion?
  • Did I check the reliability of my sources, or just google whatever was convenient to throw at my interlocutor?
  • After having done my research, do I actually know what I’m talking about, or am I simply repeating someone else’s opinion?

After all, as Aristotle said: “Piety requires us to honor truth above our friends” (Nicomachean Ethics, book I), though some scholars suggested that this was a rather unvirtuous comment aimed at his former mentor, Plato.

7. The Scientific Skepticism Movement

One of the interesting characteristics of the debate about science-pseudoscience demarcation is that it is an obvious example where philosophy of science and epistemology become directly useful in terms of public welfare. This, in other words, is not just an exercise in armchair philosophizing; it has the potential to affect lives and make society better. This is why we need to take a brief look at what is sometimes referred to as the skeptic movement—people and organizations who have devoted time and energy to debunking and fighting pseudoscience. Such efforts could benefit from a more sophisticated philosophical grounding, and in turn philosophers interested in demarcation would find their work to be immediately practically useful if they participated in organized skepticism.

That said, it was in fact a philosopher, Paul Kurtz, who played a major role in the development of the skeptical movement in the United States. Kurtz, together with Marcello Truzzi, founded the Committee for the Scientific Investigation of Claims of the Paranormal (CSICOP), in Amherst, New York in 1976. The organization changed its name to the Committee for Skeptical Inquiry (CSI) in November 2006 and has long been publishing the premier world magazine on scientific skepticism, Skeptical Inquirer. These groups, however, were preceded by a long history of skeptic organizations outside the US. The oldest skeptic organization on record is the Dutch Vereniging tegen de Kwakzalverij (VtdK), established in 1881. This was followed by the Belgian Comité Para in 1949, started in response to a large predatory industry of psychics exploiting the grief of people who had lost relatives during World War II.

In the United States, Michael Shermer, founder and editor of Skeptic Magazine, traced the origin of anti-pseudoscience skepticism to the publication of Martin Gardner’s Fads and Fallacies in the Name of Science in 1952. The French Association for Scientific Information (AFIS) was founded in 1968, and a series of groups got started worldwide between 1980 and 1990, including Australian Skeptics, Stichting Skepsis in the Netherlands, and CICAP in Italy. In 1996, the magician James Randi founded the James Randi Educational Foundation, which established a one-million-dollar prize to be given to anyone who could reproduce a paranormal phenomenon under controlled conditions. The prize was never claimed.

After the fall of the Berlin Wall, a series of groups began operating in Russia and its former satellites in response to yet another wave of pseudoscientific claims. This led to skeptic organizations in the Czech Republic, Hungary, and Poland, among others. The European Skeptic Congress was founded in 1989, and a number of World Skeptic Congresses have been held in the United States, Australia, and Europe.

Kurtz (1992) characterized scientific skepticism in the following manner: “Briefly stated, a skeptic is one who is willing to question any claim to truth, asking for clarity in definition, consistency in logic, and adequacy of evidence.” This differentiates scientific skepticism from ancient Pyrrhonian Skepticism, which famously made no claim to any opinion at all, but it makes it the intellectual descendant of the Skepticism of the New Academy as embodied especially by Carneades and Cicero (Machuca and Reed 2018).

One of the most famous slogans of scientific skepticism “Extraordinary claims require extraordinary evidence” was first introduced by Truzzi. It can easily be seen as a modernized version of David Hume’s (1748, Section X: Of Miracles; Part I. 87.) dictum that a wise person proportions his beliefs to the evidence and has been interpreted as an example of Bayesian thinking (McGrayne 2011).

According to another major, early exponent of scientific skepticism, astronomer Carl Sagan: “The question is not whether we like the conclusion that emerges out of a train of reasoning, but whether the conclusion follows from the premises or starting point and whether that premise is true” (1995).

Modern scientific skeptics take full advantage of the new electronic tools of communication. Two examples in particular are the Skeptics’ Guide to the Universe podcast published by Steve Novella and collaborators, which regularly reaches a large audience and features interviews with scientists, philosophers, and skeptic activists; and the “Guerrilla Skepticism” initiative coordinated by Susan Gerbic, which is devoted to the systematic improvement of skeptic-related content on Wikipedia.

Despite having deep philosophical roots, and despite that some of its major exponents have been philosophers, scientific skepticism has an unfortunate tendency to find itself far more comfortable with science than with philosophy. Indeed, some major skeptics, such as author Sam Harris and scientific popularizers Richard Dawkins and Neil deGrasse Tyson, have been openly contemptuous of philosophy, thus giving the movement a bit of a scientistic bent. This is somewhat balanced by the interest in scientific skepticism of a number of philosophers (for instance, Maarten Boudry, Lee McIntyre) as well as by scientists who recognize the relevance of philosophy (for instance, Carl Sagan, Steve Novella).

Given the intertwining of not just scientific skepticism and philosophy of science, but also of social and natural science, the theoretical and practical study of the science-pseudoscience demarcation problem should be regarded as an extremely fruitful area of interdisciplinary endeavor—an endeavor in which philosophers can make significant contributions that go well beyond relatively narrow academic interests and actually have an impact on people’s quality of life and understanding of the world.

8. References and Further Readings

  • Armando, D. and Belhoste, B. (2018) Mesmerism Between the End of the Old Regime and the Revolution: Social Dynamics and Political Issues. Annales historiques de la Révolution française 391(1):3-26
  • Baum, R. and Sheehan, W. (1997) In Search of Planet Vulcan: The Ghost in Newton’s Clockwork Universe. Plenum.
  • Bhakthavatsalam, S. and Sun, W. (2021) A Virtue Epistemological Approach to the Demarcation Problem: Implications for Teaching About Feng Shui in Science Education. Science & Education 30:1421-1452. https://doi.org/10.1007/s11191-021-00256-5.
  • Bloor, D. (1976) Knowledge and Social Imagery. Routledge & Kegan Paul.
  • Bonk, T. (2008) Underdetermination: An Essay on Evidence and the Limits of Natural Knowledge. Springer.
  • Boudry, M. and Braeckman, J. (2011) Immunizing Strategies and Epistemic Defense Mechanisms. Philosophia 39(1):145-161.
  • Boudry, M. and Pigliucci, M. (2017) Science Unlimited? The Challenges of Scientism. University of Chicago Press.
  • Brulle, R.J. (2020) Denialism: Organized Opposition to Climate Change Action in the United States, in: D.M. Konisky (ed.) Handbook of U.S. Environmental Policy, Edward Elgar, chapter 24.
  • Carlson, S. (1985) A Double-Blind Test of Astrology. Nature 318:419-25.
  • Cassam, Q. (2016) Vice Epistemology. The Monist 99(2):159-180.
  • Cicero (2014) On Divination, in: Cicero—Complete Works, translated by W.A. Falconer, Delphi.
  • Curd, M. and Cover, J.A. (eds.) (2012) The Duhem-Quine Thesis and Underdetermination, in: Philosophy of Science: The Central Issues. Norton, pp. 225-333.
  • Dawes, G.W. (2018) Identifying Pseudoscience: A Social Process Criterion. Journal of General Philosophy of Science 49:283-298.
  • Derksen, A.A. (1993) The Seven Sins of Demarcation. Journal for General Philosophy of Science 24:17-42.
  • Dupré, J. (1993) The Disorder of Things: Metaphysical Foundations of the Disunity of Science. Harvard University Press.
  • Fasce, A. (2018) What Do We Mean When We Speak of Pseudoscience? The Development of a Demarcation Criterion Based on the Analysis of Twenty-One Previous Attempts. Disputatio 6(7):459-488.
  • Fasce, A. (2019) Are Pseudosciences Like Seagulls? A Discriminant Metacriterion Facilitates the Solution of the Demarcation Problem. International Studies in the Philosophy of Science 32(3-4):155-175.
  • Fasce, A. and Picó, A. (2019) Conceptual Foundations and Aalidation of the Pseudoscientific Belief Scale. Applied Cognitive Psychology 33(4):617-628.
  • Feldman, R. (1981) Fallibilism and Knowing that One Knows, The Philosophical Review 90:266-282.
  • Fernandez-Beanato, D. (2020a) Cicero’s Demarcation of Science: A Report of Shared Criteria. Studies in History and Philosophy of Science Part A 83:97-102.
  • Fernandez-Beanato, D. (2020b) The Multicriterial Approach to the Problem of Demarcation. Journal for General Philosophy of Science 51:375-390.
  • Feyerabend, P. (1975) Against Method: Outline of an Anarchistic Theory of Knowledge. New Left Books.
  • Frankfurt, H. (2005) On Bullshit. Princeton University Press.
  • Gardner, M. (1952) Fads and Fallacies in the Name of Science. Dover.
  • Gauch, H.G. (2012) Scientific Method in Brief. Cambridge University Press.
  • Gould, S.J. (1989) The Chain of Reason vs. The Chain of Thumbs, Natural History, 89(7):16.
  • Grosser, M. (1962) The Discovery of Neptune. Harvard University Press.
  • Hansson, S.O. (2009) Cutting the Gordian Knot of Demarcation. International Studies in the Philosophy of Science 23(3):237-243.
  • Hansson, S.O. (2013) Defining Pseudoscience—and Science, in: M. Pigliucci and M. Boudry (eds.), The Philosophy of Pseudoscience. University of Chicago University Press, pp. 61-77.
  • Hansson, S.O. (2017) Science Denial as a Form of Pseudoscience. Studies in History and Philosophy of Science 63:39-47.
  • Hansson, S.O. (2020) Disciplines, Doctrines, and Deviant Science. International Studies in the Philosophy of Science 33(1):43-52.
  • Hausman, A., Boardman, F., and Kahane, H. (2021) Logic and Philosophy: A Modern Introduction. Hackett.
  • Hempel, C.G. (1951) The Concept of Cognitive Significance: A Reconsideration. Proceedings of the American Academy of Arts and Sciences 80:61–77.
  • Hossenfelder, S. (2018) Lost in Math: How Beauty Leads Physics Astray. Basic Books.
  • Hume, D. (1748) An Enquiry Concerning Human Understanding, online at https://davidhume.org/texts/e/.
  • Jeffers, S. (2007) PEAR Lab Closes, Ending Decades of Psychic Research. Skeptical Inquirer 31(3), online at https://skepticalinquirer.org/2007/05/pear-lab-closes-ending-decades-of-psychic-research/.
  • Kaplan, J.M. (2006) More Misuses of Evolutionary Psychology. Metascience 15(1):177-181.
  • Kennefick, D. (2019) No Shadow of a Doubt: The 1919 Eclipse That Confirmed Einsteins Theory of Relativity. Princeton University Press.
  • Kuhn, T. (1962) The Structure of Scientific Revolutions. University of Chicago Press.
  • Kurtz, P. (1992) The New Skepticism. Prometheus.
  • LaFollette, M. (1983) Creationism, Science and the Law. MIT Press.
  • Lakatos, I. (1978) The Methodology of Scientific Research Programmes. Cambridge University Press.
  • Laudan, L. (1983) The Demise of the Demarcation Problem, in: R.S. Cohen and L. Laudan (eds.), Physics, Philosophy and Psychoanalysis. D. Reidel, pp. 111–127.
  • Laudan, L. (1988) Science at the Bar—Causes for Concern. In M. Ruse (ed.), But Is It Science? Prometheus.
  • Letrud, K. (2019) The Gordian Knot of Demarcation: Tying Up Some Loose Ends. International Studies in the Philosophy of Science 32(1):3-11.
  • Machuca, D.E. and Reed, B. (2018) Skepticism: From Antiquity to the Present. Bloomsbury Academic.
  • Mahner, M. (2007) Demarcating Science from Non-Science, in: T. Kuipers (ed.), Handbook of the Philosophy of Science: General Philosophy of Science—Focal Issues. Elsevier, pp. 515-575.
  • McGrayne, S.B. (2011) The Theory That Would Not Die: How Bayes’ Rule Cracked the Enigma Code, Hunted Down Russian Submarines, and Emerged Triumphant from Two Centuries of Controversy. Yale University Press.
  • Merton, R.K. (1973) The Normative Structure of Science, in: N.W. Storer (ed.), The Sociology of Science: Theoretical and Empirical Investigations. University of Chicago Press, pp. 267-278.
  • Moberger, V. (2020) Bullshit, Pseudoscience and Pseudophilosophy. Theoria 86(5):595-611.
  • Navin, M. (2013) Competing Epistemic Spaces. How Social Epistemology Helps Explain and Evaluate Vaccine Denialism. Social Theory and Practice 39(2):241-264.
  • Pigliucci, M. (2013) The Demarcation Problem: A (Belated) Response to Laudan, in: M. Pigliucci and M. Boudry (eds.), The Philosophy of Pseudoscience. University of Chicago Press, pp. 9-28.
  • Pigliucci, M. (2017) Philosophy as the Evocation of Conceptual Landscapes, in: R. Blackford and D. Broderick (eds.), Philosophy’s Future: The Problem of Philosophical Progress. John Wiley & Sons, pp. 75-90.
  • Pigliucci, M. (2018) Nonsense on Stilts, 2nd edition. University of Chicago Press, pp. 97-98.
  • Pigliucci, M. and Boudry, M. (eds.) (2013) The Philosophy of Pseudoscience: Reconsidering the Demarcation Problem. University of Chicago Press.
  • Plato (1986) Charmides. Translated by T.G. West and G.S. West, Hackett Classics.
  • Popper, K. (1959) The Logic of Scientific Discovery. Hutchinson.
  • Riggs, W. (2009) Two Problems of Easy Credit. Synthese 169(1):201-216.
  • Sagan, C. (1995) The Demon Haunted World. Ballantine.
  • Salas D. and Salas, D. (translators) (1996) The First Scientific Investigation of the Paranormal Ever Conducted, Commissioned by King Louis XVI. Designed, conducted, & written by Benjamin Franklin, Antoine Lavoisier, & Others. Skeptic (Fall), pp.68-83.
  • Shea, B. (no date) Karl Popper: Philosophy of Science. Internet Encyclopedia of Philosophy. https://iep.utm.edu/pop-sci/
  • Smith, T.C. and Novella, S.P. (2007) HIV Denial in the Internet Era. PLOS Medicine, https://doi.org/10.1371/journal.pmed.0040256.
  • Sosa, E. (1980) The Raft and the Pyramid: Coherence versus Foundations in the Theory of Knowledge. Midwest Studies in Philosophy 5(1):3-26.
  • Sosa, E. (2011) Knowing Full Well. Princeton University Press.
  • Wittgenstein, L. (1958) Philosophical Investigations. Blackwell.
  • Zagzebski, L.T. (1996) Virtues of the Mind: An Inquiry into the Nature of Virtue and the Ethical Foundations of Knowledge. Cambridge University Press.

 

Author Information

Massimo Pigliucci
Email: mpigliucci@ccny.cuny.edu
The City College of New York
U. S. A.

Substance

The term “substance” has two main uses in philosophy. Both originate in what is arguably the most influential work of philosophy ever written, Aristotle’s Categories. In its first sense, “substance” refers to those things that are object-like, rather that property-like. For example, an elephant is a substance in this sense, whereas the height or colour of the elephant is not. In its second sense, “substance” refers to the fundamental building blocks of reality. An elephant might count as a substance in this sense. However, this depends on whether we accept the kind of metaphysical theory that treats biological organisms as fundamental. Alternatively, we might judge that the properties of the elephant, or the physical particles that compose it, or entities of some other kind better qualify as substances in this second sense. Since the seventeenth century, a third use of “substance” has gained currency. According to this third use, a substance is something that underlies the properties of an ordinary object and that must be combined with these properties for the object to exist. To avoid confusion, philosophers often substitute the word “substratum” for “substance” when it is used in this third sense. The elephant’s substratum is what remains when you set aside its shape, size, colour, and all its other properties. These philosophical uses of “substance” differ from the everyday use of “substance” as a synonym for “stuff” or “material”. This is not a case of philosophers putting an ordinary word to eccentric use. Rather, “substance” entered modern languages as a philosophical term, and it is the everyday use that has drifted from the philosophical uses.

Table of Contents

  1. Substance in Classical Greek Philosophy
    1. Substance in Aristotle
    2. Substance in Hellenistic and Roman Philosophy
  2. Substance in Classical Indian Philosophy
    1. Nyaya-Vaisheshika and Jain Substances
    2. Upanishadic Substrata
    3. Buddhist Objections to Substance
  3. Substance in Medieval Arabic and Islamic Philosophy
    1. Al-Farabi
    2. Avicebron (Solomon ibn Gabirol)
  4. Substance in Medieval Scholastic Philosophy
    1. Thomas Aquinas
    2. Duns Scotus
  5. Substance in Early Modern Philosophy
    1. Descartes
    2. Spinoza
    3. Leibniz
    4. British Empiricism
  6. Substance in Twentieth-Century and Early-Twenty-First-Century Philosophy
    1. Criteria for Being a Substance
    2. The Structure of Substances
    3. Substance and the Mind-Body Problem
  7. References and Further Reading

1. Substance in Classical Greek Philosophy

The idea of substance enters philosophy at the start of Aristotle’s collected works, in the Categories 1a. It is further developed by Aristotle in other works, especially the Physics and the Metaphysics. Aristotle’s concept of substance was quickly taken up by other philosophers in the Aristotelian and Platonic schools. By late antiquity, the Categories, along with an introduction by Porphyry, was the first text standardly taught to philosophy students throughout the Roman world, a tradition that persisted in one form or another for more than a thousand years. As a result, Aristotle’s concept of substance can be found in works by philosophers across a tremendous range of times and places. Uptake of Aristotle’s concept of substance in Hellenistic and Roman philosophy was typically uncritical, however, and it is necessary to look to other traditions for influential challenges to and/or revisions of the Aristotelian concept.

a. Substance in Aristotle

The Categories centres on two ways of dividing up the kinds of things that exist (or, on some interpretations, the kinds of words or concepts for things that exist). Aristotle starts with a simple four-fold division. He then introduces a more complicated ten-fold division. Both give pride of place to the category of substances.

Aristotle draws the four-fold division in terms of two relations: that of existing in a subject in the way that the colour grey is in an elephant, and that of being said of a subject in the way that “animal” or “four-footed” is said of an elephant. Commentators often refer to these relations as inherence and predication, respectively.

Some things, Aristotle says, exist in a subject, and some are said of a subject. Some both exist in and are said of a subject. But members of a fourth group, substances, neither exist in nor are said of a subject:

A substance—that which is called a substance most strictly, primarily, and most of all—is that which is neither said of a subject nor in a subject, e.g. the individual man or the individual horse. (Categories, 2a11)

In other words, substances are those things that are neither inherent in, nor predicated of, anything else. A problem for understanding what this means is that Aristotle does not define the said of (predication) and in (inherence) relations. Aristotle (Categories, 2b5–6) does make it clear, however, that whatever is said of or in a subject, in the sense he has in mind, depends for its existence on that subject. The colour grey and the genus animal, for example, can exist only as the colour or genus of some subject—such as an elephant. Substances, according to Aristotle, do not depend on other things for their existence in this way: the elephant need not belong to some further thing in order to exist in the way that the colour grey and the genus animal (arguably) must. In this respect, Aristotle’s distinction between substances and non-substances approximates the everyday distinction between objects and properties.

Scholars tend to agree that Aristotle treats the things that are said of a subject as universals and other things as particulars. If so, Aristotle’s substances are particulars: unlike the genus animal, an individual elephant cannot have multiple instances. Scholars also tend to agree that Aristotle treats the things that exist in a subject as accidental and the other things as non-accidental. If so, substances are non-accidental. However, the term “accidental” usually signifies the relationship between a property and its bearer. For example, the colour grey is an accident of the elephant because it is not part of its essence, whereas the genus animal is not an accident of the elephant but is part of its essence. The claim that an object-like thing, such as a man, a horse, or an elephant, is non-accidental therefore seems trivially true.

Unlike the four-fold division, Aristotle’s ten-fold division does not arise out of the systematic combination of two or more characteristics such as being said of or existing in a subject. It is presented simply as a list consisting of substance, quantity, qualification, relative, where, when, being-in-a-position, having, doing, and being-affected. Scholars have long debated on whether Aristotle had a system for arriving at this list of categories or whether he “merely picked them up as they occurred to him” as Kant suggests (Critique of Pure Reason, Pt.2, Div.1, I.1, §3, 10).

Despite our ignorance about how he arrived at it, Aristotle’s ten-fold division helps clarify his concept of substance by providing a range of contrast cases: substances are not quantities, qualifications, relatives and so on, all of which depend on substances for their existence.

Having introduced the ten-fold division, Aristotle also highlights some characteristics that make substances stand out (Categories, 3b–8b): a substance is individual and numerically one, has no contrary (nothing stands to an elephant as knowledge stands to ignorance or justice to injustice), does not admit of more or less (no substance is more or less a substance than another substance, no elephant is more or less an elephant than another elephant), is not said in relation to anything else (one can know what an elephant is without knowing anything else to which it stands in some relation), and is able to receive contraries (an elephant can be hot at one time, cold at another). Aristotle emphasises that whereas substances share some of these characteristics with some non-substances, the ability to receive contraries while being numerically one is unique to substances (Categories, 4a10–13).

The core idea of a substance in the Categories applies to those object-like particulars that, uniquely, do not depend for their existence on some subject in which they must exist or of which they must be said, and that are capable of receiving contraries when they undergo change. That, at any rate, is how the Categories characterises those things that are “most strictly, primarily, and most of all” called “substances”. One complication must be noted. Aristotle adds that:

The species in which the things primarily called substances are, are called secondary substances, as also are the genera of these species. For example, the individual man belongs in a species, man, and animal is a genus of the species; so these—both man and animal—are called secondary substances. (Categories, 2a13)

Strictly, then, the Categories characterises two kinds of substances: primary substances, which have the characteristics we have looked at, and secondary substances, which are the species and genera to which primary substances belong. However, Aristotle’s decision to call the species and genera to which primary substances belong “secondary substances” is not typically adopted by later thinkers. When people talk about substances in philosophy, they almost always have in mind a sense of the term derived from Aristotle’s discussion of primary substances. Except where otherwise specified, the same is true of this article.

In singling out object-like particulars such as elephants as those things that are “most strictly, primarily and most of all” called “substance”, Aristotle implies that the term “substance” is no mere label, but that it signifies a special status. A clue as to what Aristotle has in mind here can be found in his choice of terminology. The Greek term translated “substance” is ousia, an abstract noun derived from the participle ousa of the Greek verb eimi, meaning—and cognate with—I am. Unlike the English “substance”, ousia carries no connotation of standing under or holding up. Rather, ousia suggests something close to what we mean by the word “being” when we use it as a noun. Presumably, therefore, Aristotle regards substances as those things that are most strictly and primarily counted as beings, as things that exist.

Aristotle sometimes refers to substances as hypokeimena, a term that does carry the connotation of standing under (or rather, lying under), and that is often translated with the term “subject”. Early translators of Aristotle into Latin frequently used a Latin rendering of hypokeimenon—namely, substantia—to translate both terms. This is how we have ended up with the English term “substance”. It is possible that this has contributed to some of the confusions that have emerged in later discussions, which have placed too much weight on the connotations of the English term (see section 5.c).

Aristotle also discusses the concept of substance in a number of other works. If these have not had the same degree of influence as the Categories, their impact has nonetheless been considerable, especially on scholastic Aristotelianism. Moreover, these works add much to what Aristotle says about substance in the Categories, in some places even seeming to contradict it.

The most important development of Aristotle’s concept of substance outside the Categories is his analysis of material substances into matter (hyle) and form (morphe)—an analysis that has come to be known as hylomorphism (though only since the late nineteenth century). This analysis is developed in the Physics, a text dedicated to things that undergo change, and which, unsurprisingly therefore, also has to do with substances. Given the distinctions drawn in the Categories, one might expect Aristotle’s account of change to simply say that change occurs when a substance gains or loses one of the things that is said of or that exists in it—before its bath, the elephant is hot and grey, but afterwards, it is cool and mud-coloured. However, Aristotle also has the task of accounting for substantial change. That is, the coming to be or ceasing to exist of a substance. An old tradition in Greek philosophy, beginning with Parmenides, suggests that substantial change should be impossible, since it involves something coming from nothing or vanishing into nothing. In the Physics, Aristotle addresses this issue by analysing material substances into the matter they are made of and the form that organises that matter. This allows him to explain substantial change. For example, when a vase comes into existence, the pre-existing clay acquires the form of a vase, and when it is destroyed, the clay loses the form of a vase. Neither process involves something coming from or vanishing into nothing. Likewise, when an elephant comes into existence, pre-existing matter acquires the form of an elephant. When an elephant ceases to exist, the matter loses the form of an elephant, becoming (mere) flesh and bones.

Aristotle returns to the topic of substance at length in the Metaphysics. Here, much to the confusion of readers, Aristotle raises the question of what is most properly called a “substance” afresh and considers three options: the matter of which something is made, the form that organises that matter, or the compound of matter and form. Contrary to what was said in the Categories and the Physics, Aristotle seems to say that the term “substance” applies most properly not to a compound of matter and form such as an elephant or a vase, but to the form that makes that compound the kind of thing it is. (The form that makes a hylomorphic compound the kind of thing it is, such as the form of an elephant or the form of a vase, is referred to as a substantial form, to distinguish it from accidental forms such as size or colour). Scholars do not agree on how to reconcile this position with that of Aristotle’s other works. In any case, it should be noted that it is Aristotle’s identification of substances with object-like particulars such as elephants and vases that has guided most of later discussions of substance.

One explanation for Aristotle’s claim in the Metaphysics that it is the substantial form that most merits the title of “substance” concerns material change. In the Categories, Aristotle emphasises that substances are distinguished by their ability to survive through change. Living things, such as elephants, however, do not just change with respect to accidental forms such as temperature and colour. They also change with respect to the matter they are made of. As a result, it seems that if the elephant remains the same elephant over time, this must be in virtue of its having the same substantial form.

In the Metaphysics, Aristotle rejects the thesis that the term “substance” applies to matter. In discussing this thesis, he anticipates a usage that becomes popular from the seventeenth century onwards. On this usage, “substance” does not refer to object-like particulars such as elephants or vases; rather, it refers to an underlying thing that must be combined with properties to yield an object-like particular. This underlying thing is typically conceived as having no properties in itself, but as standing under or supporting the properties with which it must be combined. The application of the term “substance” to this underlying thing is confusing, and the common practice of favouring the word “substratum” in this context is followed here. The idea of a substratum that must be combined with properties to yield a substance in the ordinary sense is close to Aristotle’s idea of matter that must be combined with form. It is closer still to the concept of prime matter, which is traditionally (albeit controversially) attributed to Aristotle and which, unlike flesh or clay, is conceived as having no properties in its own right, except perhaps spatial extension. Though the concept of a substratum is not same as the concept of substance in its original sense, it also plays an extremely important role in the history of philosophy, and one that has antecedents earlier than Aristotle in the Presocratics and in classical Indian philosophy, a topic discussed in section 2.b.

b. Substance in Hellenistic and Roman Philosophy

As noted in the previous section, in the Categories, Aristotle distinguishes two kinds of non-substance: those that exist in a subject and those that are said of a subject. He goes on to divide these further, into the ten categories from which the work takes its name: quantity, qualification, relative, where, when, being-in-a-position, having, doing, being-affected, and secondary substance (which we can count as non-substances for the reasons explained in section 1.a).

Although an enormous number of subsequent thinkers adopt the basic distinction between substances and non-substances, many omit the distinction between predication and inherence. That is, between non-substances that are said of a subject and non-substances that exist in a subject. Moreover, many compact the list of non-substances. For example, the late Neoplatonist Simplicius (480–560 C.E.) records that the second head of the Academy after Plato, Xenocrates (395/96–313/14 B.C.E.), as well as the eleventh head of the Peripatetic school, Andronicus of Rhodes (ca.60 B.C.E.), reduced Aristotle’s ten categories to two: things that exist in themselves, meaning substances, and things that exist in relation to something else, meaning non-substances.

In adopting the language of things that exist in themselves and those that exist in relation to something else, philosophers such as Xenocrates and Andronicus of Rhodes appear to have been recasting Aristotle’s distinction between substances and non-substances in a terminology that approximates that of Plato’s Sophist (255c). It can therefore be argued that the distinction between substances and non-substances that later thinkers inherit from Aristotle also has a line of descent from Plato, even if Plato devotes much less attention to the distinction.

The definition of substances as things that exist in themselves (kath’ auta or per se) is commonplace in the history of philosophy after Aristotle. The expression is, however, regrettably imprecise, both in the original Greek and in the various translations that have followed. For it is not clear what the preposition “in” is supposed to signify here. Clearly, it does not signify containment, as when water exists in a vase or a brick in a wall. It is plausible that the widespread currency of this vague phrase is responsible for the failure of the most influential philosophers from antiquity onwards to state explicit necessary and sufficient conditions for substancehood.

The simplification of the category of non-substances and the introduction of the Platonic in itself terminology are the main philosophical innovations respecting the concept of substance in Hellenistic and Roman philosophy. The concept would also be given a historic theological application when the Nicene Creed (ca.325 C.E.) defined the Father and Son of the Holy Trinity as consubstantial (homoousion) or of one substance. As a result, the philosophical concept of substance would play a central role in the Arian controversy that shaped early Christian theology.

Although Hellenistic and Roman discussions of substance tend to be uncritical, an exception can be found in the Pyrrhonist tradition. Sextus Empiricus records a Pyrrhonist argument against the distinction between substance and non-substance, which says, in effect, that:

  1. If things that exist in themselves do not differ from things that exist in relation to something else, then they too exist in relation to something else.
  2. If things that exist in themselves do differ from things that exist in relation to something else, then they too exist in relation to something else (for to differ from something is to stand in relation to it).
  3. Therefore, the idea of something that exists in itself is incoherent (see McEvilley 2002, 469).

While arguing against the existence of substances is not a central preoccupation of Pyrrhonist philosophy, it is a central concern of the remarkably similar Buddhist Madhyamaka tradition, and there is a possibility of influence in one direction or the other.

2. Substance in Classical Indian Philosophy

The concept of substance in Western philosophy derives from Aristotle via the ancient and medieval philosophical traditions of Europe, the Middle East and North Africa. Either the same or a similar concept is central to the Indian Vaisheshika and Jain schools, to the Nyaya school with which Vaisheshika merged and, as an object of criticism, to various Buddhist schools. This appears to have been the first time that the concept of substance was subjected to sustained philosophical criticism, anticipating and possibly influencing the well-known criticisms of the idea of substance advanced by early modern Western thinkers.

a. Nyaya-Vaisheshika and Jain Substances

There exist six orthodox schools of Indian philosophy (those that acknowledge the authority of the Vedas—the principal Hindu scriptures) and four major unorthodox schools. The orthodox schools include Vaisheshika and Nyaya which appear to have begun as separate traditions, but which merged some time before the eleventh century. The founding text of the Vaisheshika school, the Vaisheshikasutra is attributed to a philosopher named Kaṇāda and was composed sometime between the fifth and the second century B.C.E. Like Aristotle’s Categories, the focus of the Vaisheshikasutra is on how we should divide up the kinds of things that exist. The Vaisheshikasutra presents a three-fold division into substance (dravya), quality (guna), and motion (karman). The substances are divided, in turn, into nine kinds. These are the five elements—earth, water, fire, air, and aether—with the addition of time, space, soul, and mind.

The early Vaisheshika commentators, Praśastapāda (ca.6th century) and Candrānanda (ca.8th century) expand the Vaisheshikasutra’s three-category division into what has become a canonical list of six categories. The additional categories are universal (samanya), particularity (vishesha), and inherence (samavaya), concepts which are also mentioned in the Vaisheshikasutra, but which are not, in that text, given the same prominence as substance, quality and motion (excepting one passage of a late edition which is of questionable authenticity).

The Sanskrit term translated as “substance”, dravya, comes from drú meaning wood or tree and has therefore a parallel etymology to Aristotle’s term for matter, hyle, which means wood in non-philosophical contexts. Nonetheless, it is widely recognised that the meaning of dravya is close to the meaning of Aristotle’s ousia: like Aristotle’s ousiai, dravyas are contrasted with quality and motion, they are distinguished by their ability to undergo change and by the fact that other things depend on them for their existence. McEvilley (2002, 526–7) lists further parallels.

At the same time, there exist important differences between the Vaisheshika approach to substance and that of Aristotle. One difference concerns the paradigmatic examples. Aristotle’s favourite examples of substances are individual objects, and it is not clear that he would count the five classical elements, soul, or mind, as substances. (Aristotle’s statements on these themes are ambiguous and interpretations differ.) Moreover, Aristotle would not class space or time as substances. This, however, need not be taken to show that the Vaisheshika and Aristotelian concepts of substance are themselves fundamentally different. For philosophers who inherit Aristotle’s concept of substance often disagree with Aristotle about its extension in respects similar to Vaisheshika philosophers.

A second difference between the Vaisheshika approach to substance and Aristotle’s is that according to Vaisheshika philosophers, composite substances (anityadravya, that is noneternal substances), though they genuinely exist, do not persist through change. An individual atom of earth or water exists forever, but as soon as you remove a part of a tree, you have a new tree (Halbfass 1992, 96). A possible explanation for both differences between Vaisheshika and Aristotelian substances is that the former are not understood as compounds of matter and form but play rather a role somewhere between that of Aristotelian substances and Aristotelian matter.

Something closer to Aristotle’s position on this point is found in Jain discussions of substance, which appear to be indebted to the Vaisheshika notion, but which combine it with the idea of a vertical universal (urdhvatasmanya). The vertical universal plays a similar role to Aristotle’s substantial form, in that it accompanies an individual substance through nonessential modifications and can therefore account for its identity through material change.

The earliest parts of the Vaisheshikasutra are believed to have been authored between the fifth and second centuries B.C.E., with most parts being in place by the second century C.E. (Moise and Thite 2022, 46). This interval included a period of intense cultural exchange between Greece and India, beginning in the final quarter of the fourth century B.C.E. In view of the close parallels between the philosophy of Aristotle and that of the proponents of Vaisheshika, and of the interaction between the two cultures going on at this time, Thomas McEvilley (2002, 535) states that “it is possible to imagine stimulus diffusion channels” whereby elements of Vaisheshika’s thought “could reflect Greek, and specifically Peripatetic, influence”, including Aristotelian ideas about substance. However, it is also possible that the Vaisheshika and Aristotelian concepts of substance developed independently, despite their similarity.

b. Upanishadic Substrata

The paradigmatic examples of substances identified by Vaisheshika thinkers, like those identified by Aristotelians, are ordinary propertied things such as earth, water, humans and horses. Section 1.a noted that since the seventeenth century, the term “substance” has acquired another usage, according to which “substance” does not applies to ordinary propertied things, but to a putative underlying entity that is supposed to lack properties in itself but to combine with properties to yield substances of the ordinary sort. The underlying entity is often referred to as a substratum to distinguish it from substances in the traditional sense of the term. Although the application of the term “substance” to substrata only became well-established in the twentieth century, the idea that substances can be analysed into properties and an underlying substratum is very old and merits attention here.

As already mentioned, the idea of a substratum is exemplified by the idea of prime matter traditionally attributed to Aristotle. An earlier precursor of this idea is the Presocratic Anaximander, according to whom the apeiron underlies everything that exists. Apeiron is usually translated “infinite”; however, in this context, a more illuminating (albeit etymologically parallel) translation would be “unlimited” or “indefinite”. Anaximander’s apeiron is a thing conceived of in abstraction from any characteristics that limit or define its nature: it is a propertyless substratum. It is reasonable, moreover, to attribute essentially the same idea to Anaximander’s teacher, Thales. For although Thales identified the thing underlying all reality as water, and not as the apeiron, once it is recognised that “water” here is used as a label for something that need not possess any of the distinctive properties of water, the two ideas turn out to be more or less the same.

Thales was the first of the Presocratics and, therefore, the earliest Western philosopher to whom the idea of a substratum can be attributed. Thomas McEvilley (2002) argues that it is possible to trace the idea of a substratum still further back to the Indian tradition. First, McEvilley proposes that Thales’ claim that everything is water resembles a claim advanced by Sanaktumara in the Chandogya Upanishad (ca.8th–6th century B.C.E.), which may well predate Thales. Moreover, just as we can recognise an approximation of the idea of a propertyless substratum in Thales’ claim, the same goes for Sanaktumara’s. McEvilley adds that even closer parallels can be found between Anaximander’s idea of the apeiron and numerous Upanishadic descriptions of brahman as that which underlies all beings, descriptions which, in this case, certainly appear much earlier.

The idea of substance in the sense of an underlying substratum can, therefore, be traced back as far as the Upanishads, and it is possible that the Upanishads influenced the Presocratic notion and, in turn, Aristotle. For there was significant Greek-Indian interchange in the Presocratic period, mediated by the Persian empire, and there is persuasive evidence that Presocratic thinkers had some knowledge of Upanishadic texts or of some unknown source that influenced both (McEvilley 2002, 28–44).

c. Buddhist Objections to Substance

The earliest sustained critiques of the notion of substance appear in Buddhist philosophy, beginning with objections to the idea of a substantial soul or atman. Early objections to the idea of a substantial soul are extended to substances in general by Nagarjuna, the founder of the Madhyamaka school, in around the second or third century C.E. As a result, discussions about substances would end up being central to the philosophical traditions across Eurasia in the succeeding centuries.

The earliest Buddhist philosophical texts are the discourses attributed to the Buddha himself and to his immediate disciples, collected in the Sutra Piṭaka. These are followed by the more technical and systematic Abhidharma writings collected in the Abhidhamma Piṭaka. The Sutra Piṭaka and the Abhidhamma Piṭaka are two of the three components of the Buddhist canon, the third being the collection of texts about monastic living known as the Vinaya Piṭaka. (The precise content of these collections differs in different Buddhist traditions, the Abhidhamma Piṭaka especially.)

The Sutra Piṭaka and the Abhidhamma Piṭaka both contain texts arguing against the idea of a substantial soul. According to the authors of these texts, the term atman is applied by convention to what is in fact a mere collection of mental and physical events. The Samyutta Nikaya, a subdivision of the Sutra Piṭaka, attributes a classic expression of this view to the Buddhist nun, Vaijira. Bhikku Bodhi (2000, 230) translates the relevant passage as follows:

Why now do you assume ‘a being’?
Mara, is that your speculative view?
This is a heap of sheer formations:
Here no being is found.

Just as, with an assemblage of parts,
The word ‘chariot’ is used,
So, when the aggregates exist,
There is the convention ‘a being’.

Although they oppose the idea of a substantial self, the texts collected in the Sutra Piṭaka and the Abhidhamma Piṭaka do not argue against the existence of substances generally. Indeed, Abhidharma philosophers analysed experiential reality into elements referred to as dharmas, which are often described in terms suggesting that they are substances (all the more so in later, noncanonical texts in the Abhidharma tradition).

The Madhyamaka school arose in response to Abhidharma philosophy as well as non-Buddhist schools such as Nyaya-Vaisheshika. In contrast to earlier Buddhist thought, its central preoccupation is the rejection of substances generally.

Madhyamaka means middle way. The school takes this name from its principal doctrine, which aims to establish a middle way between two opposing metaphysical views: realism (broadly the view that some things are ultimately real) and nihilism (the view that ultimately, nothing exists). Nagarjuna expresses the third alternative as the view that everything is characterised by emptiness (sunyata), which he explicates as the absence of svabhava. While svabhava has various interconnected meanings in Nagarjuna’s thought, it is mainly used to express the idea of substance understood as “any object that exists objectively, the existence and qualities of which are independent of other objects, human concepts, or interests” (Westerhoff 2009, 199).

Westerhoff (2009, 200–212) summarises several arguments against substance that can be attributed to Nagarjuna. These include an argument that substances could not stand in causal relations, an argument that substance could not undergo change, and an argument that there exists no satisfactory account of the relation between a substance and its properties. The first two appear to rule out substances only on the assumption that substances, if they exist at all, must stand in causal relations and undergo change, something that most, but not all, proponents of substances would hold. Regarding the self or soul, Nagarjuna joins with other Buddhist schools in arguing that what we habitually think of as a substantial self is in fact a collection of causally interconnected psychological and physical events.

The principal targets of Nagarjuna’s attacks on the concept of substance are Abhidharma and Nyaya-Vaisheshika philosophies. A central preoccupation of the Nyaya school is to respond to Buddhist arguments, including those against substance. It is possible that a secondary target is the concept of substance in Greek philosophy. As noted above, there is some evidence of influence between the Greek and Indian philosophical traditions in one or both directions. Greeks in India took a significant interest in Buddhism, with Greek converts contributing to Buddhist culture. The best known of these, Menander, a second century B.C.E. king of Bactria, is one of the two principal interlocutors in the Milindasutra, a Buddhist philosophical dialogue that includes a famous presentation of Vaijira’s chariot analogy.

There also exist striking parallels between the arguments of the Pyrrhonists, as recorded by Sextus Empiricus in around 200 C.E. and the Madhyamaka school founded by Nagarjuna at about the same time (McEvilley 2002; Neale 2014). Diogenes Laertius records that Pyrrho himself visited India with Alexander the Great’s army, spending time in Taxila, which would become a centre of Buddhist philosophy. Roman historians record flourishing trade between the Roman empire and India. There was, therefore, considerable opportunity for philosophical interchange during the period in question. Nonetheless, arguing against the idea of substance does not seem to have been such a predominant preoccupation for the Pyrrhonists as it was for the Madhyamaka philosophers.

3. Substance in Medieval Arabic and Islamic Philosophy

Late antiquity and the Middle Ages saw a decline in the influence of Greco-Roman culture in and beyond Europe, hastened by the rise of Islam. Nonetheless, the tradition of beginning philosophical education with Aristotle’s logical works, starting with the Categories, retained an enormous influence in Middle Eastern intellectual culture. (Aristotle’s work was read not only in Greek but also in Syriac and Arabic translations from the sixth and ninth centuries respectively). The translation of Greek philosophical works into Arabic was accompanied by a renaissance in Aristotelian philosophy beginning with al-Kindi in the ninth century. Inevitably, this included discussions of the concept of substance, which is present throughout the philosophy of this period. Special attention is due to al-Farabi for an early detailed treatment of the topic and to Avicebron (Solomon ibn Gabirol) for his influential defence of the thesis that all substances must be material. Honourable mention is also due to Avicenna’s (Ibn Sina) floating-man argument, which is widely seen as anticipating Descartes’ (in)famous disembodiment argument for the thesis that the mind is an immaterial substance.

a. Al-Farabi

The resurgence of Aristotelian philosophy in the Arabic and Islamic world is usually traced back to al-Kindi. Al-Kindi’s works on logic (the subject area to which the Categories is traditionally assigned) have however been lost, and with them any treatment of substance they might have contained. Thérèse-Anne Druart (1987) identifies al-Farabi’s discussion of djawhar, in his Book of Letters, as the first serious Arabic study of substance. There, al-Farabi distinguishes between the literal use of djawhar (meaning gem or ore), metaphorical uses to refer to something valuable or to the material of which something is constituted, and three philosophical uses as a term for substance or essence.

The first two philosophical uses of djawhar identified by al-Farabi approximate Aristotle’s primary and secondary substances. That is, in the first philosophical usage, djawhar refers to a particular that is not said of and does not exist in a subject. For example, an elephant. In the second philosophical usage, it refers to the essence of a substance in the first sense. For example, the species elephant. Al-Farabi adds a third use of djawhar, in which it refers to the essence of a non-substance. For example, to colour, the essence of the non-substance grey.

Al-Farabi says that the other categories depend on those of first and second substances and that this makes the categories of first and second substances more perfect than the others. He reviews alternative candidates for the status of djawhar put forward by unnamed philosophers. These include universals, indivisible atoms, spatial dimensions, mathematical points, and matter. The idea appears to be that these could turn out to be superior candidates for substances because they are more perfect. However, with one exception, al-Farabi does not discover anything more perfect than primary and secondary substances.

The exception is as follows. Al-Farabi claims that it can be proved that there exists a being that is neither in nor predicated of a subject and that is not a subject for anything else either. This being, al-Farabi claims, is more worthy of the term djawhar than the object-like primary substances, insofar as it is still more perfect. Although al-Farabi indicates that it would be reasonable to extend the philosophical usage of djawhar in this way, he does not propose to break with the established use in this way. Insofar as “more perfect” means “more fundamental”, we see here the tension mentioned at the beginning of this article between the use of the term “substance” for object-like things and its use for whatever is most fundamental.

b. Avicebron (Solomon ibn Gabirol)

Avicebron was an eleventh century Iberian Jewish Neoplatonist. In addition to a large corpus of poetry, he wrote a philosophical dialogue, known by its Latin name, Fons Vitae (Fountain of Life), which would have a great influence on Christian scholastic philosophy in the twelfth and thirteenth centuries.

Avicebron’s principal contribution to the topic of substance is his presentation of the position known as universal hylomorphism. As explained in section 1, Aristotle defends hylomorphism, the view that material substances are composed of matter (hyle) and form (morphe). However, Aristotle does not extend this claim to all substances. He leaves room for the view that there exist many substances, including human intellects, that are immaterial. By late antiquity, a standard interpretation of Aristotle emerged, according to which such immaterial substances do in fact exist. By contrast, in the Fons Vitae, Avicebron defends the thesis that all substances, with the only exception of God, are composed of matter and form.

There is a sense in which Avicebron’s universal hylomorphism is a kind of materialism: he holds that created reality consists solely of material substances. It is however important not to be misled by this fact. For although they argue that all substances, barring God, are composed of matter and form, Avicebron and other universal hylomorphists draw a distinction between the ordinary matter that composes corporeal substances and the spiritual matter that composes spiritual substances. Spiritual matter plays the same role as ordinary matter in that it combines with a form to yield a substance. However, the resulting substances do not have the characteristics traditionally associated with material entities. They are not visible objects that take up space. Hence, universal hylomorphism would not satisfy traditional materialists such as Epicurus or Hobbes, who defend their position on the basis that everything that exists must take up space.

Scholars do not agree on what the case for universal hylomorphism is supposed to be. Paul Vincent Spade (2008) suggests that it results from two assumptions: that only God is metaphysically simple in all respects, and that anything that is not metaphysically simple in all respects is a composite of matter and form. However, Avicebron does not explicitly defend this argument, and it is not obvious why something could not qualify as non-simple in virtue of being complex in some way other than involving matter and form.

4. Substance in Medieval Scholastic Philosophy

In the early sixth century, Boethius set out to translate the works of Plato and Aristotle into Latin. This project was cut short when he was executed by Theodoric the Great, but Boethius still did manage to translate Aristotle’s Categories and De Interpretatione. A century later, Isadore of Seville summarised Aristotle’s account of substance in the Categories in his Etymologiae, perhaps the most influential book of the Middle Ages, after the Bible. As a result, the concept of substance introduced in Aristotle’s Categories remained familiar to philosophers after the fall of the Western Roman Empire. Nonetheless, prior to the twelfth century, philosophy in the Latin West consisted principally in elaborating on traditional views, inherited from the Church Fathers and other familiar authorities. It is only in the twelfth century that philosophers made novel contributions to the topic of substance, influenced by Arabic-Islamic philosophy and by the recovery of ancient works by Aristotle and others. The most important are those of Thomas Aquinas and John Duns Scotus.

a. Thomas Aquinas

All the leading philosophers of this period adopted a version of Aristotle’s concept of substance. Many, and in particular those in the Franciscan order, such as Bonaventure, followed Avicebron in accepting universal hylomorphism. Aquinas’s main contribution to the topic of substance is his opposition to Avicebron’s position.

Aquinas endorses Aristotle’s definition of a substance as something that neither is said of, nor exists in, a subject, and he follows Aristotle in analysing material substances as composites of matter and form. However, Aquinas recognised a problem about how to square these views with his belief that some substances, including human souls, are immaterial.

Aquinas was committed to the view that, unlike God, created substances are characterised by potentiality. For example, before its bath, the elephant is actually hot but potentially cool. Aquinas takes the view that in material substances, it is matter that contributes potentiality. For matter is capable of receiving different forms. Since immaterial substances lack matter, it seems to follow that they also lack potentiality. Aquinas is happy to accept this conclusion respecting God whom he regards as pure act. He is however not willing to say the same of other immaterial substances, such as angels and human souls, which he takes to be characterised by potentiality no less than material substances.

One solution would be to adopt the universal hylomorphism of Avicebron, but Aquinas rejects this position on the basis that the potentiality of matter, as usually understood, consists ultimately in its ability to move through space. If so, it seems that matter can only belong to spatial, and hence corporeal, beings (Questiones Disputate de Anima, 24.1.49.142–164).

Instead, Aquinas argues that although immaterial substances are not composed of matter and form, they are composed of essence and existence. In immaterial substances, it is their essence that contributes potentiality. This account of immaterial substances presupposes that existence and essence are distinct, an idea that had been anticipated by Avicenna as a corollary of his proof of God’s existence. Aquinas defends the distinction between existence and essence in De Ente et Essentia, though scholars disagree about how exactly the argument should be understood (see Gavin Kerr’s article on Aquinas’s Metaphysics).

Aquinas recognises that one might be inclined to refer to incorporeal potentiality as matter simply on the basis that it takes on, in spiritual substances, the role that matter plays in corporeal substances. However, he takes the view that this use of the term “matter” would be equivocal and potentially misleading.

A related, but more specific, contribution by Aquinas concerns the issue of how a human soul, if it is the form of a hylomorphic compound, can nonetheless be an immaterial substance in its own right, capable of existing without the body after its death. Aquinas compares the propensity of the soul to be embodied to the propensity of lighter objects to rise, observing that in both cases, the propensity can be obstructed while the object remains in existence. For more on this issue, see Christopher Brown’s article on Thomas Aquinas.

b. Duns Scotus

Like Aquinas, Scotus adopts the Categories’ account of substance. In contrast to earlier Franciscans, he agrees with Aquinas’s rejection of universal hylomorphism. Indeed, Scotus goes even further, claiming not only that form can exist without matter, but also that prime matter can exist without form. As a result, Scotus is committed to the view that matter has a kind of formless actuality, something that, in Aquinas’s system, looks like a contradiction.

Although he drops the doctrine of universal hylomorphism, Scotus maintained, against Aquinas, a second thesis concerning substances associated with Franciscan philosophers and often paired with universal hylomorphism: the view that a single substance can have multiple substantial forms (Ordinatio, 4).

According to Aquinas, a substance has only one substantial form. For example, the substantial form of an elephant is the species elephant. The parts of the elephant, such as its organs, do not have their own substantial forms. Because substantial forms are responsible for the identity of substances over time, this view has the counterintuitive consequence that when, for example, an organ transplant takes place, the organ acquired by the recipient is not the one that was possessed by the donor.

According to Scotus, by contrast, one substance can have multiple substantial forms. For example, the parts of the elephant, such as its organs, may each have their own substantial form. This allows followers of Scotus to take the intuitive view that when an organ transplant takes place, the organ acquired by the recipient is one and the same as the organ that the donor possessed, and not a new entity that has come into existence after the donor’s death. (Aristotle seems to endorse the position of Scotus in the Categories, and that of Aquinas in the Metaphysics.)

Scotus is also known for introducing the idea that every substance has a haecceity (thisness), that is, a property that makes it the particular thing that it is. In this, he echoes the earlier Vaisheshika idea of a vishesha (usually translated “particularity”) which plays approximately the same role (Kaipayil 2008, 79).

5. Substance in Early Modern Philosophy

Prior to the early modern period, Western philosophers tend to adopt both Aristotle’s definition of substance in the Categories and his analysis of material substances into matter and form. In the early modern period, this practice begins to change, with many philosophers offering new characterisations of substance, or rejecting the notion of substance entirely. The most influential contribution from this period is Descartes’ independence definition of substance. Although many earlier philosophers have been interpreted as saying that substances are things that have independent existence, Descartes appears to be the first prominent thinker to say this explicitly. Descartes’ influence, respecting this and other topics, was reinforced by Antoine Arnauld and Pierre Nicole’s Port-Royal Logic, which, towards the end of the seventeenth century, took the place of Aristotle’s Categories as the leading introduction to philosophy. Important contributions to the idea of substance in this period are also made by Spinoza, Leibniz, Locke and Hume, all of whom are known for resisting some aspect of Descartes’ account of substance.

a. Descartes

Substance is one of the central concepts of Descartes’ philosophy, and he returns to it on multiple occasions. In the second set of Objections and Replies to the Meditations on First Philosophy, Descartes advances a definition of substance that resembles Aristotle’s definition of substance in the Categories. This is not surprising given that Descartes underwent formal training in Aristotelian philosophy at the Royal College of La Flèche, France. In a number of other locations, however, Descartes offers what has been called the independence definition of substance. According to the independence definition, a substance is anything that could exist by itself or, equivalently, anything that does not depend on anything else for its existence (Oeuvres, vol. 7, 44, 226; vol. 3, 429; vol. 8a, 24).

Scholars disagree about how exactly we should understand Descartes’ independence definition. Some have argued that Descartes’ view is that substances must be causally independent, in the sense that they do not require anything else to cause them to exist. Another and maybe more popular view is that, for Descartes, substances are modally independent, meaning that the existence of a substance does not necessitate the existence of any other entity. This interpretation itself has several variants (see Weir 2021, 281–7).

In addition to offering a new definition of substance, Descartes draws a distinction between a strict and a more permissive sense of the term. A substance in the strict sense satisfies the independence definition without qualification. Descartes claims that there is only one such substance: God. For everything else depends on God for its existence. Descartes adds, however, that we can count as created substances those things that depend only on God for their existence. Descartes claims that finite minds and bodies qualify as created substances in this sense, whereas their properties (attributes, qualities and modes in his terminology) do not.

It is possible to view Descartes’ independence definition of substance as a disambiguation of Aristotle’s definition of substance in the Categories. Aristotle says that substances do not depend, for their existence, on any other being of which they must be predicated or in which they must inhere. He does not however say explicitly whether substances depend in some other way on other things for their existence. Descartes clarifies that they do not. This is consistent with, and may even be implied by, what Aristotle says in the Categories.

In another respect, Descartes’ understanding of substance departs dramatically from the Aristotelian orthodoxy of his day. For example, while Descartes accepts Aristotle’s claim that in the case of a living human, the soul serves as the form of body, he exhibits little or no sympathy for hylomorphism beyond this. Rather than analysing material substances into matter and form like Aristotle, or substances in general into potency and act like Aquinas, Descartes proposes that every substance has, as its principal attribute, one of two properties—namely, extension or thought—and that all accidental properties of substances are modes of their principal attribute. For example, being elephant-shaped is a mode of extension, and seeing sunlight glimmer on a lake is a mode of thought. In contrast to the scholastic theory of real accidents, Descartes holds that these modes are only conceptually distinct from, and cannot exist without, the substances to which they belong.

One consequence is that Descartes appears to accept what has come to be known as the bundle view of substances: the thesis that, in his words, “the attributes all taken together are the same as the substance” (Conversation with Burman, 7). To put it another way, once we have the principal attribute of the elephant—extension—and all of the accidental attributes, such as its size, shape, texture and so on, we have everything that this substance comprises. These attributes do not need to be combined with a propertyless substratum. (The bundle view, in the relevant sense, contrasts with the substratum view, according to which a substance is composed of properties and a substratum. Sometimes, the term “bundle view” is used in a stronger sense, to imply that the properties that make up a substance could exist separately, but Descartes does not endorse the bundle view in this stronger sense.)

A further consequence is that Descartes could not accept the standard transubstantiation account of the eucharist, which depended on the theory of real accidents, and was obliged to offer a competing account.

In the late seventeenth century, two followers of Descartes, Antoine Arnauld and Pierre Nicole, set out to author a modern introduction to logic that could serve in place of the texts of Aristotle’s Organon, including the Categories. (The word “logic” is used here in a traditional sense that is significantly broader than the sense that philosophers of the beginning of the twenty-first century would attribute to it, including much of what these philosophers would recognize as metaphysics.) The result was La logique ou l’art de penser, better known as the Port-Royal Logic, a work that had an enormous influence on the next two centuries of philosophy. The Port-Royal Logic offers the following definition of substance:

I call whatever is conceived as subsisting by itself and as the subject of everything conceived about it, a thing. It is otherwise called a substance. […] This will be made clearer by some examples. When I think of a body, my idea of it represents a thing or a substance, because I consider it as a thing subsisting by itself and needing no other subject to exist. (30–21)

This definition combines Aristotle’s idea that a substance is the subject of other categories and Descartes claim that a substance does not need other things to exist. It is interesting to note here a shift in focus from what substances are to how they are conceived or considered. This reflects the general shift in focus from metaphysics to epistemology that characterised philosophy after Descartes.

b. Spinoza

Influential philosophers writing after Descartes tend to use Descartes’ views as a starting point, criticising or accepting them as they deem reasonable. Hence, a number of responses to Descartes’ account of substance appear in the early modern period.

In the only book published under his name in his lifetime, the 1663 Principles of Cartesian Philosophy, Spinoza endorses both Descartes’ definition of substance in the Second Replies (which is essentially Aristotle’s definition in the Categories) and the independence definition introduced in the Principles of Philosophy and elsewhere. Spinoza also endorses Descartes’ distinction between created and uncreated substances, his rejection of substantial forms and real accidents, and his division of substances into extended substances and thinking substances.

In the Ethics, published posthumously in 1677, Spinoza develops his own approach to these issues. Spinoza opens the Ethics by stating that “by substance I understand what is in itself and is conceived through itself”. Shortly after this, in the first of his axioms, he adds that “Whatever is, is either in itself or in another”. Spinoza’s contrast between substance, understood as those things that are in themselves, and non-substances, understood as those things that are in another, reflects the distinction introduced by Plato in the Sophist and taken up by countless later thinkers from antiquity onwards. As in the Port-Royal Logic, Spinoza’s initial definition of substance in terms of how it is conceived reflects the preoccupation of early modern philosophy with epistemology.

Spinoza clarifies the claim that a substance is conceived through itself by saying that it means that “the conception of which does not require for its formation the conception of anything else”. This might mean that something is a substance if and only if it is possible to conceive of its existing by itself. If so, then Spinoza’s definition might be interpreted as an epistemological rewriting of Descartes’ independence definition.

Spinoza purports to show, on the basis of various definitions and axioms, that there can only be one substance, and that this substance is to be identified with God. What Descartes calls created substances are really modes of God. This conclusion is sometimes represented as a radical departure from Descartes. This is misleading, however. For Descartes also holds that only God qualifies as a substance in the strict sense of the word “substance”. To this extent, Spinoza is no more monistic than Descartes.

Spinoza’s Ethics does however depart from Descartes in (i) not making use of a category of created substances, and (ii) emphasizing that those things that Descartes would class as created substances are modes of God. Despite this, Spinoza’s theory is not obviously incompatible with the existence of created substances in Descartes’ sense of the term, even if he does not make use of the category himself. It is plausibly a consequence of Descartes’ position that created substances are, strictly speaking, modes of God, even if Descartes does not state this explicitly.

c. Leibniz

In his Critical Thoughts on Descartes’ Principles of Philosophy, Leibniz raises the following objection to Descartes’ definition of created substances as things that depend only on God for their existence:

I do not know whether the definition of substance as that which needs for its existence only the concurrence of God fits any created substance known to us. […] For not only do we need other substances; we need our own accidents even much more. (389)

Leibniz does not explicitly explain here why substances should need other substances, setting aside God, for their existence. Still, his claim that substances need their own accidents is an early example of an objection that has had a significant degree of influence in the literature of the twentieth and twenty-first centuries on substance. According to this objection, nothing could satisfy Descartes’ independence definition of substance because every candidate substance (an elephant or a soul, for example) depends for its existence on its own properties. This objection is further discussed in section 6.

In the Discourse of Metaphysics, Leibniz does provide a reason for thinking that created substances need other substances to exist. There, he begins by accepting something close to Aristotle’s definition of substance in the Categories: a substance is something of which other things are predicated, but which is not itself predicated of anything else. However, Leibniz claims that this characterisation is insufficient, and sets out a novel theory of substance, according to which the haecceity of a substance includes everything true of it (see section 4.b for the notion of haecceity). Accordingly, Leibniz holds that from a perfect grasp of the concept of a particular substance, one could derive all other truths.

It is not obvious how Leibniz arrives at this unusual conception of substance, but it is clear that if the haecceity of one substance includes everything that is true of it, this will include the relationships in which it stands to every other substance. Hence, on Leibniz’s view, every substance turns out to necessitate, and so to depend modally on, every other for its existence, a conclusion that contrasts starkly with Descartes’ position.

Leibniz’s view illustrates the fact that it is possible to accept Aristotle’s definition of substance in the Categories while rejecting Descartes’ independence definition. Leibniz clearly agrees with Aristotle that a substance does not have to be said of or to exist in something in the way that properties do. However, he holds that substances depend for their existence on other things in a way that contradicts Descartes’ independence definition.

Leibniz’s enormous corpus makes a number of other distinctive claims about substances. The most important of these are the characterisation of substances as unities and as things that act, both of which can be found in his New Essays on Human Understanding. These ideas have precursors as far back as Aristotle, but they receive special emphasis in Leibniz’s work.

d. British Empiricism

Section 1 mentions that since the seventeenth century, a new usage of the term “substance” becomes prevalent, on which it does not refer to an object-like thing, such as an elephant, but to an underlying substratum that must be combined with properties to yield an object-like thing. On this usage, an elephant is a combination of properties such as its shape, size and colour, and the underlying substance in which these properties inhere. The substance in this sense is often described as having no properties in itself, and therefore resembles Aristotelian prime matter more than the objects that serve as examples of substances in earlier traditions.

This new usage of “substance” is standardly traced back to Locke’s Essay Concerning Human Understanding, where he states that:

Substance [is] nothing, but the supposed, but unknown support of those qualities, we find existing, which we imagine cannot subsist, sine re substante, without something to support them, we call that support substantia; which, according to the true import of the word, is in plain English, standing under, or upholding. (II.23.2)

This and similar statements in Locke’s Essay initiated a longstanding tradition in which British empiricists, including Berkeley, Hume, and Russell, took for granted that the term “substance” typically refers to a propertyless substratum and criticised the concept on that basis.

Scholars debate on whether Locke actually intended to identify substances with propertyless substrata. There exist two main interpretations. On the traditional interpretation, associated with Leibniz and defended by Jonathan Bennett (1987), Locke uses the word “substance” to refer to a propertyless substratum that we posit to explain what supports the collections of properties that we observe, although Locke is sceptical of the value of this idea, since it stands for something whose nature we are entirely ignorant of. (Those who believe that Locke intended to identify substances with propertyless substrata disagree regarding the further issue of whether Locke reluctantly accepts or ultimately rejects such entities.)

The alternative interpretation, defended by Michael Ayers (1977), agrees that Locke identifies substance with an unknown substratum that underlies the collections of properties we observe. However, on this view, Locke does not regard the substratum as having no properties in itself. Rather, he holds that these properties are unknown to us, belonging as they do to the imperceptible microstructure of their bearer. This microstructure is posited to explain why a given cluster of properties should regularly appear together. On this reading, Locke’s substrata play a similar role to Aristotle’s secondary substances or Jain vertical universals in that they are the essences that explain the perceptible properties of objects. The principal advantage of this interpretation is that it explains how Locke can endorse the idea of a substratum while recognising the (apparent) incoherence of the idea of something having no properties in itself. The principal disadvantages of this interpretation include the meagre textual evidence in its favour and its difficulty accounting for Locke’s disparaging comments about the idea of a substratum.

Forrai (2010) suggests that the two interpretations of Locke’s approach to substances can be reconciled if we suppose that Locke takes our actual idea of substance to be that of a propertyless substratum while holding that we only think of that substratum as propertyless because we are ignorant of its nature, which is in fact that of an invisible microstructure.

In the passages traditionally interpreted as discussing the idea of a propertyless substratum, Locke refers to it as the “idea of substance in general”. In other passages, Locke discusses our ideas of “particular sorts of substances”. Locke’s particular sorts of substances resemble the things referred to as substances in earlier traditions. His examples include humans, horses, gold, and water. These, Locke claims, are in fact just collections of simple ideas that regularly appear together:

We come to have the Ideas of particular sorts of Substances, by collecting such Combinations of simple Ideas, as are by Experience and Observation of Men’s Senses taken notice of to exist together, and are therefore supposed to flow from the particular internal Constitution, or unknown Essence of that Substance. (Essay, II.23.3)

The idea of an elephant, on this view, is really just a collection comprising the ideas of a certain colour, a certain shape, and so on. Locke seems to take the view that the distinctions we draw between different sorts of substances are somewhat arbitrary and conventional. That is, the word “elephant” may not refer to what the philosophers of the twentieth and twenty-first centuries would consider a natural kind. Hence, substances in the traditional sense turn out to be subject-dependent in the sense that the identification of some collection of ideas as a substance is not an objective, mind-independent fact, but depends on arbitrary choices and conventions.

Locke’s comments about substance, particularly those traditionally regarded as identifying substances with propertyless substrata, had a great influence on Berkeley and Hume, both of whom followed Locke in treating substances as substrata and in criticising the notion on this basis, while granting the existence of substances in the deflationary sense of subject-dependent collections of ideas.

Berkeley’s position is distinctive in that he affirms an asymmetry between perceptible substances, such as elephants and vases, and spiritual substances, such as human and divine minds. Berkeley agrees with Locke that our ideas of perceptible substances are really just collections of ideas and that we are tempted to posit a substratum in which these ideas exist. Unlike Locke, Berkeley explicitly says that in the case of perceptible objects at least, we should posit no such thing:

If substance be taken in the vulgar sense for a combination of qualities such as extension, solidity, weight, and the like […] this we cannot be accused of taking away; but if it be taken in the philosophic sense for the support of accidents or quantities without the mind, then I acknowledge that we take it away. (Principles, 1.37)

Berkeley’s rejection of substrata in the case of material objects is not necessarily due to his rejection ofthe idea of substrata in general, however. It may be that Berkeley rejects substrata for material substances only, and does so solely on the basis that, according to his idealist metaphysics, those properties that make up perceptible objects really inhere in the minds of the perceivers.

Whether or not Berkeley thinks that spiritual substances involve propertyless substrata is hard to judge and it is not clear that Berkeley maintains a consistent view on this issue. On the one hand, Berkeley’s published criticisms of the idea of a substratum tend to focus exclusively on material objects, suggesting that he is not opposed to the existence of a substratum in the case of minds. On the other hand, several passages in Berkeley’s notebooks assert that there is nothing more to minds than the perceptions they undergo, suggesting that Berkeley rejects the idea of substrata in the case of minds as well (see in particular his Notebooks, 577 and 580). The task of interpreting Berkeley on this point is complicated by the fact that the relevant passages are marked with a “+”, which some but not all scholars interpret as indicating Berkeley’s dissatisfaction with them.

Hume’s Treatise of Human Nature echoes Locke’s claim that we have no idea of what a substance is and that we have only a confused idea of what a substance does. Although Hume does not explicitly state that these criticisms are intended to apply to the idea of substances as propertyless substrata, commentators tend to agree that this is his intention (see for example Baxter 2015). Hume seems to agree with Locke (as traditionally interpreted) that we introduce the idea of a propertyless substratum in order to make sense of the unity that we habitually attribute to what are in fact mere collections of properties that regularly appear together. Hume holds that we can have no idea of this substratum because any such idea would have to come from some sensory or affective impression while, in fact, ideas derived from sensory and affective impressions are always of accidents—that is, of properties.

Hume grants that we do have a clear idea of substances understood as Descartes defines them, that is, as things that can exist by themselves. However, Hume asserts that this definition applies to anything that we can think of, and hence, that to call something a substance in this sense is not to distinguish it from anything else.

Hume further argues that we can make no sense of the idea of the inherence relation that is supposed to exist between properties and the substances to which they belong. For the inherence relation is taken to be the relation that holds between an accident and something without which it could not exist (as per Aristotle’s description of inherence in the Categories, for example). According to Hume, however, nothing stands in any such relation to anything else. For he makes it an axiom that anything that we can distinguish in thought can exist separately in reality. It follows that not only do we have no idea of a substratum, but no such thing can exist, either in the case of perceptible objects or in the case of minds. For a substratum is supposed to be that in which properties inhere. It is natural to see Hume’s arguments on this topic as the culmination of Locke’s more circumspect criticisms of substrata.

It follows from Hume’s arguments that the entities that earlier philosophers regarded as substances, such as elephants and vases, are in fact just collections of ideas, each member of which could exist by itself. Hume emphasises that, as a consequence, the mind really consists in successive collections of ideas. Hence, Hume adopts a bundle view of the mind and other putative substances not only in the moderate sense that he denies that minds involve a propertyless substratum, but in the extreme sense that he holds that they are really swarms of independent entities.

There exists a close resemblance between Hume’s rejection of the existence of complex substances and his emphasis on the nonexistence of a substantial mind in particular, and the criticisms of substance advanced by Buddhist philosophers and described in section 2. It is possible that Hume was influenced by Buddhist thought on this and other topics during his stay at the Jesuit College of La Flèche, France, in 1735–37, through the Jesuit missionary Charles François Dolu (Gopnik 2009).

Although not himself a British empiricist (though see Stephen Priest’s (2007, 262 fn. 40) protest on this point), Kant developed an approach to substance in the tradition of Locke, Berkeley and Hume, with a characteristically Kantian twist. Kant endorses a traditional account of substance, according to which substances are subjects of predication and are distinguished by their capacity to persist through change. However, Kant adds that the category of substance is something that the understanding imposes upon experience, rather than something derived from our knowledge of things in themselves. For Kant, the category of substance is, therefore, a necessary feature of experience, and to that extent, it has a kind of objectivity. Kant nonetheless agrees with Locke, Berkeley (respecting material substances) and Hume that substances are subject-dependent. (See Messina (2021) for a complication concerning whether we might nonetheless be warranted in applying this category to things in themselves.)

While earlier thinkers beginning with Aristotle asserted that substances can persist through change, Kant goes further, claiming that substances exist permanently and that their doing so is a necessary condition for the unity of time. It seems to follow that for Kant, composites such as elephants or vases cannot be substances, since they come into and go out of existence. Given that Kant also rejects the existence of indivisible atoms in his discussion of the second antinomy, the only remaining candidate for a material substance in Kant appears to be matter taken as a whole. For an influential exposition, see Strawson (1997).

6. Substance in Twentieth-Century and Early-Twenty-First-Century Philosophy

The concept of substance lost its central place in philosophy after the early modern period, partly as a result of the criticisms of the British empiricists. However, philosophers of the twentieth and early twenty-first centuries have shown a revival of interest in the idea, with several philosophers arguing that we need to accept the concept of substance to account for the difference between object-like and property-like things, or to account for which entities are fundamental, or to address a range of neighbouring metaphysical issues. Discussions have centred on two main themes: the criteria for being a substance, and the structure of substances. O’Conaill (2022) provides a detailed overview of both. Moreover, in the late twentieth century, the concept of substance has gained an important role in philosophy of mind, where it has been used to mark the difference between two kinds of mind-body dualism: substance dualism and property dualism.

a. Criteria for Being a Substance

As noted at the beginning of this article, the term “substance” has two main uses in philosophy. Some philosophers use this word to pick out those things that are object-like in contrast to things that are property-like (or, for some philosophers, event-like or stuff-like). Others use it to pick out those things that are fundamental, in contrast to things that are non-fundamental. Both uses derive from Aristotle’s Categories, which posits that the object-like things are the fundamental things. For some thinkers, however, object-like-ness and fundamentality come apart. When philosophers attempt to give precise criteria for being a substance, they tend to have one of two targets in mind. Some have in mind the task of stating what exactly makes something object-like, while others have in mind the task of stating what exactly makes something fundamental. Koslicki (2018, 164–7) describes the two approaches in detail. Naturally, this makes a difference to which criteria for being a substance seem reasonable, and occasionally this has resulted in philosophers talking past one another. Nonetheless, the hypothesis that the object-like things are the fundamental things is either sufficiently attractive, or sufficiently embedded in philosophical discourse, that there exists considerable overlap between the two approaches.

The most prominent criterion for being a substance in the philosophy of the beginning of the twenty-first century is independence. Many philosophers defend, and even more take as a starting point, the idea that what makes something a substance is the fact that it does not depend on other things. Philosophers differ, however, on what kind of independence is relevant here, and some have argued that independence criteria are unsatisfactory and that some other criterion for being a substance is needed.

The most common independence criteria for being a substance characterise substances in terms of modal (or metaphysical) independence. One thing a is modally independent of another thing b if and only if a could exist in the absence of b. The idea that substances are modally independent is attractive for two reasons. First, it seems that properties, such as shape, size or colour, could not exist without something they belong to—something they are the shape, size or colour of. In other words, property-like things seem to be modally dependent entities. By contrast, object-like things, such as elephants or vases, do not seem to depend on other things in this way. An elephant need not be the elephant of some elephant-having being. Therefore, one could argue for the claim that object-like things differ from property-like things by saying that the former are not modally dependent on other entities, while the latter are. Secondly, modally independent entities are arguably more fundamental than modally dependent entities. For example, it is tempting to say that modally independent entities are the basic elements that make up reality, whereas modally dependent entities are derivative aspects or ways of being that are abstracted from the modally independent entities.

Though attractive, the idea that substances are modally independent faces some objections. The most influential objection says that nothing is modally independent because nothing can exist without its own parts and/or properties (see Weir (2021, 287–291) for several examples). For example, an elephant might not have to be the elephant of some further, elephant-having being, but an elephant must have a size and shape, and countless material parts. An elephant cannot exist without a size, a shape and material parts, and so there is a sense in which an elephant is not modally independent of these things.

Several responses have been suggested. First, one might respond by drawing a distinction between different kinds of modal dependence (see, for example, Lowe 1998, 141; Koslicki 2018, 142–44). For instance, we might say that a is rigidly dependent on b if and only if a cannot exist without b, whereas a is generically dependent on entities of kind F if and only if a cannot exist without some entity of kind F. This allows us to distinguish between something that is weakly modally independent, in that there is no entity upon which it is rigidly dependent, and something that is strongly modally independent, in that there is no kind of entity on which it is generically dependent. It might then be argued that substances need only be weakly modally independent. Hence, the fact that an elephant cannot exist without having properties and parts of certain kinds will not disqualify it as a substance, so long as there is no particular, individual part or property that it must have. It is acceptable, for example, that an elephant must have countless carbon atoms as parts, so long as it can do without any given carbon atom (which, presumably, it can).

The problem with this response is that many putative examples of substances seem to have necessary parts or properties upon which they rigidly depend. For example, it is plausible that King Dutagamuna’s renowned elephant, Kandula, could have existed without some of his properties, such that of exhibiting heroism at the siege of Vijitanagara. It is however not plausible that Kandula could have existed without some of its properties, such as that of being the unique member of the singleton set {Kandula}. This, however, does not seem like the kind of fact that should undermine Kandula’s claim to be a substance. Likewise, it is plausible that a given H2O molecule could not exist without the particular hydrogen atom it contains, and yet most philosophers would hesitate to conclude on this basis that an H2O molecule is not a substance.

A second kind of response to the dependence of substances on their properties and parts replaces modal independence with some other variety. One strategy of this kind appeals to the idea of a non-modal essence (see Fine 1994, 1995). Proponents of non-modal essences claim that things have essences that are narrower—that is, include less—than their necessary parts and properties. For example, it can be argued that although Kandula necessarily belongs to the set {Kandula}, this is not part of Kandula’s essence. After all, it is plausible that one could grasp what it is for Kandula to exist without ever thinking about the fact that Kandula belongs to the set {Kandula}. The fact that Kandula belongs to {Kandula} seems more like a side-effect of Kandula’s nature than a part of his nature. If we accept that things have non-modal essences, then it will be possible to propose that something is a substance if and only if it does not essentially depend on other entities—that is, if and only if no other entity is part of its non-modal essence.

The proposal that substances are essentially independent, in the sense specified, promises to get around the concern that Kandula fails to qualify as a substance because Kandula necessarily belongs to the set {Kandula}. However, other problems remain. For it is plausible that some entities of the sort that intuitively count as substances have some particular properties or parts essentially, and not merely necessarily. It is plausible, for example, that the particular hydrogen atom in a given H2O molecule is not only necessary to it but is also a part of its non-modal essence: a part of what it is for this H2O molecule to exist rather than some other H2O molecule is that it should contain this particular hydrogen atom. Yet, it is not obvious that this should disqualify the H2O molecule’s claim to be a substance.

Other responses that replace modal independence with some other variety include E. J. Lowe’s (1998; 2005) identity-independence and Benjamin Schneider’s (2006) conceptual-independence criteria for substance. Like the essential-independence criterion, these get around at least some of the problems facing the simple modal independence criterion.

A more complex strategy is taken up by Joshua Hoffman & Gary Rosenkrantz (1997). Hoffman and Rosenkrantz introduce a hierarchy of categories with entity at level A, abstract and concrete at level B, and so on. After a lengthy discussion, they formulate the following definition:

x is a substance = df. x instantiates a level C category, C1, such that: (i) C1 could have a single instance throughout an interval of time, and (ii) C1’s instantiation does not entail the instantiation of another level C category which could have a single instance throughout an interval of time, and (iii) it is impossible for C1 to have an instance which has as a part an entity which instantiates another level C category, other than Concrete Proper Part, and other than Abstract Proper Part. (65)

For a full understanding of their approach, it is necessary to refer to Hoffman and Rosenkrantz’s text. However, the definition quoted is enough to illustrate how their strategy addresses the dependence of substances on their properties and parts. In short, Hoffman and Rosenkrantz retain a criterion of independence but qualify that criterion in two ways. First, on their definition, it is only necessary that some substances should satisfy the independence criterion. Substances that do not satisfy the criterion count as substances in virtue of being, in some other respect, the same kinds of entities as those that do. Secondly, even those substances that satisfy the independence criterion need only to be able to exist without a carefully specified class of entities, namely those belonging to a “level C category which could have a single instance throughout an interval of time”.

Hoffman and Rosenkrantz’s definition of substance is carefully tailored to avoid the objection that substances do depend on their properties and parts, as well as a number of other objections. A drawback is that they leave it unclear what it is that unifies the category of substances, given that they only require that some substances should satisfy their qualified independence criterion.

Perhaps the simplest response to the dependence of substances on their properties and parts maintains that while a substance must be independent of all other entities, “other entities” should be taken to refer to things that are not included in the substance. This approach is proposed by Michael Gorman (2006, 151) and defended at length by Weir (2021). According to this response, while it is true that an elephant cannot exist without a shape, a size and countless material parts, this does not mean that the elephant cannot exist by itself or without anything else in the sense required for it to be a substance. For the elephant’s shape, size, and material parts are included in it. By contrast, the reason why property-like things, such as the shape of the elephant, do not count as substances is that they are incapable of existing without something that is not included in them. The shape of the elephant, for example, can only exist by being the shape of something that includes more than just the shape. Weir (2021, 296) suggests that the fact that the elephant includes the shape and not vice versa can be seen from the fact that it is possible to start with the whole elephant and subtract elements such as its colour, weight and so on, until one is left with just the shape, whereas it is not possible to start with just the shape and, by subtracting elements, arrive at the whole elephant.

Several other objections to independence criteria deserve mention. First, if there exist necessary beings, such as numbers or God, then trivially, no candidate substance will be able to exist without them. Secondly, if particulars necessarily instantiate abstract universals (if, for example, an elephant necessarily instantiates universals, such as grey, concretum, or animal), then no candidate substance will be able to exist without abstract universals. Thirdly, if space and time are something over and above their occupants (as they are on substantivalist theories of space and time), then no spatial or temporal substance will be able to exist without these. Some of the strategies for dealing with the dependence of substances on their properties and parts can be transferred to these issues. Other strategies have also been proposed. There exists no consensus on whether one or more independence criteria can satisfactorily be defended against such objections.

Those who reject that some independence criterion is necessary for being a substance, or who hold that an independence criterion needs to be supplemented, have proposed alternative criteria. Two popular options have been subjecthood and unity.

In the Categories, Aristotle introduces substances as those things that are subjects of predication and inherence and are neither predicated of nor inherent in anything else. Since he characterises predication and inherence as dependence relations, many readers have inferred that substances are to be distinguished by their independence. However, philosophers who are hesitant about relying on independence criteria often focus on the initial claim that substances are subjects of predication and inherence that are not predicated of, nor inherent in, other things; or, as it is often put, substances are property bearers that are not themselves properties (see, for example, Heil 2012, 12–17).

One difficulty for the subjecthood or property-bearer criterion for being a substance is that it is vulnerable to the objection that the distinctions we draw between properties and subjects of properties are arbitrary. For example, instead of saying that there is an elephant in the room, we might say that the room is elephant-ish. If we do so, it will no longer be true that elephants are subjects of predication that are not predicated of other things. A proponent of the independence criterion is in a position to assert that our ordinary linguistic practices reflect a deeper metaphysical fact: the reason why we do not say that the room is elephant-ish is that the elephant does not depend for its existence on the room in the way that properties depend on their bearers. Those who rely on the subjecthood criterion by itself cannot reply in this way.

Since Leibniz, many philosophers have proposed that substances are distinguished, either partly or solely, by their high degree of unity. In its extreme form, the criterion of unity says that substances must be simples in the sense that they have no detachable parts. Heil (2012, 21) argues that the simplicity criterion follows from the assumption that substances are property-bearers. For according to Heil, no composite can genuinely bear a property. Schaffer (2010) argues for the simplicity of substances on the basis of parsimony. He proposes that duplicating all the simple entities and their relations to one another would be sufficient to duplicate the entire cosmos, and that if this is so, then there is no good reason to posit further entities beyond the simple entities. Schaffer also argues that the fundamental entities that we posit should be “freely recombinable”, in the sense that the intrinsic properties of one such entity do not constrain the intrinsic properties of another, and that this will only be so if the fundamental entities are simples.

It is widely agreed that even if substances need not be simples, they must nonetheless satisfy some criterion of unity that prevents mere groups or aggregates from counting as substances. (Schneider (2006) and Weir (2021) are liberal about counting aggregates as substances, however.) For example, Kathrin Koslicki (2018) defends a neo-Aristotelian view that, rather than employing an independence criterion as many Aristotelians do, accords to hylomorphic compounds the status of being substances on the basis of their exhibiting a special kind of unity. On Koslicki’s account of the relevant kind of unity, a structured whole is unified to the extent that its parts interact in such a way as to allow it to manifest team-work-requiring capacities, such as the way in which the eye interacts with the brain and other parts of an organism gives it a capacity for visual perception.

b. The Structure of Substances

A second theme that has regained prominence in the twentieth century and the first two decades of the twenty-first century concerns the structure of substances. Increasing attention has been given to the question of whether substances should be regarded as comprising two components: properties and a substratum. At the same time, many philosophers have revived elements of Aristotle’s analysis of material substances into form and matter. (Hylomorphism can be thought of as one particularly important version of the analysis into properties and substratum, or as a distinct but somewhat similar position.)

As noted in section 5.d, Locke (perhaps inadvertently) popularised the idea that the word “substance” refers to a propertyless substratum and that we should be sceptical about the coherence or use of the idea of substances so understood. This idea persisted into the twentieth century in the works of thinkers such as Bertrand Russell (1945, 211) and J. L. Mackie (1976, 77) and is partly responsible for a widespread hostility to substances in this period. Justin Broackes (2006) reviews this development and attempts to rescue the traditional idea of substance from its association with a propertyless substratum.

At the same time, a number of thinkers have come to the defence of the idea that substances can be analysed into properties and substratum. As a result, by the dawn of the twenty-first century, it has become commonplace to speak of two main views about the structure of substances: the bundle view and the substratum view. (As explained in section 5, the bundle view here is simply the view that a substance consists of properties with no substratum. It need not entail the more extreme claim that the properties of a substance can exist separately.)

A prominent argument for the substratum view says that something resembling a propertyless substratum is needed to contribute particularity to a substance. On the standard version of this view, universal properties must be instantiated in a bare particular. An early defence of bare particulars is advanced by Gustav Bergman (1947), and the view is then developed in works by, for example, David Armstrong (1978, 1997), Theodore Sider (2006) and Andrew Bailey (2012). In this context, Armstrong draws a contrast between what he terms a thick particular which is “a thing taken along with all its properties” and a thin particular which is “a thing taken in abstraction from all its properties”. These correspond to the traditional idea of a substance and to that of a substratum of the bare-particular variety, respectively.

Although the idea of a bare particular can be seen as a version of Locke’s idea of a propertyless substratum, bare particulars are not typically introduced to play the role that Locke assigns to substrata—that of supporting properties. Rather, for Bergman and others, the principal role of the bare particular is to account for the particularity of a substance whose other components, its properties, are all universals (things that can exist in multiple places at once). In this respect, the bare particular resembles the Vaisheshika vishesha and the Scotist haecceity.

A different line of argument for positing a substratum, advanced by C. B. Martin (1980), says that without a substratum to bind them together, we should expect the properties of an object to be capable of existing separately, like its parts, something that most philosophers believe that properties cannot do. Unlike the emphasis on the role of particularising, this line of argument may have some attraction for those who hold that properties are particulars rather than universals. One objection to Martin’s argument says that the properties in a bundle might depend on one another without depending on some further substratum (Denkel 1992).

While much of the discussion concerning the structure of substances has focused on the choice between the bundle view and the substratum view, some philosophers have also shown a revival of interest in Aristotle’s analysis of material substance into form and matter, including the prominent role he gives to substantial forms in determining the kinds to which substances belong.

The latter idea is given new life in Peter Geach (1962) and David Wiggins’ (2001) defence of the sortal-dependence of identity. A sortal is a term or concept that classifies an entity as belonging to a certain kind and that hence provides, like Aristotle’s substantial forms, an answer to the question “what is x?”. The claim that identity is sortal-dependent amounts to the claim that if some entity a at an earlier time is identical to an entity b at a later time, then there must be some sortal F such that a and b are the same F—the same elephant for example, or the same molecule. As a result, the conditions under which a and b count as identical will depend on what sortal F is: the criteria for being the same elephant have to do with the kind of things elephants are; the criteria for being the same molecule have to do with the kind of things molecules are. Geach goes further than Wiggins in arguing that identity is not just sortal-dependent but also sortal-relative, so that a might be the same F as b but not the same G as b. Wiggins argues that the sortal-relativity of identity must be rejected, given Leibniz’s law of the indiscernibility of identicals.

The claim that identity is sortal-dependent implies that there is a degree of objectivity to the kinds under which we sort entities. It contrasts with the Lockean claims that the kinds that we employ are arbitrary and, as Leszek Kołakowski expresses it, that:

Nothing prevents us from dissecting surrounding material into fragments constructed in a manner completely different from what we are used to. Thus, speaking more simply, we could build a world where there would be no such objects as “horse”, “leaf”, “star”, and others allegedly devised by nature. Instead, there might be, for example, such objects as “half a horse and a piece of river”, “my ear and the moon”, and other similar products of a surrealist imagination. (1968, 47–8)

Insofar as Geach and Wiggins’ sortals play the role of Aristotle’s substantial forms, their claims about sortal-dependence can be seen as reviving elements of Aristotle’s hylomorphism in spirit if not in letter. Numerous works go further, in explicitly defending the analysis of material substances into matter and form. Examples include Johnston (2006), Jaworski (2011, 2012), Rea (2011), Koslicki (2018) and many others.

Early-twenty-first-century hylomorphists vary widely on the nature they attribute to forms, especially with respect to whether forms should be regarded as universals or particulars. Most, however, regard the form as the source of an object’s structure, unity, activity, and the kind to which it belongs. Motivations for reviving hylomorphic structure include its (putative) ability to differentiate between those composites that really exist and those that are mere aggregates, to account for change, and to make sense of the relationship between properties, their bearers, and resemblances between numerically distinct bearers (see, for example, Koslicki 2018, § 1.5). For these hylomorphists, as for Aristotle, the matter that the form organises need not be in itself propertyless, and thus, although hylomorphism can be viewed as one version of the substratum theory of substances, it can avoid the objection that the idea of an entity that is in itself propertyless is incoherent.

Critics of this sort of hylomorphism, such as Howard Robinson (2021), have questioned whether it can do this work while remaining consistent with the thesis that all events can be accounted for by physical forces (that is, the completeness of physics thesis). Robinson argues that if physics is complete, then forms cannot play any explanatory role.

c. Substance and the Mind-Body Problem

Philosophical work on the idea of substance typically arises as part of the project of describing reality in general. Yet, a more specific source of interest in substances has arisen in the context of philosophy of mind, where the distinction between substances and properties is used to distinguish between two kinds of dualism: substance dualism and property dualism.

The terms “substance dualism” and “property dualism” were hardly used before the 1970s (Michel et al. 2011). They appear to have gained prominence as a result of a desire among philosophers arguing for a revival of mind-body dualism to distinguish their position from the traditional forms of dualism endorsed by philosophers such as Plato and Descartes. Traditional dualists affirm that the mind is a nonphysical substance, something object-like that can exist separately from the body. By contrast, many twentieth-century and early-twenty-first-century proponents of dualism, beginning with Frank Jackson (1982), limit themselves to the claim that the mind involves nonphysical properties.

One advantage of positing nonphysical properties only is that this has allowed proponents of property dualism to represent their position as one that departs only slightly from popular physicalist theories and to distance themselves from the unfashionable idea that a person exists or might exist as a disembodied mind. At the same time, however, several philosophers have questioned whether it makes sense to posit nonphysical properties only, without nonphysical substances (for example, Searle 2002; Zimmerman 2010; Schneider 2012, Weir 2023). Several works, such as those collected in Loose et al. (2018), argue that substance dualism may have advantages over property dualism.

These discussions are complicated by the fact that at the beginning of the third decade of the twenty-first century, there still exists no consensus on how to define the notion of substance, and on what the distinction between substances and properties consists in. Hence, it is not always obvious what property-dualists take themselves to reject when they eschew nonphysical substances.

7. References and Further Reading

  • Aristotle. Categories and De Interpretatione. Edited and translated by J. L. Ackrill (1963). Oxford: Clarendon.
    • Contains Aristotle’s classic introduction of the concept of substance.
  • Aristotle. Aristotle’s Metaphysics. Edited and translated by W. D. Ross (1924). Oxford: Oxford University Press.
    • Develops and revises Aristotle’s account of the nature of substances.
  • Aristotle. Physics. Edited and translated by C. D. C. Reeve (2018). Indianapolis, IN: Hackett.
    • Explains change by analysing material substances into matter and form.
  • Armstrong, David. (1978). Universals and Scientific Realism. Cambridge: Cambridge University Press.
    • Contains a classic discussion of the bundle theory and the substratum theory.
  • Armstrong, David. (1997). A World of States of Affairs. Cambridge: Cambridge University Press.
    • Contains an influential discussion of thin (that is, bare) particulars.
  • Arnauld, Antoine and Pierre Nicole. (1662). Logic or the Art of Thinking. Edited and translated by J. V. Buroker (1996). Cambridge: Cambridge University Press.
    • A highly influential Cartesian substitute for Aristotle’s logical works, covering the concept of substance.
  • Aquinas, Thomas. De Ente et Essentia. Leonine Commission (Ed.), 1976. Rome: Vatican Polyglot Press.
    • Distinguishes essence from existence.
  • Aquinas, Thomas. Questiones Disputate de Anima. Leonine Commission (Ed.), 1996. Rome: Vatican Polyglot Press.
    • Rejects universal hylomorphism.
  • Ayers, M. R. (1977). The Ideas of Power and Substance in Locke’s Philosophy (revised edition of a 1975 paper). In I. C. Tipton (Ed.), Locke on Human Understanding (pp. 77–104). Oxford: Oxford University Press.
    • Defends an influential interpretation of Locke on substances.
  • Bailey, A. M. (2012). No Bare Particulars. Philosophical Studies, 158, 31–41.
    • Rejects bare particulars.
  • Barney, S. A., W. J. Lewis, J. J. Beach and Oliver Berghoff. (2006). Introduction. In S. A. Barney, W. J. Lewis, J. J. Beach and O. Berghoff (Eds. & Trans). The Etymologies of Isadore of Seville (pp. 1-2). Cambridge: Cambridge University Press.
    • Introduces Isadore of Seville’s Etymologies.
  • Baxter, Donald. (2015). Hume on Substance: A Critique of Locke. In P. Lodge & T. Stoneham (Eds.), Locke and Leibniz on Substance (pp. 45–62). New York, NY: Routledge.
    • An exposition of Hume on substance.
  • Bennett, Jonathan. (1987). Substratum. History of Philosophy Quarterly, 4(2), 197–215.
    • Defends the traditional interpretation of Locke on substance.
  • Bergman, Gustav. (1947). Russell on Particulars. The Philosophical Review, 56(1), 59–72.
    • Defends bare particulars against Russell.
  • Berkeley, George. The Works of George Berkeley, Bishop of Cloyne. A. A. Luce and T. E. Jessop (Eds.), 1948–1957. London: Thomas Nelson and Sons.
  • Bhikku Bodhi (Trans.). (2000). The Connected Discourses of the Buddha: A New Translation of the Samyutta Nikaya. Somerville, MA: Wisdom Publications.
    • Contains the version of the chariot argument against substance attributed to the ancient Buddhist nun, Vaijira.
  • Broackes, Justin. (2006). Substance. Proceedings of the Aristotelian Society, 106, 133–68.
    • Traces the historical confusion between substance and substratum and defends the former concept.
  • Descartes, René. The Philosophical Writings of Descartes (3 vols.). Edited and translated by J. Cottingham, R. Stoothoff, D. Murdoch, and A. Kenny (1984–1991). Cambridge: Cambridge University Press.
    • Contains Descartes’ influential claims about substance, including his independence definition.
  • Descartes, René. Conversation with Burman. Translated by J. Bennett (2017).  https://earlymoderntexts.com/assets/pdfs/descartes1648.pdf
    • Contains Descartes’ identification of the substance with its attributes.
  • Denkel, Arda. (1992). Substance Without Substratum. Philosophy and Phenomenological Research, 52(3), 705–711.
    • Argues that we can retain the concept of substance while rejecting that of a substratum.
  • Druart, Thérèse-Anne. (1987). Substance in Arabic Philosophy: Al-Farabi’s Discussion. Proceedings of the American Catholic Philosophical Association, 61, 88–97.
    • An exposition of al-Farabi on substance.
  • Fine, Kit. (1994). Essence and Modality. Philosophical Perspectives, 8, 1–16.
    • Defends the idea of non-modal essences.
  • Fine, Kit. (1995). Ontological Dependence. Proceedings of the Aristotelian Society, 95, 269–90.
    • Defends the idea of essential dependence.
  • Forrai, Gabor. (2010). Locke on Substance in General. Locke Studies, 10, 27–59.
    • Attempts to synthesise Bennett’s traditional and Ayers’ novel interpretations of Locke on substance.
  • Geach, Peter. (1962). Reference and Generality. Ithaca: Cornell University Press.
    • Defends the sortal-dependence and sortal-relativity of identity.
  • Gopnik, Alison. (2009). Could David Hume Have Known about Buddhism? Charles François Dolu, the Royal College of La Flèche, and the Global Jesuit Intellectual Network. Hume Studies, 35(1-2), 5–28.
    • Argues that Hume’s criticism of the idea of a substantial self may have been influenced by Buddhist philosophy.
  • Gorman, Michael. (2006). Independence and Substance. International Philosophical Quarterly, 46, 147–159.
    • Defends a definition of substances as things that do not inhere in anything.
  • Halbfass, Wilhelm. (1992). On Being and What There Is: Classical Vaisesika and the History of Indian Ontology. New York: SUNY Press.
    • Contains a very useful introduction to the concept of substance in classical Indian philosophy.
  • Hoffman, Joshua and Gary Rosenkrantz. (1996). Substance: Its Nature and Existence. London: Routledge.
    • A sustained examination and defence of a novel characterisation of substance.
  • Hume, David. A Treatise of Human Nature. Edited by D. F. Norton and M. J. Norton (2007).  Oxford: Clarendon Press.
    • Contains Hume’s influential objections to the idea of substance.
  • Isidore of Seville. Etymologies. Edited and translated by S. A. Barney, W. J. Lewis, J. A. Beach and O. Berghoff (2006). Cambridge: Cambridge University Press.
    • Played an important role in transmitting Aristotle’s characterisation of substance to medieval philosophers in the Latin West.
  • Jackson, Frank. (1982). Epiphenomenal Qualia. Philosophical Quarterly, 32(127), 127–36.
    • A classic defence of property-dualism.
  • Kaipayil, Joseph. (2008). An Essay on Ontology. Kochi: Karunikan.
    • Contains a discussion of the idea of substance in both Western and Indian philosophy.
  • Kant, Immanuel. (1787). Critique of Pure Reason. Edited and translated by N. K. Smith (2nd ed., 2007). Basingstoke: Palgrave Macmillan.
    • Contains Kant’s approach to the idea of substance and his comments on Aristotle’s Categories.
  • Kołakowski, Leszek. (1968). Towards a Marxist Humanism. New York: Grove Press.
    • Claims, contra Geach and Wiggins, that the kinds we divide the world into are arbitrary.
  • Koslicki, Katherine. (2018). Form, Matter and Substance. Oxford: Oxford University Press.
    • Defends a unity criterion that attributes substancehood to hylomorphic compounds.
  • Leibniz, G. W. Critical Thoughts on the General Part of the Principles of Descartes. In L. Loemker (Ed. & Trans.), Gottfried Leibniz: Philosophical Papers and Letters (2nd ed., 1989). Alphen aan den Rijn: Kluwer.
    • Contains a criticism of Descartes’ independence definition of substance.
  • Leibniz, G. W. Discourse on Metaphysics. Edited and translated by G. Rodriguez-Pereyra (2020). Oxford University Press.
    • Presents Leibniz’s idiosyncratic conception of substance.
  • Locke, John. An Essay Concerning Human Understanding. Edited by P. H. Nidditch (1975). Oxford: Oxford University Press.
    • Contains Locke’s critical discussion of substance and substratum.
  • Loose, Jonathan, Angus Menuge, and J. P. Moreland (Eds.). (2018). The Blackwell Companion to Substance Dualism. Oxford: Blackwell.
    • Collects works bearing on substance dualism.
  • Lowe, E. J. (1998). The Possibility of Metaphysics: Substance, Identity and Time. Oxford: Clarendon Press.
    • Discusses substance and defends Lowe’s identity-independence criterion.
  • Lowe, E. J. (2005). The Four-Category Ontology: A Metaphysical Foundation for Natural Science. Oxford: Clarendon Press.
    • Further develops Lowe’s account of substance.
  • Martin, C. B. (2006). Substance Substantiated. Australasian Journal of Philosophy, 58(1), 3–10.
    • Argues that we should posit a substratum to explain why the properties of a substance cannot exist separately.
  • McEvilley, Thomas. (2002). The Shape of Ancient Thought: Comparative Studies in Greek and Indian Philosophies. London: Simon & Schuster.
    • Compares ancient Greek and classical Indian philosophy on many issues including the nature of substances.
  • Messina, James. (2021). The Content of Kant’s Pure Category of Substance and its Use on Phenomena and Noumena. Philosophers’ Imprint, 21(29), 1-22.
    • An exposition of Kant on substance.
  • Michel, Jean-Baptiste, et al. (2011). Quantitative Analysis of Culture Using Millions of Digitized Books. Science, 331(6014), 176–182.
    • Records development of Google’s Ngram which provides data on the appearance of the terms “substance dualism” and “property dualism”.
  • Moise, Ionut and G. U. Thite. (2022). Vaiśeṣikasūtra: A Translation. London: Routledge.
    • The founding text of the Vaisheshika school.
  • Neale, Matthew. (2014). Madhyamaka and Pyrrhonism: Doctrinal, Linguistic and Historical Parallels and Interactions Between Madhyama Buddhism and Hellenic Pyrrhonism. Ph.D. Thesis, University of Oxford.
    • Discusses the relationship between Madhyamaka and Pyrrhonism.
  • O’Conaill, Donnchadh. (2022). Substance. Cambridge: Cambridge University Press.
    • A detailed overview of philosophical work on substance.
  • Plato. Sophist. Edited and translated by N. White (1993). Indianapolis, IN: Hackett.
    • Contains Plato’s distinction between things that exist in themselves and those that exist in relation to something else.
  • Priest, Stephen. (2007). The British Empiricists (2nd ed.). London: Routledge.
    • An exposition of the ideas of the British Empiricists on topics including that of substance.
  • Rea, Michael. (2011). Hylomorphism Reconditioned. Philosophical Perspectives, 25(1), 341–58.
    • Defends a version of hylomorphism.
  • Robinson, Howard. (2021). Aristotelian Dualism, Good; Aristotelian Hylomorphism, Bad. In P. Gregoric and J. L. Fink (Eds.), Encounters with Aristotelian Philosophy of Mind (pp. 283-306). London: Routledge.
    • Criticises hylomorphism.
  • Russell, Bertrand. (1945). History of Western Philosophy. London: George Allen and Unwin.
    • Rejects the idea of substances understood as substrata.
  • Schneider, Benjamin. (2006). A Certain Kind of Trinity: Dependence, Substance, Explanation. Philosophical Studies, 129, 393–419.
    • Defends a conceptual-independence criterion for substancehood.
  • Schneider, Susan. (2012). Why Property Dualists Must Reject Substance Physicalism. Philosophical Studies, 157, 61–76.
    • Argues that mind-body dualists must be substance dualists.
  • Scotus, John Duns. Opera Omnia. Edited by C. Balic et al. (1950-2013). Rome: Vatican Polyglot Press.
    • Contains Scotus’s influential discussions of substance.
  • Searle, John. (2002). Why I am Not a Property Dualist. Journal of Consciousness Studies, 9(12), 57–64.
    • Argues that mind-body dualists must be substance dualists.
  • Sider, Ted. (2006). Bare Particulars. Philosophical Perspectives, 20, 387–97.
    • Defends substrata understood as bare particulars.
  • Solomon ibn Gabirol. The Fount of Life (Fons Vitae). Translated by J. Laumakis (2014). Milwaukee, WI: Marquette University Press.
    • Presents Avicebron’s (Solomon ibn Gabirol’s) universal hylomorphism.
  • Spade, P. V. (2008). Binarium Famosissimum. In E. N. Zalta (Ed.), The Stanford Encyclopedia of Philosophy (Fall 2008 Edition). <https://plato.stanford.edu/archives/fall2008/entries/binarium/>
    • Discusses the medieval case for universal hylomorphism.
  • Spinoza, Baruch. Principles of Cartesian Philosophy. Edited and translated by H. E. Wedeck (2014). New York: Open Road Integrated Media.
    • Contains Spinoza’s presentation of Descartes’ account of substance.
  • Spinoza, Baruch. Ethics: Proved in Geometrical Order. Edited by M. J. Kisner and translated by M. Silverthorne and M. J. Kisner (2018). Cambridge: Cambridge University Press.
    • Contains Spinoza’s account of substance and argument for substance monism.
  • Strawson, P. F. (1997). Kant on Substance. In P. F. Strawson, Entity and Identity and Other Essays (pp. 268–79). Oxford: Oxford University Press.
    • An exposition of Kant on substance.
  • Weir, R. S. (2021). Bring Back Substances!. The Review of Metaphysics, 75(2), 265–308.
    • Defends the idea of substances as things that can exist by themselves.
  • Weir, R. S. (2023). The Mind-Body Problem and Metaphysics: An Argument from Consciousness to Mental Substance. London: Routledge.
    • Argues that those who posit nonphysical properties to solve the mind-body problem must also posit nonphysical substances.
  • Westerhoff, Jan. (2009). Nagarjuna’s Madhymaka: A Philosophical Introduction. Oxford University Press.
    • An introduction to Nagarjuna’s philosophy.
  • Wiggins, David. (2001). Sameness and Substance Renewed. Cambridge: Cambridge University Press.
    • Defends the sortal-dependence of identity, but rejects the sortal-relativity of identity.
  • Zimmerman, Dean. (2010). From Property Dualism to Substance Dualism. Aristotelian Society Supplementary Volume, 84(1), 119–150.
    • Argues that mind-body dualists must be substance dualists.

 

Author Information

Ralph Weir
Email: rweir@lincoln.ac.uk
University of Lincoln
United Kingdom

Arthur Schopenhauer: Logic and Dialectic

CamusFor Arthur Schopenhauer (1788-1860), logic as a discipline belongs to the human faculty of reason, more precisely to the faculty of language. This discipline of logic breaks down into two areas. Logic or analytics is one side of the coin; dialectic or the art of persuasion is the other. The former investigates rule-oriented and monological language. The latter investigates result-oriented language and persuasive language.

Analytics or logic, in the proper sense, is a science that emerged from the self-observation of reason and the abstraction of all content. It deals with formal truth and investigates rule-governed thinking. The uniqueness of Schopenhauer’s logic emerges from its reference to intuition, which leads him to use numerous geometric forms in logic that are understood today as logic diagrams, combined with his aim of achieving the highest possible degree of naturalness, so that logic resembles mathematical proofs and, especially, the intentions of everyday thinking.

It follows from both logic and dialectic that Schopenhauer did not actively work to develop a logical calculus because axiomatisation contradicts natural thinking and also mathematics in that the foundations of mathematics should rely upon intuition rather than upon the rigor that algebraic characters are supposed to possess. However, the visualization of logic through diagrams and of geometry through figures is not intended to be empirical; rather, it is about the imaginability of logical or mathematical forms. Schopenhauer is guided primarily by Aristotle with regard to naturalness, by Euler with regard to intuition, and by Kant with regard to the structure of logic.

Schopenhauer called dialectic ‘eristics’, and the ‘art of persuasion’ and the ‘art of being right’. It has a practical dimension. Dialectic examines the forms of dialogue, especially arguments, in which speakers frequently violate logical and ethical rules in order to achieve their goal of argumentation. In pursuing this, Schopenhauer starts from the premise that reason is neutral and can, therefore, be used as a basis for valid reasoning, although it can also be misused. In the case of abuse, speakers instrumentalize reason in order to appear right and prevail against one or more opponents. Even if some texts on dialectic contain normative formulations, Schopenhauer’s goal is not to motivate invalid reasoning, but to protect against it. As such, scientific dialectic is not an ironic or sarcastic discipline, but a protective tool in the service of Enlightenment philosophy.

Schopenhauer’s dialectic is far better known than his analytics, although in direct comparison it makes up the smaller part of his writings on logic in general. For this reason, and because most texts on dialectic build on analytics, the following article is not structured around the two sub-disciplines, but around Schopenhauer’s very different texts on logic in general. First, logic is positioned as a discipline within the philosophical system. Then, the Berlin Lectures, his main text on analytics and dialectic, is introduced and followed, in chronological order, by his shorter texts on analytics and dialectic. The final section outlines research topics.

Table of Contents

  1. Logic and System
    1. Schopenhauer’s Philosophical System
    2. Normativism or Descriptivism
    3. Logic within the System
    4. Schopenhauer’s Treatises on Logic and Dialectics
  2. Schopenhauer’s Logica Maior (the Berlin Lectures)
    1. Doctrine of Concepts and Philosophy of Language
      1. Translation, Use-Theory, and Contextuality
      2. Abstraction, Concretion, and Graphs
    2. Doctrine of Judgments
      1. Relational Diagrams
      2. Stoic Logic and Oppositional Geometry
      3. Conversion and Metalogic
      4. Analytic-Synthetic Distinction and the Metaphor of the Concept
    3. Doctrine of Inferences
      1. Foundations of Logic
      2. Logical Aristotelianism and Stoicism
      3. Naturalness in Logic
    4. Further Topics of Analytic
      1. Schopenhauer’s History of Logic
      2. Logic and Mathematics
      3. Hermeneutics
    5. Dialectic or Art of Persuasion
  3. Schopenhauer’s Logica Minor
    1. Fourfold Root
    2. World as Will and Representation I (Chapter 9)
    3. Eristic Dialectics
    4. World as Will and Representation II (Chapters 9 and 10)
    5. Parerga and Paralipomena II
  4. Research Topics
  5. References and Further Readings
    1. Schopenhauer’s Works
    2. Other Works

1. Logic and System

a. Schopenhauer’s Philosophical System

Schopenhauer’s main work is The World as Will and Representation (W I). This work represents the foundation and overview of his entire philosophical system (and also includes a minor treatise on logic). It was first published in 1819 and was accepted as a habilitation thesis at the University of Berlin shortly thereafter. W I was also the basis for the revised and elaborated version—the Berlin Lectures (BL), written in the early 1820s. It also appeared in a slightly revised version in a second and third edition (1844, 1859) accompanied by a second volume (W II) that functioned as a supplement or commentary. However, none of these later editions were as rich in content as the revision in the BL. All other writings—On the Fourfold Root of the Principle of Sufficient Reason (1813 as a dissertation, 1847), On the Will in Nature (1836, 1854), The Two Fundamental Problems of Ethics (1841, 1860), and Parerga and Paralipomena (1851)—can also be regarded as supplements to the W I or the BL.

Schopenhauer’s claim, made in the W I (and also the BL), follows (early) modern and especially Kantian system criteria. He claimed that philosophy aims to depict, in one single system, the interrelationships between all the components that need to be examined. In Kant’s succession, a good or perfect system is determined by the criterion of whether the system can describe all components of nature and mind without leaving any gaps or whether all categories, principles, and topics have been listed in order to describe all components of nature and mind. This claim to completeness becomes clear in Schopenhauer’s system, more precisely, in W I or BL, each of which is divided into four books. The first book deals mainly with those topics that would, in contemporary philosophy, be assigned to epistemology, philosophy of mind, philosophy of science, and philosophy of language. The second book is usually understood as covering metaphysics and the philosophy of nature. The third book presents his aesthetics and the fourth book practical philosophy, including topics such as ethics, theory of action, philosophy of law, political philosophy, social philosophy, philosophy of religion, and so forth.

b. Normativism or Descriptivism

Schopenhauer’s system, as described above (Sect. 1.a), has not been uniformly interpreted in its 200-year history of reception, a factor that has also played a significant role in the reception of his logic. The differences between the individual schools of interpretation have become increasingly obvious since the 1990s and are a significant subject of discussion in research (Schubbe and Lemanski 2019). Generally speaking, one can differentiate between two extreme schools of interpretation (although not every contemporary study on Schopenhauer can be explicitly and unambiguously assigned to one of the following positions):

  1. Normativists understand Schopenhauer’s system as the expression of one single thought that is marked by irrationality, pessimism, obscurantism, and denial of life. The starting point of Schopenhauer’s system is Kant’s epistemology, which provides the foundation for traversing the various subject areas of the system (metaphysics, aesthetics, ethics). However, all topics presented in the system are only introductions (“Vorschulen”) to the philosophy of religion, which Schopenhauer proclaims is the goal of his philosophy, that is, salvation through knowledge (“Erlösung durch Erkenntnis”). Normativists are above all influenced by various philosophical schools or periods of philosophy such as late idealism (Spätidealismus), the pessimism controversy, Lebensphilosophie, and Existentialism.
  2. Descriptivists understand Schopenhauer’s philosophy as a logically ordered representation of all components of the world in one single system, without one side being valued more than the other. Depending on the subject, Schopenhauer’s view alternates between rationalism and irrationalism, between optimism and pessimism, between affirmation and denial of life, and so forth. Thus, there is no intended priority for a particular component of the system (although, particularly in later years, Schopenhauer’s statements became more and more emphatic). This school is particularly influenced by those researchers who have studied Schopenhauer’s intense relationship with empiricism, logic, hermeneutics, and neo-Kantianism.

c. Logic within the System

The structure of logic is determined by three sub-disciplines: the doctrines of concepts, judgments, and inferences. However, the main focus of Schopenhauerian logic is not the doctrine of inferences in the sense of logical reasoning and proving but rather in the sense that his logic corresponds with his philosophy of mathematics. According to Schopenhauer, logical reasoning in particular is overrated as people rarely put forward invalid inferences, although they often put forward false judgments. However, the intentional use of fallacies is an exception to this that is therefore studied by dialectics.

The evaluation of Schopenhauer’s logic depends strongly on the school of interpretation. Normativists have either ignored Schopenhauer’s logic or identified it with (eristic) dialectic, which in turn has been reduced to a normative “Art of Being Right” or “of Winning an Argument” (see below, Sect. 2.e, 3.c). A relevant contribution to Schopenhauer’s analytics from the school of normativists is, therefore, not known, although there were definitely intriguing approaches to dialectics. As normativism was the more dominant school of interpretation until late in the 20th century, it shaped the public image of Schopenhauer as an enemy of logical and mathematical reasoning, and so forth.

Descriptivists emphasize logic as both the medium of the system and the subject of a particular topic within the W I-BL system. The first book of W I-BL deals with representation and is divided into two sections (Janaway 2014): 1. Cognition (W I §§3–7, BL chap. 1, 2), 2. Reason (W I §§8–16, BL 3–5). Cognition refers to the intuitive and concrete, reason to the discursive and abstract representation. In the paragraphs on cognition, Schopenhauer examines the intuitive representation and its conditions, that is, space, time, and causality, while reason is built on cognition and is, therefore, the ‘representation of representation’. Schopenhauer examines three faculties of reason, which form the three sections of these paragraphs: 1. language, 2. knowledge, and 3. practical reason. Language, in turn, is then broken down into three parts: general philosophy of language, logic, and dialectics. (Schopenhauer defines rhetoric as, primarily, the speech of one person to many, and he rarely dealt with it in any substantial detail.) Following the traditional structure, Schopenhauer then divides logic into sections on concepts, judgments, and inferences. Logic thus fulfills a double role in Schopenhauer’s system: it is a topic within the entire system and it is the focus through which the system is organized and communicated. Fig. 1 shows this classification using W I as an example.
Figure 1: The first part of Schopenhauer’s system focusing on logic

However, this excellent role of logic only becomes obvious when Schopenhauer presents the aim of his philosophy. The task of his system is “a complete recapitulation, a reflection, as it were, of the world, in abstract concepts”, whereby the discursive system becomes a finite “collection [Summe] of very universal judgments” (W I, 109, BL, 551). As in Schopenhauer’s system, logic alone clarifies what concepts and judgments are: it is a very important component for understanding his entire philosophy. Schopenhauer, however, vehemently resists an axiomatic approach because in logic, mathematics and, above all, philosophy, nothing can be assumed as certain; rather, every judgment may represent a problem. Philosophy itself must be such that it is allowed to be skeptical about tautologies or laws (such as the laws of thought). This distinguishes it from other sciences. Logic and language cannot, therefore, be the foundation of science and philosophy, but are instead their means and instrument (see below, Sect. 2.c.i).

Through this understanding of the role of logic within the system, the difference between the two schools of interpretation can now also be determined: Normativists deny the excellent role attributed to logic as they regard the linguistic-logical representation as a mere introduction (“Vorschule”) to philosophical salvation at the end of the fourth book of W I or BL. This state of salvation is no longer to be described using concepts and judgments. In contrast, descriptivists stress that Schopenhauer’s entire system aims to describe the world and man’s attitude to the world with the help of logic and language. This also applies to the philosophy of religion and the treatises on salvation at the end of W I and BL. As emphasized by Wittgensteinians in particular, Schopenhauer also shows, ultimately, what can still be logically expressed and what can no longer be grasped by language (Glock 1999, 439ff.).

d. Schopenhauer’s Treatises on Logic and Dialectics

Schopenhauer’s whole oeuvre is thought to contain a total of six longer texts on logic. In chronological order, this includes the following seven texts:

(1) In the summer semester of 1811, Schopenhauer attended Gottlob Ernst (“Aenesidemus”) Schulze’s lectures on logic and wrote several notes on Schulze’s textbook (d’Alfonso 2018). As these comments do not represent work by Schopenhauer himself, they are not discussed in this article. The same applies to Schopenhauer’s comments on other books on logic, such as those of Johann Gebhard Ehrenreich Maass, Johann Christoph Hoffbauer, Ernst Platner, Johann Gottfried Kiesewetter, Salomon Maimon et al. (Heinemann 2020), as well as his shorter manuscript notes published in the Manuscript Remains. (Schopenhauer made several references to his manuscripts in BL.)

(2) Schopenhauer’s first discussion of logic occurred in his dissertation of 1813 which presented a purely discursive reflection on some components of logic (concepts, truth, and so on). In particular, his reflections on the laws of thought were emphasized.

(3) For the first time in 1819, in § 9 of W I, Schopenhauer distinguished between analytics and dialectics in more detail. In the section on analytics, he specified a doctrine of concepts with the help of a few logic diagrams. However, he wrote in § 9 that this doctrine had already been fairly well explained in several textbooks and that it was, therefore, not necessary to load the memory of the ‘normal reader’ with these rules. In the section on dialectic, he sketches a large argument map for the first time. § 9 was only lightly revised in later editions; however, his last notes in preparation for a fourth edition indicate that he had planned a few more interesting changes and additions.

(4) During the 1820s, Schopenhauer took the W I system as a basis, supplemented the missing information from his previously published writings, and developed a system that eliminated some of the shortcomings and ambiguities of W I. The system within these manuscripts then served as a source for his lectures in Berlin in the early 1820s, that is, the BL. In the first book of the BL, there is a treatise on logic the size of a textbook.

(5) Eristic Dialectics is the title of a longer manuscript that Schopenhauer worked on in the late 1820s and early 1830s. This manuscript is one of Schopenhauer’s best-known texts, although it is unfinished. It takes many earlier approaches further, but the context to analytics (and to logic diagrams) is missing in this small fragment on dialectics. With the end of his university career in the early 1830s, Schopenhauer’s intensive engagement with logic came to an end.

(6) It was not until 1844, in W II, that Schopenhauer supplemented the doctrine of concepts given in W I with a 20-page doctrine of judgment and inference. This, however, is no longer compatible with the earlier logic treatises written before 1830, as Schopenhauer repeatedly suggests new diagrammatic logics, which he does not illustrate. Given these changes, the published texts on logic look inconsistent.

(7) In 1851, Schopenhauer once again published a short treatise entitled “Logic and Dialectics” in the second volume of Parerga and Paralipomena. This treatise, however, only deals with some topics from the philosophy of logic in aphoristic style and, otherwise, focuses more strongly on dialectic. Few new insights are found here.

Since the rediscovery of the Berlin Lectures by descriptivists, a distinction has been made—in the sense of scholastic subdivision—between Logica Maior (Great Logic) and Logica Minor (Small Logic): Treatises (2), (3), (4), (5) and (6) belong to the Logica Minor and are discussed briefly in Section 3. (For more information see Lemanski 2021b, chap. 1.) The only known treatise on logic written by Schopenhauer that deserves to be called a Logica Maior is a manuscript from the Berlin Lectures written in the 1820s. This book-length text is the most profitable reading of all the texts mentioned. Thus, it is discussed in more detail in Section 2.

2. Schopenhauer’s Logica Maior (the Berlin Lectures)

Until the early 21st century, due to the dominance of the normativists in Schopenhauer scholarship, the BL were considered just a didactic version of W I and were, therefore, almost completely ignored by researchers until intensive research on Schopenhauer’s logic began in the middle of the 2010s. These lectures are not only interesting from a historical perspective, they also propose a lot of innovations and topics that are still worth discussing today, especially in the area of diagrammatic reasoning and logic diagrams. As Albert Menne, former head of the working group ‘Mathematical Logic’ at the Ruhr-Universität in Bochum stated: “Schopenhauer has an excellent command of the rules of formal logic (much better than Kant, for example). In the manuscript of his Berlin Lectures, syllogistics, in particular, is thoroughly analyzed and explained using striking examples” (Menne 2002, 201–2).

The BL are a revised and extended version of W I made for the students and guests who attended his lectures in Berlin. The belief that such an elaboration only has minor value is, however, not reasonable. Moreover, the extent, the content, and also the above-mentioned distinction between the exoteric-popular-philosophical and the esoteric-academic part of Schopenhauer’s work suggest a different evaluation. In W I, Schopenhauer deals only casually with difficult academic topics such as logic or philosophy of law; at the beginning of the BL, however, he states that these topics are the most important topics to teach prospective academics. Indeed, he repeatedly pointed out that he will also focus on logic in the title of his announcement for the Berlin Lectures. Thus, the lecture given in the winter semester of 1821-22 is titled “Dianologie und Logik” (BL, XII; Regehly 2018). Therefore, suspicion arises that research has hitherto ignored Schopenhauer’s most important textual version of his philosophical system, as the Berlin Lectures contain his complete system including some of the parts missing from W I, which are very important for the academic interpretation of the system such as logic and philosophy of law.

The first edition of the BL was published by Franz Mockrauer in 1913, reprinted by Volker Spierling in 1986, and a new edition was published in four volumes between 2017 and 2022 by Daniel Schubbe, Daniel Elon, and Judith Werntgen-Schmidt. An English translation is not available. The manuscript of the BL is deposited in the Staatsbibliothek zu Berlin Preussischer Kulturbesitz and can be viewed online at http://sammlungen.ub.uni-frankfurt.de/schopenhauer/content/titleinfo/7187127.

The Logica Maior is found in chapter III of the Berlin Lectures (book I). Here, Schopenhauer begins with (a) a treatise on the philosophy of language that announces essential elements of the subsequent theory of concepts. Then, (b) based on the diagrammatic representation of concepts, he develops a doctrine of judgment. (c) The majority of the work then deals with inferences, in which syllogistic, Stoic logic (propositional logic), modal logic, and the foundation and naturalness of logic are discussed. Together with (d) the appendix, these are the topics that belong to analytics or logic in the proper sense. (e) Finally, he addresses several topics related to dialectics.

a. Doctrine of Concepts and Philosophy of Language

This section mainly deals with BL, 234–260. Schopenhauer begins his discussion of logic with a treatise on language, which is foundational to the subsequent treatise. Several aspects of this part of the Logica Maior have been investigated and discussed to date—namely (i.) translation, use-theory, and contextuality as well as (ii.) abstraction, concretion, and graphs—which are outlined in the following subsections.

i. Translation, Use-Theory, and Contextuality

Schopenhauer distinguishes between words and concepts: he considers words to be signs for concepts, and concepts abstract representations that rest on other concepts or concrete representations (of something, that is, intuition). In order to make this difference explicit, Schopenhauer reflects on translation, as learning a foreign language and translating are the only ways to rationally understand how individuals learn abstract representations and how concepts develop and change over many generations within a particular language community.

In his translation theory, Schopenhauer defines three possible situations:

(1) The concept of the source language corresponds exactly to the concept of the target language (1:1 relation).

(2) The concept of the source language does not correspond to any concept of the target language (1:0 relation).

(3) The concept of the source language corresponds only partially to one or more concepts of the target language 1:(nx)/n relation, where n is a natural number and x < n).

For Schopenhauer, the last relation is the most interesting one: it occurs frequently, causes many difficulties in the process of translation or language learning, and is the relation with which one can understand how best to learn words or the meaning of words. Remarkably, Schopenhauer developed three theories, arguments, or topics regarding the 1:(nx)/relation that have become important in modern logic, linguistics, and analytical philosophy, namely (a) spatial logic diagrams, (b) use-theory of meaning, and (c) the context principle. (a)–(c) are combined in a passage of text on the 1:(nx)/translation:

[T]ake the word honestum: its sphere is never hit concentrically by that of the word which any German word designates, such as Tugendhaft, Ehrenvoll, anständig, ehrbar, geziemend [that is, virtuousness, honorable, decent, appropriate, glorious and others]. They do not all hit concentrically: but as shown below:


That is why one learns not the true value of the words of a foreign language with the help of a lexicon, but only ex usu [by using], by reading in old languages and by speaking, staying in the country, by new languages: namely it is only from the various contexts in which the word is found that one abstracts its true meaning, finds the concept that the word designates. [31, 245f.]

To what extent the penultimate sentence corresponds to what is called the ‘use theory of meaning’, the last sentence of the quote to the so-called ‘context principle’, and to what extent these sentences are consistent with the corresponding theories of 20th-century philosophy of language is highly controversial. Lemanski (2016; 2017, 2021b) and Dobrzański (2017; 2020) see similarities with the formulations of, for example, Gottlob Frege and Ludwig Wittgenstein. However, Schroeder (2012) and Schumann (2020) reject the idea of this similarity, and Weimer (1995; 2018) sees only a representationalist theory of language in Schopenhauer. Dümig (2020) contradicts a use theory and a context principle for quite different reasons, placing Schopenhauer closer to mentalism and cognitivism, while Koßler (2020) argues for the co-existence of various theories of language in Schopenhauer’s oeuvre.

ii. Abstraction, Concretion, and Graphs

With (b) and (c) Schopenhauer not only comes close to the modern philosophy of ordinary language, but he may also be the first philosopher in history to have used (a) logic diagrams to represent semantics or ontologies of concepts (independent of their function in judgments). In his philosophy of language, he also uses logic diagrams to sketch the processes of conceptual abstraction. Schopenhauer intends to describe processes of abstraction that are initially based on concrete representation, that is, the intuition of a concrete object, from which increasingly abstract concepts have formed over several generations within a linguistic community.

Figure 2 (SBB-IIIA, NL Schopenhauer, Fasz. 24, 112r = BL, 257)

For example, Fig. 2 shows the ‘spheres’ of the words ‘grün’ (‘green’),  ‘Baum’ (‘tree’), and  ‘blüthetragend’ (‘flower-bearing’) using three circles. The diagram represents all combinations of subclasses by intersecting the spheres of the concepts that are to be analyzed, more specifically,

There is a recognizable relationship with Venn diagrams here, as Schopenhauer uses the combination of the so-called ‘three-circle diagram’, a primary diagram in Venn’s sense. Schopenhauer distinguishes between an objective and a conceptual abstraction, as the following example illustrates: (1) GTF denotes a concept created by the objective abstraction from an object of intuitive representation, that is, a concretum. The object this was abstracted from belongs to the set of objects that is a tree that bears flowers and is green. All further steps of abstraction are conceptual abstractions or so-called ‘abstracta’. In the course of generations, language users have recognized that there are also (2) representations that can only be described with GF, but not with T, more precisely,

(for example, a common daisy). In the next step (3), the concept F was excluded so that the abstract representation of G was formed, that is,

(for example, bryophytes). Finally, (4) a purely negative concept was formed, whose property is not G nor T nor F, more specifically,

This region lies outside the conceptual sphere and, therefore, does not designate an abstractum or a concept anymore: it is merely a word without a definite meaning, such as ‘the absolute’, ‘substance’, and so forth (compare Xhignesse 2020).

Fig. 3: Interpretation of Fig. 1

In addition to the three-circle diagram (Fig. 2) and the eight classes, the interpretation in Fig. 3 includes a graph illustrating the four steps mentioned above: (1) corresponds to ν1, (2) is the abstraction e1 from ν1 to ν2, (3) is the abstraction e2 from v2 to v3 and (4) e3 is the abstraction from v3 to v4. In this example, the graph can be interpreted as directed with ν1 as the source vertex and v4 as the sink vertex. However, Schopenhauer also uses these diagrams in the opposite direction, that is, not only for abstraction but also for concretion. In both directions, the vertices in the graph represent concepts, whereas the edges represent abstraction or concretion. On account of the concretion, Schopenhauer has also been associated with reism, concretism theory, and reification of the Lwów-Warsaw School (compare Dobrzański 2017; Lemanski and Dobrzański 2020).

b. Doctrine of Judgments

This section mainly focuses on BL, 260–293. Even though Schopenhauer had already used logic diagrams in his doctrine of concepts (see above, Sect. 2.a), he explicitly introduced them in his doctrine of judgment, making reference to Euler and others. Nevertheless, in some cases Schopenhauer’s logic diagrams are fundamentally different from Euler diagrams, so in the following, the first subsection defines the expression (i) ‘Schopenhauer diagrams’ or ‘relational diagrams’. Then subsection (ii) outlines how Schopenhauer applies these diagrams to Stoic logic and how they relate to oppositional geometry. Finally, subsection (iii) discusses Schopenhauer’s theory of conversion, his use of the term metalogic, and subsection (iv) discusses his diagrammatic interpretation of the analytic-synthetic distinction.

i. Relational Diagrams

The essential feature of Schopenhauer’s Logica Maior is that, for the most part, it is based on a diagrammatic representation. Schopenhauer learned the function and application of logic diagrams, at the latest, in Gottlob Ernst Schulze’s lectures. This is known because, although Schulze did not publish any diagrams in his textbook, Schopenhauer drew Euler diagrams and made references to Leonhard Euler in his notes on Schulze’s lectures (d’Alfonso 2018). Thus, as early as 1819, Schopenhauer published a logic of concepts based on circle diagrams in W I, § 9 (see below, Sect. 3.b) that he worked through in the Logica Maior of the Berlin Lectures (BL, 272 et seqq.).

‘Diagrammatic representation’ and ‘logic diagrams’ are modern expressions for what Schopenhauer called ‘visual representation’ or ‘schemata’. Schopenhauer’s basic insight is that the relations of concepts in judgments are analogous to the circular lines in Euclidean space. One, therefore, only has to go through all possible circular relations and examine them according to their analogy to concept relations in order to obtain the basic forms of judgment on which all further logic is built. With critical reference to Kant, Schopenhauer calls his diagrammatic doctrine of judgment a ‘guide of schemata’ (Leitfaden der Schemata). As the following diagrams are intended to represent the basic relations of all judgments, they can also be called ‘relation diagrams’ (RD) as per Fig. 4.

Fig. 4.1 (RD1)

All R is all C.
All C is all R.

 

Fig. 4.2 (RD2)

All B is A.
Some A is B.
Nothing that is not A is B.
If B then A

 

Fig. 4.3 (RD3)

No A is S.
No S is A.
Everything that is S is not A.
Everything that is A is not S.

 

Fig. 4.4 (RD4)

All A is C.
All S is C.
Nothing that is not C is A.
Nothing that is not C is S.

 

Fig. 4.5 (RD 5)

Some R is F.
Some F is R.
Some S is not F.
Some F is not R

 

Fig. 4.6 (RD6)

All B is either o or i.

All six RDs form the basis on which to build all logic, that is, both Aristotelian and Stoic logic. Schopenhauer states that geometric forms were first used by Euler, Johann Heinrich Lambert, and Gottfried Ploucquet to represent the four categorical propositions of Aristotelian syllogistics: All x are y (RD2), Some x are y (RD5), No x are y (RD3) and Some x are not y (RD5). These three diagrams, together with RD1, result in the relations that Joseph D. Gergonne described almost simultaneously in his famous treatise of 1817 (Moktefi 2020). RD4 may have been inspired by Kant and Karl Christian Friedrich Krause, although there are clear differences in interpretation here. However, Fig. 3.6 is probably Schopenhauer’s own invention even though there were many precursors to these RDs prior to and during the early modern period that Schopenhauer did not know about. On account of the various influences, it might be better to speak of ‘Schopenhauer diagrams’ or ‘relational diagrams’ rather than of ‘Euler diagrams’ or ‘Gergonne relations’ and so forth.

Schopenhauer shows how each RD can express more than just one aspect of information. This ambiguity can be evaluated in different ways. In contemporary formal approaches, the ambiguity of logic diagrams is often considered a deficiency. In contrast, Schopenhauer considers this ambiguity more an advantage than a deficiency as only a few circles in one diagram can represent a multitude of complex linguistic expressions. In this way, Schopenhauer can be seen as a precursor of contemporary theories about the so-called ‘observational advantages’ of diagrams. As meaning only arises through use and context (see above) and as axioms can never be the starting point of scientific knowledge (see above), the ambiguity of logic diagrams is no problem for Schopenhauer. For him, a formal system of logic is unnecessary. He wanted to analyze the ordinary and natural language with the help of diagrams.

ii. Stoic Logic and Oppositional Geometry

Nowadays, it is known that the relation diagrams described above can be transformed, under the definition of an arbitrary Boolean algebra, into diagrams showing the relations contrariety, contradiction, subcontrariety, and subalternation. The best-known of these diagrams, which are now gathered under the heading of ‘oppositional geometry’, is the square of opposition. Although no square of opposition has yet been found in Schopenhauer’s manuscripts, he did associate some of his RDs with the above-mentioned relations and in doing so also referred to “illustrations” (BL, 280, 287) that are no longer preserved in the manuscripts.

Schopenhauer went beyond Aristotelian logic with RD2 and RD6 and also attempted to represent Stoic logic with them, which in turn can be understood as a precursor of propositional logic (BL, 278–286). RD2 expresses hypothetical judgments (if …, then …), RD6 disjunctive judgments (either … or …). In particular, researchers have studied the RD6 diagrams, also called ‘partition diagrams’, more intensively. For Schopenhauer, the RDs for Stoic logic are similar to syllogistic diagrams. However, quantification does not initially play a major role here, as the diagrams are primarily intended to express transitivity (hypothetical judgments) or difference (disjunction). Only in his second step does Schopenhauer add quantification to the diagrams again (BL, 287 et seqq.). In this context, Schopenhauer treats the theory of oppositions on several pages (BL, 284–289); however, he merely indicates that the diagrammatic representation of oppositions would have to be further elaborated.

The basic RD6 in Fig. 3.6 shows a simple contradiction between the concepts  and . However, as the RDs given above are only basic diagrams, they can be extended according to their construction principles. Thus, there is also a kind of compositional approach in Schopenhauer’s work. For example, one can imagine that a circle, such as that given in RD6, is not separated by one line but two, making each compartment a part of the circle and excluding all others. An example of this can be seen in Fig. 5, alongside its corresponding opposition diagram, a so-called ‘strong JSB hexagon’ (Demey, Lemanski 2021).

Figure 5: Partition diagram and Logical Hexagon (Aggregatzustand = state of matter, fester = solid, flüßiger = liquid, elastischer = elastic)

An example of a more complex Eulerian diagram of exclusive disjunctions used by Schopenhauer is illustrated in Fig. 6, which depicts Animalia, Vertebrata, Mammals, Birds, Reptiles, Pisces, Mollusca, ArtiCulata, and RaDiata. These terms are included as species in genera and are mutually exclusive. While the transformation into the form of oppositional geometry is found in Lemanski and Demey (2021), Fig. 6 expresses Schopenhauer’s judgments such as:@

If something is A, it is either V or I.

If something is V, it is either M or B or R or P.

If something is A, but not V, it is either M or C or D.

Fig. 6: Schopenhauer’s Animalia-Diagram

Schopenhauer here notes that the transition between the logic of concepts, judgments, and inferences is fluid. The partition diagrams only show concepts or classes, but judgments can be read through their relation to each other, that is, in a combination of RD2 and RD3. However, as the relation of three concepts to each other can already be understood as inference, the class logic is already, in most cases, a logic of inferences. For example, the last judgment mentioned above could also be understood as enthymemic inference (BL 281):

Something is A and not V. (If V then M or C or D.) Thus, it is either M or C or D.

Schopenhauer’s partition diagrams have been adopted and applied in mathematics, especially by Adolph Diesterweg (compare Lemanski 2022b).

iii. Conversion and Metalogic

In his doctrine of judgments, Schopenhauer still covers all forms of conversion and laws of thought, in which he partly uses RDs, but partly also an equality notation (=) inspired by 18th-century Wolffians. The notation for the conversio simpliciter given in Fig. 4.5 is a convenient example of the doctrine of conversion:

universal negative: No A = B. No B = A.

particular affirmative: Some A = B. Some B = A.  (BL, 293).

Following this example, Schopenhauer demonstrates all the rules of the traditional doctrine of conversion. The equality notation is astonishing as it comes close to a form of algebraic logic that is developed later by Drobisch and others (Heinemann 2020).

Furthermore, the first three laws of thought (BL, 262 et seqq.) correspond to the algebraic logic of the late 19th century, namely the:

(A) law of identity: A = A,

(B) law of contradiction: A = -A = 0,

(C) law of excluded middle: A aut = b, aut = non b.

(D) law of sufficient reason: divided into (1) the ground of becoming (Werdegrund), (2) the ground of cognition (Erkenntnisgrund), (3) the ground of being, and (4) the ground of action (Handlungsgrund).

Only the second class of the law of sufficient reason relates to logic. This ground of cognition (Erkenntnisgrund) is then divided into four further parts, which, together, form a complex truth theory. Schopenhauer distinguishes between (1) logical truth, (2) empirical truth, (3) metaphysical truth, and (4) metalogical truth. The last form is of particular interest (Béziau 2020). Metalogical truth is a reflection on the four classes of the principle of sufficient reason mentioned above. A judgment can be true if the content it expresses is in harmony with one or more of the listed laws of thought. Although some parts of modern logic have broken with these basic laws, Schopenhauer is the first logician to describe the discipline entitled “metalogy” in a similar way to Nicolai A. Vasiliev, Jan Łukasiewicz, and Alfred Tarski.

iv. Analytic-Synthetic Distinction and the Metaphor of the Concept

Another peculiarity of Schopenhauer’s doctrine of judgments is the portrayal of analytic and synthetic judgments. In Kant research, the definition of analytic and synthetic judgments has been regarded as problematic and highly worthy of discussion since Willard Van Orman Quine’s time—at the latest. This is particularly because Kant, as Quine and some of his predecessors have emphasized, used the unclear metaphors of “containment,” that is, “enthalten” (Critique of Pure Reason, Intr. IV) and “actually thinking in something,” that is “in etw. gedacht werden” (Prolegomena, §2b) to define what analytic and synthetic judgments are. In the section of the Berlin Lectures on cognition, Schopenhauer introduces the distinction between analytic and synthetic judgments as follows:

A distinction is made in judgment, more precisely, in the proposition, subject, and predicate, that is, between what something is said about, and what is said about it. Both concepts. Then the copula. Now the proposition is either mere subdivision (analysis) or addition (synthesis); which depends on whether the predicate was already thought of in the subject of the proposition, or is to be added only in consequence of the proposition. In the first case, the judgment is analytic, in the second synthetic.

All definitions are analytic judgments:

For example,

gold is yellow: analytic
gold is heavy: analytic
gold is ductile: analytic
gold is a chemically simple substance: synthetic (BL, 123)

Here, Schopenhauer initially adheres strictly to the expression of ‘actually thinking through something’ (‘mitdenken’ that is, analytically) or ‘adding something’ (‘hinzudenken’ that is, synthetically). However, he explains in detail that the distinction between the two forms of judgment is relative as it often depends on the knowledge and experience of the person making the judgment. An expert will, for example, classify many judgments from his field of knowledge as analytic, while other people would consider them to be synthetic. This is because the expert knows more about the characteristics of a subject than someone who has never learned about these things. In this respect, Schopenhauer is an advocate of ontological relativism. However, in the sense of transcendental philosophy, he suggests that every area of knowledge must have analytic judgments that are also a priori. For example, according to Kant, judgments such as “All bodies are extended” are analytic.

Even more interesting than these explanations taken from the doctrine of cognition (BL, 122–127) is the fact that Schopenhauer takes up the theory of analytic and synthetic judgments again in the Logica Maior (BL, 270 et seqq.). Here, Schopenhauer explains what the expression of ‘actually thinking through something’ (‘mitdenken’), which he borrowed from Kant, means. ‘Actually thinking in something’ can be translated with the metaphor of ‘containment’, and these expressions are linguistic representations of logic diagrams or RDs. To understand this more precisely, one must once again refer to Schopenhauer’s doctrine of concept (BL, 257 et seqq.). For Schopenhauer, there is no such thing as a ‘concept of the concept’. Rather, the concept itself is a metaphor that refers to containment. According to Schopenhauer, this is already evident in the etymology of the expression ‘concept’, which illustrates that something is being contained: horizein (Greek), concipere (Latin), begreifen (German). Concepts conceive of something linguistically, just as a hand grasps a stone. For this reason, the concept itself is not a concept, but a metaphor, and RDs are the only adequate means for representing the metaphor of the concept (Lemanski 2021b, chap. 2.2).

If one says that the concept ‘gold’ includes the concept ‘yellow’, one can also say that ‘gold’ is contained in ‘yellow’ (BL, 270 et seqq.). Both expressions are transfers from concrete representation into abstract representation, that is, from intuition into language. To explain this intuitive representation, one must use an RD2 (Fig. 3.2) such as is given in Fig. 7 (BL, 270):

c. Doctrine of Inferences

This section mainly deals with BL, 293–356. As one can see from the page references, the doctrine of inferences is the longest section of the Logica Maior in the Berlin Lectures. Herein, Schopenhauer (i) presents an original thesis for the foundation of logic and (ii) develops an archaic Aristotelian system of inferences, (iii) whose validity he sees as confirmed by the criterion of naturalness. In all three areas, logic diagrams or RDs—this time following mainly Euler’s intention—play a central role.

i. Foundations of Logic

Similar to the Cartesianists, Schopenhauer claims that logical reasoning is innate in man by nature. Thus, the only purpose academic logic has is to make explicit what everyone implicitly masters. In this respect, the proof of inferential validity can only be a secondary topic in logic. In other words, logic is not primarily a doctrine of inference, but primarily a doctrine of judgment. Schopenhauer sums this up by saying that nobody is able to draw invalid inferences for himself by himself and intend to think correctly, without realizing it (BL, 344). For him, such seriously produced invalid inferences are a great rarity (in ‘monological thinking’), but false judgments are very common. Furthermore, learning logic does not secure against false judgments.

Schopenhauer, therefore, does not consider proving inferences to be the main task of logic; rather, logic should help one formulate judgments and correctly grasp conceptual relations. However, when it comes to proof, intuition plays an important role. Schopenhauer takes up an old skeptical argument in his doctrine of judgments and inference that problematizes the foundations of logic: (1) Conclusions arrived at by deduction are only explicative, not ampliative, and (2) deductions cannot be justified by deductions. Thus, no science can be thoroughly provable, no more than a building can hover in the air, he says (BL, 527).

Schopenhauer demonstrates this problem by referring to traditional proof theories. In syllogistics, for example, non-perfect inferences are reduced to perfect ones, more precisely, the so-called modus Barbara and the modus Celarent. Yet, why are the modes Barbara and Celarent considered perfect? Aristotle, for example, justifies this with the dictum de omni et nullo, while both Kantians and skeptics, such as Schopenhauer’s logic teacher Schulze, justify the perfection of Barbara and Celarent as well as the validity of the dictum de omni et nullo with the principle nota notae est nota rei ipsisus. However, Schopenhauer goes one step further and explains that all discursive principles fail as the foundations of science because an abstract representation (such as a principle, axiom, or grounding) cannot be the foundation for one of the faculties of abstract representation (logic, for example). If one, nevertheless, wants to claim such a foundation, one inevitably runs into a regressive, a dogmatic, or a circular argument (BL, 272).

For this reason, Schopenhauer withdraws a step in the foundation of logic and offers a new solution that he repeats later as the foundation of geometry: Abstract representations are grounded on concrete representations, as abstract representations are themselves “representations of representations” (see above, Sect. 2.a.ii). The concrete representation is a posteriori or a priori intuition and both forms can be represented by RDs or logic diagrams. The abstract representation of logic is thus justified by the concrete representation of intuition, and the structures of intuition correspond to the structures of logic. For Schopenhauer, this argument can be proven directly using spatial logic diagrams (see above, Sect. 2.b.ii).

The validity of an inference can, thus, be shown in concreto, while most abstract proofs illustrated using algebraic notations are not convincing. As Schopenhauer demonstrates in his chapters on mathematics, abstract-discursive proofs are not false or useless for certain purposes, but they cannot achieve what philosophers, logicians, and mathematicians aim to achieve when they ask about the foundations of rational thinking (compare Lemanski 2021b, chap. 2.3). This argument can also be understood as part of Schopenhauer’s reism or concretism (see above, Sect. 2.a.ii).

ii. Logical Aristotelianism and Stoicism

As described above, Schopenhauer’s focus is not on proving the validity of inferences, but on the question of which logical systems are simpler, more efficient, and, above all, more natural. Although he always uses medieval mnemonics, he explains that the scholastic system attributes only a name-giving, not a proof-giving, function to inferences. On the one hand, he is arguing against Galen and many representatives of Arabic logic when he claims that the fourth figure in syllogistics has no original function. On the other hand, he is also of the opinion that Kant overstepped the mark by criticizing all figures except the first one. The result of this detailed critique, which he carried out on all 19 valid modes and for all syllogistic figures, is proof of the validity of the archaic Aristotelian Organon. Therefore, Schopenhauer claims that Aristotle is right when he establishes three figures in syllogistics and that he is also right when it comes to establishing all general and special rules. The only innovation that Schopenhauer accepts in this respect is that logic diagrams show the abstract rules and differences between the three figures concretely and intuitively.

According to Schopenhauer, a syllogistic inference is the realization of the relationship between two concepts formerly understood through the relationship of a third concept to each of them (BL, 296). Following the traditional doctrine, Schopenhauer divides the three terms into mAjor, mInor, and mEdius. He presents the 19 valid syllogisms as follows (BL, 304–321):

1st Figure

Barbara

All  E are A, all I is E, thus all I is A.

Celarent

No E is A, all I is E, thus no I is A.

Darii

All E is A, some I is E, thus some I is A.

Ferio

No E is A, some I is E, thus some I is not A.

2nd Figure

Cesare

No A is E, all I is E, thus no I is A.

Camestres

All A is E, no I is E, thus no I is A.

Festino

No A is E, some I is E, thus some I is not A.

Baroco

All A is E, some I is not E, thus some I is not A.

3rd Figure

Darapti

All E is A, all E is I, thus some I is A.

Felapton

No E is A, all E is I, thus some I is not A.

Disamis

Some E is A, all E is I, thus some I is A.

Datisi

All E is A, some E is I, thus some I is A.

Bocardo

Some E is not A, all E is I, thus some I is not A.

Ferison

No E is A, some E is I, thus some I is not A.

 

4th Figure ≈ 1st Figure

Fesapo

No A is E, all E is  I, thus some I is not A.

Dimatis

Some A is E, all E is I, thus some I is A.

Calemes

All A is E, no E is I, thus no I is A.

Bamalip

All A is E,  all E is I, thus some I is A.

Fresison

No A is E, some E is I, thus some I is not A.

Remarkably, Schopenhauer transfers the method of dotted lines from Lambert’s line diagrams to his Euler-inspired RD3 (Moktefi 2020). These dotted lines, as in the case of Bocardo, are used to indicate the ambiguity of a judgment. Nevertheless, whether Schopenhauer applies this method consistently is a controversial issue (compare BL, 563 and what follows.).

In addition to Aristotelian syllogistics, Schopenhauer also discusses Stoic logic (BL 333–339). However, Schopenhauer does not use diagrams in this discussion. He justifies this decision by saying that, here, one is dealing with already finished judgments rather than with concepts. Yet, this seems strange as, at this point in the text, Schopenhauer had already used diagrams in his discussion of the doctrine of judgment, which also represented inferences of Stoic logic. However, as the method was not yet well developed, it can be assumed that Schopenhauer failed to represent the entire Stoic logic with the help of RDs. Instead, in the chapter on Stoic logic, one finds characterization of the modus ponendo ponens and the modus tollendo tollens (hypothetical inferences), as well as the modus ponendo tollens and the modus tollendo ponens (disjunctive inferences). In addition, he also focused more intensively on dilemmas.

iii. Naturalness in Logic

One of the main topics in the doctrine of inferences is the naturalness of logic. For Schopenhauer, there are artificial logics, such as the mnemonics of scholastic logic or the mathematical demand for axiomatics, but there are also natural logics in certain degrees. Schopenhauer agrees with Kant that the first figure of Aristotelian syllogistics is the most natural one, “in that every thought can take its form” (BL, 302). Thus, the first figure is the “simplest and most essential rational operation” (ibid.) and most people unconsciously use one of the modes of the first figure for logical reasoning every day. In contrast to Kant, however, Schopenhauer does not conclude that all other figures are superfluous. For in order to make it clear that one wants to express a certain thought, one rightly falls back on the second and third figures.

To determine the naturalness of the first three figures, Schopenhauer examines the function of the inferences in everyday reasoning and, thus, asks what thought they express. Similar to Lambert, Schopenhauer states that we use the first figure to identify characteristics or decisions. We use the second figure if we want to make a difference explicit (BL, 309), while the third figure is used to express or prove a paradox, anomaly, or exception. Schopenhauer gives each of the three figures its own name according to the thought operation expressed with the figure: the first figure is the “Handhabe” (manipulator), the second the “Scheidewand” (septum), and the third the “Anzeiger” (indicator) (BL, 316). As it is natural for humans to make such thought operations explicit, the first three figures are also part of a natural logic. Schopenhauer also explains that each of these three figures has its own enthymemic form and that the function of the medius differs with each figure (BL, 329).

However, Schopenhauer argues intently against the fourth figure, which was introduced by Galen and then made public by Arabic logicians. It has no original function and is only the reversal of the first figure, that is to say, it does not indicate a decision itself, only evidence of a decision. Moreover, the fourth figure does not correspond to the natural grammatical structure through which people usually express their daily life. It is more natural when speakers put the narrower term in the place of the subject and the broader one in the place of the predicate. Although a reversal is possible, which allows the reversal from the first to the fourth figure, this reversal is unnatural. For example, it is more natural to say “No Bashire is a Christian” than to say “No Christian is a Bashire” (BL, 322).

In the chapter on Stoic logic, the intense discussion of naturalness is lost, yet Schopenhauer points out here and elsewhere that there are certain forms of propositional logic that appear natural in the sciences and everyday language. Mathematicians, for example, tend to use the modus tollendo ponens in proof techniques, even though this technique is prone to error, as the tertium non datur does not apply universally (BL, 337, 512f.). As a result of such theses, Schopenhauer is often associated with intuitionism and the systems of natural deductions (compare Schueler et al. 2020; Koetsier 2005; Belle 2021).

d. Further Topics of Analytic

In addition to the areas mentioned thus far, the BL offer many other topics and arguments that should be of interest to many, not only researchers of the history and philosophy of logic. The major topics include, for example, a treatise on the Aristotelian rules, reasons, and principles of logic (BL, 323–331), a treatise on sorites (BL, 331–333), a treatise on modal logic (BL, 339–340), a further chapter on enthymemes (BL, 341–343), and a chapter on sophisms and false inferences (BL, 343–356).

In the following sections, Schopenhauer’s views on (i) the history and development of logic, (ii) the parallels between logic and mathematics, and the focus on (iii) hermeneutics are discussed. As the chapter on sophisms and so forth is also used in dialectics, it is presented in Sect. 2.e.

i. Schopenhauer’s History of Logic

A history of logic in the narrower sense cannot be found in Schopenhauer’s treatise on logic in general (BL, 356 and what follows). However, Schopenhauer discusses the origin and development of Aristotelian logic in a longer passage on the question raised by logical algebra in the mid-18th century—and then prominently denied by Kant: Has there been any progress in logic since Aristotle?

Naturally, as an Aristotelian and Kantian, Schopenhauer answers this question in the negative but admits that there have been “additions and improvements” to logic. Schopenhauer argues that Aristotle wrote the first “scientific logic”, but admits that there were earlier logical systems and claims that Aristotle united the attempts of his precursors into one scientific system. Schopenhauer also suggests that there may have been an early exchange between Indian and Greek logic.

The additions and improvements to Aristotelian logic concern a total of five points (Pluder 2020), some of which have already been mentioned above: (1) The discussion of the laws of thought; (2) the scholastic mnemonic technique; (3) the propositional logic; (4) Schopenhauer’s own treatise on the relation between intuition and concept; and (5) the fourth figure, introduced by Galen. Schopenhauer considers some of these additions to be improvements (1, 3, 4) and considers others to be deteriorations (2 and especially 5). It seems strange that Schopenhauer does not refer to the use of spatial logic diagrams once again (BL, 270).

ii. Logic and Mathematics

Another extensive chapter of the BL, which is closely related to logic, discusses mathematics. This is no surprise, as Schopenhauer spent a semester studying mathematics with Bernhard Friedrich Thibaut in Göttingen and systematically worked through the textbooks by Franz Ferdinand Schweins, among others (Lemanski 2022b). As already discussed above, one advantage of the BL is that Schopenhauer took W I as a basis, expanded parts of it considerably, and incorporated into it some essential topics from his supplementary works. Thus, before the treatise on mathematics, one finds a detailed presentation of the four roots of sufficient reason, which Schopenhauer covered in his dissertation.

Schopenhauer’s representation of mathematics concentrates primarily on geometry. His main thesis is that abstract-algebraic proofs are possible in geometry but, like logic, they lead to a circulus vitiosus, a dogma, or an infinite regress by proving their foundation (see above, Sect. 2.c.i). Therefore, as in logic, Schopenhauer argues that abstract proofs should be dispensed with and that concrete-intuitive diagrams and figures should be regarded as the ultimate justification of proofs instead. Thus, he argues that feeling (Gefühl) is an important element, even possibly the basis, of proofs for geometry and logic (Follessa 2020). However, this feeling remains intersubjectively verifiable with the help of logic diagrams and geometric figures.

Schopenhauer discusses the main thesis of the text, in particular, in connection with the Euclidean system in which one finds both kinds of justification: discursive-abstract proofs, constructed with the help of axioms, postulates, and so forth, and concrete-intuitive proofs, constructed with the help of figures and diagrams. Similar to some historians of mathematics in the 20th century and some analytic philosophers in the 21st century, Schopenhauer believed that Euclid was seduced by rationalists into establishing an axiomatic-discursive system of geometry, although the validity of the propositions and problems was sufficiently justified by the possibility of concrete-intuitive representation (Béziau 1993).

Schopenhauer goes so far as to attribute Euclid’s axiomatic system to dialectic and persuasion. With his axiomatic system, Euclid could only show that something is like that (knowing-that), while the visual system can also show why something is like that (knowing-why). Schopenhauer demonstrates this in the BL with reference to Euclid’s Elements I 6, I 16, I 47, and VI 31. He develops his own picture proof for Pythagoras’s theorem (Bevan 2020), though he then corrects it over the years (Costanzo 2020). Given the probative power of the figures in geometry, there are clear parallels to the function of Schopenhauer diagrams in logic. Schopenhauer can, therefore, be regarded as an early representative of “diagrammatic proofs” and “visual reasoning” in mathematics.

Schopenhauer’s mathematics has been evaluated very differently in its two-hundred-year history of reception (Segala 2020, Lemanski 2021b, chap. 2.3). While Schopenhauer’s philosophy of geometry was received very positively until the middle of the 19th century, the Weierstrass School marks the beginning of a long period in which Schopenhauer’s approach was labeled a naive form of philosophy of mathematics. It was only with the advent of the so-called ‘proof without words’ movement and the rise of the so-called spatial or visual turn in the 1990s that Schopenhauer became interesting within the philosophy of mathematics once again (Costanzo 2020, Bevan 2020, Lemanski 2022b).

iii. Hermeneutics

The exploration and analysis of hermeneutics in Schopenhauer’s work are also closely related to logic. This has been the subject of intense and controversial discussion in Schopenhauer research. Overall, two positions can be identified: (1) Several researchers regard either Schopenhauer’s entire philosophy or some important parts of it as ‘hermeneutics. (2) Some researchers, however, deny that Schopenhauer can be called a hermeneuticist at all.

(1) The form of hermeneutics that researchers see in Schopenhauer, however, diverges widely. For example, various researchers speak of “world hermeneutics”, “hermeneutics of existence”, “hermeneutics of factuality”, “positivist hermeneutics”, “hermeneutics of thought”, or “hermeneutics of knowledge” (Schubbe 2010, 2018, 2020; Shapshay 2020). What all these positions have in common is that they regard the activity of interpretation and deciphering as a central activity in Schopenhauer’s philosophy.

(2) Other researchers argue, however, that Schopenhauer should not be ascribed to the hermeneutic position, while some even go as far as arguing that he is an “anti-hermeneutic”. The arguments of these researchers can be summarized as follows: (A1) Schopenhauer does not refer to authors of his time who are, today, called hermeneuticists. (A2) However, the term ‘hermeneutics’ does not actually fit philosophers of the early 19th century at all, as it was not fully developed until the 20th century. (A3) Schopenhauer is not received by modern hermeneutics.

Representatives of position (1) consider the arguments outlined in (2) to be insufficiently substantiated (ibid). From a logical point of view, argument (A2) should be met with skepticism, as the term ‘hermeneutics’ can be traced back to the second book of the Organon of Aristotle at least. Schopenhauer takes up the theory of judgment contained in the Organon again in his Logica Maior (see above, Sect. 2.b) and, in addition, explains that judgment plays a central role not only in logic but also in his entire philosophy: Every insight is expressed in true judgments, namely, in conceptual relations that have a sufficient reason. Yet, guaranteeing the truth of judgments is more difficult than forming valid inferences from them (BL, 200, 360ff).

e. Dialectic or Art of Persuasion

In addition to the analytics discussed thus far, there is also a very important chapter on (eristic) dialectics or persuasion in the BL which can be seen as an addition to § 9 of W I and as a precursor of the famous fragment entitled Eristic Dialectics. The core chapter is BL 363–366, but the chapters on paralogisms, fallacies, and sophisms, as well as some of the preliminary remarks, also relate to dialectics (BL, 343–363), as does quite a bit of the information on analytics, such as the RDs. As seen in Kant, for Schopenhauer, analytic is the doctrine of being and truth, whereas dialectic is the doctrine of appearance and illusion. In analytic, a solitary thinker reflects on the valid relations between concepts or judgments; in dialectic, a proponent aims to persuade an opponent of something that is possible.

According to Schopenhauer, the information presented in the chapter on paralogisms, fallacies, and sophisms belongs to both analytics and dialectics. In the former, their invalidity is examined; in the latter, their deliberate use in disputation is examined. Schopenhauer presents six paralogisms such as homonomy and amphiboly, seven fallacies such as ignoratio elenchi and petitio principii, and seven sophisms such as cornutus and crocodilinus. In total, 20 invalid argument types are described, with examples of each and partly subdivided into subtypes.

In the core chapter on dialectics or the art of persuasion, Schopenhauer tries to reduce these invalid arguments to a single technique (Lemanski 2023). His main aim is, thus, a reductionist approach that does not even consider the linguistic subtleties of the dishonest argument but reveals the essence of the deliberate fallacy. To this end, he draws on the RDs from analytics and explains that any invalid argument that is intentionally made is based on a confusion of spheres or RDs.

In an argument, one succumbs to a disingenuous opponent when one does not consider the RDs thoroughly but only superficially. Then one may admit that two terms in a judgment indicate a community without noticing that this community is only a partial one. Instead of the actual RD5 relation between two spheres, one is led, for example, by inattention or more covertly by paralogisms, fallacies, and sophisms, to acknowledge an RD1 or, more often, an RD2. According to Schopenhauer, dialectics is based on this confusion, as almost all concepts share a partial semantic neighborhood with another concept. Thus, it can happen that one concedes more and more small-step judgments to the opponent and then suddenly arrives at a larger judgment, a conclusion, that one would not have originally accepted at all.

Schopenhauer gives several examples of this procedure from science and everyday life and also simulates this confusion of spheres by constructing fictional discussions about ethical arguments between philosophers. In doing so, Schopenhauer uses RDs several times to demonstrate which is the valid (analytic) and which is the feigned (dialectical) relation of the spheres. Then, he goes one step further. In order to demonstrate that one can start from a concept and argue just as convincingly for or against it, Schopenhauer designs large argument maps to indicate possible courses of conversation (Lemanski 2021b, Bhattarcharjee et al. 2022).

Fig. 8 shows the sphere of the concept of good (“Gut”) on the left, the sphere of the concept of bad (“Uebel”) on the right, and the concept of country life (“Landleben”) in the middle. Starting with the term in the middle, namely, ‘country life’, the diagram reflects the partial relationship of this term with the adjacent spheres. When one chooses an adjacent sphere, for example, the adjacent circle ‘natural’ (“naturgemäß”), together, these two spheres form the small-step judgment: ‘Country life is natural’. This predicate can then be combined with another adjacent sphere to form a new judgment. Moving through the circles in this way, if one at some point arrives at ‘good’, for example, and the disputant has conceded all the small-step judgments en route, one can draw the overall conclusion that ‘country life is good’. However, as one can just as effectively argue for ‘country life is bad’ via other spheres, the argument map is a visualization of dialectical relations.

Schopenhauer also used such diagrams in the dialectic of W I, § 9, for example, the more famous “diagram of good and evil”, which has been interpreted as one of the first logic diagrams for -terms (Moktefi and Lemanski 2018), as a precursor of a diagrammatic fuzzy-logic (Tarrazo 2004), and as an argument map in which the RD5s are used as graphs (Bhattarcharjee et al. 2022). If one relates the dialectic of the BL to the other texts on dialectics, it can be said that this dialectic serves as a bridge between the short diagrammatic dialectic of the W I and the well-known fragment entitled Eristic Dialectics, in which the paralogisms, in particular, were elaborated.

Figure 8

 

3. Schopenhauer’s Logica Minor

Schopenhauer’s Berlin Lectures must be considered a Logica Maior due to the enormous size and complexity of their original subjects (especially in comparison to many other 19th-century writings). Nevertheless, one can also locate and collect a Logica Minor in Schopenhauer’s other writings. In the following, the most important treatises on analytic and dialectic from the other works of Schopenhauer are briefly presented. Even though the BL and the other writings have some literal similarities, the BL should remain the primary reference when assessing the various topics in the other writings.

a. Fourfold Root

The first edition of Schopenhauer’s dissertation, the Fourfold Root of the Principle of Sufficient Reason, was published in 1813 and a revised and extended edition was published in 1847. The second edition contains numerous additions that are not always regarded as improvements or helpful supplements. In the 1813 version of chapter 5, logic is addressed through the principle of sufficient reason of knowing. Schopenhauer follows a typical compositional approach in which inferences are considered compositions of judgments and judgments as compositions of concepts. The treatise in this chapter, however, is primarily concerned with the doctrine of concepts.

Although Schopenhauer points out that concepts have a sphere, there are no logic diagrams to illustrate this metaphor in the work. Schopenhauer deals mainly with the utility of concepts, the relationship between concept and intuition, and the doctrine of truth. The philosophy of mathematics and its relation to logic are discussed in chapters 3 and 8.

The discussion of the doctrine of truth is especially close to the text of the BL as Schopenhauer already distinguishes between logical, empirical, metaphysical, and metalogical truth. Although the expression “metalogica” is much older, this book uses the term ‘metalogic’ in the modern sense for the first time (Béziau 2020).

Furthermore, it can be argued that Schopenhauer presented the first complete treatise on the principle of sufficient reason in this book. While the other principles popularized by Leibniz and Wolff have found their way into today’s classical logic, that is, the principles of non-contradiction, identity, and the excluded middle, the principle of sufficient reason was considered non-formalizable and, therefore, not a basic principle of logic in the early 20th century. Newton da Costa, on the other hand, proposed a formalization that has made Schopenhauer’s laws of thought worthy of discussion again (Béziau 1992).

b. World as Will and Representation I (Chapter 9)

Chapter 9 (that is, § 9) of the W I takes up the terminology of Fourfold Root again and extends several elements of it. Schopenhauer first develops a brief philosophy of language to clarify the relationship between intuition and concept. He then introduces analytic by explaining the metaphors used in the doctrine of concepts, that is, higher-lower (buildings of concepts) and wider-narrower (spheres of concepts). Schopenhauer keeps to the metaphor of the sphere and explains that Euler, Lambert, and Ploucquet had already represented this metaphor with the help of diagrams. He draws some of the diagrams discussed above in Sect. 2.a— RD3 is missing—and explains that these are the fundament for the entire doctrine of judgments and inferences. Here, too, Schopenhauer represents a merely compositional position: judgments are connections of concepts while inferences are composed of judgments. However, in § 9, there is no concrete doctrine of judgment or inference. The principles of logic are also listed briefly in only one sentence.

Although W I makes the descriptive claim to represent all elements of the world, the logic presented here must be considered highly imperfect and incomplete. Schopenhauer explains that everyone, by nature, masters logical operations; thus, it is reserved for academic teaching alone to present logic explicitly and in detail, and this is what is done in the BL for an academic audience.

In the further course of § 9, Schopenhauer also discusses dialectics, which contains an argument map similar to the one illustrated above (see above, Sect. 2.e) but also lists some artifices (“Kunstgriffe”) known from later writings including the BL and Eristic Dialectic (ibid.). The philosophy of mathematics and its relation to logic are discussed in § 15 of the W I.

c. Eristic Dialectics

Of all the texts on Schopenhauer’s logic listed here, the manuscript produced in the early 1830s that he entitled Eristic Dialectic is the best known. It is usually presented separately from all other texts in editions that bear ambiguous titles such as The Art of (Always) Being Right or The Art of Winning an Argument. Schopenhauer himself titled the manuscript Eristic Dialectic. The term ‘eristics’ comes from the Greek ‘erizein’ and means ‘contest, quarrel’ and is personified in Greek mythology by the goddess Eris. Although Schopenhauer also uses the above ambiguous expressions in the text (for example, 668, 675), these are primarily understood as translations of the Greek expression ‘eristiké téchne’.

Regardless of the context, the ambiguous titles suggest that Schopenhauer is here recommending that his readers use obscure techniques in order to assert themselves against other speakers. Even though there are text fragments that partially convey this normative impression, Schopenhauer’s goal is, however, of a preventive nature: He seeks to give the reader a means to recognize and call out invalid but deliberately presented arguments and, thus, be able to defend themself (VI, 676). Therefore, Schopenhauer is not encouraging people to violate the ethical rules of good argumentation (Lemanski 2022a); rather, he is offering an antidote to such violations (Chichi 2002, 165, 170, Hordecki 2018). However, this fragment is often interpreted normatively, and in the late 20th and early 21st centuries, it was often instrumentalized in training courses for salesmen, managers, lawyers, politicians, and so forth, as a guide to successful argumentation.

The manuscript consists of two main parts. In the first, Schopenhauer describes the relationship between analytics and dialectics (VI, 666), defines dialectics several times (2002, 165), and outlines its history with particular reference to Aristotle (VI, 670–675). The second main part is divided into two subsections. The first subsection describes the “basis of all dialectics” and gives two basic modes (VI, 677 f.). The second subsection (VI, 678–695) is followed by 38 artifices (“Kunstgriffe”), which are explained clearly with examples. These artifices, which Schopenhauer also called ‘stratagems’, can be divided into preerotematic (art gr. 1–6), erotematic (7–18), and posterotematic (19–38) stratagems (compare Chichi 2002, 177).

The manuscript is unfinished and, therefore, the fragment is also referred to by Schopenhauer as a “first attempt” (VI, 676f.). According to modern research, both main parts are revisions of the Berlin Lectures, designed for independent publication: the first main part being an extension of BL 356–363, the second main part a revised version of BL 343–356. It can be assumed that Schopenhauer either wanted to add another chapter on the reduction of all stratagems to diagrams (as given in BL 363–366) or that he intended to dispense with the diagrams, as they would have presupposed knowledge of analytics. In any case, it can be assumed that Schopenhauer would have edited the fragment further before publishing it, as the manuscripts are not at the same standard as Schopenhauer’s printed works.

Despite the misuse of the fragment described above, researchers in several areas, for example in the fields of law, politics, pedagogy, ludics and artificial intelligence, are using the fragment productively (for example, Fouqueré et al. 2012, Lübbig 2020, Marciniak 2020, Hordecki 2021).

d. World as Will and Representation II (Chapters 9 and 10)

In the very first edition of W II in 1844, Schopenhauer extended the incomplete explanations of logic given in W I with his doctrines of judgment (chapter 9) and inference (chapter 10). He adopts some text passages and results of the BL, but only briefly hints at many of these topics, theses, and arguments. In comparison to the BL, chapters 9 and 10 of W II also appear to be an unsatisfactory approach to logic.

In his discussion of the doctrine of judgment, Schopenhauer pays particular attention to the function of the copula in addition to giving further explanations of the forms of judgments. In the doctrine of inference, he continues to advocate for Aristotelianism and argues against both Galen’s fourth figure and Kant’s reduction to the first figure. Furthermore, the text suggests an explanation for why Schopenhauer presents such an abbreviated representation of logic here. Schopenhauer explains in chapter 10 that RDs are a suitable technique to prove syllogisms although they are not appropriate for use in propositional logic. It seems as if Schopenhauer is going against some of the arguments of his former doctrine of diagrammatic reasoning (presented, for example, in Sect. 2.b.ii). Nevertheless, he presents this critique or skepticism almost reluctantly as an addition to W I. Although he does include some RDs, which mainly represent syllogistic inferences, in chapters 9 and 10, he also hints at a more advanced diagrammatic system based on “bars” and “hooks” several times.

However, these text passages, which point to a new diagrammatic system, remain only hints whose meaning cannot yet be grasped. Based on these dark text passages, Kewe (1907) has tried to reconstruct an alternative logic system that is supposed to resemble the structure of a voltaic column as Schopenhauer himself hinted at such a comparison at the end of chapter 10 of W II. However, Kewe’s proposal is a logically trivial, if diagrammatically very complex, interpretation that almost only highlights the disadvantages in comparison to the system of RDs.

It is more obvious that Schopenhauer thinks of these passages as a diagrammatic technique that was published in Karl Christian Friedrich Krause’s Abriss des Systemes der Logik in the late 1820s. This interpretation of W II is more plausible as Schopenhauer was in personal contact with Krause for a longer time (Göcke 2020). However, future research must clarify whether this thesis is tenable. To date, unfortunately, no note from among the manuscripts remains has been identified that may illustrate the technique described in W II, chapter 10.

e. Parerga and Paralipomena II

Parerga and Paralipomena II, chapter 2 contains a treatise on “Logic and Dialectic”. Although this chapter was written in the 1850s, it is the worst treatise Schopenhauer published on logic. In just a few paragraphs, he attempts to cover topics such as truth, analytic and synthetic judgments, and proofs. The remaining paragraphs are extracts from or paraphrases of the manuscript on Eristic Dialectics or the BL. One can see from these passages that there was a clear break in Schopenhauer’s writings around the 1830s and that his late work tended to omit rational topics. Schopenhauer also explained that he was no longer interested in working on the fragment on Eristic Dialectics, as the subject showed him the wickedness of human beings and he no longer wanted to concern himself with it.

4. Research Topics

Research into Schopenhauer’s philosophy of language, logic, and mathematics is still in its infancy because, for far too long, normativists concentrated on other topics in Schopenhauer’s theory of representation, including his epistemology and, especially, his idealism. The importance of the second part of the theory of representation, namely, the theory of reason (language, knowledge, practical action), has been almost completely ignored. However, as language and logic are the media that give expression to Schopenhauer’s entire system, it can be said that one of the most important methodological and content-related parts of the system of Schopenhauer’s complete oeuvre has, historically, been largely overlooked.

The following is a rough overview of research to be done on Schopenhauer’s logic. It is shown that these writings still offer interesting topics and theses. In particular, Schopenhauer’s use of logic diagrams is likely to meet with much interest in the course of intensive research into diagrammatic and visual reasoning. Nevertheless, many special problems and general questions remain unsolved. The most important general questions concern the following points:

  1. Do we have all of Schopenhauer’s writings on logic, or are there manuscripts that have not yet been identified? In particular, the fact that Schopenhauer uses diagrams that are not discussed in the text and discusses diagrams that are not illustrated in the text suggests that Schopenhauer knew more about logic diagrams than can be gleaned from his known books and manuscripts.
  2. How great is the influence of Schopenhauer’s logic on modern logic (especially the Vienna Circle, the school of Münster, the Lwów-Warsaw school, intuitionism, metalogic, and so forth)? Schopenhauer’s Berlin Lectures were first fully published in 1913, a period that saw the intensive reception of Schopenhauer’s teachings on logic in those schools. For example, numerous researchers have been discussing Schopenhauer’s influence on Wittgenstein for decades (compare Glock 1999). One can observe an influence on modern logic in the works of Moritz Schlick, Béla Juhos, Edith Matzun, and L. E. J. Brouwer. However, this relationship has, thus far, been consistently ignored in research.
  3. What is Schopenhauer’s relationship to the pioneering logicians of his time (for example, Krause, Jakob Friedrich Fries, Carl Friedrich Bachmann, and so forth)? Previous sections have indicated that Schopenhauer’s logic may have been close to that of Krause. Bachmann, another remarkable logician of the early 19th century, was also in contact with Schopenhauer. The fact that Schopenhauer was personally influenced by Schulze’s logic is well documented. In addition, Schopenhauer knew various logic systems from the 18th and 19th centuries; however, many studies are needed to clarify these relationships.
  4. To what extent does Schopenhauer’s logic differ from the systems of his contemporaries? Many of Schopenhauer’s innovations and additions to logic have already been recognized. Yet, the question remains, to what extent does Schopenhauer’s approach to visual reasoning correspond to the Zeitgeist? At first glance, it seems obvious, for example, that Schopenhauer strongly contradicted the Leibnizian and Hegelian schools, the Hegelian schools especially, by separating logic and metaphysics from each other and emphasizing instead the kinship of logic and intuition.
  5. To what extent can Schopenhauer’s ideas about logic and logic diagrams be applied to contemporary fields of research? Schopenhauer did not design ‘a logic’ that would meet today’s standards of logic without comment, but rather a stimulating philosophy of logic and ideas about visual reasoning. Schopenhauer questioned many principles that are often widely accepted today. Moreover, he offers many diagrammatic and graphical ideas that could be developed in many modern directions. Schopenhauer’s approaches, which were interpreted as contributions to fuzzy logic, -term logic, natural logic, metalogic, ludics, graph theory and so forth also require further intensive research.
  6. How can Schopenhauer’s system of (for example, W I) be reconstructed using logic? This question is motivated by the fact that some logical techniques have already been successfully applied to Schopenhauer’s system. For example, Matsuda (2016) has offered a precise interpretation of Schopenhauer’s world as a cellular automaton based on the so-called Rule 30 ( ) elaborated by Stephen Wolfram. In Schopenhauer’s system, logic thus has a double function: As part of the world, the discipline called logic must be analyzed as any other part of the system. However, as an instrument or organon of expression and reason, it is itself the medium through which the world and everything in it are described. This raises the question of what an interpretation of Schopenhauer’s philosophical system using his logic diagrams would look like.

5. References and Further Readings

a. Schopenhauer’s Works

  • Schopenhauer, A.: Philosophische Vorlesungen, Vol. I. Ed. by F. Mockrauer. (= Sämtliche Werke. Ed. by P. Deussen, Vol. 9). München (1913). Cited as BL.
  • Schopenhauer, A.: The World as Will and Representation: Vol. I. Transl. by J. Norman, A. Welchman, C. Janaway. Cambridge (2014). Cited as W I.
  • Schopenhauer, A.: The World as Will and Representation: Vol. II. Transl. by J. Norman, A. Welchman, C. Janaway. Cambridge (2015). Cited as W II.
  • Schopenhauer, A.: Parerga and Paralipomena. Vol I. Translated by S. Roehr, C. Janaway. Cambridge (2014).
  • Schopenhauer, A.: Parerga and Paralipomena. Vol II. Translated by S. Roehr, C. Janaway. Cambridge (2014).
  • Schopenhauer, A.: Manuscript Remains: Early Manuscripts 1804–1818. Ed. by Arthur Hübscher; translated by E. F. J. Payne Oxford et. al. (1988).
  • Schopenhauer, A.: Manuscript Remains: Critical Debates (1809–1818). Ed. by Arthur Hübscher; translated by E. F. J. Payne Oxford et. al. (1989).
  • Schopenhauer, A.: Manuscript Remains: Berlin Manuscripts (1818–1830). Ed. by Arthur Hübscher; translated by E. F. J. Payne Oxford et. al. (1988).
  • Schopenhauer, A.: Manuscript Remains: The Manuscript Books of 1830–1852 and Last Manuscripts. Ed. by Arthur Hübscher; translated by E. F. J. Payne Oxford et. al. (1990).

b. Other Works

  • Baron, M. E. (1969) A Note on the Historical Development of Logic Diagrams: Leibniz, Euler and Venn. In Mathematical Gazette 53 (384), 113–125.
  • Bevan, M. (2020) Schopenhauer on Diagrammatic Proof. In Lemanski, J. (ed.) Language, Logic, and Mathematics in Schopenhauer. Birkhäuser, Cham, 305–315.
  • Belle, van, M. (2021) Schopenbrouwer: De rehabilitatie van een miskend genie. Postbellum, Tilburg.
  • Béziau, J.-Y. (2020) Metalogic, Schopenhauer and Universal Logic. In Lemanski, J. (ed.) Language, Logic, and Mathematics in Schopenhauer. Birkhäuser, Cham, 207–257.
  • Béziau, J.-Y. (1993) La critique Schopenhauerienne de l’usage de la logique en mathématiques. O que nos faz pensar 7, 81–88.
  • Béziau, J.-Y. (1992) O Princípio de Razão Suficiente e a lógica segundo Arthur Schopenhauer. In Évora, F. R. R. (Hg.): Século XIX. O Nascimento da Ciência Contemporânea. Campinas, 35–39.
  • Bhattarchajee, R., Lemanski, J. (2022) Combing Graphs and Eulerian Diagrams in Eristic. In : Diagrams. In: Giardino, V., Linker, S., Burns, R., Bellucci, F., Boucheix, JM., Viana, P. (eds) Diagrammatic Representation and Inference. Diagrams 2022. Lecture Notes in Computer Science, vol 13462. Springer, Cham, 97–113.
  • Birnbacher, D. : Schopenhauer und die Tradition der Sprachkritik. Schopenhauer-Jahrbuch 99 (2018), 37–56.
  • Chichi, G. M. (2002) Die Schopenhauersche Eristik. Ein Blick auf ihr Aristotelisches Erbe. In Schopenhauer-Jahrbuch 83, 163–183.
  • Coumet, E. (1977) Sur l’histoire des diagrammes logiques : figures géométriques. In Mathématiques et Sciences Humaines 60, 31–62.
  • Costanzo, J. (2020) Schopenhauer on Intuition and Proof in Mathematics. In Lemanski, J. (ed.) Language, Logic and Mathematics in Schopenhauer. Birkhäuser, Cham, 287–305.
  • Costanzo, Jason M. (2008) The Euclidean Mousetrap. Schopenhauer’s Criticism of the Synthetic Method in Geometry. In Journal of Idealistic Studies 38, 209–220.
  • D’Alfonso, M. V. (2018) Arthur Schopenhauer, Anmerkungen zu G. E. Schulzes Vorlesungen zur Logik (Göttingen 1811). In I Castelli di Yale Online 6(1), 191–246.
  • Demey, L. (2020) From Euler Diagrams in Schopenhauer to Aristotelian Diagrams in Logical Geometry. In Lemanski, J. (ed.) Language, Logic, and Mathematics in Schopenhauer. Birkhäuser, Cham, 181–207.
  • Dobrzański, M. (2017) Begriff und Methode bei Arthur Schopenhauer. Königshausen & Neumann, Würzburg.
  • Dobrzański, M. (2020) Problems in Reconstructing Schopenhauer’s Theory of Meaning. With Reference to his Influence on Wittgenstein. In Lemanski, J. (ed.) Language, Logic, and Mathematics in Schopenhauer. Birkhäuser, Cham, 25–57.
  • Dobrzański, M., Lemanski, J. (2020) Schopenhauer Diagrams for Conceptual Analysis. In: Pietarinen, A.-V. et al: Diagrammatic Representation and Inference. 11th International Conference, Diagrams 2020, Tallinn, Estonia, August 24–28, 2020, Proceedings. Springer, Cham, 281–288.
  • Dümig, S. (2016) Lebendiges Wort? Schopenhauers und Goethes Anschauungen von Sprache im Vergleich. In: D. Schubbe & S.R. Fauth (Hg.): Schopenhauer und Goethe. Biographische und philosophische Perspektiven. Meiner, Hamburg, 150–183
  • Dümig, S. (2020) The World as Will and I-Language. Schopenhauer’s Philosophy as Precursor of Cognitive Sciences. In Lemanski, J. (ed.) Language, Logic, and Mathematics in Schopenhauer. Birkhäuser, Cham, 85–95.
  • Fischer, K.: Schopenhauers Leben, Werke und Lehre. 3rd ed. Winters, Heidelberg 1908.
  • Follesa, L. (2020) From Necessary Truths to Feelings: The Foundations of Mathematics in Leibniz and Schopenhauer. In Lemanski, J. (ed.) Language, Logic, and Mathematics in Schopenhauer. Birkhäuser, Cham, 315–326.
  • Fouqueré, C., Quatrini, M. (2012). Ludics and Natural Language: First Approaches. In: Béchet, D., Dikovsky, A. (eds.) Logical Aspects of Computational Linguistics. LACL 2012. Lecture Notes in Computer Science, vol 7351. Springer, Berlin, Heidelberg, 21–44.
  • Glock, H. -J. (1999) Schopenhauer and Wittgenstein Language as Representation and Will. In: In Christopher Janaway (ed.), The Cambridge Companion to Schopenhauer. Cambridge Univ. Press, Cambridge, 422–458.
  • Göcke, B. P. (2020) Karl Christian Friedrich Krause’s Influence on Schopenhauer’s Philosophy. In Wicks, R. L. (ed.) The Oxford Handbook of Schopenhauer. Oxford Univ. Press, New York.
  • Heinemann, A. -S. (2020) Schopenhauer and the Equational Form of Predication. In Lemanski, J. (ed.) Language, Logic, and Mathematics in Schopenhauer. Birkhäuser, Cham, 165–181.
  • Hordecki, B. (2018). The strategic dimension of the eristic dialectic in the context of the general theory of confrontational acts and situations. In: Przegląd Strategiczny 11, 19–26.
  • Hordecki, B. (2021) “Dialektyka erystyczna jako sztuka unikania rozmówców nieadekwatnych”, Res Rhetorica 8(2), 18–129.
  • Jacquette, D. (2012) Schopenhauer’s Philosophy of Logic and Mathematics. In Vandenabeele, B. (ed.) A Companion to Schopenhauer. Wiley-Blackwell, Chichester, 43–59.
  • Janaway, C. (2014) Schopenhauer on Cognition. O. Hallich & M. Koßler (ed.): Arthur Schopenhauer: Die Welt als Wille und Vorstellung. Akademie, Berlin, 35–50.
  • Kewe, A. (1907) Schopenhauer als Logiker. Bach, Bonn.
  • Koetsier, Teun (2005) Arthur Schopenhauer and L. E. J. Brouwer. A Comparison. In: L. Bergmans & T. Koetsier (ed.): Mathematics and the Divine. A Historical Study. Elsevier, Amsterdam, 571–595.
  • Koßler, M. (2020) Language as an “Indispensable Tool and Organ” of Reason. Intuition, Concept and Word in Schopenhauer. In Lemanski, J. (ed.) Language, Logic, and Mathematics in Schopenhauer. Birkhäuser, Cham, 15–25.
  • Lemanski, J. (2016) Schopenhauers Gebrauchstheorie der Bedeutung und das Kontextprinzip. Eine Parallele zu Wittgensteins Philosophischen Untersuchungen. In: Schopenhauer-Jahrbuch 97, 171–197.
  • Lemanski, J. (2017) ショーペンハウアーにおける意味の使用理論と文脈原理 : ヴィトゲンシュタイン ショーペンハウアー研究 = Schopenhauer-Studies 22, 150–190.
  • Lemanski, J. (2021) World and Logic. College Publications, London.
  • Lemanski, J. (2022a) Discourse Ethics and Eristic. In: Polish Journal of Aesthetics 62, 151–162.
  • Lemanski, J. (2022b) Schopenhauers Logikdiagramme in den Mathematiklehrbüchern Adolph Diesterwegs. In. Siegener Beiträge zur Geschichte und Philosophie der Mathematik 16 (2022), 97–127.
  • Lemanski, J. (2023) Logic Diagrams as Argument Maps in Eristic Dialectics. In: Argumentation, 1–21.
  • Lemanski J. and Dobrzanski, M. (2020) Reism, Concretism, and Schopenhauer Diagrams. In: Studia Humana 9, 104–119.
  • Lübbig Thomas (2020), Rhetorik für Plädoyer und forensischen Streit. Mit Schopenhauer im Gerichtssaal. Beck, München.
  • Matsuda, K. (2016) Spinoza’s Redundancy and Schopenhauer’s Concision. An Attempt to Compare Their Metaphysical Systems Using Diagrams. Schopenhauer-Jahrbuch 97, 117–131.
  • Marciniak, A. (2020) Wprowadzenie do erystyki dla pedagogów – Logos. Popraw-ność materialna argumentu, In: Studia z Teorii Wychowania 11:4, 59–85.
  • Menne, A. (2003) Arthur Schopenhauer. In: Hoerster, N. (ed.) Klassiker des philosophischen Denkens. Vol. 2. 7th ed. DTV, München, 194–230.
  • Moktefi, A. and Lemanski, J. (2018) Making Sense of Schopenhauer’s Diagram of Good and Evil. In: Chapman, P. et al. (eds.) Diagrammatic Representation and Inference. 10th international Conference, Diagrams 2018, Edinburgh, UK, June 18–22, 2018. Proceedings. Springer, Berlin et al., 721–724.
  • Moktefi, A. (2020) Schopenhauer’s Eulerian Diagrams. In Lemanski, J. (ed.) Language, Logic, and Mathematics in Schopenhauer. Birkhäuser, Cham, 111–129.
  • Pedroso, M. P. O. M. (2016) Conhecimento enquanto Afirmação da Vontade de Vida. Um Estudo Acerca da Dialética Erística de Arthur Schopenhauer. Universidade de Brasília, Brasília 2016.
  • Pluder, V. (2020) Schopenhauer’s Logic in its Historical Context. In Lemanski, J. (ed.) Language, Logic, and Mathematics in Schopenhauer. Birkhäuser, Basel, 129–143.
  • Regehly, T. (2018) Die Berliner Vorlesungen: Schopenhauer als Dozent. In Schubbe, D., Koßler, M. (ed.) Schopenhauer-Handbuch: Leben – Werk – Wirkung. 2nd ed. Metzler, Stuttgart, 169–179.
  • Saaty, T. L. (2014) The Three Laws of Thought, Plus One: The Law of Comparisons. Axioms 3:1, 46–49.
  • Salviano, J. (2004) O Novíssimo Organon: Lógica e Dialética em Schopenhauer. In: J. C. Salles (Ed.). Schopenhauer e o Idealismo Alemão. Salvador 99–113.
  • Schroeder, S. (2012) Schopenhauer’s Influence on Wittgenstein. In: Vandenabeele, B. (ed.) A Companion to Schopenhauer. Wiley-Blackwell, Chichester et al., 367–385.
  • Schubbe, D. (2010) Philosophie des Zwischen. Hermeneutik und Aporetik bei Schopenhauer. Königshausen & Neumann, Würzburg.
  • Schubbe, D. (2018) Philosophie de l’entre-deux. Herméneutique et aporétique chez Schopenhauer. Transl. by Marie-José Pernin. Presses Universitaires Nancy, Nancy.
  • Schubbe, D. and Lemanski, J. (2019) Problems and Interpretations of Schopenhauer’s World as Will and Representation. In: Voluntas – Revista Internacional de Filosofia 10(1), 199–210.
  • Schubbe, D. (2020) Schopenhauer als Hermeneutiker? Eine Replik auf Thomas Regehlys Kritik einer hermeneutischen Lesart Schopenhauers. In: Schopenhauer-Jahrbuch, 100, 139–147.
  • Schüler, H. M. & Lemanski, J. (2020) Arthur Schopenhauer on Naturalness in Logic. In Lemanski, J. (ed.) Language, Logic, and Mathematics in Schopenhauer. Birkhäuser, Cham, 145–165.
  • Schulze, G. E. (1810) Grundsätze der Allgemeinen Logik. 2nd ed. Vandenhoeck und Ruprecht, Göttingen.
  • Schumann, G. (2020) A Comment on Lemanski’s “Concept Diagrams and the Context Principle”. In Lemanski, J. (ed.) Language, Logic, and Mathematics in Schopenhauer. Birkhäuser, Cham, 73–85.
  • Segala, M. (2020) Schopenhauer and the Mathematical Intuition as the Foundation of Geometry. In Lemanski, J. (ed.) Language, Logic, and Mathematics in Schopenhauer. Birkhäuser, Birkhäuser, Cham, 261–287.
  • Shapshay, S. (2020) The Enduring Kantian Presence in Schopenhauer’s Philosophy. In: R. L. Wicks (ed.) The Oxford Handbook of Schopenhauer. Oxford Univ. Press, Oxford, 110–126.
  • Tarrazo, M. (2004) Schopenhauer’s Prolegomenon to Fuzziness. In: Fuzzy Optimization and Decision Making 3, 227–254.
  • Weimer, W. (1995) Ist eine Deutung der Welt als Wille und Vorstellung heute noch möglich? Schopenhauer nach der Sprachanalytischen Philosophie. In: Schopenhauer-Jahrbuch 76, 11–53.
  • Weimer, W.: (2018) Analytische Philosophie. In Schubbe, D., Koßler, M. (eds.) Schopenhauer-Handbuch. Leben – Werk – Wirkung. 2nd ed. Metzler, Stuttgart, 347–352.
  • Xhignesse, M. -A. (2020) Schopenhauer’s Perceptive Invective. In Lemanski, J. (ed.) Language, Logic, and Mathematics in Schopenhauer. Birkhäuser, Cham, 95–107.

 

Author Information

Jens Lemanski
Email: jenslemanski@gmail.com
University of Münster
Germany

The Value of Art

Philosophical discourse concerning the value of art is a discourse concerning what makes an artwork valuable qua its being an artwork. Whereas the concern of the critic is what makes the artwork a good artwork, the question for the aesthetician is why it is a good artwork. When we refer to a work’s value qua art, we mean those elements of it that contribute to or detract from that work’s value considered as an artwork. In this way, we aim to exclude those things that are valuable or useful about an artwork, such as a sculpture’s being a good doorstop, but that are not relevant for assessment in artistic terms. Philosophers of art, then, attempt to justify for the critic the categories or determinants with which they can make an appropriate and successful appraisal of an artwork.

What persons consider to be valuable about artworks has often tracked what they take artworks to be. In the humble beginnings of the artwork, artworks were taken to be accurate representations of the world or to be a beautiful, skillful creation that may also have served religious or political functions. Towards the eighteenth century, in light of Baumgarten’s introduction of the term aesthetics, alongside Hume’s and Kant’s treatises, the artwork’s definition and value moved toward the domains of aesthetic experience and judgments of beauty. Autonomy became desired, and political or moral commentary was supposedly inimical to value qua art. Contemporary art has pushed back against these boundaries—with social, ethical, political messages and criticism being drawn back into our artistic assessments.

Different artworks manifest different kinds of composite values. The philosopher of art’s task is to examine which values can appropriately be considered determinants of artistic value and, subsequently, what the value of art might be beyond these determinants. There is substantial disagreement about which, and how, determinants affect artistic value. Consequently, there is a vast catalogue of positions to which aestheticians subscribe, and the terminology can make it difficult to know who is talking about what. To provide some clarity to the reader in navigating this terminology and discourse, the end of this article includes an alphabetized summary of those positions. The various positions are cashed out in reference to mainly visual art, with some treatment of literature. Although some positions are easily transferred to other forms of art, some are not.

Table of Contents

  1. The Nature of Artistic Value
    1. Aesthetic Value and Artistic Value
      1. Aesthetic Value
      2. The Relationship
    2. The Definition and Value of Art
  2. The (Im-)Moral Value of Art
    1. The Moral Value of Art
    2. The Immoral Value of Art
    3. The Directness Issue
  3. The Cognitive Value of Art
  4. The Political Value of Art
    1. Epistemic Progress
    2. The Pragmatic View
  5. Summary of Available Positions and Accounts
  6. References and Further Reading

1. The Nature of Artistic Value

From the outset it should be clear that, when discussing the value of art in philosophical terms, we are not talking about the money one is willing to exchange for purchase of an artwork. In fact, this points to a rather peculiar feature about the value of art insofar as it is not a kind of quantifiable value as is, say, monetary value. If a dealer were to ask what the value of an artwork is, we could give them a particular (albeit negotiable) sum, a quantity, something we can pick out. Philosophically, it does not look like the value of art operates in the same way. Rather, artistic value just appears to be how good or bad something is as art. So, for the dealer, Da Vinci’s Salvator Mundi (1500) might be more valuable than Manet’s Spring (1881) simply because it has attracted more money at auction. In the way we’re using artistic value, however, Spring could be a better artwork than Salvator Mundi for a variety of reasons, thus having greater artistic value. Of course, there may be (quite significant) correlation between monetary and artistic value – at least, one would hope there is.

Recent work by Louise Hanson (2013, 2017) on the nature of artistic value is informative here. To capture that artistic value is not the same kind of value that other values, such as monetary, moral, or aesthetic values are, Hanson presents the analogy relating to the university category mistake. Consider your friend is giving you a tour of the University of Liverpool, showing you the sports centre, the philosophy department, the various libraries, introduces you to faculty and students, the pro-vice chancellors, and so on. Eventually, she finishes the tour and you ask, “OK, all those things are great, but where is the university?” In this case, you’re making a category mistake; there is nothing over, above, or beyond, the composite entities that your friend has shown you that constitutes the university (see Ryle, 1949/2009, p. 6 on category mistakes). The same thing is happening with artistic value, Hanson thinks. Artistic value is just a term we use to talk about something’s goodness or badness as art, and it is something comprised of (a number of) different determinant kinds of value, such as aesthetic, moral, cognitive, and political value.

In this way, artistic value is attributive goodness. It is just how good or bad something is in the category of artworks or, as philosophers like to say, qua art. Accordingly, in order for something to have artistic value, it must be an artwork; something that is not in the category of artworks cannot have artistic value if artistic value is the value (goodness) something has as art. Artistic value, then, is something all and only artworks have to the extent that they are good or bad artworks. Moreover, the assessment of artistic value constrains itself to the domain of artistic relevance: something might be good, but it does not necessarily follow that it is a good artwork. Conversely, something might be a good artwork, but not good simpliciter. That is, Nabokov’s Lolita might be a good artwork, it has high artistic value, but it is not good simpliciter. It might also make for a good coffee mug coaster, and so is good as coffee-mug-coaster, but this does not have bearing on its goodness as art. The reasons for assessing something as having high artistic value must be relevant to the category ‘artworks’, and so not all things valuable about an artwork are things that contribute to its artistic value.

a. Aesthetic Value and Artistic Value

i. Aesthetic Value

Given art’s intimate tie to the aesthetic, a good place to start the inquiry into the value of art appears to be aesthetic value. We are concerned in this subsection with the nature of aesthetic value and what it is as a kind of value, whereas in 1.a.ii we will examine the contentious question concerning the relationship between aesthetic and artistic value, such as whether these are one and the same value. In terms of the value of art, that question is the most important for our purposes. However, in order to answer it, we need to get a hold on what aesthetic value actually is. So, what is aesthetic value? Many agree that this question actually involves two subsidiary questions: first, what makes aesthetic value aesthetic and, second, what makes aesthetic value a value? The former has been referred to as the demarcation or aesthetic question, the latter as the normative question, terminology that originates in Lopes (2018; see specifically pp. 41-43 for the proposing of the questions, and pp. 43-50 for a brief discussion of them) and adopted by subsequent work in philosophical aesthetics (e.g., Shelley, 2019 and Van der Berg, 2019 both provide assessments in terms of these questions). To be specific, the aesthetic question asks us why some merits are distinctively aesthetic merits instead of some other kind of merit, whilst the normative question asks what makes that value reason-giving: how does it “lend weight to what an agent aesthetically should do?” (Lopes, 2018, p. 42).

One possible, and popular, answer is that aesthetic value is co-extensive with, or has its roots within, the aesthetic experience, a certain kind of pleasure derived from experiencing an aesthetic phenomenon. This is known as aesthetic hedonism. As Van der Berg suggests, the theory enjoys “a generous share of intuitive plausibility” (2019, p. 1). Given that we are likely to pursue pleasurable things, aesthetic hedonism provides a plausible answer to the normative question because we do value seeking out pleasurable experiences. What makes the pleasure aesthetic, however, is murkier territory. The aesthetic hedonist needs to provide an account of what makes the pleasure found in aesthetic experiences a distinctively aesthetic pleasure, rather than just pleasure we might find, say, when we finish writing our last graduate school paper.

What makes an aesthetic experience an aesthetic experience can be answered through two main routes: it’s either the content of the experience, or the way something is experienced. Carroll (2015) refers to the former as content approaches, himself endorsing such an approach, and the latter as valuing approaches. The content approach suggests that what the experience is directed towards, the “one or a combination of features” that are aesthetic features, makes the experience aesthetic (Carroll, 2015, p. 172). As Carroll suggests, the view is relatively straightforward, and so obtains the benefit of parsimony. All experiences have content and so, insofar as it is an experience, an aesthetic experience has content. The best explanation for an aesthetic experience’s being an aesthetic experience should derive, therefore, from its content, that is, aesthetic properties. The view also aligns with our intuitions. What we find valuable about aesthetic phenomena is how they look, their aesthetic properties as gestalts of formal and perceptible properties. These are the content of, and give rise to, the kind of experience we have in response to them. Aesthetic value, then, becomes wrapped up with the aesthetic experience which, in turn, is wrapped up with the formal, perceptual properties of the work; aesthetic properties. Couching the value in terms of aesthetic properties carries the advantage of explaining the aesthetic value of non-artistic, but nonetheless aesthetic, phenomena such as nature and mathematical or scientific formulae. We often refer to sunsets and landscapes as beautiful or dull. Likewise, we might attribute an attractive symmetry to a certain equation, or an enchanting simplicity to the proof of an otherwise complex theorem. As the content theory answers the aesthetic question by pointing to the experience’s content – aesthetic properties – scientific and mathematical formulae with aesthetic properties can invite aesthetic experiences.

One relatively significant objection to this, however, is that Carroll maintains a narrow approach to the concept of the aesthetic. That is, he bases aesthetic properties on formal, perceptible properties that give rise to the content of the work. Goldman criticizes this and instead endorses a broad notion of the aesthetic to include moral, political, and cognitive concerns. If our aesthetic experiences take on regard to these sorts of features, and their pleasurable nature is affected as such, then it looks like Carroll’s view is too narrow. Indeed, this kind of objection is leveled against those that think aesthetic value and artistic value are one and the same thing as we shall see later. It looks like we value art for more than just formal and perceptible properties; we say artworks can teach us, that they can provide moral commentary, and so on. That being said, if one does not commit to aesthetic value and artistic value’s identity, then that aesthetic experiences are characterized by their content as formal and perceptible properties looks convincing when one remembers we have aesthetic experiences of not just artworks, but also everyday phenomena including nature. Aesthetic value, therefore, needn’t include moral and cognitive concerns if it is also ascribed to things that are not artworks.

Valuing approaches, by contrast, vary in what they are committed to, but broadly-speaking suggest that aesthetic experiences have a particular character (that doesn’t necessarily rely on their content). This character is distinct from any other experience we might have, and so is unique to the aesthetic (thus answering the aesthetic question). They might be, for example, experiences ‘for their own sake’, or those that are emotionally rewarding on their own, without recourse to further utility, gain, or ends. The main task here is to account for how we get to an experience valued for its own sake, without necessarily referencing content, and in a way that can distinguish aesthetic pleasure from other, perhaps lesser, pleasures.

We can take a historical turn here: in his third Critique, Kant (1987) introduces the notion of disinterestedness to carve out the distinctive character of aesthetic experiences and judgements of beauty. Disinterestedness refers to the abstraction of ‘concepts’ in our judgement and experience of beautiful things. That is, when we view things aesthetically, we remove all constraints regarding what the thing is, how it is supposed to function, whether it makes any moral, political, or social claims, our own personal emotional states, and so on. A judgement of aesthetic value must also demarcate itself from ‘mere agreeableness’, which is perhaps the kind of aforementioned pleasure we have in submitting our final graduate school paper. Kant thinks this unique pleasure arises from our state of disinterestedness leading to the ‘free play’ of the imagination and understanding (see Kant, 1987, §1-§22; in contemporary aesthetics, the notion of disinterestedness has had greater uptake than the claim of ‘free play’). Something’s aesthetic value, on this account, is tied to the value of the experience we have of it without any instrumental, utilitarian, moral, or overall ‘conceptual’ concerns.

The notion of disinterestedness has sparked a lively scholarship, not least because it appears to give rise to a contradiction in Kant’s own Critique. After suggesting that judgements of beauty (perhaps, judgements of aesthetic value in contemporary terms) employ no concepts, therefore subscribing to disinterestedness, he also suggests that a species of beautiful judgements, termed dependent beauty, are in fact made with reference to concepts (see Kant, 1987, pp. §16). Additional scholarship has attempted to refine or recalibrate the notion of disinterestedness, specifically with regard to what the aesthetic attitude entails. For example, Stolnitz (1960, 1978) suggests that the aesthetic attitude, which allows us to make perceptual contact with something in order to retrieve an aesthetic experience, is encompassed by disinterested and sympathetic attention to the object for its own sake. Bullough (1912), likewise, invokes a kind of distancing to access the aesthetic experience: over-distance and you’ll be too removed to gain the pleasure, under-distance and you’ll be too involved. The aesthetic question is answered thus: what makes aesthetic value aesthetic is that it is derived from a pleasurable experience of something to which we have adopted the aesthetic attitude.

On these kinds of views, then, aesthetic value is co-extensive with the pleasurable, aesthetic experience gained from perceiving an artwork (or other phenomenon), in virtue of some particular mode of attending. However, the notion of the aesthetic attitude has received substantial criticism in general (Dickie, 1964, is the canonical instigator of such criticism). For example, it is questionable whether we really do alter our attention to things in a special way, that goes beyond merely paying attention or not paying attention, in order to gather an aesthetic experience (Dickie, 1964). At least, it’s not something we’re aware of: I don’t enter an art gallery and, prior to looking at any artworks, undergo some attentional ritual that is regarded as “adopting the aesthetic attitude”. Additionally, aesthetic attitude theory appears to render most anything a source of an aesthetic experience: if it’s down to me what I attend to in this peculiar way, then it’s down to me what can be the source of an aesthetic experience. If this is the case, and aesthetic value is proportionate to the quality of the aesthetic experience, then aesthetic value doesn’t appear all that unique; anything could give us an aesthetic experience. Moreover, the particularized views, and those derivative, of Stolnitz (1960, 1978) and Bullough (1912) have been subject to much discourse. For example, some think that Stolnitz’s (1960) notion of sympathetic disinterested attention is paradoxical. Being sympathetic to the object requires we have some idea of what the thing is and what it’s for, but disinterestedness defies this.

Most aestheticians are keen to carve out the aforementioned distinctions between a ‘broad’ and ‘narrow’ concept of the aesthetic. The narrow view limits the aesthetic to the detection of aesthetic properties: formal features of the work with some relationship to the perceptible properties (see, for example, Carroll, 2012). For example, the vivacity and vibrancy of Kandinsky’s Swinging (1925) are aesthetic properties arising from the formal and perceptible, perhaps ‘lower-level’, properties, which in turn invite an aesthetic experience, perhaps one of ‘movement’. The broad view captures within the aesthetic, say, moral features (see, for example, Gaut, 2007 and Goldman, 2013) as they arise from this form. For the formalist, or narrow aestheticist, aesthetic value refers to either aesthetic properties in themselves, or is a relational value referring to the experience we have of them. For the broad theorist, as moral, political, or cognitive content is brought about through, and directly impacts our response to the form of the work, they can interact and shape aesthetic value. In Goldman’s terms, cognitive and affective work, such as inference, theme detection, moral insight, and so on, are as much a part of the aesthetic experience as is the detection of aesthetic properties. Artworks “engage us on all levels” (Goldman, 2013, p. 331) and, in turn, their aesthetic value is affected as such.

In terms of the value of art, or artistic value, if we equate aesthetic value with artistic value, artistic value is going to be grounded in, too, the aesthetic, formal features of the work which is shaped by one’s narrow or broad view. If we’re not going to equate the two, then we can say that aesthetic value is one of many determinants of artistic value, bringing in other determinants such as cognitive value and moral value. These views come in nuanced forms, as we’ll now see.

ii. The Relationship

For some aestheticians, the issue with coming to an adequate and appropriate account of the nature of artistic value and aesthetic value is derivative of the core issue of defining the very concepts aesthetic and artistic (in section 1.b, we’ll look at the relationship between the definition of art and the nature of artistic value). The thought is that if we construct an appropriate definition of, and relationship between, art and the aesthetic, all issues in aesthetics will slowly become enlightened.

The most succinct, yet still rigorous, assessment and discussion of the relationship between aesthetic and artistic value was brought about through an interlocution between a paper by Lopes (2011) and subsequent response from Hanson (2013). Lopes has attempted to show that any kind of non-aesthetic artistic value is a myth, whilst Hanson has attempted to show that a non-aesthetic, distinctively artistic value, is a reality (which paves the way for her later account of artistic value, which we saw earlier in section 1). Lopes thinks we only have two options: to embrace the trivial theory, or equate artistic value with aesthetic value. The trivial theory suggests that artworks can have many different values, but none are characteristically artistic. For example, an artwork might grant us some moral wisdom, which is something valuable about it. Other things, though, also grant us moral wisdom, so granting moral wisdom isn’t a characteristically artistic value, that is, a value of an artwork qua artwork. The trivial theory, therefore, is uninformative and doesn’t really tell us anything about the value of art. Lopes arrives at these two options after ploughing through definitions of artistic value that, so he thinks, fail to adequately grant a particular value’s being an artistic value. Thus, the conclusion reached is that something is a value of art if and only if it is an aesthetic value of that thing within the kind of art form, genre, or other art kind.

As Hanson identifies, it is difficult to place Lopes’ position and what it exactly amounts to, but broadly there are three kinds of positions one might take to get rid of (non-aesthetic) artistic value: error theory, eliminativism, and aestheticism. On the surface these might appear to be the same position, but there are subtle differences between the three, hence the confusion in Lopes’ positioning. An error theory would claim that we are mistaken to talk about artistic value as there is no such thing as artistic value: “it appeals to a concept that does not exist” (Hanson, 2013, p. 494). Aestheticism – as Hanson is using the term – is a claim about aesthetic value and artistic value’s identity, as well as a denial of pluralism about artistic value. Pluralism would allow for many values to be contributory towards artistic, and aesthetic in this case, value, for example, moral and cognitive value might interact with aesthetic value. The aestheticist, however, thinks aesthetic value and artistic value are the same, and that only aesthetic value is a determinant of artistic value. The use of aesthetic value, in this discussion, pertains only to formal, perceptible properties, rather than a broad construal that draws in cognitive and affective components as identified in section 1.a.i. The value of an artwork, then, is comprised by its aesthetic value and its aesthetic value only for the aestheticist. For the eliminativist, an identity relation is also placed between aesthetic value and artistic value as we are talking about the same thing. Talk of artistic value is redundant as it is just aesthetic value, rather than, as for the error theorist, talk of a non-existent concept. So, for the eliminativist, we have the concept of artistic value, but it’s the same thing as aesthetic value. This position is endorsed by Berys Gaut (2007): aesthetic value is comprised by a pluralism of many different values, and artistic value is aesthetic value. The denial of pluralism, therefore, sets the aestheticist (only the formal matters for aesthetic value) and the eliminativist (only aesthetic value matters for artistic value, but other values may impact, or be drawn out through, form, i.e., aesthetic value) apart. That being said, an eliminativist need not be committed to pluralism.

We have seen reasons for thinking our talk of artistic value is conceptually and/or metaphysically mistaken just insofar as artistic value is not a kind of value in the same way all these determinant values are. Artistic value is something had by all and only artworks as a measure of how good (or bad) they are and, as such, “is just a handy linguistic construction that allows one to talk about the degree to which something is good art in a less cumbersome way than would otherwise be available” (Hanson, 2013, p. 500). In this way, artistic value is not the same kind of value as, but is indeed dependent upon, other kinds of value. With this reasoning in place, one can reject any position that places an identity claim between artistic value and aesthetic value because they are not kinds of value in the same way, and so cannot be identical. That being said, positions that acknowledge artistic value and aesthetic value’s distinct nature can claim that aesthetic value is the sole determinant of artistic value. So, for example, aestheticists might say that, yes, artistic value and aesthetic value are distinct values both in kind and nature, artistic value has determinants, and the only determinant of artistic value is aesthetic value.

Yet the task for the aestheticist (denying pluralism), arguing that aesthetic value is the sole determinant of artistic value, is rather difficult. Despite our intuitive inclinations towards the formal and the beautiful being significant determinants of artistic value, few would be inclined to suggest that art cannot provide moral, political, or social criticism, bestow knowledge upon us or clarify truths we already hold. Hence, in order to maintain their line, the aestheticist would either need to argue that (i) these other values interact with aesthetic value and, derivatively, affect artistic value (a form of eliminativism), or (ii) these other values are not values of art qua art. Route (i) is endorsed by those, such as Gaut, who see an intimate tie between aesthetic and other forms of value. Consider, for example, our labelling moral behavior beautiful and immoral behavior ugly. Known as the moral beauty view (see section 2.a for this view in greater detail), this looks like a good candidate for the interaction of aesthetic and other forms of value. The issue is that we can speak of the aesthetic value of lots of things, but those things need not be art, and often are not art, which puts pressure on the identity claim. It is mysterious to claim aesthetic value is value qua art whilst attributing aesthetic value to things that are not artworks.

One might then suggest that avenue (ii) stands a better chance at survival. We might argue that Guernica has exceptional aesthetic value, Picasso’s use of geometric forms is dramatic and imposing. We might also suggest that Guernica’s intent, the condemnation of the bombing of Guernica, is a merit of the work. However, we can consider these values separately: Guernica has high artistic value owing to its formalism, and it is a valuable thing owing to its ethico-political commentary, but the latter of these does not contribute to its value qua art. It might be valuable as art owing to its formal qualities, and valuable as a piece of ethico-political commentary, but we should not consider the latter in our assessment of its artistic value (this view, known as moderate autonomism, is explored in section 2). That is, an artwork can have many different valuable features, but the only one that determines its artistic value, its value qua art, is its aesthetic value.

Yet this just doesn’t appear to be how we view artworks. Instead, form and content work together to bear the value of the work qua art. Although content, like Guernica’s ethico-political criticism of war, is wrapped up with, and brought about through, the form of the work, it is not just that form that we value about the artwork as an artwork. We value Guernica qua art in part due to this political criticism which is drawn out through its jarring and unsettling composition, accentuating and confirming the critical attitude Picasso takes. Yet such an attitude and criticism is over and above simply the work’s being jarring and unsettling. More is needed beyond the form to access the content. If this political commentary, then, is something we find valuable about Guernica as an artwork (the denunciation of war, particularly the bombing of Guernica), and it is detachable from its aesthetic value (as jarring and unsettling), then our safest bet appears to be opting for Hanson’s line: artistic value is how good or bad something is as art, it is something had by all and only artworks, and it has a range of different determinant values one of which is, perhaps to a substantial but not wholly encompassing extent, its aesthetic value. In order to reach a full and proper understanding of the value of art, then, we need to explore these determinants and, importantly, how they interact. In sections two to five, we’ll look at some of the main values philosophers of art have thought to contribute to, or detract from, the value of an artwork.

Another issue for the aestheticist is the value we repeatedly attach to works that are deliberately anti-aesthetic. The predominant example here is conceptual art. Conceptual artworks do not bestow upon us aesthetic experiences, nor do they have any properties we would appropriately be willing to call aesthetic properties. Despite their anti-aesthetic nature, we willingly and consistently attribute to these works artistic value. If they lack any significant aesthetic value, as Stecker (2019) writes, but simultaneously possess rather substantial artistic value, then it doesn’t look like we can place an identity claim between aesthetic and artistic value, and to do so may be foolish. How can something with low aesthetic value have high artistic value if these two values are identical? In order to meet this objection, the aestheticist might wish to appeal to artistic-aesthetic holism or dependency. That is, anti-aesthetic artworks are valuable in a way that depends upon the concept of the aesthetic: it is only in virtue of the value placed on the aesthetic that anti-aesthetic art derives its reactionary value. This, however, places a burden on the aestheticist’s shoulders in trying to show how absence of something (aesthetic) gives rise to that something’s value (aesthetic value).

These perceptually indiscernible artworks might pose a different issue for the aestheticist albeit one that is closely related. The problem of conceptual art for the aestheticist is explaining how purportedly non-aesthetic art can have artistic value if artistic value is aesthetic value, and aesthetic value depends on aesthetic properties and/or experience. As Stecker (2019) identifies, the problem of indiscernible works is this: aesthetic value and artistic value cannot be one and the same thing if two perceptually indiscernible entities can have differing artistic value. Duchamp’s Fountain might be indiscernible from a ‘regular’ urinal. If aesthetic value is realized through perceptible features, then the regular urinal and Fountain have the same aesthetic value. However, it is highly unlikely that anyone would be willing to admit that Fountain and the regular urinal have precisely the same artistic value. Hence, we should be of the view that artistic value is something other than, but perhaps may indeed include, aesthetic value. To be clear, the two distinct issues are this: (1) how does the aestheticist account for the aesthetic value of non-aesthetic or deliberately anti-aesthetic art and thus, given the identity claim, their artistic value? (2) How does the aestheticist account for two indiscernible things having (presumably) identical aesthetic value—based on formal features—but distinct artistic value?  

It should be noted that some do hold that artworks can bear non-perceptual aesthetic properties. Shelley (2003) constructs such a case, arguing that it is a route scarcely travelled by aestheticists in the face of conceptual art. Instead, aestheticists tend to deny that conceptual artworks are art (an alternative is to deny that all artworks have aesthetic properties, but this is not a good move for the aestheticist!). Shelley expands aesthetic properties from the usual list – “grace, elegance, and beauty” – to include “daring, impudence, and wit” (2003, p. 373). This he does by drawing on a Sibley-inspired notion of aesthetic properties as striking us, rather than being inferred, and recognizing that it is false to say that properties that strike us rely on perceptible properties. As he writes, “ordinary urinals do not strike us with daring or wit, but Fountain, which for practical purposes is perceptually indistinguishable from them, does” (Shelley, 2003, p. 373). See Carroll (2004) for the same conclusion reached via alternative argument.

Consider, then, a different case against the aestheticist: forgeries and fakes. It is not to our surprise that our valuing of something changes upon our discovery that it is a forgery, and it is often, presumably, the case that this value change is a diminishment. The (in-)famous case of van Meegren creating fake Vermeers is commonplace in the literature. Upon discovering that these ‘Vermeers’ were actually fakes produced by van Meegren, their value suffered. Despite this, the aesthetic properties, or the aesthetic experience had, presumably stays the same owing to the change not being at the level of the formal, perceptual properties of the work. Instead, it’s something else that changes; perhaps our valuing of it as, now, not an original, authentic work. Aestheticists might appeal to a change in the moral value or assessment of the work, but the best explanation for this kind of phenomenon appears to be that the aesthetic value, co-extensive with aesthetic experience or properties, remains the same, whereas the artistic value, which can include considerations such as originality, importance for art history, authenticity, and so on, changes. Indeed, it is precisely these problematic scenarios that lead Kulka (2005) to endorse what he terms aesthetic dualism: a divorce between aesthetic value and artistic value, where aesthetic value is gathered from the intrinsic properties of the work, and artistic value includes aesthetic value but also makes reference to extrinsic information such as originality and art-historical importance.

Notwithstanding, the conceptual and empirical dependency of the artistic upon the aesthetic is a popular view. Frank Sibley (1959; and, 2001 for a collected volume of his papers) proposed a priority of the aesthetic over the artistic: all that is artistic depends upon the concept of the aesthetic. Therefore, Sibley does indeed endorse the claim that anti-aesthetic art, by its very nature, depends on the concept of the aesthetic in order to retrieve its value as art. Andy Hamilton (2006), too, endorses a link of conceptual necessity between the aesthetic and the artistic. What he calls the reciprocity thesis is a conceptual holism between artistic and aesthetic; we cannot understand, or have the concept arise, of the artistic without the aesthetic. His case is that it is unfathomable to conceive of a settling community that views a sunset and does not at the same time decorate their homes with ornaments and fanciful designs.

As we can see, many aestheticians appear to support with good reason the idea that aesthetic value and artistic value are not identical. However, we should not assume that the case is too one-sided and that proponents of the aesthetic-artistic value distinction do not have any burdens to meet. For example, in the remainder of this article we’ll look at some values philosophers of art take to be values of art, but the question is: how do we know these are values of artworks qua their being artworks, rather than values artworks have just adventitiously? How do we support the idea, for example, that an artwork’s teaching us something is a value of that work qua art, but an artwork’s covering up a crack in the wall is not a value of that work qua art? This is the main contention of Lopes’ argument against non-aesthetic artistic value: there is no non-trivial way of declaring that a value is an artistic value, that is, a value qua art.

b. The Definition and Value of Art

Before engaging such questions, we should examine the relationship between the definition of art and the value of art. As stated in the introduction, what we have taken artworks to be and what we value about them have been considered somewhat simultaneous. Rather than historically trace the definition of art and its correspondence with art’s value, we will focus here on some issues arising from the relationship between defining art and the value of art, in keeping with the article’s scope and purpose. First, a theory of art that picks out artworks based on what we deem to be valuable about them is called a value definition. It is more likely than not that this definition will also be a functional theory/definition of art, according to Davies’ (1990, 1991) delineation of functional and procedural theories of art. A functional theory defines artworks in terms of what they do, whereas a procedural theory defines artworks in terms of how they are brought about. Aesthetic theories, for example, are functional theories. The institutional theory, on the contrary, is a procedural theory. It is presumably not the case that we value artworks because they are those things picked out as candidates for appreciation by the art world (the institutional theory), but it might be the case that we value artworks because they are sources of aesthetic experiences (a version of the aesthetic theory).

Consequently, functional theories are often taken to have an advantage over procedural theories in terms of explanatory power. They tell us what an artwork is, alongside telling us why art matters to us. Indeed, it is often, then, a criticism of procedural theories that they do not go on to show us why and how we care about artworks. Although procedural theories might have a greater time encompassing more artworks under their definition (the institutional theory is praised for its extensional adequacy), they fail to meet an important desideratum of theories of art. One must be cautious, however, in approaching both the definition and value of art along the same track. If one takes what is valuable about an artwork to be the sole determinant of artistic value and that artworks are those things that have this value, then one runs into a conceptual conundrum. Such definitions perform, for Hanson (2017), definition-evaluation parallelism. These theories are unable to accommodate the existence of bad art.

Hanson cites the theories of Bell and Collingwood as falling into this trap. For Bell, artworks have significant form, and this is the determinant of their artistic value. For Collingwood, art is expressive, and their expression is the determinant of their artistic value. The puzzle, however, is this: if artworks are valuable only because of their significant form, and are artworks because of their significant form, then all artworks are valuable. Something that doesn’t have significant form cannot be artistically valuable, nor can it be deemed art. As such, the existence of bad art becomes a contradiction, given that all artworks, insofar as they are artworks, possess the valuable feature. The same can be said of Collingwood’s expressive theory, substituting expression for significant form in this example. What it would take for an artwork to be bad, i.e., lacking the valuable thing about art, would also remove its artistic status. Hence, there can be no bad art.

Not all value definitions fall into the trap of definition-evaluation parallelism. It is possible, for example, to argue that all artworks have some value x, but this value is not the sole determinant of the value of art. Instead, a multitude of values constitute the value of art, it’s just that x is also what makes artworks, artworks. If they follow this trajectory, theories of art are able to meet the desideratum of being able to at once explain what artworks are and why we value them. As Hanson points out, it has been a mistake by previous aestheticians to think of the issue of bad art as “a knock-down objection to value definitions” (2017, p. 424). Instead, it’s a burden only for those value definitions that at the same time invoke definition-evaluation parallelism.

In addition, it is not the case that one needs to pick the explanatory side they deem more praiseworthy in cases of defining art procedurally or functionally, for one can commit to, as Abell (2012) does, a hybrid theory of art. A hybrid theory of art would be one that is both functional and procedural at the same time. The motivation for a hybrid (procedural and functional) theory is that it can potentially take on the extensional power of a procedural theory (encompassing within the category of artworks the right kind of thing) as well as the explanatory power of a functional theory (letting us know how and why we care about art).

2. The (Im-)Moral Value of Art

The previous discussions setup, and invite, consideration of what other forms of value we consider to be contributory to, or detracting from, the value of an artwork qua art. Throughout the following considerations, the reader should consider whether the position and its commitments make claims about two different concerns: whether the value in question impacts the value of the work as a work of art, or whether we can assess the artwork in terms of that value, but the value doesn’t impact the value of a work of art as a work of art. The nature of such an interaction is cashed out with great intricacy in the numerous positions espoused in considerations of the (im-)moral value of art, and so it is to this value that we now turn as a good starting point.

a. The Moral Value of Art

The interaction between moral and aesthetic and/or artistic value has received extensive treatment in the literature and with extensive treatment comes an extensive list of positions one might adopt. Another entry of the IEP also considers these positions: Ethical Criticism of Art.  Nonetheless, the interaction is a considerable source of tension in philosophical aesthetics, and so I shall highlight and assess the key positions here. Roughly, the main positions are as follows. Radical Autonomists think that moral assessments of artworks are inappropriate in their entirety, that is, one should not engage in moral debate about, through, or from artworks. Moderate Autonomists think that artworks can be assessed in terms of their moral character and/or criticism, but this does not bear weight upon their value qua art, that is, their artistic value. Moralists think that a work’s moral value is a determinant of its artistic value. Radical Moralists think that the moral assessment of an artwork is the sole determinant of its artistic value. Ethicists think that, necessarily, a moral defect in a work is an aesthetic defect, and a moral merit is an aesthetic merit. Immoralists think that moral defects, or immoral properties, can be valuable for an artwork qua art, they can contribute positively to artistic value.

It should be clear from this brief exposition that the varying terminology renders the debate rather murky. Some, such as Gaut, are arguing about moral value’s encroachment on aesthetic value, whereas others are making claims in particular about artistic value. Todd (2007), for example, identifies that a significant part of the tension of ethical interaction is sourced from conflating aesthetic and artistic value. In addition, in different literature we see talk of moral value, moral properties, the morality of the artist, moral defects, aesthetic merits, artistic merits, and so on. In fact, it has been pointed out that the debate regarding immoralism (the claim that moral defects can be aesthetic/artistic merits) is marred precisely owing to the lack of consensus and terminological mud that is flown throughout the debate: no one has declared precisely what a moral defect is, and upon whom or what it falls (McGregor, 2014). A moral defect might be in the audience if they take up a flawed perspective, or it might be in the work’s suggestion that that response be taken up, or it might be in the display of immoral acts, and so on. In keeping with the focus of this article, I will consider the debate in terms of artistic value, where someone who thinks aesthetic value and artistic value are one and the same thing will be claiming that there is an interaction between (im-)moral properties and aesthetic value (as artistic value). That is, we will keep in line with the general agreement that what is at stake is the effect of (im-)moral properties on the value of artworks qua artwork. I will refer to moral and immoral aspects of the work in terms of properties and defects/merits.

Let’s start with the autonomist’s claim. Two strands of autonomism are prominent: radical and moderate. The former suggests that any and all moral assessment of an artwork is completely irrelevant to the artistic domain. It is a conceptual fallacy to suggest that morality and aesthetics interact in any substantive way. The artwork is a pure, formal phenomenon that exists in a realm divorced from concerns such as morality and politics. Clearly, however, this view has become outdated. It may have been convincing in the heydays of movements such as art for art’s sake, however, art has historically and, even more so in contemporary forms, been wrapped up with moral and political commentary, serving to criticize specific events, movements, and agendas. The latter strand, moderate autonomism, might find itself more palatable. This is the claim that the moral properties of a work have no interaction with its artistic value, but artworks can still be assessed in light of morality. On this view, then, Riefenstahl’s Triumph of the Will is good art, it is aesthetically and artistically valuable. However, in contrast to the radical autonomist, we may wish to assess the artwork in terms of a moral system, in which case Triumph of the Will is (very) flawed, but this does not have weight on our assessment of the film-documentary as art. The only thing that is relevant to the artistic value of Triumph of the Will is its aesthetic value, and on this view it is a good artwork.

There are two significant attractions to this view. Firstly, as mentioned in preceding sections, the idea that the aesthetic qualities of artworks are those things for which we value artworks is intuitively appealing; we praise artworks for their beauty, their form, and their use of this form to wrap up their content. Fundamentally, the autonomist says, we value artworks for how they look, and this is the ‘common-sense’ view of how and why we value art. Secondly, the claims that moral merits and defects do not impede upon artistic value is supported by the common-denominator argument, first proposed by Carroll. If a value is a value qua art, then it must feature as a relevant value for assessment in the consideration of all artworks. However, there are a multitude of artworks for which moral assessment is inappropriate and/or irrelevant. Abstract works, for example, in their majority do not lay claim to moral criticism or commentary, and so assessing them as artworks in terms of such value is inappropriate. Moral assessment, then, is not a common denominator amongst all artworks and so is not appropriate for assessment of an artwork qua art.

However, there are two interrelated and concerning issues for the autonomist. Firstly, the view may be problematic in the light of the fact that artworks are valued for many reasons beyond their form and aesthetic qualities. Indeed, take the earlier examples of genres of art that proceed from an anti-aesthetic standpoint. Secondly, and more importantly, it is standard practice in art criticism to produce an assessment of (some) artworks in terms of their moral and immoral claims, and this seems indubitably relevant for their assessment as artworks, or, qua art. Producing a critical review of Lolita as a work of artistic literature, for example, that made no reference to the immorality of Humbert Humbert and the relevance of this for its value as an artwork (rather than just its nature as an artwork) would be simply to have missed the plot of Lolita. Similarly, Guernica may be an exceptional, revolutionary use of form, but its assessment as an artwork just intuitively must involve its commentary on civil war and its repercussions for civilians. Likewise, it seems to be relevant to its assessment as art that Triumph of the Will was a propagandistic film endorsing the abhorrent, accentuated narrative of the Nazi party.

The moralist, who thinks an interaction exists between moral value and artistic value, is likely to use these latter examples as motivations for their own view. In these cases, it looks like the very form of the work is in some sense determined by the moral attitudes and values explored. As such, the moralist will claim that “moral presuppositions [can] play a structural role in the design of many artworks” (Carroll, 1996, p. 233). Hence, if we’re going to value artworks for their form and content, and in some cases this is dependent upon the moral claims, views, or theories employed in the work, then we need to accept that the moral value of a work is going to affect its value as an artwork.

Moralists are divided on whether their rule about moral properties (that moral merits can be aesthetic merits) is one of necessity; that is, that moral merits are always going to lead to aesthetic merits. For example, moderate moralism suggests that sometimes, but not always, moral properties can impinge upon, or contribute to, artistic value (see, for example, Carroll, 1996). In contrast, the ethicist necessitates the relationship between moral merits and aesthetic merits. For the ethicist, each and every instance of a moral merit in a work of art is an aesthetic merit. This position was made prominent by Gaut (2007), who as we saw also thinks that aesthetic value and artistic value are one and the same value. As such, a proper and appropriate formulation of ethicism would be the following: moral merits are in every case aesthetic merits, and as such moral merits always contribute to the value of an artwork as art. A caveat here is that the moral merits are core features of the artwork, rather than extraneous elements coinciding with the work. For example, the moral actions of Forrest in Forrest Gump may be aesthetically relevant, but moral claims made by the film studio in the DVD leaflet are not.

Clearly, this is a very strong claim and so requires significant motivation. Gaut bases the endorsement of ethicism upon three arguments: the merited response, moral beauty, and a cognitivist argument, the first and second I explore here. A dominant version of the merited response argument runs as follows: artworks prescribe responses in their spectators/perceivers/readers derivative of their content, and their aesthetic success is determined by, at least in part, such a response to the work being merited. One way a response might be unmerited, at least in part, is if such a response is unethical. As the unethical response is the cause of a response being unmerited, and an artwork’s success depends upon the response being merited, ethical defects are aesthetic defects and ethical merits are aesthetic merits (Gaut, 2007; Sauchelli, 2012). In sum, there is a direct correlation between a response being merited and the moral prescriptions such a response holds. The second argument, the moral beauty view, identifies that “moral virtues are beautiful, and moral vices are ugly” (Gaut, 2007, p. 115). From here, we can suggest that if a work has moral virtues—it has “ethically good attitudes”—then it has a kind of beauty. Beauty is, of course, canonically and paradigmatically an aesthetic value. Therefore, moral value contributes to aesthetic value. The argument, as Gaut suggests himself, is straightforward. To assess it requires an evaluation of the link between moral character and beauty which falls beyond the scope of this article. Readers should note Gaut provides a powerful case for the relation: see Gaut (2007, chapter six).

b. The Immoral Value of Art

There is something intuitively appealing about the claim that moral merits in artworks can be artistic merits and, as such, contribute to the value of art. The same, however, cannot be said of moral defects as meritorious contributions to the value of art. It seems odd to think that an artwork could be better in part, or wholly, because of the immoral properties it possesses. Immoralism, generally, is the position in aesthetics that holds that moral defects in a work of art can be artistic merits. Despite the instinctive resistance to such a claim, we need not look far afield to find examples of artworks that might fit this sort of bill. Consider, for example, Nabokov’s Lolita, Harvey’s Myra, and Todd Phillips’ Joker. The value of these works seems to be sourced from, or tied to, their inclusion of immoral properties, acts, or events. The issue is that we do not value immoral properties in general, or simpliciter, so why do they sometimes contribute to value qua art?

The cognitive immoralist (Kieran, 2003) suggests that we value immoral properties in artworks because they invite a cognitive and epistemic gain. That is, immoral properties are artistically virtuous insofar as they allow us to access, gain an understanding of, cement, or cultivate our moral understanding. Lolita’s immoral properties are valuable because they provide further scaffolding to our understanding of the immorality of pedophilia. By accessing the perspective of the pedophile, we garner a more complete understanding of why the actions are wrong. In this way, it has been argued that we have an epistemic duty to seek out these artworks for the resultant moral-cognitive gain, for the more comprehensive understanding of goodness and badness. Just as, so Kieran argues, the subordination of another can help us understand why a bully bullies, that is, to gain pleasure from the subordination of others, so too can artworks offer us the perspectives of perpetrators that can improve our understanding. Importantly, however, this epistemic duty does not extend to the real world. It is the imaginative experience that indirectly and informatively entertains immorality through the suspension of our natural beliefs and attitudes.

The robust immoralist, by contrast, focuses on the aesthetic and artistic achievements upheld by the ability of a work to gather our appreciation of immoral characters (Eaton, 2012). Termed rough-heroes, these characters take on immoral adventures or acts in films, novels, TV shows, and so on, but for some reason we empathize with them, we like them, we might even fancy them. For example, we might sympathize with a character that is at the same time a murderer. For the robust immoralist, it is a formidable artistic achievement to place us into this juxtaposed attitude and, hence, morally defective works can be artistically valuable just insofar as they excel within this artistic technique (Eaton, 2012).

The immoralist falls into a similar issue to the moralist, however, insofar as they need to show that it is the immoral qualities qua immoral qualities that contribute to the artistic value (Paris, 2019; this paper represents a considerable attack on immoralism). For example, some have argued that there is a two-step relation between immoral qualities and artistic value. Against the cognitive immoralist, they argue that it is the cognitive value that contributes to the artistic value, rather than the immoral qualities themselves. Similarly, we might argue that it is the aesthetic achievement of the robust immoralist, and hence aesthetic value, that contributes to the artistic value, rather than the immoral qualities themselves. Hence, immoral qualities qua immoral qualities do not contribute to artistic value. Moreover, on this criticism, we could suggest that replacing the immoral qualities with qualities (perhaps moral qualities) that give rise to the same sort of aesthetic value and/or cognitive value will produce the same influence upon artistic value and so, again, it is not immoral qualities qua immoral qualities. An intriguing consequence of this kind of criticism of immoralism is that it penetrates the veracity of theories that argue moral properties contribute to artistic value. The artistic value is not located in the moral qualities qua moral qualities (since, presumably, they are replaceable with some properties that gather the same cognitive or aesthetic gain, too).

Relatedly, some have argued that it is only insofar as these immoral qualities are accompanied by aesthetic and or cognitive gain that then masks or covers up the immoral qualities that they are deemed artistically valuable (Paris, 2019). That something else – retribution of the character, epistemic gain, aesthetic success – derives from these immoral properties suggests that they would not be valuable on their own. Since they require covering or masking in terms of aesthetic or epistemic success, they are actually shown to be detrimental to artistic value. That is, without the masking or covering up of the immoral qualities, we wouldn’t actually find the work artistically valuable. It is as though, says the critic of immoralism, the immoral qualities require covering up or redemption in order to succeed in the artistic domain. This puts them a far cry away from being valuable as art qua themselves.

Lastly, and this is a particular criticism of cognitive immoralism, it is hard to find works for which the status of the properties is genuinely immoral. If the reason we find immoral properties valuable is because of the ensuing cognitive, epistemic, moral cultivation — for example, Lolita helps us to verify and scaffold our understanding of pedophilia as immoral — then, upon calculation, the properties might not turn out to be immoral. That is, the subsequent moral cultivation outweighs the immorality of the fictional wrongs, and so the properties of the artwork are not, all things considered, immoral. The benefits outweigh the costs. If the artwork does not exhibit immoral properties, then there are no immoral properties in the first place out of which we can argue artistic value arises.

c. The Directness Issue

What the discussions of moralism and immoralism show is that for a property, quality, or value to legitimately be considered a determinant of artistic value, it must affect the value of the work qua art. In several different instances outlined, it doesn’t look like the moral and/or immoral property/value is affecting the work’s value qua art, but is instead determining some other value that we take to be valuable qua art. For example, some properties (moral or immoral) affect aesthetic value, which transitions to affect artistic value. It is, therefore, aesthetic value, not moral value, that influences artistic value. Or, some properties of artworks look to teach us things or cultivate our understanding, therefore there is a particular cognitive value about them, which has an effect on the artistic value.

The trouble that arises from this kind of thinking is which values are we willing to take as fully and finally valuable qua art and not just because they determine some other value? Perhaps this can be cast as a significant motivator for aestheticism, and indeed Lopes’ claim of the myth of non-aesthetic artistic value. Aesthetic value appears to be the only value on which there is universal consensus regarding its status as an artistic value. For moral/immoral interaction, it almost looks as if the burden always falls on the interactionist (who thinks that moral and aesthetic/artistic value interact) rather than the autonomist (who thinks they do not), in some unfair way.

Such a concern has been legitimated by Hanson (2019), who suggests that two dogmas have been pervasively present in the interaction debate: two conditions that an interactionist must meet, but that together are incompatible. Let’s refer to these as the directness problem and the qua problem. Roughly, the directness problem highlights that those engaging in the interaction debate have implicitly assumed that the only way an interactionist can show that moral/immoral properties bear on the value of art is if they influence some other value, that subsequently influences artistic value. Hence, if the interactionist shows that moral properties gather cognitive value, which bring about artistic value, then they have proposed an indirect strategy. A direct strategy would be, say, the cognitive case, where cognitive value directly bears on artistic value. The second condition that the interactionist must meet is that it must show that it is the (im-)moral properties qua (im-)moral properties that bear the value of art (the qua problem). That is, not some other, intermediary value. Clearly, however, it is logically impossible to propose that something can affect something qua itself indirectly. That is, one cannot conjure an interactionist theory that suggest moral properties are indirectly contributory to artistic value whilst at the same time maintaining that it is the moral properties qua moral properties that contribute to artistic value. One cannot, then, conjure an interactionist theory that meets these simultaneous requirements, or dogmas.

What, then, is the resolution? In order to not beg the question against the interactionist, aestheticians need to refrain from implicitly advancing both the directness and qua problem simultaneously, and instead only level one or neither. In her proposal, Hanson suggests we should take direct strategies seriously, with the caveat that this does not necessitate endorsing the qua constraint. Taking direct strategies seriously is legitimated because, well, we allow direct strategies in other cases. Consider aesthetic value, or cognitive value, or the value of originality, influence on subsequent art, and so on. All these values take on the stance of directly influencing artistic value, so why not moral value? Indeed, as Hanson suggests, we need to admit that at least some values are directly impactful on artistic value lest we enter an infinite regress. That is, if some value only contributes to artistic value via some other value, then does the latter contribute to artistic value directly? If not, then another value needs to be affected, which subsequently affects artistic value, to which the same question can be posed, and so on ad infinitum. Clearly, there must be a break in this chain somewhere such that some value(s) is (are) contributory, directly, to artistic value. As Hanson suggests, we should begin to take direct strategies more seriously, and the prospects look a lot “rosier” when we begin to do so.

3. The Cognitive Value of Art

Cognitive immoralism rests decisively on the claim that artistic value can be upheld, indeed augmented, by the cognitive value of an artwork. That is, the claim that artworks can be valuable insofar as they engender some form of cognitive gain. Indeed, a familiar endorsement of art is that it has something to teach us. These claims would be endorsed by a cognitivist about the arts: art can teach us, and it is aesthetically or artistically valuable in doing so. When discussing the cognitive value of art, it is crucial to get at what exactly the claim of “teaching us” amounts to: what is being taught and how are we being taught this? Rather usefully, Gibson (2008) has delineated the claims of different strands of artistic cognitivism. Gibson suggests that the cognitivist could argue for artworks granting us three kinds of knowledge: (i) propositional knowledge, (ii) experiential knowledge, or (iii) improvement/clarification. Other options are available for the cognitivist, such as a general increase in cognitive capacities, as we saw with cognitive immoralism. Another significant position is that art can train our empathic understanding. We’ll focus on Gibson’s assessment of cognitivism due to its informativity, before moving to Currie’s more recent analyses of cognitivism and, specifically, the enhancement of empathy.

The cognitivist endorsing (i) would suggest that artworks can give us knowledge that, such as x is y, or tomatoes are red. Artworks, for example, might serve as something akin to philosophical thought experiments, from which we can extrapolate some new truth. This strand of thought might argue that Guernica grants us propositional knowledge that civil wars affect citizens as much as infantry, and so are morally bankrupt, as seen through the bombing of Guernica. By endorsing (ii), the cognitivist is claiming that we can access “a region of human experience that would otherwise remain unknown” (Gibson, 2008, p. 583). For example, one might claim that Guernica offers us some form of access to what civilians experienced during the bombing of Guernica. In endorsing (iii), the cognitivist – Gibson labels this the neo-cognitivist position – would claim that artworks don’t teach us anything new, nor do they grant us access to otherwise inaccessible scenarios, but instead that they confirm, clarify, or accentuate knowledge we already hold. For example, we all know that war has consequences for citizens, that bombings are bad, and Guernica can shed light on, or improve our knowledge of, these facts.

There are four kinds of issue that cognitivists must overcome, with some addressing these different strands in a more targeted fashion. Gibson refers to these as the problem of unclaimed truths, missing tools of inquiry, the problem of fiction, and the nature of artistic creativity. The problem of unclaimed truths suggests that the truths found in artworks are borrowed from reality rather than revelatory of it. On this criticism, Guernica can’t grant us knowledge about the bombing of Guernica because, simply, the bombing of Guernica needed to take place in order for the artwork to borrow from such a reality. The missing tools of inquiry objection suggests that, in contrast to other cognitive pursuits, artworks don’t show us how to reach the knowledge, nor do they justify their truths, they merely show them. Picasso’s Guernica, then, can say that bombing in civil wars can lead to civilian deaths which is an immoral circumstance, but it cannot tell us why. The problem of fiction argues that the truths artworks disclose, if they do at all, are truths of the fictional world of the artwork, rather than truths that come into contact with reality. Hence, Guernica can show us that bombing cities in civil wars is wrong in the fictional world of the painting, rather than in our world of reality; the leap from fiction to reality is too large a leap to make. Relatedly, the nature of artistic creativity objection argues that artists create artefacts that are meant for “moment[s] of emancipation from reality” (Gibson, 2008, p. 578), and so praising the artistic discipline requires distancing ourselves from reality. Artworks, then, should not be valued for their cognitive gain because it is precisely the purpose of art to detach us from reality, rather than impart knowledge about it.

This final criticism can be launched against nearly all strands of cognitivism: if artworks should not be valued for the engendered cognitive gain, and artistic value is the value of something qua art, then cognitive value is not an artistic value. The problems of unclaimed truths and fiction penetrate the propositional and experiential knowledge accounts of artistic cognitivism. If the propositional or experiential knowledge qualifies the fictional world and can’t be transferred to reality, then that’s not a very valuable circumstance. Likewise, if this proposition or experience is something borrowed from reality, then again there’s no real teaching going on, for the knowledge needs to be in place in the first place for the artwork to borrow. Moreover, the fictionality objection hits the experiential account particularly hard: to put it simply, nothing is as good as the real thing. Gibson gives the example of a fictional tale about love; going to a library and reading it is not going to give the same experiential access as, should I be so lucky, finding love in reality!

The idea that artworks can give us propositional knowledge has been met with equal criticism. Consider our claim that, in order for something to contribute to artistic value, it must be valuable qua the artwork, that is, it must be something about or in the artwork that is valuable. Against the cognitivist endorsing the propositional knowledge view, one might suggest that the cognitive gain is made subsequent to engagement with the artwork. Just as with philosophical thought experiments, the knowledge isn’t held within the fiction, it is derivative of the cognitive efforts of the beholder. Guernica, considered as a thought experiment, doesn’t give us the knowledge qua itself as an artwork, but rather we extrapolate the moral propositions subsequent to our engagement. That is, Guernica doesn’t say “the killing of innocent civilians is bad”, but instead gives us a pictorial representation of the bombing of Guernica via which we subsequently undergo some cognitive work to get at this claim. Hence, the cognitive gain is not found within the artwork itself, and so cannot be a value qua art.

It looks like the strongest weapon in the cognitivist arsenal is what Gibson calls neo-cognitivism: the view that artworks clarify, accentuate, enhance, or improve knowledge that we already hold. The cognitive value of art, then, is not its offering of discrete propositional knowledge, but its amplificatory role in our cognitive lives. It offers, for example, a new way of getting at some truth. This is the kind of view many aestheticians hold. Diffey (1995) offers a view he calls moderate anti-cognitivism, based on a middle point between Stolnitz’s (1992) claim that there are no distinctive artistic truths and the cognitivist claim that new knowledge is gained through art. Diffey thinks, instead, that artworks can serve as ways of getting at a new contemplation of states of affairs. Thomson-Jones (2005), likewise suggests that artworks can grant us access to new ways of looking at some circumstances and/or states of affairs, particularly in the ethical and/or political domain. Indeed, Gibson, whose paper has been the substantial informant of this section, concurs that neo-cognitivism is the most promising way forward for the cognitivist.

In recent work, Currie (2020) has penetrated the claims of the cognitivist in a variety of forms, from the thought-experiment theorist, to the empathy cultivator, to those that think we can gain propositional knowledge, particularly in the context of literary fiction. Currie suggests a move away from knowledge acquisition in cognitivism to the more “guarded” term “learning” (2020, p. 217), arguing that the thought experiments contained within philosophical and scientific discourse offer an epistemic gain with which literary fiction cannot gain parity. He also casts significant doubt on the reliability of truths extracted from fiction, such as doubts of the expertise of authors, evidence of the disconnection between creativity and understanding, and the little support there is for “the practices and institutions of fiction” being bearers of truths (Currie, 2020, p. 198). Ultimately, Currie’s conclusion suggests that “essential features of literary fiction – narrative complexity and the centrality of literary style – seriously detract from any substantial or epistemically rewarding parallel” (Lamarque, 2021), and hence that pursuit of the claim that “’fiction teaches us stuff’ needs to be abandoned” (Currie, 2020, p. 218). Notwithstanding, Currie is sure to emphasize that he does not think that literary fiction cannot grant us knowledge tout court. Rather, the point is that learning can take place through fiction, but often this is marred with an increase in “ignorance, error, or blunted sensibility” (Currie, 2020, p. 216). Where learning does successfully take place, Currie does suggest that such cognitive gain is contributory to literary value.

With regard to empathic understanding and its improvement via literary fiction, Currie notes that empathy for fictional characters should not be taken in similar light to that of empathy in ‘reality’. When empathizing with fictional characters, the success of such empathy is dependent upon our getting it right as the narrative tells of the characters (Currie, 2020, p. 201). A further distinction that should be drawn, Currie suggests, is that between on the one hand an increase in empathy, and on the other an increase in the capacity for this empathy to be used discriminatively and in a positive way (Currie, 2020, p. 204). Drawing on empirical literature, Currie argues the evidence of gain in positive empathic discriminatory capacities is lacking and so we should not be overly optimistic (Currie, 2020, pp. 207-209). Again, though, he does not exclude the possibility of positive gain being made in empathy as a result of fiction. Some may improve, some may not; some may grant a positive effect, some negative. One work could produce empathic gain for some individual, loss for another (Currie, 2020, p. 215-6). Currie’s agenda regarding empathy cultivation through literature, therefore, is to warn against an over-optimism, as it was in his above cases about more classical cognitivist claims.

4. The Political Value of Art

a. Epistemic Progress

In light of the continued skepticism about what the cognitivist can and cannot claim, the views that art can give us experiential and/or propositional knowledge have decreased in popularity. However, in the context of contributions to political-epistemic progress, Simoniti (2021) has claimed that some art not only gives us propositional knowledge of the same standard as objective means (such as textbooks) of getting at epistemic progress, but that art sometimes has an advantage over these other forms. Put simply, Simoniti thinks that artworks can target political discourse and engender similar kinds of knowledge as do textbooks or news articles, without invoking special or peculiar art-specific knowledge – a now relatively unpopular view – alongside being able to plug a gap that objective discourse leaves open.

This is because objective discourse must deal with generalizations: people, events, political parties, and so on, are categorized and essentialized such that a view about the general commands the scope of the whole group. Through art, we come into contact with particularized, individual narratives and characters, following their stories or depictions. Consequently, artworks point out that sometimes the ‘ideal spectator’, abstracted away and taking an encompassing view of states of affairs, events, and groups, is not always the most beneficial standpoint. By allowing us to focus on individuals, artworks can become genuine contributions to epistemic progress through reducing over-confidence in our positions, recalibrating our critical capacities, and facilitating a neutral position (Simoniti, 2021, p. 569-570).

Indeed, the view that artworks can serve as pieces of political criticism or commentary is not an unpopular view. Guernica, to which we have repeatedly referred, contains political content in its denunciation of war. Rivera’s Man at the Crossroads (1934) had political motivations in its content, commissioning, and subsequent destruction. Banksy, the enigmatic graffiti street artist is renowned for the political undertones of their art. Guerilla Girls’ critical works on the Met Museum, repetitively showing the injustice of female artists’ entry into the Met other than nude depictions, are raw forms of political commentary. The core question for philosophers looking at the value of art is whether political value is a genuine determinant of artistic value.

Most aestheticians would be willing to say that art can serve as political commentary or criticism, but not that this represents a specific value in and of itself. Rather, in similar fashion to Simoniti, it looks like the aesthetician would claim that this kind of value is cognitive value. That is, artworks contribute to our knowledge and understanding of politically meritorious and demeritorious states of affairs, raising our consciousness and awareness about them, and hopefully recalibrating our attitudes so as to realize the most sociopolitically beneficial states of affairs possible. This engenders, then, the assessment above regarding the interaction between cognitive value and artistic value, including whether art can genuinely bestow knowledge upon us. Alternatively, we might consider some political aspects of artworks to have effect upon their moral value, or indeed both the cognitive and moral value of the work.

One crucial import into this debate is the nature of aesthetic and/or artistic autonomy, which might be helpfully viewed as a recasting of the interaction debates considered above. This debate has encroached with particular force upon the political power of art. The idea concerns whether art can and whether it should provide criticism or commentary on political states of affairs. Although there is scholarship on the matter, we are not concerned here with political autonomy in terms of the restraint of the state from censoring or interfering with the production and dissemination of particular artworks. Rather, we are concerned with political autonomy in terms of whether art should be viewed politically. For example, if one is a formalist (or a radical autonomist as described above), one is going to suggest that art should not be assessed in terms of its political content. If one thinks that artistic value can be determined to some extent by cognitive and moral factors, then one is likely to allow political criticism and commentary to feature in the assessment of the value of an artwork.

Yet the debate regarding the political autonomy of art can become one that is much more entrenched. In this form, the debate concerns not whether artworks should be assessed in terms of their political content, but whether artists can or should involve political criticism or comment in the first place. The idea here is that the domain of art is supposed to be a realm of detachment from reality, not rendered ‘impure’ by external factors. Artworks are a source of disinterested pleasure, a way of escaping everyday life and the perils and anxieties we draw from it, appreciated solely for their form and the experiences that arise thereof. According to this kind of autonomist, artworks should not involve themselves with, and therefore feature, any political content. The task of art is to creatively detach from reality and serving political ends will only diminish that endeavor (for an edited collection on aesthetic and artistic autonomy, see Hulatt, 2013).

For some, such as W. E. B. Du Bois, the artwork and the political are inextricable: “all Art is propaganda and ever must be, despite the wailing of the purists” (Du Bois, 1926, 29). For Du Bois, art – especially at the time of his writing during the Harlem Renaissance – should be used for “furthering the cause of racial equality for African Americans” (Horton, 2013, p. 307), rather than being constructed to “pander to white audiences for the purposes of publication” (Horton, 2013, p. 308). The artist and their work cannot be severed from the ethical, political, and social environment within which they produce and operate, and the proposal of a detachment of the aesthetic and political is inimical to the cementing of extant progress in racial equality and rights (Horton, 2013). Art, then, should not be an autonomous avenue wherein politics is avoided and instead should be used as a political device.

b. The Pragmatic View

Some artworks make an explicit and direct contribution to political progress and the rectification of social issues and problems. These works are taken to be generally captured by the terms socially engaged art and relational aesthetics. Relational aesthetics (see Bourriaud, 2002, for the seminal work that introduces this term) tends to refer to those works that do not take on traditional artistic qualities, devices, practices, mediums, or techniques, but instead take as their form the interpersonal relations that they affect. For example, Rirkrit Tiravanija has conducted a series of relational works in different galleries and exhibitions, constructing makeshift kitchens that serve Thai food to visitors and staff alike, fostering dialogue between them and establishing (or furthering) social bonds. Socially engaged works are executed in ways very similar to social work by engendering direct socially facilitative effects. This might include Oda Projesi’s workshops and picnics for children in the community, Women on Waves, the Dorchester Housing Projects, or the works of 2015 Turner Prize winning collective Assemble. In each instance, the artist(s) make a direct contribution to the resolution or easing of some social issue. In this way, their goals are pragmatic, rooted in tangible, actualized progress, rather than beautiful or formal as we often take artworks to be.

These works are (i) accepted as works of art, and (ii) have value therein. As such, they have artistic value. Simoniti suggests that they “worryingly disregard the confines of the artworld” (2018, p. 72) by lacking the employment of traditional artistic and aesthetic form or values. In fact, it is precisely in their nature that they deviate from, as we have seen, traditional forms of artistic production and merit. As a consequence, Simoniti introduces an account of artistic value that can capture the social-work-esque achievements of these works and capture these achievements as valuable qua art. Called the pragmatic account of artistic value, it is used to explain only the value of these works, and states that value v, possessed by an artwork, is an artistic value if v is the positive political, cognitive, or ethical impact of the work (Simoniti, 2018, p. 76). That is, the value of these works, as artworks, is found in the positive, pragmatic contribution they make to sociopolitical progress. It should be qualified that Simoniti does not think the pragmatic view should be extended to all forms of art when assessing their value. Instead, it is the sensible option to take with regard to the artistic value of specific forms of art, such as socially engaged art or relational art. Other forms of art, of course, can have their artistic value assessed in terms of the positions we have already explored.

There are some concerns one might wish to raise with these pragmatic works. Firstly, one might question whether we should be referring to these as artworks. If they make no attempt at semblance of traditional artistic forms or value, then why call them artworks? Indeed, if they operate closer to the sphere of social work than art, and indeed have no traditionally artistic qualities, we might want to call them social-works rather than art-works. The relevance here for our task – concerning artistic value – is that if artistic value is something’s value qua or as art, then it needs to be art to have it! Simoniti appeals to the institutional and historicist theories of art to meet this objection. A related concern regards the nature of artistic value as a comparative notion; it is just how good or bad an artwork is. If socially engaged works have a specific account of artistic value that applies to them, then it doesn’t look like we can provide a legitimate comparison of them to more traditional artistic forms. Moreover, as this particular domain of value assessment perhaps aligns with social work more so than art, we might argue that the comparison should take place between socially engaged works and, well, social work, rather than artworks. If this is the case, then we might wonder if the value assessment is actually about the works qua art, or some other domain, that is, social work.

Finally, one might suggest that extant accounts of artistic value may indeed capture the artistic value of these works. Consider, for example, the moral beauty view of Gaut that we saw earlier. If we can observe an interaction between ethically meritorious character and action and aesthetic value, suggesting that the former are beautiful, then this could be used to apply to the ethically meritorious character and action of these relational works. Likewise, the functional beauty view, endorsed in particular by Parsons and Carlson (2008), suggests that aesthetic value can be attributed on the basis of something’s appearance corresponding to its intended function. For example, a flat tyre is aesthetically displeasing because it appears as inhibiting the function of a car (Bueno, 2009, p. 47). Perhaps, we might claim, socially engaged works appear in a way that corresponds to their intended function. These two brief parses might suggest that the introduction of a specialized notion of artistic value may not be needed.

5. Summary of Available Positions and Accounts

There is a wealth of available views regarding artistic value, its determinants, its relationship to the value of art, its relationship to aesthetic value, whether and how determinants can affect it, and so on. Here, I want to provide a brief outline of the views discussed and available positions/accounts. The purpose is to provide a brief, working statement about the views at hand. This is especially useful as sometimes multiple different views can adopt the same heading term. This set is by no means exhaustive, may be incomplete, and will be updated as is seen fit.

Aestheticism – aesthetic value and artistic value are one and the same value, and only aesthetic value matters for determining artistic value (things like cognitive value, moral value, political value, don’t matter for an assessment qua art).

Anti-cognitivism – there is no such thing as a distinctively artistic truth, or a truth that only art can teach us (see Stolnitz, 1992).

Cognitivism – artworks have something to grant to us in terms of knowledge. This might be new propositional knowledge, experiential knowledge, specifically artistic knowledge, or the artwork may clarify or strengthen already-held truths.

Cognitive Immoralism – moral defects in an artwork can be artistically valuable insofar as they provide some cognitive value (for example, cultivation of our moral understanding).

Definition-Evaluation Parallelism – what makes an artwork an artwork is x, x is a value of art, and the value of art is determined by one sole value, x. Not all value definitions of art conform to definition-evaluation parallelism.

Eliminativism – aesthetic value and artistic value are one and the same thing, and as such talk of artistic value is redundant (things like cognitive value, moral value, and political value might matter for the eliminativist, if they commit to a broad notion of aesthetic value).

Error-theory about artistic value – aesthetic value is what we mean by value qua art, there is no such thing as artistic value. We are in error when we talk about it.

Ethicism – moral merits are always aesthetic merits, and moral defects are always aesthetic defects.

Immoralism – moral defects in an artwork can be aesthetic/artistic merits.

Interactionist – (about moral value) someone who thinks that the moral value of an artwork interacts with that artwork’s aesthetic/artistic value.

Moderate Autonomism – aesthetic value is all that matters for artistic value, but artworks might be assessed with reference to the moral domain. However, the latter has no bearing on the artistic value of the work (its value qua art)

Moderate Moralism – in some cases, a work of art is susceptible to treatment in the moral domain, and this can affect its artistic value (its value qua art).

Neo-cognitivism – artworks can be cognitively valuable, and their artistic value augmented as a result, insofar as they can serve to clarify or improve knowledge we already possess.

Pluralism about artistic value – there are many determinants of artistic value, such as aesthetic value, cognitive value, moral value, and political value.

Pragmatic View of Artistic Value – artistic value, explicitly and solely for the set of socially engaged artworks, is the positive cognitive, ethical, or political effect they entail. This view should not be used to apply to other kinds of art, such as painting, sculpture, music, and so on (see Simoniti, 2018).

Radical Autonomism – aesthetic value is all that matters for artistic value, and any assessment of morality with regard to an artwork is inappropriate even if one does not think it bears weight on artistic value.

Radical Moralism – the artistic value of a work of art is determined by, or reducible to, its moral value.

Robust Immoralism – moral defects in an artwork give rise to artistic value insofar as a work achieves aesthetic success through aesthetic properties that arise because of them. For example, fictional murder may be valuable insofar as it invites excitement, vivacity, or mystery.

The Trivial Theory (of artistic value) – artworks have lots of different determinant values, none of which are specific to, or characteristic of, art.

Value Definitions of Art – what makes an artwork an artwork is x, and x is also a (or the) value of art. If x is the sole determinant of the value of art, then the value definition is an instance of definition-evaluation parallelism.

6. References and Further Reading

  • Abell, C. (2012) ‘Art: What it Is and Why it Matters’. Philosophy and Phenomenological Research. Vol. 85 (3) pp. 671-691
  • Bourriaud, N. (2009) Relational Aesthetics. Dijon: Les Presses du réel
  • Bueno, O. (2002) ‘Functional Beauty: Some Applications, Some Worries’. Philosophical Books. Vol. 50 (1) pp. 47-54
  • Bullough, E. (1912) ‘“Psychical Distance” as a Factor in Art and an Aesthetic Principle’. British Journal of Psychology. Vol. 5 (2), pp. 87-118
  • Carroll, N. (1996) ‘Moderate Moralism’. British Journal of Aesthetics. Vol. 36 (3), pp. 223-238
  • Carroll, N. (2004) ‘Non-Perceptual Aesthetic Properties: Comments for James Shelley’. British Journal of Aesthetics. Vol. 44 (4) pp. 413-423
  • Carroll, N. (2012) ‘Recent Approaches to Aesthetic Experience’. The Journal of Aesthetics and Art Criticism. Vol. 70 (2) pp. 165-177
  • Carroll, N. (2015) ‘Defending the Content Approach to Aesthetic Experience’. Metaphilosophy. Vol. 46 (2) pp. 171-188
  • Currie, G. (2020) Imagining and Knowing. Oxford: Oxford University Press
  • Davies, S. (1990) ‘Functional and Procedural Definitions of Art’. Journal of Aesthetic Education. Vol. 24 (2) pp. 99-106
  • Davies, S. (1991) Definitions of Art. London: Cornell University Press
  • Dickie, G. (1964) ‘The Myth of the Aesthetic Attitude’. American Philosophical Quarterly. Vol. 1 (1) pp. 56-65
  • Diffey, (1995) ‘What can we learn from art?’. Australasian Journal of Philosophy. Vol. 73 (2) pp. 204-211
  • Du Bois, W. E. B. (1926) ‘Criteria of Negro Art’. The Crisis. Vol. 32 pp. 290-297
  • Eaton, A. W. (2012) ‘Robust Immoralism’. The Journal of Aesthetics and Art Criticism. Vol. 70 (3) pp. 281-292
  • Gaut, B.(2007) Art, Emotion, Ethics. Oxford: Oxford University Press
  • Gisbon, J (2008) ‘Cognitivism and the Arts’. Philosophy Compass. Vol. 3 (4) pp. 573-589
  • Goldman, A. (2013) ‘The Broad View of Aesthetic Experience’. The Journal of Aesthetics and Art Criticism. Vol. 71 (4) pp. 323-333
  • Hamilton, A. (2006) ‘Indeterminacy and reciprocity: contrasts and connections between natural and artistic beauty’. Journal of Visual Art Practice. Vol. 5 (3) pp. 183-193
  • Hanson, L. (2013) ‘The Reality of (Non-Aesthetic) Artistic Value’. The Philosophical Quarterly. Vol. 63 (252) pp. 492-508
  • Hanson, L. (2017) ‘Artistic Value is Attributive Goodness’. The Journal of Aestheitcs and Art Criticism. Vol. 75 (4) pp. 415-427
  • Hanson, L. (2019) ‘Two Dogmas of the Artistic-Ethical Interaction Debate’. Canadian Journal of Philosophy. Vol. 50 (2) pp. 209-222
  • Horton, R. (2013) ‘Criteria of Negro Art’. The Literature of Propaganda. Vol. 1 pp. 307-309
  • Hulatt, O. eds. (2013) Aesthetic and Artistic Autonomy. New York: Bloomsbury Academic
  • Kant, I. (1987) Critique of Judgement translated by Werner Pluhar. Cambridge: Hackett Publishing Company.
  • Kieran, M. (2003) ‘Forbidden Knowledge: The Challenge of Immoralism’ in Bermudez, L., Gardner, S. eds. (2003) Art and Morality London: Routledge
  • Kulka, T. (2005) ‘Forgeries and Art Evaluation: An Argument for Dualism in Aesthetics’. The Journal of Aesthetic Education. Vol. 39 (3) pp. 58-70
  • Lopes, D. (2013) ‘The Myth of (Non-Aesthetic) Artistic Value’. The Philosophical Quarterly. Vol. 61 (244) pp. 518-536
  • Lopes, D. (2018) Being for Beauty. Oxford: Oxford University Press
  • Matravers, D. (2014) Introducing Philosophy of Art: in Eight Case Studies London: Routledge
  • McGregor, R. (2014) ‘A Critique of the Value Interaction Debate’. British Journal of Aesthetics. Vol. 54 (4) pp. 449-466
  • Parsons, G., Carlson, A. (2008) Functional Beauty. Oxford: Oxford University Press
  • Ryle, G. (1949/2009) The Concept of Mind: 60th Anniversary Edition Oxford: Routledge
  • Sauchelli, A. (2012) ‘Ethicism and Immoral Cognitivism: Gaut versus Kieran on Art and Morality’. The Journal of Aesthetic Education. Vol. 46 (3) pp. 107-118
  • Shelley, J. (2003) ‘The Problem of Non-Perceptual Art’. British Journal of Aesthetics. Vol. 43 (4) pp. 363-378
  • Shelley, J. (2019) ‘The Default Theory of Aesthetic Value’. British Journal of Aesthetics. Vol. 59 (1) pp. 1-12
  • Sibley, F. (1959) ‘Aesthetic Concepts’. Philosophical Review. Vol. 68 (4) pp. 421-450
  • Sibley, F. (2001) Approach to Aesthetics. Oxford: Oxford University Press
  • Simoniti, V. (2018) ‘Assessing Socially Engaged Art’. The Journal of Aesthetics and Art Criticism. Vol. 76 (1) pp. 71-82
  • Simoniti, V. (2021) ‘Art as Political Discourse’. British Journal of Aesthetics. Vol. 61 (4) pp. 559-574
  • Stecker, R. (2019) Intersections of Value: Art, Nature, and the Everyday. Oxford: Oxford University Press
  • Stolnitz, J. (1960) Aesthetics and Philosophy of Art Criticism. Boston: Houghton Miffin
  • Stolnitz, J. (1978) ‘”The Aesthetic Attitude” in the Rise of Modern Aesthetics’. The Journal of Aesthetics and Art Criticism. Vol. 36 (4) pp. 409-422
  • Stolnitz, J. (1992) ‘On the Cognitive Triviality of Art’. British Journal of Aesthetics. Vol. 32 (3) pp. 191-200
  • Thomson-Jones, K. (2005) ‘Inseparable Insight: Reconciling Cognitivism and Formalism in Aesthetics’. The Journal of Aesthetics and Art Criticism. Vol. 63 (4) pp. 375-384
  • Van der Berg, S. (2019) ‘Aesthetic hedonism and its critics’. Philosophy Compass. Vol. 15 (1) e12645

 

Author Information

Harry Drummond
Email: harry.drummond@liverpool.ac.uk
University of Liverpool
United Kingdom

Charles Darwin (1809–1882)

Charles Darwin is primarily known as the architect of the theory of evolution by natural selection. With the publication of On the Origin of Species in 1859, he advanced a view of the development of life on earth that profoundly shaped nearly all biological and much philosophical thought which followed. A number of prior authors had proposed that species were not static and were capable of change over time, but Darwin was the first to argue that a wide variety of features of the biological world could be simultaneously explained if all organisms were descended from a single common ancestor and modified by a process of adaptation to environmental conditions that Darwin christened “natural selection.”

Although it would not be accurate to call Darwin himself a philosopher, as his training, his professional community, and his primary audience place him firmly in the fold of nineteenth-century naturalists, Darwin was deeply interested and well versed in philosophical works, which shaped his thought in a variety of ways. This foundation included (among others) the robust tradition of philosophy of science in Britain in the 1800s (including, for instance, J. S. Mill, William Whewell, and John F. W. Herschel), and German Romanticism (filtered importantly through Alexander von Humboldt). From these influences, Darwin would fashion a view of the living world focused on the continuity found between species in nature and a naturalistic explanation for the appearance of design and the adaptation of organismic characters to the world around them.

It is tempting to look for antecedents to nearly every topic present in contemporary philosophy of biology in the work of Darwin, and the extent to which Darwin anticipates a large number of issues that remain pertinent today is certainly remarkable. This article, however, focuses on Darwin’s historical context and the questions to which his writings were primarily dedicated.

Table of Contents

  1. Biography
  2. Darwin’s Philosophical Influences
    1. British Philosophy of Science
    2. German Romanticism
    3. Ethical and Moral Theory
  3. The Argument for Natural Selection
    1. Darwin’s Theory
    2. The Origin of Species
  4. Evolution, Humans, and Morality
    1. The Question of Human Evolution
    2. The Descent of Man
    3. Sexual Selection
  5. Design, Teleology, and Progress
    1. Design: The Darwin-Gray Correspondence
    2. Was Darwin a Teleologist?
    3. Is Natural Selection Progressive?
  6. The Reception of Darwin’s Work
    1. Scientific Reception
    2. Social and Religious Reception
    3. Darwin and Philosophy
  7. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Biography

Charles Robert Darwin was born in Shropshire, England, on February 12, 1809. He came from a relatively illustrious and well-to-do background: his father, Robert Darwin (1766–1848), was a wealthy and successful surgeon, and his uncle Josiah Wedgwood (1730–1795) was the son of the founder of the pottery and china works that still bear the family name. His grandfather was Erasmus Darwin (1731–1802), a co-founder of the Lunar Society, a group that brought together elite natural philosophers from across the English Midlands, including the chemist Joseph Priestley and the engineers James Watt and Matthew Boulton. Erasmus Darwin’s natural-philosophical poetry was widely known, especially Zoonomia (or “Laws of Life”), published between 1794 and 1796, and containing what we might today call some “proto-evolutionary” thought (Browne 1989).

Darwin had been expected to follow in his father’s footsteps and set out for the University of Edinburgh at the age of sixteen to study medicine. He was, anecdotally, so distressed by surgical demonstrations (in the years prior to anesthesia) that he quickly renounced any thoughts of becoming a doctor and turned his focus instead to the zoological lessons (and collecting exhibitions) of Robert Edmond Grant, who would soon become his first real mentor. Darwin’s father, “very properly vehement against my turning into an idle sporting man, which then seemed my probable destination” (Autobiography, p. 56), sent him in 1828 to Cambridge, with the goal of becoming an Anglican parson. Cambridge, however, would put him in contact with John Stevens Henslow, an influential botanist who encouraged Darwin to begin studying geology.

His friendship with Henslow would trigger one of the pivotal experiences of Darwin’s life. The professor was offered a position as “ship’s naturalist” for the second voyage of the HMS Beagle, a vessel tasked with sailing around the world and preparing accurate charts of the coast of South America. Henslow, dissuaded by his wife from taking the position himself, offered it to Darwin. After convincing his father that there could, indeed, be a career waiting for him at the end of the trip, Darwin departed on December 27, 1831.

Darwin left England a barely credentialed, if promising, twenty-two-year-old student of zoology, botany, and geology. By the time the ship returned in 1836, Darwin had already become a well-known figure among British naturalists. This recognition occurred for several reasons. First, it was a voyage of intellectual transformation. One of Darwin’s most significant scientific influences was Charles Lyell, whose three-volume Principles of Geology arrived by post over the course of the voyage, in the process dramatically reshaping the way in which Darwin would view the geological, fossil, zoological, and botanical data that he collected on the trip. Second, Darwin spent the entire voyage – much of that time in inland South America, while the ship made circuits surveying the coastline – collecting a wide variety of extremely interesting specimens and sending them back to London. Those collections, along with Darwin’s letters describing his geological observations, made him a popular man upon his return, and a number of fellow scientists (including the geologist and fossil expert Richard Owen, later to be a staunch critic of Darwin’s, and the ornithologist John Gould) prepared, cataloged, and displayed these specimens, many of which were extensively discussed in Darwin’s absence.

It was also on this trip that Darwin made his famed visit to the islands of the Galapagos. It is certain that the classic presentation of the Galapagos trip as a sort of “eureka moment” for Darwin, in which he both originated and became convinced of the theory of natural selection in a single stroke by analyzing the beaks of the various species of finch found across several of the islands, is incorrect. (Notably, Darwin had mislabeled several of his collected finch and mockingbird specimens, and it was only after they were analyzed by the ornithologist Gould on his return and supplemented by several other samples collected by the ship’s captain FitzRoy, that he saw the connections between beak and mode of life that we now understand to be so crucial.) But the visit was nonetheless extremely important. For one thing, Darwin was struck by the fact that the organisms found in the Galapagos did not look like inhabitants of other tropical islands, but rather seemed most similar to those found in coastal regions of South America. Why, Darwin began to wonder, would a divine intelligence not create species better tailored to their island environment, rather than borrowing forms derived from the nearby continent? This argument from biogeography (inspired in part by Alexander von Humboldt, about whom more in the next section) was one Darwin always found persuasive, and it would later be included in the Origin.

Beginning with his return in late 1836, and commencing with a flurry of publications on the results of the Beagle voyage that would culminate with the appearance of the book that we now call Voyage of the Beagle (1839, then titled Journal of Researches into the Geology and Natural History of the Various Countries Visited by H.M.S. Beagle), Darwin would spend six fast-paced years moving through London’s scientific circles. This was a period of frantic over-work and rapidly progressing illness (the subject of extreme speculation in the centuries since, with the latest hypothesis being an undiagnosed lactose intolerance). Darwin married his first cousin (a fact that caused him constant worry over the health of his children), Emma Wedgwood, in early 1839, and the family escaped the pressures of London to settle at a country manor in Down, Kent, in 1842 (now renovated as a very attractive museum). Darwin would largely be a homebody from this point on; his poor health and deep attachment to his ten children kept him hearthside for much of the remainder of his career. The death of two of his children in infancy, and especially a third, Annie, at the age of ten, were tragedies that weighed heavily upon him.

Before we turn to Darwin’s major scientific works, it is worth pausing to briefly discuss the extensive evidence revealing the development of Darwin’s thought. Luckily for those of us interested in studying the history of biology, he was a pack-rat. Darwin saved nearly every single letter he received and made pressed copies of those he wrote. He studiously preserved every notebook, piece of copy paper, or field note; we even have lists of the books that he read and when he read them, and some of his children’s drawings, if he later wrote down a brief jot of something on the back of them. As a result, we are able to chronicle the evolution (if you will) of his thinking nearly down to the day.

Thus, we know that over the London period – and particularly during two crucial years, 1837 and 1838 – Darwin would quickly become convinced that his accumulated zoological data offered unequivocal support for what he would call transformism: the idea that the species that exist today are modified descendants of species that once existed in the past but are now extinct. Across the top of his B notebook (started around July 1837), he wrote the word ZOONOMIA, in homage to his grandfather’s own transformist thought. The first “evolutionary tree” would soon follow. Around this time, he came to an understanding of natural selection as a mechanism for transformism, in essentially its modern form – since no organism is exempt from the struggle to survive and reproduce, any advantage, however slight, over its competitors will lead to more offspring in the long run, and hence the accumulation of advantageous change. With enough time, differences large enough to create the gulfs between species would arise.

In 1842, Darwin drafted a short version of this theory (now known as the Sketch) and expanded it to a much longer draft in 1844 (now known as the Essay), which he gave to his wife with instructions and an envelope of money to ensure that it would be published if Darwin died as a result of his persistent health problems. Somewhat inexplicably, he then set this work aside for around a decade, publishing a magisterial three-volume work on the classification of the barnacles. (So all-consuming was the pursuit that one of the Darwin children asked a friend where their father “did his barnacles.”) Hypotheses for the delay abound: aversion to conflict; fear of the religious implications of evolution; the impact of the wide ridicule of the rather slapdash anonymous “evolutionary” volume Vestiges of the Natural History of Creation, published in 1844; or simply a desire to immerse himself fully in the details of a taxonomic project prior to developing his own theoretical perspective.

In any event, he slowly began working on evolutionary ideas again over the mid-1850s (starting to draft a massive tome, likely in the end to have been multi-volume, now known as the Big Book or Natural Selection), until, on June 18, 1858, he received a draft of an article from fellow naturalist Alfred Russel Wallace. Darwin believed – whether or not this is true is another matter – that he had been entirely scooped on natural selection. Without his involvement, Lyell and the botanist Joseph Dalton Hooker arranged a meeting of the Linnean Society at which some of Darwin’s Sketch and Wallace’s paper would be read, allowing Darwin to secure priority in the discovery of natural selection. Meanwhile, Darwin turned to the preparation of an “abstract” of the larger book, much lighter on citations and biological detail than he would have liked, and he rushed it into print. On November 24, 1859, On the Origin of Species by Means of Natural Selection, or the Preservation of Favoured Races in the Struggle for Life was published. Its initial print run immediately sold out.

The book became a massive success, powered in no small part by the ability of natural selection to parsimoniously explain a staggering array of otherwise disunified biological facts (see section 6). He was promoted by a variety of eloquent and influential defenders (such as Thomas Henry Huxley), and even a number of fellow naturalists who were otherwise skeptical (particularly about the theory’s relationship to religious belief) offered him public support.

Despite Darwin’s best efforts (see section 4) to exclude discussion of humans and human evolution from the Origin, both the scientific community and the general public were quick to see the striking impact that Darwin’s work would have on our conception of human origins. After publishing books on the fertilization of orchids, the morphology of climbing plants, and the variation of domesticated plants and animals, Darwin finally turned directly to the question of humans, publishing The Descent of Man in 1871. His efforts there to connect the mental capacities of animals with those of humans would be extended by The Expression of the Emotions in Man and Animals, published the following year, one of the first books to be illustrated with photographic plates. Further books on fertilization, flowers, movement in plants, and a final book on earthworms were Darwin’s remaining major scientific publications – all directed at offering small but important demonstrations of the power of natural selection in action, and the ability of gradual, continuous change to accumulate in significant ways.

Darwin died in April 1882, and is buried in Westminster Abbey, next to John Herschel and just across from Isaac Newton. As such an illustrious burial attests, his legacy as one of the leading scientists of the nineteenth century was immediately cemented, even if the theory of natural selection itself took several decades to meet with universal acceptance (see section 6). By the 1950s, biological theory as a whole had been remade in a largely Darwinian image, and in 1964, Theodosius Dobzhansky would famously write that “nothing makes sense in biology except in the light of evolution.” Darwin was even featured on one side of the British £10 note from 2000 to 2018.

2. Darwin’s Philosophical Influences

For all that Darwin was assuredly not a professional philosopher – as indicated above, his relatively scattered educational trajectory was not one that would have had him reading large numbers of philosophical texts – he was still quite well-read, and concepts from both British and broader European traditions can undeniably be detected in his work. Much debate surrounds the ways in which we should understand those influences, and how they might (or might not) have shaped the content of his later scientific works.

We can be certain that while Darwin studied at Cambridge, he would have received the standard sort of training for a young man interested in becoming a naturalist and an Anglican minister (see Sloan, in Hodge and Radick 2009). He would have studied the Bible, as well as some important works of philosophy (such as John Locke’s Essay). He wrote later in his autobiography about the extent to which reading the natural theology of William Paley had been formative for him—the young Darwin was a genuine admirer of Paley’s approach, and hints of Paley’s perspective on design in nature can be found in the respect with which Darwin would treat arguments concerning the difficulty of accounting for “perfectly” adapted characters like the eye of an eagle.

Darwin also began to engage with the two philosophical traditions that would, as many commentators have noted (see especially Richards and Ruse 2016), largely structure his perspective on the world: one British, consisting of the writings on science by authors like John Herschel, William Whewell, and John Stuart Mill, and one German, which, especially for the young Darwin, would focus on the Romanticism of Alexander von Humboldt.

a. British Philosophy of Science

The British tradition was born out of the professionalization and standardization of scientific practice. Whewell would coin the very term ‘scientist’ around this period, and he and others were engaged in an explicit attempt to clarify the nature of scientific theorizing and inference. Works doing exactly this were published in rapid succession just as Darwin was negotiating the demands of becoming a professional naturalist and fashioning his work for public consumption. Herschel’s Preliminary Discourse on the Study of Natural Philosophy was published in 1830 (Darwin read it the same year), Whewell’s massive History of the Inductive Sciences and Philosophy of the Inductive Sciences appeared in 1837 and 1840, respectively, and Mill’s System of Logic dates from 1843. The very concept of science itself, the ways in which scientific evidence ought to be collected and inferences drawn, and the kinds of character traits that should be possessed by the ideal scientist were all the object of extensive philosophical discourse.

For his part, Darwin certainly was aware of the works of these three authors, even those that he had not read, and was further exposed to them all through their presence in a variety of contemporary scientific texts. Works like Charles Lyell’s Principles of Geology (1830–1833) were self-consciously structured to fulfill all the canons of quality science that had been laid down by the philosophers of the day, and so served as practical exemplars for the kind of theorizing that Darwin would later attempt to offer.

Without going too far afield into the incredibly rich subject of nineteenth-century British philosophy of science, a brief sketch of these views is nonetheless illuminating. In the early years of the 1800s, British science had been left with an uneasy mix of two competing philosophies of science. On the one hand, we find a strict kind of inductivism, often attributed to Francis Bacon, as hardened and codified by Isaac Newton. Scientists are to disinterestedly pursue the collection of the largest possible basis of empirical data and generalize from them only when a theoretical claim has received sufficient evidential support. Such was, the story went, the way in which Newton himself had induced the theory of universal gravitation on the basis of celestial and terrestrial motions, and such was the intent behind his famous injunction, “hypotheses non fingo” – I frame no hypotheses.

Such a philosophy of science, however, ran afoul of perhaps the most significant theoretical development of the early nineteenth century: the construction of the wave theory of light, along with Thomas Young and Augustin Fresnel’s impressive experimental confirmations of the various phenomena of interference. This posed a straightforward set of challenges for British philosophers of science to solve. Other than the famous “crucial experiments” in interference, there was little inductive evidence for the wave theory. What was the medium that transmitted light waves? It seemed to escape any efforts at empirical detection. More generally, was not the wave theory of light exactly the sort of hypothesis that Newton was warning us against? And if so, how could we account for its substantial success? How should the Baconian inductive method be related to a more speculative, deductive one?

Herschel, Whewell, and Mill differ in their approaches to this cluster of questions: Herschel’s emphasis on the role of the senses, Whewell’s invocation of Kantianism, and Mill’s use of more formal tools stand out as particularly notable. But at the most general level, all were trying, among numerous other goals, to find ways in which more expansive conceptions of scientific inference and argument could make room for a “legitimate” way to propose and then evaluate more speculative or theoretical claims in the sciences.

Of course, any theory addressing changes in species over geologic time will confront many of the same sorts of epistemic problems that the wave theory of light had. Darwin’s introduction of natural selection, as we will see below, both profited and suffered from this active discussion around questions of scientific methodology. On the one hand, the room that had been explicitly made for the proposition of more speculative theories allowed for the kind of argument that Darwin wanted to offer. But on the other hand, because so much focus had been aimed at these kinds of questions in recent years, Darwin’s theory was, in a sense, walking into a philosophical trap, with interlocutors primed to point out just how different his work was from the inductivist tradition. To take just one example, Darwin would complain in a letter to a friend that he thought that his critics were asking him for a standard of proof that they did not demand in the case of the wave theory. This conflict will be made explicit in the context of the Origin in the next section.

b. German Romanticism

The other philosophical tradition which substantially shaped Darwin’s thought was a German Romantic one, largely present in the figure of the naturalist, explorer, and philosopher Alexander von Humboldt (1769–1859). Darwin seems to have first read Humboldt in the years between the completion of his bachelor’s degree and his departure on the Beagle. Throughout his life, he often described his interactions with the natural world in deeply aesthetic, if not spiritual, terms, frequently linking such reflections back to Humboldt’s influence. A whole host of Darwin’s writings on the environments and landscapes he saw during his voyage, from the geology of St. Jago (now Santiago) Island in Cape Verde to the rainforests of Brazil, are couched in deeply Humboldtian language.

But this influence was not only a matter of honing Darwin’s aesthetic perception of the world, though this was surely part of Humboldt’s impact. Humboldt described the world in relational terms, focusing in particular on the reciprocal connections between botany, geology, and geography, a perspective that would be central in Darwin’s own work. Humboldt also had expounded a nearly universally “gradualist” picture of life – emphasizing the continuity between animals and humans, plants and animals, and even animate and inanimate objects. As we will see below, this kind of continuity was essential to Darwin’s picture of human beings’ place in the world.

In addition to the widely recognized influence of Humboldt, Darwin knew the works of Carl Gustav Carus, a painter and physiologist who had proposed theories of the unity of type (the sharing of an “archetype” among all organisms of a particular kind, reminiscent as well of the botanical work of Goethe). That archetype theory, in turn, was influentially elaborated by Richard Owen, with whom Darwin would work extensively on the evaluation and classification of some of his fossil specimens after his return on the Beagle. As noted above, Darwin was quite familiar with the work of Whewell, who integrated a very particular sort of neo-Kantianism into the context of an otherwise very British philosophy of science (on this point, see particularly Richards’s contribution to Richards and Ruse 2016).

Controversy exists in the literature over the relative importance of the British and German traditions to Darwin’s thought. The debate in the early twenty-first century is somewhat personified in the figures of Michael Ruse and Robert J. Richards, partisans of the British and German influences on Darwin’s work, respectively. On Ruse’s picture, the British philosophy-of-science context, supplemented by the two equally British cultural forces of horticulture and animal breeding (hallmarks of the agrarian, land-owner class) and the division of labor and a harsh struggle for existence (features of nineteenth-century British entrepreneurial capitalism), offers us the best explanation for Darwin’s intellectual foundations. Richards, of course, does not want to deny the obvious presence of these influences in Darwin’s thought. For him, what marks Darwin’s approach out as distinctive is his knowledge of and facility with German Romantic influences. In particular, Richards argues, they let us understand Darwin’s perennial fascination with anatomy and embryology, aspects that are key in this German tradition and the inclusion of which in Darwin’s work might otherwise remain confusing.

c. Ethical and Moral Theory

Darwin recognized throughout his career that his approach to the natural world would have an impact on our understanding of humans. His later works on the evolution of our emotional, social, and moral capacities, then, require us to consider his knowledge of and relation to the traditions of nineteenth-century ethics.

In 1839, Darwin read the work of Adam Smith, in particular his Theory of Moral Sentiments, which he had already known through Dugald Stewart’s biography of Smith. (It is less likely that he was familiar first-hand with any of Smith’s economic work; see Priest 2017.) Smith’s approach to the moral sentiments – that is, his grounding of our moral conduct in our sympathy and social feelings toward one another – would be reinforced by a work that was meaningful for Darwin’s theorizing but is little studied today: James Mackintosh’s Dissertation on Progress of Ethical Philosophy, published in 1836. For Smith and Mackintosh both, while rational reflection could aid us in better judging a decision, what really inspires moral behavior or right action is the feeling of sympathy for others, itself a fundamental feature of human nature. From his very first reading of Smith, Darwin would begin to write in his notebooks that such an approach to morality would enable us to ground ethical behavior in an emotional capacity that could be compared with those of the animals – and which could have been the target of natural selection.

Finally, we have the influence of Thomas Malthus. Darwin reads Malthus’s Essay on the Principle of Population (1798) on September 28, 1838, just as he is formulating the theory of natural selection for the first time. Exactly what Darwin took from Malthus, and, therefore, the extent to which the reading of Malthus should be seen as a pivotal moment in the development of Darwin’s thought, is a matter of extensive debate. We may be certain that Darwin took from the first chapter of Malthus’s work a straightforward yet important mathematical insight. Left entirely to its own devices, Malthus notes, the growth of population is an exponential phenomenon. On the contrary, even with optimistic assumptions about the ability of humans to increase efficiency and yield in our production of food, it seems impossible that growth in the capacity to supply resources for a given population could proceed faster than a linear increase.

This insight became, as Darwin endeavored to produce a more general theory of change in species, crucial to the conviction that competition in nature – what he would call the struggle for existence – is omnipresent. Every organism is locked in a constant battle to survive and reproduce, whether with other members of its species, other species, or even its environmental conditions (of drought or temperature, for instance). This struggle can help us to understand both what would cause a species to go extinct, and to see why even the slightest heritable advantage could tilt the balance in favor of a newly arrived form.

Of course, Malthus’s book does not end after its first chapter. The reason that this inevitable overpopulation and hardship seems to be absent from much of the human condition, Malthus argues, is because (at least some) humans have been prudent enough to adopt other kinds of behaviors (like religious or social checks on marriage and reproduction) that prevent our population growth from proceeding at its unfettered, exponential pace. We must ensure, he argues, that efforts to improve the lives of the poor in fact actually do so, rather than producing the conditions for problematic overpopulation. A number of commentators, perhaps most famously Friedrich Engels, have seen in this broader “Malthusianism” the moral imprint of upper-class British society. Others, by contrast, have argued that Darwin’s context is more complex than this, and requires us to carefully unpack his relationship to the multi-faceted social and cultural landscape of nineteenth-century Britain as a whole (see Hodge 2009 and Radick, in Hodge and Radick 2009).

3. The Argument for Natural Selection

Famously, Darwin described the Origin as consisting of “one long argument” for his theory of evolution by natural selection. From the earliest days of its publication, commentators were quick to recognize that while this was assuredly true, it was not the kind of argument that was familiar in the scientific method of the day.

a. Darwin’s Theory

The first question to pose, then, concerns just what Darwin is arguing for in the Origin. Strikingly, he does not use any form of the term “evolution” until the very last word of the book; he instead has a penchant for calling his position “my view” or “my theory.” Contemporary scholars tend to reconstruct this theory in two parts. First, there is the idea of descent with modification. It was common knowledge (more than a century after the taxonomic work of Linnaeus, for example) that the species that exist today seem to show us a complex network of similarities, forming a tree, composed of groups within groups. Darwin’s proposal, then, is that this structure of similarity is evidence of a structure of ancestry – species appear similar to one another precisely because they share common ancestors, with more similar species having, in general, shared an ancestor more recently. Carrying this reasoning to its logical conclusion, then, leads Darwin to propose that life itself was “originally breathed into a few forms or into one” (Origin, p. 490).

The second argumentative goal of the Origin is to describe a mechanism for the production of the changes which have differentiated species from one another over the history of life: natural selection. As organisms constantly vary, and those variations are occasionally more or less advantageous in the struggle for existence, the possessors of advantageous variations will find themselves able to leave more offspring, producing lasting changes in their lineage, and leading in the long run to the adaptation and diversification of life.

Before turning to the argument itself, it is worth offering some context: what were the understandings of the distribution and diversity of life that were current in the scientific community of the day? Two issues here are particularly representative. First, the question of ‘species.’ What exactly was the concept of species to which Darwin was responding? As John Wilkins (2009) has argued, perhaps the most common anecdotal view – that prior to Darwin, everyone believed that species were immutable categories handed down by God – is simply not supported by the historical evidence. A variety of complex notions of species were in play in Darwin’s day, and the difficulty of interpretation here is compounded by the fact that Darwin’s own notion of species is far from clear in his works (there is debate, for example, concerning whether Darwin believed species categories were merely an epistemic convenience or an objective fact about the natural world). In short, Darwin was not as radical on this score as he is sometimes made out to be, in part because there was less theoretical consensus around the question of species than we often believe.

Second, there is the question of ‘gradualism.’ As we have seen, Darwin was heavily influenced by the geologist Charles Lyell, whose Principles of Geology argued for a gradualist picture of geological change (see Herbert 2005 on Darwin’s connections and contributions to geology). Rather than a history of “catastrophes” (Rudwick 1997), where major upheavals are taken to have shaped the geological features we see around us, Lyell argued for the contrary, “uniformitarian” view, on which the same geological causes that we see in action today (like erosion, earthquakes, tidal forces, and volcanic activity), extended over a much longer history of the Earth, could produce all of today’s observed phenomena. Lyell, however, had no interest in evolution. For him, species needed a different causal story: “centers of creation,” where the divine creative power was in the process of building new species, would counterbalance extinctions caused by steady change in the distribution of environmental and climatic conditions across the globe. It is easy to see, however, how Darwin’s own view of evolution by the gradual accumulation of favorable variations could fit naturally into a Lyellian picture of geological and environmental change. Darwin is, in many ways, a product of his time.

b. The Origin of Species

The Origin begins, then, with an analogy between artificial selection – as practiced by agricultural breeders, horticulturalists, or, Darwin’s favorite example, keepers of “fancy” pigeons – and natural selection. Consider for a moment how exactly artificial selection produces new varieties. We have an idea in mind for a new variation that would be aesthetically pleasing or practically useful. Well-trained observers watch for offspring that are born with characteristics that tend in this direction, and those organisms are then bred or crossed. The process repeats and – especially in the nineteenth century, when much work was ongoing to standardize or regularize commercially viable agricultural stocks – modifications can be realized in short order. Of course, this kind of breeding requires the active intervention of an intellect to select the organisms involved, and to plan for the “target” in mind. But this need not be the case. The goal could easily be removed; Darwin has us imagine cases where a simple inclination to keep one’s “best” animals safe during storms or other periods of danger could similarly create selective breeding of this sort, though now with an “unconscious” goal. Furthermore, Darwin will argue, the “selector” can also be done away with.

The next step in the analogy, then, is to demonstrate how such selection could be happening in the natural world. Organisms in nature do seem to vary just as our domestic plants and animals do, he argues – appearances to the contrary are likely just consequences of the fact that the kind of extreme attention to variation in characteristics that an animal breeder gives to their products is absent for wild populations. In just the same way that a breeder will ruthlessly cull any organisms that do not present desirable characters, organisms in the natural world are locked in a brutal struggle for existence. Far more organisms are born than can possibly survive, leading to a kind of Malthusian competition among conspecific organisms, and, in a variety of situations, struggles against the environment itself (heat, cold, drought, and so on) are also severe. Thus, all of the ingredients are there for the analogy to go through: the generation of variation, the relevance of that variation for survival, and the possibility for this process of selection to create adaptation and diversification.

Natural selection, then, because it can work not only on the kinds of visible characters that are of concern to the horticulturalist or animal breeder, but also on the internal construction of organisms, and because it selects for general criteria of success, not limited human goals, will be able to produce adaptations entirely beyond the reach of artificial selection. The result, Darwin writes, “is as immeasurably superior to man’s feeble efforts, as the works of Nature are to those of Art” (Origin, p. 61).

How exactly should we understand this analogy? What kind of evidential or logical support does Darwin think it brings to the process of natural selection? Analogical arguments were increasingly popular throughout the nineteenth century. In part, this may be traced back to Aristotelian and other Greek uses of analogy, which would have been familiar to Darwin and his peers. The role of analogy in the formulation of causal explanations in science had also been emphasized by authors like Herschel and Mill, who argued that one step in proposing a novel causal explanation was the demonstration of an analogy between its mode of action and other kinds of causes we already know to be present in nature.

Darwin then turns to a discussion of an array of objections that he knew would have already occurred to his contemporary readers. For instance: If species arose through gradual transitions, why are they now sharply distinguished from one another? Specialization and division of labor would produce increased opportunities for success and would thus tend to drive intermediate forms to extinction. How could natural selection possibly have created organs like the eyes of an eagle, whose extreme level of perfection had indicated to authors like Paley the signature of design? With enough time, if the intervening steps along the way were still useful to the organisms that possessed them, even such organs could be produced by a gradual process of selection. Darwin also considers the appearance of instincts, with the aim of demonstrating that natural selection could influence mental processes, and the supposed infertility of hybrids, which could be seen as a problem for the accumulation of variation by crossing.

Next comes a discussion of the imperfection of the geological record. The relative rarity, Darwin argues, of the conditions required for fossilization, along with our incomplete knowledge of the fossils that are present even in well explored regions like Europe and North America, explains our ignorance of the complete set of transitional forms connecting ancestral species with the organisms alive today. This, then, serves as a segue to a collection of diverse, positive arguments for evolution by natural selection at the end of the volume, often likened to a Whewell-inspired “consilience of inductions” (a demonstration that a number of independent phenomena, not considered when the theory was first proposed, all serve as evidence for it). A number of facts about the distribution of fossils makes more sense on an evolutionary picture, Darwin argues. Extinction is given a natural explanation as an outcome of competition, and the relations between extinct groups seem to follow the same kinds of patterns that natural selection successfully predicts to exist among living species.

This final “consilience” portion of the book continues by discussing geographical distribution. Rather than appearing as though they were specifically created for their environments, Darwin notes, the flora and fauna of tropical islands are closely affiliated with the species living on the nearest major continent. This indicates that normal means of dispersal (floating, being carried by birds, and so on), along with steady evolution by natural selection, offers a solid explanation for these distributional facts. Similarly, the Linnaean, tree-like structure of larger groups containing smaller groups which relates all extant species can be explained by common ancestry followed by selective divergence, rather than simply being taken to be a brute fact about the natural world. Brief discussions of morphology, embryology, and rudimentary organs close this section, followed by a summary conclusion.

Darwin’s argument for evolution by natural selection is thus a unique one. It combines a number of relatively different ingredients: an analogy with artificial selection, several direct rebuttals of potential counterarguments, and novel evolutionary explanations for a variety of phenomena that are taken to be improvements on the consensus at that time. The ways in which these arguments relate to one another and to the evidential base for natural selection are sometimes made explicit, but sometimes left as exercises for the reader. Darwin’s critics saw in this unorthodox structure an avenue for attack (about which more in section 6).

The character of Darwin’s argument has thus remained an interpretive challenge for philosophers of science. One can recognize in the elements from which the argument is constructed the influence of particular approaches to scientific reasoning – for instance, Herschel’s understanding of the vera causa tradition, Comte’s positivism, or Whewell’s development of the consilience of inductions. These clues can help us to construct an understanding of Darwin’s strategy as being in dialogue with the contemporary philosophy of his day. How to spell this out in the details, however, is relatively challenging, especially because Darwin was himself no philosopher, and it can thus be difficult to determine to what extent he was really engaging with the details of any one philosopher’s work.

In a different vein, we can also use the Origin as a test case for a variety of contemporary pictures of scientific theory change. To take just one example, Darwin seems at times to offer an explicit argument in support of the epistemic virtues embodied by his theory. In particular, he directly considers the likely fertility of an evolutionary approach, arguing that future biological research in an evolutionary vein will be able to tackle a whole host of new problems that are inaccessible on a picture of special creation.

Similarly, evolutionary theory can serve as a test case for our understanding of scientific explanation in the context of historical sciences. Darwin’s argument relies crucially upon the ability to generalize from a local, short-term explanation (of, for instance, the creation of a new kind of pigeon by the accumulation of variations in a particular direction) to a long-term explanation of a broad trend in the history of life (like the evolution of flight). Darwin’s twin reliance on both this sense of “deep time” and on explanations that often involve not the description of a specific causal pathway (one that Darwin could not have possibly known in the mid-nineteenth century) but of a narrative establishing the plausibility of an evolutionary account for a phenomenon have since been recognized to be at the heart of a variety of scientific fields (Currie 2018).

4. Evolution, Humans, and Morality

Throughout the Origin, Darwin assiduously avoids discussion of the impact of evolutionary theory on humans. In a brief aside near the end of the conclusion, he writes only that “light will be thrown on the origin of man and his history” (Origin, p. 488). Of course, no reader could fail to notice that an evolutionary account of all other organisms, along with a unified mechanism for evolution across the tree of life, implies a new account of human origins as well. Caricatures of Darwin depicted as a monkey greeted the theory immediately upon its publication, and Darwin – whose notebooks and correspondence show us that he had always believed that human evolution was one of the most pressing questions for his theory to consider, even if it was absent from the Origin – finally tackled the question head-on when he published the two-volume Descent of Man, and Selection in Relation to Sex in 1871.

a. The Question of Human Evolution

It is important to see what Darwin’s explanatory goals were in writing the Descent. In the intervening years since publishing the Origin (which was, at this point, already in its fifth edition, and had been substantially revised as he engaged with various critics), Darwin had remained convinced that his account of evolution and selection was largely correct. He had published further volumes on variation in domesticated products and the fertilization of orchids, which he took to secure even further his case for the presence of sufficient variation in nature for natural selection to produce adaptations. What, then, was left to describe with respect to human beings? What made human beings special?

It should be emphasized that humans did not merit an exception to Darwin’s gradualist, continuous picture of life on earth. There is no drastic difference in kind – even with respect to emotions, communication, intellect, or morality – that he thinks separates human beings from the other animals. The Descent is not, therefore, in the business of advancing an argument for some special distinguishing feature in human nature.

On the contrary, it is this very gradualism that Darwin believes requires a defense. Opposition to his argument for continuity between humans and the other animals came from at least two directions. On the one hand, religious objections were relatively strong. Any picture of continuity between humans and animals would, for many theologians, have to take the human soul into account. Constructing an account of this supposedly distinctive feature of human beings which could be incorporated into a narrative of human evolution was certainly possible – many authors did precisely this (see Livingstone 2014) – but would require significant work (see more on religious responses to Darwin in section 6.b).

On the other hand, and more problematic from Darwin’s perspective, was scientific opposition, perhaps best represented by Alfred Russel Wallace, who argued that the development of human mental capacity had given us the ability to exempt ourselves from natural selection’s impact on our anatomy entirely (on the Darwin-Wallace connection, see Costa 2014). This special place for human reason did not sit well with Darwin, who thought that natural selection would act no differently in the human case. (Wallace would go on to become a spiritualist, a bridge too far for Darwin; the men rarely communicated afterward.)

Further, as has been extensively, if provocatively, maintained by Desmond and Moore (2009), Darwin recognized the moral stakes of the question. The debate over the origins of human races was raging during this period, dividing those who believed that all human beings were members of a single species (monogenists) and those who argued that human races were in fact different species (polygenists). Darwin came from an abolitionist, anti-slavery family (his wife’s grandfather, the founder of the Wedgwood pottery works, famously produced a series of “Am I Not a Man and a Brother?” cameos, which became an icon of the British and American anti-slavery movements). He had seen first-hand the impact of slavery in South America during the Beagle voyage and was horrified. Desmond and Moore’s broader argument, that Darwin’s entire approach to evolution (in particular, his emphasis on common ancestry) was molded by these experiences, has received harsh criticism. But the more limited claim that Darwin was motivated at least to some extent by the ethical significance of an evolutionary account of human beings is inarguable.

b. The Descent of Man

The Descent therefore begins with a demonstration of the similarity between the physical and mental characteristics of humans and other animals. Darwin notes the many physical homologies (that is, parts that derive from the same part in a common ancestor) between humans and animals – including a number of features of adults, our processes of embryological development, and the presence of rudimentary organs that seem to be useful for other, non-human modes of life. When Darwin turns to the intellect, he notes that, of course, even when we compare “one of the lowest savages” to “one of the higher apes,” there is an “enormous” difference in mental capacity (Descent, p. 1:34). Nonetheless, he contends once again that there is no difference in kind between humans and animals. Whatever mental capabilities we consider (such as instincts, emotions, learning, tool use, or aesthetics), we are able to find some sort of analogy in animals. The mixture of love, fear, and reverence that a dog shows for his master, Darwin speculates, might be analogous with humans’ belief in God (Descent, p. 1:68). As regards the emotions in particular, Darwin would return to this subject a year later in his work The Expression of the Emotions in Man and Animals, a full treatise concerning emotional displays in animals and their similarities with those in humans.

Of course, demonstrating that it is possible for these faculties to be connected by analogy with those in animals is not the same thing as demonstrating how such faculties might have evolved for the first time in human ancestors who lacked them. That is Darwin’s next goal, and it merits consideration in some detail.

For Darwin, the evolution of higher intellectual capacities is intimately connected with the evolution of social life and the moral sense (Descent, pp. 1:70–74). We begin with the “social instincts,” which primarily consist of sympathy and reciprocal altruism (providing aid to fellow organisms in the hope of receiving the same in the future). These would do a tolerably good job of knitting together a sort of pre-society, though obviously they would extend only to the members of one’s own “tribe” or “group.” Social instincts, in turn, would give rise to a feeling of self-satisfaction or dissatisfaction with one’s behavior, insofar as it aligned or failed to align with those feelings of sympathy. The addition of communication or language to the mix allows for social consensus to develop, along with the clear expression of public opinion. All these influences, then, could be intensified as they became habits, giving our ancestors an increasingly intuitive feeling for the conformity of their behavior with these emerging social norms.

In short, what we have just described is the evolution of a moral sense. From a basic kind of instinctive sympathy, we move all the way to a habitual, linguistically encoded sense of praise or blame, an instinctive sentiment that one’s actions should or should not have been done, a feeling for right and wrong. Darwin hastens to add that this evolutionary story does not prescribe the content of any such morality. That content will emerge from the conditions of life of the group or tribe in which this process unfolds, in response to whatever encourages or discourages the survival and success of that group. Carried to the extreme, Darwin writes that if people “were reared under precisely the same conditions as hive-bees, there can hardly be a doubt that our unmarried females would, like the worker-bees, think it a sacred duty to kill their brothers, and mothers would strive to kill their fertile daughters; and no one would think of interfering” (Descent, p. 1:73).

There is thus no derivation here of any particular principle of normative ethics – rather, Darwin wants to tell us a story on which it is possible, consistent with evolution, for human beings to have cobbled together a moral sense out of the kinds of ingredients which natural selection can easily afford us. He does argue, however, that there is no reason for us not to steadily expand the scope of our moral reasoning. As early civilizations are built, tribes become cities, which in turn become nations, and with them an incentive to extend our moral sympathy to people whom we do not know and have not met. “This point being once reached,” Darwin writes, “there is only an artificial barrier to prevent his sympathies extending to the men of all nations and races” (Descent, pp. 1:100–101).

We still, however, have not considered the precise evolutionary mechanism which could drive the development of such a moral sense. Humans are, Darwin argues, assuredly subject to natural selection. We know that humans vary, sometimes quite significantly, and experience in many cases (especially in the history of our evolution, as we are relatively frail and defenseless) the same kinds of struggles for existence that other animals do. There can be little doubt, then, that some of our features have been formed by natural selection. But the case is less obvious when we turn to mental capacities and the moral sense. In some situations, there will be clear advantages to survival and reproduction acquired by the advancement of some particular mental capacity – for instance, the ability to produce a device for obtaining food or performing well in battle.

The moral sense, however, offers a more complicated case. Darwin recognizes what is sometimes called the problem of biological altruism – that is, it seems likely that selfish individuals who freeload on the courage, bravery, and sacrifice of others will be more successful and leave behind more offspring than those with a more highly developed moral sense. If this is true, how can natural selection have favored the development of altruistic behavior? The correct interpretation of Darwin’s thinking here is the matter of a fierce debate in the literature. Darwin’s explanation seems to invoke natural selection operating at the level of groups or tribes. “When two tribes of primeval man, living in the same country, came into competition,” he writes, “if the one tribe included (other circumstances being equal) a greater number of courageous, sympathetic, and faithful members, who were always ready to warn each other of danger, to aid and defend each other, this tribe would without doubt succeed best and conquer the other” (Descent, p. 1:162). This appears to refer to natural selection not in terms of individual organisms competing to leave more offspring, but of groups competing to produce more future groups, a process known as group selection. On the group-selection reading, then, what matters is that the moral sense emerges in a social context. While individually, a selfish member of a group might profit, a selfish tribe will be defeated in the long-run by a selfless one, and thus tribes with moral senses will tend to proliferate.

Michael Ruse has, however, argued extensively for a tempering of this intuitive reading. Given that in nearly every other context in which Darwin discusses selection, he focuses on the individual level (even in cases like social insects or packs of wolves, where a group-level reading might be attractive), we should be cautious in ascribing a purely group-level explanation here. Among other considerations, the humans (or hominids) who formed such tribes would likely be related to one another, and hence a sort of “kin selection” (the process by which an organism promotes an “extended” version of its own success by helping out organisms that are related to it, and hence an individual-level explanation for apparent group-level phenomena) could be at play.

c. Sexual Selection

Notably, the material described so far has covered only around half of the first volume of the Descent. At this point, Darwin embarks on an examination of sexual selection – across the tree of life, from insects, to birds, to other mammals – that takes up the remaining volume and a half. He does so in order to respond to a unique problem that human beings pose. There is wide diversity in human morphology; different human races and populations look quite different. That said, this diversity seems not to arise as a result of the direct impact of the environment (as similar-looking humans have lived for long periods in radically different environments). It also seems not to be the sort of thing that can be explained by natural selection: there is nothing apparently adaptive about the different appearances of different human groups. How, then, could these differences have evolved?

Darwin answers this question by appealing to sexual selection (see Richards 2017). In just the same way that organisms must compete with others for survival, they must also compete when attracting and retaining mates. If the “standards of beauty” of a given species were to favor some particular characteristic for mating, this could produce change that was non-selective, or which even ran counter to natural selection. The classic example here is the tail of the peacock: even if the tail imposes a penalty in terms of the peacock’s ability to escape from predators, if having an elaborate tail is the only way in which to attract mates and hence to have offspring, the “selection” performed by peahens will become a vital part of their evolutionary story. A variety of non-selective differences in humans, then, could be described in terms of socially developed aesthetic preferences.

This explanation, too, has been the target of extensive debate. It is unclear whether or not sexual selection is a process that is genuinely distinct from natural selection – after all, if natural selection is intended to include aptitude for survival and reproduction, then it seems as though sexual selection is only a subset of natural selection. Further, the vast majority of Darwin’s examples of sexual selection in action involve traditional, nineteenth-century gender roles, with an emphasis on violent, aggressive males who compete for coy, choosy females. Can the theory be freed of these now outmoded assumptions, or should explanations that invoke sexual selection instead be discarded in favor of novel approaches that take more seriously the insights of contemporary theories of gender and sexuality (see, for instance, Roughgarden 2004)?

5. Design, Teleology, and Progress

Pre-Darwinian concepts of the character of life on earth shared a number of what we might call broad-scale or structural commitments. Features like the design of organismic traits, the use of teleological explanations, or an overarching sense of progress stood out as needing explanation in any biological theory. Many of these would be challenged by an evolutionary view. Darwin was aware of such implications of his work, though they are often addressed only partially or haphazardly in his most widely read books.

a. Design: The Darwin-Gray Correspondence

One aspect of selective explanations has posed a challenge for generations of students of evolutionary theory. The production of variations, as Darwin himself emphasized, is a random process. While he held out hope that we would someday come to understand some of the causal sequences in greater detail (as we indeed now do), in the aggregate it is “mere chance” that “might cause one variety to differ in some character from its parents” (Origin, p. 111). On the other hand, natural selection is a highly non-random process, which generates features that seem to us to be highly refined products of design.

Darwin, of course, recognized this tension, and discussed it at some length – only he did not do so, in general, in the context of his published works. It is his correspondence with the American botanist Asa Gray which casts the most light on Darwin’s thought on the matter (for an insightful recounting of the details, see Lennox 2010). Gray was what we might today call a committed “theistic evolutionist” – he believed that Darwin’s theory might be largely right in the details but hoped to preserve a role for a master plan, a divinely inspired design lying behind the agency of natural selection (which would on this view have been instituted by God as a secondary cause). Just as, many theists since Newton had argued, God might have instituted the law of gravity as a way to govern a harmonious balance in the cosmos, Gray wondered if Darwin might have discovered the way in which the pre-ordained, harmonious balance in the living world was governed.

However, this would require a place for the “guidance” of design to enter, and Gray thought that variation was where it might happen. If, rather than being purely random, variations were guided, directed toward certain future benefits or a grand design, we might be able to preserve divine influence over the evolutionary process. Such a view is entirely consistent with what Darwin had written in the Origin. He often spoke of natural selection in precisely the “secondary cause” sense noted above (and selected two quotes for the Origin’s frontispiece that supported precisely this interpretation), and he stated clearly that what he really meant in calling variation “random” was that we were entirely ignorant of its causes. Could not this open a space for divinely directed evolution?

Darwin was not sure. His primary response to Gray’s questioning was confusion. He wrote to Gray that “in truth I am myself quite conscious that my mind is in a simple muddle about ‘designed laws’ & ‘undesigned consequences.’ — Does not Kant say that there are several subjects on which directly opposite conclusions can be proved true?!” (Darwin to Gray, July 1860, in Lennox 2010, p. 464). Darwin’s natural-historical observations seem to show him that nature is a disorderly, violent, dangerous place, not exactly one compatible with the kind of design that his British Anglican upbringing had led him to expect.

Another source is worthy of note. In his 1868 Variation in Plants and Animals Under Domestication, Darwin asks us to consider the example of a pile of stones that has accumulated at the base of a cliff. Even though we might call them “accidental,” the precise shapes of the stones in the pile are the result of a series of geological facts and physical laws. Now imagine that someone builds a building from the stones in the pile, without reshaping them further. Should we infer that the stones were there for the sake of the building thus erected? Darwin thinks not. “An omniscient Creator,” he writes, “must have foreseen every consequence which results from the laws imposed by Him. But can it be reasonably maintained that the Creator intentionally ordered, if we use the words in any ordinary sense, that certain fragments of rock should assume certain shapes so that the builder might erect his edifice?” (Variation, p. 2:431). Variation, Darwin claims, should be understood in much the same way. There is no sense, divine or otherwise, in which the laws generating variation are put in place for the sake of some single character in some particular organism. In this sense, evolution is a chancy (and hence undesigned) process for Darwin.

b. Was Darwin a Teleologist?

A related question concerns the role of teleological explanation in a Darwinian world. Darwin is often given credit (for example, by Engels) for having eliminated the last vestiges of teleology from nature. A teleological account of hearts, for instance, takes as a given that hearts are there in order to pump blood, and derives from this fact explanations of their features, their function and dysfunction, and so on. (See the discussion of final causes in the entry on Aristotle’s biology.) From the perspective of nineteenth-century, post-Newtonian science, however, such a teleological explanation seems to run contrary to the direction of causation. How could the fact that a heart would go on to pump blood in the future explain facts about its development now or its evolution in the past? Any such explanation would have to appeal either to a divine design (which Darwin doubted), or to some kind of vitalist force or idealist structure preexisting in the world.

A truly “Darwinian” replacement for such teleology, it is argued, reduces any apparent appeals to “ends” or “final causes” to structures of efficient causation, phrased perhaps in terms of the selective advantage that would be conferred by the feature at issue, or a physical or chemical process that might maintain the given feature over time. The presence of these structures of efficient causation could then be explained by describing their evolutionary histories. In this way, situations that might have seemed to call for teleological explanation are made intelligible without any appeal to final causes.

This does seem to be the position on teleology that was staked out by Darwin’s intellectual descendants in mid-twentieth century biology (such as Ernst Mayr). But is this Darwin’s view? It is not clear. A compelling line of argumentation (pursued by philosophers like James Lennox and David Depew) notes the presence of a suspiciously teleological sort of explanation that runs throughout Darwin’s work. For Darwin, natural selection causes adaptations. But the fact that an adaptation is adaptive also often forms part of an explanation for its eventual spread in the population. There is thus a sense in which adaptations come to exist precisely because they have the effect of improving the survival and reproduction of the organisms that bear them. There is no mistaking this as a teleological explanation – just as we explained hearts by their effect of pumping blood, here we are explaining adaptations by the effects they have on future survival and reproduction.

There are thus two questions to be disentangled here, neither of which have consensus responses in the contemporary literature. First, did Darwin actually advocate for this kind of explanation, or are these merely turns of phrase that he had inherited from his teachers in natural history and to which we should give little actual weight? Put differently, did Darwin banish teleology from biology or demonstrate once and for all the way in which teleology could be made compatible with an otherwise mechanistic understanding of the living world? Second, does contemporary biology give us reasons to reject these kinds of explanations today, or should we rehabilitate a revised notion of teleology in the evolutionary context (for the latter perspective, see, for instance, Walsh 2016)?

c. Is Natural Selection Progressive?

The observation of “progress” across the history of life is a reasonably intuitive one: by comparison to life’s first billion years, which exclusively featured single-celled, water-dwelling organisms, we are now surrounded by a bewildering diversity of living forms. This assessment is echoed in the history of philosophy by way of the scala naturae, the “great chain of being” containing all living things, ordered by complexity (with humans, or perhaps angels, at the top of the scale).

This view is difficult to reconcile with an evolutionary perspective. In short, the problem is that evolution does not proceed in a single direction. The bacteria of today have been evolving to solve certain kinds of environmental problems for just as long, and with just as much success, as human beings and our ancestors have been evolving to solve a very different set of environmental challenges. Any “progress” in evolution will thus be progress in a certain, unusual sense of “complexity.” In the context of contemporary biology, however, it is widely recognized that any one such ordering for all of life is extremely difficult to support. A number of different general definitions of “complexity” have been proposed, and none meets with universal acceptance.

Darwin acknowledged this problem himself. Sometimes he rejected the idea of progress in general. “It is absurd,” he wrote in a notebook in 1837 (B 74), “to talk of one animal being higher than another.” “Never speak of higher and lower,” he wrote as a marginal note in his copy of Robert Chambers’s extremely progressivist Vestiges of the Natural History of Creation. Other times, he was more nuanced. As he had written at the beginning of notebook B, among his earliest evolutionary thoughts: “Each species changes. [D]oes it progress? […] [T]he simplest cannot help – becoming more complicated; & if we look to first origin there must be progress.” When life first begins, there is an essentially necessary increase in complexity (a point emphasized in the contemporary context by authors like Stephen Jay Gould and Daniel McShea), as no organism can be “less complex” than some minimal threshold required to sustain life. Is this “progress?” Perhaps, but only of a very limited sort.

These quotes paint a picture of Darwin as a fairly revolutionary thinker about progress. Progress in general cannot be interpreted in an evolutionary frame; we must restrict ourselves to thinking about evolutionary complexity; this complexity would have been essentially guaranteed to increase in the early years of life on earth. Adaptation refines organismic characteristics within particular environments, but not with respect to any kind of objective, global, or transcendental standard. If this were all Darwin had said, he could be interpreted essentially as consistent with today’s philosophical reflections on the question of progress.

But this is clearly not the whole story. Darwin also seemed to think that this restricted notion of progress as increase in complexity and relative adaptation was related to, if not even equivalent to, progress in the classical sense – and that such progress was in fact guaranteed by natural selection. “And as natural selection works solely by and for the good of each being,” he wrote near the end of the Origin, “all corporeal and mental endowments will tend to progress toward perfection” (Origin, p. 489). The best way to interpret this trend within Darwin’s writing is also the matter of some debate. We might think that Darwin is here doing his best to extract from natural selection some semblance (even if relativized to the local contexts of adaptation to a given environment) of the notion of progress that was so culturally important in Victorian Britain. Or, we might argue, with Robert Richards, that natural selection has thus retained a sort of moral, progressive force for Darwin, a force that might have been borrowed from the ideas of progress present within the German Romantic tradition.

6. The Reception of Darwin’s Work

Darwin’s work was almost immediately recognized as heralding a massive shift in the biological sciences. He quickly developed a group of colleagues who worked to elaborate and defend his theory in the British and American scientific establishment of the day. He also, perhaps unsurprisingly, developed a host of critics. First, let us consider Darwin’s scientific detractors.

a. Scientific Reception

Two facts about the Origin were frequent targets of early scientific critique. First, despite being a work on the origin of species, Darwin never clearly defines what he means by ‘species.’ Second, and more problematically, Darwin attempts to treat the generation and distribution of variations as a black box. One of the goals of the analogy between artificial and natural selection (and Darwin’s later writing of the Variation) is to argue that variation is simply a brute fact about the natural world: whenever a potential adaptation could allow an organism to advantageously respond to a given selective pressure or environmental change, Darwin is confident that the relevant variations could at least potentially arise within the population at issue.

However, as a number of his critics noted (including, for instance, J. S. Mill), it seems to be this process of the generation of variation that is really responsible for the origin of species. If the variation needed for selection to respond is not available, then evolutionary change simply will not occur. It is thus impossible, these critics argued, to have an account of evolution without a corresponding explanation of the generation of variations – or, at the very least, any such account would be incapable of demonstrating that any particular adaptation could actually have been produced by natural selection.

Another vein of scientific criticism concerned Darwin’s evidence base. The classic inductivism that was part and parcel of much of nineteenth-century British philosophy of science (see section 2.a) seems not to be satisfied by Darwin’s arguments. Darwin could not point to specific examples of evolution in the wild. He could not describe a detailed historical sequence of transitional forms connecting an ancestral species with a living species. He believed that he could tell portions of those stories, which he took to be sufficient, but this did not satisfy some critics. And he could not describe the discrete series of environmental changes or selection pressures that led to some particular evolutionary trajectory. Of course, these sorts of evidence are available to us today in a variety of cases, but that was of no help in 1859. Darwin was thus accused (for instance, in a scathing review of the Origin by the geologist Adam Sedgwick) of having inverted the proper order of explanation and having therefore proposed a theory without sufficient empirical evidence.

These scientific appraisals led to a period that has been called (not uncontroversially) the “eclipse of Darwinism” (a term coined by Julian Huxley in the mid-twentieth century; see Bowler 1992). It is notable that almost all of them are related to natural selection, not to the question of common ancestry. The vast majority of the scientific establishment quickly came to recognize that Darwin’s arguments for common ancestry and homology were extremely strong. There was thus a span of several decades during which Darwin’s “tree of life” was widely accepted, while his mechanism for that tree’s generation and diversification was not, even by scientific authorities as prestigious as Darwin’s famed defender Thomas Henry Huxley or the early geneticist T. H. Morgan. A host of alternative mechanisms were proposed, from neo-Lamarckian proposals of an inherent drive to improvement, to saltationist theories that proposed that variation proceeded not by gradual steps, but by large jumps between different forms. It was only with the integration of Mendelian genetics and the theory of evolution in the “Modern Synthesis” (developed in the 1920s and 1930s) that this controversy was finally laid to rest (see, for instance, Provine 1971).

b. Social and Religious Reception

The religious response to Darwin’s work is a complex subject, and was shaped by theological disputes of the day, local traditions of interaction (or lack thereof) with science, and questions of personal character and persuasion (see Livingstone 2014). Some religious authors were readily able to develop a version of natural selection that integrated human evolution into their picture of the world, making space for enough divine influence to allow for the special creation of humans, or at least for human souls. Others raised precisely the same kinds of objections to Darwin’s philosophy of science that we saw above, as they, too, had learned a sort of Baconian image of scientific methodology which they believed Darwin violated. But acceptance or rejection of Darwin’s theory was by no means entirely determined by religious affiliation. A number of figures in the Church of England at the time (an institution that was in the middle of its own crisis of modernization and liberalization) were themselves already quite willing to consider Darwin’s theory, or were even supporters, while a number of Darwin’s harshest critics were no friends to religion (Livingstone 2009).

Simplistic stories about the relationship between evolution and religious belief are thus very likely to be incorrect. The same is true for another classic presentation of religious opposition to Darwin, which is often used to reduce the entire spectrum of nuanced discussion to two interlocutors at a single event: the debate between Bishop Samuel (“Soapy Sam”) Wilberforce and Thomas Henry Huxley, held at the Oxford University Museum on the 30th of June, 1860. Wilberforce famously asked Huxley whether it was through his grandfather’s or grandmother’s side that he had descended from monkeys. As the classic story goes, Huxley calmly laid out the tenets of Darwin’s theory in response, clearly demonstrated the misunderstandings upon which Wilberforce’s question rested, and replied that while he was not ashamed to have descended from monkeys, he would be “ashamed to be connected with a man [Wilberforce] who used his great gifts to obscure the truth.” Huxley retired to thunderous applause, having carried the day.

The only trouble with this account is that it is almost certainly false. There are very few first-hand accounts of what actually took place that day, and many that exist are likely biased toward one side or the other. Huxley’s reputation had much to gain from his position as a staunch defender of science against the Church, and thus a sort of mythologized version of events was spread in the decades that followed the exchange. A number of attendees, however, noted rather blandly that, other than the monkey retort (which he did almost certainly say), Huxley’s remarks were unconvincing and likely interested only those already committed Darwinians (Livingstone 2009).

The Scopes Trial, another oft-cited “watershed” moment in the relationship between evolutionary theory and the general public, is also more complex than it might first appear. As Adam Shapiro (2013) has persuasively argued, the Scopes Trial was about far more than simple religious opposition to evolutionary theory (though this was certainly an ingredient). Biology had become part of a larger discussion of educational reform and the textbook system, making any hasty conclusions about the relationship between science and religion in this case difficult to support.

In summary, then, caution should be the order of the day whenever we attempt to analyze the relationship between religion and evolutionary theory. Religious institutions, from Darwin’s day to our own, are subject to a wide array of internal and external pressures, and their responses to science are not often made on the basis of a single, clear decision about the theological or scientific merits of some particular theory. This is especially true in Darwin’s case. Darwin’s theory quickly became part of larger social and cultural debates, whether these were about science and education (as in the United States), or, as was true globally, about broader ideological issues such as secularism, scientific or methodological naturalism, and the nature of the power and authority that scientists should wield in contemporary society.

There are few studies concerning the reception of Darwin by the public at large. Perhaps the most incisive remains that by the linguist Alvar Ellegård (1958), though his work only concerns the popular press in Britain for the first thirteen years after the publication of the Origin. This reaction is largely what one might have expected: the work itself was largely ignored until its implications for human evolution and theology were more widely known. At that point, natural selection remained largely either neglected or rejected, and public reactions were, in general, shaped by preexisting social structures and intellectual or cultural affiliations.

c. Darwin and Philosophy

Philosophers were quick to realize that Darwin’s work could have impacts upon a whole host of philosophical concerns. Particularly quick to respond were Friedrich Nietzsche and William James, both of whom were incorporating evolutionary insights or critiques into their works very shortly after 1859. The number of philosophical questions potentially impacted by an evolutionary approach is far too large to describe here and would quickly become an inventory of contemporary philosophy. A few notable examples will have to suffice (for more, see Smith 2017).

Biological species had, since Aristotle, been regularly taken to be paradigmatic exemplars of essences or natural kinds. Darwin’s demonstration that their properties have been in constant flux throughout the history of life thus serves as an occasion to reexamine our very notions of natural kind and essence, a task that has been taken up by a number of metaphysicians and philosophers of biology. When applied to human beings, this mistrust of essentialism poses questions for the concept of human nature. The same is true for final causes and teleological explanations (see section 5.b), where evolutionarily inspired accounts of function have been used to rethink teleological explanations across philosophy and science.

More broadly, the recognition that human beings are themselves evolved creatures can be interpreted as a call to take much more seriously the biological bases of human cognition and experience in the world. Whether this takes the form of a fully-fledged “neurophilosophy” (to borrow the coinage of Patricia Churchland) or simply the acknowledgement that theories of perception, cognition, rationality, epistemology, ethics, and beyond must be consistent with our evolved origins, it is perhaps here that Darwin’s impact on philosophy could be the most significant.

7. References and Further Reading

a. Primary Sources

  • Nearly all of Darwin’s works, including his published books, articles, and notebooks, are freely available at Darwin Online: <http://darwin-online.org.uk>
  • Darwin’s correspondence is edited, published, and also digitized and made freely available by a project at the University of Cambridge: <https://www.darwinproject.ac.uk/>
  • Darwin, Charles. 1859. On the Origin of Species by Means of Natural Selection, or the Preservation of Favoured Races in the Struggle for Life. 1st ed. London: John Murray.
    • The first edition of Darwin’s Origin is now that most commonly read by scholars, as it presents Darwin’s argument most clearly, without his extensive responses to later critics.
  • Darwin, Charles. 1862. On the Various Contrivances by Which British and Foreign Orchids Are Fertilised by Insects. London: John Murray.
    • The work on orchids offers insight into Darwin’s thought on coadaptation and the role of chance in evolution.
  • Darwin, Charles. 1868. The Variation of Animals and Plants Under Domestication. 1st ed. London: John Murray.
    • A two-volume work concerning the appearance and distribution of variations in domestic products.
  • Darwin, Charles. 1871. The Descent of Man, and Selection in Relation to Sex. 1st ed. London: John Murray.
    • Two-volume treatise on the evolution of humans, intelligence, morality, and sexual selection.
  • Darwin, Charles. 1872. The Expression of the Emotions in Man and Animals. London: John Murray.
    • An argument for continuity in emotional capacity between humans and the higher animals.
  • Barlow, Nora, ed. 1958. The Autobiography of Charles Darwin, 1809–1882. London: Collins.
    • Darwin’s autobiography, while occasionally of dubious historical merit, remains an important source for our understanding of his personal life.

b. Secondary Sources

  • Bowler, Peter J. 1992. The Eclipse of Darwinism: Anti-Darwinian Evolution Theories in the Decades around 1900. Baltimore, MD: Johns Hopkins University Press.
    • Explores the various debates surrounding natural selection and variation in the period from around Darwin’s death until the development of the early Modern Synthesis in the 1920s.
  • Browne, Janet. 1995. Charles Darwin: Voyaging, vol. 1. New York: Alfred A. Knopf.
  • Browne, Janet. 2002. Charles Darwin: The Power of Place, vol. 2. New York: Alfred A. Knopf.
    • The most detailed and highest quality general biography of Darwin, across two volumes loaded with references to published and archival materials.
  • Browne, Janet. 1989. “Botany for Gentlemen: Erasmus Darwin and ‘The Loves of the Plants.’” Isis 80: 593–621.
    • A presentation of the literary and social context of Darwin’s grandfather Erasmus’s poetic work on taxonomy and botany.
  • Costa, James T. 2014. Wallace, Darwin, and the Origin of Species. Cambridge, MA: Harvard University Press.
    • A careful discussion of the long relationship between Wallace and Darwin, ranging from the early proposal of natural selection to Wallace’s later defenses of natural and sexual selection, and forays into spiritualism.
  • Currie, Adrian. 2018. Rock, Bone, and Ruin: An Optimist’s Guide to the Historical Sciences. Cambridge, MA: The MIT Press.
    • An exploration of the conceptual issues posed by scientific explanation in the “historical sciences” (such as evolution, geology, and archaeology), from a contemporary perspective.
  • Desmond, Adrian, and James Moore. 2009. Darwin’s Sacred Cause: How a Hatred of Slavery Shaped Darwin’s Views on Human Evolution. Houghton Mifflin Harcourt.
    • Provocative biography of Darwin arguing that his development of evolution (in particular, his reliance on common ancestry) was motivated by his anti-slavery attitude and his exposure to the slave trade during the Beagle voyage.
  • Ellegård, Alvar. 1958. Darwin and the General Reader: The Reception of Darwin’s Theory of Evolution in the British Periodical Press, 1859–1872. Chicago: University of Chicago Press.
    • A wide-ranging study of the impact of Darwin’s works in the popular press of his day.
  • Herbert, Sandra. 2005. Charles Darwin, Geologist. Ithaca, NY: Cornell University Press.
    • Thorough presentation of Darwin’s work as a geologist, extremely important to his early career and to his development of the theory of natural selection.
  • Hodge, M. J. S. 2009. “Capitalist Contexts for Darwinian Theory: Land, Finance, Industry and Empire.” Journal of the History of Biology 42 (3): 399–416. https://doi.org/10.1007/s10739-009-9187-y.
    • An incisive discussion of the relationship between Darwin’s thought and the varying economic and social paradigms of nineteenth-century Britain.
  • Hodge, M. J. S., and Gregory Radick, eds. 2009. The Cambridge Companion to Darwin. 2nd ed. Cambridge: Cambridge University Press.
    • A broad, well written, and accessible collection of articles exploring Darwin’s impact across philosophy and science.
  • Lennox, James G. 2010. “The Darwin/Gray Correspondence 1857–1869: An Intelligent Discussion about Chance and Design.” Perspectives on Science 18 (4): 456–79.
    • Masterful survey of the correspondence between Charles Darwin and Asa Gray, a key source for Darwin’s thoughts about the relationship between evolution and design.
  • Livingstone, David N. 2014. Dealing with Darwin: Place, Politics, and Rhetoric in Religious Engagements with Evolution. Baltimore, MD: Johns Hopkins University Press.
    • A discussion of the wide diversity of ways in which Darwin’s religious and theological contemporaries responded to his work, with a focus on the importance of place and local tradition to those responses.
  • Livingstone, David N. 2009. “Myth 17: That Huxley Defeated Wilberforce in Their Debate over Evolution and Religion.” In Numbers, Ronald L., ed., Galileo Goes to Jail: And Other Myths about Science and Religion, pp. 152–160. Cambridge, MA: Harvard University Press.
    • A brief and extremely clear reconstruction of our best historical knowledge surrounding the Huxley/Wilberforce “debate.”
  • Manier, Edward. 1978. The Young Darwin and His Cultural Circle. Dordrecht: D. Riedel Publishing Company.
    • While somewhat dated now, this book still remains a rich resource for the context surrounding Darwin’s intellectual development.
  • Priest, Greg. 2017. “Charles Darwin’s Theory of Moral Sentiments: What Darwin’s Ethics Really Owes to Adam Smith.” Journal of the History of Ideas 78 (4): 571–93.
    • Explores the relationship between Adam Smith’s ethics and Darwin’s, arguing that Darwin did not derive any significant insights from Smith’s economic work.
  • Provine, William B. 1971. The Origins of Theoretical Population Genetics. Princeton, NJ: Princeton University Press.
    • Classic recounting of the historical and philosophical moves in the development of the Modern Synthesis, ranging from Darwin to the works of R. A. Fisher and Sewall Wright.
  • Richards, Evelleen. 2017. Darwin and the Making of Sexual Selection. Chicago: University of Chicago Press.
    • A carefully constructed history of Darwin’s development of sexual selection as it was presented in The Descent of Man, presented with careful and detailed reference to the theory’s social and cultural context.
  • Richards, Robert J., and Michael Ruse. 2016. Debating Darwin. Chicago: University of Chicago Press.
    • A volume constructed as a debate between Richards and Ruse, both excellent scholars of Darwin’s work and diametrically opposed on a variety of topics, from his intellectual influences to the nature of natural selection.
  • Roughgarden, Joan. 2004. Evolution’s Rainbow: Diversity, Gender, and Sexuality in Nature and People. Berkeley, CA: University of California Press.
    • A rethinking of Darwin’s theory of sexual selection for the contemporary context, with an emphasis on the reconstruction of biological explanations in the light of contemporary discussions of gender and sexuality.
  • Rudwick, M. J. S. 1997. Georges Cuvier, Fossil Bones, and Geological Catastrophes. Chicago: University of Chicago Press.
    • Describes the conflict between “uniformitarian” and “catastrophist” positions concerning the geological record in the years just prior to Darwin.
  • Ruse, Michael, and Robert J. Richards, eds. 2009. The Cambridge Companion to the “Origin of Species.” Cambridge: Cambridge University Press.
    • An excellent entry point into some of the more detailed questions surrounding the structure and content of Darwin’s Origin.
  • Shapiro, Adam R. 2013. Trying Biology: The Scopes Trial, Textbooks, and the Antievolution Movement in American Schools. Chicago: University of Chicago Press.
    • Insightful retelling of the place of the Scopes Trial in the American response to evolutionary theory, emphasizing a host of other, non-scientific drivers of anti-evolutionary sentiment.
  • Smith, David Livingstone, ed. 2017. How Biology Shapes Philosophy: New Foundations for Naturalism. Cambridge: Cambridge University Press.
    • This edited volume brings together a variety of perspectives on the ways in which biological insight has influenced and might continue to shape contemporary philosophical discussions.
  • Walsh, Denis M. 2016. Organisms, Agency, and Evolution. Cambridge: Cambridge University Press.
    • Develops a non-standard view of evolution on which teleology and organismic agency are given prominence over neo-Darwinian natural selection and population genetics.
  • Wilkins, John S. 2009. Species: A History of the Idea. Berkeley: University of California Press.
    • A discussion of the history of the concept of species, useful for understanding Darwin’s place with respect to other theorists of his day.

 

Author Information

Charles H. Pence
Email: charles@charlespence.net
Université Catholique de Louvain
Belgium

Kripke’s Wittgenstein

Saul Kripke, in his celebrated book Wittgenstein on Rules and Private Language (1982), offers a novel reading of Ludwig Wittgenstein’s main remarks in his later works, especially in Philosophical Investigations (1953) and, to some extent, in Remarks on the Foundations of Mathematics (1956). Kripke presents Wittgenstein as proposing a skeptical argument against a certain conception of meaning and linguistic understanding, as well as a skeptical solution to such a problem. Many philosophers have called this interpretation of Wittgenstein Kripke’s Wittgenstein or Kripkenstein because, as Kripke himself emphasizes, it is “Wittgenstein’s argument as it struck Kripke, as it presented a problem for him” (Kripke 1982, 5) and “probably many of my formulations and re-castings of the argument are done in a way Wittgenstein would not himself approve” (Kripke 1982, 5). Such an interpretation has been the subject of tremendous discussions since its publication, and this has formed a huge literature on the topic of meaning skepticism in general and Wittgenstein’s later view in particular.

According to the skeptical argument that Kripke extracts from Wittgenstein’s later remarks on meaning and rule-following, there is no fact about a speaker’s behavioral, mental or social life that can metaphysically determine, or constitute, what she means by her words and also fix a determinate connection between those meanings and the correctness of her use of these words. Such a skeptical conclusion has a disastrous consequence for the classical realist view of meaning: if we insist on the idea that meaning is essentially a factual matter, we face the bizarre conclusion that there is thereby “no such thing as meaning anything by any word” (Kripke 1982, 55).

According to the skeptical solution that Kripke attributes to Wittgenstein, such a radical conclusion is intolerable because we certainly do very often mean certain things by our words. The skeptical solution begins by rejecting the view that results in such a paradoxical conclusion, that is, the classical realist conception of meaning. The skeptical solution offers then a new picture of the practice of meaning-attribution, according to which we can legitimately assert that a speaker means something specific by her words if we, as members of a speech-community, can observe, in enough cases, that her use agrees with ours. We can judge, for instance, that she means by “green” what we mean by this word, namely, green, if we observe that her use of “green” agrees with our way of using it. Attributing meanings to others’ words, therefore, brings in the notion of a speech-community, whose members are uniform in their responses. As a result, there can be no private language.

This article begins by introducing Kripke’s Wittgenstein’s skeptical problem presented in Chapter 2 of Kripke’s book. It then explicates Kripke’s Wittgenstein’s skeptical solution to the skeptical problem, which is offered in Chapter 3 of the book. The article ends by reviewing some of the most important responses to the skeptical problem and the skeptical solution.

Table of Contents

  1. Kripke’s Wittgenstein: The Skeptical Challenge
    1. Meaning and Rule-Following
    2. The Skeptical Challenge: The Constitution Demand
    3. The Skeptical Challenge: The Normativity Demand
  2. Kripke’s Wittgenstein: The Skeptical Argument
    1. The Skeptic’s Strategy
    2. Reductionist Facts: The Dispositional View
      1. The Finitude Problem
      2. Systematic Errors
      3. The Normative Feature of Meaning
    3. Non-Reductionist Facts: Meaning as a Primitive State
    4. The Skeptical Conclusions and Classical Realism
  3. Kripke’s Wittgenstein: The Skeptical Solution
    1. Truth-Conditions vs. Assertibility Conditions
    2. The Private Language Argument
  4. Responses and Criticisms
  5. References and Further Reading
    1. References
    2. Further Reading

1. Kripke’s Wittgenstein: The Skeptical Challenge

Wittgenstein famously introduces a paradox in section 201 of the Philosophical Investigations, a paradox that Kripke takes to be the central problem of Wittgenstein’s book:

This was our paradox: no course of action could be determined by a rule, because every course of action can be made out to accord with the rule. The answer was: if everything can be made out to accord with the rule, then it can also be made out to conflict with it. And so there would be neither accord nor conflict here. (Wittgenstein 1953, §201)

Kripke’s book is formed around this paradox, investigating how Wittgenstein arrives at it and how he attempts to defuse it.

The main figure in Chapter 2 of Kripke’s book is a skeptic, Kripke’s Wittgenstein’s skeptic, who offers, on behalf of Wittgenstein, a skeptical argument against a certain sort of explanation of our commonsense notion of meaning. The argument is designed to ultimately lead to the Wittgensteinian paradox. According to this commonsense conception of meaning, we do not just randomly use words; rather, we are almost always confident that we meant something specific by them in the past and that it is because of that meaning that our current and future uses of them are regarded as correct. The sort of explanation of this commonsense conception that the skeptic aims to undermine is called “classical realism” (Kripke 1982, 73) or the “realistic or representational picture of language” (Kripke 1982, 85). According to this explanation, there are facts as to what a speaker meant by her words in the past that determine the correct way of using them in the future. The skeptical argument aims to subvert this explanation by revealing that it leads to the Wittgensteinian paradox. In the next section, this commonsense notion of meaning is outlined.

a. Meaning and Rule-Following

In Chapter 2, Kripke draws our attention to the ordinary way in which we talk about the notion of meaning something by an expression. Since this commonsense notion of meaning appeals to the notion of rule-following, Kripke initially describes the matter by using an arithmetic example, in which the notion of a rule has its clearest appearance, though the problem can be put in terms of the notion of meaning too (as well as that of intention and concept) and potentially applied to all sorts of linguistic expressions.

The commonsense notion of meaning points to a simple insight: in our everyday life, we “do not simply make an unjustified leap in the dark” (Kripke 1982, 10). Rather, we use our words in a certain way unhesitatingly and almost automatically and the reason why we do so seems to have its roots in the following two important aspects of the practice of meaning something by a word: (1) we meant something specific by our words in the past and (2) those past meanings determine the correct way of using these words now and in the future. Putting the matter in terms of rules, the point is that, for every word that we use, we grasped, and have been following, a specific rule governing the use of this word and such a rule determines how the word ought to be applied in the future. Consider the word “plus” or the plus sign “+”.  According to the commonsense notion of meaning, our use of this word is determined by a rule, the addition rule, that we have learnt and that tells us how to add two numbers. The addition rule is general in that it indicates how to add and produce the sum of any two numbers, no matter how large they are. The correct answer to any addition problem is already determined by that specific rule.

Since we have learnt or grasped the addition rule in the past and have been following it since then, we are now confident that we ought to respond with “22” to the arithmetic query “12 + 10 =?” and that this answer is the only correct answer to this question. Moreover, although we have applied the addition rule only to a limited number of cases in the past, we are prepared to give the correct answer to any addition query we may be asked in the future. This is, as the skeptic emphasizes, the “whole point of the notion that in learning to add I grasp a rule” (Kripke 1982, 8). This conception of rules can be extended to other expressions of our language. For instance, we can say that there is a rule governing the use of the word “green”: it tells us that “green” applies to certain (that is, green) things only. If we are following this rule, applying “green” to a blue object is incorrect. Again, we have used “green” only in a limited number of cases in the past, but the rule determines how to apply this word on all future occasions.

Having presented such a general picture of meaning and rule-following, the skeptic raises two fundamental questions: (1) what makes it the case that we really meant this rather than that by a word in the past or that we have been following this rather than that rule all along? (2) how can such past meanings and rules be said to determine the correct use of words in all future cases? Each question makes a demand. We can call the first the Constitution Demand and the second the Normativity Demand. Each demand is introduced below, and it is shown how they cooperate to establish a deep skepticism about meaning and rule-following.

b. The Skeptical Challenge: The Constitution Demand

Kripke’s Wittgenstein’s skeptic makes a simple claim by asking the following questions: What if by our words we actually meant something different from what we think we did in the past? What if we have really been following a rule different from what we think we did or never really followed the same rule at all? After all, we have applied the addition rule only to a limited number of cases in the past. Imagine, for example, that the largest number we have ever encountered in an addition is 57. As we are certain that we have always been following the addition rule, or meant plus by “plus”, we are confident that “125” is the correct answer to the new addition query “68 + 57 =?”. If the skeptic insists that this answer is wrong, all we can do is to check our answer all over again, and the correct answer seems to be nothing but “125”.

The skeptic, however, makes a bizarre claim: the correct answer is “5” not “125”, and this is so not because 125 is not the sum of 57 and 68, but because we have not been following the addition rule at all. At first sight, such a claim seems unacceptable, but the skeptic invites us to assume the following possible scenario. Suppose that there is another rule called quaddition: the quaddition function is symbolized by “⊕” and the rule is defined as follows:

x ⊕ y = x + y, if x, y < 57
           = 5 otherwise. (Kripke 1982, 9)

Perhaps, just imagine, we have been following this rule rather than the addition rule and taken “+” to denote the quaddition rather than the addition function. Maybe, as the scenario goes, we have meant quus rather than plus when using “plus” in the past.

According to the skeptic, the answers we have given so far have all been the quum rather than the sum of numbers. When we were asked to add 57 to 68, we got confused and thought the correct answer is “125”, probably because the sum and the quum of the numbers smaller than 57 are all the same. The correct answer, however, is “5”; we mistakenly thought that we had been following the addition rule, while the rule we actually follow is quaddition. The skeptic’s fundamental question is: “Who is to say that this is not the function I previously meant by ‘+’?” (Kripke 1982, 9).

The skeptic agrees that his claim is radical, “wild it indubitably is, no doubt it is false”, but  observes “if it is false, there must be some fact about my past usage that can be cited to refute it” (Kripke 1982, 9). What is required are some facts about ourselves, about what we did in the past, what has gone on in our minds, and similar that can do two things: (1) satisfy the Constitution Demand, that is, constitute the fact that in the past we meant plus by “plus”, and not anything else like quus, and (2) meet the Normativity Demand, that is, determine the correct way of using the word “plus” now and in the future (see Kripke 1982, 11).

The Constitution Demand reveals the metaphysical nature of the skeptic’s skeptical challenge. First, the answer “125” to “68 + 57 =?” is correct in two senses: in the arithmetical sense and in the meta-linguistic sense. It is correct in the arithmetical sense because 125 is the sum, and not, for instance, the product of 68 and 57: 125 is the result that we get after following the procedure of adding 68 to 57. Our answer is also correct in the meta-linguistic sense because, given that we mean plus by “plus” or intend “+” to denote the addition function, “125” is the only correct answer to “68 + 57 =?”. Of course, if we intend “+” to denote the quaddition function, “125” would be wrong. These two senses of correctness are distinct, for the addition function, independently of what we think of it, uniquely determines the sum of any two numbers. However, what function we intend “+” to denote is a meta-linguistic matter, that is, a matter of what we mean by our words and symbols.

The above distinction clarifies why the skeptic’s worry is not whether our computation of the sum of 68 and 57 is accurate or whether our memory works properly. Nor is his concern whether, in the case of using “green”, the objects we describe as being green are indeed green. He does not aim to raise an epistemological problem about how we know, or can be sure, that “125” is the correct answer to “57 + 68 =?”. His worry is metaphysical: is there any fact as to what we really meant by “plus”, “+”, “green”, “table” and so forth, in the past? If the skeptic successfully shows that there is no such fact, the question as to whether we accurately remember that meaning or rule would be beside the point: there is simply nothing determinate to remember. The skeptic’s claim is not that because we may forget what “plus” means or because we may make mistakes in calculating the sum of some two big numbers, we can never be sure that our answer is correct. Of course, we make mistakes: we may neglect things; we may unintentionally apply “green” to a blue object, and so forth. From the fact that we make occasional mistakes it does not follow that there is thereby no fact as to what we mean by our words. On the contrary, it seems that it is because of the fact that we mean plus by “plus” that answering with “5” to “57 + 68 =?” is considered to be wrong. The same considerations apply to the case of memory failures: we may, for example, forget to carry a digit when calculating the sum of two large numbers. Memory failures and failures of attention do not cast doubt on the fact that we mean addition by “+”. The skeptic takes it for granted that we fully recall what we did in the past, that our memory works perfectly fine, that our eyes function normally and that we can accurately compute the sum of numbers. None of these matters because he has no objection to the fact that if we can show that plus is what we meant by “plus” in the past, “125” is the correct answer to “57 + 68 =?”. In the same vein, however, if he can show that quus is what we meant by “plus”, “5” is the correct answer.

The skeptic’s Constitution Demand asks us to cite some fact about ourselves that can constitute the fact that by “plus” we meant plus rather than quus in the past. He does not care about what such a fact is: “there are no limitations, in particular, no behaviorist limitations, on the facts that may be cited to answer the sceptic” (Kripke 1982, 14). Moreover, if the skeptic succeeds in arguing that there is no fact as to what we meant by our words in the past, he has at the same time shown that there is no fact determining what we mean by our words now or in the future. As he puts it, “if there can be no fact about which particular function I meant in the past, there can be none in the present either” (Kripke 1982, 13). However, he cannot make such a claim in the beginning: if the skeptic undermines the certainty in what the words mean in the present, it seems that he could not even start conversing with anyone, nor formulate his skeptical claims in some language.

c. The Skeptical Challenge: The Normativity Demand

The second aspect of the skeptical challenge is that any fact that we may cite to defuse it must also “show how I am justified in giving the answer ‘125’ to ‘68 + 57’” (Kripke 1982, 11). The Constitution and Normativity Demands are put by the skeptic as two separate but related requirements. The second presupposes the first: if we fail to show that the speaker has meant something specific by her words, it would be absurd to say that those meanings determined how she ought to use the words in the future. It is better to see these two demands as two aspects of the skeptical problem. The connection between them is so deep that it would be hard to sharply distinguish between them as two entirely different demands: if there is no normative constraint on our use of words, we would not be able to justifiably talk about them having any specific meaning at all. If there is no such thing as correct vs. incorrect application of a word, the notion of a word meaning something specific would just vanish. The skeptic’s main point in distinguishing between these demands is to emphasize that telling a story about how meanings are constituted may still fail to offer a convincing story about the normative aspect of meaning. That is, even if we can introduce a fact that is somehow capable of explaining what the speaker meant by her word in the past, this by itself would not suffice to rule out the skeptical problem because any such fact must also justify the fact that the speaker uses her words in the way she does. In other words, it must be explained that our confidence in thinking that “125” is the correct answer to “68 + 57 =?” is based exactly on that fact, and not on anything else. (For a different reading of such a relation between meaning and correct use, see Ginsborg (2011; 2021). See also Miller (2019) for further discussion.)

Moreover, the skeptic uses each demand to offer a different argument against the different sorts of facts that may be introduced to resist the skeptical problem. As regards the Normativity Demand, the argument is based on the requirement that such facts must determine the correctness conditions for the application of words, and that they must do so for a potentially indefinite number of cases in which the words may have an application. This requirement is spelled out by the skeptic’s famous claim that any candidate for a fact that is supposed to constitute what we meant by our words in the past must be normative, not descriptive: it must tell us how we ought to or should use the words, not simply describe how we did, do or will use them. This is also known as the Normativity of Meaning Thesis. The normativity of meaning (and content) is now a self-standing topic. (For some of the main works on this thesis, see Boghossian (1989; 2003; 2008), Coates (1986), Gibbard (1994; 2013), Ginsborg (2018; 2021), Glock (2019), Gluer and Wikforss (2009), Hattiangadi (2006; 2007; 2010), Horwich (1995), Kusch (2006), McGinn (1984), Railton (2006), Whiting (2007; 2013), Wright (1984), and Zalabardo (1997).)

One of the clearest characterizations of the Normativity Demand has been given by Paul Boghossian:

Suppose the expression ‘green’ means green. It follows immediately that the expression ‘green’ applies correctly only to these things (the green ones) and not to those (the non-greens). … My use of it is correct in application to certain objects and not in application to others. (Boghossian 1989, 513)

This definition is neutral to the transtemporal aspect of the relation between meaning and use, contrary to McGinn’s reading of this relation. For McGinn, an account of the normativity of meaning requires an explanation of two things: “(a) an account of what it is to mean something at a given time and (b) an account of what it is to mean the same thing at two different times – since (Kripkean) normativeness is a matter of meaning now what one meant earlier” (McGinn 1984, 174). Kripke’s Wittgenstein’s skeptic, however, seems to view the notion of normativity as a transtemporal notion of a different sort: the normativity of meaning concerns the relation between past meanings and future uses. In this sense, what we meant by our words in the past already determined how we ought to use them in the future.

Yet the matter is more complicated than that. As we saw, the skeptic did not start by questioning the correctness of our current use of words. He asked whether some current use of a word accords with what we think we meant by it in the past: if it does, it is correct. This, however, seems merely a tactical move: the skeptic’s ultimate goal is to undermine the claim that we mean anything by our words now, in the past, or in the future and thus to rule out the idea that our past, current, or future uses of words can be regarded as correct (or incorrect) at all. If so, it is better to think of the skeptic’s conception of the normativity relation as not necessarily temporal. For him, the claim is simply that meaning determines the conditions of correct use. Nonetheless, for the reasons mentioned above, the skeptic often prefers to put the matter in a temporal way: “The relation of meaning and intention to future action is normative, not descriptive” (Kripke 1982, 37). The question then is whether our current use of a word accords with what we meant by it in the past. That a word ought to be applied in a specific way now in order to be applied in accordance with what we meant by it in the past is said to be the normative consequence of the fact that we meant a specific thing by it in the past.

All this, however, is a familiar thesis: what we decided to do in the past often has consequences for what we ought to do in the future. For instance, if you believe or accept the claim that telling lies is wrong, it has consequences for how you ought to act in the future: you should not lie. The skeptic has a similar claim in mind with regard to the notion of meaning: we cannot attach a clear meaning to “table” as used by a speaker if she uses it without any constraint whatsoever, that is, if she applies it to tables now, to chairs a minute later, and then to apples, lions, the sky, and so forth, without there being any regularity and coherence in her use of it. In such cases, it is not clear that we can justifiably say that she means this rather than that by “table” at all. The skeptic’s real question is whether there is any fact about the speaker that constitutes the fact that the speaker means table by “table” in such a way as to determine the correct use of the word in the future. If we are to be able to tell that she means table by “table”, we should also be able to say that her use of “table” is correct now and that it is so because of her meaning table by “table”, and not anything else. The reason is that the relation between meaning and use is prescriptive, not descriptive: if you mean plus rather than quus by “plus”, you ought to answer “68 + 57 =?” with “125”. The normative feature of meaning was already present in the skeptic’s characterization of the commonsense notion of meaning: with each new case of using a word, we are confident as to how we should use it because we are confident as to what we meant by it in the past.

The last step that the skeptic must take in order to complete his argument is to argue that no fact about the speaker can satisfy the two aspects of the notion of meaning, that is, the Constitution Demand and the Normativity Demand. It is not possible to introduce his arguments against each candidate fact in detail here, since in chapter 2 of Kripke’s book, the skeptic examines at least ten candidates for such a fact and argues against each in detail. In what follows, the skeptic’s general strategy in rejecting them is described by focusing on two examples.

2. Kripke’s Wittgenstein: The Skeptical Argument

The skeptic considers a variety of suggestions for facts that someone might cite to meet the skeptic’s challenge, that is, to show that we really mean plus, and not quus or anything else, by “plus”. In particular, the skeptic discusses ten candidate sorts of facts, including: (1) facts about previous external behavior of the speaker; (2) facts concerning the instructions the speaker may have in mind when, for instance, she adds two numbers; (3) some mathematical laws that seemingly work only if “+” denotes the addition rule; (4) the speaker’s possession of a certain mental image in mind when, for instance, she applies “green” to a new object; (5) facts about the speaker’s dispositions to respond in certain ways on specific occasions; (6) facts about the functioning of some machines, such as calculators, as embodying our intentions to add numbers; (7) facts about words having Fregean, objective senses; (8) the fact that meaning plus by “plus” is the simplest hypothesis about what we mean by “plus” and is thus capable of constituting the fact that “plus” means plus; (9) the fact that meaning plus is an irreducible mental state of the speaker with its own unique quale or phenomenal character; and (10) the view that meaning facts are primitive, sui generis.

In order to see how the skeptic argues against each such fact, it is helpful to classify them as falling under two general categories: reductionist facts and non-reductionist facts. The skeptic’s claim will be that neither the reductionist nor the non-reductionist facts can constitute the fact that the speaker means one thing rather than another by her words. The first eight candidate facts mentioned above belong to the reductionist camp: they are facts about different aspects of the speaker’s life, mental and physical. Here, the opponent’s claim is that such facts are capable of successfully constituting the fact that the speaker means plus by “plus”. The two last suggestions are from the non-reductionist camp, attempting to view the fact that the speaker means one thing rather than another by a word as a self-standing fact, not reducible to any other fact about the speaker’s behavioral or mental life. The skeptic’s strategy is to argue that both reductionist and non-reductionist facts fail to meet the Constitution and the Normativity Demands.

a. The Skeptic’s Strategy

In the case of the reductionist facts, the skeptic’s strategy is to show that any aspect of the speaker’s physical or mental life that may be claimed to be capable of constituting a determinate meaning fact or rule can be interpreted in a non-standard way, that is, in such a way that it can equally well be treated as constituting a different possible meaning fact or rule. Any attempts to dodge such deviant interpretations, however, face a highly problematic dilemma: either we have to appeal to some other aspect of the speaker’s life in order to eliminate the possibility of deviant interpretations and thereby fix the desired meaning or rule, in which case we will be trapped in a vicious regress, or we have to stop at some point and claim that this aspect, whatever it is, cannot be interpreted non-standardly anymore and is somehow immune to the regress problem, in which case meaning would become entirely indeterminate or totally mysterious. For the skeptic’s question is now “why is it that such a fact or rule cannot be interpreted in a different way?” and since the whole point of the skeptical argument is to show there is no answer to this question, it seems that we cannot really answer it, except if we already have a solution to the skeptical problem. If we do not, the only options available seem to be the following: (1) either we concede that there is no answer to this question, but then the indeterminacy of meaning and the Wittgensteinian paradox are waiting for us because we have embraced the claim that there is nothing on the basis of which we can determine whether our use of a word accords, or not, with a rule; our use is then both correct and incorrect at the same time; (2) or we decide to ignore this question, but we have then made the ordinary notion of meaning and rules entirely mysterious: we have appealed to a “superlative” fact or rule, which is allegedly capable of constituting the fact that the speaker means plus by “plus” but which is, in a mysterious way, immune to the skeptical problem.

In the case of the non-reductionist responses, the skeptic’s strategy is a bit different: his focus is on showing that we cannot make the nature of such primitive meaning facts intelligible, so that not only would they become mysterious, but we also have to deal with very serious epistemological problems about our first-personal epistemic access to their general content.

The next section further explains these problems by considering some examples.

b. Reductionist Facts: The Dispositional View

The most serious reductionist responses to Kripke’s Wittgenstein’s skeptic are the following: (1) the claim that facts about what the speaker meant by her words in the past are constituted by the speaker’s dispositions to respond in a certain way on specific occasions—this is the response from the dispositional view or dispositionalism; (2) the suggestion that there are some instructions in the mind of the speaker, some mental images, samples, ideas, and the like and that facts about having them constitute the fact that the speaker means, for instance, green by “green”.

According to the dispositional view, what a speaker means by her word can be extracted or read off from the responses she is disposed to produce. As the skeptic characterizes it:

To mean addition by ‘+’ is to be disposed, when asked for any sum ‘x + y’, to give the sum of x and y as the answer (in particular, to say ‘125’ when queried about ‘68 + 57’); to mean quus is to be disposed when queried about any arguments, to respond with their quum (in particular to answer ‘5’ when queried about ‘68 + 57’). (Kripke 1982, 22-23)

What dispositions are and what characteristics they have is a self-standing topic. It is helpful, however, to consider a typical example. A glass is said to have the property of being fragile: it shatters if struck by a rock. A glass, in order words, is disposed to shatter when hit by a rock or dropped. However, it is one thing to possess a disposition, another to manifest it. For instance, although a glass is disposed to shatter, and that glasses shatter very often around us, one particular glass may never actually shatter or may decay before finding any chance to manifest this disposition. Since the objects that are said to have such-and-such dispositions may never manifest them, we usually characterize their dispositional properties, or ascribe such dispositions to them, in a counterfactual way:

Glasses are disposed to shatter under certain conditions if and only if glasses would shatter if those conditions held.

These certain, normal, optimal, ideal, or standard conditions, as they are sometimes called, are supposed to exclude the conditions under which glasses may fail to manifest their disposition to shatter. There are various problems with how such conditions can be properly specified, which are not our concern here. (On dispositions, see Armstrong (1997), Bird (1998), Carnap (1928), Goodman (1973), Lewis (1997), Mellor (2000), Mumford (1998), Prior (1985), and Sellars (1958).)

Humans too can be said to possess different dispositions, which manifest themselves under certain circumstances. For instance, a child observes her parents pointing to a certain thing and saying “table”; they encourage the child to mouth “table” in the presence of that thing; the child tries to do so and when she is successful, she is rewarded; if she says “table” in the presence of a chair, she is corrected; and the process continues. The child is gradually conditioned to say “table” in the presence of the table. She then generalizes it: in the presence of a new table, she utters “table”. She is now said to be disposed to respond with “table” in the presence of tables, with “green” in the presence of green things, with the sum of two numbers when asked “x + y =?”, and so forth. Call these the “dispositional facts”. According to the dispositional view, such facts are capable of constituting what the speaker means by her words, or as the skeptic prefers to put it, from these dispositions we are supposed to “read off” what the speaker means by her words. For instance, if the speaker is disposed to apply “green” to green objects only, we can read off from such responses that she means green by “green”. Similarly, if she is disposed to apply “green” to green objects until a certain time t (for example, until the year 2100) and to blue objects after time t, we must conclude that she means something else, for instance, grue by “green” (Goodman 1973). Now, as the speaker is disposed to respond with “125” to “68 + 57 =?”, the dispositionalists’ claim is that the speaker means plus by “plus”.

The skeptic makes three objections. The first is that facts about dispositions cannot determine what the speaker means by “plus”; this is to say that the dispositional view fails to meet the Constitution Demand. The problem that the skeptic puts forward in this case is sometimes called the “Finitude Problem” or “Finiteness Problem” (Blackburn (1984a), Boghossian (1989), Ginsborg (2011), Horwich (1990), Soames (1997), and Wright (1984)). The other two objections concern the dispositional view’s success in accommodating the normative aspect of meaning: the dispositional view cannot account for systematic errors as “errors” and  dispositional facts are descriptive in nature, while meaning facts are supposed to be normative. These different problems are however related, as the next three sections make clear.

i. The Finitude Problem

According to the skeptic, facts about the speaker’s dispositions to respond in specific ways on certain occasions fail to constitute the fact that the speaker means plus by “plus” because “not only my actual performance, but also the totality of m­y dispositions, is finite” (Kripke 1982, 26). During our lifetime, we can produce only a limited number of responses. The skeptic now introduces a brand-new skeptical hypothesis: perhaps, the plus sign “+” stands for a function that we can call skaddition. It can be symbolized by “*” and defined as follows (see Kripke 1982, 30):

x * y = x + y, if x and y are small enough for us to have any disposition to add them in our lifetime;

x * y = 5, otherwise.

There are at least two possible meaning facts now, or two different rules, which are compatible with the totality of the responses a speaker can produce in her life: one possible fact is that she means addition by “+” and the other is that she means skaddition by “+”. The skeptic’s claim is that even if the speaker actually responds with the sum of all the numbers that she is asked to add in her lifetime, we still cannot read off from such responses that she really means plus by “plus”, for even the totality of her dispositions to respond to “x + y =?” is compatible with both “+” meaning addition and “+” meaning skaddition. The dispositional view fails to show that the fact that the speaker means addition, and not skaddition, by “+” can be constituted by facts about the speaker’s dispositions to respond with the sum of numbers. Therefore, the general strategy of the skeptic in this case is to uncover that no matter how the speaker actually responds, such responses can be interpreted differently, that is, in such a way that they remain compatible with different possible meaning facts or rules.

The skeptic anticipates an obvious objection from the dispositionalists, according to which the way the skeptic has characterized the dispositional view is too naive. A more sophisticated version of this view could avoid the finitude problem by including provisos like “under optimal conditions”. Their claim is that, under such conditions, “I surely will respond with the sum of any two numbers when queried” (Kripke 1982, 27). The main difficulty, however, is to specify these ideal, optimal or standard conditions in a non-question-begging way. For the skeptic, there are two general ways in which these conditions can be specified: (1) by using non-semantical and non-intentional terms, that is, in a purely naturalistic way, and (2) by using semantical and intentional terms. Both fail to bypass the skeptical problem, as the skeptic argues.

According to the skeptic, attempts for the first option lead to entirely indeterminate conjectures because we need to include conditions like “if my brain had been stuffed with sufficient extra matter”, “if it were given enough capacity to perform very large additions”, “if my life (in a healthy state) were prolonged enough”, and the like (see Kripke 1982, 27). Under such conditions, the dispositionalist may claim, I would respond by the sum of two numbers, no matter how large they are. According to the skeptic, however, “we have no idea what the results of such experiments would be. They might lead me to go insane, even to behave according to a quus-like rule. The outcome really is obviously indeterminate” (Kripke 1982, 27). It is completely unknown to us how such a person would be disposed to respond in a possible world in which she possesses such peculiar, beyond-imagination abilities.

In order to avoid such a problem, the dispositionalists may go for the second option and claim:

If I somehow were to be given the means to carry out my intentions with respect to numbers that presently are too long for me to add (or to grasp), and if I were to carry out these intentions, then if queried about ‘m + n’ for some big m and n, I would respond with their sum (and not with their quum). (Kripke 1982, 28)

The skeptic’s objection, however, is that this characterization of the optimal conditions is hopeless because it begs the question against the skeptic’s main challenge: what determines the intention of the speaker to use “+” in one way rather than another? The dispositional view presupposes, in its optimal conditions, that the speaker has a determinate intention toward what she wants to do with the numbers. Obviously, if I mean plus by “plus” or intend “+” to denote the addition function, I will be disposed to give their sum. But the problem is to determine what I mean by “plus” or what intention I have with regard to the use of “+”. This means that the dispositional view fails to meet the Constitution Demand.

ii. Systematic Errors

The dispositional account fails to accommodate the simple fact that we might be disposed to make systematic mistakes. Suppose that the speaker, for any reason, is disposed to respond slightly differently to certain arithmetic queries: she responds to “6 + 5 =?” with “10”, to “6 + 6 =?” with “11”, to “6 + 7 =?” with “12”, and so on. According to the skeptic, the dispositionalists cannot claim that the speaker means plus by “+” but simply makes mistakes, unless they beg the question against the skeptic. For, on their view, “the function someone means is to be read off from his dispositions” (Kripke 1982, 29). The dispositional account aims to show that because the speaker is disposed to respond with the sum of numbers, we can conclude that she follows the addition rule. But, in the above example, the speaker’s responses do not accord with the addition function; therefore, we cannot read off from these responses that she means plus by “plus”. Dispositionalists cannot claim that the speaker intends to give the sum of numbers but makes mistakes. Rather, all that they can say is that the speaker does not mean plus by “plus”. Otherwise, they beg the question against the skeptic by presupposing what the speaker means by “plus” in advance. This is related to the third problem with the dispositional view.

iii. The Normative Feature of Meaning

According to the skeptic, not only does the dispositional view fail to meet the Constitution Demand, but it also fails to meet the Normativity Demand. As shown in the previous section, the dispositional view fails to accommodate the fact that a speaker might make systematic mistakes. The skeptic’s more general claim is that even if the dispositional view can somehow find a way to dodge the finitude problem, it still fails to accommodate the normative feature of meaning because the dispositional facts are descriptive in nature, not normative or prescriptive. As the skeptic puts it:

A dispositional account misconceives the sceptic’s problem – to find a past fact that justifies my present response. As a candidate for a ‘fact’ t­­hat determines what I mean, it fails to satisfy the basic condition on such a candidate, […], that it should tell me what I ought to do in each new instance” (Kripke 1982, 24).

When queried about “68 + 57 =?”, we are confident that the correct answer to this query is “125” because we are confident that we mean plus‌ by “plus”. Meaning facts are normative, in that what we meant by “plus” in the past already determined how we ought to respond in the future. Nonetheless, facts about the speaker’s dispositions are descriptive: they do not say that because the speaker has been disposed to respond in this way, she should or ought to respond in that way in the future. They just describe how the speaker has used, uses or will use the word. Therefore, “this is not the proper account of the relation, which is normative, not descriptive” (Kripke 1982, 37): if you meant green by “green” in the past, you ought to apply it to this green object now. The dispositionalist cannot make such a claim, but must rather wait to see whether the speaker is or would be disposed to apply “green” to this green object.

The skeptic’s main objection against the dispositional view is that the speaker’s consistent responses cannot be counted as correct or as the responses the speaker ought to produce. If the responses that the speaker is disposed to produce cannot be viewed as correct, we cannot talk about their being in accordance with a determinate rule or a specific meaning: with no normative constraint on use, there can be no talk of meaning. According to Kripke, this is the skeptic’s chief objection to the dispositional view: “Ultimately, almost all objections to the dispositional account boil down to this one” (Kripke 1982, 24). (For defenses of the dispositional view against the skeptic see, for instance, Coates (1986), Blackburn (1984a), Horwich (1990; 1995; 2012; 2019), Ginsborg (2011; 2021), and Warren (2018).)

The skeptic’s strategy to reject the reductionist responses, such as the dispositional view, can thus be generally stated as follows: it does not matter how the speaker responds because, in whatever way she responds, it can be made compatible with her following different rules. Her answering with “125” to “68 + 57 =?” can be interpreted in such a way as to remain compatible with her following the skaddition rule. We then face a very problematic dilemma.

Suppose that one offers the following solution: each time that the speaker applies the addition rule, she has some other instruction or rule in mind, such as the “counting rule”; by appealing to this latter rule, we can then respond to the skeptic by claiming that “suppose we wish to add x and y. Take a huge bunch of marbles. First count out x marbles in one heap. Then count out y marbles in another. Put the two heaps together and count out the number of marbles in the union thus formed. The result is x + y” (Kripke 1982, 15). The skeptic’s response is obvious and based on the fact that a rule (the addition rule) is determined in terms of another rule (the counting rule). The skeptic can claim that, perhaps, by “count” the speaker always meant quount, not count; he then goes on to offer his non-standard, compatible-with-the-quus-scenario interpretation of “count” (see Kripke 1982, 16). The vicious regress of interpretations reappears, that of rules interpreting rules. At some point, we must stop and say that this rule cannot be interpreted in any other, non-standard way. The skeptic then asks: what is it about this special, “superlative” rule that prevents it from being interpreted in different ways? The skeptical challenge can be applied to this rule, unless we answer the skeptic’s question. But answering that very question is the whole point of the skeptical problem. Any attempt to escape the regress without answering the skeptic’s question, on the other hand, only makes such an alleged superlative rule mysterious.

c. Non-Reductionist Facts: Meaning as a Primitive State

The skeptic rejects a specific version of non-reductionism, according to which the fact that the speaker means plus by “plus” is primitive, irreducible to any other fact about the speaker’s behavioral or mental life. Whenever I use a word, I just directly know what I mean by it; nothing else about me is supposed to constitute this fact. The skeptic himself thinks that “such a move may in a sense be irrefutable” (Kripke 1982, 51). Nevertheless, he describes this suggestion as “despera­te” (Kripke 1982, 51) and makes two objections to it: (1) it leaves the nature of such a primitive state completely mysterious, since this state supposedly possesses a general content that is present in an indefinite number of cases in which we may use the word, but our minds or brains do not have the capacity to consider each such case of use explicitly in advance; (2) it has to propose that we somehow have a direct, first-personal epistemic access to the general content of such a state, which is not known via introspection, but which seems to be, in a queer way, always available to us. The skeptic’s objections have also been called the “argument from queerness” (see Boghossian (1989; 1990) and Wright (1984)).

According to the skeptic, the non-reductionist response “leaves the nature of this postulated primitive state – the primitive state of ‘meaning addition by “plus”’ – c­ompletely mysterious” (Kripke 1982, 51). It is mysterious because it is supposed to be a finite state, embedded in the speaker’s finite mind or brain, whose capacity is limited, but it is also supposed to possess a general content that covers a potentially infinite number of cases in which the word may be used and that is always available to the speaker and tells her what the correct way of using the word is in every possible case:

Such a state does not consist in my explicitly thinking of each case of the addition table, nor even of my encoding each separate case in the brain: we lack the capacity for that. Yet (as Wittgenstein states in the Philosophical Investigations, §195) ‘in a queer way’ each such case already is ‘in some sense present’. (Kripke 1982, 52).

It is very hard, according to the skeptic, to make sense of the nature of such states that are finite but have such a general content.

Moreover, it is not clear how to explain our direct and non-inferential epistemic access to the content of these states. The primitive state of meaning plus by “plus” determines the correct use of the word in indefinitely (or even infinitely) many cases. Yet, as the skeptic says, “we supposedly are aware of it with some fair degree of certainty whenever it occurs” (Kripke 1982, 51). We directly and non-inferentially know how to use “plus” in each possible case of using it. As Wright characterizes the argument from queerness, “how can there be a state which each of us knows about, in his own case at least, non-inferentially and yet which is infinitely fecund, possessing specific directive content for no end of distinct situations?” (Wright 1984, 775). The skeptic’s claim is that there is no plausible answer to this question.

The skeptic’s skeptical argument is now complete: any reductionist or non-reductionist response to his skeptical problem is shown to be a failure. Granted that, it remains to see to what conclusions the skeptic has been leading us all along.

d. The Skeptical Conclusions and Classical Realism

George Wilson (1994; 1998) has usefully distinguished between two different conclusions that the skeptical argument establishes: (1) the Basic Skeptical Conclusion and (2) the Radical Skeptical Conclusion. The Basic Skeptical Conclusion is the outcome of the skeptic’s detailed arguments against the aforementioned candidate facts. After arguing that all of them fail to determine what the speaker means by her words, the skeptic claims that “there can be no fact as to what I mean by ‘plus’, or any other word at any time” (Kripke 1982, 21). In order to see why the argument has a further radical conclusion, we must consider why the skeptic thinks that his argument’s target is “classical realism” (Kripke 1982, 73, 85).

According to the broad realist treatment of meaning, there are facts as to what a (declarative) sentence means or what a speaker means by it. For Kripke, the early Wittgenstein in the Tractatus (1922) supports a similar view of meaning, according to which:

 A declarative sentence gets its meaning by virtue of its truth conditions, by virtue of its correspondence to facts that must obtain if it is true. For example, ‘the cat is on the mat’ is understood by those speakers who realize that it is true if and only if a certain cat is on a certain mat; it is false otherwise (Kripke 1982, 72).

We can tell the same story about the sentences by which we ascribe meaning to our and others’ utterances, such as “Jones means plus by “plus””. According to the realist, this sentence has a truth-condition: it is true if and only if Jones really means plus by “plus”, or if the fact that Jones means plus by “plus” obtains. It is a fact that Jones means plus, and not anything else, by “plus” and depending on the sort of realist view that one holds (such as naturalist reductionist, non-naturalist, non-reductionist, and so forth), such meaning facts are either primitive or, in one way or another, constituted by some other fact about the speaker. Such a realist conception of meaning provides an explanation of why we mean what we do by our words. The skeptical argument rejects the existence of any such fact, as it appears in its Basic Skeptical Conclusion.

If we support such a realist view of meaning, the skeptical argument has a very radical outcome because the combination of the Basic Skeptical Conclusion and the classical realist conception of meaning amounts to the Radical Skeptical Conclusion, according to which “there can be no such thing as meaning anything by any word” (Kripke 1982, 21). For Kripke, this conclusion captures the paradox that Wittgenstein presents in section 201 of the Philosophical Investigations. Any use you make of a word is both correct and incorrect at the same time because it is compatible with different meanings and there is no fact determining what meaning the speaker has in mind. The notion of meaning simply vanishes, together with that of correctness of use. The classical realist explanation of meaning, therefore, leads to the Wittgensteinian paradox. Kripke, however, believes that his Wittgenstein has a “solution” to this problem, though its aim is not to rescue classical realism.

3. Kripke’s Wittgenstein: The Skeptical Solution

The Radical Skeptical Conclusion seems to be obviously wrong at least for two reasons. For one thing, we do very often mean specific things by our words. For another, the Radical Skeptical Conclusion is “incredible and self-defeating” (Kripke 1982, 71) because if it is true, the skeptical conclusions themselves would not have any meaning. According to Kripke, his Wittgenstein does not “wish to leave us with his problem, but to solve it: the sceptical conclusion is insane and intolerable” (Kripke 1982, 60). Kripke’s Wittgenstein agrees with his skeptic that there is no fact about what we mean by our words and thus accepts the Basic Skeptical Conclusion: he thinks that the classical realist explanation of meaning is deeply problematic. Nonetheless, he rejects the Radical Skeptical Conclusion as unacceptable. Although there is no fact as to what someone means by her words, we do not need to accept the conclusion that there is thereby no such thing as meaning and understanding at all. What we need to do is instead to throw away the view that resulted in such a paradox, that is, the classical realist conception of meaning. Such a view was a misunderstanding of our ordinary notion of meaning.

Kripke distinguishes between two general sorts of solutions to the skeptical problem: straight solutions and skeptical solutions. A straight solution aims to show that the skeptic is wrong or unjustified in his claims (see Kripke 1982, 66). The suggested facts previously mentioned can be seen as various attempts to offer a straight solution. The skeptic argues that they are all hopeless as they lead to the paradox. A skeptical solution, however, starts by accepting the negative point of the skeptic’s argument, that is, that there is no fact as to what someone means by her words. The skeptical solution is built on the idea that “our ordinary practice or belief is justified because – contrary appearances notwithstanding—it need not require the justification the sceptic has shown to be untenable” (Kripke 1982, 67).

a. Truth-Conditions vs. Assertibility Conditions

Consider the sentences by which we attribute meaning to others and ourselves, that is, meaning-ascribing sentences, such as “Jones means plus by “plus”” or “I mean plus by “plus””. The classical realist conception of the meaning of such sentences is truth-conditional: the sentence “Jones means plus by “plus”” is true if and only if Jones means plus by “plus” (that is, if and only if the fact that Jones means plus by “plus” obtains) and thus its meaning is that Jones means plus by “plus”. Similarly, the sentence “I mean plus by “plus”” is true if and only if I do mean plus by “plus” (that is, if and only if the fact that I mean plus by “plus” obtains) and thus means that I do mean plus by “plus”. (My concentration will be on the third-personal attributions of meaning such as “Jones means plus by “plus””, while similar considerations apply to the case of self-attributions). The skeptic argues that there is no such fact obtaining which makes these sentences true. The skeptical solution abandons the classical realist truth-conditional treatment of meaning. (See Boghossian (1989), Horwich (1990), McDowell (1992), Peacocke (1984), Soames (1998), and Wilson (1994; 1998) for the claim that Wittgenstein’s aim has not been to rule out the notion of truth-conditions, but the classical realist conception of it.)

Alternatively, as Kripke puts it:

[His] Wittgenstein replaces the question “What must be the case for [a] sentence to be true?” by two other : first, “Under what conditions may this form of words be appropriately asserted (or denied)?”; second, given an answer to the first question, “What is the role, and the utility, in our lives of our practice of asserting (or denying) the form of words under these conditions?” (Kripke 1982, 73)

 Once we give up on the classical realist view of meaning, all we need to do is to take a careful look at our ordinary practice of asserting meaning-ascribing sentences under certain conditions. Kripke’s Wittgenstein calls these conditions Assertibility Conditions or Justification Conditions (Kripke 1984, 74). In its most general sense, the assertibility conditions tell us under what conditions we are justified to assert something specific by using a sentence. When our concern is to attribute meaning to ourselves and others, these conditions tell us when we can justifiably assert that Jones means plus by “plus” or that I follow the addition rule. We already know that we cannot say that we are justified in asserting that Jones means plus by “plus” because the fact that he means plus obtained. Nor can we do the same in our own case: there is no fact about any of us constituting the fact that we mean this rather than that by our words.

Having agreed with the skeptic that there is no fact about meaning, it seems to Kripke’s Wittgenstein that all that we are left with is our feeling of confidence, blind inclinations, mere dispositions or natural propensities to respond or to use words in one way rather than another: it seems that “I apply the rule blindly” (Kripke 1982, 17). The assertibility conditions specify the conditions under which the subject is inclined, or feels confident, to apply her words in such and such a way: “the ‘assertibility conditions’ that license an individual to say that, on a given occasion, he ought to follow his rule this way rather than that, are, ultimately, that he does what he is inclined to do” (Kripke 1982, 88). This, however, does not imply that there is thereby no such thing as meaning one thing rather than another by some words. The evidence justifying us to assert or judge that Jones means green by “green” is our observation of Jones’s linguistic behavior, that is, his use of the word under certain publicly observable circumstances. We can justifiably assert that Jones means green by “green” if we can observe, in enough cases, that he uses this word as we do or would do, or more generally, as others in his speech-community are inclined to do. This is the only justification there is, and the only justification we need, to assert that he means green by “green”. We can also tell a story about why such a practice has the shape it has and why we are participating in it at all, without appealing to any classical realist or otherwise explanation of such practices: participating in them has endless benefits for us. Consider an example from Kripke:

Suppose I go to the grocer with a slip marked ‘five red apples’, and he hands over apples, reciting by heart the numerals up to five and handing over an apple as each numeral is intoned. It is under circumstances such as these that we are licensed to make utterances using numerals. (Kripke 1982, 75-76)

We can assert that the grocer and the customer both mean five by “five”, red by “red”, and apple by “apple” if they agree in the way they are inclined to apply these terms. Our lives depend on our participation and success in such practices. If the customer responds with some bizarre answers, others including the grocer start losing their justification to assert that he really means plus‌ by “plus”: the only justification there is for making such assertions starts vanishing.

Note again that such agreed-on dispositions, blind inclinations or natural propensities to respond in certain ways, contrary to the dispositional account of meaning, are not supposed to form a fact that can constitute some meaning fact, such as the fact that the grocer means apple, and not anything else, by “apple”. The sort of responses we naturally agree to produce and the impact they have on our lives give rise to our “form of life”. The members of our speech-community agree to use “plus” and other words in specific ways: they are uniform in their responses. We live a plus-like form of life (see Kripke 1982, 96). However, there is and can be no (realist or otherwise) explanation of why we agree to respond as we do. Any attempt to cite some fact constituting such agreements leads to the emergence of the Wittgensteinian paradox. For this reason, it would be nothing but a brute empirical fact, a primitive aspect of our form of life, that we all agree as we do (see Kripke 1982, 91).

b. The Private Language Argument

Once we accept such an alternative picture of meaning, we realize that one of its consequences is the impossibility of a private language. Kripke’s Wittgenstein emphasizes that “if one person is considered in isolation, the notion of a rule as guiding the person who adopts it can have no substantive content” (Kripke 1982, 89). The skeptical solution cannot admit the possibility of a private language, that is, a language that someone invents and only she can understand, independently of the shared practices of a speech-community. This comes from the nature of the assertibility conditions: “It turns out that […] these conditions […] involve reference to a community. They are inapplicable to a single person considered in isolation. Thus, as we have said, Wittgenstein rejects ‘private language’” (Kripke 1982, 79).

Consider the case of a Robinson Crusoe who has been in isolation since birth on an island. Crusoe is inclined to apply his words in certain ways. He is confident, for instance, that when he applies “green” to an object, his use is correct, that he means green or in any case something determinate by this word. Facing a new object, he thinks he ought to apply “green” to this object too. As there is no one else with whose use or responses he can contrast his, all there is to assure him that his use is correct is himself and his confidence. To Crusoe, thus, whatever seems right is right, in which case no genuine notion of error, mistake, or disagreement can emerge: if he feels confident that “green” applies to a blue object, this is correct. The assertibility conditions in this case would be along these lines:

“Green” applies to this object if and only if Crusoe thinks or feels confident that “green” applies to the object.

This is the reason why Wittgenstein famously stated that “in the present case I have no criterion of correctness. One would like to say: whatever is going to seem right to me is right. And that only means that here we can’t talk about ‘right’” (Wittgenstein 1953, §258). In order for certain applications of “green” to be incorrect, there are to be certain correct ways of applying it. For a solitary person, however, “there are no circumstances under which we can say that, even if he inclines to say ‘125’, he should have said ‘5’, or vice versa” (Kripke 1982, 88). The correct answer is simply “the answer that strikes him as natural and inevitable” (Kripke 1982, 88). Crusoe’s use is wrong only when he feels it is wrong.

Nonetheless, if Crusoe is a member of a speech-community, a new element enters the picture: although Crusoe may simply feel confident that applying “green” to this (blue) object is correct, others in his speech-community disagree. The assertibility conditions for how “green” applies turn into the following condition:

“Green” applies to this object if and only if others are inclined to apply “green” to that object, or if others feel confident that “green” applies to it.

As Kripke’s Wittgenstein puts it, “others will then have justification conditions for attributing correct or incorrect rule-following to the subject, and these will not be simply that the subject’s own authority is unconditionally to be accepted” (Kripke 1982, 89). This is the reason why Kripke thinks that Wittgenstein’s argument against the possibility of private language (known as the private language argument) is not an independent argument. Nor is it the main concern of Wittgenstein in the Investigations. Rather, it is the consequence of Wittgenstein’s new way of looking at our linguistic practices, according to which speaking and understanding a language is a sort of activity. As Wittgenstein famously puts it, “to understand a sentence means to understand a language. To understand a language means to be master of a technique” (Wittgenstein 1953, §199). If so, then “to obey a rule, to make a report, to give an order, to play a game of chess, are customs (uses, institutions)” (Wittgenstein 1953, §199). There is an extensive literature on the implications of the private language argument as well as Kripke’s reading of it (see for instance, Baker and Hacker (1984), Bar-On (1992), Blackburn (1984a), Davies (1988), Hanfling (1984), Hoffman (1985), Kusch (2006), Malcolm (1986), McDowell (1984; 1989), McGinn (1984), Williams (1991), and Wright (1984; 1991)).

4. Responses and Criticisms

Since the publication of Kripke’s book, almost every aspect of his interpretation of Wittgenstein has been carefully examined. The responses can be put in three main categories: those focusing on the correctness of Kripke’s interpretation of Wittgenstein, those discussing the plausibility of the skeptical argument and solution, and those attempting to offer an alternative solution to the skeptical problem. Many interesting and significant issues, which were first highlighted by Kripke in his book, have since turned into self-standing topics, such as that of the normativity of meaning, the dispositional view of meaning, and the community conception of language. In what follows, it will only be possible to glance upon some of the most famous responses to Kripke’s Wittgenstein. They mainly debate the issues over the individualist vs. communitarian readings of Wittgenstein and the reductionist factualist vs. non-reductionist factualist interpretations of his remarks.

In their 1984 book, Scepticism, Rules and Language, Baker and Hacker defend an individualistic reading of Wittgenstein’s view of the notion of a practice and thereby reject Kripke’s suggested communitarian interpretation. For them, not only does Kripke misrepresent Wittgenstein, but the skeptical argument and the skeptical solution are both wrong. They believe that Wittgenstein never aimed to reject a philosophical view and defend another. Thus, they find it entirely unacceptable to agree with Kripke that Wittgenstein “who throughout his life found philosophical scepticism nonsensical […] should actually make a sceptical problem the pivotal point of his work. It would be even more surprising to find him accepting the sceptic’s premises […] rather than showing that they are ‘rubbish’” (Backer and Hacker 1984, 5). According to Baker and Hacker, the skeptical argument cannot even be treated as a plausible sort of skepticism; it rather leads to pure nihilism: “Why his argument is wrong may be worth investigating (as with any paradox), but that it is wrong is indubitable. It is not a sceptical problem but an absurdity” (Backer and Hacker 1984, 5). For, as they see it, a legitimate skepticism about a subject matter involves only epistemological rather than metaphysical doubts. An epistemological skeptic would claim that we do mean specific things by our words (as we normally do) but, for some reason, we can never be certain what that meaning is. For Kripke’s Wittgenstein’s skeptic, however, there is no fact about meaning at all and this leads to a paradox, which results in the conclusion that there is no such thing as meaning anything by any word. But “this is not scepticism at all, it is conceptual nihilism, and, unlike classical scepticism, it is manifestly self-refuting” (Backer and Hacker 1984, 5).

According to the way Baker and Hacker read Wittgenstein, the paradox mentioned in section 201 of the Investigations is intended by Wittgenstein to reveal a misunderstanding, not something that we should live with, and “this is shown by the fact that no interpretation, i.e. no rule for the application of a rule, can satisfy us, can definitively fix, by itself, what counts as accord. For each interpretation generates the same problem” (Backer and Hacker 1984, 13). Our understanding of words has nothing to do with the task of fixing a mediating interpretation because the result of such an attempt is a regress of interpretations. For Wittgenstein, understanding is nothing but that which manifests itself in our use of words, in our actions, in the technique of using language. Thus, Wittgenstein cannot be taken to be offering a skeptical solution either.

Moreover, for Baker and Hacker, the community view that Kripke attributes to Wittgenstein, as Wittgenstein’s alternative view, must be thrown away. For if it is the notion of a practice that Wittgenstein thinks of as fundamental, we can find no compelling reason to conclude that Crusoe cannot come up with a practice, in the sense of acquiring a technique to use his words and symbols. After all, it is enough that such an understanding manifests itself in Crusoe’s practices. According to Baker and Hacker, to participate in a practice is not just to act but to repeat an action over time with regularity. If so, then “nothing in this discussion involves any commitment to a multiplicity of agents. All the emphasis is on the regularity, the multiple occasions, of action” (Backer and Hacker 1984, 20).

Blackburn also defends an individualistic reading of Wittgenstein. For him, there is no metaphysical difference between the case of Crusoe and the case of a community. For whatever is available to Kripke’s Wittgenstein to avoid the skeptical problem in the case of a community of speakers is equally available to an anti-communitarianist defending the case of Crusoe as a case of genuine rule-following. For instance, consider the problem with the finiteness of dispositions. If the objection is that the totality of the dispositions of an individual, because of being finite, fails to determine what the individual means by her words, the totality of the dispositions of a community too is finite and thus fails to determine what they mean by their words. This means that the community can also be seen as following the skaddition rule: the agreement in their similar responses would remain compatible with both scenarios, that is, their following the addition rule and their following the skaddition rule.

On the other hand, according to Blackburn, if the claim is merely that it is only within a community of speakers that a practice can emerge, we are misreading Wittgenstein. The claim that a practice emerges only within a community may mean different things. It might for instance mean that to Crusoe, whatever seems right is right, so that a community is inevitably required to draw a distinction between what is right and what only seems right. As Blackburn points out, however, the case of an individual and that of a community does not differ metaphysically with respect to this issue because the same problem arises in the case of a community: whatever seems right to the community is right. Alternatively, the claim may mean that it is only because of the interactions between the members of a community that the notion of a practice can be given a legitimate meaning. Blackburn’s objection is that we have no argument against the possibility that Crusoe can interact with himself and thus form a practice: we can imagine that Crusoe interacts with his past self, with the symbols, signs and the like that he used in the past. There is no reason to assume that because his responses are not like ours, Crusoe’s practice is not a practice. The point is that if he is part of no community, there simply is no requirement that he responds as any others do. Consequently, it is implausible to claim that, within a speech-community, “we see ourselves as rule-followers because why is it that Crusoe cannot see himself as a rule-follower?”

For Blackburn, the negative point that Wittgenstein makes is that we must not think of the connection between use of words and understanding them as mediated by something, such as some interpretation, mental image, idea, and so forth, because doing so leads to the regress of interpretations: the search for some other medium making the previous one fixed would go on forever. This is a misunderstanding of our practices. Wittgenstein’s positive insight is that “our rules are anchored in practice […] That is, dignifying each other as rule-following is essentially connected with seeing each other as successfully using techniques or practices” (Wittgenstein 1984a, 296). But such a notion of a practice is not necessarily hinged on a community: “we must not fall into the common trap of simply equating practice with public practice, if the notion is to give us the heartland of meaning” (Wittgenstein 1984b, 85). Blackburn, thus, defends an individualist view of rule-following against the communitarian view that Kripke’s Wittgenstein offers in his skeptical solution.

Colin McGinn, in his well-known book Wittgenstein on Meaning (1984), also defends an individualist reading of Wittgenstein. Some of his objections are similar to those made by Blackburn and by Baker and Hacker: Kripke neglects Wittgenstein’s positive remark, offered in the second part of section 201 of the Investigations, that the paradox is the result of a misunderstanding that must be removed. For McGinn, this forms a reductio for the conception of meaning that treats the notion of interpretation as essential to the possibility of understanding a language (McGinn 1984, 68). Wittgenstein’s aim has been to remove a misconception of this notion, according to which understanding is a kind of mental process, such as that of translating or interpreting words. Kripke is thus unjustified in his claim that Wittgenstein offers a skeptical problem and then a skeptical solution to such a problem. For McGinn, Wittgenstein has never been hostile to notions like “facts” and “truth-conditions” as they are ordinarily used; his target has rather been to unveil a misunderstanding of them, one that builds on the notion of interpretation. This means that McGinn supports a factualist reading of Wittgenstein against the non-factualist view that Kripke seems to attribute to him. This factualist view takes the notion of a practice, or the ability to use words in certain ways, to form a fact as to what someone means by her words: “At any rate, if we want to talk in terms of facts it seems that Wittgenstein does suggest that understanding consists in a fact, the fact of having an ability to use signs” (McGinn 1984, 71). (For some of the well-known factualist readings of Wittgenstein, and the skeptical solution, see, for instance, Byrne (1996), Davies (1998), Soames (1997; 1998), Stroud (1996), and Wilson (1994; 1998). See also Boghossian (1989; 1990), Kusch (2006) and Miller (2010) for further discussions.)

Moreover, for McGinn, the notion of a practice or a custom does not involve the notion of a community. Thus, he agrees with Blackburn and with Baker and Hacker on this point. It is true that Wittgenstein embraces the idea of multiplicity, but this has nothing to do with the multiplicity of subjects, but rather with a multiplicity of instances of rule-following: a word cannot be said to have a meaning if it is used just once; meaning emerges as the result of using words repeatedly over time in a certain way. He also sees the skeptic’s objections to non-reductionism as misplaced. For him, if we treat meaning as an irreducible state of the speaker, we may have a difficult time coming up with a theory that can explain how we directly know the general content of such states. But “lack of a theory of a phenomenon is not in itself a good reason to doubt the existence of it” (McGinn 1984, 161). (For a well-known criticism of McGinn’s view, see Wright (1989).)

On the other hand, McDowell and Peacocke have defended a communitarian reading of Wittgenstein. According to Peacocke, Wittgenstein’s considerations on rule-following reveal that following a rule is a practice, which is essentially communal: “what it is for a person to be following a rule, even individually, cannot ultimately be explained without reference to some community” (Peacocke 1981, 72). We need some public criteria in order to be able to draw the distinction between what seems right to the individual and what is right independently of what merely seems to her to be so, and to assess whether she follows a rule correctly; these criteria would emerge only if the individual can be considered as a member of a speech-community. For Peacocke, Wittgenstein has shown that the individualistic accounts of rule-following are based on a misunderstanding of what is fundamental to the existence of our ordinary linguistic practices.

According to McDowell, Kripke has misinterpreted Wittgenstein’s central point in his remarks on the paradox presented especially in section 201 of the Investigations. His chief remark is offered in the second part of the same paragraph, where Wittgenstein says: “It can be seen that there is a misunderstanding here […] What this shews is that there is a way of grasping a rule which is not an interpretation, but which is exhibited in what we call ‘obeying the rule’ and ‘going against it’ in actual cases” (Wittgenstein 1953, §201). If Wittgenstein views the paradox as the result of a misunderstanding, we cannot claim that he is sympathetic to any skeptic. According to McDowell, for Wittgenstein, the paradox comes not from adopting a realist picture of meaning but from a misconception of our linguistic practices, according to which meaning and understanding are mediated by some interpretation. When we face the question as to what constitutes such an understanding, “we tend to be enticed into looking for a fact that would constitute my having put an appropriate interpretation on what I was told and shown when I was instructed in [for instance] arithmetic” (McDowell 1984, 331). Such a conception of a fact determining an intermediate interpretation is a misunderstanding. For, as Wittgenstein famously said, “any interpretation still hangs in the air along with what it interprets, and cannot give it any support” (Wittgenstein 1953, §198).

For McDowell, if we miss this fundamental point, we then face a devastating dilemma: (1) we try to find facts that fix an interpretation, which obviously leads to the regress of interpretations; but then, (2) in order to escape such a regress, we may be tempted to read Wittgenstein as claiming that to understand is to possess an interpretation but “an interpretation that cannot be interpreted” (McDowell 1984, 332). The latter attempt, however, dramatically fails to dodge the regress of interpretations: it rather pushes us toward an even worse difficulty, that is, that there is a superlative rule which is, in a mysterious way, not susceptible to the problem of the regress of interpretations. For McDowell, “one of Wittgenstein’s main concerns is clearly to cast doubt on this mythology” (McDowell 1984, 332). Understanding has nothing to do with mediating interpretations at all.

McDowell is also against the skeptical solution, which begins by accepting the (basic) skeptical conclusion of the skeptical argument: the whole point of Wittgenstein’s discussion of the paradox in the second part of section 201 has been to warn us against the paradox, that the dilemma in question is not compulsory. The paradox emerges as the result of a misunderstood treatment of meaning and understanding, according to which understanding involves interpretation. If so, there is then no need for a skeptical solution at all. For McDowell, once we fully appreciate Wittgenstein’s point about the paradox, we can see that there really is nothing wrong with our ordinary talk of communal facts, that is, facts as to what we mean by our words in a speech-community: “I simply act as I have been trained to. […] The training in question is initiation into a custom. If it were not that […] our picture would not contain the materials to entitle us to speak of following (going by) a sign-post” (McDowell 1984, 339). To understand a language is to master the technique of using this language, that is, to acquire a practical ability. This, however, does not imply admitting a purely behaviorist view of language and thereby emptying the notion of meaning from its normative feature. McDowell’s Wittgenstein treats acting in a certain way in a community “as acting within a communal custom” (McDowell 1984, 352), which is a rule-governed activity.

As we saw, Blackburn, McGinn, and Baker and Hacker defend an individualist reading of Wittgenstein’s remarks on rule-following, while Peacocke and McDowell support a communitarian one. Boghossian (1989) and Goldfarb (1985) also raise serious doubts about whether the skeptical solution can successfully make the notion of a community central to the existence of the practice of meaning something by a word. For them, the assertibility conditions are either essentially descriptive, rather than normative (Goldfarb 1985, 482-485), or they are capable of being characterized in an individualistic way, in which no mention of others’ shared practices is made at all (Boghossian 1989, 521-522). Nonetheless, defending an individualist view of meaning is one thing, advocating a factualist view of it is another: there are individualist factualist views (such as McGinn’s), as well as communitarian factualist views (such as McDowell’s). Moreover, the factualist views may themselves be reductionist (such as Horwich’s) or non-reductionist (such as Wright’s).

For instance, although Wright has offered various criticisms of Kripke’s Wittgenstein’s view, he thinks that the proper solution to the skeptical problem is a particular version of non-reductionist factualism. Like McGinn, Wright finds the skeptic’s argument from queerness against non-reductionism unconvincing (Wright 1984, 775ff.). Nonetheless, contrary to McGinn, he believes that we need to solve the epistemological problems that come with such a view. According to Wright, the generality of the content of our semantical and intentional states or, as he calls it, their “indefinite fecundity”, is not mysterious at all: it is simply part of the ordinary notion of meaning and intention that these states possess such a general content. Wright gives an example to clarify his point: “suppose I intend, for example, to prosecute at the earliest possible date anyone who trespasses on my land” (Wright 1984, 776). The content of such an intention is general: it does not constrain my action to a specific time, occasion, or person, so that “there can indeed be no end of distinct responses, in distinct situations, which I must make if I remember this intention, continue to wish to fulfil it, and correctly apprehend the prevailing circumstances” (Wright 1984, 776). If so, the main problem with non-reductionism is to account for the problem of self-knowledge, that is, to offer an account of why and how it is that we, as first-persons, non-inferentially and directly know the general content of our meaning states on each occasion of use. For one thing, it is part of our ordinary notion of meaning and intention that “a subject has, in general, authoritative and non-inferential access to the content of his own intentions, and that this content may be open-ended and general, may relate to all situations of a certain kind” (Wright 1984, 776). For another, however, Wright believes that we must, and can, account for such a phenomenon. He attempts to put forward an account of how we know what we mean and intend, differently from the way others, third-persons, know such meanings and intentions. His account is called the “Judgement-Dependent” account of meaning and intention, which Wright develops in several of his writings. Unpacking this account involves much technicality that goes beyond the scope of this article. (See especially Wright (1992; 2001) for his account. For a different response-dependent response to Kripke’s Wittgenstein, which also defends non-reductionism, see Pettit (1990).) Wright’s main point is that the fact that the non-reductionist response must deal with the problem of self-knowledge forms no decisive argument against its plausibility. On this point, Boghossian is on board with Wright: in order to reject the non-reductionist response what the skeptic needs to do is to provide “a proof that no satisfactory epistemology was ultimately to be had” (Boghossian 1989, 542). The skeptic, however, has no such argument to offer. For Wright, this means that if we explain these features of meaning, non-reductionism “is available to confront Kripke’s sceptic, and that, so far as I can see, the Sceptical Argument is powerless against it” (Wright 1984, 776). (For more on Wright’s criticisms of Kripke, see Wright (1986; 1992, appendix to chapter 3; 2001, part II). For the main defenses of non-reductionism against Kripke’s Wittgenstein, see also Hanfling (1985), Pettit (1990), and Stroud (1996).)

Paul Horwich, on the contrary, defends a communitarian version of reductionist factualism, or more accurately a communitarian version of the dispositional view against the skeptic. His main attempt is to show that “Wittgenstein’s equation of meaning with ‘use’ (construed non-semantically) is the taken-to-be-obvious centrepiece of his view of the matter, […] [contrary] to Kripke’s interpretation [that] the centrepiece is his criticism of that equation!” (Horwich 2012, 146). For Horwich, facts about the speaker’s environment, or more particularly facts about his linguistic community, are important and must be carefully taken care of in our account of meaning. His community-based dispositional view goes against the individualistic theory, according to which “what a person means is determined solely by the dispositions of that person” (Horwich 1990, 111). The community-based version of this view aims to show that “individuals are said to mean by a word whatever that word means in the linguistic community they belong to”. Horwich calls this view the Community-Use Theory. According to Horwich, there are (naturalistic) facts with normative consequences, that is, facts about how a speaker is naturally disposed to respond as a member of a speech-community. If we accept what Horwich calls uncontroversial universal principles, that is, the principles of the form “Human beings should be treated with respect”, “one should believe the truth”, and the like, we can then see that such principles are capable of entailing the sort of conditionals that have certain factual claims as their antecedents and certain normative claims as their consequents. Such conditionals would have the following form: “If Jones is a human being, then he ought to be treated with respect” or “If it is true that 68 + 57 = 125, then one ought to believe it” (see Horwich 1990, 112). All we need is then certain agreed-on principles that can tell us what the normative outcomes of non-normative situations are. Since we can have non-semantical, dispositional facts as the antecedents of these conditionals, it would be a mistake to think that factual claims, such as those made by the naturalistic dispositional view of meaning, cannot have normative consequences. For Horwich, therefore, the communal version of the dispositional view can accommodate the normative feature of meaning: factual claims about what a speaker means, whose truth depends on the obtaining of certain facts about the speaker’s dispositions being in agreement with those of the members of the speech-community, can have normative outcomes. Horwich engages in detailed discussions of Wittgenstein’s view of the deflationary theory of truth, different aspects of the normativity of meaning thesis, and the notion of communal dispositions. (For a different sort of reductionist dispositional view, which treats the dispositional facts as irreducibly normative, see Ginsborg (2011; 2018; 2021). See also Maddy (2014) and Marie McGinn (2010) for certain naturalist responses to Kripke’s Wittgenstein.)

Further salient reactions to Kripke’s Wittgenstein, such as those made by Chomsky (1986), Goldfarb (1985), Kusch (2006), Pettit (1992), and Soames (1997), are too technical to be properly unpacked in this article. Reference to some further key works on the topic can be found in the Further Reading section.

5. References and Further Reading

a. References

  • Baker, Gordon P. and Hacker, P. M. S. 1984. Scepticism, Rules and Language. Oxford: Basil Blackwell.
  • Armstrong, David. 1997. A World of States of Affairs. Cambridge: Cambridge University Press.
  • Bar-On, Dorit. 1992. “On the Possibility of a Solitary Language”. Nous 26(1): 27–45.
  • Bird, Alexander. 1998. “Dispositions and Antidotes”. The Philosophical Quarterly 48: 227–234.
  • Blackburn, Simon. 1984a. “The Individual Strikes Back.” Synthese 58: 281–302.
  • Blackburn, Simon. 1984b. Spreading the Word. Oxford: Oxford University Press.
  • Boghossian, Paul. 1989. “The Rule-Following Considerations”. Mind 98: 507–549.
  • Boghossian, Paul. 1990. “The Status of Content”. The Philosophical Review 99(2): 157–184.
  • Boghossian, Paul. 2003. “The Normativity of Content”. Philosophical Issues 13: 31–45.
  • Boghossian, Paul. 2008. “Epistemic Rules”. The Journal of Philosophy 105(9): 472–500.
  • Byrne, Alex. 1996. “On Misinterpreting Kripke’s Wittgenstein”. Philosophy and Phenomenological Research 56(2): 339-343.
  • Carnap, Rudolf. 1928. The Logical Structure of the World. Berkeley: University of California Press.
  • Chomsky, Noam. 1986. Knowledge of Language: Its Nature, Origin and Use. New York: Praeger.
  • Coates, Paul. 1986. “Kripke’s Sceptical Paradox: Normativeness and Meaning”. Mind 95(377): 77–80.
  • Davies, David. 1998. “How Sceptical is Kripke’s ‘Sceptical Solution’?”. Philosophia 26: 119–40.
  • Davies. Stephen. 1988. “Kripke, Crusoe and Wittgenstein”. Australasian Journal of Philosophy 66(1): 52–66.
  • Gibbard, Allan. 1994. “Meaning and Normativity”. Philosophical Issues 5: 95–115.
  • Gibbard, Allan. 2013. Meaning and Normativity. Oxford: Oxford University Press.
  • Ginsborg, Hannah. 2011. “Primitive Normativity and Scepticism about Rules”. The Journal of Philosophy 108(5): 227–254.
  • Ginsborg, Hannah. 2018. “Normativity and Concepts”. In The Oxford Handbook of Reasons and Normativity, edited by Daniel Star, 989–1014. Oxford: Oxford University Press.
  • Ginsborg, Hannah. 2021. “Going On as One Ought: Kripke and Wittgenstein on the Normativity of Meaning”. Mind & Language: 1–17.
  • Glock, Hans-Johann. 2019. “The Normativity of Meaning Revisited”. In The Normative Animal?, edited by Neil Roughley and Kurt Bayertz, 295–318. Oxford: Oxford University Press.
  • Gluer, Kathrin and Wikforss, Asa. 2009. “Against Content Normativity”. Mind 118(469): 31–70.
  • Goldfarb, Warren. 1985. “Kripke on Wittgenstein on Rules”. The Journal of Philosophy 82(9): 471–488.
  • Goodman, Nelson. 1973. Fact, Fiction and Forecast. Indianapolis: Bobbs-Merill.
  • Hanfling, Oswald. 1984. “What Does the Private Language Argument Prove?”. The Philosophical Quarterly 34(137): 468–481.
  • Hattiangadi, Anandi. 2006. “Is Meaning Normative?”. Mind and Language 21(2): 220 –240
  • Hattiangadi, Anandi. 2007. Oughts and Thoughts: Rule-Following and the Normativity of Content. Oxford: Oxford University Press.
  • Hattiangadi, Anandi. 2010. “Semantic Normativity in Context”. In New Waves in Philosophy of Language, edited by Sarah Sawyer, 87–107. London: Palgrave Macmillan.
  • Hoffman, Paul. 1985. “Kripke on Private Language”. Philosophical Studies 47: 23–28.
  • Horwich, Paul. 1990. “Wittgenstein and Kripke on the Nature of Meaning”. Mind and Language 5(2): 105–121.
  • Horwich, Paul. 1995. “Meaning, Use and Truth”. Mind 104(414): 355–368.
  • Horwich, Paul. 2012. Wittgenstein’s Metaphilosophy. Oxford: Oxford University Press.
  • Horwich, Paul. 2019. “Wittgenstein (and his Followers) on Meaning and Normativity”. Disputatio 8(9): 1–25.
  • Kripke, Saul. 1982. Wittgenstein on Rules and Private Language. Cambridge, MA.: Harvard University Press.
  • Kusch, Martin. 2006. A Sceptical Guide to Meaning and Rules: Defending Kripke’s Wittgenstein. Chesham: Acumen.
  • Lewis, David. 1997. “Finkish Dispositions”. The Philosophical Quarterly 47: 143–158.
  • Maddy, Penelope. 2014. The Logical Must: Wittgenstein on Logic. Oxford: Oxford University Press.
  • Malcolm, Norman. 1986. Nothing is Hidden. Oxford: Basil Blackwell.
  • McDowell, John. 1984. “Wittgenstein on Following a Rule”. Synthese 58: 325–363.
  • McDowell. John. 1989. “One Strand in the Private Language Argument”. Grazer Philosophische Studien 33(1): 285–303.
  • McDowell, John. 1991. “Intentionality and Inferiority in Wittgenstein”. In Meaning Scepticism, edited by Klaus Puhl, 148–169. Berlin: De Gruyter.
  • McDowell, John. 1992. “Meaning and Intentionality in Wittgenstein’s Later Philosophy”. Midwest Studies in Philosophy 17(1): 40–52.
  • McGinn, Colin. 1984. Wittgenstein on Meaning. Oxford: Basil Blackwell.
  • McGinn, Marie. 2010. “Wittgenstein and Naturalism”. In Naturalism and Normativity, edited by Mario De Caro and David Macarthur, 322–351. New York: Columbia University Press.
  • Mellor, David Hugh. 2000. “The Semantics and Ontology of Dispositions”. Mind 109: 757–780.
  • Miller, Alexander. 2010. “Kripke’s Wittgenstein, Factualism and Meaning”. In The Later Wittgenstein on Language, edited by Daniel Whiting, 213–230. Basingstoke: Palgrave Macmillan.
  • Miller. Alexander. 2019. “Rule-Following, Meaning, and Primitive Normativity”. Mind 128(511): 735–760.
  • Mumford, Stephen. 1998. Dispositions. Oxford: Oxford University Press.
  • Peacocke, Christopher. 1981. “Rule-Following: The Nature of Wittgenstein’s Arguments”. In Wittgenstein: To Follow a Rule, edited by Steven Holtzman and Christopher Leich, 72–95. NY: Routledge.
  • Peacocke, Christopher. 1984. “Review of Wittgenstein on Rules and Private Language by Saul A. Kripke”. The Philosophical Review 93(2): 263–271.
  • Pettit, Philip. 1990. “The Reality of Rule-Following”. Mind 99(393):1-21.
  • Prior, Elizabeth. 1985. Dispositions. Aberdeen: Aberdeen University Press.
  • Railton, Peter. 2006. “Normative Guidance”. In Oxford Studies in Metaethics: Volume 1, edited by Russ Shafer-Landau, 3–34. Oxford: Clarendon Press.
  • Sellars, Wilfrid. 1958. “Counterfactuals, Dispositions and the Causal Modalities”. Minnesota Studies in the Philosophy of Science 2: 225–308.
  • Soames, Scott. 1997. “Scepticism about Meaning, Indeterminacy, Normativity, and the Rule-Following Paradox”. Canadian Journal of Philosophy 27: 211–249.
  • Soames, Scott. 1998. “Facts, Truth Conditions, and the Skeptical Solution to the Rule-Following Paradox”. Nous 32(12): 313–348.
  • Stroud, Barry. 1996. “Mind, Meaning, and Practice”. In The Cambridge Companion to Wittgenstein, edited by Hans Sluga and David G. Stern, 296–319. Cambridge: Cambridge University Press.
  • Warren, Jared. 2020. “Killing Kripkenstein’s Monster”. Nous 54(2): 257–289.
  • Wedgwood, Ralph. 2006. “The Meaning of ‘Ought’”. In Oxford Studies in Metaethics: Volume 1, edited by Russ Shafer-Landau, 127–160. Oxford: Clarendon Press
  • Wedgwood, Ralph. 2007. The Nature of Normativity. Oxford: Oxford University Press.
  • Whiting, Daniel. 2007. “The Normativity of Meaning Defended”. Analysis 67(294): 133–140.
  • Whiting, Daniel. 2013. “What Is the Normativity of Meaning?”. Inquiry 59(3): 219–238.
  • Williams, Meredith. 1991. “Blind Obedience: Rules, Community and the Individual”. In Meaning Scepticism, edited by Klaus Puhl, 93–125. Berlin: De Gruyter.
  • Wilson, George. 1994. “Kripke on Wittgenstein and Normativity”. Midwest Studies in Philosophy 19(1): 366–390.
  • Wilson, George. 1998. “Semantic Realism and Kripke’s Wittgenstein”. Philosophy and Phenomenological Research 58(1): 99–122.
  • Wittgenstein, Ludwig. 1922. Tractatus Logico-Philosophicus. Translated by C. K Ogden. London: Kegan Paul.
  • Wittgenstein, Ludwig. 1953. Philosophical Investigations. Translated by G. E. M. Anscombe. Oxford: Basil Blackwell.
  • Wittgenstein, Ludwig. 1956. Remarks on the Foundations of Mathematics. Translated by G. E. M. Anscombe. Edited by G. H. von Wright, R. Rhees, and G. E. M. Anscombe. Oxford: Basil Blackwell.
  • Wright, Crispin. 1984. “Kripke’s Account of the Argument Against Private Language”. The Journal of Philosophy 81(12): 759–778.
  • Wright, Crispin. 1986. “Rule-Following, Meaning and Constructivism”. In Meaning and Interpretation, edited by Charles Travis, 271–297. Oxford: Blackwell.
  • Wright, Crispin. 1989. “Critical Study of Colin McGinn’s Wittgenstein on Meaning”. Mind 98(390): 289–305.
  • Wright, Crispin. 1991. “Wittgenstein’s Later Philosophy of Mind: Sensation, Privacy and Intention”. In Meaning Scepticism, edited by Klaus Puhl, 126–147. Berlin: De Gruyter.
  • Wright, Crispin. 1992. Truth and Objectivity. Cambridge, MA: Harvard University Press.
  • Wright, Crispin. 2001. Rails to Infinity: Essays on Themes from Wittgenstein’s Philosophical Investigations. Cambridge, MA: Harvard University Press.
  • Zalabardo, Jose. 1997. “Kripke’s Normativity Argument”. Canadian Journal of Philosophy 27(4): 467–488.

b. Further Reading

  • Bloor, David. 1997. Wittgenstein, Rules and Institutions. New York: Routledge.
  • Cavell, Stanley. 1990. Conditions Handsome and Unhandsome. Chicago: University of Chicago Press.
  • Cavell, Stanley. 2005. Philosophy the Day After Tomorrow. Cambridge, MA: Belknap Press of Harvard University Press.
  • Cavell, Stanley. 2006. “The Wittgensteinian Event”. In Reading Cavell, edited by Alice Crary and Sanford Shieh, 8–25. NY: Routledge.
  • Coates, Paul. 1997. “Meaning, Mistake, and Miscalculation”. Minds and Machines 7(2):171–97.
  • Davidson, Donald. 1992. “The Second Person”. Midwest Studies in Philosophy 17: 255–267.
  • Davidson, Donald. 1994. “The Social Aspect of Language”. In The Philosophy of Michael Dummett, edited by B. McGuinness, 1–16. Dordrecht: Kluwer.
  • Diamond, Cora. 1989. “Rules: Looking in the Right Place”. In Wittgenstein: Attention to Particulars, edited by D. Z. Phillips and Peter Winch, 12–34. Hampshire: Basingstoke.
  • Ebbs, Gary. 1997. Rule-Following and Realism. Cambridge, MA: Harvard University Press.
  • Forbes, Graeme R. 1984. “Scepticism and Semantic Knowledge”. Proceedings of the Aristotelian Society 84:223-37.
  • Hacking, Ian. 1993. “On Kripke’s and Goodman’s Uses of ‘Grue’”. Philosophy 68(265): 269–295.
  • Hanfling, Oswald. 1985. “Was Wittgenstein a Skeptic?”. Philosophical Investigations 8: 1–16.
  • Katz, Jerrold J. 1990. The Metaphysics of Meaning. Cambridge, MA: MIT Press.
  • Maddy, Penelope. 1986. “Mathematical Alchemy”. The British Journal for the Philosophy of Science 37(3):279–314.
  • McGinn, Marie. 1997. The Routledge Guidebook to Wittgenstein’s Philosophical Investigations. New York: Routledge.
  • Miller, Alexander. 2020. “What Is the Sceptical Solution?”. Journal for the History of Analytical Philosophy 8 (2): 1–22.
  • Millikan, Ruth Garrett. 1990. “Truth Rules, Hoverflies, and the Kripke-Wittgenstein Paradox”. The Philosophical Review 99(3): 323–353.
  • Peacocke, Christopher. 1992. A Study of Concepts. Cambridge, MA: MIT Press.
  • Searle, John R. 2002. Consciousness and Language. Cambridge: Cambridge University Press.
  • Smart, J. J. C. 1992. “Wittgenstein, Following a Rule, and Scientific Psychology”. In The Scientific Enterprise, edited by Edna Ullmann-Margalit, 123–138. Berlin: Springer.
  • Stern, David. 1995. Wittgenstein on Mind and Language. Oxford: Oxford University Press.
  • Stern, David. 2004. Wittgenstein’s Philosophical Investigations: An Introduction. Cambridge: Cambridge University Press.
  • Tait, William W. 1986. “Wittgenstein and the ‘Skeptical Paradoxes’”. Journal of Philosophy 83(9): 475–488.
  • Wilson, George. 2006. “Rule-Following, Meaning and Normativity”. In The Oxford Handbook of Philosophy of Language, edited by Ernest Lepore and Barry C. Smith, 1–18. Oxford: Oxford University Press.
  • Wilson, George. 2011. “On the Skepticism about Rule-Following in Kripke’s Version of Wittgenstein”. In Saul Kripke, edited by Alan Berger, 253–289. Cambridge: Cambridge University Press.

 

Author Information

Ali Hossein Khani
Email: hosseinkhani@irip.ac.ir
Iranian Institute of Philosophy (IRIP)
Iran

Mathematical Nominalism

Mathematical nominalism can be described as the view that mathematical entities—entities such as numbers, sets, functions, and groups—do not exist. However, stating the view requires some care. Though the opposing view (that mathematical objects do exist) may seem like a somewhat exotic metaphysical claim, it is usually motivated by the thought that mathematical objects are required to exist in order for mathematical claims to be true. If, for instance, it is true that there are infinitely many prime numbers, then prime numbers prima facie exist. Much contemporary work in mathematical nominalism divides into efforts to argue either that mathematical truths do not in fact require the existence of mathematical objects, or that we are entitled to regard mathematical claims, such as the one above, as false.

This article surveys contemporary attempts to defend mathematical nominalism. Firstly, it considers how to formulate mathematical nominalism, surveys the origins of the contemporary debate, and explains epistemic motivations for nominalism. Secondly, it examines a particularly prominent family of objections to mathematical nominalism, issuing from the applicability of mathematics. Thirdly, it looks at three kinds of response to that family of objections: reconstructive nominalism (aiming to show that, in principle, one can recreate the applications of mathematics without making mathematical claims), deflationary nominalism (aiming to show that the truth of mathematical claims does not require the existence of mathematical objects), and instrumentalist nominalism (aiming to show that one can make sense of standard mathematical practices without incurring a commitment to the truth of mathematical claims). Finally, it surveys the claims of some leading thinkers about the relationship between mathematical nominalism and naturalism.

Table of Contents

  1. Formulating Mathematical Nominalism
  2. The Origins of the Contemporary Debate
  3. Motivations for Nominalism
  4. Nominalism and the Application of Mathematics
    1. The Indispensability Argument
  5. Reconstructive Nominalism
    1. Chihara
      1. Constructibility theory
      2. Constructibility Theory and Standard Type Theory
      3. The Role of Possible World Semantics
    2. Field
      1. Field’s Program
      2. Representation Theorems
      3. Conservativeness
      4. The Prospects of Field’s Project
      5. Conservativeness Again
      6. The Best Theory?
  6. Deflationary Nominalism
    1. Azzouni
      1. Quantifier Commitments and Ontological Commitments
      2. The Coherence of Denying Quine’s Criterion
      3. Excuse Clauses
  7. Instrumentalism
    1. Leng
      1. Mathematics and Make-Believe
      2. Explaining the success of mathematics
      3. Mathematical Explanations
      4. Nominalistic Content
  8. Mathematical Nominalism and Naturalism
    1. Quine’s Naturalism
    2. Maddy’s Naturalism
    3. Burgess and Rosen’s Naturalism
  9. References and Further Reading

1. Formulating Mathematical Nominalism

At a first pass, one can describe mathematical nominalism as the view that mathematical entities do not exist. Some clarifications and caveats, however, should be kept in mind. Firstly, some theorists have held that mathematical entities are, in some sense, mental objects. (The Dutch mathematician and philosopher L.E.J. Brouwer is sometimes interpreted as having endorsed this view.) Nominalists, however, deny the existence of mathematical objects understood as abstract objects whose existence does not depend on any mental or linguistic activity. To understand this claim, one must appreciate the thought that all that there is, or that there might be, can be divided into two exclusive and exhaustive categories: the concrete and the abstract. Nominalists hold that abstract objects do not exist. Examples of concrete objects are tables, chairs, stars, human beings, molecules, microbes, as well as more exotic theoretical entities such as electrons, bosons, dark matter and so on. Paradigmatically, concrete objects are spatiotemporal, contingent, have causal powers and can themselves be affected, participate in events, and can be interacted with, even if only indirectly. There are two different senses in which entities can be abstract. On one sense, to be abstract means to be non-particular. It is in this sense that universals are said to be abstract. Universals are properties which can be instantiated by particular objects. The property of being negatively charged would be instantiated by particular electrons, for example. Some theorists have understood mathematical objects as universals (Bigelow 1988, Shapiro 1997). However, mathematical objects are typically conceived as being abstract in a different way: they are particular, but non-concrete. Paradigmatically, objects that are abstract in this sense are particular, but non-spatial, necessary, unchanging, acausal, and cannot be interacted with, even indirectly. (The distinction between concrete and abstract entities is, however, difficult to analyze. Rosen (2020) provides a detailed discussion. See also Lewis (1986, 81–86) and Fitzgerald (2003).) Mathematical nominalism then can be more specifically described as the view that abstract mathematical entities do not exist, in either sense of “abstract”.

Secondly, some theorists hold that the term “exist” and its cognates are not univocal (see, for instance, Russell 1903; Brandom 1994; Miller 2002; Vallicella 2002; Putnam 2004; Hirsch 2011; Hofweber 2016; McDaniel 2017; Kimhi 2018). For example, the meaning of “exist(s)” in “Electrons exist” may not be the same as in “Giants exist in both Mesopotamian and Shinto mythology” or “A special bond exists between all philosophers of mathematics”. Further, some theorists hold that some usages of “exist(s)” are not ontologically committing—that is, one can talk of some things existing without thereby committing to those things being part of the furniture of reality—and, additionally, that this is true of existence claims about mathematical objects (Azzouni 2010b, 2017). On this view, mathematical objects can be said to exist in a way that is not ontologically significant, such that the existence of mathematical objects makes no demand on the world. Put differently, mathematical objects could be said to exist regardless of what the world is like. Mathematical nominalism, then, can be more specifically described as the view that abstract mathematical entities do not exist independently of mental or linguistic activity and in an ontologically significant sense of “exist”. However, to avoid being unnecessarily involute, going forward, these caveats will mostly be left implicit.

The terminology used is not uniform across the literature. Those who hold that abstract mathematical objects exist independently of mental or linguistic activity, in an ontologically significant sense, are usually called mathematical platonists. However, small-“p” platonists in this sense are not necessarily followers of Plato, and some reserve the term “Platonist” for views that have more in common with Plato’s own platonism, distinguishing platonism from the more generic object realism (Linnebo 2017) or the apophatic anti-nominalism (Burgess and Rosen 1997). At least one theorist, Rayo (2016), refers to the view that mathematical objects exist in an ontologically insignificant sense of “exist” as trivialist platonism or subtle platonism.

2. The Origins of the Contemporary Debate

Understanding contemporary defenses of nominalism requires understanding the motivations underlying anti-nominalism. Perhaps primary among these is a broadly representationalist view of language, according to which declarative claims (at the very least, simple declarative claims of subject-predicate form) purport to represent or describe the world as being a certain way. Declarative claims (again, at the very least, simple declarative claims of subject-predicate form) are true just in those cases where the world is the way they represent it as being, and to take a claim of this sort to be true is to take the world to be the way the claim represents it as being. For example, “The Forth Rail Bridge is red” says of the Forth Rail Bridge (the subject) that it is red (the predicate). So, a claim of this sort purports to denote something (the Forth Rail Bridge) and attributes a property to that thing (redness). The claim is true just in case the thing it purports to denote exists and has the property it ascribes to it. If the Forth Rail Bridge was turquoise, or did not exist, the claim would not be an accurate representation or description.

Mathematics similarly contains simple declarative claims of subject-predicate form, such as “Seven is prime”. This says of the number seven that it is prime. So, according to this broadly representationalist view, the claim purports to denote something (the number seven) and attributes a property to that thing (primeness). Since simple declarative claims are true just in those cases in which they accurately describe that of which they speak, “Seven is prime” is true just in case the thing it purports to denote exists and has the property it ascribes to it. If seven was not prime, or did not exist, the claim would not be an accurate representation or description.

A major influence here is Gottlob Frege’s pathbreaking work in the foundations of mathematics. In his Grundlagen der Arithmetik (1884), Frege defended the view that numerical expressions function as singular terms, and that singular terms are the parts of language which purport to pick out or refer to exactly one object. For Frege, singular terms are those that can correctly flank an identity sign “=”, or “is” when used to express identity, for instance: “The shortest serving prime minister of the twentieth century is Bonar Law” or “The smallest number expressible as the sum of two cubes in two different ways = 1729”. The terms “The shortest serving prime minister of the twentieth century”, “Bonar Law”, “the smallest number expressible as the sum of two cubes in two different ways”, and “1729” all purport to refer to exactly one object. For Frege, the truth of claims such as these not only requires that they purport to refer to exactly one object, but also that they succeed in doing so; that is, there must be such an object. The claim “The smallest number expressible as the sum of two cubes in two different ways is 1729” is true because there is a number, 1729, that is the smallest number expressible as the sum of two cubes. On the other hand, no claim about the largest prime number can be true, because there is no largest prime. These semantic considerations lay the groundwork for a simple but influential argument for mathematical platonism: some mathematical claims are true; therefore, there are mathematical objects. Since, it is widely supposed, these claims are true regardless of what anyone thinks or says, the existence of mathematical objects is mind- and language-independent.

The Polish logician Alfred Tarski’s (also pathbreaking) work on truth helped to ensconce this broadly representationalist picture. Tarski’s own interests were not in defending a philosophical account of language, but in showing how to define a notion true-in-L for some formal language L in a way that avoids the Liar Paradox (Tarski 1935, 1944). What emerged is known as a semantic theory of truth, so-called not because it has to do with meaning per se, but because, in line with contemporaneous usage of the word “semantic”, it has to do with relations between words and things. Tarski’s approach depends in part on stipulating what each singular term in L denotes, and which things or sequences of things “satisfy” the predicates of L. For example: “x is red” is satisfied by the Forth Rail Bridge if and only if the Forth Rail Bridge is red, and “x admires y” is satisfied by the ordered pair <Thom, Jonny> if and only if Thom admires Jonny.

Tarski’s work formed the basis of a new branch of mathematics, model theory, and, subsequently, of the model-theoretic accounts of language, which became mainstream in formal semantics by the end of the twentieth century. This is the approach to semantics familiar from logic textbooks. There, an interpretation of a language is understood as a function from the set of elements of the language itself—variables, constants, predicates, sentences—to the domain of that language—the set of things that language is about (given the interpretation).

Like Frege’s analysis of singular terms, the formal semantics that stems from Tarski’s semantic account of truth appears to entail platonism, so long as one holds that there are true mathematical sentences. This point was made by Paul Benacerraf in a canonical and widely cited 1973 paper “Mathematical Truth”. Benacerraf holds that Tarski’s is “the only viable systematic general account we have of truth” (Benacerraf 1973, 670) and that a uniform semantics or theory of truth should be given to both non-mathematical parts of natural language (for example, “There are at least three large cities older than New York”) and to mathematese (for example, “There are at least three perfect numbers greater than 17”). One reason is that:

The semantical apparatus of mathematics [should] be seen as part and parcel of that of the natural language in which it is done, and thus that whatever semantical account we are inclined to give of names or, more generally, of singular terms, predicates, and quantifiers in the mother tongue include those parts of the mother tongue which we classify as mathematese. (Benacerraf 1973, 666)

A distinct, but closely related, reason is that logical consequence is standardly defined in terms of truth (Tarski 1936). Roughly speaking: a set of sentences Σ logically entails a sentence ϕ just in case there is no interpretation function according to which Σ is true but ϕ is false. If mathematical truth is not understood along Tarskian (and therefore apparently platonist) lines, we would require a new account not just of mathematical truth but also of logical consequence, but no such accounts are forthcoming (Benacerraf 1973, 670).

Similarly influential was the Harvard logician W.V.O. Quine. Quine’s (also canonical and widely cited) 1948 paper “On What There Is” did much to establish as orthodox the view that the existential quantifier is ontologically committing. In Quine’s own words:

The variables of quantification, ‘something’, ‘nothing’, ‘everything’, range over our whole ontology, whatever it may be; and we are convicted of a particular ontological presupposition if, and only if, the alleged presuppositum has to be reckoned among the entities over which our variables range in order to render one of our affirmations true. (Quine 1948, 32)

Despite the paper’s influence, what the argument for that conclusion is, or whether it offers an argument at all, is disputed. Elsewhere, however, Quine defends the claim that the existential quantifier expresses existence on the grounds that its meaning is given by the English phrase “there is an object x such that…” and that this expresses existence (Quine 1986, 89).

3. Motivations for Nominalism

Although mathematical nominalism is the metaphysical claim that no mathematical objects exist, the chief argument for nominalism centers around the epistemological concern that we cannot have knowledge of mind-independent and language-independent mathematical objects even if they do exist, or, more weakly, that it is a mystery how we could have knowledge of mathematical objects so conceived. (See the article on The Benacerraf Problem of Mathematical Truth and Knowledge.) The upshot of these arguments is not the de facto claim that mathematical objects do not exist, but the de jure claim that we ought not to believe in mathematical objects. The epistemological problem with mathematical objects arises from the difficulty in squaring what abstract objects are like, if they exist, with what we know about ourselves as enquirers with particular capacities, abilities, and faculties for gaining knowledge of what the world is like. Mathematical objects as abstract objects are not only the sort of things that cannot be touched or seen, but they also cannot be interacted with or manipulated in any way. They have no effects, nor do they participate in events that could, even in principle, impinge on one’s experience. Facts about abstract objects, as the platonist understands them, cannot make a difference to any data one might have or come to acquire, or to any beliefs one might come to hold.

The canonical articulation of the epistemological problem is due to Benacerraf (in the same paper in which he advocates a Tarskian account of mathematical truth). Our account of mathematical knowledge, Benacerraf claimed, “must fit into an over-all account of knowledge in a way that makes it intelligible how we have the mathematical knowledge that we have” (Benacerraf 1973, 667), in particular:

A causal account of knowledge on which for [some person] X to know that [some sentence] S is true requires some causal relation to obtain between X and the referents of the names, predicates, and quantifiers of S. (Benacerraf 1973, 671)

A causal criterion for knowledge immediately rules out knowledge of abstract objects since they are acausal.

As the causal theory of knowledge waned in popularity so did Benacerraf’s particular formulation of the epistemological problem. However, it is too quick to conclude from the failure of causal analyses of knowledge that there is no sound causal-epistemological argument against the possibility of knowledge of abstract objects. This depends on whether the analysis fails because an appropriate causal connection between an agent and the object of belief is not sufficient for knowledge, or because such a connection is not necessary for knowledge. If it is the latter, then showing that there are no causal connections between an agent who holds mathematical beliefs and mathematical objects would not show that something required for knowledge is lacking. On the other hand, if appropriate causal connections are insufficient but necessary for knowledge, then the causal objection would go through. The most influential objections to causal theories are of the former sort: appropriate causal connections, it is argued, are necessary but not sufficient for knowledge, as an appropriate causal connection can exist between a person’s belief and the object of this belief, while the belief is true only by luck, and hence not known (Goldman 1976). Others however have claimed that causal connections cannot be necessary for knowledge, as this would rule out knowledge of the future (Burgess and Rosen 1997; Potter 2007).

Some have argued for more modest, restricted versions of a causal criterion, which are not committed to the claim that all knowledge requires an appropriate causal connection between the knower and the object of her knowledge. Colin Cheyne (1998, 2001) claims that the causal criterion applies to existential knowledge, arguing that this is supported by examples from empirical science. However, subsequently Nutting (2016) has argued that a Benacerraf-like argument can be made that does not rely on a general causal criterion, but on the more defensible claim that direct knowledge (that is, knowledge of a claim that is not gained via an inference from another claim) requires some kind of appropriate causal connection. This, along with the premise that the objects of mathematical knowledge are acausal mathematical objects, and the premise that if we have any mathematical knowledge, some of it must be direct, entails that mathematical knowledge is impossible.

Others have characterized the epistemological objection in different terms. W.D. Hart claims that the epistemological problem does not concern causal theories of knowledge in particular, but empiricism more generally. Empiricism, as Hart understands it, is “the doctrine that all knowledge is a posteriori” (Hart 1977, 125). A posteriori knowledge is “justified ultimately by experience” (ibid.). Experience, in turn, “requires causal interaction with the objects experienced” (ibid.). Yet, causal interaction with mathematical objects is impossible. Though there is not a strict incompatibility between these tenets—unless one reads them as making the more specific claim that, for all x, knowledge of x requires experience of x—it is enough to set up a prima facie tension between empiricism and platonism.

Hartry Field (1989) reformulated the epistemic problem as a challenge to explain how our beliefs about abstract, mathematical objects could be reliable. Realists about mathematical objects think that their beliefs about mathematical objects are largely true. If so, those beliefs are highly correlated with the mathematical facts. The platonist, however, “must not only accept the reliability, but must commit himself or herself to the possibility of explaining it” (Field 1989, 26). However, there appears to be serious difficulties in doing so. On the one hand, the platonist conception of mathematical objects as acausal and mind-independent:

means that we cannot explain the mathematicians’ beliefs and utterances on the basis of those mathematical facts being causally involved in the production of those beliefs and utterances; or on the basis of the beliefs and utterances causally producing the mathematical facts; or on the basis of some common cause producing both. (Field 1989, 231)

On the other hand, “it is very hard to see what [a] supposed non-causal explanation could be” (Field 1989, 231). If the reliability of mathematical beliefs—as understood by the Platonist, that is, as beliefs about mind-independent abstract objects—appears impossible to explain, this would “undermine the belief in mathematical entities, despite whatever reason we might have for believing in them” (Field 1989, 26).

A prominent attempt to explain the reliability of mathematical beliefs, given platonism, is due to Balaguer (1998). Balaguer responds to Field’s challenge by invoking a “full-blooded platonist” view (often referred to as “set-theoretic pluralism” or as a “set-theoretic multiverse” view) according to which, roughly, every coherently describable universe of mathematical objects exists. So long, then, as our mathematical belief-forming methods result in consistent mathematical beliefs, they will accurately describe some mathematical objects. This, Balaguer argues, offers a platonistic explanation of the reliability of mathematical beliefs.

Two issues should be noted. Firstly, for the explanation to succeed, there must be something about our linguistic practices that makes it the case that our mathematical claims are always about the mathematical objects of which they would be true (Clarke-Doane 2020). For example, Zermelo-Fraenkel set theory with the axiom of choice (ZFC) is consistent both with the hypothesis CH that there is no set with cardinality larger than the integers but smaller than the real numbers, and with the negation of CH. On a universe view, there is one set-theoretic universe (characterizable, say, by ZFC), and it is the case that CH either accurately or inaccurately characterizes that universe. On a multiverse view, there is a plurality of set-theoretic universes characterizable by ZFC (as well as yet further universes characterizable by different consistent sets of axioms), some of which are accurately characterizable by CH (ZFC + CH universes) and other are accurately characterizable by the negation of CH (ZFC + ¬CH universes). To secure reliability, it must be the case that when one makes claims that are true of for instance ZFC + CH universes, one is in fact talking about ZFC + CH universes and not, rather, making false claims about ZFC + ¬CH universes. There are some difficulties in pinning down what it is about our language that would make this the case. (See Putnam (1980). Button (2013) gives a book-length treatment and Button and Walsh (2018, chapter 2) provides an introduction to these issues concerning reference.)

Secondly, meeting Field’s challenge removes one epistemological objection to platonism, but does not show that knowledge of mathematical objects, as understood by the platonist, is possible. This is because reliability is a necessary, but not a sufficient, condition for knowledge. It is possible for a belief-forming process to be serendipitously reliable in a way that is not sufficient for knowledge. For instance if someone has a brain lesion that causes them to believe that they have a brain lesion (Plantinga 1993), that person’s belief-forming process reliably leads, in the case of this belief (that they have a brain lesion), to a true belief, but still involves a kind of epistemic luck that is antithetical to knowledge. Standard accounts of epistemic luck appeal to safety or sensitivity conditions to secure knowledge, which in turn are analyzed in terms of what the agent would have believed in metaphysically possible worlds suitably related to the actual world. For this reason, they are often taken to be inapplicable to mathematical platonism. If platonism is true, it is true in all metaphysically possible worlds. The upshot is that standard safety and sensitivity conditions are trivially met in cases of necessary truths, so that every belief whose object is a necessary truth would count as knowledge even if it is gained by luck. Collin (2018) argues that it is possible to formulate an epistemological argument against platonism in terms of epistemic luck: by analyzing safety and sensitivity conditions in terms of epistemically possible scenarios, rather than metaphysically possible worlds, one can apply safety and sensitivity conditions to necessary truths.

No formulation of the epistemological objection is uncontroversial, but a felt sense that something is epistemically worrying about abstract objects is common in the philosophical literature. Pinning down exactly what the epistemological problem with mathematical objects is remains an open task for nominalists.

4. Nominalism and the Application of Mathematics

The semantic argument for mathematical platonism presented above suggests two genera of nominalist responses. The first is to deny the mainstream semantic assumptions that undergird the inference from the truth of simple declarative mathematical claims to the existence of mathematical objects. In broad terms, this could mean either dropping representationalism—and holding that simple declarative mathematical claims do not purport to represent a domain of mathematical objects in any substantive sense of “represent”—or retaining representationalism but holding that the mathematical objects being represented are non-existent objects. The second is to accept mainstream semantic assumptions, but to hold that simple declarative mathematical claims are false because there are no mathematical objects. On this second kind of view, the standards of correctness or incorrectness for mathematical claims are not, strictly speaking, standards of truth and falsehood.

Both sorts of response are complicated by the applicability of mathematics. There exists a large literature on the applicability of mathematics (see, for example, Frege 1884; Suppes 1960; Carnap 1967; Putnam 1971; Krantz and others 1971; Field 1980; Resnik 1997; Shapiro 1997; Steiner 1998; Azzouni 2004; Chang 2004; Chihara 2004; van Fraassen 2008; Bueno and Colyvan 2011; Bangu 2012; Pincock 2012; Weisberg 2013; Morrison 2015; Bueno and French 2018; Ketland 2021; Leng 2021; the article on The Applicability of Mathematics). However, a brief overview is enough to reveal its relevance to mathematical nominalism. At a high level of generality, mathematics is applied within the sciences in the following way. Scientists devise equations which can be used to model or represent concrete systems. Measurement procedures are used to assign mathematical values to aspects of the target concrete system and these values are “plugged in” to the equations. When mathematics is not merely a predictive tool, different values within the equations correspond to different magnitudes of properties of the concrete system. By manipulating the equations, one can then make predictions about the concrete system. These predictions are (directly or indirectly) testable when the mathematical results are (directly or indirectly) associated with measurement procedures.

Consider, for example, a closed vessel of volume V (in m3) containing a gas. The pressure P (in Pascals) of the gas can be measured using a manometer, and the temperature T (in Kelvin) can be measured using a thermometer. Letting n be the number of moles of gas (where 1 mole =  6.02 x 1023 molecules), and R be the ideal gas constant (~ 8.314), the ideal gas law (in molar form) tells us:

PVnRT

Though only approximately accurate, the ideal gas law allows one to calculate physical quantities and make predictions about the behavior of gasses in a range of circumstances. It not only tells us, for instance, that increasing the temperature of the gas while holding fixed the volume of the vessel will increase the pressure but allows us to calculate precisely (idealization notwithstanding) to what extent this is the case. It also allows us, for instance, to calculate the number of moles of gas, so long as we are able to measure the pressure, volume, and temperature of the system, by rearranging the equation:

= PV/RT

In other cases, the mathematics and the measurement procedures are far more complex. There are also enduring questions in areas such as the philosophy of quantum mechanics about what physical quantities mathematical objects such as wavefunctions correspond to, or whether they are merely predictive tools. However, the general contours of the picture remain the same: representing properties of physical systems using numbers allows algebraic reasoning to be used to describe, make predictions about, and explain features of the concrete world.

One, then, can think of the language of mathematical science as being two-sorted, that is, ranging over two kinds of thing:

    • concrete entities, using primary variables: x1x2,…,xn.
    • abstract entities, using secondary variables: y1y2,…,yn.

and, therefore, containing three kinds of predicate:

    • concrete predicates, expressing relations between concreta: C1C2,…
    • abstract predicates, expressing relations between abstracta: A1A2,…
    • mixed predicates, expressing relations between concrete and abstract objects: M1M2,…

Measurement is one clear example of (iii). Measurements describe physical quantities by associating them with numerical magnitudes. Take the claim “The mass of d1 is 5 kilogrammes”, where d1 is a concrete object, or “The temperature of d2 is 30 Kelvin”, where d2 is a concrete system. The first is expressed more formally as “Mkg(d1) = 5”—which describes a function from a concrete object, d1, to an abstract object, the number 5—and the second as “Tk(d2) = 30”—which describes a function from a concrete system, d2, to an abstract object, the number 30. Although these are, in one sense, about the concrete world, they refer to both physical objects and abstract mathematical objects. Scientific theories, then, when regimented, involve a combination of claims about concrete entities, claims about mathematical entities, and claims about both concrete and mathematical entities.

a. The Indispensability Argument

This puts pressure on the two genera of nominalist views mentioned above. Regarding the first, if nominalists deny that mathematical sentences have the same semantics as sentences about concrete objects, then a puzzle arises about mixed sentences, roughly: what is the semantics for mathematical sentences and how does it combine with ordinary semantics to produce meaningful mixed sentences? Regarding the second, if nominalists deny that mathematical sentences are (strictly speaking) true, then they must also deny that many of the best scientific theories are true, for the best scientific theories are replete with mathematical claims. Moreover, we appear to have at least some justification for believing that the best scientific theories are true, as they receive empirical confirmation as a result of making testable predictions.

Considerations like these have brought about a very important and influential challenge to nominalism: the indispensability argument. In fact, to talk of the indispensability argument is misleading since there are a number of distinct arguments that fall under that rubric (see, for instance, Quine 1948, 1951, 1976, 1981; Putnam 1971; Maddy 1992; Resnik 1995; Colyvan 2001; Leng 2010; the article on the Indispensability Argument). Something like a core indispensability argument can however be isolated, and, because many forms that nominalism might take have come about largely in response to the premises of this core argument, describing it allows us to produce a useful taxonomy of nominalisms. At its heart, the indispensability argument is designed to show that nominalism is incompatible with the claims of science. Science—or at least the science of the twentieth and beginning of the twenty-first centuries—asserts the existence of abstract objects, so if its claims are true, nominalism is false; if we are justified in believing its claims, we are not justified in believing nominalism. This core indispensability argument has three premises:

(Realism) The best current scientific theories are true (or at least approximately true).

(Indispensability) The best current scientific theories indispensably quantify over abstract objects.

(Quine’s Criterion) The existential quantifier ∃x expresses existence.

Something should be said about each of the premises. (Realism) is not as straightforward as it looks, since the denial of realism, instrumentalism, can be characterized in a number of different ways. The anti-nominalists John Burgess and Gideon Rosen have characterized a rejection of (Realism) as amounting to the claim that “standard science and mathematics are no reliable guides to what there is” (Burgess and Rosen 1997, 60–61). However, the most fully developed instrumentalist nominalism, that of Mary Leng, seeks to provide an account of how a denial of realism is compatible with substantive scientific knowledge of the concrete world. (Quine’s Criterion) is motivated by broadly the sort of semantic considerations discussed earlier. Finally, (Indispensability) is also not wholly straightforward. For one thing, technical results show that, strictly speaking, (Indispensability) is false. The method of Craigian elimination can transform a two-sorted theory Γ, quantifying over two kinds of things, into a theory Γ° with infinitely many primitives and axioms that quantifies over only one of those kinds of things. So, there is a known mechanism by which quantification over mathematical entities can be dispensed with. Recommending Craigian elimination as a response to (Indispensability) appears however not to be sufficient for the nominalist, partly because the process is not thought to explain the success of mathematical theories (see Burgess and Rosen 1997, I.B.4.b for details). For another thing, the indispensability of quantification over mathematical objects may appear idle with respect to the argument. If the best scientific theories are true and assert the existence of mathematical objects, then nominalism is false regardless of whether it is possible to formulate other theories that dispense with quantification over mathematical objects. However, as it is expounded below, some have taken programs of dispensing with quantification over mathematical objects in physical theories as a means of explaining the predictive success of theories that do quantify over mathematical objects, without assuming the truth of what they say about mathematical objects, thereby undercutting the main motivation for (Realism). Alternatively, or in addition, nominalists who take it to be possible to dispense with quantification over mathematical objects may argue that the resulting theories are superior, perhaps on the grounds of ontological parsimony, or on the grounds that they avoid the epistemological problems associated with abstract objects, or on the grounds that they provide more perspicuous intrinsic descriptions and explanations of physical systems and their behavior—descriptions and explanations that appeal to intrinsic facts about those systems rather than their relations to mathematical objects. In that case, the best scientific theories will not quantify over mathematical objects.

A fourth premise, confirmation holism, is also often thought to be crucial to the indispensability argument, both by those who defend and those who resist the argument (see, for instance, Colyvan 2001; Maddy 1997; Sober 1993; Leng 2010; Resnik 1997). Confirmation holism is the claim that confirmation accrues to theories as a whole rather than accruing only to proper parts of those theories. (Realism), (Indispensability) and (Quine’s Criterion) mutually entail the falsity of nominalism, so confirmation holism is not required to make the argument logically valid. However, confirmation holism is sometimes taken to support (Realism). If empirical confirmation could accrue only to proper parts of theories, nominalists might be able to argue that it only accrues to those claims that quantified only over concrete objects. If confirmation holism is true, however, then the empirical confirmation the best scientific theories enjoy also applies to the claims they make about mathematical objects.

The three premises of the core indispensability argument are reflected in a trifurcation of approaches to nominalism. Some nominalists reject (Indispensability) and attempt to show that we can (in certain important contexts) get by without talking about abstract objects. This is reconstructive nominalism. Others reject (Quine’s criterion): one can make true claims “about” abstract objects without abstract objects existing. “There are infinitely many primes” really can be true without any primes existing (in an ontologically significant sense). This is hermeneutic or deflationary nominalism. Still others reject (Realism). They take mathematical claims, even those that appear in the best scientific theories, to be strictly false, and do not attempt to show that we can get by without them but offer an account of why we speak this way and why it is useful to do so that is not committed to the existence of mathematical objects. This is instrumentalist nominalism.

5. Reconstructive Nominalism

The first of the premises to be concertedly challenged by nominalists was (Indispensability), and there have been many attempts to discharge or partially discharge this aim (a useful overview can be found in Burgess and Rosen 1997, III.B.I.a). Hartry Field’s efforts, and responses to them, have been dominant in the philosophical literature on indispensability, to the extent that some discussion of indispensability carries on as though the failure of Field’s project would amount to the failure of reconstructive nominalism. Here we examine two important and representative strategies of dispensing with reference to and quantification over mathematical objects in some detail: Charles Chihara’s modal strategy, and Field’s geometrical strategy.

a. Chihara

Originally, Chihara responded to (Indispensability) by developing a predicative system of mathematics which avoided quantification over mathematical objects by using constructibility quantifiers instead of the standard quantifiers (Chihara 1973). Concerned that not all of the mathematics needed for contemporary science could be reconstructed in a predicative system, he has since retained the use of constructibility quantifiers but developed a different system without these restrictions (Chihara 1990, 2004, 2005). It is Chihara’s developed view that is discussed here.

In standard mathematics, the “official claims”, as it were, come in an apparently existential form: they appear to be claims about what mathematical objects exist and what relations they bear to each other. Things were not always so. In Euclid’s Elements we find the following axioms of geometry:

A straight line can be drawn joining any two points;

Any finite straight line can be extended continuously in a straight line;

For any line a circle can be drawn with the line as radius and an endpoint of the line as center.

These axioms concern not what exists or is “out there”, but what it is possible to construct. The claims of Euclidean geometry are modal rather than existential and, as a result, do not have any obvious ontological commitments to abstract (or, for that matter, concrete) objects. Geometry was principally carried out in this modal language for thousands of years, though by the twentieth century it had become common to make geometrical claims in existential language. Hilbert in his 1899 Grundlagen der Geometrie (Foundations of Geometry) gives the following as his first three axioms of geometry:

For every two points A, B there exists a line L that contains each of the points A, B;

For every two points A, B there exists no more than one line that contains each of the points A, B;

There exist at least two points on a line. There exist at least three points that do not lie on a line.

Hibert’s axioms, in contrast to Euclid’s, appear existential, describing which points and lines exist. For the nominalist, this may be philosophically significant. It shows that it is possible to practice mathematics—at least one part of mathematics—without making any claims about the existence of abstract mathematical objects. Chihara’s nominalism takes its cue from Euclid’s modal geometry; it aims to do all mathematics—or all the mathematics we need—in the modal, rather than the existential, mode. His goal is to:

Develop a mathematical system in which the existential theorems of traditional mathematics have been replaced by constructibility theorems: where, in traditional mathematics, it is asserted that such and such exists, in this system it will be asserted that such and such can be constructed. (Chihara 1990, 25)

Although Chihara works out this project in a good deal of technical detail, the fundamental idea behind it is straightforward enough. Where Field, as it is explained below, attempts to replace mathematized physics with nominalistic physics, Chihara attempts to replace standard pure mathematics with a system of mathematics that makes no claims about the existence of mathematical objects. This nominalistic surrogate for standard mathematics, then, could be true without mathematical objects existing.

i. Constructibility theory

Chihara works out a modal version of simple type theory (henceforth STT) called “constructibility theory” (henceforth Ct). The language of STT contains the standard quantifiers “∃x” (meaning “there is an object x such that…”) and “∀x” (meaning “every object  is such that…”) and the set-theoretic membership relation “∈” which is used to express which entities are in a set. “Thom ∈ {Thom, Jonny, Phil, Colin, Ed}” means that Thom is in the set containing Thom, Jonny, Phil, Colin and Ed, “√2∈ℝ” means that the number √2 is in the set of real numbers, and so on. As the language of STT contains “∃x”, “∀x” and “∈”, STT is used (at least apparently) to make assertions about which sets exist. Sets can contain ordinary objects, both concrete and abstract, and they can also contain other sets. The claims that can be made in STT about which sets exist are not wholly unrestricted; if they were, one could claim that there is a set which contains all and only those sets that do not contain themselves: ∃xy(yxyy). Consider the set just described: does it contain itself? If it does not contain itself, then it follows that it does contain itself, as it is the set that contains all sets that do not contain themselves. On the other hand, if it does contain itself, then it follows that it does not contain itself because it is the set that contains only those sets that do not contain themselves. This is Russell’s paradox. To avoid this incoherence, sets in STT are on levels: a set can only contain objects or sets on a lower level than itself. On level-0 there are ordinary objects; on level-1, sets containing ordinary objects; on level-2, sets containing sets that contain ordinary objects; and so on.

In Chihara’s system, the existential quantifier “∃x” and universal quantifier “∀x” are supplemented with modal constructibility quantifiers “Cx” and “Ax”, which, instead of making assertions about what exists, make assertions about which sentences are constructible. Corresponding to the existential quantifier “∃x” is “Cx”. Claims of the form “(Cϕ)ψϕ” mean:

It is possible to construct an open sentence ϕ such that ϕ satisfies ψ

Corresponding to the universal quantifier “∀x” is “Ax”. Claims of the form “(Aϕ)ψϕ” mean:

Every open sentence ϕ that it is possible to construct is such that ϕ satisfies ψ

To understand the constructibility quantifiers “Cx” and “Ax”, one must understand what it is for an open sentence to be satisfied or to satisfy other open sentences. Take the open sentence “x is the writer of Gormenghast”. This sentence is satisfied by Mervyn Peake—that is, the person who wrote Gormenghast. So, open sentences can be satisfied by ordinary objects. But they can also be satisfied by other open sentences. Consider the sentence “There is at least one object that satisfies F”. This is satisfied by the open sentence “x is the writer of Gormenghast”. Open sentences, like the sets of STT, are on levels: at level-0, there are ordinary objects; at level-1, open sentences that are satisfied by ordinary objects; at level-2, open sentences that are satisfied by open sentences that are satisfied by ordinary objects, and so on.

For a sentence to be constructible is just for it to be possible to construct. The sort of possibility at play here is not practical possibility; no particular person need be capable of constructing the relevant sentences. What Chihara has in mind is metaphysical possibility, which is sometimes (somewhat misleadingly) called “broadly logical possibility”. This is absolute possibility concerning how the world could have been. (Chihara sometimes also talks in terms of “conceptual possibility”, although conceptual and metaphysical possibility are not generally thought by philosophers to be equivalent.) Notice that the constructibility quantifiers are not epistemic in any way. That an open sentence ϕ is constructible does not mean that we know how to construct it, or even that it is possible in principle to know how it can be constructed. Similarly, it need not be the case that it is in principle knowable which objects or open sentences would satisfy ϕ.

After this sketch of STT and Ct, it is quite easy to see, in a general way, how Chihara’s strategy works. For every claim in STT about the existence of particular sets there corresponds a claim in Ct about the satisfiability of open sentences. Where STT says, for example, “There is a level-1 set x such that no level-0 object is in x”, Ct can say “It is possible to construct a level-1 open sentence x such that no level-0 object would satisfy x”. The set-theoretic inclusion relation “∈” is replaced by the satisfaction relation between objects and open sentences (or open sentences and other open sentences), and assertions about sets are replaced by assertions about the constructibility of open sentences. In this way, Chihara creates a branch of mathematics that does not require reference to or quantification over abstract objects. Everything one can do with STT one can do with Ct. STT is a foundational branch of mathematics, which is to say that other branches of mathematics can be reconstructed in it. Plausibly, then, STT is sufficient for any applications of mathematics that might arise in the sciences. Since Ct is a modalized version of STT, Ct, plausibly, is itself sufficient for any application of mathematics that might arise in the sciences. According to Chihara, (Indispensability) is therefore false.

ii. Constructibility Theory and Standard Type Theory

Chihara thinks of Ct as a modal version of STT, but Stewart Shapiro (1993, 1997) has claimed that Ct is in fact equivalent to STT and so could have no epistemological (or other) advantage over STT. In defense of this, Shapiro provides a recipe for transforming sentences of Ct into sentences of STT: first, replace all the variables of Ct that range over level-n open sentences with variables of STT that range over level-n sets; second, replace the symbol for satisfaction with the “∈” symbol for set membership; third, replace the constructibility quantifiers “Cx” and “Ax” with the quantifiers of predicate logic “∃x” and “∀x”. Call a sentence of STT “ϕ” and its Ct counterpart “tr(ϕ)”. Shapiro shows that ϕ is a theorem of, that is provable in, STT if and only if tr(ϕ) is a theorem of Ct, and that ϕ is true according to STT if and only if tr(ϕ)  is true according to Ct. That sentences of Ct can be transformed this way into sentences of STT and that these transformations preserve theoremhood and truth show, Shapiro claims, that the two systems are definitionally equivalent—that Ct is a mere “notational variant” of STT.

Chihara (2004) responds by noting that the ability to translate sentences of STT into Ct does not show that they are equivalent in a way that undermines his project. Though the ability to translate between the two theories would show that the sentences of STT and Ct share certain mathematically significant relationships, it would not show that these sentences have the same meaning, are true under the same circumstances, or are knowable or justifiably believed under the same circumstances. Sentences of STT entail the existence of sets and are true only if sets exist, whereas sentences of Ct do not and are not. Additionally, the two theories are confirmed in different ways. The Ct sentence “It is possible to construct an open sentence of level-1 that is not satisfied by any object” is supported by laws of modal logic, considerations about what is possible, coherent, and so on. The STT counterpart sentence “There exists a set of level-1 of which nothing is a member” is not supported by those considerations.

iii. The Role of Possible World Semantics

Another objection arises from the fact that Chihara (1990) uses possible world semantics to spell out, in a precise way, the logic of Cx” and Ax”. Possible world semantics is an extension of the model-theoretic semantics sketched earlier. Roughly speaking, instead of a single domain containing objects and sets of objects, possible world semantics makes use of possible worlds each with their own domain (at least in variable domain semantics). The basic, extensional model-theoretic semantics sketched before has an interpretation function mapping non-logical terms of the language to their extension: names are mapped to individuals of the (single) domain and predicates to sets of individuals in the (single) domain. Possible world semantics has an interpretation function mapping names and predicates to intensions. For names, intensions are functions mapping worlds to individuals in that world’s domain. For predicates, intensions are functions mapping worlds to sets in that world’s domain. Intuitively, an intension tells us what individual (if any) a name picks out at any given possible world, and what set of objects a predicate applies to at any given possible world. For example, the intension associated with “… is red” would map each world to the (possibly empty) set of red things in that world’s domain. Part of the philosophical interest of possible world semantics is that it allows one to characterize a logic (in fact a range of logics) for possibility and necessity operators. Intuitively, for some claim “ϕ”, the claim “Possibly ϕ” is true if and only if “ϕ” is true in at least one possible world, and the claim “Necessarily ϕ” is true if and only if “ϕ” is true in all possible worlds.

Possible world semantics itself, then, is a mathematical theory, quantifying over sets and functions. (Instead of modelling physical systems, it models meanings.) It might therefore be asked whether it is legitimate to engage in such possible worlds talk without believing in the mathematical objects it quantifies over, or whether this would be an instance of intellectual doublethink. Shapiro (1997) claims that possible world semantics is not available to the nominalist since it not only quantifies over abstract objects but is used in explanations. If possible world semantics is just a myth, then its falsehood precludes it from explaining anything, just as a story about Zeus (assuming his non-existence) cannot explain facts about the weather. Chihara (2004) responds by drawing a distinction between scientific explanations of natural phenomena and explications of ideas and concepts. The role of possible world semantics in Ct is not akin to a scientific explanation of an event, but to an explication of a concept. Possible world semantics is used to spell out how to make inferences using constructibility quantifiers. Put more picturesquely, it shows one how to reason with the constructibility quantifiers in broadly the same way that an allegorical tale, such as Animal Farm, shows one how to reason about totalitarian government (though the latter does so in a less rigorous but more open-ended way than the former). Just as a novel is capable of doing this without the things depicted in it really existing, so, too, possible world semantics is capable of doing this without possible worlds really existing.

None of the prominent objections to Chihara’s brand of reconstructive nominalism are decisive. Although the view has received comparatively little attention in the literature, it remains a live option for the nominalist who denies indispensability.

b. Field

Field’s reconstructive project has been utterly dominant in the literature on reconstructive nominalism since the publication of Field’s short but remarkable monograph Science Without Numbers in 1980, even if Field himself has said little about his project in print since the early nineteen nineties. Field takes there to be no mathematical objects, but also holds that the truth of mathematical sentences requires the existence of mathematical objects. As such, for Field, standard mathematical theories are (strictly speaking) false (see the introduction to Field (1989)). Mathematized science, however, uses mathematical models, equations, and so on to represent concrete systems. As a reconstructive nominalist, Field aims to show firstly that the best scientific theories can be restated in a way that avoids using mathematics. Here is a point of contrast with Chihara. Whereas Chihara claims not that mathematics per se is dispensable to science, but only the sort of mathematics that quantifies over abstract objects, Field’s project is to formulate scientific theories that do not make use of mathematics of any sort. Field also wants to establish that mathematical language is dispensable in principle: there is no context in which science would require mathematics to do something which it could not do without mathematics. To this end, the second goal of his project is to show that adding mathematical claims to claims about the concrete world does not allow us to infer anything about the concrete world that claims about the concrete world would not allow us to infer on their own.

i. Field’s Program

Field calls the process of removing reference to and quantification over mathematical objects “nominalization”. Field does not nominalize all of contemporary science—the task would be colossal—but one important theory: Newtonian Gravitational Theory (NGT). His hope is that, in doing so, he would show that a complete nominalization of science is, at least in principle, accomplishable.

Some mixed claims, expressing relationships between concrete and abstract objects, are easy to reformulate in a purely nominalistic way. “There are exactly two remaining Beatles” can be parsed:

xy(BxByxy)∧∀xyz(BxByBzx=yy=zx=z)

Where “Bx” means “x is a remaining Beatle”. The best scientific theories, however, go far beyond claims about how many of a particular kind of object there are, so the means of nominalizing these theories will be more complex. As a result, the details of Field’s project are highly technical. A non-technical overview of its general contours, however, can be given.

NGT describes the world by numerically assigning properties such as mass, distance, and so on, to points in space-time. Space-time itself is represented with a mathematical coordinate system, and quantity claims such as “The mass of b is 5kg” are understood as meaning that there exists a mass-in-kilograms function f from a domain of concrete objects C to the real numbers ℝ such that f(b)=5. Instead of describing the concrete world by assigning it numerical values, Field’s theory (henceforth FGT) describes the concrete domain directly, using comparative language. In particular, distance claims are made using a betweenness relationy Bet xz”, a simultaneity relationx Simul y”, and a congruence relationxy Cong zw”. These are primitives of the theory, but intuitively “y Bet xz” means that y is between x and z, “x Simul y” that x and y are simultaneous, and “xy Cong zw” that the distance from x to y is the same as the distance from z to w. In the same way, mass claims are expressed using mass-betweenness and mass-congruence relations. From these building blocks, Field develops a scientific theory capable of describing space-time and many of its properties without quantifying over mathematical objects.

ii. Representation Theorems

The next step is to show that FGT really is a (nominalistic) counterpart to NGT. To this end, Field proves a representation theorem. Intuitively, what Field’s representation theorem shows is that the domain of concrete things represented by FGT using comparative predicates (a space-time with mass-density and gravitational properties) has the same structural features as the abstract mathematical model of space-time with mass-density and gravitational properties given by NGT. NGT is a mathematical mirror image of FGT. In more detail, Field proves that:

there is a structure-preserving mapping ϕ from the sort of space described by FGT onto ordered quadruples of real numbers;

there is a structure-preserving mapping ρ from the mass-density properties that FGT ascribes to space-time onto an interval of non-negative real numbers;

there is a structure-preserving mapping from ψ from the gravitational properties that FGT ascribes to space-time onto an interval of real numbers.

Where ϕ is unique up to a generalized Galilean transformation, ρ is unique up to a positive multiplicative transformation, and ψ is unique up to a positive linear transformation. (What this means, in essence, is that choice of measurement scales is conventional. Different measurement scales can be used, so long as they preserve the structural features of the measurement scales they replace. Saying that something is 95.6 kilograms or saying that it is 15.2 stones are two different ways of representing the same concrete fact about mass; no unique significance attaches to the numbers 95.6 or 15.2.)

The representation theorem explains the utility of false mathematical theories: if the abstract mathematical model they describe has the same structure as the concrete world (described by true nominalistic theories), then reasoning about the abstract mathematical model will not lead us astray when making inferences about the concrete world. The following picture of nominalistic physics and its relation to scientific practice emerges: Nominalistic claims N1 … Nn have abstract counterparts N*1 … N*n which use mathematical methods to describe the same physical world described by N1 … Nn. One can ascend from N1 … Nn to N*1 … N*n, carry out derivations within the mathematical theory to arrive at some mathematized conclusion A*, and then descend to its nominalistic counterpart A. Mathematics facilitates inferences about the physical world, but these inferences could, according to Field, be made without mathematics, albeit more laboriously.

iii. Conservativeness

Having developed a nominalistic physics and proved a representation theorem, Field also needs to show that mathematics is truly dispensable to physics, at least in principle. This involves showing that there are not claims about the physical world that follow from nominalistic theories plus mathematics but that would not follow from nominalistic theories alone. In the jargon, that mathematics is conservative over purely nominalistic theories. Conservativeness is important to Field’s kind of reconstructive nominalism. Firstly, because if the mathematics we apply is not conservative, then there are things that can be said about the physical world with mathematics that could not be said without mathematics, showing that it is not dispensable after all. Secondly, because it provides part of the explanation of why we manage to use (false) platonistic theories so successfully: they have the same consequences about the concrete world as the true, nominalistic theories underlying them. Field informally describes conservativeness in the following way:

A mathematical theory S is conservative [if and only if] for any nominalistic assertion A, and any body N of such assertions, A isn’t a consequence of N + S unless A is a consequence of N alone. (Field 2016, 16)

Some clarifications are in order. The first is that Field does not have in mind pure mathematical theories, that is, mathematical theories whose vocabulary only ranges over mathematical objects. These theories are more or less trivially conservative: as they do not say anything about the physical world, they do not entail anything about the physical world (unless they are inconsistent, since, in classical logic, a contradiction entails all sentences). The second is that Field does not have in mind physical theories that use mathematics. These are more or less trivially nonconservative: it is their job to say substantive things about the physical world. If for instance N is some meagre body of nominalistic assertions, and S is Newtonian physics, then N + S will clearly enough have consequences that N does not. Instead, Field has in mind impure mathematical theories. Impure set theory, for instance, posits the existence not only of sets, but also conditionally posits the existence of sets of physical objects. Roughly speaking, for any physical object or objects, there exists a set containing that object or objects. More specifically, consider ZFC (Zermelo-Fraenkel set theory with the axiom of choice)—a theory of pure mathematics within which all core mathematics can be modelled. ZFC can be made into an applied, impure theory by adding supplementary axioms. In particular:

  1. a comprehension scheme, allowing definable kinds of concrete objects to form sets;
  2. a replacement scheme, saying that if a function from a set of concrete objects to other objects can be defined, then those latter objects also form a set. (Note that this allows one to formulate the kinds of quantity claims mentioned earlier.)

Call the theory obtained from ZFC by adding these supplementary axioms ZFCV(N). Field argues that good mathematics should be conservative; mathematics, on its own, ought not to impose constraints on the way the concrete world is. Were it to be discovered that it did, that would be a reason to consider it in need of revision:

[I]f it were to be discovered that standard mathematics implied that there are at least 106 non-mathematical objects in the universe, or that the Paris Commune was defeated […] all but the most unregenerate rationalists would take this as showing that standard mathematics needed revision. Good mathematics is conservative; a discovery that accepted mathematics isn’t conservative would be a discovery that it isn’t good. (Field 1980, 13)

In addition to this, Field proves an important conservativeness result. An equivalent way of describing conservativeness is to say that a mathematical theory S is conservative if and only if for any consistent body N of nominalistic assertions, N + S is also consistent (Field 2016, 17). Field’s proof, in essence, gives a procedure that, beginning with a nominalistic theory N assumed to be consistent (that is, satisfied by a domain D of non-sets), shows one how to construct a domain that satisfies both N and ZFCV(N) (that is, shows that N and ZFCV(N) are jointly consistent). If one is able to construct a nominalistic counterpart theory such as FGT, conservativeness ensures that adding ZFCV(N) to that nominalistic theory will not entail any purely nominalistic claims that are not entailed by the nominalistic theory alone.

iv. The Prospects of Field’s Project

Field provides a nominalistic reconstruction of one important theory of physics, but some philosophers have questioned whether his project could be devloped with nominalistic reconstructions of other theories such as General Relativity and quantum mechanics (QM). NGT uses mathematics to represent facts about concrete objects which FGT represents more directly. QM works differently. The mathematical formalism of QM is sometimes used to represent probabilities of measurement events, and a probability is not a concrete object. Even if QM could be reformulated to avoid reference to mathematical objects, it would remain a theory about probabilities, which is to say, a theory that talks about entities the nominalist does not take to exist. Balaguer (1996, 1998) has suggested that the Fieldian nominalist could take these probabilities to represent the propensities of the concrete systems they model. However, Balaguer admits that even if the details of any such nominalization were worked out, this would not provide a means of nominalizing phase space theories. Phase space theories use vectors to represent possible states of a concrete system. A Fieldian reconstruction of a phase space theory which avoids quantification over vectors would still quantify over possible states of concrete systems (Malament 1982), and these abstract objects could not be taken to represent propensities of concrete systems in the way that, plausibly, probabilities do.

A number of commentators have taken these considerations to license pessimism about the prospects of nominalizing the best contemporary scientific theories, but the Fieldian nominalist could contest this conclusion. In the first place, one can question the inference from the fact that mathematical language has not, at some point in time, yet been dispensed from the best scientific theories to the modal conclusion that mathematical language is indispensable from the best scientific theories. One would not similarly conclude that because Goldbach’s conjecture has not, at some point in time, yet been proven, it is unprovable. In the second place, progress has been made since Science Without Numbers was first published in 1980. For instance, Arntzenius and Dorr (2012) take on the task of nominalizing general relativity, which uses differential equations to describe the behavior of fields and particles in curved space-time and vector bundles. They express confidence that, given an interpretation of what concrete facts the mathematical formalism of QM represents, nominalizing strategies could be extended to apply in these cases.

v. Conservativeness Again

Field’s conservativeness claim has been criticized on a number of grounds. One is that semantic accounts of logical consequence quantify over sets. According to these, a theory is logically possible just in case it has a model in the set-theoretic sense sketched above. A claim ϕ is a consequence of a theory Γ if and only if there is no model of Γ & ¬ϕ. Logical consequence, it is sometimes argued, can only be understood if one posits the existence of sets. Field (1989, chapter 3; 1991) has a response: he takes logical possibility to be a primitive notion, not ultimately to be explained in terms of the existence of certain sets. There are considerations in favor of this. Explaining modal facts in terms of set-theoretic ones may get things backwards. As Leng (2007) argues, one ought to explain the fact that there exists no set of all sets on the grounds that there could not exist a set of all sets, rather than to explain why there could not exist a set of all sets on the grounds that there is no set of all sets.

Another objection concerns the scope of Field’s conservativeness proof. Jospeh Melia (2006) points out that although Field provides a proof of the conservativeness of ZFCV(N), he does not provide an argument that all useful applied mathematics can be carried out in ZFCV(N). Unless the Fieldian nominalist provides reasons to believe that this is the case, it is hard to assess the significance of Field’s proof, as it is hard to assess whether future applications of mathematics will be carried out in ZFCV(N).

vi. The Best Theory?

Mark Colyvan argues that talking of mathematical objects is not dispensable to the best scientific theories because nominalistic versions of those theories will be worse than their mathematical counterparts. Good theories must be both internally consistent and consistent with observations, but there are additional theoretical virtues that must be taken into consideration. Colyvan (2001) lists the following:

  • Simplicity / Parsimony: Given two theories with the same empirical consequences, we should prefer the theory that is simpler to state, and which has simpler ontological commitments;
  • Unificatory / Explanatory Power: We should prefer theories that predict the maximum number of observable consequences with the minimum number of theoretical devises;
  • Boldness / Fruitfulness: We should prefer theories that make bold predictions of novel phenomena over those that only account for familiar phenomena;
  • Formal Elegance: We should prefer theories that are, in a hard-to-define way, more beautiful than other theories.

Colyvan contends that mathematical theories are often more virtuous than nominalistic ones. (See the section Unification, Explanation and Confirmation in the article on The Applicability of Mathematics and the section The Explanatory Indispensability Argument in the article on The Indispensability Argument for examples of the unificatory and explanatory power of mathematics.)

Field, however, takes nominalistic theories to have greater explanatory power in some respects. They provide intrinsic explanations of physical phenomena, rather than appealing to extrinsic mathematical facts, and they eliminate the arbitrariness in the choice of units of measurement that accompany mathematical theories (Field 1980, 1989 chapter 6). It is open to the Fieldian nominalist to argue that her theoretical virtues are somehow better or more fundamental than those enjoyed by mathematical theories. Since there is no agreed-upon metric for measuring theoretical virtues, nor agreement over their epistemic significance (are they really indicators of truth or mere pragmatic expediencies?), reaching a resolution in this area might not be easy.

6. Deflationary Nominalism

At the beginning of the twentieth century, attempts at reconstructive nominalism ebbed and many nominalists came to accept (Indispensability) or to see it as somehow orthogonal to ontological questions. As the indispensability argument is valid, this requires rejecting either (Realism) or (Quine’s Criterion). This section explores the latter option.

A number of philosophers have questioned the semantic assumptions driving (Quine’s Criterion). Many natural language sentences employ apparent reference or quantification even when commitment to the existence of the things apparently referred to or quantified over would be, for various reasons, implausible:

  • There is a better way than this.
  • His lack of insight was astounding.
  • There are many similarities between Sellars and Brandom.
  • There is a chance we will make it in time.
  • I have a beef with the current administration.
  • The view from my office is wonderful.
  • She did it for your sake.

If one takes the semantic argument discussed earlier seriously, then, in holding these sentences to be true, one would be committed to the existence of ways, lacks, similarities, chances, beefs, views and sakes. Many, however, take it that positing the existence of objects such as these is bizarre, since they do not seem to be part of the “furniture of the universe”. One response, then, is to deny that “there is” always expresses existence, in an ontologically significant sense.

a. Azzouni

Jody Azzouni has been the most prominent defender of this approach to nominalism. Azzouni’s view (2004, 2007, 2010a, 2010b, 2017) is that both quantifiers and the term “exists” are neutral between ontologically committing and non-ontologically committing uses. Context decides whether they are being used in ontological or non-ontological ways. “God exists” uttered in a discussion between an atheist and a theist would (usually) express ontological commitment, but “Important similarities between Sellars and Brandom really do exist” would not (usually) be intended to express ontological commitment to similarities.

i. Quantifier Commitments and Ontological Commitments

Azzouni, then, draws a distinction between mere quantifier commitments and existential commitments (here understood in an ontologically significant sense); not all quantifier commitments are existential commitments. He replaces Quine’s semantic criterion for what a discourse is committed to with a metaphysical criterion for what exists. Something exists, according to Azzouni, if and only if it is mind- and language-independent. This requires both rejecting the Quinian criterion and motivating the metaphysical one. With respect to the latter, Azzouni does not give a metaphysical argument for the criterion but appeals instead to the de facto practices of people in general. One should adopt mind- and language-independence as a criterion for what exists because of the sociological fact that the community of speakers takes ontologically dependent items not to exist.

Given this criterion for existence, the heart of Azzouni’s account consists in spelling out the implications of mind- and language-independence for our knowledge-gathering behavior. For instance, language- and mind-independent objects cannot be stipulated into existence. Inventing a fictional character on the other hand involves nothing more than thinking of her. Fictional entities are paradigmatically mind-dependent and hence non-existent. This is to be contrasted with the way we form beliefs about mind-independent objects. In particular, our epistemic access to mind-independent posits possesses the following salient features: robustness, refinement, monitoring and grounding:

  • Robustness: Epistemic access to a posit is robust if results about that posit are independent of our expectations about it. For example, Newtonian mechanics (in conjunction with some auxiliary assumptions) predicted that the planet Uranus would have a particular perihelion which observation revealed it not to have.
  • Refinement: Epistemic access to a posit exhibits refinement when there are means by which to adjust or refine access to that posit. For example, more powerful telescopes allow improved access to distant parts of the observable universe.
  • Monitoring: Epistemic access to a posit involves monitoring when what the posit does through time can be tracked or when different aspects of the posit can be explored. For example, C.T.R. Wilson’s experiments appeared to reveal the trajectory of atoms by their observable effects on water vapor.
  • Grounding: Epistemic access to a posit exhibits grounding when properties of the posit itself explain why we can discover what properties the posit has. That stars emit light explains why they are visible to the naked eye at night.

If the way in which we establish truths about a posit does not fit with the way we establish truths about mind- and language-independent posits, then we are treating, in practice, this posit as mind- and language-dependent. Given Azzouni’s criterion for existence, this amounts to treating this posit as non-existent. The leading idea here is that an examination of scientific practice shows that we treat concrete posits—observables, but also theoretical posits such as subatomic particles—as mind- and language-independent but treat mathematical objects as mind- and language-dependent.

When robustness, refinement, monitoring, and grounding are active in deciding whether a posit exists and what features it has, one has what Azzouni calls “thick epistemic access” to it. Not all scientific posits which Azzouni takes to exist enjoy thick epistemic access, however. To take an example: because the expansion of the universe is accelerating, there are parts of the universe which are sufficiently distant from the Earth so that no information from them will ever reach observers on Earth—these regions are outside our past light cone. Accepted cosmology posits the existence of galaxies, stars, nebulae, and suchlike outside our past light cone, yet concrete entities in these regions of the universe clearly fail to exhibit, at the very least, monitoring and grounding. Azzouni calls the sort of access we have to entities such as these “thin epistemic access”. In Azzouni’s developed view, thin posits are “the items we commit ourselves to on the basis of our theories about what the things we thickly access are like” (Azzouni 2012, 963). Moreover, thin posits require “excuse clauses”: explanations, stemming from the scientific theories that describe them, of why we fail to have thick access to them. In the case of galaxies outside our past light cone, we have thick access to things inside our past light cone; widely accepted cosmological theories about the features of the posits in the early universe which, in conjunction with natural laws, entail the existence of galaxies outside our light cone. Hence, theories about the things we thickly access commit us to the existence of galaxies which we cannot thickly access. They also provide the needed excuse clause: special relativity does not allow entities to travel through space faster than the speed of light.

ii. The Coherence of Denying Quine’s Criterion

Some have objected that denying Quine’s criterion is incoherent: how can sense be made of the view that it is both true that there are infinitely many primes and that no primes exist (see, for instance, Burgess, 2004) ? However, it is open to the deflationary nominalist to claim that “There are infinitely many primes” and “Primes do not exist” can both be true, so long as the quantifier “there are” does not express existence. Further, it is open to the deflationary nominalist to claim that “There are primes” and “There are no primes” can both be true, so long as the first occurrence of “there are” is not being used in a sense that expresses ontological commitment and the second occurrence of “there are” is being used in a sense that expresses ontological commitment. Azzouni (2004, 2007, 2010, 2017) argues that both “there is” and “there exists” are ontologically neutral, on the basis of examples such as those listed above. According to Azzouni, it is implausible to suppose that one expresses ontological commitment to, for instance, ways, on the basis of uttering phrases such as “There is more than one way to skin a cat”. One quantifies over all sorts of things but undertakes ontological commitments only to those things one treats as mind- and language-independent.

There are other ways to motivate similar claims. Hofweber (2016) defends a view according to which some singular terms function syntactically like names but do not purport to refer to objects. Drawing on work in linguistics and developmental psychology, Hofweber argues that arithmetical singular terms are of this kind, and so do not come with ontological commitments. This is expanded to cover a similarly non-representational account of quantification in arithmetic. This view, Hofweber argues further, solves a number of puzzles in the philosophy of arithmetic.

Others have drawn a distinction between quantifier commitments and ontological commitments by defending versions of Meinongianism—named after Alexius Meinong, an Austrian philosopher whose interest in our ability to think about that which does not exist led him to theorize about nonexistent objects. Parsons (1980), Routley (1980), and Zalta (1988) have all defended views according to which some objects are non-existent. More recently, Priest (2016) develops a possible world semantics according to which all worlds share a domain of objects, though different objects exist at different worlds. Here the “particular quantifier”—Priest avoids using the term “existential quantifier”, which he takes to bias the issue—ranges over all objects in the domain (existent and non-existent), and an existence predicate is used to make claims about what exists at a given world. According to Priest, only concrete objects exist at possible worlds, so mathematical objects do not exist at any possible world. On Priest’s view, then, mathematical objects do not exist, but (nonexistent) mathematical objects are included in the domain of discourse. When one talks about mathematical objects, what one says is true or false depending on how things stand with those non-existent objects.

Rayo (2013) rejects what he calls “metaphysicalism”: the view that there is a metaphysically privileged way of carving reality into its fundamental constituents, and that for a simple sentence of subject-predicate form to be true, there must be a correspondence between the logical form of the sentence and the metaphysical structure of reality. Once one has dropped the requirement that the structure of a true sentence must be mirrored in the metaphysical structure of the world, one is free to specify truth conditions for sentences according to which this is not the case. In particular, one can give truth conditions for mathematical sentences that are trivialist, that is, that make no requirements on the world (Rayo 2013, 2015, 2016). Claims such as “Infinitely many prime numbers exist” are, on this view, both true and ontologically insignificant. (Rayo, it should be noted, refers to the view as an ontologically weightless “trivialist subtle Platonism”, and presents it as a rival to nominalism.) A number of positions, then, provide frameworks within which it is prima facie coherent to deny Quine’s criterion.

iii. Excuse Clauses

Mark Colyvan has taken issue with Azzouni’s account of excuse clauses. On Azzouni’s view, posits can fail to exhibit robustness, refinement, monitoring, and grounding but still be included in our ontology so long as there is an excuse clause explaining why they fail to exhibit these features. Colyvan (2010) objects that abstract objects do have an excuse clause: namely they are acausal. Azzouni (2012) responds by noting that Colyvan’s excuse is a philosophical gloss, rather than stemming from actual scientific and mathematical practices. The force of this objection, and of the response, will depend on criteria for what counts as a legitimate excuse clause.

Bangu (2012, 28–30) has objected to Azzouni’s claim that the community of speakers treats ontological independence as the criterion for existence. Bangu points out that this is an empirical, statistical claim, but that Azzouni presents no empirical, statistical evidence for it. Were a study to be carried out, it may turn out that opinion on the matter is not uniform (and this much seems true in the philosophical community at least).

7. Instrumentalism

Until the beginning of the twenty-first century, instrumentalism—the rejection of (Realism)—was not a popular nominalist response to indispensability arguments. This may have been motivated by the thought that a rejection of (Realism) constituted a rejection of the ability of science to informatively represent the world. Indeed Burgess (1983, 93) characterizes instrumentalism as the view that “science is just a useful mythology, and no sort of approximation to or idealization of the truth”. However, some instrumentalist nominalist views that aim to avoid this result whilst maintaining that the theories of mathematical science are not strictly true have been proposed.

a. Leng

A detailed and sophisticated view of this sort is developed by Mary Leng (2002, 2007, 2010, 2005). Leng draws on earlier work by Mark Balaguer and Steven Yablo—both of whom, ultimately, defend the view that there is no fact of the matter over whether mathematical objects exist (see Balaguer 1998; Yablo 2009)—to defend a distinctive instrumentalist nominalism.

Instrumentalist nominalism stands apart from all other forms of nominalism by rejecting the claim that the best scientific theories are, strictly speaking, true or approximately true. One does not need to find replacement theories that do not quantify over abstract objects, nor to show that the theories that do quantify over abstract objects are in fact true, in order to vindicate nominalism. Leng aims to explain the usefulness of mathematics directly by studying mathematical practices and seeing if those practices can be understood without assuming the existence of mathematical objects. If one can make sense of mathematical-scientific practices—how they are used to describe, make predictions about, and explain features of the concrete domain—without positing mathematical objects, then positing the existence of mathematical objects is unnecessary. (An important and closely related program, though one that does not reject (Realism), is developed by Otávio Bueno (2005, 2009, 2012, 2016). Like Leng, Bueno looks to account for mathematical practices in a way that does not presuppose the existence of mathematical objects, but without challenging (Indispensability). Bueno’s program, however, is agnostic about the existence of mathematical objects.)

i. Mathematics and Make-Believe

Leng’s account is fictionalist. Mathematical objects can rationally be treated as fictional; doing so does not jar with mathematical practices. Leng adopts the account of fiction as make-believe developed by the aesthetician Kendal Walton’s (1990, 1993) to spell out the details. According to this account, in articulating a fiction, one generates a prescription to imagine that things are thus and so. When Dorothy Sayers writes that Lord Peter Wimsey earned a first at Oxford, we are invited to imagine that there is a person, Wimsey, who has this particular property. This prescription to imagine and the subsequent imagining do not require the existence of Wimsey. A text is a kind of “prop” which, in conjunction with the usual conventions and practices involving fiction, generates the content of the fiction. Other principles are involved in what one is prescribed to imagine: logical consequences, facts about the (real) world, the laws of nature, and so on also come into play. This is what prevents us from imagining, or being prescribed to imagine, that Wimsey both gained a first and did not gain a first at Oxford, or that he can fly by flapping his arms.

According to Walton’s account of fictionality, a claim S is fictional if we are prescribed to imagine S  is true. While sentences such as “Lord Peter Wimsey plays cricket” are strictly false (since no such person exists), there is something correct about it. This is what Walton’s notion of fictionality is supposed to capture. The sentence’s correctness is due to its fictionality—that is, to the fact that we are prescribed to imagine it as true; the writings of Sayers, along with our conventions regarding what we do with fictions—that is, how we use fictions in practice—prescribe us to imagine that Lord Peter Wimsey plays cricket.

Just as there can be mixed mathematical-concrete sentences, such as measurement claims, that describe relations between concrete and abstract entities, so, too, there can be mixed fictional-real sentences describing relations between fictions and the real world (Walton 1990, 410):

  • Oscar Wilde killed off Dorian Gray by putting a knife through his heart.
  • Most children like E.T. better than Mickey Mouse.
  • Sherlock Holmes is more famous than any other detective.
  • Vanquished by reality, by Spain, Don Quixote died in his native village in the year 1614. He was survived for a short time by Miguel de Cervantes.

Again, Walton rejects the strict truth of these sentences, but grants that their correctness is related to their fictionality, although not in as straightforward a sense as for the pure sentences of fiction. These sentences are not fictional within their respective fictions. The novel The Picture of Dorian Gray does not depict Oscar Wilde committing an act of murder and Mickey Mouse is not in E.T. (and vice versa). Instead, in making utterances like these we are engaged in “unofficial” games of fiction-making. In these contexts, we are invited to imagine that there are worlds created by their authors, allowing us to imagine relations between those worlds and between them and the real world. The correctness or incorrectness of these claims depends on their being partly fictional: they are correct if they are in accordance with what we are prescribed to imagine. Here, however, the fictionality of the sentences does not only depend on what the author’s writings prescribe one to imagine, they also depend on how things stand with the real world. The fictionality of “Sherlock Holmes is more intelligent than any detective I’ve met” depends both on real world “props” (in this case, the detectives that the utterer has met) and what the prop of Doyle’s writings prescribes one to image of Holmes.

One value of this intermingling of fiction and reality—what Walton calls “prop-oriented make-believe”—is that it allows one to represent the real world indirectly. Saying “Sherlock Holmes is more intelligent than any detective I’ve met” allows one to indirectly express something about the intelligence of real detectives that might not otherwise be easy to express. Talk of fictional objects can be used to place restrictions on real objects. That sentences describing things that do not exist are strictly false does not disqualify them from being used to express or grasp facts about the real world.

Leng appropriates Walton’s account of prop-oriented make-believe to make sense of mathematized science. Mathematical make-believe can be used to place restrictions on non-mathematical objects, and hence to describe, indirectly, the concrete world. If one imagines that the set of real numbers, ℝ exists, one can imagine that there are functions that map concreta onto different real numbers depending on their properties, and which would allow one to represent those properties quantitatively. Imagining that there is a mass function, one could say of a concrete object, d1, that Mass kg(d1)=5, and in doing so place restrictions on d1 thus representing it indirectly. When this goes right, the measurement ascription will be fictionally or nominalistically adequate: that is, correct with respect to the facts about the concrete world. (Rosen (2001) calls a theory Γ nominalistically adequate so long as the “concrete core” or largest wholly concrete part of a world W at which Γ is true is an exact intrinsic duplicate of the concrete core of the actual world.)

The falsity of “Sherlock Holmes is more intelligent than any detective I’ve met” does not prevent it from being capable of accurately representing the real world. According to Leng, in grasping sentences like these we grasp their nominalistic content: what they “say” about the real world. In an analogous way, when we grasp mixed mathematical-physical claims, we grasp their nominalistic content. Scientific instrumentalism, on this account, does not debar science from being an accurate guide to what the world is like. Treating mathematics as a form of make-believe is consistent with treating scientific theories as having the power to accurately represent the world.

ii. Explaining the success of mathematics

This however is not the end of the story. One reason many philosophers accept (Realism) is that they take it to be the only way to explain the predictive success of science. If mathematized scientific theories are false, it would then be a hugely improbable coincidence that the very precise predictions they make are correct. J.J.C. Smart (1963, 36) gave a well-known formulation of this thought:

Is it not off that the phenomena of the world should be such as to make a purely instrumental theory true? On the other hand, if we interpret a theory in a realist way, then we have no need of such a cosmic coincidence: it is not surprising that galvanometers and cloud chambers behave in the sort of way they do, for if there really are electrons, etc., this is just what we should expect. A lot of surprising facts no longer seem surprising.

Smart had in mind instrumentalism regarding (concrete) theoretical entities such as subatomic particles, but many have endorsed the idea that the same problem carries over to instrumentalism about mathematical entities (Putnam 1971). Leng claims that the success of mathematized scientific theories is best explained in terms of their nominalistic adequacy, as opposed to their truth. Mathematized scientific theories describe non-causal relations between mathematical and concrete objects, but the behavior of concrete systems—the behavior that results in the observable events the theory predicts—cannot be in virtue of these relationships, since mathematical objects are abstract and cannot affect the behavior of concrete systems in any way. The explanation for the predictive success of mathematized theories must be that they respect the underlying concrete facts: the fundamental regularities that hold between concrete objects. Predictive success, in other words, is explained by nominalistic adequacy. Similar reasoning leads Leng to claim that what is tested empirically is the nominalistic adequacy of scientific theories, as opposed to their truth. This undercuts both (Realism) and confirmation holism. Regarding (Realism), what gets confirmed empirically is not the truth of these scientific theories, but their nominalistic adequacy. Regarding confirmation holism, the truth of the wholly nominalistic parts of these scientific theories does enjoy empirical confirmation—since truth and nominalistic adequacy are equivalent for wholly nominalistic claims—but the truth of the mathematical and mixed mathematical-concrete parts of them is not empirically confirmed. Leng notes that this line of reasoning applies to the platonist as well as the nominalist; even platonists should deny that the concrete objects described by scientific theories are the way they are because of abstract mathematical objects.

While it follows from the truth of a theory that it will be predictively successful, the explanation for why it is successful must be in terms of its nominalistic adequacy. Recall that Field explained the success of a (false) mathematized theory M by showing that it respects the non-mathematical relations that hold between concreta. He does this by creating a nominalistic theory N which describes the concrete world directly and proving a representation theorem which shows that N and M both place the same sorts of restrictions on the concrete world. If Leng’s reasoning is correct, Field’s project is superfluous, at least with regards to defending nominalism. One does not have to go to the trouble of spelling out a nominalistic counterpart theory, since it is the nominalistic adequacy of mathematized theories which explains their success regardless of whether a nominalistic counterpart theory is available.

iii. Mathematical Explanations

Some have argued that the ability to provide mathematical explanations of physical phenomena provides a reason to believe in the existence of mathematical objects over and above the reason provided by traditional indispensability arguments (see, for example, Bangu 2008, 2013; Baker 2005, 2009, 2012, 2017; Colyvan 2002; Lyon and Colyvan 2008; Lyon 2012). So, in addition to addressing the predictive success of mathematical science, Leng looks to account for the explanatory success of mathematical science. For many mathematical explanations, this is straightforward: mathematical models can be explanatory as a result of their representational role. The nominalistic adequacy of a theory explains why the concrete phenomenon in need of explanation occurs. Sometimes, though, the explanatory work done by mathematics is not exhausted by the nominalistic content of those theories, as there are cases in which, if one were able to represent the nominalistic content directly, explanatory power would actually be lost. Here is an example: the Honeycomb Conjecture—the claim that a hexagonal grid divides a surface into regions of equal area in a way that minimizes the total perimeter of cells—was proven in 1999 by Thomas Hales (this example is discussed in Lyon and Colyvan 2008 and Colyvan 2012). This proof, in conjunction with the premise that bees have evolved to minimize the amount of wax they must use while maximizing the amount of honey they can store, can be used to explain why bees build hexagonal honeycombs. It is not clear that any nominalistic explanation could do the same explanatory work as this abstract one. An explanation which quantified over particular concrete honeycombs would lack the scope of the abstract explanation, which allows us to see that any hives that respond to these evolutionary pressures will have a hexagonal structure.

Leng (2012, 2021) claims that structural explanations of this sort can also be accommodated by her form of nominalism. Structural explanations explain phenomena by showing that they will result if certain structural features are in place. The Honeycomb Conjecture is about abstract structures, but theories about abstract hexagonal structures can be (re)interpreted as being about concrete approximately hexagonal objects. Roughly speaking, what goes for ideal abstract hexagonal structures goes approximately for imperfect, concrete hexagonal structures. Axioms Γ characterizing abstract hexagonal structures will be approximately true of imperfect concrete hexagonal structures. If some claim ϕ (such as that hexagonal grids minimise the total perimeter of cells) is entailed by Γ, then ϕ will also hold in concrete systems approximately characterizable by Γ. More generally, a model of a theory Γ is a domain of objects that satisfies the axioms of Γ. If Γ entails some claim ϕ, then any model of Γ must also be a model of ϕ. Γ characterizes an abstract structure, which means that mathematical methods can be used to determine whether any model of Γ is a model of ϕ. However, the axioms of Γ can also be interpreted as being about a concrete system (the terms of Γ will be taken to denote concrete objects), in which case Γ will characterize the structural features of that concrete system. A mathematical proof that any model of Γ is a model of ϕ will, then, explain why any concrete system characterizable by Γ will exhibit ϕ (with ϕ  now also interpreted as being about a concrete system).

According to Leng, then, using mathematical explanations does not commit one to the existence of mathematical objects, but only to the fact that the concrete system being modelled has the structural features which mathematical methods allow one to describe. The applicability of the explanation is accounted for not in terms of the existence of mathematical objects, but in terms of our ability to interpret the axioms of Γ as being approximately true of concrete systems. Again, Leng’s take on nominalism is not to rid ourselves of abstract talk, or to show how abstract talk could be true without abstract objects, but to explain why engaging in abstract talk can achieve certain ends despite its literal falsehood.

iv. Nominalistic Content

Leng claims that although the best scientific theories are not true, they allow us to grasp what the physical world is like. The best scientific theories are nominalistically adequate: they, in some sense, express a true nominalistic content given by what those theories require of the concrete domain. There are difficulties in explicating what this amounts to. One way to give the nominalistic content of a mixed mathematical-concrete sentence ϕ is to provide a nominalistic paraphrase ϕN of ϕ in the manner of Field’s reconstructive project. ϕ would describe the way the physical world would have to be for ϕ to be true. However, if one renounces the project of providing Field-style nominalizations, it is not clear how to articulate the nominalistic content of mathematical descriptions of the physical world (Field 1989, chapter 7).

Melia (2000) suggests that it is possible for the nominalist to “communicate or express his picture of what the world is really like” by asserting a mixed mathematical-concrete sentence ϕ and then denying some of its consequences. One might say for example “Everyone who Fs also Gs. Except Harry—he’s the one exception”. In doing so, it is possible to successfully communicate a picture of what the world is like. Similarly, one might say “ϕ—but there are no mathematical objects”. Colyvan (2010) objects to strategies of this kind on the grounds that what is communicated or expressed is ill-defined. Colyvan compares the strategy to Tolkien retracting the (fictional) existence of hobbits late on the Lord of the Rings. We have no real grip on what is left of the story once hobbits have been extracted.

A number of theorists, however, have given accounts of nominalistic content—or the closely related notion of nominalistic adequacy—that may allow nominalistic content to be more clearly picked out without providing Field-style nominalistic paraphrases of mixed mathematical-physical sentences. Rosen (2001), Dorr (2007), and Yablo (2012) explore ways of cashing out nominalistic adequacy and nominalistic content by appealing to physically indiscernible worlds. Ketland (2011) defines nominalistic adequacy model-theoretically. Rayo (2015) defends a way of specifying nominalistic contents by “outscoping” mathematical vocabulary. Roughly speaking, instead of characterizing a world w as containing both mathematical and concrete objects and describing the relations that obtain between them, one characterizes w as containing only concrete objects and uses mathematical language to characterize the concrete objects at w. Since doing this still involves making mathematical assertions, there are interesting questions about whether the strategy is available to instrumentalist nominalists.

8. Mathematical Nominalism and Naturalism

Though the term “naturalism” is used in many different, sometimes conflicting ways, there are broadly two kinds of naturalism: metaphysical naturalism and methodological naturalism. Metaphysical naturalism is the claim that only natural things exist—there are no supernatural beings that, in some way, transcend the natural order. In many reasonable senses of “transcend the natural order”, platonism apparently violates metaphysical naturalism: mathematical objects are nonmaterial, eternal, immutable, acausal, not in space-time or subject to natural laws. For instance, Weir (2005, 473) says that “the ontological naturalist holds that we have a reasonably determinate conception of what it is to be physical and avers that everything is physical”. Mathematical objects would not usually be regarded as physical objects, even by those who regard them as postulates of physics.

Methodological naturalism involves a deferential attitude towards scientific practice. This cannot be quite as strong as the claim that one can only come to know (or be justified in believing) any given claim by scientific means, since the claim that one can only come to know (or be justified in believing) any given claim by scientific means is not something that one can come to know (or be justified in believing) by scientific means. It is not, after all, part of any empirically testable scientific hypothesis: no researcher could devise an experimental setup to empirically confirm this claim. That version of naturalism is not thereby shown to be strictly inconsistent, but it would entail its own unknowability (or inability to be justified under any circumstances). Instead, methodological naturalism can be parsed as something like the following: in scientific domains of inquiry, we should defer to the epistemic standards employed by working scientists, since there is no higher or better perspective from which to inquire into the nature of the world or from which to assess the claims of science (see, for instance, Quine 1975). If, for example, working scientists take themselves to be justified in holding that the universe is approximately 13 billion years old, we should not take ourselves to be in possession of philosophical reasons to reject this claim.

a. Quine’s Naturalism

Quine is probably the most famous advocate of methodological naturalism, particularly as it pertains to nominalism. According to Quine, natural science is:

An inquiry into reality, fallible and corrigible but not answerable to any supra-scientific tribunal, and not in need of any justification beyond observation and the hypothetico-deductive method. (Quine 1981, 72)

As such, ontology should be decided by looking to the content of the best contemporary scientific theories. If the overall body of scientific theory, suitably regimented into the language of first-order logic, states that something exists—which, for Quine, is equivalent to quantifying over that thing—then we are sufficiently justified in taking that thing to exist. (See the article on the Indispensability Argument for details of how this goes.) Quine takes empirical science to be the arbiter of what exists. Although he understands science broadly, including disciplines like psychology, economics, sociology and history, pure mathematics does not fall under this rubric. For Quine, we ought to believe the claims of mathematics, but only because they find application in empirical science. Quine changes his mind over what to make of mathematics that is not applied; at one point he claims the higher reaches of set theory are on a par with uninterpreted languages (Quine 1984), but later that we regard them as meaningful because they are “couched in the same grammar and vocabulary” as mathematics, which does get applied (Quine 1990, 94).

b. Maddy’s Naturalism

Penelope Maddy (1997) claims that the Quinian ignores some nuances of scientific practice that have a bearing on what the naturalist should take to be the real scientific standards of evidence. Maddy points out that a historical study of scientific practice reveals that, though atomic theory was entrenched to the point that quantification over atoms was indispensable to the best science by 1860, scientists did not believe in the existence of atoms until atoms themselves were detected directly by Jean Baptiste Perrin, who was able to experimentally verify the predictions made by Einstein’s 1905 account of Brownian motion in terms of atoms. Similarly, general relativity treats space-time as dense—for any two space-time points, there exists another point between them, meaning space-time is “smooth” rather than quantized—but scientists who model space-time this way do not assume that space-time itself really has this property. The choice of model is based on pragmatic factors such as convenience, effectiveness, and computational tractability. There are, then, things that the best scientific theories indispensably quantify over but which working scientists do not take themselves to be justified in believing in. As such, scientific practice appears at odds with confirmation holism: there are things affirmed by the best scientific theories that scientists do not treat as being empirically confirmed. With regards to nominalism, Maddy notes that, since mathematical objects could not be directly detectable in the manner of theoretical posits like atoms, the evidence for the existence of mathematical objects cannot be of the same sort as the evidence for the existence of some other theoretical posits. As with some concrete posits of theories, scientists include the mathematical posits they do because of pragmatic factors like convenience, effectiveness, and computational tractability. Maddy takes this to undermine the claim that the mere presence of reference to or quantification over mathematical objects in the best scientific theories provides justification for taking those objects to exist.

c. Burgess and Rosen’s Naturalism

John Burgess and Gideon Rosen (1997, 2005) have given an influential naturalistic argument against nominalism and in response to epistemological objections to belief in mathematical objects. The Burgess-Rosen-style naturalist makes no attempt to reconcile the picture of ourselves—given to us by biology, neuroscience, empirical psychology and so on—as embodied creatures availed of particular information-gathering abilities, with the claim that we have knowledge of a domain of abstract objects. Instead, he appeals to mathematical practices and the manner in which mathematicians come to accept mathematical claims. If one asks a working scientist why she believes in protons, she will cite the usual scientific evidence for the existence of protons. For the naturalist, this standard of evidence should be sufficient for belief in protons. In the same way, argue Burgess and Rosen, one should look to the usual means by which mathematicians provide evidence for mathematical claims—for the naturalist, the usual mathematical standards of evidence should be enough for us to accept the mathematical claims they aim to establish. Burgess and Rosen argue that nominalism requires rejecting the standards of justification at play in the mathematical community. They look to the circumstances under which mathematicians come to accept mathematical claims. Consider the claim:

There are numbers greater than 101010 that are prime.

Mathematicians accept this claim on the basis of mathematical proofs. There are many such proofs, but we can consider just one to form a sense of what is involved in mathematical standards of justification:

Assume that there are finitely many primes. These can be represented in a list: p1, p2, p3, p4, …, pn. Consider the number N = p1 x p2 x p3 x p4 x …  pn + 1. Either N is a prime number, or it is not. If N is prime, this contradicts the assumption that the list p1, p2, p3, p4, …, pn includes all the primes, so N cannot be prime. If N is not prime, then it must have prime divisors (all natural numbers are either prime or are the products of primes). But this divisor cannot be on the list p1, p2, p3, p4, …, pn since dividing N by numbers on the list would leave a remainder of 1. This also contradicts the assumption that the list includes all the primes, so N cannot not be prime. The assumption that there are finitely many primes entails that N cannot be prime and cannot be not prime, which is absurd. So, there are infinitely many primes. QED

That there are infinitely many prime numbers entails that there are numbers and, so, is incompatible with nominalism (given the kind of mainstream semantic presuppositions discussed earlier). Moreover, it is a claim that mathematicians take to be established by the reasoning above. If nominalists reject that claim, they must be using standards of justification different than the standards of mathematicians. Philosophers though have no higher or better vantage point from which to assess mathematical claims than mathematicians, so one should defer to the justificatory standards of mathematicians and accept that there is (decisive) justification for believing in abstract objects. Burgess and Rosen (2005) parse the argument the following way:

  1. The claims of standard mathematics appear to assert the existence of mathematical objects;
  2. Experts—mathematicians and scientists—accept these claims, using them in practical and theoretical reasoning;
  3. These claims are acceptable by mathematical standards. The claims that are not taken as axioms are supplied with proofs;
  4. The claims of standard mathematics not only appear to assert the existence of mathematical objects, they do assert the existence of mathematical objects;
  5. Accepting a claim—assenting to it verbally without reservations, using it in practical and theoretical reasoning and so on—just is believing the claim to be true;
  6. The claims of standard mathematics are not only acceptable by mathematical standards but are acceptable by scientific standards: empirical scientists defer to mathematicians on mathematical matters and there are no empirical scientific arguments against the claims of standard mathematics;
  7. There are no philosophical considerations that can override mathematical and scientific standards of acceptability;
  8. From (1), (2), (4) and (5): competent mathematicians and scientists believe in the existence of mathematical objects;
  9. From (3), (6), (7), (8): we are justified in believing in mathematical objects.

An interesting feature of the argument is that, if successful, it shows that nominalism is implausible whether or not mathematics is, in principle, dispensable to science: the de facto endorsement of mathematical claims by mathematicians is enough to undermine nominalism. Parts of the argument, however, can be contested by nominalists. For example, it might be answered that the seemingly formidable mathematicians who have denied the existence of mathematical objects—such as Alfred Tarski (see Frost-Arnold 2008), Timothy Gowers (2011), Solomon Feferman (1998), Abraham Robinson (1979), and so on—are not incompetent, contrary to what premise 8 would entail.

There have been different responses from nominalists to the considerations that motivate the argument. Leng’s pragmatic nominalism aims at showing why strictly false mathematical claims can reasonably be asserted and relied on in practical and theoretical reasoning. If her project is successful, then (5) is undercut, since accepting a claim, in the relevant sense of being nominalistically adequate, is very different from believing it to be true. Chihara (2006) has questioned (4). The claims of standard mathematics may not be genuine claims about the world, but they may, for instance, express what would be true in a structure, or may be only partially meaningful and lack complete truth conditions. Azzouni could also be understood as rejecting (1) and (4).

9. References and Further Reading

  • Arntzenius, Frank, and Cian Dorr. 2012. “Calculus as Geometry”. In Arntzenius, Frank, Space, Time, and Stuff. Oxford University Press.
  • Azzouni, Jody. 2004. “Theory, Observation and Scientific Realism”. Philosophy of Science 55: 371–92.
  • Azzouni, Jody. 2007. “A Cause for Concern: Standard Abstracta and Causation”. Philosophia Mathematica 16(3): 397–401.
  • Azzouni, Jody. 2010a. Talking about Nothing: Numbers, Hallucinations, and Fictions. Oxford University Press.
  • Azzouni, Jody 2010b. “Ontology and the Word ‘Exist’: Uneasy Relations”. Philosophia Mathematica 18(1): 74–101.
  • Azzouni, Jody. 2012. “Taking the Easy Road Out of Dodge”. Mind 121(484): 951–65.
  • Azzouni, Jody 2017. Ontology Without Borders. Oxford University Press.
  • Baker, Alan. 2005. “Are There Genuine Mathematical Explanations of Physical Phenomena?” Mind 114(454): 223–38.
  • Baker, Alan. 2009. “Mathematical Accidents and the End of Explanation”. In New Waves in Philosophy of Mathematics, edited by Otávio Bueno and Øysten Linnebo. Palgrave Macmillan: 137-159.
  • Baker, Alan. 2012. “Science-Driven Mathematical Explanation”. Mind 121(482): 243–67.
  • Baker, Alan. 2017. “Mathematical Spandrels”. Australasian Journal of Philosophy 95(4): 243–67.
  • Balaguer, Mark. 1996. “A Fictionalist Account of the Indispensable Applications of Mathematics”. Philosophical Studies 83(3): 291–314.
  • Balaguer, Mark. 1998. Platonism and Anti-Platonism in Mathematics. Oxford University Press.
  • Bangu, Sorin. 2008. “Inference to the Best Explanation and Mathematical Realism”. Synthese 160(1): 13–20.
  • Bangu, Sorin. 2012. The Applicability of Mathematics in Science: Indispensability and Ontology. Palgrave Macmillan.
  • Bangu, Sorin. 2013. “Indispensability and Explanation”. British Journal for the Philosophy of Science 64(2): 225–77.
  • Benacerraf, Paul. 1973. “Mathematical Truth”. The Journal of Philosophy 70(19): 661–79.
  • Bigelow, John. 1988. The Reality of Numbers: A Physicalist’s Philosophy of Mathematics. Clarendon Press.
  • Brandom, Robert. 1994. Making It Explicit: Reasoning, Representing, and Discursive Commitment. Harvard University Press.
  • Bueno, Otávio. 2005. “Dirac and the Dispensability of Mathematics”. Studies in History and Philosophy of Modern Physics 36: 465–90.
  • Bueno, Otávio. 2009. “Mathematical Fictionalism”. In New Waves in Philosophy of Mathematics, edited by Otávio Bueno and Øysten Linnebo. Palgrave Macmillan: 59-79.
  • Bueno, Otávio. 2012. “An Easy Road to Nominalism”. Mind 121(484): 967–82.
  • Bueno, Otávio. 2016. “An Anti-Realist Account of the Application of Mathematics”. Philosophical Studies 173: 2591–2604.
  • Bueno, Otávio, and Mark Colyvan. 2011. “An Inferential Conception of the Application of Mathematics”. Noûs 45(2): 345–74.
  • Bueno, Otávio, and Steven French. 2018. Applying Mathematics: Immersion, Inference, Interpretation. Oxford University Press.
  • Burgess, John. 1983. “Why I Am Not a Nominalist”. Notre Dame Journal of Formal Logic 24(1): 93–105.
  • Burgess, John. 2004. “Review of Deflating Existential Consequence: A Case for Nominalism”. The Bulletin of Symbolic Logic 10: 573–77.
  • Burgess, John, and Gideon Rosen. 1997. A Subject with No Object: Strategies for Nominalistic Interpretation of Mathematics. Oxford University Press.
  • Button, Tim. 2013. The Limits of Realism. Oxford University Press.
  • Button, Tim, and Sean Walsh. 2018. Philosophy and Model Theory. Oxford University Press.
  • Carnap, Rudolf. 1967. The Logical Structure of the World. University of California Press.
  • Chang, Hasok. 2004. Inventing Temperature: Measurement and Scientific Progress. Oxford University Press.
  • Cheyne, Colin. 1998. “Existence Claims and Causality”. Australasian Journal of Philosophy 76(1): 34–47.
  • Cheyne, Colin. 2001. Knowledge, Cause, and Abstract Objects: Causal Objections to Platonism. Kluwer Academic Publishers.
  • Chihara, Charles. 1973. Ontology and the Vicious Circle Principle. Cornell University Press.
  • Chihara, Charles. 1990. Constructibility and Mathematical Existence. Clarendon Press.
  • Chihara, Charles. 2004. A Structural Account of Mathematics. Oxford University Press.
  • Chihara, Charles. 2005. “Nominalism”. In The Oxford Handbook of Philosophy of Mathematics and Logic, edited by Stewart Shapiro. Oxford University Press: 483–514.
  • Chihara, Charles. 2006. “Burgess’s ‘Scientific’ Arguments for the Existence of Mathematical Objects”. Philosophia Mathematica 14: 318–37.
  • Clarke-Doane, Justin. 2020. Morality and Mathematics. Oxford University Press.
  • Collin, James Henry. 2018. “Towards an Account of Epistemic Luck for Necessary Truths”. Acta Analytica 33: 483–504.
  • Colyvan, Mark. 2001. The Indispensability of Mathematics. Oxford University Press.
  • Colyvan, Mark. 2002. “Mathematics and Aesthetic Considerations in Science”. Mind 111(411): 69–74.
  • Colyvan, Mark. 2010. “There Is No Easy Road to Nominalism”. Mind 119(474): 285–306.
  • Colyvan, Mark. 2012. An Introduction to the Philosophy of Mathematics. Cambridge University Press.
  • Dorr, Cian. 2008. “There Are No Abstract Objects”. In Contemporary Debates in Metaphysics, edited by John Hawthorne, Theodore Sider, and Dean Zimmerman. Blackwell: 32-63.
  • Feferman, Solomon. 1998. In the Light of Logic. New York: Oxford University Press.
  • Field, Hartry. 1980. Science Without Numbers. Princeton University Press.
  • Field, Hartry. 1989. “Realism and Anti-Realism about Mathematics”. In Hartry, Field, Realism, Mathematics and Modality. Blackwell: 53-78
  • Field, Hartry. 1991. “Metalogic and Modality”. Philosophical Studies 62(1): 1–22.
  • Field, Hartry. 2016. Science Without Numbers. 2nd ed. Oxford University Press.
  • Fitzgerald, Henry. 2003. “Nominalist Things”. Analysis 63(2): 170–71.
  • Frege, Gottlob. 1884. Die Grundlagen Der Arithmetik: Eine Logisch-Mathematische Untersuchung Über Den Begriff Der Zahl. Breslau: Verlage Wilhelm Koebner.
  • Frost-Arnold, Greg. 2008. “Tarski’s Nominalism”. In New Essays on Tarski and Philosophy, edited by Douglas Patterson. Oxford University Press: 225-246.
  • Goldman, Alvin. 1976. “Discrimination and Perceptual Knowledge”. The Journal of Philosophy 73 (20): 771–91.
  • Gowers, Timothy. 2011. “Comment on Gideon Rosen’s ‘The Reality of Mathematical Objects’”. In Meaning in Mathematics, edited by John Polkinghorne. Oxford University Press: 132-133.
  • Hart, W.D. 1977. “Review of Steiner, Mathematical Knowledge”. Journal of Philosophy 74: 118–29.
  • Hirsch, Eli. 2011. Quantifier Variance and Realism: Essays in Metaontology. Oxford University Press.
  • Hofweber, Thomas. 2016. Ontology and the Ambitions of Metaphysics. Oxford University Press.
  • Ketland, Jeffrey. 2011. “Nominalistic Adequacy”. Proceedings of the Aristotelian Society 111(2): 201–17.
  • Ketland, Jeffrey. 2021. “Foundations of Applied Mathematics I”. Synthese 199(1-2): 4151-4193.
  • Kimhi, Irad. 2018. Thinking and Being. Harvard University Press.
  • Krantz, David, R., Duncan Luce, Patrick Suppes, and Amos Tversky. 1971. Foundations of Measurement. Vol. 1. Academic Press.
  • Leng, M. 2005. “Revolutionary Fictionalism: A Call to Arms”. Philosophia Mathematica 13(3): 277–93.
  • Leng, Mary. 2002. “What’s Wrong with Indispensability? (Or, the Case for Recreational Mathematics)”. Synthese 131(03): 395–417.
  • Leng, Mary. 2007. “What’s There to Know? A Fictionalist Account of Mathematical Knowledge”. In Mathematical Knowledge, edited by Alexander Paseau, Mary Leng and Michael Potter. Oxford University Press: 84-108.
  • Leng, Mary. 2010. Mathematics and Reality. Oxford University Press.
  • Leng, Mary. 2012. “Taking It Easy: A Response to Colyvan”. Mind 121(484): 983–95.
  • Leng, Mary. 2021. “Models, Structures, and the Explanatory Role of Mathematics in Empirical Science”. Synthese 199(3): 10415–40.
  • Lewis, David. 1986. On the Plurality of Worlds. John Wiley & Sons.
  • Linnebo, Øysten. 2017. Philosophy of Mathematics. Princeton University Press.
  • Lyon, Aidan. 2012. “Mathematical Explanations of Empirical Facts, and Mathematical Realism”. Australasian Journal of Philosophy 90(3): 559–78.
  • Lyon, Aidan, and Mark Colyvan. 2008. “The Explanatory Power of Phase Spaces”. Philosophia Mathematica 16(2): 227–43.
  • Maddy, Penelope. 1992. “Indispensability and Practice”. Journal of Philosophy 89(6): 275–89.
  • Maddy, Penelope. 1997. Naturalism in Mathematics. Oxford University Press.
  • Malament, David. 1982. “Review of Field’s Science Without Numbers”. Journal of Philosophy 79: 523–34.
  • McDaniel, Kris. 2017. The Fragmentation of Being. Oxford University Press.
  • Melia, Joseph. 2000. “Weaseling Away the Indispensability Argument”. Mind 109(435): 455–79.
  • Melia, Joseph. 2006. “The Conservativeness of Mathematics”. Analysis 66 (291): 202–8.
  • Miller, Barry. 2002. The Fullness of Being: A New Paradigm for Existence. University of Notre Dame Press.
  • Morrison, Margaret. 2015. Reconstructing Reality: Models, Mathematics, and Simulations. Oxford University Press.
  • Nutting, Eileen S. 2016. “To Bridge Gödel’s Gap”. Philosophical Studies 173: 2133–50.
  • Parsons, Terence. 1980. Nonexistent Objects. Yale University Press.
  • Pincock, Christopher. 2012. Mathematics and Scientific Representation. Oxford University Press.
  • Plantinga, Alvin. 1993. Warrant: The Current Debate. Oxford University Press.
  • Potter, Michael. 2007. “What Is the Problem of Mathematical Knowledge?”. In Mathematical Knowledge, edited by Mary Leng, Alexander Paseau, and Michael Potter. Oxford University Press: 16-32.
  • Priest, Graham. 2016. Towards Non-Being: The Logic and Metaphysics of Intentionality. 2nd ed. Oxford University Press.
  • Putnam, Hilary. 1971. Philosophy of Logic. Harper & Row.
  • Putnam, Hilary. 1980. “Models and Reality”. The Journal of Symbolic Logic 45(3): 464–82.
  • Putnam, Hilary. 2004. Ethics Without Ontology. Harvard University Press.
  • Quine, W.V.O.. 1948. “On What There Is”. Review of Metaphysics 2(5): 21–36.
  • Quine, W.V.O.. 1951. “Two Dogmas of Empiricism”. The Philosophical Review 60(1): 20–43.
  • Quine, W.V.O.. 1975. “Five Milestones of Empiricism”. Reprinted in Theories and Things. Harvard University Press: 67-72.
  • Quine, W.V.O.. 1976. “Whither Physical Objects?”. Studies in the Philosophy of Science 39: 303–10.
  • Quine, W.V.O.. 1981. Theories and Things. Harvard University Press.
  • Quine, W.V.O.. 1984. “Review of Charles Parsons’ Mathematics in Philosophy”. Journal of Philosophy.
  • Quine, W.V.O.. 1986. Philosophy of Logic. 2nd ed. Harvard University Press.
  • Quine, W.V.O.. 1990. Pursuit of Truth. Harvard University Press.
  • Rayo, Agustín . 2013. The Construction of Logical Space. Oxford University Press.
  • Rayo, Agustín . 2015. “Nominalism, Trivialism, Logicism”. Philosophia Mathematica 23(1): 65–86.
  • Rayo, Agustín . 2016. “Neo-Fregeanism Reconsidered”. In Abstractionism: Essays in Philosophy of Mathematics, edited by Philip Ebert and Marcus Rossberg. Oxford University Press: 203-221.
  • Resnik, Michael. 1995. “Scientific vs. Mathematical Realism: The Indispensability Argument”. Philosophia Mathematica 3(2): 166–74.
  • Resnik, Michael. 1997. Mathematics as a Science of Patterns. Clarendon Press.
  • Robinson, Abraham. 1979. “Formalism”. In Selected Papers of Abraham Robinson, edited by W.A.J. Luxemburg and S. Körner. North-Holland Publishing Company.
  • Rosen, Gideon. 2001. “Nominalism, Naturalism, Epistemic Relativism”. Philosophical Perspectives 15: 69–91.
  • Rosen, Gideon. 2020. “Abstract Objects.” In The Stanford Encyclopedia of Philosophy, edited by Edward N. Zalta.
  • Rosen, Gideon, and John Burgess. 2005. “Nominalism Reconsidered”. In The Oxford Handbook of Philosophy of Mathematics and Logic, edited by Stewart Shapiro. Oxford University Press: 515-535.
  • Routley, Richard. 1980. Exploring Meinong’s Jungle and Beyond. Canberra: Philosophy Department, Research School of the Social Sciences.
  • Russell, Bertrand. 1903. Principles of Mathematics. Cambridge University Press.
  • Shapiro, Stewart. 1993. “Modality and Ontology”. Mind 102(407): 455–81.
  • Shapiro, Stewart . 1997. Philosophy of Mathematics: Structure and Ontology. Oxford University Press.
  • Sober, Eliot. 1993. “Mathematics and Indispensability”. The Philosophical Review 102 (1): 35–57.
  • Steiner, Mark. 1998. The Applicability of Mathematics as a Philosophical Problem. Harvard University Press.
  • Suppes, Patrick. 1960. “A Comparison of the Meaning and Use of Models in Mathematics and the Empirical Sciences”. Synthese 12(2-3): 287–301.
  • Tarski, Alfred. 1935. “Der Wahrheitsbegriff in der formalisierten Sprachen”. Studia Philosophica 1: 261–405.
  • Tarski, Alfred . 1936. “Über den Begriff der logischen Folgerung”. Actes du Congrès international de philosophie scientifique: Logique, Volume 7: 1–11.
  • Tarski, Alfred . 1944. “The Semantic Conception of Truth”. Philosophy and Phenomenological Research 3: 341–76.
  • Vallicella, William F. 2002. A Paradigm Theory of Existence: Onto-Theology Vindicated. Kluwer Academic Publishers.
  • van Fraassen, Bas. 2008. Scientific Representation: Paradoxes of Perspective. Oxford University Press.
  • Walton, Kendall. 1990. Mimesis as Make-Believe. Harvard University Press.
  • Walton, Kendall. 1993. “Metaphor and Prop Oriented Make-Believe”. European Journal of Philosophy 1: 39-57.
  • Weir, Alan. 2005. “Naturalism Reconsidered”. In The Oxford Handbook of Philosophy of Mathematics and Logic, edited by Stewart Shapiro. Oxford University Press: 460–82
  • Weisberg, Michael. 2013. Simulation and Similarity: Using Models to Understand the World. Oxford University Press.
  • Yablo, Stephen. 2009. “Must Existence-Questions Have Answers?”. In Metametaphysics: New Essays on the Foundations of Ontology, edited by David Manley, David Chalmers and Ryan Wasserman. Oxford University Press: 507-525.
  • Yablo, Stephen. 2012. “Explanation, Extrapolation, and Existence”. Mind 121: 1007–29.
  • Zalta, Edward N. 1988. Intensional Logic and the Metaphysics of Intensionality. MIT Press.

 

Author Information

James Henry Collin
Email: James.Collin@glasgow.ac.uk
University of Glasgow
Scotland

Epistemic Conditions of Moral Responsibility

What conditions on a person’s knowledge must be satisfied in order for them to be morally responsible for something they have done? The first two decades of the twenty-first century saw a surge of interest in this question. Must an agent, for example, be aware that their conduct is all-things-considered wrong to be blameworthy for it? Or could something weaker than this epistemic state suffice, such as having a mere belief in the act’s wrong-making features, or having the mere capacity for awareness of these features? Notice that these questions are not reducible to the question of whether moral responsibility for something requires free will or control over it. Initially, then, it is worth treating the epistemic condition (otherwise known as the “cognitive” or “knowledge” condition) on moral responsibility as distinct from the control condition. As we shall see, however, some make it part of the control condition.

This article introduces the epistemic conditions of moral responsibility. It starts by clarifying the parameters of the topic and then the two most significant debates in the epistemic condition literature: (1) the debate on whether blameworthiness for wrongdoing requires awareness of wrongdoing, and (2) the debate on whether responsibility for the consequences of our behaviour requires foreseeing those consequences. The bulk of the rest of the article is devoted to an overview of each debate, and it closes with a consideration of future directions for research on the epistemic condition—especially concerning moral praiseworthiness, collective responsibility, and criminal liability.

Table of Contents

  1. The Epistemic Conditions: The Topic
  2. The Epistemic Conditions of Culpable Misconduct
    1. Basic & Control-Based Views
      1. Strong Internalism (aka “Volitionism”)
      2. Weak Internalism
      3. Basic and Control Based Externalism
    2. Capacitarian Views
      1. Capacitarian Externalism
      2. Capacitarian Internalism?
    3. Quality-of-Will Views
      1. Moral Quality-of-Will Theories
      2. Epistemic Vice Theories
    4. Hybrid and Pluralist Views
  3. The Epistemic Conditions of Derivative Responsibility
    1. Foresight and Foreseeability Views
    2. No-Foreseeability Views
  4. Future Areas for Research
  5. References and Further Reading

1. The Epistemic Conditions: The Topic

The epistemic conditions of moral responsibility specify an epistemic property (or set of properties) of the agent that the agent must possess in order to be morally responsible for an act, attitude, trait, or event. “Epistemic” is understood loosely to mean “cognitive” or “intellectual.” The sense of “responsibility” here is, of course, to be distinguished from the sense of responsibility as a baseline moral capacity (being a “morally responsible agent”), as a virtue (“she is very responsible child”), or as a role or obligation (having “the responsibility” to do something). The relevant sense of responsibility is the one involved in being held responsible for something, implying accountability, or eligibility for praise or blame for that thing. Moreover, nearly every theorist of the epistemic condition takes the “backward-looking” perspective on accountability that praise or blame is fitting only in response to something that is about them or what they have done in the past, rather than fitting for the purposes of bringing about good consequences (as on “forward-looking” views).

The topic of the epistemic condition actually has a rather large scope. For anything X that we can be held responsible for—whether X is an act, omission, mental state, character trait, event, or state of affairs—we might be concerned with the epistemic conditions of responsibility in general, for X, or the epistemic conditions of praiseworthiness or blameworthiness in particular, for X. Moreover, we might be concerned with different degrees of responsibility (blameworthiness, etc.) and different modes of responsibility for X. For modes of responsibility, direct/original/non-derivative responsibility for X is obtained when all the conditions on responsibility are fulfilled at the time of X, whereas derivative/indirect responsibility for X is obtained when one or more conditions are not fulfilled at the time of X but are fulfilled at some suitable prior time. When responsibility is derivative, we talk of “tracing” responsibility back to that prior time. Finally, we might even be interested in more than one concept of responsibility for X (Watson 1996).

Concerning the epistemic condition itself, relevant epistemic states in the agent could include beliefs, credences, or capacities to have those beliefs or credences. With respect to X, the content of these epistemic states could include:

  • that one is doing or causing or possesses (etc.) X;
  • that X has a certain moral significance (for example, “is wrong”) or has features that make it morally significant (for example, harms others);
  • that X has an alternative Y;
  • that X could cause some consequence Y;
  • that W is how to perform X; and
  • any combinations of the above.

There is also an important distinction between occurrent and dispositional beliefs/credences. Occurrent beliefs are consciously thought, considered, or felt, whereas dispositional beliefs are not occurrent but are disposed to be occurrent under certain conditions. Finally, often the concepts of knowledge, awareness, foresight, and ignorance are used in the literature to refer to relevant epistemic states. While the traditional view is that ignorance is the lack of knowledge and that awareness is knowledge (or justified true belief), recent theorists of the epistemic condition take true belief to be necessary and sufficient for awareness, and they identify ignorance as the lack of true belief, the opposite of awareness (Peels 2010; Rosen 2008; Rudy-Hiller 2017). Partly for this reason, and for the reason that there is a plausible argument for thinking that the lack of knowledge (even justified true belief) that an act is wrong is no excuse for performing wrongdoing if one still believes that it is wrong (Rosen 2008), positions in the literature tend not to be couched in terms of knowledge. Like awareness, foresight (of consequences) tends to be analysed in terms of true belief as well (Zimmerman 1986).

It is clear, then, how wide the topic of the epistemic condition could be. But given the typical focus in responsibility studies on blame, rather than praise, and on actions/omissions and their consequences, it is unsurprising that the current focus of the debate has been on blameworthiness for actions/omissions and their consequences. Moreover, given the conceptual links between culpable conduct (that is, conduct for which one is blameworthy) and wrongful conduct, or conduct that is bad in some other way (for example, the “suberogatory”; McKenna 2012, 182-3), the focus has largely been on whether awareness of our conduct’s wrongfulness (or badness) is required to be blameworthy for performing it (Section 2). Partly because some views in this debate invoke the notion of blameworthiness for consequences of our conduct, too, there is also an interrelated literature on whether, and if so, what kind of epistemic condition must be satisfied to be culpable for the bad consequences of our conduct (Section 3).

The focus on whether awareness of wrongdoing is necessary for blameworthiness has also been spurred on by interest in the revisionary implications of a view known as “volitionism” or “strong internalism” (see Strong Internalism (aka Volitionism) below). The revisionary implications in question are that we should revise most of our ordinary judgments and practices of blame. There are also views on the epistemic condition for derivative responsibility (in particular, Foresight and Foreseeability Views) that have similar sorts of revisionary implications that have been brought to the attention of philosophers in the debate on derivative responsibility (cf. Vargas 2005). Not surprisingly, many of the positions in these debates have been offered as attempts to avoid these revisionary implications and vindicate our ordinary judgments and practices of blame. In recent times, though, discussion of the relative merits of these non- or semi-revisionary views has come to take centre stage, and the literature will undoubtedly continue to move away from the question of how to respond to revisionism (see Section 4 Future Areas for Research).

2. The Epistemic Conditions of Culpable Misconduct

What are the epistemic conditions on blameworthiness for wrongful (or bad) conduct? A useful initial way to carve up the literature on this question is to divide views into culpability internalist and culpability externalist kinds. This is, of course, to use terminology familiar to theorists of rationality, motivation, knowledge, and epistemic justification. But internalist/externalist terminology is not without some precedent in the literature on the epistemic condition (Husak 2016; Wieland 2017; Cloos 2018), even though the distinction is not often clearly defined. Let us define culpability internalism as follows:

Culpability internalism

An agent is non-derivatively (directly, or originally) blameworthy for some conduct X only if, at the time of X, the agent possesses a belief/credence concerning X’s badness or X’s bad-making features (or a higher-order belief/credence about the need to have the capacity to form such a belief/credence).

(The qualification in parentheses becomes relevant when we discuss Capacitarian Views below.) Culpability externalism is then the denial of culpability internalism. To use George Sher’s (2009) pithy phrase, Culpability externalism affirms the possibility of “responsibility without awareness.” The difference between culpability internalist and externalist views is best not defined in terms of awareness, though, since there are intuitively internalist views which regard acting contrary to one’s mistaken belief in wrongdoing to be blameworthy (Haji 1997). Thus, if a position demands belief in wrongdoing for the wrongdoing to be non-derivatively culpable, then the position is a form of culpability internalism. If, by contrast, a position demands only the capacity to believe that one’s conduct is wrong for it to be non-derivatively culpable, then the position counts as externalist.

The distinction between internalist and externalist theories of the epistemic condition, while useful, is very broad-brush. Fortunately, we can group views more informatively along the lines of what they take to support an internalist or externalist condition, for there are at least four different types of views about the underlying grounds for an epistemic condition: (1) basic views, (2) control-based views, (3) capacitarian views, and (4) quality-of-will views of the epistemic condition for culpable misconduct. Basic views holds that an epistemic condition is basic—that is, not based on any other condition for blameworthiness. Control-based views hold that an epistemic condition is based (partly) on the control or freedom condition for blameworthiness. Capacitarian views hold that an epistemic condition is based (partly) on a capacity-for-awareness condition of blameworthiness. And quality-of-will views hold that an epistemic condition is based (partly) on a quality-of-will condition for blameworthiness. This more informative taxonomy will be used to structure the overview of the debate on the epistemic condition for culpable misconduct.

a. Basic & Control-Based Views

Basic and control-based views tend to be treated as one family in the literature, as distinguished from the rest, and so the two will be treated together in the following sub-section.

According to basic views, an epistemic condition is a basic condition of culpability for misconduct. That is, it is not based even partly on any other condition for blameworthiness. There may be a control or quality-of-will condition for culpable misconduct, but such a condition is entirely independent of the epistemic condition; or there may be no other condition for culpable misconduct than an epistemic condition. Michael Zimmerman (1997), for example, identifies awareness as a “root requirement” of responsibility. And according to Alexander Guerrero (2007), a meat-eater is blameworthy simply if they eat meat while knowing that they don’t know whether the source of meat has “significant moral status.” Nothing else is required. Usually, the support for basic views is a mere appeal to intuition, however Guerrero (2007) appeals to how his principle is supported by theories of right and wrong.

According to control-based views, an internalist/externalist epistemic condition is based (partly) upon the control condition for blameworthiness (“partly,” in order to accommodate views on which the epistemic condition is not entirely a subset of the control condition.) Typically, the epistemic condition is internalist. The idea may be that a belief in the moral significance of the act is part of having the right sort of control at the time of the act—for example, “enhanced control” (Zimmerman 1986), the ability to do the right thing for the right reasons (Husak 2016; Nelkin and Rickless 2017), or the rational capacity to meet a reasonable expectation to act differently (Levy 2009; Robichaud 2014; cf. Rosen 2003).

Basic and control-based theorists are almost always internalists, and a distinction is usually drawn within basic and control-based internalism between a strong internalist view known as “volitionism” and weaker forms of basic or control-based internalism. Plausibly, though, there are basic and control-based theorists who are externalists about the epistemic condition—even though theorists of this kind tend not to be actively involved in the debate on the epistemic condition. This section will discuss, in turn, strong internalism, weak internalism, and then the possibility of basic and control-based externalism.

i. Strong Internalism (aka “Volitionism”)

Several philosophers (Levy 2009, 2011; Rosen 2003, 2004, 2007; Zimmerman 1997) defend the “strong internalist” (Cloos 2018) thesis—which also goes by the name of “volitionism” (Robichaud 2014)—that blameworthiness for misconduct is, or is traceable to blameworthiness for, an act done in the occurrent belief that the act is (all-things-considered) wrong. That the belief must be true, and so the act objectively wrong, is debated. Since akrasia is (often) defined as acting contrary to such an overall moral or all-things-considered judgment, strong internalism is often described as requiring “akrasia” for blameworthiness (Rosen 2004; Levy 2009). And it is often described as requiring “clear-eyed” akrasia in particular (FitzPatrick 2008), because it requires that one acts contrary to this belief when occurrent.

Why accept strong internalism? The key reasons are that (a) someone is blameworthy for an act only if it is either an instance of clear-eyed akrasia, or done in or from culpable ignorance; and (b) ignorance is culpable only if culpability for the ignorance is itself traceable to an instance of clear-eyed akrasia. “Ignorance” here means the lack of an occurrent true belief in the wrongfulness of the act.

In support of (a), everyone in the debate agrees that clear-eyed akratic wrongdoing is blameworthy (perhaps even the paradigm case of blameworthiness). Deliberately cheating on your partner while consciously knowing that it is wrong to do so is obviously blameworthy, provided that the non-epistemic conditions on blameworthiness are met. But when the agent acts in or from ignorance of wrongdoing (when the wrongdoing is “unwitting”; Smith 1983), strong internalists appeal to the intuition that they can still be blameworthy for wrongdoing but only through blameworthiness for their ignorance. Thus, the pilot who does not know that she is wrongfully initiating take-off without disengaging the gust lock is still blameworthy if she is blameworthy for failing to know that the gust lock is still engaged. This is a case of “factual ignorance,” where the agent’s ignorance of wrongdoing is owing exclusively to ignorance of some non-normative fact. But strong internalists argue, more controversially, that the same principle applies “by parity” (Rosen 2003) to “moral ignorance,” where one’s ignorance of wrongdoing is owing to ignorance of moral truth. (Indeed, some strong internalists [Rosen 2003] argue that the principle applies even to all-things-considered normative ignorance or ignorance of the way that morality trumps self-interest under the circumstances). Thus, the Battalion 101 policemen who executed Jewish women and children in the horrific Józefów Massacre (1942) would still have been blameworthy for the massacre if they did not know that they were doing wrong but were blameworthy for being ignorant of their wrongdoing. However, strong internalists appeal to more than a mere intuition to bolster the claim that when the act is unwitting, it is culpable only if it is done in or from culpable factual or moral ignorance. They cite considerations of control in support of (a). When the agent is ignorant, the agent no longer has the relevant abilities (for example, Levy’s [2009] “rational capacity”) to avoid wrongdoing or to act deliberately (Zimmerman 1997, 421-22); it would no longer be reasonable to expect them to act differently (Rosen 2003; Levy 2009), and it would be inappropriate to react with the blaming emotions to the wrongdoer. But if the ignorance is culpable in the first place (as we shall see, due to the presence of these abilities at an earlier time), then lacking these abilities is no legitimate block for blame.

Intuitions of blameworthiness and control-based considerations are also adduced in support of the claim (b), that ignorance is culpable only if culpability for the ignorance is itself traceable to an instance of clear-eyed akrasia. But a couple of further points are needed in support of (b). The first is that ignorance cannot be directly blameworthy (like akratic conduct), because the thesis of doxastic voluntarism is false: we do not have direct voluntary control over our belief-states. Thus, at best, ignorance can only be indirectly culpable through indirect control over it, which involves having direct control over prior acts that can (foreseeably) cause the formation or retention of such a state. (Strong internalists take a foresight or foreseeability view of responsibility for consequences; see 3. The Epistemic Condition for Derivative Responsibility.) Ignorance-causing/-sustaining acts are, of course, known as “benighting acts,” after Holly Smith (1983). And everyone agrees with Smith that benighting acts must be culpable for the ignorance to be culpable. So strong internalists argue that ignorance is culpable only if culpability for ignorance is traceable to culpability for a benighting act. Not just any benighting act will do, however: the distinctive of strong internalism (and what goes beyond Smith’s work) is that culpable benighting acts must themselves either be occurrently akratic or culpably unwitting. Why not, after all, think that the principles already on the table regarding culpability for wrongdoing apply equally well to culpability for benighting acts (Rosen 2004, 303)? Furthermore, since unwitting acts are never directly culpable, strong internalists therefore envision the possibility of yet further tracing when the benighting acts are unwitting, through an indefinitely long “regress” or “chain of culpability” (Zimmerman 2017), whose origin must lie in a case of clear-eyed akrasia. The result is the strong internalist’s “regress argument” (Wieland 2017).

Herein lies strong internalist’s much discussed revisionism to blameworthiness ascriptions.  The strong internalist regress must end only with a case of clear-eyed akrasia, but how easy is that to find? Zimmerman and Levy argue that clear-eyed akratic benighting acts are extremely rare or at least rarer than many think (Levy 2011, 110-32; Zimmerman 1997, 425-6). How often are we in a position to take a precaution against ignorance but decide contrary to our all-things-considered moral judgment to forgo that precaution (and thereby commit a culpable benighting act)? The answer appears to be “not often.” Levy (2011, 121-2) appeals to compelling empirical work which supports this answer. In contrast to Zimmerman and Levy, Gideon Rosen (2004) argues less that the regress makes culpability rare than that the regress recommends skepticism about moral responsibility. Any instance of akrasia, he argues, is extremely difficult to ascertain, and so blameworthiness is difficult to ascertain. Why is akrasia difficult to ascertain? Rosen cites “the opacity of the mind—of other minds and even of one’s own mind” (2004, 208). Indeed, clear-eyed akrasia may be hard to notice even when we can see into someone’s mind because:

it is not readily distinguishable from an impostor: ordinary weakness of will. The akratic agent judges that A is the thing to do, and then does something else, retaining his original judgment undiminished. The ordinary moral weakling, by contrast, may initially judge that A is the thing to do, but when the time comes to act, loses confidence in this judgment and ultimately persuades himself (or finds himself persuaded) that the preferred alternative is at least as reasonable. (2004, 309)

This problem is then compounded when we have to look into the past to determine an episode of clear-eyed akrasia; and it is probably harder to find such an episode when it is a case of benighting akrasia. Strong internalists therefore argue that we should revise most of our ordinary practices and judgments of blame.

ii. Weak Internalism

One reaction to strong internalism and its culpability revisionism is to argue that the same—basic, and control-based—grounds to which strong internalists appeal to support their view support an easier-to-satisfy form of culpable internalism. Call this form “weak internalism,” for the fact that its epistemic requirements are weaker than strong internalist requirements. A number of different views fall under weak internalism.

One is the dispositional belief-in-wrongdoing view according to which wrongdoing in a non-occurrent belief in wrongdoing can still be originally blameworthy (Haji 1997; Peels 2011; cf. Husak 2016, ch. 4). In support of this view, Haji appeals to the intuition that:

Tara may be blameworthy for quaffing her third gin-and-tonic even though, at the time, she does not have the occurrent belief that getting inebriated is wrong [but has a dispositional belief that getting inebriated is wrong]. (1997, 531)

Indeed, it is perfectly consistent for the dispositional belief theorist to assert nonetheless that she knows full well that she shouldn’t, even if the circumstances prevent her from having this thought explicitly. But there may be good theoretical reasons to require occurrent belief.

On the widely accepted principle that one is non-derivatively blameworthy for an action only if it would have been reasonable to expect the agent to avoid the action, Levy argues that

we can only reasonably be expected to do what we can do by an explicit reasoning procedure, a procedure we choose to engage in, and when we engage in explicit reasoning we cannot deliberately guide our behavior by reasons of which we are unaware, precisely because we are unaware of them. (2009, 736, n. 16)

If Tara does not have the occurrent thought that it is wrong to have another gin, then how can she engage in an explicit reasoning procedure with the upshot of avoiding wrongdoing? But this, Levy would argue, is required for her to be subject to a reasonable expectation to avoid having another gin and hence to be blameworthy for having it. Dispositional belief theorists might, however, try to resist Levy’s argument here on the grounds that Tara is subject to a reasonable expectation to avoid wrongdoing, despite her dispositional belief in wrongdoing. Perhaps the fact that her belief in wrongdoing would ordinarily be occurrent under the circumstances is sufficient to ground a reasonable expectation to avoid wrongdoing (but see Capacitarian Internalism? below). Or perhaps she has some other kind of occurrent awareness which grounds the reasonable expectation to act differently (cf. “the phenomenology of deliberative alertness”; Yates 2021, 189-90). In the end, though, the dispositional belief theorist could dig their heels in with the reply that accepting Levy’s argument requires far too drastic a revision to our commonsense ascriptions and practices of blame for his conclusion to be acceptable (Robichaud 2014, 149-151). (It is worth noting that Zimmerman himself seems to allow for an exception to his general requirement of occurrent belief in cases of “deliberate wrongdoing in a routine or habitual… manner” [1997, 422; cf. Zimmerman 2017].)

Another set of weak internalist responses challenge the strong internalist’s requirement of belief in wrongdoing, where the content of the belief is in question. Focusing especially on direct culpability for benighting conduct (see also Nelkin and Rickless 2017), Philip Robichaud (2014) has argued that a wrongdoer can be blameworthy even though they have only “sufficient, non-decisive motivating reasons” to act differently. Robichaud defines these reasons as “strong enough” as to make it (internally) rational to avoid wrongdoing, but not strong enough as to decisively support the avoidance of wrongdoing (2014, 142). To take his example, although we do not believe that we have an obligation (or that we morally ought) to check the functionality of our brake lights every time we go to drive, we may believe that “it would be good” (2014, 143) to check them. “It would be good” or alternatively “it would be safe,” or “I haven’t checked them in a while” (not his examples), would then function as non-decisive motivating reasons to check them and not to ignore them, in contrast to the strong internalist decisive reasons of “it would be wrong not to,” “I overall ought to,” or “I have an obligation to” check them. Suppose, then, that your brake lights were to fail, causing a fatal accident. Robichaud argues that you could be originally blameworthy for the accident, even though you only had these non-decisive reasons. In support of his account, Robichaud appeals to the aforementioned reasonable expectations condition of blameworthiness, and argues, against Levy (2009) that it would be reasonable to expect you to check the brake lights despite having only non-decisive reasons to do so. This is because, he contends, you would still have the rational capacity to check your brake lights under these conditions.

Levy (2016) has responded that acting for non-decisive reasons is too “chancy” to count as making the act one that it would be reasonable to expect you to perform; that is, decisive reasons are required. The reason is that:

when it is genuinely the case that an agent has sufficient but not decisive reasons to choose from two or more conflicting options, chancy factors [such as ‘trivial aspects of the environment or of the agent herself’] will play a decisive role in how she chooses. (2016, 5)

But it is not clear that this should move Robichaud. On some accounts—for example, on leeway incompatibilist accounts (see Free Will)— of control, cases in which one is torn between conflicting motivating reasons to do different things are often regarded as paradigm cases of responsibility-relevant control. Such a conflicted state might provide room for the exercise of agent-causal power on agent-causal accounts such as Roderick Chisholm’s (1976), and so it would not follow from a conflict between non-decisive reasons that “chancy factors” cause the choice. But does it follow that Robichaud needs to help himself to a controversial libertarian account of control to defend his appeal to non-decisive motivating reasons?

Another form of weak internalism that challenges the content of the strong internalist akrasia requirement is Alexander Guerrero’s (2007) moral risk view (cf. also Husak 2016, ch. 3). Guerrero responds to Gideon Rosen’s strong internalism by defending the principle, “Don’t Know Don’t Kill” (DKDK):

[if] someone knows that she doesn’t know whether a living organism has significant moral status or not, it is morally blameworthy for her to kill that organism or to have it killed, unless she believes that there is something of substantial moral significance compelling her to do so. (2007, 78-9)

Thus, DKDK entails that the Battalion 101 shooters would still have been blameworthy if they were merely uncertain whether Jewish women and children have “significant moral status,” and they lacked the belief that something compelled them to perform the executions. Guerrero argues then that a kind of moral recklessness can be grounds for original blameworthiness, alongside cases of clear-eyed akrasia. Indeed, Guerrero believes that forms of moral recklessness other than violating DKDK can be grounds for original blameworthiness too (cf. “Don’t Know Don’t Invade”; 2007, 94), however he confines his attention to the defense of DKDK. Still, one might be tempted to generalise (and simplify) the view to the following: someone is directly blameworthy for an act only if they believe that the act is wrong or that the act risks wrongdoing (Husak 2016, ch. 3).

Guerrero has already been identified as a basic internalist, and that is because he does not appeal to considerations of control to support DKDK. Rather, he appeals directly to intuitions of culpability, especially in cases of meat-eating under moral uncertainty, but also to theories of right action which would look favourably upon DKDK. Notably, he takes DKDK to be supported by recent theories of what to do under moral uncertainty which (rationally or morally) prescribe taking the least morally risky option. Nevertheless, one could certainly cite control-based considerations to support a moral risk view—for instance, the consideration that moral uncertainty provides a non-decisive motivating reason to avoid wrongdoing.

More critically, if the moral risk view does appeal to a non-decisive motivating reason to avoid wrongdoing, its defender would of course have to deal with Levy’s (aforementioned) luck-based objection to Robichaud’s view. There may also be the problem, from Robichaud’s perspective, of the view being still too restrictive in its appeal to only akrasia or moral recklessness as bases for blameworthiness: for Robichaud, believing that checking the brake lights “would be good” can be epistemically sufficient for blameworthiness. On the other side, the strong internalist could object that there are no cases of moral recklessness without akrasia.

One final version of weak internalism can be found in the work of Carolina Sartorio (2017). According to Sartorio, non-derivative blameworthiness requires awareness of the moral significance of one’s behaviour. Moreover,

being aware of the moral significance of our behavior—could be satisfied in different ways in different circumstances. In circumstances where we act wrongly, it could be satisfied by the awareness that we were acting wrongly, or by the awareness that one ought to have behaved differently. In circumstances where we don’t act wrongly, and perhaps are aware that we don’t act wrongly, it could be satisfied simply by virtue of recognizing that we are acting from morally reproachable reasons. (2017, 20)

The way that Sartorio spells out awareness of moral significance here and throughout the paper seems to indicate that Sartorio is thinking of the requirement that there is awareness of moral significance conceived as such for blameworthiness. To use language from the literature, she appears to demand “de dicto” awareness of moral significance (a term derived from “de dicto concern” about morality; Arpaly 2002). An alternative—weaker—view would have it that mere de re awareness of moral significance could be epistemically sufficient for blameworthiness, where de re awareness of moral significance would simply be awareness of features of the act that, as a matter of fact, make the act have its moral significance, whether or not there is awareness of its moral significance as such.

But now internalists might wonder whether de dicto awareness of moral significance is really required for blameworthiness. Quality-of-will internalists deny this requirement (see below). But recall Robichaud’s view that non-decisive motivating reasons suffice, where such a reason could be “I haven’t checked the brake lights in a while” (not his example). This would be a mere de re moral belief. But now suppose that you had this belief while lacking the morally de dicto belief that “therefore, checking the brake lights is now morally right, obligatory, or good.” Even so, it seems that having this de dicto belief could be sufficient epistemic grounds for you to be blameworthy for causing an accident.

Whether or not Sartorio has a successful response to this objection, however, it is worth noting that she tries to account for an intuition of blameworthiness in a certain range of cases that have not been given enough attention in the epistemic condition literature. These are so called “Nelkin-variants” of Frankfurt-style (1969) cases. Suppose that Jones shoots Smith even though he could not have done otherwise; a mad neuroscientist would have intervened if Jones faltered. According to Frankfurt and many of his followers (including Sartorio), Jones can still be blameworthy if he chooses to shoot Smith for reasons of his own. Now a “Nelkin-variant” of this Frankfurt-style case (named after cases raised by Dana Nelkin’s earlier work—cited in Sartorio 2017) would be one in which Jones becomes aware of the fact that a mad neuroscientist will intervene if Jones falters in his attempt to shoot Smith, and thereby comes to believe that he has no alternative to shooting Smith. Jones becomes aware of the neuroscientist’s intentions “at some point during the process” (m 2017, 8) resulting in the shot but in a way that (allegedly) leaves Jones unaffected, preserving his acting on the basis of his own reasons. On Sartorio’s view, Jones may still be blameworthy for shooting Smith if he “makes the choice completely on his own, on the basis of his own reasons (morally reproachable reasons, such as a desire for revenge), in exactly the same way he would have made it if he hadn’t been aware of the neuroscientist’s presence” (2017, 19). He would only need awareness of acting on those morally reproachable reasons. The upshot, for Sartorio, is that belief in alternatives is not an epistemic requirement on culpable conduct.

Plausibly, however, most of the views that we have discussed so far (especially due to Levy, Rosen, Robichaud, and Guerrero) assume such a requirement, and so we might wonder whether they are open to a plausible defense of this requirement. Perhaps they could question whether it is really possible (as Sartorio contends) for Jones to become aware of the neuroscientist’s presence and not let that affect his own assessment of his reasons to shoot Smith or of his alternatives. Perhaps he still has the (micro) alternative of shooting Smith not from his own reasons but by giving into the neuroscientist’s manipulation. Thus, maybe awareness of this alternative is needed for Jones to be blameworthy.

We have canvassed a range of different weak basic and control-based internalist responses to strong internalism, but it is of course possible to combine elements of each. Robichaud (2014), for example, couples his appeal to non-decisive motivating reasons with an appeal to mere dispositional belief. This would further enable such views to account for the commonness of culpability. More recently, Thomas Yates (2021) has provided a sustained defense of weak control-based internalism which incorporates distinctive elements of each of the above views with his requirement, on direct culpability, that the wrongdoer has outweighing motivating reasons to avoid wrongdoing that are based upon the normative reasons to avoid wrongdoing.

iii. Basic and Control Based Externalism

It would be premature to shift away from basic and control-based views without briefly discussing a sub-variety of these views that appears in ethics and the philosophy of action but that does not feature actively in the literature on the epistemic condition. This would be the subvariety of basic and control-based views that are externalist about culpability, on which culpability internalism is false but on basic or control-based grounds. Consider, for example, a view on which freedom or control over wrongdoing is necessary and sufficient for it to be culpable, but where the relevant control does not include a belief/credence according to which one’s conduct is bad. (Such a view might still, of course, involve awareness of what one is doing, and of alternatives, but it would not count as internalist, unless this awareness entailed having a belief/credence in the badness or bad-making features of one’s conduct.) Those who tend to run together “free action” and “action for which one is morally responsible” might endorse such a view. Roderick Chisholm, for instance, states that a “responsible act is an act such that, at the time at which the agent undertook to perform it, he had it within his power not to perform the act” (1976, 86). Michael Boylan (2021) also ties responsibility and freedom tightly and he contends that the judgments of right or wrong “assign praise or blame” (2021, 4-5). Indeed, ethics concerns only those actions that originate from “the free choice to do otherwise”—the same freedom that grounds moral responsibility for one’s actions. Later in the book, Boylan argues that cases of factually ignorant wrongdoing involve breaches of a prior duty (of “authenticity”) to “engage in all reasonable steps to properly justify a belief” (2021, 33)—no doubt, to justify it with respect to the “common body of knowledge” (2021, 34). Thus, as long as Boylan thinks that freely breaching a duty is culpable and need not involve awareness of that duty (or of the reasons for its application in the circumstances), such a view would then count as externalist. As on weak internalist tracing views such as Robichaud’s (2014), culpability for unwitting wrongdoing would not need to be traced back to culpability for clear-eyed akrasia. Nevertheless, culpability for the benighting act would be even easier to satisfy than on weak internalist views. (See Epistemic Vice Theories for a similar form of culpability externalism.)

While basic and control-based externalists may have the advantage of explaining more of our commonsense intuitions of blameworthiness than internalist views, many internalists would argue that basic and control-based externalists give us far too many false positive verdicts of blameworthiness. Consider, for example, that such views, if wedded to a simple conception of the ability to do otherwise, could easily pronounce youth, the elderly, the mentally impaired, the morally incompetent, and the morally ignorant (for example, cult members), blameworthy for their conduct, even though we might find it natural to excuse these wrongdoers. Proponents of such views must also find a way to successfully rebut internalist arguments to the effect that control-based considerations justify internalist requirements on culpable misconduct (see the debate between Levy and Robichaud above). Indeed, most control-based theorists of the epistemic condition think that there is more to culpability than wrongdoing or wrongdoing plus the ability to do otherwise.

b. Capacitarian Views

Another broad family of views on the epistemic condition for culpable misconduct go by the name of “capacitarian” views (Clarke 2014, 2017; Murray 2017; Rudy-Hiller 2017 [who coined the term]; and Sher 2009). Their basic idea is that having the unexercised capacity for awareness without actual awareness of the act’s bad-making features can be grounds for direct blameworthiness. Thus, if a pilot initiates take-off despite failing to notice the engaged gust lock, the idea is that the pilot could still be directly blameworthy for doing so (and for thereby risking the lives of all the passengers on board) if the pilot could have been aware—that is, had the unexercised capacity to be aware—of the engaged gust lock. More conditions are added, but that is the core idea.

Some capacitarians are interested in giving a capacitarian account of control (Clarke 2017; Murray 2017; Rudy-Hiller 2017), and so it could be argued that they advocate a type of control-based account. However, some capacitarians (for example, Sher 2009, 94) deny that they are giving an account in terms of control. Moreover, the control-based views above tend to have the distinctive features that (i) culpable conduct is due to the volitional exercise of one’s capacities, in contrast to the capacitarian’s unanimous appeal to unexercised capacities (but see Nelkin & Rickless 2017); and (ii) the capacities that are emphasised as needed are capacities to act or omit rather than capacities for awareness.

Capacitarian views are externalist—or at least capacitarianism “proper” is externalist. But there seems to be the possibility of “a capacitarian” (Rudy-Hiller 2019, 726) view which nevertheless requires a certain kind of awareness of moral significance, albeit not a first-order awareness of the bad-making features of one’s conduct. Capacitarianism proper will first be discussed before the possibility of “capacitarian internalism.”

i. Capacitarian Externalism

Capacitarianism proper is externalist: it holds that original blameworthiness for misconduct requires either awareness or the capacity for awareness of that conduct’s bad-making features. (The capacity for awareness of these features also does not depend on possessing actual beliefs or credences in one’s conduct’s bad-making features.) The view is disjunctive, because capacitarians allow blameworthiness in cases of acting in awareness of the bad-making features as well. Capacitarians demand the satisfaction of other conditions related to the exercise of the capacity, too. Fernando Rudy-Hiller (2017, 405-6) describes his capacitarian view as that when the agent is ignorant of some (non-moral) fact, they are blameworthy for their unwitting conduct (and their ignorance) only if they should and could be aware of that fact, where being able to be aware of this fact involves not only capacities to be aware of it but the (fair) opportunity to exercise those capacities. And Rudy-Hiller’s view is representative. The three essential elements to a capacitarian view are, to illustrate, (a) that the pilot must have the unexercised capacity to notice the engaged gust lock, (b) that the pilot must have the (fair) opportunity to (exercise the capacity to) notice the engaged gust lock, and (c) that the pilot should notice the engaged gust lock.

One significant advantage of capacitarianism is that it can accommodate folk intuitions of blameworthiness for so-called “unwitting omissions” (Clarke 2014)—cases of failing to do something you ought to do while lacking awareness of that failure. The case of the pilot failing to disengage the gust lock before taking-off is one such example. (Indeed, the unwitting omissions that capacitarians typically have in mind are factually unwitting, although there may be reason for capacitarians to extend their accounts to cover cases of morally unwitting omissions too). But another intuition that capacitarians account for is the intuition that culpability for unwitting omissions (or a subset of them) does not trace back to culpability for a benighting act. Now a tracing strategy could probably be employed to explain the pilot’s culpability in the airplane crash case (grounding culpability in the earlier failure to run through the pre-flight checklist); and indeed, tracing critics of capacitarianism have argued that many of the proposed “non-tracing” cases can be given a plausible tracing analysis (see Nelkin & Rickless’ [2017] discussion of cases given by Sher and Clarke). But let us try to consider an uncontroversial non-tracing case. Suppose that “a house burns down because someone forgot to turn off a stove” (Clarke 2017, 63), but where the culprit—call him Frank—has never forgotten to turn it off, and where it never occurred to him this time, or ever, to be more vigilant about turning it off after using it. Even still, many of us report intuitions of blameworthiness. It might, after all, seem fair for the landlord or family member to blame Frank (morally) for the house fire, especially after learning that he forgot to turn off the stove. And yet Frank was not aware of leaving the stove on at all, let alone aware of its being wrong to do so. Thus, it looks like internalist views are in trouble. But capacitarians can account for the intuition of culpability by appealing to Frank’s capacity to notice the stove, opportunity to exercise this capacity, and obligation to notice the stove.

While all capacitarians endorse this thesis about direct blameworthiness, some—for example, Rudy-Hiller (2017, 417)—also require that the ignorance is culpable for the unwitting conduct to be culpable, but others deny this. Clarke (2014, 173-4) argues that the ignorance need only be faulty for the unwitting conduct to be directly culpable, while tracing would be required to explain culpability for the ignorance. But Rudy-Hiller does not think that a culpable ignorance requirement entails that culpability for unwitting conduct is derivative of culpability for the ignorance. Rather, he thinks that both the ignorance and the unwitting conduct are under “direct” capacitarian control (apparently accepting a kind of doxastic voluntarism).

Capacitarians generally agree on which kinds of cognitive processes or faculties constitute cognitive capacities, however they disagree on how exactly to characterise them. Some also try to unify them under one “mother” capacity—for instance, vigilance (Murray 2017). On which kinds of faculties constitute cognitive capacities, Clarke has a useful passage cataloguing the relevant capacities:

Some are capacities to do things that are in a plain sense active: to turn one’s attention to, or maintain attention on, some matter; to raise a question in one’s mind or pursue such a question; to make a decision about whether to do this or that. These are, in fact, abilities to act. Others, though capacities to do things, aren’t capacities whose exercise consists in intentional action. These include capacities to remember, to think of relevant considerations, to notice features of one’s situation and appreciate their normative significance, to think at appropriate times to do things that need doing. (2017, 68)

Most capacitarians allow both kinds of capacities, however some do not allow the first class of capacities that consist in abilities to act. For example, Sher argues that “if we did construe the cognitive capacities as ones that their possessors can choose to exercise, then we would have ushered [an internalist control-based view] out the front door only to see it reenter through the back” (2009, 114). It is not clear, however, that allowing these capacities to act would involve smuggling such a view back in, for capacitarians need not hold that as soon as we enter any domain of agency or choice, let alone the domain of exercising cognitive capacities, internalist conditions need to be met.

Capacitarians face the challenge of answering what it takes to have a relevant capacity for awareness. Clarke and Rudy-Hiller take a view on which the agent has the relevant capacity if on similar occasions in the past, they have become aware of the relevant bad-making features. By contrast, Sher adopts a counterfactual analysis of capacities, according to which someone has the relevant capacity if she would have been aware of the relevant facts in a range of other similar circumstances (2009, 114). Whichever way we might spell out the relevant capacity there are some unique challenges that need to be met. For both the past-occurrences and counterfactual views, we might ask what (past or counterfactual) circumstances count as “sufficiently similar.” And concerning the past-occurrences view, we might be concerned with cases in which the agent has lost their capacity for awareness ever since they were last relevantly aware (Sher 2009, 109).

For capacitarians, having the capacity for awareness means nothing without a (fair) opportunity for it to manifest. Rudy-Hiller, for instance, requires that there are no “situational factors that decisively interfere with the deployment of the relevant abilities” (2017, 408). Frank would be excused for failing to turn off the stove if Frank collapsed with a heart attack during his cooking (although it is dubious that failing to turn off the stove would still count as wrong in this case). Clarke says something similar, although he argues that it is enough that they “sometimes mask… the manifestation of psychological capacities without diminishing or eliminating them” (Clarke 2017, 68). Imagine, instead, that Frank merely fell ill for the next couple of hours and had to lie down. In these cases, Clarke argues that it would “not be reasonable to expect [him] to remember or think to do certain things that [he] has a capacity to remember or think to do” (2017, 68).

The last key requirement, according to the capacitarian, is that the agent should have been aware of the relevant considerations at the time of their action or omission. Why is such a condition indispensable? Well, just as internalist tracers argue that blameworthiness for an unwitting act requires performing a benighting act that falls below a standard that would have been reasonably expected of them, so capacitarians contend that blameworthiness requires that the agent’s awareness fell below a certain “cognitive standard” (Clarke 2014) that would have been reasonably expected of them. If, for example, Frank fell ill while cooking, it seems false that Frank ought to have remembered that the stove was on, for such a standard seems too harsh or demanding. Capacitarians disagree, however, on whether this standard is set by an obligation (Rudy-Hiller 2017, 415; Murray 2017, 513) or merely a norm (Clarke 2014, 167) of awareness.

A number of objections to capacitarianism, in addition to the problems for giving an adequate account of capacity for awareness, have been raised in the literature. One objection is that the appeal to capacities fails to capture anything that is morally relevant for attributions of moral responsibility. Sher (2009), for instance, argues that the fact that wrongdoing originated from the wrongdoer is sufficient for the wrongdoer’s culpability, never mind whether they had control, freedom, or ill will (see Quality-of-Will Views below). Sher’s story is complicated, and appeals to the way that we react, as blamers, to the whole person when we blame them, to all their psychological capacities, and not only to their vices. But A. Smith (2010) has argued that attributability via origination threatens to collapse attributions of moral responsibility into attributions of causal responsibility. Indeed, the problem seems particularly poignant for accounts such as Sher’s which deny the control condition of blameworthiness, since those who appeal to control at least try to appeal to a widely accepted basis for responsibility attributions. Thus, a good deal seems to ride on a successful defense of the notion of capacitarian control.

A second objection is the reasonable expectations objection raised by Levy (2017) (cf. also Rudy-Hiller 2019). As we have seen, capacitarians appeal to the way that their conditions ground a reasonable expectation to avoid unwitting misconduct. Levy, however, argues that capacitarian conditions fail to ground such a reasonable expectation, because expecting someone to avoid wrongdoing through the exercise or activation of a capacity for awareness is expecting someone to avoid wrongdoing “by chance or by some kind of glitch in their agency” (2017, 255). The problem is especially pressing when one considers those capacities that are not, as Clarke describes them, “capacities to act,” and so it might be in the capacitarian’s interests to restrict the relevant capacities to those that require “effort to appropriately exercise” (Murray 2017, 516). Past-occurrent capacitarians could also reply (as they have done) that:

if an agent has demonstrated in the past that she has a certain capacity and there is no obvious impediment to its manifestation in the present circumstances, then it is reasonable to expect her to exercise it here and now. (Rudy-Hiller 2019, 734)

Even so, Levy’s point is that they would need awareness of the fact that, for example, their mind is wandering for them to have the right sort of control over their capacities, but (1) this is not required by capacitarians (at least of the externalist variety; see below) and (2) this awareness itself is not under the agent’s voluntary control (2017, 255). Rudy-Hiller (2019; see Capacitarian Internalism?) has also argued that there are cases in which the present circumstances are sufficiently different from previous circumstances (in which you demonstrated the relevant capacity for awareness), such that the agent in the present circumstances lacks awareness of the risk of not being aware of the relevant facts, and therefore lacks awareness of the need to “exert more vigilance in the particular circumstances she [is] in” (2019, 735). In these cases, he argues, it is not reasonable to expect the agent to avoid wrongdoing.

Of course, the capacitarian could deny the widely accepted reasonable expectations conditions of blameworthiness. But this would seem to come at the high price of exacerbating the first problem (above) of how to avoid collapsing moral responsibility into causal responsibility. William FitzPatrick (2017, 33) also argues that rejecting this condition fails to account for the way that reasonable expectations are grounded in moral desert, an indispensable aspect of blameworthiness on his view.

ii. Capacitarian Internalism?

But another response to the reasonable expectations objection to capacitarianism proper is to amend capacitarianism so as to include an awareness condition after all. This is Rudy-Hiller’s revised (2019) view. According to this view, the core elements of capacitarianism are left intact and constitute part of what he calls “cognitive control,” but the other part involves an awareness-of-risk condition, that is, awareness of the risk of “cognitive failure” (for example, awareness of the risk of not noticing that the stove is still on), and a know-how condition, involving awareness of how to avoid that cognitive failure in the circumstances. Rudy-Hiller argues that these conditions need to be added, because without having been in similar circumstances in the past, agents are “in the dark regarding the risks associated with allocating cognitive resources in certain ways and therefore… in the dark regarding the need to exercise that capacity” (2019: 731). Indeed, Rudy-Hiller would argue that these agents are “entitled to rely on the good functioning of [their] cognitive capacities without having to put in special effort to shore them up” (emphasis added, 2019: 732). Thus, it turns out that many unwitting wrongdoers are blameless in the end, because they fail to satisfy the awareness-of-risk and know-how conditions. Imagine that Frank’s partner announces halfway through his meal preparation that her friends are coming over, and that they are gluten-free, and so he must now change his cooking plans to accommodate them. He has never had to do this. Suppose then that they arrive and he keeps himself occupied by being a good host. Unfortunately, this means that he is no longer mentally present enough to remember to turn the stove off and it causes a kitchen fire partway through the evening. In this case, Rudy-Hiller would say that Frank is blameless, because he is not especially aware of the risk of failing to notice that the stove is still on.

Such a view seems to count as an internalist view, not only in the spirit of its appeal to awareness, but in the contents of the awareness itself. While it does not involve awareness of the badness or bad-making features of the wrongful omission, it does involve a kind of higher-order awareness of the need to have the capacity for awareness of those features (whatever they may be). (This then explains the parenthetical disjunct in the definition of culpability internalism above.) That being said, one could argue that failing to exercise enough vigilance is itself a wrongful mental omission which explains the subsequent omission to turn the stove off. If so, then awareness of the risk of failing to exercise enough vigilance in the circumstances satisfies the ordinary internalist requirement of possessing a “belief/credence in the bad-making features of one’s conduct.”

Rudy-Hiller’s capacitarian internalist view has certainly much to be said for it, and it is yet to receive significant criticism. However, it is unlikely to move those who wish to accommodate a strong intuition of culpability even in these special cases of “slips.” Rudy-Hiller sacrifices this advantage for the benefit of preserving the reasonable expectations and control conditions on responsibility. We might also wonder to what extent Rudy-Hiller’s capacitarian internalism is not a closet tracing view (a variation on the control-based internalist views from the last section) if it can intelligibly be argued that the omission to exert enough vigilance in the circumstances is a separate “benighting” mental omission that gives rise to the subsequent “unwitting” omission. These would, after all, be cases in which “the temporal gap between it and the unwitting [omission] is infinitesimal” (Smith 1983, 547).

c. Quality-of-Will Views

Another set of views on the epistemic condition for culpable misconduct approaches the topic from an entirely different perspective. According to these so-called “quality-of-will” views (which are also known as “attributionist” views, even though this term has been used for some capacitarians), blameworthiness for misconduct requires that a bad quality of will was on display in that misconduct, or in prior (benighting) misconduct. Moreover, the question of the epistemic condition for blameworthiness is to be answered by inquiring into the epistemic condition for the display of ill will. Thus, what licenses culpability ascriptions is not primarily control, as on control-based views, nor capacities, as on capacitarianism, but a bad will.

The basic idea of quality-of-will theories is simple and intuitive: the Battalion 101 shooters are blameworthy for their participation in the Józefów Massacre because they displayed an egregious disregard for the lives of Jewish women and children. The pilot who takes off without disengaging the gust lock acts carelessly and recklessly.

The main varieties of quality-of-will views are moral quality-of-will views and epistemic vice theories. Moral quality-of-will views appeal to morally reproachable qualities of the will (such as disregard for what’s morally significant). Epistemic vice theories are regarded in this article as quality-of-will views because they ground culpability for unwitting wrongdoing ultimately in the expression of a bad epistemic quality of will—for example, the epistemically vicious traits or attitudes of carelessness, inattentiveness, or arrogance. As we will see, moral quality-of-will views fall on either side of the culpability internalism/externalism debate, but epistemic vice theories are externalist.

i. Moral Quality-of-Will Theories

Moral quality-of-will theories appeal to morally reproachable qualities of the will. Accordingly, the “display of ill will” has been analysed in terms of the act’s expressing or being caused by an inadequate care for what’s morally significant (Arpaly 2002; Harman 2011), indifference towards others’ needs or interests (Talbert 2013, 2017; McKenna 2012), objectionable evaluative attitudes (A. Smith 2005), and reprehensible desires (H. Smith 1983, 2011).

These theorists are united in their view that one can be directly blameworthy for wrongdoing, even if it is done in the absence of a belief in wrongdoing or a de dicto belief in the moral significance of the act (against, for example, Sartorio). Even if the Battalion 101 shooters did not know that it was wrong to murder Jewish women and children, they are directly blameworthy for doing so, because they displayed an objectionable disregard for the moral status (humanity, etc.) of their victims. For some quality of will theorists (Talbert 2013, 234), this holds even if the shooters’ moral ignorance was blameless (or epistemically justified), given widespread cultural acceptance of the inferior status of Jews in Nazi Germany. However, others (Harman 2011, 461-2) would still require that their moral ignorance was blameworthy, even if culpability for their ignorance did not explain culpability for their unwitting wrongdoing. Nevertheless, quality-of-will theorists tend to make it easier than control-based theories for attitudes or states such as ignorance to be culpable, for these states tend to be regarded as directly, rather than indirectly, culpable, and under the same conditions as actions are culpable—namely, when they display ill will (consider, for example, prejudiced or misogynistic beliefs about women; Arpaly 2002, 104). Indeed, these theorists typically do not promote tracing explanations, because, like their “real self” forebears (Waston 1996), they hold that the relevant responsibility relation between agent and object (act, belief, etc.) is an atemporal or structural relation between the agent’s quality-of-will and the object of responsibility assessment. Not surprisingly, then, moral quality of will theorists tend not to focus on benighting conduct. But they could easily extend their views to cover benighting conduct in the way that epistemic vice theorists do below, or by appealing to the notion of motivated ignorance.

Moral quality-of-will theorists are divided on the culpability internalism/externalism debate. Matthew Talbert (2013) and Elizabeth Harman (2011, 460) are internalist, because they argue that caring inadequately for what is morally significant requires awareness of what is morally significant. Hence, they require only de re moral awareness, or awareness of the bad-making features of their conduct. Talbert has probably produced the most sustained defense of this idea. Suppose that walking on plants turns out to be wrong because it causes them to suffer, and you are ignorant of plant suffering (Levy’s [2005] example). Talbert argues that ignorance of plant suffering would excuse you from blame because doing so would not express “a judgment with which we disagree about the significance of the needs and interests of those [plants] affected by the action” (2013, 244). However, if you were to become aware that plants suffer, then you would no longer be excused for walking on plants, even if you believed that it was permissible to continue walking on them. This is because you now express a judgment concerning plant suffering that we disagree with, the judgment that plant suffering does not matter, or should not be regarded like the suffering of other living things.

Some moral quality-of-will theorists by contrast do not require awareness of misconduct’s bad-making features for it to be culpable. Most prominently, Angela Smith (2005) has argued that, among other things, unwitting omissions—such as her case of omitting to send your friend a card on her birthday because you have forgotten it is her birthday—are directly culpable because these omissions and their accompanying ignorance express objectionable evaluative attitudes (for example, the judgment that a friend’s birthday is unimportant). Critical to her argument for the culpability of unwitting omissions is her appeal to the concept of responsibility as answerability—as, being open to “demands for reasons or justifications” (2005, 259)—a property which seems applicable to you in the case of forgetting your friend’s birthday. Since these kinds of cases involve the lack of any belief or credence in the bad-making features of one’s omissions (for example, the features that today is your friend’s birthday and that it would be inconsiderate not to give her a call), the view counts as externalist.

Quality-of-will externalists like Smith and capacitarians therefore have the similarities that both are concerned with unwitting omissions, and both argue against tracing strategies for explaining culpability for unwitting misconduct. Nevertheless, an important difference between these views is that quality-of-will externalists require displays of ill will for blameworthiness. To the extent that in the above house fire case, Frank has never in his cooking shown an objectionable orientation towards his home and his family (nor the house’s owner), we might think that on this occasion, when he forgets to turn it off, Frank does not display any ill will. If so, then even quality-of-will externalists would excuse him for not turning off the stove. We have seen, though, that Frank could easily fulfil the capacitarian’s conditions, and so this is a type of case in which the verdicts of quality-of-will theorists and capacitarians could easily diverge. Admittedly, Smith seems to take it that normal cases of unwitting omissions count as cases involving objectionable attitudes, and so there may not be much of a difference in practice between the verdicts of Smith and capacitarians. But certainly, the contrast between capacitarian views and quality-of-will internalist views is significant. While Talbert (2017) appears to concede to Smith that some cases of factually unwitting omissions are culpable, Talbert argues that “garden-variety” cases of unwitting omissions—including the one about forgetting your friend’s birthday—are not obvious cases of culpability because “quite often, we probably shouldn’t have much confidence that another person’s forgetfulness or his failure to notice something conveys much morally relevant information about what he values” (2017, 30). Capacitarians and quality-of-will externalists have intuitions of culpability Talbert thinks (2017, 31ff.), because humans have a bad tendency (according to studies in psychology) to attribute ill will to other humans (even non-humans) when ill will is absent, especially when we see the harmful results of their behaviour.

Three important objections have been raised against moral quality-of-will theories in the literature. The first objection is one that we have already seen raised against capacitarians: quality-of-will theorists cannot account for the reasonable expectations conditions of blameworthiness (FitzPatrick 2017, 33-4). Consider, for example, that it might not have been reasonable to expect the Battalion 101 shooters to avoid participating in the massacre of Jewish women and children if they were entirely oblivious to the fact that it was wrong, but a quality-of-will theorist has the verdict that they are blameworthy. But this kind of case might reveal that there is a problem with the reasonable expectations conditions of blameworthiness, and this is how Talbert (2013) defends his quality-of-will theory.

A second objection to quality-of-will theories is that they collapse the “bad” and the “blameworthy” (Levy 2005)—once again, a similar objection to one raised against capacitarianism. Smith, after all, identifies the “precondition for legitimate moral assessment” (Smith 2005, 240) with the precondition for legitimate responsibility assessment—that is, she identifies “moral criticism” with moral blame. But mere negative moral assessments of a person given their behaviour—that is, judgments of their being vicious, having an objectionable attitude, or lacking sufficient care for others—seem to be crucially different from, and need not imply, judgments of moral responsibility or blameworthiness for the behaviour in question. Perhaps we think that people need the right kind of control over whether they display their ill will in order to be morally responsible for their behaviour (Levy 2005). Not according to A. Smith (2005): she is happy to accept the consequence that she collapses the bad and the blameworthy. But another quality-of-will response is to accept that this is a problem and try to explain the difference.

According to Holly Smith, we can “appropriately think worse of a person” who expresses a single or “isolated” quality of will that is objectionable, but we cannot blame her, unless she reveals “enough of her moral personality” (2011, 144). Consider her key example (2011, 133-4). Clara strongly dislikes Bonnie but has always managed to reign in “nasty” comments about her hair in order to keep a good reputation (among other reasons). One day, however, “Clara’s psychology teacher hypnotises Clara,” the outcome of which is that Clara no longer cares about her reputation (etc.). In consequence, Clara launches a “cutting attack on Bonnie’s appearance.” Now, what is important is that the attack manifests ill will (her desire to “wound” Bonnie). But H. Smith’s intuition is that she is not blameworthy. After all, the desires for maintaining her good reputation (etc.) that would normally inhibit her are not operative. Thus (apart from akrasia [H. Smith 2011, 145]), blameworthiness requires the display of a sufficient portion of the agent’s will, not just one part of it (for example, a single bad desire). Whether this distinguishes eligibility for moral criticism from eligibility for moral blame sufficiently is not clear, however. There are also concerns in the literature about the ability for quality-of-will theorists to account for intuitions of blamelessness arising from other “manipulation” cases.

A third objection to moral quality-of-will theories is simply that ill will is not necessary for blameworthiness, and the aforementioned capacitarian non-tracing cases are usually trotted out in this context. So a great deal hinges on what we are to make of that debate.

ii. Epistemic Vice Theories

Another subvariety of quality-of-will theories are James Montmarquet’s (1999) and William FitzPatrick’s (2008, 2017) epistemic vice theories. Interestingly, both theorists agree with those control-based internalists who argue that moral and factual ignorance excuses wrongdoing, but they contend that culpability for that wrongdoing traces back to culpability for the ignorance, which, they argue, is grounded in exercises of epistemic vice. The epistemic vices are apparently possessed as character traits on FitzPatrick’s (2008) view, but Montmarquet (1999) seems only to envision a momentary vicious attitude or motive (viz., insufficient “care” in belief-formation).

Consider Zimmerman’s case of Perry who, upon arriving at the scene of a car crash involving a trapped individual, Doris, and a burning car, “rushes in and quickly drags Doris free from the wreck, thinking that at any moment both he and she might get caught in the explosion” (1997, 410). Alas, Perry paralyses Doris in the act of dragging her free. In defense of the appeal to epistemic vices, Montmarquet (1999, 842) attaches significance to Zimmerman’s admission that the natural thing to say about this case is that Perry is culpable for unwittingly paralysing Doris and that this is due to Perry’s “carelessness,” “inconsiderateness,” or “inattentiveness” in failing to “entertain the possibility of doing more harm than good by means of a precipitate rescue” (Zimmerman 1997, 416). For Montmarquet, this is indeed what we should say. In fact, Montmarquet would argue that in this moment, Perry has “direct (albeit incomplete)” control (1999, 844) over his beliefs, and that the way he exercises that control is epistemically vicious, for it fails to exhibit enough “care” in belief-formation. (It is not, however, essential for epistemic vice theories to appeal to direct control over one’s beliefs. FitzPatrick (2008) denies doxastic voluntarism.) At any rate, grounding Perry’s culpability in his lack of care in belief-formation is externalist, because contrary to Zimmerman and other control-based internalists, Montmarquet and FitzPatrick would not require for Perry’s culpable ignorance that Perry was aware of his failure to be open-minded to “the possibility of doing more harm than good.”

The root idea… is that a certain quality of openness to truth- and value-related considerations is expected of persons and that this expectation is fundamental, at least in the following regard. The expectation is not derivative of or dependent upon one’s (at the moment in question) judging such openness as appropriate (good, required, etc.)—just the opposite: it would include a requirement that one be open to the need to be open, and if one is not open to this, one may be blameworthy precisely for that failure. (Montmarquet 1999, 845)

It is clear in this passage that Montmarquet employs the reasonable expectations conditions of blameworthiness (well before it became a key focus of the debate in the late 2000s) and he evidently tries to account for how it is met by his epistemic vice theory. FitzPatrick (2008, 2017) also takes up this project, but he argues in response to Levy’s (2009) strong internalist requirement for reasonable expectations that if it is not reasonable to expect someone to avoid acting from their epistemic vices, then culpability traces even further back to culpability for those vices and for those vicious character-forming acts that it would have been reasonable to expect the agent to avoid in the first place (FitzPatrick 2017). It is not clear that this solves the issue from the strong internalist’s perspective, however, for the internalist would still require that the character-forming choices were themselves seen as wrong. It seems, then, that it is in the best interests of the epistemic vice theorist to resort to Montmarquet’s appeal to the fundamentality of exercises of epistemic vice with or without awareness of doing so (and with or without Montmarquet’s appeal to direct doxastic control).

The debate between epistemic vice theorists and other defenders of the reasonable expectations condition then becomes whether the epistemic vice theorist can ground a reasonable expectation without an internalist requirement. But clearly, it is open to these theorists to dispense with this requirement—as their cousins in the moral quality-of-will camp have done (see above).

But epistemic vice theorists have their own challenges, too. Why, for example, should benighting conduct be treated any differently from ordinary (non-benighting) conduct, as far as culpability ascriptions are concerned? It is difficult to see what it is about being the kind of act or omission that causes ignorance that makes it eligible for a different culpability assessment than any other kind of act or omission. Perhaps an epistemic vice theory is best employed in conjunction with a moral quality-of-will theory of culpability for non-benighting conduct, which does away with tracing.

d. Hybrid and Pluralist Views

We have nearly canvassed the full range of positions that are currently defended on the epistemic condition for culpable misconduct. What we have left are those positions that mix some of the above views in different ways. There are two ways that this can be done: (1) defend a hybrid theory, which combines one or more of the above views in a single theory of blameworthiness; or (2) defend pluralism, which divides blameworthiness into different kinds, and then assigns different epistemic conditions to each.

For examples of a hybrid theory, FitzPatrick (2008) combines his epistemic vice theory with a kind of capacitarian requirement. The agent must have the capacity and the social opportunity to become aware of and avoid acting from epistemic vice. More recently, Christopher Cloos (2018, 211-2) argues that culpability for wrongdoing is secured either directly, under quality-of-will internalist conditions, or indirectly (when there is culpable factual ignorance) under weak internalist or epistemic vice theoretic conditions. Taking an all-inclusive approach like Cloos’s clearly gives you the advantage of accounting for as many of our ordinary intuitions of blameworthiness as possible, however it also inherits some of the distinctive problems of the views it combines. It must also face the charge of ad-hocness: is there some motivation for a hybrid theory other than its ability to account for intuitions about individual cases relevant to the epistemic condition? Is there, for instance, a plausible background theory about responsibility or blame that gives rise to such a hybrid?

By contrast, Elinor Mason (2019) and Michael Zimmerman (2017) offer pluralist accounts of the epistemic condition. Mason holds that there are three “ways to be blameworthy.” One form requires the satisfaction of strong internalist conditions; another demands only the satisfaction of quality-of-will conditions; and then the third is generated voluntarily by taking responsibility for one’s conduct (bringing along epistemic conditions of a different kind). Zimmerman (2017) defends a similar sort of pluralism, submitting that in his earlier (1997) work, he was only intending to give a strong internalist account of one form of blameworthiness, the one that is supposedly the basis for punishment. As for hybrid views, pluralist views inherit some of the problems of the monist views discussed above, but they also face the challenge of accounting for why different forms of blameworthiness are needed to account for the relevant considerations. Given that simplicity should be preferred over complexity, it seems that the debate would need to be intractable enough to warrant splitting blameworthiness into multiple forms, but it is not clear that this is so. How, for instance, should Mason and Zimmerman reply to the control-based criticism of quality-of-will views that they do not specify sufficient conditions for blameworthiness but only for some form of closely related negative attributability which is often confused for blameworthiness (Levy 2005)? Another challenge for pluralist views is justifying the exclusion of those monist analyses above (that is, capacitarianism, for Mason and Zimmerman) that do not constitute an analysis of one of the ways to be blameworthy.

3. The Epistemic Conditions of Derivative Responsibility

Alongside the debate on the epistemic condition for culpable misconduct, an interrelated debate has taken place on the epistemic condition for derivative responsibility—that is, responsibility (especially blameworthiness) for the consequences of our conduct. Why the debate on the epistemic condition for derivative responsibility is interrelated with the debate on the epistemic condition for culpable misconduct should now be clear: in the latter debate, culpability for unwitting omissions is often traced back to culpability for prior conduct, and these tracing strategies nearly always make essential reference to culpability for ignorance as itself a consequence of prior (benighting) conduct. But we have also seen how derivative responsibility for character (epistemic vices) might be part of the story. Thus, many of the philosophers whose views have already been discussed address the question of the epistemic condition for derivative responsibility in the context of the above debate (see below). But as we shall see, a number of philosophers are interested in the question of the epistemic condition for derivative responsibility as a question worth thinking about in its own right, or else they address the question in the context of another debate in responsibility studies (for example, on doxastic responsibility: Nottelmann 2007; Peels 2017). There are also many views which affirm the idea of derivative responsibility but which leave out a discussion of its epistemic condition, and so it is not clear what they have to say on the epistemic condition.

a. Foresight and Foreseeability Views

Views on the epistemic condition for derivative responsibility divide into those we might call foresight views, foreseeability views, and no-foreseeability views. Foresight views have the strongest epistemic condition in their claim that foreseen consequences are the only consequences of our conduct for which we are responsible (see, for example, Boylan 2021, 5; H. Smith 1983; Nelkin and Rickless 2017; Zimmerman 1986, 1997). By contrast, foreseeability views claim that unforeseen but (reasonably) foreseeable consequences can also be consequences for which we are responsible (Fischer and Tognazzini 2009; Murray 2017; Rosen 2004, 2008; Rudy-Hiller 2017; Sartorio 2017; Vargas 2005). Before we discuss the debate between these views, it would be worth introducing various disagreements about the nature and content of the foresight that one must have or be able to have.

On both foresight and foreseeability views, the foresight is nearly always analysed in terms of belief concerning the relevant consequence of one’s conduct (see especially, Zimmerman 1986, 206; cf. Nottelmann’s [2007, 190-3] criticism). Sometimes there is also an appeal to reasonable foresight (see, for example, Nelkin and Rickless 2017; cf. “reasonable foreseeability,” Vargas 2005) Moreover, some theorists analyse foresight in terms of occurrent belief (Zimmerman 1986), while others argue that dispositional belief suffices (for example, Fischer and Tognazzini 2009). Intuitively, if the pilot decided to skip running through every item on the pre-flight checklist but did not consciously foresee that doing so could lead to a catastrophic airplane crash, she could still be blameworthy for these consequences even if she merely dispositionally believed that these were the risks of rushing the pre-flight check (that is, if she would have cited these as reasons not to rush the check if asked). But plausibly this debate hangs on whether a successful defence of the requirement of occurrent belief can be found for directly culpable misconduct (see above).

There are also a number of disagreements surrounding the content of the relevant foresight. One disagreement concerns whether an increased likelihood of the consequence of one’s conduct must be foreseen/foreseeable. Zimmerman (1986, 206) includes no such condition, citing merely belief that there is at least “some probability” that the consequence will occur. But it is much more common to require foresight/foreseeability of an increased risk or likelihood of the consequence (Nottelmann 2007, 191ff.; Nelkin and Rickless 2017, 120; Peels 2017, 177). Intuitively, foreseeing some probability but no increase in the risk of a bad consequence would not give one a reason to take a precaution against it.

Another issue is subject to greater debate: must the specific consequence be foreseen/foreseeable, or does it suffice that the general type of consequence (“consequence type”) is foreseen/foreseeable? Some (Zimmerman 1986; Vargas 2005) think that there must be foreseeability of the specific/token consequence. In contrast, others (Fischer and Tognazzini 2009; King 2017; Nelkin and Rickless 2017; Nottelmann 2007), think that there can be foreseeability of the consequence type. The latter view is perhaps more intuitive. Suppose that a teacher comes up with the wrong answer to a highly important question raised by a student after failing to prepare for class despite recognizing the need to be well-prepared. To be responsible for giving the wrong answer, it seems that the teacher need not have foreseen the specific question to which she gave the wrong answer, nor even foreseen responding wrongly to a students question. She need only have foreseen the risk of misguiding the students or asserting falsehoods in class as a consequence of not preparing. A consequence-type view would also more easily accommodate intuitions of derivative culpability for morally unwitting wrongdoing: if the Battalion 101 shooters had the opportunity to question Nazi ideology at some point in their life prior to the massacre while believing that failing to question this ideology could lead to harming the Jews, then they could well have been indirectly blameworthy for their participation in the massacre. How, then, can defenders of the requirement of foreseen/foreseeable token consequences respond to the intuitive sufficiency of consequence-type foresight/foreseeability? Perhaps there are problems with specifying how broad a “type” the token consequence can fall under. Would foresight of a consequence as general as “causing something bad” suffice?

The final disagreement concerning the content of the required foresight/foreseeability is disagreement about how the foresight/foreseeability of the consequence’s moral significance or morally significant features is to be spelled out. After all, foresight of the consequence’s morally significant status or features is surely required (cf. Vargas 2005; Fischer and Tognazzini 2009; even though it is sometimes left out of analyses—see, for example, Nelkin and Rickless 2017). Suppose, for example, that the pilot foresaw the risk of an airplane crash from failing to run through the pre-flight checklist but did not believe that this was wrong or bad, nor even that it risked being bad. Or suppose that the pilot was crucially factually ignorant, believing mistakenly but fully that she had been told to intentionally crash the plane for a film stunt. Employing various of the intuitions generated in reflection on the epistemic condition for culpable misconduct (above), she is surely blameless for the crash under one or more of these conditions, unless she was blameworthy for her ignorance, or she displayed ill will despite her factual ignorance, or she had the capacity to be aware that she was not in a film set.

What moral significance or morally significant features, in particular, must be foreseen/foreseeable? Plausibly the answer should be informed by one’s account of the epistemic condition for directly culpable misconduct. Thus, strong internalists and others (for example, Sartorio) who require de dicto awareness of moral significance might be tempted to require, for culpability, that the consequence is believed to be morally bad or wrong. Weak internalists such as Robichaud might only require foresight/foreseeability of reasons against the consequence. And quality-of-will theorists and capacitarians might only require foresight/foreseeability of the consequence’s bad-making features.

At last, we come to the debate between foresight and foreseeability views. Why demand a more restrictive foresight condition for derivative responsibility? Intuitively it seems that (reasonable) foreseeability could suffice. Suppose that the teacher failed to even foresee misleading her students as a consequence of not preparing for her class, but that this consequence was (at least reasonably) foreseeable for her. Even so, it seems that she could be blameworthy for misleading her students. At the very least, that is the type of view that quality-of-will externalists and capacitarians would be drawn to (cf. Rudy-Hiller 2017). Consider, after all, that she seems to meet capacitarian conditions with respect to the consequence of misleading her students: she seems to have the capacity and the opportunity to foresee, and failing to foresee falls short of a cognitive standard that applies to her (no doubt qua teacher). In fact, capacitarian conditions seem to provide a plausible analysis of the nature of foreseeability (compare Zimmerman’s [1986] discussion of an alternative analysis in terms of what the “reasonable person” would foresee, as used in the legal definition of negligence). Quality-of-will externalists might also appeal to the way that her failure to foresee misleading her students, despite its being reasonably foreseeable for her, reveals an objectionable indifference to their success.

But the fact that a foreseeability view is at home with externalism about directly culpable misconduct might give us a clue as to how the foresight view could plausibly be defended against it, despite being more restrictive and maybe less intuitive: we seem to get the best justification for the foresight view from internalism about directly culpable misconduct. Interestingly, however, some internalists (Rosen 2008; Fischer and Tognazzini 2009), who argue that blameless ignorance excuses wrongdoing from it, defend a foreseeability view. But they do not tend to give an argument for this combination of internalism about direct culpability with a foreseeability view about indirect culpability. And, in fact, Daniel Miller (2017) has recently produced an ingenious argument for the inconsistency of this combination of commitments:

The argument begins from the premise that it is possible for an agent to be blameless for failing to foresee what was foreseeable for him. The second premise is the principle that an agent is blameworthy for acting from ignorance only if he is blameworthy for that ignorance. If blameless ignorance excuses agents for actions, though, then it also excuses agents for action consequences (the third premise). But, given the first premise, foreseeability versions of the tracing strategy contradict this: they imply that an agent can be blameworthy for some consequence even if he was blamelessly ignorant of it. (Miller 2017, 1567)

So it looks like Rosen and Fischer and Tognazzini owe Miller a reply. Perhaps they might do best to question premise one. If they cannot respond to this charge of inconsistency, however, they must revise one of their commitments.

b. No-Foreseeability Views

Foresight and foreseeability views are not the only views on the epistemic condition for derivative responsibility. No-foreseeability views (we might call them) hold that we can be responsible for the consequences of our conduct even if they are entirely (or at least reasonably) unforeseeable at the time of that conduct, but when the consequences are appropriately (for example, “non-deviantly”) caused by it, or reflect the agent’s ill will, or what have you. Basic and control-based externalists and quality-of-will externalists could therefore be attracted to such a view. In fact, Rik Peels (2017), appears to defend a kind of no-foreseeability view of derivative responsibility for beliefs. On his view, we are responsible for those beliefs that we have merely influenced through our actions, where influence of a belief that p consists simply in the “ability to believe otherwise”—or there being some “action or series of actions A that [the agent] S could have performed such that if S had performed A, S would not have believed that p” (2017, 143). But this view seems to propose far too weak a condition of derivative responsibility for beliefs. A corresponding account of derivative responsibility for events would entail that, for example, if the pilot’s airplane crash could have been prevented had the pilot ran through the pre-flight checklist but the crash caused the airplane company to go into liquidation, then the pilot would be responsible for this consequence, even if the pilot had no way of foreseeing it (especially given her justified belief that the company was on firm financial footing). And it does not seem that beliefs as action consequences are relevantly different from events. From another point of view, quality-of-will externalists might try to justify a no-foreseeability view by arguing that there are cases in which the consequences of one’s conduct reflect ill will even though those consequences weren’t (reasonably) foreseeable. But even if the pilot displayed recklessness towards other people’s lives by rushing through the pre-flight checklist (in the case where the pilot does not believe she is doing a film stunt), it does not seem that she is morally responsible for throwing the company into liquidation, for this consequence does not seem to reflect ill will. But perhaps the quality-of-will externalist could try to argue that there are some unforeseeable consequences of the airplane crash that do reflect the pilot’s recklessness.

These are the challenges facing a no-foreseeability view of derivative responsibility. But a reason to take the view seriously is found in Manuel Vargas’ (2005) well-discussed dilemma for foresight and (reasonable) foreseeability views (which in many ways parallels the revisionist dilemma posed by strong internalists about culpable misconduct). According to Vargas’ dilemma, there are many cases in which the consequences of our behavior (for example, as youth) on our character and later choices are not foreseeable at the time of that behavior, and yet we are intuitively to blame for those consequences. Commonly discussed is his case of “Jeff the Jerk” in which Jeff, a high-school school kid, endeavors to become more like the “jerks” who have “success” with their female classmates. He successfully becomes a jerk, but this means that later in life he is “rude and inconsiderate about the feelings of others” as he lays off his employees (2005, 271). Vargas argues that it is natural and common to think that we are culpable for these sorts of consequences of our earlier behavior, even though they are not reasonably foreseeable. But foresight and reasonable foreseeability views must regard these character traits and choices as something for which we lack responsibility. Thus, we have a dilemma: either we accept a reasonable foreseeability or foresight view and its culpability revisionist implications or we reject those views in order to vindicate our ordinary pre-theoretical intuitions.

But are foreseeability and foresight views stuck on the horns of this dilemma? In favour of a reasonable foreseeability view, Fischer and Tognazzini (2009) reply that Vargas’ cases are either cases in which the consequences in question are intuitively non-culpable, or they are culpable but there is a way for reasonable foreseeability views to account for their culpability. Concerning Jeff the Jerk, for instance, Fischer and Tognazzini argue that he is blameworthy for the way that he lays off his employees, since a relevant consequence type was foreseeable for Jeff: the consequence that he would “[treat] some people poorly at some point in the future as a result of his jerky character” (2009, 537). So it is not clear that Vargas’ dilemma for foresight and foreseeability views can successfully be used to defend no-foreseeability views, or at least used against consequence-type reasonable foreseeability views.

4. Future Areas for Research

The epistemic conditions of moral responsibility is thus a ripe field of philosophical research. While there is much more room for future contributions to the epistemic condition for culpable misconduct and for derivative responsibility, there are at least three other areas for future research on the epistemic conditions on which comparatively less has been written.

One of these areas is the epistemic condition for moral praiseworthiness, to which there are only a few extant contributions. Nomy Arpaly (2002) defends the view that cases of “inverse akrasia” or of doing something right while believing that it is wrong can in fact be morally praiseworthy, given appropriate care about the act’s right-making features. Paulina Sliwa (2017) disagrees, holding that there must be awareness of the rightness of the act to be praiseworthy for it. But even if we grant with Sliwa that a belief in wrongdoing undermines praiseworthiness, must there be awareness of the act’s rightness? What about a view modeled on a kind of weak internalism about culpability? But maybe there are reasons to embrace an asymmetry between the epistemic condition for praiseworthiness and the epistemic condition for blameworthiness?

Another area for future research is the epistemic condition for collective responsibility. As yet, there is not much work on this subject, but there are interesting questions to be asked on what the satisfaction of the above epistemic conditions on individual responsibility would look like at the collective level (supposing that such epistemic conditions ought to be satisfied for collective responsibility), and whether any unique epistemic conditions must be satisfied. If we took a “collectivist” approach to collective responsibility, according to which groups or corporations themselves can be morally responsible for collective actions and their consequences (whatever we say about the responsibility of individual members), we might wonder whether and under what conditions groups can themselves know or believe things, or whether this is even required for them to be morally responsible. Alternatively, if we took a more “individualistic” approach to collective responsibility, according to which only individual members of groups can be held responsible for collective actions and their consequences, it would seem that ordinary epistemic conditions apply concerning responsibility for their direct contribution to the collective action, but that further epistemic conditions need to be satisfied for them to be held responsible for collective actions and their consequences. On Seumas Miller’s (2006, 177) individualist approach, for instance, individual members are morally responsible for a collective action and any consequences of it only if they have a true belief that by acting in a certain way, “they will jointly realize an end which each of them has.”

A final area for future research is on the significance of the epistemic condition for criminal liability. In one of the first book-length studies of this kind, Douglas Husak (2016), a weak control-based internalist, argues that ignorance that an act is, or might be, morally wrong should ideally excuse offenders from criminal punishment. Such a view, if implemented, would force significant revisions to current (Anglo-American/common law) legal systems. Of course, it is already true in such systems that to determine whether a criminal offence has actually taken place—that is, to determine whether the accused performed the actus reus (that is, act) with the mens rea (that is, mental state) of criminal intent, knowledge, recklessness, or negligence—the satisfaction of certain epistemic conditions concerning awareness/ignorance of (non-moral/non-legal) facts must be proven beyond reasonable doubt. These conditions are part of the mens rea components of offenses. If your unattended child is harmed and you are ignorant of the risk of harm, but a “reasonable person” would have recognized that risk, then you are criminally negligent (for example, guilty of negligent homicide or endangerment). You are criminally reckless, by contrast, if you cause harm while recognizing the risk of harm; and you have criminal intent or knowledge if you cause harm knowing that the act would cause harm. Your sentence would likely also be heftier having been found guilty of one of these forms of liability than if you were found guilty of mere negligence (matching the common but not uncontroversial assumption that akratic wrongdoing is more culpable than unwitting wrongdoing.) Some existing offenses do also include awareness of the act’s illegality or wrongfulness in their mens rea components. (And one might think of the existing “insanity defense” in this context, for how it allows offenders to avoid conviction on the grounds that they cannot “distinguish right from wrong.” But in responsibility terms, this would be to appeal to a lack of a baseline moral capacity of responsibility, rather than to appeal directly to ignorance of the act’s wrongfulness). However, in Husak’s mind, we need to look beyond the way that actual jurisdictions impose criminal liability. If, guided by the “presumptive conformity” of law to morality, we were to consistently apply the correct—in Husak’s view, weak internalist—epistemic conditions of moral blameworthiness to criminal liability in the ideally just legal system (that is, without consideration of real-world problems concerning its applicability), then not only might we have to remove negligence as a form of criminal liability (for it is after all a form of ignorance of fact), but, argues Husak, we would have to “treat mistakes of fact and law [or morality] symmetrically by replicating the same normative structure in each context” (2016, 161). That is to say, the just legal system would impose criminal liability and punishment only on those offenders who are intentional, knowledgeable, reckless, and probably not negligent with respect to the underlying morality of the offence—in particular, with respect to whether it is “contrary to the balance of moral reasons and is wrong” (2016, 161). In practice, the just legal system would then either explicitly or implicitly build a requirement of awareness of (the risk of) wrongdoing into the mens rea element of the definition of the offense (for example, “murder” would be “knowingly killing someone while knowing the wrongfulness of doing so”), or (less symmetrically) such a system would leave the definitions of offences untouched and provide a unique “mistake of law/morality” defense (alongside other defenses, such as the insanity defense) for a not-guilty plea (see Husak’s discussion in: 2016, 262ff).

Husak’s revisionary application of the epistemic condition to criminal liability raises a number of questions. One issue that many will have with his straightforward application of culpability internalism to criminal liability is that the ideally just legal system would not punish “zealous terrorists who are unaware of wrongdoing” (2016, 265)—a rather counterintuitive consequence of the view! In this connection, we might ask whether it is true that a just legal system would make criminal liability depend on (at least one form of) moral blameworthiness, and thus on the satisfaction of its epistemic condition. Suppose that it wouldn’t. Would criminal liability still be structurally analogous to moral blameworthiness (cf. Rosen 2003 80-81), such that a parallel epistemic condition applies? If it were to make criminal liability depend on moral blameworthiness or a structural analogue, would the just legal system make criminal liability depend on the most plausible view of the epistemic condition (for example, in Husak’s view, a weak control-based internalism), or rather would it make criminal liability depend on the most accepted view of moral blameworthiness, or maybe whatever view accords most with common-sense intuitions of blameworthiness? Or should criminal liability have nothing to do with moral blameworthiness (but be concerned exclusively with, say, mere wrongdoing, deterrence, or rehabilitation). These are all important questions for future inquiries into the epistemic condition.

5. References and Further Reading

  • Arpaly, Nomy. Unprincipled Virtue. Oxford: Oxford University Press, 2002.
  • Boylan, Michael. Basic Ethics, 3rd ed. New York: Routledge, 2021.
  • Chisholm, Roderick. Person and Object: A Metaphysical Study. George Allen & Unwin Ud, 1976.
  • Clarke, Randolph. “Blameworthiness and Unwitting Omissions.” In The Ethics and Law of Omissions, edited by Dana Kay Nelkin and Samuel C. Rickless. Oxford: Oxford University Press, 2017.
  • Clarke, Randolph. “Negligent Action and Unwitting Omission.” In Omissions: Agency, Metaphysics, and Responsibility. Oxford: Oxford University Press, 2014.
  • Cloos, Christopher Michael. Responsibility Beyond Belief: The Epistemic Condition on Moral Responsibility: a doctoral dissertation accepted by the University of California, Santa Barbara, September, 2018. Available at: https://escholarship.org/uc/item/1hr314cs.
  • Fischer, John Martin, and Neal A. Tognazzini. “The Truth about Tracing.” Noûs 43, no. 3 (2009): 531-556.
  • FitzPatrick, William. “Moral Responsibility and Normative Ignorance: Answering a New Skeptical Challenge.” Ethics 118, no. 4 (2008): 589–613.
  • FitzPatrick, William. “Unwitting Wrongdoing, Reasonable Expectations, and Blameworthiness.” In Responsibility: The Epistemic Condition, edited by Philip Robichaud and Jan Willem Wieland, 29–46. Oxford: Oxford University Press, 2017.
  • Frankfurt, Harry G. “Alternate Possibilities and Moral Responsibility.” The Journal of Philosophy 66, no. 23 (1969): 829–39.
  • Guerrero, Alexander. “Don’t Know, Don’t Kill: Moral Ignorance, Culpability, and Caution.” Philosophical Studies 136, no. 1 (2007): 59–97.
  • Haji, Ishtiyaque. “An Epistemic Dimension of Blameworthiness.” Philosophy and Phenomenological Research 57, no. 3 (1997): 523–44.
  • Harman, Elizabeth. “Does Moral Ignorance Exculpate?” Ratio 24, no. 4 (2011): 443–68.
  • Husak Douglas. Ignorance of Law: A Philosophical Inquiry. Oxford: Oxford University Press, 2016.
  • Levy, Neil. “Culpable Ignorance and Moral Responsibility: A Reply to FitzPatrick.” Ethics 119, no. 4 (2009): 729–41.
  • Levy, Neil. “Culpable Ignorance: A Reply a Robichaud.” Journal of Philosophical Research 41 (2016): 263–71.
  • Levy, Neil. “The Good, the Bad and the Blameworthy.” Journal of Ethics and Social Philosophy 1, no. 2 (2005): 1–16.
  • Levy, Neil. Hard Luck: How Luck Undermines Free Will and Moral Responsibility. Oxford: Oxford University Press, 2011.
  • Levy, Neil. “Methodological Conservatism and the Epistemic Condition.” In Responsibility: The Epistemic Condition, edited by Philip Robichaud and Jan Willem Wieland, 252–65. Oxford: Oxford University Press, 2017.
  • Mason, Elinor. Ways to Be Blameworthy: Rightness, Wrongness, and Responsibility. Oxford: Oxford University Press, 2019.
  • McKenna, Michael. Conversation and Responsibility. Oxford: Oxford University Press, 2012
  • Miller, Daniel. “Reasonable Foreseeability and Blameless Ignorance.” Philosophical Studies 174, no. 6 (2017): 1561-1581.
  • Miller, Seumas. “Collective Moral Responsibility: An Individualist Account.” Midwest Studies in Philosophy 15 (2006): 176-193.
  • Montmarquet, James. “Zimmerman on Culpable Ignorance.” Ethics 109, no. 4 (1999): 842–45.
  • Murray, Samuel. “Responsibility and Vigilance.” Philosophical Studies 174, no. 2 (2017): 507–27.
  • Nelkin, Dana Kay, and Samuel C. Rickless. “Moral Responsibility for Unwitting Omissions.” In The Ethics and Law of Omissions, edited by Dana Kay Nelkin, and Samuel C. Rickless, 106-130. New York: Oxford University Press, 2017.
  • Nottelmann, Nikolaj. Blameworthy Belief: A Study in Epistemic Deontologism. Dordrecht: Springer Netherlands, 2007.
  • Peels, Rik. Responsible Belief: A Theory in Ethics and Epistemology. Oxford: Oxford University Press, 2017.
  • Peels, Rik. “Tracing Culpable Ignorance.” Logos & Episteme 2, no. 4 (2011): 575–82.
  • Robichaud, Philip. “On Culpable Ignorance and Akrasia.” Ethics 125, no. 1 (2014): 137–51.
  • Rosen, Gideon. “Culpability and Ignorance.” Proceedings of the Aristotelian Society 103 (2003): 61–84.
  • Rosen, Gideon. “Kleinbart the Oblivious and Other Tales of Ignorance and Responsibility.” The Journal of Philosophy 105, no. 10 (2008): 591–610.
  • Rosen, Gideon. “Skepticism about Moral Responsibility.” Philosophical Perspectives 18 (2004): 295–313.
  • Rudy-Hiller, Fernando. “A Capacitarian Account of Culpable Ignorance.” Pacific Philosophical Quarterly 98 (2017): 398–426.
  • Rudy-Hiller, Fernando. “Give People a Break: Slips and Moral Responsibility.” Philosophical Quarterly 69, no. 277 (2019): 721-740.
  • Sartorio, Carolina. “Ignorance, Alternative Possibilities, and the Epistemic Conditions for Responsibility.” In Perspectives on Ignorance from Moral and Social Philosophy, edited by Rik Peels, 15–29. New York: Routledge, 2017.
  • Sher, George. Who Knew?: Responsibility Without Awareness. Oxford: Oxford University Press, 2009.
  • Sliwa, Paulina. “On Knowing What’s Right and Being Responsible For It.” In Responsibility: The Epistemic Condition, edited by Philip Robichaud and Jan Willem Wieland, 127-145. Oxford: Oxford University Press, 2017
  • Smith, Angela. “Responsibility for Attitudes: Activity and Passivity in Mental Life.” Ethics 115, no. 2 (2005): 236–71.
  • Smith, Angela. “Review of George Sher’s Who Knew? Responsibility without Awareness”, Social Theory and Practice 36, no. 3 (2010): 515–524.
  • Smith, Holly. “Culpable Ignorance.” The Philosophical Review 92, no. 4 (1983): 543–71.
  • Smith, Holly. “Non-Tracing Cases of Culpable Ignorance.” Criminal Law and Philosophy 5, no. 2 (2011): 115–46.
  • Talbert, Matthew. “Omission and Attribution Error.” In The Ethics and Law of Omissions, edited by Dana Nelkin and Samuel C. Rickless, 17–35. Oxford: Oxford University Press, 2017
  • Talbert, Matthew. “Unwitting Wrongdoers and the Role of Moral Disagreement in Blame.” In Oxford Studies in Agency and Responsibility Volume 1, edited by David Shoemaker. Oxford: Oxford University Press, 2013
  • Vargas, Manuel. “The Trouble with Tracing.” Midwest Studies in Philosophy 29 (2005): 269–91.
  • Watson, Gary. “Two Faces of Responsibility.” Philosophical Topics 24, no. 2 (1996): 227–48.
  • Wieland, Jan W. “Introduction: The Epistemic Condition.” In Responsibility: The Epistemic Condition, edited by Philip Robichaud and Jan Willem Wieland, 1–45. Oxford: Oxford University Press, 2017.
  • Yates, Thomas A. Moral Responsibility and Motivating Reasons: On the Epistemic Condition for Moral Blameworthiness: a doctoral dissertation accepted by the University of Auckland, February 5, 2021. Available at: https://researchspace.auckland.ac.nz/handle/2292/54410.
  • Zimmerman, Michael J. “Ignorance as a Moral Excuse.” In Perspectives on Ignorance from Moral and Social Philosophy, edited by Rik Peels, 77-94. New York, US: Routledge, 2017.
  • Zimmerman, Michael J. “Moral Responsibility and Ignorance.” Ethics 107 (1997): 410–26.
  • Zimmerman, Michael J. “Negligence and Moral Responsibility.” Nous 20, no. 2 (1986): 199–218.

 

Author Information

Tom Yates
Email: tyatesnz@gmail.com
Massey University
New Zealand

Deductive and Inductive Arguments

In philosophy, an argument consists of a set of statements called premises that serve as grounds for affirming another statement called the conclusion. Philosophers typically distinguish arguments in natural languages (such as English) into two fundamentally different types: deductive and inductive. Each type of argument is said to have characteristics that categorically distinguish it from the other type. The two types of argument are also said to be subject to differing evaluative standards. Pointing to paradigmatic examples of each type of argument helps to clarify their key differences. The distinction between the two types of argument may hardly seem worthy of philosophical reflection, as evidenced by the fact that their differences are usually presented as straightforward, such as in many introductory philosophy textbooks. Nonetheless, the question of how best to distinguish deductive from inductive arguments, and indeed whether there is a coherent categorical distinction between them at all, turns out to be considerably more problematic than commonly recognized. This article identifies and discusses a range of different proposals for marking categorical differences between deductive and inductive arguments while highlighting the problems and limitations attending each. Consideration is also given to the ways in which one might do without a distinction between two types of argument by focusing instead solely on the application of evaluative standards to arguments.

Table of Contents

  1. Introduction
  2. Psychological Approaches
  3. Behavioral Approaches
  4. Arguments that “Purport”
  5. Evidential Completeness
  6. Logical Necessity vs. Probability
  7. The Question of Validity
  8. Formalization and Logical Rules to the Rescue?
  9. Other Even Less Promising Proposals
  10. An Evaluative Approach
  11. References and Further Reading

1. Introduction

In philosophy, an argument consists of a set of statements called premises that serve as grounds for affirming another statement called the conclusion. Philosophers typically distinguish arguments in natural languages (such as English) into two fundamentally different kinds: deductive and inductive. (Matters become more complicated when considering arguments in formal systems of logic as well as in the many forms of non-classical logic. Readers are invited to consult the articles on Logic in this encyclopedia to explore some of these more advanced topics.) In the philosophical literature, each type of argument is said to have characteristics that categorically distinguish it from the other type.

Deductive arguments are sometimes illustrated by providing an example in which an argument’s premises logically entail its conclusion. For example:

Socrates is a man.
All men are mortal.
Therefore, Socrates is mortal.

Assuming the truth of the two premises, it seems that it simply must be the case that Socrates is mortal. According to this view, then, this would be a deductive argument.

By contrast, inductive arguments are said to be those that make their conclusions merely probable. They might be illustrated by an example like the following:

Most Greeks eat olives.
Socrates is a Greek.
Therefore, Socrates eats olives.

Assuming the truth of those premises, it is likely that Socrates eats olives, but that is not guaranteed. According to this view, this argument is inductive.

This way of viewing arguments has a long history in philosophy. An explicit distinction between two fundamentally distinct argument types goes back to Aristotle (384-322 B.C.E.) who, in his works on logic (later dubbed “The Organon”, meaning “the instrument”) distinguished syllogistic reasoning (sullogismos) from “reasoning from particulars to universals” (epagôgê). Centuries later, induction was famously advertised by Francis Bacon (1561-1626) in his New Organon (1620) as the royal road to knowledge, while Rationalist mathematician-philosophers, such as René Descartes (1596-1650) in his Discourse on the Method (1637), favored deductive methods of inquiry. Albert Einstein (1879-1955) discussed the distinction in the context of science in his essay, “Induction and Deduction in Physics” (1919). Much contemporary professional philosophy, especially in the Analytic tradition, focuses on presenting and critiquing deductive and inductive arguments while considering objections and responses to them. It is therefore safe to say that a distinction between deductive and inductive arguments is fundamental to argument analysis in philosophy.

Although a distinction between deductive and inductive arguments is deeply woven into philosophy, and indeed into everyday life, many people probably first encounter an explicit distinction between these two kinds of argument in a pedagogical context. For example, students taking an elementary logic, critical thinking, or introductory philosophy course might be introduced to the distinction between each type of argument and be taught that each have their own standards of evaluation. Deductive arguments may be said to be valid or invalid, and sound or unsound. A valid deductive argument is one whose logical structure or form is such that if the premises are true, the conclusion must be true. A sound argument is a valid argument with true premises. Inductive arguments, by contrast, are said to be strong or weak, and, although terminology varies, they may also be considered cogent or not cogent. A strong inductive argument is said to be one whose premises render the conclusion likely. A cogent argument is a strong argument with true premises. All arguments are made better by having true premises, of course, but the differences between deductive and inductive arguments concern structure, independent of whether the premises of an argument are true, which concerns semantics.

The distinction between deductive and inductive arguments is considered important because, among other things, it is crucial during argument analysis to apply the right evaluative standards to any argument one is considering. Indeed, it is not uncommon to be told that in order to assess any argument, three steps are necessary. First, one is to determine whether the argument being considered is a deductive argument or an inductive one. Second, one is to then determine whether the argument is valid or invalid. Finally, one is to determine whether the argument is sound or unsound (Teays 1996).

All of this would seem to be amongst the least controversial topics in philosophy. Controversies abound in metaphysics, epistemology, and ethics (such as those exhibited in the contexts of Ancient and Environmental Ethics, just to name a couple). By contrast, the basic distinctions between deductive and inductive arguments seem more solid, more secure; in short, more settled than those other topics. Accordingly, one might expect an encyclopedic article on deductive and inductive arguments to simply report the consensus view and to clearly explain and illustrate the distinction for readers not already familiar with it. However, the situation is made more difficult by three facts.

First, there appear to be other forms of argument that do not fit neatly into the classification of deductive or inductive arguments. Govier (1987) calls the view that there are only two kinds of argument (that is, deductive and inductive) “the positivist theory of argument”. She believes that it naturally fits into, and finds justification within, a positivist epistemology, according to which knowledge must be either a priori (stemming from logic or mathematics, deploying deductive arguments) or a posteriori (stemming from the empirical sciences, using inductive arguments). She points out that arguments as most people actually encounter them assume such a wide variety of forms that the “positivist theory of argument” fails to account for a great many of them.

Second, it can be difficult to distinguish arguments in ordinary, everyday discourse as clearly either deductive or inductive. The supposedly sharp distinction tends to blur in many cases, calling into question whether the binary nature of the deductive-inductive distinction is correct.

Third (this point being the main focus of this article), a perusal of elementary logic and critical thinking texts, as well as other presentations aimed at non-specialist readers, demonstrates that there is in fact no consensus about how to draw the supposedly straightforward deductive-inductive argument distinction, as least within the context of introducing the distinction to newcomers. Indeed, proposals vary from locating the distinction within subjective, psychological states of arguers to objective features of the arguments themselves, with other proposals landing somewhere in-between.

Remarkably, not only do proposals vary greatly, but the fact that they do so at all, and that they generate different and indeed incompatible conceptions of the deductive-inductive argument distinction, also seems to go largely unremarked upon by those advancing such proposals. Many authors confidently explain the distinction between deductive and inductive arguments without the slightest indication that there are other apparently incompatible ways of making such a distinction. Moreover, there appears to be little scholarly discussion concerning whether the alleged distinction even makes sense in the first place. That there is a coherent, unproblematic distinction between deductive and inductive arguments, and that the distinction neatly assigns arguments to one or the other of the two non-overlapping kinds, is an assumption that usually goes unnoticed and unchallenged. Even a text with the title Philosophy of Logics (Haack 1978) makes no mention of this fundamental philosophical problem.

A notable exception has already been mentioned in Govier (1987), who explicitly critiques what she calls “the hallowed old distinction between inductive and deductive arguments.” However, her insightful discussion turns out to be the exception that proves the rule. Her critique appears not to have awoken philosophers from their dogmatic slumbers concerning the aforementioned issues of the deductive-inductive argument classification. Moreover, her discussion, while perceptive, does not engage the issue with the level of sustained attention that it deserves, presumably because her primary concerns lay elsewhere. In short, the problem of distinguishing between deductive and inductive arguments seems not to have registered strongly amongst philosophers. A consequence is that the distinction is often presented as if it were entirely unproblematic. Whereas any number of other issues are subjected to penetrating philosophical analysis, this fundamental issue typically traipses past unnoticed.

Accordingly, this article surveys, discusses, and assesses a range of common (and other not-so-common) proposals for distinguishing between deductive and inductive arguments, ranging from psychological approaches that locate the distinction within the subjective mental states of arguers, to approaches that locate the distinction within objective features of arguments themselves. It aims first to provide a sense of the remarkable diversity of views on this topic, and hence of the significant, albeit typically unrecognized, disagreements concerning this issue. Along the way, it is pointed out that none of the proposed distinctions populating the relevant literature are entirely without problems. This is especially the case when related to other philosophical views which many philosophers would be inclined to accept, although some of the problems that many of the proposed distinctions face may be judged to be more serious than others.

In light of these difficulties, a fundamentally different approach is then sketched: rather than treating a categorical deductive-inductive argument distinction as entirely unproblematic (as a great many authors do), these problems are made explicit so that emphasis can be placed on the need to develop evaluative procedures for assessing arguments without identifying them as strictly “deductive” or “inductive.” This evaluative approach to argument analysis respects the fundamental rationale for distinguishing deductive from inductive arguments in the first place, namely as a tool for helping one to decide whether the conclusion of any argument deserves assent. Such an approach bypasses the problems associated with categorical approaches that attempt to draw a sharp distinction between deductive and inductive arguments. Ultimately, the deductive-inductive argument distinction should be dispensed with entirely, a move which is no doubt a counterintuitive conclusion for some that nonetheless can be made plausible by attending to the arguments that follow.

First, a word on strategy. Each of the proposals considered below will be presented from the outset in its most plausible form in order to see why it might seem attractive, at least initially so. The consequences of accepting each proposal are then delineated, consequences that might well give one pause in thinking that the deductive-inductive argument distinction in question is satisfactory.

2. Psychological Approaches

Perhaps the most popular approach to distinguish between deductive and inductive arguments is to take a subjective psychological state of the agent advancing a given argument to be the crucial factor. For example, one might be informed that whereas a deductive argument is intended to provide logically conclusive support for its conclusion, an inductive argument is intended to provide only probable, but not conclusive, support (Barry 1992; Vaughn 2010; Harrell 2016; and many others). Some accounts of this sort could hardly be more explicit that such psychological factors alone are the key factor. From this perspective, then, it may be said that the difference between deductive and inductive arguments does not lie in the words used within the arguments, but rather in the intentions of the arguer. That is to say, the difference between each type of argument comes from the relationship the arguer takes there to be between the premises and the conclusion. If the arguer believes that the truth of the premises definitely establishes the truth of the conclusion, then the argument is deductive. If the arguer believes that the truth of the premises provides only good reasons to believe the conclusion is probably true, then the argument is inductive. According to this psychological account, the distinction between deductive and inductive arguments is determined exclusively by the intentions and/or beliefs of the person advancing an argument.

This psychological approach entails some interesting, albeit often unacknowledged, consequences. Because the difference between deductive and inductive arguments is said to be determined entirely by what an arguer intends or believes about any given argument, it follows that what is ostensibly the very same argument may be equally both deductive and inductive.

An example may help to illustrate this point. If person A believes that the premise in the argument “Dom Pérignon is a champagne; so, it is made in France” definitely establishes its conclusion (perhaps on the grounds that “champagne” is a type of sparkling wine produced only in the Champagne wine region of France), then according to the psychological approach being considered, this would be a deductive argument. However, if person B believes that the premise of the foregoing argument provides only good reasons to believe that the conclusion is true (perhaps because they think of “champagne” as merely any sort of fizzy wine), then the argument in question is also an inductive argument. Therefore, it is entirely possible on this psychological view for the same argument to be both a deductive and an inductive argument. It is a deductive argument because of what person A believes. It is also an inductive argument because of what person B believes. Indeed, this consequence need not involve different individuals at all. This result follows even if the same individual maintains different beliefs and/or intentions with respect to the argument’s strength at different times.

The belief-relativity inherent in this psychological approach is not by itself an objection, much less a decisive one. Olson (1975) explicitly advances such an account, and frankly embraces its intention- or belief-relative consequences. Perhaps the fundamental nature of arguments is relative to individuals’ intentions or beliefs, and thus the same argument can be both deductive and inductive. However, this psychological approach does place logical constraints on what else one can coherently claim. For example, one cannot coherently maintain that, given the way the terms ‘deductive argument’ and ‘inductive argument’ are categorized here, an argument is always one or the other and never both. If this psychological account of the deductive-inductive argument distinction is accepted, then the latter claim is necessarily false.

Of course, there is a way to reconcile the psychological approach considered here with the claim that an argument is either deductive or inductive, but never both. One could opt to individuate arguments on the basis of individuals’ specific intentions or beliefs about them. In this more sophisticated approach, what counts as a specific argument would depend on the intentions or beliefs regarding it. So, for example, if person A believes that “Dom Pérignon is a champagne; so, it is made in France” definitely establishes the truth of its conclusion, while person B believes that “Dom Pérignon is a champagne; so, it is made in France” provides only good reasons for thinking that its conclusion is true, then there isn’t just one argument here after all. Rather, according to this more sophisticated account, there are two distinct arguments here that just happen to be formulated using precisely the same words. According to this view, the belief that there is just one argument here would be naïve. Hence, it could still be the case that any argument is deductive or inductive, but never both. Arguments just need to be multiplied as needed.

However, this more sophisticated strategy engenders some interesting consequences of its own. Since intentions and beliefs can vary in clarity, intensity, and certainty, any ostensible singular argument may turn out to represent as many distinct arguments as there are persons considering a given inference. So, for example, what might initially have seemed like a single argument (say, St. Anselm of Canterbury‘s famous ontological argument for the existence of God) might turn out in this view to be any number of different arguments because different thinkers may harbor different degrees of intention or belief about how well the argument’s premises support its conclusion.

On a similar note, the same ostensible single argument may turn out to be any number of arguments if the same individual entertains different intentions or beliefs (or different degrees of intention or belief) at different times concerning how well its premises support its conclusion, as when one reflects upon an argument for some time. Again, this is not necessarily an objection to this psychological approach, much less a decisive one. A proponent of this psychological approach could simply bite the bullet and concede that what at first appeared to be a single argument may in fact be many.

Be that as it may, there are yet other logical consequences of adopting such a psychological account of the deductive-inductive argument distinction that, taken together with the foregoing considerations, may raise doubts about whether such an account could be the best way to capture the relevant distinction. Because intentions and beliefs are not publicly accessible, and indeed may not always be perfectly transparent even to oneself, confident differentiation of deductive and inductive arguments may be hard or even impossible in many, or even in all, cases. For example, in cases where one does not or cannot know what the arguer’s intentions or beliefs are (or were), it is necessarily impossible to identify which type of argument it is, assuming, again, that it must be either one type or the other. If the first step in evaluating an argument is determining which type of argument it is, one cannot even begin.

In response, it might be advised to look for the use of indicator words or phrases as clues to discerning an arguer’s intentions or beliefs. The use of words like “necessarily,” or “it follows that,” or “therefore it must be the case that” could be taken to indicate that the arguer intends the argument to definitely establish its conclusion, and therefore, according to the psychological proposal being considered, one might judge it to be a deductive argument. Alternatively, the use of words like “probably,” “it is reasonable to conclude,” or “it is likely” could be interpreted to indicate that the arguer intends only to make the argument’s conclusion probable. One might judge it to be an inductive argument on that basis.

However, while indicator words or phrases may suggest specific interpretations, they need to be viewed in context, and are far from infallible guides. At best, they are indirect clues as to what any arguer might believe or intend. Someone may say one thing, but intend or believe something else. This need not involve intentional lying. Intentions and beliefs are often opaque, even to the person whose intentions and beliefs they are. Moreover, they are of limited help in providing an unambiguous solution in many cases. Consider the following example:

Most Major League Baseball outfielders consistently have batting averages over .250. Since Ken Singleton played centerfield for the Orioles for three consecutive years, he must have been batting over .250 when he was traded.

If one takes seriously the “must have” clause in the last sentence, it might be concluded that the proponent of this argument intended to provide a deductive argument and thus, according to the psychological approach, it is a deductive argument. If one is not willing to ascribe that intention to the argument’s author, it might be concluded that he meant to advance an inductive argument. In some cases, it simply cannot be known. To offer another example, consider this argument:

It has rained every day so far this month.
If it has rained every day so far this month, then probably it will rain today.
Therefore, probably it will rain today.

The word “probably” appears twice, suggesting that this may be an inductive argument. Yet, many would agree that the argument’s conclusion is “definitely established” by its premises. Consequently, while being on the lookout for the appearance of certain indicator words is a commendable policy for dealing fairly with the arguments one encounters, it does not provide a perfectly reliable criterion for categorically distinguishing deductive and inductive arguments.

This consequence might be viewed as merely an inconvenient limitation on human knowledge, lamentably another instance of which there already are a great many. However, there is a deeper worry associated with a psychological approach than has been considered thus far. Recall that a common psychological approach distinguishes deductive and inductive arguments in terms of the intentions or beliefs of the arguer with respect to any given argument being considered. If the arguer intends or believes the argument to be one that definitely establishes its conclusion, then it is a deductive argument. If the arguer intends or believes the argument to be one that merely makes its conclusion probable, then it is an inductive argument. But what if the person putting forth the argument intends or believes neither of those things?

Philosophy instructors routinely share arguments with their students without any firm beliefs regarding whether they definitely establish their conclusions or whether they instead merely make their conclusions probable. Likewise, they may not have any intentions with respect to the arguments in question other than merely the intention to share them with their students. For example, if an argument is put forth merely as an illustration, or rhetorically to show how someone might argue for an interesting thesis, with the person sharing the argument not embracing any intentions or beliefs about what it does show, then on the psychological approach, the argument is neither a deductive nor an inductive argument. This runs counter to the view that every argument must be one or the other.

Nor can it be said that such an argument must be deductive or inductive for someone else, due to the fact that there is no guarantee that anyone has any beliefs or intentions regarding the argument. In this case, then, if the set of sentences in question still qualifies as an argument, what sort of argument is it? It would seem to exist in a kind of logical limbo or no man’s land. It would be neither deductive nor inductive. Furthermore, there is no reason to suppose that it is some other type, unless it isn’t really an argument at all, since no one intends or believes anything about how well it establishes its conclusion. In that case, one is faced with the peculiar situation in which someone believes that a set of sentences is an argument, and yet it cannot be an argument because, according to the psychological view, no one has any intentions for the argument to establish its conclusion, nor any beliefs about how well it does so. However, it could still become a deductive or inductive argument should someone come to embrace it with greater, or with lesser, conviction, respectively. With this view, arguments could continually flicker into and out of existence.

These considerations do not show that a purely psychological criterion for distinguishing deductive and inductive arguments must be wrong, as that would require adopting some other presumably more correct standard for making the deductive-inductive argument distinction, which would then beg the question against any psychological approach. Logically speaking, nothing prevents one from accepting all the foregoing consequences, no matter how strange and inelegant they may be. However, there are other troubling consequences of adopting a psychological approach to consider.

Suppose that it is said that an argument is deductive if the person advancing it believes that it definitely establishes its conclusion. According to this account, if the person advancing an argument believes that it definitely establishes its conclusion, then it is definitively deductive. If, however, everyone else who considers the argument thinks that it makes its conclusion merely probable at best, then the person advancing the argument is completely right and everyone else is necessarily wrong.

For example, consider the following argument: “It has rained nearly every day so far this month. So, it will for sure rain tomorrow as well.” If the person advancing this argument believes that the premise definitely establishes its conclusion, then according to such a psychological view, it is necessarily a deductive argument, despite the fact that it would appear to most others to at best make its conclusion merely probable. Or, to take an even more striking example, consider Dr. Samuel Johnson’s famous attempted refutation of Bishop George Berkeley‘s immaterialism (roughly, the view that there are no material things, but only ideas and minds) by forcefully kicking a stone and proclaiming “I refute it thus!” If Dr. Johnson sincerely believed that by his action he had logically refuted Berkeley’s immaterialism, then his stone-kicking declaration would be a deductive argument.

Likewise, some arguments that look like an example of a deductive argument will have to be re-classified on this view as inductive arguments if the authors of such arguments believe that the premises provide merely good reasons to accept the conclusions as true. For example, someone might give the following argument:

All men are mortal.
Socrates is a man.
Therefore, Socrates is mortal.

This is the classic example of a deductive argument included in many logic texts. However, if someone advancing this argument believes that the conclusion is merely probable given the premises, then it would, according to this psychological proposal, necessarily be an inductive argument, and not just merely be believed to be so, given that it meets a sufficient condition for being inductive.

A variation on this psychological approach focuses not on intentions and beliefs, but rather on doubts. According to this alternative view, a deductive argument is one such that, if one accepts the truth of the premises, one cannot doubt the truth of the conclusion. By contrast, an inductive argument is one such that, if one accepts the truth of the premises, one can doubt the truth of the conclusion. This view is sometimes expressed by saying that deductive arguments establish their conclusions “beyond a reasonable doubt” (Teays 1996). Deductive arguments, in this view, may be said to be psychologically compelling in a way that inductive arguments are not. Good deductive arguments compel assent, but even quite good inductive arguments do not.

However, a moment’s reflection demonstrates that this approach entails many of the same awkward consequences as do the other psychological criteria previously discussed. What people are capable of doubting is as variable as what they might intend or believe, making this doubt-centered view subject to the same sorts of agent-relative implications facing any intention-or-belief approach.

One might try to circumvent these difficulties by saying that a deductive argument should be understood as one that establishes its conclusion beyond a reasonable doubt. In other words, given the truth of the premises, one should not doubt the truth of the conclusion. Likewise, one might say that an inductive argument is one such that, given the truth of the premises, one should be permitted to doubt the truth of the conclusion. However, this tactic would be to change the subject from the question of what categorically distinguishes deductive and inductive arguments to that of the grounds for deciding whether an argument is a good one – a worthwhile question to ask, to be sure, but a different question than the one being considered here.

Again, in the absence of some independently established distinction between deductive and inductive arguments, these consequences alone cannot refute any psychological account. Collectively, however, they raise questions about whether this way of distinguishing deductive and inductive arguments should be accepted, given that such consequences are hard to reconcile with other common beliefs about arguments, say, about how individuals can be mistaken about what sort of argument they are advancing. Luckily, there are other approaches. However, upon closer analysis these other approaches fare no better than the various psychological approaches thus far considered.

3. Behavioral Approaches

Psychological approaches are, broadly speaking, cognitive. They concern individuals’ mental states, specifically their intentions, beliefs, and/or doubts. Given the necessarily private character of mental states (assuming that brain scans, so far at least, provide only indirect evidence of individuals’ mental states), it may be impossible to know what an individual’s intentions or beliefs really are, or what they are or are not capable of doubting. Hence, it may be impossible given any one psychological approach to know whether any given argument one is considering is a deductive or an inductive one. That and other consequences of that approach seem less than ideal. Can such consequences be avoided?

The problem of knowing others’ minds is not new. A movement in psychology that flourished in the mid-20th century, some of whose tenets are still evident within 21st century psychological science, was intended to circumvent problems associated with the essentially private nature of mental states in order to put psychology on a properly scientific footing. According to Behaviorism, one can set aside speculations about individuals’ inaccessible mental states to focus instead on individuals’ publicly observable behaviors. According to certain behaviorists, any purported psychological state can be re-described as a set of behaviors. For example, a belief such as “It will rain today” might be cashed out along the lines of an individual’s behavior of putting on wet-weather gear or carrying an umbrella, behaviors that are empirically accessible insofar as they are available for objective observation. In this way, it was hoped, one can bypass unknowable mental states entirely.

Setting aside the question of whether Behaviorism is viable as a general approach to the mind, a focus on behavior rather than on subjective psychological states in order to distinguish deductive and inductive arguments promises to circumvent the epistemic problems facing a cognitive approach. According to one such proposal, a deductive argument is one whose premises are claimed to support the conclusion such that it would be impossible for the premises to be true and for the conclusion to be false. An inductive argument is one whose premises are claimed to provide only some less-than-conclusive grounds for accepting the conclusion (Copi 1978; Hurley and Watson 2018). A variation on this approach says that deductive arguments are ones in which the conclusion is presented as following from the premises with necessity, whereas inductive arguments are ones in which the conclusion is presented as following from the premises only with some probability (Engel 1994). Notice that, unlike intending or believing, “claiming” and “presenting” are expressible as observable behaviors.

This behavioral approach thus promises to circumvent the epistemic problems facing psychological approaches. What someone explicitly claims an argument shows can usually, or at least often, be determined rather unproblematically. For example, if someone declares “The following argument is a deductive argument, that is, an argument whose premises definitely establish its conclusion,” then, according to the behavioral approach being considered here, it would be a sufficient condition to judge the argument in question to be a deductive argument. Likewise, if someone insists “The following argument is an inductive argument, that is, an argument such that if its premises are true, the conclusion is, at best, probably true as well,” this would be a sufficient condition to conclude that such an argument is inductive. Consequently, some of the problems associated with psychological proposals fall by the wayside. Initially, therefore, this approach looks promising.

The most obvious problem with this approach is that few arguments come equipped with a statement explicitly declaring what sort of argument it is thought to be. As Govier (1987) sardonically notes, “Few arguers are so considerate as to give us a clear indication as to whether they are claiming absolute conclusiveness in the technical sense in which logicians understand it.” This leaves plenty of room for interpretation and speculation concerning the vast majority of arguments, thereby negating the chief hoped for advantage of focusing on behaviors rather than on psychological states.

Alas, other problems loom as well. Having already considered some of the troubling agent-relative consequences of adopting a purely psychological account, it will be easy to anticipate that behavioral approaches, while avoiding some of the psychological approach’s epistemic problems, nonetheless will inherit many of the latter’s agent-relativistic problems in virtually identical form.

First, what is ostensibly the very same argument (that is, consisting of the same sequence of words) in this view may be both a deductive and an inductive argument when advanced by individuals making different claims about what the argument purports to show, regardless of how unreasonable those claims appear to be on other grounds. For example, the following argument (a paradigmatic instance of the modus ponens argument form) would be a deductive argument if person A claims that, or otherwise behaves as if, the premises definitely establish the conclusion:

If P, then Q.
P.
Therefore, Q.

(The capital letters exhibited in this argument are to be understood as variables that can be replaced with declarative sentences, statements, or propositions, namely, items that are true or false. The investigation of logical forms that involve whole sentences is called Propositional Logic.)

However, by the same token, the foregoing argument equally would be an inductive argument if person B claims (even insincerely so, since psychological factors are by definition irrelevant under this view) that its premises provide only less than conclusive support for its conclusion.

Likewise, the following argument would be an inductive argument if person A claims that its premise provides less than conclusive support for its conclusion:

A random sample of voters in Los Angeles County supports a new leash law for pet turtles; so, the law will probably pass by a very wide margin.

However, it would also be a deductive argument if person B claims that its premises definitely establish the truth of its conclusion. On a behavioral approach, then, recall that whether an argument is deductive or inductive is entirely relative to individuals’ claims about it, or to some other behavior. Indeed, this need not involve different individuals at all. An argument would be both a deductive and an inductive argument if the same individual makes contrary claims about it, say, at different times.

If one finds these consequences irksome, one could opt to individuate arguments on the basis of claims about them. So, two individuals might each claim that “Dom Pérignon is a champagne; so, it is made in France.” But if person A claims that the premise of this argument definitely establishes its conclusion, whereas person B claims that the premise merely makes its conclusion probable, there isn’t just one argument about Dom Pérignon being considered, but two: one deductive, the other inductive, each one corresponding to one of the two different claims. There is no need to rehearse the by-now familiar worries concerning these issues, given that these issues are nearly identical to the various ones discussed with regard to the aforementioned psychological approaches.

A proponent of any sort of behavioral approach might bite the bullet and accept all of the foregoing consequences. Since no alternative unproblematic account of the deduction-induction distinction has been presented thus far, such consequences cannot show that a behavioral approach is simply wrong. Likewise, the relativism inherent in this approach is not by itself an objection. Perhaps the distinction between deductive and inductive arguments is relative to the claims made about them. However, this approach is incompatible with the common belief that an argument is either deductive or inductive, but never both. This latter belief would have to be jettisoned if a behavioral view were to be adopted.

4. Arguments that “Purport”

Both the psychological and behavioral approaches take some aspect of an agent (various mental states or behaviors, respectively) to be the decisive factor distinguishing deductive from inductive arguments. An alternative to these approaches, on the other hand, would be to take some feature of the arguments themselves to be the crucial consideration instead. One such proposal of this type states that if an argument purports to definitely establish its conclusion, it is a deductive argument, whereas if an argument purports only to provide good reasons in support of its conclusion, it is an inductive argument (Black 1967). Another way to express this view involves saying that an argument that aims at being logically valid is deductive, whereas an argument that aims merely at making its conclusion probable is an inductive argument (White 1989; Perry and Bratman 1999; Harrell 2016). The primary attraction of these “purporting” or “aiming” approaches is that they promise to sidestep the thorny problems with the psychological and behavioral approaches detailed above by focusing on a feature of arguments themselves rather than on the persons advancing them. However, they generate some puzzles of their own that are worth considering.

The puzzles at issue all concern the notion of an argument “purporting” (or “aiming”) to do something. One might argue that “purporting” is something that only intentional agents can do, either directly or indirectly. Skyrms (1975) makes this criticism with regard to arguments that are said to intend a conclusion with a certain degree of support. Someone, being the intentional agent they are, may purport to be telling the truth, or rather may purport to have more formal authority than they really possess, just to give a couple examples. The products of such intentional agents (sentences, behaviors, and the like) may be said to purport to do something, but they still in turn depend on what some intentional agent purports. Consequently, then, this “purporting” approach may collapse into a psychological or behavioral approach.

Suppose, however, that one takes arguments themselves to be the sorts of things that can purport to support their conclusions either conclusively or with strong probability. How does one distinguish the former type of argument from the latter, especially in cases in which it is not clear what the argument itself purports to show? Recall the example used previously: “Dom Pérignon is a champagne; so, it is made in France.” How strongly does this argument purport to support its conclusion? As already seen, this argument could be interpreted as purporting to show that the conclusion is logically entailed by the premise, since, by definition, “champagne” is a type of sparkling wine produced only in France. On the other hand, the argument could also be interpreted as purporting to show only that Dom Pérignon is probably made in France, since so much wine is produced in France. How does one know what an argument really purports?

One might attempt to answer this question by inferring that the argument’s purport is conveyed by certain indicator words. Words like “necessarily” may purport that the conclusion logically follows from the premises, whereas words like “probably” may purport that the conclusion is merely made probable by the premises. However, consider the following argument: “The economy will probably improve this year; so, necessarily, the economy will improve this year.” The word “probably” could be taken to indicate that this purports to be an inductive argument. The word “necessarily” could be taken to signal that this argument purports to be a deductive argument. So, which is it? One cannot strictly tell from these indicator words alone. Granted, this is indeed a very strange argument, but that is the point. What does the argument in question really purport, then? Certainly, despite issues of the argument’s validity or soundness, highlighting indicator words does not make it clear what it precisely purports. So, highlighting indicator words may not always be a helpful strategy, but to make matters more complicated, specifying that an argument purports to show something already from the beginning introduces an element of interpretation that is at odds with what was supposed to be the main selling point of this approach in the first place – that distinguishing deductive and inductive arguments depends solely on objective features of arguments themselves, rather than on agents’ intentions or interpretations.

5. Evidential Completeness

Another proposal for distinguishing deductive from inductive arguments with reference to features of arguments themselves focuses on evidential completeness. One might be told, for example, that an inductive argument is one that can be affected by acquiring new premises (evidence), but a deductive argument cannot be.” Or, one might be told that whereas the premises in a deductive argument “stand alone” to sufficiently support its conclusion, all inductive arguments have “missing pieces of evidence” (Teays 1996). This evidential completeness approach is distinct from the psychological approaches considered above, given that an argument could be affected (that is, it could be strengthened or weakened) by acquiring new premises regardless of anyone’s intentions or beliefs about the argument under consideration. It is also distinct from the behavioral views discussed above as well, given that an argument could be affected by acquiring new premises without anyone claiming or presenting anything about it. Finally, it is distinct from the “purporting” view, too, since whether an argument can be affected by acquiring additional premises has no evident connection with what an argument purports to show.

How well does such an evidential completeness approach work to categorically distinguish deductive and inductive arguments? Once again, examination of an example may help to shed light on some of the implications of this approach. Consider the following argument:

All men are mortal.
Therefore, Socrates is mortal.

On the evidential completeness approach, this cannot be a deductive argument because it can be affected by adding a new premise, namely “Socrates is a man.” The addition of this premise makes the argument valid, a characteristic of which only deductive arguments can boast. On the other hand, were one to acquire the premise “Socrates is a god,” this also would greatly affect the argument, specifically by weakening it. At least in this case, adding a premise makes a difference. Without the inclusion of the “Socrates is a man” premise, it would be considered an inductive argument. With the “Socrates is a man” premise, the argument is deductive. As such, then, the evidential completeness approach looks promising.

However, it is worth noticing that to say that a deductive argument is one that cannot be affected (that is, it cannot be strengthened or weakened) by acquiring additional evidence or premises, whereas an inductive argument is one that can be affected by additional evidence or premises, is to already begin with an evaluation of the argument in question, only then to proceed to categorize it as deductive or inductive. “Strengthening” and “weakening” are evaluative assessments. This is to say that, with the evidential completeness approach being considered here, the categorization follows rather than precedes argument analysis and evaluation. This is precisely the opposite of the traditional claim that categorizing an argument as deductive or inductive must precede its analysis and evaluation. If categorization follows rather than precedes evaluation, one might wonder what actual work the categorization is doing. Be that as it may, perhaps in addition to such concerns, there is something to be said with regard to the idea that deductive and inductive arguments may differ in the way that their premises relate to their conclusions. That is an idea that deserves to be examined more closely.

6. Logical Necessity vs. Probability

Govier (1987) observes that “Most logic texts state that deductive arguments are those that ‘involve the claim’ that the truth of the premises renders the falsity of the conclusion impossible, whereas inductive arguments ‘involve’ the lesser claim that the truth of the premises renders the falsity of the conclusion unlikely, or improbable.” Setting aside the “involve the claim” clause (which Govier rightly puts in scare quotes), what is significant about this observation is how deductive and inductive arguments are said to differ in the way in which their premises are related to their conclusions.

Anyone acquainted with introductory logic texts will find quite familiar many of the following characterizations, one of them being the idea of “necessity.” For example, McInerny (2012) states that “a deductive argument is one whose conclusion always follows necessarily from the premises.” An inductive argument, by contrast, is one whose conclusion is merely made probable by the premises. Stated differently, “A deductive argument is one that would be justified by claiming that if the premises are true, they necessarily establish the truth of the conclusion” (Churchill 1987). Similarly, “deductive arguments … are arguments whose premises, if true, guarantee the truth of the conclusion” (Bowell and Kemp 2015). Or, one may be informed that in a valid deductive argument, anyone who accepts the premises is logically bound to accept the conclusion, whereas inductive arguments are never such that one is logically bound to accept the conclusion, even if one entirely accepts the premises (Solomon 1993). Furthermore, one might be told that a valid deductive argument is one in which it is impossible for the conclusion to be false given its true premises, whereas that is possible for an inductive argument.

Neidorf (1967) says that in a valid deductive argument, the conclusion certainly follows from the premises, whereas in an inductive argument, it probably does. Likewise, Salmon (1963) explains that in a deductive argument, if all the premises are true, the conclusion must be true, whereas in an inductive argument, if all the premises are true, the conclusion is only probably true. In a later edition of the same work, he says that “We may summarize by saying that the inductive argument expands upon the content of the premises by sacrificing necessity, whereas the deductive argument achieves necessity by sacrificing any expansion of content” (Salmon 1984).

Another popular approach along the same lines is to say that “the conclusion of a deductively valid argument is already ‘contained’ in the premises,” whereas inductive arguments have conclusions that “go beyond what is contained in their premises” (Hausman, Boardman, and Howard 2021). Likewise, one might be informed that “In a deductive argument, the … conclusion makes explicit a bit of information already implicit in the premises … Deductive inference involves the rearranging of information.” By contrast, “The conclusion of an inductive argument ‘goes beyond’ the premises” (Churchill 1986). A similar idea is expressed by saying that whereas deductive arguments are “demonstrative,” inductive arguments “outrun” their premises (Rescher 1976). The image one is left with in such presentations is that in deductive arguments, the conclusion is “hidden in” the premises, waiting there to be “squeezed” out of them, whereas the conclusion of an inductive argument has to be supplied from some other source. In other words, deductive arguments, in this view, are explicative, whereas inductive arguments are ampliative. These are all interesting suggestions, but their import may not yet be clear. Such import must now be made explicit.

7. The Question of Validity

Readers may have noticed in the foregoing discussion of such “necessitarian” characterizations of deductive and inductive arguments that whereas some authors identify deductive arguments as those whose premises necessitate their conclusions, others are careful to limit that characterization to valid deductive arguments. After all, it is only in valid deductive arguments that the conclusion follows with logical necessity from the premises. A different way to put it is that only in valid deductive arguments is the truth of the conclusion guaranteed by the truth of the premises; or, to use yet another characterization, only in valid deductive arguments do those who accept the premises find themselves logically bound to accept the conclusion. One could say that it is impossible for the conclusion to be false given that the premises are true, or that the conclusion is already contained in the premises (that is, the premises are necessarily truth-preserving). Thus, strictly speaking, these various necessitarian proposals apply only to a distinction between valid deductive arguments and inductive arguments.

Some authors appear to embrace such a conclusion. McIntyre (2019) writes the following:

Deductive arguments are and always will be valid because the truth of the premises is sufficient to guarantee the truth of the conclusion; if the premises are true, the conclusion will be also. This is to say that the truth of the conclusion cannot contain any information that is not already contained in the premises.

By contrast, he mentions that “With inductive arguments, the conclusion contains information that goes beyond what is contained in the premises.” Such a stance might well be thought to be no problem at all. After all, if an argument is valid, it is necessarily deductive; if it isn’t valid, then it is necessarily inductive. The notion of validity, therefore, appears to neatly sort arguments into either of the two categorically different argument types – deductive or inductive. Validity, then, may be the answer to the problems thus far mentioned.

There is, however, a cost to this tidy solution. Many philosophers want to say not only that all valid arguments are deductive, but also that not all deductive arguments are valid, and that whether a deductive argument is valid or invalid depends on its logical form. In other words, they want to leave open the possibility of there being invalid deductive arguments. The psychological approaches already considered do leave open this possibility, since they distinguish deductive and inductive arguments in relation to an arguer’s intentions and beliefs, rather than in relation to features of arguments themselves. Notice, however, that on the necessitarian proposals now being considered, there can be no invalid deductive arguments. “Deduction,” in this account, turns out to be a success term. There are no bad deductive arguments, at least so far as logical form is concerned (soundness being an entirely different matter). Consequently, if one adopts one of these necessitarian accounts, claims like the following must be judged to be simply incoherent: “A bad, or invalid, deductive argument is one whose form or structure is such that instances of it do, on occasion, proceed from true premises to a false conclusion” (Bergmann, Moor, and Nelson 1998). If deductive arguments are identical with valid arguments, then an “invalid deductive argument” is simply impossible: there cannot be any such type of argument. Salmon (1984) makes this point explicit, and even embraces it. Remarkably, he also extends automatic success to all bona fide inductive arguments, telling readers that “strictly speaking, there are no incorrect deductive or inductive arguments; there are valid deductions, correct inductions, and assorted fallacious arguments.” Essentially, therefore, one has a taxonomy of good and bad arguments.

Pointing out these consequences does not show that the necessitarian approach is wrong, however. One might simply accept that all deductive arguments are valid, and that all inductive arguments are strong, because “to be valid” and “to be strong” are just what it means to be a deductive or an inductive argument, respectively. One must then classify bad arguments as neither deductive nor inductive. An even more radical alternative would be to deny that bad “arguments” are arguments at all.

Still, to see why one might find these consequences problematic, consider the following argument:

If P, then Q.
Q.
Therefore, P.

This argument form is known as “affirming the consequent.” It is identified in introductory logic texts as a logical fallacy. In colloquial terms, someone may refer to a widely-accepted but false belief as a “fallacy.” In logic, however, a fallacy is not a mistaken belief. Rather, it is a mistaken form of inference. Arguments can fail as such in at least two distinct ways: their premises can be false (or unclear, incoherent, and so on), and the connection between the premises and conclusion can be defective. In logic, a fallacy is a failure of the latter sort. Introductory logic texts usually classify fallacies as either “formal” or “informal.” An ad hominem (Latin for “against the person”) attack is a classic informal fallacy. By contrast, “affirming the consequent,” such as the example above, is classified as a formal fallacy.

How are these considerations relevant to the deductive-inductive argument distinction under consideration? On the proposal being considered, the argument above in which “affirming the consequent” is exhibited cannot be a deductive argument, indeed not even a bad one, since it is manifestly invalid, given that all deductive arguments are necessarily valid. Rather, since the premises do not necessitate the conclusion, it must be an inductive argument. This is the case unless one follows Salmon (1984) in saying that it is neither deductive nor inductive but, being an instance of affirming the consequent, it is simply fallacious.

Perhaps it is easy to accept such a consequence. Necessitarian proposals are not out of consideration yet, however. Part of the appeal of such proposals is that they seem to provide philosophers with an understanding of how premises and conclusions are related to one another in valid deductive arguments. Is this a useful proposal after all?

Consider the idea that in a valid deductive argument, the conclusion is already contained in the premises. What might this mean? Certainly, all the words that appear in the conclusion of a valid argument need not appear in its premises. Rather, what is supposed to be contained in the premises of a valid argument is the claim expressed in its conclusion. This is the case given that in a valid argument the premises logically entail the conclusion. So, it can certainly be said that the claim expressed in the conclusion of a valid argument is already contained in the premises of the argument, since the premises entail the conclusion. Has there thus been any progress made in understanding validity?

To answer that question, consider the following six arguments, all of which are logically valid:

P. P. P and not P.
Therefore, P. Therefore, either Q or not Q. Therefore, Q.
P. P. P.
Therefore, P or Q. Therefore, if Q then Q. Therefore, if not P, then Q.

In any of these cases (except the first), is it at all obvious how the conclusion is contained in the premise? Insofar as the locution “contained in” is supposed to convey an understanding of validity, such accounts fall short of such an explicative ambition. This calls into question the aptness of the “contained in” metaphor for explaining the relationship between premises and conclusions regarding valid arguments.

8. Formalization and Logical Rules to the Rescue?

In the previous section, it was assumed that some arguments can be determined to be logically valid simply in virtue of their abstract form. After all, the “P”s and “Q”s in the foregoing arguments are just variables or placeholders. It is the logical form of those arguments that determines whether they are valid or invalid. Rendering arguments in symbolic form helps to reveal their logical structure. Might not this insight provide a clue as to how one might categorically distinguish deductive and inductive arguments? Perhaps it is an argument’s capacity or incapacity for being rendered in symbolic form that distinguishes an argument as deductive or inductive, respectively.

To assess this idea, consider the following argument:

If today is Tuesday, we’ll be having tacos for lunch.
Today is Tuesday.
So, we’ll be having tacos for lunch.

This argument is an instance of the valid argument form modus ponens, which can be expressed symbolically as:

P → Q.
P.
∴ Q.

Any argument having this formal structure is a valid deductive argument and automatically can be seen as such. Significantly, according to the proposal that deductive but not inductive arguments can be rendered in symbolic form, a deductive argument need not instantiate a valid argument form. Recall the fallacious argument form known as “affirming the consequent”:

If P, then Q.
Q.
Therefore, P.

It, too, can be rendered in purely symbolic notation:

P → Q.
Q.
∴ P.

Consequently, this approach would permit one to say that deductive arguments may be valid or invalid, just as some philosophers would wish. It might be thought, on the other hand, that inductive arguments do not lend themselves to this sort of formalization. They are just too polymorphic to be represented in purely formal notation.

Note, however, that the success of this proposal depends on all inductive arguments being incapable of being represented formally. Unfortunately for this proposal, however, all arguments, both deductive and inductive, are capable of being rendered in formal notation. For example, consider the following argument:

We usually have tacos for lunch on Tuesdays.
Today is Tuesday.
So, we’re probably having tacos for lunch.

In other words, given that today is Tuesday, there is a better than even chance that tacos will be had for lunch. This might be rendered formally as:

P(A/B) > 0.5

It must be emphasized that the point here is not that this is the only or even the best way to render the argument in question in symbolic form. Rather, the point is that inductive arguments, no less than deductive arguments, can be rendered symbolically, or, at the very least, the burden of proof rests on deniers of this claim. But, if so, then it seems that the capacity for symbolic formalization cannot categorically distinguish deductive from inductive arguments.

Another approach would be to say that whereas deductive arguments involve reasoning from one statement to another by means of logical rules, inductive arguments defy such rigid characterization (Solomon 1993). In this view, identifying a logical rule governing an argument would be sufficient to show that the argument is deductive. Failure to identify such a rule governing an argument, however, would not be sufficient to demonstrate that the argument is not deductive, since logical rules may nonetheless be operative but remain unrecognized.

The “reasoning” clause in this proposal is also worth reflecting upon. Reasoning is something that some rational agents do on some occasions. Strictly speaking, arguments, consisting of sentences lacking cognition, do not reason (recall that earlier a similar point was considered regarding the idea of arguments purporting something). Consequently, the “reasoning” clause is ambiguous, since it may mean either that: (a) there is a logical rule that governs (that is, justifies, warrants, or the like) the inference from the premise to the conclusion; or (b) some cognitional agent either explicitly or implicitly uses a logical rule to reason from one statement (or a set of statements) to another.

If the former, more generous interpretation is assumed, it is easy to see how this suggestion might work with respect to deductive arguments. Consider the following argument:

If today is Tuesday, then the taco truck is here.
The taco truck is not here.
Therefore, today is not Tuesday.

This argument instantiates the logical rule modus tollens:

If P, then Q. P → Q
Not Q. ~ Q
Therefore, not P. ∴ ~ P

Perhaps all deductive arguments explicitly or implicitly rely upon logical rules. However, for this proposal to categorically distinguish deductive from inductive arguments, it must be the case both that all deductive arguments embody logical rules, and that no inductive arguments do.

Is this true? It is not entirely clear. A good case can be made that all valid deductive arguments embody logical rules (such as modus ponens or modus tollens). However, if one wants to include some invalid arguments within the set of all deductive arguments, then it is hard to see what logical rules could underwrite invalid argument types such as affirming the consequent or denying the antecedent. It would seem bizarre to say that in inferring “P” from “If P, then Q” and “Q” that one relied upon the logical rule “affirming the consequent.” That is not a logical rule. It is a classic logical fallacy.

Likewise, consider the following argument that many would consider to be an inductive argument:

Nearly all individuals polled in a random sample of registered voters contacted one week before the upcoming election indicated that they would vote to re-elect Senator Blowhard. Therefore, Senator Blowhard will be re-elected.

There may be any number of rules implicit in the foregoing inference. For example, the rule implicit in this argument might be something like this:

Random sampling of a relevant population’s voting preferences one week before an election provides good grounds for predicting that election’s results.

This is no doubt some sort of rule, even if it does not explicitly follow the more clear-cut logical rules thus far mentioned. Is the above the right sort of rule, however? Perhaps deductive arguments are those that involve reasoning from one statement to another by means of deductive rules. One could then stipulate what those deductive logical rules are, such that they exclude rules like the one implicit in the ostensibly inductive argument above. This would resolve the problem of distinguishing between deductive and inductive arguments, but at the cost of circularity (that is, by committing a logical fallacy).

If one objected that the inductive rule suggested above is a formal rule, then a formal version of the rule could be devised. However, if that is right, then the current proposal stating that deductive arguments, but not inductive ones, involve reasoning from one statement to another by means of logical rules is false. Inductive arguments rely, or at least can rely, upon logical rules as well.

9. Other Even Less Promising Proposals

A perusal of introductory logic texts turns up a hodgepodge of other proposals for categorically distinguishing deductive and inductive arguments that, upon closer inspection, seem even less promising than the proposals surveyed thus far. One example will have to suffice.

Kreeft (2005) says that whereas deductive arguments begin with a “general” or “universal” premise and move to a less general conclusion, inductive arguments begin with “particular”, “specific”, or “individual” premises and move to a more general conclusion.

In light of this proposal, consider again the following argument:

All men are mortal.
Socrates is a man.
Therefore, Socrates is mortal.

As mentioned already, this argument is the classic example used in introductory logic texts to illustrate a deductive argument. It moves from a general (or universal) premise (exhibited by the phrase “all men”) to a specific (or particular) conclusion (exhibited by referring to “Socrates”). By contrast, consider the following argument:

Each spider so far examined has had eight legs.
Therefore, all spiders have eight legs.

This argument moves from specific instances (demarcated by the phrase “each spider so far examined”) to a general conclusion (as seen by the phrase “all spiders”). Therefore, on this proposal, this argument would be inductive.

So far, so good. However, this approach seems much too crude for drawing a categorical distinction between the deductive and inductive arguments. Consider the following argument:

All As are Bs.
All Bs are Cs.
Therefore, all As are Cs.

On this account, this would be neither deductive nor inductive, since it involves only universal statements. Likewise, consider the following as well:

 Each spider so far examined has had eight legs.
Therefore, likewise, the next spider examined will have eight legs.

According to Kreeft’s proposal, this would be neither a deductive nor an inductive argument, since it moves from a number of particulars to yet another particular. What kind of argument, then, may this be considered as? Despite the ancient pedigree of Kreeft’s proposal (since he ultimately draws upon both Platonic and Aristotelian texts), and the fact that one still finds it in some introductory logic texts, it faces such prima facie plausible exceptions that it is hard to see how it could be an acceptable, much less the best, view for categorically distinguishing between deductive and inductive arguments.

10. An Evaluative Approach

There have been many attempts to distinguish deductive from inductive arguments. Some approaches focus on the psychological states (such as the intentions, beliefs, or doubts) of those advancing an argument. Others focus on the objective behaviors of arguers by focusing on what individuals claim about or how they present an argument. Still others focus on features of arguments themselves, such as what an argument purports, its evidential completeness, its capacity for formalization, or the nature of the logical bond between its premises and conclusion. All of these proposals entail problems of one sort or another. The fact that there are so many radically different views about what distinguishes deductive from inductive arguments is itself noteworthy, too. This fact might not be evident from examining the account given in any specific text, but it emerges clearly when examining a range of different proposals and approaches, as has been done in this article. The diversity of views on this issue has so far garnered remarkably little attention. Some authors (such as Moore and Parker 2004) acknowledge that the best way of distinguishing deductive from inductive arguments is “controversial.” Yet, there seems to be remarkably little actual controversy about it. Instead, matters persist in a state of largely unacknowledged chaos.

Rather than leave matters in this state of confusion, one final approach must be considered. Instead of proposing yet another account of how deductive and inductive arguments differ, this proposal seeks to dispense entirely with the entire categorical approach of the proposals canvassed above.

Without necessarily acknowledging the difficulties explored above or citing them as a rationale for taking a fundamentally different approach, some authors nonetheless decline to define “deductive” and “inductive” (or more generally “non-deductive”) arguments at all, and instead adopt an evaluative approach that focuses on deductive and inductive standards for evaluating arguments (see Skyrms 1975; Bergmann, Moor, and Nelson 1998). When presented with any argument, one can ask: “Does the argument prove its conclusion, or does it only render it probable, or does it do neither?” One can then proceed to evaluate the argument by first asking whether the argument is valid, that is, whether the truth of the conclusion is entailed by the truth of the premises. If the answer to this initial question is affirmative, one can then proceed to determine whether the argument is sound by assessing the actual truth of the premises. If the argument is determined to be sound, then its conclusion is ceteris paribus worth believing. If the argument is determined to be invalid, one can then proceed to ask whether the truth of the premises would make the conclusion probable. If it would, one can judge the argument to be strong. If one then determines or judges that the argument’s premises are probably true, the argument can be declared cogent. Otherwise, it ought to be declared not-cogent (or the like). In this latter case, one ought not to believe the argument’s conclusion on the strength of its premises.

What is noteworthy about this procedure is that at no time was it required to determine whether any argument is “deductive,” “inductive,” or more generally “non-deductive.” Such classificatory concepts played no role in executing the steps in the process of argument evaluation. Yet, the whole point of examining an argument in first place is nevertheless achieved with this approach. That is, the effort to determine whether an argument provides satisfactory grounds for accepting its conclusion is carried out successfully. In order to discover what one can learn from an argument, the argument must be treated as charitably as possible. By first evaluating an argument in terms of validity and soundness, and, if necessary, then in terms of strength and cogency, one gives each argument its best shot at establishing its conclusion, either with a very high degree of certainty or at least with a degree of probability. One will then be in a better position to determine whether the argument’s conclusion should be believed on the basis of its premises.

This is of course not meant to minimize the difficulties associated with evaluating arguments. Evaluating arguments can be quite difficult. However, insisting that one first determine whether an argument is “deductive” or “inductive” before proceeding to evaluate it seems to insert a completely unnecessary step in the process of evaluation that does no useful work on its own. Moreover, a focus on argument evaluation rather than on argument classification promises to avoid the various problems associated with the categorical approaches discussed in this article. There is no need to speculate about the possibly unknowable intentions, beliefs, and/or doubts of someone advancing an argument. There is no need to guess at what an argument purports to show, or to ponder whether it can be formalized or represented by logical rules in order to determine whether one ought to believe the argument’s conclusion on the basis of its premises. In short, one does not need a categorical distinction between deductive and inductive arguments at all in order to successfully carry out argument evaluation..

This article is an attempt to practice what it preaches. Although there is much discussion in this article about deductive and inductive arguments, and a great deal of argumentation, there was no need to set out a categorical distinction between deductive and inductive arguments in order to critically evaluate a range of claims, positions, and arguments about the purported distinction between each type of argument. Hence, although such a distinction is central to the way in which argumentation is often presented, it is unclear what actual work it is doing for argument evaluation, and thus whether it must be retained. Perhaps it is time to give the deductive-inductive argument distinction its walking papers.

11. References and Further Reading

  • Aristotle. The Basic Works of Aristotle. New York: Random House, 1941.
  • Bacon, Francis. Francis Bacon: The Major Works. Oxford: Oxford University Press, 2002.
  • Barry, Vincent E. The Critical Edge: Critical Thinking for Reading and Writing. Orlando, FL: Holt, Rinehart and Winston, Inc., 1992.
  • Bergmann, Merrie, James Moor and Jack Nelson. The Logic Book. 3rd ed. New York: McGraw-Hill, 1998.
  • Black, Max. “Induction.” The Encyclopedia of Philosophy. Ed. Paul Edwards. Vol. 4. New York: Macmillan Publishing Co., Inc. & The Free Press, 1967. 169-181.
  • Bowell, Tracy and Gary Kemp. Critical Thinking: A Concise Guide. 4th ed. London: Routledge, 2015.
  • Churchill, Robert Paul. Becoming Logical: An Introduction to Logic. New York: St. Martin’s Press, 1986.
  • Copi, Irving. Introduction to Logic. 5th ed. New York: Macmillan, 1978.
  • Descartes, René. A Discourse on the Method. Oxford: Oxford University Press, 2006.
  • Einstein, Albert. “Induction and Deduction in Physics.” Einstein, Albert. The Collected Papers of Albert Einstein: The Berlin Years: Writings, 1918-1921. Trans. Alfred Engel. Vol. 7. Princeton: Princeton University Press, 2002. 108-109. <https://einsteinpapers.press.princeton.edu/vol7-trans/124>.
  • Engel, S. Morris. With Good Reason: An Introduction to Informal Fallacies. 5th ed. New York: St. Martin’s Press, 1994.
  • Govier, Trudy. Problems in Argument Analysis and Evaluation. Updated Edition. Windsor: Windsor Studies in Argumentation, 1987.
  • Haack, Susan. Philosophy of Logics. Cambridge: Cambridge University Press, 1978.
  • Harrell, Maralee. What is the Argument? An Introduction to Philosophical Argument and Analysis. Cambridge: The MIT Press, 2016.
  • Hausman, Alan, Frank Boardman and Kahane Howard. Logic and Philosophy: A Modern Introduction. 13th ed. Indianapolis: Hackett Publishing, 2021.
  • Hurley, Patrick J. and Lori Watson. A Concise Introduction to Logic. 13th ed. Belmont: Cengage Learning, 2018.
  • Kreeft, Peter. Socratic Logic: A Logic Text Using Socratic Method, Platonic Questions, and Aristotelian Principles. 2nd ed. South Bend: St. Augustine’s Press, 2005.
  • McInerny, D. Q. An Introduction to Foundational Logic. Elmhurst Township: The Priestly Fraternity of St. Peter, 2012.
  • McIntyre, Lee. The Scientific Attitude: Defending Science from Denial, Fraud, and Pseudoscience. Cambridge: The MIT Press, 2019.
  • Moore, Brooke Noel and Richard Parker. Critical Thinking. 7th ed. New York:: McGraw Hill, 2004.
  • Neidorf, Robert. Deductive Forms: An Elementary Logic. New York: Harper and Row, 1967.
  • Olson, Robert G. Meaning and Argument. New York: Harcourt, Brace, and World, 1975.
  • Perry, John and Michael Bratman. Introduction to Philosophy: Classical and Contemporary Readings. 3rd ed. New York: Oxford University Press, 1999.
  • Rescher, Nicholas. Plausible Reasoning. Assen: Van Gorcum, 1976.
  • Salmon, Wesley. Logic. Englewood Cliffs: Prentice Hall, 1963.
  • Salmon, Wesley. Logic. 3rd ed. Englewood Cliffs: Prentice Hall, 1984.
  • Skyrms, Brian. Choice and Chance. 2nd ed. Encino: Dikenson, 1975.
  • Solomon, Robert C. Introducing Philosophy: A Text with Integrated Readings. 5th ed. Fort Worth: Harcourt Brace Jovanovich, 1993.
  • Teays, Wanda. Second Thoughts: Critical Thinking from a Multicultural Perspective. Mountain View: Mayfield Publishing Company, 1996.
  • Vaughn, Lewis. The Power of Critical Thinking: Effective Reasoning about Ordinary and Extraordinary Claims. 3rd ed. New York: Oxford University Press, 2010.
  • White, James E. Introduction to Philosophy. St. Paul: West Publishing Company, 1989.

Author Information:
Timothy Shanahan
Email: timothy.shanahan@lmu.edu
Loyola Marymount University
U. S. A.

Frequently Asked Questions about Time

This supplement provides background information about many of the topics discussed in both the main Time article and its other companion article What Else Science Requires of Time (That Philosophers Should Know). It is not intended that this article be read in order by section number.

Table of Contents

  1. What Are Durations, Instants, Moments, and Points of Time?
  2. What Is an Event?
  3. What Is a Reference Frame?
  4. Curved Space and Cartesian Coordinates
  5. What Is an Inertial Frame?
  6. What Is Spacetime?
  7. What Is a Spacetime Diagram and a Light Cone?
  8. What Are Time’s Metric and Spacetime’s Interval?
  9. How Does Proper Time Differ from Standard Time and Coordinate Time?
  10. Is Time the Fourth Dimension?
  11. How Is Time Relative to the Observer?
  12. What Is the Relativity of Simultaneity?
  13. What Is the Conventionality of Simultaneity?
  14. What are the Absolute Past and the Absolute Elsewhere?
  15. What Is Time Dilation?
  16. How Does Gravity Affect Time?
  17. What Happens to Time near a Black Hole?
  18. What Is the Solution to the Twins Paradox?
  19. What Is the Solution to Zeno’s Paradoxes?
  20. How Are Coordinates Assigned to Time?
  21. How Do Dates Get Assigned to Actual Events?
  22. What Is Essential to Being a Clock?
  23. What Does It Mean for a Clock to Be Accurate?
  24. What Is Our Standard Clock or Master Clock?
    1. How Does an Atomic Clock Work?
    2. How Do We Find and Report the Standard Time?
  25. Why Are Some Standard Clocks Better than Others?
  26. What Is a Field?

1. What Are Durations, Instants, Moments, and Points of Time?

A duration is a measure of elapsed time. It is a number with a unit such as seconds or hours. “4” is not a duration, but “4 seconds” is. The second is the agreed-upon standard unit for the measurement of duration in the S.I. system (the International Systems of Units, that is, Le Système International d’Unités). How to carefully define the term second is discussed later in this supplement.

In informal conversation, an instant or moment is a very short duration. In physics, however, an instant is even shorter. It is instantaneous; it has zero duration. This is perhaps what the poet T.S. Eliot was thinking of when he said, “History is a pattern of timeless moments.”

There is another sense of the words instant and moment which means, not a very short duration, but rather a time, as when we say it happened at that instant or at that moment. Now a moment is being considered to be a three-dimensional object, namely a ‘snapshot’ of the universe.  Midnight could be such a moment. This is the sense of the word moment meant by a determinist who says the state of the universe at one point of time determines the state of the universe at any later point or moment. In this sense, a moment is normally considered to be a special three-dimensional object, namely a snapshot of our universe at a moment. This is a Leibnizian notion of what a state is.

It is assumed in all currently accepted fundamental theories of physics that any interval of time is a linear continuum of the points of time that compose it, but it is an interesting philosophical question to ask how physicists know time is a continuum. Nobody could ever measure time that finely, even indirectly.  Points of time cannot be detected. That is, there is no physically possible way to measure that the time is exactly noon even if it is true that the time is noon. Noon is 12 to an infinite number of decimal places, and no measuring apparatus is infinitely precise, and no measurement fails to have a margin of error. But given what we know about points, we should not be trying to detect points of anything. Belief in the existence of points of time is justified holistically by appealing to how they contribute to scientific success, that is, to how the points give our science extra power to explain, describe, predict, and enrich our understanding. In order to justify belief in the existence of points, we need confidence that our science would lose too many of these virtues without the points. Without points, we could not use calculus to describe change in nature.

Consider what a point in time really is. Any interval of time is a real-world model of a segment of the real numbers in their normal order. So, each instant corresponds to just one real number and vice versa. To say this again in other words, time is a line-like structure on sets of point events. Just as the real numbers are an actually infinite set of decimal numbers that can be linearly ordered by the less-than-or-equal relation, so time is an actually infinite set of instants or instantaneous moments that can be linearly ordered by the happens-before-or-at-the-same-time-as relation in a single reference frame. An instant or moment can be thought of as a set of point-events that are simultaneous in a single reference frame.

Although McTaggart disagrees, all physicists would claim that a moment is not able to change because change is something that is detectable only by comparing different moments.

There is a deep philosophical dispute about whether points of time actually exist, just as there is a similar dispute about whether spatial points actually exist. The dispute began when Plato said, “[T]his queer thing, the instant, …occupies no time at all….” (Plato 1961, p. 156d). Some philosophers wish to disallow point-events and point-times. They want to make do with intervals, and want an instant always to have a positive duration. The philosopher Michael Dummett, in (Dummett 2000), said time is not made of point-times but rather is a composition of overlapping intervals, that is, non-zero durations. Dummett required the endpoints of those intervals to be the initiation and termination of actual physical processes. This idea of treating time without instants developed a 1936 proposal of Bertrand Russell and Alfred North Whitehead. The central philosophical issue about Dummett’s treatment of motion is whether its adoption would negatively affect other areas of mathematics and science. It is likely that it would. For the history of the dispute between advocates of point-times and advocates of intervals, see (Øhrstrøm and Hasle 1995). The term interval in the phrase spacetime interval is a different kind of interval.

Even if time is made of points, it does not follow that matter is. It sometimes can be a useful approximation to say an electron or a quark is a point particle, but it remains an approximation. They are really vibrations of quantized fields.

2. What Is an Event?

In the manifest image, the universe is more fundamentally made of objects than events. In the scientific image, the universe is more fundamentally made of events than objects.

But the term event has multiple senses. There is sense 1 and sense 2. In ordinary discourse, one uses sense 1 in which an event is a happening lasting some duration during which some object changes its properties. For example, this morning’s event of buttering the toast is the toast’s changing from having the property of being unbuttered this morning to having the property of being buttered later this morning.

The philosopher Jaegwon Kim, among others, claimed that an event should be defined as an object’s having a property at a time. So, two events are the same if they are both events of the same object having the same property at the same time. This suggestion captures sense 1 of our informal concept of event, but with Kim’s suggestion it is difficult to make sense of the remark, “The vacation could have started an hour earlier.” On Kim’s analysis, the vacation event could not have started earlier because, if it did, it would be a different event. A possible-worlds analysis of events might be the way to solve this problem of change.

Physicists do sometimes use the term event this way, but they also use it differently—in what we here call sense 2—when they say events are point-events or regions of point-events often with no reference to any other properties of those events, such as their having the property of being buttered toast at that time. The simplest point-event in sense 2 is a location in spacetime with zero volume and zero duration. Hopefully, when the term event occurs, the context is there to help disambiguate sense 1 from sense 2. For instance, when an eternalist says our universe is a block of events, the person normally means the universe is the set of all point-events with their actual properties.

To a non-quantum physicist, any physical object is just a series of its point-events plus the values of all their intrinsic properties. For example, the process of a ball’s falling down is a continuous, infinite series of point-events along the path in spacetime of the ball.  One of those events would be this particular point piece of the ball being at a specific spatial location at some specific time. The reason for the qualification about “non-quantum” is discussed at the end of this section.

The physicists’ notion of point-event in real, physical space (rather than in mathematical space) is metaphysically unacceptable to some philosophers, in part because it deviates so much from the way the word event is used in ordinary language and in our manifest image. That is, sense 2 deviates too much from sense 1. For other philosophers, it is unacceptable because of its size, its infinitesimal size. In 1936, in order to avoid point-events altogether in physical space, Bertrand Russell and A. N. Whitehead developed a theory of time that is based on the assumption that every event in spacetime has a finite, non-zero duration. They believed this definition of an event is closer to our common sense beliefs, which it is. Unfortunately, they had to assume that any finite part of an event is also an event, and this assumption indirectly appeals to the concept of the infinitesimal and so is no closer to common sense than the physicist’s assumption that all events are composed of point-events.

McTaggart argued early in the twentieth century that events change. For example, he said the event of Queen Anne’s death is changing because it is receding ever farther into the past as time goes on. Many other philosophers (those of the so-called B-camp) believe it is improper to consider an event to be something that can change, and that the error is in not using the word change properly. This is still an open question in philosophy, but physicists use the term event as the B-theorists do, namely as something that does not change.

In non-quantum physics, specifying the state of a physical system at a time involves specifying the masses, positions and velocities of each of the system’s particles at that time. Not so in quantum mechanics. The simultaneous precise position and velocity of a particle—the key ingredients of a classical particle event—do not exist according to quantum physics. The more precise the position is, the less precise is the velocity, and vice versa. Also, many physicists consider the notion of event in physics to be emergent at a higher scale from a more fundamental lower scale that has no events. The philosopher David Wallace, among others, has emphasized this idea.

The ontology of quantum physics is very different from that of non-quantum physics. The main Time article intentionally downplays this. But, says the physicist Sean Carroll, “at the deepest level, events are not a useful concept,” and one should focus on the wave function.

More than half the physicists in the first quarter of the 21st century believed that a theory of quantum gravity will require (1) quantizing time, (2) having time or spacetime be emergent from a more fundamental entity, (3) having only a finite maximum number of events that can occur in a finite volume. Current relativity theory and quantum theory have none of these three features.

For more discussion of what an event is, see the article on Events.

3. What Is a Reference Frame?

A reference frame is a standard viewpoint or perspective chosen by someone to display quantitative measurements about places of interest in a space plus the phenomena that take place there. It is not an objective feature of nature. To be suited for its quantitative purpose, a reference frame needs to include a coordinate system, that is, is a system of assigning numerical locations or ordered sets of numerical locations to points of the space. If the space is physical spacetime, then each point needs to be assigned at least four numbers, three for its location in space, and one for its location in time. These numbers are called “coordinates.” For every coordinate system, every point-event in spacetime has three spatial coordinate numbers and one time coordinate number. It is a convention that we usually choose the time axis to be straight rather than some other shape, but this is not required, and on the globe we use longitudes as coordinate lines, and these are not straight and not parallel.

Choosing a coordinate system requires selecting some point to be called the system’s “origin” and selecting the appropriate number of coordinate axes that orient the frame in the space. You need at least as many axes as there are dimensions to the space. To add a coordinate system to a reference frame for a space is to add an arrangement of reference lines to the space so that all points of space have unique names. It is often assumed that an observer is located at the origin, but this is not required; it is sufficient to treat the frame “as if” it had an observer. The notion of a reference frame is modern; Newton did not know about reference frames.

The name of a point in a two-dimensional space is an ordered set of two numbers (the coordinates). If a Cartesian coordinate system is assigned to the space, then a point’s coordinate is its signed distance projected along each axis from the origin point, and the axes are straight and mutually perpendicular. The origin is customarily named “(0,0).” For a four-dimensional space, a point is named with a set of four numbers. A coordinate system for n-dimensional space is a mapping from each point to an ordered set of its n coordinate numbers. The most useful numbers to assign as coordinates are real numbers because real numbers enable us to use the techniques of calculus and because their use makes it easy to satisfy the helpful convention that nearby points have nearby coordinates.

Physicists usually suggest that time is like a line. This means time is composed of durationless instants and the set of instants have a linearly-ordered structure under the happens-before-or-at-the-same-time relation, so time is like what mathematicians call “the continuum” and what non-mathematicians call “a line.”

When we speak of the distance between two points in a space, we implicitly mean the distance along the shortest path between them because there might be an infinite number of paths one could take. If a space has a coordinate system, then it has an infinite number of available coordinate systems because there is an unlimited number of choices for an origin, or an orientation of the axes, or the scale.

There are many choices for kinds of reference frames, although the Cartesian coordinate system is the most popular. Its coordinate axes are straight lines and are mutually perpendicular. Assuming Euclidean geometry (and so no curvature of space), the equation of the circle of diameter one centered on the origin of a Cartesian coordinate system is x2 + y2 = 1. This same circle has a very different equation if a polar coordinate system is used instead.

Reference frames can be created for physical space, or for time, or for spacetime, or for things having nothing to do with real space and time. One might create a two-dimensional (2-D) Cartesian coordinate system, with one coordinate axis for displaying the salaries of a company’s sales persons and a second coordinate axis for displaying their names. Even if the space represented by the coordinate system were to be real physical space, its coordinates would not be physically real. You cannot  add two points. From this fact it can be concluded that not all the mathematical structures in the coordinate system are also reflected in what the system represents. These extraneous mathematical structures are called “mathematical artifacts.”

Below is a picture of a reference frame spanning a space that contains a solid ball. The coordinates are not confined to the surface of the ball but also cover the surrounding space. What we have here is a 3-dimensional Euclidean space that uses a Cartesian coordinate system with three mutually perpendicular axes. The space contains a 3-dimensional (3-D) solid ball:

Reference Frames — kRPC 0.4.8 documentation

The origin of the coordinate system is at the center of the ball, and the y-axis intersects the north pole and the south pole. Two of the three coordinate axes intersect the blue equator at specified places. The red line represents a typical longitude, but this longitude is not a coordinate axis. The three coordinates of any point in this space form an ordered set (x,y,z) of the x, y, and z coordinates of the point, with commas separating each from the other coordinate labels for the point. Thinking of the ball as the globe, there are points on the Earth, inside the Earth, and outside the Earth. For 3-D space, the individual coordinates normally would be real numbers. For example, we might say a point of interest deep inside the ball (the Earth) has the three coordinates (4.1,π,0), where it is assumed all three numbers have the same units, such as meters. It is customary in a three-dimensional space to label the three axes with the letters x, y, and z, and for (4.1,π,0) to mean that 4.1 meters is the x-coordinate of the point, π meters is the y-coordinate of the same point, and 0 meters is the z-coordinate of the point. The center of the Earth in this graph is located at the origin of the coordinate system; the origin of a frame has the coordinates (0,0,0). Mathematical physicists frequently suppress talk of the units and speak of π being the y-coordinate, although strictly speaking the y-coordinate is π meters. The x-axis is all the points (x,0,0); the y-axis is all the points (0,y,0); the z-axis is all the points (0,0,z), for all possible values of x, y, and z.

In a coordinate system, the axes need not be mutually perpendicular, but in order to be a Cartesian coordinate system, the axes must be mutually perpendicular, and the coordinates of a point in spacetime must be the values along axes of the perpendicular projections of the point onto the axes. All Euclidean spaces can have Cartesian coordinate systems. If the space were the surface of the sphere above, not including its insides or outside, then this two-dimensional space would be a sphere, and it could not have a two-dimensional Cartesian coordinate system because all the axes could not lie within the space. The 2D surface could have a 3D Cartesian coordinate system, though. This coordinate system was used in our diagram above. A more useful coordinate system might be a 3D spherical coordinate system. Space and time in the theory of special relativity are traditionally represented by a frame with four independent, real coordinates (t,x,y,z).

Changing from one reference frame to another does not change any phenomenon in the real world being described with the reference frame, but is merely changing the perspective on the phenomena. If an object has certain coordinates in one reference frame, it usually has different coordinates in a different reference frame, and this is why coordinates are not physically real—they are not frame-free. Durations are not frame-free. Neither are positions, directions, and speeds. An object’s speed is different in different reference frames, with one exception. The upper limit on the speed of any object in space satisfying the principles of special relativity is c, the speed of light in a vacuum. This claim is not relative to a reference frame. This speed c is the upper limit on the speed of transmission from any cause to its effect. This c is the c in the equation E = mc2. It is the speed of any particle with zero rest mass such as a photon. The notion of speed of travel through spacetime rather than through space is usually considered by physicists not to be sensible. Whether the notion of speed through time also is not sensible is a controversial topic in the philosophy of physics. See the main Time article’s section “The Passage or Flow of Time” for a discussion of whether it is sensible.

The word reference is often dropped from the phrase reference frame, and the term frame and coordinate system are often used interchangeably. A frame for the physical space in which a particular object always has zero velocity is called the object’s rest frame or proper frame. 

A reference frame is a possible viewpoint. When choosing to place a frame upon a space, there are an infinite number of legitimate choices. Choosing a frame carefully can make a situation much easier to describe. For example, suppose we are interested in events that occur along a highway. We might orient the z-axis by saying it points  up away from the center of Earth, while the x-axis points along the highway, and the y-axis is perpendicular to the other two axes and points across the highway. If events are to be described, then a fourth axis for time would be needed, but its units would be temporal units and not spatial units. It usually is most helpful to make the time axis be perpendicular to the three spatial axes, and to require successive seconds along the axis to be the same duration as seconds of the standard clock. By applying a coordinate system to spacetime, a point of spacetime is specified uniquely by its four independent coordinate numbers, three spatial coordinates and one time coordinate. The word independent implies that knowing one coordinate of a point gives no information about the point’s other coordinates.

Coordinate systems of reference frames have to obey rules to be useful in science. No accepted theory of physics allows a time axis to be shaped like a figure eight. Frames need to honor the laws if they are to be perspectives on real events. For all references frames allowed by relativity theory, if a particle collides with another particle, they must collide in all allowed reference frames. Relativity theory does not allow reference frames in which a particle of light is at rest. Quantum mechanics does. A frame with a time axis in which your shooting a gun is simultaneous with your bullet hitting a distant target is not allowed by relativity theory. Informally, we say it violates the fact that causes occur before their effects in all legitimate reference frames for relativity theory. Formally, we say it violates the light cone structure required by relativity theory.

How is the time axis oriented in the world? This is done by choosing t = 0 to be the time when a specific event occurs such as the Big Bang, or the birth of Jesus. A second along the t-axis usually is required to be congruent to a second of our civilization’s standard clock, especially for clocks not moving with respect to that clock.

A space with a topology defined on it and having any number of dimensions is called a manifold. Newtonian mechanics, special relativity, general relativity, and quantum theory all require the set of all events (in the sense of possible space-time locations) to form a four-dimensional manifold. Informally, what it means to be four-dimensional is that each point cannot be specified with less than four independent numbers. Formally, the definition of dimension is somewhat complicated.

Treating time as a special dimension of spacetime is called spatializing time, and doing this is what makes time precisely describable mathematically in a way that treating time only as becoming does not. It is a major reason why mathematical physics can be mathematical.

One needs to be careful not to confuse the features of time with the features of the mathematics used to describe time. Einstein admitted [see (Einstein 1982) p. 67] that even he often made this mistake of failing to distinguish the representation from the object represented, and it added years to the time it took him to create his general theory of relativity.

Times are not numbers, but time coordinates are. When a time-translation occurs with a magnitude of Δt, this implies the instant I at coordinate t is now associated with another instant I’ at coordinate t’ and this equality holds: t’ = t + Δt. If the laws of physics are time-translation symmetric, which is the normal assumption, then the laws of mathematical physics are invariant relative to the group of transformations of time coordinate t expressed by t ⇒ t + Δt where Δt is an arbitrarily chosen constant real number.

Some features of reality are relative to a reference frame and some are not. Duration is relative. Distance is relative. Spacetime interval is not. The speed of light in a vacuum is not.

4. Curved Space and Cartesian Coordinates

According to general relativity theory, space curves near all masses. Here are the three main types of geometries for representing curvature:

  • Euclidean geometry.
  • Hyperbolic geometry.
  • Elliptical geometry.

The following diagram shows how the three, when viewed from a higher dimension, differ in curvature and in the character of their parallel lines, circles and triangles. Click on the diagram to expand it:

Source: Wikipedia

The geometry of a space exists independently of whatever coordinate system is used to describe it, so one has to take care to distinguish what is a real feature of the geometry from what is merely an artifact of the mathematics used to characterize the geometry.

A Cartesian coordinate system can handle all sorts of curved paths and curved objects, but it fails whenever the space itself curves.  What we just called “the space” could be real physical space or an abstract mathematical space or spacetime or just time.

Any Euclidean space can have a Cartesian coordinate system. A reference frame fixed to the surface of the Earth cannot have a Cartesian coordinate system covering all the surface because the surface curves and the space is therefore not Euclidean. Spaces with a curved geometry require curvilinear coordinate systems in which the axes curve as seen from a higher dimensional Euclidean space in which the lower-dimensional space is embedded. This higher-dimension can be real or unreal.

If the physical world were two-dimensional and curved like the surface of a sphere, then a two-dimensional Cartesian coordinate system for that space must fail to give coordinates to most places in the world. To give all the points of the 2D world their own Cartesian coordinates, one would need a 3D Cartesian system, and each point in the world would be assigned three coordinates, not merely two. For the same reason, if we want an arbitrary point in our real, curving 4D-spacetime to have only four coordinates and not five, then the coordinate system must be curvilinear and not Cartesian.  But what if we are stubborn and say we want to stick with the Cartesian coordinate system and we don’t care that we have to bring in an extra dimension and give our points of spacetime five coordinates instead of four? In that case we cannot trust the coordinate system’s standard metric to give correct answers.

Let’s see why this is so. Although the coordinate system can be chosen arbitrarily for any space or spacetime, different choices usually require different metrics. Suppose the universe is two-dimensional and shaped like the surface of a sphere when seen from a higher dimension.  The 2D sphere has no inside or outside; the extra dimension is merely for our visualization purposes. Then when we use the 3D system’s metric, based on the 3D version of the Pythagorean Theorem, to measure the spatial distance between two points in the space, say, the North Pole and the equator, the value produced is too low. The correct value is higher because it is along a longitude and must stay confined to the surface. The 3D Cartesian metric says the shortest line between the North Pole and a point on the equator cuts through the Earth and so escapes the universe, which indicates the Cartesian metric cannot be correct. The correct metric would compute distance within the space along a geodesic line (a great circle in this case such as a longitude) that is confined to the sphere’s surface.

The orbit of the Earth around the Sun is curved in 3D space, but “straight” in 4D spacetime. The scare quotes are present because the orbit is straight only in the sense that a geodesic is straight. A geodesic path between two points of spacetime is a path of shortest spacetime interval between the points.

One could cover a curved 4D-spacetime with a special Cartesian-like coordinate system by breaking up the spacetime into infinitesimal regions, giving each region its own Cartesian coordinate system, and then stitching the coordinate systems all together where they meet their neighbors. The stitching produces what is customarily called an atlas. Each point would have its own four unique coordinates, but when the flat Cartesian metric is used to compute intervals, lengths, and durations from the coordinate numbers of the atlas, the values will be incorrect.

Instead of considering a universe that is the surface of a sphere, consider a universe that is the surface of a cylinder. This 2D universe is curved when visualized from a 3D Euclidean space in which the cylinder is embedded. Surprisingly, it is not intrinsically curved at all. The measures of the three angles of any triangle sum to 180 degrees. Circumferences of its circles always equal pi times their diameters. We say that, unlike the sphere, the surface of a cylinder is extrinsically curved but intrinsically flat.

For a more sophisticated treatment of reference frames and coordinates, see Coordinate Systems. For an introduction to the notion of curvature of space, see chapter 42 in The Feynman Lectures on Physics by Richard Feynman.

5. What Is an Inertial Frame?

Galileo first had the idea that motion is relative. If you are inside a boat with no windows and are floating on a calm sea, you cannot tell whether the  boat is moving. Even if it is moving, you won’t detect this inside a closed cabin of the boat, say, by seeing a dropped ball curve as it falls or by feeling a push on yourself or seeing all the flies near you being pushed to the back of the room.  Galileo believed steady motion is motion relative to other objects, and there is no such thing as simply motion relative to nothing, or motion relative to fixed, absolute space. Newton disagreed with this. Einstein agreed.

Newton  believed in absolute motion. This is motion of an object that is not dependent upon its relations with any other object. Newton would say an inertial frame is a reference frame moving at constant velocity relative to absolute space.

An inertial observer is someone who feels weightless, as if they are floating. They feel no acceleration and no gravitational field, yet all the laws of physics apply to this observer as they do to anything else.

Einstein described an inertial frame as a reference frame in which Newton’s first law of motion holds. Newton’s first law says an isolated object, that is, an object affected by no total extrinsic force, has a constant velocity over time. It does not accelerate. In any inertial frame, any two separate objects that are moving in parallel and coasting along with no outside forces on them, will remain moving in parallel forever. Einstein described his special theory of relativity in 1905 by saying it requires the laws of physics to have the same form in any inertial frame of reference.

According to the general theory of relativity, there are no global inertial reference frames at all because Newton’s first law is not strictly true globally. It holds to an acceptable degree of approximation in some restricted regions that are sufficiently far away from masses.

Newton’s first law can be thought of as providing a definition of the concept of zero total external force; an object has zero total external force if it is moving with constant velocity. In the real world, no objects behave this way; they cannot be isolated from the force of gravity. Gravity cannot be turned off, and so Newton’s first law fails, and there are no inertial frames. But the first law does hold approximately. That is, it holds well enough for various purposes in many situations. It holds in any infinitesimal region. In larger regions, if spacetime curvature can be ignored for a certain phenomenon of interest, then one can find an inertial frame for the phenomenon. A Cartesian coordinate system fixed to Earth usually will serve adequately as an inertial frame for describing cars on a race track or describing the flight of a tennis ball, but not for describing a rocket’s flight from Paris to Mars. A coordinate frame for space that is fixed on the distant stars and is used by physicists only to describe phenomena far from any of those stars, and far from planets, and far from other massive objects, is very nearly an inertial frame in that region. Given that some frame is inertial, any frame that rotates or otherwise accelerates relative to this first frame is non-inertial.

Newton’s theory requires a flat, Euclidean geometry for space and for spacetime. Special relativity requires a flat Euclidean geometry for space but a flat, non-Euclidean geometry for spacetime. General relativity allows all these but also allows curvature for spacetime as well as space. If we demand that our reference frame’s coordinate system span all of spacetime, then a flat frame does not exist for the real world, just as a plane cannot cover the surface of a sphere. The existence of gravity requires there to be curvature of space around any object that has mass, thereby making a flat frame fail to span some of the space near the object.

Perhaps most importantly, it became generally accepted since the 1920s that Euclid and Kant were mistaken about the geometry of the universe because they failed to distinguish mathematical geometry (which is a priori) from physical geometry (which is empirical). In philosophy, this point was made most strenuously by Hans Reichenbach.

For a deeper philosophical introduction to inertial frames, see chapter 2 of (Maudlin 2012).

6. What Is Spacetime?

Spacetime is a certain combination of space and time. It is the set of locations of events, or it can be considered to be a field where all events are located.

There are actual spacetimes and imaginary spacetimes. Our real four-dimensional spacetime has a single time dimension and at least three space dimensions. It is still an open question whether there are more than three spatial dimensions. But there definitely are imaginary spacetimes with twenty-seven dimensions or three hundred. There could be a three-dimensional  spacetime composed of two spatial dimensions and a time dimension in which points in space indicate the latitude and longitude in Canada for the sale of a company’s widget, and points along the time dimension indicate the date of the sale of the widget. In any spacetime, real or imaginary, the coordinates are the names of locations in space and time. Coordinates are mathematical artifacts.

In 1908, Einstein’s mathematics teacher Hermann Minkowski was the first person to say that real spacetime is fundamental and that space and time are just aspects of spacetime. And he was the first to say different reference frames will divide spacetime differently but correctly into their time part and space part. Einstein was soon convinced by Minkowski’s reasoning.

Later, Einstein discovered that real spacetime is dynamic and not static as in special relativity theory. It is dynamic because its structure, such as its geometry, changes over time. Einstein said it changes as the distribution of matter-energy changes. In special relativity and in Newton’s theory, spacetime is not dynamic; it stays the same regardless of what matter and energy are doing. In any spacetime obeying either the special or the general theory of relativity, the key idea about time is that there is a light-cone structure such that every point in spacetime has both a forward light-cone of future events and a backward light-cone of past events. What this means is explained momentarily.

In his general theory of relativity, Einstein said gravity is a feature of spacetime, namely its curvature. Spacetime curves near gravitational fields, and it curves more the stronger the field strength. The overall, cosmic curvature of space was far from zero at the Big Bang, but it is now about zero, although many cosmologists believe it is evolving toward a positive value. These days the largest curvature of spacetime is in black holes.

In general relativity, spacetime is assumed to be a fundamental feature of reality. It is very interesting to investigate whether this assumption is true. There have been serious attempts to construct theories of physics in which spacetime is not fundamental but instead emerges from something more fundamental such as quantum fields, but none of these attempts have stood up to any empirical observations or experiments that could show the new theories to be superior to the presently accepted theories.

The metaphysical question of whether spacetime is a substantial object or merely a relationship among events, or neither, is considered in the discussion of the relational theory of time in the main Time article. For some other philosophical questions about what spacetime is, see What is a Field?

According  to the physicist George Musser, “Gravity is not a force that propagates through space but a feature of spacetime itself. When you throw a ball high into the air, it arcs back to the ground because Earth distorts the spacetime around it, so that the paths of the ball and the ground intersect again.”

How do we detect that a space is curved if we cannot look down upon it from a higher dimension and see the curvature and must instead make the decision from within the space? The answer is that we can detect deviations from Euclidean geometry, such as (1) initially parallel lines becoming less parallel as they are extended, or (2) failure of the theorem that says the sum of the interior angles of a triangle add to 180 degrees, or (3) the circumference of a circle is not the product of pi and its diameter.

Many physicists who promote string theory believe that spacetime really has many more dimensions of space than three. There are the three common ones, customarily called our “brane” (short for “membrane”), plus others. The higher-dimensional  “hyperspace” in which our brane resides is called the “bulk.” Our 3D brane bends into the bulk. It is believed that light cannot escape our brane but gravity can.

7. What Is a Spacetime Diagram and a Light Cone?

A spacetime diagram is a graphical representation of the coordinates of events in spacetime. Think of the diagram as a picture of a reference frame. In classical spacetime diagrams, one designated coordinate axis is for time. The other axes are for space. A Minkowski spacetime diagram is a special kind of spacetime graph.  It is a particular 4-dimensional generalization of 3-D Cartesian coordinates, one that represents phenomena that obey the laws of special relativity. A Minkowski diagram allows no curvature of spacetime itself, although objects themselves can have curving sides, and they can have curving paths in space.

The following diagram is an example of a three-dimensional Minkowski spacetime diagram containing two spatial dimensions (with straight lines for the two axes) and a time dimension (with a vertical straight line for the time axis). If you are located at the origin, then the space part of this spacetime frame constitutes your rest frame; it’s the frame in which you have zero velocity. Two cones emerge upward and downward from the point-event of you, the zero-volume observer being here now at the origin of the reference frame of your spacetime diagram. These cones are your future and past light cones. The two cones are composed of green paths of possible unimpeded light rays emerging from the observer or converging into the observer. The light cone at a point of space exists even if there is no actual light there.

A 3D Minkowski diagram

Attribution:Stib at en.wikipedia, CC BY-SA 3.0, Link

By convention, in a Minkowski spacetime diagram, a Cartesian (rectangular) coordinate system is used, the time axis is shown vertically, and one or two of the three spatial dimensions are suppressed (that is, not included).

If the Minkowski diagram has only one spatial dimension, then a flash of light in a vacuum has a perfectly straight-line representation, but it is has a cone-shaped representation if the Minkowski diagram has two spatial dimensions, and it is a sphere if there are three spatial dimensions. Because light travels at such a high speed, it is common to choose the units along the axes so that the path of a light ray is a 45 degree angle and the value of c is 1 light year per year, with light years being the units along each space axis and years being the units along the time axis. Or the value of c could have been chosen to be one light nanosecond per nanosecond. The careful choice of units for the axes in the diagram is important in order to prevent the light cones’ appearing too flat to be informative.

Below is an example of a Minkowski diagram having only one space dimension, so every future light cone has the shape of the letter “V.”

This Minkowski diagram represents a spatially-point-sized Albert Einstein standing still midway between two special places, places where there is an instantaneous flash of light at time t = 0 in coordinate time. At t = 0, Einstein cannot yet see the flashes because they are too far away for the light to reach him yet. The directed arrows represent the path of the four light rays from the flashes. In a Minkowski diagram, a physical point-object of zero volume is represented as occupying a single point at one time and as occupying a line containing all the spacetime points at which it exists. That line is called the  world line of the (point) object. All world lines representing real objects are continuous paths in spacetime. Accelerating objects have curved paths in spacetime. Real objects that are not just points have a world tube rather than merely a world line.

Events on the same horizontal line of the Minkowski diagram are simultaneous in the reference frame. The more tilted an object’s world line is away from the vertical, the faster the object is moving. Given the units chosen for the above diagram, no world line can tilt down more than 45 degrees, or else that object is moving faster than c, the cosmic speed limit according to special relativity.

In the above diagram, Einstein’s world line is straight, indicating no total external force is acting on him. If an object’s world line meets another object’s world line, then the two objects collide.

The set of all possible photon histories or light-speed world lines going through a specific point-event defines the two light cones of that event, namely its past light cone and its future light cone. The future cone or forward cone is called a cone because, if the spacetime diagram were to have two space dimensions, then light emitted from a flash would spread out in the two spatial dimensions in a circle of ever-growing diameter, producing a cone shape over time. In a diagram for three-dimensional space, the light’s wavefront is an expanding sphere and not an expanding cone, but sometimes physicists still will speak informally of its cone.

Every point of spacetime has its own pair of light cones, but the light cone has to do with the structure of spacetime, not its contents, so the light cone of a point exists even if there is no light there.

Whether a member of a pair of events could have had a causal impact upon the other event is an objective feature of the universe and is not relative to a reference frame. A pair of events inside the same light cone are said to be causally-connectible because they could have affected each other by a signal going from one to the other at no faster than the speed of light, assuming there were no obstacles that would interfere. For two causally-connectible events, the relation between the two events is said to be timelike. If you were once located in spacetime at, let’s say, (x1,y1,z1,t1), then for the rest of your life you cannot affect or participate in any event that occurs outside of the forward light cone whose apex is at (x1,y1,z1,t1). Light cones are an especially helpful tool because different observers in different rest frames should agree on the light cones of any event, despite their disagreeing on what is simultaneous with what and their disagreeing on the duration between two events. So, the light-cone structure of spacetime is objectively real.

Einstein’s Special Theory does apply to gravitation, but it does so very poorly. It  falsely assumes that gravitational processes have no effect on the structure of spacetime. When attention needs to be given to the real effect of gravitational processes on the structure of spacetime, that is, when general relativity needs to be used, then Minkowski diagrams become inappropriate for spacetime. General relativity assumes that the geometry of spacetime is locally Minkowskian, but not globally Minkowskian. That is, spacetime is locally flat in the sense that in any infinitesimally-sized region one always finds spacetime to be 4D Minkowskian (which is 3D Euclidean for space but not 4D Euclidean for spacetime). When we say spacetime is curved and not flat, we mean it deviates from 4D Minkowskian geometry. In discussions like this, more often the term “Lorentzian” is used in place of “Minkowskian.”

8. What Are Time’s Metric and Spacetime’s Interval?

The metric of a space contains geometric information about the space. It tells the curvature at points, and it tells the distance between any two points along a curve containing the two points. The introduction below discusses distance and duration and spacetime interval. If you change to a different coordinate system, generally you must change the metric. In that sense, the metric is not objective.

In simple situations in a Euclidean space with a Cartesian coordinate system, the metric is a procedure that says that in order to find the duration subtract the event’s starting time from its ending time. More specifically, this metric for a one-dimensional space for time says that, in order to compute the duration between point-event a that occurs at time t(a) and point-event b that occurs at time t(b), one should compute |t(b) – t(a)|, the absolute value of their difference. This is the standard way to compute durations when curvature of spacetime is not involved. When it is involved, such as in general relativity, we need a more exotic metric, and the computations can be extremely complicated.

The metric for spacetime implies the metric for time. The spacetime metric tells the spacetime interval between two point events. The spacetime interval has both space aspects and time aspects. The interval is the measure of the spacetime separation between two point events along a specific spacetime path. Let’s delve into this issue a little more deeply.

There are multiple senses of the word space. A mathematical space is not a physical space. A physicist often represents time as a one-dimensional space, space as a three-dimensional space, and spacetime as a four-dimensional space. More generally, a metric for any sort of space is an equation that says how to compute the distance (or something distance-like, as we shall soon see) between any two points in that space along a curve in the space, given the location coordinates of the two points. Note the coordinate dependence. For ordinary Euclidean space, the usual metric is just the three-dimensional version of the Pythagorean Theorem. In a Euclidean four-dimensional space, the metric is the four-dimensional version of the Pythagorean Theorem.

In a one-dimensional Euclidean space along a straight line from point location x to a point location y, the metric says the distance d between the two points is |y – x|. It is assumed both locations use the same units.

The duration t(a,b) between an event a that occurs at time t(a) and an event b that occurs at time t(b) is given by the equation:

t(a,b) = |t(b) – t(a)|.

This is the standardly-accepted way to compute durations when curvature is not involved. Philosophers have asked whether one could just as well have used half that absolute value, or the square root of the absolute value. More generally, is one definition of the metric the correct one or just the more useful one? That is, philosophers are interested in the underlying issue of whether the choice of a metric is natural in the sense of being objective or whether its choice is a matter of convention.

Let’s bring in more dimensions. In a two-dimensional plane satisfying Euclidean geometry, the formula for the metric is:

d2 = (x2 – x1)2 + (y2 – y1)2.

It defines what is meant by the distance d between an arbitrary point with the Cartesian coordinates (x1 , y1) and another point with the Cartesian coordinates (x2 , y2), assuming all the units are the same, such as meters. The x numbers are values in the x dimension, that is, parallel to the x-axis, and the y numbers are values in the y dimension. The above equation is essentially the Pythagorean Theorem of plane geometry. Here is a visual representation of this for the two points:       

If you imagine this graph is showing you what a crow would see flying above a square grid of streets, then the metric equation d2 = (x1 – x2)2+ (y1 – y2)2  gives you the distance d as the crow flies. But if your goal is a metric that gives the distance only for taxicabs that are restricted to travel vertically or horizontally, then a taxicab metric would compute the taxi’s distance this way:

|x2 – x1| + |y2 – y1|.

So, a space can have more than one metric, and we choose the metric depending on the character of the space and what our purpose is.

Usually for a physical space there is a best or intended or conventionally-assumed metric. If all we want is the shortest distance between two points in a two-dimensional Euclidean space, the conventional metric is:

d2 = (x2 – x1)2 + (y2 – y1)2

which is the Pythagorean theorem. But if we are interested in distances along an arbitrary path rather than just the shortest path, then the above metric is correct only infinitesimally, and a more sophisticated metric is required by using the tools of calculus. In this case, the above metric is re-expressed as a difference equation using the delta operator symbol Δ to produce:

(Δs)2 = (Δx)2+ (Δy)2

where Δs is the spatial distance between the two points and Δx = x1 – x2 and Δy = y1 – y2. The delta symbol Δ is not a number but rather is an operator on two numbers that produces their difference. If the differences are extremely small, infinitesimally small, then they are called differentials instead of differences, and then Δs becomes ds, and Δx becomes dx, and Δy becomes dy, and we have entered the realm of differential calculus with:

ds2 = dx2+ dy2

The letter d in a differential stands for an infinitesimally small delta operation, and it is not a number.

Let’s generalize this idea from 2D-space to 4D-spacetime. The metric we are now looking for is the space-time interval between two arbitrary point-events, not the distance between them nor the time between them. Although there is neither a duration between New York City and Paris, nor a spatial distance between noon today and midnight later, nevertheless there is a spacetime interval between New York City at noon and Paris at midnight.

Unlike temporal durations and spatial distances, intervals are objective in the sense that the spacetime interval is not relative to a reference frame or coordinate system. All observers measure the same value for an interval, assuming they measure it correctly. The value of an interval between two point events does not change if the reference frame changes. Alternatively, acceptable reference frames are those that preserve the intervals between points.

Any space’s metric says how to compute the value of the separation s between any two points in that space. In special relativity, the four-dimensional abstract space that represents spacetime is indeed special. Its 3-D spatial part is Euclidean and its 1-D temporal part is Euclidean, but the 4D space it is not Euclidean, and its metric is exotic. It is said to be Minkowskian, and it is given a Lorentzian coordinate system. Its metric is defined between two infinitesimally close points of spacetime to be:

ds2 = c2dt2 dx2

where ds is an infinitesimal interval (or a so-called differential displacement of the spacetime coordinates) between two nearby point-events in the spacetime; c is the speed of light; the differential dt is the infinitesimal duration between the two time coordinates of the two events; and dx is the infinitesimal spatial distance between the two events. Notice the negative sign. If it were a plus sign, then the metric would be Euclidean.

Because there are three dimensions of space in a four-dimensional spacetime, say dimensions 1, 2, and 3, the differential spatial distance dx is defined to be:

dx2 = dx12 + dx22 + dx32

No negation signs. This equation is obtained in Cartesian coordinates by using the Pythagorean Theorem for three-dimensional space. The differential dx1 is the displacement along dimension 1 of the three dimensions. Similarly, for 2 and 3. This is the spatial distance between two point-events, not the interval between them.

With these differential equations, the techniques of calculus can then be applied to find the interval between any two point-events along some path s even if they are not nearby in spacetime. Well we can find it if we have the necessary information about the world line s, the path in spacetime, such as its equation in the coordinate system.

In special relativity, the interval between two events that occur at the same place, such as the place where the clock is sitting, is very simple. Since dx = 0, the interval is:

t(a,b) = |t(b) – t(a)|.

This is the absolute value of the difference between the real-valued time coordinates, assuming all times are specified in the same units, say, seconds. We began the discussion of this section by using that metric.

Now let us generalize this notion in order to find out how to use a clock for events that do not occur at the same place. The infinitesimal proper time dτ, rather than the differential coordinate-time dt, is the duration shown by a clock carried along the infinitesimal spacetime interval ds. It is defined in any spacetime obeying special relativity to be:

2= ds2/c2.

In general, dτ ≠ dt. They are equal only if the two point-events have the same spatial location so that dx = 0.

As we have seen, the length of a path in spacetime is not calculated the way we calculate the length of a path in space.  In space we use the Euclidean method; in spacetime we use the Minkowski method, which contains a negation sign in its equation ds2 = c2dt2 dx2. Because spacetime “distances” (intervals) can be negative, and because the spacetime interval between two different events can be zero even when the events are far apart in spatial distance (but reachable by a light ray if intervening material were not an obstacle), the term interval here is not what is normally meant by the term distance.

To get a sense of the oddness of a spacetime interval, note that the spacetime interval between the birth of a photon and its death far away when it is absorbed by an atom is zero even though the two events do not have a zero time interval.

There are three kinds of spacetime intervals: timelike, spacelike, and null. In spacetime, if two events are in principle connectable by a signal moving from one event to the other at less than light speed, the interval between the two events is called timelike. The interval is spacelike if there is no reference frame in which the two events occur at the same place, so they must occur at different places and be some spatial distance apart—thus the choice of the word spacelike. Two events connectable by a signal moving exactly at light speed are separated by a null interval, an interval of magnitude zero.

Here is an equivalent way of describing the three kinds of spacetime intervals. If one of the two events occurs at the origin or apex of a light cone, and the other event is within either the forward light cone or backward light cone, then the two events have a timelike interval. If the other event is outside the light cones, then the two events have a spacelike interval [and are in each other’s so-called absolute elsewhere]. If the two events lie directly on the same light cone, then their interval is null or zero.

The spacetime interval between any two events in a human being’s life must be a timelike interval. No human being can do anything to affect an event outside their future light cone. Such is the human condition according to relativity theory.

The information in the more complicated metric for general relativity enables a computation of the curvature at any point. This more complicated metric is the Riemannian metric tensor field. This is what you know when you know the metric of spacetime.

A space’s metric provides a complete description of the local properties of the space, regardless of whether the space is a physical space or a mathematical space representing spacetime. By contrast, the space’s topology provides a complete description of the global properties of the space such as whether it has external curvature like a cylinder or no external curvature as in a plane; these two spaces are locally the same.

The metric for special relativity is complicated enough, but the metric for general relativity is very complicated.

The discussion of the metric continues in the discussion of time coordinates. For a helpful and more detailed presentation of the spacetime interval and the spacetime metric, see chapter 4 of (Maudlin 2012) and especially the chapter “Geometry” in The Biggest Ideas in the Universe: Space, Time, and Motion by Sean Carroll.

9. How Does Proper Time Differ from Standard Time and Coordinate Time?

Proper time is personal, and standard time is public. Standard time is the proper time reported by the standard clock of our conventionally-chosen standard coordinate system. Coordinate time is the time measured in some conventionally adopted coordinate system. Every properly functioning clock measures its own proper time, the time along its own world tube, no matter how the clock is moving or what forces are acting upon it. Loosely speaking, standard time is the time shown on a designated clock in Paris, France that reports the time in London, England that we agree to be the correct time. The Observatory is assumed to be stationary in the standard coordinate system. Given a coordinate system with a time coordinate and space coordinate, if you sit still, then your proper time is the same as the coordinate time.

But the faster your clock moves compared to the standard clock or the greater the gravitational force on it compared to the standard clock, then the more your clock readings will deviate from standard time as would be very clear if the two clocks were ever to meet. This effect is called time dilation. Under normal circumstances in which you move slowly compared to the speed of light and do not experience unusual gravitational forces, then there is no difference between your proper time and your civilization’s standard time.

Think of any object’s proper time as the time that would be shown on an ideal, small, massless, correct clock that always travels with the object and has no physical effect upon the object and that is not affected if the object is ever frozen. Your cell phone is an exception. Although it has its own proper time, what it reports id not its proper time but instead the proper time of our standard clock adjusted by an hour for each time zone between it and the cell phone. People on Earth normally do not notice that they have different proper times from each other because the time dilation effect is so small for the kind of life they lead.

The proper time interval between two events (on a world line) is the amount of time that elapses according to an ideal clock that is transported between the two events. But there are many paths for the transportation, just as there are many roads between Paris and Berlin. Consider two point-events. Your own proper time between them is the duration between the two events as measured along the world line of your clock that is transported between the two events. Because there are so many physically possible ways to do the clock transporting, for example at slow speed or high speed and near a large mass or far from it, there are so many different possible proper time intervals for the same two events. There is one exception here. The proper time between two points along the worldl ine of a light ray is always zero. So, if you were a photon and traveled across the Milky Way Galaxy, no proper time would elapse  during your journey, although external observers of your journey would measure a large amount of coordinate time.

Here is a way to maximize the difference between proper time and standard time. If you and your clock pass through the event horizon of a black hole and fall toward the hole’s center, you will not notice anything unusual about your proper time, but external observers using Earth’s standard time will measure that you took an extremely long time to enter the horizon.

The actual process by which coordinate time is computed from the proper times of real clocks and the process by which a distant clock is synchronized with a local clock are very complicated, though some of the philosophically most interesting issues here—regarding the relativity of simultaneity and the conventionality of simultaneity—are discussed below.

Authors and speakers who use the word time often do not specify whether they mean proper time or standard time or coordinate time. They assume the context is sufficient for us to know what they mean.

10. Is Time the Fourth Dimension?

Yes and no; it depends on what is meant by the question. It is correct to say time is a dimension but not to say time is a spatial dimension. Time is the fourth dimension of 4D spacetime, but time is not the fourth dimension of physical space because that space has only three dimensions, as far as we know. In 4D spacetime, the time dimension is special and differs in a fundamental way from the other three dimensions.

Mathematicians have a broader notion of the term space than the average person. In their sense, a space need not contain any geographical locations nor any times, and it can have any number of dimensions, even an infinite number. Such a space might be two-dimensional and contain points represented by the ordered pairs in which a pair’s first member is the name of a voter in London and its second member is the average monthly income of that voter. Not paying attention to the two meanings of the term space is the source of all the confusion about whether time is the fourth dimension.

Newton treated space as three dimensional and treated time as a separate one-dimensional space. He could have used Minkowski’s 1908 idea, if he had thought of it, namely the idea of treating spacetime as four-dimensional.

The mathematical space used by mathematical physicists to represent physical spacetime that obeys the laws of relativity is four-dimensional; and in that mathematical space, the space of places is a 3D sub-space, and time is another sub-space, a 1D one. The mathematician Hermann Minkowski was the first person to construct such a 4D mathematical space for spacetime, although in 1895 H. G. Wells treated time informally as the fourth dimension in his novel The Time Machine.

In 1908, Minkowski remarked that “Henceforth space by itself, and time by itself, are doomed to fade away into mere shadows, and only a kind of union of the two will preserve an independent reality.” Many people mistakenly took this to mean that time is partly space, and vice versa. The philosopher C. D. Broad countered that the discovery of spacetime did not break down the distinction between time and space but only their independence or isolation.

The reason why time is not partly space is that, within a single frame, time is always distinct from space. Another way of saying this is to say time always is a distinguished dimension of spacetime, not an arbitrary dimension. What being distinguished amounts to, speaking informally, is that when you set up a rectangular coordinate system on a spacetime with an origin at, say, some important event, you may point the x-axis east or north or up or any of an infinity of other directions, but you may not point it forward in time—you may do that only with the t-axis, the time axis.

For any coordinate system on spacetime, mathematicians of the early twentieth century believed it was necessary to treat a point-event with at least four independent numbers in order to account for the four dimensionality of spacetime. Actually this appeal to the 19th-century definition of dimensionality, which is due to Bernhard Riemann, is not quite adequate because mathematicians have subsequently discovered how to assign each point on the plane to a point on the line without any two points on the plane being assigned to the same point on the line. The idea comes from the work of Georg Cantor. Because of this one-to-one correspondence between the plane’s points and the line’s points, the points on a plane could be specified with just one number instead of two. If so, then the line and plane must have the same dimensions according to the Riemann definition of dimension. To avoid this result, and to keep the plane being a 2D object, the notion of dimensionality of space has been given a new, but rather complex, definition.

There has been much research in string theory regarding whether space has more than three dimensions and whether the dimensionality can differ in different regions of spacetime. If string theory is correct, space might have many more dimensions than three, but string theory is an unconfirmed theory. A space with more dimensions than three for our universe is commonly called “the bulk.”

11. How Is Time Relative to the Observer?

The rate that a clock ticks is relative to the observer. Given one event, the first observer’s clock can measure one value for its duration, but a second clock can measure a different value if it is moving or being affected differently by gravity. Yet, says Einstein, both measurements can be correct. That is what it means to say time is relative to the observer. This relativity is quite a shock to our manifest image of time. According to Newton’s physics, in principle there is no reason why observers cannot agree on what time it is now or how long an event lasts or when some distant event occurred. Einstein’s theory disagrees with Newton’s on all this.

The term “observer” in relativity theory has a technical meaning.  The observer has no effect on the observation. The observer at a point is idealized as a massless point particle having no impact on its environment. Ideally, an observer is a conscious being who can report an observation and who has a certain orientation to what is observed, such as being next to the measured event or being three light years away. If so, the observation is called objective. An observation is the result of the action of observing. It establishes the values of one or more variables as in: “It was noon on my spaceship’s clock when the asteroid impact was detected, so because of the travel time of light I compute that the impact occurred at 11:00.”

Think of an observer as being an omniscient reference frame. Consider what is involved  in being an omniscient reference frame. Information about any desired variable is reported from a point-sized spectator at each spacetime location. A spectator is always accompanied by an ideal, point-sized, massless, perfectly functioning clock that is synchronized with the clocks of other spectators at all other points of spacetime. The observer at a location has all the tools needed for reporting values of variables such as voltage or the presence or absence of grape jelly at that location.

12. What Is the Relativity of Simultaneity?

The relativity of simultaneity is the feature of spacetime in which observers using different reference frames disagree on which events are simultaneous. Simultaneity is relative to the chosen reference frame. A large percentage of both physicists and philosophers of time suggest that this implies simultaneity is not objectively real, and they conclude also that the present is not objectively real, the present being all the events that are simultaneous with being here now.

Why is there disagreement about what is simultaneous with what? It occurs because the two events occur spatially far from each other.

In our ordinary lives, we can neglect all this because we are interested in nearby events. If two events occur near us, we can just look and see whether they occurred simultaneously.  But suppose we are on a spaceship circling Mars when a time signal is received saying it is noon in London, England. Did the event of the sending and receiving occur simultaneously? No. Light takes an hour and twenty minutes to travel from the Earth to the spaceship. If we want to use this time signal to synchronize our clock with the Earth clock, then instead of setting our spaceship clock to noon, we should set it to an hour and twenty minutes before noon.

This scenario conveys the essence of properly synchronizing distant clocks with our nearby clock. There are some assumptions that are ignored for now, namely that we can determine that the spaceship was relatively stationary with respect to Earth and was not in a different gravitational potential field from that of the Earth clock.

The diagram below illustrates the relativity of simultaneity for the so-called midway method of synchronization. There are two light flashes. Did they occur simultaneously?

Minkows2

The Minkowski diagram represents Einstein sitting still in the reference frame indicated by the coordinate system with the thick black axes. Lorentz is traveling rapidly away from him and toward the source of flash 2. Because Lorentz’s world line is a straight line, we can tell that he is moving at a constant speed. The two flashes of light arrive simultaneously at their midpoint according to Einstein but not according to Lorentz. Lorentz sees flash 2 before flash 1. That is, the event A of Lorentz seeing flash 2 occurs before event C of Lorentz seeing flash 1. So, Einstein will readily say the flashes are simultaneous, but Lorentz will have to do some computing to figure out that the flashes are simultaneous in the Einstein frame because they are not simultaneous to him in a reference frame in which he is at rest.  However, if we’d chosen a different reference frame from the one above, one in which Lorentz is not moving but Einstein is, then it would be correct to say flash 2 occurs before flash 1. So, whether the flashes are or are not simultaneous depends on which reference frame is used in making the judgment. It’s all relative.

There is a related philosophical issue involved with assumptions being made in, say, claiming that Einstein was initially midway between the two flashes. Can the midway determination be made independently of adopting a convention about whether the speed of light is independent of its direction of travel? This is the issue of whether there is a ‘conventionality’ of simultaneity.

13. What Is the Conventionality of Simultaneity?

The relativity of simultaneity is philosophically less controversial than the conventionality of simultaneity. To appreciate the difference, consider what is involved in making a determination regarding simultaneity. The central problem is that you can measure the speed of light only for a roundtrip, not a one-way trip, so you cannot simultaneously check what time it is on your clock and on a distant clock. A related, simpler problem, is to determine whether the speed of light is the same in opposite directions.

Given two events that happen essentially at the same place, physicists assume they can tell by direct observation whether the events happened simultaneously. If they cannot detect that one of them is happening first, then they say they happened simultaneously, and they assign the events the same time coordinate in the reference frame. The determination of simultaneity is very much more difficult if the two events happen very far apart, such as claiming that the two flashes of light reaching Einstein in the scenario of the previous section began at the same time. One way to measure (operationally define) simultaneity at a distance is the midway method. Say that two events are simultaneous in the reference frame in which we are stationary if unobstructed light signals caused by the two events reach us simultaneously when we are midway between the two places where they occurred. This is the operational definition of simultaneity used by Einstein in his theory of special relativity.

This midway method has a significant presumption: that the light beams coming from opposite directions travel at the same speed. Is this a fact or just a convenient convention to adopt? Einstein and the philosophers of time Hans Reichenbach and Adolf Grünbaum have called this a reasonable convention because any attempt to experimentally confirm the equality of speeds, they believed, presupposes that we already know how to determine simultaneity at a distance.

Hilary Putnam, Michael Friedman, and Graham Nerlich object to calling it a convention—on the grounds that to make any other assumption about light’s speed would unnecessarily complicate our description of nature, and we often make choices about how nature is on the basis of simplification of our description of nature.

To understand the dispute from another perspective, notice that the midway method above is not the only way to define simultaneity. Consider a second method, the mirror reflection method. Select an Earth-based frame of reference, and send a flash of light from Earth to Mars where it hits a mirror and is reflected back to its source. The flash occurred at 12:00 according to a correct Earth clock, let’s say, and its reflection arrived back on Earth 20 minutes later. The light traveled the same empty, undisturbed path coming and going. At what time did the light flash hit the mirror? The answer involves the conventionality of simultaneity. All physicists agree one should say the reflection event occurred at 12:10 because they assume it took ten minutes going to Mars, and ten minutes coming back. The difficult philosophical question is whether this way of calculating the ten minutes is really just a convention. Einstein pointed out that there would be no inconsistency in our saying that the flash hit the mirror at 12:17, provided we live with the awkward consequence that light was relatively slow reaching the mirror, but then traveled back to Earth at a faster speed.

Suppose we want to synchronize a Mars clock with our clock on Earth using the reflection method. Let’s draw a Minkowski diagram of the situation and consider just one spatial dimension in which we are at location A on Earth next to the standard clock used for the time axis of the reference frame. The distant clock on Mars that we want to synchronize with Earth time is at location B. See the diagram.

conventionality of simultaneity graph

The fact that the world line of the B-clock is parallel to the time axis shows that the two clocks are assumed to be relatively stationary. (If they are not, and we know their relative speed, we might be able to correct for this.) We send light signals from Earth in order to synchronize the two clocks. Send a light signal from A at time t1 to B, where it is reflected back to us at A, arriving at time t3. So, the total travel time for the light signal is t3 – t1, as judged by the Earth-based frame of reference. Then the reading tr on the distant clock at the time of the reflection event should be set to t2, where:

t2 = t1 + (1/2)(t3 – t1).

If tr = t2, then the two spatially separated clocks are supposedly synchronized.

Einstein noticed that the use of the fraction 1/2 rather than the use of some other fraction implicitly assumes that the light speed to and from B is the same. He said this assumption is a convention, the so-called conventionality of simultaneity, and is not something we could check to see whether it is correct.  Only with the fraction (1/2) are the travel speeds the same going and coming back.

Suppose we try to check whether the two light speeds really are the same. We would send a light signal from A to B, and see if the travel time was the same as when we sent it from B to A. But to trust these durations we would already need to have synchronized the clocks at A and B. But that synchronization process will presuppose some value for the fraction, said Einstein.

Not all philosophers of science agree with Einstein that the choice of (1/2) is a convention, nor with those philosophers such as Putnam who say the messiness of any other choice shows that the choice of 1/2 must be correct. Everyone does agree, though, that any other choice than 1/2 would make for messy physics.

Some researchers suggest that there is a way to check on the light speeds and not simply presume they are the same. Create two duplicate, correct clocks at A. Transport one of the clocks to B at an infinitesimal speed. Going this slow, the clock will arrive at B without having its own time reports deviate from that of the A-clock. That is, the two clocks will be synchronized even though they are distant from each other. Now the two clocks can be used to find the time when a light signal left A and the time when it arrived at B, and similarly for a return trip. The difference of the two time reports on the A and B clocks can be used to compute the light speed in each direction, given the distance of separation. This speed can be compared with the speed computed with the midway method. The experiment has never been performed, but the recommenders are sure that the speeds to and from will turn out to be identical, so they are sure that the (1/2) is correct and not a convention.

Sean Carroll has yet another position on the issue. He says “The right strategy is to give up on the idea of comparing clocks that are far away from each other” (Carroll 2022, 150).

The conventionality of simultaneity is discussed in the encyclopedia article on Eternalism. For additional discussion of the controversies involved in the conventionality of simultaneity, see (Callender 2017, p. 51) and pp. 179-184 of The Blackwell Guide to the Philosophy of Science, edited by Peter Machamer and Michael Silberstein, Blackwell Publishers, Inc., 2002.

14. What are the Absolute Past and the Absolute Elsewhere?

What does it mean to say the human condition is one in which you never will be able to affect an event outside your forward light cone? Here is a visual representation of the human condition according to the special theory of relativity, whose spacetime can always be represented by a Minkowski diagram of the following sort:

Minkows1

The absolutely past events (the green events in the diagram above) are the events in or on the backward light cone of your present event, your here-and-now. The backward light cone of event Q is the imaginary cone-shaped surface of spacetime points formed by the paths of all light rays reaching Q from the past.

The events in your absolute past zone or region are those that could have directly or indirectly affected you, the observer, at the present moment, assuming there were no intervening obstacles. The events in your absolute future zone are those that you could directly or indirectly affect.

An event’s being in another event’s absolute past is a feature of spacetime itself because the event is in the point’s past in all possible reference frames. This feature is frame-independent. For any event in your absolute past, every observer in the universe (who is not making an error) will agree the event happened in your past. Not so for events that are in your past but not in your absolute past. Past events not in your absolute past are in what Eddington called your absolute elsewhere. The absolute elsewhere is the region of spacetime containing events that are not causally connectible to your here-and-now. Your absolute elsewhere is the region of spacetime that is neither in nor on either your forward or backward light cones. No event here and now, can affect any event in your absolute elsewhere; and no event in your absolute elsewhere can affect you here and now.

If you look through a telescope you can see a galaxy that is a million light-years away, and you see it as it was a million years ago. But you cannot see what it looks like now because the present version of that galaxy is outside your light cone, and is in your absolute elsewhere.

A single point’s absolute elsewhere, absolute future, and absolute past form a partition of all spacetime into three disjoint regions. If point-event A is in point-event B’s absolute elsewhere, the two events are said to be spacelike related. If the two are in each other’s forward or backward light cones they are said to be time-like related or to be causally connectible. We can affect or be affected by events that are time-like related to us here and now; we cannot affect or be affected by events that are space-like separated from our here and now. Whether a space-like event occurs before the event of your being here now depends on the chosen frame of reference, but the order of occurrence of a time-like event and our here-and-now is not frame-relative. Another way to make the point is to say that, when choosing a reference frame, we have a free choice about the time order of two events that are space-like related, but we have no freedom when it comes to two events that are time-like related because the causal order determines their time order. That is why the absolute elsewhere is also called the extended present. There is no fact of the matter about whether a point in your absolute elsewhere is in your present, your past, or your future. It is simply a conventional choice of reference frame that fixes what events in your absolute elsewhere are present events.

For any two events in spacetime, they are time-like, space-like, or light-like separated, and this is an objective feature of the pair that cannot change with a change in the reference frame. This is another implication of the fact that the light-cone structure of spacetime is real and objective, unlike features such as durations and lengths.

The past light cone looks like a cone in small regions in a spacetime diagram with one dimension of time and two of space. However, the past light cone is not cone-shaped in a large cosmological region, but rather has a pear-shape because all very ancient light lines must have come from the infinitesimal volume at the Big Bang.

15. What Is Time Dilation?

Time dilation occurs when two synchronized clocks get out of synchrony due either to their relative motion or due to their being in regions of different gravitational field strengths.  An observer always notices that it is the other person’s clock that is behaving oddly, never that their own clock is behaving oddly. When two observers are in relative motion, each can see that the other person’s clock is slowing down relative to their own clock. It’s as if the other person’s time is stretched  or dilated. There is philosophical controversy about whether the dilation is literally a change in time itself or only a change in how  durations are measured using someone else’s clock as opposed to one’s own clock.

The specific amount of time dilation depends on the relative speed of one clock toward or away from the other. If one clock circles the other, their relative speed is zero, so there is no time dilation due to speed, regardless of how fast the rotational speed.

The sister of time dilation is space contraction. The length of an object changes in different reference frames to compensate for time dilation so that the speed of light c in a vacuum is constant in any frame. The object’s length measured perpendicular to the direction of motion is not affected by the motion, but the length measured in the direction of the motion is affected. If you are doing the measuring, then moving sticks get shorter if moving toward you or away from you. The length changes not because of forces, but rather because space itself contracts.  What a shock this is to our manifest image! No one notices that the space around themselves is contracting, only that the space somewhere else seems to be affected.

Here is a picture of the visual distortion of moving objects due to space contraction:

rolling wheel
Image: Corvin Zahn, Institute of Physics, Universität Hildesheim,
Space Time Travel (http://www.spacetimetravel.org/)

The picture describes the same wheel in different colors: (green) rotating in place just below the speed of light; (blue) moving left to right just below the speed of light; and (red) remaining still.

To give some idea of the quantitative effect of time dilation:

Among particles in cosmic rays we find protons…that move so fast that their velocities differ infinitesimally from the speed of light: the difference occurs only in the twentieth (sic!) non-zero decimal after the decimal point. Time for them flows more slowly than for us by a factor of ten billion, If, by our clock, such a proton takes a hundred thousand years to cross our stellar system—the Galaxy—then by ‘its own clock’ the proton needs only five minutes to cover the same distance (Novikov 1998, p. 59).

16. How Does Gravity Affect Time?

According to the general theory of relativity, gravitational differences affect time by dilating it—in the sense that observers in a less intense gravitational potential field find that clocks in a more intense gravitational potential field run slow relative to their own clocks. It’s as if the time of the clock in the intense gravitational field is stretched out and not ticking fast enough. For this reason, people in ground floor apartments outlive their twins in penthouses, all other things being equal. Basement flashlights will be shifted toward the red end of the visible spectrum compared to the flashlights in attics. All these phenomena are the effects of gravitational time dilation.

Spacetime in the presence of gravity is curved, according to general relativity. So, time is curved, too. When time curves, clocks do not bend in space as if in a Salvador Dali painting. Instead they undergo gravitational time dilation.

Information from the Global Positioning System (GPS) of satellites orbiting Earth is used by your cell phone to tell you whether you should turn right at the next intersection. The GPS is basically a group of flying atomic clocks that broadcast the time. The curvature of spacetime near Earth is significant enough that gravitational time dilation must be accounted for by these clocks to keep us from making navigation errors. GPS clocks slow down because of gravitational time dilation and speed up because of time dilation due to their high speed. The combined effect is seven seconds for the GPS clocks.  Therefore, these GPS satellites are launched with their clocks adjusted ahead of Earth clocks by about seven seconds and then are periodically readjusted ahead so that they stay synchronized with Earth’s standard time. The smaller the error in the atomic clock in the satellite and in the standard clock the better the GPS system works. That is one reason physicists keep trying to build better clocks. In 2018, gravitational time dilation was measured in Boulder, Colorado, U.S.A. so carefully that it detected the difference in ticking of two initially synchronized atomic clocks that differed in height by only a centimeter. Unfortunately, the actual atomic clocks used in GPS satellites are much less accurate than these atomic clocks.

When a metaphysician asks the question, “What is gravity?” there are three legitimate, but very different, answers. Gravity is (1) a force, (2) intrinsic curvature of spacetime, and (3) exchanges of virtual particles. All three answers are correct and have their uses. When speaking of spilling milk or designing a rocket to visit the moon, the first answer is most appropriate to use. In the context of general relativity, the second answer is most appropriate. For ontology, the third answer is best.

In the context of a future theory of quantum gravity that incorporates gravity into the best features of quantum mechanics and the best features of the standard model of particle physics, the third answer is expected to be best. At this more fundamental level, forces are features of field activity. Those virtual gravity particles are called gravitons, and they are fluctuations within the gravitational field. What is happening with spilled milk is that pairs of virtual entangled particles bubble up out of the relevant fields. Normally one member of the pair has positive momentum, and the other member has negative momentum. Those particles with negative momentum  are exchanged between the milk and the Earth and floor, thereby causing the milk to be attracted to the floor in analogy to how, when someone throws a boomerang beyond you, it can hit you on its way back and push you closer to the thrower.

17. What Happens to Time near a Black Hole?

Time, as measured by an Earth-based clock, slows down for all processes near a black hole. For example, a clock on a spaceship approaching the hole will be slowed down as measured by our standard Earth-based clock. So will people’s actions and thoughts.

What is a black hole? Although Einstein believed a black hole is too strange to actually exist, black holes subsequently have been recognized to be real phenomena existing throughout the universe. Princeton physicist Richard Gott described a black hole as a spherical hotel in which you can check in but cannot check out. Even light sent out from a laser that has fallen in will get dragged back and never escape. That is why the hole is called black. Every massive object has an escape velocity, namely a minimum speed required for a spaceship to fly away directly away from the object and escape into outer space. The escape velocity of any black hole is larger than the velocity of light, which is why it is black.

Black holes are made from material objects getting crushed together, but those material objects get crushed and turned into the energy of spatial curvature making the black hole matter-free and just a strange region of space with no matter in the region. What then keeps the black hole curved after all the matter has vanished? It is the enormous energy of its own warping. That energy is stored within the gravitational field which in turn produces such a high gravitational force that it curves itself until it becomes stable.

A typical black hole is produced by the death of a star whose nuclear fuel has been used up. A healthy star is an explosion. A black hole is an implosion. Matter is a physical object within space and time; a black hole is merely space and time itself. The hole might later contain matter, but this would be due only to new matter falling in after the hole was created, although this matter would soon be sucked toward the center, and it, too, would disappear at the singularity and be converted into pure energy, and the information of this matter would surely be lost when the black hole evaporates—as all black holes do. We on the outside of a black hole know only the mass (including its gravitational activity), rotation, and electric charge. All its other information is hidden from view, as if it has disappeared from the universe. The term “gravitational activity” is referring to the fact that black holes can vibrate at resonant frequencies of its gravitational waves when matter falls into the hole and makes it ring somewhat like a bell having been struck by a hammer.

The center of a black hole is often called its singularity, but strictly speaking the center and the singularity are different. The spatial center is a crushed region of very large spatial curvature. The singularity, on the other hand, is the end of the proper time of any object that plunges into the hole. Nevertheless it is common to casually use the two terms interchangeably.

Most contemporary physicists do not believe there are any singularities of zero volume; they believe those are artifacts of relativity theory, and signs that that relativity theory needs to be improved and reconciled with quantum theory. Instead, they believe the “singularity” is very small but not actually infinitesimal, and the curvature there is extremely high but not infinite. They believe this because relativity theory’s singularity is inconsistent with quantum theory’s requirement that no matter can be confined to a point; the confinement would violate Heisenberg’s Uncertainty Principle of quantum theory. On this issue, contemporary physicists trust quantum mechanics more than relativity theory.

Here is a processed photograph of a black hole surrounded by its colorful but dangerous accretion disk that is radiating electromagnetic radiation (mostly high-energy x-rays) due to particles outside the hole crashing into each other as they are gravitationally pulled toward the hole:

picture of black hole
The M87 black hole image produced by the European Southern Observatory

The red and orange area is the accretion disk outside the hole where incoming particles crash into each other. The black region in the center is produced by the black hole’s blocking the light from its own accretion disk that is behind it. The colors in the picture are artifacts added by a computer because the real light (when shifted from x-ray frequencies to optical frequencies) would be white and because humans can detect differences among colors better than differences in the brightness of white light. It is believe that nearly all black holes spin and spin faster the smaller they are, but even if a black hole is not spinning, its surrounding accretion disk beyond its event horizon will surely be spinning. Because of the spinning. the accretion disk is not spherical, but is pizza-shaped. It can have a temperature of from a thousand to many millions of degrees.

The event horizon is a two-dimensional fluid-like surface separating the inside from the outside of the black hole. Think of the event horizon as a two-dimensional spherical envelope. To plunge across the event horizon is to cross a point of no return. Even light generated inside cannot get back out. So, black holes are relatively dark compared to stars. However, because the accretion disk outside the horizon can eject hot, magnetized gas and shine as a quasar, some supermassive black holes are the most luminous objects in all the universe.

If you were unlucky enough to fall through the event horizon, you could see out continually, but you could not send a signal out, nor could you yourself escape even if your spaceship had an extremely powerful thrust. Also, the space around you increasingly collapses, so you would be sucked toward the singularity and squeezed on your way there—a process called “spaghettification.” Despite your soon being crushed to a tiny volume on your way to the end of time at the singularity, you would continue to affect the world outside the black hole via your contribution to its gravity. Spaghettification is not very significant as you cross the horizon of a very large black hole, but it gets very significant as you get near the singularity.

Any macroscopic object can become a black hole if sufficiently compressed. An object made of anti-matter can become a black hole, too. If you bang two rocks together fast enough, they will produce a black hole, and the black hole will begin pulling in nearby particles, including you the experimenter. Luckily even our best particle colliders in Earth’s laboratories are not powerful enough to create black holes, but the creators of those first laboratories were gambling with the future of humanity when they guessed no black hole would be created.

All massive stars will become black holes when their fuel runs out and they stop radiating so that the star’s gravity takes over and crushes the star. Our Sun is not quite massive enough to do this, but its future ash will eventually find its way into a black hole.

If an electron were a point particle, then it would have an enormous density and be dense enough to become a black hole. That electrons exist around us and do not become black holes is the best reason to believe electrons are not point particles.

The black hole M87 is pictured above. It has a mass of about 6.5 billion of our suns, so it is too big for it to have originated from the collapse of only a single star. It probably has eaten many nearby stars and much cosmic dust. It is not in the Milky Way but in a distant galaxy 55 million light years from Earth. There is another, smaller yet supermassive black hole at the center of the Milky Way. It, too, is probably made by feeding on neighbor stars and other nearby infalling material. Almost all galaxies have a black hole at their center, but black holes also exist elsewhere. Most black holes are not powerful enough to suck in all the stars around them, just as our sun will never suck in all the planets of our solar system. This is because of the planet’s angular momentum around the Sun. Similarly, most all the electrons and protons in the accretion disk of a black hole have angular momentum that produces enough centrifugal force to resist the black hole’s gravitational pull.

A black hole’s accretion disk also spins if the hole spins, and because of this the Doppler effect shown in the picture above requires the redness at the top to be less bright than at the bottom of the picture. The picture has been altered to remove the blurriness that would otherwise be present due to the refraction from the plasma and dust between the Earth and the black hole. The plasma close to the black hole has a temperature of hundreds of billions of degrees.

The matter orbiting the black hole is a diffuse gas of electrons and protons. …The black hole pulls that matter from the atmospheres of stars orbiting it. Not that it pulls very much. Sagittarius A* is on a starvation diet—less than 1 percent of the stuff captured by the black hole’s gravity ever makes it to the event horizon. (Seth Fletcher. Scientific American, September 2022 p. 53.)

Relativity theory implies an infalling spaceship suffers an infinite time dilation at the event horizon and so does not fall through the horizon in a finite time. Most physicists believe this implication is incorrect and is a reason that relativity theory needs to be revised. When quantum mechanics is taken into account, it seems that the unfortunate spaceship does fall through in a finite time. This is because the gravitational field produced by the spaceship itself acts on the black hole. As the spaceship gets very, very close to the event horizon such as an atom’s width away, the time dilation does radically increase, but the event horizon slightly expands enough to swallow the spaceship in a finite time—a trivially short time as judged from the spaceship, but a very long time as judged from Earth. This occurrence of slight expansion is one sign that the event horizon is fluidlike. After the spaceship is swallowed, the event horizon returns to its original shape, but it is slightly larger now because it is more massive by one spaceship-mass.

By applying quantum theory to black holes, Stephen Hawking discovered that black holes are  not quite black even when the radiation of the accretion disk is not taken into account. Every black hole emits “Hawking radiation” at its horizon. This radiation reduces the mass of the black hole causing it to eventually evaporate. Large black holes take about 1067 years to completely evaporate. To appreciate how long a black hole lives, remember that the Big Bang occurred less than twenty billion years ago (2 x 1010 years ago). Because every black hole absorbs the cosmic background radiation, a black hole will not even start evaporating and losing energy until the hole’s absorption of the cosmic background radiation subsides enough that the hole’s immediate environment is below the temperature of the black hole. Quantum theory suggests black holes get warmer as they shrink. Hawking first realized that they do this by absorbing particles on their event horizon that have negative mass (Yes, “negative”). When a black hole shrinks to the size of a bacterium, its outgoing radiation becomes white-colored, producing a white black-hole. At the very last instant of its life, it evaporates as it explodes in a flash of extremely hot, high-energy particles.

Black holes have many other strange properties. When a pair of particles is produced at the event horizon, one of the pair can fall into the hole while the other flies away from the hole. From the perspective of any particle, it always has positive energy, but from the perspective of a distant, external observer, the infalling particle can have negative energy.

Another odd property is that nearly all physical objects tend to get warmer when you shine a light on them. Think of your ice cream cone in the sunshine. A black hole is an exception. It get colder in the sunshine.

Black holes produce startling visual effects. A light ray can circle outside a black hole once or many times depending upon its angle of incidence to the event horizon. A light ray grazing a black hole can leave at any angle, so a person viewing a black hole from outside can see multiple copies of themselves various angles. Each copy arrives at a different time. We can see the same galaxy more than once because of this. See http://www.spacetimetravel.org/reiseziel/reiseziel1.html for some of these visual effects. An external viewer can also see the part of the accretion disk that is behind the black hole and would be obscured by the hole itself if it were not for gravitational lensing.

Every spherical black hole has the odd geometric feature that its diameter is very much larger than its circumference, which is very unlike the sphere of Euclidean geometry. The diameter “plunges” into the hole.

Additional mass makes a black hole grow. Some black holes have been detected to be 100 billion times more massive than our Sun.

Some popularizers have said that the roles of time and space are reversed within a black hole, but this is not correct. Instead it is only coordinates that reverse their roles. Given a coordinate system whose origin is outside a black hole, its timelike coordinates become spacelike coordinates inside the horizon. If you were to fall into a black hole, your clock would not begin measuring distance. See (Carroll 2022c  251-255) for more explanation of this role reversal.

Black Holes and Information

Do black holes destroy information? Stephen Hawking’s calculation was that information is lost to outsiders as an object hits the singularity; it is an irreversible process. The information is lost before any measurements are made so it is a second kind of information loss for quantum theory. By “information” Hawking meant all the details of an initial state at some time before the object plunges through the event horizon. According to Hawking’s calculation, the details are lost to outsiders because the information is destroyed at the singularity. This loss is inconsistent with standard quantum theory which says information is never lost except during measurements. The leading hypothesis in the first quarter of the 21st century is that the information is not lost, and it escapes back out through the Hawking radiation during the evaporation process of the black hole. Unfortunately, this hypothesis cannot be experimentally tested because the Hawking radiation is far too weak to be practically measured, and we still do not have a theory of quantum gravity that says whether information is or isn’t lost at the singularity. “In the absence of both theory and data, physicists produced many hypotheses about what might happen in black hole evaporation. Maybe black holes don’t entirely evaporate, but leave behind remnants. Maybe information can’t ever fall in but remain on the event horizon. Maybe it bounces back at the singularity. Maybe it comes out with the Hawking radiation. The latter is the favored hypothesis.

History

In 1783, John Michell speculated that there may be a star with a large enough diameter that the velocity required to escape its gravitational pull would be so great that not even Newton’s particles of light could escape. He called them “dark stars.” Einstein invented the general theory of relativity in 1915, and the next year the German physicist Karl Schwarzschild discovered that Einstein’s equations imply that if a non-rotating, perfectly spherical star, in an  otherwise empty universe were massive enough and its radius were somehow small enough so that it had extremely high density, then it would undergo an unstoppable collapse. Meanwhile, the gravitational force from the object would be so strong that not even light within the hole could escape the inward pull of gravity. In 1935, Arthur Eddington commented upon this discovery that relativity theory allowed a star to collapse this way:

I think there should be a law of nature to stop a star behaving in this absurd way.

Because of Eddington’s prestige, other physicists (with the notable exception of Subrahmanyan Chandrasekhar) agreed. Then in 1939, J. Robert Oppenheimer and his student Hartland Snyder first seriously suggested that some stars would in fact collapse into black holes, and they first clearly described the defining features of a black hole—that “The star thus tends to close itself off from any communication with a distant observer; only its gravitational field persists.” The term “black hole” was first explicitly mentioned by physicist Robert Dicke some time in the early 1960s when he made the casual comparison to a notorious dungeon of the same name in India, the Black Hole of Calcutta. The term was first published in the American magazine Science News Letter in 1964. John Wheeler subsequently promoted use of the term, following a suggestion from one of his students.

Roger Penrose won a Nobel Prize for proving that if you perturb Schwarzchild’s  solution by making the black hole not quite spherical, the new hole still must have a singularity.

It is now known that most of the entropy of the universe is within black holes. Any black hole has more entropy than the material from which it was made.

18. What Is the Solution to the Twins Paradox?

The paradox is an argument that uses the theory of relativity to produce an apparent contradiction. Before giving that argument, let’s set up a typical situation that can be used to display the paradox. Consider two twins at rest on Earth with their correct clocks synchronized. One twin climbs into a spaceship, and flies far away at a high, constant speed, then stops, reverses course, and flies back at the same speed. An application of the equations of special relativity theory shows that the twin on the spaceship will return and be younger than the Earth-based twin. Their clocks disagree about the elapsed time of the trip. Now that the situation has been set up, notice that relativity theory implies that either twin could say they are the stationary twin, so there is a contradiction. Isn’t that an implication of relativity theory?

To recap, the paradoxical argument is that either twin could regard the other as the traveler and thus as the one whose time dilates. If the spaceship were considered to be stationary, then would not relativity theory imply that the Earth-based twin could race off (while attached to the Earth) and return to be the younger of the two twins? If so, then when the twins reunite, each is younger than the other. That result is paradoxical.

Herbert Dingle was the President of London’s Royal Astronomical Society in the early 1950s. He famously argued in the 1960s that this twins paradox reveals an inconsistency in special relativity. All scientists and almost all philosophers disagree with Dingle and say the twin paradox is not a true paradox, in the sense of revealing an inconsistency within relativity theory, but is merely a complex puzzle that can be adequately solved within relativity theory.

The twins paradox is an interesting puzzle that has a solution that depends on the fact that in relativity theory two people who take different paths through spacetime take different amounts of time to do this in analogy to how two hikers take different times in hiking from place A to place B if the two take different paths. The puzzle’s solution depends on noticing that the two situations are not sufficiently similar, and because of this, for reasons to be explained in a moment, the twin who stays home on Earth maximizes his or her own time (that is, proper time) and so is always the older twin when the two reunite. This solution to the paradox involves spacetime geometry, and it has nothing to do with an improper choice of the reference frame. Nor is the puzzle explained by remarking that the two twins feel different accelerations on their body, even though that is true. The resolution of the puzzle has to do with the fact that some paths in spacetime must take more proper time to complete than other paths.

Here is how to understand the paradox. Consider the Minkowski spacetime diagram below.

twin paradox

The principal suggestion for solving the paradox is to note that there must be a difference in the time taken by the twins because their behaviors are different, as shown by the number and spacing of nodes along their two world lines above. The nodes represent ticks of their clocks. Notice how the space traveler’s time is stretched or dilated compared to the coordinate time, which also is the time of the stay-at-home twin. The coordinate time, that is, the time shown by clocks fixed in space in the coordinate system is the same for both travelers. Their personal times are not the same. The traveler’s personal time is less than that of the twin who stays home, and they return home with fewer memories.

For simplicity we are giving the twin in the spaceship an instantaneous initial acceleration and ignoring the enormous  gravitational forces this would produce, and we are ignoring the fact that the Earth is not really stationary but moves slowly through space during the trip.

The key idea for resolving the paradox is not that one twin accelerates and the other does not, although this claim is very popular in the literature in philosophy and physics. The key idea is that, during the trip, the traveling twin experiences less time but more space. That fact is shown by how their world lines in spacetime are different. Relativity theory requires that for two paths that begin and end at the same point, the longer the path in spacetime (and thus the longer the world line in the spacetime diagram) the shorter the elapsed proper time along that path. That difference is why the spacing of nodes or clock ticks is so different for the two travelers. This is counterintuitive (because our intuitions falsely suggest that longer paths take more time even if they are spacetime paths). And nobody’s clock is speeding up or slowing down relative to its own rate a bit earlier.

A free-falling clock ticks faster and more often than any other accurate clock that is used to measure the duration between pairs of events. It is so for the event of the twins leaving each other and reuniting. This is illustrated graphically by the fact that the longer world line in the graph represents a greater distance in space and a greater interval in spacetime but a shorter duration along that world line. The number of dots in the line is a correct measure of the time taken by the traveler. The spacing of the dots represents the durations between ticks of a personal clock along that world line. If the spaceship approached the speed of light, that twin would cover an enormous amount of space before the reunion, but that twin’s clock would hardly have ticked at all before the reunion event.

To repeat this solution in other words, the diagram shows how sitting still on Earth is a way of maximizing the trip time, and it shows how flying near light speed in a spaceship away from Earth and then back again is a way of minimizing the time for the trip, even though if you paid attention only to the shape of the world lines in the diagram and not to the dot spacing within them you might mistakenly think just the reverse. This odd feature of the geometry is one reason why Minkowski geometry is different from Euclidean geometry. So, the conclusion of the analysis of the paradox is that its reasoning makes the mistake of supposing that the situation of the two twins can properly be considered to be essentially the same.

Richard Feynman famously, but mistakenly, argued in 1975 that acceleration is the key to the paradox. As (Maudlin 2012) explains, the acceleration that occurs in the paths of the example above is not essential to the paradox because the paradox could be expressed in a spacetime obeying special relativity in which neither twin accelerates yet the twin in the spaceship always returns younger. The paradox can be described using a situation in which spacetime is compactified in the spacelike direction with no intrinsic spacetime curvature, only extrinsic curvature. To explain that remark, imagine this situation: All of Minkowski spacetime is like a very thin, flat cardboard sheet. It is “intrinsically flat.” Then roll it into a cylinder, like the tube you have after using the last paper towel on the roll. Do not stretch, tear, or otherwise deform the sheet. Let the time axis be parallel to the tube length, and let the one-dimensional space axis be a circular cross-section of the tube. The tube spacetime is still flat intrinsically, as required by special relativity, even though now it is curved extrinsically (which is allowed by special relativity). The travelling twin’s spaceship circles the universe at constant velocity, so its spacetime path is a spiral. The stay-at-home twin sits still, so its spacetime path is a straight line along the tube. The two paths start together, separate, and eventually meet (many times). During the time between separation and the first reunion, the spaceship twin travels in a spiral as viewed from a higher dimensional Euclidean space in which the tube is embedded. That twin experiences more space but less time than the stationary twin. Neither twin accelerates. There need be no Earth nor any mass nearby for either twin. Yet the spaceship twin who circles the universe comes back younger because of the spacetime geometry involved, in particular because the twin travels farther in space and less far in time than the stay-at-home twin.

For more discussion of the paradox, see (Maudlin 2012), pp. 77-83, and, for the travel on the cylinder, see pp. 157-8.

19. What Is the Solution to Zeno’s Paradoxes?

See the article “Zeno’s Paradoxes” in this encyclopedia.

20. How Are Coordinates Assigned to Time?

A single point of time is not a number, but it has a number when a coordinate system is applied to time. When coordinate systems are assigned to spaces, coordinates are assigned to points. The space can be physical space or mathematical space. The coordinates hopefully are assigned in a way that a helpful metric can be defined for computing the distances between any pair of point-places, or, in the case of time, the duration between any pair of point-times. Points, including times, cannot be added, subtracted, or squared, but their coordinates can be. Coordinates applied to the space are not physically real; they are tools used by the analyst, the physicist; and they are invented, not discovered. The coordinate systems gives each instant a unique name.

Technically, the question, “How do time coordinates get assigned to points in spacetime?” presupposes knowing how we coordinatize the four-dimensional manifold that we call spacetime. The manifold is a collection of points (technically, it is a topological space) which behaves as a Euclidean space in neighborhoods around any point. The focus in this section is on its time coordinates.

There is very good reason for believing that time is one-dimensional, and so, given any three different point events, one of them will happen between the other two. This feature is reflected in the fact that when real number time coordinates are assigned to three point events, and one of the three coordinates is between the other two.

Every event on the world-line of the standard clock is assigned a t-coordinate by that special clock. The clock also can be used to provide measures of the duration between two point events that occur along the coordinate line. Each point event along the world-line of the master clock is assigned some t-coordinate by that clock. For example, if some event e along the time-line of the master clock occurs at the spatial location of the clock while the master clock shows, say, t = 4 seconds, then the time coordinate of the event e is declared to be 4 seconds. That is t(e)=4. We assume that e occurs spatially at an infinitesimal distance from the master clock, and that we have no difficulty in telling when this situation occurs. So, even though determinations of distant simultaneity are somewhat difficult to compute, determinations of local simultaneity in the coordinate system are not. In this way, every event along the master clock’s time-line is assigned a time of occurrence in the coordinate system.

In order to extend the t-coordinate to events that do not occur where the standard clock is located, we can imagine having a stationary, calibrated, and synchronized clock at every other point in the space part of spacetime at t = 0, and we can imagine using those clocks to tell the time along their world lines. In practice we do not have so many accurate clocks, so the details for assigning time to these events is fairly complicated, and it is not discussed here. The main philosophical issue is whether simultaneity may be defined for anywhere in the universe. The sub-issues involve the relativity of simultaneity and the conventionality of simultaneity. Both issues are discussed in other sections of this supplement.

Isaac Newton conceived of points of space and time as absolute in the sense that they retained their identity over time. Modern physicists do not have that conception of points; points are identified relative to events, for example, the halfway point in space between this object and that object, and ten seconds after that point-event.

In the late 16th century, the Italian mathematician Rafael Bombelli interpreted real numbers as lengths on a line and interpreted addition, subtraction, multiplication, and division as “movements” along the line. His work eventually led to our assigning real numbers to instants. Subsequently, physicists have found no reason to use complex numbers or other exotic numbers for this purpose, although some physicists believe that the future theory of quantum gravity might show that discrete numbers such as integers will suffice and the exotically structured real numbers will no longer be required.

To assign numbers to instants (the numbers being the time coordinates or dates), we use a system of clocks and some calculations, and the procedure is rather complicated the deeper one probes. For some of the details, the reader is referred to (Maudlin 2012), pp. 87-105. On pp. 88-89, Maudlin says:

Every event on the world-line of the master clock will be assigned a t-coordinate by the clock. Extending the t-coordinate to events off the trajectory of the master clock requires making use of…a collection of co-moving clocks. Intuitively, two clocks are co-moving if they are both on inertial trajectories and are neither approaching each other nor receding from each other. …An observer situated at the master clock can identify a co-moving inertial clock by radar ranging. That is, the observer sends out light rays from the master clock and then notes how long it takes (according to the master clock) for the light rays to be reflected off the target clock and return. …If the target clock is co-moving, the round-trip time for the light will always be the same. …[W]e must calibrate and synchronize the co-moving clocks.

The master clock is the standard clock. Co-moving inertial clocks do not generally exist according to general relativity, so the issue of how to assign time coordinates is complicated in the real world. What follows is a few more interesting comments about the assignment.

The main point of having a time coordinate is to get agreement from others about which values of times to use for which events, namely which time coordinates to use. Relativity theory implies every person and even every object has its own proper time, which is the time of the clock accompanying it. Unfortunately these personal clocks do not usually stay in synchrony with other well-functioning clocks, although Isaac Newton falsely believed they do stay in synchrony. According to relativity theory, if you were to synchronize two perfectly-performing clocks and give one of them a speed relative to the other, then the two clocks readings must differ (as would be obvious if they reunited), so once you’ve moved a clock away from the standard clock you can no longer trust the clock to report the correct coordinate time at its new location.

The process of assigning time coordinates assumes that the structure of the set of instantaneous events is the same as, or is embeddable within, the structure of our time numbers. Showing that this is so is called solving the representation problem for our theory of time measurement. The problem has been solved. This article does not go into detail on how to solve this problem, but the main idea is that the assignment of coordinates should reflect the structure of the space of instantaneous times, namely its geometrical structure, which includes its topological structure, diffeomorphic structure, affine structure, and metrical structure. It turns out that the geometrical structure of our time numbers is well represented by the structure of the real numbers.

The features that a space has without its points being assigned any coordinates whatsoever are its topological features, its differential structures, and its affine structures. The topological features include its dimensionality, whether it goes on forever or has a boundary, and how many points there are. The mathematician will be a bit more precise and say the topological structure tells us which subsets of points form the open sets, the sets that have no boundaries within them. The affine structure is about which lines are straight and which are curved. The diffeomorphic structure distinguishes smooth from bent (having no derivative).

If the space has a certain geometry, then the procedure of assigning numbers to time must reflect this geometry. For example, if event A occurs before event B, then the time coordinate of event A, namely t(A), must be less than t(B). If event B occurs after event A but before event C, then we should assign coordinates so that t(A) < t(B) < t(C).

Consider a space as a class of fundamental entities: points. The class of points has “structure” imposed upon it, constituting it as a geometry—say the full structure of space as described by Euclidean geometry. [By assigning coordinates] we associate another class of entities with the class of points, for example a class of ordered n-tuples of real numbers [for a n-dimensional space], and by means of this “mapping” associate structural features of the space described by the geometry with structural features generated by the relations that may hold among the new class of entities—say functional relations among the reals. We can then study the geometry by studying, instead, the structure of the new associated system [of coordinates]. (Sklar 1976, p. 28)

But we always have to worry that there is structure among the numbers that is not among the entities numbered. Such structures are “mathematical artifacts.”

The goal in assigning coordinates to a space is to create a reference system; this is a reference frame plus (or that includes [the literature is ambiguous on this point]) a coordinate system. For 4D spacetime obeying special relativity with its Lorentzian geometry, a Lorentzian coordinate system is a grid of smooth timelike and spacelike curves on the spacetime that assigns to each point three space-coordinate numbers and one time-coordinate number. No two distinct points of the spacetime can have the same set of four coordinate numbers. Technically, being continuous is a weaker requirement than being smooth, but the difference is not of concern here.

As we get more global, we have to make adjustments. Consider two coordinate systems in adjacent regions. For the adjacent regions, we make sure that the ‘edges’ of the two coordinate systems match up in the sense that each point near the intersection of the two coordinate systems gets a unique set of four coordinates and that nearby points get nearby coordinate numbers. The result is an atlas on spacetime. Inertial frames can have global coordinate systems, but in general, we have to use atlases for other frames. If we are working with general relativity where spacetime can curve and we cannot assume inertial frames, then the best we can do without atlases is to assign a coordinate system to a small region of spacetime where the laws of special relativity hold to a good approximation. General relativity requires special relativity to hold locally, that is, in any infinitesimal region, and thus for space to be Euclidean locally. That means that locally the 3-d space is correctly described by 3-d Euclidean solid geometry. Adding time is a complication. Spacetime is not Euclidean in relativity theory. Infinitesimally, it is Minkowskian or Lorentzian.

Regarding any event in the the atlas, we demand that nearby events get nearby coordinates. When this feature holds everywhere, the coordinate assignment is said to be monotonic or to “obey the continuity requirement.” We satisfy this requirement by using real numbers as time coordinates.

The metric of spacetime in general relativity is not global but varies from place to place due to the presence of matter and gravitation, and it varies over time as the spatial distribution of matter and energy varies with time. So,  spacetime cannot be given its coordinate numbers without our knowing the distribution of matter and energy. That is the principal reason why the assignment of time coordinates to times is so complicated.

To approach the question of the assignment of coordinates to spacetime points more philosophically, consider this challenging remark:

Minkowski, Einstein, and Weyl invite us to take a microscope and look, as it were, for little featureless grains of sand, which, closely packed, make up space-time. But Leibniz and Mach suggest that if we want to get a true idea of what a point of space-time is like we should look outward at the universe, not inward into some supposed amorphous treacle called the space-time manifold. The complete notion of a point of space-time in fact consists of the appearance of the entire universe as seen from that point. Copernicus did not convince people that the Earth was moving by getting them to examine the Earth but rather the heavens. Similarly, the reality of different points of space-time rests ultimately on the existence of different (coherently related) viewpoints of the universe as a whole. Modern theoretical physics will have us believe the points of space are uniform and featureless; in reality, they are incredibly varied, as varied as the universe itself.
—From “Relational Concepts of Space and Time” by Julian B. Barbour, The British Journal for the Philosophy of Science, Vol. 33, No. 3 (Sep., 1982), p. 265.

For a sophisticated and philosophically-oriented approach to assigning time coordinates to times, see Philosophy of Physics: Space and Time by Tim Maudlin, pp. 24-34.

21. How Do Dates Get Assigned to Actual Events?

The following discussion presupposes the discussion in the previous section.

Our purpose in choosing a coordinate system or atlas is to express  time-order relationships (Did this event occur between those two or before them or after them?) and magnitude-duration relationships (How long after A did B occur?) and date-time relationships (When did event A itself occur?). The date of a (point) event is the time coordinate number of the spacetime coordinate of the event. We expect all these assignments of dates to events to satisfy the requirement that event A happens before event B iff t(A) < t(B), where t(A) is the time coordinate of A, namely its date. The assignments of dates to events also must satisfy the demands of our physical theories, and in this case we face serious problems involving inconsistency if a geologist gives one date for the birth of Earth, an astronomer gives a different date, and a theologian gives yet another date.

Ideally for any reference frame, we would like to partition the set of all actual events into simultaneity equivalence classes by some reliable method. All events in one equivalence class happen at the same time in the frame, and every event is in some class or other.

This cannot be done, but it is interesting to know how close we can come to doing it and how we would go about doing it. We would like to be able to say what event near our spaceship circling Mars (or the supergiant star Betelgeuse) is happening now (at the same time as our now where we are located). More generally, how do we determine whether a nearby event and a very distant event occurred simultaneously? Here we face the problem of the relativity of simultaneity and the problem of the conventionality of simultaneity.

How do we calibrate and synchronize our own clock with the standard clock? Let’s design a coordinate system for time. Suppose we have already assigned a date of zero to the event that we choose to be at the origin of our coordinate system. To assign dates (that is, time coordinates) to other events, we must have access to information from the standard clock, our master clock, and be able to use this information to declare correctly that the time intervals between any two consecutive ticks of our own clock are the same. The second is our conventional unit of time measurement, and it is defined to be the duration required for a specific number of ticks of the standard clock.

We then hope to synchronize other clocks with the standard clock so the clocks show equal readings at the same time. We cannot do this. What are the obstacles? The time or date at which a point-event occurs is the number reading on the clock at rest there. If there is no clock there, the assignment process is more complicated. One could transport a synchronized clock to that place, but any clock speed or influence by a gravitational field during the transport will need to be compensated for. If the place is across the galaxy, then any transport is out of the question, and other means must be used.

Because we want to use clocks to assign a time coordinate even to very distant events, not just to events in the immediate vicinity of the clock. As has been emphasized several times throughout this rambling article, the major difficulty is that two nearby synchronized clocks, namely clocks that have been calibrated and set to show the same time when they are next to each other, will not in general stay synchronized if one is transported somewhere else. If they undergo the same motions and gravitational influences, and thus have the same world line or time line, then they will stay synchronized; otherwise, they will not. There is no privileged transportation process that we can appeal to. Einstein offered a solution to this problem.

He suggested the following method. Assume in principle that we have stationary, ideal clocks located anywhere and we have timekeepers there who keep records and adjust clocks. Assume there is an ideal clock infinitesimally near the spaceship. Being stationary in the coordinate system implies it co-moves with respect to the master clock back in London. We need to establish that the two clocks remain the same distance apart, so how could we determine that they are stationary? We determine that, each time we send a light signal from London and bounce it off the distant clock, the roundtrip travel time remains constant. That procedure also can be used to synchronize the two clocks, or at least it can in a world that obeys special relativity, provided we know how far away the distant clock is. For example, the spaceship is known to be a distance d away from London. The roundtrip travel time is, say 2t seconds. When someone at the spaceship receives a signal from London saying it is noon, the person at the spaceship sets their clock to t seconds after noon. This is an ideal method of establishing simultaneity for distant events.

This method has some hidden assumptions that have not been mentioned. For more about this and about how to assign dates to distant events, see the discussions of the relativity of simultaneity and the conventionality of simultaneity.

As a practical matter, dates are assigned to events in a wide variety of ways. The date of the birth of the Sun is assigned very differently from dates assigned to two successive crests of a light wave in a laboratory laser. For example, there are lasers whose successive crests of visible light waves pass by a given location in the laboratory every 10-15 seconds. This short time is not measured with a stopwatch. It is computed from measurements of the light’s wavelength. We rely on electromagnetic theory for the equation connecting the periodic time of the wave to its wavelength and speed. Dates for other kinds of events, such as the birth of Mohammad or the origin of the Sun, are computed from historical records rather than directly measured with a clock.

22. What Is Essential to Being a Clock?

The definition of “clock” is not a precisely defined term in science, but normally a clock is used for one of three purposes: to tell what time it is, to determine which of two events happened first, and to decide how long an event lasts. In order to do this, the clock needs at least two sub-systems, (1) fairly regular ticking and (2) the counting of those ticks.

Regarding (1), the goal in building the ticking sub-system is to have a tick rate that is stable for a long enough time to be useful. Stability implies it is regular in the sense of not drifting very much over time. The tick rate in clocks that use cyclic processes is called the clock’s “frequency,” and it is measured in cycles per second. So, a clock is intended to have a sufficiently stable frequency for a reasonably continuous period of time. If a clock is stable for only half a second, but you intend to use it to measure events that last about a minute, then it is not stable enough.

Regarding (2), the counting sub-system counts the ticks in order to measure how much time has elapsed between two events of interest, and to calculate what time it is, and to display the result. For one example, in a clock based upon Earth rotations, a tick might be the sun setting below the horizon (as viewed from an agreed upon location), with the count advancing one day for each observed tick.

All other things being equal, the higher the frequency of our best clocks the better. Earth rotations have a low frequency. Pendulums are better. With a quartz clock (used in all our computers and cellphones), a piece of quartz crystal is stimulated with a voltage in order to cause it to vibrate at its characteristic frequency, usually 32,768 cycles per second. So, when 32,768 ticks occur, the quartz clock advances its count by one second. Our civilization’s standard atomic clock ticks at a frequency of 9,192,631,770 ticks per second. After that many ticks, it advances its count by one second.

The longer a clock can tick without gaining or losing a second, the more useful it is for measuring whether universal constants do not vary or drift.

The best atomic clocks are predicted to be off by less than a second in 31 billions years. Nuclear clocks of the future will depend on transitions between energy levels of an atom’s nucleus instead of its electrons.  Nuclear clocks are predicted to be off by only a second in 300 billion years.

Expressed a bit technically in the language of relativity theory, what a clock does is measure its own “proper” time along its trajectory in spacetime. Commenting on this, the philosopher Tim Maudlin said:

An ideal clock is some observable physical device by means of which numbers can be assigned to events on the device’s world-line, such that the ratios of differences in the numbers are proportional to the ratios of interval lengths of segments of the world-line that have those events as endpoints.

So, for example, if an ideal clock somehow assigns the numbers 4, 6, and 10 to events p, q, and r on its world-line, then the ratio of the length of the segment pq to the segment qr is 1:2, and so on. (Maudlin 2012, 108).

An object’s world-line is its trajectory through spacetime.

A clock’s ticking needs to be a regular process but not necessarily a repeatable process. There are two very different ways to achieve a clock’s regular ticking. The most important way is by repetition, namely by cyclic behavior. The most important goal is that any one cycle lasts just as long as any other cycle. This implies the durations between any pair of ticks are congruent. This point is sometimes expressed by saying the clock’s frequency should be constant.

A second way for a clock to contain a regular process or stable ticking is very different, and it does not require there to be any cycles or repeatable process. A burning candle can be the heart of a clock in which duration is directly correlated with, and measured by, how short the candle has become since the burning began. Two ideal candles will regularly burn down the same distance over the same duration. There will be a regular rate of burning, but no cyclic, repeatable burning because, once some part of the candle has burned, it no longer exists to be burned again. This candle timer is analogous to the behavior of sub-atomic ‘clocks’ based on radioactive decay that are used for carbon dating of ancient trees and mammoths.

A daily calendar alone is not a clock unless it is connected to a regular process. It could be part of a clock in which daily progress along the calendar is measured by a process that regularly takes a day per cycle, such as the process of sunrise followed by sunset. A pendulum alone is not a clock because it has no counting mechanism. Your circadian rhythm is often called your biological clock, because it produces a regular cycle of waking and sleeping, but it is not a complete clock because there is no counting of the completed cycles. A stopwatch is not a clock. It is designed to display only the duration between when it is turned on and turned off. But it could easily be converted into a clock by adding a counting and reporting mechanism. Similarly for radioactive decay that measures the time interval between now and when a fossilized organism last absorbed Earth’s air.

Here are some examples of cyclical processes that are useful for clocks: the swings of a pendulum, repeated sunrises, cycles of a shadow on a sundial, revolutions of the Earth around the Sun, bouncing mechanical springs, and vibrations of a quartz crystal. Regularity of the repetitive process is essential because we want a second today to be equal to a second tomorrow, although as a practical matter we have to accept some margin of error or frequency drift. Note that all these repetitive processes for clocks are absolute physical quantities in the sense that they do not depend upon assigning any coordinate system, nor are they dependent on any process occurring in a living being, including any thought.

The larger enterprise of practical time-keeping for our civilization requires that clock readings be available at locations of interest, including onboard our spaceships and inside submarines. This availability can be accomplished in various ways. A standard clock sitting in a room in Paris is a practical standard only if either its times can be broadcast quickly to the desired distant location, or the clock can be copied and calibrated so that the copies stay adequately synchronized even though they are transported to different places. If the copies cannot always stay sufficiently synchronized (calibrated) with the standard clock back in Paris, then we need to know how we can compensate for this deviation from synchrony.

The count of a clock’s ticks is normally converted and displayed in seconds or in some other unit of time such as minutes, nanoseconds, hours, or years. This counting of ticks can be difficult. Our civilization’s 1964 standard clock ticks 9,192,631,770 times per second. Nobody sat down for a second and counted this number. An indirect procedure is required.

It is an arbitrary convention that we design clocks to count up to higher numbers rather than down to lower numbers. It is also a convention that we re-set our clock by one hour as we move across a time-zone on the Earth’s surface in order that the sun be nearly overhead at noons in those zones. In order to prevent noon from ever occurring when the sun is setting, we also add leap years.  However, it is no convention that the duration from instantaneous event A to instantaneous event B plus the duration from B to instantaneous event C is equal to the duration from A to C. It is one of the objective characteristics of time, and failure for this to work out numerically for your clock is a sure sign your clock is faulty.

A clock’s ticking needs to be a practically irreversible process. Any clock must use entropy increase in quantifying time. Some entropy must be created to ensure that the clock ticks forward and does not suffer a fluctuation that causes an occasional tick backward. The more entropy produced the less likely such an unwanted fluctuation will occur.

In addition to our clocks being regular and precise, we also desire our clocks to be accurate. What that means and implies is discussed in the next section.

23. What Does It Mean for a Clock to Be Accurate?

A group of clock readings is very precise if the readings are very close to each other even if they all are wildly inaccurate because they all report that it is 12:38 when actually it is noon.

A clock is accurate if it reports the same time as the standard clock. A properly working clock correctly measures the interval along its own trajectory in spacetime, its so-called proper time. The interval in spacetime is the spatio-temporal length of its trajectory, so a clock is analogous to an odometer for spacetime. Just as a car’s odometer can give a different reading for the distance between two locations if the car takes a different route between two locations, so also a properly working clock can give different measures of the duration of time between two events if the clock takes different spacetime trajectories between them. That is why it is easiest to keep two clocks in synchrony if they are sitting next to each other, and that is why it is easiest to get an accurate measure of the time between two events if they occur at the same place.

Because clocks are intended to be used to measure events external to themselves, a goal in clock building is to ensure there is no difficulty in telling which clock tick is simultaneous with which external event. For most nearby situations and nearby clocks and everyday purposes, the sound made by the ticking helps us make this determination. We hear the tick just as we hear or see the brief event occur that we wish to “time.” Humans actually react faster to what they hear than what they see. Trusting what we see or hear presupposes that we can ignore the difference in time between when a sound reaches our ears and when it is consciously recognized in our brain, and it presupposes that we can safely ignore the difference between the speed of sound and the speed of light.

If a clock is synchronized with the standard clock and works properly and has the same trajectory in spacetime as the standard clock, then it will remain accurate (that is, stay in synchrony) with the standard clock. According to the general theory of relativity, if a clock takes a different trajectory from the standard clock, then its readings will deviate from those of the standard clock, and when the second clock is brought back to be adjacent to the standard clock, the two will give different readings of what time it is. That is, if your well-functioning clock were at rest adjacent to the standard clock, and the two were synchronized, then they would stay synchronized, but if your clock moved away from the standard clock and took some different path through space, then the two would not give the same readings when they were reunited, even though both continued to be correct clocks, so this complicates the question of whether a clock that is distant from the standard clock is telling us standard time. To appreciate the complication, ask yourself the question: When our standard clock shows noon today, what event within a spaceship on Mars occurs simultaneously? Or ask the question: How do you “set” the correct time on the Mars clock?

The best that a designated clock can do while obeying the laws of general relativity is to accurately measure its own proper time. Time dilation will affect the readings of all other clocks and make them drift out of synchrony with the designated clock. It is up to external observers to keep track of these deviations and account for them for the purpose at hand.

There is an underlying philosophical problem and a psychological problem. If we assign a coordinate system to spacetime, and somehow operationally define what it is for a clock at one place to be in synch with a clock at another place, then we can define distant simultaneity in that coordinate system. However, whether spatiotemporally separated clocks are simultaneous is a coordinate-dependent artifact. Even when people understand this philosophical point that arises because of the truth of the general theory of relativity, they still seem unable to resist the temptation to require a correct answer to the question “What event on a spaceship circling Mars is simultaneous with noon today here on Earth” and unable to appreciate that this notion of simultaneity is a convention that exists simply for human convenience.

The quartz clock in your cellphone drifts and loses about a second every day or two, so it frequently needs to be “reset” (that is, restored to synchrony with our society’s standard clock).

Our best atomic clocks need to be reset by one second every 100 million years.

Suppose we ask the question, “Can the time shown on a properly functioning standard clock ever be inaccurate?” The answer is “no” if the target is synchrony with the current standard clock, as the conventionalists believe, but “yes” if there is another target. Objectivists can propose at least three other distinct targets: (1) synchrony with absolute time (as Isaac Newton proposed in the 17th century), (2) synchrony with the best possible clock, and (3) synchrony with the best-known clock. We do not have a way of knowing whether our current standard clock is close to target 1 or target 2. But if the best-known clock is known not yet to have been chosen to be the standard clock, then the current standard clock can be inaccurate in sense 3 and perhaps it is time to call an international convention to discuss adopting a new time standard.

Practically, a reading of ‘the’ standard clock is a report of the average value of the many conventionally-designated standard clocks, hundreds of them distributed around the globe. Any one of these clocks could fail to stay in sync with the average, and when this happens it is re-set (that is, re-calibrated, or re-set to the average reading). The re-setting occurs about once a month to restore accuracy.

There is a physical limit to the shortest duration measurable by a given clock because no clock can measure events whose duration is shorter than the time it takes a signal to travel between the components of that clock, the components in the part that generates the regular ticks. This theoretical limit places a lower limit on the margin of error of any measurement of time made with that clock.

Every physical motion of every clock is subject to disturbances. So, we want to minimize the disturbance, and we want our clock to be adjustable in case it drifts out of synchrony a bit. To achieve this goal, it helps to keep the clock isolated from environmental influences such as heat, dust, unusual electromagnetic fields, physical blows (such as dropping the clock), immersion in liquids, and differences in gravitational force. And it helps to be able to predict how much a specific influence affects the drift out of synchrony so that there can be an adjustment for this influence.

Sailors use clocks to discover the longitude of where they are in the ocean. Finding a sufficiently accurate clock was how 18th and 19th century sailors eventually were able to locate themselves when they could not see land. At sea at night, the numerical angle of the North Star above the horizon is their latitude. Without a clock, they had no way to determine their longitude except by dead reckoning, which is very error-prone. A pendulum clock does not work well when the sea is not smooth. If they had an accurate mechanical clock with them that wasn’t affected by choppy seas, they could use it to find their longitude. First, before setting sail they would synchronize it with the standard clock at zero degrees longitude. Out on the ocean or on some island, this clock would tell them the time back at zero degrees longitude. Then at sea on a particular day, the sailors could wait until the Sun was at its highest point and know the local time is 12 noon. If at that moment their clock read 0900 (that is, 9:00 A.M.), then they would know their clock is off by 3 hours from the time at zero degrees longitude. Because Earth turns on its axis 360 degrees of longitude every day and 15 degrees every hour, the sailors could compute that they were 3 x 15 degrees west of zero degrees, namely at 45 degrees west longitude. Knowing both their latitude and longitude, they could use a map to locate themselves. The first reasonably reliable mechanical clock that could be used for measuring longitude at sea was invented by British clockmaker John Harrison in 1727. It was accurate to one second a month. When mariners adopted similarly accurate mechanical clocks, the number of ships per year that crashed into rocks plummeted.

24. What Is Our Standard Clock or Master Clock?

Our civilization’s standard clock or master clock is the clock that other clocks are synchronized with. It reports ‘the correct time’ because humanity agrees that it does. This standard clock is a designated cesium atomic clock in Paris France. Your cell phone synchronizes its internal clock with this standard clock about once a week.

More specifically, the standard clock reports the proper time for the Royal Observatory Greenwich in London, England which sits at zero degrees longitude (the prime meridian), even though the report is created in a laboratory near Paris. The report is the result of a computational average of reports supplied from a network of many designated atomic clocks situated around the globe.

a. How Does an Atomic Clock Work?

First, a one-paragraph answer to this question. Then a much more detailed answer and explanation.

An atomic clock is a very regular clock that measures the time taken for a fixed number of jumps by electrons between energy levels in atoms. There are many kinds of atomic clocks, but the one adopted worldwide in 1964 for Coordinated Universal Time relied on the very regular behavior of the cesium-133 atom. What is regular is the frequency of the microwave radiation needed for a maser pointed at cesium atoms in a vacuum chamber in order get the cesium to radiate with a specific color—that is, to resonate.  The oscillation of the wave in the maser is analogous to the swing of a pendulum.

Resonance occurs when the cesium isotope’s atoms are stimulated by a special incoming microwave frequency. If the stimulation is not at the proper frequency, the cesium will radiate very little compared to what it can radiate if properly stimulated. The stimulation causes the outer electron to transition from a low-energy ground state to a next higher-energy ground state and then to fall back down again while emitting the same microwave frequency that caused it to bump up in the first place. The oscillation or “waving” of this incoming and outgoing radiation from the maser is the ticking of the clock. Counting those ticks tells us the time. A count of 9,192,631,770 is defined to be one second. Nobody sits there and counts the waves flying by them, but the engineering details of counting them for the time interval from event A to event B are not relevant to any philosophical issues.

Pendulum clocks work by counting swings of the pendulum. Quartz clocks work by counting the shakes of a small piece of quartz crystal set in motion when voltage is applied to it. Astronomical clocks count rotations of the Earth. Atomic clocks work by producing a wave process such as a microwave, and counting a specific number of those waves that pass by a single point in space, then declaring that the time taken is one second on the clock.

The key idea for all objects that deserve to be called “clocks” is that they can be relied upon to produce nearly the same, fixed number of ticks per second. Call that number n. So, for every occurrence of n oscillations, the clock reports that a second has passed. For every 60n oscillations, it reports a minute has passed. For every 60(60n) oscillations it reports an hour, and so forth. The frequency (or, equivalently, the number of oscillations per second) is the clock’s rate of ticking. If the frequency doesn’t drift very much, it is called a “stable” frequency. The more stable the better. The reason why all the above clocks work as clocks is that they can produce relative stable frequencies compared to that of the rest of the universe’s processes such as a president’s heartbeat or the dripping from a goatskin water bag.

The advantage of using an atomic clock that relies on a specific isotope is that (1) for any isotope, all its atoms behave exactly alike, unlike any two quartz crystals or any two rotations of the Earth, (2) the atomic clock’s ticking is very regular compared to any non-atomic clock, (3) it ticks at a very fast rate (high frequency) so it is useful for measurements of events having a very brief duration, (4) the clock can easily be copied and constructed elsewhere, (5) the clock is not easily perturbed,  and (6) there is no deep mystery about why it is a better master clock than other candidates for a master clock.

An atomic clock’s stable frequency is very easy to detect because the isotope “fluoresces” or “shines” or “resonates” in a characteristic, easily-detectable narrow band of frequencies. That is, its frequency distribution has a very, very narrow central peak that clearly differs from the peaks of radiation that can be produced by electron transitions between all other energy levels in the the same atom. It is these transitions from one energy level to another that emit light and produce the resonating. No radioactivity is involved in an atomic clock.

In 1879, James Clerk Maxwell was the first person to suggest using the frequency of atomic radiation as a kind of invariant natural pendulum. This remark showed great foresight, and it was made before the rest of the physics community had yet accepted the existence of atoms. Vibrations in atomic radiation are the most stable periodic events that scientists in the 21st century have been able to use for clock building.

A cesium atomic clock was adopted in 1967 as the world’s standard clock, and it remains the standard in the 2020s. At the convention, physicists agreed that when 9,192,631,770 cycles of microwave radiation in the clock’s special, characteristic process are counted, then the atomic clock should report that a duration of one atomic second has occurred.

What is this mysterious “special, characteristic process” in cesium clocks that is so stable? This question is answered assuming every cesium atom behaves according to the Bohr model of atoms. The model is easy to visualize, but it provides a less accurate description than does a description in terms of quantum theory. However, quantum theory is more difficult to understand, so mention of it is minimized in this article.

Every atom of a single isotope behaves just like any other, unlike two manufactured pendulums or event two rotations of the Earth. It is not that every atom of an isotope is in the same position or has the same energy or the same velocity, but rather that, besides those properties, they are all alike. Cesium is special because it is the most chemically reactive material in the world, so builders of atomic clocks must be careful to keep  its cesium away from all other kinds of atoms.

An atom’s electrons normally stay in orbit and don’t fly away, nor do they crash into the nucleus. Electrons stay in their orbits until perturbed, and each orbit has a characteristic energy level, a specific value of its energy for any electron in that orbit. When stimulated by incoming electromagnetic radiation, such as from a laser, the electrons can absorb the incoming radiation and transition to higher, more energetic orbits. Which orbit the electron moves to depends  on the energy of the incoming radiation that it absorbs. Higher orbits are orbits are more distant from the nucleus. Also, an electron orbiting in a higher, more energetic orbit is said to be excited because it might emit some radiation spontaneously and transition into one of the lower orbits. There are an infinite number of energy levels and orbits, but they do not differ continuously. They differ by discrete steps. The various energies that can be absorbed and emitted are unique to each isotope of each element. Examining the various frequencies of the emitted radiation of an object gives sufficient information to identify which isotope and element is present. Ditto for the signature of the absorption frequencies. Famously, finding the signature for helium in sunlight was the first evidence that there was helium in the Sun.

A cesium atom’s outer electron shell contains only a single electron, making it chemically reactive to incoming microwave radiation. To take advantage of this feature in a cesium atomic clock, an outer electron in its lowest-energy orbit around the cesium-133 nucleus is targeted by some incoming microwave radiation from the atomic clock’s laser. Doing so makes the electron transition to a higher energy orbit around the cesium nucleus, thus putting the electron into an “excited” state. Properly choosing the frequency of the incoming radiation that hits the target cesium (called successfully “tuning” the laser) can control which orbit the electron transitions to. Tuning the laser is a matter of controlling the laser’s frequency with a feedback loop that keeps it generating the desired, stable frequency. Initially, the cesium is heated to produce a vapor or gas, then the cesium atoms are cooled as a group to reduce their kinetic energy, and then they are magnetically filtered to select only the atoms whose outer electrons are in the lowest possible energy state.

Our Bohr model supposes, following a suggestion from Einstein, that any electromagnetic wave such as a light wave or a microwave or a radio wave can just as well be considered to be composed of small, discrete particle-like objects called photons. The photon’s energy is directly correlated with the wave’s frequency—higher energy photons correspond to higher frequency waves. If a photon of exactly the right energy from the laser arrives and hits a cesium atom’s electron, the electron can totally absorb the photon by taking all its energy and making the electron transition up to a higher energy level. Energy is conserved during absorption and emission.

Later, the electron in a higher, excited state might spontaneously fall back down to one of the various lower energy levels, while emitting a photon of some specific frequency. The value of that frequency is determined by the energy difference in the two energy levelsof the transition. If it is still in an excited state, the (or an) electron might spontaneously fall again to an even lower energy level, and perhaps cascade all the way down to the lowest possible energy level. There is an infinite number of energy levels of any atom, so potentially there is an infinite number of frequencies of photons that can be absorbed and an infinite number of frequencies of photons that can be emitted in the transitions. There are an infinite number, but not just any number, because the frequencies or energies differ in small, discrete steps from each other.

If the electron in a specific energy level were hit with a sufficiently energetic incoming photon, the electron would fly away from the atom altogether, leaving the atom ionized.

For any atom of any isotope of any element with its outer electron in its lowest ground state, there is a characteristic, unique energy value for that state, and there is a characteristic minimum energy for an incoming photon to be able to knock the outer electron up to the very next higher level and no higher, and this is the same energy or frequency that is emitted when that higher-level electron spontaneously transitions back to the lowest level. This ground state behavior of transitioning to the next higher level and back down again is the key behavior of an atom that is exploited in the operation of an atomic clock.

In a cesium atomic clock using the isotope 133Cs, its cesium gas is cooled and manipulated so that nearly all its atoms are in their unexcited, lowest ground state. This manipulation uses the fact that atoms in the two different states have different magnetic properties so they can be separated magnetically. Then the laser’s frequency is tuned until the laser is able to knock the outer electrons from their ground state up to the next higher energy state (but no higher) so that the excited electrons then transition back down spontaneously to the ground level and produce radiation of exactly the same frequency as that of the laser. That is, the target cesium shines or fluoresces with the same frequency it was bombarded with. When this easily-detectable fluorescence occurs, the counting can begin, and the clock can measure elapsed time.

While the definition of a second has stayed the same since 1967, the technology of atomic clocks has not. Scientists in 2020s can make an atomic clock so precise that it would take 30 billion years to drift by a single second. The cesium atomic clock of 1967 drifted quite a bit more. That is why the world’s 1967 time-standard using cesium atomic clocks is likely to be revised in the 21st century.

For more details on how an atomic clock works, see (Gibbs, 2002).

b. How Do We Find and Report the Standard Time?

If we were standing next to the standard clock, we could find the standard time by looking at its display of the time. Almost all countries use a standard time report that is called Coordinated Universal Time. Other names for it are UTC, and Zulu Time. It once was named Greenwich Mean Time (GMT). Some countries prefer their own, different name.

How we find out what time it is when we are not next to the standard click is quite complicated. First, ignoring the problems of time dilation and the relativity of simultaneity raised by Einstein’s theory of relativity that are discussed above, let’s consider the details of how standard time is reported around the world for the vast majority of countries. The international standard time that gets reported is called U.T.C. time, for the initials of the French name for Coordinated Universal Time. The report of U.T.C. time is based on computations and revisions made from the time reports of the Atomic Time (A.T.) of many cesium clocks in many countries.

U.T.C. time is, by agreement, the time at zero degrees longitude. This longitude is an imaginary great circle that runs through the North Pole and South Pole and a certain astronomical observatory in London England, although the report itself is produced near Paris France. This U.T.C. time is used by the Internet and by the aviation industry throughout the world. Different geographical regions of the world have their own time due to the world’s being divided into time zones, approximately by the region’s longitude. Usually a time zone differs by one hour from its neighboring zone.

U.T.C. time is produced from T.A.I. time by adding or subtracting some appropriate integral number of leap years and leap seconds, with leap years added every four years and leap seconds added as needed. T.A.I. time is computed from a variety of reports received of A.T. time (Atomic Time), the time of our standard, conventionally-designated cesium-based atomic clocks. All A.T. times are reported in units called S.I. seconds. A.T. time produces T.A.I. time which produces U.T.C. time.

An S.I. second (that is, a Système International second or a second of Le Système International d’Unités) is defined to be the numerical measure of the time it takes for the motionless (motionless relative to the Greenwich-London observatory), designated, master cesium atomic clock to emit exactly 9,192,631,770 cycles of radiation. The number “9,192,631,770” was chosen rather than some other number by vote at an international convention for the purpose of making the new second be as close as scientists could come to the duration of what was called a “second” back in 1957 when the initial measurements were made on cesium-133 using the best solar-based clocks available then.

The T.A.I. scale from which U.T.C. time is computed is the average of the reports of A.T. time from about 200 designated cesium atomic clocks that are distributed around the world in about fifty selected laboratories, all reporting to Paris. One of those laboratories is the National Institute of Standards and Technology (NIST) in Boulder, Colorado, U.S.A. The calculated average time of the 200 reports is the T.A.I. time, the abbreviation of the French phrase for International Atomic Time. The International Bureau of Weights and Measures (BIPM) near Paris performs the averaging about once a month. If your designated laboratory in the T.A.I. system had sent in your clock’s reading for a certain specified event that occurred in the previous month, then in the present month the BIPM calculates the average answer for all the 200 reported clock readings and sends you a notice of how inaccurate your report was from the average, so you can reset your clock, that is, make adjustments to your atomic clock and hopefully have it be in better agreement with next month’s average for the 200. Time physicists are following the lead over time of their designated clocks because there is nothing better to follow.

A.T. time, T.A.I. time, and U.T.C. time are not kinds of physical time but rather are kinds of reports of physical time.

In the 17th century, Christiaan Huygens recommend dividing a solar day into 24 hours per day and 60 minutes per hour and 60 seconds per minute, making a second be 1/86,400 of a solar day. This is called Universal Time 1 or UT1. This is rotational time. Subsequently, the second was redefined by saying there are 31,556,925.9747 seconds in the tropical year 1900. At the 13th General Conference on Weights and Measures in 1967, the definition of a second was changed again to a specific number of periods of radiation produced by a standard cesium atomic clock (actually, the average of 200 standard atomic clocks). This second is the so-called standard second or the S.I. second. It is defined to be the duration of 9,192,631,770 periods (cycles, oscillations, vibrations) of a certain kind of microwave radiation absorbed in the standard cesium atomic clock. More specifically, the second is defined to be the duration of exactly 9,192,631,770 periods of the microwave radiation required to produce the maximum fluorescence of a small gas cloud of cesium-133 atoms as the single outer-shell electron in these atoms transitions between two specific energy levels of the atom. This is the internationally agreed-upon unit for atomic time in the T.A.I. system. In 1967 the atomic clocks were accurate to one second every 300 years. The accuracy of atomic clocks subsequently have gotten very much better.

All metrologists expect there to be an eventual change in the standard clock by appeal to higher frequency clocks.  The higher ticking rate is important for many reasons, one of which is that the more precise the clock that is used the better physicists can test the time-translation invariance of the fundamental laws of physics, such as checking whether the supposed constants of nature do in fact stay constant over time.

Leap years (with their leap days) are needed as adjustments to the standard clock’s count in order to account for the fact that the number of the Earth’s rotations per Earth revolution does not stay constant from year to year. The Earth is spinning slower every day, but not uniformly. Without an adjustment, the time called “midnight” eventually would drift into the daylight. Leap years are added every four years. The effect on the period is not practically predictable, so, when the irregularity occurs, a leap second is introduced or removed as needed whenever the standard atomic clock gets behind or ahead of the old astronomical clock (Universal Coordinated Time UTC) by more than 0.9 seconds.

The meter depends on the second, so time measurement is more basic than space measurement. It does not follow from this, though, that time itself is more basic than space. In 1983, scientists agreed that the meter is how far light travels in 1/299,792,458 seconds in a vacuum. This conventional number is chosen for four reasons: (i) Choosing the number 299,792,458 made the new meter be very close to the old meter that was once defined to be the distance between two specific marks on a platinum bar kept in the Paris Observatory. (ii) light propagation is very stable or regular; its speed is either constant, or when not constant we know how to compensate for the influence of the medium; (iii) a light wave’s frequency can be made extremely stable (that is, little drift); and (iv) distance cannot be measured more accurately in other ways; using the platinum bar in Paris is a less accurate means of measuring distance.

Time can be measured more accurately and precisely than distance, voltage, temperature, mass, or anything else.

So why bother to improve atomic clocks? The duration of the second can already be measured to 14 or 15 decimal places, a precision 1,000 times that of any other fundamental unit. One reason to do better is that the second is increasingly the fundamental unit. Three of the six other basic units—the meter, lumen and ampere—are defined in terms of the second. (Gibbs, 2002)

One philosophical implication of the standard definition of the second and of the meter is that they fix the numerical value of the speed of light in a vacuum in all inertial frames. The speed is exactly 299,792,458 meters per second. There can no longer be any direct measurement to check whether that is how fast light really moves; it is defined to be moving that fast. Any measurement that produced a different value for the speed of light is presumed to have an error. The error would be in accounting for the influence of gravitation and acceleration, or in its assumption that the light was moving in a vacuum. This initial presumption of where the error lies comes from a deep reliance by scientists on Einstein’s general theory of relativity. However, if it were eventually decided by the community of scientists that the speed of light should not have been fixed as it was, then the scientists would call for a new world convention to re-define the second or the meter.

25. Why Are Some Standard Clocks Better than Others?

Other clocks ideally are calibrated by being synchronized to “the” standard clock, our master clock. It is normally assumed that the standard clock is the most reliable and regular clock. Physicists have chosen the currently-accepted standard clock for two reasons: (1) they believe it will tick very regularly in the sense that all periods between adjacent ticks are sufficiently congruent—they have the same duration. (2) There is no better choice of a standard clock. Choosing a standard clock that is based on the beats of a president’s heart would be a poor choice because clocks everywhere would suddenly and mysteriously get out of synchrony with the standard clock (the heartbeats) when the president goes jogging.

So, some choices of standard clock are better than others. Some philosophers of time believe one choice is better than another because the best choice is closest to a clock that tells what time it really is. Most philosophers of time argue that there is no access to what time it really is except by first having selected the standard clock.

Let’s consider the various goals we want to achieve in choosing one standard clock rather than another. One goal is to choose a clock with a precise tick rate that does not drift very much. That is, we want a clock that has a very regular period—so the durations between ticks are congruent. On many occasions throughout history, scientists have detected that their currently-chosen standard clock seemed to be drifting. In about 1700, scientists discovered that the duration from one day to the next, as determined by the duration between sunrises, varied throughout the year. They did not notice any variation in the duration of a year, so they began to rely on the duration of the year rather than the day.

As more was learned about astronomy, the definition of the second was changed. In the 19th century and before the 1950s, the standard clock was defined astronomically in terms of the mean rotation of the Earth upon its axis (solar time). For a short period in the 1950s and 1960s, the standard clock was defined in terms of the revolution of the Earth about the Sun (ephemeris time), and the second was defined to be 1/86,400 of the mean solar day, which is the average throughout the year of the rotational period of the Earth with respect to the Sun. But all these clocks were soon discovered to drift too much.

To solve these drift problems, physicists chose a certain kind of atomic clock as the standard, and they said it reported atomic time. All atomic clocks measure time in terms of the natural resonant frequencies of electromagnetic radiation absorbed and emitted from the electrons within certain atoms of the clock. The accurate dates of adoption of these standard clocks are omitted in this section because different international organizations adopted different standards in different years. The U.S.A.’s National Institute of Standards and Technology’s F-1 atomic fountain clock is so accurate that it drifts by less than one second every 30 million years. We know there is this drift because it is implied by the laws of physics, not because we have a better clock that measures this drift.

Atomic clocks use the frequency of a specific atomic transition as an extremely stable time standard. While the second is currently defined by caesium-based clocks that operate at microwave frequencies, physicists have built much more accurate clocks that are based on light. These optical clocks tick at much higher frequencies than microwave clocks and can keep time that is accurate to about one part in 1018, which is about 100 times better than the best caesium clocks.

The international metrology community aims to replace the microwave time standard with an optical clock, but first must choose from one of several clock designs being developed worldwide”—Hamish Johnston, Physics World, 26 March 2021 .

Optical atomic clocks resonate at light frequencies rather than microwave frequencies, and this is why they tick about 100,000 faster than the microwave atomic clocks.

To achieve the goal of restricting drift, and thus stabilizing the clock, any clock chosen to become the standard clock should be maximally isolated from outside effects. A practical goal in selecting a standard clock is to find a clock that can be well insulated from environmental impacts such as convection currents in the Earth’s molten core, comets impacting the Earth, earthquakes, stray electric fields, heavy trucks driving on nearby bumpy roads, the invasion of dust and rust into the clock, extraneous heat, variation in gravitational force, and adulteration of the clock’s gas (for example, the cesium) with other stray elements.

If not insulation, then compensation. If there is some theoretically predictable effect of an environmental influence upon the standard clock, then the clock can be regularly adjusted to compensate for this effect. For example, thanks to knowing the general theory of relativity, we know how to adjust for the difference in gravitational force between being at sea level and being a meter above sea level. Commenting on the insulation problem, Nobel Prize winner Frank Wilczek said that the basic laws of the universe are local, so:

Thankfully, you don’t have to worry about the distant universe, what happened in the past, or what will happen in the future…and it is philosophically important to notice that it is unnecessary to take into account what people,  or hypothetical superhuman beings, are thinking. Our experience with delicate, ultra-precise experiments puts severe pressure on the idea that minds can act directly on matter, through will. There’s an excellent opportunity here for magicians to cast spells, for someone with extrasensory powers to show their stuff, or for an ambitious experimenter to earn everlasting glory by demonstrating the power of prayer or wishful thinking. Even very small effects could be detected. but nobody has ever done this successfully.” Fundamentals: Ten Keys to Reality.

Consider the insulation problem we would have if we were to replace the atomic clock as our standard clock and use instead the mean yearly motion of the Earth around the Sun. Can we compensate for all the relevant disturbing effects on the motion of the Earth around the Sun? Not easily nor precisely. The principal problem is that the Earth’s rate of spin varies in a practically unpredictable manner. This affects the behavior of the solar clock, but not the atomic clock.

Our civilization’s earlier-chosen standard clock once depended on the Earth’s rotations and revolutions, but this Earth-Sun clock is now known to have lost more than three hours in the last 2,000 years. Leap years and leap seconds are added or subtracted occasionally to the standard atomic clock in order to keep our atomic-based calendar in synchrony with the rotations and revolutions of the Earth. We do this because we want to keep atomic-noons occurring on astronomical-noons and ultimately because we want to prevent Northern hemisphere winters from occurring in some future July. These changes do not affect the duration of a second, but they do affect the duration of a year because not all years last the same number of seconds. In this way, we compensate for the Earth-Sun clocks falling out of synchrony with our standard atomic clock.

Another desirable feature of a standard clock is that reproductions of it stay in synchrony with each other when environmental conditions are the same. Otherwise, we may be limited to relying on a specifically-located standard clock that can not be trusted elsewhere and that can be broken, vandalized or stolen.

The principal goal in selecting a standard clock is to reduce mystery in physics. The point is to find a clock process that, if adopted as our standard, makes the resulting system of physical laws simpler and more useful, and allows us to explain phenomena that otherwise would be mysterious. Choosing an atomic clock as standard is much better for this purpose than choosing the periodic revolution of the Earth about the Sun. If scientists were to have retained the Earth-Sun astronomical clock as the standard clock and were to say that by definition the Earth does not slow down in any rotation or in any revolution, then when a comet collides with Earth, tempting the scientists to say the Earth’s period of rotation and revolution changed, the scientists instead would be forced not to say this but to alter, among many other things, their atomic theory and to say the frequency of light emitted from cesium atoms mysteriously increases all over the universe when comets collide with the Earth. By switching to the cesium atomic standard, these alterations are unnecessary, and the mystery vanishes.

To make this point a little more simply, suppose the President’s heartbeats were chosen as our standard clock and so the count of heartbeats always showed the correct time. It would become a mystery why pendulums (and cesium radiation in atomic clocks) changed their frequency whenever the President went jogging, and scientists would have to postulate some new causal influence that joggers have on pendulums and atomic clocks across the globe.

To achieve the goal of choosing a standard clock that maximally reduces mystery, we want the clock’s readings to be consistent with the accepted laws of motion, in the following sense. Newton’s first law of motion says that a body in motion should continue to cover the same distance during the same time interval unless acted upon by an external force. If we used our standard clock to run a series of tests of the time intervals as a body coasted along a carefully measured path, and we found that the law was violated and we could not account for this mysterious violation by finding external forces to blame and we were sure that there was no problem otherwise with Newton’s law or with the measurement of the length of the path, then the problem would be with the clock. Leonhard Euler (1707-1783) was the first person to suggest this consistency requirement on our choice of a standard clock. A similar argument holds today but with using the laws of motion from Einstein’s general theory of relativity, one of the two fundamental theories of physics.

When we want to know how long a basketball game lasts, why do we subtract the start time from the end time? The answer is that we accept a metric for duration in which we subtract the two time numbers. Why do not we choose another metric and, let’s say, subtract the square root of the start time from the square root of the end time? This question is implicitly asking whether our choice of metric can be incorrect or merely inconvenient.

When we choose a standard clock, we are choosing a metric. By agreeing to read the clock so that a duration from 3:00 to 5:00 is 5-3 hours, and so 2 hours, we are making a choice about how to compare two durations in order to decide whether they are equal, that is, congruent. We suppose the duration from 3:00 to 5:00 as shown by yesterday’s reading of the standard clock was the same as the duration from 3:00 to 5:00 on the readings from two days ago and will be the same for today’s readings and tomorrow’s readings.

Philosophers of time continue to dispute the extent to which the choice of metric is conventional rather than objective in the sense of being forced on us by nature. The objectivist says the choice is forced and that the success of the standard atomic clock over the standard solar clock shows that we were more accurate in our choice of the standard clock. An objectivist says it is just as forced on us as our choosing to say the Earth is round rather than flat. It would be ridiculous to insist the Earth is flat. Taking the conventional side on this issue, Adolf Grünbaum argued that time is metrically amorphous. It has no intrinsic metric. Instead, we choose the metric we do in order only to achieve the goals of reducing mystery in science, but satisfying those goals is no sign of being correct.

The conventionalist, as opposed to the objectivist, would say that if we were to require by convention that the instant at which Jesus was born and the instant at which Abraham Lincoln was assassinated are to be only 24 seconds apart, whereas the duration between Lincoln’s assassination and his burial is to be 24 billion seconds, then we could not be mistaken. It is up to us as a civilization to say what is correct when we first create our conventions about measuring duration. We can consistently assign any numerical time coordinates we wish, subject only to the condition that the assignment properly reflects the betweenness relations of the events that occur at those instants. That is, if event J (birth of Jesus) occurs before event L (Lincoln’s assassination) and this, in turn, occurs before event B (burial of Lincoln), then the time assigned to J must be numerically less than the time assigned to L, and both must be less than the time assigned to B so that t(J) < t(L) < t(B). A simple requirement. Yes, but the implication is that this relationship among J, L, and B must hold for events simultaneous with J, and for all events simultaneous with K, and so forth.

It is other features of nature that lead us to reject the above convention about 24 seconds and 24 billion seconds. What features? There are many periodic processes in nature that have a special relationship to each other; their periods are very nearly constant multiples of each other, and this constant stays the same over a long time. For example, the period of the rotation of the Earth is a fairly constant multiple of the period of the revolution of the Earth around the Sun, and both these periods are a constant multiple of the periods of a swinging pendulum and of vibrations of quartz crystals. The class of these periodic processes is very large, so the world will be easier to describe if we choose our standard clock from one of these periodic processes. A good convention for what is regular will make it easier for scientists to find simple laws of nature and to explain what causes other events to be irregular. It is the search for regularity and simplicity and removal of mystery that leads us to adopt the conventions we do for the numerical time coordinate assignments and thus leads us to choose the standard clock we do choose. Objectivists disagree and say this search for regularity and simplicity and removal of mystery is all fine, but it is directing us toward the correct metric, not simply the useful metric.

For additional discussion of some of the points made in this section, including the issue of how to distinguish an accurate clock from an inaccurate one, see chapter 8 of (Carnap 1966).

26. What Is a Field?

The technical word “field” means something that extends throughout space and that exerts a force on various things it encounters. The most familiar field is the temperature field for a single time; it is displayed on the screen during a weather report. It shows the temperature at each place represented on the map such as 70 degrees Fahrenheit now in New York, and 71 degrees now in Washington, DC. The field is a special kind of extended object having three or more dimensions with each place having a value for some physical variable—temperature in this example.

Suppose your system of interest is a room instead of the cosmos, and you wish to understand sound in the room. This means you want to know about its air density field. Sound waves in the room are oscillations of this field due to changing air density in different places at different times.

Unlike a temperature field or an air density field, some fields always have directions at their points. A wind field has a speed and a direction at each place. A magnetic field has a strength and a direction at each place.

What is the advantage of treating the world in terms of fields? Here is a brief answer. In any field theory with the property called “locality,” any change in a field’s value at a place can directly induce changes only in infinitesimally-nearby places. Think of points in the field as interacting only with their nearest neighbors, which in turn interact with their own neighbors, and so forth. So, field theory with locality has the advantage that, if you want to know what will happen next at a place, you do not have to consider the influence of everything everywhere in the universe but only the field values at the place of interest and the rates of change of those values. Computing the effect of a change can be much simpler this way.

In Newton’s mechanics, two distant objects act on each other directly and instantaneously. In contemporary mechanics, the two distant objects act on each other only indirectly via the field between them. However, Newton’s theory of gravity without fields is sometimes more practical to use because gravitational forces get weaker with distance, and the gravitational influence of all the distant particles can be ignored for some practical purposes.

The universe at a time is approximately a system of particles in spacetime, but, more fundamentally, the best, educated guess of physicists is that it is a system of co-existing quantized fields acting on the vacuum or being the vacuum. We know this is so for all non-gravitational phenomena, but do not have a definitive theory of the gravitational field that involves quantum phenomena. In the early years of using the concept of fields, the fields were considered something added to systems of particles, but the modern viewpoint (influenced by quantum mechanics) is that particles themselves are only local vibrations or excitations of fields; the particles are the vibrations that are fairly stable in the sense of persisting (for the particle’s lifetime) and not occupying a large spatial region as the field itself does.

The classical concept of there being a particle at a point does not quite hold in quantum field theory. The key ontological idea is that the particles supervene on the fields. Particles are epiphenomena. Also, the particles of quantum fields do not change their values continuously as do particles in classical fields. A particle in a quantum field is able to change its energy only in discrete jumps.

The concept of a field originated with Pierre-Simon Laplace in about 1800. He suggested Newton’s theory of gravity could be treated as being a field theory. In Laplace’s field theory of gravity, the notion of action at a distance was eliminated. Newton would have been happy with this idea of a field because he always doubted that gravity worked by one object’s mass acting directly on another distant object instantaneously. In a letter to Richard Bentley, he said:

It is inconceivable that inanimate brute matter should, without the intervention of something else which is not material, operate upon and affect other matter, and have an effect upon it, without mutual contact.

Michael Faraday was the first physicist to assert that fields in this sense are real, not just mathematical artifacts.

Instantaneous actions were removed from the treatment of electricity and magnetism by Maxwell in the 1860s when he created his theory of electromagnetism as a field theory. Changes in electromagnetic forces were propagated, not instantaneously, but at the speed c of light. Instantaneous actions were eventually removed from gravitational theory in Einstein’s general theory of relativity of 1915. It was Einstein who first claimed that spacetime itself is the field associated with gravity. According to Einstein,

As the Earth moves, the direction of its gravitational pull does not change instantly throughout the universe. Rather, it changes right where the Earth is located, and then the field at that point tugs on the field nearby, which tugs on the field a little farther away, and so on in a wave moving outward at the speed of light. (Carroll 2019, p. 249)

Gravitational force, according to Einstein’s theory, is not really a force in the usual sense of the term, but is the curvature of spacetime.

Depending upon the field, a field’s value at a point in space might be a simple number (as in the Higgs field), or a vector (as in the classical electromagnetic field), or a tensor (as in Einstein’s gravitational potential field), or even a matrix. Fields obey laws, and these laws usually are systems of partial differential equations that hold at each point.

As mentioned briefly above, with the rise of quantum field theory, instead of a particle being treated as a definite-size object within spacetime it is treated as a special kind of disturbance of the field itself, a little “hill” or deviation above its average value nearby. For example, an electron is a localized disturbance in the electromagnetic field. The anti-electron is a localized disturbance in the same field, and so is a photon. The disturbance is a fuzzy bundle of quantized energy occupying a region of space bigger than a single point, but having a maximum at a place that would classically have been called the “particle’s location.” A particle is a little hill in the field. These hills can be stationary or moving. The hills can pass by each other or pass through other hills or bounce off them, depending on the kinds of hills. Moving hills carry information and energy from one place to another. New energy inputted into the field can increase the size of the hill, but only in discrete sizes. Any hill has a next bigger possible size (or energy).

So, the manifest image of a particle cannot easily be reconciled with the quantum mechanical image of a particle. Although fields, not particles, are ontologically basic, it does not follow from this that particles are not real. They are real but odd because they are emergent and epiphenomenal entities having no sharply defined diameter and not being able to change their sizes gradually. Although an electron does have a greater probability of being detected more at some places than at others, in any single detection at a single time the electron is detected only at a point, not a region. The electron is a disturbance that spreads throughout space, although the high-amplitude parts are in a very small region.  Despite its having no sharp boundary, the electron is physically basic in the sense that it has no sub-structure particle. The proton is not basic because it is made of quarks and gluons. Particles with no sub-structure are called elementary particles.

Relativity theory’s biggest ontological impact is that whether a particle is present depends on the observer. An accelerating observer might observe (that is, detect) particles being present in a specific region while a non-accelerating observer can see no particles there. For a single region of spacetime, there can be particles in the region in one reference frame and no particles in that region for another frame, yet both frames are correct descriptions of reality!

One unusual feature of quantum mechanics is the Heisenberg Uncertainty Principle. It implies that any object, such as an electron, has complementary features. For example, it has values for its position and for the rate of change of its position, but the values are complementary in the sense that the more precisely one value is measured the less precisely the other value can be measured. Fields are objects, too, and so Heisenberg’s Uncertainty Principle applies also to fields. Fields have complementary features. The more certain you are of the value of a field at one location in space, the less certain you can be of its rate of change at that location. Thus the word “uncertainty” in the name Heisenberg Uncertainty Principle.

There are many basic quantum fields that exist together. There are four fundamental matter fields, two of which are the electron field and the quark field. There are five fundamental force-carrying fields, such as the electromagnetic field and the Higgs field. All physicists believe there are more, as yet unknown, fields, such as a dark matter field, a dark energy field, and a quantum-gravity field.

Fields often interact with other fields. The electron has the property of having an electric charge. What this means in quantum field theory is that the property of having a certain electric charge is a short description of how the electron field interacts with the electromagnetic field. The electromagnetic field interacts with the electron field whenever an energetic photon transitions into an electron and a positron (that is, an anti-electron). What it is for an electron to have a mass is that the electron field interacts with the Higgs field. Physicists presuppose that two fields can interact with each other only when they are at the same point. If this presupposition were not true, our world would be a very spooky place.

According to quantum field theory, once one of these basic fields comes into existence it does not disappear; the field exists everywhere from then on. Magnets create magnetic fields, but if you were to remove all the magnets, there would still be a magnetic field, although it would be at its minimum strength. Sources of fields are not essential for the existence of fields.

Because of the Heisenberg Uncertainty Principle, even when a field’s value is the lowest possible (called the vacuum state or unexcited state) in a region, there is always a non-zero probability that its value will spontaneously deviate from that value in the region. The most common way this happens is via virtual-pair production. This occurs when a particle and its anti-particle spontaneously come into existence in the region, then rapidly annihilate each other in a small burst of energy. You can think of space in its smallest regions as being a churning sea, a sea of pairs of these particles and their anti-particles that are continually coming into existence and then rapidly being annihilated. These virtual particles are certain compact quantum vacuum fluctuations. So, even if all universe’s fields were to be at their lowest state, empty space always would have some activity and energy. This energy of the vacuum state is inaccessible to us; we can never use it to do work. Nevertheless, the energy of these virtual particles does contribute to the energy density of so-called “empty space.” The claim has been carefully verified experimentally.

This story or description of virtual particles is helpful but can be misleading when it is interpreted as suggesting that something is created from nothing in violation of energy conservation. However, it is correct to draw the conclusion from the story that the empty space of physics is not the metaphysician’s nothingness. So, there is no region of empty space where there could be empty time or changeless time in the sense meant by a Leibnizian relationist.

Because all these fields are quantum fields, their disturbances or excitations can occur only in quantized chunks, namely integer multiples of some baseline energy, the so-called zero-point energy, which is the lowest possible positive energy. It is these chunks, called “quanta,” that make the theory be a quantum theory.

Although fields that exist cannot go out of existence, they can wake up from their slumbers and turn on. Soon after the Big Bang, the Higgs field, which had a value of zero everywhere, began to increase in value as the universe started cooling. When the universe’s temperature fell below a certain critical value, the field grew spontaneously. From then on, any particle that interacted with the Higgs field acquired a mass. Before that, all particles were massless. The more a particle interacts with the Higgs field, the heavier it is. The photon does not interact at all with the Higgs field. The neutrino interacts the least.

What is the relationship between spacetime and all these fields? Are the fields in space or, as Einstein once asked, are they properties of space, or is there a different relationship? Some physicists believe the gravitational field does reside within spacetime, but others believe it does not.

Many physicists believe that the universe is not composed of many fields; it is composed of a single field, the quantum field, which has a character such that it appears as if it is composed of various different fields. This one field is the vacuum, and all particles are really just fluctuations in the vacuum.

There is also serious speculation that fields are not the ontologically basic entities; information is the universe’s basic entity.

For an elementary introduction to quantum fields, see the video https://www.youtube.com/watch?v=X5rAGfjPSWE.

Back to the main “Time” article for references.

 

Author Information

Bradley Dowden
Email: dowden@csus.edu
California State University Sacramento
U. S. A.

Virtue Epistemology

Virtue epistemology is a collection of recent approaches to epistemology that give epistemic or intellectual virtue concepts an important and fundamental role. Virtue epistemologists can be divided into two groups, each accepting a different conception of what an intellectual virtue is.

Virtue reliabilists conceive of intellectual virtues as stable, reliable and truth-conducive cognitive faculties or powers and cite vision, introspection, memory, and the like as paradigm cases of intellectual virtue. These virtue epistemologists tend to focus on formulating virtue-based accounts of knowledge or justification. Virtue reliabilist accounts of knowledge and justification are versions of epistemological externalism. Consequently, whatever their strengths as versions of externalism, virtue reliabilist views are likely to prove unsatisfying to anyone with considerable internalist sympathies.

Virtue responsibilists conceive of intellectual virtues as good intellectual character traits, like attentiveness, fair-mindedness, open-mindedness, intellectual tenacity, and courage, traits that might be viewed as the traits of a responsible knower or inquirer. While some virtue responsibilists have also attempted to give virtue-based accounts of knowledge or justification, others have pursued less traditional projects, focusing on such issues as the nature and value of virtuous intellectual character as such, the relation between intellectual virtue and epistemic responsibility, and the relevance of intellectual virtue to the social and cross-temporal aspects of the intellectual life.

There is however a sense in which the very distinction between virtue reliabilism and virtue responsibilism is sketchier than it initially appears. Indeed, the most plausible version of virtue reliabilism will incorporate many of the character traits of interest to virtue responsibilists into its repertoire of virtues and in doing so will go significant lengths toward bridging the gap between virtue reliabilism and virtue responsibilism.

Table of Contents

  1. Introduction to Virtue Epistemology
  2. Virtue Reliabilism
    1. Key Figures
    2. Prospects for Virtue Reliabilism
  3. Virtue Responsibilism
    1. Key Figures
    2. Prospects for Virtue Responsibilism
  4. The Reliabilist/Responsibilist Divide
  5. References and Further Reading

1. Introduction to Virtue Epistemology

Virtue epistemology is a collection of recent approaches to epistemology that give epistemic or intellectual virtue concepts an important and fundamental role.

The advent of virtue epistemology was at least partly inspired by a fairly recent renewal of interest in virtue concepts among moral philosophers (see, for example, Crisp and Slote 1997). Noting this influence from ethics, Ernest Sosa introduced the notion of an intellectual virtue into contemporary epistemological discussion in a 1980 paper, “The Raft and the Pyramid.” Sosa argued in this paper that an appeal to intellectual virtue could resolve the conflict between foundationalists and coherentists over the structure of epistemic justification. Since the publication of Sosa’s paper, several epistemologists have turned to intellectual virtue concepts to address a wide range of issues, from the Gettier problem to the internalism/externalism debate to skepticism.

There are substantial and complicated differences between the various virtue epistemological views; as a result, relatively little can be said by way of generalization about the central tenets of virtue epistemology. These differences are attributable mainly to two competing conceptions of the nature of an intellectual virtue. Sosa and certain other virtue epistemologists tend to define an intellectual virtue as roughly any stable and reliable or truth-conducive property of a person. They cite as paradigm instances of intellectual virtue certain cognitive faculties or powers like vision, memory, and introspection, since such faculties ordinarily are especially helpful for getting to the truth. Epistemologists with this conception of intellectual virtue have mainly been concerned with constructing virtue-based analyses of knowledge and/or justification. Several have argued, for instance, that knowledge should be understood roughly as true belief arising from an exercise of intellectual virtue. Because of their close resemblance to standard reliabilist epistemologies, these views are referred to as instances of “virtue reliabilism.”

A second group of virtue epistemologists conceives of intellectual virtues, not as cognitive faculties or abilities like memory and vision, but rather as good intellectual character traits, traits like inquisitiveness, fair-mindedness, open-mindedness, intellectual carefulness, thoroughness, and tenacity. These character-based versions of virtue epistemology are referred to as instances of “virtue responsibilism,” since the traits they regard as intellectual virtues might also be viewed as the traits of a responsible knower or inquirer. Some virtue responsibilists have adopted an approach similar to that of virtue reliabilists by giving virtue concepts a crucial role in an analysis of knowledge or justification. Linda Zagzebski, for instance, claims that knowledge is belief arising from what she calls “acts of intellectual virtue” (1996). Other virtue responsibilists like Lorraine Code (1987) have eschewed more traditional epistemological problems. Code argues that epistemology should be oriented on the notion of epistemic responsibility and that epistemic responsibility is the chief intellectual virtue; however, she makes no attempt to offer a definition of knowledge or justification based on these concepts. Her view instead gives priority to topics like the value of virtuous cognitive character as such, the social and moral dimensions of the intellectual life, and the role of agency in inquiry.

Virtue reliabilists and virtue responsibilists alike have claimed to have the more accurate view of intellectual virtue and hence of the general form that a virtue-based epistemology should take. And both have appealed to Aristotle, one of the first philosophers to employ the notion of an intellectual virtue, in support of their claims. Some virtue responsibilists (for example, Zagzebski 1996) have argued that the character traits of interest to them are the intellectual counterpart to what Aristotle and other moral philosophers have regarded as the moral virtues and that these traits are therefore properly regarded as intellectual virtues. In response, virtue reliabilists have pointed out that, whatever his conception of moral virtue, Aristotle apparently conceived of intellectual virtues more as truth-conducive cognitive powers or faculties than as good intellectual character traits. They have claimed furthermore that these powers, but not the responsibilist’s character traits, have an important role to play in an analysis of knowledge, and that consequently, the former are more reasonably regarded as intellectual virtues (Greco 2000).

It would be a mistake, however, to view either group of virtue epistemologists as necessarily having a weightier claim than the other to the concept of an intellectual virtue, for both are concerned with traits that are genuine and important intellectual excellences and therefore can reasonably be regarded as intellectual virtues. Virtue reliabilists are interested in cognitive qualities that are an effective means to epistemic values like truth and understanding. The traits of interest to virtue responsibilists are also a means to these values, since a person who is, say, reflective, fair-minded, perseverant, intellectually careful, and thorough ordinarily is more likely than one who lacks these qualities to believe what is true, to achieve an understanding of complex phenomena, and so forth. Moreover, these qualities are “personal excellences” in the sense that one is also a better person (albeit in a distinctively intellectual rather than straightforwardly moral way) as a result of possessing them, that is, as a result of being reflective, fair-minded, intellectually courageous, tenacious, and so forth. The latter is not true of cognitive faculties or abilities like vision or memory. These traits, while contributing importantly to one’s overall intellectual well-being, do not make their possessor a better person in any relevant sense. This is entirely consistent, however, with the more general point that virtue responsibilists and virtue reliabilists alike are concerned with genuine and important intellectual excellences both sets of which can reasonably be regarded as intellectual virtues. Virtue reliabilists are concerned with traits that are a critical means to intellectual well-being or “flourishing” and virtue responsibilists with traits that are both a means to and are partly constitutive of intellectual flourishing.

A firmer grasp of the field of virtue epistemology can be achieved by considering, for each branch of virtue epistemology, how some of its main proponents have conceived of the nature of an intellectual virtue and how they have employed virtue concepts in their theories. It will also be helpful to consider the apparent prospects of each kind of virtue epistemology.

2. Virtue Reliabilism

a. Key Figures

Since introducing the notion of an intellectual virtue to contemporary epistemology, Sosa has had more to say than any other virtue epistemologist about the intellectual virtues conceived as reliable cognitive faculties or abilities. Sosa characterizes an intellectual virtue, very generally, as “a quality bound to help maximize one’s surplus of truth over error” (1991: 225). Recognizing that any given quality is likely to be helpful for reaching the truth only with respect to a limited field of propositions and only when operating in a certain environment and under certain conditions, Sosa also offers the following more refined characterization: “One has an intellectual virtue or faculty relative to an environment E if and only if one has an inner nature I in virtue of which one would mostly attain the truth and avoid error in a certain field of propositions F, when in certain conditions C” (284). Sosa identifies reason, perception, introspection, and memory as among the qualities that most obviously satisfy these conditions.

Sosa’s initial appeal to intellectual virtue in “The Raft and the Pyramid” is aimed specifically at resolving the foundationalist/coherentist dispute over the structure of epistemic justification. (Sosa has since attempted to show that virtue concepts are useful for addressing other epistemological problems as well; the focus here, however, will be limited to his seminal discussion in the “The Raft and the Pyramid.”) According to Sosa, traditional formulations of both foundationalism and coherentism have fatal defects. The main problem with coherentism, he argues, is that it fails to give adequate epistemic weight to experience. The coherentist claims roughly that a belief is justified just in case it coheres with the rest of what one believes. But it is possible for a belief to satisfy this condition and yet be disconnected from or even to conflict with one’s experience. In such cases, the belief in question intuitively is unjustified, thereby indicating the inadequacy of the coherentist’s criterion for justification (1991: 184-85). Sosa also sees standard foundationalist accounts of justification as seriously flawed. The foundationalist holds that the justification of non-basic beliefs derives from that of basic or foundational beliefs and that the latter are justified on the basis of things like sensory experience, memory, and rational insight. According to Sosa, an adequate version of foundationalism must explain the apparent unity of the various foundationalist principles that connect the ultimate sources of justification with the beliefs they justify. But traditional versions of foundationalism, Sosa claims, seem utterly incapable of providing such an explanation, especially when the possibility of creatures with radically different perceptual or cognitive mechanisms than our own (and hence of radically different epistemic principles) is taken into account (187-89).

Sosa briefly sketches a model of epistemic justification that he says would provide the required kind of explanation. This model depicts justification as “stratified”: it attaches primary justification to intellectual virtues like sensory experience and memory and secondary justification to beliefs produced by these virtues. A belief is justified, according to the model, just in case it is has its source in an intellectual virtue (189). Sosa’s proposed view of justification is, in effect, an externalist version of foundationalism, since a belief can have its source in an intellectual virtue and hence be justified without this fact’s being internally or subjectively accessible to the person who holds it. This model provides an explanation of the unity of foundationalist epistemic principles by incorporating the foundationalist sources of epistemic justification under the concept of an intellectual virtue and offering a unified account of why beliefs grounded in intellectual virtue are justified (namely, because they are likely to be true). If Sosa’s criticisms of traditional coherentist and foundationalist views together with his own positive proposal are plausible, virtue reliabilism apparently has the resources to deal effectively with one of the more challenging and longstanding problems in contemporary epistemology.

John Greco also gives the intellectual virtues conceived as reliable cognitive faculties or abilities a central epistemological role. Greco characterizes intellectual virtues generally as “broad cognitive abilities or powers” that are helpful for reaching the truth. He claims, more specifically, that intellectual virtues are “innate faculties or acquired habits that enable a person to arrive at truth and avoid error in some relevant field.” These include things like “perception, reliable memory, and various kinds of good reasoning” (2002: 287).

Greco offers an account of knowledge according to which one knows a given proposition just in case one believes the truth regarding that proposition because one believes out of an intellectual virtue (311). This definition is broken down by Greco as follows. It requires, first, that one be subjectively justified in believing the relevant claim. According to Greco, one is subjectively justified in believing a given proposition just in case this belief is produced by dispositions that one manifests when one is motivated to believe what it is true. Greco stipulates that an exercise of intellectual virtue entails the manifestation of such dispositions. Second, Greco’s definition of knowledge requires that one’s belief be objectively justified. This means that one’s belief must be produced by one or more of one’s intellectual virtues. Third, Greco’s definition requires that one believe the truth regarding the claim in question because one believes the claim out of one or more of one’s intellectual virtues. In other words, one’s being objectively justified must be a necessary and salient part of the explanation for why one believes the truth.

Greco discusses several alleged virtues of his account of knowledge. One of these is the reply it offers to the skeptic. According to one variety of skepticism, we do not and cannot have any non-question-begging reasons for thinking that any of our beliefs about the external world are true, for any such reasons inevitably depend for their force on some of the very beliefs in question (305-06). Greco replies by claiming that the skeptic’s reasoning presupposes a mistaken view of the relation between knowledge and epistemic grounds or reasons. The skeptic assumes that to know a given claim, one must be in possession of grounds or reasons which, via some inductive, deductive, or other logical or quasi-logical principle, provide one with a cogent reason for thinking that the claim is true or likely to be true. If Greco’s account of knowledge is correct, this mischaracterizes the conditions for knowledge. Greco’s account requires merely that an agent’s grounds be reliable, or rather, that an agent herself be reliable on account of a disposition to believe on reliable grounds. It follows that as long as a disposition to form beliefs about the external world on the basis of sensory experience of that world is reliable, knowledge of the external world is possible for a person who possesses this disposition. But since an agent can be so disposed and yet lack grounds for her belief that satisfy the skeptic’s more stringent demands, Greco can conclude that knowledge does not require the satisfaction of these demands (307).

b. Prospects for Virtue Reliabilism

The foregoing indicates some of the ways that virtue reliabilist accounts of knowledge and justification may, if headed in the right general direction, provide helpful ways of addressing some of the more challenging problems in epistemology. It remains, however, that one is likely to find these views plausible only to the extent that one is already convinced of a certain, not wholly uncontroversial position that undergirds and partly motivates them.

Virtue reliabilist accounts of knowledge and justification are versions of epistemological externalism: they deny that the factors grounding one’s justification must be cognitively accessible from one’s first-person or internal perspective. Consequently, whatever their strengths as versions of externalism, virtue reliabilist views are likely to prove unsatisfying to anyone with considerable internalist sympathies. Consider, for example, a version of internalism according to which one is justified in believing a given claim just in case one has an adequate reason for thinking that the claim is true. It is not difficult to see why, if this account of justification were correct, the virtue reliabilist views considered above would be less promising than they might initially appear.

Sosa, for instance, attempts to resolve the conflict between foundationalism and coherentism by offering an externalist version of foundationalism. But traditionally, the coherentist/foundationalist debate has been an in-house debate among internalists. Coherentists and foundationalists alike have generally agreed that to be justified in believing a given claim is to have a good reason for thinking that the claim is true. The disagreement has been over the logical structure of such a reason, with coherentists claiming that the structure should be characterized in terms of doxastic coherence relations and foundationalists that it should be characterized mainly in terms of relations between foundational beliefs and the beliefs they support. Sosa rejects this shared assumption. He claims that justification consists in a belief’s having its source in an intellectual virtue. But a belief can have its source in an intellectual virtue without one’s being aware of it and hence without one’s having any reason at all for thinking that the belief is true. Therefore, Sosa’s response to the coherentism/foundationalism debate is likely to strike traditional coherentists and foundationalists as seriously problematic.

(It is worth noting in passing that in later work [for example, 1991], Sosa claims that the kind of justification just described is sufficient, when combined with the other elements of knowledge, merely for “animal knowledge” and not for “reflective” or “human knowledge.” The latter requires the possession of an “epistemic perspective” on any known proposition. While Sosa is not entirely clear on the matter, this apparently requires the satisfaction of something like either traditional coherentist or traditional foundationalist conditions for justification [see, for example, BonJour 1995].)

An internalist is likely to have a similar reaction to Greco’s response to the skeptic. Greco argues against skepticism about the external world by claiming that if a disposition to reason from the appearance of an external world to the existence of that world is in fact reliable then knowledge of the external world is possible for a person who possesses such a disposition. But this view allows for knowledge of the external world in certain cases where a person lacks any cogent or even merely non-question-begging reasons for thinking that the external world exists. As a result, Greco’s more lenient requirements for knowledge are likely to seem to internalists more like a capitulation to rather than a victory over skepticism.

Of course, these considerations do not by themselves show virtue reliabilism to be implausible, as the internalist viewpoint in question is itself a matter of some controversy. Indeed, Sosa and Greco alike have argued vigorously against internalism and have lobbied for externalism as the only way out of the skeptical bog. But the debate between internalists and externalists remains a live one and the foregoing indicates that the promise of virtue reliabilism hangs in a deep and important way on the outcome of this debate.

3. Virtue Responsibilism

a. Key Figures

Virtue responsibilism contrasts with virtue reliabilism in at least two important ways. First, virtue responsibilists think of intellectual virtues, not as cognitive faculties like introspection and memory, but rather as traits of character like attentiveness, intellectual courage, carefulness, and thoroughness. Second, while virtue reliabilists tend to focus on the task of providing a virtue-based account of knowledge or justification, several virtue responsibilists have seen fit to pursue different and fairly untraditional epistemological projects.

One of the first contemporary philosophers to discuss the epistemological role of the intellectual virtues conceived as character traits is Lorraine Code (1987). Code claims that epistemologists should pay considerably more attention to the personal, active, and social dimensions of the cognitive life and she attempts to motivate and outline an approach to epistemology that does just this. The central focus of her approach is the notion of epistemic responsibility, as an epistemically responsible person is especially likely to succeed in the areas of the cognitive life that Code says deserve priority. Epistemic responsibility, she claims, is the chief intellectual virtue and the virtue “from which other virtues radiate” (44). Some of these other virtues are open-mindedness, intellectual openness, honesty, and integrity. Since Code maintains that epistemic responsibility should be the focus of epistemology and thinks of epistemic responsibility in terms of virtuous intellectual character, she views the intellectual virtues as deserving an important and fundamental role in epistemology.

Code claims that intellectual virtue is fundamentally “a matter of orientation toward the world, toward one’s knowledge-seeking self, and toward other such selves as part of the world” (20). This orientation is partly constituted by what she calls “normative realism”: “[I]t is helpful to think of intellectual goodness as having a realist orientation. It is only those who, in their knowing, strive to do justice to the object – to the world they want to know as well as possible – who can aspire to intellectual virtue … Intellectually virtuous persons value knowing and understanding how things really are” (59). To be intellectually virtuous on Code’s view is thus to regard reality as genuinely intellectually penetrable; it is to regard ourselves and others as having the ability to know and understand the world as it really is. It is also to view such knowledge as an important good, as worth having and pursuing.

Code also claims that the structure of the intellectual virtues and their role in the intellectual life are such that an adequate conception of these things is unlikely to be achieved via the standard methodologies of contemporary epistemology. She claims that an accurate and illuminating account of the intellectual virtues and their cognitive significance must draw on the resources of fiction (201) and often must be content with accurate generalizations rather than airtight technical definitions (254).

Because of its uniqueness on points of both content and method, Code’s suggested approach to epistemology is relatively unconcerned with traditional epistemological problems. But she sees this as an advantage. She believes that the scope of traditional epistemology is too narrow and that it overemphasizes the importance of analyzing abstract doxastic properties (for example, knowledge and justification) (253-54). Her view focuses alternatively on cognitive character in its own right, the role of choice in intellectual flourishing, the relation between moral and epistemic normativity, and the social and communal dimensions of the intellectual life. The result, she claims, is a richer and more “human” approach to epistemology.

A second contemporary philosopher to give considerable attention to the intellectual virtues understood as character traits is James Montmarquet. Montmarquet’s interest in these traits arises from a prior concern with moral responsibility (1993). He thinks that to make sense of certain instances moral responsibility, an appeal must be made to a virtue-based conception of doxastic responsibility.

According to Montmarquet, the chief intellectual virtue is epistemic conscientiousness, which he characterizes as a desire to achieve the proper ends of the intellectual life, especially the desire for truth and the avoidance of error (21). Montmarquet’s “epistemic conscientiousness” bears a close resemblance to Code’s “epistemic responsibility.” But Montmarquet is quick to point out that a desire for truth is not sufficient for being fully intellectually virtuous and indeed is compatible with the possession of vices like intellectual dogmatism or fanaticism. He therefore supplements his account with three additional kinds of virtues that regulate this desire. The first are virtues of impartiality, which include “an openness to the ideas of others, the willingness to exchange ideas with and learn from them, the lack of jealousy and personal bias directed at their ideas, and the lively sense of one’s own fallibility” (23). A second set of virtues are those of intellectual sobriety: “These are the virtues of the sober-minded inquirer, as opposed to the ‘enthusiast’ who is disposed, out of sheer love of truth, discovery, and the excitement of new and unfamiliar ideas, to embrace what is not really warranted, even relative to the limits of his own evidence.” Finally, there are virtues of intellectual courage, which include “the willingness to conceive and examine alternatives to popularly held beliefs, perseverance in the face of opposition from others (until one is convinced that one is mistaken), and the determination required to see such a project through to completion” (23).

Montmarquet argues that the status of these traits as virtues cannot adequately be explained on account of their actual reliability or truth-conduciveness. He claims, first, that if we were to learn that, say, owing to the work of a Cartesian demon, the traits we presently regard as intellectual virtues actually lead us away from the truth and the traits we regard as intellectual vices lead us to the truth, we would not immediately revise our judgments about the worth or virtue of those epistemic agents we have known to possess the traits in question (for example, we would not then regard someone like Galileo as intellectually vicious) (20). Second, he points out that many of those we would regard as more or less equally intellectually virtuous (for example, Aristotle, Ptolemy, Galileo, Newton, and Einstein) were not equally successful at reaching the truth (21).

Montmarquet goes on to argue that the traits we presently regard as intellectual virtues merit this status because they are qualities that a truth-desiring person would want to have (30). The desire for truth therefore plays an important and basic normative role in Montmarquet’s account of intellectual virtue. The value or worth of this desire explains why the traits that emerge from it should be regarded as intellectual virtues.

Unlike Code, Montmarquet does not call for a reorientation of epistemology on the intellectual virtues. His concern is considerably narrower. He is interested mainly in cases in which an agent performs a morally wrong action which from her own point of view is morally justified. In some such cases, the person in question intuitively is morally responsible for her action. But this is possible, Montmarquet argues, only if we can hold the person responsible for the beliefs that permitted the action. He concludes that moral responsibility is sometimes grounded in doxastic responsibility.

Montmarquet appeals to the concept of an intellectual virtue when further clarifying the relevant sense of doxastic responsibility. He claims that in cases of the sort in question, a person can escape moral blame only if the beliefs that license her action are attributable to an exercise of intellectual virtue. Beliefs that satisfy this condition count as epistemically justified in a certain subjective sense (99). Thus, on Montmarquet’s view, the intellectual virtues are central to an account of doxastic responsibility which in turn is importantly related to the notion of moral responsibility.

Linda Zagzebski’s treatment of the intellectual virtues in her book Virtues of the Mind (1996) is one of the most thoroughly and systematically developed in the literature. Zagzebski is unquestionably a virtue responsibilist, as she clearly thinks of intellectual virtues as traits of character. That said, her view bears a notable resemblance to several virtue reliabilist views because its main component is a virtue-based account of knowledge.

Zagzebski begins this account with a detailed and systematic treatment of the structure of a virtue. She says that a virtue, whether moral or intellectual, is “a deep and enduring acquired excellence of a person” (137). She also claims that all virtues have two main components: a motivation component and a success component. Accordingly, to possess an intellectual virtue, a person must be motivated by and reliably successful at achieving certain intellectual ends. These ends are of two sorts (1999: 106). The first are ultimate or final intellectual ends like truth and understanding. Zagzebski’s account thus resembles both Code’s and Montmarquet’s, since she also views the intellectual virtues as arising fundamentally from a motivation or desire to achieve certain intellectual goods. The second set of ends consists of proximate or immediate ends that differ from virtue to virtue. The immediate end of intellectual courage, for instance, is to persist in a belief or inquiry in the face of pressure to give it up, while the immediate end of open-mindedness is to genuinely consider the merits of others’ views, even when they conflict with one’s own. Thus, on Zagzebski’s view, an intellectually courageous person, for instance, is motivated to persist in certain beliefs or inquiries out of a desire for truth and is reliably successful at doing so.

Zagzebski claims that knowledge is belief arising from “acts of intellectual virtue.” An “act of intellectual virtue” is an act that “gets everything right”: it involves having an intellectually virtuous motive, doing what an intellectually virtuous person would do in the situation, and reaching the truth as a result (1996: 270-71). One performs an act of fair-mindedness, for example, just in case one exhibits the motivational state characteristic of this virtue, does what a fair-minded person would do in the situation, and reaches the truth as a result. Knowledge is acquired when one forms a belief out of one or more acts of this sort.

As this characterization indicates, the justification or warrant condition on Zagzebski’s analysis of knowledge entails the truth condition, since part of what it is to perform an act of intellectual virtue is to reach the truth or to form a true belief, and to do so through certain virtuous motives and acts. This explains why Zagzebski characterizes knowledge simply as belief – rather than true belief – arising from acts of intellectual virtue.

Zagzebski claims that this tight connection between the warrant and truth conditions for knowledge makes her analysis immune to Gettier counterexamples (1996: 296-98). She characterizes Gettier cases as situations in which the connection between the warrant condition and truth condition for knowledge is severed by a stroke of bad luck and subsequently restored by a stroke of good luck. Suppose that during the middle of the day I look at the highly reliable clock in my office and find that it reads five minutes past 12. I form the belief that it is five past 12, and this belief is true. Unknown to me, however, the clock unexpectedly stopped exactly 12 hours prior, at 12:05 AM. My belief in this case is true, but only as a result of good luck. And this stroke of good luck cancels out an antecedent stroke of bad luck consisting in the fact that my ordinarily reliable clock has malfunctioned without my knowing it. While my belief is apparently both true and justified, it is not an instance of knowledge.

Zagzebski’s account of knowledge generates the intuitively correct conclusion in this and similar cases. My belief about the time, for instance, fails to satisfy her conditions for knowledge because what explains my reaching the truth is not any virtuous motive or activity on my part, but rather a stroke of good luck. Thus, by making it a necessary condition for knowledge that a person reach the truth through or because of virtuous motives and actions, Zagzebski apparently is able to rule out cases in which a person gets to the truth in the fortuitous manner characteristic of Gettier cases.

b. Prospects for Virtue Responsibilism

Virtue responsibilist views clearly are a diverse lot. This complicates any account of the apparent prospects of virtue responsibilism, since these prospects are likely to vary from one virtue responsibilist view to another. It does seem fairly clear, however, that as analyses of knowledge or justification, virtue responsibilism faces a formidable difficulty. Any such analysis presumably will make something like an exercise of intellectual virtue a necessary condition either for knowledge or for justification. The problem with such a requirement is that knowledge and justification often are acquired in a more or less passive way, that is, in a way that makes few if any demands on the character of the cognitive agent in question. Suppose, for example, that I am working in my study late at night and the electricity suddenly shuts off, causing all the lights in the room to go out. I will immediately know that the lighting in the room has changed. Yet in acquiring this knowledge, it is extremely unlikely that I exercise any virtuous intellectual character traits; rather, my belief is likely to be produced primarily, if not entirely, by the routine operation of my faculty of vision. Given this and related possibilities, an exercise of intellectual virtue cannot be a necessary condition for knowledge or justification.

This point has obvious implications for a view like Zagzebski’s. In the case just noted, I do not exhibit any virtuous intellectual motives. Moreover, while I may not act differently than an intellectually virtuous person would in the circumstances, neither can I be said to act in a way that is characteristic of intellectual virtue. Finally, I get to the truth in this case, not as a result of virtuous motives or actions, but rather as a result of the more or less automatic operation of one of my cognitive faculties. Thus, on several points, my belief fails to satisfy Zagzebski’s requirements for knowledge.

This suggests that any remaining hope for virtue responsibilism must lie with views that do not attempt to offer a virtue-based analysis of knowledge or justification. But such views, which include the views of Code and Montmarquet, also face a serious and rather general challenge. Virtue epistemologists claim that virtue concepts deserve an important and fundamental role in epistemology. But once it is acknowledged that these concepts should not play a central role in an analysis of knowledge or justification, it becomes difficult to see how the virtue responsibilist’s claim about the epistemological importance of the intellectual virtues can be defended, for it is at best unclear whether there are any other traditional epistemological issues or questions that a consideration of intellectual virtue is likely to shed much light on. It is unclear, for instance, how reflection on the intellectual virtues as understood by virtue responsibilists could shed any significant light on questions about the possible limits or sources of knowledge.

Any viable version of virtue responsibilism must, then, do two things. First, it must show that there is a unified set of substantive philosophical issues and questions to be pursued in connection with the intellectual virtues and their role in the intellectual life. In the absence of such issues and questions, the philosophical significance of the intellectual virtues and the overall plausibility of virtue responsibilism itself remain questionable. Second, if these issues and questions are to form the basis of an alternative approach to epistemology, they must be the proper subject matter of epistemology itself, rather than of ethics or some other related discipline.

The views of Code and Montmarquet appear to falter with respect to either one or the other of these two conditions. Code, for instance, provides a convincing case for the claim that the possession of virtuous intellectual character is crucial to intellectual flourishing, especially when the more personal and social dimensions of intellectual flourishing are taken into account. But she fails to identify anything like a unified set of substantive philosophical issues and questions that might be pursued in connection with these traits. Nor is it obvious from her discussion what such questions and issues might be. This leaves the impression that while Code has identified an important insight about the value of the intellectual virtues, this insight does not have significant theoretical implications and therefore cannot successfully motivate anything like an alternative approach to epistemology.

Montmarquet, on the other hand, does identify several interesting philosophical questions related to intellectual virtue, for example, questions about the connection between moral and doxastic responsibility, the role of intellectual character in the kind of doxastic responsibility relevant to moral responsibility, and doxastic voluntarism as it relates to issues of moral and doxastic responsibility. The problem with Montmarquet’s view as a version of virtue responsibilism, however, is that the questions he identifies seem like the proper subject matter of ethics rather than epistemology. While he does offer a virtue-based conception of epistemic justification, he is quick to point out that this conception is not of the sort that typically interests epistemologists, but rather is aimed at illuminating one aspect of moral responsibility (1993: 104). Indeed, taken as an account of epistemic justification in any of the usual senses, Montmarquet’s view is obviously problematic, since it is possible to be justified in any of these senses without satisfying Montmarquet’s conditions, that is, without exercising any virtuous intellectual character traits. (This again is due to the fact that knowledge and justification are sometimes acquired in a more or less passive way.) Montmarquet’s view therefore apparently fails to satisfy the second of the two conditions noted above.

Jonathan Kvanvig (1992) offers a treatment of the intellectual virtues and their role in the intellectual life that comes closer than that of either Code or Montmarquet to showing that there are substantive questions concerning these traits that might reasonably be pursued by an epistemologist. Kvanvig maintains that the intellectual virtues should be the focus of epistemological inquiry but that this is impossible given the Cartesian structure and orientation of traditional epistemology. He therefore commends a radically different epistemological perspective, one that places fundamental importance on the social and cross-temporal dimensions of the cognitive life and gives a backseat to questions about the nature and limits of knowledge and justification.

While the majority of Kvanvig’s discussion is devoted to showing that the traditional framework of epistemology leaves little room for considerations of intellectual virtue (and hence that this framework should be abandoned), he does go some way toward sketching a theoretical program motivated by his proposed alternative perspective that allegedly would give the intellectual virtues a central role. One of the main themes of this program concerns how, over the course of a life, “one progresses down the path toward cognitive ideality.” Understanding this progression, Kvanvig claims, would require addressing issues related to “social patterns of mimicry and imitation,” cognitive exemplars, and “the importance of training and practice in learning how to search for the truth” (172). Another crucial issue on Kvanvig’s view concerns “accounting for the superiority from an epistemological point of view of certain communities and the bodies of knowledge they generate.” This might involve asking, for instance, “what makes physics better off than, say, astrology; or what makes scientific books, articles, addresses, or lectures somehow more respectable from an epistemological point of view than books, articles, addresses or lectures regarding astrology” (176). Kvanvig maintains that answers to these and related questions will give a crucial role to the intellectual virtues, as he, like Code, thinks that the success of a cognitive agent in the more social and diachronic dimensions of the cognitive life depends crucially on the extent to which the agent embodies these virtues (183).

Kvanvig’s discussion along these lines is suggestive and may indeed point in the direction of a plausible and innovative version of virtue responsibilism. But without seeing the issues and questions he touches on developed and addressed in considerably more detail, it is difficult to tell whether they really could support a genuine alternative approach to epistemology and whether the intellectual virtues would really be the main focus of such an approach. It follows that the viability of virtue responsibilism remains at least to some extent an open question. But if virtue responsibilism is viable, this apparently must be on account of approaches that are in the same general vein as Kvanvig’s, that is, approaches that attempt to stake out an area of inquiry regarding the nature and cognitive significance of the intellectual virtues that is at once philosophically substantial as well as the proper subject matter of epistemology.

4. The Reliabilist/Responsibilist Divide

Virtue reliabilists and virtue responsibilists appear to be advocating two fundamentally different and perhaps opposing kinds of epistemology. The former view certain cognitive faculties or powers as central to epistemology and the latter certain traits of intellectual character. The two approaches also sometimes differ about the proper aims or goals of epistemology: virtue reliabilists tend to uphold the importance of traditional epistemological projects like the analysis of knowledge, while some virtue responsibilists give priority to new and different epistemological concerns. The impression of a deep difference between virtue reliabilism and virtue responsibilism is reinforced by at least two additional considerations. First, by defining the notion of intellectual virtue in terms of intellectual character, virtue responsibilists seem to rule out ex hypothesi any significant role in their theories for the cognitive abilities that interest the virtue reliabilist. Second, some supporters of virtue reliabilism have claimed outright that the character traits of interest to the virtue responsibilist have little bearing on the questions that are most central to a virtue reliabilist epistemology (Goldman 1992: 162).

But the divide between virtue reliabilism and virtue responsibilism is not entirely what it seems. Minimally, the two approaches are not always incompatible. A virtue reliabilist, for instance, can hold that relative to questions concerning the nature of knowledge and justification, a faculty-based approach is most promising, while still maintaining that there are interesting and substantive epistemological questions (even if not of the traditional variety) to be pursued in connection with the character traits that interest the virtue responsibilist (see, for example, Greco 2002).

More importantly, there is a sense in which the very distinction between virtue reliabilism and virtue responsibilism is considerably sketchier than it initially appears. Virtue reliabilists conceive of intellectual virtues, broadly, as stable and reliable cognitive qualities. In developing their views, they go on to focus more or less exclusively on cognitive faculties or powers like introspection, vision, reason, and the like. To a certain extent, this approach is quite reasonable. After all, the virtue reliabilist is fundamentally concerned with those traits that explain one’s ability to get to the truth in a reliable way, and in many cases, all that is required for reaching the truth is the proper functioning of one’s cognitive faculties. For example, to reach the truth about the appearance of one’s immediate surroundings, one need only have good vision. Or to reach the truth about whether one is in pain, one need only be able to introspect. Therefore, as long as virtue reliabilists limit their attention to instances of knowledge like these, a more or less exclusive focus on cognitive faculties and related abilities seems warranted.

But reaching the truth often requires much more than the proper operation of one’s cognitive faculties. Indeed, reaching the truth about things that matter most to human beings—for example, matters of history, science, philosophy, religion, and morality—would seem frequently to depend more, or at least more saliently, on rather different qualities, many of which are excellences of intellectual character. An important scientific discovery, for example, is rarely explainable primarily in terms of a scientist’s good memory, excellent eyesight, or proficiency at drawing valid logical inferences. While these things may play a role in such an explanation, this role is likely to be secondary to the role played by other qualities, for instance, the scientist’s creativity, ingenuity, intellectual adaptability, thoroughness, persistence, courage, and so forth. And many of these are the very traits of interest to the virtue responsibilist.

It appears that since virtue reliabilists are principally interested in those traits that play a critical or salient role in helping a person reach the truth, they cannot reasonably neglect matters of intellectual character. They too should be concerned with better understanding the nature and intellectual significance of the character traits that interest the virtue responsibilist. Indeed, the most plausible version of virtue reliabilism will incorporate many of these traits into its repertoire of virtues and in doing so will go significant lengths toward bridging the gap between virtue reliabilism and virtue responsibilism.

5. References and Further Reading

  • Aristotle. 1985. Nicomachean Ethics, trans. Terence Irwin (Indianapolis: Hackett).
  • Axtell, Guy. 1997. “Recent Work in Virtue Epistemology,” American Philosophical Quarterly 34: 1-27.
  • Axtell, Guy, ed. 2000. Knowledge, Belief, and Character (Lanham, MD: Rowman & Littlefield).
  • BonJour, Laurence. 1995. “Sosa on Knowledge, Justification, and ‘Aptness’,” Philosophical Studies 78: 207-220. Reprinted in Axtell (2000).
  • Code, Lorraine. 1987. Epistemic Responsibility (Hanover, NH: University Press of New England).
  • Crisp, Roger and Michael Slote, eds. 1997. Virtue Ethics (Oxford: Oxford UP).
  • DePaul, Michael and Linda Zagzebski. 2003. Intellectual Virtue: Perspectives from Ethics and Epistemology (Oxford: Oxford UP).
  • Fairweather, Abrol and Linda Zagzebski. 2001. Virtue Epistemology: Essays on Epistemic Virtue and Responsibility (New York: Oxford UP).
  • Goldman, Alvin. 1992. “Epistemic Folkways and Scientific Epistemology,” Liaisons: Philosophy Meets the Cognitive and Social Sciences (Cambridge, MA: MIT Press).
  • Greco, John. 1992. “Virtue Epistemology,” A Companion to Epistemology, eds. Jonathan Dancy and Ernest Sosa (Oxford: Blackwell).
  • Greco, John. 1993. “Virtues and Vices of Virtue Epistemology,” Canadian Journal of Philosophy 23: 413-32.
  • Greco, John. 1999. “Agent Reliabilism,” Philosophical Perspectives 13, Epistemology, ed. James Tomberlin (Atascadero, CA: Ridgeview).
  • Greco, John. 2000. “Two Kinds of Intellectual Virtue,” Philosophy and Phenomenological Research 60: 179-84.
  • Greco, John. 2002. “Virtues in Epistemology,” Oxford Handbook of Epistemology, ed. Paul Moser (New York: Oxford UP).
  • Hookway, Christopher. 1994. “Cognitive Virtues and Epistemic Evaluations,” International Journal of Philosophical Studies 2: 211-27.
  • Kvanvig, Jonathan. 1992. The Intellectual Virtues and the Life of the Mind (Savage, MD: Rowman & Littlefield).
  • Montmarquet, James. 1992. “Epistemic Virtue,” A Companion to Epistemology, eds. Jonathan Dancy and Ernest Sosa (Oxford: Blackwell).
  • Montmarquet, James. 1993. Epistemic Virtue and Doxastic Responsibility (Lanham, MD: Rowman & Littlefield).
  • Plantinga, Alvin. 1993. Warrant and Proper Function (New York: Oxford UP).
  • Sosa, Ernest. 1980. “The Raft and the Pyramid: Coherence versus Foundations in the Theory of Knowledge,” Midwest Studies in Philosophy V: 3-25. Reprinted in Sosa (1991).
  • Sosa, Ernest. 1991. Knowledge in Perspective (Cambridge: Cambridge UP).
  • Steup, Matthias. 2001. Knowledge, Truth, and Duty (Oxford: Oxford UP).
  • Zagzebski, Linda. 1996. Virtues of the Mind (Cambridge: Cambridge UP).
  • Zagzebski, Linda. 1998. “Virtue Epistemology,” Encyclopedia of Philosophy, ed. Edward Craig (London: Routledge).
  • Zagzebski, Linda. 1999. “What Is Knowledge?” The Blackwell Guide to Epistemology, eds. John Greco and Ernest Sosa (Oxford: Blackwell).
  • Zagzebski, Linda. 2000. “From Reliabilism to Virtue Epistemology,” Axtell (2000).

 

Author Information

Jason S. Baehr
Email: Jbaehr@lmu.edu
Loyola Marymount University
U. S. A.

Paradox of Hedonism

Varieties of hedonism have been criticized from ancient to modern times. Along the way, philosophers have also considered the paradox of hedonism. The paradox is a common objection to hedonism, even if they often do not give that specific name to the objection. According to the paradox of hedonism, the pursuit of pleasure is self-defeating. This article examines this objection. There are several ambiguities that surround the use of this paradox, so first, a condensed conceptual history of the paradox of hedonism is presented. Second, it is explained that prudential hedonism is the best target of the paradox, and this is made clear by considering different hedonistic theories and meanings of the word hedonism. Third, it is claimed that the overly conscious pursuit of pleasure, instead of other definitions that emerge from the literature, best captures the kind of pursuit that might generate paradoxical effects. Fourth, there is a discussion on the implications of prudential hedonism. Fifth, different explanations of the paradox that can be traced in the literature are analysed, and the incompetence account is identified as the most plausible. Sixth, the implications of prudential hedonism are discussed. Finally, it is concluded that no version of the paradox provides a convincing objection against prudential hedonism.

Table of Contents

  1. Condensed Conceptual History
  2. Paradoxes of Hedonism
  3. Isolating the Paradox of Hedonism
  4. Defining the Paradox
    1. Definition: Conscious Pursuit of Pleasure
    2. Self-Defeatingness Objection: Conscious Pursuit of Pleasure
  5. Explanations of the Paradox
    1. Logical Paradoxes
    2. Incompetence Account
    3. Self-Defeatingness Objection: Incompetence
  6. Concluding Remarks
  7. References and Further Reading

1. Condensed Conceptual History

“I can’t get no satisfaction. ‘Cause I try, and I try, and I try, and I try.” (The Rolling Stones)

These lyrics evoke the so-called paradox of hedonism that it leads to reduced pleasure. The worry the paradox generates for hedonistic theories is that they appear to be self-defeating. That is, if we pursue the goal of this theory, we are less likely to achieve it. For example, Crisp states, “one will gain more enjoyment by trying to do something other than to enjoy oneself.” Veenhoven attests that this paradox strikes at the heart of hedonism. He argues that if hedonism does not bring pleasure in the end, the true hedonist should repudiate the theory. Eggleston adds that the paradox of hedonism seems to be an issue for hedonistic ethical theories, such as utilitarianism (for objections, see experience machine).

“The paradox of hedonism,” “the paradox of happiness,” “the pleasure paradox,” the “the hedonistic paradox,” and so forth are a family of names given to the same paradox and are usually used interchangeably. Hereon, I refer to the paradox of hedonism only, and I understand happiness as hedonists do—interchangeable with pleasure.

Non-hedonistic accounts of happiness do not consider it a state of mind. Aristotle, for example, considers eudaimonia, sometimes translated as happiness, as an activity in accordance with virtue exercised over a lifetime and in the presence of sufficient external goods.

The word “hedonism” descends from the Ancient Greek for “pleasure.” Psychological hedonism holds pleasure or pain to be the only good that motivates us. In other words, we are motivated only by conscious or unconscious desires to experience pleasure or avoid pain. Ethical hedonism holds that only pleasure has value and only pain has disvalue.

The relation between the philosophical and non-philosophical uses of the word hedonism needs to be explained. The word hedonism is used differently in ordinary language from its use among philosophers. For a non-philosopher, a stereotypical hedonist is epitomized by the slogan “sex, drugs, and rock ‘n’ roll.” To “the folk,” a hedonist is a person that pursues pleasure shortsightedly, selfishly, or indecently—without regard for her long-term pleasure, the pleasure of others, and the socially-appropriate conduct. Also, psychologists sometimes use the word hedonism in the sense of folk hedonism.

That said, even within philosophy, the word hedonism can cause confusion. For instance, some consider hedonistic utilitarianism egoistic; some identify pleasure as necessarily non-social and purely physical. However, hedonism corresponds to a set of theories attributing pleasure the primary role. Ethical hedonism is the theory that identifies pleasure as the only ultimate value, not as an instrumental value or an ultimate value among several. The attainment of good through the so-called “base or disgusting pleasures” such as sex, drugs, and sadism; the indifference to long-term consequences such as rejecting delayed gratification; and the disregard for others’ pleasure, such as taking pleasure at another’s expense are features attached to folk hedonism but are not necessarily part of philosophical hedonism.

Prudential hedonism—the theory identified as the target of the paradox—is a kind of ethical hedonism concerning well-being. It is the claim that pleasure is the only ultimate good for an individual’s life and pain is the opposite. That is, the best life is the one with the most net pleasure, and the worst is the one with the most net pain. Net-pleasure (or “pleasure minus pain”) hedonism means the result of a calculation where dolors (units of pain) are subtracted from hedons (units of pleasure). Like ethical hedonism, prudential hedonism does not claim how pleasure should be pursued. In prudential hedonism, pleasure can be pursued in disparate ways, such as sensory gratification and ascetic spiritual practice; all strategies are good as long as they are successful. Prudential hedonism is also silent about the time span (immediate vs. future pleasure). Nor does prudential hedonism advise that pleasure should be pursued anti-socially. In short, prudential hedonism is not committed to any claims concerning pleasures’ source, temporal location, or whether pleasure can be generated by social behaviors.

According to Herman, the paradox of hedonism can be found in Aristotle. Aristotle claimed that pleasure represents the outcome of an activity by asking and answering the following question: why is it the case that no one is never-endingly pleased? Aristotle replied that human beings are unable to perpetually perform an activity. Therefore, pleasure cannot be perpetual because it derives from activity. So, on closer inspection, Aristotle’s argument does not seem to be the forerunner of the paradox. This argument does not tackle the issue of whether the pursuit of pleasure is self-defeating. Rather, Aristotle’s reflection concerns what causes pleasure/activity and the impossibility of perpetual pleasure.

Later, Butler elaborates an argument against psychological egoism, especially its hedonistic version, which can be considered the harbinger to the paradox, if not its first complete instantiation. Butler’s argument, called “Butler’s stone,” has been interpreted widely as refuting psychological hedonism. The claim is that the experience of pleasure upon the satisfaction of a desire presupposes a desire for something that is not pleasure itself. That is, it presupposes that people sometimes experience pleasure that can generate only from the satisfaction of a non-hedonistic desire. Therefore, psychological hedonism is false—the view that all desires are hedonistic.

Austin attributes the first formulation of the paradox to J. S. Mill. After experiencing major depression in his early twenties, Mill states that happiness is not attainable directly and that happy people have their attention directed at something different than happiness. Later, Sidgwick coined the phrase “paradox of hedonism” while discussing egoistic hedonism. This form of ethical hedonism equates the moral good with the pleasure of the individual, so for Sidgwick, the overly conscious pursuit of pleasure is self-defeating because it promotes pleasure-seeking in a way that results in diminished pleasure.

2. Paradoxes of Hedonism

However, it is questionable to consider the paradox of folk hedonism a paradox, even in the sense of empirical irony. To be empirically ironic, the paradox should involve the psychological truth of a seemingly absurd claim. Common sense holds that certain ways to pursue pleasure, such as committing crimes to finance heroin addiction, are ineffective. Since common sense holds that folk hedonism does not lead to happiness, this “paradox” lacks the counter-intuitiveness required to be labeled as such. Furthermore, if we consider that the focus of folk hedonism is short-term gains, it is not paradoxical. For example, suppose Suzy consumes cocaine during a party; this means she reached her aim. Suzy may encounter future displeasure, perhaps from addiction, but as a folk hedonist, Suzy does not have to care about her future self. So, neither common folk nor folk hedonists should be surprised that folk hedonism is a bad strategy for maximizing pleasure over a lifetime.

Psychological hedonism is the view that conscious or unconscious intrinsic desires are exclusively oriented towards pleasure. Individuals hold a particular desire because they believe that satisfying it will bring them pleasure. For example, Jane desires to do gardening because she believes that gardening will increase her pleasure. The paradox of psychological hedonism consists in the claim that the way our motivational system functions, we get less pleasure than we would have if our motivational system worked differently, specifically if it allowed the non-pleasures to motivate us. On the one hand, if psychological hedonism is a true description of our motivational system, it would have no prescriptive value because it advises us to do something impossible, at least until it becomes possible to alter our motivational system. The paradox of psychological hedonism can be seen as a device to stop being human. On the other hand, if psychological hedonism is not a true description of our motivational system, then we do not need to worry about the paradox at all. Considering this, it appears that this version of the paradox of hedonism is not particularly useful.

It seems that the above-explained ways of understanding the paradox do not capture the core idea. The paradox of folk hedonism is not counter-intuitive enough to be a paradox. For short-term gains, folk hedonism does not seem to backfire. However, any wisdom that resides in the paradox of folk hedonism collapses into the incompetence account analyzed below. Furthermore, the paradox of psychological hedonism is a descriptive claim that does not generate any useful advice.

The paradox of prudential hedonism best captures the heart of the expression: the paradox of hedonism. (1) It is prescriptive. That is, if you do x, the result will be y—which is bad. (2) It is counter-intuitive. That is, if you try to maximize your life’s net pleasure, you end up with less. The apparent absurdity in this claim is a necessary condition for a paradox. For instance, imagine telling a musician that if you aim to produce beautiful music, you will end up producing unpleasant noises. Or consider advising a student not to study hard because aiming for good grades will be counter-productive. These ways of talking are nonsensical. Common sense tells us that if you aim at something, you will be more likely to get it.

(1) and (2) also apply to the paradox of egoistic hedonism. Consider the similarities and differences between these theories. Both egoistic hedonism and prudential hedonism are normative theories that one should pursue pleasure. Yet, prudential hedonism is a theory of well-being or self-interest rationality, while egoistic hedonism is a theory of morality. According to prudential hedonism, it is rational in terms of self-interest to pursue pleasure. In contrast, according to egoistic hedonism, it is a moral obligation to pursue pleasure. Given that, it becomes apparent why prudential hedonism is the best candidate for the most refined version of the paradox of hedonism. The paradox, in fact, questions whether hedonism is rational, not whether it is moral. In other words, the claim of the paradox concerns the idiocy of pursuing pleasure, not its moral blameworthiness. For these reasons, this article focuses on the paradox of prudential hedonism.

3. Isolating the Paradox of Hedonism

This article is restricted to the common understanding of the paradox which refers to the pursuit of pleasure and does not cast light on avoiding displeasure. The points being made may not apply to both. Further research is required to understand that to what extent, if any, these processes overlap. For example, it might be claimed that happiness is a mirage. Such a claim would not imply that minimizing suffering is unrealizable too. For example, a pessimist such as Schopenhauer advised avoiding suffering instead of pursuing happiness. According to him, if you keep your expectations low, you will have the most bearable life. Therefore, further research is needed to understand that to what extent the reflection on the paradox of hedonism applies to the paradox of negative hedonism—the claim that the avoidance of displeasure is self-defeating. This distinction might have an important implication for prudential hedonism. If the pursuit of pleasure is paradoxical but the avoidance of displeasure is not, prudential hedonism is safe from the objection of self-defeatingness. Prudential hedonists would have to pursue the good life by minimizing displeasure rather than by maximizing pleasure.

Since affects can alter decision-making, we should exclude this from the most refined version of the paradox of hedonism. The opposite is the relevant mechanism: decision altering affects. The paradox is usually thought to be concerned with the relationship between pursuing pleasure and getting it, not with the relation between being pleased and its continuation. A related popular belief consists in the claim that happiness necessarily collapses into boredom. However, this cultural belief seems questionable. Certainly, some pleasures can lead to temporary satiation and loss of interest, but to not practice these pleasures in the rotation is a case of incompetence in the pursuit of pleasure. This phenomenon does not imply that pleasant states necessarily impair themselves. Relatedly, Timmermann’s “new paradox of hedonism” is based on the claim that “there can be cases in which we reject pleasure because there is too much of it.” Timmermann denies that his paradox descends from temporary satiation. However, Feldman shows that Timmermann’s new paradox of hedonism is nothing new and is based on a conflation of ethical hedonism with psychological hedonism. The psychological mechanism according to which we reject pleasure may threaten the claim that our motivation is only directed at pleasure but does not affect the claim that pleasure is good. Timmermann’s new paradox of hedonism is not a problem for prudential hedonism.

Another clarification describes the paradox of hedonism as the only mechanism that concerns decision-making and expected pleasure. In other words, the possible cases where prudential hedonism defeats itself momentarily are not included in the most refined understanding of the paradox. According to the paradox of hedonism, the agent’s decision to maximize pleasure does not optimize it in the long-term. A different mechanism involves decision-making and immediately experienced pleasure or pain. Since empirical evidence supports the view that decision-making involves immediate pleasure and pain, we should consider the paradox to refer only to the paradoxical effects concerning expected utility.

Following Moore, the paradox of hedonism is distinct from the weakness of will—hen a subject acts freely and intentionally but contrary to their better judgment. Consider the following example: Imagine that after years of studying philosophy, Bill concludes that prudential hedonism is true. Meanwhile, he cannot implement any change directed at his neurotic personality. Bill is an unhappy prudential hedonist, exhibiting the weakness of will. Indeed, empirical evidence suggests that when we imagine what will make us happier, we fail to be consistent with the plans that rationally follow from it. For example, people knowing that flow activities facilitate happiness end up over-practicing passive leisure and underutilizing active-leisure activities that could elicit periods of flow. Nevertheless, considering that the paradox of hedonism is the pursuit of pleasure resulting in less pleasure, cases of the weakness of will are not included in the refined version of the paradox because the pursuit of pleasure is missing. Cases of the weakness of will do not represent prudential hedonism’s paradoxical effects unless the belief about the truth of prudential hedonism somehow disposes of people the weakness of will more than other beliefs. Thus, the refined version of the paradox of hedonism excludes: the paradox of negative hedonism, pleasure impairing its continuation, momentary self-defeatingness, and the weakness of will.

4. Defining the Paradox

The direct pursuit of pleasure is frequently used to express the paradox of hedonism, but how is it different from the indirect pursuit of pleasure. Imagine taking an opioid. The opiates travel through the bloodstream into the brain and attach to particular proteins, the mu-opioid receptors located on the surfaces of opiate-sensitive neurons. The union of these chemicals and the mu-opioid receptors starts the biochemical brain processes that make subjects experience feelings of pleasure. Taking an opioid seems to be the most direct way to pursue pleasure, but notice that several steps are still required, for instance, owning enough money and acquiring and taking the drug. Consequently, our pursuit of pleasure is always indirect in the sense that various actions mediate it. Thus, it seems that we cannot substantially regulate our hedonic experience at will.

However, even if the direct pursuit of pleasure is impossible, it is still possible for the pursuit of pleasure to be more or less direct. Imagine the directness of the pursuit as a spectrum where the action of consuming a psychoactive substance stands on the far right and the less controversial activities such as going to a party on the left. These activities on the left also include a wide range of more or less direct paths to the goal of pleasure. For example, diving into a pool on a hot day seems to be a shortcut to pleasure compared to the challenges of studying hard and eventually securing a fulfilling job. The issues seem to lie in how long one has to wait for pleasure. Incorporating this more plausible spectrum of directness into the paradox, we get that the direct pursuit of pleasure results in less pleasure. However, this formulation seems empirically questionable. Unless endorsing some forms of asceticism, it does not seem that pleasure simply depends on always choosing the long and hard route. Sometimes, like for pool-owners on a very hot day, the highly direct pursuit seems to produce more net pleasure in addition to immediate pleasure. So, it seems that not all forms of the direct pursuit of pleasure uniformly generate paradoxical effects.

The formulation of the paradox as a consequence of holding pleasure as the only intrinsically valuable end seems poorly descriptive. This expression corresponds to broader definitions of prudential hedonism. By definition, every prudential hedonist considers pleasure as the ultimate goal, the intrinsic good, the sole ultimately valuable end, etc. According to this interpretation, the belief in the truth of prudential hedonism is itself the mental state that generates paradoxical effects. However, it seems more useful to identify the mental state that descends from the belief in the truth of prudential hedonism (for example, the conscious pursuit of pleasure) determines the paradox. In other words, the expressions at stake do not seem descriptive because it seems that the paradoxical mental state is not a philosophical belief but another mental state or behavior that the philosophical belief might be determining.

As recognized by Dietz, the definition of the paradox that emerges from holding pleasure as the only intrinsic desire configures the paradox of hedonism as a symptom of a paradox of desire-satisfaction. If we only desire desire-satisfaction, we are stuck. In Dietz’s view, this paradox threatens all theories of well-being that value satisfaction of a subject’s desires, primarily desire-satisfactionism, which is one of the main rivals of hedonism as a theory of prudence. That said, this article is silent about the plausibility of the paradox of desire-satisfaction. Nevertheless, the paradox of desire-satisfaction needs a further step to affect prudential hedonism, which is rational desire—the view that there is a rational connection between our evaluative beliefs and desires. Contrary to rational desire, Blake writes that being a hedonist does not commit one to consider pleasure as the only desire. Even if the rational desire is true, this mechanism concerns ideal agents. We seem to consider things good without desiring them or desire things without considering them good.

What of the intentional pursuit of pleasure? Kant’s use of the adverb “purposely” seems to be a synonym of “intentionally.” Notice that philosophers distinguish between prior intention and intention in action, corresponding to action-initiation and action-control. Given that, the conscious pursuit of pleasure, analyzed below, appears more precise by pointing only to the paradoxical mechanism of action control.

a. Definition: Conscious Pursuit of Pleasure

All things considered, the conscious pursuit of pleasure seems to be the most appropriate definition. The conscious pursuit of pleasure can be understood as the pursuit that holds pleasure in the mind’s eye. Pleasure is kept in mind by the agent as her regulative objective. This is a case t“indirect self-defeatingness” when the counter-productive effects of a theory are caused by conscious efforts to comply with it. Among different passages, Sidgwick advances this interpretation when he writes: “Happiness is likely to be better attained if the extent to which we set ourselves consciously to aim at it be carefully restricted.” Which share of our conscious awareness should the pursuit of pleasure occupy? Or how often should we perform a conscious recollection of the goal of pleasure? Perhaps, the wisdom underlying the paradox of hedonism can be found in answers to these questions: the paradox should be regarded as advice against focusing too much on hedonic maximization.

The strategy of never being conscious of the goal of pleasure also seems irrational, especially when considering normative theories of instrumental rationality. The calculation of the best means to any given end is assumed to be more effective than a chance to secure the end. It does not seem wise to never think of the outcome we aim for. Sometimes, we need to remember why we are acting, even in the broad sense of directing or sustaining our attention. To never aim at happiness and yet still achieve it is a case of serendipity. In fact, it is possible to find x when looking for y, such as finding pleasure while pursuing a life of moral or intellectual perfection. Still, if you enter a supermarket to buy peanuts, looking for toothpaste does not seem to be the most rational strategy; however, if you are taking a walk, you might find peanuts.

Mill goes further by trying to identify why pursuing pleasure too consciously may be ineffective. He claims that allowing pleasure to occupy our internal discourse brings about an excessive critical scrutiny of pleasures. Similarly, Sidgwick seems to have identified one paradoxical mechanism of a too conscious pursuit of pleasure when warning about the risks of pleasure’s meta-awareness. In the first two decades of the 21st century, the empirical literature on the paradoxical effects of pursuing pleasure claim that much research supports the idea that monitoring one’s hedonic experience can negatively interfere with one’s hedonic state (Zerwas and Ford).

Concerning empirical evidence on the conscious pursuit of pleasure, Schooler and colleagues instructed participants to up-regulate pleasure while listening to a hedonically ambiguous melody, while the control group was only required to listen to the melody. Subsequent experimental studies by Mauss and colleagues employed a similar methodology by making participants watch a happy film clip. This research has investigated the effects of attempting to up-regulate pleasure during a neutral or pleasant experience consciously. Importantly, these studies support the paradoxical over-conscious pursuit of pleasure. In fact, the inductions of the experiments—subjects were asked to up-regulate their hedonic experience—caused the participants to pursue pleasure and fail consciously. Given the points above, it seems that the most sensible definition of the paradox of hedonism consists of the claim that the overly conscious pursuit of pleasure is self-defeating. According to Wild, it seems that hedonism’s paradox constitutes advice to maximize pleasure by temporarily forgetting about it. It is self-defeating to fix attention on pleasure too often.

Concerning the paradoxical conscious pursuit of pleasure, it does not seem that the strategy reported by Arnold, aiming at devising it as a logical argument, is successful. The argument is supposed to work this way: the pleasure kept in view (so that it can be sought) must be an idea. An idea is no longer a feeling, and the intellectual nature of ideas prevents them from being pleasurable. However, as claimed by Arnold, one of the fallacies of this argument lies in a false conception of the function of logical constructions: a hedonist aims at pleasant states, not at the idea of such states. The idea of pleasure is just a signpost, a concept that is supposed to lead to pleasure-producing choices. If keeping in mind pleasure the signpost impairs one’s ability to experience pleasure, this seems to be an empirical claim rather than a logical necessity. Therefore, the excessively conscious pursuit seems best understood as an empirical rather than a logical paradox because the attempt to make it a logical paradox fails. Following Singer, the paradox of hedonism does not seem like a paradox in the sense of a logical contradiction; instead, it seems to represent a psychological incongruity or empirical irony about the process of pleasure-seeking.

b. Self-Defeatingness Objection: Conscious Pursuit of Pleasure

The version of the paradox identified gives no reason to think that prudential hedonism is theoretically weakened by it. As Eggleston claims, the paradox of hedonism might result in being an interesting psychological mechanism with no philosophical implications. Mill seems to support this conclusion when he starts his exposition of the paradox by saying that he is not questioning the prudential primacy of pleasure.

In fact, a theory that (1) considers pleasure to be the only intrinsic prudential good is not necessarily doomed to be internally inconsistent just because it (2) acknowledges that we should forget about pleasure at some points. (1) is a claim of theoretical reason, the kind of reason concerned with the truth of propositions; (2) is a claim of practical reason, it concerns the value of actions. The former addresses beliefs, and the latter addresses intentions. Since prudential hedonism advises the maximization of pleasure, it also advises that the agent instrumentally shapes the pursuit in whatever way it is most effective. As Sidgwick claims, the paradox of hedonism does not seem to cause any practical problem once the possibility of it has been acknowledged. As advanced by Sheldon, pleasure can be a by-product of states that require us not to pursue pleasure overly consciously. So, pleasure may be the reason to sometime forget about pleasure. These recommendations on how to avoid the paradox determine this version of the paradox of hedonism as a contingent practical problem for prudential hedonism but can be avoided. To sum up, considering the best definition of the paradox, the argument based on the paradox does not constitute a valid objection to prudential hedonism.

5. Explanations of the Paradox

Based on Butler’s reflections, Dietz discusses an older explanation of the paradox of hedonism that considers the paradox to generate from pleasure itself and its relation to the satisfaction of desire. This explanation, with Dietz’s spin on it, explains that the evidentialist account is supposed to represent logical paradoxes. The evidentialist account relies on a desire-belief condition for pleasure and evidentialism. The desire-belief condition claims that pleasure requires the subject to believe she is getting what she wants. This account is based on Heathwood’s view that pleasure consists in having an intrinsic desire for some state of affairs and believing that this state of affairs is the case. Evidentialism is an epistemological theory in which a rational agent will hold beliefs only if justified by the evidence. This theory is supposed to dictate the rules for the formation of the belief about whether the desire is satisfied.

According to this account, prudential hedonism is self-defeating if the subject is epistemically rational and not deceived. Opposite to the incompetence account that arises from our irrationality and lack of self-knowledge, the evidentialist account arises for ideal agents. According to Dietz, if we suppose that I will experience pleasure only if I believe in my own pleasure and that I am going to be rational and well-informed, there will be no option for me to find independent support for this belief; thus, I will not be able to form such a belief, and I will never experience pleasure. In other words, as an evidentialist, I will only believe what I have good evidence to believe. To be pleased, I have to believe I am pleased. But, to believe I am pleased, I need good evidence that I am pleased. Unfortunately, the only evidence of my pleasure is the belief that I am pleased. So, no pleasure beliefs ever get off the ground because the evidence is tightly circular, therefore, not compelling. The underlying reasoning of the evidentialist account has the same structure as Cave’s placebo paradox. Cave imagines a sick person that receives a placebo. This person will regain his health only if he believes that he will regain his health. Similarly, a hedonist, following this account, will be pleased only if he believes that he is or will be pleased. But if the sick person is rational, he will only have the belief that he will regain his health if he has solid evidence that this is the case. Likewise, if a hedonist holds that his pleasure itself is a unique thing in which he will take pleasure, the belief that he will experience pleasure is not independently true, and if he is rational, he cannot form this belief.

a. Logical Paradoxes

Butler’s account is based on the view that pleasure consists in the satisfaction of non-hedonistic desires—desires for anything but pleasure—is implausible. For instance, we can take delight in pleasure itself and not only by gratifying non-hedonistic desires. The concepts of meta-emotions (emotions about emotions) and meta-moods (moods about moods) have been adopted and explored by researchers within both philosophy and psychology. It is possible to feel, for example, content about being relaxed, hopeful about being relieved, and grateful about being euphoric (positive-positive secondary emotions). These are counter-examples to Butler’s account because they involve feeling good about feeling good, precisely what is supposed to be impossible in Butler’s view. Thus, Butler’s theory of pleasure seems implausible.

Concerning Dietz’s evidentialist account, it is weakened by concerning ideal agents: given that human beings are not, as this account presupposes, it has scant practical utility. The evidentialist account assumes a questionable theory of pleasure (see Katz for problems in desire-based theories of pleasure). For example, pleasant surprises constitute prima facie counter-examples to hold desire-satisfaction necessary for pleasure. Also, solid neuroscientific evidence confutes the reduction of pleasure to desire.

To summarize, Butler’s and the evidentialist’s accounts do not seem reliable explanations of the paradox of hedonism because they are built on implausible theories of pleasure.

b. Incompetence Account

A closer inspection reveals that the special goods account collapses into the incompetence account, the belief that by atomistically pursuing our pleasure, we will maximize it. The special goods account collapses into the incompetence account because pursuing pleasure atomistically seems a fallacious strategy in terms of self-interest. Accordingly, if only individuals were well-informed about what leads to pleasure, they would cultivate special goods as means to pleasure.

Having rejected Butler’s and the evidentialist’s accounts and reduced the special goods account to the incompetence account, we are left with the “overly conscious” definition versions of the paradox. The incompetence account claims that we are so prone to making mistakes in pursuing pleasure, that by not aiming at it we are more successful in securing it. Following Haybron, much empirical evidence has been amassed on the ways in which humans are likely to make errors in pursuing their interests, including happiness. We possess compelling empirical evidence confirming that individuals are systematically unskillful at forecasting what will bring them pleasure. Individuals seem to suffer from several cognitive biases that undermine their capacity to elaborate accurate predictions about what will please them. This inability to make accurate predictions about the affective impact of future events might be problematic for prudential hedonism, especially what Sidgwick calls the “empirical-reflective method.” The empirical-reflective method consists of: (1) forecasting the affects resulting from different lines of conduct; (2) evaluating, considering probabilities, which affects are preferable; (3) undertaking the matching line of conduct. As Sidgwick already recognized, to imagine future pleasures and pains, sub (1), is an unreliable operation, so our confidence in the empirical-reflective method should be restricted.

Kant seems to have explained the paradox of hedonism similarly. Incidentally, for him, morality must always be given normative priority over happiness. The moral person acts to obey the moral law irrespectively of what might be prudentially good. He claims we do not have an accurate idea of what will make us happy. According to him, pursuing wealth can generate troubles and anxiety, pursuing knowledge can determine a sense of tragedy, pursuing health can highlight the pains of ill health in advanced age, and so forth.

Kant’s understanding of the paradox seems to rely on the incompetence account and especially on the failures of affective forecasting. Many life-defining choices are based on affective forecasts. Should you get married? Have children? Pursue a career as an academic or as a financer? These important decisions are heavily influenced by forecasts about how the different scenarios will make you feel.

Consequently, the aforementioned line of empirical research shows that, in pursuing pleasure, we are not rational agents: we make mistakes, and we can fail miserably. Perhaps this is not surprising. Who has not at some time chosen a job, holiday, partner, etc., only to find out that the choice did not bring nearly as much pleasure as we had expected?

To summarize, in this section, we explored affective forecasting failures as examples of our ineptitude in pursuing pleasure. Given this evidence of human incompetence in the pursuit of pleasure, it seems we lack the skills and knowledge required to effectively grasp and sustain this elusive feeling. This weakness in our psychology seems a plausible cause of the paradox¾a case Parfit labels as “direct self-defeatingness” when the counter-productive effects of a theory are caused by compliance to it. Pity that we are so inept in our pursuit of pleasure that pursuing it destines us to fail, and perhaps fail so catastrophically that we might find ourselves less pleased than when we started.

c. Self-Defeatingness Objection: Incompetence

Having identified a plausible causal relation underpinning the paradox above, whether the incompetence account represents a theoretical issue for prudential hedonism is explored here. Recall that according to the argument based on the paradox of hedonism, prudential hedonism is a self-defeating theory.

Parfit elaborates on self-interest theory (the name under which he includes several theories of well-being) and the problem of self-defeatingness. For Parfit, a self-defeating theory “fails even in its own terms. And thus condemns itself.” Furthermore, the incompetence account corresponds to a peculiar category of self-defeatingness, a category that Parfit considers very unproblematic. In fact, in setting the boundaries of his study, he excludes cases where the paradoxical effects are mistakenly caused by what the agent does. For Parfit, incompetence is not a legitimate objection to a theory because the fault is not in the theory but in the agent.

Once again, as in the “overly conscious” definition of the paradox, the incompetence account can be seen as a practical problem that does not affect prudential hedonism as a theory. The possible practical self-defeatingness of prudential hedonism does not disprove any of prudential hedonism’s claims. Our incompetence in pursuing pleasure does not affect the validity of a theory that holds pleasure as the ultimate prudential good. If the paradox of hedonism emerges merely because of some contingent mechanisms in our psychology, prudential hedonists have no reason to reject the theory.

6. Concluding Remarks

This article analyzed the paradox of hedonism, which is the objection that prudential hedonism is self-defeating. First, the most plausible definition of the paradox was pointed out. The overly conscious pursuit of pleasure was identified as the behavior that might determine paradoxical effects in a hedonistic prudential agent. This constitutes a plausible case of prudential hedonism’s indirect self-defeatingness when the conscious effort to comply with the theory defeats its aims. Secondly, the explanations of different versions of the paradox identifiable in the literature were assessed. The incompetence account emerged as a plausible causal mechanism behind the paradox of hedonism. This is a case of prudential hedonism’s direct self-defeatingness when acting in accordance with the theory defeats its aims. However, both versions of the paradox end up being contingent on psychological mechanisms. The possible practical problems that were identified, overly conscious and incompetent pursuits of pleasure, do not theoretically affect the plausibility of prudential hedonism that concerns prudential value and not practical rationality. Nevertheless, both seem avoidable. In practice, prudential hedonism does not seem to imply a necessarily self-defeating pursuit.

7. References and Further Reading

  • Aristotle. (1975). Nicomachean ethics, In H. Rackham (Transl.), Aristotle in 23 Volumes, Vol 19. Harvard University Press.
  • Arnold, F. (1906). The so-called hedonist paradox. International Journal of Ethics, 16(2), 228–234.
  • Austin, L. (2009). John Stuart Mill, the Autobiography, and the paradox of happiness. World Picture, 3, http://www.worldpicturejournal.com/WP_3/Austin.html
  • Besser, L. L. (2021). The philosophy of happiness: An interdisciplinary introduction. Routledge.
  • Blackburn, S. (2016). Hedonism, paradox of. In The Oxford dictionary of philosophy. Oxford University Press.
  • Blake, R. M. (1926). Why not hedonism? A protest. The International Journal of Ethics, 37(1), 1–18.
  • Butler, J. (1991). Fifteen sermons preached at the Rolls Chapel. In D. D. Raphael (Ed.), British Moralists, 1650 –1800, 374–435. Hackett.
  • Cave, P. (2001). Too self-fulfilling. Analysis, 61(270), 141–146.
  • Crisp, R. (2001). Well-being. In E. N. Zalta (Ed.), The Stanford Encyclopedia of Philosophy.
  • Crisp, R. (2006). Hedonism reconsidered. Philosophy and Phenomenological Research, 73(3), 619–645.
  • Dietz, A. (2019). Explaining the paradox of hedonism. Australasian Journal of Philosophy, 97(3), 497–510.
  • Dietz, A. (2021). How to use the paradox of hedonism. Journal of Moral Philosophy, 18(4), 387–411.
  • Eggleston, B. (2013). Paradox of happiness. International Encyclopedia of Ethics, 3794–3799. Wiley-Blackwell.
  • Feldman, F. (2006). Timmermann’s new paradox of hedonism: Neither new nor paradoxical. Analysis, 66(1), 76–82.
  • Haybron, D. M. (2011). Happiness. In E. N. Zalta (Ed.), The Stanford Encyclopedia of Philosophy.
  • Heathwood, C. (2006). Desire satisfactionism and hedonism. Philosophical Studies, 128(3), 539–63.
  • Herman, A. L. (1980). Ah, but there is a paradox of desire in Buddhism—A reply to Wayne Alt. Philosophy East and West, 30(4), 529–532.
  • Hewitt, S. (2010). What do our intuitions about the experience machine really tell us about hedonism? Philosophical Studies, 151(3), 331–349.
  • Kant, I. (1996). Practical Philosophy. M. J. Gregor (Ed.), Cambridge University Press.
  • Martin, M. W. (2008). Paradoxes of happiness. Journal of Happiness Studies, 9(2), 171–184.
  • Mauss, I. B., et al. (2011). Can seeking happiness make people unhappy? Paradoxical effects of valuing happiness. Emotion, 11(4), 807–815.
  • Mill, J. S. (1924). Autobiography. Columbia University Press.
  • Moore, A. (2004). Hedonism. In E. N. Zalta (Ed.), The Stanford Encyclopedia of Philosophy.
  • Parfit, D. (1986). Reasons and Persons. Oxford University Press.
  • Schooler, J. W., et al. (2003). The pursuit and assessment of happiness can be self-defeating. In J. C. I. Brocas (Ed.), The Psychology of Economic Decisions, 41–70. Oxford University Press.
  • Sheldon, W. H. (1950). The absolute truth of hedonism. The Journal of Philosophy, 47(10), 285–304.
  • Sidgwick, H. (1907). The methods of ethics. Macmillan & Co. https://www.gutenberg.org/files/46743/46743-h/46743-h.htm
  • Silverstein, M. (2000). In defense of happiness. Social Theory and Practice, 26(2), 279–300.
  • Singer, P. (2011). Practical Ethics. Cambridge University Press.
  • Stocker, M. (1976). The schizophrenia of modern ethical theories. The Journal of Philosophy, 73(14), 453–466.
  • Timmermann, J. (2005). Too much of a good thing? Another paradox of hedonism. Analysis, 65(2), 144–146.
  • Veenhoven, R. (2003). Hedonism and happiness. Journal of Happiness Studies, 4(4), 437–457.
  • Wild, J. (1927). The resurrection of hedonism. International Journal of Ethics, 38(1), 11–26.
  • Zerwas, F. K. and Ford, B. Q. (2021). The paradox of pursuing happiness. Current Opinion in Behavioral Sciences, 39, 106–112.

 

Author Information

Lorenzo Buscicchi
Email: lorenzobuscicchi@hotmail.it
The University of Waikato
New Zealand

Certainty

The following article provides an overview of the philosophical debate surrounding certainty. It does so in light of distinctions that can be drawn between objective, psychological, and epistemic certainty. Certainty consists of a valuable cognitive standing, which is often seen as an ideal. It is indeed natural to evaluate lesser cognitive standings, in particular beliefs and opinions, in light of one’s intuitions regarding what is certain. Providing an account of what certainty is has however proven extremely difficult; in one part because certainty comes in varieties that may easily be conflated, and in another part because of looming skeptical challenges.

Is certainty possible in the domain of the contingent? Or is it restricted, as Plato and Aristotle thought, to the realm of essential truths? The answer to this question depends heavily on whether or not a distinction can be drawn between the notion of objective certainty and the notion of epistemic certainty. How are we to characterize the epistemic position of a subject for whom a particular proposition is certain? Intuitively, if a proposition is epistemically certain for a subject, that subject is entitled to be psychologically certain of that proposition. Yet, as outlined by philosophers such as Unger, depending on how psychological certainty is conceived of, skeptical implications are looming. Depending on how psychological certainty is conceived of, it is not clear that a subject can be entitled in being psychological certain of a proposition. Generally, it has proven challenging to articulate a notion of epistemic certainty that preserves core intuitions regarding what one is entitled to think and regarding what characterizes, psychologically, the attitude of certainty.

Table of Contents

  1. Varieties of Certainty
    1. Objective, Epistemic and Psychological Certainty
    2. Certainty and Knowledge
  2. Psychological Certainty
    1. Certainty and Belief
    2. A Feeling of Conviction
    3. The Operational Model
  3. Epistemic Certainty
    1. The Problem of Epistemic Certainty
    2. Skeptical Theories of Epistemic Certainty
      1. Radical Infallibilism
      2. Invariantist Maximalism
      3. Classical Infallibilism
      4. A Worry for Skeptical Theories of Certainty
    3. Moderate Theories of Epistemic Certainty
      1. Moderate Infallibilism
      2. Fallibilism
      3. Epistemic Immunity and Incorrigibility
    4. Weak Theories of Epistemic Certainty
      1. The Relativity of Certainty
      2. Contextualism
      3. Pragmatic Encroachment
  4. Connections to Other Topics in Epistemology
  5. References and Further Readings

1. Varieties of Certainty

a. Objective, Epistemic and Psychological Certainty

As a property, certainty can be attributed to a proposition or a subject. When attributed to a proposition, certainty can be understood metaphysically (objectively) or relative to a subject’s epistemic position (Moore 1959, DeRose 1998). Objective certainties consist of propositions that are necessarily true. The relevant types of necessities are logical, metaphysical and physical. For instance, the proposition “It rains in Paris”, even if true, cannot be regarded as objectively certain. This is because it is possible that it does not rain in Paris. On the other hand, the proposition “All bachelors are unmarried” can be regarded as objectively certain, for it is logically impossible that this proposition be false.

Epistemic certainties are propositions that are certain relative to the epistemic position of a subject and the notion of epistemic certainty ought to be distinguished from that of psychological certainty, which denotes a property attributed to a subject relative to a given proposition (Moore 1959, Unger 1975: 62sq, Klein 1981: 177sq, 1998, Audi 2003: 224 sq, Stanley 2008, DeRose 2009, Littlejohn 2011, Reed 2008, Petersen 2019, Beddor 2020a, 2020b, Vollet 2020). Consider the statement “It is certain for Peter that John is not sick”. Note that this statement is ambiguous, as, “for Peter” could refer to Peter’s epistemic position, for example, the evidence Peter possesses. If Peter states “It is certain that John is not sick, because the doctor told me so”, he can be understood as stating that “John is not sick” is certain given his epistemic position which comprises what the doctor told him. But “for Peter” can also denote an attitude adopted by Peter toward the proposition “John is not sick”. The attitude at issue is the type of attitude that falls under the concept of psychological certainty.

Epistemic certainty or epistemic uncertainty are often expressed by the use of modals such as “may” or “impossible” which are understood in an epistemic sense (see Moore 1959, DeRose 1998: 69, Littlejohn 2011, Petersen 2019, Beddor 2020b: sect. 5). To express epistemic certainty one can say, for instance, “It is impossible that John is sick, because the doctor said he wasn’t”. Likewise, to express epistemic uncertainty, one can say “John may be sick, as his temperature is high”. Used in such a way, these modals describe the epistemic position of a subject relative to a proposition toward which she may or may not adopt an attitude of certainty.

Even if it is intuitively correct that the epistemic position of a subject for whom some proposition is certain consists of a favorable epistemic position, it is an open question if a proposition being epistemically certain for a subject entails that that proposition is true. Depending on how the epistemic position relative to which a proposition that is epistemically certain for a subject is conceived of, epistemic certainty may not turn out to be factive (DeRose 1998 Hawthorne 2004: 28, Huemer 2007, von Fintel et Gillies 2007, Littlejohn 2011, Petersen 2019, Beddor 2020b, Vollet 2020).

Psychological certainty, for its part, is generally regarded as a being non-factive. For example, John can be psychologically certain that it is raining in Paris even if it is not raining in Paris. In addition, psychological certainty does not require that a subject be in a favorable epistemic position. It is possible for John to have no reason to believe that it is raining in Paris, and yet, be psychologically certain that it is raining in Paris.

Despite being conceptually distinct, the notions of objective, epistemic and psychological certainty are significantly related. From Antiquity to the end of the Middle Ages, the idea that true science (epistémè) – that is, epistemic certainty – could only pertain to necessary truths  whose object was either intelligible forms or essences seemed to be widely accepted. In The Republic books V, VI and VII, Plato endorses the view that sensible reality is merely an imperfect and mutable copy of an ideal, perfect and immutable realm of existence. As a result, sensible reality can only be the object of uncertain opinions (doxa). For his part, Aristotle defines epistemic certainty, or “scientific knowledge,” as the syllogistic demonstration of essential truths. It is through such syllogisms that one can comprehend what belongs essentially to objects of knowledge (Aristotle Organon IV. Posterior Analytics, I, 2, Metaphysics VII.2, 1027a20). It is during the Scientific Revolution that emerges the idea of a science of the contingent and of the possibility of distinguishing epistemic certainty from objective certainty.

In addition, epistemic certainty has an important normative relationship to psychological certainty (Klein 1998). For instance, Descartes states that one should not adopt an attitude of certainty toward propositions that are not entirely certain and indubitable (Descartes Meditations on First Philosophy § 2). Similarly, Locke’s evidentialist principle prescribes that a subject should proportionate her opinion to the evidence she possesses (Locke Essay, IV, 15 §5). Indeed, it seems that if a proposition is not epistemically certain for a subject, that subject is not justified in being certain that that proposition is true (Ayer 1956: 40, Unger 1975).

b. Certainty and Knowledge

According to several philosophers, the notions of psychological and epistemic certainty are closely connected to the notion of knowledge. One could regard the propositional attitude involved in knowing something to be the case as consisting of the attitude of psychological certainty and therefore take epistemic certainty to be a condition on knowledge. (Ayer 1956, Moore 1959, Unger 1975, Klein 1981). Such a view would explain why concessive knowledge attributions such as, “I know that I have hands, but I might be a handless brain in a vat” appear to be inconsistent (Lewis 1996).

However, there are reasons to draw a more substantial distinction between certainty and knowledge. First, as epistemic certainty is intuitively very demanding, taking it as a condition on knowledge could easily lead to the conclusion that ordinary knowledge is beyond one’s reach (Unger 1975). Second, there seem to be cases where some knowledge is attributed to a particular subject without that subject being described as psychologically certain (Stanley 2008, McGlynn 2014, Petersen 2019, Beddor 2020b, Vollet 2020). For instance, consider the statements, “I know that p for certain” or, “I know that p with certainty”. These statements are not redundant, and express something stronger than “I know that p” (Malcolm 1952, Beddor 2020b, Vollet 2020, Descartes distinguishes, for his part, cognitio and scientia: see Pasnau 2013).

In addition, concessive knowledge attributions can be explained by other means. According to some philosophers, a pragmatic implicature in tension with the attribution of knowledge is communicated whenever an epistemic possibility is explicitly considered during  a conversation. For instance, that this possibility is relevant when it comes to determining whether the subject knows that p (Rysiew 2001, Fantl and MacGrath 2009, Dougherty and Rysiew 2009, 2011, for difficulties raised by this type of explanation see Dodd 2010). Other philosophers explain the apparent inconsistency of concessive knowledge attributions by claiming that epistemic certainty is the norm of assertion (Stanley 2008, Pertersen 2019, Beddor 2020b, Vollet 2020).

Whether or not knowledge can be conceived of in terms certainty, because of the close connection between these notions and the centrality of questions pertaining to knowledge in epistemology, the philosophical discussion has been primarily focused on the notions of psychological and epistemic certainty. Therefore, the primary aim of this entry is to present the different ways in which these notions have been understood by philosophers. Note that the question of the relationship between epistemic and objective certainty will nevertheless be addressed in relation to infaillibilist conceptions of epistemic certainty.

2. Psychological Certainty

a. Certainty and Belief

There is an intuitive connection between psychological certainty and belief. Whenever a subject is certain of p, she takes p to be true which is characteristic of the attitude of belief. Yet, one can believe that p without being certain of p. One can reasonably state, “I believe that it will rain tomorrow, but I am not certain that it will” (Hawthorne, Rothschild and Spectre 2016, Rothshild 2020). Thus, being certain of p does not (merely) amount to believing that p (for versions of the claim that believing that p involves being certain that p, see Finetti 1990, Roorda 1995, Wedgwood 2012, Clarke 2013, Greco 2015, Dodd 2017 and Kauss 2020).

One way to conceive of the relationship between psychological certainty and belief is to introduce a “graded” notion of belief. Traditionally, philosophical discussion has relied on a binary notion of belief: either a subject believes a proposition, or she does not. But a subject can also be said to have a certain degree of belief in a proposition and given a graded notion of belief, psychological certainty can plausibly be conceived as the maximum degree of belief one could have in a proposition. For its part, the attitude of belief or outright belief is conceivable as a high, yet non-maximal degree of belief in a proposition (Foley 1992, Ganson 2008, Weatherson 2005, Sturgeon 2008: 154–160 and Leitgeb 2013, 2014, 2017).

Such a conception of psychological certainty raises, however, three important questions. First, what does it mean for a subject to have a particular degree of belief in a proposition? Second, how should degrees of belief be quantified? Third, what does it take for a subject to have the maximum degree of belief in a proposition?

b. A Feeling of Conviction

 

A first possibility is to consider that a subject’s degree of belief in p consists of an internally discernable feeling of conviction toward p’s truth (for a discussion of certainty as a metacognitive or epistemic feeling, see Dokic 2012, 2014; Vazard 2019; Vollet 2022). For example, consider the propositions “2+2=4” and “It will rain tomorrow.” Presumably, there is a discernable difference in one’s feeling of conviction toward the truth of each proposition. One’s conviction toward the truth of “2+2=4” is stronger than one’s conviction toward “It will rain tomorrow” and, given the view under examination, one’s degree of belief in “2+2=4” is thereby higher than one’s degree of belief in “it will rain tomorrow”. This is the case even if one believes that both “2+2=4” and “It will rain tomorrow” are true. Such a conception seems to prevail among philosophers such as Descartes and If a subject’s degree of belief in p consists of a certain feeling of conviction toward p’s truth, then psychological certainty is naturally conceived of a maximally strong feeling of conviction toward the truth of a proposition (see also Locke Essay, and Hume Enquiry).

Such a conception of psychological certainty is problemmatic however. First, many propositions which a particular subject is certain of at a given time may not be associated with a discernable feeling of conviction. One might be absolutely certain that 2+2=4 at t without having any particular feeling toward that proposition, for example, if one is not entertaining the thought that 2+2=4 at t. Second, it is not clear that the type of feeling that is supposed to illuminate the notion of psychological certainty has the required structure. In particular, it is not clear that such a feeling has upper and lower bounds. In light of these complications, it might be preferable, in order to explicate the notion of psychological certainty, to exploit the intuitive connection there is between that notion and the notion of doubt.

It is intuitively correct that if a subject is absolutely convinced that a proposition is true, her attitude toward that proposition leaves no room for doubt. The proposition, for that subject, is indubitable. For instance, one may have that degree of conviction toward the proposition, “I think, therefore I am ” because one finds it difficult—perhaps even impossible—to doubt it (see Ayer 1956: 44-45 for discussion). Note, however, that this characterization of psychological certainty remains ambiguous. It can be understood either synchronically or diachronically (Reed 2008); assuming that one distinguishes the degree of belief a subject can have in a proposition from the stability of this degree of belief (see Levi 1983: 165; Gärdenfors et Makinson 1988: 87, Skyrms 1980: 11f., Leitgeb 2013: 1348, 1359, Moon 2017 and Kauss 2020). From a synchronic perspective, it follows from this characterization that if a subject is certain that p at t, that subject has absolutely no doubt regarding p’s truth at t. However, it does not follow that the subject who is certain that p at t could not doubt that p at a later time t’. In contrast, from a diachronic perspective, it follows from this characterization of psychological certainty that if a subject is certain that p at t, then p is indubitable in the sense that any doubt concerning p’s truth is excluded for that subject, both at t, and at a later time t’.

Understood diachronically, this characterization of psychological certainty under examination is thus stronger and possibly better suited to explicate this type of attitude.  As a matter of fact, according to the synchronic reading, a belief very easily revisable could qualify as a psychological certainty. Yet, it seems that psychological certainty consists of something stronger than this. If a subject is absolutely convinced that p is true, one expects her attitude toward p to be stable.

Several ways of understanding what it takes for a proposition to be indubitable for a subject have been put forward. According to Peirce, the notion of doubt is not related to the mere contemplation of the possibility of a proposition being false (Peirce 1877 and 2011). Instead, doubt is characterized as a “mental irritation” resulting from the acquisition of unforeseen information which provides motivation to investigate a proposition’s truth further. According to this pragmatic conception of doubt, the exclusion of doubt regarding p’s truth is the result of a rational inquiry into the question of p’s truth and does not simply consist of a psychological impossibility to contemplate the possibility of p being false (Vazard 2019) .

In contrast, Malcolm and Unger adopt a Cartesian conception of doubt (Malcolm 1963 67-68, Unger 1975  30-31, 105 sq). For them, doubt regarding p’s truth is excluded for a subject at a time t whenever her attitude toward p at t is such that no information or experiences will lead her to change her mind regarding p. When a subject is certain that p and doubt regarding p’s truth is excluded for her, her attitude toward p is such that, for her, any consideration in favor of the conclusion that p is false is misleading. This means, for instance, that if a subject is certain that there is an ink bottle in front of her at t, her attitude toward the proposition “There is an ink bottle in front of me” is such that, at t, if she felt or had a visual experience as of her hand passing through that ink-bottle, this would not provide reason for her to think that there is not an ink bottle in front of her. Rather, it would provide reason for the subject to think that such sensory experiences are illusory.

For other philosophers, if a subject is certain that p, no data or experience could lead her to doubt that p is true without putting the totality of her beliefs into question, including beliefs concerning the data or experience prompting her to doubt that p is true in the first place (Miller 1978). That is, if a subject is certain that p, then any reason to doubt that p is true would constitute, for her, a reason to doubt absolutely anything, including the very existence of that reason.

This characterization of psychological certainty manages to capture the idea that this attitude differs from the attitude of belief in that it is not revisable. Note, however, that this characterization does not entail that no psychological event could alter one’s attitude of certainty. Even if any doubt regarding p’s truth is excluded for a subject at t, circumstances could lead her to doubt p’s truth. In addition, this characterization does not exclude the possibility of the subject acquiring, at a later time t’, evidence supporting the conclusion that p is false, and of losing, as a result, her conviction regarding p’s truth.

Additionally, the exclusion of doubt regarding p’s truth is not only related to the attitude adopted by a subject who is certain of p toward what counts as a reason to think that p is false. As noted by Unger, a subject’s absolute conviction that p is true manifests itself in the subject’s readiness to use p as a premise of practical or theoretical reasoning without hesitation (Unger 1975: 116). Of course, this aspect of psychological certainty is not in conflict with the characterization of this attitude outlined above. As a matter of fact, if any doubt regarding p’s truth is excluded for a subject, it is plausible that she is ready to use p as a premise of reasoning without hesitation.

While the characterization of psychological certainty just outlined capture central features of this attitude, it also faces certain difficulties. Given such a characterization of psychological certainty, one could be led to think, following Unger, that psychological certainty is a fundamentally dogmatic attitude which should not be adopted (Unger 1975). Yet, philosophers such as Dicker, Carrier, Douven and Olders reject the idea that psychological certainty consists of a dogmatic attitude (Dicker 1974: 166, 168, Carrier 1983, Douven and Olders 2008: 248) and philosophers such as Miller argue that psychological certainty is in fact compatible with a feeling of doubt (Miller 1978: 48, 53-54).

c. The Operational Model

As noted in the previous section, explicating the notion of psychological certainty in terms of an internally discernable feeling of conviction raises serious problems. This has led several philosophers to favor an operationalist or functionalist approach of psychological certainty. According to De Finetti, an operational definition of a subject’s degree of belief can be given in terms of her betting behavior (De Finetti 1937 and 1990). More precisely, a subject’s degree of belief in p can be conceived of as the odd that a subject regards as fair for a bet on p’s truth that would be rewarded with one monetary unit if p was true. For instance, suppose one is offered a bet on whether or not the proposition “Berlin is the capital of the Federal Republic of Germany” is true. Suppose, in addition, that if one were to be right concerning that proposition, one would be rewarded $1. If one is ready to pay $.80 to bet on the truth of that proposition, then, given De Finetti’s model, one’s degree of belief in the proposition “Berlin is the capital of the Federal Republic of Germany” can be represented as a function which assigns the value .8 to that proposition.

Ramsey generalizes the relationship between a subject’s expectations—her degree of belief regarding the truth of a set of propositions—and her behavior to any type of preferences (Ramsey 1929). According to him, whenever a subject determines whether she prefers to do A or B, she relies on her degree of belief in the propositions representing the states of affairs which the possible results of each option depends on. Thus, a subject’s expectations allow her to determine the value she can expect from each option and to rationally determine whether or not she prefers to do A or B.

Several representation theorems have been formulated to show that if a subject’s preferences satisfy a set of intuitively acceptable constraints, they can be represented by a probability function corresponding to both the subject’s expectations and a utility function which, in conjunction, maximize the expected utility of each possible option (Ramsey 1926, Savage 1954, Jeffrey 1965, Joyce 1999). This formalization of the relationship between rational expectations and rational preferences is central to both Bayesian Epistemology and Bayesian Decision Theory as it lays the groundwork for an analysis of epistemic rationality in light of the assumption that rational expectations can be represented as a probability distribution over a given set of propositions.

The connection between a subject’s degree of belief and her preferences suggests that psychological certainty can be conceived of as a subject’s propensity to act in a certain way. If one relies on betting behavior to explicate degree of belief, psychological certainty could be conceived of as a subject’s readiness to accept any bet concerning p’s truth as long as the bet at issue can result in potential gains. Such a conception would be similar to the one presented in the previous section; viewing psychological certainty as an attitude toward p characterized by the exclusion of doubt regarding p’s truth. If doubt regarding p’s truth is excluded for a subject, what reason could that subject have to refuse a bet on p’s truth that could result in potential gains? If any doubt regarding p’s truth is excluded for her, then nothing could lead her to doubt that p is true; not even the stakes involved in a particular bet concerning p’s truth.

But, of course, we are not certain of many propositions in that sense. If the stakes related to a bet that is offered to us concerning the truth of a proposition we regard as being certain are high, we hesitate. Additionally, it seems that we are right not to be certain of many propositions in that sense, for the evidence we normally have access to does not warrant adopting such an attitude. This is not perceived, however, as being fundamentally problemmatic by the proponents of Bayesian conceptions of epistemic and practical rationality as such conceptions purport to model epistemic and practical rationality in a context of generalized uncertainty. If such conceptions fundamentally aim at showing how it can be reasonable for a subject to think or act in certain way in a context of generalized uncertainty, the fact that given these conceptions there is almost nothing which we are certain and can reasonably be certain of can hardly count as a drawback. This is true as long as one is ready to concede that this construal of psychological certainty is not necessarily equivalent to our ordinary concept of psychological certainty (Jeffrey 1970: 161).

3. Epistemic Certainty

a. The Problem of Epistemic Certainty

According to the Lockean principle, which requires that one proportions one’s degree of belief to the evidence, a subject is justified in being psychologically certain of a proposition if, and only if, this proposition is epistemically certain. This principle is widely accepted as it explains why statements such as “It is certain that p, but I’m not certain that p” sound incoherent (Stanley 2008, Beddor 2020b, Vollet 2020). The main question being: are there epistemically certain propositions, and if so, what grounds their epistemic certainty?

To tackle this question, let us consider the following propositions:

(1)   It will rain in Paris in exactly one month at 5 p.m.

(2)   The lottery ticket I’ve just bought is a losing ticket (Context: It only has a chance of one in a million to win).

(3)   My car is currently parked in the parking lot (Context: I parked my car in the parking lot five minutes ago.)

(4)   The world was not created five minutes ago.

(5)   It hurts. (Context: I just dropped a hammer on my foot)

(6)   All bachelors are unmarried men.

As previously mentioned, epistemic certainty is relative to the epistemic position of a particular subject. Considering the six propositions above, the question at hand is therefore whether or not a subject’s epistemic position can be such that these propositions are certain for her. One can reasonably doubt that a subject can be in an epistemic position such that (1) is certain for her. Considering the evidence a subject normally has access to, it seems that (1) can be, at best, highly probable, but not certain. Likewise, (2) appears to constitute a typical example of uncertain propositions; the sort that tends to illustrate the difference between certainty and high probability.

On the other hand, it is intuitive to think that propositions such as (3), (4), (5) or (6) can be epistemically certain. In a typical situation, if a subject comes to doubt that her car is still parked in the spot she left it five minutes ago, that the world was not created five minutes ago, that it hurts when she drops a hammer on her foot, or that all bachelors are unmarried, one would presumably consider this doubt as ill-founded and absurd. Yet, one might think it is possible that one’s car was stolen two minutes ago, or that the world was created five minutes ago. In fact, it is unclear if the evidence one possesses allows dismissing such scenarios. What about propositions such as (5) and (6)? Some philosophers suggest that it is reasonable to doubt the truth of such propositions in cases where one is offered a bet with extremely high stakes, for example a bet in which, if the proposition is false, one’s family is tortured to death. In such cases, it seems reasonable to decline the bet. Now, given the Lockean principle mentioned above, this may be interpreted as providing good evidence to think that even these kinds of propositions are not actually certain.

These considerations show how problemmatic the notion of epistemic certainty can be. We easily admit that, given the evidence normally available to a subject, propositions such as (1) and (2) are uncertain. In contrast, we are inclined to think that propositions such as (3), (4), (5) and (6) are, or can be, epistemically certain. Yet, minimal reflection suffices to put this inclination into question. This has been highlighted by Hume; when one does not pay specific enough attention, one considers many propositions to be certain (Hume Treatise III, 1. 1. 1.). However, minimal philosophical reflection suffices to shake this conviction which reappears as soon as one comes back to one’s daily life. The question of the nature, possibility and extension of epistemic certainty is in fact nothing other than the problem of skepticism, which rests at the heart of important debates in epistemology (Firth 1967).

The challenge, then, consists in articulating and defending a criterion for epistemic certainty, while also explaining the problemmatic cases which arise from this criterion. If we consider the propositions (1)-(6), three families of theories of epistemic certainty can be distinguished.

In the following list, the term “skeptical” is used with respect to epistemic certainty, rather than knowledge:

  Skeptical:

o        Radically skeptical: none of the considered propositions are, or can be, certain.

o        Strongly skeptical: only propositions such as (6) are, or can be, certain.

o        Moderately skeptical: only propositions such as (5) or (6) are, or can be, certain.

Moderate:

o        Strong moderate: only propositions such as (4), (5) and (6) are, or can be, certain.

o        Weak moderate: only propositions such as (3), (4), (5) and (6) are, or can be, certain.

Weak:

o        Propositions such as  (2), (3), (4), (5) and (6) are, or can be, certain.

In the remaining sections, the focus is on the theories listed above, with radically skeptical theories considered as opponents that these theories are designed to respond to.

b. Skeptical Theories of Epistemic Certainty

i. Radical Infallibilism

One way of explaining epistemic certainty appeals to infallibility. In general, one is infallible with regard to p if and only if it is impossible for one to be mistaken about p’s truth (Audi 2003: 301 sq):

Certainty-RI: p is certain for S if, and only if, S can believe that p, and it is absolutely impossible for S to believe that p and to be mistaken about p’s truth.

According to this definition, at least two kinds of propositions can be certain. First, there are necessarily true propositions such as (6). Indeed, if a proposition is necessarily true, it is impossible for a subject to believe that this proposition is true and, at the same time, to be mistaken about that proposition’s truth. Second, there are propositions that are true in virtue of being believed to be so by a subject. For example, a subject cannot believe the propositions, “I have a belief” or, “I think, therefore I am” to be true and, at the same time, be mistaken concerning their truth. This is because, these propositions are true in virtue of being believed to be so by the subject.

As it should be clear, this conception of epistemic certainty excludes propositions such as (1), (2), (3), (4) and (5). Indeed, these propositions are contingent, and their truth is independent of whether or not they are believed to be true by a subject (Ayer 1956: 19). Certainty-RI is therefore a strongly skeptical conception of epistemic certainty. Given that conception, very few informative propositions can be certain or even known if one maintains that knowledge requires epistemic certainty.

A major difficulty raised by this conception of epistemic certainty is, however, that it entails that any logically or metaphysically necessary proposition is epistemically certain. For instance, a mathematical conjecture such as Goldbach’s, if it is true, is necessarily true. As a result, according to this conception of epistemic certainty, any mathematical conjecture, if it is true, is epistemically certain. Yet, it seems clear that one can have reasons to doubt the truth of a mathematical conjecture (Plantinga 1993: ch. 8).For example, a recognized expert might wrongly assert that a given conjecture is false. A related worry is the well-known problem of logical omniscience: since we are not logically omniscient, it is implausible to consider that every logical or metaphysical truth is certain for us (Hintikka 1962, Stalnaker 1991). In order for a logical or metaphysical necessity to be epistemically certain for a subject, it seems that the subject should at least grasp the nature of that necessity.

One crucial aspect that this conception of epistemic certainty fails to capture is related to the absence of good reasons to doubt. Intuitively, what makes a logical or metaphysical truth epistemically certain is not the fact that it is necessarily true, but that we have very strong reasons to regard it as necessarily true.

ii. Invariantist Maximalism

The above considerations suggest that epistemic certainty should rather be explicated in terms of the absence of good reasons to doubt:

Certainty-IND: p is certain for S if and only if p is epistemically indubitable for S. That is, if and only if it is impossible for S to have a good reason to doubt that p.

According to a first version of Certainty-IND, epistemic certainty depends on a subject having the highest possible degree of justification for believing a proposition (Russell 1948, Firth 1967: 8-12). If a subject’s justification is absolutely maximal with respect to p, no proposition q can be more justified than p. It follows that if p is certain, no proposition can provide a good reason to doubt that p, as any consideration q speaking against the truth of p would have a lower degree of justification. Let us label this conception “Invariantist Maximalism.”

Certainty-IM: p is epistemically certain for S if and only if there is no proposition q which can be more justified, for S, than p.

Invariantist Maximalism relies on the thought that the term “certain” is an absolute term which applies in light of an invariant and maximal criterion: if p is certain for S, nothing can be more certain than p, no matter the context of epistemic appraisal (Unger 1975. For criticisms, see Barnes 1973, Cargile 1972, Klein 1981, Stanley 2008, and Vollet 2020). An advantage of this view is that it does not entail that all necessary truths are epistemically certain. Even if one assumes that if a proposition is maximally justified for a subject, the subject is then infallible with respect to it, infallibility, on its own, is not sufficient for epistemic certainty. (Fantl and McGrath 2009: ch. 1, Firth 1967: 9). For example, someone cannot incorrectly believe that water is H2O, though one’s justification for believing that water is H2O need not be maximal.

Nonetheless, Invariantist Maximalism easily leads to radical skepticism. A first reason for this is that one might think it is impossible to identify a maximal threshold of justification. Indeed, it always seems possible to raise the degree of justification one has for believing that a given proposition is true, either by acquiring new evidence from a different source, or by acquiring higher-order justification (Brown 2011; 2018, Fantl 2003). Taking this into account, the invariantist Maximalist conception of epistemic certainty predicts that no proposition is epistemically certain.

Furthermore, this approach leads one to classify as epistemically uncertain any proposition less justified than those such as, “I exist.” This is the case even with propositions that can be logically deduced from such propositions. For, it is plausible that the degree of justification one has for believing logically stronger propositions than “I exist” is slightly lower than the degree of justification one has for believing the proposition “I exist”.

One way of avoiding these skeptical consequences is to restrict the set of propositions that constitute the comparison class. That is, the set of propositions which p is compared to in determining its epistemic certainty (see Firth 1967: 12 for a presentation of various possibilities). However, the difficulty is to propose a criterion for restricting this set that is neither too strong nor too weak. For instance, Chisholm proposes to restrict the comparison class to the propositions that a subject, at a given time t, can reasonably believe (Chisolm 1976: 27):

Certainty-Chisholm 1: p is certain for S at t if and only if

(i) Accepting p is more reasonable for S at t than withholding p, and

(ii) there is no q such that accepting q is more reasonable for S at t than accepting q.

Yet, this criterion seems too weak. If no proposition is justified to a high degree, some propositions with a very low degree of justification could be said to be epistemically certain given Certainty-Chisholm 1 (Reed 2008).

Consequently, consider the criterion proposed by Chisholm (1989: 12):

Certainty-Chisholm 2: p is epistemically certain for S if and only if, for any proposition q:

(i) believing p is more justified for S than withholding judgement concerning q, and

(ii) believing p is at least as justified for S as is believing q.

Even if this criterion is stronger, the problem is that there are many propositions which one has absolutely no evidence for, and, accordingly, concerning which one is absolutely justified to suspend judgement. For instance, it does not seem more reasonable to believe the proposition, “I think, therefore I am” than it is to withhold judgement concerning the proposition, “There are an even number of planets.” As a result, according to Certainty-Chisholm 2, “I think, therefore I am” is not epistemically certain (Reed 2008).

iii. Classical Infallibilism

According to another version of Certainty-IND, epistemic certainty does not require having a maximal justification for believing a proposition (in contrast to Certainty-IM) or being infallible regarding a proposition’s truth in the sense of Certainty-RI. Instead, epistemic certainty requires that one’s justification be infallible. To say that the justification S has for p is infallible is to say that it is impossible, logically or metaphysically, for S to have this justification and that p is false. In addition, this requirement is traditionally understood along internalist lines in such a way that whenever a subject possesses an infallible justification for p she is in a position, because of the access she has to the justifiers, to rule out, herself, the possibility of p being false. Thus, consider the following formulation of the classical infallibilist conception of epistemic certainty:

Certainty-CI: p is certain for S if and only if S has (internal) access to a justification for p which implies p.

According to this conception, epistemic certainty requires an infallible guarantee that is (reflexively) accessible to the subject (Dutant 2015). For instance, Descartes maintains that clear and distinct ideas are guaranteed to be true, and they are therefore epistemically indubitable (Meditations II). Russell states that through introspection, one can directly access one’s sense data, and that thereby, one’s introspective beliefs are guaranteed to be true (Russell 1912).

This kind of approach can avoid the problem of propositions that are necessarily true, but epistemically uncertain. This is the case if one maintains that one consideration can justify another only if what makes the first consideration true also (in part) makes the other consideration true (Fumerton 2005: sec. 2.2). Alternatively, one can think of an infallible guarantee as a ground or method of belief formation which cannot produce false beliefs (Dutant 2016). Note that this conception can allow propositions of type (5) to be certain if it is assumed, for example, that what justifies my belief that “My foot hurts” is the fact that my foot itself hurts, which is accessible via introspection.

The Bayesian approach, on a standard interpretation, can be considered as a formalized version of such a conception. One of its main tenets is that a subject’s expectations regarding a set of propositions can be, if rational, represented as a probability distribution over this set. In other words, a subject’s expectations relative to the truth of a set of propositions can be, if rational, represented as a function assigning to each of these propositions a numerical value which satisfies the definition of a probability function given by Kolmogorov (Kolmogorov 1956). If a subject’s rational expectations regarding a set of propositions are represented as such, the expectations a subject should have, given the evidence she possesses, can be represented in terms of conditional probability, whose definition is provided by Bayes’ theorem. Epistemic certainty is thus conceived of as the maximal probability (probability 1) that a proposition can have, given the available evidence:

Certainty-Prob: p is epistemically certain for S if and only if Pr(p|E) = 1

This conception of certainty can be viewed as a version of classical infallibilism if it is assumed that no false proposition can be evidence, that evidence is always accessible as such to the subject, and that whatever constitutes the evidence possessed by a subject is itself epistemically certain. This is a strongly skeptical conception if, following orthodox Bayesians like Jeffrey, one thinks that only logically necessary propositions should receive maximal probability (Jeffrey 2004). If, on the other hand, some contingent propositions, in particular about our own mental states, can be considered as evidence in this sense, then this conception can be regarded as moderately skeptical. The main advantage of this kind of approach is that it offers a way of accounting for practical and epistemic rationality in absence of epistemic certainty (and knowledge) of a great number of propositions.

Still, we may think we can have reasons to doubt the evidence we have and what our evidence does or does not entail (Lasonen-Aanio 2020). For example, one may have reason to doubt the infallible character of clear and distinct perceptions, or doubt what qualifies as clear and distinct (Descartes Meditations I, Ayer 1956: 42-44). Furthermore, the standard Bayesian conception has it that it is rational to take a bet on any logical truth, no matter the stakes or the odds. According to this framework, it would be irrational (probabilistically incoherent) to assign a non-maximal probability to a logical truth. Yet, it seems that it is sometimes irrational to take such a bet (Fantl and McGrath 2009: ch. 1, Hawthorne 2004). As previously demonstrated, it is plausible to think that some logical truths can be epistemically uncertain.

iv. A Worry for Skeptical Theories of Certainty

Whether or not a satisfactory skeptical account of certainty can be offered, it is in tension with our intuitive judgements and the ordinary way in which we use the word ‘certainty’ and epistemic modals (Huemer 2007, Beddor 2020b, Vollet 2020). Suppose that it is epistemically certain that p if and only if it is epistemically impossible that not-p (DeRose 1998). According to Invariantist Maximalism, for example, if I lost my wallet this morning and my wife tells me, “It might be in the restaurant” and I answer “No, that’s impossible, I didn’t go to the restaurant.” then, my wife says something which is, strictly speaking, true, and I say something which is, strictly speaking, false. Indeed, I do not satisfy the absolutely maximal criteria of justification with respect to the proposition, “My wallet is not in the restaurant.” Similarly, if my evidence does not logically exclude this possibility, then the probability of that proposition on my evidence is lower than 1. Even more surprisingly, we should admit that my wife would, strictly speaking, say something true were she to say “Your wallet might be on the Moon.” (Huemer 2007).

A pragmatic explanation of the (in)appropriateness of these utterances might be advanced. One might say that, in the above context, it is (practically) appropriate to ignore the epistemic possibilities in question (see the treatment of concessive knowledge attributions above). Still, explaining the intuitive (in)appropriateness of an utterance does not amount to accounting for its intuitive truth value. Another option is to rely on the distinction between epistemic and moral or practical certainty (Descartes Principles of philosophy, IV, § 205, Wedgwood 2012, Locke 2015). The latter can be understood as a degree of epistemic justification sufficiently high to treat the proposition as certain in one’s actions and practical decisions. One may suggest that the ordinary way in which people use ‘certain’ and the associated epistemic modals, as well as our intuitions, primarily tracks certainty as understood in the latter sense (Kattsoff 1965: 264).

The Bayesian approach has the advantage of providing a general framework in which practical and epistemic rationality in a context of generalized uncertainty can be modeled in a precise way. The claim that the subject’s expectations, if rational, can be represented as a probabilistic distribution allows formulating conditionalization rules describing exactly how a subject should adapt her expectations to the evidence in the absence of certainties. That said, either the concept of certainty that this approach uses is a technical one, in that it does not correspond to our ordinary concept of certainty, or this approach must provide an explanation of the way in which we ordinarily use this concept and the associated epistemic modals.

c. Moderate Theories of Epistemic Certainty

i. Moderate Infallibilism

It is often thought that requiring an infallible justification leads to skepticism about certainty (and knowledge). However, a non-skeptical and infallibilist account can be offered if one rejects the (internalist) accessibility requirement of Certainty-CI (Dutant 2016). For example, one could say that infallibility should not be cashed out in terms of the evidence possessed by a subject but instead in terms of a modal relation between the belief and the truth of its propositional content, such as the sensitivity of a belief or its safety (Dretske 1971, Nozick 1981, Williamson 2000). A reliabilist could also maintain that epistemic certainty requires that the belief-forming processes be maximally reliable (in the circumstances in which the belief is formed). Such approaches are infallibilist in the sense that they state it is impossible for a belief to be false if the guarantee (sensitivity, safety or maximal reliability) required for a proposition to be certain holds.

Another option consists in maintaining that infallibility depends on the subject’s evidence, while also opting for a more generous view of evidence. One may hold that propositions such as (4) can be certain if one thinks that the set of evidence a subject possesses can include any propositions about the external world (see Brown 2018 ch. 3 for further discussion). This option will typically exclude propositions such as (2), whose epistemic probability, although high, is not maximal. It can be declined in a stronger or weaker version, depending on whether propositions such as (3) can receive a maximal probability on the evidence.

Williamson defends a weak version of moderate infallibilism about knowledge, according to which one can know a proposition such as (3) (Williamson 2000). In addition to a safety condition allowing for a margin of error — in which a subject knows that p only if p is true in relevantly close situations — Williamson proposes that the evidence of a subject is only constituted by what she knows (see also McDowell 1982). If epistemic certainty is evidential probability 1, it follows that:

Certainty-Prob/K: p is epistemically certain for S if and only if Pr(p|K) = 1, where K stands for the set of propositions known by S.

This view is part of a broader “knowledge-first” research program in which Williamson assumes that (the concept of) knowledge is primitive. If this approach is correct, it can provide a reductive analysis of epistemic certainty in terms of knowledge by considering that all and only known propositions (or all and only propositions one is in a position to know) are epistemic certainties. This fits well with the traditional view of epistemic modals, according to which “It is impossible that p” (and, therefore, “It is certain that not-p”) is true if and only if p is incompatible with what the subject (or a potentially contextually determined relevant group of subjects) knows, or is in a position to know (DeRose 1991).

However, one can subscribe to a moderately infallibilist view of certainty without subscribing to the claim that knowledge entails certainty. As a matter of fact, the very idea that knowledge is moderately infallible is controversial. According to a widespread view, often called “logical fallibilism about knowledge,” S can know that p even if S’ evidence does not entail that p, or even if prob(p|E) is less than one (Cohen 1988, Rysiew 2001, Reed 2002, Brown 2011, 2018, Fantl and McGrath 2009: chap. 1). In some versions, this kind of fallibilism concedes that knowing that p requires having entailing evidence for p, but rejects that all evidence must be maximally probable (Dougherty 2011: 140). According to this fallibilist view of knowledge, propositions such as (2) can typically be known. However, proponents of this view generally deny that propositions such as (2), — and, even, propositions such as (3), (4), (5) or (6) — can be certain. Therefore, logical fallibilism about knowledge is compatible with a moderately infallibilist conception of epistemic certainty, in which epistemic certainty requires that a subject’s evidence entails that p, or that p’s epistemic probability be maximal (Reed 2008, Dougherty 2011). In brief, logical fallibilism about knowledge states that p can be known even if p is not epistemically certain, even in the sense of the moderate infallibilist view of certainty.

Regardless of if one endorses logical fallibilism about knowledge, a moderate infallibilist view of certainty which relies on a generous account of evidence accepts that p is epistemically certain for S if and only if S’ evidence rules out all the possibilities in which p is false, where a possibility is “ruled out” by evidence when it is logically incompatible with it (or with the fact that S has this evidence: Lewis 1996). According to this approach, the certainty conferring evidence must have probability 1 and be true, otherwise, the entailed proposition could be false and receive a probability lower than 1 (Brown 2018: 28). If one endorses logical fallibilism about knowledge, then epistemic certainties are a subset of what the subject knows, or is in a position to know (Littlejohn 2011, Petersen 2019, Beddor 2020b, Vollet 2020).

A general concern for this type of approach is that it can seem circular or insufficient. Indeed, the propositions (evidence) that grant epistemic certainty must themselves be epistemically certain. If not, what can be deduced from them will not be certain. Therefore, according to such approaches, one must assume that there are primitive epistemic certainties, that is, propositions whose prior probability is 1 (Russell 1948, Van Cleve 1977). However, the question remains: in virtue of what do these propositions have such a high probability? In addition, as logical truths are logically entailed by any set of evidence, this approach fails to account for the fact that logical truths can be epistemically uncertain.

ii. Fallibilism

According to other philosophers, while epistemic certainty depends on the evidence a subject has, that evidence does not need to entail that p in order for p to be certain. To express that claim in terms of epistemic probability: the probability of p being true conditional on the evidence possessed by a subject does not need to be maximal for p to be epistemically certain for that subject (see Reed 2002: 145-146 for further discussion).

According to Moore, if S knows that p, then it is epistemically impossible for S that p is false (Moore 1959). In other words, p is epistemically certain for S. However, in Moore’s view, one can know that p based on evidence that does not entail p’s truth. For instance, a subject can know that she has hands based on a visual experience of her hands so that it is epistemically impossible for her, given that experience, that she does not have hands. Yet, it is logically possible for the subject to undergo that visual experience without having hands. The logical possibility of undergoing that experience without having hands is simply conceived of by Moore as being compatible with the epistemic certainty regarding the fact that she has hands (see also DeRose 1991, Stanley 2005b). Thus, Moore offers a fallibilist conception of certainty based on a fallibilist conception of knowledge.

According to Moore’s conception of epistemic certainty, propositions such as (3) or (4) can be certain provided that their negation is incompatible with what a subject knows and propositions such as (5) can be uncertain if their negation is compatible with what a subject knows. In addition, this framework opens up the possibility of a weak conception of epistemic certainty, according to which propositions such as (2) can be epistemically certain.

In Moore’s approach, epistemic certainty is identified with knowledge. Nevertheless, Moore himself acknowledges that one may want to draw a distinction between knowledge and certainty (see also Firth 1967: 10, Miller 1978: 46n3, Lehrer, 1974, Stanley 2008). As previously noted, it is common to draw such a distinction by endorsing a logical fallibilist conception of knowledge, while also maintaining an infallibilist (either moderate or skeptical) conception of certainty. But is it possible to endorse, with Moore, a fallibilist conception of epistemic certainty and still draw a distinction between knowledge and epistemic certainty?

That is possible if one endorses another version of fallibilism with respect to knowledge, which is known as epistemic fallibilism. According to epistemic fallibilism, S can know that p even if S cannot rule out every possibility of p being false, where a possibility of p being false is “ruled out” whenever it is logically incompatible with what S knows to be true (Dretske 1981: 371). Endorsing such a conception of knowledge involves rejecting epistemic closure in that it involves accepting that S can know that p, and that p entails q, without thereby knowing (or being in a position to know) whether q is true. It involves accepting that S can know that she has hands, and that her having hands entails that she is not a handless brain in a vat, without thereby being in a position to know whether she is a handless brain in a vat (Nozick 1981, Dretske 1970).

Thus, even if one endorses a logical fallibilist conception of epistemic certainty, one can maintain that, in contrast to knowledge, certainty requires epistemic infallibility in this sense. That is, epistemic certainty requires having a justification for p such that every possibility of p being false is ruled out. This can be seen by the fallibilist conception of epistemic certainty in terms of immunity, as presented below. Given this approach, it is possible to claim that propositions such as (2) can be known, while also claiming that they cannot be epistemically certain. Though a subject can know that her lottery ticket is a losing one, it is not certain to her that it is a losing ticket. Indeed, the justification that the subject has for her belief does not rule out any possibility that her ticket is a winning one.

Another way to draw a distinction between knowledge and epistemic certainty which is compatible with a fallibilist conception of epistemic certainty is to argue that certainty involves, in addition to knowing a particular proposition to be true, having a specific epistemic perspective on that knowledge. For instance, Descartes acknowledges the fact that an atheist mathematician can possess a cognitio of mathematical truths but claims that she could not possess a scientia of that domain (Descartes Meditations on First Philosophy, second answer to objections). As such, an atheist mathematician does not recognize the divine guarantee that anything which is conceived of clearly and distinctly is thereby true. Thus, whenever she conceives mathematical truths clearly and distinctly, she cannot know that she knows. Accordingly, her knowledge of mathematical truths remains uncertain (Carrier 1993). Likewise, Williamson endorses the idea that a form of epistemic uncertainty can result from a lack of second-order knowledge (Williamson 2005; 2009, Rysiew 2007: 636-37, 657–58, n. 13, Turri 2010).

iii. Epistemic Immunity and Incorrigibility

According to moderate conceptions of epistemic certainty, propositions such as (3) or (4) can be certain for a subject. This is because either these propositions can be deduced from other propositions which are themselves certain, or the justification one has for these propositions is, although fallible, sufficient for certainty. Yet, if certainty depends on neither complete infallibility, nor on maximal justification, what does it depend on, precisely? What makes propositions such as (3) and (4), or the propositions from which they can be deduced from, certain for a subject?

The plausibility of strong conceptions of epistemic certainty results, at least partly, from the fact that they attribute a type of epistemic immunity to propositions which are certain. As previously noted, if a proposition p is maximally justified for a subject, then that proposition is immune to refutation. This is because there is no proposition q, such that q is incompatible with p, which can defeat or diminish p’s justification. A proposition also seems to be immune to refutation so long as the evidence one has for that proposition is itself infallible. One aim of moderate conceptions of epistemic certainty is therefore to offer a conception of epistemic certainty that attributes a form of epistemic immunity to propositions which are considered certain, without making this immunity dependent on complete infallibility or maximal justification.

Incorrigibility consists of a type of epistemic immunity that depends neither on complete infallibility, nor on maximal justification. The justification one has for p is considered incorrigible if it constitutes an ultimate authority on the question as to whether p and it cannot be defeated (Ayer 1963: 70-73, Firth 1967: 21, Alston 1992, Reed 2002: 144). Propositions concerning one’s mental states– such as one’s intentions, feelings, thoughts and immediate experiences– are typical examples of propositions for which one can have an incorrigible justification (Malcolm 1963: 77-86; Ayer 1963: 68-73, Firth 1967: 21). For instance, if a subject sincerely asserts that she undergoes an experience of something as of being red, this assertion seems to provide an incorrigible justification for the proposition “That subject undergoes an experience of something being red” (see Armstrong 1963 for a critical discussion).

However, that the justification one has for such a proposition is incorrigible does not entail that the proposition which is thus justified is true (Ayer 1963: 71-73). For, incorrigibility does not require infallibility (Firth 1967: 25). Some philosophers suggest that with respect to our own mental states, the incorrigibility of the justification we have for a proposition such as ‘I am in pain’ depends on the fact that we are infallible regarding our mental states (Malcolm 1963: 85). On the contrary, one can suppose that incorrigibility results from enjoying a privileged, yet fallible access to our mental states. For example, if I consider two lines of similar length, I can wonder which line appears to be the longest. If I can doubt this particular appearance, then I can consider being wrong about it. I can also falsely believe that the second line appears to be longer than the first one (Ayer 1956: 65). Additionally, incorrigible justification does not require maximal justification. In a world where no one can be wrong about their mental states, propositions concerning one’s mental states would be better justified if asserted sincerely. This suggests that, in the actual world, the incorrigible justification one has for such propositions is not maximal.

According to this approach, propositions about the external world such as (4) are considered epistemically certain. This is because it seems unreasonable to question the truth of propositions such as ”The world was not created five minutes ago”, “The external world exists” or “I have hands.” According to hinge theorists following Wittgenstein, the position of such propositions within one’s conceptual scheme is what makes them immune or incorrigible, and thereby certain (Wittgenstein 1969, Coliva 2015, Pritchard 2016). The truth of these propositions must be assumed in order to be able to operate with the notions of doubt and justification. Outside of very specific contexts, such as those involving someone who lost her hands in an accident, attempting to justify or doubt the truth of having hands is simply nonsensical. In Wittgenstein’s view, this incorrigibility is what distinguishes knowable propositions from certain propositions; the former can be the object of doubt and justification, while the latter cannot.

Another approach proposes that propositions about the external world can be certain even if all their logical consequences have not been verified, so long as their justification is sufficiently immune (contra C. I. Lewis 1946: 80, 180). A specification of sufficient immunity relies on the concept of ideal irrefutability. A proposition p is ideally irrefutable at a time t for a subject S if and only if there is no conceivable event such that if at t S is justified in believing that this event will occur, S is also justified in believing that p is false at t (Firth 1967: 16). In other words, a proposition p is certain if and only if one is justified in believing that p and, for all future tests which one is justified in believing that they will happen (or that one could imagine), they are such that they would not provide a justification for believing that p is false (Malcolm 1963: 68).

For example, suppose I see that there is an ink bottle here. This is compatible with the possibility of me having the future sensation of my hand passing through the ink bottle. I may even be justified in believing I will undergo such a sensation (suppose, for example, that a reliable prediction has been made). Yet, there is a way in which the present sensation of there being an ink bottle here justifies treating, at the moment of my seeing, any future sensation indicating that there is not an ink bottle here as misleading. After all, for any future such sensation, there exists a possible explanation that is compatible with the claim ”There is an ink bottle here.” For example, a future sensation of my hand passing through the ink bottle might be explained as a hallucination. If I am, at the moment of my seeing, justified in believing that there is actually an ink bottle here, it seems that, at that very moment, I am also justified in believing that any future sensation is explainable in a way that is compatible with the claim ”There is an ink-bottle here”. If so, at the moment of my seeing, the proposition that there is an ink-bottle here can be said to be ideally irrefutable and, according to the view of epistemic certainty under examination, epistemically certain for me.

Note that this does not mean the propositions that are epistemically certain for S in the present will remain epistemically certain for S in the future. If, in the future, S has the sensation that her hand passes through the ink bottle and her vision of the ink bottle is different, the proposition “There is, or was, an ink-bottle here” can become epistemically uncertain for S (Klein 1981: 91). What matters here is that, at the moment of S’s seeing that there is an ink bottle, S is justified in believing any future sensation can be explained as compatible  with the claim ”There is an ink bottle here.”

In comparison, Miller defends a weaker account of immunity and certainty. According to him, the justification possessed by S makes p certain only if there can be no other proposition q which is justified enough to show that S should not believe p, in spite of S’ current reasons to believe p (Miller 1978). In other words, p is certain for S if it is always permissible for S to believe p in light of S’s current and future evidence. According to this view, it does not matter if the new evidence will make the belief that not-p permissible, or if the hypothesis that not-p will constitute the best available explanation for the new set of evidence. For example, suppose that a scientist and everyone around me say that there have never been cars, and that I’m just waking from a dream caused by drugs. If I add this experience to my memories of cars, it still seems permissible for me to believe that there are cars, and to doubt the testimony of the people telling me the contrary– even if I can find no good explanation for my new experience.

According to a stronger characterization of immunity, certainty requires the proposition p be ideally immune to any decrease in justification. In this sense, it is not clear that the proposition “There is an ink bottle here” is certain. For, it seems that if I was justified in believing that my hand will pass over the supposed ink bottle, my justification for believing that there is an ink bottle would diminish (Firth 1967: 17, Miller 1978, contra Malcolm 1963: 93).

A slightly different approach proposed by Klein also requires immunity against a decrease in justification (Klein 1981; 1992). According to him, p is absolutely certain for S if and only if (a) p is warranted for S, (b) S is warranted in denying every proposition q such that if q was added to S‘s beliefs, the warrant for p would be reduced (subjective immunity) and (c) there is no true proposition d, such that if d was added to S‘s true beliefs the warrant for p would be reduced (objective immunity).The satisfaction of condition (a) does not entail that p is true, for the set of justified beliefs can include false beliefs, but condition (c) can be satisfied only if p is true.

This approach proposes that epistemic certainty requires immunity against all attacks in the actual world, but not immunity against all attacks in all possible worlds (in particular, in the worlds in which the considered proposition is false). The fact that it is not certain (for S) that a proposition is certain, or the fact that this proposition is not certain in all possible worlds, does not make this proposition uncertain in the actual world (Klein 1981: 181-189).

One can apprehend the distinction between certainty in the actual world and certainty in all possible worlds with the notion of relative certainty. When speaking of relative certainty, one may want to characterize a degree of justification more or less close to absolute certainty in the actual world, which implies some uncertainty for the proposition in the actual world. But one may also want to designate a degree of justification more or less close to absolute certainty in all possible worlds. If, in this second sense, relative certainty implies uncertainty for the proposition in some possible worlds, it does not imply uncertainty for the proposition in the actual world (Klein 1981: 189).

Most theories of certainty based on the notion of epistemic immunity are strong moderate theories. They take p to be epistemically certain for S only if, for any contrary proposition q (which implies not-p or decreases the justification for p), S is permitted to deny q. Hence, these theories exclude that propositions such as (3) are certain, for it is not difficult to imagine a situation in which the police call you to say that your car was stolen. In such a situation, it does not seem that you are allowed to deny that your car has been stolen.

However, a question remains regarding the kind of certainty this approach assigns to propositions such as (4). In virtue of what would S be allowed to deny a contrary proposition q, for example, the proposition that the world was created five minutes ago and we were born two minutes ago with false memories? If it is in virtue of the fact that p is certain, it appears that immunity is a logical consequence of epistemic certainty, rather than its grounds (Klein 1981: 30, Reed 2008). If it is in virtue of the fact that p occupies a specific place in our conceptual or linguistic scheme (Wittgenstein 1969), or that one cannot imagine or conceive of a possible refutation or invalidation for p, it is not clear that the certainty attached to propositions such as (4) is epistemic, rather than merely psychological.

d. Weak Theories of Epistemic Certainty

i. The Relativity of Certainty

According to weak theories of certainty, propositions such as (2) can be certain. As outlined in the previous section, one may understand the notion of certainty in relation to a class of propositions justified or true in the actual world or in relation to a class of propositions justified or true in all possible worlds. Yet, there are many other ways of relativising the notion of certainty (Firth 1967: 10-12). For example, there are Chisholm’s views previously mentioned (Chisholm 1976, 1989). Malcolm suggests that there are various kinds of justification associated with different kinds of propositions, which can give rise to various criteria for certainty (Malcolm 1963). For instance, to see in full light that a given object is a plate seems to provide one with a maximally strong justification for believing that the object is a plate. That is because, according to Malcolm, no one “has any conception of what would be a better proof that it is a plate”  (Malcolm 1963: 92). Still, as Firth (1967: 19-20) notes, that depends on the criterion used to define “better.” (Firth 1967: 19-20) According to a Cartesian criterion, we would have a better justification in a world in which vision is infallible and our senses are never misleading. Although it is possible to defend a weak invariantist account of epistemic certainty, the fact that various criteria of epistemic certainty can be conceived of may suggest that these criteria are, in fact, shifty.

ii. Contextualism

A first way of elaborating the thought that the standards of certainty are shifty consists in suggesting that ascriptions of certainty are, with respect to their truth-conditions, context-sensitive (Lewis 1976: 353-354, Stanley 2008, Petersen 2019, Beddor 2020a, b, Vollet 2020). A theory of this kind has notably been defended regarding ascriptions of knowledge. On this view, the epistemic standards that a subject S must satisfy with respect to a proposition p for a statement such as “S knows that p” to be true, depends on the conversational context (Cohen 1988, Lewis 1996, DeRose 2009). Some relevant features of the context are the salience of various error possibilities, as well as possibilities that one must take into account given the stakes related to being wrong about p. In the same way as the question of whether S is tall cannot be answered without (implicitly) invoking a reference class, relative to which a standard is fixed (for instance, tall “for a basketball player” or “for a child”), the question of whether or not a proposition is certain might not be answerable independently of the context in which the word ‘certain’ is used. There could be contexts in which the statement, “It is certain that my lottery ticket is a losing ticket” is true, for example a context in which we are discussing the opportunity of making an important financial investment – and contexts where that statement is false, even if the evidential situation remains the same – for example a context where we are discussing the fact that at least one lottery ticket will be the winning one.

However, adopting a contextualist view of certainty does not suffice to vindicate a weak theory of certainty. For example, Beddor proposes that the ascription “It is (epistemically) certain for S that p” is true if and only if p is true in all the contextually relevant worlds compatible with S’s epistemic situation, where the space of contextually relevant worlds includes all worlds nearby the actual world. Under the assumption that there is always a nearby world where one’s ticket wins, (2) cannot, given such a view, qualify as certain (see also Lewis’ rule of resemblance, 1996: 557).

iii. Pragmatic Encroachment

Some authors claim that the epistemic standards a subject must satisfy to know a proposition are partially determined by the question as to whether it is rational for this subject to act on the proposition’s truth given her overall practical situation (Stanley 2005a, Hawthorne 2004, Fantl and McGrath 2009). One may suggest that this “pragmatic encroachment” concerns also, or rather, epistemic certainty. For example, Stanley argues for the existence of a pragmatic encroachment on knowledge and maintains that knowledge determines epistemic certainties, that is, the epistemic possibilities relative to which a proposition can be considered to be epistemically certain (Stanley 2005a). Fantl and McGrath, for their part, defend the existence of a pragmatic encroachment on knowledge-level justification but reject the claim that knowledge-level justification determines epistemic certainties (Fantl and McGrath 2009). A third option would be to reject pragmatic encroachment on knowledge as well as the idea that knowledge determines epistemic certainties, while allowing pragmatic encroachment on epistemic certainties.

The conceptions according to which the criteria of epistemic certainty shifts with the conversational context or the practical cost of error, are compatible with a weak conception of epistemic certainty. Indeed, they can easily grant that there are contexts in which one says something true when one says of a proposition like (2) that it is certain for S.

4. Connections to Other Topics in Epistemology

The notion of certainty is connected to various epistemological debates. In particular, it is connected to philosophical issues concerning norms of assertions, actions, beliefs and credences. It also concerns central questions regarding the nature of evidence, evidential probability, and the current debate regarding epistemic modals (Beddor 2020b).

For example, some philosophers distinguish knowledge and certainty and propose to deal with concessive knowledge attributions by embracing the view that certainty is the epistemic norm of assertion (Stanley 2008, Pertersen 2019, Beddor 2020b, Vollet 2020). A prominent argument for such a certainty norm of assertion comes from the infelicity of Moorean assertions involving certainty, such as, “p, but it’s not / I’m not certain that p.” In a similar vein, some philosophers defend a certainty norm of action and practical reasoning (Beddor 2020a, Vollet 2020). This is in part because such a norm can easily handle some of the counterexamples raised against competing knowledge norms (for such counterexamples, see Brown 2008, Reed 2010 and Roebert 2018; for an overview on knowledge norms, see Benton 2014).

For instance, Beddor argues that, with respect to the nature of evidence and evidential probability, we should analyze evidence in terms of epistemic certainty (Beddor 2020b). Such a view is supported by the oddity of the utterance, “It is certain that smoking causes cancer, but the evidence leaves open the possibility that smoking does not cause cancer.” This suggests that if p is epistemically certain, then p is entailed by the available evidence. In addition, the oddity of the utterance, “The medical evidence entails that smoking causes cancer, but it isn’t certain that smoking causes cancer” suggests that p is entailed by the available evidence only if p is epistemically certain.

Thus, given the relations it bears to other important philosophical notions, it is clear that certainty is central to epistemological theorizing. The difficulty of providing a fully satisfactory analysis of this notion might then suggest that certainty should, in fact, be treated as primitive.

5. References and Further Readings

  • Alston, W. (1992). Incorrigibility. In Dancy, Jonathan & Sosa, Ernest (Eds.), A Companion to Epistemology. Wiley-Blackwell.
  • Aristotle (1984). The Complete Works of Aristotle, Volumes I and II, ed. and tr. J. Barnes, Princeton: Princeton University Press..
  • Armstrong, D. M. (1963). Is Introspective Knowledge Incorrigible? Philosophical Review 72 (4): 417.
  • Armstrong, D. M. (1981). The Nature of Mind and Other Essays. Ithaca: Cornell University Press.
  • Audi, R. (2003). Epistemology: A Contemporary Introduction to the Theory of Knowledge. Routledge.
  • Ayer, A.J. (1956). The Problem of Knowledge. London: Penguin.
  • Ayer, A. J. (1963). The Concept of a Person and Other Essays. New York: St. Martin’s Press.
  • Barnes, G. W. (1973). Unger’s Defense of Skepticism. Philosophical Studies 24 (2): 119-124.
  • Beddor, B. (2020a). Certainty in Action. Philosophical Quarterly 70 (281): 711-737.
  • Beddor, B. (2020b). New Work for Certainty. Philosophers’ Imprint 20 (8).
  • Benton, M. A. (2014). Knowledge Norms. Internet Encyclopedia of Philosophy
  • Brown, J. (2008). Subject‐Sensitive Invariantism and the Knowledge Norm for Practical Reasoning. Noûs 42 (2):167-189.
  • Brown, J. (2011). Fallibilism and the Knowledge Norm for Assertion and Practical Reasoning. In Brown, J. & Cappelen, H. (Eds.), Assertion: New Philosophical Essays. Oxford University Press.
  • Brown, J. (2018). Fallibilism: Evidence and Knowledge. Oxford University Press.
  • Cargile, J. (1972). In Reply to A Defense of Skepticism. Philosophical Review 81 (2): 229-236.
  • Carnap, R. (1947). Meaning and Necessity. University of Chicago Press.
  • Carrier, L. S. (1983). Skepticism Disarmed. Canadian Journal of Philosophy. 13 (1): 107-114.
  • Carrier, L. S. (1993). How to Define a Nonskeptical Fallibilism. Philosophia 22 (3-4): 361-372.
  • Chisholm, R. (1976). Person and Object. La Salle, IL: Open Court.
  • Chisholm, R. (1989). Theory of Knowledge. 3rd. ed. Englewood Cliffs. NJ: Prentice-Hall.
  • Clarke, R. (2013). Belief Is Credence One (in Context). Philosophers’ Imprint 13:1-18.
  • Cohen, S. (1988). How to Be a Fallibilist. Philosophical Perspectives 2: 91-123
  • Coliva, A. (2015). Extended Rationality: A Hinge Epistemology. Palgrave-Macmillan.
  • DeRose, K. (1991). Epistemic Possibilities. Philosophical Review 100 (4): 581-605.
  • DeRose, K. (1998). Simple ‘might’s, indicative possibilities and the open future. Philosophical Quarterly 48 (190): 67-82.
  • DeRose, K. (2009). The Case for Contextualism: Knowledge, Skepticism, and Context, Vol. 1. Oxford University Press.
  • Descartes, R. (1999). Rules for the Direction of the Natural Intelligence: A Bilingual Edition of the Cartesian Treatise on Method, ed. and tr. George Heffernan. Amsterdam: Editions Rodopi.
  • Descartes, R. (2008). Meditations on First Philosophy: With Selections from the Objections and Replies, trans. Michael Moriarty. Oxford: Oxford University Press
  • Dicker, G. (1974). Certainty without Dogmatism: a Reply to Unger’s ‘An Argument for Skepticism’. Philosophic Exchange 5 (1): 161-170.
  • Dodd, D. (2010). Confusion about concessive knowledge attributions. Synthese 172 (3): 381 – 396.
  • Dodd, D. (2017). Belief and certainty. Synthese 194 (11): 4597-4621.
  • Dokic, J. (2012). Seeds of self-knowledge: noetic feelings and metacognition. Foundations of metacognition 6: 302–321.
  • Dokic, J. (2014). Feelings of (un)certainty and margins for error. Philosophical Inquiries 2(1): 123–144.
  • Dokic, J. et Engel, P. (2001). Frank Ramsey: Truth and Success. London: Routledge.
  • Dougherty, T. & Rysiew, P. (2009). Fallibilism, Epistemic Possibility, and Concessive Knowledge Attributions. Philosophy and Phenomenological Research 78 (1):123-132.
  • Dougherty, T. & Rysiew, P. (2011). Clarity about concessive knowledge attributions: reply to Dodd. Synthese 181 (3): 395-403.
  • Dougherty, T. (2011). Fallibilism. In Duncan Pritchard & Sven Bernecker (eds.), The Routledge Companion to Epistemology. Routledge.
  • Douven, I. & Olders, D. (2008). Unger’s Argument for Skepticism Revisited. Theoria 74 (3): 239-250.
  • Dretske, F. (1970). Epistemic Operators. Journal of Philosophy 67: 1007-1023.
  • Dretske, F. (1971). Conclusive reasons. Australasian Journal of Philosophy 49 (1):1-22.
  • Dretske, F. (1981). The Pragmatic Dimension of Knowledge. Philosophical Studies 40: 363-378
  • Dutant, J. (2015). The legend of the justified true belief analysis. Philosophical Perspectives 29 (1): 95-145.
  • Dutant, J. (2016). How to be an Infallibilist. Philosophical Issues 26 (1): 148-171.
  • Fantl, J. (2003). Modest Infinitism. Canadian Journal of Philosophy 33 (4): 537- 562.
  • Fantl, J. & McGrath, M. (2009). Knowledge in an Uncertain World. Oxford University Press.
  • de Finetti, B. (1937). La Prévision: Ses Lois Logiques, Ses Sources Subjectives. Annales de l’Institut Henri Poincaré 7: 1–68.
  • de Finetti, B. (1990). Theory of Probability (Volume I). New York: John Wiley.
  • Firth, R. (1967). The Anatomy of Certainty. Philosophical Review 76: 3-27.
  • Foley, R. (1992). Working Without a Net: A Study of Egocentric Epistemology. New York: Oxford University Press.
  • Fumerton, R. (2005). Theories of justification. In Paul K. Moser (Ed.), The Oxford Handbook of Epistemology. Oxford University Press: 204–233.
  • Ganson, D. (2008). Evidentialism and pragmatic constraints on outright belief. Philosophical Studies 139 (3): 441- 458.
  • Gärdenfors, P. and D. Makinson (1988). Revisions of Knowledge Systems Using Epistemic Entrenchment. In Theoretical Aspects of Reasoning About Knowledge, Moshe Verde (Ed.) (Morgan Kaufmann): 83–95.
  • Greco, D. (2015). How I learned to stop worrying and love probability 1. Philosophical Perspectives 29 (1): 179-201.
  • Hawthorne, J. (2004). Knowledge and Lotteries. Oxford University Press.
  • Hawthorne, J., Rothschild, D. & Spectre, L. (2016). Belief is weak. Philosophical Studies 173 (5): 1393-1404.
  • Hintikka, J. (1962). Knowledge and Belief: An Introduction to the Logic of the Two Notions. V. Hendriks and J. Symons (Eds.). London: College Publications.
  • Huemer, M. (2007). Epistemic Possibility. Synthese 156 (1): 119-142.
  • Hume, D. (1975). A Treatise of Human Nature. ed. by L. A. Selby-Bigge, 2nd ed. rev. by P. H. Nidditch. Oxford: Clarendon Press.
  • Hume, D. (1993). An Enquiry Concerning Human Understanding. ed. Eric Steinberg. Indianapolis: Hackett Publishing Co.
  • Jeffrey, R. (1965). The Logic of Decision. New York: McGraw-Hill.
  • Jeffrey, R. (1970). Dracula meets Wolfman: Acceptance vs. Partial Belief’. In Induction, Acceptance, and Rational Belief. Marshall Swain (Ed.) Dordrecht: D. Reidel Publishing Company: 157-85.
  • Jeffrey, R. (2004). Subjective Probability. The Real Thing. Cambridge: Cambridge University Press.
  • Joyce, J. M. (1999). The Foundations of Causal Decision Theory. New York: Cambridge University Press
  • Kattsoff, L. O. (1965). Malcolm on knowledge and certainty. Philosophy and Phenomenological Research 26 (2): 263-267.
  • Kauss, D. (2020). Credence as doxastic tendency. Synthese 197 (10): 4495-4518.
  • Klein, P. (1981). Certainty: A Refutation of Scepticism. Minneapolis: University of Minnesota Press.
  • Klein, P. (1992). Certainty. In J. Dancy and E. Sosa (Eds.), A Companion to Epistemology. Oxford: Blackwell: 61-4.
  • Kolmogorov, A. N. (1956). Foundations of the Theory of Probability. New York: Chelsea Publishing Company.
  • Lasonen-Aarnio, M. (2020). Enkrasia or evidentialism? Learning to love mismatch. Philosophical Studies 177 (3): 597-632.
  • Lehrer, K. (1974). Knowledge. Oxford: Clarendon Press.
  • Leitgeb, H. (2013). Reducing belief simpliciter to degrees of belief. Annals of Pure and Applied Logic 164 (12): 1338-1389.
  • Leitgeb, H. (2014). The Stability Theory of Belief. Philosophical Review 123 (2): 131-171.
  • Leitgeb, H. (2017). The Stability of Belief: How Rational Belief Coheres with Probability. Oxford University Press
  • Levi, I. (1983). Truth, fallibility and the growth of knowledge. In R. S. Cohen & M. W. Wartofsky (Eds.), Boston studies in the philosophy of science (Vol. 31, pp. 153–174). Dordrecht: Springer
  • Lewis, C.I. (1929). Mind and the World Order. New York: Dover.
  • Lewis, C. I. (1946). An Analysis of Knowledge and Valuation. Open Court.
  • Lewis, D. (1979). Scorekeeping in a Language Game. Journal of Philosophical Logic 8 (1): 339-359.
  • Lewis, D. (1996). Elusive Knowledge. Australasian Journal of Philosophy 74 (4: 549—567.
  • Littlejohn, C. (2011). Concessive Knowledge Attributions and Fallibilism. Philosophy and Phenomenological Research 83 (3): 603-619.
  • Locke, D. (2015). Practical Certainty. Philosophy and Phenomenological Research 90 (1): 72-95.
  • Locke, J. (1975). An Essay Concerning Human Understanding, Peter H. Nidditch (Ed.), Oxford: Clarendon Press.
  • Malcolm, N. (1952). Knowledge and belief. Mind 61 (242): 178-189.
  • Malcolm, N. (1963). Knowledge and Certainty. Englewood Cliffs, NJ: Prentice-Hall.
  • McDowell, J. H. (1982). Criteria, Defeasibility, and Knowledge. Proceedings of the British Academy, 68: 455–479.
  • Miller, R. W. (1978). Absolute certainty. Mind 87 (345): 46-65.
  • Moore G.E. (1959). Certainty. In Philosophical Papers. London: George Allen & Unwin, 227-251.
  • Nozick, R. (1981). Philosophical Explanations. Cambridge: Cambridge University Press.
  • Pasnau, R. (2013). Epistemology Idealized. Mind 122 (488): 987-1021.
  • Peirce, C. (1877/2011). The Fixation of Belief. In R. Talisse & S. Aikin (Eds.). The Pragmatism Reader: From Peirce Through the Present. Princeton University Press: 37-49.
  • Petersen, E. (2019). A case for a certainty norm of assertion. Synthese 196 (11): 4691-4710.
  • Plantinga, A. (1993). Warrant and Proper Function. Oxford University Press.
  • Plato (1997). Republic. In J. M. Cooper (Ed.). Plato: Complete Works. Indianapolis: Hackett.
  • Pritchard, D. (2008). Certainty and Scepticism. Philosophical Issues 18 (1): 58-67.
  • Pritchard, D. (2016). Epistemic Angst. Radical Scepticism and the Groundlessness of Our Believing, Princeton University Press.
  • Ramsey, F. P. (1926). Truth and Probability. In R. B. Braithwaite (Ed.). Foundations of Mathematics and Other Logical Essays. London: Kegan, Paul, Trench, Trubner & Co., New York: Harcourt, Brace and Company: 156–198.
  • Reed, B. (2002). How to Think about Fallibilism. Philosophical Studies 107: 143-57.
  • Reed, B. (2008). Certainty. Stanford Encyclopedia of Philosophy.
  • Reed, B. (2010). A defense of stable invariantism. Noûs 44 (2): 224-244.
  • Roeber, B. (2018). The Pragmatic Encroachment Debate. Noûs 52 (1): 171-195.
  • Roorda, J. (1997). Fallibilism, Ambivalence, and Belief. Journal of Philosophy 94 (3): 126.
  • Rothschild, D. (2020). What it takes to believe. Philosophical Studies 177 (5): 1345-1362.
  • Russell, B. (1912). The Problems of Philosophy, Londres, Williams & Norgate
  • Russell, B. (1948). Human Knowledge: Its Scope and Limits. New York: Simon and Schuster.
  • Rysiew, P. (2001). The Context-sensitivity of Knowledge Attributions. Noûs 35 (4): 477–514.
  • Rysiew, P. (2007). Speaking of Knowledge. Noûs 41: 627–62.
  • Savage, L. J. (1954). The Foundations of Statistics. New York: John Wiley.
  • Skyrms, B. (1980). Causal Necessity: A Pragmatic Investigation of the Necessity of Laws. Yale University Press.
  • Stalnaker, R. (1991). The problem of logical omniscience, I. Synthese 89 (3): 425–440.
  • Stanley, J. (2005a). Knowledge and Practical Interests. Oxford University Press.
  • Stanley, J. (2005b). Fallibilism and concessive knowledge attributions. Analysis 65 (2): 126-131.
  • Stanley, J. (2008). Knowledge and Certainty. Philosophical Issues 18 (1): 35-57.
  • Sturgeon, S. (2008). Reason and the grain of belief. Noûs 42 (1): 139–165.
  • Turri, J. (2010). Prompting Challenges. Analysis 70 (3): 456-462.
  • Unger, P. (1975). Ignorance: A Case for Scepticism. Oxford: Clarendon Press.
  • Van Cleve, J. (1977). Probability and Certainty: A Reexamination of the Lewis-Reichenbach Debate. Philosophical Studies 32: 323-34.
  • Vazard, J. (2019). Reasonable doubt as affective experience: Obsessive–compulsive disorder, epistemic anxiety and the feeling of uncertainty. Synthese https://doi.org/10.1007/s11229-019-02497-y
  • Vollet, J.-H. (2020). Certainty and Assertion. Dialectica, 74 (3).
  • Vollet, J.-H. (2022). Epistemic Excuses and the Feeling of Certainty, Analysis.
  • von Fintel, K. and A. Gillies (2007). An Opinionated Guide to Epistemic Modality. In T. Gendler and J. Hawthorne (ed.), Oxford Studies in Epistemology, Volume 2. New York: Oxford University Press.
  • Weatherson, B. (2005). Can we do without pragmatic encroachment. Philosophical Perspectives 19 (1): 417–443.
  • Wedgwood, R. (2012). Outright Belief. Dialectica 66 (3): 309–329.
  • Williamson, T. (2000). Knowledge and Its Limits. Oxford University Press.
  • Williamson, T. (2009). Reply to Mark Kaplan. In Pritchard, D. and Greenough, P. (ed.) Williamson on Knowledge. Oxford: Oxford University Press .
  • Wittgenstein, L. (1969). On Certainty. G.E.M. Anscombe & G.H. von Wright (Eds.). New York: Harper & Row.

 

Author Information

Miloud Belkoniene
Email: miloud@belkoniene.org
University of Glasgow
United Kingdom

and

Jacques-Henri Vollet
Email: jacquesvollet@yahoo.fr
University Paris-Est Créteil
France

Bodily Awareness

Most of us agree that we are conscious, and we can be consciously aware of public things such as mountains, tables, foods, and so forth; we can also be consciously aware of our own psychological states and episodes such as emotions, thoughts, perceptions, and so forth. Each of us can be aware of our body via vision, sound, smell, and so on. We also can be aware of our own body “from the inside,” via proprioception, kinaesthesis, the sense of balance, and interoception. When you are reading this article, in addition to your visual experiences of many words, you might feel that your legs are crossed, that one of your hands is moving toward a coffee mug, and that you are a bit hungry, without ever seeing or hearing your limbs and your stomach. We all have these experiences. The situation can get peculiar, intriguing, and surprising if we reflect upon it a bit more: the body and its parts are objective, public things, and that is why in principle everyone else can perceive our bodies. But the body and its parts also have a subjective dimension. This is why many believe that in principle only one’s own self can be aware of one’s own body “from the inside.” Consciousness of, or awareness of, one’s own body, then, can generate many interesting and substantive philosophical and empirical questions due to the objective-subjective dual aspects, as is seen below. The beginning of section 1 introduces the structure of this article and presents some caveats. Having these early on can be daunting, but they occur there because this is a complicated area of study.

Table of Contents

  1. Varieties of Bodily Awareness
    1. Touch
    2. Proprioception, Kinaesthesis, and the Vestibular Sense
    3. Thermal Sensation, Pain, and Interoception
    4. Bodily Feelings
    5. Bodily Representations: Body Image, Body Schema, and Peripersonal Space
  2. Contemporary Issues
    1. Is There a Tactile Field?
    2. Does Bodily Immunity to Error Through Misidentification Hold?
    3. How Do Body Ownership and Mental Ownership Relate?
    4. Must Bodily Awareness Be Bodily Self-Awareness?
    5. What Does Body Blindness, Actual or Imagined, Show?
  3. Phenomenological Insights: The Body as a Subjective Object and an Objective Subject
    1. Two Notions of the Body
    2. Non-Perceptual Bodily Awareness
  4. Conclusion
  5. References and Further Reading

1. Varieties of Bodily Awareness

Bodily awareness, or bodily consciousness, covers a wide range of experiences. It is closely related to, though crucially different from, bodily representation (1.e) and bodily self-awareness (2.d). Another related notion is bodily self-knowledge, which includes immunity to error through misidentification (2.b). What follows is some broad territory, and it is unrealistic to claim comprehensiveness. It is divided in the following way: section 1 discusses varieties of bodily awareness, without committing the view that this represents the classification of bodily awareness (Armstrong, 1962): different researchers would carve things up in slightly different ways, but the most important elements are covered here. Section 2 surveys several contemporary issues in Anglo-Saxon philosophy and cognitive sciences. Note that the divide between sections 1 and 2 is somewhat artificial: in introducing varieties of bodily awareness, we will of course discuss theoretical issues and questions in those areas, otherwise, it would become a pure reportage. However, this divide between sections 1 and 2 is not entirely arbitrary, since while section 1 will be primarily on different varieties of bodily awareness, section 2 will be explicitly question-oriented. They will be mutually complementary and not repetitive. Section 3 discusses some insights from the phenomenological tradition with a specific focus on the lived body as a subjective object and an objective subject. The divide between sections 2 and 3 can also be seen as somewhat artificial: it is perfectly sensible to spread those or even more phenomenological insights along the way in sections 1 and 2. This will not be the strategy because in practice, these traditions work in parallel most of the time, and seek to communicate when there are opportunities. It will be conceptually cleaner if we proceed in a way that separates them first. Also, the phenomenological insights covered below seem especially suitable for the larger issues in section 3, so we will save them mostly for that section, with the proviso that many ideas in section 3 will rely on various elements in the previous sections, and that considerations from the analytic tradition will creep back toward the end. Note that the discussions of section 3 are highly selective; after all, this article is mostly written from the analytical point of view. Many phenomenologists have studied the body and bodily awareness intensively, but for the flow of the narrative and the scope of the article, they are not included below. Notable names that we will not discuss include Aron Gurwitsch (1964), Michel Henry (1965), Dorothée Legrand (2007a, 2007b), and Dan Zahavi (2021). Section 4 concludes and summarises.

a. Touch

What is touch? This question is surprisingly difficult to answer if what we are looking for is a precise definition. Examples are easy to give: we (and other animals) touch things with our hands, feet, and/or other parts of the body when we make contact with those things with body parts. Things quickly become murkier when we consider specific conditions; for example, is skin necessary for touch? Many animals do not have skin, at least under common understandings of what skins are, but they can touch things and have tactile experiences, at least according to most. Even humans seem to be able to touch things with lips, tongues, and eyes, thereby having tactile experiences, but they are not covered by skin. Some would even claim that when one’s stomach is in contact with foods, one can sometimes feel tactile sensations, though see discussions of interoception below (1.c). So even if we only focus on examples, it is difficult to differentiate touch from non-touch. Moreover, many touches or tactile experiences seem to involve indirect contacts: for example, your hands can touch your shoulders even when wearing clothes or gloves; one’s hands can have tactile feedbacks by using crutches to walk. Exactly how to conceive the relation between touch and contact can seem controversial.

What about definitions then? This often appears under the heading of “individuating the senses” (for example, Macpherson, 2011): what are the individuation conditions of, say, vision, audition, olfaction, gustation, touch, and perhaps other senses? Aristotle in De Anima proposed the “proper object account”: colours are only for vision, sounds are only for audition, smells are only for olfaction, tastes are only for gustation, and so on. But what about touch? There does not seem to be any proper object for it. With touch we can take in information about objects’ sizes and shapes, but they can also be taken in by sight, or perhaps even by audition: we seem to be able to hear (to some extent) whether the rolling rocks are huge or small, or what the shape of a room roughly is, for example (for example, Plumbley, 2013). Some have argued that pressure is the proper objects of touch (Vignemont and Massin, 2015), though controversies have not been settled. Researchers have proposed many other candidate criteria, including the representational criterion, the phenomenal character criterion, the proximal stimulus criterion, the sense-organ criterion, and so on. Each has its strengths and weaknesses. Still, there are difficult questions to answer such as: are ventral and dorsal visions separate senses? How about orthonasal and retronasal olfaction (Wilson, 2021)? Does neutral touch, thermoception, and nociception form a unitary sense (Fulkerson, 2013)? To acknowledge touch as one element of bodily awareness, though, one does not need to resolve these difficult questions first.

Setting aside the above controversies, a basic distinction within touch is between haptic/active and passive touch. While in daily life, creatures often actively explore objects in the environment, they also experience passive touch all the time; consider the contacts between your body and the chair you sit on, or the clothing that covers different parts of your body. This distinction is closely related to, though not perfectly mapped onto, the distinction between kinaesthesis and proprioception (see the next subsection). In experimental works, laboratories tend to specialise on either haptic or passive touch, focusing on their temporal or/and spatial profiles. For example, in the famous cutaneous rabbit illusion (a.k.a. cutaneous saltation), where participants feel a tactile illusion induced by tapping multiple separate regions of the skin (often on a forearm) in rapid succession (Geldard and Sherrick, 1972), participants are asked not to move their body; same is true of the perhaps even more famous rubber hand illusion, in which the feeling that a rubber hand belongs to one’s body is generated by stroking a visible rubber hand synchronously to the participant’s own hidden hand (Ehrsson, Spence, and Passingham, 2004; also see a related four-hand illusion in Chen, Huang, Lee, and Liang, 2018, where each participant has the illusory experience of owning four hands). Varieties of tactile and body illusions are important entry points for researchers to probe the distinctive properties of touch. Vignemont (2018) offers an excellent list of bodily illusions with informative descriptions (p. 207-211).

An important approach to studying touch is to look into cases in which the subjects have no sight, both congenitally and otherwise (Morash, Pensky, Alfaro, and McKerracher, 2012). This also includes experimental conditions where participants are blindfolded or situated in a dark room. This is a useful method because crossmodal or multisensory interactions can greatly influence tactile experiences; therefore, blocking the influence from vision (and other senses) can make sure what is being studied is touch itself. This is one reason why Molyneux’s question is so theoretically relevant and intriguing (Locke 1693/1979; Cheng, 2020; Ferretti and Glenney, 2020). Molyneux’s question hypothesizes that it is possible to restore the vision of those who are born completely blind. It then asks whether the subjects who obtain this new visual capability can immediately tell which shapes are which, solely by vision. The question depends on how we think of the structural similarities between sight and touch, how amodal spatial representation works in transforming spatial representations in different modalities, and so on. The same consideration about blocking crossmodal effects applies to audition: in experiments on touch, participants are often asked to put on earplugs or headphones with white noises. The relations between sight, touch, and multimodality have been important in the literature, but this goes beyond the scope of this article.

Touch is a form of perception, and in many philosophical and empirical studies of touch, researchers focus primarily on the “cold” aspect of it; that is, sometimes people talk as if touch is primarily about gathering information about the immediate environment and one’s own body. But touch also has the “hot” aspect, which is often called “affective touch.” This cold/hot distinction is also applicable to other sense modalities, and even to cognition. While “cold” perceptions or cognitions are often said to be receptive and descriptive, “hot” perceptions and cognitions are by contrast evaluative and motivational. Affective perceptions involve conscious experiences, emotions, and evaluative judgments. Another way to pick out this “hot” aspect is to label these perceptions as “valenced.” Focusing on touch, it is notable that tactile experiences often if not always have felt pleasant or unpleasant phenomenal characters. Phenomenologically speaking, these valences might feel as if they are integral to tactile experiences themselves, though physiologically, specialised afferent nerve channels “CT-Afferents” might be distinctively responsible for pleasantness (McGlone, Wessberg, and Olausson, 2014). Affective perceptions, touch included, seem to be essential to varieties of social relations and aesthetic experiences, and this makes them a much wider topic of study in philosophy, psychology, and beyond (Nanay, 2016; Korsmeyer, 2020).

Touch carries information both about the external world and about the body itself (Katz, 1925/1989). It is related to other forms of bodily awareness, such as proprioception and kinaesthesis, thermal sensation and pain, interoception, and so on. These will be discussed in some detail in the following subsections. For other philosophical discussions concerning touch, for example, varieties of tangible qualities, the nature of pleasant touch, and the relation between touch and action, see for example Fulkerson (2015/2020).

b. Proprioception, Kinaesthesis, and the Vestibular Sense

The term “proprioception” can be at least traced back to Sherrington (1906): “In muscular receptivity, we see the body itself acting as a stimulus to its own receptors – the proprioceptors.” This definition has been refined many times in the past century, and the term has at least a broad and a narrow meaning. Broadly construed, this term is interchangeable with “kinaesthesis,” and they jointly refer to the sense through which the subjects can perceive or sense the position and movement of our body (Tuthill and Azim, 2018). Narrowly construed, although “proprioception” refers to the perception or at least sensing of the positions of our body parts, “kinaesthesis” refers to the perception or at least sensing of the movement of our body parts. The reservation here concerning perception is that some would think perception is necessarily exteroceptive and can be about multiple objects, while some might regard proprioception and kinaesthesis as interoceptive and can only be about one specific object (note that Sherrington himself clearly distinguishes proprioception from interoception; for more on interoception and related issues, see also 1.c and 3.b). With this narrower usage, one can see that proprioception and kinaesthesis can sometimes be dissociated, but they often occur together: when we sit or stand without any obvious movement, we still feel where our limbs are and how they stretch, and so forth, so this can be a case of having proprioception without kinaesthesis. In other cases, where someone moves around or uses their hands to grab things, they at the same time feel the positions and movements of our body parts.

Proprioception and kinaesthesis raise some distinctive philosophical issues (for example, Fridland, 2011); specifically, some have argued that surprisingly, one can proprioceive someone else’s movements in some sense (Montero, 2006); it is also explored as an aesthetic sense (Schrenk, 2014) and an affective sense (Cole and Montero, 2007). In considering deafferented subjects, who lack proprioceptive awareness of much of their bodies (or “body blind”; see 2.e), some have considered the role of proprioceptive awareness in our self-conscious unity as practical subjects (Howe, 2018). Relatedly, it has been argued that the possibility of bodily action is provided by multimodal body representations for action (Wong, 2017a). Also based on deafferented patients, some have argued that proprioception is necessary for body schema plasticity (Cardinali, Brozzoli, Luauté, Roy, and Farnè, 2016). Moreover, some have argued that proprioception is our direct, immediate knowledge of the body (Hamilton, 2005). It has also been identified as a crucial element in many other senses (O’Dea, 2011). And there is much more. To put it bluntly, proprioception is almost everywhere in our conscious life, though this might not be obvious before being pointed out. It is worth noting that the above contributions are from both philosophers and empirical researchers, and sometimes it is hard to figure out whether a specific work is by philosophers or scientists.

The vestibular sense or system in the inner ear is often introduced with proprioception and kinaesthesis as bodily senses; it is our sense of balance, including sensations of body rotation, gravitation, acceleration, and movement. The system includes two structures of the bony labyrinth of the inner ear – the vestibule and the semicircular canals. When it goes wrong, we feel dizziness or vertigo. The basic functions of the vestibular system include stabilising postures and gazes and providing the gravitational or geocentric frame of reference (Berthoz, 1991). It is multisensory in the sense that it is often or even always implicated in other sense perceptions. Whether it has “proprietary phenomenology,” that is, phenomenology specific to it, is a matter of dispute (Wong, 2017b). It is less seen in philosophical contexts, but in recent years it also figures in the purview of philosophy. What are the distinctive features of the vestibular sense or system? Here are some potential candidates: vestibular afferents are constantly active even when we are motionless; it has “no overt, readily recognizable, localizable, conscious sensation from [the vestibular] organs” (Day and Fitzpatrick, 2005, p.R583); it enables an absolute frame of reference for self-motion, particularly absolute head motion in a head-centered frame of reference; and vestibular information and processing in the central nervous system is highly multisensory (Wong, 2017b). It can be argued that, however, some of these characteristics are shared with other senses. For example, the first point might be applicable to proprioception, and the fourth point might be applicable to some cases of touch. Still, even if these four points are not exclusive for the vestibular sense, they are at least important characteristics of it. One major philosophical import of the vestibular sense is the ways in which it relates self, body, and world. More specifically, the vestibular system plays crucial roles “in agentive self-location…, in anchoring the self to its body…, and in orienting the subject to the world… balance is being-in-my-body-in-the-world” (ibid., p. 319-320; 328). Note that self-location is often but not always bounded with body-location: in the case of out-of-body experience (Lenggenhager, Tadi, Metzinger, and Blanke, 2007), for example, the two are dissociated. It has also been proposed that there should be a three-way distinction here: in addition to self-location and body-location, there is also “1PP-location”: “the sense of where my first-person perspective is located in space” (Huang, Lee, Chen, and Liang, 2017).

c. Thermal Sensation, Pain, and Interoception

Another crucial factor in bodily awareness is thermal sensation or thermoception, which is necessarily implicated in every tactile experience: people often do not notice the thermal aspect of touch, but they can become salient when, for example, the coffee is too hot, or the bathing water is too cold. They also exist in cases without touch: People feel environmental temperatures without touch (exteroceptive), and they feel body temperature in body parts that have no contact with things (interoceptive; for more on the exteroceptive and the interoceptive characters of thermal perception, see Cheng, 2020). Thermal illusions are also ways of probing the nature of bodily awareness (for example, thermal referral, Cataldo, Ferrè, di Pellegrino, and Haggard, 2016; thermal grill, Fardo, Finnerup, and Haggard, 2018). Connecting back to the individuation of the senses discussion, there is a question concerning how many senses there are within the somatosensory system. More specifically, are touch, thermal sensation, and nociception (see below) different senses? Or should they be grouped as one sense modality? Or perhaps this question has no proper theoretical answer (Ratcliffe, 2012)? Besides, there are questions specific to thermal perception. For example, what do experiences of heat and cold represent, if they represent anything at all? Do they represent states or processes of things? Gray (2013) argues that experiences of heat and cold do not represent states of things; they represent processes instead. More specifically, he develops the “heat exchange model of heat perception,” according to which “the opposite processes of thermal energy being transmitted to and from the body, respectively” (p. 131). Relating this back to general considerations in philosophy of mind and metaphysics should help us understand what is at stake: some have argued that the senses do not have intentional content, that is, they do not represent (Travis, 2014). Many philosophers demur and hold the “content view” of experience instead (Siegel, 2010), but within the content view the major variant is that sensory experiences represent objects such as tables, chairs, mountains, and rivers; they also represent states of things, such as how crowded the room is, or the temperatures of things people are in contact. Gray’s view is that experiences of heat and cold do represent, but what they represent are not states but a certain kind of processes (for more on the ontological differences between events, processes, and states, see Steward, 1997). This view is controversial, to be sure, but it opens up a new theoretical possibility that should be considered seriously. Philosophical discussions of thermal perception or the thermal sense have been quite limited so far, and there might be some more potential in this area.

Pain is often regarded as having similar status as thermal perception, that is, subjective and (at least often) interoceptive, though pain seems to have drawn more attention at least in philosophy (for example, the toy example with pain and C-fibre firing). In the empirical literature, “pain” tends to occur with another term “nociception,” but they are strictly speaking different: “Pain is a product of higher brain center processing, whereas nociception can occur in the absence of pain” (National Research Council, 2009). This is not to deny they have large physiological overlaps, but since we do not aim to cover physiology, readers are encouraged to look for relevant resources elsewhere. Pain sometimes appears in the context of touch, for example, under specific circumstances where touch is multisensory (Fulkerson, 2015/2020); it also occurs in the context of thermal pain. But pain also has its own distinctive philosophical issues: do pains represent at all? Are painful qualities exhausted by representational properties (for example, Lycan, 1987)? Do pains have physical locations (for example, Bain, 2007)? How should we explain clinical cases such as pain asymbolia, that is, the syndrome with which subjects can feel pain but do not care to remove them (Berthier, Starkstein, and Leiguarda, 1988)? Is pain a natural kind (Corns, 2020)? Amongst these significant questions, arguably the most central question concerning the nature of pain is epitomised by the so-called “paradox of pain,” that is, according to the folk conception of pain, it is both mental and bodily (Hill, 2005, 2017; Aydede, 2013; Reuter, Phillips, and Sytsma, 2014; Reuter, 2017; Borg, Harrison, Stazicker, & Salomons, 2020). On the one hand, pains seem to allow privileged access for the subject in question and admits no appearance/reality distinction (Kripke, 1980; Searle, 1992), while on the other hand, pains seem to be bodily states, processes, or activities, just like bodily damages are. In addition to these two opposing views, there is also the “polyeidic view,” according to which our concept of pain is polyeidic or multi-dimensional, “containing a number of different strands or elements (with the bodily/mental dimension being just one strand among others” (Borg, Harrison, Stazicker, & Salomons, 2020, p. 30-31). Moreover, there is also the “polysemy view,” according to which pain terms are polysemous, referring to both mental and bodily states (Liu, 2021). Without going into the details, three observations are on offer: firstly, some have argued that the above discussions tend to be conducted in English, but other languages might reflect different conceptions of pain (Liu and Klein, 2020). Secondly, sometimes we can easily run two debates together, one about the nature or metaphysics of pain, and the other about folk notions or concepts of pain. And thirdly, sometimes it can seem that the above debate is at least partially about consciousness in general, not about pain. For example, when disagreeing about whether one can draw the distinction between appearance and reality for pain, it seems that the disagreement is actually about consciousness, whether it is painful experience or otherwise.

Apart from the above controversies, there is a relatively new category that has not been recognised widely by the literature as an independent sense, but the experience itself is familiar enough: as Lin, Hung, Han, Chen, Lee, Sun, and Chen (2018) point out, “acid or soreness sensation is a characteristic sensory phenotype of various acute and chronic pain syndromes” (p. 1). The question is whether they should be classified under nociception, or they should be singled out as a distinct sense. What is sngception exactly? In a certain variant of Chinese, acid pain is called “sng” (「痠」), “meaning a combination of soreness and pain, and is much more commonly reported than ‘pain’ among patients with chronic pain, especially for those with musculoskeletal pain” (ibid., 2018, p. 5). The authors introduced this term “specifically to describe the response of the somatosensory nervous system to sense tissue acidosis or the activation of acid-sensitive afferent neurons” (ibid., p. 6). The authors’ reason for distinguishing it from other elements of bodily awareness is primarily physiological, and as indicated above we will not go into those biological details. As far as individuating the senses is concerned, physiology is an important consideration, but it is far from decisive (Macpherson, 2011). Whether sng-ception should really be distinguished from pain and nociception is an open empirical question.

Interoception is to be contrasted with exteroception: whether the senses in question are directed toward the outside or inside, to put it crudely. One major difficulty is how to draw the inner/outer boundary, since not every part of our body is covered by skin, but there seems to be an intuitive sense in which we want to classify specific senses as exteroceptive or interoceptive. For example, the classical five senses – vision, audition, olfaction, gustation, and touch – are exteroceptive, while proprioception, kinaesthesis, feelings of heart beats and gut, and so forth, are interoceptive. A more technical definition is this: “Interoception is the body-to-brain axis of signals originating from the internal body and visceral organs (such as gastrointestinal, respiratory, hormonal, and circulatory systems)” (Tsakiris, and de Preester, 2019, p. v; some use “visceroception” to refer to the sensings of visceral organs). But in the very same piece, actually the next two sentences, the authors say that it “refers to the sensing of the state of the inner body and its homeostatic needs, to the ever-fluctuating state of the body beneath its sensory (exteroceptive) and musculoskeletal sheath” (ibid., p. v). These two definitions or characterisations are already not identical, and this shows that interoception is a rich territory that covers lots of grounds. More classically, and also from Sherrington (1906), interoception “is based on cardiovascular, respiratory, gastrointestinal, and urogenital systems, [which] provides information about the physiological condition of the body in order to maintain optimal homeostasis” (Vignemont, 2020, p. 83). Defining interoception has proven to be extremely difficult: in the literature, there have been the sole-object definition (“interoception consists of information that is exclusively about one’s body”), the insider definition (“interoception consists of information about what is internal to the body and not about what is at its surface”), and the regulator definition (“interoception consists of information that plays a role in internal regulation and monitoring”). Each of them exists due to certain initial plausibility, but they all face some difficult challenges and potential counterexamples (Vignemont, 2018a).

Interoception provides many good samples of philosophical relevance of bodily awareness: can interoception provide priors for the Bayesian predictive model? How does interoception shape the perception of time? In what way and to what extent is the brain-gut axis mental? What is the relation between interoception and emotion? And there are many more, especially if we consider how interoception interacts with other elements of bodily awareness, and with exteroceptions such as vision, audition, and olfaction (Tsakiris, and de Preester, 2019). In the past two decades, interoception has been thought to be connected with (bodily or otherwise) self-awareness, as in the proto-self (Damasio, 1999), the sentient self (Craig, 2003), the embodied self (Seth, 2013), and the material me (Tsakiris, 2017). However, Vignemont (2018a) argues that interoceptive feelings by itself cannot distinguish self and non-self, but it provides an affective background for bodily sensations (more on “feelings” in the next subsection).

d. Bodily Feelings

It is quite important to note that there is a group of bodily experiences that is recognisably different from all of the above. According to Vignemont (2020), they are different specifically in that these feelings are relatively permanent features of bodily awareness. In the literature, the following three are the most prominent:

The feeling of bodily presence: The body in the world.

The feeling of bodily capacities: The body in action.

The feeling of bodily ownership: The body and the self.

The notion of “presence” here is derived from the sensorimotor approach, and primarily in the case of vision (for example, Noë, 2004): when one sees a tree in front of her, for example, her sensorimotor skills or knowledge enable her to have a sense of the visual presence of the sides and the back of the tree. Quite independent of the plausibility of the sensorimotor approach itself, that understanding of presence can be appropriated to characterise bodily experiences. For example, when one feels a tickle in her left wrist, she feels not only that specific spot, but also the nearby areas of skin, muscles, and joints. There is a sense in which the body is there (presence, rather than absence), though not all parts of them are in the foreground of our awareness. This feeling of presence can sometimes be replaced by the feeling of absence, for example, in the case of depersonalisation (more on this in 3.d) and is sometimes classified as a sensory problem.

Bodily capacities include feelings of being able and unable to do with one’s own body. In the literature sometimes it is called the “sense of agency,” but that normally refers to “the awareness of oneself as the cause of a particular action” (Vignemont, 2020, p. 85; emphasis added). By “bodily capacities,” here we mean something more permanent, that is, long-term capacities to do various things with one’s body. Where do these capacities come from? They might be from monitoring our past capacities of doing stuff, and hence involve certain metacognitive abilities, which need to be studied themselves. This sense of bodily capacities can sometimes be replaced by the feeling of bodily incapacities, for example, in the case of hysterical conversion (roughly: wrongly assume that parts of one’s body is paralysed).

Bodily or body ownership is probably the most discussed phenomenon in this area, so will be also covered in 2.b and 2.c below. In most cases, “one does not normally experience a body that immediately responds to one’s intentions; one normally experiences one’s own body” (Vignemont, 2020, p. 86). Bermúdez (2011/2018) argues that this kind of body ownership involves only judgments, not feelings, but this remains to be controversial. This sense of ownership can sometimes be replaced by the feeling of bodily disownership, for example, in the case of somatoparaphrenia (more on this in 3.b). It has been argued that bodily ownership crucially involves affective consciousness (Vignemont, 2018b).

In a slightly different context, Matthew Ratcliffe (2005, 2008, 2016) has developed a sophisticated theory of existential feelings, which is both bodily and affective. This kind of feelings shapes human’s space of possible actions. They are pre-structuring backgrounds of all human experiences, and they are themselves parts of experiences as well. Ratcliffe argues that these kinds of bodily existential feelings are different from emotions and moods; they are sui generis. How this kind of feeling relates to other mental phenomena, such as thoughts and self-consciousness, remains to be seen (Kreuch, 2019). For our purposes, the most relevant question might be: in what way and to what extent do these existential feelings overlap with the three kinds of bodily feelings Vignemont identifies?

e. Bodily Representations: Body Image, Body Schema, and Peripersonal Space

Bodily awareness is closely related to bodily representations, just as in general awareness or consciousness is closely related to representations. The three notions introduced in this subsection are often understood in terms of mental representations, though they do not have to be (for anti-representational alternatives specific to some of these notions, see for example Gallagher, 2008). No matter how they are understood, it is a consensus that they play some significant roles in understanding bodily awareness. Let’s begin with body image, which refers to the subject’s mental representation of one’s own body configurations, very generally speaking. In philosophy, Brian O’Shaughnessy’s works (1980, 1995) have brought it into focus. He posits a long-term body image that sustains the spatial structure of our bodily awareness. This is a rather static notion, as the spatial structure can be quite relevant to possible actions, but it does not mention actions explicitly. Body schema, by contrast, is defined as consisting in sensory-motor capacities and actions. It is worth noting that the discussions we are familiar with today are already quite different from the original discussions in the early 20th century, notably from Head and Holmes (1911). For example, they did not mention action at all, and they distinguished two types of body schema, one that keeps track of postural changes and the other that represents the surface of the body. Also note that in other disciplines sometimes a broader notion of body image is invoked to refer to one’s thoughts and feelings about the attractiveness of one’s own body, but in philosophy we tend to stick to its narrower meanings. Note also that there is a group of questions concerning whether bodily awareness requires action (Briscoe, 2014; see also 3.b) and whether action requires bodily awareness (O’Shaughnessy, 2000; Wong, 2015), which we do not review here.

This pair of notions can seem to be intuitively clear, but when researchers make claims about them, things can get complicated and controversial. For example, O’Shaughnessy (1989) holds that our body image consists in a collection of information from our bodily senses, such as proprioception, but this seems to miss the important fact that blind subjects tend to have less accurate representations of the sizes of their own bodies (Kinsbourne and Lempert, 1980), which shows that sight also plays a crucial role in our body image. Gallagher (1986) once states that “[t]he body image is a conscious image or representation, owned, but abstract and disintegrated, and appears to be something in-itself, differentiated from its environment” (p. 541). This obviously goes beyond what many would want to mean by “body image.” The same goes for body schema. For example, Gallagher (2008) holds that body schema is in effect a sensorimotor function, which is not itself a mental representation. Moreover, both Head and Holmes (1911) and Gallagher (1986) regard body schema as unconscious, but it can be argued that under certain circumstances it can be brought into consciousness, at least in principle. To trace the history of how these terms have been used in the past century is itself an interesting and useful project (Ataria, Tanaka, and Gallagher, 2021), but since this article is not primarily a historical one, we will stick to the key idea that while body image is one’s own mental representation about the spatial structure of one’s own body, body schema is a corresponding representation that explicitly incorporate elements in relations to the possibility of actions. For potential double dissociations, see Paillard (1999). There is a long history of using the two terms interchangeably, but nowadays it is advised not to do so. Vignemont (2011/2020) offers a very useful list of potential differences that researchers have invoked to distinguish between body schema and body image, and she also points out that different taxonomies even “sometimes lead to opposite interpretations of the very same bodily disorders” (section 3.2). The situation is thorny and disappointing, and there seems to be no easy way out. To give one example, Bermúdez (1995) critically evaluates O’Shaughnessy’s views and arguments (1980, 1989, 1995) for the view that “it is a necessary condition of being capable of intentional action that one should have an immediate sensation-based awareness of one’s body” (Bermúdez, 1995, p. 382). Here he follows O’Shaughnessy’s conception of body image, but since it is about intentional action, considerations about (some notions of) body schema might be relevant; exactly how the discussion would go remains unclear. A related distinction between A-location and B-location is proposed by Bermúdez (2011/2018): while “[t]he A-location of a bodily event is fixed relative to an abstract map of the body,” “the B-location of a bodily event does take into account how the body is disposed” (p. 177-178). In introducing this pair of distinctions, the author does not mention body image or body schema.

What about peripersonal space (PPS)? This notion was invented only in the early 1980s (Rizzolatti, Scandolara, Matelli, and Gentilucci, 1981). It has also gone through many conceptual refinements and empirical investigations. A recent definition goes like this: it is “the space surrounding the body where we can reach or be reached by external entities, including objects or other individuals” (Rabellino, Frewen, McKinnon, and Lanius, 2020). Note that this kind of definition would not be accepted by those who clearly differentiate peripersonal space from reaching space; for example, “Human-environment interactions normally occur in the physical milieu and thus by medium of the body and within the space immediately adjacent to and surrounding the body, the peripersonal space (PPS)” (Serino, et al., 2018). Peripersonal space has been regarded as an index of multisensory body-environment interactions in real, virtual, and mixed realities. Some recent studies have supported the idea that PPS should be understood as a set of different graded fields that are affected by many factors other than stimulus proximity (Bufacchi and Iannetti, 2018). A basic distinction between appetitive and defensive PPS has been made (Vignemont and Iannetti, 2015), but further experimental and conceptual works are called for to substantiate this and other potential distinctions. A further question is in what ways body image, body schema, and peripersonal space relate to one another (for example, Merleau-Ponty on projection and intentional arc, 1945/2013).

It has been argued that awareness of peripersonal space facilitates a sense of being here (Vignemont, 2021). This is different from bodily presence discussed in 2.d, as the presence in question now is hereness, that is, self-location, which is different from bodily location (1.b). One implication of this view is that “depersonalized patients fail to process their environment as being peripersonal” (ibid., p. 192). Peripersonal awareness gives a specific sense of presence, which is not given by other awareness such as interoception and proprioception. This also relates bodily awareness to traditional philosophical discussions of indexicals (Perry, 1990, 2001).

Similar complications concerning representation can be found in this area too. For example, in their introductory piece, Vignemont, Serino, Wong, and Farnè have “a special way of representing space” as their subtitle, but in what sense and whether PPS is indeed representational can be debatable. Another critical point concerns in what way and to what extent issues surrounding PPS are philosophically significant, given that so many works in this area are empirical or experimental. This is indeed a difficult question, and similar worries can be raised for other interdisciplinary discussions in philosophy and cognitive sciences. Without going into the theoretical disagreements concerning a priori/armchair/a posteriori, here is a selective list of the relevant issues: What are the relations between egocentric space, allocentric space, and peripersonal space? How does it help us understand self-location, body ownership, and bodily self-awareness? How does attention affect our experiences of peripersonal space? Is peripersonal space a set of contact-related action fields? How does peripersonal space contribute to our sense of bodily presence (see various chapters in Vignemont, Serino, Wong, and Farnè, 2021)? No matter what the verdict is, it is hard to deny the relevance of peripersonal space for philosophical issues concerning bodily awareness in general, which will hopefully be clearer in the following sections.

Now, is bodily awareness a unified sense modality? Given how diverse its elements are, the answer is probably going to be negative; though as Vignemont (2018) points out, what these diverse elements “have in common is that they seem to all guarantee that bodily judgments are immune to error through misidentification relative to the first-person” (see 2.b). She then goes on to elaborate on several puzzles about body ownership, exploring varieties of bodily experiences and body representations and proposing a positive solution to those puzzles: the “bodyguard hypothesis,” which has it that “only the protective body map can ground the sense of bodily ownership” (Vignemont, 2018b, p. 167). However, “bodily awareness” can also be construed narrowly: for example, when Martin (1992) and others argue that spatial touch depends on bodily awareness (see 2.a below), they intend a narrower meaning of the term, including proprioception and kinaesthesis only. So, one can still sensibly ask: is proprioception a sense modality on its own? Is the vestibular sense a sense modality on its own? These are all open questions for future research. The next section is about some contemporary issues concerning aspects of bodily awareness. Some might hold that before asking whether bodily awareness is a unified sense modality, we should decide first whether these various experiences described above are perceptual or not. Others might hold that this is not the case, as the senses do not have to be exteroceptive, and therefore perceptual. More positively, one can ask whether proprioception itself is a natural kind, without committing that it is perceptual.

2. Contemporary Issues

In section 1, it was shown that bodily awareness has many varieties. In considering them, one has also seen that many questions arise along the way, for example, how to individuate the senses within bodily awareness, how to draw the distinction between interoception and exteroception, and so forth. However, many philosophical questions deserve further consideration given the complexities involved; in this section is a discussion of some of these questions.

a. Is There a Tactile Field?

This question would not make sense unless it is situated within a wider context. Consider visual field: in daily life, we know that when we close one eye, our visual fields are roughly cut half. In clinical contexts, we are sometimes told that due to strokes or other conditions, our visual fields shrink and as a result we bump into things more often and will need to readjust. When we say blindsight patients have a “blind field,” we are already presupposing the existence of visual fields. In psychology, we can measure the boundaries of our visual fields, and there are of course individual differences. Different philosophical theories of perception might attach different metaphysical natures of visual fields. For example, an often-quoted passage states that a visual field is the “spatial array of visual sensations available to observation in introspectionist psychological experiments” (Smythies, 1996, p. 369). This obviously commits to something – visual sensations – that are not acknowledged by many researchers in this area, though it can be regarded as the standard understanding of the sensationalist tradition (for example, Peacocke, 1983). A naïve realist might prefer characterising visual fields as constituted by external objects and phenomena themselves (Martin, 1992). A representationalist would presumably invoke mental representations to characterise visual fields. On this occasion we do not need to deal with the metaphysics of visual fields; suffice to say that visual fields seem to be indispensable, or at least quite important, for spatial vision. So, our question is: what about other spatial senses? Do they also rely on the relevant sensory fields? Specifically, for spatial touch, do they rely on a tactile field or tactile fields?

Questions concerning tactile fields arise explicitly in the context of P. F. Strawson’s essay on descriptive metaphysics:

Evidently the visual field is necessarily extended at any moment… The case of touch is less obvious: it is not, for example, clear what one would mean by a “tactual field” (Strawson, 1959, p. 65, emphasis added)

Strawson’s challenge here is moderate in the sense that he only invites those who believe in tactile fields to say more about what they mean by it. More challenging moves can be found in Brian O’Shaughnessy, M. G. F. Martin, and Matthew Soteriou. O’Shaughnessy writes, “There is in touch no analogue of the visual field of visual sensations” (1989 p. 38, emphasis added). This is more challenging because, unlike Strawson, O’Shaughnessy here asserts that there is no analogue. Notice that when he writes this, he has a rather specific view on vision, which involves visual sensations or visual sense-data. Soteriou makes a more specific claim that “The structural feature of normal visual experience that accounts for the existence of its spatial sensory field is lacking in the form of bodily awareness involved when one feels a located bodily sensation” (2013, p. 120, emphasis added).

Countering the above line of thought, Patrick Haggard and colleagues (2008, 2011) have attempted to empirically test the hypothesis that tactile fields exist, and they sustain tactile pattern perceptions. Following earlier works, Fardo, Beck, Cheng, and Haggard (2018) argue that “integration of continuous sensory inputs across several tactile RFs [receptive fields] provides an intrinsic mechanism for spatial perception” (p. 236). For a more detailed summary of this series of works, see Cheng (2019), where it is also noted that in the case of thermal perception and nociception, there seems to be no such field (Mancini, Stainitz, Steckelmacher, Iannetti, and Haggard, 2015). Further characteristics of tactile fields include, for example, we can perceive space between multiple stimuli (Mac Cumhaill 2017; also compare Evans 1980 on the simultaneous spatial concept). For touch, the sensory array has a distinctive spatial organisation due to the arrangement of receptive fields on the receptor surface on the skin.

Recently, discussions on tactile fields have gone beyond the above contexts. For example, in comparing shape representations in sight and touch, E. J. Green (2020) discusses various responses to Molyneux’s question and classifies (and argues against) the tactile field proposal into what he calls the “structural correspondence view” (also see Cheng, 2020). In investigating the spatial content of painful sensations, Błażej Skrzypulec (2021) argues that cutaneous pains “do not have field-like content, as they do not present distance relations between painful sensations” (p. 1). In what sense there are tactile fields seems to be a theoretically fruitful question, and further studies need to be done to explore ramifications in this area. Similar works have been done for other sensory modalities, such as olfaction (Aasen, 2019) and audition.

b. Does Bodily Immunity to Error Through Misidentification Hold?

“Immunity to error through misidentification relative to the first-person” (IEM) is a putative phenomenon identified by Sydney Shoemaker (1968), who also attributes it to Wittgenstein (1958) (Also see Salje, 2017). “Error through misidentification” is a specific kind of error; let’s illustrate IEM via an example. When I say “I see a canary,” if I am sincere, I can still be wrong about what I see, or even about whether I really have visual experiences at that time. But it seems that I cannot be wrong about the first-person: it is me who thinks and judges I see a canary, and there is no doubt about it (beyond reasonable doubt). Shoemaker regards this as a logical truth, though a further complication here is that Shoemaker himself draws a distinction between de facto IEM and logical IEM, which is about the scope of the IEM claim. If we regard IEM as a logical thesis, then we are after the broader, logical thesis.

Now, setting aside whether in general IEM is true, it is about self-ascription of mental states. Independently, one might reasonably wonder whether IEM is applicable to the self-ascription of bodily states (that is, physical bodily properties, including body size, weight, and posture, and so forth). Again, let’s illustrate this with an example. Suppose I come up with the judgement that I am doing a power pose. If I formed this judgement via my vision, it is possible (though unlikely) for me to get it wrong who is doing this pose, as I might confuse someone else’s arms with my own. By contrast, if I form this judgement by proprioception, I might be wrong about the pose itself, but I cannot be wrong about who is doing the power pose, or so it is sometimes argued (Evans, 1982). But things are not so simple; Vignemont (2018) usefully distinguishes between bodily immunity from the inside and from the outside. In what follows we briefly discuss their contents and potential problems respectively.

Bodily immunity from the inside is in a way more standard: bodily senses seem to guarantee IEM in this sense because they provide privileged informational access to one’s own body. This is not to say that bodily senses do not provide information about other things – touch of course brings in information about other objects – but in retrieving information about those other things, the privileged bodily information is always implicated. As Vignemont states, “Proprioceptive experiences suffice to justify bodily self-ascriptions such that no intermediary process of self-identification is required” (2018, p. 51). Details aside, this thesis faces at least two kinds of putative counterexamples: in false negative errors, “one does not self-ascribe properties that are instantiated by one’s own body” (ibid., p. 51, emphasis added), while in false positive errors, “one self-ascribes properties that are instantiated by another’s body” (ibid., p. 51, emphasis added).

For false negative errors, the clinical case somatoparaphrenia is a salient example (Bottini, Bisiach, Sterzi, and Vallar, 2002). A famous case patient FB can feel tactile experiences when her hand is touched, but she would judge that it is her niece’s hand that is touched. That is to say, she has troubles with body ownership with respect to her left hand. She does not self-ascribe properties, in this case being touched, that are instantiated by her own left hand. Whether this kind of case really constitutes counterexamples of bodily IEM is a matter of dispute. For example, Vignemont (2018) has argued that somatoparaphrenia is actually irrelevant to bodily IEM, because “The bodily IEM thesis claims that if the judgement derives from the right ground, then it is immune to error” (p. 52-3, emphasis added). However, one might worry that this move makes bodily IEM too weak. After all, philosophical theses like this tend to lose their significance when they are not universal claims. To this Vignemont might reply that “immunity to error” is a quite strong claim, so even if it needs to be qualified like above, it is still a significant thesis. For comparison, consider this parallel claim that if the perception derives from the right ground, then it is immune to error. This seems to be false because even when perceptions have right or good grounds, they can still be subject to errors.

For false positive errors, an obvious candidate is rubber hand illusion. In such a case, participants (to some extent) identify rubber hands as their hands. That is to say, they self-ascribe properties, in this case being their hands, that are not instantiated by their bodies. There are, to be sure, lots of controversies concerning the interpretation of this illusion, and whether it really constitutes a counterexample here. As Vignemont points out, “it is questionable whether participants actually self-attribute the rubber hand… they feel as if the rubber hand were part of their own body, but they do not believe it” (2018, p. 53, emphasis added). Arguably, those subjects do not make mistakes here; they rightly believe that those rubber hands are not their own hands. Another potential false positive case is embodied hand illusion, which is sometimes also derived from somatoparaphrenia. Some though not all somatoparaphrenia patients would also self-attribute another person’s hand, either spontaneously or induced via the RHI paradigm (Bolognini, Ronchi, Casati, Fortis, and Vallar, 2014; note that a similar condition can be found in those who do not have somatoparaphrenia). Basically, when the embodied hand moves, if the subjects see it, they might report feeling their hands moving. These are all tricky examples, and many individual differences are involved. What is crucial, methodologically, is to recognise that these are actual world clinical or experimental examples, rather than thought experiments. With these actual examples, we need to look into details in different cases, and be really careful in making sweeping claims about them.

What about bodily immunity from the outside? This putative phenomenon is less well known, but the cases for them might be familiar. It is less well known because we tend to think that information from outside (of the body) is fallible so cannot be immune to error. But consider this passage from J. J. Gibson:

[A]ll the perceptual systems are propriosensitive as well as exterosensitive, for they all provide information in their various ways about the observer’s activities… Information that is specific to the self is picked up as such, no matter what sensory nerve is delivering impulses to the brain. (1979, p. 115, emphasis added)

By “propriosensitive” Gibson means “information about one’s body.” This part of Gibson’s idea – self-specifying information – is less known than affordance, but is actually integral to his view, under the label of “visual kinesthesis”: due “to self-specific invariants in the optic flow of visual information (for example, rapid expansion of the entire optic array), we can see whether we are moving, even though we do not directly see our body moving” (Vignemont, 2018, p. 58). Relatedly, Evans (1982) and Cassam (1995) have argued that self-locating judgements enjoy the same status if they represent the immediate environment within an egocentric frame of reference, because this frame carries self-specifying information concerning the location of the perceiver. As Evans puts it, when I am standing in front of a tree, I cannot sincerely entertain this doubt: “someone is standing in front of a tree, but is it I?” (1982, p. 222). Again, these ideas have clear route from Wittgenstein. What is introduced above is visual experiences of the environment grounding bodily IEM; Vignemont (2018) also discusses the possibility that visual experiences of the body grounding bodily IEM (p. 58-61). Note that self-specificity is weaker than self-reference, as the former does not imply that awareness of one’s body as one’s own (Vignemont, 2018a).

Relatively independent of IEM, philosophers also disagree about how to model body ownership. The questions include: it seems to make an experiential difference whether one is aware of one’s body as one’s own or not, but how to account for this difference in consciousness or phenomenology? What are the grounds of the sense of body ownership? Is there a distinct feeling of myness (Bermúdez, 2011/2018; Alsmith, 2015; Guillot, 2017; Chadha, 2018)? Different answers have been proposed, including the deflationary account, the cognitive account, the agentive account, and the affective account. The deflationary account has it that the sense of body ownership can be reduced to the spatiality of bodily sensations and judgements of ownership about one’s own body (Martin, 1992, 1995; Bermúdez, 2011). One potential problem is that one seems to be able to become aware of the boundaries of own’s own body without being aware of the boundaries of one’s own body qua one’s own (Dokic, 2003; Serrahima, forthcoming). Another potential problem is that bodily sensations might be able to be dissociated from the sense of body ownership: patients with disownership syndromes remain to be able to experience at least some bodily experiences; whether this decisively refutes the deflationary account remains to be determined (Moro, Massimiliano, and Salvatore, 2004; Bradley, 2021). The cognitive account has it that “one experiences something as one’s own only if one thinks of something as one’s own” (Alsmith, 2015, p. 881). Whether this account is successful depends on how we account for the apparent cognitive impenetrability of the sense of body ownership: if there are cases of body ownership or disownership that cannot be altered by thinking or other propositional attitudes, it will be difficult for this account to explain what is really going on. The agentive account has it that body ownership has certain constitutive connection between body schema (Vignemont, 2007), agentive feelings (Baier and Karnath, 2008), or agentive abilities (Peacocke, 2017). One major potential problem with this is that, for example, participants with the rubber hand illusion might feel that the rubber hand is her or his own, without feeling that they can act with that very rubber hand. Finally, the affective account has it that there is a specific affective phenomenological quality that is over and above sensory phenomenological qualities of bodily awareness (Vignemont, 2018b). As we mentioned in discussing affective touch, this kind of quality is valenced or valued, and in this specific case the quality signifies the unique value of the body for the self. This kind of affective quality is key to survival. One concern is that it might be unclear whether it is affective phenomenology that explains body ownership, or the other way around. Another concern is that evolutionary explanations always risk being just-so stories. These are all very substantive issues that we do not go into, but the general shape of this rich terrain should be clear enough.

c. How Do Body Ownership and Mental Ownership Relate?

Above we have seen that somatoparaphrenia and other conditions have been regarded as test cases for body ownership. Relatedly, they might cause a parallel problem for mental or experiential ownership (Lane, 2012). Let’s recall the patient FB case: when she judges that her left hand belongs to her niece, she was confused about body ownership, as left hand is a body part. By contrast, when she judges that the relevant tactile sensations belong to her niece as well, she was confused about mental ownership, as tactile sensation is a mental state or episode. This corresponds to Evans’ distinction between mental self-ascription and bodily self-ascription (1982, p. 220-235), which also brings us back to the original formulation of IEM in Shoemaker.

How does somatoparaphrenia, cases like patient FB, threaten IEM with regard to mental ownership? Since FB gets the who wrong in mental self-ascription, it does look like a counterexample of IEM. Consider some original formulations:

(1) To ask “are you sure it’s you who have pains?” would be nonsensical. (Wittgenstein, 1958, p. 67)

(2) [T]here is no room for the thought “Someone is hungry all right, but is it me?” (Shoemaker, 1996, p. 211)

“Nonsensical” in (1) and “no room” in (2) both refer to the “immunity” part of IEM. “Are you sure it’s you who have pains?” in (1) and “Someone is hungry all right, but is it me?” in (2) refer to the “error through misidentification” part. For Wittgenstein, the question in (1) looks like a query in response to the subject’s spontaneous report of his sensational state, saying, “it is me who is in pain.” Here Wittgenstein argues that when a subject sincerely reports that she is in pain, it is nonsensical to question whether the subject is wrong about who is the subject. In the case of FB, she did not spontaneously report that she was experiencing a certain sensation; moreover, she reported that the sensation belongs to someone else. This makes no contact with what Wittgenstein has in mind. However, this is not true in Shoemaker’s case. The question “Someone is hungry all right, but is it me?” allows two kinds of cases. First, the subject is not hungry, but she suspects she is the subject of that experience. Second, the subject is truly hungry, but she suspects she is not the subject of that experience. FB fits the second case, so proponents of IEM will have a hard time reconciling this second case with the case of FB. How about the first case? Since by hypothesis the subject is not hungry in the first place, FB’s case would be irrelevant. So, if we read Shoemaker’s question in the first sense, it would be easier for proponents of IEM to face empirical cases like FB.

How, then, do body ownership and mental ownership relate? There seems to be no straightforward answer. Consider bodily IEM and the original IEM: as discussed above, Vignemont argues that the bodily IEM thesis claims that if the judgement derives from the right ground, then it is immune to error; presumably this strategy, if acceptable, can apply in the original IEM. Indeed, Shoemaker once states that “if I have my usual access to my hunger, there is no room for the thought ‘Someone is hungry alright, but is it me?’” (Shoemaker, 1996, p. 210, emphasis added). So, in this sense, Bodily IEM and the original IEM can be coped with in the same way. This does not show, to be sure, there are no crucial differences between them. What about mental ownership and body ownership in general, independent of IEM, bodily or not? They seem to go together very often: on the one hand, in normal cases one would correctly recognise one’s own limbs as one’s own, and would correctly recognise one’s own sensations as one’s own; on the other hand, in the case of FB and some other somatoparaphrenia patients, they wrongly recognise one’s own limbs as others’, and would also wrongly recognise one’s own sensations as others’ (this is debatable, to be sure). Is it possible to double dissociate them? For correct body ownership and wrong mental ownership, the claim “I feel your pain” might be a possible case, since in this case one gets the body right but the sensation wrong when it comes to the who question: when you sympathise with someone’s else’s pain so that you feel pain too, it is your pain then. What about wrong body ownership and correct mental ownership? These are all open empirical questions that need to be further explored.

d. Must Bodily Awareness Be Bodily Self-Awareness?

On the face of it, this question might make no good sense: “of course it must; bodily awareness is through proprioception, kinaesthesis, and pain and so forth, to become aware of one’s own body; one is aware of one’s own body from the inside, as it were” (see O’Shaughnessy, 1980; “highly unusual relation”). How can it fail to be bodily self-awareness?” Indeed, in the empirical literature, researchers do not normally distinguish between them (for example, Blanke, 2012). In philosophy, sometimes “bodily self-awareness” refers to something more specific, for example, aware of this body as mine, aware of this bodily self qua subject (Cassam, 1997; also see Salje, 2019; Longuenesse, 2006, 2021), or perhaps aware of oneself as a bodily presence in the world (McDowell, 1996; or “existential feelings,” various writings by Ratcliffe, and Vignemont, 2020, p. 83, as discussed in 1.d). This is not to accuse scientists of committing a conceptual confusion; it is just that philosophers are sometimes concerned with questions that have no clear empirical bearing, at least for the time being. Below we briefly review this stricter usage of “bodily self-awareness,” and philosophical implications around this corner.

In Self and World (1997), Cassam seeks to identify necessary conditions for self-consciousness. One line he takes is called the “objectivity argument,” which has it that objective experience requires “awareness of oneself, qua subject of experience, as a physical object” (p. 28; “the materialist conception”; also see his 2019). For our current purpose, that is, distinguishing bodily awareness and bodily self-awareness, we only need to get clear about what “qua subject” means. One can be aware of oneself, or one’s own body, not qua subject, but just qua (say) an animal, or even a thing. To illustrate this, consider the case in which you see yourself in a mirror from a rather strange angle (“from the outside”), and without realising that it is you. In that case, there is no self-awareness under this stricter meaning. To apply this to the body case, proprioception might automatically tell the subject the locations of the limbs, but without a proper sense of mineness, to put it vaguely, it does not automatically follow that those bodily awarenesses are also cases of bodily self-awareness. Note that Cassam sometimes defends different though related theses on different occasions; for example, he also defends the “bodily awareness thesis” that “awareness of one’s own body is a necessary condition for the acquisition and possession of concepts of primary qualities such as force and shape” (2002, p. 315).

This point can be further illustrated by the “Missing Self” problem, explained below. Here is how Joel Smith formulates the target:

[I]n bodily awareness, one is not simply aware of one’s body as one’s body, but one is aware of one’s body as oneself. That is, when I attend to the object of bodily awareness I am presented not just with my body, but with my “bodily self.” [the bodily-self thesis] (2006, p. 49)

Smith’s argument against this view is based on two claims about imagination, which he defends in turn. To retain our focus, here we will assume that those two claims are cogent. They are as follows:

(1) “[T]o imagine sensorily a ψ is to imagine experiencing a ψ” (Martin, 2002, p. 404; the “dependency thesis”).

(2) “When we engage in imagining being someone else, we do not imagine anything about ourselves” (Smith, 2006, p. 56).

With these two claims about imagination, Smith launches his argument as follows:

The argument…begins with the observation that I can imagine being Napoleon feeling a bodily sensation such as a pain in the left foot. According to [1], when I imagine a pain I imagine experiencing a pain. It follows from this that the content of perceptual awareness will be “mirrored” by the content of sensory imagination…Now, [given 2], then imagining being Napoleon having a pain in the left foot will not contain me as an object. The only person in the content of this imagining is Napoleon…Thus, when I simply imagine a pain, but without specifying whose pain, the imagined experience is not first personal. (2006, p. 57, emphasis added)

What should we say about this argument? For Smith, the bodily-self thesis requires getting the who right. Therefore, imagining being other people is relevant. But it is unclear whether getting the who right is crucial for Cassam (1997), for example. Suppose that I am engaging the kind of imagination Smith has in mind. In that scenario, according to his view, I am not part of the imagination. Napoleon is. Smith believes that this is sufficient for rejecting the bodily-self thesis, but this hardly places any pressure on Cassam’s view. All we need here is that in having a certain kind of bodily awareness, this awareness is not only about the body, but also about the mind that is associated with that body. Whether the mind is Tony or Napoleon is out of the question here. Perhaps I get the subject wrong. Perhaps, as Smith has it, in the imagination the subject is Napoleon, not Tony. Still, all we need is that bodily awareness is not only about the body, but the minded body. If so, even if Smith’s argument is sound, the Cassam picture is not one of his targets, since for him it is not needed to get it right about who the subject is.

Another way to see the current point is to consider an analogous point concerning the first-person pronoun: the reference of “I” is token reflexive in Reichenbach’s sense (1947): any token of “I” refers to whoever produces that expression. When I produce a sentence containing “I”, it refers to me. Whether I correctly identify myself as Tony Cheng or misidentify myself as Napoleon is irrelevant. Likewise, in the case of bodily awareness, the subject is aware of him- or herself as the person who is experiencing the bodily experience in question. Whether the subject can correctly identify who he or she is – Napoleon or not – is irrelevant. The reason might be that what is unique about the first person is the token reflexivity. The identity of the subject, though important, is always an additional question. It is interesting to compare Bernard Williams’ thought experiments concerning torturing and personal identity (1970; see also Williams 2006): when I am tortured and want to escape from the situation, what is crucial is that I am being tortured and I want to escape. Whether I am Tony Cheng or Napoleon is a further, and less important, question. One outcome of this view is that one then has no absolute authority about what one’s is imagining. This might not be a theoretical cost, as the general trend in contemporary epistemology has it that all sorts of first-person authority are less robust than philosophers have thought in the past.

The moral is that no matter who I am, who I will be, what I will remember, or what I can imagine, as long as what is going to be tortured is me, then I have every reason to fear. In his later writings, Smith is more sympathetic to the bodily-self thesis. For example, he writes: “if bodily sensations are given as located properties of one’s lived body and, further, bodily sensations are presented as properties of oneself, then bodily awareness is an awareness of one’s body as oneself” (Smith 2016, p. 157). So, must bodily awareness be bodily self-awareness? Philosophers seem to still disagree, and it is to date unclear how this can be resolved with the helps with empirical works directly.

e. What Does Body Blindness, Actual or Imagined, Show?

Partial body/proprio- blind cases have been found in the actual world, whereby the subject has no touch or proprioception below the neck but is still able to see the world roughly in the way we do, and can experience temperatures, pain, and muscle fatigue (Cole, 1991). For this kind of rare subjects, they need to make use of information from other modalities, mostly vision, to coordinate actions. What this shows is that proprioception and touch play extremely important roles in our daily lives. Although it is still possible to maintain minimal functions, it is extremely laborious to conduct bodily actions without appropriate bodily awareness.

What about the imagined case? Consider this thought experiment: “[E]ven someone who lacks the form of bodily awareness required for tactile perception can still see the surrounding world as a world of physical objects (Aquila 1979, p. 277). This is a suggestion of disembodiment: most people agree that having bodily awareness is very important for navigating the world, but in this imagined case, call it “total body blindness,” the subject seems to be able to have basic cognition without any bodily awareness. This seems to contradict Cassam’s claim that objective experiences require bodily self-awareness (1997). More specifically, Aquila argues that given that if I am body blind, “I experience no bodily sensations, or at least none which I am able to identify in connection with some particular body I perceive, and I perceive no body at all which I would identify as my own” (Aquila, 1979, p. 277). If this thought experiment is coherent, it suggests a limitation of the importance of bodily awareness: true, bodily awareness is so crucial for cognition and action, but it is not a necessary condition.

It is worth considering a real-life example that might pose a similar threat. Depersonalisation Disorder, or DPD, denotes a specific feeling of being unreal. In the newest edition of the DSM (Diagnostic and Statistical Manual of Mental Disorders, fifth edition), it has been renamed as Depersonalisation/Derealisation Disorder, or DDD. In what follows we use “depersonalisation” for this syndrome not only because it is handier, but also because depersonalisation and derealisation are closely connected but different phenomena: while depersonalisation is directed toward oneself or at least one’s own body, derealisation is directed toward the outside world. Here we will only discuss depersonalisation.

The first reported case of depersonalisation was presented by an otolaryngologist M. Krishaber under the label “cerebro-cardial neuropathy” in 1873. The term “depersonalisation” was coined later in 1880 by the Swiss researcher Henri Amiel in response to his own experiences. Like all the other mental disorders or illnesses, the evaluation and diagnosis of depersonalisation have a long and convoluted history, and the exact causes have not been entirely disentangled. What is crucial for our current purposes is the feeling involved in the phenomenon: patients with this disorder might feel themselves to be “a floating mind in space with blunted, blurry thoughts” (Bradshaw 2016). To be sure, there are individual differences amongst patients, just as there are individual differences amongst so-called “healthy subjects.” Still, this description seems to fit many such patients, and more importantly, even if it does not fit many, the very fact that some patients feel like this is sufficient to generate worries here. Here is why: presumably most, if not all, of these patients retain the capacity for object cognition and perception. They still (perceptually) understand that objects are solid, shaped, and sized, and that things can persist even when occluded, for example. But they seem to lack the kind of bodily awareness in question: the very description of a “floating mind in space” signifies this feeling of disembodiment. If this is right, these patients are real-life counterexamples to the Cassamian thesis: they have the capacity for basic perception/cognition, while lacking awareness of oneself as a physical object.

So, what does body blindness, actual or imagined, show? In the actual case, one sees in detail how a lack of robust bodily awareness can put daily life into troubles; in the imagined case, where the subject in question does not even have bodily awareness above the neck, the subject seems to be still able to have basic awareness of the world. There can be worries here, to be sure. For example, if someone has zero bodily awareness, she or he would have no muscle feedback around the eyes, which will impair the visual capacities to a substantive extent. Still, it would presumably not render the subject blind, so again, bodily awareness is so very important, but perhaps not strictly speaking necessary for basic cognition or objective experience. For more on embodiment, the self, and self-awareness, see Cassam (2011).

3. Phenomenological Insights: The Body as a Subjective Object and an Objective Subject

From the above discussions, we have seen that that the body seems to be both subjective and objective, in some sense. What should we make of this? Or more fundamentally, how is that even possible? Let’s consider the possibility that with bodily awareness one can be aware of one’s body as a subjective object and an objective subject: the bodily self can be aware of itself as an object, but it is not just another object in the world. It is a subjective object (Husserl, 1989), that is, the object that sustains one’s subjective states and episodes. It is also an objective subject, that is, the subject that is situated in an objective world. There seems to be no inherent incompatibility within this distinction between object and subject; they are not mutually exclusive. To channel Joel Smith, “in bodily awareness my body is given as lived – as embodied subjectivity – but it is also co-presented as a thing – as the one thing I constantly see” (2016, p. 159). By contrast, this line is at odds with Sartre’s idea that one’s body is either “a thing among other things, or it is that by which things are revealed to me. But it cannot be both at the same time” (1943/2003, p. 304).

a. Two Notions of the Body

If bodily self-awareness can be about subjective object and objective subject, this comes close to Merleau-Ponty’s notion of subject-object (Merleau-Ponty, 1945/2013). However, in both his and Husserl’s works, and in the phenomenological tradition more broadly, the general consensus is that we are never aware of ourselves as physical objects. In order to incorporate their insights without committing to this latter point, we need to look into some of the details of their views. For Husserl, the Body (Leib) is the “animated flesh of an animal or human being,” that is, a bodily self, while a mere body (Körper) is simply “inanimate physical matter” (1913/1998, p. xiv). The Body presents itself as “a bearer of sensations” (ibid., p. 168). A similar distinction emerges in Merleau-Ponty’s work between the phenomenal/lived body and the objective body that is made of muscles, bones, and nerves (1945/2013). There is a debate over whether the distinction should be interpreted as between different entities or different perspectives of the same entity (Baldwin, 1988). As in the case of Kant’s transcendental idealism, the two-world/entity view is in general more difficult to defend, so for our purposes we will assume the less contentious two-perspective view. The idea, then, is that the human body can be viewed in at least two ways: as phenomenal, and as objective. From the first-person point of view, the body presents to us only as phenomenal, not objective. For a detailed comparison of Husserl and Merleau-Ponty in this regard, see Carman (1999).

For Merleau-Ponty, “[t]he body is not one more among external objects” (1945/2013, p. 92). One can only be aware of oneself as the phenomenal self in one’s pre-reflective awareness. As Vignemont explains,

[T]he lived body is not an object that can be perceived from various perspectives, left aside or localized in objective space. More fundamentally, the lived body cannot be an object at all because it is what makes our awareness of objects possible…The objectified body could then no longer anchor the way we perceive the world…The lived body is understood in terms of its practical engagement with the world…[Merleau-Ponty] illustrates his view with a series of dissociations between the lived body and the objective body. For instance, the patient Schneider was unable to scratch his leg where he was stung. (2011/2020, pp. 17-18, emphasis added)

Another gloss is that the lived body is “the location of bodily sensation” (Smith 2016, p. 148, original emphasis; Merleau-Ponty, “sensible sentient,” 1968, p. 137). Compare Cassam’s characterisation of the physical or material body as the “bearer of primary qualities” (2002, p. 331). Now, for pathological cases like that of Schneider, who exemplified a “dissociation of the act of pointing from reactions of taking or grasping” (Merleau-Ponty, 1945/2013, p. 103-4), it shows only that the phenomenal/lived body is not the same as the objective body. It does not show that one cannot be aware of oneself as an objective body. For Merleau-Ponty, one’s body is “a being of two leaves, from one side a thing among things and otherwise what sees and touches them” (Merleau-Ponty, 1968, p. 137). The human body has a “double belongingness to the order of the ‘object’ and to the order of the ‘subject’” (ibid., p. 137). Our notion of the subjective object and objective subject, then, is intended to capture, or at least echo, Merleau-Ponty’s “Subject-Object,” and Husserl’s intriguing idea that the human body is “simultaneously a spatial externality and a subjective internality” (1913/1998).

This phenomenological approach has an analytic ally called “sensorimotor approach” (for example, see Noë, 2004; for details, see Vignemont, 2011/2020). Its major rival is the “representational approach,” which has it that “in order to account for bodily awareness one needs to appeal to mental representations of the body” (ibid., p. 12). To rehearse, reasons for postulating these representations include: 1) explaining the disturbances of bodily awareness such as phantom limbs; 2) accounting for the spatial organisation of bodily awareness (O’Shaughnessy 1980, 1995); and 3) understanding the ability to move one’s own body. Even if one agrees with the representational approach, it has proven to be extremely difficult to decide how many and what kinds of body representations we should postulate since the classic work by Head and Holmes (1911) (1.e). The reason for discussing the representational approach here is that while crucially different from the sensorimotor approach, the representational approach can raise a similar objection with its own terms: what one is aware of is one’s body schema or body image, but not one’s objective body. Under the phenomenological tradition, there is a branch called “neurophenomenology,” which “aimed at bridging the explanatory gap between first-person subjective experience and neurophysiological third-person data, through an embodied and enactive approach to the biology of consciousness” (Khachouf, Poletti, and Pagnoni, 2013). What these neurophenomenologists would say about the current case is not immediately clear.

This formulation of the problem might have some initial plausibility. Consider the case of phantom limb, in which the patient feels pain in a limb that has been amputated. The representational explanation says that the patient represents the pain in his body schema/image, which still retains the amputated limb. This shows, so the thought goes, that one is aware of only one’s own body schema/image. A similar line of thought can be found in Thomas Reid’s work, for instance when he argues that bodily awareness, such as sensations, is the result of purely subjective states or episodes (1863/1983, Ch. 5; see Martin, 1993 for discussion). If this is correct, then that kind of awareness can only be about something subjective, for example, the represented body, as opposed to the objective or physical body.

This inference might be too hasty. Assuming representationalism in this domain, it is sensible to hold the kind of explanation of phantom limb described above. However, the right thing to say might be that one is aware of one’s objective body through one’s body schema/image. They function as modes of presentation of the body. Why is this the right thing to say? One reason is that one can be aware of one’s own body objectively. If the representational approach needs to fit in, the only sensible place is in the modes of presentation.

In sum, the force behind the phenomenological or representational considerations should be fully acknowledged, but the right thing to say seems to be this: what one is aware of is the physical body, but one is not aware of it simply as a mere body or just yet another physical object. Rather, as explained above, one’s own body is aware of it as a subjective object, for example, the object that sustains one’s subjective states and episodes. The bodily self is aware of itself as a subjective object, and as an object in the weighty sense, that is, something can persist without being perceived (Strawson, 1959). Here we can echo Martin’s view of sensation: “Sensations are not… purely subjective events internal to the mind; they are experiences of one’s body, itself a part of the objective world” (1993, p. 209).

b. Non-Perceptual Bodily Awareness

These discussions from the phenomenological perspective also interact with the analytic tradition concerning the topic whether awareness of one’s own body is perceptual (Mandrigin and Thompson, 2015). According to Vignemont (2018b), “bodily presence” refers to the idea that “one’s body is perceived as being one object among others” (Vignemont, 2018b, p. 44). There is, to be sure, a matter of controversy given different criteria or conceptions of perception. McGinn (1996) holds that “bodily sensations do not have an intentional object in the way perceptual experiences do” (p. 8), and one potential reason can be found in the above Merleau-Ponty view, namely that “the body is not one more among external objects” (1945/2013, p. 92), and being an external object seems to be a necessary condition for perception. This seems also to echo Wittgenstein’s distinction between self as a subject and as an object, and the former cannot be an object of perception by oneself. However, this might not preclude the latter use of the self as an object, which can be an object of perception by oneself (Dokic, 2003). Another potential reason comes from the analytic tradition; Shoemaker (1996) has argued that one necessary condition of perception is that its way of gaining information needs to make room for identification and reidentification of the perceived objects, but bodily awareness seems to gain information from only one object, for example that is, one’s own body. Martin (1995) has argued that this sole-object view does not preclude bodily awareness being perceptual; Schwenkler (2013) instead argues that bodily awareness conveys information about multiple objects, since those pieces of information are about different parts of one’s own body.

This is a huge topic that deserve further investigations; as Vignemont (2011/2020) points out, the perceptual model of bodily awareness has faced challenges from many directions. In addition to the above considerations, some have argued that the distinctive spatiality of bodily awareness precludes it from being perceptual: its spatiality violates basic spatial rules in the so-called “external world” (for example, O’Shaughnessy, 1980); some go further to argue that bodily awareness itself is not intrinsically spatial (Noordhof, 2002). As the once famous local sign theory has it (Lotze, 1888), “each sensible nerve gives rise to its own characteristic sensation that is specific to the body part that is stimulated but spatial ascription does not derive from the spatial content of bodily sensations themselves” (Vignemont, 2011/2020). This goes against the tactile field view introduced in 2.a, as it argues that some intrinsic spatiality of touch is held by tactile fields as sustained by skin-space, that is, a flattened receptor surface or sheet (derma-topic, Cheng, 2019). Skin-space is to be contrasted with body-space, understood as torsos, limbs, joints, and their connections (somato-topic), and external-space, including peripersonal space, understood as coordinates in an egocentric representation that will update when the body parts move (spatio-topic). Relatedly, A. D. Smith (2002) has argued that bodily sensations are mere sensations and therefore non-perceptual, on the ground they do not meet his criteria of being objective. Other challenges toward the perceptual model of bodily awareness include the so-called “knowledge without observation” (Anscombe, 1962), the enactivists perspectives (Noë, 2004), and other action-based theories of perception (for example, Evans, 1982; Briscoe, 2014).

Now we are in a much better position to see why bodily awareness is philosophically important and intriguing: there can be many answers to this as we have seen throughout, but one major reason is that philosophy seeks to understand the convoluted relations between the subjective and the objective, and the body is one’s medium through which the objective can be reached by the subjective; this can be said to be a bodily quest for objectivity: the body as the seat of the subjective is also objective itself, and is one’s way towards the rest of the objective world; it is worth emphasising that the body is itself also part of the so-called “external world”; it is itself a denizen of the objective, that is, mind-independent. One might think that the body is external to the mind, though this spatial metaphor is not uncontroversial: others might think that the body is internal to the mind in the sense that the body is represented by the mind. Above we have touched on many aspects of the body and bodily awareness, and thereby seen how we can make progress in thinking about difficult philosophical issues in this area.

4. Conclusion

Bodily awareness is an extremely rich area of research that defies comprehensive introductions. Even if we double the word count here, there will still be territories that are not covered. Above we have surveyed varieties of bodily awareness, including touch, proprioception, kinaesthesis, the vestibular sense, thermal sensation, pain, interoception, a relatively new category called “sngception,” and bodily feelings. We have also discussed some contemporary issues that involve tactile fields, bodily IEM and IEM in general, mental ownership, bodily self-awareness, and body blindness. Finally, going beyond the Anglo-Saxon tradition, we have also selectively discussed insights from the phenomenological tradition, notably on the possibility of being aware of one’s bodily self as a subjective object and an objective subject, and whether bodily awareness is perceptual. Together they cover a huge ground under the general heading of “bodily awareness.” It would be an exaggeration to say that bodily awareness has become a heated area in the early twenty-first century, but it should be safe and accurate to state that it has been undergoing a resurgence or revival in the first quarter of the twenty-first century, as this article shows. This impressive lineup should guarantee the continuing importance of topics in this area, and there is much to follow up on in this rich area of research.

5. References and Further Reading

  • Aasen, S. (2019). Spatial aspects of olfactory experience. Canadian Journal of Philosophy, 49(8), 1041-1061.
  • Alsmith, A. J. T. (2015). Mental activity and the sense of ownership. Review of Philosophy and Psychology, 6(4), 881-896.
  • Alsmith, A. J. T. (forthcoming). Bodily self-consciousness. London: Routledge.
  • Anscombe, G. E. M. (1962). On sensation of position. Analysis, 22(3), 55-58.
  • Aquila, R. (1979). Personal identity and Kant’s “refutation of idealism.” Kant Studien, 70, 257-278.
  • Aristotle (1987). De Anima (On the soul). London: Penguin Classics.
  • Armstrong, D. M. (1962). Bodily sensations. London: Routledge.
  • Ataria, Y., Tanaka, S., & Gallagher, S. (2021). (Ed.) Body schema and body image: New irections. Oxford: Oxford University Press.
  • Aydede, M. (2013). Pain. In E. N. Zalta (Ed.), The Stanford Encyclopedia of Philosophy.
  • Baier, B., & Karnath, H-O. (2008). Tight link between our sense of limb ownership and self-awareness of actions. Stroke, 39(2), 486-488.
  • Bain, D. (2007). The location of pains. Philosophical Papers, 36(2), 171-205.
  • Baldwin, T. (1988). Phenomenology, solipsism, and egocentric thought. Aristotelian Society Supplementary Volume, 62(1), 27-60.
  • Bermúdez, J. L. (1995). Transcendental arguments and psychology: The example of O’Shaughnessy on intentional action. Metaphilosophy, 26(4), 379-401.
  • Bermúdez, J. L. (2011/2018). Bodily awareness and self-consciousness. In The bodily self: Selected essays. Cambridge, MA: MIT Press.
  • Berthier, M., Starkstein, S., & Leiguarda, R. (1988). Asymbolia for pain: A sensory-limbic disconnection syndrome. Annals of Neurology, 24(1), 41-49.
  • Berthoz, A. (1991). Reference frames for the perception and control of movement. In J. Paillard (Ed.), Brain and space. Oxford: Oxford University Press.
  • Blanke, O. (2012). Multisensory brain mechanisms of bodily self-consciousness. Nature Review Neuroscience, 13, 556-571.
  • Bolognini, N., Ronchi, R., Casati, C., Fortis, P., & Vallar., G. (2014). Multisensory remission of somatoparaphrenic delusion: My hand is back! Neurology: Clinical Practice, 4(3), 216-225.
  • Borg, E., Harrison, R., Stazicker, J., & Salomons, T. (2020). Is the folk concept of pain polyeidic? Mind and Language, 35, 29-47.
  • Bottini, G., Bisiach, E., Sterzi, R., & Vallar, G. (2002). Feeling touches in someone else’s hand. Neuroreport, 13(2), 249-252.
  • Bradley, A. (2021). The feeling of bodily ownership. Philosophy and Phenomenological Research, 102(2), 359-379.
  • Bradshaw, M. (2016). A return to self: Depersonalization and how to overcome it. Seattle, WA: Amazon Services International.
  • Briscoe, R. (2014). Spatial content and motoric significance. AVANT: The Journal of the Philosophical-Interdisciplinary Vanguard, 5(2), 199-217.
  • Bufacchi, R. J. & Iannetti, G. D. (2018). An action field theory of peripersonal space. Trends in Cognitive Neurosciences, 22(12), 1076-1090.
  • Cardinali, L., Brozzoli, C., Luauté, J., Roy, A. C., & Farnè, A. (2016). Proprioception is necessary for body schema plasticity: Evidence from a deafferented patient. Frontiers in Human Neuroscience, 10, 272.
  • Carman, T. (1999). The body in Husserl and Merleau-Ponty. Philosophical Topics, 27(2), 205-226.
  • Cassam, Q. (1995). Introspection and bodily self-ascription. In J. L. Bermúdez, A. J. Marcel, and N. M. Eilan (Eds.), The body and the self. Cambridge, MA: MIT Press.
  • Cassam, Q. (1997). Self and world. Oxford: Oxford University Press.
  • Cassam, Q. (2002). Representing bodies. Ratio, 15(4), 315-334.
  • Cassam, Q. (2011). The embodied self. In S. Gallagher (Ed.), The Oxford handbook of the self. Oxford: Oxford University Press.
  • Cassam, Q. (2019). Consciousness of oneself as subject. Philosophy and Phenomenological Research, 98(3), 736-741.
  • Cataldo, A., Ferrè, E. R., di Pellegrino, G., & Haggard, P. (2016). Thermal referral: Evidence for a thermoceptive uniformity illusion without touch. Scientific Reports, 6, 35286.
  • Chadha, M. (2018). No-self and the phenomenology of ownership. Australasian Journal of Philosophy, 96(1), 114-27.
  • Chen, W. Y., Huang, H. C., Lee, Y. T., & Liang, C. (2018). Body ownership and four-hand illusion. Scientific Reports, 8, 2153.
  • Cheng, T. (2019). On the very idea of a tactile field. In Cheng, T., Deroy, O., and Spence, C. (Eds.), Spatial senses: Philosophy of perception in an age of science. London: Routledge.
  • Cheng, T. (2020). Molyneux’s question and somatosensory spaces. In Ferretti, G., and Glenney, B. (Eds.), Molyneux’s question and the history of philosophy. London: Routledge.
  • Cheng, T., & Cataldo, A. (2022). Touch and other somatosensory senses. In Brigard, F. D. and Sinnott-Armstrong, W. (Eds.), Neuroscience and philosophy. Cambridge, MA: MIT Press.
  • Cole, J. (1991). Pride and a daily marathon. Cambridge, MA: MIT Press.
  • Cole, J., & Montero, B. (2007). Affective proprioception. Jenus Head, 9(2), 299-317.
  • Corns, J. (2020). The complex reality of pain. New York: Routledge.
  • Craig, A. D. (2003). Interoception: The sense of the physiological condition of the body. Current Opinion in Neurobiology, 13(4), 500-505.
  • Damasio, A. (1999). The feeling of what happens: Body and emotion in the making of consciousness. London: William Heinemann.
  • Day, B. L., & Fitzpatrick, R. C. (2005). Virtual head rotation reveals a process of route reconstruction from human vestibular signals. Journal of Physiology, 567(Pt 2), 591-597.
  • Dokic, J. (2003). The sense of ownership: An analogy between sensation and action. In J. Roessler and N. Eilan (Eds.), Agency and self-awareness: Issues in philosophy and psychology. Oxford: Oxford University Press.
  • Ehrsson, H. H., Spence, C., & Passingham, R. E. (2004). That’s my hand! Activity in premotor cortex reflects feeling of ownership of a limb. Science, 305(5685), 875-877.
  • Evans, G. (1980). Things without the mind. In Z. V. Straaten (Ed.), Philosophical subjects. Oxford: Oxford University Press.
  • Evans, G. (1982). The varieties of reference. Oxford: Oxford University Press.
  • Fardo, F., Beck, B., Cheng, T., & Haggard, P. (2018). A mechanism for spatial perception on human skin. Cognition, 178, 236-243.
  • Fardo, F., Finnerup, N. B., & Haggard, P. (2018). Organization of the thermal grill illusion by spinal segments, Annals of Neurology, 84(3), 463-472.
  • Ferretti, G., & Glenney, B. (Eds.), Molyneux’s question and the history of philosophy. London: Routledge.
  • Fridland, E. (2011). The case for proprioception. Phenomenology and Cognitive Sciences, 10(4), 521-540.
  • Fulkerson, M. (2013). The first sense: A philosophical study of human touch. Cambridge, MA: MIT Press.
  • Fulkerson, M. (2015/2020). Touch. In E. N. Zalta (Ed.), The Stanford Encyclopedia of Philosophy.
  • Gallagher, S. (1986). Body image and body schema: A conceptual clarification. Journal of Mind and Behavior, 7(4), 541-554.
  • Gallagher, S. (2008). Are minimal representations still representations? International Journal of Philosophical Studies, 16(3), 351-369.
  • Geldard, F. A., & Sherrick, C. E. (1972). The cutaneous “rabbit”: A perceptual illusion. Science, 178(4057), 178-179.
  • Gibson, J. J. (1979). The ecological approach to visual perception. Boston: Houghton Mifflin.
  • Gray, R. (2013). What do our experiences of heat and cold represent? Philosophical Studies, 166(S1), 131-151.
  • Green, E. J. (2020). Representing shape in sight and touch. Mind and Language, online first.
  • Guillot, M. (2017). I, me, mine: On a confusion concerning the subjective character of experience. Review of Philosophy and Psychology, 8(1), 23-53.
  • Gurwitsch, A. (1964). The field of consciousness. Pittsburgh, PA: Duquesne University Press.
  • Haggard, P., & Giovagnoli, G. (2011). Spatial patterns in tactile perception: Is there a tactile field? Acta Psychologica, 137(1), 65-75.
  • Hamilton, A. (2005). Proprioception as basic knowledge of the body. In van Woudenberg, R., Roeser, S., Rood, R. (2005). Basic belief and basic knowledge. Ontos-Verlag.
  • Head, H., & Holmes, G. (1911). Sensory disturbances from cerebral lesions. Brain, 34(2-3), 102-254.
  • Henry, M. (1965/1975). Philosophy and phenomenology of the body. (G. Etzkorn, trans.). The Hague: Nijhoff.
  • Hill, C. (2005). Ow! The paradox of pain. In M. Aydede (Ed.), Pain: New essays on the nature of pain and the methodology of its study. Cambridge, MA: MIT Press.
  • Hill, C. (2017). Fault lines in familiar concepts of pain. In J. Corns (Ed.), The Routledge handbook of philosophy of pain. New York: Routledge.
  • Howe, K. A. (2018). Proprioceptive awareness and practical unity. Theorema: International Journal of Philosophy, 37(3), 65-81.
  • Huang, H. C., Lee, Y. T., Chen, W. Y., & Liang, C. (2017). The sense of 1PP-location contributes to shaping the perceived self-location together with the sense of body-location. Frontiers in Psychology, 8, 370.
  • Husserl, E. (1913/1998). Ideas pertaining to a pure phenomenology and to a phenomenological philosophy – first book: general introduction to a pure phenomenology. (F. Kersten, trans.). Dordrecht: Kluwer Academic Publishers.
  • Husserl, E. (1989). Ideas pertaining to a pure phenomenology and to a phenomenological philosophy – Second book: studies in the phenomenology of constitution. (R. Rojcewicz and A. Schuwer, trans.). Dordrecht: Kluwer Academic Publishers.
  • Katz, D. (1925/1989). The world of touch. Krueger, L. E. (trans.) Hillsdale, NJ: Erlbaum.
  • Khachouf, O. T., Poletti, S., & Pagnoni, G. (2013). The embodied transcendental: A Kantian perspective on neurophenomenology. Frontiers in Human Neuroscience, 7, 611.
  • Kinsbourne, M., & Lempert, H. (1980). Human figure representation by blind children. The Journal of General Psychology, 102(1), 33-37.
  • Korsmeyer, C. (2020). Things: In touch with the past. New York: Oxford University Press.
  • Kreuch, G. (2019). Self-feeling: Can self-consciousness be understood as a feeling? Springer.
  • Kripke, S. (1980). Naming and necessity. Cambridge, MA: Harvard University Press.
  • Lane, T. (2012). Toward an explanatory framework for mental ownership. Phenomenology and Cognitive Sciences, 11(2), 251-286.
  • Legrand, D. (2007a). Pre-reflective self-consciousness: On being bodily in the world. Janus Head, 9(2), 493-519.
  • Legrand, D. (2007b). Subjectivity and the body: Introducing basic forms of self-consciousness. Consciousness and Cognition, 16(3), 577-582.
  • Lenggenhager, B., Tadi, T., Metzinger, T., & Blanke, O. (2007). Video ergo sum: Manipulating bodily self-consciousness. Science, 317(5841), 1096-1099.
  • Lin, J. H., Hung, C. H., Han, D. S., Chen, S. T., Lee, C. H., Sun, W. Z., & Chen, C. C. (2018). Sensing acidosis: Nociception or sngception? Journal of Biomedical Science, 25, 85.
  • Liu, M. (2021). The polysemy view of pain. Mind and Language, Online first.
  • Liu, M., & Klein, C. (2020). Analysis, 80(2), 262-272.
  • Locke, J. (1693/1979). Letter to William Molynoux, 28 March. In de Beer, E. S. (Ed.), The correspondence of John Locke (vol. 9). Oxford: Clarendon Press.
  • Longuenesse, B. (2006). Self-consciousness and consciousness of one’s own body: Variations on a Kantian theme. Philosophical Topics, 34(1/2), 283-309.
  • Longuenesse, B. (2021). Revisiting Quassim Cassam’s Self and world. Analytic Philosophy, 62(1), 70-83.
  • Lotze, H. (1888). Logic, in three books: Of thought, of investigation, and of knowledge. Oxford: Clarendon Press.
  • Lycan, W. G. (1987). Consciousness. Cambridge, MA: MIT Press.
  • Mac Cumhaill, C. (2017). The tactile ground, immersion, and the “space between.” Southern Journal of Philosophy, 55(1), 5-31.
  • Macpherson, F. (2011). (Ed.) The senses: Classical and contemporary philosophical perspectives. Oxford: Oxford University Press.
  • Mancini, F., Stainitz, H., Steckelmacher, J., Iannetti,G. D., & Haggard, P. (2015). Poor judgment of distance between nociceptive stimuli. Cognition, 143, 41-47.
  • Mandrigin, A., & Thompson, E. (2015). Own-body perception. In M Matthen (Ed.), Oxford handbook of the philosophy of perception. Oxford: Oxford University Press.
  • Martin, M. G. F. (1992). Sight and touch. In Crane, T. (Ed.). The contents of experience: Essays on perception. New York: Cambridge University Press.
  • Martin, M. G. F. (1993). Sensory modalities and spatial properties. In N. Eilan, R. McCarty, and B. Brewer (Eds.), Spatial representation: Problems in philosophy and psychology. Oxford: Basil Blackwell.
  • Martin, M. G. F. (1995). Bodily awareness: A sense of ownership. In J. L. Bermúdez, A. Marcel, and N. Eilan (Eds.), The body and the self. Cambridge, MA: MIT Press.
  • Martin, M. G. F. (2002). The transparency of experience. Mind and Language, 17(4), 376-425.
  • McDowell, J. (1996). Mind and world. Cambridge, MA: Harvard University Press.
  • McGinn, C. (1996). The character of mind: An introduction to the philosophy of mind. Oxford: Oxford University Press.
  • McGlone, F., Wessberg, J., & Olausson, H. (2014). Discriminative and affective touch: Sensing and feeling. Neuron, 82(4), 737-755.
  • Merleau-Ponty, M. (1945/2013). Phenomenology of perception. (D. A. Landes, trans.) London: Routledge.
  • Merleau-Ponty, M. (1968). The visible and the invisible. (A. Lingis, trans.). Evanston: Northwestern University Press.
  • Montero, B. (2006). Proprioceiving someone else’s movement. Philosophical Explorations: An International Journal for the Philosophy of Mind and Action, 9(2), 149-161.
  • Morash, V., Pensky, A. E. C., Alfaro, A. U., & McKerracher, A. (2012). A review of haptic spatial abilities in the blind. Spatial Cognition and Computation, 12(2-3), 83-95.
  • Moro, V., Massimiliano, Z., & Salvatore, M. A. (2004). Changes in spatial position of hands modify tactile extinction but not disownership of contralesional hand in two right brain-damaged patients. Neurocase, 10(6), 437-443.
  • Nanay, B. (2016). Aesthetics as philosophy of perception. Oxford: Oxford University Press.
  • National Research Council (US) Committee on Recognition and Alleviation of Pain in Laboratory Animals. (2009). Recognition and alleviation of pain in laboratory animals. Washington, DC: National Academies Press.
  • Noë, A. (2004). Action in perception. Cambridge, MA: MIT Press.
  • Noordhof, P. (2002). In pain. Analysis, 61(2), 95-97.
  • O’Dea, J. (2011). A proprioceptive account of the sense modalities. In Macpherson, F. (Ed.), The senses: Classic and contemporary philosophical perspectives. Oxford: Oxford University Press.
  • O’Shaughnessy, B. (1980). The will, vol. 1. Cambridge: Cambridge University Press.
  • O’Shaughnessy, B. (1989). The sense of touch. Australasian Journal of Philosophy, 67(1), 37-58.
  • O’Shaughnessy, B. (1995). Proprioception and the body image. In Bermúdez, B., Marcel, A., & Eilan, N. (Eds.), The body and the self. Cambridge, MA: MIT Press.
  • O’Shaughnessy, B. (2000). Consciousness and the world. Oxford: Oxford University Press.
  • Paillard, J. (1999). Body schema and body image: A double dissociation in deafferented patients. In Gantchev, G. N., Mori, S., and Massion, J. (Eds.), Motor control today and tomorrow. Sofia: Professor Marius Drinov Academic Publishing House.
  • Peacocke, C. (1983). Sense and content: Experience, thought, and their relations. Oxford: Oxford University Press.
  • Peacocke, C. (2017). Philosophical reflections on the first person, the body, and agency. The subject’s matter: Self-consciousness and the body. Cambridge, MA: MIT Press.
  • Penfield, W., & Rasmussen, T. (1950). The cerebral cortex of man: A clinical study of localization of function. New York: Macmillan.
  • Perry, J. (1990). Self-location. Logos, 11, 17-31.
  • Perry, J. (2001). Reference and reflexivity. Stanford: CSLI Publications.
  • Plumbley, M. D. (2013). Hearing the shape of a room. Proceedings of the National Academy of Sciences of the United States of America, 201309932.
  • Rabellino, D., Frewen, P. A., McKinnon, M. C., & Lanius, R. A. (2020). Peripersonal space and bodily self-consciousness: Implications for psychological trauma-related disorders. Frontiers in Neuroscience, 14, 586605.
  • Ratcliffe, M. (2005). The feeling of being. Journal of Consciousness Studies, 12(8-10), 43-60.
  • Ratcliffe, M. (2008). Feelings of being: Phenomenology, psychiatry and the sense of reality. Oxford: Oxford University Press.
  • Ratcliffe, M. (2012). What is touch? Australasian Journal of Philosophy, 90(3), 413-432.
  • Ratcliffe, M. (2016). Existential feeling and narrative. In Muller, O. and Breyer, T. (Eds.), Funktionen des Lebendigen. Berlin: De Gruyter.
  • Reichenbach, H. (1947). Elements of symbolic logic. New York: Free Press.
  • Reid, T. (1863/1983). Inquiry and essays. Indiana: Hackett Publishing Company.
  • Reuter, K. (2017). The developmental challenge of the paradox of pain. Erkenntnis, 82, 265-283.
  • Reuter, K., Phillips, D., & Sytsma, J. (2014). In J. Sytsma (Ed.), Advances in experimental philosophy of mind. London: Bloomsbury Academic.
  • Richardson, L. (2010). Seeing empty space. European Journal of Philosophy, 18(2), 227-243.
  • Rizzolatti, G., Scandolara, C., Matelli, M., & Gentilucci, M. (1981). Afferent properties of periarcuate neurons in macaque monkeys. I. Somatosensory responses. Behavioural Brain Research, 2, 125-146.
  • Rosenthal, D. M. (2010). Consciousness, the self and bodily location. Analysis, 70(2), 270-276.
  • Salje, L. (2017). Crossed wires about crossed wires: Somatosensation and immunity to error through misidentification. Dialectica, 71(1), 35-56.
  • Salje, L. (2019). The inside-out binding problem. In Cheng, T., Deroy, O., and Spence, C. (Eds.), Spatial senses: Philosophy of perception in an age of science. London: Routledge.
  • Sartre, J-P. (1943/2003). Being and nothingness: An essay on phenomenological ontology. (H. E. Barnes, trans). Oxford: Routledge.
  • Searle, J. (1992). The rediscovery of the mind. Cambridge, MA: MIT Press.
  • Seth, A. (2013). Interoceptive inference, emotion, and the embodied self. Trends in Cognitive Sciences, 17(11), 565-573.
  • Schrenk, M. (2014). Is proprioceptive art possible? In Priest, G. and Young, D. (Eds.), Philosophy and the Martial Arts. New York: Routledge.
  • Schwenkler, J. (2013). The objects of bodily awareness. Philosophical Studies, 162(2), 465-472.
  • Serino, A., Giovagnoli, G., Vignemont, de. V., & Haggard, P. (2008). Acta Psychologica,
  • Serino, A., Noel, J-P., Mange, R., Canzoneri, E., Pellencin, E., Ruiz, J. B., Bernasconi, F., Blanke, O., & Herbelin, B. (2018). Peripersonal space: An index of multisensory body-environment interactions in real, virtual, and mixed realities. Frontiers in ICT, 4, 31.
  • Serrahima, C. (forthcoming). The bounded body: On the sense of bodily ownership and the experience of space. In Garcia-Carpintero, M. and Guillot, M. (Eds)., The sense of mineness. Oxford: Oxford University Press.
  • Shoemaker, S. (1968). Self-reference and self-awareness. The journal of Philosophy, 65(19), 555-567.
  • Shoemaker, S. (1996). The first-person perspective and other essays. Cambridge: Cambridge University Press.
  • Sherrington, C. S. (1906). (Ed.) The integrative action of the nervous system. Cambridge: Cambridge University Press.
  • Siegel, S. (2010). The contents of visual experience. Oxford: Oxford University Press.
  • Skrzypulec, B. (2021). Spatial content of painful sensations. Mind and Language, online first.
  • Smith, A. D. (2002). The problem of perception. Cambridge, MA: Harvard University Press.
  • Smith, J. (2006). Bodily awareness, imagination and the self. European Journal of Philosophy, 14(1), 49-68.
  • Smith, J. (2016). Experiencing phenomenology: An introduction. New York: Routledge.
  • Smythies, J. (1996). A note on the concept of the visual field in neurology, psychology, and visual neuroscience. Perception, 25(3), 369-371.
  • Soteriou, M. (2013). The mind’s construction: The ontology of mind and mental action. Oxford: Oxford University Press.
  • Steward, H. (1997). The ontology of mind: Events, processes, and states. Oxford: Clarendon Press.
  • Strawson, P. F. (1959). Individuals: An essay in descriptive metaphysics. London: Routledge.
  • Travis, C. (2004). The silence of the senses. Mind, 113(449), 57-94.
  • Tsakiris, M. (2017). The material me: Unifying the exteroceptive and interoceptive sides of the bodily self. In F. D. Vignemont and A. J. T. Alsmith (Eds.), The subject’s matter: Self-consciousness and the body. Cambridge, MA: MIT Press.
  • Tsakiris, M., & de Preester, H. (2019). The interoceptive mind: From homeostasis to awareness. Oxford: Oxford University Press.
  • Tuthill, J. C., & Azim, E. (2018). Proprioception. Current Biology, 28(5), R194-R203.
  • Vignemont, F. (2007). Habeas Corpus: The sense of ownership of one’s own body. Mind and Language, 22(4), 427-449.
  • Vignemont, F. (2011/2020). Bodily awareness. In E. N. Zalta (Ed.), The Stanford Encyclopedia of Philosophy.
  • Vignemont, F. (2018a). Was Descartes right after all? An affective background for bodily awareness. In M. Tsakiris and H. de Preester (Eds.), The interoceptive mind: From homeostasis to awareness. Oxford: Oxford University Press.
  • Vignemont, F. (2018b). Mind the body: An exploration of bodily self-awareness. Oxford: Oxford University Press.
  • Vignemont, F. (2020). Bodily feelings: Presence, agency, and ownership. In U. Kriegel (Ed.), The Oxford handbook of the philosophy of consciousness. Oxford: Oxford University Press.
  • Vignemont, F. (2021). Feeling the world as being here. In F. de Vignemont, A. Serino, H. Y. Wong, and A. Farnè (Eds.), The world at our fingertips: A multidisciplinary exploration of peripersonal space. Oxford: Oxford University Press.
  • Vignemont, F. (forthcoming). Bodily awareness. Cambridge: Cambridge University Press.
  • Vignemont, F., & Alsmith, A. (2017) (Ed.) The subject’s matter: Self-consciousness and the body. Cambridge, MA: MIT Press.
  • Vignemont, F., & Iannetti, G. D. (2015). How many peripersonal spaces? Neuropsychologia, 70, 327-334.
  • Vignemont, F., & Massin, O. (2015). Touch. In Matthen, M. (Ed.) The Oxford Handbook of Philosophy of Perception. Oxford: Oxford University Press.
  • Vignemont, F., Serino, A., Wong, H. Y., & Farnè, A. (2021). (Eds.) The world at our fingertips: A multidisciplinary exploration of peripersonal space. Oxford: Oxford University Press.
  • Williams, B. (1970). The self and the future. The Philosophical Review, 79(2), 161-187.
  • Williams, B. (2006). Ethics and the limits of philosophy. London: Routledge.
  • Wilson, K. (2021). Individuating the senses of “smell”: Orthonasal versus retronasal olfaction. Synthese, 199, 4217-4242.
  • Wittgenstein, L. (1958). The blue and brown books. Oxford: Blackwell.
  • Wong, H. Y. (2015). On the significance of bodily awareness for bodily action. The Philosophical Quarterly, 65(261), 790-812.
  • Wong, H. Y. (2017a). On proprioception in action: Multimodality versus deafferentation. Mind and Language, 32(3), 259-282.
  • Wong, H. Y. (2017b). In and out of balance. In de Vignemont, F. and Alsmith, A. (Eds.), The subject’s matter: Self-consciousness and the body. Cambridge, MA: MIT Press.
  • Zahavi, D. (2021). Embodied subjectivity and objectifying self-consciousness: Cassam and phenomenology. Analytic Philosophy, 62, 97-105.

 

Author Information

Tony Cheng
Email: h.cheng.12@alumni.ucl.ac.uk
National Chengchi University
Taiwan

George Orwell (1903—1950)

Eric Arthur Blair, better known by his pen name George Orwell, was a British essayist, journalist, and novelist. Orwell is most famous for his dystopian works of fiction, Animal Farm and Nineteen Eighty-Four, but many of his essays and other books have remained popular as well. His body of work provides one of the twentieth century’s most trenchant and widely recognized critiques of totalitarianism.

Orwell did not receive academic training in philosophy, but his writing repeatedly focuses on philosophical topics and questions in political philosophy, epistemology, philosophy of language, ethics, and aesthetics. Some of Orwell’s most notable philosophical contributions include his discussions of nationalism, totalitarianism, socialism, propaganda, language, class status, work, poverty, imperialism, truth, history, and literature.

Orwell’s writings map onto his intellectual journey. His earlier writings focus on poverty, work, and money, among other themes. Orwell examines poverty and work not only from an economic perspective, but also socially, politically, and existentially, and he rejects moralistic and individualistic accounts of poverty in favor of systemic explanations. In so doing, he provides the groundwork for his later championing of socialism.

Orwell’s experiences in the 1930s, including reporting on the living conditions of the poor and working class in Northern England as well as fighting as a volunteer soldier in the Spanish Civil War, further crystalized Orwell’s political and philosophical outlook. This led him to write in 1946 that, “Every line of serious work I have written since 1936 has been, directly or indirectly, against totalitarianism and for democratic Socialism” (“Why I Write”).

For Orwell, totalitarianism is a political order focused on power and control. Much of Orwell’s effectiveness in writing against totalitarianism stems from his recognition of the epistemic and linguistic dimensions of totalitarianism. This is exemplified by Winston Smith’s claim as the protagonist in Nineteen Eighty-Four: “Freedom is the freedom to say that two plus two makes four. If that is granted, all else follows.” Here Orwell uses, as he often does, a particular claim to convey a broader message. Freedom (a political state) rests on the ability to retain the true belief that two plus two makes four (an epistemic state) and the ability to communicate that truth to others (via a linguistic act).

Orwell also argues that political power is dependent upon thought and language. This is why the totalitarian, who seeks complete power, requires control over thought and language. In this way, Orwell’s writing can be viewed as philosophically ahead of its time for the way it brings together political philosophy, epistemology, and philosophy of language.

Table of Contents

  1. Biography
  2. Political Philosophy
    1. Poverty, Money, and Work
    2. Imperialism and Oppression
    3. Socialism
    4. Totalitarianism
    5. Nationalism
  3. Epistemology and Philosophy of Mind
    1. Truth, Belief, Evidence, and Reliability
    2. Ignorance and Experience
    3. Embodied Cognition
    4. Memory and History
  4. Philosophy of Language
    1. Language and Thought
    2. Propaganda
  5. Philosophy of Art and Literature
    1. Value of Art and Literature
    2. Literature and Politics
  6. Orwell’s Relationship to Academic Philosophy
  7. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Biography

Eric Arthur Blair was born on June 25, 1903 in India. His English father worked as a member of the British specialized services in colonial India, where he oversaw local opium production for export to China. When Blair was less than a year old, his mother, of English and French descent, returned to England with him and his older sister. He saw relatively little of his father until he was eight years old.

Blair described his family as part of England’s “lower-upper-middle class.” Blair had a high degree of class consciousness, which became a common theme in his work and a central concern in his autobiographical essay, “Such, Such Were the Joys” (facetiously titled) about his time at the English preparatory school St. Cyprian’s, which he attended from ages eight to thirteen on a merit-based scholarship. After graduating from St. Cyprian’s, from ages thirteen to eighteen Orwell attended the prestigious English public school, Eton, also on a merit-based scholarship.

After graduating from Eton, where he had not been a particularly successful student, Blair decided to follow in his father’s footsteps and join the specialized services of the British Empire rather than pursue higher education. Blair was stationed in Burma (now Myanmar) where his mother had been raised. He spent five unhappy years with the Imperial Police in Burma (1922-1927) before leaving the position to return to England in hopes of becoming a writer.

Partly out of need and partly out of desire, Blair spent several years living in or near poverty both in Paris and London. His experiences formed the basis for his first book, Down and Out in Paris and London, which was published in 1933. Blair published the book under the pen name George Orwell, which became the moniker he would use for his published writings for the rest of his life.

Orwell’s writing was often inspired by personal experience. He used his experiences working for imperial Britain in Burma as the foundation for his second book, Burmese Days, first published in 1934, and his frequently anthologized essays, “A Hanging” and “Shooting an Elephant,” first published in 1931 and 1936 respectively.

He drew on his experiences as a hop picker and schoolteacher in his third novel, A Clergyman’s Daughter, first published in 1935. His next novel, Keep the Aspidistra Flying, published in 1936, featured a leading character who had given up a middle-class job for the subsistence pay of a book seller and the chance to try to make it as a writer. At the end of the novel, the protagonist gets married and returns to his old middle-class job. Orwell wrote this book while he himself was working as a book seller who would soon be married.

The years 1936-1937 included several major events for Orwell, which would influence his writing for the rest of his life. Orwell’s publisher, the socialist Victor Gollancz, suggested that Orwell spend time in the industrial north of England in order to gather experience about the conditions there for journalistic writing. Orwell did so during the winter of 1936. Those experiences formed the foundation for his 1937 book, The Road to Wigan Pier. The first half of Wigan Pier reported on the poor working conditions and poverty that Orwell witnessed. The second half focused on the need for socialism and the reasons why Orwell thought the British left intelligentsia had failed in convincing the poor and working class of the need for socialism. Gollancz published Wigan Pier as part of his Left Book Club, which provided Wigan Pier with a larger platform and better sales than any of his previous books.

In June 1936, Orwell married Eileen O’Shaughnessy, an Oxford graduate with a degree in English who had worked various jobs including those of teacher and secretary. Shortly thereafter, Orwell became a volunteer soldier fighting on behalf of the left-leaning Spanish Republicans against Francisco Franco and the Nationalist right in the Spanish Civil War. His wife joined him in Spain later. Orwell’s experiences in Spain further entrenched his shift towards overtly political writing. He experienced first-hand the infighting between various factions opposed to Franco on the political left. He also witnessed the control that the Soviet Communists sought to exercise over both the war, and perhaps more importantly, the narratives told about the war.

Orwell fought with the POUM (Partido Obrero de Unificación Marxista) militia that was later maligned by Soviet propaganda. The Soviets leveled a range of accusations against the militia, including that its members were Trotskyists and spies for the other side. As a result, Spain became an unsafe place for him and Eileen. They escaped Spain by train to France in the summer of 1937. Orwell later wrote about his experiences in the Spanish Civil War in Homage to Catalonia, published in 1938.

While Wigan Pier had signaled the shift to an abiding focus on politics and political ideas in Orwell’s writing, similarly, Homage to Catalonia signaled the shift to an abiding focus on epistemology and language in his work. Orwell’s time in Spain helped him understand how language shapes beliefs and how beliefs, in turn, shape the contours of power. Thus, Homage to Catalonia does not mark a mere epistemic and linguistic turn in Orwell’s thinking. It also marks a significant development in Orwell’s views about the complex relationship between language, thought, and power.

Orwell’s experiences in Spain also further cemented his anti-Communism and his role as a critic of the left operating within the left. After a period of ill health upon returning from Spain due to his weak lungs from having been shot in his throat during battle, Orwell took on a grueling pace of literary production, publishing Coming Up for Air in 1939, Inside the Whale and Other Essays in 1940, and his lengthy essay on British Socialism, “The Lion and the Unicorn: Socialism and the English Genius” in 1941, as well as many other essays and reviews.

Orwell would have liked to have served in the military during the Second World War, but his ill health prevented him from doing so. Instead, between 1941-1943 he worked for the British Broadcasting Company (BBC). His job was meant, in theory, to aid Britain’s war efforts. Orwell was tasked with creating and delivering radio content to listeners on the Indian subcontinent in hopes of creating support for Britain and the Allied Powers. There were, however, relatively few listeners, and Orwell came to consider the job a waste of his time. Nevertheless, his experiences of bureaucracy and censorship at the BBC would later serve as one of the inspirations for the “Ministry of Truth,” which played a prominent role in the plot of Nineteen Eighty-Four (Sheldon 1991, 380-381).

Orwell’s final years were a series of highs and lows. After leaving the BBC, Orwell was hired as the literary editor at the democratic socialist magazine, the Tribune. As part of his duties, he wrote a regular column titled “As I Please.” He and Eileen, who herself was working for the BBC, adopted a baby boy named Richard in 1944. Shortly before they adopted Richard, Orwell had finished work on what was to be his breakthrough work, Animal Farm. Orwell originally had trouble finding someone to publish Animal Farm due to its anti-Communist message and publishers’ desires not to undermine Britain’s war effort, given that the United Kingdom was allied with the USSR against Nazi Germany at the time. The book was eventually published in August 1945, a few months after Eileen had died unexpectedly during an operation at age thirty-nine.

Animal Farm was a commercial success in both the United States and the United Kingdom. This gave Orwell both wealth and literary fame. Orwell moved with his sister Avril and Richard to the Scottish island of Jura, where Orwell hoped to be able to write with less interruption and to provide a good environment in which to raise Richard. During this time, living without electricity on the North Atlantic coast, Orwell’s health continued to decline. He was eventually diagnosed with tuberculosis.

Orwell pressed ahead on completing what was to be his last book, Nineteen Eighty-Four. In the words of one of Orwell’s biographers, Michael Sheldon, Nineteen Eighty-Four is a book in which “Almost every aspect of Orwell’s life is in some way represented.” Published in 1949, Nineteen Eighty-Four was in many ways the culmination of Orwell’s life work: it dealt with all the major themes from his writing—poverty, social class, war, totalitarianism, nationalism, censorship, truth, history, propaganda, language, and literature, among others.

Orwell died less than a year after the publication of Nineteen Eighty-Four. Shortly before his death, he had married Sonia Brownell, who had worked for the literary magazine Horizons. Brownell, who later went by Sonia Brownell Orwell, became one of Orwell’s literary executors. Her efforts to promote her late husband’s work included establishing the George Orwell Archive at University College London and co-editing with Ian Angus a four-volume collection of Orwell’s essays, journalism, and letters, first published in 1968. The publication of this collection further increased interest in Orwell and his work, which has yet to abate in the over seventy years since his death.

2. Political Philosophy

Orwell’s claim that “Every line of serious work I have written since 1936 has been, directly or indirectly, against totalitarianism and for democratic Socialism,” divides Orwell’s work into two parts: pre-1936 and 1936-and-after.

Orwell’s second period (1936-and-after) is characterized by his strong views on politics and his focus on the interconnections between language, thought, and power. Orwell’s first period (pre-1936) focuses on two sets of interrelated themes: (1) poverty, money, work, and social status, and (2) imperialism and its ethical costs.

a. Poverty, Money, and Work

Orwell frequently wrote about poverty. It is a central topic in his books Down and Out and Wigan Pier and many of his essays, including “The Spike” and “How the Poor Die.” In writing about poverty, Orwell does not adopt an objective “view from nowhere”: rather, he writes as a member of the middle class to readers in the upper and middle classes. In doing so, he seeks to correct common misconceptions about poverty held by those in the upper and middle classes. These correctives deal with both the phenomenology of poverty and its causes.

His overall picture of poverty is less dramatic but more benumbing than his audience might initially imagine: one’s spirit is not crushed by poverty but rather withers away underneath it.

Orwell’s phenomenology of poverty is exemplified in the following passage from Down and Out:

It is altogether curious, your first contact with poverty. You have thought so much about poverty it is the thing you have feared all your life, the thing you knew would happen to you sooner or later; and it, is all so utterly and prosaically different. You thought it would be quite simple; it is extraordinarily complicated. You thought it would be terrible; it is merely squalid and boring. It is the peculiar lowness of poverty that you discover first; the shifts that it puts you to, the complicated meanness, the crust-wiping (Down and Out, 16-17).

This account tracks Orwell’s own experiences by assuming the perspective of one who encounters poverty later in life, rather than the perspective of someone born into poverty. At least for those who “come down” into poverty, Orwell identifies a silver lining in poverty: that the fear of poverty in a hierarchical capitalist society is perhaps worse than poverty itself. Once you realize that you can survive poverty (which is something Orwell seemed to think most middle-class people in England who later become impoverished could), there is “a feeling of relief, almost of pleasure, at knowing yourself at last genuinely down and out” (Down and Out, 20-21). This silver lining, however, seems to be limited to those who enter poverty after having received an education. Orwell concludes that those who have always been down and out are the ones who deserve pity because such a person “faces poverty with a blank, resourceless mind” (Down and Out, 180). This latter statement invokes controversial assumptions in the philosophy of mind and is indicative of the ways in which Orwell was never able to overcome certain class biases from his own education. Orwell’s views on the working class and the poor have been critiqued by some scholars, including Raymond Williams (1971) and Beatrix Campbell (1984).

Much of Orwell’s discussion about poverty is aimed at humanizing poor people and at rooting out misconceptions about poor people. Orwell saw no inherent difference of character between rich and poor. It was their circumstances that differed, not their moral goodness. He identifies the English as having a “a strong sense of the sinfulness of poverty” (Down and Out, 202). Through personal narratives, Orwell seeks to undermine this sense, concluding instead that “The mass of the rich and the poor are differentiated by their incomes and nothing else, and the average millionaire is only the average dishwasher dressed in a new suit” (Down and Out, 120). Orwell blames poverty instead on systemic factors, which the rich have the ability to change. Thus, if Orwell were to pass blame for the existence of poverty, it is not the poor on whom he would pass blame.

If poverty is erroneously associated with vice, Orwell notes that money is also erroneously associated with virtue. This theme is taken up most directly in his 1936 novel, Keep the Aspidistra Flying, which highlights the central role that money plays in English life through the failures of the novel’s protagonist to live a fulfilling life that does not revolve around money. Orwell is careful to note that the significance of money is not merely economic, but also social. In Wigan Pier, Orwell notes that English class stratification is a “money-stratification” but that it is also a “shadowy caste-system” that “is not entirely explicable in terms of money” (122). Thus, both money and culture seem to play a role in Orwell’s account of class stratification in England.

Orwell’s view on the social significance of money helped shape his views about socialism. For example, in “The Lion and the Unicorn,” Orwell argued in favor of a socialist society in which income disparities were limited on the grounds that a “man with £3 a week and a man with £1500 a year can feel themselves fellow creatures, which the Duke of Westminster and the sleepers on the Embankment benches cannot.”

Orwell was attuned to various ways in which money impacts work and vice versa. For example, in Keep the Aspidistra Flying, the protagonist, Gordon Comstock, leaves his job in order to have time to write, only to discover that the discomforts of living on very little money have drained him of the motivation and ability to write. This is in keeping with Orwell’s view that creative work, such as making art or writing stories, requires a certain level of financial comfort. Orwell expresses this view in Wigan Pier, writing that, “You can’t command the spirit of hope in which anything has got to be created, with that dull evil cloud of unemployment hanging over you” (82).

Orwell sees this inability to do creative or other meaningful work as itself one of the harmful consequences of poverty. This is because Orwell views engaging in satisfying work as a meaningful part of human experience. He argues that human beings need work and seek it out (Wigan Pier, 197) and even goes so far as to claim that being cut off from the chance to work is being cut off from the chance of living (Wigan Pier, 198). But this is because Orwell sees work as a way in which we can meaningfully engage both our bodies and our minds. For Orwell, work is valuable when it contributes to human flourishing.

But this does not mean that Orwell thinks all work has such value. Orwell is often critical of various social circumstances that require people to engage in work that they find degrading, menial, or boring. He shows particular distaste for working conditions that combine undesirability with inefficiency or exploitation, such as the conditions of low-level staff in Paris restaurants and coal miners in Northern England. Orwell recognizes that workers tolerate such conditions out of necessity and desperation, even though such working conditions often rob the workers of many aspects of a flourishing human life.

b. Imperialism and Oppression

By the time he left Burma at age 24, Orwell had come to strongly oppose imperialism. His anti-imperialist works include his novel Burmese Days, his essays “Shooting an Elephant” and “A Hanging,” and chapter 9 of Wigan Pier, in which he wrote that by the time he left his position with the Imperial Police in Burma that “I hated the imperialism I was serving with a bitterness which I probably cannot make clear” (Wigan Pier, 143).

In keeping with Orwell’s tendency to write from experience, Orwell focused mostly on the damage that he saw imperialism causing the imperialist oppressor rather than the oppressed. One might critique Orwell for failing to better account for the damage imperialism causes the oppressed, but one might also credit Orwell for discussing the evils of imperialism in a manner that might make its costs seem real to his audience, which, at least initially, consisted mostly of beneficiaries of British imperialism.

In writing about the experience of imperialist oppression from the perspective of the oppressor, Orwell often returns to several themes.

The first is the role of experience. Orwell argues that one can only really come to hate imperialism by being a part of imperialism (Wigan Pier, 144). One can doubt this is true, while still granting Orwell the emotional force of the point that experiencing imperialism firsthand can give one a particularly vivid understanding of imperialism’s “tyrannical injustice,” because one is, as Orwell put it, “part of the actual machinery of despotism” (Wigan Pier, 145).

Playing such a role in the machinery of despotism connects to a second theme in Orwell’s writing on imperialism: the guilt and moral damage caused by being an imperialist oppressor. In Wigan Pier, for example, Orwell writes the following about his state of mind after working for five years for the British Imperial Police in Burma:

I was conscious of an immense weight of guilt that I had got to expiate. I suppose that sounds exaggerated; but if you do for five years a job that you thoroughly disapprove of, you will probably feel the same. I had reduced everything to a simple theory that the oppressed are always right and the oppressors always wrong: a mistaken theory, but the natural result of being one of the oppressors yourself (Wigan Pier, 148).

A third theme in Orwell’s writing about imperialism is about ways in which imperialist oppressors—despite having economic and political power over the oppressed—themselves become controlled, in some sense, by those whom they oppress. For example, in “Shooting an Elephant” Orwell presents himself as having shot an elephant that got loose in a Burmese village merely in order to satisfy the local people’s expectations, even though he doubted shooting the elephant was necessary. Orwell writes of the experience that “I perceived in this moment that when the white man turns tyrant it is his own freedom that he destroys…For it is the condition of his rule that he shall spend his life trying to impress the ‘natives’ and so in every crisis he has got to do what the ‘natives’ expect of him.”

Thus, on Orwell’s account, no one is free under conditions of imperialist oppression—neither the oppressors nor the oppressed. The oppressed experience what Orwell calls in Wigan Pier “double oppression” because imperialist power not only leads to substantive injustice being committed against oppressed people, but to injustices committed by unwanted foreign invaders (Wigan Pier, 147). Oppressors, on the other hand, feel the need to conform to their role as oppressors despite their guilt, shame, and a desire to do otherwise (which Orwell seemed to think were near universal feelings among the British imperialists of his day).

Notably, some of Orwell’s earliest articulations of how pressures to socially conform can lead to suppression of freedom of speech occur in the context of his discussions of the lack of freedom experienced by imperialist oppressors. For example, in “Shooting an Elephant,” Orwell wrote that he “had to think out [his] problems in the utter silence that is imposed on every Englishman in the East.” And in Wigan Pier, he wrote that for British imperialists in India there was “no freedom of speech” and that “merely to be overheard making a seditious remark may damage [one’s] career” (144).

c. Socialism

From the mid-1930s until the end of his life, Orwell advocated for socialism. In doing so, he sought to defend socialism against mischaracterization. Thus, to understand Orwell’s views on socialism, one must understand both what Orwell thought socialism was and what he thought it was not.

Orwell offers his most succinct definition of socialism in Wigan Pier as meaning “justice and liberty.” The sense of justice he had in mind included not only economic justice, but also social and political justice. Inclusion of the word “liberty” in his definition of socialism helps explain why elsewhere Orwell specifies that he is a democratic socialist. For Orwell, democratic socialism is a political order that provides social and economic equality while also preserving robust personal freedom. Orwell was particularly concerned to preserve what we might call the intellectual freedoms: freedom of thought, freedom of expression, and freedom of the press.

Orwell’s most detailed account of socialism, at least as he envisioned it for Great Britain, is included in his essay “The Lion and the Unicorn.” Orwell notes that socialism is usually defined as “common ownership of the means of production” (Part II, Section I), but he takes this definition to be insufficient. For Orwell, socialism also requires political democracy, the removal of hereditary privilege in the United Kingdom’s House of Lords, and limits on income inequality (Part II, Section I).

For Orwell, one of the great benefits of socialism seems to be the removal of class-based prejudice. Orwell saw this as necessary for the creation of fellow feeling between people within a society. Given his experiences within socially stratified early twentieth century English culture, Orwell saw the importance of removing both economic and social inequality in achieving a just and free society.

This is reflected in specific proposals that Orwell suggested England adopt going into World War II. (In “The Lion and the Unicorn,” Orwell typically refers to England or Britain, rather than the United Kingdom as a whole. This is true of much of Orwell’s work.) These proposals included:

I. Nationalization of land, mines, railways, banks and major industries.
II. Limitation of incomes, on such a scale that the highest tax-free income in Britain does not exceed the lowest by more than ten to one.
III. Reform of the educational system along democratic lines.
IV. Immediate Dominion status for India, with power to secede when the war is over.
V. Formation of an Imperial General Council, in which the colored peoples are to be represented.
VI. Declaration of formal alliance with China, Abyssinia and all other victims of the Fascist powers. (Part III, Section II)

Orwell viewed these as steps that would turn England into a “socialist democracy.”

In the latter half of Wigan Pier, Orwell argues that many people are turned off by socialism because they associate it with things that are not inherent to socialism. Orwell contends that socialism does not require the promotion of mechanical progress, nor does it require a disinterest in parochialism or patriotism. Orwell also views socialism as distinct from both Marxism and Communism, viewing the latter as a form of totalitarianism that at best puts on a socialist façade.

Orwell contrasts socialism with capitalism, which he defines in “The Lion and the Unicorn” as “an economic system in which land, factories, mines and transport are owned privately and operated solely for profit.” Orwell’s primary reason for opposing capitalism is his contention that capitalism “does not work” (Part II, Section I). Orwell offers some theoretical reasons to think capitalism does not work (for example, “It is a system in which all the forces are pulling in opposite directions and the interests of the individual are as often as not totally opposed to those of the State” (Part II, Section I). But the core of Orwell’s argument against capitalism is grounded in claims about experience. In particular, he argues that capitalism left Britain ill-prepared for World War II and led to unjust social inequality.

d. Totalitarianism

Orwell conceives of totalitarianism as a political order focused on absolute power and control. The totalitarian attitude is exemplified by the antagonist, O’Brien, in Nineteen Eighty-Four. The fictional O’Brien is a powerful government official who uses torture and manipulation to gain power over the thoughts and actions of the protagonist, Winston Smith, a low-ranking official working in the propaganda-producing “Ministry of Truth.” Significantly, O’Brien treats his desire for power as an end in itself. O’Brien represents power for power’s sake.

Orwell recognized that because totalitarianism seeks complete power and total control, it is incompatible with the rule of law—that is, that totalitarianism is incompatible with stable laws that apply to everyone, including political leaders themselves. In “The Lion and the Unicorn,” Orwell writes of “[t]he totalitarian idea that there is no such thing as law, there is only power.” While law limits a ruler’s power, totalitarianism seeks to obliterate the limits of law through the uninhibited exercise of power. Thus, the fair and consistent application of law is incompatible with the kind of complete centralized power and control that is the final aim of totalitarianism.

Orwell sees totalitarianism as a distinctly modern phenomenon. For Orwell, Soviet Communism, Italian Fascism, and German Nazism were the first political orders seeking to be truly totalitarian. In “Literature and Totalitarianism,” Orwell describes the way in which totalitarianism differs from previous forms of tyranny and orthodoxy as follows:

The peculiarity of the totalitarian state is that though it controls thought, it doesn’t fix it. It sets up unquestionable dogmas, and it alters them from day to day. It needs the dogmas, because it needs absolute obedience from its subjects, but it can’t avoid the changes, which are dictated by the needs of power politics (“Literature and Totalitarianism”).

In pursuing complete power, totalitarianism seeks to bend reality to its will. This requires treating political power as prior to objective truth.

But Orwell denies that truth and reality can bend in the ways that the totalitarian wants them to. Objective truth itself cannot be obliterated by the totalitarian (although perhaps the belief in objective truth can be). It is for this reason that Orwell writes in “Looking Back on the Spanish War” that “However much you deny the truth, the truth goes on existing, as it were, behind your back, and you consequently can’t violate it in ways that impair military efficiency.” Orwell considers this to be one of the two “safeguards” against totalitarianism. The other safeguard is “the liberal tradition,” by which Orwell means something like classical liberalism and its protection of individual liberty.

Orwell understood that totalitarianism could be found on the political right and left. For Orwell, both Nazism and Communism were totalitarian (see, for example, “Raffles and Miss Blandish”). What united both the Soviet Communist and the German Nazi under the banner of totalitarianism was a pursuit of complete power and the ideological conformity that such power requires. Orwell recognized that such power required extensive capacity for surveillance, which explains why means of surveillance such as the “telescreen” and the “Thought Police” play a large role in the plot of Nineteen Eighty-Four. (For a discussion of Orwell as an early figure in the ethics of surveillance, see the article on surveillance ethics.)

e. Nationalism

One of Orwell’s more often cited contributions to political thought is his development of the concept of nationalism. In “Notes on Nationalism,” Orwell describes nationalism as “the habit of identifying oneself with a single nation or other unit, placing it beyond good and evil and recognizing no other duty than that of advancing its interests.” In “The Sporting Spirit,” Orwell adds that nationalism is “the lunatic modern habit of identifying oneself with large power units and seeing everything in terms of competitive prestige.”

In both these descriptions Orwell describes nationalism as a “habit.” Elsewhere, he refers to nationalism more specifically as a “habit of mind.” This habit of mind has at least two core features for Orwell—namely, (1) rooting one’s identity in group membership rather than in individuality, and (2) prioritizing advancement of the group one identifies with above all other goals. It is worth examining each of these features in more detail.

For Orwell, nationalism requires subordination of individual identity to group identity, where the group one identifies with is a “large power unit.” Importantly, for Orwell this large power unit need not be a nation. Orwell considered nationalism to be prevalent in movements as varied as “Communism, political Catholicism, Zionism, Antisemitism, Trotskyism and Pacifism” (“Notes on Nationalism”). What is required is that the large power unit be something that individuals can adopt as the center of their identity. This can happen both via a positive attachment (that is, by identifying with a group), but it can also happen via negative rejection (that is, by identifying as against a group). This is how, for example, Orwell’s list of movements with nationalistic tendencies could include both Zionism and Antisemitism.

But making group membership the center of one’s identity is not on its own sufficient for nationalism as Orwell understood it. Nationalists make advancement of their group their top priority. For this reason, Orwell states that nationalism “is inseparable from the desire for power” (“Notes on Nationalism”). The nationalist stance is aggressive. It seeks to overtake all else. Orwell contrasts the aggressive posture taken by nationalism with a merely defensive posture that he refers to as patriotism. For Orwell, patriotism is “devotion to a particular place and a particular way of life, which one believes to be the best in the world but has no wish to force on other people” (“Notes on Nationalism”). He sees patriotism as laudable but sees nationalism as dangerous and harmful.

In “Notes on Nationalism,” Orwell writes that the “nationalist is one who thinks solely, or mainly, in terms of competitive prestige.” As a result, the nationalist “may use his mental energy either in boosting or in denigrating—but at any rate his thoughts always turn on victories, defeats, triumphs and humiliations.” In this way, Orwell’s analysis of nationalism can be seen as a forerunner for much of the contemporary discussion about political tribalism and negative partisanship, which occurs when one’s partisan identity is primarily driven by dislike of one’s outgroup rather than support for one’s ingroup (Abramowitz and Webster).

It is worth noting that Orwell takes his own definition of nationalism to be somewhat stipulative. Orwell started with a concept that he felt needed to be discussed and decided that nationalism was the best name for this concept. Thus, his discussions of nationalism (and patriotism) should not be considered conceptual analysis: rather, these discussions are more akin to what is now often called conceptual ethics or conceptual engineering.

3. Epistemology and Philosophy of Mind

Just as 1936-37 marked a shift toward the overtly political in Orwell’s writing, so too those years marked a shift toward the overtly epistemic. Orwell was acutely aware of how powerful entities, such as governments and the wealthy, were able to influence people’s beliefs. Witnessing both the dishonesty and success of propaganda about the Spanish Civil War, Orwell worried that these entities had become so successful at controlling others’ beliefs that “The very concept of objective truth [was] fading out of the world” (“Looking Back at the Spanish War”). Orwell’s desire to defend truth, alongside his worries that truth could not be successfully defended in a completely totalitarian society, culminate in the frequent epistemological ruminations of Winston Smith, the fictional protagonist in Nineteen Eighty-Four.

a. Truth, Belief, Evidence, and Reliability

Orwell’s writing routinely employs many common epistemic terms from philosophy, including “truth,” “belief,” “knowledge,” “facts,” “evidence,” “testimony,” “reliability,” and “fallibility,” among others, yet he also seems to have taken for granted that his audience would understand these terms without defining them. Thus, one must look at how Orwell uses these terms in context in order to figure out what he meant by them.

To start with the basics, Orwell distinguishes between belief and truth and rejects the view that group consensus makes something true. For example, in his essay on Rudyard Kipling, Orwell writes “I am not saying that that is a true belief, merely that it is a belief which all modern men do actually hold.” Such a statement assumes that truth is a property that can be applied to beliefs, that truth is not grounded on acceptance by a group, and that just because someone believes something does not make it true.

On the contrary, Orwell seems to think that truth is, in an important way, mind-independent. For example, he writes that, “However much you deny the truth, the truth goes on existing, as it were, behind your back, and you consequently can’t violate it in ways that impair military efficiency” (“Looking Back on the Spanish War”). For Orwell, truth is derived from the way the world is. Because the world is a certain way, when our beliefs fail to accord with reality, our actions fail to align with the way the world is. This is why rejecting objective truth wholesale would, for instance, “impair military efficiency.” You can claim there are enough rations and munitions for your soldiers, but if, in fact, there are not enough rations and munitions for your soldiers, you will suffer military setbacks. Orwell recognizes this as a pragmatic reason to pursue objective truth.

Orwell does not talk about justification for beliefs as academic philosophers might. However, he frequently appeals to quintessential sources of epistemic justification—such as evidence and reliability—as indicators of a belief’s worthiness of acceptance and its likelihood of being true. For example, Orwell suggests that if one wonders whether one harbors antisemitic attitudes that one should “start his investigation in the one place where he could get hold of some reliable evidence—that is, in his own mind” (“Antisemitism”). Regardless of what one thinks of Orwell’s strategy for detecting antisemitism, this passage shows Orwell’s assumption that, at least some of the time, we can obtain reliable evidence via introspection.

Orwell’s writings on the Spanish Civil War provide a particularly rich set of texts from which to learn about the conditions under which Orwell thinks we can obtain reliable evidence. This is because Orwell was seeking to help readers (and perhaps also himself) separate truth from lies about what happened during that war. In so doing, Orwell offers an epistemology of testimony. For example, he writes:

Nor is there much doubt about the long tale of Fascist outrages during the last ten years in Europe. The volume of testimony is enormous, and a respectable proportion of it comes from the German press and radio. These things really happened, that is the thing to keep one’s eye on (“Looking Back on the Spanish War”).

Here, Orwell appeals to both the volume and the source of testimony as reason to have little doubt that fascist outrages had been occurring in Europe. Orwell also sometimes combines talk of evidence via testimony with other sources of evidence—like first-hand experience—writing, for example, “I have had accounts of the Spanish jails from a number of separate sources, and they agree with one another too well to be disbelieved; besides I had a few glimpses into one Spanish jail myself” (Homage to Catalonia, 179).

While recognizing the epistemic challenges posed by propaganda and self-interest, Orwell was no skeptic about knowledge. He was comfortable attributing knowledge to agents and referring to states of affairs as facts, writing, for example: “These facts, known to many journalists on the spot, went almost unmentioned to the British press” (“The Prevention of Literature”). Orwell was less sanguine about our ability to know with certainty, writing, for example, “[It] is difficult to be certain about anything except what you have seen with your own eyes, and consciously or unconsciously everyone writes as a partisan” (Homage to Catalonia, 195). This provides reason to think that Orwell was a fallibilist about knowledge—that is, someone who thinks you can know a proposition even while lacking certainty about the truth of that proposition. (For example, a fallibilist might claim to know she has hands but still deny that she is certain that she has hands.)

Orwell saw democratic socialism as responsive to political and economic facts, whereas he saw totalitarianism as seeking to bend the facts to its will. Thus, Orwell’s promotion of objective truth is closely tied to his promotion of socialism over totalitarianism. This led Orwell to confess that he was frightened by “the feeling that the very concept of objective truth is fading out of the world.” For Orwell, acknowledging objective truth requires acknowledging reality and the limitations reality places on us. Reality says that 2 + 2 = 4 and not that 2 + 2 = 5.

In this way, Orwell uses the protagonist of Nineteen Eighty-Four, Winston Smith, to express his views on the relationship between truth and freedom. An essential part of freedom for Orwell is the ability to think and to speak the truth. Orwell was especially prescient in identifying hindrances to the recognition of truth and the freedom that comes with it. These threats include nationalism, propaganda, and technology that can be used for constant surveillance.

b. Ignorance and Experience

Writing was a tool Orwell used to try to dispel his readers’ ignorance. He was a prolific writer who wrote many books, book reviews, newspaper editorials, magazine articles, radio broadcasts, and letters during a relatively short career. In his writing, he sought to disabuse the rich of ignorant notions about the poor; he sought to correct mistaken beliefs about the Spanish Civil War that had been fueled by fascist propaganda; and he sought to counteract inaccurate portrayals of democratic socialism and its relationship to Soviet Communism.

Orwell’s own initial ignorance on these matters had been dispelled by life experience. As a result, he viewed experience as capable of overcoming ignorance. He seemed to believe that testimony about experience also had the power to help those who received testimony to overcome their ignorance. Thus, Orwell sought to testify to his experiences in a way that might help counteract the inaccurate perceptions of those who lacked experience about matters to which he testified in his writing.

As discussed earlier, Orwell believed that middle- and upper-class people in Britain were largely ignorant about the character and circumstances of those living in poverty, and that what such people imagined poverty to be like was often inaccurate. Concerning his claim that the rich and poor do not have different natures or different moral character, Orwell writes that “Everyone who has mixed on equal terms with the poor knows this quite well. But the trouble is that intelligent, cultivated people, the very people who might be expected to have liberal opinions, never do mix with the poor” (Down and Out, 120).

Orwell made similar points about many other people and circumstances. He argued that the job of low-level kitchen staff in French restaurants that appeared easy from the outside was actually “astonishingly hard” (Down and Out, 62), that actually watching coal miners work could cause a member of the English to doubt their status as “a superior person” (Wigan Pier, 35), and that working in a bookshop was a good way to disabuse oneself of the fantasy that working in a bookshop was a paradise (see “Bookshop Memories”).

There is an important metacommentary that is hard to overlook concerning Orwell’s realization that experience is often necessary to correct ignorance. During his lifetime, Orwell amassed an eclectic set of experiences that helped him to understand better the perspective of those in a variety of professions and social classes. This allowed him to empathize with the plight of a wide variety of white men. However, try as he might, Orwell could not ever experience what it was like to be a woman, person of color, or queer-identified person in any of these circumstances.  Feminist critics have rightfully called attention to the misogyny and racism that is common in Orwell’s work (see, for example, Beddoe 1984, Campbell 1984, and Patai 1984). Orwell’s writings were also often homophobic (see, for example, Keep the Aspidistra Flying, chapter 1; Taylor 2003, 245). In addition, critics have pointed to antisemitism and anti-Catholicism in his writing (see, for example, Brennan 2017). Thus, Orwell’s insights about the epistemic power of experience also help explain significant flaws in his corpus, due to the limits of his own experience and imagination, or perhaps more simply due to his own prejudices.

c. Embodied Cognition

Orwell’s writing is highly consonant with philosophical work emphasizing that human cognition is embodied. For Orwell, unlike Descartes, we are not first and foremost a thinking thing. Rather, Orwell writes that “A human being is primarily a bag for putting food into; the other functions and faculties may be more godlike, but in point of time they come afterwards” (Wigan Pier, 91-92).

The influence of external circumstances and physical conditions on human cognition plays a significant role in all of Orwell’s nonfiction books as well as in Animal Farm and Nineteen Eighty-Four. In Homage to Catalonia, Orwell relays how, due to insufficient sleep as a soldier in the Spanish Republican Army, “One grew very stupid” (43). In Down and Out, Orwell emphasized how the physical conditions of a poor diet make it so that you can “interest yourself in nothing” so that you become “only a belly with a few accessory organs” (18-19). And in Wigan Pier, Orwell argues that even the “best intellect” cannot stand up against the “debilitating effect of unemployment” (81). This, he suggests, is why it is hard for the unemployed to do things like write books. They have the time, but according to Orwell, writing books requires peace of mind in addition to time. And Orwell believed that the living conditions for most unemployed people in early twentieth century England did not afford such peace of mind.

Orwell’s emphasis on embodied cognition is another way in which he recognizes the tight connection between the political and the epistemic. In Animal Farm, for example, the animals are initially pushed toward their rebellion against the farmer after they are left unfed, and their hunger drives them to action. And Napoleon, the aptly named pig who eventually gains dictatorial control over the farm, keeps the other animals overworked and underfed as a way of making them more pliable and controllable. Similarly, in Nineteen Eighty-Four, while food is rationed, gin is in abundance for party members. And the physical conditions of deprivation and torture are used to break the protagonist Winston Smith’s will to the point that his thoughts become completely malleable. Epistemic control over citizens’ minds gives the Party power over the physical conditions of the citizenry, with control over the physical conditions of the citizenry in turn helping cement the Party’s epistemic control over citizens.

d. Memory and History

Orwell treats memory as a deeply flawed yet invaluable faculty, because it is often the best or only way to obtain many truths about the past. The following passage is paradigmatic of his position: “Treacherous though memory is, it seems to me the chief means we have of discovering how a child’s mind works. Only by resurrecting our own memories can we realize how incredibly distorted is the child’s vision of the world” (“Such, Such Were the Joys”).

In his essay “My Country Right or Left,” Orwell expresses wariness about the unreliability of memories, yet he also seems optimistic about our ability to separate genuine memories form false interpolations with concentration and reflection. Orwell argued that over time British recollection of World War I had become distorted by nostalgia and post hoc narratives. He encouraged his readers to “Disentangle your real memories from later accretions,” which suggests he thinks such disentangling is at least possible. This is reinforced by his later claim that he was able to “honestly sort out my memories and disregard what I have learned since” about World War I (“My Country Right or Left”).

As these passages foreshadow, Orwell sees both the power and limitation of memory as politically significant. Accurate memories can refute falsehoods and lies, including falsehoods and lies about history. But memories are also susceptible to corruption, and cognitive biases may allow our memories to be corrupted in predictable and useful ways by those with political power. Orwell worried that totalitarian governments were pushing a thoroughgoing skepticism about the ability to write “true history.” At the same time, Orwell also noted that these same totalitarian governments used propaganda to try to promote their own accounts of history—accounts which often were wildly discordant with the facts (see, for example, “Looking Back at the Spanish War,” Section IV).

The complex relationship between truth, memory, and history in a totalitarian regime is a central theme in Nineteen Eighty-Four. One of the protagonist’s primary ways of resisting the patent lies told by the Party was clinging to memories that contradicted the Party’s false claims about the past. The primary antagonist, O’Brien, sought to eliminate Winston’s trust in his own memories by convincing him to give up on the notion of objective truth completely. Like many of the key themes in Nineteen Eighty-Four, Orwell discussed the relationship between truth, memory, and history under totalitarianism elsewhere. Notable examples include his essays “Looking Back on the Spanish War,” “Notes on Nationalism,” and “The Prevention of Literature.”

4. Philosophy of Language

Orwell had wide-ranging interests in language. These interests spanned the simple “joy of mere words” to the political desire to use language “to push the world in a certain direction” (“Why I Write”). Orwell studied how language could both obscure and clarify, and he sought to identify the political significance language had as a result.

a. Language and Thought

For Orwell, language and thought significantly influence one another. Our thought is a product of our language, which in turn is a product of our thought.

“Politics and the English Language” contains Orwell’s most explicit writing about this relationship. In the essay, Orwell focuses primarily on language’s detrimental effects on thought and vice versa, writing, for example, that the English language “becomes ugly and inaccurate because our thoughts are foolish, but the slovenliness of our language makes it easier for us to have foolish thoughts” and that “If thought corrupts language, language can also corrupt thought.” But despite this focus on detrimental effects, Orwell’s purpose in “Politics and the English Language” is ultimately positive. His “point is that the process [of corruption] is reversible.” Orwell believed the bad habits of thought and writing he observed could “be avoided if one is willing to take the necessary trouble.” Thus, the essay functions, in part, as a guide for doing just that.

This relationship between thought and language is part of a larger three-part relationship Orwell identified between language, thought, and politics (thus why the article is entitled “Politics and the English Language”). Just as thought and language mutually influence one another, so too do thought and politics. Thus, through the medium of thought, politics and language influence one another too. Orwell argues that if one writes well, “One can think more clearly,” and in turn that “To think clearly is a necessary first step toward political regeneration.” This makes good writing a political task. Therefore, Orwell concludes that for those in English-speaking political communities, “The fight against bad English is not frivolous and is not the exclusive concern of professional writers.” An analogous principle holds for those living in political communities that use other languages. For example, based on his theory about the bi-directional influence that language, thought, and politics have upon one another, Orwell wrote that he expected “that the German, Russian and Italian languages have all deteriorated in the last ten or fifteen years, as a result of dictatorship.” (“Politics and the English Language” was published shortly after the end of World War II.)

Orwell’s desire to avoid bad writing is not the desire to defend “standard English” or rigid rules of grammar. Rather, Orwell’s chief goal is for language users to aspire “to let the meaning choose the word, and not the other way about.” Communicating clearly and precisely requires conscious thought and intention. Writing in a way that preserves one’s meaning takes work. Simply selecting the words, metaphors, and phrases that come most easily to mind can obscure our meaning from others and from ourselves. Orwell describes a speaker who is taken over so completely by stock phrases, stale metaphors, and an orthodox party line as someone who:

Has gone some distance toward turning himself into a machine. The appropriate noises are coming out of his larynx, but his brain is not involved, as it would be if he were choosing his words for himself. If the speech he is making is one that he is accustomed to make over and over again, he may be almost unconscious of what he is saying.

Orwell explores this idea in Nineteen Eighty-Four with the concept of “duckspeak,” which is defined as a speaker who merely quacks like a duck when repeating orthodox platitudes.

b. Propaganda

Like many terms that were important to him, Orwell never defines what he means by “propaganda,” and it is not clear that he always used the term consistently. Still, Orwell was an insightful commentator on how propaganda functioned and why understanding it mattered.

Orwell often used the term “propaganda” pejoratively. But this does not mean that Orwell thought propaganda was always negative. Orwell wrote that “All art is propaganda,” while denying that all propaganda was art (“Charles Dickens”). He held that the primary aim of propaganda is “to influence contemporary opinion” (“Notes on Nationalism”). Thus, Orwell’s sparsest conception of propaganda seems to be messaging aimed at influencing opinion. Such messages need not be communicated only with words. For example, Orwell frequently pointed out the propagandistic properties of posters, which likely inspired his prose about the posters of Big Brother in Nineteen Eighty-Four. This sparse conception of propaganda does not include conditions that other accounts may include, such as that the messaging must be in some sense misleading or that the attempt to influence must be in some sense manipulative (compare with Stanley 2016).

Orwell found much of the propaganda of his age troubling because of the deleterious effects he believed propaganda was having on individuals and society. Propaganda functions to control narratives and, more broadly, thought. Orwell observed that sometimes this was done by manipulating the effect language was apt to have on audiences.

He noted that dictators like Hitler and Stalin committed callous murders, but never referred to them as such, preferring instead to use terms like “liquidation,” “elimination,” or “some other soothing phrase” (“Inside the Whale”). But at other times, he noted that propaganda consisted of outright lies. In lines reminiscent of the world he would create in Nineteen Eighty-Four, Orwell described the situation he observed as follows: “Much of the propagandist writing of our time amounts to plain forgery. Material facts are suppressed, dates altered, quotations removed from their context and doctored so as to change their meaning” (“Notes on Nationalism”). Orwell also noted that poorly done propaganda could not only fail but could also backfire and repel the intended audience. He was often particularly hard on his allies on the political left for propaganda that he thought most working-class people found off-putting.

5. Philosophy of Art and Literature

Orwell viewed aesthetic value as distinct from other forms of value, such as moral and economic. He most often discussed aesthetic value while discussing literature, which he considered a category of art. Importantly, Orwell did not think that the only way to assess literature was on its aesthetic merits. He thought literature (along with other kinds of art and writing) could be assessed morally and politically as well. This is perhaps unsurprising given his desire “to make political writing into an art” (“Why I Write”).

a. Value of Art and Literature

That Orwell views aesthetic value as distinct from moral value is clear. Orwell wrote in an essay on Salvador Dali that one “ought to be able to hold in one’s head simultaneously the two facts that Dali is a good draughtsman and a disgusting human being” (“Benefit of Clergy”). What is less clear is what Orwell considers the grounds for aesthetic value. Orwell appears to have been of two minds about this. At times, Orwell seemed to view aesthetic values as objective but ineffable. At other times, he seemed to view aesthetic value as grounded subjectively on the taste of individuals.

For example, Orwell writes that his own age was one “in which the average human being in the highly civilized countries is aesthetically inferior to the lowest savage” (“Poetry and the Microphone”). This suggests some culturally neutral perspective from which aesthetic refinement can be assessed. In fact, Orwell seems to think that one’s cultural milieu can enhance or corrupt one’s aesthetic sensitivity, writing that the “ugliness” of his society had “spiritual and economic causes,” and that “Aesthetic judgements, especially literary judgements, are often corrupted in the same way as political ones” (“Poetry and the Microphone”; “Notes on Nationalism”). Orwell even held that some people “have no aesthetic feelings whatever,” a condition to which he thought the English were particularly susceptible (“The Lion and the Unicorn”). On the other hand, Orwell also wrote that “Ultimately there is no test of literary merit except survival, which is itself an index to majority opinion” (“Lear, Tolstoy, and the Fool”). This suggests that perhaps aesthetic value bottoms out in intersubjectivity.

There are ways of softening this tension, however, by noting the different ways in which Orwell thinks literary merit can be assessed. For example, Orwell writes the following:

Supposing that there is such a thing as good or bad art, then the goodness or badness must reside in the work of art itself—not independently of the observer, indeed, but independently of the mood of the observer. In one sense, therefore, it cannot be true that a poem is good on Monday and bad on Tuesday. But if one judges the poem by the appreciation it arouses, then it can certainly be true, because appreciation, or enjoyment, is a subjective condition which cannot be commanded (“Politics vs. Literature”).

This suggests literary merit can be assessed either in terms of artistic merit or in terms of subjective appreciation and that these two forms of assessment need not generate matching results.

This solution, however, still leaves the question of what justifies artistic merit unanswered. Perhaps the best answer available comes in Orwell’s essay on Charles Dickens. There, Orwell concluded that “As a rule, an aesthetic preference is either something inexplicable or it is so corrupted by non-aesthetic motives as to make one wonder whether the whole of literary criticism is not a huge network of humbug.” Here, Orwell posits two potential sources of aesthetic preference: one of which is humbug and one of which is inexplicable. This suggests that Orwell may favor a view of aesthetic value that is ultimately ineffable. But even if the grounding of aesthetic merit is inexplicable, Orwell seems to think we can still judge art on aesthetic, as well as moral and political, grounds.

b. Literature and Politics

Orwell believed that there was “no such thing as genuinely non-political literature” (“The Prevention of Literature”). This is because Orwell thought that all literature sent a political message, even if the message was as simple as reinforcing the status quo. This is part of what Orwell means when he says that all art is propaganda. For Orwell, all literature—like all art—seeks to influence contemporary opinion. For this reason, all literature is political.

Because all literature is political, Orwell thought that a work of literature’s political perspective often influenced the level of merit a reader assigned to it. More specifically, people tend to think well of literature that agrees with their political outlook and think poorly of literature that disagrees with it. Orwell defended this position by pointing out “the extreme difficulty of seeing any literary merit in a book that seriously damages your deepest beliefs” (“Inside the Whale”).

But just as literature could influence politics through its message, so too politics could and did influence literature. Orwell argued that all fiction is “censored in the interests of the ruling class” (“Boys’ Weeklies”). For Orwell, this was troubling under any circumstances, but was particularly troublesome when the state exhibited totalitarian tendencies. Orwell thought that the writing of literature became impossible in a state that was genuinely authoritarian. This was because in a totalitarian regime there is no intellectual freedom and there is no stable set of shared facts. As a result, Orwell held that “The destruction of intellectual liberty cripples the journalist, the sociological writer, the historian, the novelist, the critic, and the poet, in that order” (“The Prevention of Literature”).

Thus, Orwell’s views on the mutual connections between politics, thought, and language extend to art—especially written art. These things affect literature so thoroughly that certain political orders make writing literature impossible. But literature, in turn, has the power to affect these core aspects of human life.

6. Orwell’s Relationship to Academic Philosophy

Orwell’s relationship to academic philosophy has never been a simple matter. Orwell admired Bertrand Russell, yet he wrote in response to a difficulty he encountered reading one of Russell’s books that it was “the sort of thing that makes me feel that philosophy should be forbidden by law” (Barry 2021). Orwell considered A. J. Ayer a “great friend,” yet Ayer said that Orwell “wasn’t interested in academic philosophy in the very least” and believed that Orwell thought academic philosophy was “rather a waste of time” (Barry 2022; Wadhams 2017, 205). And Orwell referred to Jean Paul Sartre as “a bag of wind” to whom he was going to give “a good [metaphorical] boot” (Tyrrell 1996).

Some have concluded that Orwell was uninterested in or incapable of doing rigorous philosophical work. Bernard Crick, one of Orwell’s biographers who was himself a philosopher and political theorist, concluded that Orwell would “have been incapable of writing a contemporary philosophical monograph, scarcely of understanding one,” observing that “Orwell chose to write in the form of a novel, not in the form of a philosophical tractatus” (Crick 1980, xxvii). This is probably all true. But this does not mean that Orwell’s work was not influenced by academic philosophy. It was. This also does not mean that Orwell’s work is not valuable for academic philosophers. It is.

Aside from critical comments about Marx, Orwell tended not to reference philosophers by name in his work (compare with Tyrrell 1996). As such, it can be hard to determine the extent to which he was familiar with or was influenced by such thinkers. Crick concludes that Orwell was “innocent of reading either J.S. Mill or Karl Popper,” yet seemed independently to reach some similar conclusions (Crick 1980, 351). But while there is little evidence of Orwell’s knowledge of the history of philosophy, there is plenty of evidence of his familiarity with at least some philosophical work written during his own lifetime. Orwell reviewed books by both Sartre and Russell (Tyrrell 1996, Barry 2021), and Orwell’s library at the time of his death included several of Russell’s books (Barry 2021). By examining Orwell’s knowledge of, interactions with, and writing about Russell, Peter Brian Barry has offered compelling arguments that Russell influenced Orwell’s views on moral psychology, metaethics, and metaphysics (Barry 2021; Barry 2022). And as others have noted, there is a clear sense in which Orwell’s writing deals with philosophical themes and seeks to work through philosophical ideas (Tyrrell 1996; Dwan 2010, 2018; Quintana 2018, 2020; Satta 2021a, 2021c).

These claims can be made consistent by distinguishing being an academic philosopher and being a philosophical thinker in some other sense. Barry puts the point well, noting that Orwell’s lack of interest in “academic philosophy” is “consistent with Orwell being greatly interested in normative public philosophy, including social and political philosophy.” David Dwan makes a similar point, preferring to call Orwell a “political thinker” rather than a “political philosopher” and arguing that we “can map the challenges he [Orwell] presents for political philosophy without ascribing to him a rigour to which he never aspired” (Dwan 2018, 4).

Philosophers working in political philosophy, philosophy of language, epistemology, ethics, and metaphysics, among other fields, have used and discussed Orwell’s writing. Richard Rorty, for example, devoted a chapter to Orwell in his 1989 book Contingency, Irony, and Solidarity, where he claimed that Orwell’s “description of our political situation—of the dangers and options at hand—remains as useful as any we possess” (Rorty 1989, 170). For Rorty, part of Orwell’s value was that he “sensitized his [readers] to a set of excuses for cruelty,” which helped reshape our political understanding (Rorty 1989, 171). Rorty also saw Orwell’s work as helping show readers that totalitarian figures like 1984’s O’Brien were possible (Rorty 1989, 175-176).

But perhaps the chief value Rorty saw in Orwell’s work was the way in which it showed the deep human value in having the ability to say what you believe and the “ability to talk to other people about what seems true to you” (Rorty 1989, 176). That is to say, Rorty recognized the value that Orwell placed on intellectual freedom. That said, Rorty here seeks to morph Orwell into his own image by suggesting that Orwell cares merely about intellectual freedom and not about truth. Rorty argues that, for Orwell, “It does not matter whether ‘two plus two is four’ is true” and that Orwell’s “question about ‘the possibility of truth’ is a red herring” (Rorty 1989, 176, 182). Rorty’s claims that Orwell was not interested in truth have not been widely adopted. In fact, his position has prompted philosophical defense of the much more plausible view that Orwell cared about truth and considered truth to be, in some sense, real and objective (see, for example, van Inwagen 2008; Dwan 2018, 160-163; confer Conant 2020).

In philosophy of language, Derek Ball has identified Orwell as someone who recognized that “A particular metasemantic fact might have certain social and political consequences” (Ball 2021, 45). Ball also notes that on one plausible reading, Orwell seems to accept both linguistic determinism—“the claim that one’s language influences or determines what one believes, in such a way that speakers of different languages will tend to possess different (and potentially incompatible) beliefs precisely because they speak different languages”—and linguistic relativism—”the claim that one’s language influences or determines what concepts one possesses, and hence what thoughts one is capable of entertaining, in such a way that speakers of different languages frequently possess entirely different conceptual repertoires precisely because they speak different languages” (Ball 2021, 47).

Ball’s points are useful ways to frame some of Orwell’s key philosophical commitments about the interrelationship between language, thought, and politics. Ball’s observations accord with Judith Shklar’s claim that the plot of 1984 “is not really just about totalitarianism but rather about the practical implications of the notion that language structures all our knowledge of the phenomenal world” (Shklar 1984). Similarly, in his work on manipulative speech, Justin D’Ambrosio has noted the significance of Orwell’s writing for politically relevant philosophy of language (D’Ambrosio unpublished manuscript). These kinds of observations about Orwell’s views may become increasingly significant in academic philosophy, given the current development of political philosophy of language as an area of study (see, for example, Khoo and Sterken 2021).

Philosophers have also noted the value of Orwell’s work for epistemology. Martin Tyrrell argues that much of Orwell’s “later and better writing amounts to an attempt at working out the political consequences of what are essentially philosophical questions,” citing specifically epistemological questions like “When and what should we doubt?” and “When and what should we believe?” (Tyrrell 1996). Simon Blackburn has noted the significance of Orwell’s worries about truth for political epistemology, concluding that “The answer to Orwell’s worry [about the possibility of truth] is not to give up inquiry, but to conduct it with even more care, diligence, and imagination” (Blackburn 2021, 70). Mark Satta has documented Orwell’s recognition of the epistemic point that our physical circumstances as embodied beings influence our thoughts and beliefs (Satta 2021a).

As noted earlier, Orwell treats moral value as a domain distinct from other types of value, such as the aesthetic. Academic philosophers have studied and productively used Orwell’s views in the field of ethics. Barry argues that Orwell’s moral views are a form of threshold deontology, on which certain moral norms (such as telling the truth) must be followed, except on occasions where not following such norms is necessary to prevent horrendous results. Barry also argues that Orwell’s moral norms come from Orwell’s humanist account of moral goodness, which grounds moral goodness in what is good for human beings. This account of Orwell’s ethical commitments accords with Dwan’s view that, while Orwell engaged in broad criticism of moral consequentialism, there were limits to Orwell’s rejection of consequentialism, such as Orwell’s acceptance that some killing is necessary in war (Dwan 2018, 17-19).

Philosophers have also employed Orwell’s writing at the intersection of ethics and political philosophy. For example, Martha Nussbaum identifies the ethical and political importance given to emotions in 1984. She examines how Winston Smith looks back longingly at a world which contained free expression of emotions like love, compassion, pity, and fellow feeling, while O’Brien seeks to establish a world in which the dominant (perhaps only) emotions are fear, rage, triumph, and self-abasement (Nussbaum 2005). Oriol Quintana has identified the importance of human recognition in Orwell’s corpus and has used this in an account of the ethics of solidarity (Quintana 2018). Quintana has also argued that there are parallels between the work of George Orwell and the French philosopher Simone Weil, especially the importance they both attached to “rootedness”—that is, “a feeling of belonging in the world,” in contrast to asceticism or detachment (Quintana 2020, 105). Felicia Nimue Ackerman has emphasized the ways in which 1984 is a novel about a love affair, which addresses questions about the nature of human agency and human relationships under extreme political circumstances (Ackerman 2019). David Dwan examines Orwell’s understanding of and frequent appeals to several important moral and political terms including “equality,” “liberty,” and “justice” (Dwan 2012, 2018). Dwan holds that Orwell is “a great political educator, but less for the solutions he proffered than for the problems he embodied and the questions he allows us to ask” (Dwan 2018, 2).

Thus, although he was never a professional philosopher or member of the academy, Orwell has much to offer those interested in philosophy. An increasing number of philosophers seem to have recognized this in recent years. Although limited by his time and his prejudices, Orwell was an insightful critic of totalitarianism and many other ways in which political power can be abused. Part of his insight was the interrelationship between our political lives and other aspects of our individual and collective experiences, such as what we believe, how we communicate, and what we value. Both Orwell’s fiction and his essays provide much that is worthy of reflection for those interested in such aspects of human experience and political life.

7. References and Further Reading

a. Primary Sources

  • Down and Out in Paris and London. New York: Harcourt Publishing Company, 1933/1961.
  • Burmese Days. Boston: Mariner Books, 1934/1974.
  • “Shooting an Elephant.” New Writing, 1936 https://www.orwellfoundation.com/the-orwell-foundation/orwell/essays-and-other-works/shooting-an-elephant/).
  • Keep the Aspidistra Flying. New York: Harcourt Publishing Company, 1936/1956.
  • The Road to Wigan Pier. New York: Harcourt Publishing Company, 1937/1958.
  • Homage to Catalonia. Boston: Mariner Books, 1938/1952.
  • “My Country Right or Left.” Folios of New Writing, 1940 https://www.orwellfoundation.com/the-orwell-foundation/orwell/essays-and-other-works/my-country-right-or-left/).
  • “Inside the Whale.” Published in Inside the Whale and Other Essays. London: Victor Gollancz Ltd., 1940 https://www.orwellfoundation.com/the-orwell-foundation/orwell/essays-and-other-works/inside-the-whale/).
  • “Boys’ Weeklies.” Published in Inside the Whale and Other Essays. London: Victor Gollancz Ltd., 1940 https://www.orwellfoundation.com/the-orwell-foundation/orwell/essays-and-other-works/boys-weeklies/).
  • “Charles Dickens.” Published in Inside the Whale and Other Essays. London: Victor Gollancz Ltd., 1940 https://www.orwellfoundation.com/the-orwell-foundation/orwell/essays-and-other-works/charles-dickens/).
  • “Rudyard Kipling.” Horizon, 1941 https://www.orwellfoundation.com/the-orwell-foundation/orwell/essays-and-other-works/rudyard-kipling/).
  • “The Lion and the Unicorn: Socialism and the English Genius.” Searchlight Books, 1941 https://www.orwellfoundation.com/the-orwell-foundation/orwell/essays-and-other-works/the-lion-and-the-unicorn-socialism-and-the-english-genius/.
  • “Literature and Totalitarianism.” Listener (originally broadcast on the BBC Overseas Service). June 19, 1941. (Reprinted in The Collected Essays, Journalism and Letters of George Orwell, Vol 2. Massachusetts: Nonpareil Books, 2007.)
  • “Looking Back on the Spanish War.” New Road, 1943 https://www.orwellfoundation.com/the-orwell-foundation/orwell/essays-and-other-works/looking-back-on-the-spanish-war/.
  • “Benefit of Clergy: Some Notes on Salvador Dali.” 1944. https://www.orwellfoundation.com/the-orwell-foundation/orwell/essays-and-other-works/benefit-of-clergy-some-notes-on-salvador-dali/.
  • “Antisemitism in Britain.” Contemporary Jewish Record, 1945 https://www.orwellfoundation.com/the-orwell-foundation/orwell/essays-and-other-works/antisemitism-in-britain/.
  • “Notes on Nationalism.” Polemic, 1945 https://www.orwellfoundation.com/the-orwell-foundation/orwell/essays-and-other-works/notes-on-nationalism/.
  • “The Sporting Spirit.” Tribune, 1945 https://www.orwellfoundation.com/the-orwell-foundation/orwell/essays-and-other-works/the-sporting-spirit/.
  • “Poetry and the Microphone.” The New Saxon Pamphlet, 1945 https://www.orwellfoundation.com/the-orwell-foundation/orwell/essays-and-other-works/poetry-and-the-microphone/.
  • “The Prevention of Literature.” Polemic, 1946 https://www.orwellfoundation.com/the-orwell-foundation/orwell/essays-and-other-works/the-prevention-of-literature/.
  • “Why I Write.” Gangrel, 1946 https://www.orwellfoundation.com/the-orwell-foundation/orwell/essays-and-other-works/why-i-write/.
  • “Politics and the English Language.” Horizon, 1946 https://www.orwellfoundation.com/the-orwell-foundation/orwell/essays-and-other-works/politics-and-the-english-language/.
  • “Politics vs. Literature: An Examination of Gulliver’s Travels.” Polemic, 1946 https://www.orwellfoundation.com/the-orwell-foundation/orwell/essays-and-other-works/politics-vs-literature-an-examination-of-gullivers-travels/.
  • “Lear, Tolstoy, and the Fool.” Polemic, 1947 (Reprinted in The Collected Essays, Journalism and Letters of George Orwell, Vol 4. Massachusetts: Nonpareil Books, 2002.)
  • Animal Farm. New York: Signet Classics, 1945/1956.
  • 1984. New York: Signet Classics, 1949/1950.
  • “Such, Such Were the Joys.” Posthumously published in Partisan Review, 1952.

b. Secondary Sources

  • Abramowitz, Alan I. and Steven W. Webster. (2018). “Negative Partisanship: Why Americans Dislike Parties but Behave Like Rabid Partisans.” Advances in Political Psychology 39 (1): 119-135.
  • Ackerman, Felicia Nimue. (2019). “The Twentieth Century’s Most Underrated Novel.” George Orwell: His Enduring Legacy. University of New Mexico Honors College / University of New Mexico Libraries: 46-52.
  • Ball, Derek. (2021). “An Invitation to Social and Political Metasemantics.” The Routledge Handbook of Social and Political Philosophy of Language, Justin and Rachel Katharine Sterken). New York and Abingdon: Routledge.
  • Barry, Peter Brian. (2021). “Bertrand Russell and the Forgotten Fallacy in Nineteen Eighty-Four.” George Orwell Studies 6 (1): 121-129.
  • Barry, Peter Brian. (2022). “Orwell and Bertrand Russell.” In Oxford Handbook of George Orwell, Nathan Waddell (ed.). Oxford: Oxford University Press.
  • Beddoe, Deirdre. (1984). “Hindrances and Help-Meets: Women and the Writings of George Orwell.” In Inside the Myth, Orwell: Views from the Left, Christopher Norris (ed.). London: Lawrence and Wishart: 139-154.
  • Blackburn, Simon. (2021). “Politics, Truth, Post-Truth, and Post-Modernism.” In The Routledge Handbook of Political Epistemology, Michael Hannon and Jeroen de Ridder (eds.). Abingdon and New York: Routledge: 65-73.
  • Bowker, Gordon. (2003). George Orwell. Time Warner Books UK.
  • Brennan, Michael G. (2017). George Orwell and Religion. London: Bloomsbury Academic.
  • Campbell, Beatrix. (1984). “Orwell – Paterfamilias or Big Brother?” In Inside the Myth, Orwell: Views from the Left, Christopher Norris (ed.). London: Lawrence and Wishart: 126-138.
  • Conant, James. (2000). “Freedom, Cruelty, and Truth: Rorty versus Orwell.” In Rorty and His Critics, Robert Brandom (ed.). Oxford: Blackwell: 268-342.
  • Crick, Bernard. (1987). George Orwell: A Life. Sutherland House.
  • D’Ambrosio, Justin. (unpublished manuscript). “A Theory of Manipulative Speech.”
  • Dwan, David. (2010). “Truth and Freedom in Orwell’s Nineteen Eighty-Four.” Philosophy and Literature 34 (2): 381-393.
  • Dwan, David. (2012). “Orwell’s Paradox: Equality in ‘Animal Farm’.” ELH 79 (3): 655-683.
  • Dwan, David. (2018). Liberty, Equality & Humbug: Orwell’s Political Ideals. Oxford: Oxford University Press.
  • Ingle, Stephen. (2008). The Social and Political Thought of George Orwell. Routledge.
  • Khoo, Justin and Rachel Katharine Sterken (eds.). (2021). The Routledge Handbook of Social and Political Philosophy of Language. New York and Abingdon: Routledge.
  • Nussbaum, Martha. (2005). “The Death of Pity: Orwell and American Political Life.” In On Nineteen Eighty-Four: Orwell and Our Future, Abbott Gleason, Jack Goldsmith, and Martha Nussbaum (eds.). Princeton NJ: Princeton University Press: 279-299.
  • Patai, Daphne. (1984). The Orwell Mystique: A Study in Male Ideology. Amherst: University of Massachusetts Press.
  • Quintana, Oriol. (2018). “What Makes Help Helpful? Some Thoughts on Ethics of Solidarity Through George Orwell’s Writings.” Ramon Llull Journal of Applied Ethics 9: 137-153.
  • Quintana, Oriol. (2020). “The Politics of Rootedness: On Simone Weil and George Orwell.” In Simone Weil, Beyond Ideology? Sophie Bourgault and Julie Daigle (eds.). Switzerland: Palgrave Macmillan: 103-121.
  • Rodden, John (ed.). (2007). Cambridge Companion to George Orwell. Cambridge: Cambridge University Press.
  • Satta, Mark. (2021a). “George Orwell on the Relationship Between Food and Thought.” George Orwell Studies 5 (2): 76-89.
  • Satta, Mark. (2021b). “Orwell’s ideas remain relevant 75 years after ‘Animal Farm’ was published.” The Conversation US, https://theconversation.com/orwells-ideas-remain-relevant-75-years-after-animal-farm-was-published-165431.
  • Satta, Mark. (2021c). “George Orwell’s Philosophical Views.” 1000-Word Philosophy, https://1000wordphilosophy.com/2021/12/17/george-orwell/.
  • Scrivener, Michael and Louis Finkelman, (1994). “The Politics of Obscurity: The Plain Style and Its Detractors.” Philosophy and Literature 18 (1): 18-37.
  • Sheldon, Michael. (1991). George Orwell: The Authorized Biography. HarperCollins.
  • Shklar, Judith N. (1985). “Nineteen Eighty-Four: Should Political Theory Care?” Political Theory 13 (1): 5-18.
  • Taylor, D.J. (2003). Orwell: The Life. New York: Vintage Books.
  • Tyrrell, Martin. (1996). “Orwell and Philosophy.” Philosophy Now 16.
  • Waddell, Nathan (ed.). (2020). The Cambridge Companion to Nineteen Eighty-Four. Cambridge: Cambridge University Press.
  • Wadhams, Stephen. (2017). The Orwell Tapes. Vancouver, Canada: Locarno Press.
  • Williams, Raymond. (1971). Orwell. London: Fontana Paperbacks.
  • Woloch, Alex. (2016). Or Orwell: Writings and Democratic Socialism. Harvard University Press.

 

Author Information

Mark Satta
Email: mark.satta@wayne.edu
Wayne State University
U. S. A.

Epistemic Value

Epistemic value is a kind of value which attaches to cognitive successes such as true beliefs, justified beliefs, knowledge, and understanding. These kinds of cognitive success do often have practical value: true beliefs about local geography help us get to work on time; knowledge of mechanics allows us to build vehicles; understanding of general annual weather patterns helps us to plant our fields at the right time of year to ensure a good harvest. By contrast, false beliefs can and do lead us astray both in trivial and in colossally important ways.

It is fairly uncontroversial that we tend to care about having various cognitive or epistemic goods, at least for their practical value, and perhaps also for their own sakes as cognitive successes. But this uncontroversial point raises a number of important questions. For example, it is natural to wonder whether there really are all these different kinds of things (true beliefs, knowledge, and so on) which have distinct value from an epistemic point of view, or whether the value of some of them is reducible to, or depends on, the value of others.

It is also natural to think that knowledge is more valuable than mere true belief, but it has proven to be a challenge explaining where the extra value of knowledge comes from. Similarly, it is natural to think that understanding is more valuable than any other epistemic state which falls short of understanding, such as true belief or knowledge. But there is disagreement about what makes understanding the highest epistemic value, or what makes it distinctly valuable, or even whether it is distinctly valuable.

Indeed, it is no easy task saying just what makes something an epistemic value in the first place. Perhaps epistemic values just exist on their own, independent of other kinds of value? Or perhaps cognitive goods are valuable because we care about having them for their own sakes? Or perhaps they are valuable because they help us to achieve other things which we care about for their own sakes?

Furthermore, if we accept that there are things that are epistemically valuable, then we might be tempted to accept a kind of instrumental (or consequentialist, or teleological) conception of epistemic rationality or justification, according to which a belief is epistemically rational just in case it appropriately promotes the achievement of an epistemic goal, or it complies with rules which tend to produce overall epistemically valuable belief-systems. If this idea is correct, then we need to know which epistemic values to include in the formulation of the epistemic goal, where the epistemic goal is an epistemically valuable goal in light of which we evaluate beliefs as epistemically rational or irrational.

Table of Contents

  1. Claims about Value
    1. Instrumental and Final Value
    2. Subjective and Objective Value
    3. Pro Tanto and All-Things-Considered Value
  2. The Value Problem
    1. The Primary Value Problem
      1. Knowledge as Mere True Belief
      2. Stability
      3. Virtues
      4. Reliabilism
      5. Contingent Features of Knowledge
      6. Derivative Non-Instrumental Value
    2. The Secondary Value Problem
      1. No Extra Value
      2. Virtues
      3. Knowledge and Factive Mental States
      4. Internalism and the Basing Relation
    3. The Tertiary Value Problem
  3. Truth and other Epistemic Values
    1. Truth Gets Us What We Want
    2. What People Ought to Care About
    3. Proper Functions
    4. Assuming an Epistemic Value/Critical Domains of Evaluation
    5. Anti-Realism
    6. Why the Focus on Truth?
  4. Understanding
    1. Understanding: Propositions and Domains; Subjective and Objective
    2. The Location of the Special Value of Understanding
    3. The Value of Understanding
    4. Alternatives to the Natural Picture of the Value of Understanding
  5. Instrumentalism and Epistemic Goals
    1. The Epistemic Goal as a Subset of the Epistemic Values
    2. Common Formulations of the Epistemic Goal
    3. Differences between the Goals: Interest and Importance
    4. Differences between the Goals: Synchronic and Diachronic Formulations
  6. Conclusion
  7. References and Further Reading

1. Claims about Value

Philosophers working on questions of value typically draw a number of distinctions which are good to keep in mind when we’re thinking about particular kinds of value claims. We’ll look at three particularly useful distinctions here before getting into the debates about epistemic value.

a. Instrumental and Final Value

The first important distinction to keep in mind is between instrumental and final value. An object (or state, property, and so forth) is instrumentally valuable if and only if it brings about something else that is valuable. An object is finally valuable if and only if it’s valuable for its own sake.

For example, it’s valuable to have a hidden pile of cash in your mattress: when you have a pile of cash readily accessible, you have the means to acquire things which are valuable, such as clothing, food, and so on. And, depending on the kind of person you are, it might give you peace of mind to sleep on a pile of cash. But piles of cash are not valuable for their own sake—money is obviously only good for what it can get you. So money is only instrumentally valuable.

By contrast, being healthy is something we typically think of as finally valuable. Although being healthy is instrumentally good because it enables us to do other valuable things, we also care about being healthy just because it’s good to be healthy, whether or not our state of health allows us to achieve other goods.

The existence of instrumental value depends on and derives from the existence of final value. But it’s possible for final value to exist without any instrumental value. There are possible worlds where there simply are no causal relations at all, for example. In some worlds like that, there could exist some final value (for instance, there could be sentient beings who feel great pleasure), but nothing would ever count as a means for bringing about anything else, and there would be no instrumental value. In the actual world, though, it’s pretty clear that there is both instrumental and final value.

b. Subjective and Objective Value

The second distinction is between subjective and objective value. Subjective value is a matter of the satisfaction of people’s desires (or the fulfillment of their plans, intentions, and so forth). Objective value is a kind of value which doesn’t depend on what people desire, care about, plan to do, and so forth. (To say that an object or event O is subjectively valuable for a subject S is not to say anything about why S thinks that O is valuable; O can be subjectively valuable in virtue of S’s desiring to bring O about, even if the reason S desires to bring O about is precisely because S thinks that O is objectively valuable. In a case like that, if O really is objectively valuable, then it is both objectively and subjectively valuable; but if S is mistaken, and O is not objectively valuable, then O is only subjectively valuable.)

Some philosophers think that there is really only subjective value (and correspondingly, subjective reasons, obligations, and so on); others think that there is only objective value, and that there is value in achieving one’s actual desires only when the desires are themselves objectively good. Still other philosophers allow both kinds of value. Many of the views which we’ll see below can be articulated in terms of either subjective or objective value, and when a view is committed to allowing only one type of value, the context will generally make it clear whether it’s subjective or objective. So, to keep things simple, claims about value will not be qualified as subjective or objective in what follows.

c. Pro Tanto and All-Things-Considered Value

Suppose that God declares that it is maximally valuable, always and everywhere, to feed the hungry. Assuming that God is omniscient and doesn’t lie, it necessarily follows that it’s true that it’s maximally valuable, always and everywhere, to feed the hungry. So there’s nothing that could ever outweigh the value of feeding the hungry. This would be an indefeasible kind of value: it is a kind of value that cannot be defeated by any contrary values or considerations.

Most value, however, is defeasible: it can be defeated, either by being overridden by contrary value-considerations, or else by being undermined. For an example of undermining: it’s instrumentally valuable to have a policy of getting an annual physical exam done, because that’s the kind of thing that normally helps catch medical issues before they become serious. But suppose that Sylvia becomes invulnerable to medically diagnosable illnesses. . In this case, nothing medically valuable comes about as a result of Sylvia’s policy of getting her physical done. The instrumental medical value which that policy would have enjoyed is undermined by the fact that annual physicals no longer contribute to keeping Sylvia in good health.

By contrast, imagine that Roger goes to the emergency room for a dislocated shoulder. The doctors fix his shoulder, but while sitting in the waiting room, Roger inhales droplets from another patient’s sneeze, and he contracts meningitis as a result, which ends up causing him brain damage. In this case, there is some medical value which resulted from Roger’s visit to the emergency room: his shoulder was fixed. But because brain damage is more disvaluable than a fixed shoulder is valuable, the value of having a fixed should is outweighed, or overridden, by the disvalue of having brain damage. So all things considered, Roger’s visit to the emergency room is disvaluable. But at least there is still something positive to be said for it.

In cases where some value V1 of an object O (or action, event, and so forth) is overridden by some contrary value V2, but where V1 still at least counts in favour of O’s being valuable, we can say that V1 is a pro tanto kind of value (that is, value “so far as it goes” or “to that extent”). So the value of Roger’s fixed shoulder is pro tanto: it counts in favour of the value of his visit to the emergency room, even though it is outweighed by the disvalue of his resulting brain damage. The disvalue of getting brain damage is also pro tanto: there can be contrary values which would outweigh it, though in Roger’s case, the disvalue of the brain damage is the stronger of the competing value-considerations. So we can say that, all things considered, Roger’s visit to the emergency room is disvaluable.

2. The Value Problem

a. The Primary Value Problem

Knowledge and true belief both tend to be things we want to have, but all else being equal, we tend to prefer to have knowledge over mere true belief. The “Primary Value Problem” is the problem of explaining why that should be the case. Many epistemologists think that we should take it as a criterion of adequacy for theories of knowledge that they be able to explain the fact that we prefer knowledge to mere true belief, or at least that they be consistent with a good explanation of why that should be the case.

To illustrate: suppose that Steve believes that the Yankees are a good baseball team, because he thinks that their pinstriped uniforms are so sharp-looking. Steve’s belief is true – the Yankees always field a good team – but he holds his belief for such a terrible reason that we are very reluctant to think of it as an item of knowledge.

Cases like that one motivate the view that knowledge consists of more than just true belief. In order to count as knowledge, a belief has to be well justified in some suitable sense, and it should also meet a suitable Gettier-avoidance condition (see the article on Gettier Problems). But not only do beliefs like Steve’s motivate the view that knowledge consists of more than mere true belief: they also motivate the view that knowledge is better to have than true belief. For suppose that Yolanda knows the Yankees’ stats, and on that basis she believes that the Yankees are a good team. It seems that Yolanda’s belief counts as an item of knowledge. And if we compare Steve and Yolanda, it seems that Yolanda is doing better than Steve; we’d prefer to be in Yolanda’s epistemic position rather than in Steve’s. This seems to indicate that we prefer knowledge over mere true belief.

The challenge of the Primary Value Problem is to explain why that should be the case. Why should we care about whether we have knowledge instead of mere true belief? After all, as is often pointed out, true beliefs seem to bring us the very same practical benefits as knowledge. (Steve would do just as well as Yolanda betting on the Yankees, for example.) Socrates makes this point in the Meno, arguing that if someone wants to get to Larisa, and he has a true belief but not knowledge about which road to take, then he will get to Larisa just as surely as if he had knowledge of which road to take. In response to Socrates’s argument, Meno is moved to wonder why anyone should care about having knowledge instead of mere true belief. (Hence, the Primary Value Problem is sometimes called the Meno Problem.)

So in short, the problem is that mere true beliefs seem to be just as likely as knowledge to guide us well in our actions. But we still seem to have the persistent intuition that there is something better about any given item of knowledge than the corresponding item of mere true belief. The challenge is to explain this intuition. Strategies for addressing this problem can either try to show that knowledge really is always more valuable than corresponding items of mere true belief, or else they can allow that knowledge is sometimes (or even always) no more valuable than mere true belief. If we adopt the latter kind of response to the problem, it is incumbent on us to explain why we should have the intuition that knowledge is more valuable than mere true belief, in cases where it turns out that knowledge isn’t in fact more valuable. Following Pritchard (2008; 2009), we can call strategies of the first kind vindicating, and we can call strategies of the second kind revisionary.

There isn’t a received view among epistemologists about how we ought to respond to the Primary Value Problem. What follows is an explanation of some important proposals from the literature, and a discussion of their challenges and prospects.

i. Knowledge as Mere True Belief

A very straightforward way to respond to the problem is to deny one of the intuitions on which the problem depends, the intuition that knowledge is distinct from true belief.  Meno toys with this idea in the Meno, though Socrates disabuses him of the idea. (Somewhat more recently, Sartwell (1991; 1992) has defended this approach to knowledge.) If knowledge is identical with true belief, then we can simply reject the value problem as resting on a mistaken view of knowledge. If knowledge is true belief, then there’s no discrepancy in value to explain.

The view that knowledge is just true belief is almost universally rejected, however. Cases where subjects have true beliefs but lack knowledge are so easy to construct and so intuitively obvious that identifying knowledge with true belief represents an extreme departure from what most epistemologists and laypeople think of knowledge. Consider once again Steve’s belief that the Yankees are a good baseball team, which he holds because he thinks their pinstriped uniforms are so sharp. It seems like an abuse of language to call Steve’s belief an item of knowledge. At the very least, we should be hesitant to accept such an extreme view until we’ve exhausted all other theoretical options.

It could still be the case that knowledge is no more valuable than mere true belief, even though knowledge is not identical with true belief. But, as we’ve seen, there is a widespread and resilient intuition that knowledge is more valuable than mere true belief (recall, for instance, that we tend to think that Yolanda’s epistemic state is better than Steve’s). If knowledge were identical with true belief, then we would have to take that intuition to be mistaken; but, since we can see that knowledge is more than mere true belief, we can continue looking for an acceptable account which would explain why knowledge is more valuable than mere true belief.

ii. Stability

Most attempts to explain why knowledge is more valuable than mere true belief proceed by identifying some condition which must be added to true belief in order to yield knowledge, and then explaining why that further condition is valuable. Socrates’s own view, at least as presented in the Meno, is that knowledge is true opinion plus an account of why the opinion is true (where the account of why it is true is itself already present in the soul, and it must only be recalled from memory). So, Socrates proposes, a known true belief will be more stable than a mere true belief, because having an account of why a belief is true helps to keep us from losing it. If you don’t have an account of why a proposition is true, you might easily forget it, or abandon your belief in it when you come across some reason for doubting it. But if you do have an account of why a proposition is true, you likely have a greater chance of remembering it, and if you come across some reason for doubting it, you’ll have a reason available to you for continuing to believe it.

A worry for this solution is that it seems to be entirely possible for a subject S to have some entirely unsupported beliefs, which do not count as knowledge, but where S clings to these beliefs dogmatically, even in the face of good counterevidence. S’s belief in a case like this can be just as stable as many items of knowledge – indeed, dogmatically held beliefs can even be more stable than knowledge. For if you know that p, then presumably your belief is a response to some sort of good reason for believing that p. But if your belief is a response to good reasons, then you’d likely be inclined to revise your belief that p, if you were to come across some good evidence for thinking that p is false, or for thinking that you didn’t have any good reason for believing that p in the first place. On the other hand, if p is something you cling to dogmatically (contrary evidence be damned), then you’ll likely retain p even when you get good reason for doubting it. So, even though having stable true beliefs is no doubt a good thing, knowledge isn’t always more stable than mere true belief, and an appeal to stability does not seem to give us an adequate explanation of the extra value of knowledge over mere true belief.

One way to defend the stability response to the value problem is to hold that knowledge is more stable than mere true beliefs, but only for people whose cognitive faculties are in good working order, and to deny that the cognitive faculties of people who cling dogmatically to evidentially unsupported beliefs are in good working order (Williamson 2000). This solution invites the objection, however, that our cognitive faculties are not all geared to the production of true beliefs. Some cognitive faculties are geared towards ensuring our survival, and the outputs of these latter faculties might be held very firmly even if they are not well supported by evidence. For example, there could be subjects with cognitive mechanisms which take as input sudden sounds and generate as output the belief that there’s a predator nearby. Mechanisms like these might very well generate a strong conviction that there’s a predator nearby. Such mechanisms would likely yield many more false positive predator-identifications than they would yield correct identifications, but their poor true-to-false output-ratio doesn’t prevent mechanisms of this kind from having a very high survival value, as long as they do correctly identify predators when they are present. So it’s not really clear that knowledge is more stable than mere true beliefs, even for mere true beliefs which have been produced by cognitive systems which are in good working order, because it’s possible for beliefs to be evidentially unsupported, and very stable, and produced by properly functioning cognitive faculties, all at the same time. (See Kvanvig 2003, ch1. for a critical discussion of Williamson’s appeal to stability.)

iii. Virtues

Virtue epistemologists are, roughly, those who think that knowledge is true belief which is the product of intellectual virtues. (See the article on Virtue Epistemology.) Virtue Epistemology seems to provide a plausible solution to the Primary (and, as we’ll see, to the Secondary) Value Problem.

According to a prominent strand of virtue epistemology, knowledge is true belief for which we give the subject credit (Greco 2003), or true belief which is a cognitive success because of the subject’s exercise of her relevant cognitive ability (Greco 2008; Sosa 2007). For example (to adapt Sosa’s analogy): an archer, in firing at a target, might shoot well or poorly. If she shoots poorly but hits the target anyway (say, she takes aim very poorly but sneezes at the moment of firing, and luckily happens to hit the target), her shot doesn’t display skill, and her hitting the target doesn’t reflect well on her. If she shoots well, on the other hand, then she might hit the target or miss the target. If she shoots well and misses the target, we will still credit her with having made a good shot, because her shot manifests skill. If she shoots well and hits the target, then we will credit her success to her having made a good shot – unless there were intervening factors which made it the case that the shot hit the mark just as a matter of luck. For example: if a trickster moves the target while the arrow is in mid-flight, but a sudden gust of wind moves the arrow to the target’s new location, then in spite of the fact that the archer makes a good shot, and she hits the target, she doesn’t hit the target because she made a good shot. She was just lucky, even though she was skillful. But when strange factors don’t intervene, and the archer hits the target because she made a good shot, we give her credit for having hit the target, since we think that performances which succeed because they are competent are the best kind of performances. And, similarly, when it comes to belief-formation, we give people credit for getting things right as a result of the exercise of their intellectual virtues: we think it’s an achievement to get things right as the result of one’s cognitive competence, and so we tend to think that there’s a sense in which people who get things right because of their intellectual competence deserve credit for getting things right.

According to another strand of virtue epistemology (Zagzebski 2003), we don’t think of knowledge as true belief which meets some further condition. Rather, we should think of knowledge as a state which a subject can be in, which involves having the propositional attitude of belief, but which also includes the motivations for which the subject has the belief. Virtuous motivations might include things like diligence, integrity, and a love of truth. And, just as we think that, in ethics, virtuous motives make actions better (saving a drowning child because you don’t want children to suffer and die is better than saving a drowning child because you don’t want to have to give testimony to the police, for example), we should also think that the state of believing because of a virtuous motive is better than believing for some other reason.

Some concerns have been raised for both strands of virtue epistemology, however. Briefly, a worry for the Sosa/Greco type of virtue epistemology is that (as we’ll see in section 3) knowledge might not after all in general be an achievement – it might be something we can come by in a relatively easy or even lazy fashion. A worry for Zagzebski’s type of virtue epistemology is that there seem to be possible cases where subjects can acquire knowledge even though they lack virtuous intellectual motives. Indeed, it seems possible to acquire knowledge even if one has only the darkest of motives: if a torturer is motivated by the desire to waterboard people until they go insane, for example, he can thereby gain knowledge of how long it takes to break a person by waterboarding.

iv. Reliabilism

The Primary Value Problem is sometimes thought to be especially bad for reliabilists about knowledge. Reliabilism in its simplest form is the view that beliefs are justified if and only if they’re produced by reliable processes, and they count as knowledge if and only if they’re true, and produced by reliable processes, and they’re not Gettiered. (See, for example, Goldman and Olsson (2009, p. 22), as well as the article on Reliabilism.) The apparent trouble for reliabilism is that reliability only seems to be valuable as a means to truth – so, in any given case where we have a true belief, it’s not clear that the reliability of the process which produced the belief is able to add anything to the value that the belief already has in virtue of being true. The value which true beliefs have in virtue of being true completely “swamps” the value of the reliability of their source, if reliability is only valuable as a means to truth. (Hence the Primary Value Problem for reliabilism has often been called the “swamping problem.”)

To illustrate (Zagzebski 2003): the value of a cup of coffee seems to be a matter of how good the coffee tastes. And we value reliable coffeemakers because we value good cups of coffee. But when it comes to the value of any particular cup of coffee, its value is just a matter of how good it tastes; whether the coffee was produced by a reliable coffeemaker doesn’t add to or detract from the value of the cup of coffee. Similarly, we value true beliefs, and we value reliable belief-forming processes because we care about getting true beliefs. So we have reason to prefer reliable processes over unreliable ones. But whether a particular belief was reliably or unreliably produced doesn’t seem to add to or detract from the value of the belief itself.

Responses have been offered on behalf of reliabilism. Brogaard (2006) points out that critics of reliabilism seem to have been presupposing a Moorean conception of value, according to which the value of an object (or state, condition, and so forth) is entirely a function of the internal properties of the object. (The value of the cup of coffee is determined entirely by its internal properties, not by the reliability of its production, or by the fineness of a particular morning when you enjoy your coffee.) But this seems to be a mistaken view about value in general. External features can add value to objects. We value a genuine Picasso painting more than a flawless counterfeit, for example. If that’s correct, then extra value can be conferred on an object, if it has a valuable source, and perhaps the value of reliable processes can transfer to the beliefs which they produce. Goldman and Olsson (2009) offer two further responses on behalf of reliabilism. Their first response is that we can hold that true belief is always valuable, and that reliability is only valuable as a means to true belief, but that it is still more valuable to have knowledge (understood as reliabilists understand knowledge, that is, as reliably-produced and unGettiered true belief) than a mere true belief. For if S knows that p in circumstances C, then S has formed the belief that p through some reliable process in C. So S has some reliable process available to her, and it generated a belief in C. This makes it more likely that S will have a reliable process available to her in future similar circumstances, than it would be if S had an unreliably produced true belief in C. So, when we’re thinking about how valuable it is to be in circumstances C, it seems to be better for S to be in C if S has knowledge in C than if she has mere true belief in C, because having knowledge in C makes it likelier that she’ll get more true beliefs in future similar circumstances.

This response, Goldman and Olsson think, accounts for the extra value which knowledge has in many cases. But there will still be cases where S’s having knowledge in C doesn’t make it likelier that she’ll get more true beliefs in the future. For example, C might be a unique set of circumstances which is unlikely to come up again. Or S might be employing a reliable process which is available to her in C, but which is likely to become unavailable to her very soon. Or S might be on her deathbed. So this response isn’t a completely validating solution to the value problem, and it’s incumbent on Goldman and Olsson to explain why we should tend to think that knowledge is more valuable than mere true belief in those cases when it’s not.

So Goldman and Olsson offer a second response to the Primary Value Problem: when it comes to our intuitions about the value of knowledge, they argue, it’s plausible that these intuitions began long ago with the recognition that true belief is always valuable in some sense to have, and that knowledge is usually valuable because it involves both true belief and the probability of getting more true beliefs; and then, over time, we have come to simply think that knowledge is valuable, even in cases when having knowledge doesn’t make it more probable that the subject will get more true beliefs in the future. (For some critical discussions and defenses of Goldan and Olsson’s treatment of the value problem, see Horvath (2009); Kvanvig (2010); and Olsson (2009; 2011)).

v. Contingent Features of Knowledge

An approach similar to Goldman and Olsson’s is to consider the values of contingent features of knowledge, rather than the value of its necessary and/or sufficient conditions. Although we might think that the natural way to account for the value of some state or condition S1, which is composed of other states or conditions S2-Sn, is in terms of the values of S2-Sn, perhaps S1 can be valuable in virtue of some other conditions which typically (but not always) accompany S1, or in terms of some valuable result which S1 is typically (but not always) able to get us. For example: it’s normal to think that air travel is valuable, because it typically enables people to cover great distances safely and quickly. Sometimes airplanes are diverted, and slow travellers down, and sometimes airplanes crash. But even so, we might continue to think, air travel is typically a valuable thing, because in ordinary cases, it gets us something good.

Similarly, we might think that knowledge is valuable because we need to rely on the information which people give us in order to accomplish just about anything in this life, and being able to identify people as having knowledge means being able to rely on them as informants. And we also might think that there’s value in being able to track whether our own beliefs are held on the basis of good reasons, and we typically have good reasons available to us for believing p when we know that p. We are not always in a position to identify when other people have knowledge, and if externalists about knowledge are right, then we don’t always have good reasons available to us when we have knowledge ourselves. Nevertheless, we can typically identify people as knowers, and we can typically identify good reasons for the things we know. These things are valuable, so they make typical cases of knowledge valuable, too. (See Craig (1990) for an account of the value of knowledge in terms of the characteristic function of knowledge-attribution. Jones (1997) further develops the view.)

Like Goldman and Olsson’s responses, this strategy for responding to the value problem doesn’t give us an account of why knowledge is always more valuable than mere true belief. For those who think that knowledge is always preferable to mere true belief, and who therefore seek a validating solution to the Primary Value Problem, this strategy will not be satisfactory. But for those who are willing to accept a somewhat revisionist response, according to which knowledge is only usually or characteristically preferable to mere true belief, this strategy seems promising.

vi. Derivative Non-Instrumental Value

Sylvan (2018) proposes the following principle as a way to explain the extra value that justification adds to true belief:

(The Extended Hurka Principle) When V is a non-instrumental value from the point of view of domain D, fitting ways of valuing V in D and their manifestations have some derivative non-instrumental value in D.

For instance, in the aesthetic domain, beauty is fundamentally valuable; but it’s also derivatively good to value or respect beauty, and it’s bad to disvalue or disrespect beauty. In the moral domain, beneficence is good; and it’s derivatively good to value or respect beneficence, and it’s bad to value or respect maleficence. And in the epistemic domain, true belief is good; but it’s also derivatively good to value or respect truth (by having justified beliefs), and it’s bad to disvalue or disrespect truth (by having unjustified beliefs).

In these domains, the derivatively valuable properties are not valuable because they promote or generate more of what is fundamentally valuable; rather, they are valuable because it’s just a good thing to manifest respect for what is fundamentally valuable. Still, Sylvan argues that the value of justification in the epistemic domain depends on and derives from the epistemic value of truth, because if truth were not epistemically valuable, then neither would respecting the truth be epistemically valuable.

A possible worry for this approach is that although respecting a fundamentally valuable thing might be good, it’s not clear that it adds domain-relative value to the thing itself (Bondy 2022). For instance, an artist passionately manifesting her love of beauty as she creates a sculpture does not necessarily make the sculpture itself better. The same might go for belief: perhaps the fact that a believer manifests respect for the truth in holding a belief does not necessarily make the belief itself any better.

b. The Secondary Value Problem

Suppose you’ve applied for a new position in your company, but your boss tells you that your co-worker Jones is going to get the job. Frustrated, you glance over at Jones, and see that he has ten coins on his desk, and you then watch him put the coins in his pocket. So you form the belief that the person who will get the job has at least ten coins in his or her pocket (call this belief “B”). But it turns out that your boss was just toying with you; he just wanted to see how you would react to bad news. He’s going to give you the job. And it turns out that you also have at least ten coins in your pocket.

So, you have a justified true belief, B, which has been Gettiered. In cases like this, once you’ve found out that you were Gettiered, it’s natural to react with annoyance or intellectual embarrassment: even though you got things right (about the coins, though not about who would get the job), and even though you had good reason to think you had things right, you were just lucky in getting things right.

If this is correct – if we do tend to prefer to have knowledge over Gettiered justified true beliefs – then this suggests that there’s a second value problem to be addressed. We seem to prefer having knowledge over having any proper subset of the parts of knowledge. But why should that be the case? What value is added to justified true beliefs, when they meet a suitable anti-Gettier condition?

i. No Extra Value

An initial response is to deny that knowledge is more valuable than mere justified true belief. If we’ve got true beliefs, and good reasons for them, we might be Gettiered, if for some reason it turns out that we’re just lucky in having true beliefs. When we inquire into whether p, we want to get to the truth regarding p, and we want to do so in a rationally defensible way. If it turns out that we get to the truth in a rationally defensible way, but strange factors of the case undermine our claim to knowing the truth about p, perhaps it just doesn’t matter that we don’t have knowledge.

Few epistemologists have defended this view, however (though Kaplan (1985) is an exception). We do after all find it irritating when we find out that we’ve been Gettiered; and when we are considering corresponding cases of knowledge and of Gettiered justified true belief, we tend to think that the subject who has knowledge is better off than the subject who is Gettiered. We might be mistaken; there might be nothing better in knowledge than in mere justified true belief. But the presumption seems to be that knowledge is more valuable, and we should try to explain why that is so. Skepticism about the extra value of knowledge over mere justified true belief might be acceptable if we fail to find an adequate explanation, but we shouldn’t accept skepticism before searching for a good explanation.

ii. Virtues

We saw above that some virtue epistemologists think of knowledge in terms of the achievement of true beliefs as a result of the exercise of cognitive skills or virtues. And we do generally seem to value success that results from our efforts and skills (that is, we value success that’s been achieved rather than stumbled into (for example Sosa (2003; 2007) and Pritchard (2009)). So, because we have a cognitive aim of getting to the truth, and we can achieve that aim either as a result of luck or as a result of our skillful cognitive performance, it seems that the value of achieving our aims as a result of a skillful performance can help explain why knowledge is more valuable than mere true belief.

That line of thought works just as well as a response to the Secondary Value Problem as to the Primary Value Problem. For in a Gettier case, the subject has a justified true belief, but it’s just as a result of luck that she arrived at a true belief rather than a false one. By contrast, when a subject arrives at a true belief because she has exercised a cognitive virtue, it’s plausible to think that it’s not just lucky that she’s arrived at a true belief; she gets credit for succeeding in the aim of getting to the truth as a result of her skillful performance. So cases of knowledge do, but Gettier cases do not, exemplify the value of succeeding in achieving our aims as a result of a skillful performance.

iii. Knowledge and Factive Mental States

”Knowledge-first epistemology” (beginning with Williamson 2000) is the approach to epistemology that does not attempt to analyze knowledge in terms of other more basic concepts; rather, it takes knowledge to be fundamental, and it analyzes other concepts in terms of knowledge. Knowledge-first epistemologists still want to say informative things about what knowledge is, but they don’t accept the traditional idea that knowledge can be analyzed in terms of informative necessary and sufficient conditions.

Williamson argues that knowledge is the most general factive mental state. At least some mental states have propositional contents (the belief that p has the content p; the desire that p has the content p; and so on). Factive mental states are mental states which you can only be in when their contents are true. Belief isn’t a factive mental state, because you can believe p even if p is false. By contrast, knowledge is a factive mental state, because you can only know that p if p is true. Other factive mental states include seeing that (for example you can only see that the sun is up, if the sun really is up) and remembering that. Knowledge is the most general factive mental state, for Williamson, because any time you are in a factive mental state with the content that p, you must know that p. If you see that it’s raining outside, then you know that it’s raining outside. Otherwise – say, if you have a mere true belief that it’s raining, or if your true belief that it’s raining is justified but Gettiered – you only seem to see that it’s raining outside.

If Williamson is right, and knowledge really is the most general factive mental state, then it is easy enough to explain the value of knowledge over mere justified true belief. We care, for one thing, about having true beliefs, and we dislike being duped. We would especially dislike it if we found out that we were victims of widespread deception. (Imagine your outrage and intellectual embarrassment, for example, if you were to discover that you were living in your own version of The Truman Show!) But not only that: we also care about being in the mental states we think we’re in (we care about really remembering what we think we remember, for example), and we would certainly dislike being duped about our own mental states, including when we take ourselves to be in factive mental states. So if having a justified true belief that p which is Gettiered prevents us from being in the factive mental states we think we’re in, but having knowledge enables us to be in these factive mental states, then it seems that we should care about having knowledge.

iv. Internalism and the Basing Relation

Finally, internalists about knowledge have an interesting response to offer to the Secondary Value Problem. Internalism about knowledge is the view that a necessary condition on S’s knowing that p is that S must have good reasons available for believing that p (where this is usually taken to mean that S must be able to become aware of those reasons, just by reflecting on what reasons she has). Internalists will normally hold that you have to have good reasons available to you, and you have to hold your belief on the basis of those reasons, in order to have knowledge.

Brogaard (2006) argues that the fact that beliefs must be held on the basis of good reasons gives the internalist her answer to the Secondary Value Problem. Roughly, the idea is that, if you hold the belief that p on the basis of a reason q, then you must believe (at least dispositionally) that in your current circumstances, q is a reliable indicator of p’s truth. So you have a first-order belief, p, and you have a reason for believing p, which is q, and you have a second-order belief, r, to the effect that q is a reliable indicator of p’s truth. And when your belief that p counts as knowledge, your reason q must in fact be a reliable indicator of p’s truth in your current circumstances – which means that your second-order belief r is true. So, assuming that the extra-belief requirement for basing beliefs on reasons is correct, it follows that when you have knowledge, you also have a correct picture of how things stand more broadly speaking.

When you are in a Gettier situation, by contrast, there is some feature of the situation which makes it the case that your belief that q is not a reliable indicator of the truth of p. That means that your second-order belief r is false. So, even though you’ve got a true first-order belief, you have an incorrect picture of how things stand more broadly speaking. Assuming that it’s better to have a correct picture of how things stand, including a correct picture of what reasons are reliable indicators of the truth of our beliefs, knowledge understood in an internalist sense is more valuable than Gettiered justified true belief.

c. The Tertiary Value Problem

Pritchard (2007; 2010) suggests that there’s a third value problem to address (cf. also Zagzebski 2003). We often think of knowledge as distinctively valuable – that it’s a valuable kind of thing to have, and that its value isn’t the same kind of value as (for example) the value of true belief. If that’s correct, then simply identifying a kind of value which true beliefs have, and showing that knowledge has that same kind of value but to a greater degree, does not yield a satisfactory solution to this value problem.

By analogy, think of two distinct kinds of value: moral and financial. Suppose that both murders and mediocre investments are typically financially disvaluable, and suppose that murders are typically more financially disvaluable than mediocre investments. Even if we understand the greater financial disvalue of murders over the financial disvalue of mediocre investments, if we do not also understand that murders are disvaluable in a distinctively moral sense, then we will fail to grasp something fundamental about the disvalue of murder.

If knowledge is valuable in a way that is distinct from the way that true beliefs are valuable, then the kind of solution to the Primary Value Problem offered by Goldman and Olsson which we saw above isn’t satisfactory, because the extra value they identify is just the extra value of having more true beliefs. By contrast, as Pritchard suggests, if knowledge represents a cognitive achievement, in the way that virtue theorists often suggest, then because we do seem to think of achievements as being valuable just insofar as they are achievements (we value the overcoming of obstacles, and we value success which is attributable to a subject’s exercise of her skills or abilities), it follows that thinking of knowledge as an achievement provides a way to solve the Tertiary Value Problem. (Though, as we’ll see in section 3, Pritchard doesn’t think that knowledge in general represents an achievement.)

However, it’s not entirely clear that the Tertiary Value Problem is a real problem which needs to be addressed. (Haddock (2010) explicitly denies it, and Carter, Jarvis, and Rubin (2013) also register a certain skepticism before going on to argue that if there is a Tertiary Value Problem, it’s easy to solve.) Certainly most epistemologists who have attempted to solve the value problem have not worried about whether the extra value they were identifying in knowledge was different in kind from the value of mere true belief, or of mere justified true belief. Perhaps it is fair to say that it would be an interesting result if knowledge turned out to have a distinctive kind of value; maybe that would even be a mark in favour of an epistemological theory which had that result. But the consensus seems to be that, if we can identify extra value in knowledge, then that is enough to solve the value problem, even if the extra value is just a greater degree of the same kind of value which we find in the proper parts of knowledge such as true belief.

3. Truth and other Epistemic Values

We have been considering ways to try to explain why knowledge is more valuable than its proper parts. More generally, though, we might wonder what sorts of things are epistemically valuable, and just what makes something an epistemic value in the first place.

A natural way to proceed is simply to identify some state which epistemologists have traditionally been interested in, or which seems like it could or should be important for a flourishing cognitive life – such as the states of having knowledge, true belief, justification, wisdom, empirically adequate theories, and so on – and try to give some reason for thinking that it’s valuable to be in such a state.

Epistemologists who work on epistemic value usually want to explain either why true beliefs are valuable, or why knowledge is valuable, or both. Some also seek to explain the value of other states, such as understanding, and some seek to show that true beliefs and knowledge are not always as valuable as we might think.

Sustained arguments for the value of knowledge are easy to come by; the foregoing discussion of the Value Problem was a short survey of such arguments. Sustained arguments for the value of true belief, on the other hand, are not quite so plentiful. But it is especially important that we be able to show that true belief is valuable, if we are going to allow true belief to play a central role in epistemological theories. It is, after all, very easy to come up with apparently trivial true propositions, which no one is or ever will be interested in. Truths about how many grains of sand there are on some random beach, for example, seem to be entirely uninteresting. Piller suggests that “the string of letters we get, when we combine the third letters of the first ten passenger’s family names who fly on FR2462 to Bydgoszcz no more than seventeen weeks after their birthday with untied shoe laces” is an uninteresting truth, which no one would care about (2009, p.415). (Though see Treanor (2014) for an objection to arguments that proceed by comparing what appear to be more and less interesting truths.)  What is perhaps even worse, it is easy to construct cases where having a true belief is positively disvaluable. For example, if someone tells you how a movie will end before you see it, you will probably not enjoy the movie very much when you do get around to seeing it (Kelly 2003). Now, maybe these apparently trivial or disvaluable truths are after all at least a little bit valuable, in an epistemic sense – but on the face of them, these truths don’t seem valuable, so the claim that they are valuable needs to be argued for. We’ll see some such arguments shortly.

Keep in mind that although epistemologists often talk about the value of having true beliefs, this is usually taken to be short for the value of having true beliefs and avoiding false beliefs (though see Pritchard 2014 and Hutchinson 2021, who think that truth itself is what is valuable). These two aspects of what is usually referred to as a truth-goal are clearly related, but they are distinct, and sometimes they can pull in opposite directions. An extreme desire to avoid false beliefs can lead us to adopt some form of skepticism, for example, where we abandon all or nearly all of our beliefs, if we’re not careful. But in giving up all of our beliefs, we do not only avoid having false beliefs; we also lose all of the true beliefs we would have had. When the goals of truth-achievement and error-avoidance pull in opposite directions, we need to weigh the importance of getting true beliefs against the importance of avoiding false ones, and decide how much epistemic risk we’re willing to take on in our body of beliefs (cf. James 1949, Riggs 2003).

Still, because the twin goals of achieving true beliefs and avoiding errors are so closely related, and because they are so often counted as a single truth-goal, we can continue to refer to them collectively as a truth-goal. We just need to be careful to keep the twin aspects of the goal in mind.

a. Truth Gets Us What We Want

One argument for thinking that true beliefs are valuable is that without true beliefs, we cannot succeed in any of our projects. Since even the most unambitious of us care about succeeding in a great many things (even making breakfast is a kind of success, which requires a great many true beliefs), we should all think that it’s important to have true beliefs, at least when it comes to subjects that we care about.

An objection to this argument for the value of true beliefs is that, as we’ve already seen, there are many true propositions which seem not to be worth caring about, and some which can be positively harmful. So although true beliefs are good when they can get us things we want, that is not always the case. So this argument doesn’t establish that we should always care about the truth.

A response to this worry is that we will all be faced with new situations in the future, and we will need to have a broad range of true beliefs, and as few false beliefs mixed in with the true ones as we can, in order to have a greater chance of succeeding when such situations come up (Foley 1993, ch.1). So it’s a good idea to try to get as many true beliefs as we can. This line of argument gives us a reason to think that it’s always at least pro tanto valuable to have true beliefs (that is, there’s always something positive to be said for true beliefs, even if that pro tanto value can sometimes be overridden by other considerations).

This is a naturalistically acceptable kind of value for true beliefs to enjoy. Although it doesn’t ground the value of true beliefs in the fact that people always desire to have true beliefs, it does ground their value in their instrumental usefulness for getting us other things which we do in fact desire. The main drawback for this approach, however, is that when someone positively desires not to have a given true belief – say, because it will cause him pain, or prevent him from having an enjoyable experience at the movies – it doesn’t seem like his desires can make it at all valuable for him to have the true belief in question. The idea here was to try to ground the value of truths in their instrumental usefulness, in the way that they are good for getting us what we want. But if there are true beliefs which we know will not be useful in that way (indeed, if there are true beliefs which we know will be harmful to us), then those beliefs don’t seem to have anything to be said in favour of them – which is to say that they aren’t even pro tanto valuable.

Whether we think that this is a serious problem will depend on whether we think that the claim that true beliefs are valuable entails that true beliefs must always have at least pro tanto value. Sometimes epistemologists (for example White 2007) explicitly claim that true beliefs are not always valuable in any real sense, since we just don’t always care about having them; but, just as money is valuable even though it isn’t something that we always care about having, so too, true beliefs are still valuable, in a hypothetical sense: when we do want to have true beliefs, or when true beliefs are necessary for getting us what we want, they are valuable. So we can always say that they have value; it’s just that the kind of value in question is only hypothetical in nature. (One might worry, however, that “hypothetical” seems to be only a fancy way to say “not real.”)

b. What People Ought to Care About

A similar way to motivate the claim that true beliefs are valuable is to say that there are some things that we morally ought to care about, and we need to have true beliefs in order to achieve those things (Zagzebski 2003; 2009). For example, I ought to care about whether my choices as a consumer contribute to painful and degrading living and working conditions for people who produce what I am consuming. (I do care about that, but even if I did not, surely, I ought to care about it.) But in order to buy responsibly, and avoid supporting corporations that abuse their workers, I need to have true beliefs about the practices of various corporations.

So, since there are things we should care about, and since we need true beliefs to successfully deal with things which we should care about, it follows that we should care about having true beliefs.

This line of argument is unavailable to anyone who wants to avoid positing the existence of objective values which exist independently of what people actually desire or care about, and it doesn’t generate any value for true beliefs which aren’t relevant to things we ought to care about. But if there are things which we ought to care about, then it seems correct to say that at least in many cases, true beliefs are valuable, or worth caring about.

Lynch (2004) gives a related argument for the objective value of truth. Although he doesn’t ground the value of true beliefs in things that we morally ought to care about, his central argument is that it’s important to care about the truth for its own sake, because caring for the truth for its own sake is part of what it is to have intellectual integrity, and intellectual integrity is an essential part of a healthy, flourishing life. (He also argues that a concern for the truth for its own sake is essential for a healthy democracy.)

c. Proper Functions

Some epistemologists (for example Plantinga 1993; Bergmann 2006; Graham 2011) invoke the proper functions of our cognitive systems in order to argue for (or to explain) the value of truth, and to explain the connection between truth and justification or warrant. Proper functions are usually given a selected-effects gloss, following Millikan (1984). The basic idea is that an organ or a trait (T), which produces an effect (E), has the production of effects of type E as its proper function just in case the ancestors of T also produced effects of type E, and the fact that they produced effects of type E is part of a correct explanation of why the Ts (or the organisms which have Ts) survived and exist today. For example, hearts have the proper function of pumping blood because hearts were selected for their ability to pump blood – the fact that our ancestors had hearts that pumped blood is part of a correct explanation of why they survived, reproduced, and why we exist today and have hearts that pump blood.

Similarly, the idea goes, we have cognitive systems which have been selected for producing true beliefs. And if that’s right, then our cognitive systems have the proper function of producing true beliefs, which seems to mean that there is always at least some value in having true beliefs.

It’s not clear whether selected-effect functions are in fact normative, however (in the sense of being able by themselves to generate reasons or value). Millikan, at least, thought that proper functions are normative. Others have disagreed (for example Godfrey-Smith 1998). Whether we can accept this line of argument for the value of true beliefs will depend on whether we think that selected-effects functions are capable of generating value by themselves, or whether they only generate value when taken in a broader context which includes reference to the desires and the wellbeing of agents.

A further potential worry with the proper-function explanation of the value of true beliefs is that there seem to be cognitive mechanisms which have been selected for, and which systematically produce, false beliefs. (See Hazlett (2013), for example, who considers cognitive biases such as the self-enhancement bias at considerable length.) Plantinga (1993) suggests that we should distinguish truth-directed cognitive mechanisms from others, and say that it’s only the proper functioning of well-designed, truth-conducive mechanisms that yield warranted beliefs. But if this response works, it’s only because there’s some way to explain why truth is valuable, other than saying that our cognitive mechanisms have been selected for producing true beliefs; otherwise there would be no reason to suggest that it’s only the truth-directed mechanisms that are relevant to warranted and epistemically valuable beliefs.

d. Assuming an Epistemic Value/Critical Domains of Evaluation

Many epistemologists don’t think that we need to argue that truth is a valuable thing to have (for example BonJour 1985, Alston 1985; 2005, Sosa 2007). All we need to do is to assume that there is a standpoint which we take when we are doing epistemology, or when we’re thinking about our cognitive lives, and stipulate that the goal of achieving true beliefs and avoiding errors is definitive of that standpoint. We can simply assume that truth is a real and fundamental epistemic value, and proceed from there.

Proponents of this approach still sometimes argue for the claim that achieving the truth and avoiding error is the fundamental epistemic value. But when they do, their strategy is to assume that there must be some distinctively epistemic value which is fundamental (that is, which orients our theories of justification and knowledge, and which explains why we value other things from an epistemic standpoint), and then to argue that achieving true beliefs does a better job as a fundamental epistemic value than other candidate values do.

The strategy here isn’t to argue that true beliefs are always valuable, all things considered. The strategy is to argue only that true belief is of fundamental value insofar as we are concerned with evaluating beliefs (or belief-forming processes, practices, institutions, and so forth) from an epistemic point of view. True beliefs are indeed sometimes bad to have, all things considered (as when you know how a movie will end), and not everyone always cares about having true beliefs. But enough of us care about having true beliefs in a broad enough range of cases that a critical domain of evaluation has arisen, which takes true belief as its fundamental value.

In support of this picture of epistemology and epistemic value, Sosa (2007) compares epistemology to the critical domain of evaluation which centers on good coffee. That domain takes the production and consumption of good cups of coffee as its fundamental value, and it has a set of evaluative practices in light of that goal. Many people take that goal seriously, and we have enormous institutional structures in place which exist entirely for the purpose of achieving the goal of producing good cups of coffee. But there are people who detest coffee, and perhaps coffee isn’t really valuable at all. (Perhaps…) But even so, enough people take the goal of producing good coffee to be valuable that we have generated a critical domain of evaluation centering on the value of producing good coffee, and even people who don’t care about coffee can still recognize good coffee, and they can engage in the practices which go with taking good coffee as a fundamental value of a critical domain. And for Sosa, the value of true belief is to epistemology as the value of good cups of coffee is to the domain of coffee production and evaluation.

One might worry, however, that this sort of move cannot accommodate the apparently non-optional nature of epistemic evaluation. It’s possible to opt out of the practice of making evaluations of products and processes in terms of the way that they promote the goal of producing tasty cups of coffee, but our epistemic practices don’t seem to be optional in that way. Even if I were to foreswear any kind of commitment to the importance of having epistemically justified beliefs, for example, you could appropriately level criticism at me if my beliefs were to go out of sync with my evidence.

e. Anti-Realism

An important minority approach to epistemic value and epistemic normativity is a kind of anti-realism, or conventionalism. The idea is that there is no sense in which true beliefs are really valuable, nor is there a sense in which we ought to try to have true beliefs, except insofar as we (as individuals, or as a community) desire to have true beliefs, or we are willing to endorse the value of having true beliefs.

One reason for being anti-realist about epistemic value is that you might be dissatisfied with all of the available attempts to come up with a convincing argument for thinking that truth (or anything else) is something which we ought to value. Hazlett (2013) argues against the “eudaimonic ideal” of true belief, which is the idea that even though true beliefs can be bad for us in exceptional circumstances, still, as a rule, true beliefs systematically promote human flourishing better than false beliefs do. One of Hazlett’s main objections to this idea is that there are types of cases where true beliefs are systematically worse for us than false beliefs. For example, people who have an accurate sense of what other people think of them tend to be more depressed than people who have an inflated sense of what others think of them. When it comes to beliefs about what others think about us, then, true beliefs are systematically worse for our wellbeing than corresponding false beliefs would be.

Because Hazlett thinks that the problems facing a realist account of epistemic value and epistemic norms are too serious, he adopts a form of conventionalism, according to which epistemic norms are like club rules. Just as a club might adopt the rule that they will not eat peas with spoons, so too, we humans have adopted epistemic rules such as the rule that we should believe only what the evidence supports. The justification for this rule isn’t that it’s valuable in any real sense to believe what the evidence supports; rather, the justification is just that the rule of believing in accord with the evidence is in fact a rule that we have adopted. (A worry for this approach, however, is that epistemic rules seem to be non-optional in a way that club rules are not. Clubs can change their rules by taking a vote, for example, whereas it doesn’t seem as though epistemic agents can do any such thing.)

f. Why the Focus on Truth?

We’ve been looking at some of the main approaches to the question of whether and why true beliefs are epistemically valuable. For a wide range of epistemologists, true beliefs play a fundamental role in their theories, so it’s important to try to see why we should think that truth is valuable. But, given that we tend to value knowledge more than we value true belief, one might wonder why true belief is so often taken to be a fundamental value in the epistemic domain. Indeed, not only do many of us think that knowledge is more valuable than mere true belief; we also think that there are a number of other things which should also count as valuable from the epistemic point of view: understanding, justification, simplicity, empirical adequacy of theories, and many other things too, seem to be important kinds of cognitive successes. These seem like prime candidates for counting as epistemically valuable – so why do they so often play such a smaller role in epistemological theories than true belief plays?

There are three main reasons why truth is often invoked as a fundamental epistemic value, and why these other things are often relegated to secondary roles. The first reason is that, as we saw in section 2(a), true beliefs do at least often seem to enable us to accomplish our goals and achieve what we want. And they typically enable us to do so whether or not they count as knowledge, or even whether or not they’re justified, or whether they represent relatively simple hypotheses. This seems like a reason to care about having true beliefs, which doesn’t depend on taking any other epistemic states to be valuable.

The second reason is that, if we take true belief to be the fundamental epistemic value, we will still be able to explain why we should think of many other things aside from true beliefs as epistemically valuable. If justified beliefs tend to be true, for example, and having true beliefs is the fundamental epistemic value, then justification will surely also be valuable, as a means to getting true beliefs (this is suggested in a widely-cited and passage in (BonJour 1985, pp.7-8)). Similarly, we might be able to explain the epistemic value of simplicity in terms of the value of truth, because the relative simplicity of a hypothesis can be evidence that the hypothesis is more likely than other competing hypotheses to be true. On one common way of thinking about simplicity, a hypothesis H1 is simpler than another hypothesis H2 if H1 posits fewer theoretical entities. Understanding simplicity in that way, it’s plausible to think that simpler hypotheses are likelier to be true, because there are fewer ways for them to go wrong (there are fewer entities for them to be mistaken about).

By contrast, it is not so straightforward to try to explain the value of truth in terms of other candidate epistemic values, such as simplicity or knowledge. If knowledge were the fundamental (as opposed to the highest, or one of the highest) epistemic value, so that the value of true beliefs would have to be dependent on the value of knowledge, then it seems that it would be difficult to explain why unjustified true beliefs should be more valuable than unjustified false beliefs, which they seem to be.

And the third reason why other candidate epistemic values are not often invoked in setting out epistemic theories is that, even if there are epistemically valuable things which do not get all of their epistemic value from their connection with true belief, there is a particular theoretical role which many epistemologists want the central epistemic goal or value to play, and it can only play that role if it’s understood in terms of achieving true beliefs and avoiding false ones (David 2001; cf. Goldman 1979). Briefly, the role in question is that of providing a way to explain our epistemic notions, including especially the notions of knowledge and epistemic rationality, in non-epistemic terms. Since truth is not itself an epistemic term, it can play this role. But other things which seem to be epistemically valuable, like knowledge and rationality, cannot play this role, because they are themselves epistemic terms. We will come back to the relation between the analysis of epistemic rationality and the formulation of the epistemic goal in the final section of this article.

Still, “veritism,” or “truth-value-monism”—the view that truth, or true belief, is the sole or the fundamental epistemic value—has come in for heavy criticism in recent years. Pluralists argue that there are multiple states or properties that have independent epistemic value (for example, DePaul 2001; Kvanvig 2005; Brogaard 2009; Madison 2017); some argue that truth is not particularly valuable, or not particularly epistemically valuable (for example Feldman 2000; Wrenn 2017); and as we saw above, some epistemologists argue that knowledge is what is primarily valuable, and that the attempt to explain the value of knowledge in terms of the value of truth is misguided (for example, Littlejohn 2018; Aschliman 2020) For defenses of veritism from some of its challenges, see (Ahlstrom-Vij 2013; Pritchard 2014; 2021).

4. Understanding

There is growing support among epistemologists for the idea that understanding is the highest epistemic value, more valuable even than knowledge. There are various ways of fleshing out this view, depending on what kind of understanding we have in mind, and depending on whether we want to remain truth-monists about what’s fundamentally epistemically valuable or not.

a. Understanding: Propositions and Domains; Subjective and Objective

If you are a trained mechanic, then you understand how automobiles work. This is an understanding of a domain, or of a kind of object. To have an understanding of a domain, you need to have a significant body of beliefs about that domain, which fits together in a coherent way, and which involves beliefs about what would explain why things happen as they do in that domain. When you have such a body of beliefs, we can say that you have a subjective understanding of the domain (Grimm 2012). When, in addition, your beliefs about the domain are mostly correct, we can say that you have an objective understanding of the domain.

In addition to understanding a domain, you might also understand that p – you might understand that some proposition is true. There are several varieties of propositional understanding: there is simply understanding that p; there is understanding why p, which involves understanding that p because q; there is understanding when p, which involves understanding that p happens at time t, and understanding why p happens at time t; and so on, for other wh- terms, such as who and where. In what follows, we’ll talk in general in terms of propositional understanding, or understanding that p, to cover all these cases.

Understanding that p entails having at least some understanding of a domain. To borrow an example of Pritchard’s (2009): imagine that you come home to find your house burnt to the ground. You ask the fire chief what caused the fire, and he tells you that it was faulty wiring. Now you know why your house burnt to the ground (you know that it burnt down because of the faulty wiring), and you also understand why your house burnt to the ground (you know that the house burnt down because of faulty wiring, and you have some understanding of the kinds of things that tend to start fires, such as sparks, or overheating, both of which can be caused by faulty wiring.) You understand why the house burnt down, in other words, only because you have some understanding of how fires are caused.

As Kvanvig (2003) notes, it’s plausible that you only genuinely understand that p if you have a mostly correct (that is, an objective) understanding of the relevant domain. For suppose that you have a broad and coherent body of beliefs about celestial motion, but which centrally involves the belief that the earth is at the center of the universe. Because your body of beliefs involves mistaken elements at its core, we would normally say that you misunderstand celestial motions, and you misunderstand why (for example) we can observe the sun rising every day. In a case like this, where you misunderstand why p (for example why the sun comes up), we can say that you have a subjective propositional understanding: your belief that the sun comes up every day because the earth is at the center of the Universe, and the celestial bodies all rotate around it, can be coherent with a broader body of justified beliefs, and it can provide explanations of celestial motions. But because your understanding of the domain of celestial motion involves false beliefs at its core, you have an incorrect understanding of the domain, and your explanatory propositional understanding, as a result, is also a misunderstanding.

By contrast, when your body of beliefs about a domain is largely correct, and your understanding of the domain leads you to believe that p is true because q is true, we can say that you have an objective understanding of why p is true. In what follows, except where otherwise specified, “understanding” refers to objective propositional understanding.

b. The Location of the Special Value of Understanding

It seems natural to think that understanding that p involves knowing that p, plus something extra, where the extra bit is something like having a roughly correct understanding of some relevant domain to do with p: you understand that p when (and only when) you know that p, and your belief that p fits into a broader, coherent, explanatory body of beliefs, where this body of beliefs is largely correct. So the natural place to look for the special epistemic value of understanding is in the value of this broader body of beliefs.

Some authors (Kvanvig 2003; Hills 2009; Pritchard 2009) have argued that propositional understanding does not require the corresponding propositional knowledge: S can understand that p, they argue, even if S doesn’t know that p. The main reason for this view is that understanding seems to be compatible with a certain kind of luck, environmental luck, which is incompatible with knowledge. For example, think again of the case where you ask the fire chief the cause of the fire, but now imagine that there are many pretend fire chiefs all walking around the area in uniform, and it’s just a matter of luck that you asked the real fire chief. In this case, it seems fairly clear that you lack knowledge of the cause of the fire, since you could so easily have asked a fake fire chief, and formed a false belief as a result. But, the argument goes, you do gain understanding of the cause of the fire from the fire chief. After all, you have gained a true belief about what caused the fire, and your belief is justified, and it fits in with your broader understanding of the domain of fire-causing. What we have here is a case of a justified true belief, where that belief fits in with your understanding of the relevant domain, but where you have been Gettiered, so you lack knowledge.

So, it’s controversial whether understanding that p really presupposes knowing that p. But when it comes to the value of understanding, we can set this question aside. For even if there are cases of propositional understanding without the corresponding propositional knowledge, still, most cases of propositional understanding involve the corresponding propositional knowledge, and in those cases, the special value of understanding will lie in what is added to the propositional knowledge to yield understanding. In cases where there is Gettierizing environmental luck, so that S has a Gettierized justified true belief which fits in with her understanding of the relevant domain, the special value of understanding will lie in what is added to justified true belief. In other words, whether or not propositional understanding presupposes the corresponding propositional knowledge, the special value of propositional understanding will be located in the subject’s understanding of the relevant domain.

c. The Value of Understanding

There are a few plausible accounts of why understanding should be thought of as distinctively epistemically valuable, and perhaps even as the highest epistemic value. One suggestion, which would be friendly to truth-monists about epistemic value, is that we can consistently hold both that truth is the fundamental epistemic value and that understanding is the highest epistemic value. Because understanding that p typically involves both knowing that p and having a broader body of beliefs, where this body of beliefs is coherent and largely correct, it follows from the fundamental value of true beliefs that in any case where S understands that p, S’s cognitive state involves greater epistemic value than if S were merely to truly believe that p, because S has many other true beliefs too. On this picture, understanding doesn’t have a distinctive kind of value, but it does have a greater quantity of value than true belief, or even than knowledge. But, for a truth-monist about epistemic value, this is just the result that should be desired – otherwise, the view would no longer be monistic.

An alternative suggestion, which does not rely on truth-monism about epistemic value, is that the value of having a broad body of beliefs which provide an explanation for phenomena is to be explained by the fact that whether you have such a body of beliefs is transparent to you: you can always tell whether you have understanding (Zagzebski 2001). And surely, if it’s always transparent to you whether you understanding something, that is a source of extra epistemic value for understanding on top of the value of having true belief or even knowledge, since we can’t in general tell whether we are in those states.

The problem with this suggestion, though, as Grimm (2006; 2012) points out, is that we cannot always tell whether we have understanding. It often happens that we think we understand something, when in fact we gravely misunderstand it. It might be the case that we can always tell whether we have a subjective understanding – we might always be able to tell whether we have a coherent, explanatory body of beliefs – but we are not in general in a position to be able to tell whether our beliefs are largely correct. The subjective kind of understanding doesn’t entail the objective kind. Still, it is worth noting that there seems to be a kind of value in being aware of the coherence and explanatory power of one’s beliefs on a given topic, even if it’s never transparent whether one’s beliefs are largely correct. (See Kvanvig 2003 for more on the value of internal awareness and of having coherent bodies of beliefs.)

A third suggestion about the value of understanding, which is also not committed to truth-monism, is that having understanding can plausibly be thought of as a kind of success which is properly attributable to one’s exercise of a relevant ability, or in other words, an achievement. As we saw above, a number of virtue epistemologists think that we can explain the distinctive value of knowledge by reference to the fact that knowledge is a cognitive achievement. But others (notably, Lackey 2006 and 2009) have denied that subjects in general deserve credit for their true belief in cases of knowledge. Cases of testimonial knowledge are popular counterexamples to the view that knowledge is in general an achievement: when S learns some fact about local geography from a random bystander, for example, S can gain knowledge, but if anyone deserves credit for S’s true belief, it seems to be the bystander. So, if that’s right, then it’s not after all always much of an achievement to gain knowledge.

Pritchard (2009) also argues that knowledge is not in general an achievement, but he claims that understanding is. For when S gains an understanding that p, it seems that S must bring to bear significant cognitive resources, unlike when S only gains knowledge that p. Suppose, for example, that S asks bystander B where the nearest tourist information booth is, and B tells him. Now let’s compare S’s and B’s cognitive states. S has gained knowledge of how to get to the nearest information booth, but S doesn’t have an understanding of the location of the nearest information booth, since S lacks knowledge of the relevant domain (that is, local geography). B, on the other hand, both knows and understands the location of the nearest booth. And B’s understanding of the local geography, and her consequent understanding of the location of the nearest booth, involves an allocation of significant cognitive resources. (Anyone who has had to quickly memorize the local geography of a new city will appreciate how much cognitive work goes into having a satisfactory understanding of this kind of domain.)

d. Alternatives to the Natural Picture of the Value of Understanding

If understanding that p requires both knowing that p (or having a justified true belief that p) and having a broader body of beliefs which is coherent, explanatory, and largely correct, then it’s plausible to think that the special value of understanding is in the value of having such a body of beliefs. But it’s possible to resist this view of the value of understanding in a number of ways. One way to resist it would be to deny that understanding is ever any different from knowing. Reductivists about understanding think that it’s not possible to have knowledge without having understanding, or understanding without knowledge.  Sliwa (2015) argues, for example, that when S knows that p, S must understand that p at least to some extent. S has a better understanding that p when S has a better understanding of the relevant domain, in the form of knowledge of more related propositions, but S knows that p if and only if S has some understanding that p.

For reductivists about understanding, there can obviously be no value in understanding beyond the value of having knowledge. There are better and worse understandings, but any genuine (objective) understanding involves at least some knowledge, and better understanding just involves more knowledge. If that’s right, then we don’t need to say that understanding has more value than knowledge.

A second way to resist the approach to the value of understanding presented in the previous section is to resist the claim that understanding requires that one’s beliefs about a domain must be mostly correct. Elgin (2007; 2009), for example, points out that in the historical progression of science, there have been stages at which scientific understanding, while useful and epistemically good, centrally involved false beliefs about the relevant domains. Perhaps even more importantly, scientists regularly employ abstract or idealized models, which are known to be strictly false – but they use these models to gain a good understanding of the domain or phenomenon in question. And the resulting understanding is better, rather than worse, because of the use of these models, which are strictly speaking false. So the elimination of all falsehoods from our theories is not even desirable, on Elgin’s view. (In the language of subjective and objective understanding, we might say that Elgin thinks that subjective understanding can be every bit as good to have as objective understanding. We need to keep in mind, though, that Elgin would reject the view that subjective understandings which centrally involve false beliefs are necessarily misunderstandings.)

5. Instrumentalism and Epistemic Goals

The final topic we need to look at now is the relation between epistemic values and the concept of epistemic rationality or justification. According to one prominent way of analyzing epistemic rationality, the instrumental conception of epistemic rationality, beliefs are epistemically rational when and just to the extent that they appropriately promote the achievement of a distinctively epistemic goal. This approach can measure the epistemic rationality of individual beliefs by how well they themselves do with respect to the epistemic goal (for example, Foley 1987); or it can measure the rationality of whole belief-systems by how accurate they are, according to some appropriate formal rule that scores bodies of beliefs in light of the epistemic goal (for example, Joyce 1998).

The instrumental conception has been endorsed by many epistemologists over the past several decades (for example BonJour 1985; Alston 1985, 2005; Foley 1987, 1993, 2008), though a number of important criticisms of it have emerged in recent years (for example Kelly 2003; Littlejohn 2012; Hazlett 2013). For instrumentalists, getting the right accounts of epistemic goals and epistemic rationality are projects which constrain each other. Whether or not we want to accept instrumentalism in the end, it’s important to see the way that instrumentalists think of the relation of epistemic goals and epistemic rationality.

a. The Epistemic Goal as a Subset of the Epistemic Values

The first thing to note about the instrumentalist’s notion of an epistemic goal is that it has to do with what is valuable from an epistemic or cognitive point of view. But instrumentalists typically are not concerned to identify a set of goals which is exhaustive of what is epistemically valuable. Rather, they are concerned with identifying an epistemically valuable goal which is capable of generating a plausible, informative, and non-circular account of epistemic rationality in instrumental terms, and it’s clear that not all things that seem to be epistemically valuable can be included in an epistemic goal which is going to play that role. David (2001) points out that if we take knowledge or rationality (or, we might also add here, understanding) to be part of the epistemic goal, then the instrumental account of epistemic rationality becomes circular. This is most obvious with rationality: rationality is no doubt something we think is epistemically valuable, but if we include rationality in the formulation of the epistemic goal, and we analyze epistemic rationality in terms of achieving the epistemic goal, then we’ve analyzed epistemic rationality as the appropriate promotion of the goal of getting epistemically rational beliefs – an unhelpfully circular analysis, at best. And, if knowledge and understanding presuppose rationality, we also cannot include knowledge or understanding in the formulation of the epistemic goal.

This is one important reason why many epistemologists have taken the epistemic goal to be about achieving true beliefs and avoiding false ones. That seems to be a goal which is valuable from an epistemic point of view, and it stands a good chance at grounding a non-circular analysis of epistemic rationality.

David in fact goes a step further, and claims that because true belief is the only thing that is epistemically valuable that is capable of grounding an informative and non-circular analysis of epistemic rationality, truth is the only thing that’s really valuable from an epistemic point of view; knowledge, he thinks, is an extra-epistemic value. But it’s possible for pluralists about epistemic value to appreciate David’s point that only some things that are epistemically valuable (such as having true beliefs) are suitable for being taken up in the instrumentalist’s formulation of the epistemic goal. In other words, pluralism about epistemic values is consistent with monism about the epistemic goal.

b. Common Formulations of the Epistemic Goal

Now, there are two further important constraints on how to formulate the epistemic goal. First, it must be plausible to take as a goal – that is, as something we do in fact care about, or at least something that seems to be worth caring about even if people don’t in fact care about it. We might express this constraint by saying that the epistemic goal must be at least pro tanto valuable in either a subjective or an objective sense. And second, the goal should enable us to categorize clear cases of epistemically rational and irrational beliefs correctly. We can close this discussion of epistemic values and goals by considering three oft-invoked formulations of the epistemic goal, and noting the important differences between them. According to these formulations, the epistemic goal is:

(1) “to amass a large body of beliefs with a favorable truth-falsity ratio” (Alston 1985, p.59);

(2) “maximizing true beliefs and minimizing false beliefs about matters of interest and importance” (Alston 2005, p.32); and

(3) “now to believe those propositions that are true and now not to believe those propositions that are false” (Foley 1987, p.8).

Each of these formulations of the epistemic goal emphasizes the achievement of true beliefs and the avoidance of false ones. But there are two important dimensions along which they diverge.

c. Differences between the Goals: Interest and Importance

The first difference is with respect to whether the epistemic goal includes all propositions (or, perhaps, all propositions which a person could conceivably grasp), or whether it includes only propositions about matters of interest or importance. Formulation (2) includes an “interest and importance” clause, whereas (1) and (3) do not. The reason for including a reference to interest and importance is that it makes the epistemic goal much more plausible to take as a goal which is pro tanto valuable. For, as we have seen, there are countless examples of apparently utterly trivial or even harmful true propositions, which one might think are not worth caring about having. This seems like a reason to restrict the epistemic goal to having true beliefs and avoiding false ones about matters of interest and importance: we want to have true beliefs, but only when it is interesting or important to us to have them.

The drawback of an interest and importance clause in the epistemic goal is that it seems to prevent the instrumental approach from providing a fully general account of epistemic rationality. For it seems possible to have epistemically rational or irrational beliefs about utterly trivial or even harmful propositions. Suppose I were to come across excellent evidence about the number times the letter “y” appears in the seventeenth space on all lines in the first three and the last three sections of this article. Even though that strikes me as an utterly trivial truth, which I don’t care about believing, I might still come to believe what my evidence supports regarding it. And if I do, then it’s plausible to think that my belief will count as epistemically rational, because it’s based on good evidence. If it is not part of the epistemic goal that we should achieve true beliefs about even trivial or harmful matters, then it doesn’t seem like instrumentalists have the tools to account for our judgments of epistemic rationality or irrationality in such cases. This seems to give us a reason to make the epistemic goal include all true propositions, or at least all true propositions which people can conceivably grasp. (Such a view might be supported by appeal to the arguments for the general value of truth which we saw above, in section 2.)

d. Differences between the Goals: Synchronic and Diachronic Formulations

The second difference between the three formulations of the epistemic goal is regarding whether the goal is synchronic or diachronic. Formulation (3) is synchronic: it is about now having true beliefs and avoiding false ones. (Or, if we are considering a subject S’s beliefs at a time t other than the present, the goal is to believe true propositions and not believe false ones, at t.) Formulations (1) and (2) are neutral on that question.

A reason for accepting a diachronic formulation of the epistemic goal is that it is, after all, plausible to think that we do care about having true beliefs and avoiding false beliefs over the long run. Having true beliefs now is a fine thing, but having true beliefs now and still having them ten minutes from now is surely better. A second reason for adopting a diachronic formulation of the goal, offered by Vahid (2003), is to block Maitzen’s (1995) argument that instrumentalists who think that the epistemic goal is about having true beliefs cannot say that there are justified false beliefs, or unjustified true beliefs. Briefly, Maitzen argues that false beliefs can never, and true beliefs can never fail to, promote the achievement of the goal of getting true beliefs and avoiding false ones. Vahid replies that if the epistemic goal is about having true beliefs over the long run, then false beliefs can count as justified, in virtue of their truth-conducive causal histories.

The reason why instrumentalists like Foley formulate the epistemic goal instead in synchronic terms is to avoid the counterintuitive result that the epistemic status of a subject’s beliefs at t can depend on what happens after t. For example: imagine that you have very strong evidence at time t for thinking that you are a terrible student, but you are extremely confident in yourself anyway, and you hold the belief at t that you are a good student. At t+1, you consider whether to continue your studies or to drop out of school. Because of your belief about your abilities as a student, you decide to continue with your studies. And in continuing your studies, you go on to become a better student, and you learn all sorts of new things.

In this case, your belief at t that you are a good student does promote the achievement of a large body of beliefs with a favorable truth-falsity ratio over the long run. But by hypothesis, your belief is held contrary to very strong evidence at time t. The intuitive verdict in such cases seems to be that your belief at t that you are a good student is epistemically irrational. So, since the belief promotes the achievement of a diachronic epistemic goal, but not a synchronic one, we should make the epistemic goal synchronic. Or, if we want to maintain that the epistemic goal is diachronic, we can do so, as long as we are willing to accept the cost of adopting a partly revisionary view about what’s epistemically rational to believe in some cases where beliefs are held contrary to good available evidence.

6. Conclusion

We’ve gone through some of the central problems to do with epistemic value here. We’ve looked at attempts to explain why and in what sense knowledge is more valuable than any of its proper parts, and we’ve seen attempts to explain the special epistemic value of understanding. We’ve also looked at some attempts to argue for the fundamental epistemic value of true belief, and the role that the goal of achieving true beliefs and avoiding false ones plays when epistemologists give instrumentalist accounts of the nature of epistemic justification or rationality. Many of these are fundamental and important topics for epistemologists to address, both because they are intrinsically interesting, and also because of the implications that our accounts of knowledge and justification have for philosophy and inquiry more generally (for example, implications for norms of assertion, for norms of practical deliberation, and for our conception of ourselves as inquirers, to name just a few).

7. References and Further Reading

  • Ahlstrom-Vij, Kristoffer (2013). In Defense of Veritistic Value Monism. Pacific Philosophical Quarterly. 94: 1, 19-40.
  • Ahlstrom-Vij, Kristoffer, and Jeffrey Dunn (2018). Epistemic Consequentialism. Oxford: Oxford University Press.
    • This useful volume contains essays that develop, criticize, and defend consequentialist (instrumentalist) accounts of epistemic norms. Much of the volume concerns formal approaches to scoring beliefs and belief-systems in light of the epistemic goal of achieving true beliefs and avoiding false beliefs.
  • Alston, William (1985). Concepts of Epistemic Justification. The Monist. 68. Reprinted in his Epistemic Justification: Essays in the Theory of Knowledge. Ithaca, NY: Cornell University Press, 1989.
    • Discusses concepts of epistemic justification. Espouses an instrumentalist account of epistemic evaluation.
  • Alston, William (2005). Beyond Justification: Dimensions of Epistemic Evaluation. Ithaca, NY: Cornell University Press.
    • Abandons the concept of epistemic justification as too simplistic; embraces the pluralist idea that there are many valuable ways to evaluate beliefs. Continues to endorse the instrumentalist approach to epistemic evaluations.
  • Aschliman, Lance (2020). Is True Belief Really a Fundamental Value? Episteme. 17: 1, 88-104.
  • Bergmann, Michael (2006). Justification without Awareness. Oxford: Oxford University Press.
  • Bondy, Patrick (2018). Epistemic Rationality and Epistemic Normativity. Routledge.
    • Considers three strategies for explaining the normativity of epistemic reasons; criticizes instrumentalism about the nature of epistemic reasons and rationality; defends instrumentalism about the normativity of epistemic reasons.
  • Bondy, Patrick (2022). Avoiding Epistemology’s Swamping Problem: Instrumental Normativity without Instrumental Value. Southwest Philosophy Review.
    • Argues that the normativity of epistemic reasons is instrumental. Also raises worries for Sylvan’s (2018) derivative but non-instrumental approach to the epistemic value of justification and knowledge.
  • BonJour, Laurence (1985). The Structure of Empirical Knowledge. Cambridge, Mass: Harvard University Press.
    • Develops a coherentist internalist account of justification and knowledge. Gives a widely-cited explanation of the connection between epistemic justification and the epistemic goal.
  • Brogaard, Berit (2006). Can Virtue Reliabilism Explain the Value of Knowledge? Canadian Journal of Philosophy. 36: 3, 335-354.
    • Defends generic reliabilism from the Primary Value Problem; proposes an internalist response to the Secondary Value Problem.
  • Brogaard, Berit (2008). The Trivial Argument for Epistemic Value Pluralism, or, How I Learned to Stop Caring About Truth. In: Adrian Haddock, Alan Millar, and Duncan Pritchard, eds. Epistemic Value. Oxford: Oxford University Press. 284-308.
  • Carter, J. Adam, Benjamin Jarvis, and Katherine Rubin (2013). Knowledge: Value on the Cheap. Australasian Journal of Philosophy. 91: 2, 249-263.
    • Presents the promising proposal that because knowledge is a continuing state rather than something that is achieved and then set aside, there are easy solutions to the Primary, Secondary, and even Tertiary Value Problems for knowledge.
  • Craig, Edward (1990). Knowledge and the State of Nature. Oxford: Oxford University Press.
  • David, Marian (2001). Truth as the Epistemic Goal. In Matthias Steup, ed., Knowledge, Truth, and Duty: Essays on Epistemic Justification, Responsibility, and Virtue. New York and Oxford: Oxford University Press. 151-169.
    • A thorough discussion of how instrumentalists about epistemic rationality or justification ought to formulate the epistemic goal.
  • David, Marian (2005). Truth as the Primary Epistemic Goal: A Working Hypothesis. In Matthias Steup and Ernest Sosa, eds. Contemporary Debates in Epistemology. Malden, MA: Blackwell. 296-312.
  • DePaul, Michael (2001). Value Monism in Epistemology. In: Mathhias Steup, ed. Knowledge, Truth, and Duty: Essays on Epistemic Justification, Responsibility, and Virtue. Oxford: Oxford University Press. pp.170-183.
  • Dogramaci, Sinan (2012). Reverse Engineering Epistemic Evaluations. Philosophy and Phenomenological Research. 84: 3, 513-530.
    • Accepts the widely-endorsed thought that justification or rationality are only instrumentally valuable for getting us true beliefs. The paper inquires into what function our epistemic practices could serve, in cases where what’s rational to believe is false, or what’s irrational to believe is true.
  • Elgin, Catherine (2007). Understanding and the Facts. Philosophical Studies. 132, 33-42.
  • Elgin, Catherine (2009). Is Understanding Factive? In A. Haddock, A. Millar, and D. Pritchard, eds. Epistemic Value. Oxford: Oxford University Press. 322-330.
  • Feldman, Richard (2000). The Ethics of Belief. Philosophy and Phenomenological Research. 60: 3, 667-695.
  • Field, Hartry (2001). Truth and the Absence of Fact. Oxford: Oxford University Press.
    • Among other things, argues that there are no objectively correct epistemic goals which can ground objective judgments of epistemic reasonableness.
  • Foley, Richard (1987). The Theory of Epistemic Rationality. Cambridge, Mass: Harvard University Press.
    • A very thorough development of an instrumentalist and egocentric account of epistemic rationality.
  • Foley, Richard (1993). Working Without a Net: A Study of Egocentric Rationality. New York and Oxford: Oxford University Press.
    • Develops and defends the instrumental approach to rationality generally and to epistemic rationality in particular.
  • Foley, Richard (2008). An Epistemology that Matters. In P. Weithman, ed. Liberal Faith: Essays in Honor of Philip Quinn. Notre Dame, Indiana: University of Notre Dame Press. 43-55.
    • Clear and succinct statement of Foley’s instrumentalism.
  • Godfrey-Smith, Peter (1998). Complexity and the Function of Mind in Nature. Cambridge; Cambridge University Press.
  • Goldman, Alvin (1979). What Is Justification? In George Pappas, ed. Justification and Knowledge. Dordrecht: D. Reidel Publishing Company, 1-23.
  • Goldman, Alvin (1999). Knowledge in a Social World.
    • Adopts a veritist approach to epistemic value; describes and evaluates a number of key social institutions and practices in light of the truth-goal.
  • Goldman, Alvin and Olsson, Erik (2009). Reliabilism and the Value of Knowledge. In A. Haddock, A. Millar, and D. Pritchard, eds. Epistemic Value. Oxford: Oxford University Press. 19-41.
    • Presents two reliabilist responses to the Primary Value Problem.
  • Graham, Peter (2011). Epistemic Entitlement. Noûs. 46: 3, 449-482.
  • Greco, John (2003). Knowledge as Credit for True Belief. In Michael DePaul and Linda Zagzebski, eds. Intellectual Virtue: Perspectives from Ethics and Epistemology. Oxford: Oxford University Press. 111-134.
    • Sets out the view that attributions of knowledge are attributions of praiseworthiness, when a subject gets credit for getting to the truth as a result of the exercise of intellectual virtues. Discusses praise, blame, and the pragmatics of causal explanations.
  • Greco, John (2008). Knowledge and Success from Ability. Philosophical Studies. 142, 17-26.
    • Elaboration of ideas in Greco (2003).
  • Grimm, Stephen (2006). Is Understanding a Species of Knowledge? British Journal for the Philosophy of Science. 57, 515–35.
  • Grimm, Stephen (2012). The Value of Understanding. Philosophy Compass. 7: 2, 1-3-117.
    • Good survey article of work on the value of understanding up to 2012.
  • Haddock, Adrian (2010). Part III: Knowledge and Action. In Duncan Pritchard, Allan Millar, and Adrian Haddock, The Nature and Value of Knowledge: Three Investigations. Oxford: Oxford University Press.
  • Hazlett, Allan (2013). A Luxury of the Understanding: On the Value of True Belief. Oxford: Oxford University Press.
    • An extended discussion of whether true belief is valuable. Presents a conventionalist account of epistemic normativity.
  • Hills, Alison (2009). Moral Testimony and Moral Epistemology. Ethics. 120: 1, 94-127.
  • Horvath, Joachim (2009). Why the Conditional Probability Solution to the Swamping Problem Fails. Grazer Philosophische Studien. 79: 1, 115-120.
  • Hutchinson, Jim (2021). Why Can’t What Is True Be Valuable? Synthese. 198, 6935-6954.
  • Hyman, John (2010). The Road to Larissa. Ratio. 23: 4, 393-414.
    • Contains detailed explanatory and critical discussion of the Primary and Secondary Value Problems, and Plato’s and Williamson’s stability solutions. Proposes that knowledge is the ability to be guided by the facts; and that knowledge is expressed when we guide ourselves by the facts—when we “do things for reasons that are facts” (p.411); and mere true belief is insufficient for this kind of guidance.
  • James, William (1949). The Will to Believe. In his Essays in Pragmatism. New York: Hafner. pp. 88-109. Originally published in 1896.
  • Jones, Ward (1997). Why Do We Value Knowledge? American Philosophical Quarterly. 34: 4, 423-439.
    • Argues that reliabilists and other instrumentalists cannot handle the Primary Value Problem. Proposes that we solve the problem by appealing to the value of contingent features of knowledge.
  • Joyce, James (1998). A Nonpragmatic Vindication of Probabilism. Philosophy of Science. 65: 4, 575-603.
    • Assumes an epistemic goal of truth or accuracy; shows that credal systems that conform to the axioms of probability do better than systems that violate those axioms.
  • Kaplan, Mark (1985). It’s Not What You Know that Counts. The Journal of Philosophy. 82: 7, 350-363.
    • Denies that knowledge is any more important than justified true belief.
  • Kelly, Thomas (2003). Epistemic Rationality as Instrumental Rationality: A Critique. Philosophy and Phenomenological Research. 66: 3, 612-640.
    • Criticizes the instrumental conception of epistemic rationality, largely on the grounds that beliefs can be epistemically rational or irrational in cases where there is no epistemic goal which the subject desires to achieve.
  • Kornblith, Hilary (2002). Knowledge and its Place in Nature. Oxford: Clarendon Press of Oxford University Press.
    • Develops the idea that knowledge is a natural kind which ought to be studied empirically rather than through conceptual analysis. Grounds epistemic norms, including the truth-goal, in the fact that we desire anything at all.
  • Kvanvig, Jonathan (2003). The Value of Knowledge and the Pursuit of Understanding. Cambridge: Cambridge University Press.
    • Considers and rejects various arguments for the value of knowledge. Argues that understanding rather than knowledge is the primary epistemic value.
  • Kvanvig, Jonathan (2005). Truth is Not the Primary Epistemic Goal. In Matthias Steup and Ernest Sosa, eds. Contemporary Debates in Epistemology. Malden, MA: Blackwell. 285-296.
    • Criticizes epistemic value monism.
  • Kvanvig, Jonathan (2010). The Swamping Problem Redux: Pith and Gist. In Adrian Haddock, Alan Millar, and Duncan Pritchard, eds. Social Epistemology. 89-112.
  • Lackey, Jennifer (2007). Why We Don’t Deserve Credit for Everything We Know. Synthese. 158: 3, 345-361.
  • Lackey, Jennifer (2009). Knowledge and Credit. Philosophical Studies. 142: 1, 27-42.
  • Lackey argues against the virtue-theoretic idea that when S knows that p, S’s getting a true belief is always creditable to S.
  • Littlejohn, Clayton (2012). Justification and the Truth-Connection. Cambridge: Cambridge University Press.
    • Contains an extended discussion of internalism and externalism, and argues against the instrumental conception of epistemic justification. Also argues that there are no false justified beliefs.
  • Littlejohn, Clayton (2018). The Right in the Good: A Defense of Teleological Non-Consequentialism. In: Kristoffer Ahlstrom-Vij and Jeffrey Dunn, eds. Epistemic Consequentialism. Oxford: Oxford University Press. 23-47.
  • Lynch, Michael (2004). True to Life: Why truth Matters. Cambridge, Mass: MIT Press.
    • Argues for the objective value of true beliefs.
  • Lynch, Michael (2009). Truth, Value and Epistemic Expressivism. Philosophy and Phenomenological Research. 79: 1, 76-97.
    • Argues against expressivism and anti-realism about the value of true beliefs.
  • Madison, B.J.C. (2017). Epistemic Value and the New Evil Demon. Pacific Philosophical Quarterly. 98: 1, 89-107.
    • Argues that justification is valuable for its own sake, not just as a means to truth.
  • Maitzen, Stephen (1995). Our Errant Epistemic Aim. Philosophy and Phenomenological Research. 55: 4, 869-876.
    • Argues that if we take the epistemic goal to be achieving true beliefs and avoiding false ones, then all and only true beliefs will count as justified. Suggests that we need to adopt a different formulation of the goal.
  • Millikan, Ruth (1984). Language, Thought, and other Biological Categories. Cambridge, Mass.: MIT Press.
    • Develops and applies the selected-effect view of the proper functions of organs and traits.
  • Olsson, Erik (2009). In Defense of the Conditional Reliability Solution to the Swamping Problem. Grazer Philosophische Studien. 79: 1, 93-114.
  • Olsson, Erik (2011). Reply to Kvanvig on the Swamping Problem. Social Epistemology. 25: 2, 173-182.
  • Piller, Christian (2009). Valuing Knowledge: A Deontological Approach. Ethical Theory and Moral Practice. 12, 413-428.
  • Plantinga, Alvin (1993). Warrant and Proper Function. New York: Oxford University Press.
    • Develops a proper function analysis of knowledge.
  • Plato. Meno. Trans. G. M. A. Grube. In Plato, Complete Works. J. M. Cooper and D. S. Hutcheson, eds. Indianapolis and Cambridge: Hackett, 1997. 870-897.
  • Pritchard, Duncan. (2007). Recent Work on Epistemic Value. American Philosophical Quarterly. 44: 2, 85-110.
    • Survey article on problems of epistemic value. Distinguishes Primary, Secondary, and Tertiary value problems.
  • Pritchard, Duncan. (2008). Knowing the Answer, Understanding, and Epistemic Value. Grazer Philosophische Studien. 77, 325–39.
  • Pritchard, Duncan. (2009). Knowledge, Understanding, and Epistemic Value. Epistemology (Royal Institute of Philosophy Lectures). Ed. Anthony O’Hear. New York: Cambridge University Press. 19–43.
  • Pritchard, Duncan. (2010) Part I: Knowledge and Understanding. In Duncan Pritchard, Allan Millar, and Adrian Haddock, The Nature and Value of Knowledge: Three Investigations. Oxford: Oxford University Press.
  • Pritchard, Duncan. (2014). Truth as the Fundamental Epistemic Good. In: Jonathan Matheson and Rico Vitz, eds. The Ethics of Belief: Individual and Social. Oxford:Oxford University Press. 112-129.
  • Pritchard, Duncan. (2021). Intellectual Virtues and the Epistemic Value of Truth. Synthese. 198, 5515- 5528.
  • Riggs, Wayne (2002). Reliability and the Value of Knowledge. Philosophy and Phenomenological Research. 64, 79-96.
  • Riggs, Wayne (2003). Understanding Virtue and the Virtue of Understanding. In Michael DePaul & Linda Zagzebski, eds. Intellectual Virtue: Perspectives from Ethics and Epistemology. Oxford University Press.
  • Riggs, Wayne (2008). Epistemic Risk and Relativism. Acta Analytica. vol. 23, no. 1, pp. 1-8.
  • Sartwell, Crispin (1991). Knowledge is Merely True Belief. American Philosophical Quarterly. 28: 2, 157-165.
  • Sartwell, Crispin  (1992). Why Knowledge is Merely True Belief. The Journal of Philosophy. 89: 4, 167-180.
    • These two articles by Sartwell are the only places in contemporary epistemology where the view that knowledge is just true belief is seriously defended.
  • Sliwa, Paulina (2015). Understanding and Knowing. Proceedings of the Aristotelian Society. 115, part 1, pp.57-74.
    • Defends the reductivist thesis that the various types of understanding (understanding a domain, understanding that p, understanding a person, and so on) are no different from the corresponding types of knowing.
  • Sosa, Ernest (2003). The Place of Truth in Epistemology. In Michael DePaul and Linda Zagzebski, eds. Intellectual Virtue: Perspectives from Ethics and Epistemology. Oxford: Clarendon Press; New York: Oxford University Press.
  • Sosa, Ernest (2007). A Virtue Epistemology: Apt Belief and Reflective Knowledge, Volume 1. Oxford: Clarendon Press; New York: Oxford University Press.
    • Sets out a virtue-theoretic analysis of knowledge. Distinguishes animal knowledge from reflective knowledge. Responds to dream-skepticism. Argues that true belief is the fundamental epistemic value.
  • Sylvan, Kurt (2018). Veritism Unswamped. Mind. 127: 506, 381-435.
    • Proposes that justification is non-instrumentally, but still derivatively, valuable.
  • Treanor, Nick (2014). Trivial Truths and the Aim of Inquiry. Philosophy and Phenomenological Research. 89: 3, pp.552-559.
    • Argues against an argument for the popular claim that some truths are more interesting than others. Points out that the standard comparisons between what are apparently more and less interesting true sentences are unfair, because the sentences might not involve or express the same number of true propositions.
  • Vahid, Hamid (2003). Truth and the Aim of Epistemic Justification. Teorema. 22: 3, 83-91.
    • Discusses justification and the epistemic goal. Proposes that accepting a diachronic formulation of the epistemic goal solves the problem raised by Stephen Maitzen (1995).
  • Weiner, Matthew (2009). Practical Reasoning and the Concept of Knowledge. In A. Haddock, A. Millar, and D. Pritchard, eds. Epistemic Value. Oxford: Oxford University Press. 163-182.
    • Argues that knowledge is valuable in the same way as a Swiss Army Knife is valuable. A Swiss Army Knife contains many different blades which are useful in different situations; they’re not always all valuable to have, but it’s valuable to have them all collected in one easy-to-carry package. Similarly, the concept of knowledge has a number of parts which are useful in different situations; they’re not always all valuable in all cases, but it’s useful to have them collected together in one easy-to-use concept.
  • Williamson, Timothy (2000). Knowledge and its Limits. Oxford: Oxford University Press.
    • Among many other things, Williamson sets out and defends knowledge-first epistemology, adopts a stability-based solution to the Primary Value Problem, and suggests that his view of knowledge as the most general factive mental state solves the Secondary Value Problem.
  • White, R. (2007). Epistemic Subjectivism. Episteme: A Journal of Social Epistemology. 4: 1, 115-129.
  • Whiting, Daniel (2012). Epistemic Value and Achievement. Ratio. 25, 216-230.
    • Argues against the view that the value of epistemic states in general should be thought of in terms of achievement (or success because of ability). Also argues against Pritchard’s achievement-account of the value of understanding in particular.
  • Wrenn, Chase (2017). True Belief Is Not (Very) Intrinsically Valuable. Pacific Philosophical Quarterly. 98, 108-128.
  • Zagzebski, Linda (2001). Recovering Understanding. In Knowledge, Truth, and Duty: Essays on Epistemic Justification, Responsibility, and Virtue. Ed. Matthias Steup. New York: Oxford University Press, 2001. 235–56.
  • Zagzebski, Linda (2003). The Search for the Source of Epistemic Good. Metaphilosophy. 34, 12-28.
    • Gives a virtue-theoretic explanation of knowledge and the value of knowledge. Claims that it is morally important to have true beliefs, when we are performing morally important actions. Claims that knowledge is motivated by a love of the truth, and explains the value of knowledge in terms of that love and the value of that love.
  • Zagzebski, Linda (2009). On Epistemology. Belmont, CA: Wadsworth.
    • Accessible introduction to contemporary epistemology and to Zagzebski’s preferred views in epistemology. Useful for students and professional philosophers.

 

Author Information

Patrick Bondy
Email: patrbondy@gmail.com
Cornell University
U. S. A.

The Problem of Induction

This article discusses the problem of induction, including its conceptual and historical perspectives from Hume to Reichenbach. Given the prominence of induction in everyday life as well as in science, we should be able to tell whether inductive inference amounts to sound reasoning or not, or at least we should be able to identify the circumstances under which it ought to be trusted. In other words, we should be able to say what, if anything, justifies induction: are beliefs based on induction trustworthy? The problem(s) of induction, in their most general setting, reflect our difficulty in providing the required justifications.

Philosophical folklore has it that David Hume identified a severe problem with induction, namely, that its justification is either circular or question-begging. As C. D. Broad put it, Hume found a “skeleton” in the cupboard of inductive logic. What is interesting is that (a) induction and its problems were thoroughly debated before Hume; (b) Hume rarely spoke of induction; and (c) before the twentieth century, almost no one took it that Hume had a “problem” with induction, a.k.a. inductive scepticism.

This article tells the story of the problem(s) of induction, focusing on the conceptual connections and differences among the accounts offered by Hume and all the major philosophers that dealt with induction until Hans Reichenbach. Hence, after Hume, there is a discussion of what Kant thought Hume’s problem was. It moves on to the empiricist-vs-rationalist controversy over induction as it was instantiated by the views of J. S. Mill and W. Whewell in the nineteenth century.  It then casts light on important aspects of the probabilistic approaches to induction, which have their roots in Pierre Laplace’s work on probability and which dominated most of the twentieth century. Finally, there is an examination of important non-probabilistic treatments of the problem of induction, such as Peter Strawson’s view that the “problem” rests on a conceptual misunderstanding, Max Black’s self-supporting justification of induction, Karl Popper’s “anathema” of induction, and Nelson Goodman’s new riddle of induction.

Table of Contents

  1. Reasoning
    1. Τwo Kinds of Reasoning
      1. Deductive Reasoning
      2. Inductive Reasoning
    2. The Skeleton in the Cupboard of Induction
    3. Two Problems?
  2. What was Hume’s Problem?
    1. “Rules by which to judge causes and effects”
    2. The Status of the Principle of Uniformity of Nature
    3. Taking a Closer Look at Causal Inference
    4. Causal Inference is Non-Demonstrative
    5. Against Natural Necessity
      1. Malebranche on Necessity
      2. Leibniz on Induction
    6. Can Powers Help?
    7. Where Does the Idea of Necessity Come From?
  3. Kant on Hume’s Problem
    1. Hume’s Problem for Kant
    2. Kant on Induction
  4. Empiricist vs Rationalist Conceptions of Induction (After Hume and Kant)
    1. Empiricist Approaches
      1. John Stuart Mill: “The Problem of Induction”
      2. Mill on Enumerative Induction
      3. Mill’s Methods
      4. Alexander Bain: The “Sole Guarantee” of the Inference from a Fact Known to a Fact Unknown
    2. Rationalist Approaches
      1. William Whewell on “Collecting General Truths from Particular Observed Facts”
        1. A Short Digression: Francis Bacon
        2. Back to Whewell
      2. Induction as Conception
    3. The Whewell-Mill Controversy
      1. On Kepler’s Laws
      2. On the Role of Mind in Inductive Inferences
    4. Early Appeals to Probability: From Laplace to Russell via Venn
      1. Venn: Induction vs Probability
      2. Laplace: A Probabilistic Rule of Induction
      3. Russell’s Principle of Induction
  5. Non-Probabilistic Approaches
    1. Induction and the Meaning of Rationality
    2. Can Induction Support Itself?
      1. Premise-Circularity vs Rule-Circularity
      2. Counter-Induction?
    3. Popper Against Induction
    4. Goodman and the New Riddle of Induction
  6. Reichenbach on Induction
    1. Statistical Frequencies and the Rule of Induction
    2. The Pragmatic Justification
    3. Reichenbach’s Views Criticized
  7. Appendix
  8. References and Further Reading

1. Reasoning

a. Τwo Kinds of Reasoning

Reasoning in general is the process by which one draws conclusions from a set of premises. Reasoning is guided by rules of inference, that is, rules which entitle the reasoner to draw the conclusion, given the premises. There are, broadly speaking, two kinds of rules of inference and hence two kinds of reasoning: Deductive or (demonstrative) and Inductive (or non-demonstrative).

i. Deductive Reasoning

Deductive inference is such that the rule used is logically valid. A logically valid argument is such that the premises are inconsistent with the negation of the conclusion. That is, a deductively valid argument is such that if the premises are true, the conclusion has to be true. Deductive arguments can be valid without being sound. A sound argument is a deductively valid argument with true premises. A valid argument is sound if its premises are actually true. For example, the valid argument {All human beings are mortal; Madonna is a human being; therefore, Madonna is mortal} is valid, but whether or not it is sound depends on whether or not its premises are true. If at least one of them fails to be true, the argument is unsound. So, soundness implies validity, whereas validity does not imply soundness. Logically valid rules of inference are, for instance, modus ponens and modus tollens, the hypothetical and the disjunctive syllogism and categorical syllogisms.

The essential property of valid deductive argument is known as truth-transmission. This simply is meant to capture the fact that in a valid argument the truth of the premises is “transferred” to the conclusion: if the premises are true, the conclusion has to be true. Yet this feature comes at a price: deductive arguments are not content-increasing. The information contained in the conclusion is already present—albeit in an implicit form—in the premises. Thus, deductive reasoning is non-ampliative, or explicative, as the American philosopher Charles Peirce put it. The non-ampliative character of deductive reasoning has an important consequence regarding its function in language and thought: deductive reasoning unpacks the information content of the premises. In mathematics, for instance, the axioms of a theory contain all information that is unraveled by proofs into the theorems.

ii. Inductive Reasoning

Not all reasoning is deductive, however, for the simple reason that the truth of the premises of a deductive argument cannot, as a rule, be established deductively. As John Stuart Mill put it, the “Truth can only be successfully pursued by drawing inferences from experience,” and these are non-deductive. Following Mill, let us call Induction (with capital I) the mode of reasoning which moves from “particulars to generals,” or equivalently, the rule of inference in which “The conclusion is more general than the largest of the premises.” A typical example is enumerative Induction: if one has observed n As being B and no As being not-B, and if the evidence is enough and variable, then one should infer that “All As are B.”

Inductive arguments are logically invalid: the truth of the premises is consistent with the falsity of the conclusion. Thus, the rules of inductive inference are not truth-preserving precisely because they are ampliative: the content of the conclusion of the argument exceeds (and hence amplifies) the content of its premises. A typical case of it is:

All observed individuals who have the property A also have the property B;

therefore, All individuals who have the property A also have the property B.

It is perfectly consistent with the fact that All observed individuals who have the property A also have the property B, that there are some As which are not B (among the unobserved individuals).

And yet, the logical invalidity of Induction is not, per se, reason for indictment. The conclusion of an ampliative argument is adopted on the basis that the premises offer some reason to accept it as true. The idea here is that the premises inductively support the conclusion, even if they do not prove it to be true. This is the outcome of the fact that Induction by enumeration is ampliative. It is exactly this feature of inductive inference that makes it useful for empirical sciences, where next-instance predictions or general laws are inferred on the basis of a finite number of observational or experimental facts.

b. The Skeleton in the Cupboard of Induction

Induction has a problem associated with it. In a nutshell, it is motivated by the following question: on what grounds is one justified to believe that the conclusion of an inductive inference is true, given the truth of its premises? The skeptical challenge to Induction is that any attempt to justify Induction, either by the lights of reason only or with reason aided by (past) experience, will be circular and question begging.

In fact, the problem concerns ampliative reasoning in general. Since the conclusion Q of an ampliative argument can be false, even though all of its premises are true, the following question arises: what makes it the case that the ampliative reasoning conveys whatever epistemic warrant the premises might have to the intended conclusion Q, rather than to its negation not-Q? The defender of ampliative reasoning will typically reply that Induction relies on some substantive and contingent assumptions (for example, that the world has a natural-kind structure, that the world is governed by universal regularities, or that the course of nature will remain uniform, etc.); hence some argue that these assumptions back up Induction in all cases. But the sceptic will retort that these very assumptions can only be established as true by means of ampliative reasoning. Arguing in a circle, the sceptic notes, is inevitable and this simply means, she concludes, that the alleged defense carries no rational compulsion with it.

It is typically, but not quite rightly, accepted that the Problem of Induction was noted for the first time by David Hume in his A Treatise of Human Nature (1739). (For an account of Induction and its problem(s) before Hume, see Psillos 2015.)  In section 2, this article discusses Hume’s version of the Problem of Induction (and his solution to this problem) in detail. For the time being, it is important to note that Hume’s Problem of Induction as it appears in standard textbooks, and in particular the thought that Induction needs a special justification, is formed distinctly as a philosophical problem only in the twentieth century. It has been expressed by C. D. Broad in an address delivered in 1926 at Cambridge on the occasion of Francis Bacon’s tercentenary. There, Broad raised the following question: “Did Bacon provide any logical justification for the principles and methods which he elicited and which scientists assume and use?” His reply is illuminating: “He did not, and he never saw that it was necessary to do so. There is a skeleton in the cupboard of Inductive Logic, which Bacon never suspected and Hume first exposed to view.” (1952: 142-3) This skeleton is the Problem of Induction. Another Cambridge philosopher, J. M. Keynes explains in his A Treatise of Probability why Hume’s criticism of Induction never became prominent in the eighteenth and the nineteenth century:

Between Bacon and Mill came Hume (…) Hume showed, not that inductive methods were false, but that their validity had never been established and that all possible lines of proof seemed equally unpromising. The full force of Hume’s attack and the nature of the difficulties which it brought to light were never appreciated by Mill, and he makes no adequate attempt to deal with them. Hume’s statement of the case against induction has never been improved upon; and the successive attempts of philosophers, led by Kant, to discover a transcendental solution have prevented them from meeting the hostile arguments on their own ground and from finding a solution along lines which might, conceivably, have satisfied Hume himself. (1921: 312-313)

c. Two Problems?

Indeed, hardly ever does anyone mention Hume’s name in relation to the Problem of Induction before the Cambridge Apostles, with the exception of John Venn (see section 4.4). Bertrand Russell, in his famous book The Problems of Philosophy in 1912, devoted a whole chapter on Induction (interestingly, without making any reference to Hume). There, he took it that there should be a distinction between two different issues, and hence two different types of justification that one may provide to Induction, a distinction “without which we should soon become involved in hopeless confusions.” (1912: 34) The first issue is a fact about human and animal lives, namely, that expectations about the future course of events or about hitherto unobserved objects are formed on the basis on (and are caused by) past uniformities. In this case, “The frequent repetition of some uniform succession or coexistence has been a cause of our expecting the same succession or coexistence on the next occasion” (ibid.). Thus, the justification (better put, exculpation) would be of the following sort: since, as a matter of fact, the mind works in such and such a way, we expect the conclusion of induction to be true. The second issue is about the justification of the inferences that lie at the basis of the transition from the past regularities (or the hitherto observed pattern among objects) to a generalization (that is, to their extension to the future or to the hitherto unobserved). This second issue, Russell thought, revolves around the problem of whether there is “any reasonable ground for giving weight” to such expectations of uniformity after “the question of their validity has been raised.” (1912: 35) Hence, the Problem of Induction is a problem that arises upon reflection on a practice, namely, the practice to form expectations about the future on the basis of whatever has happened in the past; or, in other words, the practice of learning from experience.

Later on, Karl Popper distinguished between the psychological problem of Induction, which can be formulated in terms of the following question: How is it that nevertheless all reasonable people expect and believe that instances of which they have had no experience will conform to those of which they have had experience?” (Popper 1974: 1018) The logical problem of Induction which is expressed in the question: “Are we rationally justified in reasoning from repeated instances of which we have had experience to instances of which we have had no experience?” (ibid.)

To show the difference between the two types of problems, Popper (1974: 1019) referred to an example from Russell (1948): consider a person who, out of mental habit, does not follow the rules of inductive inference. If the only justification of the rule is based on how the mind works, we cannot explain why that person’s way of thinking is irrational. The only thing we can tell is that the person does not follow the way that most people think. Can we do better than that? Can we solve the logical problem of induction? And, more importantly, is there a logical problem to solve?

A fairly recent, but very typical, formulation of it from Gerhard Schurz clarifies the logical problem of Induction. The Problem of Induction is that:

There is no epistemic justification [of induction], meaning a system of arguments showing that inductive methods are useful or the right means for the purpose of acquiring true and avoiding false beliefs. […] Hume did not only say that we cannot prove that induction is successful or reliable; he argued that induction is not capable of any rational justification whatsoever. (2019: 7)

2. What was Hume’s Problem?

a. “Rules by which to judge causes and effects”

Suppose that you started to read Hume’s A Treatise of Human Nature from section XV, of part III of book I, titled Rules by which to judge causes and effects. You read:

[…] There are no objects, which by the mere survey, without consulting experience, we can determine to be the causes of any other; and no objects, which we can certainly determine in the same manner not to be the causes. Any thing may produce any thing. Where objects are not contrary, nothing hinders them from having that constant conjunction, on which the relation of cause and effect totally depends. (1739: 173)

Fair enough, you may think. Hume claims that only experience can teach us what causes what, and without any reference to (prior) experience anything can be said to cause anything else to happen—meaning, no causal connections can be found with the lights of reason only. Reason imposes no constraints on what constant conjunctions among non-contrary (mutually exclusive) objects or properties there are in nature. Then, you read on: “Since therefore ’tis possible for all objects to become causes or effects to each other, it may be proper to fix some general rules, by which we may know when they really are so.”

Fair enough again, you may think. If only experience can teach us what constant conjunctions of objects there are in the world, then we had better have some ways to find out which among the possible constant conjunctions (possible if only Reason were in operation) are actual. And Hume does indeed go ahead to give 8 rules, the first six of which are:

    1. The cause and effect must be contiguous in space and time;
    2. The cause must be prior to the effect;
    3. There must be a constant union between the cause and effect;
    4. The same cause always produces the same effect, and the same effect never arises but from the same cause;
    5. When several different objects produce the same effect, it must be by means of some quality, which is common amongst them;
    6. If two resembling objects produce different effects, then the difference in the effects must proceed from something in which the causes differ.

It is not the aim of this article to discuss these rules. Suffice it to say that they are hardly controversial. Rules 1 and 2 state that causes are spatio-temporally contiguous with and temporally prior to their effects. Rule 3 states that cause and effect form a regular succession. Rule 4, perhaps the most controversial, states a fundamental principle about causation (which encapsulates the principle of uniformity of nature) which Mill defended too. Rules 5 and 6 are early versions of the methods of agreement and difference, which became central features of Mill’s epistemology of causation. Hume readily acknowledges that the application of these rules is not easy, since most natural phenomena are complex and complicated. But all this is very natural and is nowhere related with any Problem of Induction, apart from the issue of how to distinguish between good and bad inductive inferences.

There is something even more surprising in Hume’s Treatise. He notes:

’Tis certain, that not only in philosophy, but even in common life, we may attain the knowledge of a particular cause merely by one experiment, provided it be made with judgement, and after a careful removal of all foreign and superfluous circumstances. Now as after one experiment of this kind, the mind, upon the appearance of the cause or the effect, can draw an inference concerning the existence of its correlative; and as a habit can never be acquir’d merely by one instance; it may be thought that belief cannot in this case be esteem’d the effect of custom. (1739: 104-5)

Hume certainly allows that a single experiment may be enough for causal knowledge (which is always general), provided, as he says, the experiment is “made with judgement, and after a careful removal of all foreign and superfluous circumstances.” Now, strictly speaking, it makes no sense to say that in a single experiment “all foreign and superfluous circumstances” can be removed. A single experiment is a one-off act: it includes all the factors it actually does. To remove or change some factors (circumstances) is to change the experiment, or to perform a different, but related, one. So, what Hume has in mind when he says that we can draw causal conclusions from single experiments is that we have to perform a certain type of experiment a few times, each time removing or changing a certain factor, in order to see whether the effect is present (or absent) under the changed circumstances. In the end, it will be a single experiment that will reveal the cause. But this revelation will depend on having performed the type of experiment a few times, each under changed circumstances. Indeed, this thought is captured by Hume’s Rule 5 above. This rule urges the experimenter to remove the “foreign and superfluous circumstances” in a certain type of experiment by removing a factor each time it is performed until the common factor in all of them is revealed.

But Hume’s main concern in the quotation above is to resist the claim that generalizing on the basis of a single experiment is a special non-inductive procedure. He goes on to explain that even though in a certain case we may have to rely on a single experiment to form a general belief, we in fact rely on a principle for which we have “millions” of experiments in support: “That like objects, plac’d in like circumstance, will always produce like effects.” (1739: 105) So, when general causal conclusions are drawn from single experiments, this activity is “comprehended under [this higher-order] principle,” which is clearly a version of the Principle of Uniformity of Nature. This higher-order principle “bestows an evidence and firmness on any opinion, to which it can be apply’d.” (1739: 105)

Note that section XV, of part III of book I reveals hardly any sign of inductive skepticism from Hume. Instead, it offers methods for judging the circumstances under which Induction is legitimate.

b. The Status of the Principle of Uniformity of Nature

So, what is the issue of Hume’s skepticism about Induction? Note, for a start, what he adds to what he has already said. This higher-order principle (the principle of uniformity of nature) is “habitual”; that is, it is the product of habit or custom and not of Reason. The status of this principle is then the real issue that Hume is concerned with.

Hume rarely uses the term “induction,” but when he does use it, it is quite clear that he has in mind something like generalization on the basis of observing cases or instances. But on one occasion, in his Enquiry Concerning the Principles of Morals, he says something more:

There has been a controversy started of late, much better worth examination, concerning the general foundation of MORALS; whether they be derived from REASON, or from SENTIMENT; whether we attain the knowledge of them by a chain of argument and induction, or by an immediate feeling and finer internal sense. (1751: 170)

It seems that Hume contrasts “induction” to argument (demonstration); hence he seems to take it to be an inferential process based on experience.

With this in mind, let us discuss Hume’s “problem of induction.” In the Treatise, Hume aims to discover the locus of the idea of necessary connection, which is taken to be part of the idea of causation. One of the central questions he raises is this: “Why we conclude, that such particular causes must necessarily have such particular effects; and what is the nature of that inference we draw from the one to the other, and of the belief we repose in it?” (1739: 78)..

When it comes to the inference from cause to effect, Hume’s approach is captivatingly simple. We have memory of past co-occurrences of types of events C and E, where Cs and Es have been directly perceived, or remembered to have been perceived. This co-occurrence is “a regular order of contiguity and succession” among tokens of C and tokens of E. (1739: 87) So, when in a fresh instance we perceive or remember a C, we “infer the existence” of an E. Although in all past instances of co-occurrence, both Cs and Es “have been perceiv’d by the senses and are remember’d,” in the fresh instance, E is not yet perceived, but its idea is nonetheless “supply’d in conformity to our past experience” (ibid.). He then adds: “Without any further ceremony, we call the one [C] cause and the other [E] effect, and infer the existence of the one from that of the other” (ibid.). What is important in this process of causal inference is that it reveals “a new relation betwixt cause and effect,” a relation that is different from contiguity, succession and necessary connection, namely, constant conjunction. It is this “CONSTANT CONJUNCTION” (1739: 87) that is involved in our “pronouncing” a sequence of events to be causal. Hume says that contiguity and succession “are not sufficient to make us pronounce any two objects to be cause and effect, unless we perceive, that these two relations are preserv’d in several instances” (ibid.). The “new relation” (constant conjunction) is a relation among sequences of events. Its content is captured by the claim: “Like objects have always been plac’d in like relations of contiguity and succession.” (1739: 88)

Does that mean that Hume identifies the sought-after necessary connection with the constant conjunction? By no means! The observation of a constant conjunction generates no new impression in the objects perceived. Hume points out that the mere multiplication of sequences of tokens of C being followed by tokens of E adds no new impressions to those we have had from observing a single sequence. Observing, for instance, a single collision of two billiard balls, we have impressions of the two balls, of their collision, and of their flying apart. These are exactly the impressions we have no matter how many times we repeat the collision of the balls. The impressions we had from the single sequence did not include any impression that would correspond to the idea of necessary connection. But since the observation of the multiple instances generates no new impressions in the objects perceived, it cannot possibly add a new impression which might correspond to the idea of necessary connection. As Hume puts it:

From the mere repetition of any past impression, even to infinity, there never will arise any new original idea, such as that of necessary connexion; and the number of impressions has in this case no more effect than if we confin’d ourselves to one only. (1739: 88)

The reason why constant conjunction is important (even though it cannot directly account for the idea of necessary connection by means of an impression) is that it is the source of the inference we make from causes to effects. Looking more carefully at this inference might cast some new light on what exactly is involved when we call a sequence of events causal. As he put it: “Perhaps ‘twill appear in the end, that the necessary connexion depends on the inference, instead of the inference’s depending on the necessary connexion.” (1739: 88)

c. Taking a Closer Look at Causal Inference

The inference of which Hume wants to unravel the “nature” is this: “After the discovery of the constant conjunction of any objects, we always draw an inference from one object to another.” (1739: 88) This, it should be noted, is what might be called an inductive inference. To paraphrase what Hume says, its form is:

(I)

(CC): A has been constantly conjoined with B (that is, all As so far have been followed by Bs)

(FI): a is A (a fresh instance of A)

Therefore, a is B (the fresh instance of A will be followed by a fresh instance of B).

Hume’s target is all those philosophers who think that this kind of inference is (or should be) demonstrative. In particular, his target is all those who think that the fresh instance of A must necessarily be followed by a fresh instance of B. Recall his question cited above: “Why we conclude, that such particular causes must necessarily have such particular effects.”

What, he asks, determines us to draw inference (I)? If it were Reason that determined us, then this would have to be a demonstrative inference: the conclusion would have to follow necessarily from the premises. But then an extra premise would be necessary, namely, “Instances, of which we have had no experience, must resemble those, of which we have had experience, and that the course of nature continues always uniformly the same” (ibid.).

Let us call this the Principle of Uniformity of Nature (PUN). If indeed this principle were added as an extra premise to (I), then the new inference:

(PUN-I)

(CC): A has been constantly conjoined with be (i.e., all As so far have been followed by Bs)

(FI): a is A (a fresh instance of A)

(PUN): The course of nature continues always uniformly the same.

Therefore, a is B (the fresh instance of A will be followed by a fresh instance of B).

would be demonstrative and the conclusion would necessarily follow from the premises. Arguably then, the logical necessity by means of which the conclusion follows from the premises would mirror the natural necessity by means of which causes bring about the effects (a thought already prevalent in Aristotle). But Hume’s point is that for (PUN-I) to be a sound argument, PUN need to be provably true. There are two options here.

The first is that PUN is proved itself by a demonstrative argument. But this, Hume notes, is impossible since “We can at least conceive a change in the course of nature; which sufficiently proves that such a change is not absolutely impossible.” (1739: 89) Here what does the work is Hume’s separability principle, namely, that if we can conceive A without conceiving B, then A and B are distinct and separate entities and one cannot be inferred from the other. Hence, since one can conceive the idea of past constant conjunction without having to conceive the idea of the past constant conjunction being extended in the future, these two ideas are distinct from each other. So, PUN cannot be demonstrated a priori by pure Reason. It is not a conceptual truth, nor a principle of Reason.

The other option is that PUN is proved by recourse to experience. But, Hume notes, any attempt to base the Principle of Uniformity of Nature on experience would be circular. From the observation of past uniformities in nature, it cannot be inferred that nature is uniform, unless it is assumed what was supposed to be proved, namely, that nature is uniform,  that there is “a resemblance betwixt those objects, of which we have had experience [i.e. past uniformities in nature] and those, of which we have had none [i.e. future uniformities in nature].” (1739: 90) In his first Enquiry, Hume is even more straightforward: “To endeavour, therefore the proof of this last supposition [that the future will be conformable to the past] by probable arguments, or arguments regarding existence, must evidently be going in a circle, and taking that for granted, which is the very point in question.” (1748: 35-6) As he explains in his Treatise, “The same principle cannot be both the cause and effect of another.” (1739: 89-90) PUN would be the “cause” (read: “premise”) for the “presumption of resemblance” between the past and the future, but it would also be the “effect” (read: “conclusion”) of the “presumption of resemblance” between the past and the future.

d. Causal Inference is Non-Demonstrative

What then is Hume’s claim? It is that (PUN-I) cannot be a demonstrative argument. Neither Reason alone, nor Reason “aided by experience” can justify PUN, which is necessary for (PUN-I) being demonstrative. Hence, causal inference—that is (I) above—is genuinely non-demonstrative.

Hume summed up this point as follows:

Thus not only our reason fails us in the discovery of the ultimate connexion of causes and effects, but even after experience has inform’d us of their constant conjunction, ‘tis impossible for us to satisfy ourselves by our reason, why we shou’d extend that experience beyond those particular instances, which have fallen under our observation. We suppose, but are never able to prove, that there must be a resemblance betwixt those objects, of which we have had experience, and those which lie beyond the reach of our discovery. (1739: 91-92)

Note well Hume’s point: “We suppose but we are never able to prove” the uniformity of nature. Indeed, Hume goes on to add that there is causal inference in the form of (I), but it is not (cannot be) governed by Reason, but “by certain principles, which associate together the ideas of these objects, and unite them in the imagination” (1739: 92). These principles are general psychological principles of resemblance, contiguity and causation by means of which the mind works. Hume is adamant that the “supposition” of PUN “is deriv’d entirely from habit, by which we are determin’d to expect for the future the same train of objects, to which we have been accustom’d.” (1739: 134)

Hume showed that (I) is genuinely non-demonstrative. In summing up his view, he says:

According to the hypothesis above explain’d [his own theory] all kinds of reasoning from causes or effects are founded on two particulars, viz. the constant conjunction of any two objects in all past experience, and the resemblance of a present object to any one of them. (1739: 142)

In effect, Hume says that (I) supposes (but does not explicitly use) a principle of resemblance (PUN).

It is a nice question to wonder in what sense Hume’s approach is skeptical. For Hume does not deny that the mind is engaged in inductive inferences, he denies that these inferences are governed by Reason. To see the sense in which this is a skeptical position, let us think of someone who would reply to Hume by saying that there is more to Reason’s performances than demonstrative arguments. The thought could be that there is a sense in which Reason governs non-demonstrative inference according to which the premises of a non-demonstrative argument give us good reasons to rationally accept the conclusion. Argument (I) above is indeed genuinely non-demonstrative, but there is still a way to show that it offers reasons to accept the conclusion. Suppose, for instance, that one argued as follows:

(R-I)

(CC): A has been constantly conjoined with be (that is, all As so far have been followed by Bs)

(FI): a is A (a fresh instance of A)

(R): CC and FI are reasons to believe that a is B

Therefore, (probably) a is B (the fresh instance of A will be followed by a fresh instance of B).

Following Stroud (1977: 59-65), it can be argued that Hume’s reaction to this would be that principle (R) cannot be a good reason for the conclusion. Not because (R) is not a deductively sufficient reason, but because any defense of (R) would be question-begging in the sense noted above. To say, as (R) in effect does, that a past constant conjunction between As and Bs is reason enough to make the belief in their future constant conjunction reasonable is just to assume what needs to be defended by further reason and argument.

Be that as it may, Hume’s so-called inductive skepticism is a corollary of his attempt to show that the idea of necessary connection cannot stem from the supposed necessity that governs causal inference. For, whichever way you look at it, talk of necessity in causal inference is unfounded.

e. Against Natural Necessity

In the Abstract, Hume considers a billiard-ball collision which is “as perfect an instance of the relation of cause and effect as any which we know, either by sensation or reflection” (1740: 649) and suggests we examine it. He concludes that experience dictates three features of cause-effect relation: contiguity in time and place; priority of the cause in time; constant conjunction of the cause and the effect; and nothing further. However, as we have already seen, Hume did admit that, over and above these three features, causation involves necessary connection of the cause and the effect.

The view that causation implies necessary connections between distinct existences had been the dominant one ever since Aristotle put it forward. It was tied to the idea that things possess causal powers, where power is “a principle of change in something else or in itself qua something else.” Principles are causes, hence powers are causes. Powers are posited for explanatory reasons—they are meant to explain activity in nature: change and motion. Action requires agency. For X to act on Y, X must have the (active) power to bring a change to Y, and Y must have the (passive) power to be changed (in the appropriate way) by X. Powers have modal force: they ground facts about necessity and possibility. Powers necessitate their effects: when a (natural) power acts (at some time and in the required way), and if there is “contact” with the relative passive power, the effect necessarily (that is, inevitably) follows. Here is Aristotle’s example: “And that that which can be hot must be made hot, provided the heating agent is there, i.e. comes near.” (324b8) (1985: 530)

i. Malebranche on Necessity

Before Hume, Father Nicolás Malebranche had emphatically rejected as “pure chimera” the idea that things have natural powers in virtue of which they necessarily behave the way they do. When someone says that, for instance, the fire burns by its nature, they do not know what they mean. For him, the very notion of such a “force,” “power,” or “efficacy,” was completely inconceivable: “Whatever effort I make in order to understand it, I cannot find in me any idea representing to me what might be the force or the power they attribute to creatures.” (1674-5: 658) Moreover, he challenged the view that there are necessary connections between worldly existences (either finite minds or bodies) based on the claim that the human mind can only perceive the existence of a necessary connection between God’s Will and his willed actions. In a famous passage in his La Recherche de la Vérité, he noted:

A true cause as I understand it is one such that the mind perceives a necessary connection between its and its effect. Now the mind perceives a necessary connection between the will of an infinite being and its effect. Therefore, it is only God who is the true cause and who truly has the power to move bodies. (1674-5: 450)

Drawing a distinction between real causes and natural causes (or occasions), he claimed that natural causes are merely the occasions on which God causes something to happen, typically by general volitions which are the laws of nature. Malebranche and, following him, a bunch of radical thinkers argued that a coherent Cartesianism should adopt occasionalism, namely, the view that a) bodies lack motor force and b) God acts on nature via general laws. Since, according to Cartesianism, a body’s nature is exhausted by its extension, Malebranche argued, bodies cannot have the power to move anything, and hence to cause anything to happen. He added, however, that precisely because causality involves a necessary connection between the cause and the effect, and since no such necessary connection is perceived in cases of alleged worldly causality (where, for instance, it is said that a billiard ball causes another one to move), there is no worldly causality: all there is in the world is regular sequences of events, which, strictly speaking, are not causal. Hume, as is well known, was very much influenced by Malebranche, to such an extent that Hume’s own approach can be described as Occasionalism minus God.

ii. Leibniz on Induction

But by the time of Hume’s Treatise, causal powers and necessary connections had been resuscitated by Leibniz. He distinguished between two kinds of necessity. Some principles are necessary because opposing them implies a contradiction. This is what he called “logical, metaphysical or geometrical” necessity. In Theodicy he associated this kind of necessity with the “‘Eternal Verities’, which are altogether necessary, so that the opposite implies contradiction.” But both in Theodicy and the New Essays on Human Understanding (which were composed roughly the same time), he spoke of truths which are “only necessary by a physical necessity.” (1896: 588) These are not absolutely necessary in that they can be denied without contradiction. And yet they are necessary because, ultimately, they are based on the wisdom of God. In Theodicy Leibniz says that we learn these principles either a posteriori based on experience or “by reason and a priori, that is, by considerations of the fitness of things which have caused their choice.” (1710: 74) In the New Essays he states that these principles are known by Induction, and hence that physical necessity is “founded upon induction from that which is customary in nature, or upon natural laws which, so to speak, are of divine institution.” (1896: 588) Physical necessity constitutes the “order in Nature” and “lies in the rules of motion and in some other general laws which it pleased God to lay down for things when he gave them being.” (1710: 74) So, denying these principles entails that nature is disorderly (and hence unknowable).

Leibniz does discuss Induction in various places in his corpus. In his letter to Queen Sophie Charlotte of Prussia, On what is Independent of Sense and Matter in 1702, he talks of “simple induction,” and claims that it can never assure us of the “perfect generality” of truth arrived at by it. He notes: “Geometers have always held that what is proved by induction or by example in geometry or in arithmetic is never perfectly proved.” (1989: 190) To be sure, in this particular context, he wants to make the point that mathematical truths are truths of reason, known either a priori or by means of demonstration. But his point about induction is perfectly general. The “senses and induction” as he says, “can never teach us truths that are fully universal, nor what is absolutely necessary, but only what is, and what is found in particular examples.” (1989: 191)  Since, however, Leibniz does not doubt that “We know universal and necessary truth in the sciences,” there must be a way of knowing them which is non-empirical. They are known by “an inborn light within us;” we have “derived these truths, in part, from what is within us” (ibid.).

In his New Essays, he allows that “Propositions of fact can also become general,” by means of “induction or observation.” For instance, he says, we can find out by Induction that “All mercury is evaporated by the action of fire.” But Induction, he thought, can never deliver more than “a multitude of similar facts.” In the mercury case, the generality achieved is never perfect, the reason being that “We can’t see its necessity.”  For Leibniz, only Reason can come to know that a truth is necessary: “Whatever number of particular experiences we may have of a universal truth, we could not be assured of it forever by induction without knowing its necessity through the reason.” (1896: 81)

For Leibniz, Induction, therefore, suffers from an endemic “imperfection.” But what exactly is the problem? Ιn an early unpublished piece, (Preface to an Edition of Nizolius 1670), Leibniz offers perhaps his most systematic treatment of the problem of the imperfection of Induction.

The problem: Induction is essentially incomplete.

(1) Perfectly universal propositions can never be established on this basis [through collecting individuals or by induction] because “You are never certain in induction that all individuals have been considered.” (1989a: 129)

(2) Since, then, “No true universality is possible, it will always remain possible that countless other cases which you have not examined are different” (ibid.).

Ηowever, the following objection may be put forward: from the fact that entity A with nature N has regularly caused B in the past, we infer (with moral certainty) that universally entity A with nature N causes B. As Leibniz put it:

“Do we not say universally that fire, that is, a certain luminous, fluid, subtle body, usually flares up and burns when wood is kindled, even if no one has examined all such fires, because we have found it to be so in those cases we have examined?” (op.cit.)

“We infer from them, and believe with moral certainty, that all fires of this kind burn and will burn you if you put your hand to them.” (op.cit.)

Leibniz endorses this objection, and hence he does not aim to discredit Induction. Rather, he aims to ground it properly by asking what is the basis for true universality? What is the basis for blocking the possibility of exceptions?

Leibniz’s reply is that the grounds for true universality are the (truly universal) principle that nature is uniform. But the (truly universal) principle that nature is uniform cannot depend on Induction because this would lead to a(n) (infinite) regress, and moral certainty would not be possible.

Induction yields at best moral (and not perfect) certainty. But this moral certainty:

Is not based on induction alone and cannot be wrested from it by main force but only by the addition or support of the following universal propositions, which do not depend on induction but on a universal idea or definition of terms:

(1) if the cause is the same or similar in all cases, the effect will be the same or similar in all;

(2) the existence of a thing which is not sensed is not assumed; and, finally,

(3) whatever is not assumed, is to be disregarded in practice until it is proved.

From these principles arises the practical or moral certainty of the proposition that all such fire burns…. (op.cit.)

So here is how we would reason “inductively” according to Leibniz.

(L)

Fires have so far burned.

Hence, (with moral certainty) “All fire burns.”

This inference rests on “the addition or support” of the universal proposition (1): “If the cause is the same or similar in all cases, the effect will be the same or similar in all.” In making this inference, we do not assume anything about fires we have not yet seen or touched (hence, we do not beg the question concerning unseen fires); instead, we prove something about unseen fires, namely, that they too burn.

Note Leibniz’s reference to the “addition or support” of proposition (1), which amounts to a Uniformity Principle. We may think of (L) as an elliptical demonstrative argument which requires the addition of (1), or we can think of it as a genuine inductive argument, “supported” by a Uniformity principle. In either case, the resulting generalization is naturally necessary, and hence truly universal, though the supporting uniformity principle is not metaphysically necessary. The resulting generalization (“All fire burns”) is known by “practical or moral certainty,” which rests on the three principles supplied by Reason.

It is noteworthy that Leibniz is probably the first to note explicitly that any attempt to justify the required principles by means of Induction would lead to an infinite regress, since if these principles were to be arrived at by Induction, further principles would be required for their derivation, “and so on to infinity, and moral certainty would never be attained.” (1989a: 130) So, these principles are regress-stoppers, and for them to play this role they cannot be inductively justified.

Let us be clear on Leibniz’s “problem of induction”: Induction is required for learning from experience, but experience cannot establish the universal necessity of a principle, which requires the uniform course of nature. If Induction is to be possible, it must be based on principles which are not founded on experience. It is Reason that supplies the missing rationale for Induction by providing the principles that are required for the “connection of the phenomena.” (1896: 422) Natural necessity is precisely this “connection of the phenomena” that Reason supplies and makes Induction possible.

Though Induction (suitably aided by principles of reason) can and does lead to moral certainty about matters of fact, only demonstrative knowledge is knowledge proper. And this, Leibniz concludes, can only be based on reason and the Principle of Non-Contradiction. But this is precisely the problem. For if this is the standard of knowledge, then even the basic principles by means of which induction can yield moral certainty cannot be licensed by the Principle of Non-Contradiction. So, the space is open for an argument to the effect that they are not, properly speaking, principles of reason.

f. Can Powers Help?

It is no accident, then, that Hume takes pains to show that the Principle of Uniformity of Nature is not a principle of Reason. What is even more interesting is that Hume makes an extra effort to block an attempt to offer a certain metaphysical foundation to the Principle of Uniformity of Nature based on the claim that so-called physically necessary truths are made true by the causal powers of things. Here is how this metaphysical grounding would go: a certain object A has the power to produce an object B. If this were the case, then the necessity of causal claims would be a consequence of a power-based ontology, according to which “The power necessarily implies the effect.” (1739: 90) Hume even allowed that positing of powers might be based on experience in the following sense: after having seen A and B being constantly conjoined, we conclude that A has the power to produce B. Either way, the relevant inference would become thus:

(P-I)

(CC): A has been constantly conjoined with B (that is, all As so far have been followed by Bs)

(P):  A has the power to produce B

(FI): a is A (a fresh instance of A)

Therefore, a is B (the fresh instance of A will be followed by a fresh instance of B).

Here is how Hume put it: “The past production implies a power: The power implies a new production: And the new production is what we infer from the power and the past production.” (1739: 90) If this argument were to work, PUN would be grounded in the metaphysical structure of the world, and, more particularly, in powers and their productive relations with their effects. Hume’s strategy against this argument is that even if powers were allowed (a thing with which Hume disagrees), (P-I) would be impotent as a demonstrative argument since it would require proving that powers are future-oriented (namely, that a power which has been manifested in a certain manner in the past will continue to manifest itself in the same way in the future), and this is a claim that neither reason alone nor reason aided with experience can prove.

g. Where Does the Idea of Necessity Come From?

Hume then denies necessity in the workings of nature. He criticizes Induction insofar as it is taken to be related to PUN, that is, insofar as it was meant to yield (naturally) necessary truths, based on Reason and past experiences. Here is how he summed it up:

That it is not reasoning which engages us to suppose the past resembling the future, and to expect similar effects from causes, which are, to appearance, similar. This is the proposition which I intended to enforce in the present section. (1748: 39)

Instead of being products of Reason, “All inferences from experience, therefore, are effects of custom.” (1748: 43)

For Hume, causality, as it is in the world, is regular succession of event-types: one thing invariably following another. His famous first definition of causality runs as follows:

We may define a CAUSE to be “An object precedent and contiguous to another, and where all the objects resembling the former are plac’d in like relations of precedency and contiguity to those objects, that resemble the latter. (1739: 170)

And yet, Hume agrees that not only do we have the idea of necessary connection, but also that it is part of the concept of causation. As noted already, it would be wrong to think that Hume identified the necessary connection with the constant conjunction. After all, the observation of a constant conjunction generates no new impression in the objects perceived. What it does do, however, is cause a certain feeling of determination in the mind. After a point, the mind does not treat the repeated and sequence-resembling phenomenon of tokens of C being followed by tokens of E as independent anymore—the more it perceives, the more determined it is to expect that they will occur again in the future. This determination of the mind is the source of the idea of necessity and power: “The necessity of the power lies in the determination of the mind…” Hence, the alleged natural necessity is something that exists only in the mind, not in nature! Instead of ascribing the idea of necessity to a feature of the natural world, Hume took it to arise from within the human mind when it is conditioned by the observation of a regularity in nature to form an expectation of the effect when the cause is present. Indeed, Hume offered a second definition of causality: “A CAUSE is an object precedent and contiguous to another, and so united with it, that the idea of the one determines the mind to form the idea of the other, and the impression of the one to form a more lively idea of the other.” (1739: 170) Hume thought that he had unpacked the “essence of necessity”: it “is something that exists in the mind, not in the objects.” (1739: 165) He claimed that the supposed objective necessity in nature is spread by the mind onto the world. Hume can be seen as offering an objective theory of causality in the world (since causation amounts to regular succession), which was however accompanied by a mind-dependent view of necessity.

3. Kant on Hume’s Problem

Kant, rather bravely, acknowledged in the Prolegomena that “The remembrance of David Hume was the very thing that many years ago first interrupted my dogmatic slumber and gave a completely different direction to my researches in the field of speculative philosophy.” (1783: 10) In fact, his magnum opus, the Critique of Pure Reason, was “the elaboration of the Humean problem in its greatest possible amplification.”

a. Hume’s Problem for Kant

But what was Hume’s problem for Kant? It was not inductive skepticism and the like. Rather, it was the origin and justification of necessary connections among distinct and separate existences. Hume, Kant noted, “indisputably proved” that Reason cannot be the foundation of the judgment that “Because something is, something else necessarily must be.” (B 288) But that is exactly what the concept of causation says. Hence, the very idea of causal connections, far from being introduced a priori, is the “bastard” of imagination and experience which, ultimately, disguises mere associations and habits as objective necessities.

Kant took it upon himself to show that the idea of necessary connections is a synthetic a priori principle and hence that it has “an inner truth independent of all experience.” Synthetic a priori truths are not conceptual truths of reason; rather, they are substantive claims which are necessary and are presupposed for the very possibility of experience. Kant tried to demonstrate that the principle of causality, namely, “Everything that happens, that is, begins to be, presupposes something upon which it follows by rule” (A 189), is a precondition for the very possibility of objective experience.

He took the principle of causality to be a requirement for the mind to make sense of the temporal irreversibility in certain sequences of impressions. So, whereas we can have the sequence of impressions that correspond to the sides of a house in any order we please, the sequence of impressions that correspond to a ship going downstream cannot be reversed: it exhibits a certain temporal order (or direction). This temporal order by which certain impressions appear can be taken to constitute an objective happening only if the later event is taken to be necessarily determined by the earlier one (that is, to follow by rule from its cause). For Kant, objective events are not “given”: they are constituted by the organizing activity of the mind and, in particular, by the imposition of the principle of causality on the phenomena. Consequently, the principle of causality is, for Kant, a synthetic a priori principle.

b. Kant on Induction

What about Induction then? Kant distinguished between two kinds of universality when it comes to judgements (propositions): strict and comparative. Comparative universal propositions are those that derive from experience and are made general by Induction. An inductively arrived at proposition is liable to exceptions; it comes with the proviso, as Kant put it: “As far as we have yet perceived, there is no exception to this or that rule.” (B 4) Strictly universal propositions are thought of without being liable to any exceptions. Hence, they are not derived from experience or by induction. Rather, as Kant put it, they are “valid absolutely a priori.” That is an objective distinction, Kant thought, which we discover rather than invent. Strictly universal propositions are essentially so. For Kant, strict universality and necessity go together, since experience can teach us how things are but not that they could not be otherwise. Hence, strictly universal propositions are necessary propositions, while comparatively universal propositions are contingent. Necessity and strict universality are then the marks of a priority, whereas comparative universality and contingency are the marks of empirical-inductive knowledge. Naturally, Kant is not a sceptic about inductive knowledge; yet he wants to demarcate it properly from a priori knowledge: “[Rules] cannot acquire anything more through induction than comparative universality, i.e., widespread usefulness.” (A92/B124) It follows that the concept of cause “must be grounded completely a priori in the understanding,” precisely because experience can only show a regular succession of events A and B, and never that event B must follow from A. As Kant put it: “To the synthesis of cause and effect there [the rule] attaches a dignity that can never be expressed empirically, namely, that the effect does not merely come along with the cause, but is posited through it and follows from it.” (A91/B124)

Not only is there not a Problem of Induction in Kant, but he discussed Induction in his various lectures on Logic. In the so-called Blomberg Logic (dating back to the early 1770s) he noted of Induction that it is indispensable (“We cannot do without it”) and that it yields knowledge (were we to abolish it, “Along with it most of our cognitions would have to be abolished at the same time”), despite the fact that it is non-demonstrative. Induction is a kind of inference where “We infer from the particular to the universal.” (1992: 232) It is based on the following rule: “What belongs to as many things as I have ever cognized must also belong to all things that are of this species and genus.” Natural kinds have properties shared by all of their members; hence if a property P has been found to be shared by all examined members of kind K, then the property P belongs to all members of K.

Now, a principle like this is fallible, as Kant knew very well. Not all properties of an individual are shared by all of its fellow kind members; only those that are constitutive of the kind. But what are they? It was partly to highlight this problem that Kant drew the distinction between “empirical universality” (what in the Critique he called “comparative universality”) and “rational” or “strict” universality, in which a property is attributed to all things of a kind without the possibility of exception. For instance, the judgment “All matter is extended” is rationally universal whereas the judgement “All matter has weights” is empirically universal. All and only empirically universal propositions are formed by Induction; hence they are uncertain. And yet, as already noted, Induction is indispensable, since “Without universal rules we cannot draw a universal inference.” (1992: 409) In other words, if our empirical knowledge is to be extended beyond the past and the seen, we must rely on Induction (and analogy). They are “inseparable from our cognitions, and yet errors for the most part arise from them.” Induction is a fallible “crutch” to human understanding.

Later on, this “crutch” was elevated to the “reflective power of judgement.” In his third Critique (Critique of Judgement) Kant focused on the power of judgement, where judgement is a cognitive faculty, namely, that of subsuming the particular under the universal. The power of judgement is reflective, as opposed to determining, when the particular is known and the universal (the rule, the law, the principle) is sought. Hence, the reflective power of judgement denotes the inductive use of judgement, that is, looking for laws or general principles under which the particulars can be subsumed. These laws will never be known with certainty; they are empirical laws. But, as Kant admits, they can be tolerated in empirical natural science. Uncertainty in pure natural science, as well as in metaphysics, of course cannot be tolerated. Hence, knowledge proper must be grounded in the apodictic certainty of synthetic a priori principles, such as the causal maxim. Induction can only be a crutch for human reason and understanding, but, given that we (are bound to) learn from experience, it is an indispensable crutch.

4. Empiricist vs Rationalist Conceptions of Induction (After Hume and Kant)

a. Empiricist Approaches

i. John Stuart Mill: “The Problem of Induction”

It might be ironic that John Stuart Mill was the first who spoke of “the problem of Induction.” (1879: 228) But by this he meant the problem of distinguishing between good and bad inductions. In particular, he thought that there are cases in which a single instance might be enough for “a complete induction,” whereas in other cases, “Myriads of concurring instances, without a single exception known or presumed, go such a very little way towards establishing an universal proposition.” Solving this problem, Mill suggested, amounts to solving the Problem of Induction.

Mill took Induction to be both a method of generating generalizations and a method of proving they are true. In his System of Logic, first published in 1848, he defined Induction as “The operation of discovering and proving general propositions” (1879: 208). As a nominalist, he thought that “generals”—what many of his predecessors had thought of as universals—are collections of particulars “definite in kind but indefinite in number.” So, Induction is the operation of discovering and proving relations among (members of) kinds—where kinds are taken to be characterized by relations of resemblance “in certain assignable respects” among its members. The basic form of Induction, then, is by enumeration: “This and that A are B, therefore every A is B.” The key point behind enumerative Induction is that it cannot be paraphrased as a conjunction of instances. It yields “really general propositions,” namely, a proposition such that the predicate is affirmed or denied of “an unlimited number of individuals.” Mill was ready to add that this unlimited number of individuals include actual and possible instances of a generalization, “existing or capable of existing.” This suggests that inductive generalizations have modal or counterfactual force: If All As are B, then if a were an A it would be a B.

It is then important for Mill to show how Induction acquires this modal force. His answer is tied to his attempt to distinguish between good and bad inductions and connects good inductions with establishing (and latching onto) laws of nature. But there is a prior question to be dealt with, namely, what is the “warrant” for Induction? (1879: 223) Mill makes no reference to Hume when he raises this issue. But he does take it that the root of the problem of the warrant for Induction is the status of the Principle of Uniformity of Nature. This is a principle according to which “The universe, so far as known to us, is so constituted, that whatever is true in any one case, is true in all cases of a certain description; the only difficulty is, to find what description.” (1879: 223)

This, he claims, is “a fundamental principle, or general axiom, of Induction” (1879: 224) and yet, it is itself an empirical principle (a generalization itself based on Induction): “This great generalization is itself founded on prior generalizations.” If this principle were established and true, it could appear as a major premise in all inductions; hence all inductions would turn into deductions. But how can it be established? For Mill there is no other route to it than experience: “I regard it as itself [the Principle of Uniformity of Nature] a generalization from experience.” (1879: 225) Mill claims that the Principle of Uniformity of Nature emerges as a second-order induction over successful first-order inductions, the successes of which support each other and the general principles.

There may be different ways to unpack this claim, but it seems that the most congenial to Mill’s own overall strategy is to note that past successes of inductions offer compelling reasons to believe that there is uniformity in nature. In a lengthy footnote (1879: 407) in which he aimed to tackle the standard objection attributed to Reid and Stewart that experience gives us knowledge only of the past and the present but never of the future, he stressed: “Though we have had no experience of what is future, we have had abundant experience of what was future.” Differently put, there is accumulated future-oriented evidence for uniformity in nature. Induction is not a “leap in the dark.”

In another lengthy footnote, this time in his An Examination of Sir William Hamilton’s Philosophy (1865: 537) he favored a kind of reflective equilibrium justification of PUN. After expressing his dismay of the constant reminder that “The uniformity of the course of nature cannot be itself an induction, since every inductive reasoning assumes it, and the premise must have been known before the conclusion,” he stressed that those who are moved by this argument have missed the point of the continuous “giving and taking, in respect of certainty” between PUN and “all the narrower truths of experience”—that is, of all first-order inductions. This “reciprocity” mutually enhances the certainty of the PUN and the certainty of first-order inductions. In other words, first-order inductions support PUN, but having been supported by them, PUN, in its turn, “raises the proof of them to a higher level.”

ii. Mill on Enumerative Induction

Recall that in formulating the Principle of Uniformity of Nature, Mill takes it to be a principle about the “constitution” of the universe, being such that it contains regularities: “Whatever is true in any one case, is true in all cases of a certain description.” But he meaningfully adds: “The only difficulty is, to find what description,” which should be taken to imply that the task of inductive logic is to find the regularities there are in the universe and that this task is not as obvious as it many sound, since finding the kinds (that is, the description of collections of individuals) that fall under certain regularities is far from trivial and may require extra methods. Indeed, though Mill thinks that enumerative induction is indispensable as a form of reasoning (since true universality in space and time can be had only through it, if one starts from experience, as Mill recommends), he also thinks that various observed patterns in nature may not be as uniform as a simple operation of enumerative induction would imply.

To Europeans, not many years ago, the proposition, All swans are white, appeared an equally unequivocal instance of uniformity in the course of nature. Further experience has proved (…) that they were mistaken; but they had to wait fifty centuries for this experience. During that long time, mankind believed in a uniformity of the course of nature where no such uniformity really existed. (1879: 226)

 The “true theory of induction” should aim to find the laws of nature. As Mill says:

Every well-grounded inductive generalization is either a law of nature, or a result of laws of nature, capable, if those laws are known, of being predicted from them. And the problem of Inductive Logic may be summed up in two questions: how to ascertain the laws of nature; and how, after having ascertained them, to follow them into their results. (1879: 231)

The first question—much more significant in itself—requires the introduction of new methods of Induction, namely, methods of elimination. Here is the rationale behind these methods:

Before we can be at liberty to conclude that something is universally true because we have never known an instance to the contrary, we must have reason to believe that if there were in nature any instances to the contrary, we should have known of them. (1879: 227)

Note the counterfactual claim behind Mill’s assertion: enumerative Induction on its own (though ultimately indispensable) cannot yield the modal force required for empirical generalizations that can be deemed laws of nature. What is required are methods which would show how, were there exceptions, they could be (or would have been) found. Given these methods, Induction acquires modal force: in a good induction—that is, in an induction such that if there were negative instances, they would have been found—the conclusion is not just “All As are B”; implicit in it is the further claim: if there were an extra A, it would be B.

iii. Mill’s Methods

These methods are Mill’s famous methods of agreement and difference, which Mill presents as methods of Induction (1879: 284).

Suppose that we know of a factor C, and we want to find out its effect. We vary the factors we conjoin with C and examine what the effects are in each case. Suppose that, in a certain experiment, we conjoin C with A and B, and what follows is abe. Then, in a new experiment, we conjoin C, not with A and B, but with D and F, and what follows is dfe. Both experiments agree only on the factor C and on the effect e. Hence, the factor C is the cause of the effect e. AB is not the cause of e since the effect was present even when AB was absent. Nor is DF the cause of e since e was present when DF was absent. This is then the Method of Agreement. The cause is the common factor in a number of otherwise different cases in which the effect occurs. As Mill put it: “If two or more instances of the phenomenon under investigation have only one circumstance in common, the circumstance in which alone all the instances agree is the cause (or effect) of the given phenomenon.” (1879: 280) The Method of Difference proceeds in an analogous fashion. Suppose that we run an experiment, and we find that an antecedent ABC has the effect abe. Suppose also that we run the experiment once more, this time with AB only as the antecedent factors. So, factor C is absent. If, this time, we only find the part ab of the effect, that is, if e is absent, we conclude that C was the cause of e. In the Method of Difference, then, the cause is the factor that is different in two cases, which are similar except that in the one the effect occurs, while in the other it does not. In Mill’s words:

If an instance in which the phenomenon under investigation occurs, and an instance in which it does not occur, have every circumstance in common save one, that one occurring only in the former; the circumstance in which alone the two instances differ is the effect, or the cause, or an indispensable part of the cause, of the phenomenon. (1879: 280)

It is not difficult to see that what Mill has described are cases of controlled experiments. In such cases, we find causes (or effects) by creating circumstances in which the presence (or the absence) of a factor makes the only difference to the production (or the absence) of an effect. The effect is present (or absent) if and only if a certain causal factor is present (or absent). Mill is adamant that his methods work only if certain metaphysical assumptions are already in place. First, it must be the case that events have causes. Second, it must be the case that events have a limited number of possible causes. In order for the eliminative methods he suggested to work, it must be the case that the number of causal hypotheses considered is relatively small. Third, it must be the case that same causes have same effects, and conversely. Fourth, it must be the case that the presence or absence of causes makes a difference to the presence or absence of their effects. Indeed, Mill (1879: 279) made explicit reference to two “axioms” on which his two Methods depend. The axiom for the Method of Agreement is this:

Whatever circumstances can be excluded, without prejudice to the phenomenon, or can be absent without its presence, is not connected with it in the way of causation. The casual circumstance being thus eliminated, if only one remains, that one is the cause we are in search of: if more than one, they either are, or contain among them, the cause…. (ibid.)

The axiom for the Method of Difference is:

Whatever antecedent cannot be excluded without preventing the phenomenon, is the cause or a condition of that phenomenon: Whatever consequent can be excluded, with no other difference in the antecedent than the absence of the particular one, is the effect of that one. (1879: 280)

What is important to stress is that although only a pair of (or even just a single) carefully controlled experiment(s) might get us at the causes of certain effects, what, for Mill, makes this inference possible is that causal connections and laws of nature are embodied in regularities—and these, ultimately, rely on enumerative induction.

iv. Alexander Bain: The “Sole Guarantee” of the Inference from a Fact Known to a Fact Unknown

The Millian Alexander Bain (1818-1903), Professor of Logic in the University of Aberdeen, in his Logic: Deductive and Inductive (1887), undertook the task of explaining the role of the Principle of Uniformity of Nature in Inductive Logic. He took this principle to be the “sole guarantee” of the inference from a fact known to a fact unknown. He claimed that when it comes to uniformities of succession, the Law of Cause and Effect, or Causation, is a version of the PUN: “Every event is uniformly preceded by some other event. To every event there is some antecedent, which happening, it will happen.” (1887: 20) He actually took it that this particular formulation of PUN has an advantage over more controversial modal formulations of the Principle, such as “every effect must have a cause.” The advantage is precisely that this is a non-modal formulation of the Principle in that it states a meta-regularity.

Bain’s treatment of Induction is interesting, because he takes it that induction proper should be incomplete—that is, it should not enumerate all relevant instances or facts, because then it would yield a summation and not a proper generalization. For Bain, Induction essentially involves the move from some instances to a generalization because only this move constitutes an “advance beyond” the particulars that probed the Induction. In fact, the scope of an inductive generalization is sweeping. It involves:

The extension of the concurrence from the observed to the unobserved cases—to the future which has not yet come within observation, to the past before observation began, to the remote where there has been no access to observe. (1887: 232)

And precisely because of this sweeping scope, Induction involves a “leap” which is necessary to complete the process. This leap is “the hazard of Induction,” which is, however, inevitable as “an instrument for multiplying and extending knowledge.” So, Induction has to be completed in the end, in that the generalization it delivers expresses “what is conjoined everywhere, and at all times, superseding for ever the labour of fresh observation.” But it is not completed through enumeration of particulars; rather, the completion is achieved by PUN.

Bain then discusses briefly “a more ambitious form of the Inductive Syllogism” offered by Henry Aldrich and Richard Whately in the Elements of Logic (1860). According to this, a proper Induction has the following form:

The magnets that I have observed, together with those that I have not observed, attract iron.

These magnets are all magnets.

All magnets attract iron.

Bain says that this kind of inference begs the question, since it assumes what needs to be proved, namely, that the unobserved magnets attract iron. As he says: “No formal logician is entitled to lay down a premise of this nature.” (1887: 234)

Does, however, the very same problem not arise for Bain’s PUN? Before we attempt to answer this, let us address a prior question: how many instances are required for a legitimate generalization? Here Bain states what he calls the principle of Universal Agreement, which he takes to be the sole evidence for inductive truth. According to this principle, “We must go through the labour of a full examination of instances, until we feel assured that our search is complete, that if contrary cases existed, they must have been met with.” (1887: 276) Note that the application of this principle does not require exhaustive enumeration—rather, it requires careful search for negative instances. Once this search has been conducted thoroughly, Bain claims that the generalization can be accepted as true (until exceptions are discovered) based on the further claim that “What has never been contradicted (after sufficient search) is to be received as true.” (1887: 237) This kind of justification is not obvious. But it does point to the view that beliefs are epistemically innocent until proven guilty. It is a reflexive principle in that it urges for the active search of counter-instances.

Bain accepts the Millian idea that PUN is “the ultimate major premise of every inductive inference.” (1887: 238) The thought here is that an argument of the following form would be a valid syllogism:

All As observed so far have been B

What has been in the past will continue

Therefore, the unobserved As are B.

What then is the status of PUN itself? Bain takes it to be a Universal Postulate. Following Spencer, he does not take it that a Universal Postulate has to be a logical or conceptual truth. That is, a Universal Postulate does not have to be such that it cannot be denied without contradiction. Rather, he takes it that a Universal Postulate is an ultimate principle on a which all reasoning of a sort should be based. As such, it is a Principle such that some might say it begs the question, while others might say that it has to be granted for reasoning to be possible. But this dual stance is exactly what is expected when it comes to ultimate principles. And that is why he thinks that, unlike Aldrich and Whately’s case above, his own reliance on PUN is not necessarily question begging.

Besides, unlike Aldrich and Whately, Bain never asserts indiscriminately that whatever holds of the observed As also holds of the unobserved As. (Recall Aldrich and Whately’s premise above: The magnets that I have observed, together with those that I have not observed, attract iron. Bain, taking a more cautious stance towards PUN, talks about uniformities as opposed to Uniformity. We have evidence for uniformities in nature, and these are the laws of nature, according to Bain. More importantly, however, we have evidence for exceptions in natural uniformities. This “destructive evidence,” Bain says, entitles us to accept the uniformities for which there has not been found destructive evidence, despite our best efforts to find it. As he put it:

We go forward in blind faith, until we receive a check; our confidence grows with experience; yet experience has only a negative force, it shows us what has never been contradicted; and on that we run the risk of going forward in the same course. (1887: 672)

So PUN—in the form “What has never been contradicted in any known instance (there being ample means and opportunities of search) will always be true”—is an Ultimate Postulate, which, however, is not arbitrary in that there is ample evidence for and lack of destructive evidence against uniformities in nature.

In fact, Bain takes PUN to be an Ultimate Postulate, alongside the Principle of Non-Contradiction. Here is how he puts it:

The fact, generally expressed as Nature’s Uniformity, is the guarantee, the ultimate major premise, of all Induction. ‘What has been, will be’, justifies the inference that water will assuage thirst in after times. We can give no reason, or evidence, for this uniformity; and, therefore, the course seems to be to adopt this as the finishing postulate. And, undoubtedly, there is no other issue possible. We have a choice of modes of expressing the assumption, but whatever be the expression, the substance is what is conveyed by the fact of Uniformity. (1887: 671)

Does that mean that Bain takes it that PUN is justified as a premise to all inductive inference? Strikingly, he takes the issue to be practical as opposed to theoretical. He admits that it can be seen as question begging from the outset but claims that it is a folly to try to avoid this charge by proposing reasons for its justification. For,

If there be a reason, it is not theoretical, but practical. Without the assumption, we could not take the smallest steps in practical matters; we could not pursue any object or end in life. Unless the future is to reproduce the past, it is an enigma, a labyrinth. Our natural prompting is to assume such identity, to believe it first, and prove it afterwards. (1887: 672)

Bain then presages the trend to offer practical or pragmatic “justifications” of Induction.

b. Rationalist Approaches

i. William Whewell on “Collecting General Truths from Particular Observed Facts”

William Whewell (1794-1866) was perhaps the most systematic writer on Induction after Francis Bacon.

1. A Short Digression: Francis Bacon

 In his Novum Organum in 1620 Bacon spoke of “inductio legitima et vera” in order to characterize his own method. The problem, Bacon thought, lied with the way Induction was supposed to proceed, namely, via simple enumeration without taking “account of the exceptions and distinctions that nature is entitled to.” Having the Aristotelians in mind, he called enumerative Induction “a childish thing” in that it “jumps to conclusions, is exposed to the danger of instant contradiction, observes only familiar things and reaches no result.” (2000: 17).. His new form of Induction differed from Aristotle’s (and Bacon’s predecessors in general) in the following: it is a general method for arriving at all kinds of general truths (not just the first principles, but also at the “lesser middle axioms” as he put it); it surveys not only affirmative or positive instances, but also negative ones. It therefore “separate(s) out a nature through appropriate rejections and exclusions.” (2000: 84)

As is well-known, Bacon’s key innovation was that he divided his true and legitimate Induction into three stages, only the third of which was Induction. Stage I is experimental and natural history: a complete inventory of all instances of natural things and their effects. Here, observation and experiment rule. Then at Stage II, tables of presences, absences and degrees of comparison are constructed. Finally, Stage III is Induction. Whatever is present when the nature under investigation is present or absent when this nature is absent or decreases when this nature decreases and conversely, is the form of this nature.

What is really noteworthy is that in denying that all instances have to be surveyed, Bacon reconceptualised how particulars give rise to the universal. By taking a richer view about experience, he did not have to give to the mind a special role in bridging the gap between the particulars and the general.

2. Back to Whewell

Whewell was a central figure of Victorian science. He was among the founders of the British Association for the Advancement of Science, a fellow of the Royal Society, president of the Geological Society, and Master of Trinity College, Cambridge. He was elected Professor of Mineralogy in 1828, and of Moral theology in 1837. Whewell coined the word “scientist” in 1833.

In The Philosophy of the Inductive Sciences, Founded Upon Their History (1840), he took Induction to be the “common process of collecting general truths from particular observed facts,” (1840 v.1: 2) which is such that, as long as it is “duly and legitimately performed,” it yields real substantial truth. Inductive truths are not demonstrative truths. They are “proved, like the guess which answers a riddle, by [their] agreeing with the facts described;” (1840 v.1: 23) they capture relations among existing things and not relations among ideas. They are contingent and not necessary truths. (1840 v.1: 57)

Whewell insisted that experience can never deliver (and justify) necessary truths. Knowledge derived from experience “can only be true as far as experience goes, and can never contain in itself any evidence whatever of its necessity.” (1840 v.1: 166) What is the status of a principle such that “Every event must have a cause”? Of this principle, Whewell argues that it is “rigorously necessary and universal.” Hence, it cannot be based on experience. This kind of principle, which Whewell re-describes as a principle of invariable succession of the form “Every event must have a certain other event invariably preceding it,” is required for inductive extrapolation. Given that we have seen a case of a stone ascending after it was thrown upwards, we have no hesitation to conclude that another stone that will be thrown upwards will ascend. Whewell argues that for this kind of judgement to be possible, the mind should take it that there is a connection between the invariably related events and not a mere succession. And then he concludes that “The cause is more than the prelude, the effect is more than the sequel, of the fact. The cause is conceived not as a mere occasion; it is a power, an efficacy, which has a real operation.” (1840 v.1: 169)

This is a striking observation because it introduces a notion of natural necessity between the cause, qua power, and the effect. But this only accentuates the problem of the status of the principle “Every event must have a cause.” For the latter is supposed to be universal and necessary—logically necessary, that is. The logical necessity which underwrites this principle is supposed to give rise to the natural necessity by means of which the effect follows from the cause. In the end, logical and natural necessity become one. And if necessary truths such as the above cannot be known from experience, how are they known?

In briefly recounting the history of this problem, Whewell noted that it was conceived as the co-existence of two “irreconcilable doctrines”: the one was “the indispensable necessity of a cause of every event,” and the other was “the impossibility of our knowing such a necessity.” (1840 v.1: 172) He paid special attention to the thought of Scottish epistemologists, such as Thomas Brown and Dugald Stewart, that a principle of the form “Every event must have a cause” is an “instinctive law of belief, or a fundamental principle of the human mind.” He was critical of this approach precisely because it failed to explain the necessity of this principle. He contrasted this approach to Kant’s, according to which a principle such as the above is a condition for the possibility of experience, being a prerequisite for our understanding of events as objective events. Whewell’s Kantian sympathies were no secret. As he put it: “The Scotch metaphysicians only assert the universality of the relation; the German attempts further to explain its necessity.” (1840 v.1: 174) But in the end, he chose an even stronger line of response. He took it that the Causal Maxim is such that “We cannot even imagine the contrary”—hence it is a truth of reason, which is grounded in the Principle of Non-Contradiction.

Whewell offered no further explanation of this commitment. In the next paragraph, he assumes a softer line by noting that there are necessary truths concerning causes and that “We find such truths universally established and assented to among the cultivators of science, and among speculative men in general.” (1840 v.1: 180) This is a far cry from the claim that their negation is inconceivable. In fact, Mill was quick to point out that this kind of point amounts to claiming that some habitual associations, after having been entrenched, are given the “appearance of necessary ones.” And that is not something that Mill would object to, provided it was not taken to imply that these principles are not absolutely necessary. It is fair to say that, though Whewell was struggling with this point, he wanted to argue that some principles are constitutive of scientific inquiry and that the evidence for it is their universal acceptance. But Mill’s persistent (and correct) point was that if the inconceivability criterion is taken as a strict logical criterion, then the negation of the principles Whewell appeals to is not inconceivable; hence they cannot be absolutely necessary, and that is the end of it.

It was the search for the ground of universal and necessary principles that led Whewell to accept that there are Fundamental Ideas (like the one of cause noted above) which yield universality and necessity. Whewell never doubted that universal and necessary principles are known and that they cannot be known from experience. But Induction proceeds on the basis of experience. Hence, it cannot, on its own, yield universal and necessary truths. The thought, however, is that Induction does play a significant role in generating truths which can then be the premises of demonstrative arguments. According to Whewell, each science grows through three stages. It begins with a “prelude” in which a mass of unconnected facts is collected. It then enters an “inductive epoch” in which useful theories put order to these facts through the creative role of the scientists—an act of “colligation.” Finally, a “sequel” follows where the successful theory is extended, refined, and applied.

ii. Induction as Conception

The key element of Induction, for Whewell, is that it is not a mere generalization of singular facts. The general proposition is not the result of “a mere juxtaposition of the cases” or of a mere conjunction and extension of them. (1840 v.2: 47) The proper Induction introduces a new element—what Whewell calls “conception”— which is actively introduced by the mind and was not there in the observed facts. This conceptual novelty is supposed to exhibit the common property—the universal—under which all the singular facts fall. It is supposed to provide a “Principle of Connexion” of the facts that probed it but did not dictate it. Whewell’s typical example of a Conception is Kepler’s notion of an ellipse. Observing the motion of Mars and trying to understand it, Kepler did not merely juxtapose the known positions. He introduced the notion of an ellipse, namely, that the motion of Mars is an ellipse. This move, Whewell suggested, was inductive but not enumerative. So, the mind plays an active role in Induction—it does not merely observe and generalize, it introduces conceptual novelties which act as principles of connection. In this sense, the mind does not have to survey all instances. Insofar as it invents the conception that connects them, it is entitled to the generalization. Whewell says:

In each inference made by Induction, there is introduced some General Conception, which is given, not by the phenomena, but by the mind. The conclusion is not contained in the premises, but includes them by the introduction of a New Generality. In order to obtain our inference, we travel beyond the cases which we have before us; we consider them as mere exemplifications of some Ideal Case in which the relations are complete and intelligible. We take a Standard, and measure the facts by it; and this Standard is constructed by us, not offered by Nature. (1840 v.2: 49)

Induction is then genuinely ampliative—not only does it go beyond the observed instances, but it introduces new conceptual content as well, which is not directly suggested by the observed instances. Whewell calls this type of ampliation “superimposition,” because “There is some Conception superinduced upon the Facts” and takes it that this is the proper understanding of Induction. So, proper Induction requires, as he put it, “an idea from within, facts from without, and a coincidence of the two.” (1840 v.2: 619)

c. The Whewell-Mill Controversy

Whewell takes it that this dual aspect is his own important contribution to the Logic of Induction. His account of Induction landed him in a controversy with Mill. Whewell summarized his views and criticized Mill in a little book titled Of Induction, with a Special Reference to John Stuart Mill’s System of Logic, which appeared in 1849. In this, he first stressed the basic elements of his own views. More specifically: Reason plays an ineliminable role in Induction, since Induction requires the Mind’s conscious understanding of the general form under which the individual instances are subsumed. Hence, Whewell insists, Induction cannot be based on instinct, since the latter operates “blindly and unconsciously in particular cases.” The role of Mind is indispensable, he thought, in inventing the right “conception.” Once this is hit upon by the mind, the facts “are seen in a new point of view.” This point of view puts the facts (the inductive basis) in a certain unity and order. Before the conception, “The facts are seen as detached, separate, lawless; afterwards, they are seen as connected, simple, regular; as parts of one general fact, and thereby possessing innumerable new relations before unseen.” (1849: 29) The point here is that the conception is supposed to bridge the gap between the various instances and the generalization; it provides the universal under which all particular instances, seen and unseen, are subsumed.

Mill objected to this view that what Whewell took to be a proper Induction was a mere description of the facts. The debate was focused on Kepler’s first law, namely, that all planets move in ellipses—or, for that matter, that Mars describes an ellipse. We have already seen Whewell arguing that the notion of “ellipse” is not to be found in the facts of Mars’s motion around the sun. Rather, it is a new point of view, a new conception introduced by the mind, and it is such that it provided a “principle of connexion” among the individual facts—that is, the various positions of Mars in the firmament. This “ellipsis,” Whewell said, is superinduced on the fact, and this superinduction is an essential element of Induction.

i. On Kepler’s Laws

For Mill, when Kepler introduced the concept of “ellipse” he described the motion of Mars (and of the rest of the planets). Whewell had used the term “colligation” to capture the idea that the various facts are connected under a new conception. For Mill, colligation is just description and not Induction. More specifically, Kepler collected various observations about the positions occupied by Mars, and then he inquired about what sort of curve these points would make. He did end up with an ellipse. But for Mill, this was a description of the trajectory of the planet. There is no doubt that this operation was not easy, but it was not an induction. It is no more an induction than drawing the shape of an island on a map based on observations of successive points of the coast.

What, then, is Induction? As we have already seen, Mill took Induction to involve a transition from the particular to the general. As such, it involves a generalization to the unobserved and a claim that whatever holds for the observed holds for the unobserved too. Then, the inductive move in Kepler’s first law is not the idea of an ellipse, but rather the commitment to the view that when Mars is not observed its positions lie on the ellipse; that is, the inductive claim is that Mars has described and will keep describing an ellipse. Here is how Mill put it:

The only real induction concerned in the case, consisted in inferring that because the observed places of Mars were correctly represented by points in an imaginary ellipse, therefore Mars would continue to revolve in that same ellipse; and in concluding (before the gap had been filled up by further observations) that the positions of the planet during the time which intervened between two observations, must have coincided with the intermediate points of the curve.

In fact, Kepler did not even make the induction, according to Mill, because it was known that the planets periodically return to their positions. Hence, “Knowing already that the planets continued to move in the same paths; when [Kepler] found that an ellipse correctly represented the past path, he knew that it would represent the future path.”

Part of the problem with Whewell’s approach, Mill thought, was that it was verging on idealism. He took Whewell to imply that the mind imposes the conception on the facts. For Mill, the mind simply discovers it (and hence, it describes it). Famously, Mill said that if “the planet left behind it in space a visible track,” it could be seen that it is an ellipse. So, for Mill, Whewell was introducing hypotheses by means of his idea of conception and was not describing Induction. Colligation is the method of hypothesis, he thought, and not of Induction.

Whewell replied that Kepler’s laws are based on Induction in the sense that “The separate facts of any planet (Mars, for instance) being in certain places at certain times, are all included in the general proposition which Kepler discovered, that Mars describes an ellipse of a certain form and position.” (1840: 18)

What can we make of this exchange? Mill and Whewell do agree on some basic facts about Induction. They both agree that Induction is a process that moves from particulars to the universal, from observed instances to a generalization. Mill says, “Induction may be defined the operation of discovering and forming general propositions,” and Whewell agrees with this and emphasizes that generality is essential for Induction, since only this can make Induction create excess content.

Generality is conceived of as true universality. As Mill makes clear (and he credits this thought to all those who have discussed induction in the past), Induction:

  • involves “inferences from known cases to unknown”;
  • affirms “of a class, a predicate which has been found true of some cases belonging to the class”;
  • concludes that “Because some things have a certain property, that other things which resemble them have the same property”;
  • concludes that “Because a thing has manifested a property at a certain time, that it has and will have that property at other times.”

So, inductive generalizations are spatio-temporal universalities. They extend a property possessed by some observed members of a kind to all other (unobserved or unobservable) members of the kind (in different times and different spaces); they extend a property being currently possessed by an individual to its being possessed at all times. There is no doubt that Whewell shares this view too. So where is the difference?

ii. On the Role of Mind in Inductive Inferences

The difference is in the role of the principles of connection in Induction and, concomitantly, on the role of mind in inductive inferences—and this difference is reflected in how exactly Induction is described. Whewell takes it that the only way in which the inductively arrived proposition is truly universal is when the Intellect provides the principle of connection (that is, the conception) of the observed instances. In other words, the principles of connection are necessary for Induction, and, since they cannot be found in experience, the Mind has to provide them. If a principle of connection is provided, and if it is the correct one, then the resulting proposition captures within itself, as it were, its true universality (aka its future extendibility). In the case of Mars, the principle of connection is that Mars describes an ellipse—that is, that an ellipse binds together “particular observations of separate places of Mars.” If Mars does describe an ellipse, or if all planets do describe ellipses, then there is no (need for) further assurance that this claim is truly universal. Its universality follows from its capturing a principle of connection between the various instances (past, present and future).

In this sense, Whewell sees Induction as a one-stage process. The observation of particulars leads the mind to search for a principle of connection (the “conception” that binds them together into a general claim about all particulars of this kind). This is where Induction ends. But Inquiry does not end there for Whewell—for further testing is necessary for finding out whether the introduced principle of connection is the correct one. Recall his point: Induction requires “an idea from within, facts from without, and a coincidence of the two.” The coincidence of the two is precisely a matter of further testing. The well-known consilience of inductions is precisely how the further testing works and secures, if successfully performed, that the principle of connection was the correct one. Consilience, Whewell argued, “is another kind of evidence of theories, very closely approaching to the verification of untried predictions.” (1849: 61) It occurs when “Inductions from classes of facts altogether different have thus jumped together,” (1840 v.2: 65) that is, when a theory is supported by facts that it was not intended to explain. His example is the theory of universal gravitation, which, though obtained by Induction from the motions of the planets, “was found to explain also that peculiar motion of the spheroidal earth which produces the Precession of the Equinoxes.” Whewell thought that the consilience of inductions is a criterion of truth, a “stamp of truth,” or, as he put it, “the point where truth resides.”
Mill objected that no predictions could prove the truth of a theory. But the important point here is that Whewell took it that the principles of connection that the Mind supplies in Induction require further proof to be accepted as true.

For Mill, there are no such principles of connection—just universal and invariant successions—and the mind has no power, not inclination, to find them. Actually, there are no such connections. So, Induction is, in essence, described as a two-stage process. In the first stage, there is description of a regularity; in the second stage, there is a proper universalization, so to speak, of this regularity. The genuinely inductive “Mars’ trajectory is an  an ellipse asserts a regularity. But this regularity is truly universal only if it asserts that it holds for all past, present, and future trajectories of Mars. In criticizing Whewell, Mill agreed that the assertion “The successive places of Mars are points in an ellipse” is “not the sum of the observations merely,” since the idea of an ellipse is involved in it. Still, he thought, “It was not the sum of more than the observations, as a real induction is.” That is, it rested only on the actual observations and did not extend it to the unobserved positions of Mars. “It took in no cases but those which had been actually observed…There was not that transition from known cases to unknown, which constitutes Induction in the original and acknowledged meaning of the term.” (1879: 221) Differently put, the description of the regularity, according to Mill, should be something like: Mars has described an ellipse. The Inductive move should be “Mars describes an ellipse.”

What was at stake, in the end, were two rival metaphysical conceptions of the world. Not only did Whewell take it that “Metaphysics is a necessary part of the inductive movement, (1858, vii)  but he also thought the inductive movement is grounded on the existence of principles of connection in nature, which the mind (and human reason) succeeds in discovering. Mill, on the other hand, warned us against “the notion of causation:” The notion of causation is deemed, by the schools of metaphysics most in vogue at the present moment,

to imply a mysterious and most powerful tie, such as cannot, or at least does not, exist between any physical fact and that other physical fact on which it is invariably consequent, and which is popularly termed its cause: and thence is deduced the supposed necessity of ascending higher, into the essences and inherent constitution of things, to find the true cause, the cause which is not only followed by, but actually produces, the effect.

Mill was adamant that “No such necessity exists for the purposes of the present inquiry…. The only notion of a cause, which the theory of induction requires, is such a notion as can be gained from experience.” (1879: 377)

d. Early Appeals to Probability: From Laplace to Russell via Venn

i. Venn: Induction vs Probability

Induction, for John Venn (1834–1923), “involves a passage from what has been observed to what has not been observed.” (1889: 47) But the very possibility of such a move requires that Nature is such that it enables knowing the unobserved. Hence, Venn asks the key question: “What characteristics then ought we to demand in Nature in order to enable us to effect this step?” Answering this question requires a principle which is both universal (that is, it has universal applicability) and objective (that is, it must express some regularity in the world itself and not something about our beliefs.)

Interestingly, Venn took this principle to be the Principle of Uniformity of Nature. But Venn was perhaps the first to associate Hume’s critique of causation with a critique of Induction and, in particular, with a critique of the status of PUN. To be sure, Venn credited Hume with a major shift in the “signification of Cause and Effect” from the once dominant account of causation as efficiency to the new account of causation as regularity. (1889:49) But this shift brought with it the question: what is the foundation of our belief in the regularity? To which Hume answered, according to Venn, by showing that the foundation of this belief is Induction based on past experience. In this setting, Venn took it that the problem of induction is the problem of establishing the foundation of the belief in the uniformity of nature.

Hence, for Venn, Hume moved smoothly from causation, to regularity, to Induction. Moreover, he took the observed Uniformity of Nature as “the ultimate logical ground of our induction” (1889: 128). And yet, the belief in the Uniformity of Nature is the result of Induction. Hume had shown that the process for extending to the future a past association of two events cannot possibly be based on reasoning, but it is instead a matter of custom or habit. (op.cit. 131)

Venn emphatically claims there is no logical solution to the problem of uniformity. And yet, this is no cause for despair. For inductive reasoning requires taking the Uniformity of Nature as a postulate: “It must be assumed as a postulate, so far as logic is concerned, that the belief in the Uniformity of Nature exists.” (op.cit. 132) This postulate of Uniformity (same antecedents are followed by the same consequents) finds its natural expression in the Law of Causation (same cause, same effect). The Law of Causation captures “a certain permanence in the order of nature.” This permanence is “clearly essential” if we are to generalize from the observed to the unobserved. Hence, “The truth of the law [of causation] is clearly necessary to enable us to obtain our generalisations: in other words, it is necessary for the Inductive part of the process.” (1888: 212)

These inductively-established generalizations are deemed the laws of nature. The laws are regularities; they suggest that some events are “connected together in a regular way.” Induction enables the mind to move from the known to the unknown and hence to acquire knowledge of new facts. As Venn put it:

[The] mind […] dart[s] with its inferences from a few facts completely through a whole class of objects, and thus [it] acquire[s] results the successive individual attainment of which would have involved long and wearisome investigation, and would indeed in multitudes of instances have been out of the question. (1888: 206)

The intended contrast here is between inductive generalizations and next-instance inductions. There are obviously two routes to the conclusion that the next raven will be black, given that all observed ravens have been black. The first is to start with the observed ravens being black and, passing through the generalization that All ravens are black, to conclude that the next raven will be black. The other route is to start with the observed ravens being black and to conclude directly that the next raven will be black. Hence, we can always avoid the generalization and “make our inference from the data afforded by experience directly to the conclusion,” namely, to the next instance. And though Venn adds that “It is a mere arrangement of convenience” (1888: 207) to pass through the generalization, the convenience is so extreme that generalization is forced upon us when we infer from past experience. The inductive generalizations are not established with “with absolute certainty, but with a degree of conviction that is of the utmost practical use” (1888: 207). Nor is the existence of laws of nature “a matter of a priori necessity.” (op.cit.)

Now, following Laplace, Venn thought there is a link between Induction and probability, though he did think that “Induction is quite distinct from Probability”, the latter being, by and large, a mathematical theory. Yet, “[Induction] co-operates [with probability] in almost all its inferences.”

To see the distinction Venn has in mind, we first have to take into account the difference between establishing a generalization and drawing conclusions from it. Following Mill, Venn argued that the first task requires Induction, while the second requires logic. Now suppose the generalization is universal: all As are B. We can use logic to determine what follows from it, for instance, that the next A will be B. But not all generalizations are universal. There are generalizations which assert that a “certain proportion” prevails “among the events in the long run.” (1888: 18) These are what are today called statistical generalizations. Venn thinks of them as expressing “proportional propositions” and claims that probability is needed to “determine what inferences can be made from and by them” (1888: 207). The key point, then, is this: no matter whether a generalization is universal or statistical, it has to rely on the Principle of Uniformity of Nature. For only the latter can render valid the claim that either the regular succession found so far among factors A and B or the statistical correlation found so far among A and B is stable and can be extended to the unobserved As and Bs.

That is a critical point. Take the very sad fact that Venn refers to, namely, that three out of ten infants die in their first four years. It is a matter for Induction, Venn says, to examine whether the available evidence justifies the generalization that All infants die in that proportion, and not of Probability.

Venn distanced himself from those, like Laplace, who thought of a tight link between probability and Induction. He took issue with Laplace’s attempt to forge this tight link by devising a probabilistic rule of Induction, namely, what Venn dubbed the “Rule of Succession.” He could not be more upfront: “The opinion therefore according to which certain Inductive formulae are regarded as composing a portion of Probability, and which finds utterance in the Rule of Succession (…) cannot, I think, be maintained.” (1888: 208)

ii. Laplace: A Probabilistic Rule of Induction

Now, the mighty Pierre Simon, Marquis De Laplace (1749-1827) published in 1814 a book titled A Philosophical Essay on Probabilities, in which he developed a formal mathematical theory of probability based, roughly put, on the idea that, given a partition of a space of events, equiprobability is equipossibility. This account, which defined probability as the degree of ignorance in the occurrence of the event, became known as the classical interpretation of probability. (For a presentation of this interpretation and its main problems, see the IEP article on Probability and Induction.) For the time being, it is worth stressing that part of the motivation of developing the probability calculus was to show that Induction, “the principal means for ascertaining truth,” is based on probability.  (1814: 1)

In his attempt to put probability into Induction, Laplace put forward an inductive-probabilistic rule, which Venn called the “Rule of Succession.” It was a rule for the estimation of the probability of an event, given a past record of failures and successes in the occurrence of that event-type:

An event having occurred successively any number of times, the probability that it will happen again the next time is equal to this number increased by unity divided by the same number, increased by two units. (1814: 19)

The rule tells us how to calculate the conditional probability (see section 6.1) of an event  to occur, given evidence  that the same event (type) has occurred  times in a row in the past. This probability is:

(N+1)/(N+2).

In the more general case where an event has occurred  times and failed to occur  times in the past, the probability of a subsequent occurrence is:

(N+1)/(N+M+2).

(See, Keynes 1921: 423.)

The derivation of the rule in mathematical probability theory is based on two assumptions. The first one is that the only information available is that related to the number of successes and failures of the event examined. And the second one is what Venn called the “physical assumption that the universe may be likened to…a bag” of black and white balls from which we draw independently (1888: 197), that is, the success or failure in the occurrence of an event has no effect on subsequent tests of the same event.

Famously, Laplace applied the rule to calculate the probability of the sun rising tomorrow, given the history of observed sunrises, and concluded that it is extremely likely that the sun will rise tomorrow:

Placing the most ancient epoch of history at five thousand years ago, or at 182623 days, and the sun having risen constantly in the interval at each revolution of twenty-four hours, it is a bet of 1826214 to one that it will rise again tomorrow. (1814: 19)

Venn claimed that “It is hard to take such a rule as this seriously.” (1888: 197) The basis of his criticism is that we cannot have a good estimate of the probability of a future recurrence of an event if the event has happened just a few times, so much less if it has happened just once. However, the rule of succession suggests that on the first occasion the odds are 2 to 1 in favor of the event’s recurrence. Commenting on an example suggested by Jevons, Venn claimed that more information should be taken into account to say something about the event’s recurrence, which is not available just by observing its first occurrence:

For instance, Jevons (Principles of Science p. 258) says “Thus on the first occasion on which a person sees a shark, and notices that it is accompanied by a little pilot fish, the odds are 2 to 1 that the next shark will be so accompanied.” To say nothing of the fact that recognizing and naming the fish implies that they have often been seen before, how many of the observed characteristics of that single ‘event’ are to be considered essential? Must the pilot precede; and at the same distance? Must we consider the latitude, the ocean, the season, the species of shark, as matter also of repetition on the next occasion? and so on. (1888: 198 n.1)

Thus, he concluded that “I cannot see how the Inductive problem can be even intelligibly stated, for quantitative purposes, on the first occurrence of any event.” (1888: 198, n.1)

In a similar vein, Keynes pointed out “the absurdity of supposing that the odds are 2 to 1 in favor of a generalization based on a single instance—a conclusion which this formula would seem to justify.” (1921: 29 n.1) However, his criticism, as we shall see, goes well beyond noticing the problem of the single case.

iii. Russell’s Principle of Induction

Could Induction not be defended on synthetic a priori grounds? This was attempted by Russell (1912) in his famous The Problems of Philosophy. He took the Principle of Induction to assert the following: (1) the greater the number of cases in which A has been found associated with B, the more probable it is that A is always associated with B (if no instance is known of A not associated with B); (2) a sufficient number of cases of association between A and B will make it nearly certain that A is always associated with B.

Clearly, thus stated, the Principle of Induction cannot be refuted by experience, even if an A is actually found not to be followed by B. But neither can it be proved on the basis of experience. Russell’s claim was that without a principle like this, science is impossible and that this principle should be accepted on the ground of its intrinsic evidence. Russell, of course, said this in a period in which the synthetic a priori could still have a go for it. But, as Keynes observed, Russell’s Principle of Induction requires that the Principle of Limited Variety holds. Though synthetic, this last principle is hardly a priori.

5. Non-Probabilistic Approaches

a. Induction and the Meaning of Rationality

P. F. Strawson discussed the Problem of Induction in the final section of his Introduction to Logical Theory (1952), entitled “The ‘Justification’ of Induction.” After arguing that any attempt to justify Induction in terms of deductive standards is not viable, he went on to argue that the inductive method is the standard of rationality when we reason from experience.

Strawson invited us to consider “the demand that induction shall be shown to be really a kind of deduction.” (1952: 251) This demand stems from considering the ideal of rationality in terms of deductive standards as realized in formal logic. Thus, to justify Induction, one should show its compliance with these standards. He examined two attempts along this line of thought which are both found problematic. The first consists in finding “the supreme premise of inductions” that would turn an inductive argument into a deductive one. What would be the logical status of such a premise, he wondered? If the premise were a non-necessary proposition, then the problem of justification would reappear in a different guise. If it were a necessary truth that, along with the evidence, would yield the conclusion, then there is no need for it since the evidence would entail the conclusion by itself without the need of the extra premise and the problem would disappear. A second (more sophisticated) attempt to justify Induction on deductive grounds rests on probability theory. In this case, the justification takes the form a mathematical theorem. However, Strawson points out that mathematical modelling of an inductive process requires assumptions that are not of mathematical nature, and they need, in turn, to be justified. Hence, the problem of justification is simply moved instead of being solved. As Strawson commented, “This theory represents our inductions as the vague sublunary shadows of deductive calculations we cannot make.” (1952: 256)

Strawson’s major contribution to the problem is related to the conceptual clarification of the meaning of rationality: what do we mean by being rational when we argue about matters of fact? If we answer that question we can (dis-)solve the problem of the rational justification of Induction, since the rationality of Induction is not a “fact about the constitution of the world. It is a matter of what we mean by the word ‘rational’….” (1952: 261) We suggest the following reconstruction of Strawson’s argument: (1952: 256-257)

(1) If someone is rational, then they “have a degree of belief in a statement which is proportional to the strength of the evidence in its favour.”

(2) If someone has “a degree of belief in a statement which is proportional to the strength of the evidence in its favour,” then they have a degree of belief in a generalization as high as “the number of favourable instances, and the variety of circumstances in which they have been found, is great.”

(3) If someone has a degree of belief in a generalization as high as “the number of favourable instances, and the variety of circumstances in which they have been found, is great,” then they apply inductive methodology.

Therefore,

(C) If someone is rational, then they apply inductive methodology.

According to Strawson, all three premises in the above reconstruction are analytic propositions stemming from the definition of rationality, its application to the case of a generalization, and, finally, our understanding of Induction. Hence, that Induction exemplifies rationality when arguing about facts of matter is an inevitable conclusion. Of course, this does not mean that Induction is always successful, that is, the evidence may not be sufficient to assign a high degree of belief to the generalization.

When it comes to the success of Induction, Strawson claimed that to deem successful a method of prediction about the unobserved, Induction is required, since the success of any method is justified in terms of a past record of successful predictions. Thus, the proposition “any successful method of finding about the unobserved is justified by induction” is an analytic proposition, and “Having, or acquiring, inductive support is a necessary condition of the success of a method.” (1952: 259)

However, those who discuss the success of induction have in mind something quite different. To consider Induction a successful method of inference, the premises of an inductive argument should confer a high degree of belief on its conclusion. But this is not something that should be taken for granted. In a highly disordered, chancy world, the favorable cases for a generalization may be comparable with the unfavorable. Thus, there would be no strong evidence for the conclusion of an inductive argument. (1952: 262) Hence, assumptions that guarantee the success of Induction need to be imposed if Induction is to be considered a successful method. Such conditions, Strawson claimed, are factual, not necessary, truths about the universe. Given a past record of successful predictions about the unobserved, such factual claims are taken to have a good inductive support and speak for the following claim: “[The universe is such that] induction will continue to be successful.” (1952: 261)

Nevertheless, Strawson insisted that we should not confuse the success of Induction with its being rational; hence, it would be utterly senseless and absurd to attempt to justify the rationality of Induction in terms of its being successful. To Strawson, Induction is rational, and this is an analytic truth that is known a priori and independently of our ability to predict successfully unobserved facts, whereas making successful predictions about unobserved rests on contingent facts about the world which can be inductively supported but cannot fortify or impair the rationality of Induction. Thus, Strawson concludes, questions of the following sort: “Is the universe such that inductive procedures are rational?” or “what must the universe be like in order for inductive procedures to be rational?”, are confused and senseless on a par with statements like “The uniformity of nature is a presupposition of the validity of induction.” (1952: 262) In this way, Strawson explains the emergence of the Problem of Induction as a result of a conceptual misunderstanding.

b. Can Induction Support Itself?

Can there be an inductive justification of Induction? For many philosophers the answer is a resounding NO! The key argument for this builds on the well-known sceptical challenge: subject S asserts that she knows that p, where p is some proposition. The sceptic asks her: how do you know that p? S replies: because I have used criterion c (or method m, or whatever). The sceptic then asks: how do you know that criterion c (or whatever) is sufficient for knowledge? It is obvious that this strategy leads to a trilemma: either infinite regress (S replies: because I have used another criterion c’), or circularity (S replies: because I have used criterion c itself) or dogmatism (S replies: because criterion c is sufficient for knowledge). So, the idea is that if Induction is used to vindicate Induction, this move would be infinitely regressive, viciously circular, or merely dogmatic.

What would such a vindication be like? It would rest on what Max Black has called self-supporting inductive arguments (1958). Roughly put, the argument would be: Induction has led to true beliefs in the past (or so far); therefore Induction is reliable, where reliability, in the technical epistemic conception, is a property of a rule of inference such that if it is fed with true premises, it tends to generate true conclusions. So:

Induction has yielded true conclusions in the past; therefore, Induction is likely to work in the future—and hence to be reliable.

A more exact formulation of this argument would use as premises lots of successful individual instances of Induction and would conclude (by a meta-induction or a second-order Induction) the reliability of Induction simpliciter. Or, as Black put it, about a rule of Induction R:

In most instances of the use of R in arguments with true premises examined in a wide variety of conditions, R has been successful. Hence (probably): In the next instance to be encountered of the use of R in an argument with a true premise, R will be successful. The rule of inductive inference R is the following: “Most instances of A’s examined in a wide variety of conditions have been B; hence (probably) The next A to be encountered will be B.” (1958: 719-20)

Arguments such as these have been employed by many philosophers, such as Braithwaite (1953), van Cleve (1984), Papineau (1992), Psillos (1999), and others. What is wrong with them? There is an air of circularity in them, since the rule R is employed in an argument which concludes that R is trustworthy or reliable.

i. Premise-Circularity vs Rule-Circularity

In his path-breaking work, Richard Braithwaite (1953) distinguished between two kinds of circularity: premise-circularity and rule-circularity.

“Premise-circular” describes an argument such that its conclusion is explicitly one of its premises. Suppose you want to prove P, and you deploy an argument with P among its premises. This would be a viciously circular argument. The charge of vicious circularity is an epistemic charge—a viciously circular argument has no epistemic force: It cannot offer reasons to believe its conclusion, since it presupposes it; hence, it cannot be persuasive. Premise-circularity is vicious! But (I) above (even in the rough formulation offered) is not premise-circular.

There is, however, another kind of circularity. This, as Braithwaite put it, “is the circularity involved in the use of a principle of inference being justified by the truth of a proposition which can only be established by the use of the same principle of inference” (1953: 276). It can be called rule-circularity. In general, an argument has a number of premises, P1,…,Pn. Qua argument, it rests on (employs/uses) a rule of inference R, by virtue of which a certain conclusion Q follows. It may be that Q has a certain content: it asserts or implies something about the rule of inference R used in the argument, in particular, that R is reliable. So, rule-circular arguments are such that the argument itself is an instance, or involves essentially an application, of the rule of inference whose reliability is asserted in the conclusion.

If anything, (I) is rule-circular. Is rule-circularity vicious? Obviously, rule circularity is not premise-circularity. But, one may wonder, is it still vicious in not having any epistemic force? This issue arises already when it comes to the justification of deductive logic. In the case of the justification of modus ponens (or any other genuinely fundamental rule of logic), if logical scepticism is to be forfeited, there is only rule-circular justification. Indeed, any attempt to justify modus ponens by means of an argument has to employ modus ponens itself (see Dummett 1974).

ii. Counter-Induction?

But, one may wonder, could any mode of reasoning (no matter how crazy or invalid) not be justified by rule-circular arguments? A standard worry is that a rule-circular argument could be offered in defense of “counter-induction.” This moves from the premise that “Most observed As are B” to the conclusion “The next A will be not-B.” A “counter-inductivist” might support this rule by the following rule-circular argument: since most counter-inductions so far have failed, conclude, by counter-induction, that the next counter-induction will succeed.

The right reply here is that the employment of rule-circular arguments rests on or requires the absence of specific reasons to doubt the reliability of a rule of inference. We can call this, the Fair-Treatment Principle: a doxastic/inferential practice is innocent until proven guilty. This puts the onus on those who want to show guilt. The rationale for this principle is that justification has to start from somewhere and there is no other point to start apart from where we currently are, that is, from our current beliefs and inferential practices. Accordingly, unless there are specific reasons to doubt the reliability of induction, there is no reason to forego its uses in justificatory arguments. Nor is there reason to search for an active justification of it. Things are obviously different with counter-induction, since there are plenty of reasons to doubt its reliability, the chief being that typically counter-induction have led to false conclusions.

It may be objected that we have no reasons to rely on certain inferential rules. But this is not quite so. Our basic inferential rules (including Induction, of course) are rules we value. And we value them because they are our rules, that is, rules we employ and reply upon to form beliefs. Part of the reason why we value these rules is that they have tended to generate true beliefs—hence, we have some reason to think they are reliable, or at least more reliable than competing rules (say counter-induction).

Rule-circularity is endemic in any kind of attempt to justify basic method of inference and basic cognitive processes, such as perception and memory. In fact, as Frank Ramsey noted, it is only via memory that we can examine the reliability of memory (1926). Even if we were to carry out experiments to examine it, we would still have to rely on memory: we would have to remember their outcomes. But there is nothing vicious in using memory to determine and enhance the degree of accuracy of memory, for there is no reason to doubt its general reliability and have some reasons to trust it.

If epistemology is not to be paralysed, if inferential scepticism is not to be taken as the default reasonable position, we have to rely on rule-circular arguments for the justification of basic methods and cognitive processes.

c. Popper Against Induction

In the first chapter of the book Objective Knowledge: An evolutionary approach, Popper presented his solution of the Problem of Induction. His reading of Hume distinguished between the logical Problem of Induction (1972: 4),

HL: Are we justified in reasoning from [repeated] instances of which we have experience to other instances [conclusion] of which we have no experience?

and the psychological Problem of Induction,

HPs: Why, nevertheless, do all reasonable people expect, and believe, that instances of which we have no experience will conform to those they have experience? That is, Why do we have expectations in which we have great confidence?

Hume, Popper claimed, answered the logical problem in the negative—no number of observed instances can justify unobserved ones—while he answered the psychological problem positively—custom and habit are responsible for the formation of our expectations. In this way, Popper observes, a huge gap is opened up between rationality and belief formation and, thus, “Hume (…) was turned into a sceptic and, at the same time, into a believer: a believer in an irrationalist epistemology.” (ibid.)

In his own attempt to solve the logical Problem of Induction, Popper suggested the following three reformulations of it (1972: 7-8):

L1: Can the claim that an explanatory universal theory is true be justified by “empirical reasons’”; that is by assuming the truth of certain test statements or observation statements (which it may be said, are “based on experience”)?

L2: Can the claim that an explanatory universal theory is true or that it is false be justified by “empirical reasons”; that is can the assumption of the truth of test statements justify either the claim that a universal theory is true or the claim that it is false?

L3: Can a preference, with respect to truth or falsity, for some competing universal theories over others ever be justified by such “empirical reasons”?

Popper considers L2 to be a generalization of L1 and L3 an equivalent formulation of L2. In addition, Popper’s formulation(s) of the logical problem L1 differs from his original formulation of the Humean problem, HL, since, in L1 L3, the conclusion is an empirical generalization and the premises are “observation; or ‘test’ statements, as opposed to instances of experience” (1972: 12). In deductive logic, the truth of a universal statement cannot be established by any finite number of true observation or test statements. However, Popper, in L2, added an extra disjunct so as to treat the falsity of universal statements on empirical grounds. He can then point out that a universal statement can always be falsified by a test statement. (1972: 7) Hence, by the very (re)formulation of the logical Problem of Induction, as in L2, in such a way as to include both the (impossible) verification of a universal statement as well as its (possible) falsification, Popper thinks he has “solved” the logical Problem of Induction. The “solution” is merely stating the “asymmetry between verification and falsification by experience” from the point of view of deductive logic.

After having “solved” the logical Problem of Induction, Popper applies a heuristic conjecture, called the principle of transference, to transfer the logical solution of the Problem of Induction to the realm of psychology and to remove the clash between the answers provided by Hume to the two aspects of the Problem of Induction. This principle states roughly that “What is true in logic is true in psychology.” (1972: 6) Firstly, Popper noticed that “Induction—the formation of a belief by repetition—is a myth”: people have an inborn, instinctual inclination to impose regularities upon their environment and to make the world conform with their expectation in the absence of or prior to any repetitions of phenomena. As a consequence, Hume’s answer to HPs that bases belief formation on custom and habit is considered inadequate. Having disarmed Hume’s answer to the psychological Problem of Induction, Popper applies the principle of transference to align logic and psychology in terms of the following problem and answer:

Ps1: If we look at a theory critically, from the point of view of view of sufficient evidence rather than from any pragmatic point of view, do we always have the feeling of complete assurance or certainty of its truth, even with respect to the best-tested theories, such as that the sun rises every day? (1972: 26)

Popper’s answer to Ps1 is negative: the feeling of certainty we may experience is not based on evidence; it has its source in pragmatic considerations connected with our instincts and with the assurance of an expectation that one needs to engage in goal-oriented action. The critical examination of a universal statement shows that such a certainty is not justified, although, for pragmatic reasons related to action, we may not take seriously possibilities that are against our expectations. In this way, Popper aligns his answer to the logical Problem of Induction with his treatment of its psychological counterpart.

d. Goodman and the New Riddle of Induction

In Fact, Fiction and Forecast (1955: 61ff), Goodman argued that the “old,” as he called it, Problem of Induction is a pseudo-problem based on a presumed peculiarity of Induction which, nevertheless, does not exist. Both in deduction and in Induction, an inference is correct if it conforms with accepted rules, and rules are accepted if they codify our inferential practices. Hence, we should not seek after a reason that would justify Induction in a non-circular way any more than we do so for deduction, and the noted circularity is, as Goodman says, a “virtuous” one. The task of the philosopher is to find those rules that best codify our inferential practices in order to provide a systematic description of what a valid inference is. As a result, the only problem about Induction that remains is that, contrary to deductive inference, such rules have not been consolidated. The search for such rules is what Goodman called “the constructive task of confirmation theory.”

The new riddle of Induction appeared in the attempt to explicate the relation of confirmation of a general hypothesis by a particular instance of it. It reflects the realization that the confirmation relation is not purely syntactic: while a positive instance of a generalization may confirm it, if it is a lawlike generalization, it does not bear upon its truth if it is an accidental generalization. To illustrate this fact, Goodman used the following examples: firstly, consider the statement, “This piece of copper conducts electricity” that confirms the lawlike generalization, “All pieces of copper conduct electricity.” Secondly, consider the statement, “The man in the room is a third son” that does not confirm the accidental generalization, “All men in the room are third sons.” Obviously, the difference in these examples is not couched in terms of syntax since in both cases the observation statements and the generalizations have the same syntactic form. The new riddle of Induction shows the difficulty of making the required distinction between lawlike and accidental generalizations.

Consider two hypotheses H1 and H2, that have the form of a universal generalization: “All S is P.” Let H1 be “All emeralds are green” and H2  be “All emeralds are grue,” where “grue” is a one-place predicate defined as follows:

At time T1, both H1 and H2 are equally well confirmed by reports of observations of green emeralds made before time T. The two hypotheses differ with respect to the predictions they make about the color of the observed emeralds after time : the predictions, “The next emerald to observe after time T  is green” and “The next emerald to observe after time T is grue” are inconsistent. In addition, it may occur that the same prediction made at a time T is equally well-supported by diverse collections of evidence collected before T, as long as these collections of evidence are reflected on the different hypotheses formulated in terms of appropriately formed predicate constructs. However, Goodman claims that “…only the predictions subsumed under law-like hypotheses are genuinely confirmed.” (1955: 74-75) Thus, to distinguish between the predictions that are genuinely confirmed from the ones that are not is to distinguish between lawlike generalizations and accidental ones.

The most popular suggestion is to demand that lawlike generalizations should not contain any reference particular individuals or involve any spatial or temporal restrictions (Goodman 1955: 77). In the new riddle, the predicate ‘grue’ used in  violates this criterion, since it references a particular time ; it is a positional predicate. Hence, one may claim that  does not qualify as a lawlike generalization. However, this analysis can be challenged as follows. Specify a grue-like predicate, bleen, as follows:

Now notice, we can define green (and blue) in terms of grue and bleen as follows:

“Thus qualitativeness is an entirely relative matter,” concludes Goodman, “[t]his relativity seems to be completely overlooked by those who contend that the qualitative character of a predicate is a criterion for its good behavior. (1955: 80)

Goodman solves the problem in terms of the entrenchment of a predicate. Entrenchment measures the size of the past record of hypotheses formulated using a predicate that they have been actually projected—that is, they have been adopted after their examined instances have been found true. Hence, the predicate “grue” is less entrenched than the predicate “green,” since it has not been used to construct hypotheses licensing predictions about as yet unexamined objects as many times as “green.” Roughly, Goodman’s idea is that lawful or projectible hypotheses use only well-entrenched predicates. On this account, only hypothesis H1 is lawful or projectible and not H2, and only H1 can be confirmed in the light of evidence.

Goodman’s account of lawlikeness is pragmatic, since it rests on the use of the predicates in language, and so it is the suggested solution for his new riddle and is restricted to universal hypotheses. Entrenchment has been criticized as imprecise concept, “a crude measure” says Teller (1969), which has not been properly defined. Anyone who attempts to measure entrenchment faces the problem of dealing with two predicates having the same extension and different past records of actual projections. Although their meaning is the same, their extension is different. Finally, entrenchment seems to suggest an excessively conservative policy for scientific practice that undermines the possibility of progress, since no new predicate would be well-entrenched on the basis of past projections, and “Science could never confirm anything new.” (ibid)

6. Reichenbach on Induction

a. Statistical Frequencies and the Rule of Induction

Hans Reichenbach distinguished between classical and statistical Induction, with the first being a special case of the latter. Classical Induction is what is ordinarily called Induction by enumeration, where an initial section of a given sequence of objects or events is found to possess a given attribute, and it is assumed that the attribute persists in the entire sequence. On the other hand, statistical Induction does not presuppose the uniform appearance of an attribute in any section of the sequence. In statistical Induction it is assumed that in an initial section of a sequence, an attribute is manifested with relative frequency f, and we infer that “The relative frequency observed will persist approximately for the rest of the sequence; or, in other words, that the observed value represents, within certain limits of exactness, the value of the limit for the whole sequence.” (1934: 351) Classical Induction as a special case of statistical induction results for f = 1.

Consider a sequence of events or objects and an attribute , which is exhibited by some events of the sequence.  Suppose that you flip a coin several times forming a sequence of “Heads” (H) and “Tails” (T), and you focus your attention on the outcome H.

H H T T H T T T H …

By examining the first six elements of the sequence you can calculate the relative frequency of exhibiting H in the six flips by dividing the number of H, that is, three, by the total number of trials, that is, six: hence,

Generally, by inspecting the first  elements of the sequence, we may calculate the relative frequency,

In this way, we may define a mathematical sequence, {fn}n∈ℕ, with elements fn representing the relative frequency of appearance of the attribute A in the first n elements of the sequence of events. In the coin-flipping example we have:

n 1 2 3 4 5 6 7 8 9
Outcome H H T T H T T T H
fn 1 1 2/3 2/4 3/5 3/6 3/7 3/8 4/8

 

According to Reichenbach (1934: 445), the rule or principle of Induction makes the following posit (for the concept of posit, see below):

For any given δ > 0, no matter how small we choose it

for all n > n0.

To apply the rule of Induction to the coin-flipping example we need to fix a δ, say δ = 0.05, and to conjecture at each trial n0, the relative frequency of H for the flips > n0 to a δ–degree of approximation.

n 1 2 3 4 5 6 7 8 9
Outcome H H T T H T T T H
fn 1 1 2/3 2/4 3/5 3/6 3/7 3/8 4/8
Conjectured fn 1 ± 0.05 1/2 ± 0.05 2/3 ± 0.05 2/4 ± 0.05 3/5 ± 0.05 3/6 ± 0.05 3/7 ± 0.05 3/8 ± 0.05 4/8 ± 0.05

 

The sequence of relative frequencies, {fn}n∈ℕ, may converge to a limiting relative frequency p or not. This limiting relative frequency, if it exists, expresses the probability of occurrence of attribute  in this sequence of events, according to the frequency interpretation of probability. For a fair coin in the coin-flipping experiment, the sequence of relative frequencies converges to = ½ Generally, however, we do not know whether such a limit exists, and it is non-trivial to assume its existence.  Reichenbach formulated the rule of induction in terms of such a limiting frequency (For further discussion consult the Appendix):

Rule of Induction. If an initial section of n elements of a sequence xi  is given, resulting in the frequency n, and if, furthermore, nothing is known about the probability of the second level for the occurrence of a certain limit p, we posit that the frequency f i (i > n) will approach a limit  p within f n ± δ when the sequence is continued. (1934: 446)

Two remarks are in order here: the first is about Reichenbach’s reference to “probability of the second level.”  He examined higher-level probabilities in Ch. 8 of his book on probability theory. If the first-level probabilities are limits of relative frequencies in a given sequence of events expressing the probability of an attribute to be manifested in this sequence, second-level probabilities refer to different sequences of events, and they express the probability of a sequence of events to exhibit a particular limiting relative frequency for that attribute. By means of second-level probabilities, Reichenbach discussed probability implications that have, as a consequent, a probability implication. In the example of coin flips, this would amount to having an infinite pool of coins that are not all of them fair. The probability of picking out a coin with a limiting relative frequency of ½ to bring “Heads” is a second-order probability. In the Rule of Induction, it is assumed that we have no information about “the probability of the second level for the occurrence of a certain limit ” and the posit we make is a blind one (1936: 446); namely, we have no evidence to know how good it is.

Secondly, it is worthwhile to highlight the analogy with classical Induction. An enumerative inductive argument either predicts what will happen in the next occurrence of a similar event or yields a universal statement that claims what happens in all cases. Similarly, statistical Induction either predicts something about the behavior of the relative frequencies that follow the ones already observed, or it yields what corresponds to the universal claim, namely, that the sequence of frequencies as a whole converges to a limiting value that lies within certain bounds of exactness from an already calculated relative frequency.

b. The Pragmatic Justification

Reichenbach claims that the problem of justification of Induction is a problem of justification of a rule of inference. A rule does not state a matter of fact, so it cannot be proved to be true or false; a rule is a directive that tells us what is permissible to do, and it requires justification. But what did Reichenbach mean by justification?

He writes, “It must be shown that the directive serves the purpose for which it is established, that it is a means to a specific end” (1934: 24), and “The recognition of all rules as directives makes it evident that a justification of the rules is indispensable and that justifying a rule means demonstrating a means-end relation.” (1934: 25)

Feigl called this kind of justification that is based on showing that the use of a rule is appropriate for the attainment of a goal vindication to distinguish it from validation, a different kind of justification that is based on deriving a rule from a more fundamental principle. (Feigl, 1950)

In the case of deductive inferences, a rule is vindicated if it can be proven that its application serves the purpose of truth-preservation, that is, if the rule of inference is applied to true statements, it provides a true statement. This proof is a proof of a meta-theorem. Consider, for instance, modus ponens; by applying this rule to the well-formed formulas φ, φψ  we get ψ. It is easy to verify that φ, φψ cannot have the value “True” while ψ has the value “False.” Reichenbach might have had this kind of justification in mind for deductive rules of inference.

What is the end that would justify the rule of induction as a means to it? The end is to determine within a desired approximation the limiting relative frequency of an attribute in a given sequence, if that limiting relative frequency exists: “The aim is predicting the future—to formulate it as finding the limit of a frequency is but another version of the same aim.” (1951: 246)

And, as we have seen, the rule of induction is the most effective means for accomplishing this goal: “If a limit of the frequency exists, positing the persistence of the frequency is justified because this method, applied repeatedly, must finally lead to true statements;” (1934: 472) “So if you want to find a limit of the frequency, use the inductive inference – it is the best instrument you have, because, if your aim can be reached, you will reach it that way.” (1951: 244)

Does this sort of justification presuppose that the limit of the sequence of relative frequency exists in a given sequence of events? Reichenbach says “No!”: “If [your aim] it cannot be reached, your attempt was in vain; but any other attempt must also break down.” (1951: 244)

In the last two passages quoted from The Rise of Scientific Philosophy, we find Reichenbach’s argument for the justification of Induction:

  1. Either the limit of the relative frequency exists, or it does not exist.
  2. If it does exist, then, by applying the rule of induction, we can find it.
  3. If it does not exist, then no method can find it.
  4. Therefore, either we find the limit of the frequency by induction or by no method at all.

The failure of any method in premise #3 follows from the consideration that if there were a successful alternative method, then the limit of the frequency would exist, and the rule of induction would be successful too. Reichenbach does not deny in principle that methods other than induction may succeed in accomplishing the aim set in certain circumstances; what he claims is that induction is maximally successful in accomplishing this aim.

The statement that there is a limit of a frequency is synthetic, since it says something non-trivial about the world and, Reichenbach claims, “that sequences of events converge toward a limit of the frequency, may be regarded as another and perhaps more precise version of the uniformity postulate.” (1934: 473) In regards to its truth, the principle is commonly taken either as postulated and self-warranted or as inferred from other premises. If postulated, then, Reichenbach says, we are introducing in epistemology a form of synthetic a priori principles. Russell is criticized for having introduced synthetic a priori principles in his theory of probability of Induction and is called to “revise his views.” (1951: 247) On the other hand, if inferred, we are attempting to justify the principle by proving it from other statements, which may lead to circularity or infinite regress.

Reichenbach did not undertake the job of proving that inductive inference concludes true or even probable beliefs from any more fundamental principle. He was convinced that this cannot be done. (1951: 94) Instead, he claimed that knowledge consists of assertions for which we have no proof of their truth, although we treat them as true, as posits. As he put it:

The word “posit” is used here in the same sense as the word “wager” or “bet” in games of chance. When we bet on a horse we do not want to say by such a wager that it is true that the horse will win; but we behave as though it were true by staking money on it. A posit is a statement with which we deal as true, although the truth value is unknown.” (1934: 373)

And elsewhere he stressed, “All knowledge is probable knowledge and can be asserted only in the sense of posits.” (1951: 246) Thus, as a posit, a predictive statement does not require a proof of its truth. And the classical problem of induction is not a problem for knowledge anymore: we do not need to prove from ‘higher’ principles that induction yields true conclusions. Since, for a posit, “All that can be asked for is a proof that it is a good posit, or even the best posit available.” (1951: 242)

Induction is justified as the instrument for making good posits:

Thesis θ. The rule of induction is justified as an instrument of positing because it is a method of which we know that if it is possible to make statements about the future we shall find them by means of this method. (1934: 475)

c. Reichenbach’s Views Criticized

One objection to Reichenbach’s vindication of Induction questions the epistemic end of finding the limit of the frequency asymptotically, since, as Keynes’s famous slogan put it, “In the long run we are all dead.” (1923: 80) What we should care about, say the critics, is to justify Induction as a means to the end of finding truth, or the correct limiting frequency, in a finite number of steps, in the short run. This is the only legitimate epistemic end, and in this respect Reichenbach’s convergence to truth has not much to say.

Everyone agrees that reaching a goal set, in a finite number of steps, would be a desideratum for any methodology. However, we should notice that any method that can be successful in the short run, will be successful in the long run as well. Or, by contraposition, if a method does not guarantee success in the long run, then it will not be successful in the short run as well. Hence, although success in the long run is not the optimum one could request from a method, it is still a desirable epistemic end. And Induction is the best candidate for being successful in the short run, since it is maximally successful in the long run. (Glymour 2015: 249) To stress this point, Huber made an analogy with deductive logic. As eternal life is impossible, it is impossible to live in any logically possible world other than the actual one. Yet, this does not prevent us from requiring our belief system to be logically consistent, that is, to have an epistemic virtue that is defined in every logically possible world, as a minimum requirement of having true beliefs about the actual world. (Huber 2019: 211)

A second objection rests on the fact that Reichenbach’s rule of Induction is not the only rule that converges to the limit of relative frequency if the limit exists. Thus, there are many rules, actually an infinite number of rules, that are vindicated. Any rule that would posit that the limit of the relative frequency p  is found within a δ-interval around cno + fno

for any given δ > 0 and cno → 0 when n0 → ∞ would yield a successful prediction if the limiting frequency  existed.

For instance, let

Then in the coin-flipping example, we obtain the following different conjectures according to Reichenbach’s rule and the cno-rule:

n 1 2 3 4 5 6 7 8 9
Outcome H H T T H T T T H
Conjectured fn 1 ± 0.05 1 ± 0.05 2/3 ± 0.05 2/4 ± 0.05 3/5 ± 0.05 3/6 ± 0.05 3/7 ± 0.05 3/8 ± 0.05 4/8 ± 0.05
cn0–Conjectured fn 0 ± 0.05 1/2 ± 0.05 5/9 ± 0.05 2/4 ± 0.05 14/25 ± 0.05 3/6 ± 0.05 22/49 ± 0.05 13/32 ± 0.05 4/8 ± 0.05

Despite the differences in the short run, the two rules converge to the same relative frequency asymptotically; hence, both rules are vindicated. Why, then, should one choose Reichenbach’s rule (cno = 0) rather than the cno-rule to make predictions?

Reichenbach was aware of the problem, and he employed descriptive simplicity to select among the rival rules. (1934: 447) According to Reichenbach, descriptive simplicity is a characteristic of a description of the available data that has no bearing on its truth. Using this criterion, we may choose among different hypotheses, not on the basis of their predictions, but on the basis of their convenience or easiness to handle: “…The inductive posit is simpler to handle.” (ibid.)

Thus, since all rules converge in the limit of empirical investigation, when all available evidence have been taken into consideration, the more convenient choice is the rule of Induction with cno = 0 for all n0 ∈ ℕ.

Huber claims that all the different rules that converge to the same limiting frequency and are associated with the same sequence of events are functionally equivalent since they serve the same end, that of finding the limit of the relative frequency. So, an epistemic agent can pick out any of these methods to attain this goal, but only one at a time. Yet, he argues, this is not a peculiar feature of Induction; the situation in deductive logic is similar. There are different systems of rules of inference in classical logic, and all of them justify the same particular inferences. Every time one uses a language, they are committed to a system of rules of inference. If one does not demand a justification of the system of rules in deductive logic, why should they require such a justification of the inductive rule. (Huber 2019: 212)

7. Appendix

This appendix shows the asymptotic and self-corrective nature of the inductive method that establishes its success and the truth of the posit made in Reichenbach’s rule of Induction for a convergent sequence of relative frequencies.

Firstly, assume that the sequence of relative frequencies {fn}n∈ℕ is convergent to a value p.  Then {fn}n∈ℕ is a Cauchy sequence,

ε > 0,∃N(ε)∈ℕ such that ∀n∈ℕ,n> N ⟹ |f– fno| < ε.

Setting ε = δ, where δ is the desired accuracy of our predictions, we conclude that there is always a number of trials, N(δ), after which our conjectured relative frequency fno for n0 > N(δ) approximates the frequencies that will be observed, fn, n > n> N(δ), to a δ degree of error.

Of course, this mathematical fact does not entail that the inductive posit is necessarily true. It will be true only if the number of items,  inspected is sufficient (that is, ) to establish deductively the truth of

|f– fno| < δ for nn0.

In the example of the coin-flipping, as we see in the relevant table, for δ, the conjectured relative frequency of H at the 3rd trial is between 185/300 and 215/300 for every > 3. However, at the fourth trial the conjecture is proved false since the relative frequency is 150/300.

Now, if the posit is false, we may inspect more elements of the sequence and correct our posit. Hence, for nn0 our posit may become

for all n > n1. Again, if the new posit is false we may correct it anew and so on. However, since {fn}n∈ℕ is convergent, after a finite number of (+ 1) steps, for some nk, our posit,

for all > n> N(δ) ,will become true.

This is what Reichenbach meant when he called inductive method, self-corrective, or asymptotic:

The inductive procedure, therefore, has the character of a method of trial and error so devised that, for sequences having a limit of the frequency, it will automatically lead to success in a finite number of steps. It may be called a self-corrective method, or an asymptotic method. (1934: 446)

Secondly, we show that for a sequence of relative frequencies {fn}n∈ℕ that converges to a number p the posit that Reichenbach makes in his rule of induction is true. Namely, we will show that for every desirable degree of accuracy δ > 0, there is a N(δ)∈ℕ such that for every > n> N, fapproaches to p that is within fn ± δ, i.e. |fn| and |p – fno| < δ.

We start from the inequality,

From the convergence of {fn}n∈ℕ it holds that

∃ N∈ ℕ such that ∀∈ ℕ,> N1 ⟹ |fn| < δ/2

and

∃ N∈ ℕ such that ∀∈ ℕ,> N0 > N⟹ |fn fno| < δ/2.

Let = max{N1, N2}, then for every n n> N,

|p – fno| < δ and |p – fn| < δ/2.

8. References and Further Reading

  • Aristotle, (1985). “On Generation and Corruption,” H. H. Joachim (trans.). In Barnes, J. (ed.). Complete Works of Aristotle v. 1: 512-555. Princeton: Princeton University Press.
  • Bacon, F., (2000). The New Organon. Cambridge: Cambridge University Press.
  • Bain, A., (1887). Logic: Deductive and Inductive. New York: D. Appleton and Company.
  • Black, M., (1958). “Self-Supporting Inductive Arguments.” The Journal of Philosophy 55(17): 718-725.
  • Braithwaite, R. B., (1953). Scientific Explanation: A Study of the Function of Theory, Probability and Law in Science. Cambridge: Cambridge University Press.
  • Broad, C. D., (1952). Ethics and The History of Philosophy: Selected essays. London: Routledge.
  • Dummett, M., (1974). “The Justification of Deduction.” In Dummett, M. (ed.). Truth and Other Enigmas. Oxford: Oxford University Press.
  • Feigl, H., (1950 [1981]). “De Principiis Non Disputandum…? On the Meaning and the Limits of Justification.” In Cohen, R.S. (ed.). Herbert Feigl Inquiries and Provocations: Selected Writings 1929-1974. 237-268. Dordrecht: D. Reidel Publishing Company.
  • Glymour, C., (2015). Thinking Things Through: An Introduction to Philosophical Issues and Achievements. Cambridge, MA: The MIT Press.
  • Goodman, N., (1955 [1981]). Fact, Fiction and Forecast. Cambridge, MA: Harvard University Press.
  • Huber, F., (2019). A Logical Introduction to Probability and Induction. Oxford: Oxford University Press.
  • Hume, D., (1739 [1978]). A Treatise of Human Nature. Selby-Bigge, L. A. & Nidditch, P. H. (eds). Oxford: Clarendon Press.
  • Hume, D., (1740 [1978]). “An Abstract of A Treatise of Human Nature.” In Selby-Bigge, L. A. & Nidditch, P. H., (eds). A Treatise of Human Nature. Oxford: Clarendon Press.
  • Hume, D., (1748 [1975]). “An Enquiry concerning Human Understanding.” In Selby-Bigge, L. A. & Nidditch, P. H., (eds). Enquiries concerning Human Understanding and concerning the Principle of Morals. Oxford: Clarendon Press.
  • Hume, D., (1751 [1975]). “An Enquiry concerning the Principles of Morals.” In Selby-Bigge, L. A. & Nidditch, P. H., (eds). Enquiries concerning Human Understanding and concerning the Principle of Morals. Oxford: Clarendon Press.
  • Jeffrey, R., (1992). Probability and the Art of Judgement. Cambridge: Cambridge University Press.
  • Kant, I., (1783 [2004]). Prolegomena to any Future Metaphysics That Will Be Able to Come Forward as Science. Revised edition.G. Hatfield (trans. and ed). Cambridge: Cambridge University Press.
  • Kant, I., (1781-1787 [1998]). Critique of Pure Reason. Guyer, P. and Wood, A. W. (trans and eds). Cambridge: Cambridge University Press.
  • Kant, I., (1992). Lectures on Logic. Young, J. M (trans. and ed.). Cambridge: Cambridge University Press.
  • Keynes, J. M., (1921). A Treatise on Probability. London: Macmillan and Company.
  • Keynes, J. M., (1923). A Tract on Monetary Reform. London: Macmillan and Company.
  • Laplace, P. S., (1814 [1951]). A Philosophical Essay on Probabilities. New York: Dover Publications, Inc.
  • Leibniz, G. W. (1989). Philosophical Essays. Ariew, R. and Garber, D. (trans.). Indianapolis & Cambridge: Hackett P.C.
  • Leibniz, G. W. (1989a). Philosophical Papers and Letters. Loemker, L. (trans.), Dordrecht: Kluwer.
  • Leibniz, G. W. (1896). New Essays on Human Understanding. New York: The Macmillan Company.
  • Leibniz, G. W. (1710 [1985]). Theodicy: Essays on the Goodness of God, the Freedom of Man and the Origin of Evil. La Salle, IL: Open Court.
  • Malebranche, N. (1674-5 [1997]). The Search after Truth and Elucidations of the Search after Truth. Lennon, T. M. and Olscamp, P. J. (eds). Cambridge: Cambridge University Press.
  • Mill, J. S. (1865). An Examination of Sir William Hmailton’s Philosophy. London: Longman, Roberts and Green.
  • Mill, J. S. (1879). A System of Logic, Ratiocinative and Inductive: Being a Connected View of The Principles of Evidence and the Methods of Scientific Investigation. New York: Harper & Brothers, Publishers.
  • Papineau, D., (1992). “Reliabilism, Induction and Scepticism.” The Philosophical Quarterly 42(66): 1-20.
  • Popper, K., (1972). Objective Knowledge: An evolutionary approach. Oxford: Oxford University Press.
  • Popper, K., (1974). “Replies to My Critics.” In Schilpp, P. A., (ed.). The Philosophy of Karl Popper. 961-1174. Library of Living Philosophers, Volume XIV Book II. La Salle, IL: Open Court Publishing Company.
  • Psillos, S. (1999). Scientific Realism: How Science Tracks Truth. London: Routledge.
  • Psillos, S. (2015). “Induction and Natural Necessity in the Middle Ages.” Philosophical Inquiry 39(1): 92-134.
  • Ramsey, F., (1926). “Truth and Probability.” In Braithwaite, R. B. (ed.). The Foundations of Mathematics and other essays. London: Routledge.
  • Reichenbach, H., (1934 [1949]). The Theory of Probability: An Inquiry into the Logical and Mathematical Foundations of the Calculus of Probability. Berkeley and Los Angeles: University of California Press.
  • Reichenbach, H., (1951). The Rise of Scientific Philosophy. Berkeley and Los Angeles: University of California Press.
  • Russell, B., (1912). The Problems of Philosophy. London: Williams and Norgate; New York: Henry Holt and Company.
  • Russell, B., (1948 [1992]). Human Knowledge—Its Scope and Limits. London: Routledge.
  • Schurz, G., (2019). Hume’s Problem Solved. The Optimality of Meta-Induction. Cambridge, MA: The MIT Press.
  • Strawson, P. F., (1952 [2011]). Introduction to Logical Theory. London: Routledge.
  • Teller, P., (1969). “Goodman’s Theory of Projection.” The British Journal for the Philosophy of Science, 20(3): 219-238.
  • van Cleve, J., (1984). “Reliability, Justification, and the Problem of Induction.” Midwest Studies in Philosophy 9(1): 555-567.
  • Venn, J., (1888). The Logic of Chance. London: Macmillan and Company.
  • Venn, J., (1889). The Principles of Empirical or Inductive Logic. London: Macmillan and Company
  • Whewell, W., (1840). The Philosophy of the Inductive Sciences, Founded Upon Their History, vol. I, II. London: John W. Parker, West Strand.
  • Whewell, W., (1858). Novum Organum Renovatum. London: John W. Parker, West Strand.
  • Whewell, W., (1849). Of Induction with especial reference to John Stuart Mill’s System of Logic. London: John W. Parker, West Strand.

 

Author Information

Stathis Psillos
Email: psillos@phs.uoa.gr
University of Athens
Greece

and

Chrysovalantis Stergiou
Email: cstergiou@acg.edu
The American College of Greece
Greece

An encyclopedia of philosophy articles written by professional philosophers.